Paulo Zanoni [Tue, 19 Jan 2016 13:35:56 +0000 (11:35 -0200)]
drm/i915/fbc: call intel_fbc_pre_update earlier during page flips
Make sure we do the pre_update - which also deactivates FBC - before
we actually schedule the page flip, just to make sure we don't
flip to the new FB with FBC still activated for the previous FB.
Paulo Zanoni [Tue, 19 Jan 2016 13:35:55 +0000 (11:35 -0200)]
drm/i915/fbc: don't store the fb_id on reg_params
We don't actually use fb_id anywhere. We already compare all
parameters that matter to the hardware: pixel format, stride,
fence_reg and ggtt_offset. The ID shouldn't make a difference.
Besides, we already update the FBC data at every modeset/flip, so this
can't change behind our backs.
Paulo Zanoni [Tue, 19 Jan 2016 13:35:54 +0000 (11:35 -0200)]
drm/i915/fbc: don't print no_fbc_reason to dmesg
Our dmesg messages started being misleading after we converted to the
enable+activate model: we always print "Disabling FBC", even when
we're just deactivating it. So, for example, when I boot my machine
and do "dmesg | grep -i fbc", I see:
[drm:intel_fbc_enable] Enabling FBC on pipe A
[drm:set_no_fbc_reason] Disabling FBC: framebuffer not tiled or fenced
but then, if I read the debugfs file, I will see:
$ sudo cat i915_fbc_status
FBC enabled
Compressing: yes
so we can conclude that dmesg is misleading, since FBC is actually
enabled. What happened is that we deactivated FBC due to fbcon not
being tiled, but when we silently reactivated it when the display
manager started. We don't print activation messages since there may be
way too many of these operations per second during normal desktop
usage.
One possible solution would be to change set_no_fbc_reason to
correctly differentiate between disable and deactivation, but we
removed support from printing activation/deactivation messages in the
past because they were too frequent. So instead of doing this, let's
just not print anything on dmesg, and leave the debugfs file if the
user needs to investigate something. We already print when we enable
and disable FBC anyway on a given pipe, so this should already help
triaging bugs.
Paulo Zanoni [Tue, 19 Jan 2016 13:35:52 +0000 (11:35 -0200)]
drm/i915/fbc: make FBC work with fastboot
Move intel_fbc_enable to a place where it is called regardless of the
"modeset" variable, and make sure intel_fbc_enable can be called
multiple times without intel_fbc_disable being called.
Paulo Zanoni [Tue, 19 Jan 2016 13:35:51 +0000 (11:35 -0200)]
drm/i915/fbc: move intel_fbc_{enable, disable} call one level up
Instead of duplicating the calls for every platform, let's just put
them in the correct places inside intel_atomic_commit. This will also
make it easier for us to move the enable call in order to support
fasbtoot.
Paulo Zanoni [Tue, 19 Jan 2016 13:35:50 +0000 (11:35 -0200)]
drm/i915/fbc: choose the new FBC CRTC during atomic check
This opens the possibility of implementing nicer schemes to choose the
CRTC, such as checking the amount of stolen memory available, or
choosing the best pipe on platforms that don't die FBC to pipe or
plane A.
This code was written for another refactor that I ended up discarding,
so I don't actually need it, but I figured this patch would be an
improvement on its own so I kept it on the series.
Paulo Zanoni [Tue, 19 Jan 2016 13:35:49 +0000 (11:35 -0200)]
drm/i915: simplify struct drm_device access at intel_atomic_check()
We already have a dev variable, there's no need to access state->dev.
Also, I plan to add another dev_priv user here, so declare one for the
current user.
Paulo Zanoni [Tue, 19 Jan 2016 13:35:48 +0000 (11:35 -0200)]
drm/i915/fbc: rewrite the multiple_pipes_ok() code for locking
Older FBC platforms have this restriction where FBC can't be enabled
if multiple pipes are enabled. In the current code, we disable FBC
before the second pipe becomes visible.
One of the problems with this code is that the current
multiple_pipes_ok() implementation just iterates through all CRTCs
looking at their states, but it doesn't make sure that the state
locks are grabbed. It also can't just grab the locks for every CRTC
since this would kill one of the biggest advantages of atomic
modesetting.
After the recent FBC changes, we now have the appropriate locks for
the given CRTC, so we can just try to maintain the state of each CRTC
and update it once intel_fbc_pre_update is called.
As a last note, I don't have gen 2/3 machines to test this code. My
current plan is to enable FBC on just the newer platforms, so this
patch is just an attempt to get the gen 2/3 code at least looking
sane, so if one day someone decide to fix FBC on these platforms, they
may have less work to do.
Paulo Zanoni [Tue, 19 Jan 2016 13:35:47 +0000 (11:35 -0200)]
drm/i915/fbc: make sure we cancel the work function at fbc_disable
Just to be sure nothing will survive a module unload. We need to do
this after the unlock in order to make sure the function won't get
stuck trying to grab the lock we already own while we wait for it to
finish.
Paulo Zanoni [Tue, 19 Jan 2016 13:35:45 +0000 (11:35 -0200)]
drm/i915/fbc: unexport intel_fbc_deactivate
With the addition and usage of intel_fbc_pre_update,
intel_fbc_deactivate is not used anymore outside intel_fbc.c, so kill
the exported function and rename __intel_fbc_deactivate.
Paulo Zanoni [Tue, 19 Jan 2016 13:35:44 +0000 (11:35 -0200)]
drm/i915/fbc: fix the FBC state checking code
We'll now call intel_fbc_pre_update instead of intel_fbc_deactivate
during atomic commits. This will continue to guarantee that we
deactivate FBC and it will also update the state checking structures
at the correct time. Then, later, at the point where we were calling
intel_fbc_update, we'll only need to call intel_fbc_post_update.
Also add the proper warnings in case we don't have the appropriate
locks. Daniel mentioned the warnings will have to be removed for async
commits, but let's keep them here while we can.
Paulo Zanoni [Tue, 19 Jan 2016 13:35:43 +0000 (11:35 -0200)]
drm/i915/fbc: split intel_fbc_update into pre and post update
So now pre_update will be responsible for unconditionally deactivating
FBC and updating the state cache, while post_update will be
responsible for checking if it can be enabled, then enabling it.
This is one more step into proper locking.
Notice that intel_fbc_flush now calls post_update directly. The FBC
flush can only happen for drawing operations - since we explicitly
ignore the flips -, so the FBC state is not expected to have changed
at this point. With this we can just run post_update, which will make
sure we won't deactivate+reactivate FBC as would be the case now if we
called pre_update + post_update.
Per the new atomic locking rules, we need to cache the CRTC, plane and
FB state structures we use so we can access them later without needing
more locks. So do this.
Notice that there are some pieces of the FBC code that look at things
that are only computed during the modeset, so we can't just can't
precompute whether FBC can be activated during the update_state_cache
stage. We may be able to do this later.
Paulo Zanoni [Thu, 21 Jan 2016 20:07:17 +0000 (18:07 -0200)]
drm/i915/fbc: unconditionally update FBC during atomic commits
We unconditionally disable/update FBC even during the page flip
IOCTLs, and an unconditional disable/update at every atomic commit
touching the primary plane shouldn't impact PC state residency
noticeably. Besides, the code that checks for rotation is a good hint
that we may be forgetting something else, so let's leave all the
decisions to intel_fbc.c, making the code much safer.
Once we have the code to properly make FBC enable/update decisions
based on atomic states, with proper locking, then we'll be able to
evaluate whether it will be worth trying to optimize the cases where a
disable isn't needed.
v2: Upstream moved and now our patch needs to remove dev_priv.
Paulo Zanoni [Tue, 19 Jan 2016 13:35:39 +0000 (11:35 -0200)]
drm/i915/fbc: don't use the frontbuffer tracking subsystem for flips
Before this patch, page flips would call intel_frontbuffer_flip() and
intel_frontbuffer_flip_complete(), which would call intel_fbc_flush(),
which would call intel_fbc_update(). The problem is that drawing
operations also trigger intel_fbc_flush() calls, so it's not
guaranteed that we have the CRTC and FB locks grabbed when
intel_fbc_flush() happens, since the call trace may come from the
rendering path.
We're trying to make the FBC code grab the appropriate CRTC/FB locks,
so split the drawing and the flipping logic in order to achieve that
in later patches. So now the frontbuffer tracking code is just going
to be used for frontbuffer drawing, and intel_fbc_update() is going to
be used directly for actual page flips.
As a note, we don't need to call intel_fbc_flip() during the two
places where we call intel_frontbuffer_flip() since in one of them we
already have an intel_fbc_update() call, and in the other we have the
planes disabled.
Paulo Zanoni [Mon, 11 Jan 2016 19:44:36 +0000 (17:44 -0200)]
drm/i915/fbc: replace frequent dev_priv->fbc.x with fbc->x
We say "dev_priv->fbc.something" way too many times in our code while
we could be saying just "fbc->something" with a previous declaration
of fbc. This has been bothering me for a while but I didn't want to
patch it since I wanted to fix the real problems first. But as I add
more code I keep thinking about it, especially since it makes the code
easier to read and it can make us fit 80 columns easier, so let's just
do the change now.
While at it, also rename from i915_fbc to intel_fbc because the whole
FBC code uses intel_fbc.
The early return inside __intel_fbc_update does not completely check
all the parameters that affect the FBC register values. For example,
we currently lack looking at crtc->adjusted_y (for the fence Y offset)
and all the parameters that affect the CFB size (for i8xx).
Instead of just adding the missing parameters to the check and hoping
that any changes to the fbc_activate functions also come with a
matching change to the __intel_fbc_update check, introduce a new
structure where we store these parameters and use the structure at the
fbc_activate function. Of course, it's still possible to access
everything from dev_priv in those functions, but IMHO the new code
will be harder to break.
Paulo Zanoni [Thu, 21 Jan 2016 20:03:05 +0000 (18:03 -0200)]
drm/i915/fbc: wait for a vblank instead of 50ms when enabling
Instead of waiting for 50ms, just wait until the next vblank, since
it's the minimum requirement. The whole infrastructure of FBC is based
on vblanks, so waiting for X vblanks instead of X milliseconds sounds
like the correct way to go. Besides, 50ms may be less than a vblank on
super slow modes that may or may not exist.
There are some small improvements in PC state residency (due to the
fact that we're now using 16ms for the common modes instead of 50ms),
but the biggest advantage is still the correctness of being
vblank-based instead of time-based.
v2:
- Rebase after changing the patch order.
- Update the commit message.
v3:
- Fix bogus vblank_get() instead of vblank_count() (Ville).
- Don't forget to call drm_crtc_vblank_{get,put} (Chris, Ville)
- Adjust the performance details on the commit message.
v4:
- Don't grab the FBC mutex just to grab the vblank (Maarten)
Gerd Hoffmann [Mon, 25 Jan 2016 11:02:28 +0000 (12:02 +0100)]
drm/i915: refine qemu south bridge detection
The test for the qemu q35 south bridge added by commit
"39bfcd52 drm/i915: more virtual south bridge detection"
also matches on real hardware. Having the check for
virtual systems last in the list is not enough to avoid
that ...
Refine the check by additionally verifying the pci
subsystem id to see whenever it *really* is qemu.
Ville Syrjälä [Wed, 20 Jan 2016 19:05:28 +0000 (21:05 +0200)]
drm/i915: Fix intel_tile_width() parameters
The fb_modifiers and cpp arguments passed to intel_tile_width() in
intel_fill_fb_ggtt_view() got accidentally swapped around. I'm pretty
sure I fixed this already, but could be I lost the fix accidentally
during some rebases or something. Anyway, fix it up for real.
Ville Syrjälä [Wed, 20 Jan 2016 19:05:26 +0000 (21:05 +0200)]
drm/i915: Standardize on 'cpp' for bytes per pixel
We more or less randomly call the "bytes per pixel" value
'cpp', 'bytes_per_pixel', 'pixel_size', or even 'bpp'. Let's just pick
one and stick to it. I've chosen 'cpp'.
Ville Syrjälä [Wed, 20 Jan 2016 19:05:23 +0000 (21:05 +0200)]
drm/i915: Pass stride to rotate_pages()
Pass stride in addition to width and height to rotate_pages(). For now
width and stride are the same, but once framebuffer offsets enter the
scene that may no longer be the case.
Dave Gordon [Thu, 28 Jan 2016 10:48:09 +0000 (10:48 +0000)]
Fix pointer tests in error-handling paths
In the error-handling paths of i915_gem_do_execbuffer() and
intel_crtc_page_flip(), the local pointer-to-request variables
were expected to be either valid pointers or NULL. Since
2682708 drm/i915: simplify allocation of driver-internal requests
they could also be ERR_PTR() values, so the tests need to be
updated to accommodate this case.
Tvrtko Ursulin [Thu, 28 Jan 2016 10:29:57 +0000 (10:29 +0000)]
drm/i915: Fix premature LRC unpin in GuC mode
In GuC mode LRC pinning lifetime depends exclusively on the
request liftime. Since that is terminated by the seqno update
that opens up a race condition between GPU finishing writing
out the context image and the driver unpinning the LRC.
To extend the LRC lifetime we will employ a similar approach
to what legacy ringbuffer submission does.
We will start tracking the last submitted context per engine
and keep it pinned until it is replaced by another one.
Note that the driver unload path is a bit fragile and could
benefit greatly from efforts to unify the legacy and exec
list submission code paths.
At the moment i915_gem_context_fini has special casing for the
two which are potentialy not needed, and also depends on
i915_gem_cleanup_ringbuffer running before itself.
v2:
* Move pinning into engine->emit_request and actually fix
the reference/unreference logic. (Chris Wilson)
* ring->dev can be NULL on driver unload so use a different
route towards it.
v3:
* Rebase.
* Handle the reset path. (Chris Wilson)
* Exclude default context from the pinning - it is impossible
to get it right before default context special casing in
general is eliminated.
v4:
* Rebased & moved context tracking to
intel_logical_ring_advance_and_submit.
Tvrtko Ursulin [Thu, 28 Jan 2016 10:29:56 +0000 (10:29 +0000)]
drm/i915: Extract context unpinning to its own function
Will enable cleaner implementation of a following fix and
easier code unification in the future.
Idea and code by Chris Wilson.
v2: Do not return before last_contexts on engines are unpinned.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Chris Wilson <chris@chris-wilson.co.uk>
Tvrtko Ursulin [Thu, 28 Jan 2016 10:29:55 +0000 (10:29 +0000)]
drm/i915: Make LRC pinning own a reference to the context
Will simplify the following fix and sounds logical.
v2: Add some whitespace to separate logic better. (Chris Wilson)
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Nick Hoath <nicholas.hoath@intel.com>
Tvrtko Ursulin [Thu, 28 Jan 2016 10:29:54 +0000 (10:29 +0000)]
drm/i915: Make LRC (un)pinning work on context and engine
Previously intel_lr_context_(un)pin were operating on requests
which is in conflict with their names.
If we make them take a context and an engine, it makes the names
make more sense and it also makes future fixes possible.
v2: Rebase for default_context/kernel_context change.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Nick Hoath <nicholas.hoath@intel.com>
Imre Deak [Tue, 19 Jan 2016 13:26:32 +0000 (15:26 +0200)]
drm/i915: Move stolen memory initialization earlier during loading
The only device specific dependency of the stolen memory setup is the
MMIO mapping and the stolen memory size. Both are already available in
i915_gtt_init(), so move the stolen initialization to there. The
clean-up code for i915_gtt_init() is in i915_global_gtt_cleanup(), so
move the stolen memory clean-up code there too.
This will be needed by an upcoming patch that needs the details of the
memory we reserve, but the change is also part of our generic goal to
move the initialization of resources with no or little dependencies on
other device specific resources towards the beginning of the init
sequence.
Imre Deak [Tue, 19 Jan 2016 13:26:31 +0000 (15:26 +0200)]
drm/i915: Move MCHBAR setup earlier during init
Move the MCHBAR setup right after the MMIO setup, since the two things
are logically related and the MCHBAR setup code doesn't depend on any
other device specific resource. We'll also need MCHBAR to be ready
earlier in an upcoming patch, so this is also a preparation for that.
Factor out the init/clean-up code to separate functions to make things
clearer in the i915_driver_load()/unload() functions.
Imre Deak [Tue, 19 Jan 2016 13:26:30 +0000 (15:26 +0200)]
drm/i915: Move allocation of various workqueues earlier during init
Workqueue initalization doesn't depend on any other device specific
resource, so move it close to the beginning, so we don't need to
consider them when thinking about dependencies for other resources.
Also factor out things to separate init/cleanup functions to make
i915_driver_load()/unload() clearer, atm it's somewhat difficult to
follow there in what order resources are inited/cleaned-up.
Imre Deak [Tue, 19 Jan 2016 13:26:29 +0000 (15:26 +0200)]
drm/i915: Sanitize i915_gem_load() init and clean-up
Factor out common clean-up code for the GEM load time init function.
Also rename i915_gem_load() to i915_gem_load_init() to have a better
match with its new clean-up function.
Imre Deak [Tue, 19 Jan 2016 13:26:28 +0000 (15:26 +0200)]
drm/i915: Sanitize GEM shrinker init and clean-up
Factor out the common GEM shrinker clean-up code and call the shrinker
init function from the same function from where the corresponding
shrinker clean-up function is called. Also add sanity checking to the
shrinker and OOM registration calls.
Clarify the name of the label on the error path, making it clear what's
being cleaned up. The kmem_cache_destroy() calls are NOPs on the
corresponding error path.
"Which would mean the offender is in intel_logical_ring_cleanup is most
likely:
...
if (ring->status_page.obj) {
kunmap(sg_page(ring->status_page.obj->pages->sgl));
ring->status_page.obj = NULL;
}
...
"I think that the i915_gem_context_fini will do a final unref on
dev_priv->kernel_context and then the ring buff has a copy which is
left dangling because:
"Where default_ctx_obj == dev_priv->kernel_context->engine[ring->id].state
So indeed looks like the unload ordering is the trigger. In fact it
is almost the same fragility wrt/ kernel_context hidden dependency I
expressed my worry about in an e-mail yesterday or so. It only shows
if CONFIG_DEBUG_SG is set, otherwise it accesses freed memory and
probably just survives."
This causes serious trouble in our CI system since it took out all
gen8+ machines. Not yet clear why this wasn't caught in pre-merge
testing.
Cc: Nick Hoath <nicholas.hoath@intel.com> Cc: Dave Gordon <david.s.gordon@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Tested-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Nick Hoath [Thu, 21 Jan 2016 19:37:45 +0000 (19:37 +0000)]
drm/i915: Fix context/engine cleanup order
Swap the order of context & engine cleanup, so that contexts are cleaned
up first, and *then* engines. This is a more sensible order anyway, but
in particular has become necessary since the 'intel_ring_initialized()
must be simple and inline' patch, which now uses ring->dev as an
'initialised' flag, so it can now be NULL after engine teardown. This
in turn can cause a problem in the context code, which (used to) check
the ring->dev->struct_mutex -- causing a fault if ring->dev was NULL.
Also rename the cleanup function to reflect what it actually does
(cleanup engines, not a ringbuffer), and fix an annoying whitespace issue.
v2: Also make the fix in i915_load_modeset_init, not just in
i915_driver_unload (Chris Wilson)
v3: Had extra stuff in it.
v4: Reverted extra stuff (so we're back to v2).
Rebased and updated commentary above (Dave Gordon).
Signed-off-by: Nick Hoath <nicholas.hoath@intel.com> Signed-off-by: Dave Gordon <david.s.gordon@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> (v2) Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Chris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/1453405067-32890-3-git-send-email-david.s.gordon@intel.com Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Chris Wilson [Thu, 21 Jan 2016 17:32:43 +0000 (17:32 +0000)]
drm/i915: Improve handling of overlapping objects
The generic interval tree we use to speed up range invalidation is an
augmented rbtree that can report all overlapping intervals for a given
range. Therefore we do not need to degrade to a linear list if we find
overlapping objects. Oops.
Arun Siluvery [Thu, 21 Jan 2016 21:43:53 +0000 (21:43 +0000)]
drm/i915/skl: Enable Per context Preemption granularity control
Per context preemption granularity control is only available from SKL:E0+
Actual WA is to disable percontext preemption granularity control until D0
which is the default case so this is equivalent to the inverse of
WaDisablePerCtxtPreemptionGranularityControl:skl
Arun Siluvery [Thu, 21 Jan 2016 21:43:52 +0000 (21:43 +0000)]
drm/i915/skl: Add GEN8_L3SQCREG4 to HW whitelist
Required for WaDisableLSQCROPERFforOCL:skl
This register is added to HW whitelist to support WA required for future
enabling of pre-emptive command execution, WA implementation will be in
userspace and it cannot program this register if it is not on HW whitelist.
Arun Siluvery [Thu, 21 Jan 2016 21:43:51 +0000 (21:43 +0000)]
drm/i915/bxt: Add GEN8_L3SQCREG4 to HW whitelist
Required for WaDisableLSQCROPERFforOCL:bxt
According to WA database these are only applicable for BXT:A0 but since
A0 and A1 shares the same GT these are extended for A1 as well.
This register is added to HW whitelist to support WA required for future
enabling of pre-emptive command execution, WA implementation will be in
userspace and it cannot program this register if it is not on HW whitelist.
Arun Siluvery [Thu, 21 Jan 2016 21:43:50 +0000 (21:43 +0000)]
drm/i915/bxt: Add GEN9_CS_DEBUG_MODE1 to HW whitelist
Required for,
WaDisableObjectLevelPreemptionForTrifanOrPolygon:bxt
WaDisableObjectLevelPreemptionForInstancedDraw:bxt
WaDisableObjectLevelPreemtionForInstanceId:bxt
According to WA database these are only applicable for BXT:A0 but since
A0 and A1 shares the same GT these are extended for A1 as well.
These are also required for SKL until B0 but not adding them because they
are pre-production steppings.
This register is added to HW whitelist to support WA required for future
enabling of pre-emptive command execution, WA implementation will be in
userspace and it cannot program this register if it is not on HW whitelist.
v2: use lower case in register defines (Nick)
v3: explain purpose of changes (Chris)
Arun Siluvery [Thu, 21 Jan 2016 21:43:49 +0000 (21:43 +0000)]
drm/i915/gen9: Add HDC_CHICKEN1 to HW whitelist
Required for WaAllowUMDToModifyHDCChicken1:skl,bxt
This register is added to HW whitelist to support WA required for future
enabling of pre-emptive command execution, WA implementation will be in
userspace and it cannot program this register if it is not on HW whitelist.
Arun Siluvery [Thu, 21 Jan 2016 21:43:48 +0000 (21:43 +0000)]
drm/i915/gen9: Add GEN8_CS_CHICKEN1 to HW whitelist
Required for WaEnablePreemptionGranularityControlByUMD:skl,bxt
This register is added to HW whitelist to support WA required for future
enabling of pre-emptive command execution, WA implementation will be in
userspace and it cannot program this register if it is not on HW whitelist.
Arun Siluvery [Thu, 21 Jan 2016 21:43:47 +0000 (21:43 +0000)]
drm/i915/gen9: Add framework to whitelist specific GPU registers
Some of the HW registers are privileged and cannot be written to from
non-privileged batch buffers coming from userspace unless they are added to
the HW whitelist. This whitelist is maintained by HW and it is different from
SW whitelist. Userspace need write access to them to implement preemption
related WA.
The reason for using this approach is, the register bits that control
preemption granularity at the HW level are not context save/restored; so even
if we set these bits always in kernel they are going to change once the
context is switched out. We can consider making them non-privileged by
default but these registers also contain other chicken bits which should not
be allowed to be modified.
In the later revisions controlling bits are save/restored at context level but
in the existing revisions these are exported via other debug registers and
should be on the whitelist. This patch adds changes to provide HW with a list
of registers to be whitelisted. HW checks this list during execution and
provides access accordingly.
HW imposes a limit on the number of registers on whitelist and it is
per-engine. At this point we are only enabling whitelist for RCS and we don't
foresee any requirement for other engines.
The registers to be whitelisted are added using generic workaround list
mechanism, even these are only enablers for userspace workarounds. But by
sharing this mechanism we get some test assets without additional cost (Mika).
v2: rebase
v3: parameterize RING_FORCE_TO_NONPRIV() as _MMIO() should be limited to
i915_reg.h (Ville), drop inline for wa_ring_whitelist_reg (Mika).
v4: improvements suggested by Chris Wilson.
Clarify that this is HW whitelist and different from the one maintained in
driver. This list is engine specific but it gets initialized along with other
WA which is RCS specific thing, so make it clear that we are not doing any
cross engine setup during initialization.
Make HW whitelist count of each engine available in debugfs.
Andreas Ziegler [Mon, 25 Jan 2016 11:41:19 +0000 (12:41 +0100)]
drm/i915: Remove select to deleted STOP_MACHINE from Kconfig
Commit 5bab6f60cb4d ("drm/i915: Serialise updates to GGTT with access
through GGTT on Braswell") depended upon a working stop_machine() and
so forced the selection of STOP_MACHINE. However, commit 86fffe4a61dd
("kernel: remove stop_machine() Kconfig dependency") removed the option
STOP_MACHINE from init/Kconfig and ensured that stop_machine()
universally works. Due to the order in which the patches were applied,
removing the select from DRM_I915 got lost during merging.
Chris Wilson [Fri, 15 Jan 2016 16:51:46 +0000 (16:51 +0000)]
drm/i915: Seal busy-ioctl uABI and prevent leaking of internal ids
Tvrtko was looking through the execbuffer-ioctl and noticed that the
uABI was tightly coupled to our internal engine identifiers. Close
inspection also revealed that we leak those internal engine identifiers
through the busy-ioctl, and those internal identifiers already do not
match the user identifiers. Fortuitiously, there is only one user of the
set of busy rings from the busy-ioctl, and they only wish to choose
between the RENDER and the BLT engines.
Let's fix the userspace ABI while we still can.
v2: Update the uAPI documentation to explain the identifiers.
Tvrtko Ursulin [Fri, 15 Jan 2016 15:12:50 +0000 (15:12 +0000)]
drm/i915: Decouple execbuf uAPI from internal implementation
At the moment execbuf ring selection is fully coupled to
internal ring ids which is not a good thing on its own.
This dependency is also spread between two source files and
not spelled out at either side which makes it hidden and
fragile.
This patch decouples this dependency by introducing an explicit
translation table of execbuf uAPI to ring id close to the only
call site (i915_gem_do_execbuffer).
This way we are free to change driver internal implementation
details without breaking userspace. All state relating to the
uAPI is now contained in, or next to, i915_gem_do_execbuffer.
As a side benefit, this patch decreases the compiled size
of i915_gem_do_execbuffer.
v2: Extract ring selection into eb_select_ring. (Chris Wilson)
Chris Wilson [Wed, 20 Jan 2016 13:43:35 +0000 (15:43 +0200)]
drm/i915: Use ordered seqno write interrupt generation on gen8+ execlists
Broadwell and later currently use the same unordered command sequence to
update the seqno in the HWS status page and then assert the user
interrupt. We should apply the w/a from legacy (where we do an mmio
read to delay the seqno read after the interrupt), but this is not
enough to enforce coherent seqno visibilty on Skylake. Rather than
search for the proper post-interrupt seqno barrier, use a strongly
ordered command sequence to write the seqno, then assert the user
interrupt from the ring.
v2: Move around the wa tail dwords to avoid adding duplicate code.
v3: Add references, comments on workarounds and bit5 check.
Mika Kuoppala [Wed, 20 Jan 2016 10:32:23 +0000 (12:32 +0200)]
drm/i915: Limit the auto arming of mmio debugs on vlv/chv
The capability to detect unclaimed register access was
recently introduced for vlv/chv platforms. Apparently
there are plenty of unclaimed access on these platforms,
resulting in new dmesg warns. But as we are trying to form
a beachhead for CI/Bat, all new warns are adding to the
noise and thus not desirable at this point in time.
Make it so that if in these platforms the automatic arming
was responsible for mmio_debug enabling, ignore the warns.
If user/dev wants to fix these, he can still do so by
i915.mmio_debug=1234.
Daniel Vetter [Tue, 19 Jan 2016 20:00:56 +0000 (21:00 +0100)]
drm/i915: Tune down "GT register while GT waking disabled" message
We've had this since forever, and's randomly reporting issues and as
such causing piles&piles of CI noise. Mika is working on proper debug
infrastructure for this, and on fixing this properly.
Dave Gordon [Tue, 19 Jan 2016 19:02:55 +0000 (19:02 +0000)]
drm/i915: tidy up a few leftovers
There are a few bits of code which the transformations implemented by
the previous patch reveal to be suboptimal, once the notion of a per-
ring default context has gone away. So this tidies up the leftovers.
It could have been squashed into the previous patch, but that would have
made that patch less clearly a simple transformation. In particular, any
change which alters the code block structure or indentation has been
deferred into this separate patch, because such things tend to make
diffs more difficult to read.
Dave Gordon [Tue, 19 Jan 2016 19:02:54 +0000 (19:02 +0000)]
drm/i915: abolish separate per-ring default_context pointers
Now that we've eliminated a lot of uses of ring->default_context,
we can eliminate the pointer itself.
All the engines share the same default intel_context, so we can just
keep a single reference to it in the dev_priv structure rather than one
in each of the engine[] elements. This make refcounting more sensible
too, as we now have a refcount of one for the one pointer, rather than
a refcount of one but multiple pointers.
From an idea by Chris Wilson.
v2: transform an extra instance of ring->default_context introduced by 42f1cae8c drm/i915: Restore inhibiting the load of the default context
That patch's commentary includes:
v2: Mark the global default context as uninitialized on GPU reset so
that the context-local workarounds are reloaded upon re-enabling
The code implementing that now also benefits from the replacement of
the multiple (per-ring) pointers to the default context with a single
pointer to the unique kernel context.
Dave Gordon [Tue, 19 Jan 2016 19:02:53 +0000 (19:02 +0000)]
drm/i915: simplify allocation of driver-internal requests
There are a number of places where the driver needs a request, but isn't
working on behalf of any specific user or in a specific context. At
present, we associate them with the per-engine default context. A future
patch will abolish those per-engine context pointers; but we can already
eliminate a lot of the references to them, just by making the allocator
allow NULL as a shorthand for "an appropriate context for this ring",
which will mean that the callers don't need to know anything about how
the "appropriate context" is found (e.g. per-ring vs per-device, etc).
So this patch renames the existing i915_gem_request_alloc(), and makes
it local (static inline), and replaces it with a wrapper that provides
a default if the context is NULL, and also has a nicer calling
convention (doesn't require a pointer to an output parameter). Then we
change all callers to use the new convention:
OLD:
err = i915_gem_request_alloc(ring, user_ctx, &req);
if (err) ...
NEW:
req = i915_gem_request_alloc(ring, user_ctx);
if (IS_ERR(req)) ...
OLD:
err = i915_gem_request_alloc(ring, ring->default_context, &req);
if (err) ...
NEW:
req = i915_gem_request_alloc(ring, NULL);
if (IS_ERR(req)) ...
Ville Syrjälä [Tue, 19 Jan 2016 16:23:17 +0000 (18:23 +0200)]
drm/i915: Fix NULL plane->fb oops on SKL
In this atomic age, we can't trust the plane->fb pointer anymore.
It might get update too late. Instead we are supposed to use the
plane_state->fb pointer instead. Let's do that in
intel_plane_obj_offset() and avoid problems from dereferencing the
potentially stale plane->fb pointer.
Paulo found this with 'kms_frontbuffer_tracking --show-hidden --run-subtest nop-1p-rte'
but it can be reproduced with just plain old kms_setplane.
I was too lazy to bisect this, so not sure exactly when it broke. The
most obvious candidate
commit ce7f17285639 ("drm/i915: Fix i915_ggtt_view_equal to handle rotation correctly")
was actually still fine, so it must have broken some time after that.
This commit was triggering some FIFO underrun warnings on ILK-IVB
platforms (but surprisingly not on HSW/BDW that share more or less the
same codepaths). These underruns were caught by the continuous
integration (CI) system and could be reproduced consistently when
running the basic acceptance tests (BAT) on the affected platforms.
Note that this revert will cause a visible regression for some
end-users; the "flicker when mouse moves between monitors in X" issue
that was reported before this patch was merged will now return. However
regressions that are visible to CI have higher priority since they
prevent proper testing of future patches on those platforms. Hopefully
we'll be able to figure out the cause of the underruns quickly and
remerge an improved version of this patch to fix the regression.
Arun Siluvery [Mon, 18 Jan 2016 15:59:36 +0000 (15:59 +0000)]
drm/i915/gen9: Correct max save/restore register count during gpu reset with GuC
In GuC submission mode, driver has to provide a list of registers to be
save/restored during gpu reset, make the max no. of registers value consistent
with that of the value defined in FW. If they are not in sync then register
save/restore during gpu reset won't work as expected.
Chris Wilson [Wed, 13 Jan 2016 17:38:15 +0000 (17:38 +0000)]
drm/i915: Demote user facing DMC firmware load failure message
This is an expected error given the lack of the firmware so emit it at
KERN_NOTICE and not KERN_ERROR. Also include the firmware URL in the
user facing message so that the user can investigate and fix the issue
on their own, and also explain the consequence in plain language.
The complete failure message, including the first line from the firmware
loader, becomes
i915 0000:00:02.0: Direct firmware load for i915/skl_dmc_ver1.bin failed with error -2
i915 0000:00:02.0: Failed to load DMC firmware [https://01.org/linuxgraphics/intel-linux-graphics-firmwares], disabling runtime power management.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Damien Lespiau <damien.lespiau@intel.com> Cc: Imre Deak <imre.deak@intel.com> Cc: Sunil Kamath <sunil.kamath@intel.com> Cc: Daniel Vetter <daniel.vetter@intel.com> Cc: Animesh Manna <animesh.manna@intel.com> Cc: Jani Nikula <jani.nikula@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1452706695-13518-1-git-send-email-chris@chris-wilson.co.uk Reviewed-by: Damien Lespiau <damien.lespiau@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Ville Syrjälä [Thu, 15 Oct 2015 14:01:58 +0000 (17:01 +0300)]
drm/i915: skl_update_scaler() wants a rotation bitmask instead of bit number
Pass BIT(DRM_ROTATE_0) instead of DRM_ROTATE_0 to skl_update_scaler().
The former is a mask, the latter just the bit number.
Fortunately the only thing skl_update_scaler() does with the rotation
is check if it's 90/270 degrees or not, and so in this case it would
still do the right thing.
Ville Syrjälä [Fri, 15 Jan 2016 18:46:53 +0000 (20:46 +0200)]
drm/i915: Don't reject primary plane windowing with color keying enabled on SKL+
On SKL+ plane scaling is mutually exclusive with color keying. The code
check for this, but during some refactoring the code got changed to
also reject primary plane windowing when color keying is used. There is
no such restriction in the hardware, so restore the original logic.
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Fixes: 061e4b8d650a ("drm/i915: clean up atomic plane check functions, v2.") Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1452883613-28549-1-git-send-email-ville.syrjala@linux.intel.com Cc: stable@vger.kernel.org Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Alex Dai [Wed, 13 Jan 2016 19:01:50 +0000 (11:01 -0800)]
drm/i915/guc: Fix a memory leak where guc->execbuf_client is not freed
During driver unloading, the guc_client created for command submission
needs to be released to avoid memory leak.
The struct_mutex needs to be held before tearing down GuC.
v1: Move i915_guc_submission_disable out of i915_guc_submission_fini and
take struct_mutex lock before release GuC client. (Dave Gordon)
v2: Add the locking for failure case in guc_fw_fetch. (Dave Gordon)
Add i915_guc_submission_fini for failure case in intel_guc_ucode_load.
Tvrtko Ursulin [Fri, 15 Jan 2016 17:12:45 +0000 (17:12 +0000)]
drm/i915: Cache LRC state page in the context
LRC lifetime is well defined so we can cache the page pointing
to the object backing store in the context in order to avoid
walking over the object SG page list from the interrupt context
without the big lock held.
v2: Also cache the mapping. (Chris Wilson)
v3: Unmap on the error path.
v4: No need to cache the page. (Chris Wilson)
v5: No need to dirty the page on unpin. (Chris Wilson)
v6: kmap() cannot fail and use kmap_to_page to simplify unpin.
(Chris Wilson)
Tvrtko Ursulin [Fri, 15 Jan 2016 15:10:27 +0000 (15:10 +0000)]
drm/i915: Do not call API requiring struct_mutex where it is not available
LRC code was calling GEM API like i915_gem_obj_ggtt_offset from
places where the struct_mutex cannot be grabbed (irq handlers).
To avoid that this patch caches some interesting bits and values
in the engine and context structures.
Some usages are also removed where they are not needed like a
few asserts which are either impossible or have been checked
already during engine initialization.
Side benefit is also that interrupt handlers and command
submission stop evaluating invariant conditionals, like what
Gen we are running on, on every interrupt and every command
submitted.
This patch deals with logical ring context id and descriptors
while subsequent patches will deal with the remaining issues.
v2:
* Cache the VMA instead of the address. (Chris Wilson)
* Incorporate Dave Gordon's good comments and function name.
v3:
* Extract ctx descriptor template to a function and group
functions dealing with ctx descriptor & co together near
top of the file. (Dave Gordon)
Dave Airlie [Sun, 17 Jan 2016 21:02:19 +0000 (07:02 +1000)]
Merge tag 'drm-intel-next-fixes-2016-01-14' of git://anongit.freedesktop.org/drm-intel into drm-next
misc i915 fixes all over the place.
* tag 'drm-intel-next-fixes-2016-01-14' of git://anongit.freedesktop.org/drm-intel:
drm/i915/gen9: Set PIN_ZONE_4G end to 4GB - 1 page
drm/i915: Widen return value for reservation_object_wait_timeout_rcu to long.
drm/i915: intel_hpd_init(): Fix suspend/resume reprobing
drm/i915: shut up gen8+ SDE irq dmesg noise, again
drm/i915: Restore inhibiting the load of the default context
drm/i915: Tune down rpm wakelock debug checks
drm/i915: Avoid writing relocs with addresses in non-canonical form
drm/i915: Move Braswell stop_machine GGTT insertion workaround
Dave Airlie [Sun, 17 Jan 2016 21:01:16 +0000 (07:01 +1000)]
Merge tag 'topic/drm-misc-2016-01-17' of git://anongit.freedesktop.org/drm-intel into drm-next
Since your main drm-next pull isn't out of the door yet I figured I might
as well flush out drm-misc instead of delaying for 4.6. It's really just
random stuff all over, biggest thing probably connector_mask tracking from
Maarten.
* tag 'topic/drm-misc-2016-01-17' of git://anongit.freedesktop.org/drm-intel: (24 commits)
drm/fb_cma_helper: Remove implicit call to disable_unused_functions
drm/sysfs: use kobj_to_dev()
drm/i915: Init power domains early in driver load
drm: Do not set connector->encoder in drivers
apple-gmux: Add initial documentation
drm: move MODULE_PARM_DESC to other file
drm/edid: index CEA/HDMI mode tables using the VIC
drm/atomic: Remove drm_atomic_connectors_for_crtc.
drm/i915: Update connector_mask during readout, v2.
drm: Remove opencoded drm_gem_object_release_handle()
drm: Do not set outparam on error during GEM handle allocation
drm/docs: more leftovers from the big vtable documentation pile
drm/atomic-helper: Reject legacy flips on a disabled pipe
drm/atomic: add connector mask to drm_crtc_state.
drm/tegra: Use __drm_atomic_helper_reset_connector for subclassing connector state, v2.
drm/atomic: Add __drm_atomic_helper_connector_reset, v2.
drm/i915: Set connector_state->connector using the helper.
drm: Use a normal idr allocation for the obj->name
drm: Only bump object-reference count when adding first handle
drm: Balance error path for GEM handle allocation
...
Maxime Ripard [Thu, 14 Jan 2016 15:24:56 +0000 (16:24 +0100)]
drm/fb_cma_helper: Remove implicit call to disable_unused_functions
The drm_fbdev_cma_init function always calls the
drm_helper_disable_unused_functions. Since it's part of the usual probe
process, all the drivers using that helper will end up having their encoder
and CRTC disable functions called at probe if their device has not been
reported as enabled.
This could be fixed by reading out from the registers the current state of
the device if it is enabled, but even that will not handle the case where
the device is actually disabled.
Moreover, the drivers using the atomic modesetting expect that their enable
and disable callback to be called when the device is already enabled or
disabled (respectively).
We can however fix this issue by moving the call to
drm_helper_disable_unused_functions out of drm_fbdev_cma_init and make the
drivers needing it (all the drivers calling drm_fbdev_cma_init and not
using the atomic modesetting) explicitly call it.
Jani Nikula [Wed, 13 Jan 2016 14:35:20 +0000 (16:35 +0200)]
drm/i915/dp: fall back to 18 bpp when sink capability is unknown
Per DP spec, the source device should fall back to 18 bpp, VESA range
RGB when the sink capability is unknown. Fix the color depth
clamping. 18 bpp color depth should ensure full color range in automatic
mode.
The clamping has been HDMI specific since its introduction in
Jani Nikula [Thu, 14 Jan 2016 15:12:07 +0000 (17:12 +0200)]
drm/i915/bios: Fix the sequence size calculations for MIPI seq v3
Two errors in a single line. The size was read from the wrong offset,
and the end index didn't take the five bytes for sequence byte and size
of sequence into account. Fix it all, and break up the calculations a
bit to make it clearer.
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Reported-and-tested-by: Mika Kahola <mika.kahola@intel.com> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Fixes: 2a33d93486f2 ("drm/i915/bios: add support for MIPI sequence block v3") Signed-off-by: Jani Nikula <jani.nikula@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1452784327-27258-1-git-send-email-jani.nikula@intel.com
Francisco Jerez [Thu, 14 Jan 2016 02:59:39 +0000 (18:59 -0800)]
drm/i915: Make sure DC writes are coherent on flush.
We need to set the DC FLUSH PIPE_CONTROL bit on Gen7+ to guarantee
that writes performed via the HDC are visible in memory. Fixes an
intermittent failure in a Piglit test that writes to a BO from a
shader using GL atomic counters (implemented as HDC untyped atomics)
and then expects the memory to read back the same value after mapping
it on the CPU.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91298 Tested-by: Mark Janes <mark.a.janes@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: stable@vger.kernel.org Signed-off-by: Francisco Jerez <currojerez@riseup.net> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1452740379-3194-1-git-send-email-currojerez@riseup.net Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Ville Syrjälä [Thu, 14 Jan 2016 12:53:35 +0000 (14:53 +0200)]
drm/i915: Use the active wm config for merging on ILK-BDW
ilk_program_watermarks() is supposed to merge the active watermarks from
all pipes. Thus we need to use the active config too instead of some
precomputed stuff.
Ville Syrjälä [Thu, 14 Jan 2016 12:53:34 +0000 (14:53 +0200)]
drm/i915: Start WM computation from scratch on ILK-BDW
ilk_compute_pipe_wm() assumes as zeroed pipe_wm structure when it
starts. We used to pass such a zeroed struct in, but this got broken
when the pipe_wm structure got embedded in the crtc state.
To fix it without too much fuzz, we need to resort to a memset().
Dave Airlie [Wed, 13 Jan 2016 21:59:21 +0000 (07:59 +1000)]
Merge branch 'exynos-drm-next' of git://git.kernel.org/pub/scm/linux/kernel/git/daeinki/drm-exynos into drm-next
This 2nd pull request includes the following,
- add configurable plane support and relevant cleanups.
- fixup kernel panic issue at drm releasing.
- remove unnecessary codes.
This has been delayed to resolve a critical issue - which incurrs
a kernel panic when driver is released - and review it.
* 'exynos-drm-next' of git://git.kernel.org/pub/scm/linux/kernel/git/daeinki/drm-exynos:
drm/exynos: fix kernel panic issue at drm releasing
drm/exynos: crtc: do not wait for the scanout completion
drm/exynos: mixer: properly update all planes on the same vblank event
drm/exynos: crtc: rework atomic_{begin,flush}
drm/exynos: mixer: unify a check for video-processor window
drm/exynos: mixer: also allow ARGB1555 and ARGB4444
drm/exynos: mixer: refactor layer setup
drm/exynos: mixer: remove all static blending setup
drm/exynos: mixer: set window priority based on zpos
drm/exynos: make zpos property configurable
drm/exynos: rename zpos to index
Dave Airlie [Wed, 13 Jan 2016 21:58:06 +0000 (07:58 +1000)]
Merge branch 'drm-etnaviv-fixes' of git://git.pengutronix.de/git/lst/linux into drm-next
etnaviv fixes.
* 'drm-etnaviv-fixes' of git://git.pengutronix.de/git/lst/linux:
drm/etnaviv: fix workaround for GC500
drm/etnaviv: unlock on error in etnaviv_gem_get_iova()