Ville Syrjälä [Tue, 12 Jul 2016 12:59:28 +0000 (15:59 +0300)]
drm/i915: Fix iboost setting for DDI with 4 lanes on SKL
Bspec says:
"For DDIA with x4 capability (DDI_BUF_CTL DDIA Lane Capability Control =
DDIA x4), the I_boost value has to be programmed in both
tx_blnclegsctl_0 and tx_blnclegsctl_4."
Currently we only program tx_blnclegsctl_0. Let's do the other one as
well.
Chris Wilson [Tue, 2 Aug 2016 10:15:27 +0000 (11:15 +0100)]
drm/i915: Protect older gen against intel_gt_init_powersave()
In the middle of intel_gt_init_powersave() we have an if-chain that ends
with a universal else clause to read gen6+ registers. Older platforms
like Pineview that end up here do not like those registers and may even
OOPS whilst reading them!
Fixes: 3ea9a80132 ("drm/i915: Perform static RPS frequency setup ...") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1470132927-1821-1-git-send-email-chris@chris-wilson.co.uk Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
David Weinehall [Mon, 1 Aug 2016 14:33:27 +0000 (17:33 +0300)]
drm/i915/debugfs: Take runtime_pm ref for sseu
When reading the SSEU statistics, we need to call
intel_runtime_pm_get() first, otherwise we might end up
triggering "Device suspended during HW access".
Chris Wilson [Thu, 28 Jul 2016 23:45:35 +0000 (00:45 +0100)]
drm/i915: Add missing ring_mask to Pineview
It appears that we never told Pineview it has a RENDER_RING. This was
all fine until we started using the ring_mask for determining all the
available rings to initialise for legacy ringbuffer submission in commit 88d2ba2e95c8 ("drm/i915: Unify engine init loop"). Though really it is a
latent bug since the ring_mask inception in commit 73ae478cdf6a
("drm/i915: Replace has_bsd/blt/vebox with a mask").
To prevent similar mishaps in future, add a WARN_ON() if we find
ourselves with a device without any rings.
Fixes: 73ae478cdf6a ("drm/i915: Replace has_bsd/blt/vebox with a mask") Fixes: 88d2ba2e95c8 ("drm/i915: Unify engine init loop") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Ben Widawsky <ben@bwidawsk.net> Link: http://patchwork.freedesktop.org/patch/msgid/1469749535-2382-1-git-send-email-chris@chris-wilson.co.uk Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: drm-intel-fixes@lists.freedesktop.org
Ville Syrjälä [Mon, 23 May 2016 14:42:48 +0000 (17:42 +0300)]
drm/i915: Never fully mask the the EI up rps interrupt on SNB/IVB
SNB (and IVB too I suppose) starts to misbehave if the GPU gets stuck
in an infinite batch buffer loop. The GPU apparently hogs something
critical and CPUs start to lose interrupts and whatnot. We can keep
the system limping along by unmasking some interrupts in
GEN6_PMINTRMSK. The EI up interrupt has been previously chosen for
that task, so let's never mask it.
Chris Wilson [Wed, 27 Jul 2016 08:07:30 +0000 (09:07 +0100)]
drm/i915: Update a couple of hangcheck comments to talk about engines
We still have lots of comments that refer to the old ring when we mean
struct intel_engine_cs and its hardware correspondence. This patch fixes
an instance inside hangcheck to talk about engines.
Chris Wilson [Wed, 27 Jul 2016 08:07:28 +0000 (09:07 +0100)]
drm/i915: Avoid using intel_engine_cs *ring for GPU error capture
Inside the error capture itself, we refer to not only the hardware
engine, its ringbuffer but also the capture state. Finding clear names
for each whilst avoiding mixing ring/intel_engine_cs is tricky. As a
compromise we keep using ering for the error capture.
v2: Use 'ee' locals for struct drm_i915_error_engine
Chris Wilson [Wed, 27 Jul 2016 08:07:27 +0000 (09:07 +0100)]
drm/i915: Use engine to refer to the user's BSD intel_engine_cs
This patch transitions the execbuf engine selection away from using the
ring nomenclature - though we still refer to the user's incoming
selector as their user_ring_id.
Ville Syrjälä [Wed, 13 Jul 2016 13:32:03 +0000 (16:32 +0300)]
drm/i915: Wait up to 3ms for the pcu to ack the cdclk change request on SKL
Bspec tells us to keep bashing the PCU for up to 3ms when trying to
inform it about an upcoming change in the cdclk frequency. Currently
we only keep at it for 15*10usec (+ whatever delays gets added by
the sandybridge_pcode_read() itself). Let's change the limit to 3ms.
I decided to keep 10 usec delay per iteration for now, even though
the spec doesn't really tell us to do that.
Cc: stable@vger.kernel.org Fixes: 5d96d8afcfbb ("drm/i915/skl: Deinit/init the display at suspend/resume") Cc: David Weinehall <david.weinehall@intel.com> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1468416723-23440-1-git-send-email-ville.syrjala@linux.intel.com Tested-by: David Weinehall <david.weinehall@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Chris Wilson [Tue, 26 Jul 2016 11:01:53 +0000 (12:01 +0100)]
drm/i915: Only drop the batch-pool's object reference
The obj->batch_pool_link is only inspected when traversing the batch
pool list and when on the batch pool list the object is referenced. Thus
when freeing the batch pool list, we only need to unreference the object
and do not have to worry about the obj->batch_pool_link.
Chris Wilson [Tue, 26 Jul 2016 11:01:52 +0000 (12:01 +0100)]
drm/i915: Only clear the client pointer when tearing down the file
Upon release of the file (i.e. the user calls close(fd)), we decouple
all objects from the client list so that we don't chase the dangling
file_priv. As we always inspect file_priv first, we only need to nullify
that pointer and can safely ignore the list_head.
Chris Wilson [Tue, 26 Jul 2016 11:01:50 +0000 (12:01 +0100)]
drm/i915: Reduce breadcrumb lock coverage for intel_engine_enable_signaling()
Since intel_engine_enable_signaling() is now only called via
fence_enable_sw_signaling(), we can rely on it to provide serialisation
and run-once for us and so make ourselves slightly simpler.
Chris Wilson [Sun, 24 Jul 2016 09:10:21 +0000 (10:10 +0100)]
drm/i915: Update the breadcrumb interrupt counter before enabling
In order to close a race with a long running hangcheck comparing a stale
interrupt counter with a just started waiter, we need to first bump the
counter as we start the fresh wait.
Chris Wilson [Sun, 24 Jul 2016 09:10:20 +0000 (10:10 +0100)]
drm/i915: Drop racy markup of missed-irqs from idle-worker
During the idle-worker we disable the hangcheck and so kick any waiters
that should have been completed (since the GPU is now idle). Unlike the
hangcheck, we do not take any care to avoid the race between the irq
handler and ourselves, and so it is possible for us to declare a missed
interrupt even as the bottom-half is being scheduled to run. Let's
ignore this race to stop a potential false-positive error.
Chris Wilson [Thu, 21 Jul 2016 20:16:19 +0000 (21:16 +0100)]
Revert "drm/i915: Enable RC6 immediately"
This reverts commit b12e0ee2080c ("drm/i915: Enable RC6 immediately"),
as it was never meant to be sent anywhere other than the bug report for
experimentation.
I forgot to remove these when reworking the firmware loading sequence
last year. The new sequence is that we load firmware, and if it's not
there we entirely (and permanently) fail dmc setup.
Dave Gordon [Thu, 21 Jul 2016 17:39:38 +0000 (18:39 +0100)]
drm/i915: use i915_gem_object_put_unlocked() after releasing mutex
The exit path in intel_overlay_put_image_ioctl() first unlocks the
struct_mutex, then drops its reference to 'new_bo' by calling
i915_gem_object_put(). As it isn't holding the mutex at this point,
this should be i915_gem_object_put_unlocked().
This was previously correct but got splatted in the recent
s/drm_gem_object_unreference/i915_gem_object_put/
where the _unlocked suffix was lost in this one case.
v2: don't bother fixing whitespace glitch [Chris Wilson]
Chris can do it next time he touches gem_evict.c ;)
Dave Gordon [Wed, 20 Jul 2016 17:16:07 +0000 (18:16 +0100)]
drm/i915: rename & update eb_select_ring()
'ring' is an old deprecated term for a GPU engine, so we're trying to
phase out all such terminology. eb_select_ring() not only has 'ring'
(meaning engine) in its name, but it has an ugly calling convention
whereby it returns an errno and stores a pointer-to-engine indirectly
through an output parameter. As there is only one error it ever returns
(-EINVAL), we can make it return the pointer directly, and have the
caller pass back the error code -EINVAL if the pointer result is NULL.
Thus we can replace
- ret = eb_select_ring(dev_priv, file, args, &engine);
- if (ret)
- return ret;
with
+ engine = eb_select_engine(dev_priv, file, args);
+ if (!engine)
+ return -EINVAL;
for increased clarity and maybe save a few cycles too.
Dave Gordon [Wed, 20 Jul 2016 17:16:06 +0000 (18:16 +0100)]
drm/i915: rename 'ring' where it refers to an engine or engine_id
'ring' is an old deprecated term for a GPU engine. Chris Wilson wants to
use the name for what is currently known as an intel_ringbuffer, but it
will be dreadfully confusing if some rings are ringbuffers but other
rings are still engines. So this patch changes the names of a bunch of
parameters called 'ring' to either 'engine' or 'engine_id' according to
what they actually are.
Dave Gordon [Wed, 20 Jul 2016 17:16:05 +0000 (18:16 +0100)]
drm/i915: rename macro parameter(ring) to (engine)
'ring' is an old deprecated term for a GPU engine. Here we make the
terminology more consistent by renaming the 'ring' parameter of lots of
macros that calculate addresses within the MMIO space of an engine.
Add WaDisableGatherAtSetShaderCommonSlice for all gen9 as stated
by bspec. The bspec told to put this workaround to the per ctx bb.
Initial implementation and subsequent review were done based on
bspec. Arun raised a suspicion that this would belong to indirect bb
instead and he conducted more throughout investigation on the matter
and indeed the documentation was wrong.
v2: Move to indirect_ctx wa bb, as it is correct place (Arun)
References: HSD#2135817 Cc: Arun Siluvery <arun.siluvery@linux.intel.com> Cc: Matthew Auld <matthew.auld@intel.com> Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> (v1) Reviewed-by: Arun Siluvery <arun.siluvery@linux.intel.com> Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1469013973-24104-1-git-send-email-mika.kuoppala@intel.com
Chris Wilson [Wed, 20 Jul 2016 12:31:57 +0000 (13:31 +0100)]
drm/i915: Convert i915_semaphores_is_enabled over to early sanitize
Rather than recomputing whether semaphores are enabled, we can do that
computation once during early initialisation as the i915.semaphores
module parameter is now read-only.
Chris Wilson [Wed, 20 Jul 2016 12:31:55 +0000 (13:31 +0100)]
drm/i915: Treat ringbuffer writes as write to normal memory
Ringbuffers are now being written to either through LLC or WC paths, so
treating them as simply iomem is no longer adequate. However, for the
older !llc hardware, the hardware is documentated as treating the TAIL
register update as serialising, so we can relax the barriers when filling
the rings (but even if it were not, it is still an uncached register write
and so serialising anyway.).
For simplicity, let's ignore the iomem annotation.
v2: Remove iomem from ringbuffer->virtual_address
v3: And for good measure add iomem elsewhere to keep sparse happy
As these are wrappers around kref_get/kref_put() it is preferable to
follow the naming convention and use the same verb get/put in our
wrapper names for manipulating a reference to the context.
Chris Wilson [Wed, 20 Jul 2016 08:21:15 +0000 (09:21 +0100)]
drm/i915: Wait on external rendering for GEM objects
When transitioning to the GTT or CPU domain we wait on all rendering
from i915 to complete (with the optimisation of allowing concurrent read
access by both the GPU and client). We don't yet ensure all rendering
from third parties (tracked by implicit fences on the dma-buf) is
complete. Since implicitly tracked rendering by third parties will
ignore our cache-domain tracking, we have to always wait upon rendering
from third-parties when transitioning to direct access to the backing
store. We still rely on clients notifying us of cache domain changes
(i.e. they need to move to the GTT read or write domain after doing a CPU
access before letting the third party render again).
v2:
This introduces a potential WARN_ON into i915_gem_object_free() as the
current i915_vma_unbind() calls i915_gem_object_wait_rendering(). To
hit this path we first need to render with the GPU, have a dma-buf
attached with an unsignaled fence and then interrupt the wait. It does
get fixed later in the series (when i915_vma_unbind() only waits on the
active VMA and not all, including third-party, rendering.
To offset that risk, use the __i915_vma_unbind_no_wait hack.
Chris Wilson [Wed, 20 Jul 2016 08:21:14 +0000 (09:21 +0100)]
drm/i915: Mark imported dma-buf objects as being coherent
A foreign dma-buf does not share our cache domain tracking, and we rely
on the producer ensuring cache coherency. Marking them as being in the
CPU domain is incorrect.
v2: Add commentary about the GTT domain. This is not the best place for
it, but pending an actual overhaul of our domain tracking and explaining
each one, this comment should help the next reader...
Chris Wilson [Wed, 20 Jul 2016 08:21:13 +0000 (09:21 +0100)]
drm/i915: Disable waitboosting for mmioflips/semaphores
Since commit a6f766f39751 ("drm/i915: Limit ring synchronisation (sw
sempahores) RPS boosts") and commit bcafc4e38b6a ("drm/i915: Limit mmio
flip RPS boosts") we have limited the waitboosting for semaphores and
flips. Ideally we do not want to boost in either of these instances as no
userspace consumer is waiting upon the results (though a userspace producer
may be stalled trying to submit an execbuf - but in this case the
producer is being throttled due to the engine being saturated with
work). With the introduction of NO_WAITBOOST in the previous patch, we
can finally disable these needless boosts.
Chris Wilson [Wed, 20 Jul 2016 08:21:12 +0000 (09:21 +0100)]
drm/i915: Disable waitboosting for fence_wait()
We want to restrict waitboosting to known process contexts, where we can
track which clients are receiving waitboosts and prevent excessive power
wasting. For fence_wait() we do not have any client tracking and so that
leaves it open to abuse.
v2: Hide the IS_ERR_OR_NULL testing for special clients
Chris Wilson [Wed, 20 Jul 2016 08:21:11 +0000 (09:21 +0100)]
drm/i915: Derive GEM requests from dma-fence
dma-buf provides a generic fence class for interoperation between
drivers. Internally we use the request structure as a fence, and so with
only a little bit of interfacing we can rebase those requests on top of
dma-buf fences. This will allow us, in the future, to pass those fences
back to userspace or between drivers.
v2: The fence_context needs to be globally unique, not just unique to
this device.
Chris Wilson [Wed, 20 Jul 2016 08:21:10 +0000 (09:21 +0100)]
drm/i915: Mark all current requests as complete before resetting them
Following a GPU reset upon hang, we retire all the requests and then
mark them all as complete. If we mark them as complete first, we both
keep the normal retirement order (completed first then retired) and
provide a small optimisation for concurrent lookups.
Chris Wilson [Wed, 20 Jul 2016 08:21:09 +0000 (09:21 +0100)]
drm/i915: Retire oldest completed request before allocating next
In order to keep the memory allocated for requests reasonably tight, try
to reuse the oldest request (so long as it is completed and has no
external references) for the next allocation.
v2: Throw in a comment to hopefully make sure no one mistakes the
optimistic retirement of the oldest request for simply stealing it.
We have latency issues that might impact the performance: #96606.
and hangs and loading issues on resume after S4: #96526.
This is also blocking a platform milestone so let's disable
this for now while we make sure we don't have any more loading
issue, or related basic hangs and it pass BAT for real in all
platofmrs.
In case BAT is wrong let's first fix BAT before re-enable it here.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96606
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96526 Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Dave Gordon <david.s.gordon@intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Stable <stable@vger.kernel.org> Cc: Jani Nikula <jani.nikula@intel.com> Cc: Christophe Prigent <christophe.prigent@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: http://patchwork.freedesktop.org/patch/msgid/1468884477-30086-1-git-send-email-rodrigo.vivi@intel.com
Imre Deak [Fri, 1 Jul 2016 14:32:08 +0000 (17:32 +0300)]
drm/i915: Give proper names to MOCS entries
The purpose for each MOCS entry isn't well defined atm. Defining these
is important to remove any uncertainty about the use of these entries
for example in terms of performance and GPU/CPU coherency.
Suggested by Ville.
v4:
- Rename I915_MOCS_AUTO to I915_MOCS_PTE. (Chris)
Imre Deak [Fri, 1 Jul 2016 13:40:05 +0000 (16:40 +0300)]
drm/i915/bxt: Fix inadvertent CPU snooping due to incorrect MOCS config
Setting a write-back cache policy in the MOCS entry definition also
implies snooping, which has a considerable overhead. This is
unexpected for a few reasons:
- From user-space's point of view since it didn't want a coherent
surface (it didn't set the buffer as such via the set caching IOCTL).
- There is a separate MOCS entry field for snooping (which we never
set).
- This MOCS table is about caching in (e)LLC and there is no (e)LLC on
BXT. There is a separate table for L3 cache control.
Considering the above the current behavior of snooping looks like an
unintentional side-effect of the WB setting. Changing it to be LLC-UC
gets rid of the snooping without any ill-effects. For a coherent
surface the application would use a separate MOCS entry at index 1 and
call the set caching IOCTL to setup the PTE entries for the
corresponding buffer to be snooped. In the future we could also add a
new MOCS entry for coherent surfaces.
This resulted in 70% improvement in synthetic texturing benchmarks.
Kudos to Valtteri Rantala, Eero Tamminen and Michael T Frederick and
Ville who helped to narrow the source of problem to the kernel and to
the snooping behaviour in particular.
With a follow-up change to adjust the 3rd entry value
igt/gem_mocs_settings is passing after this change.
v2:
- Rebase on v2 of patch 1/2.
v3:
- Set the entry as LLC uncached instead of PTE-passthrough. This way
we also keep snooping disabled, but we also make the cacheability/
coherency setting indepent of the PTE which is managed by the
kernel. (Chris)
CC: Rong R Yang <rong.r.yang@intel.com> CC: Yakui Zhao <yakui.zhao@intel.com> CC: Valtteri Rantala <valtteri.rantala@intel.com> CC: Eero Tamminen <eero.t.tamminen@intel.com> CC: Michael T Frederick <michael.t.frederick@intel.com> CC: Ville Syrjälä <ville.syrjala@linux.intel.com> CC: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Imre Deak <imre.deak@intel.com> Acked-by: Zhao Yakui <yakui.zhao@intel.com> Tested-by: Rong R Yang <rong.r.yang@intel.com> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1467380406-11954-3-git-send-email-imre.deak@intel.com
Imre Deak [Fri, 1 Jul 2016 13:40:04 +0000 (16:40 +0300)]
drm/i915/gen9: Clean up MOCS table definitions
Use named struct initializers for clarity. Also fix the target cache
definition to reflect its role in GEN9 onwards. On GEN8 a TC value of 0
meant ELLC but on GEN9+ it means the TC and LRU controls are taken from
the PTE.
No functional change, igt/gem_mocs_settings still passing after this
change.
v2: (Chris)
- Add back the hexa literals for the entries.
Add note that igt/gem_mocs_settings still passes.
Daniel Vetter [Fri, 15 Jul 2016 19:48:06 +0000 (21:48 +0200)]
drm/i915: Clean up kerneldoc for intel_lrc.c
Fairly minimal, there's still lots of functions without any docs, and
which aren't static. But probably we want to first clean this up some more.
- Drop the bogus const. Marking argument pointers themselves (instead of
what they point at) as const provides roughly 0 value. And it's confusing,
since the data the pointer points at _is_ being changed.
- Remove kerneldoc for static functions. Keep comments where they seem valuable.
- Indent and whitespace fixes.
- Blockquote the bit field definitions of the descriptor for correct layouting.
Daniel Vetter [Fri, 15 Jul 2016 19:48:05 +0000 (21:48 +0200)]
drm/i915: Fixup kerneldoc code snippets in intel_uncore.c
We need :: before, blank lines around and indentation with 4 _additional_
spaces to make it work. Also, don't use @param in code snippets, it results
in confusion.
Chris Wilson [Sat, 16 Jul 2016 17:42:36 +0000 (18:42 +0100)]
drm/i915: Handle ENOSPC after failing to insert a mappable node
Even after adding individual page support for GTT mmaping, we can still
fail to find any space within the mappable region, and
drm_mm_insert_node() will then report ENOSPC. We have to then handle
this error by using the shmem access to the pages.
Fixes: b50a53715f09 ("drm/i915: Support for pread/pwrite ... objects")
Testcase: igt/gem_concurrent_blit Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Ankitprasad Sharma <ankitprasad.r.sharma@intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com Link: http://patchwork.freedesktop.org/patch/msgid/1468690956-23480-1-git-send-email-chris@chris-wilson.co.uk Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Dave Gordon [Thu, 14 Jul 2016 13:52:04 +0000 (14:52 +0100)]
drm/i915: refactor eb_get_batch()
Precursor for fix to secure batch execution. We will need to be able to
retrieve the batch VMA (as well as the batch itself) from the eb list,
so this patch extracts that part of eb_get_batch() into a separate
function, and moves both parts to a more logical place in the file, near
where the eb list is created.
Also, it may not be obvious, but the current execbuffer2 ioctl interface
requires that the buffer object containing the batch-to-be-executed be
the LAST entry in the exec2_list[] array (I expected it to be the
first!).
To clarify this, we can replace the rather obscure construct
"list_entry(eb->vmas.prev, ...)"
in the old version of eb_get_batch() with the equivalent but more
explicit
"list_last_entry(&eb->vmas,...)"
in the new eb_get_batch_vma() and of course add an explanatory comment.
Dave Gordon [Thu, 14 Jul 2016 13:52:03 +0000 (14:52 +0100)]
drm/i915: compile-time consistency check on __EXEC_OBJECT flags
Two different sets of flag bits are stored in the 'flags' member of a
'struct drm_i915_gem_exec_object2', and they're defined in two different
source files, increasing the risk of an accidental clash.
Some flags in this field are supplied by the user; these are defined in
i915_drm.h, and they start from the LSB and work up.
Other flags are defined in i915_gem_execbuffer, for internal use within
that file only; they start from the MSB and work down.
So here we add a compile-time check that the two sets of flags do not
overlap, which would cause all sorts of confusion.
Bob Paauwe [Fri, 15 Jul 2016 13:59:02 +0000 (14:59 +0100)]
drm/i915: Set legacy properties when using legacy gamma set IOCTL. (v2)
The i915 driver is now using atomic properties and atomic commit
to handle the legacy set gamma IOCTL. However, if the driver is
configured without atomic (nuclear_pageflip = false), it won't
update the legacy properties for degamma_lut, gamma_lut and ctm
leaving them out of sync with the atomic version of the properties.
Until the driver is full atomic, make sure we update the non-atomic
version of the properties.
v2: Update the comment with a FIXME. (Daniel)
v3: Update arguments of the gamma_set vfunc (Lionel)
Ville Syrjälä [Mon, 18 Jul 2016 10:15:14 +0000 (13:15 +0300)]
drm/i915: Treat eDP as always connected, again
eDP should be treated as connected even if doesn't have an EDID. In that
case we'll use the timings from the VBT. That used to be the case until
commit f21a21983ef1 ("drm/i915: Splitting intel_dp_detect")
broke things by considering even eDP disconnected if we fail to get
an EDID for it.
Fix things up again by treating eDP as always connected.
Cc: Shubhangi Shrivastava <shubhangi.shrivastava@intel.com> Cc: Nathan D Ciobanu <nathan.d.ciobanu@intel.com> Cc: Sivakumar Thulasimani <sivakumar.thulasimani@intel.com> Cc: Ander Conselvan de Oliveira <conselvan2@gmail.com> Cc: Larry Finger <larry.finger@lwfinger.net> Reported-by: Larry Finger <larry.finger@lwfinger.net>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96675 Cc: drm-intel-fixes@lists.freedesktop.org Fixes: f21a21983ef1 ("drm/i915: Splitting intel_dp_detect") Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Tested-by: Larry Finger <larry.finger@lwfinger.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: http://patchwork.freedesktop.org/patch/msgid/1468836914-16537-1-git-send-email-ville.syrjala@linux.intel.com
Chris Wilson [Fri, 15 Jul 2016 13:56:20 +0000 (14:56 +0100)]
drm/i915: Flush logical context image out to memory upon suspend
Before suspend, and especially before building the hibernation image, we
need to context image to be coherent in memory. To do this we require
that we perform a context switch to a disposable context (i.e. the
dev_priv->kernel_context) - when that switch is complete, all other
context images will be complete. This leaves the kernel_context image as
incomplete, but fortunately that is disposable and we can do a quick
fixup of the logical state after resuming.
v2: Share the nearly identical code to switch to the kernel context with
eviction.
v3: Explain why we need the switch and reset.
Testcase: igt/gem_exec_suspend # bsw
References: https://bugs.freedesktop.org/show_bug.cgi?id=96526 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Tested-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1468590980-6186-2-git-send-email-chris@chris-wilson.co.uk
Chris Wilson [Fri, 15 Jul 2016 13:56:19 +0000 (14:56 +0100)]
drm/i915/evict: Always switch away from the current context
Currently execlists is exempt from emitting a request to switch each
ring away from the current context over to the dev_priv->kernel_context
(for whatever reason, just under execlists the GGTT is unlikely to be as
fragmented, however the switch may help in some extreme cases). Extract
the switcher and enable it for execlsts as well, as we need to do so in
a later patch to force the context switch before suspend. (And since for
that switch we explicitly require the disposable kernel context, rename
the extracted function.)
Chris Wilson [Wed, 13 Jul 2016 17:34:45 +0000 (18:34 +0100)]
drm/i915/fbdev: Check for the framebuffer before use
If the fbdev probing fails, and in our error path we fail to clear the
dev_priv->fbdev, then we can try and use a dangling fbdev pointer, and
in particular a NULL fb. This could also happen in pathological cases
where we try to operate on the fbdev prior to it being probed.
Reported-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1468431285-28264-2-git-send-email-chris@chris-wilson.co.uk Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Chris Wilson [Wed, 13 Jul 2016 17:34:44 +0000 (18:34 +0100)]
drm/i915/fbdev: Drain the suspend worker on retiring
Since the suspend_work can arm itself if the console_lock() is currently
held elsewhere, simply calling flush_work() doesn't guarantee that the
work is idle upon return. To do so requires using cancel_work_sync().
drm/i915: add missing condition for committing planes on crtc
The i915 driver checks for color management properties changes as part
of a plane update. Therefore a color management update must imply a
plane update, otherwise we never update the transformation matrixes
and degamma/gamma LUTs.
v2: add comment about moving the commit of color management registers
to an async worker
v3: Commit color management register right after vblank
v4: Move back color management commit condition together with planes
commit
v5: Trigger color management commit through the planes commit (Daniel)
v6: Make plane change update more readable
Fixes: 20a34e78f0d7 (drm/i915: Update color management during vblank evasion.) Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: drm-intel-fixes@lists.freedesktop.org Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
References: https://lkml.org/lkml/2016/7/14/614 Reviewed-and-tested-by: Mario Kleiner <mario.kleiner.de@gmail.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: http://patchwork.freedesktop.org/patch/msgid/1464183041-8478-1-git-send-email-lionel.g.landwerlin@intel.com
Lyude [Tue, 21 Jun 2016 21:03:44 +0000 (17:03 -0400)]
drm/i915: Enable polling when we don't have hpd
Unfortunately, there's two situations where we lose hpd right now:
- Runtime suspend
- When we've shut off all of the power wells on Valleyview/Cherryview
While it would be nice if this didn't cause issues, this has the
ability to get us in some awkward states where a user won't be able to
get their display to turn on. For instance; if we boot a Valleyview
system without any monitors connected, it won't need any of it's power
wells and thus shut them off. Since this causes us to lose HPD, this
means that unless the user knows how to ssh into their machine and do a
manual reprobe for monitors, none of the monitors they connect after
booting will actually work.
Eventually we should come up with a better fix then having to enable
polling for this, since this makes rpm a lot less useful, but for now
the infrastructure in i915 just isn't there yet to get hpd in these
situations.
Changes since v1:
- Add comment explaining the addition of the if
(!mode_config->poll_running) in intel_hpd_init()
- Remove unneeded if (!dev->mode_config.poll_enabled) in
i915_hpd_poll_init_work()
- Call to drm_helper_hpd_irq_event() after we disable polling
- Add cancel_work_sync() call to intel_hpd_cancel_work()
Changes since v2:
- Apparently dev->mode_config.poll_running doesn't actually reflect
whether or not a poll is currently in progress, and is actually used
for dynamic module paramter enabling/disabling. So now we instead
keep track of our own poll_running variable in dev_priv->hotplug
- Clean i915_hpd_poll_init_work() a little bit
Changes since v3:
- Remove the now-redundant connector loop in intel_hpd_init(), just
rely on intel_hpd_poll_enable() for setting connector->polled
correctly on each connector
- Get rid of poll_running
- Don't assign enabled in i915_hpd_poll_init_work before we actually
lock dev->mode_config.mutex
- Wrap enabled assignment in i915_hpd_poll_init_work() in READ_ONCE()
for doc purposes
- Do the same for dev_priv->hotplug.poll_enabled with WRITE_ONCE in
intel_hpd_poll_enable()
- Add some comments about racing not mattering in intel_hpd_poll_enable
Changes since v4:
- Rename intel_hpd_poll_enable() to intel_hpd_poll_init()
- Drop the bool argument from intel_hpd_poll_init()
- Remove redundant calls to intel_hpd_poll_init()
- Rename poll_enable_work to poll_init_work
- Add some kerneldoc for intel_hpd_poll_init()
- Cross-reference intel_hpd_poll_init() in intel_hpd_init()
- Just copy the loop from intel_hpd_init() in intel_hpd_poll_init()
Changes since v5:
- Minor kerneldoc nitpicks
Cc: stable@vger.kernel.org Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Lyude <cpaul@redhat.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Lyude [Tue, 21 Jun 2016 21:03:43 +0000 (17:03 -0400)]
drm/i915/vlv: Disable HPD in valleyview_crt_detect_hotplug()
One of the things preventing us from using polling is the fact that
calling valleyview_crt_detect_hotplug() when there's a VGA cable
connected results in sending another hotplug. With polling enabled when
HPD is disabled, this results in a scenario like this:
- We enable power wells and reset the ADPA
- output_poll_exec does force probe on VGA, triggering a hpd
- HPD handler waits for poll to unlock dev->mode_config.mutex
- output_poll_exec shuts off the ADPA, unlocks dev->mode_config.mutex
- HPD handler runs, resets ADPA and brings us back to the start
This results in an endless irq storm getting sent from the ADPA
whenever a VGA connector gets detected in the middle of polling.
Somewhat based off of the "drm/i915: Disable CRT HPD around force
trigger" patch Ville Syrjälä sent a while back
Cc: stable@vger.kernel.org Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Lyude <cpaul@redhat.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Lyude [Tue, 21 Jun 2016 21:03:42 +0000 (17:03 -0400)]
drm/i915/vlv: Reset the ADPA in vlv_display_power_well_init()
While VGA hotplugging worked(ish) before, it looks like that was mainly
because we'd unintentionally enable it in
valleyview_crt_detect_hotplug() when we did a force trigger. This
doesn't work reliably enough because whenever the display powerwell on
vlv gets disabled, the values set in VLV_ADPA get cleared and
consequently VGA hotplugging gets disabled. This causes bugs such as one
we found on an Intel NUC, where doing the following sequence of
hotplugs:
Would result in VGA hotplugging becoming disabled, due to the powerwells
getting toggled in the process of connecting HDMI.
Changes since v3:
- Expose intel_crt_reset() through intel_drv.h and call that in
vlv_display_power_well_init() instead of
encoder->base.funcs->reset(&encoder->base);
Changes since v2:
- Use intel_encoder structs instead of drm_encoder structs
Changes since v1:
- Instead of handling the register writes ourself, we just reuse
intel_crt_detect()
- Instead of resetting the ADPA during display IRQ installation, we now
reset them in vlv_display_power_well_init()
Cc: stable@vger.kernel.org Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Lyude <cpaul@redhat.com> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
[danvet: Rebase over dev_priv/drm_device embedding.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Chris Wilson [Wed, 13 Jul 2016 08:10:37 +0000 (09:10 +0100)]
drm/i915: Defer enabling rc6 til after we submit the first batch/context
Some hardware requires a valid render context before it can initiate
rc6 power gating of the GPU; the default state of the GPU is not
sufficient and may lead to undefined behaviour. The first execution of
any batch will load the "golden render state", at which point it is safe
to enable rc6. As we do not forcibly load the kernel context at resume,
we have to hook into the batch submission to be sure that the render
state is setup before enabling rc6.
However, since we don't enable powersaving until that first batch, we
queued a delayed task in order to guarantee that the batch is indeed
submitted.
v2: Rearrange intel_disable_gt_powersave() to match.
v3: Apply user specified cur_freq (or idle_freq if not set).
v4: Give in, and supply a delayed work to autoenable rc6
v5: Mika suggested a couple of better names for delayed_resume_work
v6: Rebalance rpm_put around the autoenable task
Chris Wilson [Wed, 13 Jul 2016 08:10:35 +0000 (09:10 +0100)]
drm/i915: Define a separate variable and control for RPS waitboost frequency
To allow the user finer control over waitboosting, allow them to set the
frequency we request for the boost. This also them allows to effectively
disable the boosting by setting the boost request to a low frequency.
Chris Wilson [Wed, 13 Jul 2016 08:10:32 +0000 (09:10 +0100)]
drm/i915: Preserve current RPS frequency across init
Select idle frequency during initialisation, then reset the last known
frequency when re-enabling. This allows us to preserve the user selected
frequency across resets.
v2: Stop CHV from overriding the user's choice in cherryview_enable_rps()
Chris Wilson [Wed, 13 Jul 2016 08:10:31 +0000 (09:10 +0100)]
drm/i915: Flush GT idle status upon reset
Upon resetting the GPU, we force the engines to be idle by clearing
their request lists. However, I neglected to clear the GT active status
and so the next request following the reset was not marking the device
as busy again. (We had to wait until any outstanding retire worker
finally ran and cleared the active status.)
Ville Syrjälä [Tue, 12 Jul 2016 12:00:37 +0000 (15:00 +0300)]
drm/i915: Ignore panel type from OpRegion on SKL
Dell XPS 13 9350 apparently doesn't like it when we use the panel type
from OpRegion. The OpRegion panel type (0) tells us to use use low
vswing for eDP, whereas the VBT panel type (2) tells us to use normal
vswing. The problem is that low vswing results in some display flickers.
Since no one seems to know how this stuff is supposed to be handled,
let's just ignore the OpRegion panel type on SKL for now.
v2: Print the panel type correctly in the debug output
Reported-by: James Bottomley <James.Bottomley@HansenPartnership.com> Cc: James Bottomley <James.Bottomley@HansenPartnership.com> Cc: drm-intel-fixes@lists.freedesktop.org
References: https://lists.freedesktop.org/archives/intel-gfx/2016-June/098826.html Fixes: a05628195a0d ("drm/i915: Get panel_type from OpRegion panel details") Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1468324837-29237-1-git-send-email-ville.syrjala@linux.intel.com Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Tested-by: James Bottomley <James.Bottomley@HansenPartnership.com> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: Chris Wilson <chris-wilson.co.uk>
With the unified common engine setup done, and the execlist engine
initialization loop clearly split into two phases, we can eliminate
the separate legacy engine initialization code.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: Chris Wilson <chris-wilson.co.uk>
Move the execlist engine setup to vfuncs so that the engine
init loop is clearly split into the mode agnostic and
specific steps.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: Chris Wilson <chris-wilson.co.uk>
Dave Gordon [Wed, 13 Jul 2016 15:03:35 +0000 (16:03 +0100)]
drm/i915: unify first-stage engine struct setup
intel_lrc.c has a table of "logical rings" (meaning engines), while
intel_ringbuffer.c has separately open-coded initialisation for each
engine. We can deduplicate this somewhat by using the same first-stage
engine-setup function for both modes.
So here we expose the function that transfers information from the
static table of (all) known engines to the dev_priv->engine array of
engines available on this device (adjusting the names along the way)
and then embed calls to it in both the LRC and the legacy-mode setup.
Signed-off-by: Dave Gordon <david.s.gordon@intel.com> Reviewed-by: Chris Wilson <chris-wilson.co.uk> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Ville Syrjälä [Tue, 12 Jul 2016 16:24:47 +0000 (19:24 +0300)]
drm/i915: Unbreak interrupts on pre-gen6
Prior to gen6 we didn't have per-ring IMR registers, which means that
since commit 61ff75ac20ff ("drm/i915: Simplify enabling
user-interrupts with L3-remapping") we're now masking off all interrupts
when init_render_ring() gets called. That's rather rude. Let's limit
the ring IMR frobbing to machines that actually have the per-ring IMR
registers.
Chris Wilson [Tue, 12 Jul 2016 11:55:29 +0000 (12:55 +0100)]
drm/i915: Provide argument names for static stubs
Make sure we keep kbuilder happy in all of its random configs by
providing argument names for compile-time stubs.
In file included from drivers/gpu/drm/i915/intel_dp_mst.c:27:0:
drivers/gpu/drm/i915/i915_drv.h: In function 'i915_debugfs_register':
>> drivers/gpu/drm/i915/i915_drv.h:3612:48: error: parameter name omitted
static inline int i915_debugfs_register(struct drm_i915_private *) {return 0;}
^~~~~~~~~~~~~~~~
drivers/gpu/drm/i915/i915_drv.h: In function 'i915_debugfs_unregister':
drivers/gpu/drm/i915/i915_drv.h:3613:51: error: parameter name omitted
static inline void i915_debugfs_unregister(struct drm_i915_private *) {}
Chris Wilson [Mon, 11 Jul 2016 13:46:17 +0000 (14:46 +0100)]
drm/i915: Update ifdeffery for mutex->owner
In commit 7608a43d8f2e ("locking/mutexes: Use MUTEX_SPIN_ON_OWNER when
appropriate") the owner field in the mutex was updated from being
dependent upon CONFIG_SMP to using optimistic spin. Update our peek
function to suite.
Now that the last couple of hacks have been removed from the runtime
powermanagement users, we can fully enable the asserts by preventing the
temptation to disable them when our code is buggy.
Chris Wilson [Sat, 9 Jul 2016 09:12:06 +0000 (10:12 +0100)]
drm/i915: Kick hangcheck from retire worker
Let's ensure that we cannot run indefinitely without the hangcheck
worker being queued. We removed it from being kicked on every request
because we were kicking it a few millions times in every hangcheck
interval and only once is necessary! However, that leaves us with the
issue of what if userspace never waits for a request, or runs out of
resources, what if userspace just issues a request then spins on
BUSY_IOCTL?
Chris Wilson [Sat, 9 Jul 2016 09:12:05 +0000 (10:12 +0100)]
drm/i915/breadcrumbs: Queue hangcheck before sleeping
Never go to sleep waiting on the GPU without first ensuring that we will
get woken up.
We have a choice of queuing the hangcheck before every schedule() or the
first time we wakeup. In order to simply accommodate both the signaler
and the ordinary waiter, move the queuing to the common point of
enabling the irq. We lose the paranoid safety of ensuring that the
hangcheck is active before the sleep, but avoid code duplication (and
redundant hangcheck queuing).
Chris Wilson [Fri, 24 Jun 2016 13:07:14 +0000 (14:07 +0100)]
drm/i915: Fill unused GGTT with scratch pages for VT-d
One of the numerous VT-d workarounds we require is that the display
hardware reads past the end of the buffer triggering VT-d faults. This
is acknowledged in the code as being safe "since we fill the unused
portions of the GGTT with the scratch page". Alas, that is no longer
always true and so we trigger DMAR read faults.
Skylake also requires another workaround to avoid mixing VT-d and
unpopulated PTE, and so there we also need to ensure we fill unused
entries with the scratch page.
Reported-by: Mike Lothian <mike@fireburn.co.uk>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96584 Fixes: f7770bfd9fd2 ("drm/i915: Skip clearing the GGTT on full-ppgtt systems") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: David Weinehall <david.weinehall@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1466773634-8106-1-git-send-email-chris@chris-wilson.co.uk Reviewed-by: David Weinehall <david.weinehall@intel.com>
drm/i915: Introduce Kabypoint PCH for Kabylake H/DT.
Some Kabylake SKUs are going to use Kabypoint PCH.
It is mainly for Halo and DT ones.
>From our specs it doesn't seem that KBP brings
any change on the display south engine. So let's consider
this as a continuation of SunrisePoint, i.e., SPT+.
Since it is easy to get confused by a letter change:
KBL = Kabylake - CPU/GPU codename.
KBP = Kabypoint - PCH codename.
Ville Syrjälä [Wed, 22 Jun 2016 18:57:09 +0000 (21:57 +0300)]
drm/i915: Check for invalid cloning earlier during modeset
Move the encoder cloning check to happen earlier in the modeset. The
main benefit will be that the debug output from a failed modeset will
be less confusing as output_types can not indicate an invalid
configuration during the later computation stages.
For instance, what happened to me was kms_setmode was attempting one
of its invalid cloning checks during which it asked for DP+VGA cloning
on HSW. In this case the DP .compute_config() was executed after
the FDI .compute_config() leaving the DP link clock (1.62 in this case)
in port_clock, and then later the FDI BW computation tried to use that
as the FDI link clock (which should always be 2.7). 1.62 x 2 wasn't
enough for the mode it was trying to use, and so it ended up rejecting
the modeset, not because of an invalid cloning configuration, but
because of supposedly running out of FDI bandwidth. Took me a while
to figure out what had actually happened.