Chris Wilson [Wed, 15 Feb 2017 13:15:47 +0000 (13:15 +0000)]
drm/i915: Only enable hotplug interrupts if the display interrupts are enabled
In order to prevent accessing the hpd registers outside of the display
power wells, we should refrain from writing to the registers before the
display interrupts are enabled.
v2: Set dev_priv->display_irqs_enabled to true for all platforms other
than vlv/chv that manually control the display power domain.
Fixes: 19625e85c6ec ("drm/i915: Enable polling when we don't have hpd")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97798 Suggested-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Lyude <cpaul@redhat.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Cc: Hans de Goede <jwrdegoede@fedoraproject.org> Cc: stable@vger.kernel.org Link: http://patchwork.freedesktop.org/patch/msgid/20170215131547.5064-1-chris@chris-wilson.co.uk Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Uma Shankar [Wed, 8 Feb 2017 10:50:50 +0000 (16:20 +0530)]
drm/i915: Check for platform specific GPIO config
Panel GPIO control should be done based on platform. Add a check
to restrict VLV and CHT specific GPIO confirguration, so that
they dont apply to other platforms.
The VBT spec fails to mention the PMIC backlight control option is valid
only for VLV/CHT, and the field may be set to "PMIC" for BXT even if
PMIC is not desired or possible.
Manasi Navare [Wed, 8 Feb 2017 00:54:11 +0000 (16:54 -0800)]
drm/i915/dp: Reset the link params on HPD/connected boot/resume
The max link parameters should be set/reset only on HPD or
connected boot case or on system resume.
Add a flag reset_link_params to intel_dp to decide when
to reset the max link parameters. This prevents the parameters
from getting reset/overwritten through all other
connector->funcs->detect() calls. This is important when link
training fails and the max link params are modified to the
lower fallback values.
Chris Wilson [Wed, 15 Feb 2017 08:43:55 +0000 (08:43 +0000)]
drm/i915: Only preallocate the aliasing GTT to the extents of the global GTT
As the aliasing GTT is only accessed via the global GTT, we will never
use more of it than we expose via the Global GTT and so we only need to
preallocate sufficient space within the ppgtt for the full GTT. Equally,
if the aliasing GTT is smaller than the global GTT, we have a serious
issue and must bail.
Chris Wilson [Wed, 15 Feb 2017 08:43:54 +0000 (08:43 +0000)]
drm/i915: Remove i915_address_space.start
Once upon a time, back in the UMS days, we supported userspace
initialising the GTT and sharing portions of the GTT with other users.
Now, we own the GTT (both global and per-process) and the tables always
start at 0 - so we can remove i915_address_space.start and forget about
this old complication.
Chris Wilson [Wed, 15 Feb 2017 08:43:49 +0000 (08:43 +0000)]
drm/i915: Remove bitmap tracking for used-pml4
We only operate on known extents (both for alloc/clear) and so we can use
both the knowledge of the bind/unbind range along with the knowledge of
the existing pagetable to avoid having to allocate temporary and
auxiliary bitmaps.
Chris Wilson [Wed, 15 Feb 2017 08:43:48 +0000 (08:43 +0000)]
drm/i915: Remove bitmap tracking for used-pdpes
We only operate on known extents (both for alloc/clear) and so we can use
both the knowledge of the bind/unbind range along with the knowledge of
the existing pagetable to avoid having to allocate temporary and
auxiliary bitmaps.
Chris Wilson [Wed, 15 Feb 2017 08:43:47 +0000 (08:43 +0000)]
drm/i915: Remove bitmap tracking for used-pdes
We only operate on known extents (both for alloc/clear) and so we can use
both the knowledge of the bind/unbind range along with the knowledge of
the existing pagetable to avoid having to allocate temporary and
auxiliary bitmaps.
Chris Wilson [Wed, 15 Feb 2017 08:43:46 +0000 (08:43 +0000)]
drm/i915: Remove bitmap tracking for used-ptes
We only operate on known extents (both for alloc/clear) and so we can use
both the knowledge of the bind/unbind range along with the knowledge of
the existing pagetable to avoid having to allocate temporary and
auxiliary bitmaps.
Chris Wilson [Wed, 15 Feb 2017 08:43:43 +0000 (08:43 +0000)]
drm/i915: Always preallocate gen6/7 ppgtt
The hardware does not cope very well with us changing the PD within an
active context (the context must be idle for it to re-read the PD). As
we only check whether the page is idle before changing the entry (and on
through the PD tree), we cannot reliably replace PD entries on
gen6/gen7. To fully avoid changing the tree at runtime, preallocate it
on init.
Chris Wilson [Wed, 15 Feb 2017 08:43:42 +0000 (08:43 +0000)]
drm/i915: Move allocate_va_range to GTT
In the future, we need to call allocate_va_range on the aliasing-ppgtt
which means moving the call down from the vma into the vm (which is
more appropriate for calling the vm function).
Chris Wilson [Wed, 15 Feb 2017 08:43:41 +0000 (08:43 +0000)]
drm/i915: Remove kmap/kunmap wrappers
As these are now both plain and simple kmap_atomic/kunmap_atomic pairs,
we can remove the wrappers for a small gain of clarity (in particular,
not hiding the atomic critical sections!).
Chris Wilson [Wed, 15 Feb 2017 08:43:40 +0000 (08:43 +0000)]
drm/i915: Convert clflushed pagetables over to WC maps
We flush the entire page every time we update a few bytes, making the
update of a page table many, many times slower than is required. If we
create a WC map of the page for our updates, we can avoid the clflush
but incur additional cost for creating the pagetable. We amoritize that
cost by reusing page vmappings, and only changing the page protection in
batches.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Improve the sg iteration and in hte process eliminate a bug in
miscomputing the pml4 length as orig_nents<<PAGE_SHIFT is no longer the
full length of the sg table.
v2: Check for the end of the fourth level page table (the final pdpe)
and move onto the next.
v3: Assert that 3lvl insert_pte_entries doesn't overflow its smaller set
of PDP.
Inline the address computation to avoid the vfunc call for every page.
We still have to pay the high overhead of sg_page_iter_next(), but now
at least GCC can optimise the inner most loop, giving a significant
boost to some thrashing Unreal Engine workloads.
Hans de Goede [Tue, 14 Feb 2017 16:12:38 +0000 (18:12 +0200)]
drm/i915: Fix not finding the VBT when it overlaps with OPREGION_ASLE_EXT
If there is no OPREGION_ASLE_EXT then a VBT stored in mailbox #4 may
use the ASLE_EXT parts of the opregion. Adjust the vbt_size calculation
for a vbt in mailbox #4 for this.
This fixes the driver not finding the VBT on a jumper ezpad mini3
cherrytrail tablet and on a ACER SW5_017 machine.
Chris Wilson [Fri, 10 Feb 2017 15:03:48 +0000 (15:03 +0000)]
drm/i915: Only apply the jump to the "efficient RPS" frequency on startup
Currently we apply the jump to rpe if we are below it and the GPU needs
more power. For some GPUs, the rpe is 75% of the maximum range causing
us to dramatically overshoot low power applications *and* unable to
reach the low frequency that can most efficiently deliver their
workload.
Chris Wilson [Fri, 10 Feb 2017 15:03:47 +0000 (15:03 +0000)]
drm/i915: Don't accidentally increase the frequency in handling DOWN rps
If we receive a DOWN_TIMEOUT rps interrupt, we respond by reducing the
GPU clocks significantly. Before we do, double check that the frequency
we pick is actually a decrease.
Chris Wilson [Fri, 10 Feb 2017 15:03:46 +0000 (15:03 +0000)]
drm/i915: Enable fine-tuned RPS for cherryview
When the RPS tuning was applied to Baytrail, in commit 8fb55197e64d
("drm/i915: Agressive downclocking on Baytrail"), concern was given that
it might cause Cherryview excess wakeups of the common power well.
However, the static thresholds perform poorly for Kodi, and the GPU is
unable to deliver the video frames on time. Enabling the dynamic, finer
thresholds used on all other platforms (including Skylake and Broxton
that also have the same multiple powerwell concerns) allows the GPU to
pick a more appropriate frequency and not drop frames.
Chris Wilson [Tue, 14 Feb 2017 16:46:11 +0000 (16:46 +0000)]
drm/i915: The return of i915_gpu_info to debugfs
Once upon a time before we had automated GPU state capture upon hangs,
we had intel_gpu_dump. Now we come almost full circle and reinstate that
view of the current GPU queues and registers by using the error capture
facility to snapshot the GPU state when debugfs/.../i915_gpu_info is
opened - which should provided useful debugging to both the error
capture routines (without having to cause a hang and avoid the error
state being eaten by igt) and generally.
v2: Rename drm_i915_error_state to i915_gpu_state to alleviate some name
collisions between the error state dump and inspecting the gpu state.
Chris Wilson [Tue, 14 Feb 2017 13:34:20 +0000 (13:34 +0000)]
drm/i915/guc: Don't take struct_mutex for object unreference
We no longer need to take the struct_mutex for freeing objects, and on
the finalisation paths here the mutex is not been used for serialisation
of the pointer access, so remove the BKL wart.
Chris Wilson [Tue, 14 Feb 2017 14:35:09 +0000 (14:35 +0000)]
drm/i915: Silence compiler warning for seltests/i915_gem_coherency
In general, the compiler should not be able to detect if we do any
passes through the test loops:
In file included from drivers/gpu/drm/i915/i915_gem.c:5029:
drivers/gpu/drm/i915/selftests/i915_gem_coherency.c: In function 'igt_gem_coherency':
drivers/gpu/drm/i915/selftests/i915_gem_coherency.c:274: error: 'err' may be used uninitialized in this function
Chris Wilson [Tue, 14 Feb 2017 11:37:56 +0000 (11:37 +0000)]
drm/i915: Silence compiler for GTT selftests
gcc-4.7 spotted that
In file included from drivers/gpu/drm/i915/i915_gem_gtt.c:3791:0:
drivers/gpu/drm/i915/selftests/i915_gem_gtt.c: In function ‘pot_hole’:
drivers/gpu/drm/i915/selftests/i915_gem_gtt.c:594:6: error: ‘err’ may be used uninitialized in this function [-Werror=maybe-uninitialized]
So set it to 0 should we ever skip over a hole smaller than a few pages.
Chris Wilson [Tue, 14 Feb 2017 09:23:44 +0000 (09:23 +0000)]
drm/i915: Avoid overflow in computing pot_hole loop termination
When using the mock_ppgtt selftest, the GTT is large enough to cause an
overflow in pot_hole() when adding 2 pages to the address. Avoid the
overflow by computing the final valid address and iterating up to that
address.
Deepak S [Fri, 12 Aug 2016 13:16:41 +0000 (18:46 +0530)]
drm/i915/chv: Set min freq to RPn on CHV.
With latest Punit FW, vgg input voltag drop falling to minimum is fixed.
So reverting the WA patch & moving to turbo freq opreation range to [RPn -> RP0]
This is not a 1:1 revert of the commit 5b7c91b78b1ce6663e0f1f037f6cb4d7c9537d44.
You can refer to commit 5b5929cbe3f7 ("drm/i915/chv: remove
pre-production hardware workarounds") as the reason for the discrepancy
Ville Syrjälä [Wed, 21 Dec 2016 14:31:14 +0000 (16:31 +0200)]
drm/i915: Dump more configuration information for DSI
Dump out more of the DSI configuration details during init.
This includes pclk, burst_mode_ratio, lane_count, pixel_overlap,
video_mode_format and reset_timer_val.
v2: Dump more info (Chris)
v3: Use the VIDEO_MODE_ defines for consistency (Chris)
Dump dphy_reg too (Chris)
Tvrtko Ursulin [Tue, 14 Feb 2017 11:32:42 +0000 (11:32 +0000)]
drm/i915: Emit to ringbuffer directly
This removes the usage of intel_ring_emit in favour of
directly writing to the ring buffer.
intel_ring_emit was preventing the compiler for optimising
fetch and increment of the current ring buffer pointer and
therefore generating very verbose code for every write.
It had no useful purpose since all ringbuffer operations
are started and ended with intel_ring_begin and
intel_ring_advance respectively, with no bail out in the
middle possible, so it is fine to increment the tail in
intel_ring_begin and let the code manage the pointer
itself.
Useless instruction removal amounts to approximately
two and half kilobytes of saved text on my build.
Not sure if this has any measurable performance
implications but executing a ton of useless instructions
on fast paths cannot be good.
v2:
* Change return from intel_ring_begin to error pointer by
popular demand.
* Move tail increment to intel_ring_advance to enable some
error checking.
v3:
* Move tail advance back into intel_ring_begin.
* Rebase and tidy.
v4:
* Complete rebase after a few months since v3.
drm/i915: Convert remaining users of 32bit power domain masks
I screwed up the rebase of commit d8fc70b7367b ("drm/i915: Make power
domain masks 64 bit long") before sending v2, causing a couple of
conversions from 32 to 64 bit masks to be lost.
Fixes: d8fc70b7367b ("drm/i915: Make power domain masks 64 bit long") Cc: Jani Nikula <jani.nikula@linux.intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Cc: Ander Conselvan de Oliveira <ander.conselvan.de.oliveira@intel.com> Cc: Daniel Vetter <daniel.vetter@intel.com> Cc: intel-gfx@lists.freedesktop.org Signed-off-by: Ander Conselvan de Oliveira <ander.conselvan.de.oliveira@intel.com> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170213145733.8779-1-ander.conselvan.de.oliveira@intel.com
Chris Wilson [Sun, 12 Feb 2017 21:53:43 +0000 (21:53 +0000)]
drm/i915: Pass timeout==0 on to i915_gem_object_wait_fence()
The i915_gem_object_wait_fence() uses an incoming timeout=0 to query
whether the current fence is busy or idle, without waiting. This can be
used by the wait-ioctl to implement a busy query.
Fixes: e95433c73a11 ("drm/i915: Rearrange i915_wait_request() accounting with callers")
Testcase: igt/gem_wait/basic-busy-write-all Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Matthew Auld <matthew.william.auld@gmail.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: <drm-intel-fixes@lists.freedesktop.org> # v4.10-rc1+ Cc: stable@vger.kernel.org Link: http://patchwork.freedesktop.org/patch/msgid/20170212215344.16600-1-chris@chris-wilson.co.uk Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Chris Wilson [Wed, 9 Nov 2016 10:39:05 +0000 (10:39 +0000)]
drm/i915/gvt: Disable access to stolen memory as a guest
Explicitly disable stolen memory when running as a guest in a virtual
machine, since the memory is not mediated between clients and reserved
entirely for the host. The actual size should be reported as zero, but
like every other quirk we want to tell the user what is happening.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99028 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Zhenyu Wang <zhenyuw@linux.intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20161109103905.17860-1-chris@chris-wilson.co.uk Reviewed-by: Zhenyu Wang <zhenyuw@linux.intel.com> Cc: stable@vger.kernel.org
Chris Wilson [Mon, 13 Feb 2017 17:15:57 +0000 (17:15 +0000)]
drm/i915: Exercise crossing pot boundaries in the GTT
As the page-table trees within the GTT are naturally aligned to
power-of-two boundaries, by inserting an object that crosses a
power-of-two (and the power-of-two intervals) we can quickly check the
code for errors in switching between levels in the tree.
Chris Wilson [Mon, 13 Feb 2017 17:15:44 +0000 (17:15 +0000)]
drm/i915: Use fault-injection to force the shrinker to run in live GTT tests
It is possible whilst allocating the page-directory tree for a ppgtt
bind that the shrinker may run and reap unused parts of the tree. If the
shrinker happens to remove a chunk of the tree that the
allocate_va_range has already processed, we may then try to insert into
the dangling tree. This test uses the fault-injection framework to force
the shrinker to be invoked before we allocate new pages, i.e. new chunks
of the PD tree.
References: https://bugs.freedesktop.org/show_bug.cgi?id=99295 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Chris Wilson [Mon, 13 Feb 2017 17:15:43 +0000 (17:15 +0000)]
drm/i915: Live testing of lowlevel GTT operations
Directly test allocating the va range and clearing it, this bypasses the
use of i915_vma_bind() and inserting the pages to focus on testing of
the pagetables.
Chris Wilson [Mon, 13 Feb 2017 17:15:39 +0000 (17:15 +0000)]
drm/i915: Exercise filling the top/bottom portions of the ppgtt
Allocate objects with varying number of pages (which should hopefully
consist of a mixture of contiguous page chunks and so coalesced sg
lists) and check that the sg walkers in insert_pages cope.
Chris Wilson [Mon, 13 Feb 2017 17:15:37 +0000 (17:15 +0000)]
drm/i915: Add a live dmabuf selftest
Though we have good coverage of our dmabuf interface through the mock
tests, we also want to check the heavy module unload paths of the live
i915 driver.
Chris Wilson [Mon, 13 Feb 2017 17:15:35 +0000 (17:15 +0000)]
drm/i915: Sanity check all registers for matching fw domains
Add a late selftest that walks over all forcewake registers (those below
0x40000) and uses the mmio debug register to check to see if any are
unclaimed. This is possible if we fail to wake the appropriate
powerwells for the register.
Chris Wilson [Mon, 13 Feb 2017 17:15:34 +0000 (17:15 +0000)]
drm/i915: Test all fw tables during mock selftests
In addition to just testing the fw table we load, during the initial
mock testing we can test that all tables are valid (so the testing is
not limited to just the platforms that load that particular table).
Chris Wilson [Mon, 13 Feb 2017 17:15:28 +0000 (17:15 +0000)]
drm/i915: Add selftests for object allocation, phys
The phys object is a rarely used device (only very old machines require
a chunk of physically contiguous pages for a few hardware interactions).
As such, it is not exercised by CI and to combat that we want to add a
test that exercises the phys object on all platforms.
v2: Always set err on error paths and not rely on inheriting the err.
Chris Wilson [Mon, 13 Feb 2017 17:15:21 +0000 (17:15 +0000)]
drm/i915: Add selftests for i915_gem_request
Simple starting point for adding seltests for i915_gem_request, first
mock a device (with engines and contexts) that allows us to construct
and execute a request, along with waiting for the request to complete.
Chris Wilson [Mon, 13 Feb 2017 17:15:20 +0000 (17:15 +0000)]
drm/i915: Create a fake object for testing huge allocations
We would like to be able to exercise huge allocations even on memory
constrained devices. To do this we create an object that allocates only
a few pages and remaps them across its whole range - each page is reused
multiple times. We can therefore pretend we are rendering into a much
larger object.
Chris Wilson [Mon, 13 Feb 2017 17:15:13 +0000 (17:15 +0000)]
drm/i915: Add some selftests for sg_table manipulation
Start exercising the scattergather lists, especially looking at
iteration after coalescing.
v2: Comment on the peculiarity of table construction (i.e. why this
sg_table might be interesting).
v3: Added one __func__ to identify expect_pfn_sg()
v4: Loop until we have crossed the chain boundary (forcing sg_table to
do multiple allocations) before squelching a potential ENOMEM from oom.
Chris Wilson [Mon, 13 Feb 2017 17:15:12 +0000 (17:15 +0000)]
drm/i915: Provide a hook for selftests
Some pieces of code are independent of hardware but are very tricky to
exercise through the normal userspace ABI or via debugfs hooks. Being
able to create mock unit tests and execute them through CI is vital.
Start by adding a central point where we can execute unit tests and
a parameter to enable them. This is disabled by default as the
expectation is that these tests will occasionally explode.
To facilitate integration with igt, any parameter beginning with
i915.igt__ is interpreted as a subtest executable independently via
igt/drv_selftest.
Two classes of selftests are recognised: mock unit tests and integration
tests. Mock unit tests are run as soon as the module is loaded, before
the device is probed. At that point there is no driver instantiated and
all hw interactions must be "mocked". This is very useful for writing
universal tests to exercise code not typically run on a broad range of
architectures. Alternatively, you can hook into the live selftests and
run when the device has been instantiated - hw interactions are real.
v2: Add a macro for compiling conditional code for mock objects inside
real objects.
v3: Differentiate between mock unit tests and late integration test.
v4: List the tests in natural order, use igt to sort after modparam.
v5: s/late/live/
v6: s/unsigned long/unsigned int/
v7: Use igt_ prefixes for long helpers.
v8: Deobfuscate macros overriding functions, stop using -I$(src)
Chris Wilson [Sun, 12 Feb 2017 17:20:02 +0000 (17:20 +0000)]
drm/i915: Clear the last_retired_context following a hang/reset
Following a hang and reset, we know that the engine is idle and all
context state has been saved or lost. Consequently, we know that the
engine is no longer referencing the last context and we can relinquish
our tracking.
Chris Wilson [Sun, 12 Feb 2017 17:20:01 +0000 (17:20 +0000)]
drm/i915: Park the breadcrumbs signaler across a GPU reset
The signal threads may be running concurrently with the GPU reset. The
completion from the GPU run asynchronous with the reset and two threads
may see different snapshots of the state, and the signaler may mark a
request as complete as we try to reset it. We don't tolerate 2 different
views of the same state and complain if we try to mark a request as
failed if it is already complete. Disable the signal threads during
reset to prevent this conflict (even though the conflict implies that
the state we resetting to is invalid, we have already made our
decision!).
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99733
References: https://bugs.freedesktop.org/show_bug.cgi?id=99671 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170212172002.23072-4-chris@chris-wilson.co.uk Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Chris Wilson [Sun, 12 Feb 2017 17:20:00 +0000 (17:20 +0000)]
drm/i915: Kill the tasklet then disable
Disabling the tasklet leaves it if scheduled on the ready to run list
until it is re-enabled. This will leave the ksoftird thread spinning
until satisfied. To prevent this situation on starting the GPU reset, we
want to kill the tasklet first and then disable. The same problem will
arise when a tasklet is scheduled from another device, so a better
solution is required for the general case.
Reported-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Fixes: 1f7b847d72c3 ("drm/i915: Disable engine->irq_tasklet around resets") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170212172002.23072-3-chris@chris-wilson.co.uk Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>