Chris Wilson [Wed, 15 Feb 2017 10:59:18 +0000 (10:59 +0000)]
drm/i915: struct_mutex is not required for allocating the framebuffer
We do not need the BKL struct_mutex in order to allocate a GEM object,
nor to create the framebuffer, so resist the temptation to take the BKL
willy nilly. As this changes the locking contract around internal API
calls, the patch is a little larger than a plain removal of a pair of
mutex_lock/unlock.
Chris Wilson [Thu, 16 Feb 2017 09:46:21 +0000 (09:46 +0000)]
drm/i915: Remove struct_mutex for destroying framebuffers
We do not need to hold struct_mutex for destroying drm_i915_gem_objects
any longer, and with a little care taken over tracking
obj->framebuffer_references, we can relinquish BKL locking around the
destroy of intel_framebuffer.
v2: Use atomic check for WARN_ON framebuffer miscounting
Chris Wilson [Wed, 15 Feb 2017 16:39:00 +0000 (16:39 +0000)]
drm/i915: Unwind conversion to i915_gem_phys_ops on failure
The physical object is treated as permanently pinned. If we fail to take
this initial pin during i915_gem_object_attach_phys() we need to revert
it back to an ordinary shmemfs object before reporting the failure.
Chris Wilson [Thu, 16 Feb 2017 12:54:41 +0000 (12:54 +0000)]
drm/i915: Squelch any ktime/jiffie rounding errors for wait-ioctl
We wait upon jiffies, but report the time elapsed using a
high-resolution timer. This discrepancy can lead to us timing out the
wait prior to us reporting the elapsed time as complete.
This restores the squelching lost in commit e95433c73a11 ("drm/i915:
Rearrange i915_wait_request() accounting with callers").
Fixes: e95433c73a11 ("drm/i915: Rearrange i915_wait_request() accounting with callers") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Matthew Auld <matthew.william.auld@gmail.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: <drm-intel-fixes@lists.freedesktop.org> # v4.10-rc1+ Cc: stable@vger.kernel.org Link: http://patchwork.freedesktop.org/patch/msgid/20170216125441.30923-1-chris@chris-wilson.co.uk Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Chris Wilson [Wed, 15 Feb 2017 13:15:47 +0000 (13:15 +0000)]
drm/i915: Only enable hotplug interrupts if the display interrupts are enabled
In order to prevent accessing the hpd registers outside of the display
power wells, we should refrain from writing to the registers before the
display interrupts are enabled.
v2: Set dev_priv->display_irqs_enabled to true for all platforms other
than vlv/chv that manually control the display power domain.
Fixes: 19625e85c6ec ("drm/i915: Enable polling when we don't have hpd")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97798 Suggested-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Lyude <cpaul@redhat.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Cc: Hans de Goede <jwrdegoede@fedoraproject.org> Cc: stable@vger.kernel.org Link: http://patchwork.freedesktop.org/patch/msgid/20170215131547.5064-1-chris@chris-wilson.co.uk Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Uma Shankar [Wed, 8 Feb 2017 10:50:50 +0000 (16:20 +0530)]
drm/i915: Check for platform specific GPIO config
Panel GPIO control should be done based on platform. Add a check
to restrict VLV and CHT specific GPIO confirguration, so that
they dont apply to other platforms.
The VBT spec fails to mention the PMIC backlight control option is valid
only for VLV/CHT, and the field may be set to "PMIC" for BXT even if
PMIC is not desired or possible.
Manasi Navare [Wed, 8 Feb 2017 00:54:11 +0000 (16:54 -0800)]
drm/i915/dp: Reset the link params on HPD/connected boot/resume
The max link parameters should be set/reset only on HPD or
connected boot case or on system resume.
Add a flag reset_link_params to intel_dp to decide when
to reset the max link parameters. This prevents the parameters
from getting reset/overwritten through all other
connector->funcs->detect() calls. This is important when link
training fails and the max link params are modified to the
lower fallback values.
Chris Wilson [Wed, 15 Feb 2017 08:43:55 +0000 (08:43 +0000)]
drm/i915: Only preallocate the aliasing GTT to the extents of the global GTT
As the aliasing GTT is only accessed via the global GTT, we will never
use more of it than we expose via the Global GTT and so we only need to
preallocate sufficient space within the ppgtt for the full GTT. Equally,
if the aliasing GTT is smaller than the global GTT, we have a serious
issue and must bail.
Chris Wilson [Wed, 15 Feb 2017 08:43:54 +0000 (08:43 +0000)]
drm/i915: Remove i915_address_space.start
Once upon a time, back in the UMS days, we supported userspace
initialising the GTT and sharing portions of the GTT with other users.
Now, we own the GTT (both global and per-process) and the tables always
start at 0 - so we can remove i915_address_space.start and forget about
this old complication.
Chris Wilson [Wed, 15 Feb 2017 08:43:49 +0000 (08:43 +0000)]
drm/i915: Remove bitmap tracking for used-pml4
We only operate on known extents (both for alloc/clear) and so we can use
both the knowledge of the bind/unbind range along with the knowledge of
the existing pagetable to avoid having to allocate temporary and
auxiliary bitmaps.
Chris Wilson [Wed, 15 Feb 2017 08:43:48 +0000 (08:43 +0000)]
drm/i915: Remove bitmap tracking for used-pdpes
We only operate on known extents (both for alloc/clear) and so we can use
both the knowledge of the bind/unbind range along with the knowledge of
the existing pagetable to avoid having to allocate temporary and
auxiliary bitmaps.
Chris Wilson [Wed, 15 Feb 2017 08:43:47 +0000 (08:43 +0000)]
drm/i915: Remove bitmap tracking for used-pdes
We only operate on known extents (both for alloc/clear) and so we can use
both the knowledge of the bind/unbind range along with the knowledge of
the existing pagetable to avoid having to allocate temporary and
auxiliary bitmaps.
Chris Wilson [Wed, 15 Feb 2017 08:43:46 +0000 (08:43 +0000)]
drm/i915: Remove bitmap tracking for used-ptes
We only operate on known extents (both for alloc/clear) and so we can use
both the knowledge of the bind/unbind range along with the knowledge of
the existing pagetable to avoid having to allocate temporary and
auxiliary bitmaps.
Chris Wilson [Wed, 15 Feb 2017 08:43:43 +0000 (08:43 +0000)]
drm/i915: Always preallocate gen6/7 ppgtt
The hardware does not cope very well with us changing the PD within an
active context (the context must be idle for it to re-read the PD). As
we only check whether the page is idle before changing the entry (and on
through the PD tree), we cannot reliably replace PD entries on
gen6/gen7. To fully avoid changing the tree at runtime, preallocate it
on init.
Chris Wilson [Wed, 15 Feb 2017 08:43:42 +0000 (08:43 +0000)]
drm/i915: Move allocate_va_range to GTT
In the future, we need to call allocate_va_range on the aliasing-ppgtt
which means moving the call down from the vma into the vm (which is
more appropriate for calling the vm function).
Chris Wilson [Wed, 15 Feb 2017 08:43:41 +0000 (08:43 +0000)]
drm/i915: Remove kmap/kunmap wrappers
As these are now both plain and simple kmap_atomic/kunmap_atomic pairs,
we can remove the wrappers for a small gain of clarity (in particular,
not hiding the atomic critical sections!).
Chris Wilson [Wed, 15 Feb 2017 08:43:40 +0000 (08:43 +0000)]
drm/i915: Convert clflushed pagetables over to WC maps
We flush the entire page every time we update a few bytes, making the
update of a page table many, many times slower than is required. If we
create a WC map of the page for our updates, we can avoid the clflush
but incur additional cost for creating the pagetable. We amoritize that
cost by reusing page vmappings, and only changing the page protection in
batches.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Improve the sg iteration and in hte process eliminate a bug in
miscomputing the pml4 length as orig_nents<<PAGE_SHIFT is no longer the
full length of the sg table.
v2: Check for the end of the fourth level page table (the final pdpe)
and move onto the next.
v3: Assert that 3lvl insert_pte_entries doesn't overflow its smaller set
of PDP.
Inline the address computation to avoid the vfunc call for every page.
We still have to pay the high overhead of sg_page_iter_next(), but now
at least GCC can optimise the inner most loop, giving a significant
boost to some thrashing Unreal Engine workloads.
Hans de Goede [Tue, 14 Feb 2017 16:12:38 +0000 (18:12 +0200)]
drm/i915: Fix not finding the VBT when it overlaps with OPREGION_ASLE_EXT
If there is no OPREGION_ASLE_EXT then a VBT stored in mailbox #4 may
use the ASLE_EXT parts of the opregion. Adjust the vbt_size calculation
for a vbt in mailbox #4 for this.
This fixes the driver not finding the VBT on a jumper ezpad mini3
cherrytrail tablet and on a ACER SW5_017 machine.
Chris Wilson [Fri, 10 Feb 2017 15:03:48 +0000 (15:03 +0000)]
drm/i915: Only apply the jump to the "efficient RPS" frequency on startup
Currently we apply the jump to rpe if we are below it and the GPU needs
more power. For some GPUs, the rpe is 75% of the maximum range causing
us to dramatically overshoot low power applications *and* unable to
reach the low frequency that can most efficiently deliver their
workload.
Chris Wilson [Fri, 10 Feb 2017 15:03:47 +0000 (15:03 +0000)]
drm/i915: Don't accidentally increase the frequency in handling DOWN rps
If we receive a DOWN_TIMEOUT rps interrupt, we respond by reducing the
GPU clocks significantly. Before we do, double check that the frequency
we pick is actually a decrease.
Chris Wilson [Fri, 10 Feb 2017 15:03:46 +0000 (15:03 +0000)]
drm/i915: Enable fine-tuned RPS for cherryview
When the RPS tuning was applied to Baytrail, in commit 8fb55197e64d
("drm/i915: Agressive downclocking on Baytrail"), concern was given that
it might cause Cherryview excess wakeups of the common power well.
However, the static thresholds perform poorly for Kodi, and the GPU is
unable to deliver the video frames on time. Enabling the dynamic, finer
thresholds used on all other platforms (including Skylake and Broxton
that also have the same multiple powerwell concerns) allows the GPU to
pick a more appropriate frequency and not drop frames.
Chris Wilson [Tue, 14 Feb 2017 16:46:11 +0000 (16:46 +0000)]
drm/i915: The return of i915_gpu_info to debugfs
Once upon a time before we had automated GPU state capture upon hangs,
we had intel_gpu_dump. Now we come almost full circle and reinstate that
view of the current GPU queues and registers by using the error capture
facility to snapshot the GPU state when debugfs/.../i915_gpu_info is
opened - which should provided useful debugging to both the error
capture routines (without having to cause a hang and avoid the error
state being eaten by igt) and generally.
v2: Rename drm_i915_error_state to i915_gpu_state to alleviate some name
collisions between the error state dump and inspecting the gpu state.
Chris Wilson [Tue, 14 Feb 2017 13:34:20 +0000 (13:34 +0000)]
drm/i915/guc: Don't take struct_mutex for object unreference
We no longer need to take the struct_mutex for freeing objects, and on
the finalisation paths here the mutex is not been used for serialisation
of the pointer access, so remove the BKL wart.
Chris Wilson [Tue, 14 Feb 2017 14:35:09 +0000 (14:35 +0000)]
drm/i915: Silence compiler warning for seltests/i915_gem_coherency
In general, the compiler should not be able to detect if we do any
passes through the test loops:
In file included from drivers/gpu/drm/i915/i915_gem.c:5029:
drivers/gpu/drm/i915/selftests/i915_gem_coherency.c: In function 'igt_gem_coherency':
drivers/gpu/drm/i915/selftests/i915_gem_coherency.c:274: error: 'err' may be used uninitialized in this function
Chris Wilson [Tue, 14 Feb 2017 11:37:56 +0000 (11:37 +0000)]
drm/i915: Silence compiler for GTT selftests
gcc-4.7 spotted that
In file included from drivers/gpu/drm/i915/i915_gem_gtt.c:3791:0:
drivers/gpu/drm/i915/selftests/i915_gem_gtt.c: In function ‘pot_hole’:
drivers/gpu/drm/i915/selftests/i915_gem_gtt.c:594:6: error: ‘err’ may be used uninitialized in this function [-Werror=maybe-uninitialized]
So set it to 0 should we ever skip over a hole smaller than a few pages.
Chris Wilson [Tue, 14 Feb 2017 09:23:44 +0000 (09:23 +0000)]
drm/i915: Avoid overflow in computing pot_hole loop termination
When using the mock_ppgtt selftest, the GTT is large enough to cause an
overflow in pot_hole() when adding 2 pages to the address. Avoid the
overflow by computing the final valid address and iterating up to that
address.
Deepak S [Fri, 12 Aug 2016 13:16:41 +0000 (18:46 +0530)]
drm/i915/chv: Set min freq to RPn on CHV.
With latest Punit FW, vgg input voltag drop falling to minimum is fixed.
So reverting the WA patch & moving to turbo freq opreation range to [RPn -> RP0]
This is not a 1:1 revert of the commit 5b7c91b78b1ce6663e0f1f037f6cb4d7c9537d44.
You can refer to commit 5b5929cbe3f7 ("drm/i915/chv: remove
pre-production hardware workarounds") as the reason for the discrepancy
Ville Syrjälä [Wed, 21 Dec 2016 14:31:14 +0000 (16:31 +0200)]
drm/i915: Dump more configuration information for DSI
Dump out more of the DSI configuration details during init.
This includes pclk, burst_mode_ratio, lane_count, pixel_overlap,
video_mode_format and reset_timer_val.
v2: Dump more info (Chris)
v3: Use the VIDEO_MODE_ defines for consistency (Chris)
Dump dphy_reg too (Chris)
Tvrtko Ursulin [Tue, 14 Feb 2017 11:32:42 +0000 (11:32 +0000)]
drm/i915: Emit to ringbuffer directly
This removes the usage of intel_ring_emit in favour of
directly writing to the ring buffer.
intel_ring_emit was preventing the compiler for optimising
fetch and increment of the current ring buffer pointer and
therefore generating very verbose code for every write.
It had no useful purpose since all ringbuffer operations
are started and ended with intel_ring_begin and
intel_ring_advance respectively, with no bail out in the
middle possible, so it is fine to increment the tail in
intel_ring_begin and let the code manage the pointer
itself.
Useless instruction removal amounts to approximately
two and half kilobytes of saved text on my build.
Not sure if this has any measurable performance
implications but executing a ton of useless instructions
on fast paths cannot be good.
v2:
* Change return from intel_ring_begin to error pointer by
popular demand.
* Move tail increment to intel_ring_advance to enable some
error checking.
v3:
* Move tail advance back into intel_ring_begin.
* Rebase and tidy.
v4:
* Complete rebase after a few months since v3.
drm/i915: Convert remaining users of 32bit power domain masks
I screwed up the rebase of commit d8fc70b7367b ("drm/i915: Make power
domain masks 64 bit long") before sending v2, causing a couple of
conversions from 32 to 64 bit masks to be lost.
Fixes: d8fc70b7367b ("drm/i915: Make power domain masks 64 bit long") Cc: Jani Nikula <jani.nikula@linux.intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Cc: Ander Conselvan de Oliveira <ander.conselvan.de.oliveira@intel.com> Cc: Daniel Vetter <daniel.vetter@intel.com> Cc: intel-gfx@lists.freedesktop.org Signed-off-by: Ander Conselvan de Oliveira <ander.conselvan.de.oliveira@intel.com> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170213145733.8779-1-ander.conselvan.de.oliveira@intel.com
Chris Wilson [Sun, 12 Feb 2017 21:53:43 +0000 (21:53 +0000)]
drm/i915: Pass timeout==0 on to i915_gem_object_wait_fence()
The i915_gem_object_wait_fence() uses an incoming timeout=0 to query
whether the current fence is busy or idle, without waiting. This can be
used by the wait-ioctl to implement a busy query.
Fixes: e95433c73a11 ("drm/i915: Rearrange i915_wait_request() accounting with callers")
Testcase: igt/gem_wait/basic-busy-write-all Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Matthew Auld <matthew.william.auld@gmail.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: <drm-intel-fixes@lists.freedesktop.org> # v4.10-rc1+ Cc: stable@vger.kernel.org Link: http://patchwork.freedesktop.org/patch/msgid/20170212215344.16600-1-chris@chris-wilson.co.uk Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Chris Wilson [Wed, 9 Nov 2016 10:39:05 +0000 (10:39 +0000)]
drm/i915/gvt: Disable access to stolen memory as a guest
Explicitly disable stolen memory when running as a guest in a virtual
machine, since the memory is not mediated between clients and reserved
entirely for the host. The actual size should be reported as zero, but
like every other quirk we want to tell the user what is happening.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99028 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Zhenyu Wang <zhenyuw@linux.intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20161109103905.17860-1-chris@chris-wilson.co.uk Reviewed-by: Zhenyu Wang <zhenyuw@linux.intel.com> Cc: stable@vger.kernel.org
Chris Wilson [Mon, 13 Feb 2017 17:15:57 +0000 (17:15 +0000)]
drm/i915: Exercise crossing pot boundaries in the GTT
As the page-table trees within the GTT are naturally aligned to
power-of-two boundaries, by inserting an object that crosses a
power-of-two (and the power-of-two intervals) we can quickly check the
code for errors in switching between levels in the tree.
Chris Wilson [Mon, 13 Feb 2017 17:15:44 +0000 (17:15 +0000)]
drm/i915: Use fault-injection to force the shrinker to run in live GTT tests
It is possible whilst allocating the page-directory tree for a ppgtt
bind that the shrinker may run and reap unused parts of the tree. If the
shrinker happens to remove a chunk of the tree that the
allocate_va_range has already processed, we may then try to insert into
the dangling tree. This test uses the fault-injection framework to force
the shrinker to be invoked before we allocate new pages, i.e. new chunks
of the PD tree.
References: https://bugs.freedesktop.org/show_bug.cgi?id=99295 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Chris Wilson [Mon, 13 Feb 2017 17:15:43 +0000 (17:15 +0000)]
drm/i915: Live testing of lowlevel GTT operations
Directly test allocating the va range and clearing it, this bypasses the
use of i915_vma_bind() and inserting the pages to focus on testing of
the pagetables.
Chris Wilson [Mon, 13 Feb 2017 17:15:39 +0000 (17:15 +0000)]
drm/i915: Exercise filling the top/bottom portions of the ppgtt
Allocate objects with varying number of pages (which should hopefully
consist of a mixture of contiguous page chunks and so coalesced sg
lists) and check that the sg walkers in insert_pages cope.
Chris Wilson [Mon, 13 Feb 2017 17:15:37 +0000 (17:15 +0000)]
drm/i915: Add a live dmabuf selftest
Though we have good coverage of our dmabuf interface through the mock
tests, we also want to check the heavy module unload paths of the live
i915 driver.
Chris Wilson [Mon, 13 Feb 2017 17:15:35 +0000 (17:15 +0000)]
drm/i915: Sanity check all registers for matching fw domains
Add a late selftest that walks over all forcewake registers (those below
0x40000) and uses the mmio debug register to check to see if any are
unclaimed. This is possible if we fail to wake the appropriate
powerwells for the register.
Chris Wilson [Mon, 13 Feb 2017 17:15:34 +0000 (17:15 +0000)]
drm/i915: Test all fw tables during mock selftests
In addition to just testing the fw table we load, during the initial
mock testing we can test that all tables are valid (so the testing is
not limited to just the platforms that load that particular table).
Chris Wilson [Mon, 13 Feb 2017 17:15:28 +0000 (17:15 +0000)]
drm/i915: Add selftests for object allocation, phys
The phys object is a rarely used device (only very old machines require
a chunk of physically contiguous pages for a few hardware interactions).
As such, it is not exercised by CI and to combat that we want to add a
test that exercises the phys object on all platforms.
v2: Always set err on error paths and not rely on inheriting the err.
Chris Wilson [Mon, 13 Feb 2017 17:15:21 +0000 (17:15 +0000)]
drm/i915: Add selftests for i915_gem_request
Simple starting point for adding seltests for i915_gem_request, first
mock a device (with engines and contexts) that allows us to construct
and execute a request, along with waiting for the request to complete.
Chris Wilson [Mon, 13 Feb 2017 17:15:20 +0000 (17:15 +0000)]
drm/i915: Create a fake object for testing huge allocations
We would like to be able to exercise huge allocations even on memory
constrained devices. To do this we create an object that allocates only
a few pages and remaps them across its whole range - each page is reused
multiple times. We can therefore pretend we are rendering into a much
larger object.