Jani Nikula [Wed, 1 Apr 2015 07:58:05 +0000 (10:58 +0300)]
drm/i915: add bxt gmbus support
For BXT gmbus is pulled from PCH to CPU. From implementation point of
view only pin pair configuration will change. The existing
implementation supports all platforms previous to GEN8 and also SKL. But
for BXT pin pair configuration is completely different than SKL or other
previous GEN's. This patch introduces the new pin pair configuration
structure specific to BXT and also ensures every real gmbus port has a
gpio pin.
v3 by Jani: with the platform independent prep work in place, the bxt
enabling reduces to a fairly trivial patch. Credits are due Sunil for
giving me the ideas (with his patches) what the platform independent
parts should look like.
v4: Fix intel_hdmi_init_connector() for bxt. Abstract gmbus_pin access
more. s/GPU/PCH/ in commit message.
v5: Rebase.
Issue: VIZ-3574 Signed-off-by: A.Sunil Kamath <sunil.kamath@intel.com> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Reviewed-by: Imre Deak <imre.deak@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Vandana Kannan [Tue, 19 Aug 2014 06:35:01 +0000 (12:05 +0530)]
drm/i915/bxt: don't use unsupported port detection
The port detection register flags in SFUSE_STRAP and DDI_BUF_CTL_A are
not defined for BXT, so don't use them.
Suggested by Satheesh.
v2:
- DDI_BUF_CTL_A bit 0 is not useful on BXT. Making changes to use this
bit when simulator or BXT is not applicable. Code re-arranged as per
Damien's suggestion.
Robert Beckett [Wed, 11 Mar 2015 08:28:25 +0000 (10:28 +0200)]
drm/i915/bxt: add workaround to avoid PTE corruption
Set TLBPF in TILECTL. This fixes an issue with BXT HW seeing
corrupted pte entries.
v2:
- move the workaround to bxt_init_clock_gating (imre)
Signed-off-by: Robert Beckett <robert.beckett@intel.com> (v1) Signed-off-by: Imre Deak <imre.deak@intel.com> Reviewed-by: Nick Hoath <nicholas.hoath@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Imre Deak <imre.deak@intel.com> Reviewed-by: Nick Hoath <nicholas.hoath@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Imre Deak <imre.deak@intel.com> Reviewed-by: Nick Hoath <nicholas.hoath@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Imre Deak <imre.deak@intel.com> Reviewed-by: Nick Hoath <nicholas.hoath@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Imre Deak <imre.deak@intel.com> Reviewed-by: Nick Hoath <nicholas.hoath@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Imre Deak [Fri, 27 Mar 2015 12:00:04 +0000 (14:00 +0200)]
drm/i915/bxt: add bxt_init_clock_gating
v2:
- Make the condition to select between SKL and BXT consistent with the
corresponding condition in init_workarounds_ring (Nick)
Signed-off-by: Imre Deak <imre.deak@intel.com> Reviewed-by: Nick Hoath <nicholas.hoath@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Imre Deak [Sun, 25 Jan 2015 21:27:11 +0000 (13:27 -0800)]
drm/i915/gen9: fix PIPE_CONTROL flush for VS_INVALIDATE
On GEN9+ per specification a NULL PIPE_CONTROL needs to be emitted
before any PIPE_CONTROL command with the VS_INVALIDATE flag set.
Signed-off-by: Imre Deak <imre.deak@intel.com> Reviewed-by: Nick Hoath <nicholas.hoath@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Chris Wilson [Tue, 7 Apr 2015 16:28:24 +0000 (17:28 +0100)]
drm/i915: Remove obj->pin_mappable
The obj->pin_mappable flag only exists for debug purposes and is a
hindrance that is mistreated with rotated GGTT views. For debug
purposes, it suffices to mark objects with pin_display as being of note.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Chris Wilson [Tue, 7 Apr 2015 15:20:41 +0000 (16:20 +0100)]
drm/i915: Optimistically spin for the request completion
This provides a nice boost to mesa in swap bound scenarios (as mesa
throttles itself to the previous frame and given the scenario that will
complete shortly). It will also provide a good boost to systems running
with semaphores disabled and so frequently waiting on the GPU as it
switches rings. In the most favourable of microbenchmarks, this can
increase performance by around 15% - though in practice improvements
will be marginal and rarely noticeable.
v2: Account for user timeouts
v3: Limit the spinning to a single jiffie (~1us) at most. On an
otherwise idle system, there is no scheduler contention and so without a
limit we would spin until the GPU is ready.
v4: Drop forcewake - the lazy coherent access doesn't require it, and we
have no reason to believe that the forcewake itself improves seqno
coherency - it only adds delay.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Cc: Eero Tamminen <eero.t.tamminen@intel.com> Cc: "Rantala, Valtteri" <valtteri.rantala@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
drm/i915: skylake panel fitting using shared scalers
Enabling skylake panel fitting feature using shared scalers
v2:
-added force detach parameter for pfit disable purpose (me)
-read crtc scaler state from hw state (Daniel)
-replaced both skylake_pfit_enable and disable with skylake_pfit_update (me)
-added scaler id check to intel_pipe_config_compare (Daniel)
v3:
-updated function header to kerneldoc format (Matt)
-dropped need_scaling checks (Matt)
v4:
-move clearing of scaler id from commit path to check path (Matt)
-updated colorkey checks based on recent updates (me)
-squashed scaler check while enabling colorkey to here (me)
-use values in plane_state->src as regular integers (me)
-changes made not to modify state in commit path (Matt)
v5:
-squashed helper function to update scaler users to here (Matt)
-squashed helper function to detach scaler to here (Matt, me)
-changes to align with updated scaler structures (Matt, me)
Signed-off-by: Chandra Konduru <chandra.konduru@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
drm/i915: Ensure setting up scalers into staged crtc_state
From intel_atomic_check, call intel_atomic_setup_scalers() to
assign scalers based on staged scaling requests. Fail the
transaction if setup returns error.
Setting up of scalers should be moved to atomic crtc check once
atomic crtc is ready.
v2:
-updated parameter passing to setup_scalers (me)
Signed-off-by: Chandra Konduru <chandra.konduru@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Added intel_atomic_setup_scalers to setup scalers based on
staged scaling requests from a crtc and its planes. If staged
requests are supportable, this function assigns scalers to
requested planes and crtc. Note that the scaler assignement
itself is staged into crtc_state and respective plane_states
for later commit after all checks have been done.
overall high level flow:
- scaler requests are staged into crtc_state by planes/crtc
- check whether staged scaling requests can be supported
- add planes using scalers that aren't in current transaction
- assign scalers to requested users
- as part of plane commit, scalers will be committed
(i.e., either attached or detached) to respective planes in hw
- as part of crtc_commit, scaler will be either attached or detached
to crtc in hw
crtc_compute_config calls intel_atomic_setup_scalers() to start
scaler assignments as per scaler state in crtc config. This call
should be moved to atomic crtc once it is available.
v2:
-removed a log message (me)
-changed input parameter to crtc_state (me)
v3:
-remove assigning plane_state returned by drm_atomic_get_plane_state (Matt)
-fail if there is an error from drm_atomic_get_plane_state (Matt)
v4:
-changes to align with updated scaler structure (Matt, me)
v5:
-added addtional checks before enabling HQ mode (me)
-added comments to enable HQ mode (Matt)
Signed-off-by: Chandra Konduru <chandra.konduru@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
drm/i915: Preserve scaler state when clearing crtc_state
crtc_state is cleared during mode set which wipes out complete
scaler state too. This is causing issues. To fix, ensure scaler
state is preserved because it contains not only crtc
scaler usage, but also planes using scalers on this crtc.
Signed-off-by: Chandra Konduru <chandra.konduru@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
drm/i915: Allocate connector state together with the connectors
Connector states were being allocated in intel_setup_outputs() in loop
over all connectors. That meant hot-added connectors would have a NULL
state. Since the change to use a struct drm_atomic_state for the legacy
modeset, connector states are necessary for the i915 driver to function
properly, so that would lead to oopses.
v2: Fix test for intel_connector_init() success in lvds and sdvo (PRTS)
Signed-off-by: Ander Conselvan de Oliveira <ander.conselvan.de.oliveira@intel.com> Reported-and-tested-by: Nicolas Kalkhof <nkalkhof@web.de> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Michel Thierry <michel.thierry@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
drm/i915: add i915 specific connector debugfs file for DPCD
Reported-by: kbuild test robot <fengguang.wu@intel.com> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Mika Kuoppala [Fri, 10 Apr 2015 12:54:58 +0000 (15:54 +0300)]
drm/i915: Move vm page allocation in proper place
Move to i915_vma_bind as it is part of the binding.
Suggested-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Michel Thierry <michel.thierry@intel.com> Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
v2: Moving creation of property in a function, checking for 90/270
rotation simultaneously (Chris)
Letting primary plane to be positioned
v3: Adding if/else for 90/270 and rest params programming, adding check for
pixel_format, some cleanup (review comments)
v4: Adding right pixel_formats, using src_* params instead of crtc_* for offset
and size programming (Ville)
v5: Rebased on -nightly and Tvrtko's series for gtt remapping.
v6: Rebased on -nightly (Tvrtko's series merged)
v7: Moving pixel_format check to intel_atomic_plane_check (Matt)
Signed-off-by: Sonika Jindal <sonika.jindal@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
drm/i915: Remove stale comment from __intel_set_mode()
Since the following commit, the PLL calculations are done earlier, so
the code following the comment doesn't do anything PLL or encoder
related. It only updates the primary plane now.
commit f3019a4d92f08b2dd92443a4b567a066a51c6ec0
Author: Ander Conselvan de Oliveira <ander.conselvan.de.oliveira@intel.com>
Date: Wed Oct 29 11:32:37 2014 +0200
drm/i915: Remove crtc_mode_set() hook
Signed-off-by: Ander Conselvan de Oliveira <ander.conselvan.de.oliveira@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Chris Wilson [Tue, 7 Apr 2015 16:28:17 +0000 (17:28 +0100)]
drm/i915: Simplify object is-pinned checking for shrinker
When looking for viable candidates to shrink, we only want objects that
are not pinned. However to do so we performed a double iteration over
the vma in the objects, first looking for the pin-count, then looking
for allocations. We can do both at once and be slightly more explicit in
our validity test.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Chris Wilson [Tue, 7 Apr 2015 15:21:11 +0000 (16:21 +0100)]
drm/i915: Allocate context objects from stolen
As we never expose context objects directly to userspace, we can forgo
allocating a first-class GEM object for them and prefer to use the
limited resource of reserved/stolen memory for them. Note this means
that their initial contents are undefined.
However, a downside of using stolen objects for execlists is that we
cannot access the physical address directly (thanks MCH!) which prevents
their use.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Chris Wilson [Tue, 7 Apr 2015 15:21:09 +0000 (16:21 +0100)]
drm/i915: Remove request->uniq
We already assign a unique identifier to every request: seqno. That
someone felt like adding a second one without even mentioning why and
tweaking ABI smells very fishy.
drm/i915: Fix a use after free, and unbalanced refcounting
v2: Rebase
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Nick Hoath <nicholas.hoath@intel.com> Cc: Thomas Daniel <thomas.daniel@intel.com> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: Jani Nikula <jani.nikula@intel.com>
[danvet: Fixup because different merge order.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Chris Wilson [Tue, 7 Apr 2015 15:21:04 +0000 (16:21 +0100)]
drm/i915: Reduce locking in gen8 IRQ handler
Similar in vain in reducing the number of unrequired spinlocks used for
execlist command submission (where the forcewake is required but
manually controlled), we know that the IRQ registers are outside of the
powerwell and so we can access them directly. Since we now have direct
access exported via I915_READ_FW/I915_WRITE_FW, lets put those to use in
the irq handlers as well.
In the process, reorder the execlist submission to happen as early as
possible.
v2: Restrict the untraced register mmio to just the GT path (i.e. the
hotpath for execlists)
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Chris Wilson [Tue, 7 Apr 2015 15:21:02 +0000 (16:21 +0100)]
drm/i915: Reduce locking in execlist command submission
This eliminates six needless spin lock/unlock pairs when writing out
ELSP.
v2: Respin with my preferred colour.
v3: Mostly back to the original colour
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> [v1] Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Matt Roper [Thu, 9 Apr 2015 17:48:38 +0000 (10:48 -0700)]
drm/i915: Clear crtc atomic flags at beginning of transaction
Once we have full atomic modeset, these kind of flags should be in a
real intel_crtc_state that's tracked properly. In the meantime, make
sure we clear out any old flags at the beginning of a transaction so
that we don't wind up seeing leftover flags from old transactions that
were checked, but never went to the commit step. At the moment, a
failed check or prepare could leave stale flags behind that interfere
with the next atomic transaction.
v2: Just do a memset; the series this patch was originally part of
placed additional fields into the structure that shouldn't be
cleared, but that's no longer the case.
Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Matt Roper [Thu, 9 Apr 2015 01:56:53 +0000 (18:56 -0700)]
drm/i915: Switch to full atomic helpers for plane updates/disable, take two
Switch from our plane update/disable entrypoints to use the full atomic
helpers (which generate a top-level atomic transaction) rather than the
transitional helpers (which only create/manipulate orphaned plane states
independent of a top-level transaction). Various upcoming work (SKL
scalers, atomic watermarks, etc.) requires a full atomic transaction to
behave properly/cleanly.
Last time we tried this, we had to back out the change because we still
call the drm_plane vfuncs directly from within our legacy modesetting
code. This potentially results in nested atomic transactions, locking
collisions, and other failures. To avoid that problem again, we
sidestep the issue by calling the transitional helpers directly (rather
than through a vfunc) when we're nested inside of other legacy
modesetting code. However this does allow legacy SetPlane() ioctl's to
process an entire drm_atomic_state transaction, which is important for
upcoming patches.
Cc: Chandra Konduru <chandra.konduru@intel.com> Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Chris Wilson [Tue, 7 Apr 2015 15:20:51 +0000 (16:20 +0100)]
drm/i915: Remove vestigal DRI1 ring quiescing code
After the removal of DRI1, all access to the rings are through requests
and so we can always be sure that there is a request to wait upon to
free up available space. The fallback code only existed so that we could
quiesce the GPU following unmediated access by DRI1.
v2: Rebase
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Chris Wilson [Tue, 7 Apr 2015 15:20:49 +0000 (16:20 +0100)]
drm/i915: Use the global runtime-pm wakelock for a busy GPU for execlists
When we submit a request to the GPU, we first take the rpm wakelock, and
only release it once the GPU has been idle for a small period of time
after all requests have been complete. This means that we are sure no
new interrupt can arrive whilst we do not hold the rpm wakelock and so
can drop the individual get/put around every single request inside
execlists.
Note: to close one potential issue we should mark the GPU as busy
earlier in __i915_add_request.
To elaborate: The issue is that we emit the irq signalling sequence
before we grab the rpm reference, which means we could miss the
resulting interrupt (since that's not set up when suspended). The only
bad side effect is a missed interrupt, gt mmio writes automatically
wake up the hw itself. But otoh we have an umbrella rpm reference for
the entirety of execbuf, as long as that's there we're covered.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
[danvet: Explain a bit more about the add_request issue, which after
some irc chatting with Chris turns out to not be an issue really.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Ville Syrjälä [Wed, 25 Mar 2015 16:45:58 +0000 (18:45 +0200)]
drm/i915: Fix the VBT child device parsing for BSW
Recent BSW VBT has a VBT child device size 37 bytes instead of the 33
bytes our code assumes. This means we fail to parse the VBT and thus
fail to detect eDP ports properly and just register them as DP ports
instead.
Fix it up by using the reported child device size from the VBT instead
of assuming it matches out struct defintions.
The latest spec I have shows that the child device size should be 36
bytes for rev >= 195, however on my BSW the size is actually 37 bytes.
And our current struct definition is 33 bytes.
Feels like the entire VBT parses would need to be rewritten to handle
changes in the layout better, but for now I've decided to do just the
bare minimum to get my eDP port back.
Cc: Vijay Purushothaman <vijay.a.purushothaman@linux.intel.com> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Damien Lespiau <damien.lespiau@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Michel Thierry [Wed, 8 Apr 2015 11:13:35 +0000 (12:13 +0100)]
drm/i915: Use complete address space in true PPGTT
True PPGTT is capable of having a full address space, even if the system
has less allocated memory.
Note that aliasing PPGTT always aliases the GGTT and thus should remain
of the same size.
Signed-off-by: Michel Thierry <michel.thierry@intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Michel Thierry [Wed, 8 Apr 2015 11:13:34 +0000 (12:13 +0100)]
drm/i915/gen8: Dynamic page table allocations
This finishes off the dynamic page tables allocations, in the legacy 3
level style that already exists. Most everything has already been setup
to this point, the patch finishes off the enabling by setting the
appropriate function pointers.
In LRC mode, contexts need to know the PDPs when they are populated. With
dynamic page table allocations, these PDPs may not exist yet. Check if
PDPs have been allocated and use the scratch page if they do not exist yet.
Before submission, update the PDPs in the logic ring context as PDPs
have been allocated.
v2: Update aliasing/true ppgtt allocate/teardown/clear functions for
gen 6 & 7.
v3: Rebase.
v4: Remove BUG() from ppgtt_unbind_vma, but keep checking that either
teardown_va_range or clear_range functions exist (Daniel).
v5: Similar to gen6, in init, gen8_ppgtt_clear_range call is only needed
for aliasing ppgtt. Zombie tracking was originally added for teardown
function and is no longer required.
v6: Update err_out case in gen8_alloc_va_range (missed from lastest
rebase).
v7: Rebase after s/page_tables/page_table/.
v8: Updated scratch_pt check after scratch flag was removed in previous
patch.
v9: Note that lrc mode needs to be updated to support init state without
any PDP.
v10: Unmap correct page_table in gen8_alloc_va_range's error case, clean-up
gen8_aliasing_ppgtt_init (remove duplicated map), and initialize PTs
during page table allocation.
v11: Squashed LRC enabling commit, otherwise LRC mode would be left broken
until it was updated to handle the init case without any PDP.
v12: Do not overallocate new_pts bitmap, make alloc_gen8_temp_bitmaps
static and don't abuse of inline functions. (Mika)
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Michel Thierry <michel.thierry@intel.com> (v2+) Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Michel Thierry [Wed, 8 Apr 2015 11:13:33 +0000 (12:13 +0100)]
drm/i915/gen8: begin bitmap tracking
Like with gen6/7, we can enable bitmap tracking with all the
preallocations to make sure things actually don't blow up.
v2: Rebased to match changes from previous patches.
v3: Without teardown logic, rely on used_pdpes and used_pdes when
freeing page tables.
v4: Rebased after s/page_tables/page_table/.
v5: Rebased after page table generalizations.
Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Michel Thierry <michel.thierry@intel.com> (v2+) Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Michel Thierry [Wed, 8 Apr 2015 11:13:32 +0000 (12:13 +0100)]
drm/i915/gen8: Split out mappings
When we do dynamic page table allocations for gen8, we'll need to have
more control over how and when we map page tables, similar to gen6.
In particular, DMA mappings for page directories/tables occur at allocation
time.
This patch adds the functionality and calls it at init, which should
have no functional change.
The PDPEs are still a special case for now. We'll need a function for
that in the future as well.
v2: Handle renamed unmap_and_free_page functions.
v3: Updated after teardown_va logic was removed.
v4: Rebase after s/page_tables/page_table/.
v5: No longer allocate all PDPs in GEN8+ systems with less than 4GB of
memory, and update populate_lr_context to handle this new case (proper
tracking will be added later in the patch series).
v6: Assign lrc page directory pointer addresses using a macro. (Mika)
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Michel Thierry <michel.thierry@intel.com> (v2+) Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Michel Thierry [Wed, 8 Apr 2015 11:13:31 +0000 (12:13 +0100)]
drm/i915: Extract PPGTT param from page_directory alloc
This will be useful for when we move to 48b addressing, and the PDP isn't
the root of the page table structure.
v2: Rebase after changes for Gen8+ systems with less than 4GB of memory.
v3: Rebase after Mika's code review.
Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Michel Thierry <michel.thierry@intel.com> (v2) Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
These values are never quite useful for dynamic allocations of the page
tables. Getting rid of them will help prevent later confusion.
v2: Updated to use unmap_and_free_pd functions.
v3: Updated gen8_ppgtt_free after teardown logic was removed.
v4: Rebase after s/page_tables/page_table/.
v5: Keep allocating all page directories in GEN8+ systems with less
than 4GB of memory. Updated gen6_for_all_pdes.
v6: Prevent (harmless) out of range access in gen6_for_all_pdes.
Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Michel Thierry <michel.thierry@intel.com> (v2+) Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Michel Thierry [Wed, 8 Apr 2015 11:13:29 +0000 (12:13 +0100)]
drm/i915/gen8: Update pdp switch and point unused PDPs to scratch page
One important part of this patch is we now write a scratch page
directory into any unused PDP descriptors. This matters for 2 reasons,
first, we're not allowed to just use 0, or an invalid pointer, and second,
we must wipe out any previous contents from the last context.
The latter point only matters with full PPGTT. The former point only
effect platforms with less than 4GB memory.
v2: Updated commit message to point that we must set unused PDPs to the
scratch page.
v3: Unmap scratch_pd in gen8_ppgtt_free.
v4: Initialize scratch_pd. (Mika)
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Michel Thierry <michel.thierry@intel.com> (v2+) Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Michel Thierry [Wed, 8 Apr 2015 11:13:28 +0000 (12:13 +0100)]
drm/i915/gen8: pagetable allocation rework
Start using gen8_for_each_pde macro to allocate page tables.
v2: teardown_va_range references removed.
v3: Rebase after s/page_tables/page_table/.
v4: Keep setting up page tables for all page directories in systems with
less than 4GB of memory.
v5: Also initialize the page tables. (Mika)
v6: Initialize all page tables, including the extra ones from systems
with less than 4GB of memory. (Mika)
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Michel Thierry <michel.thierry@intel.com> (v2+) Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Michel Thierry [Wed, 8 Apr 2015 11:13:27 +0000 (12:13 +0100)]
drm/i915/gen8: page directories rework allocation
Start using gen8_for_each_pdpe macro to allocate the page directories.
Similar to PTs, while setting up a page directory, make all entries of
the pd point to the scratch pd before mapping (and make all its entries
point to the scratch page); this is to be safe in case of out of bound
access or proactive prefetch. Systems without LLC require an explicit
flush.
v2: Rebased after s/free_pt_*/unmap_and_free_pt/ change.
v3: Rebased after teardown va range logic was removed.
v4: Keep setting up all page directories for systems with less than 4GB
of memory.
v5: Initialize PDs. (Mika)
v6: Initialize also the extra PDs from systems with less than 4GB of
memory. (Mika)
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Michel Thierry <michel.thierry@intel.com> (v2+) Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Michel Thierry [Wed, 8 Apr 2015 11:13:26 +0000 (12:13 +0100)]
drm/i915/gen8: Add dynamic allocation macros and helper functions
Similar to gen6, we will use for_each_pde/for_each_pdpe
and pte/pde/pdpe_index to iterate over these new structures.
v2: Match trace_i915_va_teardown params
v3: Multiple rebases.
v4: Updated to use unmap_and_free_pt.
v5: teardown_va_range logic no longer needed.
v6: Rebase after s/page_tables/page_table/.
v7: Renamed commit to match what it does now (it was "Use dynamic
allocation idioms on free").
v8: Prevent (harmless) out of range access in gen8_for_each_pde and
gen8_for_each_pdpe_e.
Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Michel Thierry <michel.thierry@intel.com> (v2+) Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
[danvet: s/BUG/WARN/] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Michel Thierry [Wed, 8 Apr 2015 11:13:25 +0000 (12:13 +0100)]
drm/i915/gen8: Initialize page tables
Similar to gen6, while setting up a page table, make all entries of the
pt point to the scratch page before mapping; this is to be safe in case
of out of bound access or proactive prefetch.
Systems without LLC require an explicit flush.
v2: Expanded commit text and fixed indentation (Mika)
Signed-off-by: Michel Thierry <michel.thierry@intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
We are already unmapping them in gen8_ppgtt_free. This function became
redundant since commit 06fda602dbca9c59d87db7da71192e4b54c9f5ff
("drm/i915: Create page table allocators").
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Signed-off-by: Michel Thierry <michel.thierry@intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Michel Thierry [Wed, 8 Apr 2015 11:13:23 +0000 (12:13 +0100)]
drm/i915: Remove _entry from PPGTT page structures
Lets try to keep this consistent:
Page Directory Pointer (PDP).
Page Directory (PD), also known as page directory pointer entries.
Page Table (PT), also known as page directory entries.
Suggested-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Signed-off-by: Michel Thierry <michel.thierry@intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Chris Wilson [Tue, 7 Apr 2015 15:20:47 +0000 (16:20 +0100)]
drm/i915: Record ring->start address in error state
This is mostly useful for execlists where the rings switch between
contexts (and so checking that the ring's start register matches the
context is important).
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Chris Wilson [Tue, 7 Apr 2015 15:20:40 +0000 (16:20 +0100)]
drm/i915: Suppress empty lines from debugfs/i915_gem_objects
This is just so that I don't have to read about the batch pool on
systems that are not using it! Rather than using a newline between the
kernel clients and userspace clients, just distinguish the internal
allocations with a '[k]'
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Chris Wilson [Tue, 7 Apr 2015 15:20:39 +0000 (16:20 +0100)]
drm/i915: Include active flag when describing objects in debugfs
Since we use obj->active as a hint in many places throughout the code,
knowing its state in debugfs is extremely useful.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Chris Wilson [Tue, 7 Apr 2015 15:20:38 +0000 (16:20 +0100)]
drm/i915: Split batch pool into size buckets
Now with the trimmed memcpy before the command parser, we try to
allocate many different sizes of batches, predominantly one or two
pages. We can therefore speed up searching for a good sized batch by
keeping the objects of buckets of roughly the same size.
v2: Add a comment about bucket sizes
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Chris Wilson [Tue, 7 Apr 2015 15:20:37 +0000 (16:20 +0100)]
drm/i915: Free batch pool when idle
At runtime, this helps ensure that the batch pools are kept trim and
fast. Then at suspend, this releases memory that we do not need to
restore. It also ties into the oom-notifier to ensure that we recover as
much kernel memory as possible during OOM.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Chris Wilson [Tue, 7 Apr 2015 15:20:36 +0000 (16:20 +0100)]
drm/i915: Split the batch pool by engine
I woke up one morning and found 50k objects sitting in the batch pool
and every search seemed to iterate the entire list... Painting the
screen in oils would provide a more fluid display.
One issue with the current design is that we only check for retirements
on the current ring when preparing to submit a new batch. This means
that we can have thousands of "active" batches on another ring that we
have to walk over. The simplest way to avoid that is to split the pools
per ring and then our LRU execution ordering will also ensure that the
inactive buffers remain at the front.
v2: execlists still requires duplicate code.
v3: execlists requires more duplicate code
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Chris Wilson [Tue, 7 Apr 2015 15:20:35 +0000 (16:20 +0100)]
drm/i915: Tidy batch pool logic
Move the madvise logic out of the execbuffer main path into the
relatively rare allocation path, making the execbuffer manipulation less
fragile.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Chris Wilson [Tue, 7 Apr 2015 15:20:34 +0000 (16:20 +0100)]
drm/i915: Split i915_gem_batch_pool into its own header
In the next patch, I want to use the structure elsewhere and so require
it defined earlier. Rather than move the definition to an earlier location
where it feels very odd, place it in its own header file.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
The premise that media/blitter workloads are not affected by boosting is
patently false with a trip through igt. The question that remains is
what exactly is going wrong with the media workload that prompted this?
Hopefully that would be fixed by the missing agressive downclocking, in
addition to the extra restrictions imposed on how frequent a process is
allowed to boost.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Deepak S <deepak.s@linux.intel.com> Cc: Daniel Vetter <daniel.vetter@ffwll> Acked-by: Deepak S <deepak.s@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Chris Wilson [Tue, 7 Apr 2015 15:20:32 +0000 (16:20 +0100)]
drm/i915: Deminish contribution of wait-boosting from clients
With boosting for missed pageflips, we have a much stronger indication
of when we need to (temporarily) boost GPU frequency to ensure smooth
delivery of frames. So now only allow each client to perform one RPS boost
in each period of GPU activity due to stalling on results.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Deepak S <deepak.s@linux.intel.com> Reviewed-by: Deepak S <deepak.s@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Chris Wilson [Tue, 7 Apr 2015 15:20:31 +0000 (16:20 +0100)]
drm/i915: Boost GPU frequency if we detect outstanding pageflips
If we hit a vblank and see that have a pageflip queue but not yet
processed, ensure that the GPU is running at maximum in order to clear
the backlog. Pageflips are only queued for the following vblank, if we
miss it, there will be a visible stutter. Boosting the GPU frequency
doesn't prevent us from missing the target vblank, but it should help
the subsequent frames hitting theirs.
v2: Reorder vblank vs flip-complete so that we only check for a missed
flip after processing the completion events, and avoid spurious boosts.
v3: Rename missed_vblank
v4: Rebase
v5: Cancel the outstanding work in runtime suspend
v6: Rebase
v7: Rebase required fixing
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Cc: Deepak S<deepak.s@linux.intel.com> Reviewed-by: Deepak S<deepak.s@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Chris Wilson [Tue, 7 Apr 2015 15:20:29 +0000 (16:20 +0100)]
drm/i915: Fix computation of last_adjustment for RPS autotuning
The issue is that by computing the last_adj value after applying the
clamping, we can end up with a bogus value for feeding into the next RPS
autotuning step.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Deepak S <deepak.s@linux.intel.com> Reviewed-by: Deepak S <deepak.s@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Chris Wilson [Tue, 7 Apr 2015 15:20:28 +0000 (16:20 +0100)]
drm/i915: Agressive downclocking on Baytrail
Reuse the same reclocking strategy for Baytail as on its bigger brethren,
Sandybridge and Ivybridge. In particular, this makes the device quicker
to reclock (both up and down) though the tendency now is to downclock
more aggressively to compensate for the RPS boosts.
v2: Rebase
v3: Exclude Cherrytrail as Deepak was concerned that the increased
number of register writes would wake the common powerwell too often.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Deepak S <deepak.s@linux.intel.com> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: Deepak S <deepak.s@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Chris Wilson [Tue, 7 Apr 2015 15:20:26 +0000 (16:20 +0100)]
drm/i915: Fix the flip synchronisation to consider mmioflips
Currently we emit semaphore synchronisation as if we were going to flip
using the target CS engine, but we then change our minds and do the flip
using the CPU. Consequently we write instructions to the ring but never
use them - even to the point of filling that ring up entirely and never
submitting a request.
The wrinkle in the ointment is that we have to tell a white lie to
pin-to-display for it to skip the synchronisation for mmioflips as we
will create a task specifically for that slow synchronisation. An oddity
of note is the discrepancy in requests that we tell to pin-display to
serialise to and that we then eventually wait upon. This is due to a
limitation in the i915_gem_object_sync() routine that will be lifted
later.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Chris Wilson [Tue, 7 Apr 2015 15:20:25 +0000 (16:20 +0100)]
drm/i915: Cache last obj->pages location for i915_gem_object_get_page()
The biggest user of i915_gem_object_get_page() is the relocation
processing during execbuffer. Typically userspace passes in a set of
relocations in sorted order. Sadly, we alternate between relocations
increasing from the start of the buffers, and relocations decreasing
from the end. However the majority of consecutive lookups will still be
in the same page. We could cache the start of the last sg chain, however
for most callers, the entire sgl is inside a single chain and so we see
no improve from the extra layer of caching.
v2: Avoid the double increment inside unlikely()
References: https://bugs.freedesktop.org/show_bug.cgi?id=88308 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: John Harrison <John.C.Harrison@Intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
drm/i915: Don't use staged config in intel_mst_pre_enable_dp()
For the conversion to atomic. The pre_enable() hooks are called as part
of the crtc enable sequence, at which point the staged config was
already made effective. Furthermore, the function actually changes
hardware state, so it should anyway deal with current and not staged
config.
Signed-off-by: Ander Conselvan de Oliveira <ander.conselvan.de.oliveira@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
drm/i915: Don't use staged config for VLV cdclk calculations
Now that we use a drm atomic state for the legacy modeset, it is
possible to get rid of the usage of intel_crtc->new_config in the
function intel_mode_max_pixclk().
Signed-off-by: Ander Conselvan de Oliveira <ander.conselvan.de.oliveira@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Chris Wilson [Thu, 2 Apr 2015 09:35:08 +0000 (10:35 +0100)]
drm/i915: Allow disabling the destination colorkey for overlay
Sometimes userspace wants a true overlay that is never clipped. In such
cases, we need to disable the destination colorkey. However, it is
currently unconditionally enabled in the overlay with no means of
disabling. So rectify that by always default to on, and extending the
UPDATE_ATTR ioctl to support explicit disabling of the colorkey.
This is contrast to the spite code which requires explicit enabling of
either the destination or source colorkey. Handling source colorkey is
still todo for the overlay. (Of course it may be worth migrating overlay
to sprite before then.)
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Jeff McGee [Sat, 4 Apr 2015 01:13:18 +0000 (18:13 -0700)]
drm/i915/bxt: Support BXT in SSEU device status dump
Modify the Gen9 SSEU device status logic to support Broxton.
Broxton reuses the Skylake power gate acknowledgment registers but
has at most 1 slice and 3 subslices. Broxton supports subslice
power gating within its single slice.
Signed-off-by: Jeff McGee <jeff.mcgee@intel.com> Reviewed-by: Imre Deak <imre.deak@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Jeff McGee [Sat, 4 Apr 2015 01:13:16 +0000 (18:13 -0700)]
drm/i915/bxt: Determine BXT slice/subslice/EU info
Modify the Gen9 SSEU info initialization logic to support
Broxton. Broxton reuses the SKL fuse registers but has at most
1 slice and 6 EU per subslice.
Signed-off-by: Jeff McGee <jeff.mcgee@intel.com> Reviewed-by: Imre Deak <imre.deak@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Nick Hoath [Fri, 20 Mar 2015 09:03:52 +0000 (09:03 +0000)]
drm/i915/bxt: Add Broxton steppings
Signed-off-by: Nick Hoath <nicholas.hoath@intel.com> Reviewed-by: Imre Deak <imre.deak@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Nick Hoath [Tue, 17 Mar 2015 09:39:38 +0000 (11:39 +0200)]
drm/i915/bxt: HardWare WorkAround ring initialisation for Broxton
Adds framework for Broxton HW WAs
Signed-off-by: Nick Hoath <nicholas.hoath@intel.com> Signed-off-by: Imre Deak <imre.deak@intel.com> Reviewed-by: Nick Hoath <nicholas.hoath@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Imre Deak [Fri, 27 Mar 2015 11:07:33 +0000 (13:07 +0200)]
drm/i915/bxt: map GTT as uncached
On Broxton per specification the GTT has to be mapped as uncached.
This was caught by the PTE write readback warning, which showed a
corrupted PTE value with using the current write-combine mapping.
v2:
- add comment explaining how the problem with WC mapping manifests
(Daniel)
Signed-off-by: Imre Deak <imre.deak@intel.com> Reviewed-by: Antti Koskipää <antti.koskipaa@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>