New GuC code is logging errors at runtime suspend and resume which
causes CI testing to log "orange" status. Default to not trying to
load the firmware until this is resolved.
Ville Syrjälä [Mon, 16 May 2016 13:59:40 +0000 (16:59 +0300)]
drm/i915: Assert the dbuf is enabled when disabling DC5/6
Like with cdclk, the DMC is supposed to manage dbuf enabling/disabling.
Let's make sure it has correctly restored the dbuf state to enabled
when we disable the DC states.
Ville Syrjälä [Fri, 13 May 2016 20:41:40 +0000 (23:41 +0300)]
drm/i915: Set BXT cdclk to minimum initially
In case the driver is initialized without active displays, we should
just drop the cdclk to the minimum frequency right off the bat. There
might not be a modeset to drop it to the minimum late rafter all.
With DMC supposedly we should always have the cdclk up and running.
The DMC will shut the DE PLL down when appropriate, so let's nuke
the related FIXMEs as well. Trying to do anything different would
go against the expectations of the DMC firmware, and we all know
how fragile the DMC firmware is.
Ville Syrjälä [Fri, 13 May 2016 20:41:39 +0000 (23:41 +0300)]
drm/i915: Replace bxt_verify_cdclk_state() with a more generic cdclk check
Rather than having a BXT specific function to make sure the DE PLL is
enabled after disabling DC6, let's just make sure the current cdclk
is the same as what we last programmed.
Having another check in bxt_display_core_init() almost immediately after
the cdclk init seems redundant, so let's just kill that one.
Ville Syrjälä [Fri, 13 May 2016 20:41:38 +0000 (23:41 +0300)]
drm/i915: Make bxt_set_cdclk() operate in terms of the current vs target DE PLL vco
Make bxt_set_cdclk() more readable by looking at current vs. target
DE PLL vco to determine if the DE PLL needs disabling and/or enabling.
We can also calculate the CD2X divider simply as (vco/cdclk) instead of
depending on magic numbers.
The magic numbers are still needed though, but only to map the supported
cdclk frequencies to corresponding DE PLL frequencies.
Note that w'll now program CDCLK_CTL correctly even for the bypass case.
Actually the CD2X divider should not matter in that case since the
hardware will bypass it too, but the "decimal" part should matter (if we
want to do gmbus/aux with the bypass enabled).
Ville Syrjälä [Fri, 13 May 2016 20:41:34 +0000 (23:41 +0300)]
drm/i915: Extract bxt DE PLL enable/disable from broxton_set_cdclk()
Enabling and disalbing the DE PLL are two nice self contained
operations, so let's move them into a few small helper functions.
Makes it easier to see the forest from the trees in broxton_set_cdclk().
Ville Syrjälä [Fri, 13 May 2016 20:41:33 +0000 (23:41 +0300)]
drm/i915: Store cdclk PLL reference clock under dev_priv
Future platforms will have multiple options for the cdclk PLL reference
clock, so let's start tracking that under dev_priv alreday on SKL,
although on SKL it's always 24 MHz.
Ville Syrjälä [Fri, 13 May 2016 20:41:32 +0000 (23:41 +0300)]
drm/i915: Rename skl_vco_freq to cdclk_pll.vco
We'll want to store the cdclk PLL (whatever PLL that is in reality) vco
frequency somewhere on other platforms too, so let's rename the
skl_vco_freq to cdclk_pll.vco, and let's store it in kHz instead of MHz
to match most of the other clocks.
Ville Syrjälä [Fri, 13 May 2016 20:41:31 +0000 (23:41 +0300)]
drm/i915: Make 308 and 671 MHz cdclks more accurate on SKL
The SKL 308.57 MHz cdclk is probably 8640/28 = ~308.571 Mhz.
Similartly the 617.14 MHz cdclk is probably 8640/14 = ~617.143 MHz.
Let's use the slightly more accurate numbers. Potentially we might
change to computing all of these based on dividers, but let's
stick to the current theme for now..
Ville Syrjälä [Fri, 13 May 2016 20:41:30 +0000 (23:41 +0300)]
drm/i915: Move SKL+ DBUF enable/disable to display core init/uninit
SKL and BXT have the same snippets of code for enabling disabling the
DBUF. Extract those into helpers and move the calls from
init/unit_cdclk() to the display core init/init since this stuff isn't
really about cdclk. Also doing the enable twice shouldn't hurt since
you're just setting the request bit again when it was already set.
We can also toss in a few WARNs about the register values into
skl_get_dpll0_vco() now that we know that things should always be
sane there.
Ville Syrjälä [Fri, 13 May 2016 20:41:29 +0000 (23:41 +0300)]
drm/i915: Unify SKL cdclk init paths
Currently we initialize cdclk on SKL from two different places,
depending on whether it's during driver init or resume. Let's
unify it to happen from the same place always, and that place will be
the display core init function.
To do this we first run through the cdclk sanitation code, which will
first verify that the PLL is programmed correctly, after which we can
read out the current cdclk frequency, and once the cdclk is known we
verify that the cdclk "decimal" frequency is programmed correctly. If
any of these fail we will force a cdclk change, and to be safe we also
force the PLL to be turned off and on again. If the sanitation step
didn't notice anything amiss, we'll skip the cdclk programming which
will prevent cdclk reprogramming when the displays might be active.
We can also toss in a few WARNs about the register values into
skl_update_dpll0() since we now know that the PLL state should
always be sane when that function is called.
Ville Syrjälä [Fri, 13 May 2016 20:41:28 +0000 (23:41 +0300)]
drm/i915: Beef up skl_sanitize_cdclk() a bit
Also verify the DPLL_CTRL1 register value in skl_sanitize_cdclk(), throw
out a few unneeded variables, and write the CDCLK_CTL check a bit more
legible way.
Ville Syrjälä [Fri, 13 May 2016 20:41:27 +0000 (23:41 +0300)]
drm/i915: Keep track of preferred cdclk vco frequency on SKL
Now that skl_vco_freq tracks the actual DPLL0 vco frequency, we'll need
something that keeps track of which vco frequency we want to use in case
the current vco is 0. This would be important across supend/resume since
we'll disable DPLL0 around those parts.
We'll also update our idea of max cdclk/dotclock when the preferred
vco changes. That could happen if out initial guess was wrong, and
later eDP would force us to change it. One issue here could be that
changing the max dotclock could cause our mode list to change during
next time the displays get probed. But I don't see a good way to avoid
that, except perhaps by allowing either vco frequency to be used as
needed. But the docs suggest that such usage wasn't really inteded.
Also need to make sure we don't update our max_cdclk value before we
have a preferred vco value, which means moving that to happen after
the cdclk sanitation.
Ville Syrjälä [Fri, 13 May 2016 20:41:24 +0000 (23:41 +0300)]
drm/i915: Actually read out DPLL0 vco on skl from hardware
Currently we're trying to guess which lcpll vco frequency is used
use based on the cdclk. That doesn't work for cdclk==540 since
both vco frequencies can generate a 540 Mhz output. Let's stop
guessing and just read the actual vco frequency from the
hardware.
Ville Syrjälä [Fri, 13 May 2016 20:41:23 +0000 (23:41 +0300)]
drm/i915: Extract skl_calc_cdclk()
We have many places where we want to pick a suitable cdclk frequency for
skl based on the dotclock and lcpll vco. Split that code into a small
helper and call it from all over.
Ville Syrjälä [Fri, 13 May 2016 20:41:22 +0000 (23:41 +0300)]
drm/i915: Move the SKL DPLL0 VCO computation into intel_dp_compute_config()
Shared plls won't get assigned until the .compute_clocks() hook gets
called, which happens from the crtc .atomic_check hook. That's too late
as the cdclk computation has already happened. So let's move the DPLL0
VCO computation into intel_dp_compute_config() so that it's done when
the cdclk computation happens. Also only do it for eDP since we only
pick DPLL0 for eDP.
Clint Taylor [Fri, 13 May 2016 20:41:21 +0000 (23:41 +0300)]
drm/i915/skl: SKL CDCLK change on modeset tracking VCO
WARNING: Using ChromeOS with an eDP panel and a 4K@60 DP monitor connected
to DDI1 the system will hard hang during a cold boot. Occurs when DDI1
is enabled when the cdclk is less then required. DP connected to DDI2
and HPD on either port works correctly.
Set cdclk based on the max required pixel clock based on VCO
selected. Track boot vco instead of boot cdclk.
The vco is now tracked at the atomic level and all CRTCs updated if
the required vco is changed. Not tested with eDP v1.4 panels that
require 8640 vco due to availability.
V1: initial version
V2: add vco tracking in intel_dp_compute_config(), rename
skl_boot_cdclk.
V3: rebase, V2 feedback not possible as encoders are not aware of
atomic.
V4: track target vco is atomic state. modeset all CRTCs if vco changes
V5: rename atomic variable, cleaner if/else logic, use existing vco if
encoder does not return a new vco value. check_patch.pl cleanup
V6: simplify logic in intel_modeset_checks.
V7: reorder an IF for readability and whitespace fix.
V8: use dev_cdclk for tracking new cdclk during atomic
V9: correctly handle vco 8640 when crtcs==0
V10: Clean up if else in crtcs==0
V11: Rebase for new intel_dpll_mgr.c
Ville Syrjälä [Fri, 13 May 2016 20:41:20 +0000 (23:41 +0300)]
drm/i915: Fix BXT min_pixclk after state readout
commit 4e5ca60fd35a ("drm/i915: Use ilk_max_pixel_rate() for BXT cdclk calculation")
tried to change BXT to use ilk_max_pixel_rate() to compute the
pipe pixel rate. I failed to notice that there was another place
in the state readout code that needs the same treatment. So let's
change that one too.
Should probably just change things to always compuyte the pipe pixel
rates, instead of just doing on platforms that can change cdclk
dynamically. But for now let's just move BXT fully over to the
side that uses ilk_pipe_pixel_rate().
Chris Wilson [Mon, 23 May 2016 14:08:10 +0000 (15:08 +0100)]
drm/i915/opregion: Rename init/fini functions to register/unregister
Current intel_opregion_init is called during the driver registration
phase and intel_opregion_fini from the unregistration phase. Rename the
functions so that this is clear from their names. The phases tell us
what we expect the existing hw state to be, e.g. whether interrupts are
still enabled etc.
It should be noted that the opregion init/fini routines are asymmetric
and this is carried across into their new names. Indeed, their new names
make it even clearer that perhaps all is not well in the opregion
suspend/resume sequence (as well in the module unload).
Chris Wilson [Mon, 23 May 2016 14:08:09 +0000 (15:08 +0100)]
drm/i915/opregion: Convert to using native drm_i915_private
Prefer passing struct drm_i915_private to internal interfaces as this
saves us having to dance between drm_device and our native struct. The
savings hare are small (only 70 bytes of unrequired dancing), but
progressive!
Ville Syrjälä [Thu, 19 May 2016 09:14:43 +0000 (12:14 +0300)]
drm/i915: Enable GSE interrupt on BDW+
We've never actually enabled or unmasked the GSE interrupt on BDW+,
even though the interrupt handler was always prepared for it.
Let's enable it and see what happens.
Credit to Mark Kettenis who fixed this in the OpenBSD fork of the
driver. He reports that it fixed the "ACPI _BCM/_BCQ-based
brightness mechanism on a MacBookPro12,1 and a 3rd gen Lenovo X1
Carbon" for them.
Mark says:
"FWIW, this *is* needed if you want ACPI-based backlight control to
work. On Linux you probably don't notice, since "hardware" backlight
control is preferred over "firmware" or "platform" backlight control.
It would help me if this did land in the Linux tree though, as it will
make future imports of the i915 driver into OpenBSD easier."
So even though we don't really need this, let's put it in to help Mark
with future porting efforts. Should be harmless to have it enabled in
any case.
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
References: http://lists.freedesktop.org/archives/intel-gfx/2015-December/081799.html Reported-by: Mark Kettenis <mark.kettenis@xs4all.nl> Cc: Mark Kettenis <mark.kettenis@xs4all.nl> Cc: Jani Nikula <jani.nikula@linux.intel.com> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1463649283-28698-1-git-send-email-ville.syrjala@linux.intel.com Acked-by: Jani Nikula <jani.nikula@intel.com>
Dave Gordon [Fri, 13 May 2016 14:36:34 +0000 (15:36 +0100)]
drm/i915/guc: rework guc_add_workqueue_item()
Mostly little optimisations and future-proofing against code breakage.
For instance, if the driver is correctly following the submission
protocol, the "out of space" condition is impossible, so the previous
runtime WARN_ON() is promoted to a GEM_BUG_ON() for a more dramatic
effect in development and less impact in end-user systems.
Similarly we can make alignment checking more stringent and replace
other WARN_ON() conditions that don't relate to the runtime hardware
state with either BUILD_BUG_ON() for compile-time-detectable issues, or
GEM_BUG_ON() for logical "can't happen" errors.
With those changes, we can convert it to void, as suggested by Chris
Wilson, and update the calling code appropriately.
v2:
Note that we're now putting the request seqno in the "fence_id"
field of each GuC-work-item, in case it turns up somewhere useful
(e.g. in a GuC log) [Tvrtko Ursulin].
Signed-off-by: Dave Gordon <david.s.gordon@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Dave Gordon [Fri, 13 May 2016 14:36:33 +0000 (15:36 +0100)]
drm/i915/guc: don't spinwait if the GuC's workqueue is full
Rather than wait to see whether more space becomes available in the GuC
submission workqueue, we can just return -EAGAIN and let the caller try
again in a little while. This gets rid of an uninterruptable sleep in
the polling code :)
We'll also add a counter to the GuC client statistics, to see how often
we find the WQ full.
v2:
Flag the likely() code path (Tvtrko Ursulin).
v4:
Add/update comments about failure counters (Tvtrko Ursulin).
Signed-off-by: Dave Gordon <david.s.gordon@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Dave Gordon [Fri, 13 May 2016 14:36:32 +0000 (15:36 +0100)]
drm/i915/guc: pass request (not client) to i915_guc_{wq_check_space, submit}()
The knowledge of how to derive the relevant client from the request
should be localised within i915_guc_submission.c; the LRC code shouldn't
have to know about the internal details of the GuC submission process.
And all the information the GuC code needs should be encapsulated in (or
reachable from) the request.
v2:
GEM_BUG_ON() for bad GuC client (Tvrtko Ursulin).
Add/update kerneldoc explaining check_space/submit protocol
Signed-off-by: Dave Gordon <david.s.gordon@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Dave Gordon [Fri, 20 May 2016 10:42:42 +0000 (11:42 +0100)]
drm/i915/guc: add enable_guc_loading parameter
Split the function of "enable_guc_submission" into two separate
options. The new one ("enable_guc_loading") controls only the
*fetching and loading* of the GuC firmware image. The existing
one is redefined to control only the *use* of the GuC for batch
submission once the firmware is loaded.
In addition, the degree of control has been refined from a simple
bool to an integer key, allowing several options:
-1 (default) whatever the platform default is
0 DISABLE don't load/use the GuC
1 BEST EFFORT try to load/use the GuC, fallback if not available
2 REQUIRE must load/use the GuC, else leave the GPU wedged
The new platform default (as coded here) will be to attempt to
load the GuC iff the device has a GuC that requires firmware,
but not yet to use it for submission. A later patch will change
to enable it if appropriate.
v4:
Changed some error-message levels, mostly ERROR->INFO, per
review comments by Tvrtko Ursulin.
v5:
Dropped one more error message, disabled GuC submission on
hypothetical firmware-free devices [Tvrtko Ursulin].
v6:
Logging tidy by Tvrtko Ursulin:
* Do not log falling back to execlists when wedging the GPU.
* Do not log fw load errors when load was disabled by user.
* Pass down some error code from fw load for log message to
make more sense.
Signed-off-by: Dave Gordon <david.s.gordon@intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> (v5) Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Tested-by: Fiedorowicz, Lukasz <lukasz.fiedorowicz@intel.com> Signed-off-by: Dave Gordon <david.s.gordon@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> (v5) Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Nick Hoath <nicholas.hoath@intel.com> (v6)
Dave Gordon [Fri, 13 May 2016 14:36:30 +0000 (15:36 +0100)]
drm/i915/guc: distinguish HAS_GUC() from HAS_GUC_UCODE/HAS_GUC_SCHED
For now, anything with a GuC requires uCode loading, and then supports
command submission once loaded. But these are logically distinct from
simply "having a GuC", so we need a separate macro for the latter. Then,
various tests should use this new macro rather than HAS_GUC_UCODE() or
testing enable_guc_submission.
v4:
Added a couple more uses of the new macro.
Signed-off-by: Dave Gordon <david.s.gordon@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Dave Gordon [Fri, 13 May 2016 14:36:29 +0000 (15:36 +0100)]
drm/i915/guc: rename loader entry points
The GuC initialisation code could do other things apart from loading
firmware, so here we rename the three primary entry points to remove any
specific reference to "ucode" (no functional changes, just renaming).
Signed-off-by: Dave Gordon <david.s.gordon@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Dave Gordon [Fri, 20 May 2016 10:54:07 +0000 (11:54 +0100)]
drm/i915: Inline sg_next() for the optimised SGL iterator
Avoiding the out-of-line call to sg_next() reduces the kernel execution
overhead by 10% in some workloads (for example the Unreal Engine 4 demo
Atlantis on 2GiB GTTs) which are dominated by the cost of inserting PTEs
due to texture thrashing. We can demonstrate this in a microbenchmark
that forces us to rebind the object on every execbuf, where we can
measure a 25% improvement, in the time required to execute an execbuf
requiring a texture to be rebound, for inlining the sg_next() for large
texture sizes.
Dave Gordon [Fri, 20 May 2016 10:54:06 +0000 (11:54 +0100)]
drm/i915: Introduce & use new lightweight SGL iterators
The existing for_each_sg_page() iterator is somewhat heavyweight, and is
limiting i915 driver performance in a few benchmarks. So here we
introduce somewhat lighter weight iterators, primarily for use with GEM
objects or other case where we need only deal with whole aligned pages.
Unlike the old iterator, the new iterators use an internal state
structure which is not intended to be accessed by the caller; instead
each takes as a parameter an output variable which is set before each
iteration. This makes them particularly simple to use :)
One of the new iterators provides the caller with the DMA address of
each page in turn; the other provides the 'struct page' pointer required
by many memory management operations.
Various uses of for_each_sg_page() are then converted to the new macros.
v2: Force inlining of the sg_iter constructor and make the union
anonymous.
Dave Gordon [Fri, 20 May 2016 10:54:05 +0000 (11:54 +0100)]
drm/i915: optimise i915_gem_object_map() for small objects
We're using this function for ringbuffers and other "small" objects, so
it's worth avoiding an extra malloc()/free() cycle if the page array is
small enough to put on the stack. Here we've chosen an arbitrary cutoff
of 32 (4k) pages, which is big enough for a ringbuffer (4 pages) or a
context image (currently up to 22 pages).
Dave Gordon [Fri, 20 May 2016 10:54:04 +0000 (11:54 +0100)]
drm/i915: refactor i915_gem_object_pin_map()
The recently-added i915_gem_object_pin_map() can be further optimised
for "small" objects. To facilitate this, and simplify the error paths
before adding the new code, this patch pulls out the "mapping" part of
the operation (involving local allocations which must be undone before
return) into its own subfunction.
The next patch will then insert the new optimisation into the middle of
the now-separated subfunction.
This reorganisation will probably not affect the generated code, as the
compiler will most likely inline it anyway, but it makes the logical
structure a bit clearer and easier to modify.
Daniel Vetter [Wed, 18 May 2016 16:47:15 +0000 (18:47 +0200)]
drm/i915/psr: Use ->get_aux_send_ctl functions
I just wanted to get rid of the rmw cycle for gen9, but this also
fixes some bugs we haven't carried over, like using recommended
precharge and timeout values.
Also I noticed that we don't set the fastwake sync length on skl, and
that's used by PSR2 selective updates. Fix that.
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Sonika Jindal <sonika.jindal@intel.com> Cc: Durgadoss R <durgadoss.r@intel.com> Cc: "Pandiyan, Dhinakaran" <dhinakaran.pandiyan@intel.com> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Tested-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1463590036-17824-6-git-send-email-daniel.vetter@ffwll.ch
Daniel Vetter [Wed, 18 May 2016 16:47:14 +0000 (18:47 +0200)]
drm/i915/psr: Order DP aux transactions correctly
On bdw/hsw we have a separate psr dp aux registers to set up, but on
bdw it's shared with the main dp aux thing. Which means any subsequent
dp aux transaction will trample over it, and hence must be done
beforehand.
Also this means we can't do any dp aux transactions while PSR is
active, or at least we must restore the old state.
Probably need a psr disable/enable pair around dp aux transactions in
general.
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Sonika Jindal <sonika.jindal@intel.com> Cc: Durgadoss R <durgadoss.r@intel.com> Cc: "Pandiyan, Dhinakaran" <dhinakaran.pandiyan@intel.com> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Tested-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1463590036-17824-5-git-send-email-daniel.vetter@ffwll.ch
without the hack to use 2 idle frames when VBT says 1. We keep the + 1
just for safety, although I haven't really figured out why that one
exists.
It's nonsense. idle_frames = number of frames where the screen is
entirely idle before we think about entering PSR.
idle_patter = part of link training, and we probably totally butchered
link training because we told the hw to entirely skip it. No wonder
PSR occasionally just fell over.
I suspect the reason we've increased idle frames is that it makes PSR
entry slightly less likely, and more likely to happen in a quite
system, which probably increased the changes the panel came back up
without link training. The proper fix is to implement link training
for PSR.
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Sonika Jindal <sonika.jindal@intel.com> Cc: Durgadoss R <durgadoss.r@intel.com> Cc: "Pandiyan, Dhinakaran" <dhinakaran.pandiyan@intel.com> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Tested-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1463590036-17824-3-git-send-email-daniel.vetter@ffwll.ch
Daniel Vetter [Wed, 18 May 2016 16:47:11 +0000 (18:47 +0200)]
drm/i915/psr: Try to program link training times correctly
The default of 0 is 500us of link training, but that's not enough for
some platforms. Decoding this correctly means we're using 2.5ms of
link training on these platforms, which fixes flickering issues
associated with enabling PSR.
Chris Wilson [Thu, 19 May 2016 15:17:16 +0000 (16:17 +0100)]
drm/i915/userptr: Convert to drm_i915_private
userptr directly only uses drm_device in a single interface where it
meant to use drm_i915_private (everywhere else we have to derive it from
the drm_i915_gem_object and so require going from drm_device).
If planes are added to the state after the call to
drm_atomic_helper_check_planes planes_changed may not be set
and we will not unpin the old framebuffer. This results in a
pin leak long after the framebuffer is destroyed, so to find
this add some checks when it happens.
All of intel_post_plane_update is handled there now, so move it over.
This is run after the hw state checker because it can't handle checking
crtc's separately yet.
drm/i915: Prepare connectors for nonblocking checks.
intel_unpin_work may not take the list lock because it requires the connector_mutex.
To prevent taking locks we add an array of old and new state. The old state to free,
the new state to commit and verify.
drm/i915: Pass atomic states to fbc update functions.
This is required to let fbc updates run async. It has a lot of
checks whether certain locks are taken, which can be removed when
the relevant states are passed in as pointers.
drm/i915: Rework intel_crtc_page_flip to be almost atomic, v3.
Create a work structure that will be used for all changes. This will
be used later on in the atomic commit function.
Changes since v1:
- Free old_crtc_state from unpin_work_fn properly.
Changes since v2:
- Add hunk for calling hw state verifier.
- Add missing support for color spaces.
Set plane_state->base.fence to the dma_buf exclusive fence,
and add a wait to the mmio function. This will make it easier
to unify plane updates later on.
drm/i915: Allow mmio updates on all platforms, v2.
With intel_pipe_update begin/end we ensure that the mmio updates
don't run during vblank interrupt, using the hw counter we can
be sure that when current vblank count != vblank count at the time
of pipe_update_end the mmio update is complete.
This allows us to use mmio updates on all platforms, using the
update_plane call.
With Chris Wilson's patch to skip waiting for vblanks for
legacy_cursor_update this potentially leaves a small race
condition, in which update_plane can be called with a freed
crtc_state. Because of this commit acf4e84d61673
("drm/i915: Avoid stalling on pending flips for legacy cursor updates")
is temporarily reverted.
Changes since v1:
- Split out the flip_work rename.
Revert "drm/i915: Avoid stalling on pending flips for legacy cursor updates"
This reverts commit acf4e84d6167317ff21be5c03e1ea76ea5783701.
Unfortunately this breaks the next commit with a use-after-free, so
temporarily revert until we can apply a solution.
drm/i915: Unify unpin_work and mmio_work into flip_work, v2.
Rename intel_unpin_work to intel_flip_work and use it for mmio flips
and unpinning. Use flip_queued_req to hold the wait request in the
mmio case, and the vblank counter from intel_crtc_get_vblank_counter.
MMIO flips get their own path through intel_finish_page_flip_mmio,
handled on vblank. CS page flips go through *_cs.
Changes since v1:
- Clean up destinction between MMIO and CS flips.
Instead of calling prepare_flip right before calling finish_page_flip
do everything from prepare_page_flip in finish_page_flip.
Putting prepare and finish page_flip in a single step removes the need
for INTEL_FLIP_COMPLETE, so it can be removed. This simplifies the code
slightly.
Changes since v1:
- Invert if case to simplify code.
- Add missing barrier.
- Reword commit message.
Changes since v2:
- intel_page_flip_plane is removed.
- work->pending is turned into a bool.
Both intel_unpin_work.pending and intel_unpin_work.enable_stall_check
were used to see if work should be enabled. By only using pending
some special cases are gone, and access to unpin_work can be simplified.
A flip could previously be queued before
stallcheck was active. With the addition of the pending member
enable_stall_check became obsolete and can thus be removed.
Use this to only access work members untilintel_mark_page_flip_active
is called, or intel_queue_mmio_flip is used. This will prevent
use-after-free, and makes it easier to verify accesses.
Changes since v1:
- Reword commit message.
- Do not access unpin_work after intel_mark_page_flip_active.
- Add the right memory barriers.
Changes since v2:
- atomic_read() needs a full smp_rmb.
This function is useful for gen2 intel devices which have no frame
counter, but need a way to determine the current vblank count without
racing with the vblank interrupt handler.
intel_pipe_update_start checks if no vblank interrupt will occur
during vblank evasion, but cannot check whether the vblank handler has
run to completion. This function uses the timestamps to determine
when the last vblank has happened, and interpolates from there.
Changes since v1:
- Take vblank_time_lock and don't use drm_vblank_count_and_time.
Changes since v2:
- Don't return time of last vblank.
Changes since v3:
- Change pipe to unsigned int. (Ville)
- Remove unused documentation for tv_ret. (kbuild)
Changes since v4:
- Add warning to docs when the function is useful.
- Add a WARN_ON when get_vblank_timestamp is unavailable.
- Use drm_vblank_count.
Cc: Mario Kleiner <mario.kleiner.de@gmail.com> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> #v4 Acked-by: David Airlie <airlied@linux.ie> #irc, v4 Link: http://patchwork.freedesktop.org/patch/msgid/1463490484-19540-2-git-send-email-maarten.lankhorst@linux.intel.com Reviewed-by: Mario Kleiner <mario.kleiner.de@gmail.com>
Peter Antoine [Tue, 17 May 2016 14:12:45 +0000 (15:12 +0100)]
drm/i915/bxt: reserve space for RC6 in the the GuC WOPCM
This patch resizes the GuC WOPCM (specifically on BXT)
so that the GuC and RC6 memory spaces do not overlap.
v2:
Made calculation of WOPCM size into a separate function,
so that it's consistent between the firmware size-check
and the register-programming operations [Dave Gordon].
Issue: https://jira01.devtools.intel.com/browse/VIZ-6638 Signed-off-by: Peter Antoine <peter.antoine@intel.com> Signed-off-by: Dave Gordon <david.s.gordon@intel.com> Reviewed-by: Nick Hoath <nicholas.hoath@intel.com> Tested-by: Nick Hoath <nicholas.hoath@intel.com> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1463494365-26330-1-git-send-email-david.s.gordon@intel.com
Daniel Vetter [Mon, 9 May 2016 07:31:25 +0000 (09:31 +0200)]
drm/i915: Simplify control flow in intel_atomic_check a bit.
- Unconditionally add plane states. Core helpers would have done this
in drm_atomic_helper_check_modeset, doing it once more won't cause
harm and is less fragile.
- Simplify the continue logic when disabling a pipe.
Deepak M [Tue, 26 Apr 2016 13:14:26 +0000 (16:14 +0300)]
drm/i915/dsi: CABC support for Panel PWM backlight control
In CABC (Content Adaptive Brightness Control) content grey level
scale can be increased while simultaneously decreasing
brightness of the backlight to achieve same perceived brightness.
The CABC is not standardized and panel vendors are free to follow
their implementation. The CABC implementaion here assumes that the
panels use standard SW register for control.
CABC is supported only when the PWM source for backlight is
from the panel.
v2 by Jani: rebase, renames, check cabc support earlier, etc.
Jani Nikula [Tue, 26 Apr 2016 13:14:25 +0000 (16:14 +0300)]
drm/i915/dsi: Add DCS control for Panel PWM
If the source of the backlight PWM is from the
panel then the PWM can be controlled by DCS
command, this patch adds the support to
enable/disbale panel PWM, control backlight level
etc...
v2: Moving the CABC bkl functions to new file.(Jani)
v3: Rebase
v4: Rebase
v5: Use mipi_dsi_dcs_write() instead of mipi_dsi_dcs_write_buffer() (Jani)
Move DCS macro`s to include/video/mipi_display.h (Jani)
v6: Rename the file to intel_dsi_panel_pwm.c
Removing the CABC operations
v7 by Jani: renames, rebases, etc.
v8 by Jani: s/INTEL_BACKLIGHT_CABC/INTEL_BACKLIGHT_DSI_DCS/
v9 by Jani: rename init function to intel_dsi_dcs_init_backlight_funcs
Dave Airlie [Mon, 16 May 2016 21:06:14 +0000 (07:06 +1000)]
Merge tag 'topic/drm-misc-2016-05-13' of git://anongit.freedesktop.org/drm-intel into drm-next
I kinda hoped that I could still sneak in Noralf's
drm_simple_display_pipe, since there's intereset by others now (for tilcdc
at least). But it wasn't ready by a hair. Oh well.
Otherwise random stuff plus prep patches from Noralf.
* tag 'topic/drm-misc-2016-05-13' of git://anongit.freedesktop.org/drm-intel:
drm/atomic: Add drm_atomic_helper_best_encoder()
drm/atomic: Don't skip drm_bridge_*() calls if !drm_encoder_helper_funcs
drm/fb-cma-helper: Hook up to DocBook and fix some docs
drm/fb-helper: Remove mention of CONFIG_FB_DEFERRED_IO in docs
drm/sti: include linux/seq_file.h where needed
drm/tegra: Use lockless gem BO free callback
drm/exynos: Use lockless gem BO free callback
drm: Make drm_encoder_helper_funcs optional
Dave Airlie [Mon, 16 May 2016 20:36:08 +0000 (06:36 +1000)]
Merge branch 'topic-arcpgu-updates' of https://github.com/foss-for-synopsys-dwc-arc-processors/linux into drm-next
Please pull this mini-series that allows ARC PGU to use
dedicated memory location as framebuffer backing storage.
* 'topic-arcpgu-updates' of https://github.com/foss-for-synopsys-dwc-arc-processors/linux:
ARC: [axs10x] Specify reserved memory for frame buffer
drm/arcpgu: use dedicated memory area for frame buffer
Chris Wilson [Sat, 14 May 2016 06:26:35 +0000 (07:26 +0100)]
drm/i915: Skip clearing the GGTT on full-ppgtt systems
Under full-ppgtt, access to the global GTT is carefully regulated
through hardware functions (i.e. userspace cannot read and write to
arbitrary locations in the GGTT via the GPU). With this restriction in
place, we can forgo clearing stale entries from the GGTT as they will
not be accessed.
For aliasing-ppgtt, we could almost do the same except that we do allow
userspace access to the global-GTT via execbuf in order to workraound
some quirks of certain instructions. (This execbuf path is filtered out
with EINVAL on full-ppgtt.)
The most dramatic effect this will have will be during resume, as with
full-ppgtt the GGTT is only used sparingly.
Chris Wilson [Sat, 14 May 2016 06:26:34 +0000 (07:26 +0100)]
drm/i915: Lazily migrate the objects after hibernation
Now that we mark the object domains for having been restored from the
hibernation image, we not need to flush everything during resume and
can instead rely on the normal domain tracking to flush only when
required. The only caveat here are objects that are pinned for use by
the hardware, whose contents must be coherent for when the device
resumes reading from then (shortly afterwards with the driver assuming
the objects are in the correct domain).
References: https://bugs.freedesktop.org/show_bug.cgi?id=94722 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Imre Deak <imre.deak@intel.com> Cc: David Weinehall <david.weinehall@intel.com> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Tested-by: David Weinehall <david.weinehall@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1463207195-22076-3-git-send-email-chris@chris-wilson.co.uk
Chris Wilson [Sat, 14 May 2016 06:26:33 +0000 (07:26 +0100)]
drm/i915: Update domain tracking for GEM objects on hibernation
When creating the hibernation image, the CPU will read the pages of all
objects and thus conflict with our domain tracking. We need to update
our domain tracking to accurately reflect the state on restoration.
v2: Perform the domain tracking inside freeze, before the image is
written, rather than upon restoration.
Chris Wilson [Sat, 14 May 2016 06:26:32 +0000 (07:26 +0100)]
drm/i915: Add distinct stubs for PM hibernation phases
Currently for handling the extra hibernation phases we just call the
equivalent suspend/resume phases. In the next couple of patches, I wish
to specialise the hibernation phases to reduce the amount of work
required for handling GEM objects.
v2: There are more! Don't forget the freeze phases.
Ville Syrjälä [Fri, 13 May 2016 17:10:42 +0000 (10:10 -0700)]
drm/i915: Ignore stale wm register values on resume on ilk-bdw (v2)
When we resume the watermark register may contain some BIOS leftovers,
or just the hardware reset values. We should ignore those as the
pipes will be off anyway, and so frobbing around with intermediate
watermarks doesn't make much sense.
In fact I think we should just throw the skip_intermediate_wm flag
out, and instead properly sanitize the "active" watermarks to match
the current plane and pipe states. The actual wm state readout might
also need a bit of work. But for now, let's continue with the
skip_intermediate_wm to keep the fix more minimal.
Fixes this sort of errors on resume
[drm:ilk_validate_pipe_wm] LP0 watermark invalid
[drm:intel_crtc_atomic_check] No valid intermediate pipe watermarks are possible
[drm:intel_display_resume [i915]] *ERROR* Restoring old state failed with -22
and a boatload of subsequent modeset BAT fails on my ILK.
v2:
- Rebase; the SKL atomic WM patches that just landed changed the WM
structure fields in intel_crtc_state slightly. (Matt)
Cc: Matt Roper <matthew.d.roper@intel.com> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Fixes: ed4a6a7ca853 ("drm/i915: Add two-stage ILK-style watermark programming (v11)") Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1463159442-20478-1-git-send-email-matthew.d.roper@intel.com
Ville Syrjälä [Fri, 13 May 2016 14:55:17 +0000 (17:55 +0300)]
drm/i915: Don't leave old junk in ilk active watermarks on readout
When we read out the watermark state from the hardware we're supposed to
transfer that into the active watermarks, but currently we fail to any
part of the active watermarks that isn't explicitly written. Let's clear
it all upfront.
Looks like this has been like this since the beginning, when I added the
readout. No idea why I didn't clear it up.
Ville Syrjälä [Wed, 11 May 2016 19:44:51 +0000 (22:44 +0300)]
drm/i915: Program BXT_CDCLK_CD2X_PIPE
BXT could change the CD2X divider synchronized with a single pipe.
So assuming the DE PLL frequency doesn't need to be changed, we could
change cdclk without shutting off the pipe (when only a single pipe is
enabled). In the meantime let's configure CDCLK_CTL for non-double
buffered CD2X update, although it shouldn't really matter as long as
the selected pipe is disabled when reprogramming the divider.
Ville Syrjälä [Wed, 11 May 2016 19:44:49 +0000 (22:44 +0300)]
drm/i915: s/freq/cdclk/
Rename the generic sounding freq/frequency parameters to the cdclk
functions to 'cdclk' so that we'll know which clock we're talking about
once we have to deal with the vco frequencies as well.
Ville Syrjälä [Wed, 11 May 2016 19:44:47 +0000 (22:44 +0300)]
drm/i915: Extract skl_dpll0_disable()
Make thins a bit easier to read by extracting the SKL DPLL0
disable into separate functions. We already have the enable
counterpart. Down the line this will also help make the cdclk
programming on SKL, BXT, and following platforms look rather
consistent.
Ville Syrjälä [Wed, 11 May 2016 19:44:45 +0000 (22:44 +0300)]
drm/i915: Use skl_cdclk_decimal() on bxt
Both SKL and BXT need to fill in the "decimal" cdclk frequency into
the CDCLK_CTL register. SKL uses a small helper to do the kHz->"decimal"
conversion, whereas BXT has it open-coded. Use the helper on BXT too.
While at it, change it to round to closest rather than down. It doesn't
actually matter with the frequencies we have to deal with, but it seems
like the right thing to do.
Ville Syrjälä [Wed, 11 May 2016 19:44:44 +0000 (22:44 +0300)]
drm/i915: Use ilk_max_pixel_rate() for BXT cdclk calculation
BXT uses the "pch" panel fitter configuration, so we can use
ilk_max_pixel_rate() instead of intel_mode_max_pixclk() to compute the
pipe pixel rate. ilk_max_pixel_rate() will account for the pipe
scaler downscaling factor whereas intel_mode_max_pixclk() will not.
I'm pretty sure the same limitation is there on GMCH platforms, but
no one just bothered to implement the downscaling adjustment for them.
Probably should just unify the panel fitter setup more across the
platforms and use the exact same code on all platforms for this.
But in the meantime, let's at least make BXT a bit more correct.
Ville Syrjälä [Wed, 11 May 2016 19:44:42 +0000 (22:44 +0300)]
drm/i915: Untangle .fdi_link_train and cdclk vfunc setup
Split the .fdi_link_train and .modeset_commit_cdclk/.modeset_calc_cdclk
into two separate if ladders. Much easier to read when you're not
confusing two totally separate subjects.
Ville Syrjälä [Wed, 11 May 2016 19:44:40 +0000 (22:44 +0300)]
drm/i915: Drop checks for max_pixclk failures in cdclk computation
commit 565602d7501a ("drm/i915: Do not acquire crtc state to check clock during modeset, v4.")
removed the possibility that intel_mode_max_pixclk() or
ilk_max_pixel_rate() might return an error, so let's get rid of the
error checks in the callers as well.
Matt Roper [Thu, 12 May 2016 14:06:11 +0000 (07:06 -0700)]
drm/i915: Remove wm_config from dev_priv/intel_atomic_state
We calculate the watermark config into intel_atomic_state and then save
it into dev_priv, but never actually use it from there. This is
left-over from some early ILK-style watermark programming designs that
got changed over time.
Matt Roper [Thu, 12 May 2016 14:06:10 +0000 (07:06 -0700)]
drm/i915/gen9: Reject display updates that exceed wm limitations (v2)
If we can't find any valid level 0 watermark values for the requested
atomic transaction, reject the configuration before we try to start
programming the hardware.
v2:
- Add extra debugging output when we reject level 0 watermarks so that
we can more easily debug how/why they were rejected.
Matt Roper [Thu, 12 May 2016 22:11:40 +0000 (15:11 -0700)]
drm/i915/gen9: Calculate watermarks during atomic 'check' (v2)
Moving watermark calculation into the check phase will allow us to to
reject display configurations for which there are no valid watermark
values before we start trying to program the hardware (although those
tests will come in a subsequent patch).
Another advantage of moving this calculation to the check phase is that
we can calculate the watermarks in a single shot as part of the atomic
transaction. The watermark interfaces we inherited from our legacy
modesetting days are a bit broken in the atomic design because they use
per-crtc entry points but actually re-calculate and re-program something
that is really more of a global state. That worked okay in the legacy
modesetting world because operations only ever updated a single CRTC at
a time. However in the atomic world, a transaction can involve multiple
CRTC's, which means we wind up computing and programming the watermarks
NxN times (where N is the number of CRTC's involved). With this patch
we eliminate the redundant re-calculation of watermark data for atomic
states (which was the cause of the WARN_ON(!wm_changed) problems that
have plagued us for a while).
We still need to work on the 'commit' side of watermark handling so that
we aren't doing redundant NxN programming of watermarks, but that's
content for future patches.
v2:
- Bail out of skl_write_wm_values() if the CRTC isn't active. Now that
we set dirty_pipes to ~0 if the active pipes change (because
we need to deal with DDB changes), we can now wind up here for
disabled pipes, whereas we couldn't before.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89055
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92181 Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Tested-by: Daniel Stone <daniels@collabora.com> Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1463091100-13747-1-git-send-email-matthew.d.roper@intel.com
Matt Roper [Thu, 12 May 2016 14:06:08 +0000 (07:06 -0700)]
drm/i915/gen9: Propagate watermark calculation failures up the call chain
Once we move watermark calculation to the atomic check phase, we'll want
to start rejecting display configurations that exceed out watermark
limits. At the moment we just assume that there's always a valid set of
watermarks, even though this may not actually be true. Let's prepare by
passing return codes up through the call stack in preparation.
Matt Roper [Thu, 12 May 2016 14:06:06 +0000 (07:06 -0700)]
drm/i915/gen9: Allow watermark calculation on in-flight atomic state (v3)
In an upcoming patch we'll move this calculation to the atomic 'check'
phase so that the display update can be rejected early if no valid
watermark programming is possible.
v2:
- Drop intel_pstate_for_cstate_plane() helper and add note about how
the code needs to evolve in the future if we start allowing more than
one pending commit against a CRTC. (Maarten)
v3:
- Only have skl_compute_wm_level calculate watermarks for enabled
planes; we can just set the other planes on a CRTC to disabled
without having to look at the plane state. This is important because
despite our CRTC lock we can still have racing commits that modify
a disabled plane's property without turning it on. (Maarten)
Matt Roper [Thu, 12 May 2016 14:06:04 +0000 (07:06 -0700)]
drm/i915/gen9: Drop re-allocation of DDB at atomic commit (v2)
Now that we're properly pre-allocating the DDB during the atomic check
phase and we trust that the allocation is appropriate, let's actually
use the allocation computed and not duplicate that work during the
commit phase.
v2:
- Significant rebasing now that we can use cached data rates and
minimum block allocations to avoid grabbing additional plane states.
Matt Roper [Thu, 12 May 2016 14:06:03 +0000 (07:06 -0700)]
drm/i915/gen9: Compute DDB allocation at atomic check time (v4)
Calculate the DDB blocks needed to satisfy the current atomic
transaction at atomic check time. This is a prerequisite to calculating
SKL watermarks during the 'check' phase and rejecting any configurations
that we can't find valid watermarks for.
Due to the nature of DDB allocation, it's possible for the addition of a
new CRTC to make the watermark configuration already in use on another,
unchanged CRTC become invalid. A change in which CRTC's are active
triggers a recompute of the entire DDB, which unfortunately means we
need to disallow any other atomic commits from racing with such an
update. If the active CRTC's change, we need to grab the lock on all
CRTC's and run all CRTC's through their 'check' handler to recompute and
re-check their per-CRTC DDB allocations.
Note that with this patch we only compute the DDB allocation but we
don't actually use the computed values during watermark programming yet.
For ease of review/testing/bisecting, we still recompute the DDB at
watermark programming time and just WARN() if it doesn't match the
precomputed values. A future patch will switch over to using the
precomputed values once we're sure they're being properly computed.
Another clarifying note: DDB allocation itself shouldn't ever fail with
the algorithm we use today (i.e., we have enough DDB blocks on BXT to
support the minimum needs of the worst-case scenario of every pipe/plane
enabled at full size). However the watermarks calculations based on the
DDB may fail and we'll be moving those to the atomic check as well in
future patches.
v2:
- Skip DDB calculations in the rare case where our transaction doesn't
actually touch any CRTC's at all. Assuming at least one CRTC state
is present in our transaction, then it means we can't race with any
transactions that would update dev_priv->active_crtcs (which requires
_all_ CRTC locks).
v3:
- Also calculate DDB during initial hw readout, to prevent using
incorrect bios values. (Maarten)
v4:
- Use new distrust_bios_wm flag instead of skip_initial_wm (which was
never actually set).
- Set intel_state->active_pipe_changes instead of just realloc_pipes