Ville Syrjälä [Fri, 6 May 2016 18:35:55 +0000 (21:35 +0300)]
drm/i915: Re-enable GGTT earlier during resume on pre-gen6 platforms
Move the intel_enable_gtt() call to happen before we touch the GTT
during resume. Right now it's done way too late. Before
commit ebb7c78d358b ("agp/intel-gtt: Only register fake agp driver for gen1")
it was actually done earlier on account of also getting called from
the resume hook of the fake agp driver. With the fake agp driver
no longer getting registered we must move the call up.
The symptoms I've seen on my 830 machine include lowmem corruption,
other kinds of memory corruption, and straight up hung machine during
or just after resume. Not really sure what causes the memory corruption,
but so far I've not seen any with this fix.
I think we shouldn't really need to call this during init, but we have
been doing that so I've decided to keep the call. However moving that
call earlier could be prudent as well. Doing it right after the
intel-gtt probe seems appropriate.
Also tested this on 946gz,elk,ilk and all seemed quite happy with
this change.
v2: Reorder init_hw vs. enable_hw functions (Chris)
Chris Wilson [Mon, 9 May 2016 16:39:42 +0000 (17:39 +0100)]
x86: Silence 32bit compiler warning in intel_graphics_stolen()
arch/x86/kernel/early-quirks.c: In function ‘intel_graphics_stolen’:
arch/x86/kernel/early-quirks.c:539:9: warning: format ‘%llx’ expects
argument of type ‘long long unsigned int’, but argument 2 has type ‘phys_addr_t’ [-Wformat=]
"0x%llx-0x%llx\n", base, base + size - 1);
^
arch/x86/kernel/early-quirks.c:539:9: warning: format ‘%llx’ expects
argument of type ‘long long unsigned int’, but argument 3 has type ‘phys_addr_t’ [-Wformat=]
v2: Use %pa for addresses
Fixes: ee0629cfd3c16 (drm/i915: Function per early graphics quirk) Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> #v1 Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1462811982-1567-1-git-send-email-chris@chris-wilson.co.uk
Ville Syrjälä [Fri, 29 Apr 2016 14:31:18 +0000 (17:31 +0300)]
drm/i915: Add a FIXME about crtc !active vs. watermarks
When the crtc is enabled but !active, we should still compute the
watermarks as if the planes were visible. That would make it more
likely that the we can later transition to active without errors.
Add a FIXME to remind people that we're doing the wrong thing now.
We should perhaps just move the wm computation for each individual plane
into the .check_plane hook, and later we'd just combine the results from
all active planes.
Ville Syrjälä [Fri, 29 Apr 2016 14:31:17 +0000 (17:31 +0300)]
drm/i915: Calculate IPS linetime watermark based on future cdclk
Use the cdclk we're going to be using when the pipe gets enabled to
compute the IPS linetime watermark. The current cdclk frequency is
irrelevant at this point since it can still change.
Chris Wilson [Fri, 6 May 2016 14:40:20 +0000 (15:40 +0100)]
drm/i915/execlists: Refactor common engine setup
Move all of the constant assignments up front and into a common
function. This is primarily to ensure the backpointers are set as early
as possible for later use during initialisation.
v2: Use a constant struct so that all the similar values are set
together.
v3: Sanitize the engine's IMR to disable any potential interrupt before
we are ready (enabled in init_hw).
v4: Ignore the engine's IMR, to be resolved later
Tvrtko Ursulin [Fri, 6 May 2016 13:48:28 +0000 (14:48 +0100)]
drm/i915: Small display interrupt handlers tidy
I have noticed some of our interrupt handlers use both dev and
dev_priv while they could get away with only dev_priv in the
huge majority of cases.
Tidying that up had a cascading effect on changing functions
prototypes, so relatively big churn factor, but I think it is
for the better.
For example even where changes cascade out of i915_irq.c, for
functions prefixed with intel_, genX_ or <plat>_, it makes more
sense to take dev_priv directly anyway.
This allows us to eliminate local variables and intermixed usage
of dev and dev_priv where only one is good enough.
End result is shrinkage of both source and the resulting binary.
So small shrinkage but it is all fast paths so doesn't harm.
Situation is similar in other interrupt handlers as well.
v2: Tidy intel_queue_rps_boost_for_request as well. (Chris Wilson)
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Ville Syrjälä [Wed, 4 May 2016 11:45:22 +0000 (14:45 +0300)]
drm/i915: Determine DP++ type 1 DVI adaptor presence based on VBT
DP dual mode type 1 DVI adaptors aren't required to implement any
registers, so it's a bit hard to detect them. The best way would
be to check the state of the CONFIG1 pin, but we have no way to
do that. So as a last resort, check the VBT to see if the HDMI
port is in fact a dual mode capable DP port.
v2: Deal with VBT code reorganization
Deal with DRM_DP_DUAL_MODE_UNKNOWN
Reduce DEVICE_TYPE_DP_DUAL_MODE_BITS a bit
Accept both DP and HDMI dvo_port in VBT as my BSW
at least declare its DP port as HDMI :(
v3: Ignore DEVICE_TYPE_NOT_HDMI_OUTPUT (Shashank)
Cc: stable@vger.kernel.org Cc: Tore Anderson <tore@fud.no> Reported-by: Tore Anderson <tore@fud.no> Fixes: 7a0baa623446 ("Revert "drm/i915: Disable 12bpc hdmi for now"") Cc: Paulo Zanoni <paulo.r.zanoni@intel.com> Cc: Shashank Sharma <shashank.sharma@intel.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1462362322-31278-1-git-send-email-ville.syrjala@linux.intel.com Reviewed-by: Shashank Sharma <shashank.sharma@intel.com>
Ville Syrjälä [Mon, 2 May 2016 19:08:24 +0000 (22:08 +0300)]
drm/i915: Enable/disable TMDS output buffers in DP++ adaptor as needed
To save a bit of power, let's try to turn off the TMDS output buffers
in DP++ adaptors when we're not driving the port.
v2: Let's not forget DDI, toss in a debug message while at it
v3: Just do the TMDS output control based on adaptor type. With the
helper getting passed the type, we wouldn't actually have to
check at all in the driver, but the check eliminates the debug
output more honest
Ville Syrjälä [Mon, 2 May 2016 19:08:23 +0000 (22:08 +0300)]
drm/i915: Respect DP++ adaptor TMDS clock limit
Try to detect the max TMDS clock limit for the DP++ adaptor (if any)
and take it into account when checking the port clock.
Note that as with the sink (HDMI vs. DVI) TMDS clock limit we'll ignore
the adaptor TMDS clock limit in the modeset path, in case users are
already "overclocking" their TMDS links. One subtle change here is that
we'll have to respect the adaptor TMDS clock limit when we decide whether
to do 12bpc or 8bpc, otherwise we might end up picking 12bpc and
accidentally driving the TMDS link out of spec even when the user chose
a mode that fits wihting the limits at 8bpc. This means you can't
"overclock" your DP++ dongle at 12bpc anymore, but you can continue to
do so at 8bpc.
Note that for simplicity we'll use the I2C access method for all dual
mode adaptors including type 2. Otherwise we'd have to start mixing
DP AUX and HDMI together. In the future we may need to do that if we
come across any board designs that don't hook up the DDC pins to the
DP++ connectors. Such boards would obviously only work with type 2
dual mode adaptors, and not type 1.
v2: Store adaptor type under indel_hdmi->dp_dual_mode
Deal with DRM_DP_DUAL_MODE_UNKNOWN
Pass adaptor type to drm_dp_dual_mode_max_tmds_clock(),
and use it for type1 adaptors as well
Cc: stable@vger.kernel.org Reported-by: Tore Anderson <tore@fud.no> Fixes: 7a0baa623446 ("Revert "drm/i915: Disable 12bpc hdmi for now"") Cc: Paulo Zanoni <paulo.r.zanoni@intel.com> Cc: Shashank Sharma <shashank.sharma@intel.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1462216105-20881-3-git-send-email-ville.syrjala@linux.intel.com Reviewed-by: Shashank Sharma <shashank.sharma@intel.com>
Ville Syrjälä [Fri, 6 May 2016 13:46:52 +0000 (16:46 +0300)]
drm: Add helper for DP++ adaptors
Add a helper which aids in the identification of DP dual mode
(aka. DP++) adaptors. There are several types of adaptors
specified: type 1 DVI, type 1 HDMI, type 2 DVI, type 2 HDMI
Type 1 adaptors have a max TMDS clock limit of 165MHz, type 2 adaptors
may go as high as 300MHz and they provide a register informing the
source device what the actual limit is. Supposedly also type 1 adaptors
may optionally implement this register. This TMDS clock limit is the
main reason why we need to identify these adaptors.
Type 1 adaptors provide access to their internal registers and the sink
DDC bus through I2C. Type 2 adaptors provide this access both via I2C
and I2C-over-AUX. A type 2 source device may choose to implement either
of these methods. If a source device implements the I2C-over-AUX
method, then the driver will obviously need specific support for such
adaptors since the port is driven like an HDMI port, but DDC
communication happes over the AUX channel.
This helper should be enough to identify the adaptor type (some
type 1 DVI adaptors may be a slight exception) and the maximum TMDS
clock limit. Another feature that may be available is control over
the TMDS output buffers on the adaptor, possibly allowing for some
power saving when the TMDS link is down.
Other user controllable features that may be available in the adaptors
are downstream i2c bus speed control when using i2c-over-aux, and
some control over the CEC pin. I chose not to provide any helper
functions for those since I have no use for them in i915 at this time.
The rest of the registers in the adaptor are mostly just information,
eg. IEEE OUI, hardware and firmware revision, etc.
v2: Pass adaptor type to helper functions to ease driver implementation
Fix a bunch of typoes (Paulo)
Add DRM_DP_DUAL_MODE_UNKNOWN for the case where we don't (yet) know
the type (Paulo)
Reject 0x00 and 0xff DP_DUAL_MODE_MAX_TMDS_CLOCK values (Paulo)
Adjust drm_dp_dual_mode_detect() type2 vs. type1 detection to
ease future LSPCON enabling
Remove the unused DP_DUAL_MODE_LAST_RESERVED define
v3: Fix kernel doc function argument descriptions (Jani)
s/NONE/UNKNOWN/ in drm_dp_dual_mode_detect() docs
Add kernel doc for enum drm_dp_dual_mode_type
Actually build the docs
Fix more typoes
v4: Adjust code indentation of type2 adaptor detection (Shashank)
Add debug messages for failurs cases (Shashank)
v5: EXPORT_SYMBOL(drm_dp_dual_mode_read) (Paulo)
Cc: stable@vger.kernel.org Cc: Tore Anderson <tore@fud.no> Cc: Paulo Zanoni <paulo.r.zanoni@intel.com> Cc: Shashank Sharma <shashank.sharma@intel.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Shashank Sharma <shashank.sharma@intel.com> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Shashank Sharma <shashank.sharma@intel.com> (v4) Link: http://patchwork.freedesktop.org/patch/msgid/1462542412-25533-1-git-send-email-ville.syrjala@linux.intel.com
I mixed up maintainers and thought Linus' ack was for the mfd tree.
But Lee Jones (the real maintainer) wants to merge this through the
mfd tree, so revert here.
Cc: Lee Jones <lee.jones@linaro.org> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Kenneth Graunke [Fri, 6 May 2016 07:50:14 +0000 (08:50 +0100)]
drm/i915: Allow MI_LOAD_REGISTER_REG between whitelisted registers.
Allowing register copies where the source and destination are both
whitelisted should be safe, and is useful. For example, Mesa uses
this to load the command streamer math registers with data from the
pipeline statistics counters.
v2: Reject writes to OACONTROL (and reads as well :(
Chris Wilson [Wed, 4 May 2016 13:25:36 +0000 (14:25 +0100)]
drm/i915: Report command parser version 0 if disabled
If the command parser is not active, then it is appropriate to report it
as operating at version 0 as no higher mode is supported. This greatly
simplifies userspace querying for the command parser as we then do not
need to second guess when it will be active (a mixture of module
parameters and generational support, which may change over time).
Daniel Vetter [Tue, 3 May 2016 08:33:01 +0000 (10:33 +0200)]
drm/i915: Bail out of pipe config compute loop on LPT
LPT is pch, so might run into the fdi bandwidth constraint (especially
since it has only 2 lanes). But right now we just force pipe_bpp back
to 24, resulting in a nice loop (which we bail out with a loud
WARN_ON). Fix this.
Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
References: https://bugs.freedesktop.org/show_bug.cgi?id=93477 Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> Tested-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Cc: stable@vger.kernel.org Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: http://patchwork.freedesktop.org/patch/msgid/1462264381-7573-1-git-send-email-daniel.vetter@ffwll.ch
Lyude [Tue, 3 May 2016 15:01:32 +0000 (11:01 -0400)]
Revert "drm/i915: start adding dp mst audio"
Right now MST audio is causing too many kernel panics to really keep
around in the kernel. On top of that, even after fixing said panics it's
still basically non-functional (at least on all the setups I've tested
it on). Revert until we have a proper solution for this.
Imre Deak [Tue, 3 May 2016 12:54:21 +0000 (15:54 +0300)]
drm/i915/chv: Tune L3 SQC credits based on actual latencies
While browsing BSpec I bumped into a note saying we need to tune these
values based on actual measurements done after initial enabling. I've
checked that it indeed improves things on BXT. I haven't checked this on
CHV, but here it is if someone wants to give it a go.
v2:
- Add note about the discrepancy wrt. to the spec in the formula
calculating the credit encodings. (Mika, Ville)
- Move the WA comment to the new function. (Ville)
v3:
- Keep the comment about the SQC WA in the caller. (Ville)
Praveen Paneri [Mon, 2 May 2016 08:40:29 +0000 (14:10 +0530)]
drm/i915: Add rpm get/put in oom and vmap notifier
i915_gem_shrink() will scan the bound list only if device is not
suspended but in OOM failure scenario it becomes absolutely necessary
to release as much memory as possible. Also in allocation failure from
vmap address space, it is incumbent on the Driver to reap all its
vmaps. So, adding rpm get/put in i915_gem_shrinker_oom() and
i915_gem_shrinker_vmap() to ensure shrinking of bound objects as well.
Praveen Paneri [Mon, 2 May 2016 08:40:28 +0000 (14:10 +0530)]
drm/i915: Unbind objects in shrinker only if device is runtime active
When the system is running low on memory, gem shrinker is invoked.
In this process objects will be unbounded from GTT and unbinding process
will require access to GTT(GTTADR) and also to fence register potentially.
That requires a resume of gfx device, if suspended, in the shrinker path.
Considering the power leakage due to intermediate resume, perform unbinding
operation only if device is already runtime active.
v2: Use newly implemented intel_runtime_pm_get_if_in_use (Chris)
Jani Nikula [Fri, 29 Apr 2016 12:34:02 +0000 (15:34 +0300)]
drm/i915/lvds: separate border enable readout from panel fitter
The LVDS border enable is independent from the panel fitter. Move the
readout of the "border bits" from i9xx_get_pfit_config() to
intel_lvds_get_config(), where it will be read if LVDS is enabled even
if the panel fitter is not.
This fixes the state checker warning:
[drm:intel_pipe_config_compare [i915]] *ERROR* mismatch in
gmch_pfit.lvds_border_bits (expected 0x00008000, found 0x00000000)
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Cc: drm-intel-fixes@lists.freedesktop.org
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=87632 Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Tested-by: Sitsofe Wheeler <sitsofe@yahoo.com> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1461933243-2140-1-git-send-email-jani.nikula@intel.com
Chris Wilson [Fri, 29 Apr 2016 11:20:22 +0000 (12:20 +0100)]
drm/i915: Enable semaphores for legacy submission on gen8
We have sufficient evidence from igt to support that semaphores are in
a working state. Enabling semaphores now for legacy provides a better
comparison of execlists against legacy ring submission.
Chris Wilson [Fri, 29 Apr 2016 12:18:24 +0000 (13:18 +0100)]
drm/i915: Fix serialisation of pipecontrol write vs semaphore signal
In order for the MI_SEMAPHORE_SIGNAL command to wait until after the
pipecontrol writing the signal value is complete, we have to pause the
CS inside the PIPE_CONTROL with the CS_STALL bit.
Chris Wilson [Fri, 29 Apr 2016 12:18:23 +0000 (13:18 +0100)]
drm/i915: Fix gen8 semaphores id for legacy mode
With the introduction of a distinct engine->id vs the hardware id, we need
to fix up the value we use for selecting the target engine when signaling
a semaphore. Note that these values can be merged with engine->guc_id.
Chris Wilson [Fri, 29 Apr 2016 12:18:21 +0000 (13:18 +0100)]
drm/i915: Apply strongly ordered RCS breadcrumb to gen8/legacy
For legacy ringbuffer mode, we need the new ordered breadcrumb emission
tried and tested on execlists in order to avoid the dreaded "missed
interrupt" syndrome. A secondary advantage of the execlists method is
that it writes to an arbitrary address, useful if one wants to write a
breadcrumb elsewhere.
This fix is taken from commit 7c17d377374dd (drm/i915: Use ordered seqno
write interrupt generation on gen8+ execlists).
Chris Wilson [Fri, 29 Apr 2016 08:07:06 +0000 (09:07 +0100)]
drm/i915: Trim the flush for the execlists request emission
At the start of request emission, we flush some space for the request,
estimating the typical size for the request body. The common tail is now
much larger than the typical body, so we can shrink the flush
substantially.
Chris Wilson [Fri, 29 Apr 2016 08:07:05 +0000 (09:07 +0100)]
drm/i915: Trim the flush for the legacy request emission
At the start of request emission, we flush some space for the request,
estimating the typical size for the request body. The tail is now much
larger than the typical body, so we can shrink the flush slightly.
Chris Wilson [Fri, 29 Apr 2016 08:07:04 +0000 (09:07 +0100)]
drm/i915: Bump reserved size for legacy gen8 semaphore emission
With 5 rings and a flush, we need 192 bytes of space to emit the
breadcrumb and semaphores. However, we need some spare room the size of
the single largest packet (36 dwords, 144 bytes) to accommodate
wraparound giving a grand total of 336 bytes
drm/i915: Unduplicate VLV phy pre pll enabling code
The code used by the DP and HDMI paths was very similar, so make them
share it. Note that this removes the write to signal level registers
from the HDMI pre pll enable path, but that's OK since those are set
in vlv_hdmi_pre_enable() function.
The logic for setting signal levels is used for both HDMI and DP with
small variations. But it is similar enough to put behind a function
called from the encoders.
v2: Remove unrelated MST changes due to rebase fumble. (Jim Bride)
Fix typo in the commit message. (Jim Bride)
drm/i915: Unduplicate CHV encoders' post pll disable code
The exact same code was used by HDMI and DP encoders, so move it to
intel_dpio_phy.c.
v2: Fix typo in the commit message. (Jim Bride)
v3: Call the new function chv_phy_post_pll_disable() instead of
chv_phy_post_disable(), as it should be called after the pll
is disabled. (Ville)
The only difference between the DP and HDMI versions was the lane count.
Since lane_count is now set appropriately for HDMI too, get rid of the
duplication and move this to intel_dpio_phy.c
Set the lane count for HDMI to 4. This will make it easier to
unduplicate CHV phy code.
This also fixes the the soft reset programming for HDMI with CHV. After
commit a8f327fb8464 ("drm/i915: Clean up CHV lane soft reset
programming"), it wouldn't set the right bits for PCS23 since it relied
on a lane count that was never set.
v2: Set lane_count in *_get_config() to please state checker. (0day)
v3: Set lane_count for DDI in DVI mode too. (CI)
v4: Add note about CHV soft lane reset. (Ander)
Ville Syrjälä [Tue, 26 Apr 2016 16:46:34 +0000 (19:46 +0300)]
drm/i915: Fix comments about GMBUSFREQ register
The comment about GMBUSFREQ is confused. The spec actually explains
the 4MHz thing perfectly by noting that the 4MHz divider values is
actually just bits [9:2] not [9:0], hence the divide by 1000 correct.
Replace the confused note with a quote from the spec, and eliminate
the duplicated comment that snuck in.
Ville Syrjälä [Tue, 26 Apr 2016 16:46:33 +0000 (19:46 +0300)]
drm/i915: Use cached cdclk value in i915_audio_component_get_cdclk_freq()
No point in reading the cdclk out from the hardware every single time
since we have it cached already. Just return the cached value to the
audio driver.
Ville Syrjälä [Tue, 26 Apr 2016 16:46:32 +0000 (19:46 +0300)]
drm/i915: Update CDCLK_FREQ register on BDW after changing cdclk frequency
Update CDCLK_FREQ on BDW after changing the cdclk frequency. Not sure
if this is a late addition to the spec, or if I simply overlooked this
step when writing the original code.
This is what Bspec has to say about CDCLK_FREQ:
"Program this field to the CD clock frequency minus one. This is used to
generate a divided down clock for miscellaneous timers in display."
And the "Broadwell Sequences for Changing CD Clock Frequency" section
clarifies this further:
"For CD clock 337.5 MHz, program 337 decimal.
For CD clock 450 MHz, program 449 decimal.
For CD clock 540 MHz, program 539 decimal.
For CD clock 675 MHz, program 674 decimal."
Ramalingam C [Tue, 19 Apr 2016 08:18:14 +0000 (13:48 +0530)]
drm/i915/bxt: Adjusting the error in horizontal timings retrieval
In BXT DSI there is no regs programmed with few horizontal timings
in Pixels but txbyteclkhs.. So retrieval process adds some
ROUND_UP ERRORS in the process of PIXELS<==>txbyteclkhs.
Actually here for the given adjusted_mode, we are calculating the
value programmed to the port and then back to the horizontal timing
param in pixels. This is the expected value at the end of get_config,
including roundup errors. And if that is same as retrieved value
from port, then retrieved (HW state) adjusted_mode's horizontal
timings are corrected to match with SW state to nullify the errors.
Chris Wilson [Thu, 28 Apr 2016 08:56:59 +0000 (09:56 +0100)]
drm/i915: Unify GPU resets upon shutdown
Both execlists and legacy need to reset the context (and mode) of the
GPU before we lose control of the system. By resetting the GPU, we
revert back to default settings. This simplifies the life of any
subsequent driver (in particular for virtualized setups) as it does not
then have to try and recover from an unknown condition. As both paths
need to reset for the same reason, move the reset to a common point.
This unifies the resets added in a647828afc (drm/i915: Also perform gpu
reset under execlist mode) and 8e96d9c4d9 (drm/i915: reset the GPU on
context fini).
v2: Restrict the reset to "modern" gen (where we enable HW contexts) to
try and avoid leaving the machine in an unusable state with a risky
reset on older GPU. This should keep the status quo as to who performs
resets (i.e. currently only GPUs with HW contexts perform a reset on
shutdown).
With the previous patch having extended the pinned lifetime of
contexts by referencing the previous context from the current
request until the latter is retired (completed by the GPU),
we can now remove usage of execlist retired queue entirely.
This is because the above now guarantees that all execlist
object access requirements are satisfied by this new tracking,
and we can stop taking additional references and stop keeping
request on the execlists retired queue.
The latter was a source of significant scalability issues in
the driver causing performance hits on some tests. Most
dramatical of which was igt/gem_close_race which had run time
in tens of minutes which is now reduced to tens of seconds.
Chris Wilson [Thu, 28 Apr 2016 08:56:56 +0000 (09:56 +0100)]
drm/i915: Track the previous pinned context inside the request
As the contexts are accessed by the hardware until the switch is completed
to a new context, the hardware may still be writing to the context object
after the breadcrumb is visible. We must not unpin/unbind/prune that
object whilst still active and so we keep the previous context pinned until
the following request. We can generalise the tracking we already do via
the engine->last_context and move it to the request so that it works
equally for execlists and GuC.
v2: Drop the execlists double pin as that exposes a race inside the lrc
irq handler as it tries to access the context after it may be retired.
Chris Wilson [Thu, 28 Apr 2016 08:56:55 +0000 (09:56 +0100)]
drm/i915: Move releasing of the GEM request from free to retire/cancel
If we move the release of the GEM request (i.e. decoupling it from the
various lists used for client and context tracking) after it is complete
(either by the GPU retiring the request, or by the caller cancelling the
request), we can remove the requirement that the final unreference of
the GEM request need to be under the struct_mutex.
The careful reader may notice that one or two impossible NULL pointer
tests are dropped for readability. These pointers cannot be NULL since
they are assigned during request construction and never unset.
v2,v3: Rebalance execlists by moving the context unpinning.
v4: Rebase onto -nightly
v5: Avoid trying to rebalance execlist/GuC context pinning, leave that
to the next step
Chris Wilson [Thu, 28 Apr 2016 08:56:54 +0000 (09:56 +0100)]
drm/i915: Move the magical deferred context allocation into the request
We can hide more details of execlists from higher level code by removing
the explicit call to create an execlist context from execbuffer and
into its first use by execlists.
Refactor pinning and unpinning of contexts, such that the default
context for an engine is pinned during initialisation and unpinned
during teardown (pinning of the context handles the reference counting).
Thus we can eliminate the special case handling of the default context
that was required to mask that it was not being pinned normally.
v2: Rebalance context_queue after rebasing.
v3: Rebase to -nightly (not 40 patches in)
v4: Rebase onto request_alloc unwinding
Chris Wilson [Thu, 28 Apr 2016 08:56:52 +0000 (09:56 +0100)]
drm/i915: Replace the pinned context address with its unique ID
Rather than reuse the current location of the context in the global GTT
for its hardware identifier, use the context's unique ID assigned to it
for its whole lifetime.
Chris Wilson [Thu, 28 Apr 2016 08:56:51 +0000 (09:56 +0100)]
drm/i915: Assign every HW context a unique ID
The hardware tracks contexts and expects all live contexts (those active
on the hardware) to have a unique identifier. This is used by the
hardware to assign pagefaults and the like to a particular context.
v2: Reorder to make sure ctx->link is not left dangling if the
assignment of a hw_id fails (Mika).
Chris Wilson [Thu, 28 Apr 2016 08:56:49 +0000 (09:56 +0100)]
drm/i915: Preallocate enough space for the average request
Rather than being interrupted when we run out of space halfway through
the request, and having to restart from the beginning (and returning to
userspace), flush a little more free space when we prepare the request.
Chris Wilson [Thu, 28 Apr 2016 08:56:48 +0000 (09:56 +0100)]
drm/i915: Manually unwind after a failed request allocation
In the next patches, we want to move the work out of freeing the request
and into its retirement (so that we can free the request without
requiring the struct_mutex). This means that we cannot rely on
unreferencing the request to completely teardown the request any more
and so we need to manually unwind the failed allocation. In doing so, we
reorder the allocation in order to make the unwind simple (and ensure
that we don't try to unwind a partial request that may have modified
global state) and so we end up pushing the initial preallocation down
into the engine request initialisation functions where we have the
requisite control over the state of the request.
Moving the initial preallocation into the engine is less than ideal: it
moves logic to handle a specific problem with request handling out of
the common code. On the other hand, it does allow those backends
significantly more flexibility in performing its allocations.
Chris Wilson [Thu, 28 Apr 2016 08:56:47 +0000 (09:56 +0100)]
drm/i915: Remove the identical implementations of request space reservation
Now that we share intel_ring_begin(), reserving space for the tail of
the request is identical between legacy/execlists and so the tautology
can be removed. In the process, we move the reserved space tracking
from the ringbuffer on to the request. This is to enable us to reorder
the reserved space allocation in the next patch.
Chris Wilson [Thu, 28 Apr 2016 08:56:46 +0000 (09:56 +0100)]
drm/i915: Unify intel_ring_begin()
Combine the near identical implementations of intel_logical_ring_begin()
and intel_ring_begin() - the only difference is that the logical wait
has to check for a matching ring (which is assumed by legacy).
In the process some debug messages are culled as there were following a
WARN if we hit an actual error.
Chris Wilson [Thu, 28 Apr 2016 08:56:45 +0000 (09:56 +0100)]
drm/i915: Rearrange switch_context to load the aliasing ppgtt on first use
The code to switch_mm() is already handled by i915_switch_context(), the
only difference required to setup the aliasing ppgtt is that we need to
emit te switch_mm() on the first context, i.e. when transitioning from
engine->last_context == NULL. This allows us to defer the
initialisation of the GPU from early device initialisation to first use,
which should marginally speed up both. The caveat is that we then defer
the context initialisation until first use - i.e. we cannot assume that
the GPU engines are initialised. For example, this means that power
contexts for rc6 (Ironlake) need to explicitly loaded, as they are.
Chris Wilson [Thu, 28 Apr 2016 08:56:43 +0000 (09:56 +0100)]
drm/i915: Consolidate L3 remapping LRI
We can use a single MI_LOAD_REGISTER_IMM command packet to write all the
L3 remapping registers, shrinking the number of bytes required to emit
the context switch.
Chris Wilson [Thu, 28 Apr 2016 08:56:41 +0000 (09:56 +0100)]
drm/i915: Mark the current context as lost on suspend
In order to force a reload of the context image upon resume, we first
need to mark its absence on suspend. Currently we are failing to restore
the golden context state and any context w/a to the default context
after resume.
One oversight corrected, is that we had forgotten to reapply the L3
remapping when restoring the lost default context.
Chris Wilson [Thu, 28 Apr 2016 08:56:40 +0000 (09:56 +0100)]
drm/i915: Use i915_vma_pin_iomap on the ringbuffer object
Similarly to i915_gem_object_pin_map on LLC platforms, we can
use the new VMA based io mapping on !LLC to amoritize the cost
of ringbuffer pinning and unpinning.
Chris Wilson [Thu, 28 Apr 2016 08:56:39 +0000 (09:56 +0100)]
drm/i915: Move ioremap_wc tracking onto VMA
By tracking the iomapping on the VMA itself, we can share that area
between multiple users. Also by only revoking the iomapping upon
unbinding from the mappable portion of the GGTT, we can keep that iomap
across multiple invocations (e.g. execlists context pinning).
Note that by moving the iounnmap tracking to the VMA, we actually end up
fixing a leak of the iomapping in intel_fbdev.
v1.5: Rebase prompted by Tvrtko
v2: Drop dev_priv parameter, we can recover the i915_ggtt from the vma.
v3: Move handling of ioremap space exhaustion to vmap_purge and also
allow vmallocs to recover old iomaps. Add Tvrtko's kerneldoc.
v4: Fix a use-after-free in shrinker and rearrange i915_vma_iomap
v5: Back to i915_vm_to_ggtt
v6: Use i915_vma_pin_iomap and i915_vma_unpin_iomap to mark critical
sections and ensure the VMA cannot be reaped whilst mapped.
v7: Move i915_vma_iounmap so that consumers of the API are not tempted,
and add iomem annotations
Chris Wilson [Thu, 28 Apr 2016 08:56:38 +0000 (09:56 +0100)]
drm/i915: Introduce i915_vm_to_ggtt()
In a couple of places, we have an i915_address_space that we know is
really an i915_ggtt that we want to use. Create an inline helper to
convert from the i915_address_space subclass into its container.
Chris Wilson [Thu, 28 Apr 2016 08:56:37 +0000 (09:56 +0100)]
io-mapping: Specify mapping size for io_mapping_map_wc()
The ioremap() hidden behind the io_mapping_map_wc() convenience helper
can be used for remapping multiple pages. Extend the helper so that
future callers can use it for larger ranges.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Daniel Vetter <daniel.vetter@intel.com> Cc: Jani Nikula <jani.nikula@linux.intel.com> Cc: David Airlie <airlied@linux.ie> Cc: Yishai Hadas <yishaih@mellanox.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: "Peter Zijlstra (Intel)" <peterz@infradead.org> Cc: David Hildenbrand <dahi@linux.vnet.ibm.com> Cc: Luis R. Rodriguez <mcgrof@kernel.org> Cc: intel-gfx@lists.freedesktop.org Cc: dri-devel@lists.freedesktop.org Cc: netdev@vger.kernel.org Cc: linux-rdma@vger.kernel.org Cc: linux-kernel@vger.kernel.org Reviewed-by: Luis R. Rodriguez <mcgrof@kernel.org> Link: http://patchwork.freedesktop.org/patch/msgid/1461833819-3991-3-git-send-email-chris@chris-wilson.co.uk
Chris Wilson [Thu, 28 Apr 2016 08:56:36 +0000 (09:56 +0100)]
drm/i915/overlay: Replace i915_gem_obj_ggtt_offset() with the known flip_addr
When setting up the overlay page, we pin it into the GGTT (when using
virtual addresses) and store the offset as overlay->flip_addr. Rather
than doing a lookup of the GGTT address everytime, we can use the known
address instead.
Chris Wilson [Mon, 25 Apr 2016 12:32:13 +0000 (13:32 +0100)]
drm/i915: Propagate error from drm_gem_object_init()
Propagate the real error from drm_gem_object_init(). Note this also
fixes some confusion in the error return from i915_gem_alloc_object...
v2:
(Matthew Auld)
- updated new users of gem_alloc_object from latest drm-nightly
- replaced occurrences of IS_ERR_OR_NULL() with IS_ERR()
v3:
(Joonas Lahtinen)
- fix double "From:" in commit message
- add goto teardown path
v4:
(Matthew Auld)
- rebase with i915_gem_alloc_object name change
Signed-off-by: Matthew Auld <matthew.auld@intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1461587533-8841-1-git-send-email-matthew.auld@intel.com
[Joonas: Removed spurious " = NULL" from _init() function] Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Chris Wilson [Wed, 27 Apr 2016 08:02:01 +0000 (09:02 +0100)]
drm/i915: Protect gen7 irq_seqno_barrier with uncore lock
Faced with sporadic machine hangs on gen7, that mimic the issue of
concurrent writes to the same cacheline and seem to start with
commit 9b9ed3093613 (drm/i915: Remove forcewake dance from seqno/irq
barrier on legacy gen6+), let us restore the spinlock around the mmio
read.
Ville Syrjälä [Wed, 27 Apr 2016 14:43:22 +0000 (17:43 +0300)]
drm/i915: Update RAWCLK_FREQ register on VLV/CHV
I just noticed that VLV/CHV have a RAWCLK_FREQ register just like PCH
platforms. It lives in the display power well, so we should update it
when enabling the power well.
Interestingly the BIOS seems to leave it at the reset value (125) which
doesn't match the rawclk frequency on VLV/CHV (200 MHz). As always with
these register, the spec is extremely vague what the register does. All
it says is: "This is used to generate a divided down clock for
miscellaneous timers in display." Based on a quick test, at least AUX
and PWM appear to be unaffected by this.
But since the register is there, let's configure it in accordance with
the spec.
Note that we have to move intel_update_rawclk() to occur before we
touch the power wells, so that the dev_priv->rawclk_freq is already
populated when the disp2 enable hook gets called for the first time.
I think this should be safe to do on other platforms as well.
Check for VLV/CHV instead if !BXT when re-enabling DPOunit clock gating
after DSI disable. That's what we checked when disabling the clock
gating when enabling DSI.
Also use the same temporary variable name in both cases, and toss in a
bit of dev vs. dev_priv cleanup while at it.
Deepak M [Wed, 30 Mar 2016 14:03:40 +0000 (17:03 +0300)]
drm/i915: Parsing the PWM cntrl and CABC ON/OFF fields in VBT
For dual link panel scenarios there are new fields added in the
VBT which indicate on which port the PWM cntrl and CABC ON/OFF
commands needs to be sent.
drm/i915: Add Backlight Control using DPCD for eDP connectors (v9)
This patch adds support for eDP backlight control using DPCD registers to
backlight hooks in intel_panel.
It checks for backlight control over AUX channel capability and sets up
function pointers to get and set the backlight brightness level if
supported.
v2: Moved backlight functions from intel_dp.c into a new file
intel_dp_aux_backlight.c. Also moved reading of eDP display control
registers to intel_dp_get_dpcd
v3: Correct some formatting mistakes
v4: Updated to use AUX backlight control if PWM control is not possible
(Jani)
v5: Moved call to initialize backlight registers to dp_aux_setup_backlight
v6: Check DP_EDP_BACKLIGHT_PIN_ENABLE_CAP is disabled before setting up AUX
backlight control. To fix BLM_PWM_ENABLE igt test warnings on bdw_ultra
v7: Add enable_dpcd_backlight module parameter.
v8: Rebase onto latest drm-intel-nightly branch
v9: Remove changes to intel_dp_dpcd_read_wake
Split addition edp_dpcd variable into a separate patch
Dave Gordon [Fri, 22 Apr 2016 18:14:32 +0000 (19:14 +0100)]
drm/i915: rename i915_gem_alloc_object() to i915_gem_object_create()
Because having both i915_gem_object_alloc() and i915_gem_alloc_object()
(with different return conventions) is just too confusing!
(i915_gem_object_alloc() is the low-level memory allocator, and remains
unchanged, whereas i915_gem_alloc_object() is a constructor that ALSO
initialises the newly-allocated object.)
Move graphics stolen memory related early quirk into a function to
allow easy adding of other graphics quirks to fix memory maps on
machines running old BIOS versions.
While at it;
- _funcs -> _ops to follow de facto naming
- make the iteration code tad more readable
- remove unused variables
Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Acked-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Move the better constructs/comments from i915_gem_stolen.c to
early-quirks.c and increase readability in preparation of only
having one set of functions.
- intel_stolen_base -> gen3_stolen_base
- use phys_addr_t instead of u32 for address for future proofing
v2:
- Print the invalid register values (Chris)
(Omitting the register prefix as it's visible from backtrace.)
Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Acked-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
This patch applies a performance enhancement workaround
based on analysis of DX and OCL S-Curve workloads. We
increase the General Priority Credits for L3SQ from the
hardware default of 56 to the max value 62, and decrease
the High Priority credits from 8 to 2.
v2: Only apply to B0 onwards
v3: Move w/a to per engine init, ie bxt_init_workarounds