git.karo-electronics.de Git - linux-beck.git/log

drm/i915: move intel_hrawclk() to intel_display.c

Make it available outside of intel_dp.c.

Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Clint Taylor <Clinton.A.Taylor@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Notify GuC rc6 state

If rc6 is enabled, notify GuC so it can do proper forcewake before
command submission.

Signed-off-by: Alex Dai <yu.dai@intel.com>
Reviewed-by: Tom O'Rourke <Tom.O'Rourke@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915/guc: Support GuC version 4.3

The firmware layout changes that now it only has css header +
uCode + RSA signature. Plus, other trivial changes to support
GuC V4.3.

Signed-off-by: Alex Dai <yu.dai@intel.com>
Reviewed-by: Dave Gordon <david.s.gordon@intel.com>
Reviewed-by: Dave Gordon <david.s.gordon@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Fix module initialisation, v2.

The driver doesn't support UMS any more, so set DRIVER_MODESET by default,
remove the legacy s/r callbacks, and rename the s/r functions to make it more clear
they're only in use by switcheroo now.

Also remove an obsolete comment about atomic. Normal updates are supported only
async updates aren't yet.

v2: Don't unconditionally set DRIVER_ATOMIC, we're not yet there.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Factor out intel_crtc_has_encoders()

Make the code mode readable by pulling the "does this crtc have any
encoders?" deduction into a separate function.

Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Fix clock readout when pipes are enabled w/o ports

The BIOS sometimes likes to enable pipes w/o any ports, at least on
older machines. Currently we fail to assign anything sensible to
crtc->hwmode.crtc_clock which leads to complaints from the vblank code.
Deal with active pipes w/o ports and assign something sensible to
crtc_clock in i9xx_get_pipe_config(). The encoder .get_config() will
override this if the port is enabled.

Gets rid of rest of these on my gen4:
[drm:drm_calc_timestamping_constants [drm]] *ERROR* crtc 24: Can't calculate constants, dotclock = 0!
[drm:i915_get_vblank_timestamp] crtc 1 is disabled

v2: Fill out crtc_clock already in i9xx_get_pipe_config() (Maarten)

Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Add CHV PHY LDO power sanity checks

At various points when changing the DPIO lane/phy power states,
construct an expected value of the DISPLAY_PHY_STATUS register
and compare it with the real thing.

To construct the expected value we look at our shadow PHY_CONTROL
register value (which should match what we've just written to the
hardware), and we also need to look at the actual state of the cmn
power wells as a disabled power well causes the relevant LDO status
to be reported as 'on' in DISPLAY_PHY_STATUS.

When initially powering up the PHY it performs various internal
calibrations for which it fully powers up. That means that if we check
for the expetected power state immediately upon releasing cmnreset we
would get the occasional false positive. But we can of course
poll until the expected value appears. It shouldn't be too long so
this shouldn't make modesets substantially longer.

One extra complication is introduced when we cross the streams, ie.
drive port B with pipe B. In this case we trick CL2 (where the DPLL lives)
into life by temporaily powering up the lanes in the second channel,
and once the pipe is up and runnign we release the lane power override.
At that point the power state of CL2 has somehow gotten entangled with
the power state of the first channel. That means that constructing the
expected DISPLAY_PHY_STATUS value is a bit tricky since based on the
lane power states in the second channel, CL2 should also be powered
down. But we can use the DPLL enable bit to determine when CL2 should
be alive even if the lanes are powered down. However the power state
of CL2 isn't actually tied in with the DPLL state, but to the state
of the lanes in first channel, so we have to avoid checking the
expected state between shutting down the DPLL and powering down
the lanes in the first channel. So no calling assert_chv_phy_status()
before the DISPLAY_PHY_CONTROL write in chv_phy_powergate_lanes(),
but after the write is a safe time to check.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Deepak S <deepak.s@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Add some CHV DPIO lane power state asserts

Add some checks that the state of the DPIO lanes is more or less what we
expect based on the overrides.

The hardware only provides two bits per channel indicating whether all
or some of the lanes are powered down, so we can't do an exact check.

Additionally, CL2 powering down before we can check it adds another
twist. To work around this we simply check for the 0 value of the
CL2 register (which is what we get when it's powered down) and
adjust our expectations.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Deepak S <deepak.s@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Clean up CHV lane soft reset programming

Currently we release the lane soft reset before lane stagger settings
have been programmed. I believe that means we don't actually do lane
staggering. So move the soft reset deassert to happen after lane
staggering has been programmed.

The one confusing thing in this is that when we remove the power down
override from the lanes, they power up with defaul register values,
which do not have the soft reset overrides enabled. And according to
some docs by default the data lane resets are tied to cmnreset. So that
would mean that lanes would come out of reset without staggering as
soon as the power down overrides are removed. But since we can't access
either the lane stagger register nor the soft reset override registers
until the lanes are powered on, we can't really do anything about it.
So let's just set the soft reset overrides as soon as the lane is
powered on and hope for the best.

v2: Fix typos in commit message (Daniel)

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Deepak S <deepak.s@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Bump command parser version number.

This was forgotten in

commit d351f6d94893f3ba98b1b20c5ef44c35fc1da124
Author: Francisco Jerez <currojerez@riseup.net>
Date: Fri May 29 16:44:15 2015 +0300

drm/i915: Add SCRATCH1 and ROW_CHICKEN3 to the register whitelist.

Signed-off-by: Francisco Jerez <currojerez@riseup.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915/dp: use the drm dp helper for determining sink tps3 support

No functional changes.

Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
[danvet: s/intel_dp_tps/drm_dp_tps/.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/dp: add drm_dp_tps3_supported helper

Cc: Thierry Reding <thierry.reding@gmail.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Update DRIVER_DATE to 20150828

Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

Partially revert "drm/i915: Use full atomic modeset."

This partially reverts commit 74c090b1bdc57b1c9f1361908cca5a3d8a80fb08.

The DRIVER_ATOMIC cap cannot yet be exported because i915 lacks async
support.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: gen 9 can check for unclaimed registers too

Dear git bisect user,

Even though this is the patch that introduced the WARN() you're
bisecting, please notice that it's very likely that the problem you're
facing was already present before this commit. In other words: this
commit adds code to detect errors and give WARN()s about them, but the
errors were already there.

In order to continue your debug, please use the i915.mmio_debug
option, check the backtraces and try to discover which read or write
operation is causing the error message. Then check if this is
happening because the register does not exist or because its power
well is down when the operation is being done.

On my SKL machine, if I use i915.mmio_debug=999, this patch triggers
42 WARNs just by booting. I didn't investigate them yet. Normal users
are only going to get a single WARN due to the default i915.mmio_debug
setting.

Thank you for your comprehension,
Paulo

Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Force CL2 off in CHV x1 PHY

We can choose to leave the display PHY CL2 powerdown up to some hardware
signals, or we can force it. The BXT code forces the nonexistent CL2 in
the x1 PHY to power down. Follow suit on CHV. Maybe it can still save
some extra power by disabling some extra logic in CL1, or something.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Deepak S <deepak.s@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Enable DPIO SUS clock gating on CHV

CHV has supports some form of automagic clock gating for the
DPIO SUS clock. We can simply enable the magic bits and the
hardware should take care of the rest.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Deepak S <deepak.s@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Force common lane on for the PPS kick on CHV

With DPIO powergating active the DPLL can't be accessed unless
something else is keeping the common lane in the channel on.
That means the PPS kick procedure could fail to enable the PLL.

Power up some data lanes to force the common lane to power up
so that the PLL can be enabled temporarily.

v2: Avoid gcc uninitilized variable warning

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Deepak S <deepak.s@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Trick CL2 into life on CHV when using pipe B with port B

Normmally the common lane in a PHY channel gets powered up when some
of the data lanes get powered up. But when we're driving port B with
pipe B we don't want to enabled any of the data lanes, and just want
the DPLL in the common lane to be active.

To make that happens we have to temporarily enable some data lanes
after which we can access the DPLL registers in the common lane. Once
the pipe is up and running we can drop the power override on the data
lanes allowing them to shut down. From this point forward the common
lane will in fact stay powered on until the data lanes in the other
channel get powered down.

Ville's extended explanation from the review thread:

On Wed, Aug 19, 2015 at 07:47:41AM +0530, Deepak wrote:
> One Q, why only for port B? Port C is also in same common lane right?

Port B is in the first PHY channel which also houses CL1. CL1 always
powers up whenever any lanes in either PHY channel are powered up.
CL2 only powers up if lanes in the second channel (ie. the one with
port C) powers up.

So in this scenario (pipe B->port B) we want the DPLL from CL2, but
ideally we only want to power up the lanes for port B. Powering up
port B lanes will only power up CL1, but as we need CL2 instead we
need to, temporarily, power up some lanes in port C as well.

Crossing the streams the other way (pipe A->port C) is not a problem
since CL1 powers up whenever anything else powers up. So powering up
some port C lanes is enough on its own to make the CL1 DPLL
operational, even though CL1 and the lanes live in separate channels.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Deepak S <deepak.s@linux.intel.com>
[danvet: Amend commit message with extended explanation.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Implement PHY lane power gating for CHV

Powergate the PHY lanes when they're not needed. For HDMI all four lanes
are needed always, but for DP we can enable only the needed lanes. To
power down the unused lanes we use some power down override bits in the
DISPLAY_PHY_CONTROL register. Without the overrides it appears that the
hardware always powers on all the lanes. When the port is disabled the
power down override is not needed and the lanes will shut off on their
own. That also means the override is critical to actually be able to
access the DPIO registers before the port is actually enabled.

Additionally the common lanes will power down when not needed. CL1
remains on as long as anything else is on, CL2 will shut down when
all the lanes in the same channel will shut down. There is one exception
for CL2 that will be dealt in a separate patch for clarity.

With potentially some lanes powered down, the DP code now has to check
the number of active lanes before accessing PCS/TX registers. All
registers in powered down blocks will reads as 0xffffffff, and soe we
would drown in warnings from vlv_dpio_read() if we allowed the code
to access all those registers.

Another important detail in the DP code is the "TX latency optimal"
setting. Normally the second TX lane acts as some kind of reset master,
with the other lanes as slaves. But when only a single lane is enabled,
that single lane obviously has to be the master.

A bit of extra care is needed to reconstruct the initial state of the
DISPLAY_PHY_CONTROL register since it can't be read safely. So instead
read the actual lane status from the DPLL/PHY_STATUS registers and
use that to determine which lanes ought to be powergated initially.

We also need to switch the PHY power modes to "deep PSR" to avoid
a hard system hang when powering down the single channel PHY.

Also sprinkle a few debug prints around so that we can monitor the
DISPLAY_PHY_STATUS changes without having to read it and risk
corrupting it.

v2: Add locking to chv_powergate_phy_lanes()
v3: Actually enable dynamic powerdown in the PHY and deal with the
fallout

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Deepak S <deepak.s@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Move DPLL ref/cri/VGA mode frobbing to the disp2d well enable

Bunch of stuff needs the DPLL ref/cri clocks on both VLV and CHV,
and having VGA mode enabled causes some problems for CHV. So let's just
pull the code to configure those bits into the disp2d well enable hook.
With the DPLL disable code also fixed to leave those bits alone we
should now have a consistent DPLL state all the time even if the DPLL
is disabled.

This also neatly removes some duplicated code between the VLV and
CHV codepaths.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Sivakumar Thulasimani <sivakumar.thulasimani@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Make some string arrays const

Most of our char* arrays are markes as const already, but a few slipped
through the cracks. Fix it.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Use ARRAY_SIZE() instead of hand rolling it

A couple of hand rolled ARRAY_SIZE()s caught my eye. Get rid of them.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Fix some gcc warnings

Simple one:
drivers/gpu/drm/i915/i915_debugfs.c:2449:57: warning: Using plain integer as NULL pointer

And something a bit more peculiar:
drivers/gpu/drm/i915/i915_debugfs.c:4953:18: warning: Variable length array is used.
drivers/gpu/drm/i915/i915_debugfs.c:4953:32: warning: Variable length array is used.

We pass a 'const int' as the array size which results in the warning,
dropping the const gets rid of the warning. Weird, but I think getting
rid of the warnings is better than holding on to the const.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915/bxt: Use correct live status register for BXT platform

BXT platform uses live status bits from 0x44440 register to obtain DP
status on hotplug. The existing g4x_digital_port_connected() uses a
different register and hence misses DP hotplug events on BXT
platform. This patch fixes it by using the appropriate register(0x44440)
and live status bits(3:5).

Based on a patch by Durgadoss R <durgadoss.r@intel.com>, from whom the
commit message is shamelessly copy pasted.

Reported-by: Durgadoss R <durgadoss.r@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Durgadoss R <durgadoss.r@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: split g4x_digital_port_connected to g4x and vlv variants

Choose the right function at the intel_digital_port_connected level.

Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Durgadoss R <durgadoss.r@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: split ibx_digital_port_connected to ibx and cpt variants

Choose the right function at the intel_digital_port_connected level.

Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Durgadoss R <durgadoss.r@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: add common intel_digital_port_connected function

Add a common intel_digital_port_connected() that splits out to functions
for different platforms. No functional changes.

v2: make the function return a boolean

Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Durgadoss R <durgadoss.r@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: add MISSING_CASE annotation to ibx_digital_port_connected

With the case added for eDP on port A (always connected from this
function's point of view), we should not be hitting any of the default
cases in ibx_digital_port_connected, so add MISSING_CASE annotation.

Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Durgadoss R <durgadoss.r@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: make g4x_digital_port_connected return boolean status

We should not be hitting any of the default cases in
g4x_digital_port_connected, so add MISSING_CASE annotation and return
boolean status. The current behaviour is just cargo culting from the
days of yonder when the display port support was added to i915.

Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Durgadoss R <durgadoss.r@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: move ibx_digital_port_connected to intel_dp.c

The function can be made static there. No functional changes.

Reviewed-by: Durgadoss R <durgadoss.r@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: DVO pixel clock check

It is possible the we request to have a mode that has
higher pixel clock than our HW can support. This patch
checks if requested pixel clock is lower than the one
supported by the HW. The requested mode is discarded
if we cannot support the requested pixel clock.

This patch applies to DVO.

V2:
- removed computation for max pixel clock

V3:
- cleanup by removing unnecessary lines

V4:
- clock check against max dotclock moved inside 'if (fixed_mode)'

V5:
- dot clock check against fixed_mode clock when available

Signed-off-by: Mika Kahola <mika.kahola@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: DSI pixel clock check

It is possible the we request to have a mode that has
higher pixel clock than our HW can support. This patch
checks if requested pixel clock is lower than the one
supported by the HW. The requested mode is discarded
if we cannot support the requested pixel clock.

This patch applies to DSI.

V2:
- removed computation for max pixel clock

V3:
- cleanup by removing unnecessary lines

V4:
- max_pixclk variable renamed as max_dotclk
- moved dot clock checking inside 'if (fixed_mode)'

V5:
- dot clock checked against fixed_mode clock

Signed-off-by: Mika Kahola <mika.kahola@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: LVDS pixel clock check

It is possible the we request to have a mode that has
higher pixel clock than our HW can support. This patch
checks if requested pixel clock is lower than the one
supported by the HW. The requested mode is discarded
if we cannot support the requested pixel clock.

This patch applies to LVDS.

V2:
- removed computation for max pixel clock

V3:
- cleanup by removing unnecessary lines

V4:
- moved supported dotclock check from mode_valid() to intel_lvds_init()

V5:
- dotclock check moved back to mode_valid() function
- dotclock check for fixed mode

Signed-off-by: Mika Kahola <mika.kahola@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Store max dotclock

Store max dotclock into dev_priv structure so we are able
to filter out the modes that are not supported by our
platforms.

V2:
- limit the max dot clock frequency to max CD clock frequency
  for the gen9 and above
- limit the max dot clock frequency to 90% of the max CD clock
  frequency for the older gens
- for Cherryview the max dot clock frequency is limited to 95%
  of the max CD clock frequency
- for gen2 and gen3 the max dot clock limit is set to 90% of the
  2X max CD clock frequency

V3:
- max_dotclk variable renamed as max_dotclk_freq in i915_drv.h
- in intel_compute_max_dotclk() the rounding method changed from
  round up to round down when computing max dotclock

V4:
- Haswell and Broadwell supports now dot clocks up to max CD clock
  frequency

Signed-off-by: Mika Kahola <mika.kahola@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Add vlv_dport_to_phy()

Add vlv_dport_to_phy() and fix up the return values of
vlv_dport_to_channel() and vlv_pipe_to_channel() to use
the appropriate enums.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Deepak S <deepak.s@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Move VLV/CHV prepare_pll later

With DPIO powergating active on CHV, we can't even access the DPIO PLL
registers until the lane power state overrides have been enabled. That
will happen from the encoder .pre_pll_enable() hook, so move
chv_prepare_pll() to happen after that point, which puts it just before
chv_enable_pll() actually.

Do the same for VLV to avoid accumulating weird differences between the
platforms. Both platforms seem happy with the new arrangement.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Deepak S <deepak.s@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Add locking around chv_phy_control_init()

dev_priv->chv_phy_control is protected by the power_domains->lock
elsewhere, so also grab it when initializing chv_phy_control.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Deepak S <deepak.s@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Move DPIO port init earlier

To implement DPIO lane power gating on CHV we're going to need to access
DPIO registers from the cmn power well enable hook. That gets called
rather early, so we need to move the DPIO port IOSF sideband port
assignment earlier as well.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Deepak S <deepak.s@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Add encoder->post_pll_disable() hooks and move CHV clock buffer disables there

Move the CHV clock buffer disable from chv_disable_pll() to the new
encoder .post_pll_disable() hook. This is more symmetric since the
clock buffer enable happens from the .pre_pll_enable() hook.

We'll have more use for the new hook soon.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Deepak S <deepak.s@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Always program unique transition scale for CHV

The docs give you the impression that the unique transition scale
value shouldn't matter when unique transition scale is enabled. But
as Imre found on BXT (and I verfied also on BSW) the value does
matter. So from now on just program the same value 0x9a always.

Cc: Imre Deak <imre.deak@intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Deepak S <deepak.s@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Always program m2 fractional value on CHV

When fractional m2 divider isn't used on CHV the fractional part
is ignore by the hardware. Despite that, program the fractional
value (0 in this case) to the hardware register just to keep
things a bit more consistent. Might at least make register dumps
a bit less confusing when there isn't some stale fractional part
hanging around.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Deepak S <deepak.s@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: fix driver's versions of WARN_ON & WARN_ON_ONCE

The current versions of these two macros don't work correctly if the
argument expression happens to contain a modulo operator (%) -- when
stringified, it gets interpreted as a printf formatting character!
With a specifically crafted parameter, this could probably cause a
kernel OOPS; consider WARN_ON(p%s) or WARN_ON(f %*pEp).

Instead, we should use an explicit "%s" format, with the stringified
expression as the coresponding literal-string argument.

Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Put back lane_count into intel_dp and add link_rate too

With MST there won't be a crtc assigned to the main link encoder, so
trying to dig up the pipe_config from there is a recipe for an oops.

Instead store the parameters (lane_count and link_rate) in the encoder,
and use those values during link training etc. Since those parameters
are now assigned only when the link is actually enabled,
.compute_config() won't clobber them as it did before.

Hardware state readout is still bonkers though as we don't transfer the
link parameters from pipe_config intel_dp. We should do that during
encoder sanitation. But since we don't even do a proper job of reading
out the main link encoder state for MST there's littel point in
worrying about this now.

Fixes a regression with MST caused by:
commit 90a6b7b052b1aa17fbb98b049e9c8b7f729c35a7
Author: Ville Syrjälä <ville.syrjala@linux.intel.com>
Date:   Mon Jul 6 16:39:15 2015 +0300

    drm/i915: Move intel_dp->lane_count into pipe_config

v2: Different apporoach that should keep intel_dp_check_mst_status()
    somewhat less oopsy

Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reported-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Tested-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915/bxt: don't allow cached GEM mappings on A stepping

Due to a coherency issue on BXT A steppings we can't guarantee a
coherent view of cached (CPU snooped) GPU mappings, so fail such
requests. User space is supposed to fall back to uncached mappings in
this case.

v2:
- limit the WA to A steppings, on later stepping this HW issue is fixed
v3:
- return error instead of trying to work around the issue in kernel,
since that could confuse user space (Chris)

Testcast: igt/gem_store_dword_batches_loop/cached-mapping
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915/bxt: work around HW coherency issue when accessing GPU seqno

By running igt/store_dword_loop_render on BXT we can hit a coherency
problem where the seqno written at GPU command completion time is not
seen by the CPU. This results in __i915_wait_request seeing the stale
seqno and not completing the request (not considering the lost
interrupt/GPU reset mechanism). I also verified that this isn't a case
of a lost interrupt, or that the command didn't complete somehow: when
the coherency issue occured I read the seqno via an uncached GTT mapping
too. While the cached version of the seqno still showed the stale value
the one read via the uncached mapping was the correct one.

Work around this issue by clflushing the corresponding CPU cacheline
following any store of the seqno and preceding any reading of it. When
reading it do this only when the caller expects a coherent view.

v2:
- fix using the proper logical && instead of a bitwise & (Jani, Mika)
- limit the workaround to A stepping, on later steppings this HW issue
is fixed
v3:
- use a separate get_seqno/set_seqno vfunc (Chris)

Testcase: igt/store_dword_loop_render
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Also call frontbuffer flip when disabling planes.

We also need to call the frontbuffer flip to trigger proper
invalidations when disabling planes. Otherwise we will miss
screen updates when disabling sprites or cursor.

On core platforms where HW tracking also works, this issue
is totally masked because HW tracking triggers PSR exit
however on VLV/CHV that has only SW tracking we miss screen
updates when disabling planes.

It was caught with kms_psr_sink_crc sprite_plane_onoff
and cursor_plane_onoff subtests running on VLV/CHV.

This is probably a regression since I can also get this
with the manual test case, but with so many changes on atomic
modeset I couldn't track exactly when this was introduced.

Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Change SRM, LRM instructions to use correct length

MI_STORE_REGISTER_MEM, MI_LOAD_REGISTER_MEM instructions are not really
variable length instructions unlike MI_LOAD_REGISTER_IMM where it expects
(reg, addr) pairs so use fixed length for these instructions.

v2: rebase

Cc: Dave Gordon <david.s.gordon@intel.com>
Signed-off-by: Arun Siluvery <arun.siluvery@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
[danvet: Appease checkpatch as Mika spotted in i915_reg.h - it seems
terminally unhappy about i915_cmd_parser.c so that would be a separate
patch.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

doc: drm: Fix mis-spelling of i915_guc_submission includes

In commit
d1675198e: drm/i915: Integrate GuC-based command submission

the drm.tmpl include lines reference the intel_guc_submission.c but the
patch adds the file i915_guc_submission.c. drm.tmpl fails to build with:
docproc: .//drivers/gpu/drm/i915/intel_guc_submission.c:
No such file or directory

Change the file reference to the actual file.

Signed-off-by: Graham Whaley <graham.whaley@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: remove excessive scaler debugging messages

There's so much scaler debugging messages that it makes other debugging
hard. Remove them.

Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Debugfs interface for GuC submission statistics

This provides a means of reading status and counts relating
to GuC actions and submissions.

v2:
    Remove surplus blank line in output [Chris Wilson]

v5:
    Added GuC per-engine submission & seqno statistics

v6:
    Add per-ring statistics to client, refactor client-dumper.

Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
Signed-off-by: Alex Dai <yu.dai@intel.com>
Reviewed-by: Tom O'Rourke <Tom.O'Rourke@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Integrate GuC-based command submission

GuC-based submission is mostly the same as execlist mode, up to
intel_logical_ring_advance_and_submit(), where the context being
dispatched would be added to the execlist queue; at this point
we submit the context to the GuC backend instead.

There are, however, a few other changes also required, notably:
1.  Contexts must be pinned at GGTT addresses accessible by the GuC
    i.e. NOT in the range [0..WOPCM_SIZE), so we have to add the
    PIN_OFFSET_BIAS flag to the relevant GGTT-pinning calls.

2.  The GuC's TLB must be invalidated after a context is pinned at
    a new GGTT address.

3.  GuC firmware uses the one page before Ring Context as shared data.
    Therefore, whenever driver wants to get base address of LRC, we
    will offset one page for it. LRC_PPHWSP_PN is defined as the page
    number of LRCA.

4.  In the work queue used to pass requests to the GuC, the GuC
    firmware requires the ring-tail-offset to be represented as an
    11-bit value, expressed in QWords. Therefore, the ringbuffer
    size must be reduced to the representable range (4 pages).

v2:
    Defer adding #defines until needed [Chris Wilson]
    Rationalise type declarations [Chris Wilson]

v4:
    Squashed kerneldoc patch into here [Daniel Vetter]

v5:
    Update request->tail in code common to both GuC and execlist modes.
    Add a private version of lr_context_update(), as sharing the
        execlist version leads to race conditions when the CPU and
        the GuC both update TAIL in the context image.
    Conversion of error-captured HWS page to string must account
        for offset from start of object to actual HWS (LRC_PPHWSP_PN).

Issue: VIZ-4884
Signed-off-by: Alex Dai <yu.dai@intel.com>
Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
Reviewed-by: Tom O'Rourke <Tom.O'Rourke@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Interrupt routing for GuC submission

Turn on interrupt steering to route necessary interrupts to GuC.

v6:
Rebased

Issue: VIZ-4884
Signed-off-by: Alex Dai <yu.dai@intel.com>
Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
Reviewed-by: Tom O'Rourke <Tom.O'Rourke@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Implementation of GuC submission client

A GuC client has its own doorbell and workqueue. It maintains the
doorbell cache line, process description object and work queue item.

A default guc_client is created for the i915 driver to use for
normal-priority in-order submission.

Note that the created client is not yet ready for use; doorbell
allocation will fail as we haven't yet linked the GuC's context
descriptor to the default contexts for each ring (see later patch).

v2:
    Defer adding structure members until needed [Chris Wilson]
    Rationalise type declarations [Chris Wilson]

v5:
    Add GuC per-engine submission & seqno statistics.
    Move wq locking to encompass both get_space() and add_item().
    Take forcewake lock in host2guc_action() [Tom O'Rourke]

v6:
    Fix GuC doorbell cacheline selection code (the
        cacheline-within-page calculation was wrong).
    Rename GuC priorities to make them closer to the names used in
        the GuC firmware source, matching what the autogenerated
        versions will (probably) be.
    Add per-ring statistics to client.

Issue: VIZ-4884
Signed-off-by: Alex Dai <yu.dai@intel.com>
Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
Reviewed-by: Tom O'Rourke <Tom.O'Rourke@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Enable GuC firmware log

Allocate a GEM object to hold GuC log data. A debugfs interface
(i915_guc_log_dump) is provided to print out the log content.

v2:
Add struct members at point of use [Chris Wilson]

v6:
Rebased

Issue: VIZ-4884
Signed-off-by: Alex Dai <yu.dai@intel.com>
Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
Reviewed-by: Tom O'Rourke <Tom.O'Rourke@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Prepare for GuC-based command submission

This adds the first of the data structures used to communicate with the
GuC (the pool of guc_context structures).

We create a GuC-specific wrapper round the GEM object allocator as all
GEM objects shared with the GuC must be pinned into GGTT space at an
address that is NOT in the range [0..WOPCM_TOP), as that range of GGTT
addresses is not accessible to the GuC (from the GuC's point of view,
it's permanently reserved for other objects such as the BootROM & SRAM).

Later, we will need to allocate additional GuC-sharable objects for the
submission client(s) and the GuC's debug log.

v2:
    Remove redundant initialisation [Chris Wilson]
    Defer adding struct members until needed [Chris Wilson]
    Local functions should pass dev_priv rather than dev [Chris Wilson]

v5:
    Invalidate GuC TLB after allocating and pinning a new object

v6:
    Rebased

Issue: VIZ-4884
Signed-off-by: Alex Dai <yu.dai@intel.com>
Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
Reviewed-by: Tom O'Rourke <Tom.O'Rourke@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Expose one LRC function for GuC submission mode

GuC submission is basically execlist submission, but with the GuC
handling the actual writes to the ELSP and the resulting context
switch interrupts.  So to describe a context for submission via
the GuC, we need one of the same functions used in execlist mode.
This commit exposes one such function, changing its name to better
describe what it does (it's related to logical ring contexts rather
than to execlists per se).

v2:
    Replaces previous "drm/i915: Move execlists defines from .c to .h"

v3:
    Incorporates a change to one of the functions exposed here that was
        previously part of an internal patch, but which was omitted from
        the version recently committed to drm-intel-nightly:
    7a01a0a drm/i915/lrc: Update PDPx registers with lri commands
        So we reinstate this change here.

v4:
    Drop v3 change, update function parameters due to collision with
        8ee3615 drm/i915: Convert execlists_ctx_descriptor() for requests

v5:
    Don't expose execlists_update_context() after all. The current
        version is no longer compatible with GuC submission; trying to
        share the execlist version of this function results in both GuC
        and CPU updating TAIL in the context image, with bad results when
        they get out of step. The GuC submission path now has its own
        private version that just updates the ringbuffer start address,
        and not TAIL or PDPx.

v6:
    Rebased

Issue: VIZ-4884
Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
Reviewed-by: Tom O'Rourke <Tom.O'Rourke@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Debugfs interface to read GuC load status

The new node provides access to the status of the GuC-specific loader;
also the scratch registers used for communication between the i915
driver and the GuC firmware.

v2:
Changes to output formats per Chris Wilson's suggestions

v6:
Rebased

Issue: VIZ-4884
Signed-off-by: Alex Dai <yu.dai@intel.com>
Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
Reviewed-by: Tom O'Rourke <Tom.O'Rourke@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: GuC-specific firmware loader

This fetches the required firmware image from the filesystem,
then loads it into the GuC's memory via a dedicated DMA engine.

This patch is derived from GuC loading work originally done by
Vinit Azad and Ben Widawsky.

v2:
    Various improvements per review comments by Chris Wilson

v3:
    Removed 'wait' parameter to intel_guc_ucode_load() as firmware
        prefetch is no longer supported in the common firmware loader,
per Daniel Vetter's request.
    Firmware checker callback fn now returns errno rather than bool.

v4:
    Squash uC-independent code into GuC-specifc loader [Daniel Vetter]
    Don't keep the driver working (by falling back to execlist mode)
        if GuC firmware loading fails [Daniel Vetter]

v5:
    Clarify WOPCM-related #defines [Tom O'Rourke]
    Delete obsolete code no longer required with current h/w & f/w
        [Tom O'Rourke]
    Move the call to intel_guc_ucode_init() later, so that it can
        allocate GEM objects, and have it fetch the firmware; then
intel_guc_ucode_load() doesn't need to fetch it later.
        [Daniel Vetter].

v6:
    Update comment describing intel_guc_ucode_load() [Tom O'Rourke]

Issue: VIZ-4884
Signed-off-by: Alex Dai <yu.dai@intel.com>
Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
Reviewed-by: Tom O'Rourke <Tom.O'Rourke@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Kill intel_dp->{link_bw, rate_select}

We only need the link_bw/rate_select parameters when starting link
training, and they should be computed based on the currently active
config, so throw them out from intel_dp and just compute on demand.

Toss in an extra debug print to see rate_select in addition to link_bw,
as the latter may be 0 for eDP 1.4.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Sivakumar Thulasimani <sivakumar.thulasimani@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Don't use link_bw to select between TP1 and TP3

intel_dp->link_bw is going away, so consul the port_clock instead when
choosing between TP1 and TP3.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Sivakumar Thulasimani <sivakumar.thulasimani@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Move intel_dp->lane_count into pipe_config

Currently we clobber intel_dp->lane_count in compute config, which means
after a rejected modeset we may no longer be able to retrain the current
link. Move lane_count into pipe_config to avoid that.

v2: Add missing ':' to the pipe config debug dump

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Sivakumar Thulasimani <sivakumar.thulasimani@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Avoid confusion between DP and TRANS_DP_CTL in DP .get_config()

Use a separate variable for the TRANS_DP_CTL value instead of reusing
'tmp' that otherwise contains the DP port register value.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Sivakumar Thulasimani <sivakumar.thulasimani@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Don't pass clock to DDI PLL select functions

All the *_ddi_pll_select() functions get passed the port_clock and pipe
config as parameters. We only need to pass the pipe config, and the
functions can dig up the port_clock themselves.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Sivakumar Thulasimani <sivakumar.thulasimani@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Don't use link_bw for PLL setup

Use port_clock instead of link_bw when picking the PLL parameters for
DP. link_bw may be zero with an eDP 1.4 sink that supports
DP_LINK_RATE_SET so we shouldn't use it for anything other than feed it
to the sink appropriately.

v2: Fix typo in commit message (Sivakumar)

Reviewed-by: Sivakumar Thulasimani <sivakumar.thulasimani@intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Clean up DP/HDMI limited color range handling

Currently we treat intel_{dp,hdmi}->color_range as partly user
controller value (via the property) but we also change it during
.compute_config() when using the "Automatic" mode. That is a bit
confusing, so let's just change things so that we store the user
property values in intel_dp, and only change what's stored in
pipe_config during .compute_config().

There should be no functional change.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Sivakumar Thulasimani <sivakumar.thulasimani@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Do not check or a stalled pageflip prior to it being queued

When we queue the command or operation to change the scanout address, we
mark the flip as in progress. We can use this flag to prevent us from
checking for a stalled flip prior to its existence!

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: clflush on pin_to_display after pwrite to UC bo in LLC

Currently we don't clflush on pin_to_display if the bo is already
UC/WT and is not in the CPU write domain. This causes problems with
pwrite since pwrite doesn't change the write domain, and it avoids
clflushing on UC/WT buffers on LLC platforms unless the buffer is
currently being scanned out.

Fix the problem by marking the cache dirty and adjusting
i915_gem_object_set_cache_level() to clflush when the cache is dirty
even if the cache_level doesn't change.

My last attempt [1] at fixing this via write domain frobbing was shot
down, but now with the cache_dirty flag we can do things in a nicer way.

[1] http://lists.freedesktop.org/archives/intel-gfx/2014-November/055390.html

v2: Drop the I915_CACHE_NONE/WT checks from pwrite

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=86422
Testcase: igt/kms_pwrite_crc
Testcase: igt/gem_pwrite_snooped
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915/bxt: WA for swapped HPD pins in A stepping

WA for BXT A0/A1, where DDIB's HPD pin is swapped to DDIA, so enabling
DDIA HPD pin in place of DDIB.

v2: For DP, irq_port is used to determine the encoder instead of
hpd_pin and removing the edp HPD logic because port A HPD is not
present(Imre)
v3: Rebased on top of Imre's patchset for enabling HPD on PORT A.
Added hpd_pin swapping for intel_dp_init_connector, setting encoder
for PORT_A as per the WA in irq_port (Imre)
v4: Dont enable interrupt for edp, also reframe the description (Siva)
v5: Don’t check for PORT_A in intel_ddi_init to update dig_port,
instead avoid setting hpd_pin itself (Imre)

Signed-off-by: Sonika Jindal <sonika.jindal@intel.com>
Reviewed-by: Sivakumar Thulasimani <sivakumar.thulasimani@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915/bxt: Add HPD support for DDIA

Also remove redundant comments.

Signed-off-by: Sonika Jindal <sonika.jindal@intel.com>
Reviewed-by: Sivakumar Thulasimani <sivakumar.thulasimani@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Always pass dev pointer in pdp_init

And fix 0-DAY kernel test infrastructure warning.

Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: Michel Thierry <michel.thierry@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Use complete virtual address range on 32-bit platforms

With the offset length being taken care of in ("drm/i915/gtt: Allow >=
4GB offsets in X86_32"), the code should be finally safe in 32-bit
kernels.

This reverts commit 501fd70fcaebc911b6b96a7b331e6960e5af67e7
Author: Michel Thierry <michel.thierry@intel.com>
Date: Fri May 29 14:15:05 2015 +0100

drm/i915: limit PPGTT size to 2GB in 32-bit platforms

Signed-off-by: Michel Thierry <michel.thierry@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915/gtt: Allow >= 4GB offsets in X86_32

Similar to commit c44ef60e437019b8ca1dab8b4d2e8761fd4ce1e9 ("drm/i915/gtt:
Allow >= 4GB sizes for vm"), i915_gem_obj_offset and i915_gem_obj_ggtt_offset
return an unsigned long, which in only 4-bytes long in 32-bit kernels.

Change return type (and other related offset variables) to u64.

Since Global GTT is always limited to 4GB, this change would not be required
in i915_gem_obj_ggtt_offset, but this is done for consistency.

v2: Remove unnecessary offset variable in do_pin, as we already have
vma->node.start (Chris).
Update GGTT offset too (Tvrtko).

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Signed-off-by: Michel Thierry <michel.thierry@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Dont -ETIMEDOUT on identical new and previous (count, crc).

By Vesa DP 1.2 spec TEST_CRC_COUNT is a "4 bit wrap counter which
increments each time the TEST_CRC_x_x are updated."

However if we are trying to verify the screen hasn't changed we get
same (count, crc) pair twice. Without this patch we would return
-ETIMEOUT in this case.

So, if in 6 vblanks the pair (count, crc) hasn't changed we
return it anyway instead of returning error and let test case decide
if it was right or not.

Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Save latest known sink CRC to compensate delayed counter reset.

By Vesa DP 1.2 Spec TEST_CRC_COUNT should be
"reset to 0 when TEST_SINK bit 0 = 0."

However for some strange reason when PSR is enabled in
certain platforms this is not true. At least not immediatelly.

So we face cases like this:

first get_sink_crc operation:
     count: 0, crc: 000000000000
     count: 1, crc: c101c101c101
returned expected crc: c101c101c101

secont get_sink_crc operation:
     count: 1, crc: c101c101c101
     count: 0, crc: 000000000000
     count: 1, crc: 0000c1010000
should return expected crc: 0000c1010000

But also the reset to 0 should be faster resulting into:

get_sink_crc operation:
     count: 1, crc: c101c101c101
     count: 1, crc: 0000c1010000
should return expected crc: 0000c1010000

So in order to know that the second one is valid one
we need to compare the pair (count, crc) with latest (count, crc).

If the pair changed you have your valid CRC.

Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Force sink crc stop before start.

By Vesa DP spec, test counter at DP_TEST_SINK_MISC just reset to 0
when unsetting DP_TEST_SINK_START, so let's force this stop here.

But let's minimize the aux transactions and just do it when we know
it hasn't been properly stoped.

Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915/userptr: Kill user_size limit check

GTT was only 32b and its max value is 4GB. In order to allow objects
bigger than 4GB in 48b PPGTT, i915_gem_userptr_ioctl we could check
against max 48b range (1ULL << 48).

But since the check no longer applies, just kill the limit.

v2: Use the default ctx to infer the ppgtt max size (Akash).
v3: Just kill the limit, it was only there for early detection of an
error when used for execbuffer (Chris).

Cc: Akash Goel <akash.goel@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Michel Thierry <michel.thierry@intel.com>
Reviewed-by: Akash Goel <akash.goel@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: batch_obj vm offset must be u64

Otherwise it can overflow in 48-bit mode, and cause an incorrect
exec_start.

Before commit 5f19e2bffa63a91cd4ac1adcec648e14a44277ce ("drm/i915: Merged
the many do_execbuf() parameters into a structure"), it was already an u64.

Signed-off-by: Michel Thierry <michel.thierry@intel.com>
Reviewed-by: Akash Goel <akash.goel@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: object size needs to be u64

In a 48b world, users can try to allocate buffers bigger than 4GB; in
these cases it is important that size is a 64b variable.

v2: Drop the warning about bind with size 0, it shouldn't happen anyway.
Signed-off-by: Michel Thierry <michel.thierry@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915/gen8: Add ppgtt info and debug_dump

v2: Clean up patch after rebases.
v3: gen8_dump_ppgtt for 32b and 48b PPGTT.
v4: Use used_pml4es/pdpes (Akash).
v5: Rebase after Mika's ppgtt cleanup / scratch merge patch series.
v6: Rely on used_px bits instead of null checking (Akash)

Cc: Akash Goel <akash.goel@intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Michel Thierry <michel.thierry@intel.com> (v2+)
Reviewed-by: Akash Goel <akash.goel@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Expand error state's address width to 64b

v2: For semaphore errors, object is mapped to GGTT and offset will not
be > 4GB, print only lower 32-bits (Akash)
v3: Print gtt_offset in groups of 32-bit (Chris)

Cc: Akash Goel <akash.goel@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Michel Thierry <michel.thierry@intel.com>
Reviewed-by: Akash Goel <akash.goel@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915/gen8: Initialize PDPs and PML4

Similar to PDs, while setting up a page directory pointer, make all entries
of the pdp point to the scratch pd before mapping (and make all its entries
point to the scratch page); this is to be safe in case of out of bound
access or proactive prefetch.

Also add a scratch pdp, which the PML4 entries point to.

v2: Handle scratch_pdp allocation failure correctly, and keep
initialize_px functions together (Akash)
v3: Rebase after Mika's ppgtt cleanup / scratch merge patch series. Rely on
the added macros to initialize the pdps.
v4: Rebase after final merged version of Mika's ppgtt/scratch patches
(and removed commit message part related to v3).
v5: Update commit message to also mention PML4 table initialization and
the new scratch pdp (Akash).

Suggested-by: Akash Goel <akash.goel@intel.com>
Signed-off-by: Michel Thierry <michel.thierry@intel.com>
Reviewed-by: Akash Goel <akash.goel@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915/gen8: Add 4 level support in insert_entries and clear_range

When 48b is enabled, gen8_ppgtt_insert_entries needs to read the Page Map
Level 4 (PML4), before it selects which Page Directory Pointer (PDP)
it will write to.

Similarly, gen8_ppgtt_clear_range needs to get the correct PDP/PD range.

This patch was inspired by Ben's "Depend exclusively on map and
unmap_vma".

v2: Rebase after s/page_tables/page_table/.
v3: Remove unnecessary pdpe loop in gen8_ppgtt_clear_range_4lvl and use
clamp_pdp in gen8_ppgtt_insert_entries (Akash).
v4: Merge gen8_ppgtt_clear_range_4lvl into gen8_ppgtt_clear_range to
maintain symmetry with gen8_ppgtt_insert_entries (Akash).
v5: Do not mix pages and bytes in insert_entries (Akash).
v6: Prevent overflow in sg_nents << PAGE_SHIFT, when inserting 4GB at
once.
v7: Rebase after Mika's ppgtt cleanup / scratch merge patch series.
Use gen8_px_index functions, and remove unnecessary number of pages
parameter in insert_pte_entries.
v8: Change gen8_ppgtt_clear_pte_range to stop at PDP boundary, instead of
adding and extra clamp function; remove unnecessary pdp_start/pdp_len
variables (Akash).
v9: pages->orig_nents instead of sg_nents(pages->sgl) to get the
length (Akash).
v10: Remove pdp warning check ingen8_ppgtt_insert_pte_entries until this
commit (Akash).

Reviewed-by: Akash Goel <akash.goel@intel.com> (v9)
Cc: Akash Goel <akash.goel@intel.com>
Signed-off-by: Michel Thierry <michel.thierry@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915/gen8: Pass sg_iter through pte inserts

As a step towards implementing 4 levels, while not discarding the
existing pte insert functions, we need to pass the sg_iter through.
The current function understands to the page directory granularity.
An object's pages may span the page directory, and so using the iter
directly as we write the PTEs allows the iterator to stay coherent
through a VMA insert operation spanning multiple page table levels.

v2: Rebase after s/page_tables/page_table/.
v3: Rebase after Mika's ppgtt cleanup / scratch merge patch series;
updated commit message (s/map/insert).
v4: Rebase.

Reviewed-by: Akash Goel <akash.goel@intel.com> (v3)
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Michel Thierry <michel.thierry@intel.com> (v2+)
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915/gen8: Add 4 level switching infrastructure and lrc support

In 64b (48bit canonical) PPGTT addressing, the PDP0 register contains
the base address to PML4, while the other PDP registers are ignored.

In LRC, the addressing mode must be specified in every context
descriptor, and the base address to PML4 is stored in the reg state.

v2: PML4 update in legacy context switch is left for historic reasons,
the preferred mode of operation is with lrc context based submission.
v3: s/gen8_map_page_directory/gen8_setup_page_directory and
s/gen8_map_page_directory_pointer/gen8_setup_page_directory_pointer.
Also, clflush will be needed for bxt. (Akash)
v4: Squashed lrc-specific code and use a macro to set PML4 register.
v5: Rebase after Mika's ppgtt cleanup / scratch merge patch series.
PDP update in bb_start is only for legacy 32b mode.
v6: Rebase after final merged version of Mika's ppgtt/scratch
patches.
v7: There is no need to update the pml4 register value in
execlists_update_context. (Akash)
v8: Move pd and pdp setup functions to a previous patch, they do not
belong here. (Akash)
v9: Check USES_FULL_48BIT_PPGTT instead of GEN8_CTX_ADDRESSING_MODE in
gen8_emit_bb_start to check if emit pdps is needed. (Akash)

Cc: Akash Goel <akash.goel@intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Michel Thierry <michel.thierry@intel.com> (v2+)
Reviewed-by: Akash Goel <akash.goel@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915/gen8: implement alloc/free for 4lvl

PML4 has no special attributes, and there will always be a PML4.
So simply initialize it at creation, and destroy it at the end.

The code for 4lvl is able to call into the existing 3lvl page table code
to handle all of the lower levels.

v2: Return something at the end of gen8_alloc_va_range_4lvl to keep the
compiler happy. And define ret only in one place.
Updated gen8_ppgtt_unmap_pages and gen8_ppgtt_free to handle 4lvl.
v3: Use i915_dma_unmap_single instead of pci API. Fix a
couple of incorrect checks when unmapping pdp and pd pages (Akash).
v4: Call __pdp_fini also for 32b PPGTT. Clean up alloc_pdp param list.
v5: Prevent (harmless) out of range access in gen8_for_each_pml4e.
v6: Simplify alloc_vma_range_4lvl and gen8_ppgtt_init_common error
paths. (Akash)
v7: Rebase, s/gen8_ppgtt_free_*/gen8_ppgtt_cleanup_*/.
v8: Change location of pml4_init/fini. It will make next patches
cleaner.
v9: Rebase after Mika's ppgtt cleanup / scratch merge patch series, while
trying to reuse as much as possible for pdp alloc. pml4_init/fini
replaced by setup/cleanup_px macros.
v10: Rebase after Mika's merged ppgtt cleanup patch series.
v11: Rebase after final merged version of Mika's ppgtt/scratch
patches.
v12: Fix pdpe start value in trace (Akash)
v13: Define all 4lvl functions in this patch directly, instead of
previous patches, add i915_page_directory_pointer_entry_alloc here,
use test_bit to detect when pdp is already allocated (Akash).
v14: Move pdp allocation into a new gen8_ppgtt_alloc_page_dirpointers
funtion, as we do for pds and pts; move pd and pdp setup functions to
this patch (Akash).
v15: Added kfree(pdp) from previous patch to this (Akash).

Cc: Akash Goel <akash.goel@intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Michel Thierry <michel.thierry@intel.com> (v2+)
Reviewed-by: Akash Goel <akash.goel@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915/gen8: Add PML4 structure

Introduces the Page Map Level 4 (PML4), ie. the new top level structure
of the page tables.

To facilitate testing, 48b mode will be available on Broadwell and
GEN9+, when i915.enable_ppgtt = 3.

v2: Remove unnecessary CONFIG_X86_64 checks, ppgtt code is already
32/64-bit safe (Chris).
v3: Add goto free_scratch in temp 48-bit mode init code (Akash).
v4: kfree the pdp until the 4lvl alloc/free patch (Akash).
v5: Postpone 48-bit code in sanitize_enable_ppgtt (Akash).
v6: Keep _insert_pte_entries changes outside this patch (Akash).

Cc: Akash Goel <akash.goel@intel.com>
Signed-off-by: Michel Thierry <michel.thierry@intel.com>
Reviewed-by: Akash Goel <akash.goel@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915/gen8: Add dynamic page trace events

The dynamic page allocation patch series added it for GEN6, this patch
adds them for GEN8.

v2: Consolidate pagetable/page_directory events
v3: Multiple rebases.
v4: Rebase after s/page_tables/page_table/.
v5: Rebase after Mika's ppgtt cleanup / scratch merge patch series.
v6: Rebase after gen8_map_pagetable_range removal.
v7: Use generic page name (px) in DECLARE_EVENT_CLASS (Akash)
v8: Defer define of i915_page_directory_pointer_entry_alloc (Akash)

Cc: Akash Goel <akash.goel@intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Michel Thierry <michel.thierry@intel.com> (v3+)
Reviewed-by: Akash Goel <akash.goel@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915/gen8: Generalize PTE writing for GEN8 PPGTT

The insert_entries function was the function used to write PTEs. For the
PPGTT it was "hardcoded" to only understand two level page tables, which
was the case for GEN7. We can reuse this for 4 level page tables, and
remove the concept of insert_entries, which was never viable past 2
level page tables anyway, but it requires a bit of rework to make the
function a bit more generic.

v2: Rebase after Mika's ppgtt cleanup / scratch merge patch series.
v3: Rebase after final merged version of Mika's ppgtt/scratch patches.
v4: Check and warn for NULL value of pdp pointer (Akash).

Cc: Akash Goel <akash.goel@intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Michel Thierry <michel.thierry@intel.com> (v2)
Reviewed-by: Akash Goel <akash.goel@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915/gen8: Abstract PDP usage

Up until now, ppgtt->pdp has always been the root of our page tables.
Legacy 32b addresses acted like it had 1 PDP with 4 PDPEs.

In preparation for 4 level page tables, we need to stop using ppgtt->pdp
directly unless we know it's what we want. The future structure will use
ppgtt->pml4 for the top level, and the pdp is just one of the entries
being pointed to by a pml4e. The temporal pdp local variable will be
removed once the rest of the 4-level code lands.

Also, start passing the vm pointer to the alloc functions, instead of
ppgtt.

v2: Updated after dynamic page allocation changes.
v3: Rebase after s/page_tables/page_table/.
v4: Rebase after changes in "Dynamic page table allocations" patch.
v5: Rebase after Mika's ppgtt cleanup / scratch merge patch series.
v6: Rebase after final merged version of Mika's ppgtt/scratch patches.
v7: Keep pagetable map in-line (and avoid unnecessary for_each_pde
loops), remove redundant ppgtt pointer in _alloc_pagetabs (Akash)
v8: Fix text indentation in _alloc_pagetabs/page_directories (Chris)
v9: Defer gen8_alloc_va_range_4lvl definition until 4lvl is implemented,
clean-up gen8_ppgtt_cleanup [pun intended] (Akash).
v10: Clean-up commit message (Akash).

Cc: Akash Goel <akash.goel@intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Michel Thierry <michel.thierry@intel.com> (v2+)
Reviewed-by: "Akash Goel" <akash.goel@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915/gen8: Make pdp allocation more dynamic

This transitional patch doesn't do much for the existing code. However,
it should make upcoming patches to use the full 48b address space a bit
easier.

32-bit ppgtt uses just 4 PDPs, while 48-bit ppgtt will have up-to 512;
this patch prepares the existing functions to query the right number of pdps
at run-time. This also means that used_pdpes should also be allocated during
ppgtt_init, as the bitmap size will depend on the ppgtt address range
selected.

v2: Renamed pdp_free to be similar to pd/pt (unmap_and_free_pdp).
v3: To facilitate testing, 48b mode will be available on Broadwell and
GEN9+, when i915.enable_ppgtt = 3.
v4: Rebase after s/page_tables/page_table/, added extra information
about 4-level page table formats and use IS_ENABLED macro.
v5: Check CONFIG_X86_64 instead of CONFIG_64BIT.
v6: Rebase after Mika's ppgtt cleanup / scratch merge patch series, and
follow
his nomenclature in pdp functions (there is no alloc_pdp yet).
v7: Rebase after merged version of Mika's ppgtt cleanup patch series.
v8: Rebase after final merged version of Mika's ppgtt/scratch patches.
v9: Introduce PML4 (and 48-bit checks) until next patch (Akash).
v10: Also use test_bit to detect when pd/pt are already allocated (Akash)

Cc: Akash Goel <akash.goel@intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Michel Thierry <michel.thierry@intel.com> (v2+)
Reviewed-by: Akash Goel <akash.goel@intel.com>
[danvet: Amend commit message as suggested by Michel.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Remove unnecessary gen8_clamp_pd

gen8_clamp_pd clamps to the next page directory boundary, but the macro
gen8_for_each_pde already has a check to stop at the page directory
boundary.

Furthermore, i915_pte_count also restricts to the next page table
boundary.

v2: Rebase after Mika's ppgtt cleanup / scratch merge patch series.

Suggested-by: Akash Goel <akash.goel@intel.com>
Signed-off-by: Michel Thierry <michel.thierry@intel.com>
Reviewed-by: "Akash Goel" <akash.goel@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Per-DDI I_boost override

An OEM may request increased I_boost beyond the recommended values
by specifying an I_boost value to be applied to all swing entries for
a port. These override values are specified in VBT.

v2: rebase and remove unused iboost_bit variable

Issue: VIZ-5676
Signed-off-by: Antti Koskipaa <antti.koskipaa@linux.intel.com>
Reviewed-by: David Weinehall <david.weinehall@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

Merge tag 'drm-intel-fixes-2015-08-14' into drm-intel-next-fixes

Backmerge drm-intel-fixes because a bunch of atomic patch backporting
we had to do lead to horrible conflicts.

Conflicts:
drivers/gpu/drm/drm_crtc.c
Just a bit of context conflict between -next and -fixes.
drivers/gpu/drm/i915/intel_atomic.c
drivers/gpu/drm/i915/intel_display.c
Atomic conflicts, always pick the code from -next.

Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>

drm/i915/skl: WaIgnoreDDIAStrap is forever, always init DDI A

There is currently conflicting documentation on which steppings the
workaround is needed, up to C vs. forever. However there is post-C
stepping hardware that doesn't report port presence on DDI A, leading to
black screen on eDP. Assume the strap isn't connected, and try to enable
DDI A on these machines. (We'll still check the VBT for the info in DDI
init.)

Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Mika Westerberg <mika.westerberg@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: fix checksum write for automated test reply

DP spec requires the checksum of the last block read to be written
when replying to TEST_EDID_READ. This patch fixes the current code
to do the same.

v2: removed loop for jumping blocks and performed direct addition
as recommended by Daniel

Signed-off-by: Sivakumar Thulasimani <sivakumar.thulasimani@intel.com>
Reviewed-by: Sonika Jindal <sonika.jindal@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Contain the WA_REG macro

Prevent leaking the if scoping by containing the WA_REG
macro inside its own scope.

Reported-by: Arun Siluvery <arun.siluvery@linux.intel.com>
Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Dave Gordon <david.s.gordon@intel.com>
[danvet: Appease checkpatch.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Remove the failed context from the fpriv->context_idr

If we encounter an allocation failure during ppggt creation (trivial
even with 16Gib+ RAM!), we need to remove the dead context from the
fpriv->context_idr along with the references.

gem_exec_ctx: page allocation failure: order:0, mode:0x8004
CPU: 3 PID: 27272 Comm: gem_exec_ctx Tainted: G W 4.2.0-rc5+ #37
0000000000000000 ffff880086ff7a78 ffffffff816b947a ffff88041ed90038
0000000000008004 ffff880086ff7b08 ffffffff8114b1a5 ffff880086ff7ac8
ffffffff8108d848 0000000000000000 ffffffff81ce84b8 0000000000000000
Call Trace:
[<ffffffff816b947a>] dump_stack+0x45/0x57
[<ffffffff8114b1a5>] warn_alloc_failed+0xd5/0x120
[<ffffffff8108d848>] ? __wake_up+0x48/0x60
[<ffffffff8114e0ed>] __alloc_pages_nodemask+0x73d/0x8e0
[<ffffffffc0472238>] ? i915_gem_execbuffer2+0x148/0x240 [i915]
[<ffffffffc0474240>] __setup_page_dma+0x30/0x110 [i915]
[<ffffffffc0477f61>] gen8_ppgtt_init+0x31/0x2f0 [i915]
[<ffffffffc04785e0>] i915_ppgtt_init+0x30/0x80 [i915]
[<ffffffffc0478928>] i915_ppgtt_create+0x48/0xc0 [i915]
[<ffffffffc046c9c2>] i915_gem_create_context+0x1c2/0x390 [i915]
[<ffffffffc046d9cb>] i915_gem_context_create_ioctl+0x5b/0xa0 [i915]

leading to an oops in i915_gem_context_close. Also note that this
benchmark should not be running out of memory in the first place...

Testcase: igt/benchmark/gem_exec_ctx -b create # ppgtt >= 2
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Report IOMMU enabled status for GPU hangs

The IOMMU for Intel graphics has historically had many issues resulting
in random GPU hangs. Lets include its status when capturing the GPU hang
error state for post-mortem analysis.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Check idle to active before processing CSQ

If idle to active bit is set, the rest of the fields
in CSQ are not valid.

Bail out early if this is the case in order to prevent
rest of the loop inspecting stale values.

This was found by Bspec/code inspection. Doesn't seem to fix any of
the known issues.

Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Arun Siluvery <arun.siluvery@linux.intel.com>
[danvet: Add note about how this was found.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>