git.karo-electronics.de Git - linux-beck.git/log

Merge tag 'drm-intel-next-2013-02-01' of git://people.freedesktop.org/~danvet/drm-intel into drm-next

Daniel writes:
"Probably the last feature pull for 3.9, there's some fixes outstanding
thought that I'd like to sneak in. And maybe 3.8 takes a bit longer ...
Anyway, highlights of this pull:
- Kill the horrible IS_DISPLAYREG hack to handle the mmio offset movements
  on vlv, big thanks to Ville.
- Dynamic power well support for Haswell, shaves away a bit when only
  using the eDP port on pipe A (Paulo). Plus unclaimed register fixes
  uncovered by this.
- Clarifications of the gpu hang/reset state transitions, hopefully fixing
  a few spurious -EIO deaths in userspace.
- Haswell ELD fixes.
- Some more (pp)gtt cleanups from Ben.
- A few smaller things all over.

Plus all the stuff from the previous rather small pull request:
- Broadcast RBG improvements and reduced color range fixes from Ville.
- Ben is on a "kill legacy gtt code for good" spree, first pile of patches
  included.
- No-relocs and bo lut improvements for faster execbuf from Chris.
- Some refactorings from Imre."

* tag 'drm-intel-next-2013-02-01' of git://people.freedesktop.org/~danvet/drm-intel: (101 commits)
  GPU/i915: Fix acpi_bus_get_device() check in drivers/gpu/drm/i915/intel_opregion.c
  drm/i915: Set the SR01 "screen off" bit in i915_redisable_vga() too
  drm/i915: Kill IS_DISPLAYREG()
  drm/i915: Introduce i915_vgacntrl_reg()
  drm/i915: gen6_gmch_remove can be static
  drm/i915: dynamic Haswell display power well support
  drm/i915: check the power down well on assert_pipe()
  drm/i915: don't send DP "idle" pattern before "normal" on HSW PORT_A
  drm/i915: don't run hsw power well code on !hsw
  drm/i915: kill cargo-culted locking from power well code
  drm/i915: Only run idle processing from i915_gem_retire_requests_worker
  drm/i915: Fix CAGF for HSW
  drm/i915: Reclaim GTT space for failed PPGTT
  drm/i915: remove intel_gtt structure
  drm/i915: Add probe and remove to the gtt ops
  drm/i915: extract hw ppgtt setup/cleanup code
  drm/i915: pte_encode is gen6+
  drm/i915: vfuncs for ppgtt
  drm/i915: vfuncs for gtt_clear_range/insert_entries
  drm/i915: Error state should print /sys/kernel/debug
  ...

GPU/i915: Fix acpi_bus_get_device() check in drivers/gpu/drm/i915/intel_opregion.c

acpi_bus_get_device() returns int not acpi_status.

The patch change not to apply ACPI_FAILURE() to the return value of
acpi_bus_get_device().

Signed-off-by: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Set the SR01 "screen off" bit in i915_redisable_vga() too

From BSpec / SR01 - Clocking Mode:
"The following sequence must be used when disabling the VGA plane.
Write SR01 to set bit 5 = 1 to disable video output.
Wait for 100us.
Disable the VGA plane via Bit 31 of the MMIO VGA control."

So simply call i915_disable_vga() from i915_redisable_vga().

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Kill IS_DISPLAYREG()

All display registers should now include the proper offset on VLV.
That means IS_DISPLAYREG() is now useless, and we can eliminate it.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Introduce i915_vgacntrl_reg()

The VGACNTRL register has moved around between different platforms.
To handle the differences add i915_vgacntrl_reg() which returns the
correct offset for the VGACNTRL register.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: gen6_gmch_remove can be static

Signed-off-by: Changlong Xie <changlongx.xie@intel.com>
Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: dynamic Haswell display power well support

We can disable (almost) all the display hw if we only use pipe A, with
the integrated edp transcoder on port A. Because we don't set the cpu
transcoder that early (yet), we need to help us with a trick to simply
check for any edp encoders.

v2: Paulo Zanoni pointed out that we also need to configure the eDP
cpu transcoder correctly.

v3: Made by Paulo Zanoni
  - Rebase patch to be on top of "fix intel_init_power_wells" patch
  - Fix typos
  - Fix a small bug by adding a "connectors_active" check
  - Restore the initial code that unconditionally enables the power
    well when taking over from the BIOS

v4: Made by Paulo Zanoni
  - One more typo spotted by Jani Nikula

v5: Made by Paulo Zanoni
  - Rebase

Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: check the power down well on assert_pipe()

If the power well is disabled, we should not try to read its
registers, otherwise we'll get "unclaimed register" messages.

V2: Don't check whether the power well is enabled or not, just check
whether we asked it to be enabled or not: if we asked to disable the
power well, don't use the registers on it, even if it's still enabled.

V3: Fix bug that breaks all non-Haswell machines.

Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: don't send DP "idle" pattern before "normal" on HSW PORT_A

The DP_TP_STATUS register for PORT_A doesn't exist. Our documentation
will be fixed soon, so the code does not match it for now.

This solves "Timed out waiting for DP idle patterns" and "unclaimed
register" messages on eDP.

V1: Was called "drm/i915: don't read DP_TP_STATUS(PORT_A)"
V2: Was called "drm/i915: don't send DP idle pattern before normal
pattern on HSW"
V3: Only change the code that touches PORT_A.

Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: don't run hsw power well code on !hsw

Dumps annoying noise into the dmesg:

[drm:intel_set_power_well] *ERROR* Timeout enabling power well

Reported-by: Sedat Dilek <sedat.dilek@gmail.com>
Cc: Sedat Dilek <sedat.dilek@gmail.com>
Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
Tested-by: Sedat Dilek <sedat.dilek@gmail.com>
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: kill cargo-culted locking from power well code

We may not concurrently change the power wells code. Which
is already guaranteed since modesets aren't concurrent. That
leaves races against setup/teardown/suspend/resume, and for
those we already (try) rather hard not to hit concurrent
modesets.

No debug WARN_ON added since that would require us to grab the
modeset locks in init/suspend code. Which is again just cargo
culting since just grabbing the locks in those paths isn't good
enough, we need the right order of operations, too.

Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@gmail.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Only run idle processing from i915_gem_retire_requests_worker

When adding the fb idle detection to mark-inactive, it was forgotten
that userspace can drive the processing of retire-requests. We assumed
that it would be principally driven by the retire requests worker,
running once every second whilst active and so we would get the deferred
timer for free. Instead we spend too many CPU cycles reclocking the LVDS
preventing real work from being done.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reported-and-tested-by: Alexander Lam <lambchop468@gmail.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=58843
Cc: stable@vger.kernel.org
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@gmail.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Fix CAGF for HSW

The shift changed, hurray.

Reported-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: Paulo Zanoni <przanoni@gmail.com>
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Tested-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Cc: stable@vger.kernel.org
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Reclaim GTT space for failed PPGTT

When the PPGTT init fails, we may as well reuse the space that we were
reserving for the PPGTT PDEs.

This also fixes an extraneous mutex_unlock.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: remove intel_gtt structure

With the probe call in our dispatch table, we can now cut away the
last three remaining members in the intel_gtt shared struct and so
remove it completely.

v2: Rebased on top of Daniel's series

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Damien Lespiau <damien.lespiau@intel.com>
[danvet: bikeshed commit message a bit.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Add probe and remove to the gtt ops

The idea, and much of the code came originally from:

commit 0712f0249c3148d8cf42a3703403c278590d4de5
Author: Ben Widawsky <ben@bwidawsk.net>
Date: Fri Jan 18 17:23:16 2013 -0800

drm/i915: Create a vtable for i915 gtt

Daniel didn't like the color of that patch series, and so I asked him to
start something which appealed to his sense of color. The preceding
patches are those, and now this is going on top of that.

[extracted from the original commit message]

One immediately obvious thing to implement is our gmch probing. The init
function was getting massively bloated. Fundamentally, all that's needed
from GMCH probing is the GTT size, and the stolen size. It makes design
sense to put the mappable calculation in there as well, but the code
turns out a bit nicer without it (IMO)

The intel_gtt bridge thing is still here, but the subsequent patches
will finish ripping that out.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
[danvet: Bikeshedded one comment (GMADR is just the PCI aperture, we
use it for other things than just accessing tiled surfaces through a
linear view) and cut the newly added long lines a bit. Also one
checkpatch error.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: extract hw ppgtt setup/cleanup code

At the moment only cosmetics, but being able to initialize/cleanup
arbitrary ppgtt address spaces paves the way to have more than one of
them ... Just in case we ever get around to implementing real
per-process address spaces. Note that in that case another vfunc for
ppgtt would be beneficial though. But that can wait until the code
grows a second place which initializes ppgtts.

Reviewed-by: Damien Lespiau <damien.lespiau@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: pte_encode is gen6+

All the other gen6+ hw code has the gen6_ prefix, so be consistent
about it.

Reviewed-by: Damien Lespiau <damien.lespiau@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: vfuncs for ppgtt

Like for the global gtt we want a notch more flexibility here. Only
big change (besides a few tiny function parameter adjustments) was to
move gen6_ppgtt_insert_entries up (and remove _sg_ from its name, we
only have one kind of insert_entries since the last gtt cleanup).

We could also extract the platform ppgtt setup/teardown code a bit
better, but I don't care that much.

With this we have the hw details of pte writing nicely hidden away
behind a bit of abstraction. Which should pave the way for
different/multiple ppgtts (e.g. what we need for real ppgtt support).

Reviewed-by: Damien Lespiau <damien.lespiau@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: vfuncs for gtt_clear_range/insert_entries

We have a few too many differences here, so finally take the prepared
abstraction and run with it. A few smaller changes are required to get
things into shape:

- move i915_cache_level up since we need it in the gt funcs
- split up i915_ggtt_clear_range and move the two functions down to
where the relevant insert_entries functions are
- adjustments to a few function parameter lists

Now we have 2 functions which deal with the gen6+ global gtt
(gen6_ggtt_ prefix) and 2 functions which deal with the legacy gtt
code in the intel-gtt.c fake agp driver (i915_ggtt_ prefix).

Init is still a bit a mess, but honestly I don't care about that.

One thing I've thought about while deciding on the exact interfaces is
a flag parameter for ->clear_range: We could use that to decide
between writing invalid pte entries or scratch pte entries. In case we
ever get around to fixing all our bugs which currently prevent us from
filling the gtt with empty ptes for the truly unused ranges ...

Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
[bwidawsk: Moved functions to the gtt struct]
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Damien Lespiau <damien.lespiau@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Error state should print /sys/kernel/debug

/sys/kernel/debug has more or less been the standard location of debugfs
for several years now. Other parts of DRM already use this location, so
we should as well.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Carl Worth <cworth@cworth.org>
Reviewed-by: Damien Lespiau <damien.lespiau@intel.com>
[danvet: split up long line.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: move DP save/restore into i915_ums.c

Note that this slightly changes the order, but we only move it within
the block of registers that restore encoder state. Specifically LVDS
is now restored after DP, whereas previously it was done before.

Legacy vga is still restored afterwards, which seems to be the
important thing (if there's anything important in this restore
ordering at all).

Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: dont save/restore VGA state for kms

The only thing we really care about that it is off. To do so, reuse
the recently created i915_redisable_vga function, which is already
used to put obnoxious firmware into check on lid reopening.

Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: extract ums suspend/resume into i915_ums.c

Similarly to how i915_dma.c is shaping up to be the dungeon hole for
all things supporting dri1, create a new one to hide all the crazy
things which are only really useful for ums support. Biggest part is
the register suspend/resume support.

Unfortunately a lot of it is still intermingled with bits and pieces
we might still need, so needs more analysis and needs to stay in
i915_suspend.c for now.

Reviewed-by: Imre Deak <imre.deak@intel.com>
v2: s/modeset_reg/display_reg/ as suggested by Imre, to avoid
confusion between the kernel modeset code and display save/restore to
support ums.

v3: Fixup alphabetical order in the Makefile, spotted by Chris Wilson.

Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: move modeset checks out of save/restore_modeset_reg

That way the control flow is clearer, and it prepares the stage
to extract these ums functions and hide them somewhere.

There's still tons of display stuff outside of these, but that
requires more work.

Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Implement WaVSRefCountFullforceMissDisable

Implements WaVSRefCountFullforceMissDisable as documented in the BSpec
3D workarounds chapter.

Cc: Paulo Zanoni <przanoni@gmail.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: turn on the power well before suspending

Our suspend code touches a lot of registers all over the place, so we
need to enable the power well before suspending.

Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
[danvet: Fixup compilation by stealing the header decl from the
dynamic power wells patch.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: set TRANSCODER_EDP even earlier

Instead of setting it at the beginning of haswell_crtc_mode_set, let's
set it at the beginning of intel_crtc_mode_set. When
intel_crt_mode_set calls drm_vblank_pre_modeset we already need to
have the transcoder_edp correctly set, because eventually
drm_vblank_pre_modeset calls functions that call i915_pipe_enabled
from i915_irq.c, which will read PIPECONF(cpu_transcoder).

This is a bug that affects us since we added support for
TRANSCODER_EDP, but I was only able to see the problem after
suspending a machine with the power well disabled (got an "unclaimed
register" error.

Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: fixup per-crtc locking in intel_release_load_detect_pipe

One of the early return cases missed the mutex unlocking. Hilarity
ensued.

This regression has been introduced in

commit 7b24056be6db7ce907baffdd4cf142ab774ea60c
Author: Daniel Vetter <daniel.vetter@ffwll.ch>
Date: Wed Dec 12 00:35:33 2012 +0100

drm: don't hold crtc mutexes for connector ->detect callbacks

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=59750
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Tested-by: Cancan Feng <cancan.feng@intel.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>

drm/i915: only disable enabled planes on intel_fb_restore_mode

We should avoid touching registers that are on the power down well
when we don't need to, because if we touch these registers when the
power well is disabled we'll get tons of "unclaimed register"
messages. This commit fixes some of these messages.

Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: fix intel_init_power_wells

The current code was wrong in many different ways, so this is a full
rewrite. We don't have "different power wells for different parts of
the GPU", we have a single power well, but we have multiple registers
that can be used to request enabling/disabling the power well. So
let's be a good citizen and only use the register we're suppose to
use, except when we're loading the driver, where we clear the request
made by the BIOS.

If any of the registers is requesting the power well to be enabled, it
will be enabled. If none of the registers is requesting the power well
to be enabled, it will be disabled.

For now we're just forcing the power well to be enabled, but in the
next commits we'll change this.

V2:
  - Remove debug messages that could be misleading due to possible
    race conditions with KVMr, Debug and BIOS.
  - Don't wait on disabling: after a conversaion with a hardware
    engineer we discovered that the "restriction" on bit 31 is just
    for the "enable" case, and we don't even need to wait on the
    "disable" case.

Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: SWF screatch registers need an offset on VLV

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Include display_mmio_offset in sequencer index/data registers

SR01 needs to be touched to disable VGA on non-UMS setups too.
So the sequencer registers need to include the appripriate offset
on VLV.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Pass VLV_DISPLAY_BASE + reg to intel_{hdmi, dp}_init on VLV

When passing the DP/HDMI/SDVO registers to the encoder init functions,
include the VLV specific offset in the value.

v2: Resolved conflicts w/ VLV SDVO elimination

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: VLV doesn't have SDVO

Don't call intel_sdvo_init() for VLV.

Preserve the same behaviour as when intel_sdvo_init() would
have returned false.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Always use adpa_reg

Instead of using ADPA/VLV_ADPA/PCH_ADPA in various parts of
intel_crt code, just use adpa_reg which always contains the
correct value for the platform.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: PLL registers need an offset on VLV

v2: Dropped the clock gating registers

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Set display_mmio_offset for VLV

This will cause display registers to include the correct
offset on VLV.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: GPIO/GMBUS registers need an offset on VLV

GPIO/GMBUS registers must be offset on VLV, so simply
adjust gpio_mmio_base to include the correct offset.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: DPIO registers are VLV only and need an offset

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Spell out VLV_DISPLAY_BASE for interrupt registers

Instead of 0x18xxxx use (VLV_DISPLAY_BASE + xxxx).

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Make VLV_GUNIT_CLOCK_GATE register value more readable

Instead of 0x18xxxx use (VLV_DISPLAY_BASE + xxxx).

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: FB_BLC_SELF_VLV is VLV only and needs an offset

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Pipe palette registers need an offset on VLV

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Pipe timing registers need an offset on VLV

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: PORT_HOTPLUG registers need an offset on VLV

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Panel fitter registers need an offset on VLV

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: DPFLIPSTAT and DPINVGTT registers are VLV only and need an offset

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: DSPFW registers need an offset on VLV

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: VLV_DDL is VLV only and needs an offset

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Cursor registers need an offset on VLV

CURSIZE is not present on VLV, so it was left out, as were the IVB
specific cursor B registers.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Pipe registers need an offset on VLV

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Primary plane registers need an offset on VLV

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: PIPE M/N registers need an offset on VLV

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: VLV_VIDEO_DIP_CTL is for VLV only

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Per-pipe PP registers are for VLV only

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: AUD_VID_DID needs an offset on VLV

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Add display_display_mmio_offset to intel_device_info

Add an optional offset to intel_device_info, which will added
to most display register offsets.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Convert intel_dp to enum port

Use intel_dig_port->port rather than intel_dp->output_reg.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Convert intel_hdmi to enum port

Use intel_dig_port->port rather than intel_hdmi->sdvox_erg.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: don't save/restore DSPARB on gen5+

Because the register does not exist in gen5+.

This patch solves "unclaimed register" messages on Haswell after
suspend/resume.

Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: fixup sbi_read/write locking

commit 09153000b8ca32a539a1207edebabd0d40b6c61b
Author: Daniel Vetter <daniel.vetter@ffwll.ch>
Date:   Wed Dec 12 14:06:44 2012 +0100

    drm/i915: rework locking for intel_dpio|sbi_read|write

reworked the locking around sbi_read/write functions for 3.8-fixes.
But

commit dde86e2db54545ef981b64805097a7b4c3156d6e
Author: Paulo Zanoni <paulo.r.zanoni@intel.com>
Date:   Sat Dec 1 12:04:25 2012 -0200

    drm/i915: add lpt_init_pch_refcl

Added new use-cases in the -next tree which has not been updated in
the merge. Fix it up.

Reported-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
Tested-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: HDMI/DP - ELD info refresh support for Haswell

ELD info should be updated dynamically according to hot plug event.
For haswell chip, clear/set the eld valid bit and output enable bit
from callback intel_disable/eanble_ddi().

Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@gmail.com>
Signed-off-by: Wang Xingchao <xingchao.wang@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: use gem_set_seqno() on hardware init

When machine was rebooted or module was reloaded,
gem_hw_init() set last_seqno to be identical to next_seqno.
This lead to situation that waits for first ever request
always passed immediately regardless if it was actually
executed.

Use gem_set_seqno() to be consistent how hw is
initialized on init, wrap and on resume.

Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: add quirk to invert brightness on Packard Bell NCL20

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=44156
Reported-by: Alan Zimmerman <alan.zimm@gmail.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: add quirk to invert brightness on eMachines e725

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=31522#c35
[Note: There are more than one broken setups in the bug. This fixes one.]
Reported-by: Martins <andrissr@inbox.lv>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: add quirk to invert brightness on eMachines G725

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=59628
Reported-by: Roland Gruber <post@rolandgruber.de>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: clarify concurrent hang detect/gpu reset consistency

Damien Lespiau wondered how race the gpu reset/hang detection code is
against concurrent gpu resets/hang detections or combinations thereof.
Luckily the single work item is guranteed to never run concurrently,
so reset handling is already single-threaded.

Hence we only have to worry about concurrent hang detections, or a
hang detection firing off while we're still processing an older gpu
reset request. Due to the new mechanism of setting the reset in
progress flag and the ordering guaranteed by the schedule_work
function there's nothing to do but add a comment explaining why we're
safe.

The only thing I've noticed is that we still try to reset the gpu now,
even when it is declared terminally wedged. Add a check for that to
avoid continous warnings about failed resets, in case the hangcheck
timer ever gets stuck.

Reviewed-by: Damien Lespiau <damien.lespiau@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: create a race-free reset detection

With the previous patch the state transition handling of the reset
code itself is now (hopefully) race free and solid. But that still
leaves out everyone else - with the various lock-free wait paths
we have there's the possibility that the reset happens between the
point where we read the seqno we should wait on and the actual wait.

And if __wait_seqno then never sees the RESET_IN_PROGRESS state, we'll
happily wait for a seqno which will in all likelyhood never signal.

In practice this is not a big problem since the X server gets
constantly interrupted, and can then submit more work (hopefully) to
unblock everyone else: As soon as a new seqno write lands, all waiters
will unblock. But running the i-g-t reset testcase ZZ_hangman can
expose this race, especially on slower hw with fewer cpu cores.

Now looking forward to ARB_robustness and friends that's not the best
possible behaviour, hence this patch adds a reset_counter to be able
to detect any reset, even if a given thread never observed the
in-progress state.

The important part is to correctly order things:
- The write side needs to increment the counter after any seqno gets
  reset.  Hence we need to do that at the end of the reset work, and
  again wake everyone up. We also need to place a barrier in between
  any possible seqno changes and the counter increment, since any
  unlock operations only guarantee that nothing leaks out, but not
  that at later load operation gets moved ahead.
- On the read side we need to ensure that no reset can sneak in and
  invalidate the seqno. In all cases we can use the one-sided barrier
  that unlock operations guarantee (of the lock protecting the
  respective seqno/ring pair) to ensure correct ordering. Hence it is
  sufficient to place the atomic read before the mutex/spin_unlock and
  no additional barriers are required.

The end-result of all this is that we need to wake up everyone twice
in a reset operation:
- First, before the reset starts, to get any lockholders of the locks,
  so that the reset can proceed.
- Second, after the reset is completed, to allow waiters to properly
  and reliably detect the reset condition and bail out.

I admit that this entire reset_counter thing smells a bit like
overkill, but I think it's justified since it makes it really explicit
what the bail-out condition is. And we need a reset counter anyway to
implement ARB_robustness, and imo with finer-grained locking on the
horizont this is the most resilient scheme I could think of.

v2: Drop spurious change in the wait_for_error EXIT_COND - we only
need to wait until we leave the reset-in-progress wedged state.

v3: Don't play tricks with barriers in the throttle ioctl, the
spin_unlock is barrier enough.

I've also considered using a little helper to grab the current
reset_counter, but then decided that hiding the atomic_read isn't a
great idea, since having it explicitly show up in the code is a nice
remainder to reviews to check the memory barriers.

v4: Add a comment to explain why we need to fall through in
__wait_seqno in the end variable assignments.

v5: Review from Damien:
- s/smb/smp/ in a comment
- don't increment the reset counter after we've set it to WEDGED. Now
  we (again) properly wedge the gpu when the reset fails.

Reviewed-by: Damien Lespiau <damien.lespiau@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

Merge branch 'drm-kms-locking' of git://people.freedesktop.org/~danvet/drm-intel into drm-next

The aim of this locking rework is that ioctls which a compositor should be
might call for every frame (set_cursor, page_flip, addfb, rmfb and
getfb/create_handle) should not be able to block on kms background
activities like output detection. And since each EDID read takes about
25ms (in the best case), that always means we'll drop at least one frame.

The solution is to add per-crtc locking for these ioctls, and restrict
background activities to only use the global lock. Change-the-world type
of events (modeset, dpms, ...) need to grab all locks.

Two tricky parts arose in the conversion:
- A lot of current code assumes that a kms fb object can't disappear while
  holding the global lock, since the current code serializes fb
  destruction with it. Hence proper lifetime management using the already
  created refcounting for fbs need to be instantiated for all ioctls and
  interfaces/users.

- The rmfb ioctl removes the to-be-deleted fb from all active users. But
  unconditionally taking the global kms lock to do so introduces an
  unacceptable potential stall point. And obviously changing the userspace
  abi isn't on the table, either. Hence this conversion opportunistically
  checks whether the rmfb ioctl holds the very last reference, which
  guarantees that the fb isn't in active use on any crtc or plane (thanks
  to the conversion to the new lifetime rules using proper refcounting).
  Only if this is not the case will the code go through the slowpath and
  grab all modeset locks. Sane compositors will never hit this path and so
  avoid the stall, but userspace relying on these semantics will also not
  break.

All these cases are exercised by the newly added subtests for the i-g-t
kms_flip, tested on a machine where a full detect cycle takes around 100
ms.  It works, and no frames are dropped any more with these patches
applied.  kms_flip also contains a special case to exercise the
above-describe rmfb slowpath.

* 'drm-kms-locking' of git://people.freedesktop.org/~danvet/drm-intel: (335 commits)
  drm/fb_helper: check whether fbcon is bound
  drm/doc: updates for new framebuffer lifetime rules
  drm: don't hold crtc mutexes for connector ->detect callbacks
  drm: only grab the crtc lock for pageflips
  drm: optimize drm_framebuffer_remove
  drm/vmwgfx: add proper framebuffer refcounting
  drm/i915: dump refcount into framebuffer debugfs file
  drm: refcounting for crtc framebuffers
  drm: refcounting for sprite framebuffers
  drm: fb refcounting for dirtyfb_ioctl
  drm: don't take modeset locks in getfb ioctl
  drm: push modeset_lock_all into ->fb_create driver callbacks
  drm: nest modeset locks within fpriv->fbs_lock
  drm: reference framebuffers which are on the idr
  drm: revamp framebuffer cleanup interfaces
  drm: create drm_framebuffer_lookup
  drm: revamp locking around fb creation/destruction
  drm: only take the crtc lock for ->cursor_move
  drm: only take the crtc lock for ->cursor_set
  drm: add per-crtc locks
  ...

drm/fb_helper: check whether fbcon is bound

We need to make sure that the fbcon is still bound when touching the
hw, since otherwise we might corrupt the modeset state of kms clients.
X mostly works around that with VT switching and setting the VT into
raw mode, which disables most fbcon events.

Raw kms test programs though don't do that dance, and in the future
we might want to aim to abolish CONFIG_VT anyway. So improve preventive
measures a bit. To do so, extract the existing logic for handling hotplug
events (which X can't block with the current set of tricks) and reuse
it for the fbdev blanking helper.

Long-term we really need to either scrap this all and only have a OOPS
console, or come up with a saner model for device ownership sharing
between fbdev/fbcon and kms userspace.

Reviewed-by: Rob Clark <rob@ti.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/doc: updates for new framebuffer lifetime rules

Now that framebuffer are reference-counted for all use-sites, update
the documentation accordingly to stress the new rules for
initialization and teardown.

Also add a short paragraph about the implications for drivers of the
new locking rules.

Reviewed-by: Rob Clark <rob@ti.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm: don't hold crtc mutexes for connector ->detect callbacks

The coup de grace of the entire journey. No more dropped frames every
10s on my testbox!

I've tried to audit all ->detect and ->get_modes callbacks, but things
became a bit fuzzy after trying to piece together the umpteenth
implemenation. Afaict most drivers just have bog-standard output
register frobbing with a notch of i2c edid reading, nothing which
could potentially race with the newly concurrent pageflip/set_cursor
code. The big exception is load-detection code which requires a
running pipe, but radeon/nouveau seem to to this without touching any
state which can be observed from page_flip (e.g. disabled crtcs
temporarily getting enabled and so a pageflip succeeding).

The only special case I could find is the i915 load detect code. That
uses the normal modeset interface to enable the load-detect crtc, and
so userspace could try to squeeze in a pageflip on the load-detect
pipe. So we need to grab the relevant crtc mutex in there, to avoid
the temporary crtc enabling to sneak out and be visible to userspace.

Note that the sysfs files already stopped grabbing the per-crtc locks,
since I didn't want to bother with doing a interruptible
modeset_lock_all. But since there's very little in-between breakage
(essentially just the ability for userspace to pageflip on load-detect
crtcs when it shouldn't on the i915 driver) I figured I don't need to
bother.

Reviewed-by: Rob Clark <rob@ti.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm: only grab the crtc lock for pageflips

The pagelip ioctl itself is rather simply, so the hard work for this
patch is auditing all the drivers:

- exynos: Pageflip is protect with dev->struct_mutex and ...
  synchronous. But nothing fancy going on, besides a check whether the
  crtc is enabled, which should probably be somewhere in the drm core
  so that we have unified behaviour across all drivers.

- i915: hw-state is protected with dev->struct_mutex, the delayed
  unpin work together with the other stuff the pageflip complete irq
  handler needs is protected by the event_lock spinlock.

- nouveau: With the pin/unpin functions fixed, everything looks safe:
  A bit of ttm wrestling and refcounting, and a few channel accesses.
  The later are either already proteced sufficiently, or are now safe
  with the channel locking introduced to make cursor updates safe.

- radeon: The irq_get/put functions look a bit race, since the
  atomic_inc/dec isn't protect with locks. Otoh they're all per-crtc,
  so we should be safe with per-crtc locking from the drm core. Then
  there's tons of per-crtc register access, which could potentially go
  through the indirect reg acces. But that's fixed to make cursor
  updates concurrent. Bookeeping for the drm even is also protected
  with the even_lock, which also protects against the pageflip irq
  handler since radeon hw seems to have no way to queue these up
  asynchronously. Otherwise just a bit of ttm-based buffer handling
  and fencing, which is now safe with the previous patch to hold
  bdev->fence_lock while grabbing the ttm fence.

- shmob: Only one crtc. That's an easy one ...

- vmwgfx: As usual a bit special with tons different things:
  - Flippable check using is_implicit and num_implicit. Changes to
    those seem to be nicely covered with the global modeset lock, so
    we should be fine.
  - Some dirty cliprect handling stuff, or at least that is my guess.
    Looks like it's fine since either it's per-crtc, invariant or
    (like the execbuf stuff launched) protected otherwise.
  - Adding the actual flip to the fence_event list. On a quick look
    this seems to have solid locking in place, too.
  ... but generally this is all way over my head.

- imx: Impressive display of races between the page_flip
  implementation and the irq handler. Also, ipu_drm_set_base which
  gets eventually called from the irq handler to update the display
  base isn't really protected against concurrent set_config calls from
  process context.  In any case, going for per-crtc locking won't make
  this worse, so nothing to do.

- omap: The new async callback code merged into 3.8 seems to have
  solid locking in place, and there doesn't seem to be any shared
  state at risk. Especially since the callbacks still use
  modeset_lock_all and are so not converted.

v2: Update omapdrm analysis to 3.8 code per the discussion with Rob
Clark.

Reviewed-by: Rob Clark <rob@ti.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm: optimize drm_framebuffer_remove

Now that all framebuffer usage is properly refcounted, we are no
longer required to hold the modeset locks while dropping the last
reference. Hence implemented a fastpath which avoids the potential
stalls associated with grabbing mode_config.lock for the case where
there's no other reference around.

Explain in a big comment why it is safe. Also update kerneldocs with
the new locking rules around drm_framebuffer_remove.

Reviewed-by: Rob Clark <rob@ti.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/vmwgfx: add proper framebuffer refcounting

Afact vmwgfx already has all the right refcounting implemented on the
backing storage, and we only need to ensure that the drm fb doesn't
disappear untimely. So holding onto the fb reference from _lookup
until vmw_kms_present has completed should be enough.

Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: dump refcount into framebuffer debugfs file

Useful for checking whether the new refcounting works as advertised.

Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm: refcounting for crtc framebuffers

With the prep patch to encapsulate ->set_crtc calls, this is now
rather easy. Hooray for inconsistent semantics between ->set_crtc and
->page_flip, where the driver callback is supposed to update the fb
pointer, and ->update_plane, where the drm core does the same.

Also, since the drm core functions check crtc->fb before calling into
driver callbacks, we can't really reduce the critical sections
protected by the mode_config locks.

Reviewed-by: Rob Clark <rob@ti.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm: refcounting for sprite framebuffers

Now plane->fb holds a reference onto it's framebuffer. Nothing too
fancy going on here:
- Extract __drm_framebuffer_unreference to be called when we know
  we're not dropping the last reference, e.g. useful in the fb cleanup
  code.
- Reduce the locked sections in the set_plane ioctl to only protect
  plane->fb/plane->crtc and the driver callback (i.e. hw state).
  Everything either doesn't disappear (crtc, plane) or is refcounted
  (fb), and all the data we check is invariant over the respective
  object's lifetimes.

Reviewed-by: Rob Clark <rob@ti.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm: fb refcounting for dirtyfb_ioctl

We only need to ensure that the fb stays around for long enough. While
at it, only grab the modeset locks when we need them (since most
drivers don't implement the dirty callback, this should help jitter
and stalls when using the generic modeset driver).

Reviewed-by: Rob Clark <rob@ti.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm: don't take modeset locks in getfb ioctl

We only need to push the fb unreference a bit down. While at it,
properly pass the return value from ->create_handle back to userspace.

Most drivers either return -ENODEV if they don't have a concept of
buffer objects (ast, cirrus, ...) or just install a handle for the
underlying gem object (which is ok since we hold a reference on that
through the framebuffer).

v2: Split out the ->create_handle rework in the individual drivers.

Reviewed-by: Rob Clark <rob@ti.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm: push modeset_lock_all into ->fb_create driver callbacks

And drop it where it's not needed. Most driver just lookup the gem
object, allocate an fb struct, fill in all the useful fields and then
register it with drm_framebuffer_init.

All of these operations are already separately locked, and since we
only put the fb into the fpriv->fbs list _after_ having called
->fb_create, we can't also race with rmfb. We can otoh race with other
ioctls that put the framebuffer to use, but all drivers have been
reorganized already to call drm_framebuffer_init last in the fb
creation sequence.

So essentially, we can completely remove any modeset locks from the
addfb ioctl paths. Yeah!

Also, reference-counting is solid - we get a reference from fb_create
which we transfer to the fpriv->fbs list. And after unlocking the
fpriv->fbs_lock we don't touch the framebuffer any longer. Furthermore
drm_framebuffer_init has added a 2nd reference for the idr lookup, and
any access through that table will do it's own refcounting.

Reviewed-by: Rob Clark <rob@ti.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm: nest modeset locks within fpriv->fbs_lock

Atm we still need to unconditionally take the modeset locks in the
rmfb paths. But eventually we only want to take them if there are
other users around as a slow-path. This way sane userspace avoids
blocking on edid reads and other stuff in rmfb if it ensures that the
fb isn't used anywhere by a crtc/plane.

We can do a quick check for such other users once framebuffers are
properly refcounting by locking at the refcount - if it's more than 1,
there are other users left. Again, rmfb racing against other ioctls
isn't a real problem, userspace is allowed to shoot its foot.

This patch just prepares this by moving the modeset locks to nest
within fpriv->fbs_lock. Now the distinction between the fbs_lock and
the device-global fb_lock is clear, since we need to hold the fbs_lock
outside of any modeset_locks in fb_release.

Reviewed-by: Rob Clark <rob@ti.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm: reference framebuffers which are on the idr

Since otherwise looking and reference-counting around
drm_framebuffer_lookup will be an unmanageable mess. With this change,
an object can either be found in the idr and will stay around once we
incremented the reference counter. Or it will be gone for good and
can't be looked up using its id any more.

Atomicity is guaranteed by the dev->mode_config.fb_lock. The
newly-introduce fpriv->fbs_lock looks a bit redundant, but the next
patch will shuffle the locking order between these two locks and all
the modeset locks taken in modeset_lock_all, so we'll need it.

Also, since userspace could do really funky stuff and race e.g. a
getresources with an rmfb, we need to make sure that the kernel
doesn't fall over trying to look-up an inexistent fb, or causing
confusion by having two fbs around with the same id. Simply reset the
framebuffer id to 0, which marks it as reaped. Any lookups of that id
will fail, so the object is really gone for good from userspace's pov.

Note that we still need to protect the "remove framebuffer from all
use-cases" and the final unreference with the modeset-lock, since most
framebuffer use-sites don't implement proper reference counting yet.
We can only lift this once _all_ users are converted.

With this change, two references are held on alife, but unused
framebuffers:
- The reference for the idr lookup, created in this patch.
- For user-created framebuffers the fpriv->fbs reference, for
driver-private fbs the driver is supposed to hold it's own last
reference.

Note that the dev->mode_config.fb_list itself does _not_ hold a
reference onto the framebuffers (this list is essentially only used
for debugfs files). Hence if there's anything left there when the
driver has cleaned up all it's modeset resources, this is a ref-leak.
WARN about it.

Now we only need to fix up all other places to properly reference
count framebuffers.

v2: Fix spelling fail in a comment spotted by Rob Clark.

Reviewed-by: Rob Clark <rob@ti.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm: revamp framebuffer cleanup interfaces

We have two classes of framebuffer
- Created by the driver (atm only for fbdev), and the driver holds
  onto the last reference count until destruction.
- Created by userspace and associated with a given fd. These
  framebuffers will be reaped when their assoiciated fb is closed.

Now these two cases are set up differently, the framebuffers are on
different lists and hence destruction needs to clean up different
things. Also, for userspace framebuffers we remove them from any
current usage, whereas for internal framebuffers it is assumed that
the driver has done this already.

Long story short, we need two different ways to cleanup such drivers.
Three functions are involved in total:
- drm_framebuffer_remove: Convenience function which removes the fb
  from all active usage and then drops the passed-in reference.
- drm_framebuffer_unregister_private: Will remove driver-private
  framebuffers from relevant lists and drop the corresponding
  references. Should be called for driver-private framebuffers before
  dropping the last reference (or like for a lot of the drivers where
  the fbdev is embedded someplace else, before doing the cleanup
  manually).
- drm_framebuffer_cleanup: Final cleanup for both classes of fbs,
  should be called by the driver's ->destroy callback once the last
  reference is gone.

This patch just rolls out the new interfaces and updates all drivers
(by adding calls to drm_framebuffer_unregister_private at all the
right places)- no functional changes yet. Follow-on patches will move
drm core code around and update the lifetime management for
framebuffers, so that we are no longer required to keep framebuffers
alive by locking mode_config.mutex.

I've also updated the kerneldoc already.

vmwgfx seems to again be a bit special, at least I haven't figured out
how the fbdev support in that driver works. It smells like it's
external though.

v2: The i915 driver creates another private framebuffer in the
load-detect code. Adjust its cleanup code, too.

Reviewed-by: Rob Clark <rob@ti.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm: create drm_framebuffer_lookup

And replace all fb lookups with it. Also add a WARN to
drm_mode_object_find since that is now no longer the blessed interface
to look up an fb. And add kerneldoc to both functions.

This only updates all callsites, but immediately drops the acquired
refence again. Hence all callers still rely on the fact that a mode fb
can't disappear while they're holding the struct mutex. Subsequent
patches will instate proper use of refcounts, and then rework the rmfb
and unref code to no longer serialize fb destruction with the
mode_config lock. We don't want that since otherwise a compositor
might end up stalling for a few frames in rmfb.

v2: Don't use kref_get_unless_zero - Greg KH doesn't like that kind of
interface.

Reviewed-by: Rob Clark <rob@ti.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm: revamp locking around fb creation/destruction

Well, at least step 1. The goal here is that framebuffer objects can
survive outside of the mode_config lock, with just a reference held
as protection. The first step to get there is to introduce a special
fb_lock which protects fb lookup, creation and destruction, to make
them appear atomic.

This new fb_lock can nest within the mode_config lock. But the idea is
(once the reference counting part is completed) that we only quickly
take that fb_lock to lookup a framebuffer and grab a reference,
without any other locks involved.

vmwgfx is the only driver which does framebuffer lookups itself, also
wrap those calls to drm_mode_object_find with the new lock.

Also protect the fb_list walking in i915 and omapdrm with the new lock.

As a slight complication there's also the list of user-created fbs
attached to the file private. The problem now is that at fclose() time
we need to walk that list, eventually do a modeset call to remove the
fb from active usage (and are required to be able to take the
mode_config lock), but in the end we need to grab the new fb_lock to
remove the fb from the list. The easiest solution is to add another
mutex to protect this per-file list.

Currently that new fbs_lock nests within the modeset locks and so
appears redudant. But later patches will switch around this sequence
so that taking the modeset locks in the fb destruction path is
optional in the fastpath. Ultimately the goal is that addfb and rmfb
do not require the mode_config lock, since otherwise they have the
potential to introduce stalls in the pageflip sequence of a compositor
(if the compositor e.g. switches to a fullscreen client or if it
enables a plane). But that requires a few more steps and hoops to jump
through.

Note that framebuffer creation/destruction is now double-protected -
once by the fb_lock and in parts by the idr_lock. The later would be
unnecessariy if framebuffers would have their own idr allocator. But
that's material for another patch (series).

v2: Properly initialize the fb->filp_head list in _init, otherwise the
newly added WARN to check whether the fb isn't on a fpriv list any
more will fail for driver-private objects.

v3: Fixup two error-case unlock bugs spotted by Richard Wilbur.

Reviewed-by: Rob Clark <rob@ti.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm: only take the crtc lock for ->cursor_move

->cursor_move uses mostly the same facilities in drivers as
->cursor_set, so pretty much nothing to fix up:

- ast/gma500/i915: They all use per-crtc registers to update the
  cursor position. ast again touches the global cursor cache, but
  that's ok since there's only one crtc.

- nouveau: nv50+ is again special, updates happen through the per-crtc
  channel (without pushbufs), so it's not protected by the new evo
  lock introduced earlier. But since this channel is per-crtc, we
  should be fine anyway.

- radeon: A bit a mess: avivo asics need a workaround when both output
  pipes are enabled, which means it'll access the crtc list. Just
  reading that flag is ok though as long as radeon _always_ grabs all
  locks when changing the crtc configuration. Which means with the
  current scheme it cannot do an optimized modeset which only locks
  the relevant crtcs. This can be fixed though by introducing a bit of
  global state with separate locks and ensure in the modeset code that
  the cursor will be updated appropriately when enabling the 2nd pipe
  (on affected asics).

- vmwgfx: I still don't understand what it's doing exactly, so apply
  the same trick for now.

v2: Fixup unlocking for the error cases, spotted by Richard Wilbur.

v3: Another error-case fixup.

Reviewed-by: Rob Clark <rob@ti.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm: only take the crtc lock for ->cursor_set

First convert ->cursor_set to only take the crtc lock, since that
seems to be the function with the least amount of state - the core
ioctl function doesn't check anything which can change at runtime, so
we don't have any object lifetime issues to contend.

The only thing which is important is that the driver's implementation
doesn't touch any state outside of that single crtc which is not yet
properly protected by other locking:

- ast: access the global ast->cache_kmap. Luckily we only have on crtc
  on this driver, so this is fine. Add a comment.

- gma500: calls gma_power_begin|and and psb_gtt_pin|unpin, both which
  have their own locking to protect their state. Everything else is
  crtc-local.

- i915: touches a bit of global gem state, all protected by the One
  Lock to Rule Them All (dev->struct_mutex).

- nouveau: Pre-nv50 is all nice, nv50+ uses the evo channels to queue
  up all display changes. And some of these channels are device
  global. But this is fine now since the previous patch introduced an
  evo channel mutex.

- radeon: Uses some indirect register access for cursor updates, but
  with the previous patches to protect these indirect 2-register
  access patterns with a spinlock, this should be fine now, too.

- vmwgfx: I have no idea how that works - update_cursor_position
  doesn't take any per-crtc argument and I haven't figured out any
  other place where this could be set in some form of a side-channel.
  But vmwgfx definitely has more than one crtc (or at least can
  register more than one), so I have no idea how this is supposed to
  not fail with the current code already. Hence take the easy way out
  and simply acquire all locks (which requires dropping the crtc lock
  the core acquired for us). That way it's not worse off for
  consistency than the old code.

Reviewed-by: Rob Clark <rob@ti.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm: add per-crtc locks

*drumroll*

The basic idea is to protect per-crtc state which can change without
touching the output configuration with separate mutexes, i.e.  all the
input side state to a crtc like framebuffers, cursor settings or plane
configuration. Holding such a crtc lock gives a read-lock on all the
other crtc state which can be changed by e.g. a modeset.

All non-crtc state is still protected by the mode_config mutex.
Callers that need to change modeset state of a crtc (e.g. dpms or
set_mode) need to grab both the mode_config lock and nested within any
crtc locks.

Note that since there can only ever be one holder of the mode_config
lock we can grab the subordinate crtc locks in any order (if we need
to grab more than one of them). Lockdep can handle such nesting with
the mutex_lock_nest_lock call correctly.

With this functions that only touch connectors/encoders but not crtcs
only need to take the mode_config lock. The biggest such case is the
output probing, which means that we can now pageflip and move cursors
while the output probe code is reading an edid.

Most cases neatly fall into the three buckets:
- Only touches connectors and similar output state and so only needs
  the mode_config lock.
- Touches the global configuration and so needs all locks.
- Only touches the crtc input side and so only needs the crtc lock.

But a few cases that need special consideration:

- Load detection which requires a crtc. The mode_config lock already
  prevents a modeset change, so we can use any unused crtc as we like
  to do load detection. The only thing to consider is that such
  temporary state changes don't leak out to userspace through ioctls
  that only take the crtc look (like a pageflip). Hence the load
  detect code needs to grab the crtc of any output pipes it touches
  (but only if it touches state used by the pageflip or cursor
  ioctls).

- Atomic pageflip when moving planes. The first case is sane hw, where
  planes have a fixed association with crtcs - nothing needs to be
  done there. More insane^Wflexible hw needs to have plane->crtc
  mapping which is separately protect with a lock that nests within
  the crtc lock. If the plane is unused we can just assign it to the
  current crtc and continue. But if a plane is already in use by
  another crtc we can't just reassign it.

  Two solution present themselves: Either go back to a slow-path which
  takes all modeset locks, potentially incure quite a hefty delay. Or
  simply disallowing such changes in one atomic pageflip - in general
  the vblanks of two crtcs are not synced, so there's no sane way to
  atomically flip such plane changes accross more than one crtc. I'd
  heavily favour the later approach, going as far as mandating it as
  part of the ABI of such a new a nuclear pageflip.

  And if we _really_ want such semantics, we can always get them by
  introducing another pageflip mutex between the mode_config.mutex and
  the individual crtc locks. Pageflips crossing more than one crtc
  would then need to take that lock first, to lock out concurrent
  multi-crtc pageflips.

- Optimized global modeset operations: We could just take the
  mode_config lock and then lazily lock all crtc which are affected by
  a modeset operation. This has the advantage that pageflip could
  continue unhampered on unaffected crtc. But if e.g. global resources
  like plls need to be reassigned and so affect unrelated crtcs we can
  still do that - nested locking works in any order.

This patch just adds the locks and takes them in drm_modeset_lock_all,
no real locking changes yet.

v2: Need to initialize the new lock in crtc_init and lock it righ
away, for otherwise the modeset_unlock_all below will try to unlock a
not-locked mutex.

Reviewed-by: Rob Clark <rob@ti.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

omapdrm: use modeset_lock_all

I've left the locking in the debugfs code as-is, it's essentially just
used to keep the framebuffer object alive (which won't be necessary
any more later on). We don't need fb refcounting either, since the new
mode_config.fb_lock ensures that the framebuffers can't disappear
(once mode_config.mutex doesn't guarantee this any more later on in
the series).

The fbcon restore needs all modeset locks. The crtc callbacks seem to
only need the crtc locks, but I've quickly discussed things with Rob
Clark and he's fine with just using modeset_lock_all for those, too.
He'll look into converting things over later.

Reviewed-by: Rob Clark <rob@ti.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/vmwgfx: use drm_modeset_lock_all

Ok, this one here is a bit more complicated, and I can't really claim
to fully understand the locking and lifetime rules of the vmwgfx
driver. So just convert ever mutex_lock call, including the
interruptible one. Since other places (e.g. in the execbuf ioctl) take
the mode_config.mutex without bothering with interruptible handling,
I've figured I should be able to get away with this in a few more
places ...

Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/shmobile: use drm_modeset_lock_all

Only a resume method to account for.

v2: Fixup compilation.

Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/ast: use drm_modeset_lock_all

Just a call to drm_helper_resume_force_mode, obviously wants full
locking for that.

Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/gma500: use drm_modeset_lock_all

Only two places:
- suspend/resume
- Some really strange mode validation tool with too much funny-lucking
hand-rolled conversion code.
- The recently-added lastclose fbdev restore code.

Better safe than sorry, so convert both places to keep the locking
semantics as much as possible.

Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: use drm_modeset_lock_all

Two exceptions:
- debugfs files only read information which is not related to crtc, so
  can stay on the modeset_config lock.
- Same holds for the edp vdd work in intel_dp.c. Add a corresponding
  WARN_ON and a comment next to the intel_dp struct fields for
  documentation.

Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm: add drm_modeset_lock|unlock_all

This is the first step towards introducing the new modeset locking
scheme. The plan is to put helper functions into place at all the
right places step-by-step, so that the final patch to switch on the
new locking scheme doesn't need to touch every single driver.

This helper here will serve as the shotgun solutions for all places
where a more fine-grained locking isn't (yet) implemented.

v2: Fixup kerneldoc for unlock_all.

Reviewed-by: Rob Clark <rob@ti.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm: encapsulate crtc->set_config calls

With refcounting we need to adjust framebuffer refcounts at each
callsite - much easier to do if they all call the same little helper
function.

Reviewed-by: Rob Clark <rob@ti.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/<drivers>: Unified handling of unimplemented fb->create_handle

Some drivers don't have real ->create_handle callbacks.

- cirrus/ast/mga200: Returns either 0 or -EINVAL.

- udl: Didn't even bother with a callback, leading to a nice
userspace-triggerable OOPS.

- vmwgfx: This driver bothered with an implementation to return 0 as
the handle (which is the canonical no-obj gem handle).

All have in common that ->create_handle doesn't really make too much
sense for them - that ioctl is used only for seamless fb takeover in
the radeon/nouveau/i915 ddx drivers. So allow drivers to not implement
this and return a consistent -ENODEV.

Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/nouveau: try to protect nbo->pin_refcount

... by moving the bo_pin/bo_unpin manipulation of the pin_refcount
under the protection of the ttm reservation lock. pin/unpin seems
to get called from all over the place, so atm this is completely racy.

After this patch there are only a few places in cleanup functions
left which access ->pin_refcount without locking. But I'm hoping that
those are safe and some other code invariant guarantees that this
won't blow up.

In any case, I only need to fix up pin/unpin to make ->pageflip work
safely, so let's keep it at that.

Add a comment to the header to explain the new locking rule.

Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>