git.karo-electronics.de Git - linux-beck.git/log

drm/i915: Move vblank wait determination to 'check' phase

Determining whether we'll need to wait for vblanks is something we
should determine during the atomic 'check' phase, not the 'commit'
phase.  Note that we only set these bits in the branch of 'check' where
intel_crtc->active is true so that we don't try to wait on a disabled
CRTC.

The whole 'wait for vblank after update' flag should go away in the
future, once we start handling watermarks in a proper atomic manner.

This regression has been introduced in

commit 2fdd7def16dd7580f297827930126c16b152ec11
Author: Matt Roper <matthew.d.roper@intel.com>
Date:   Wed Mar 4 10:49:04 2015 -0800
    drm/i915: Don't clobber plane state on internal disables

Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
Root-cause-analysis-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89550
Testcase: igt/pm_rpm/legacy-planes
Testcase: igt/pm_rpm/legacy-planes-dpms
Testcase: igt/pm_rpm/universal-planes
Testcase: igt/pm_rpm/universal-planes-dpms
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Tested-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915/chv: use vlv_PLL_is_optimal in chv_find_best_dpll

Prepare chv_find_best_dpll to be used for BXT too, where we want to
consider the error between target and calculated frequency too when
choosing a better PLL configuration.

No functional change.

Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: check for div-by-zero in vlv_PLL_is_optimal

Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: factor out vlv_PLL_is_optimal

Factor out the logic to decide whether the newly calculated dividers are
better than the best found so far. Do this for clarity and to prepare
for the upcoming BXT helper needing the same.

No functional change.

Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Kill intel_plane->obj

intel_plane->obj is not used anymore so kill it. Also don't pass both
the fb and obj to the sprite .update_plane() hook, as just passing the fb
is enough.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Initialize all contexts

The problem is we're going to switch to a new context, which could be
the default context. The plan was to use restore inhibit, which would be
fine, except if we are using dynamic page tables (which we will). If we
use dynamic page tables and we don't load new page tables, the previous
page tables might go away, and future operations will fault.

CTXA runs.
switch to default, restore inhibit
CTXA dies and has its address space taken away.
Run CTXB, tries to save using the context A's address space - this
fails.

The general solution is to make sure every context has it's own state,
and its own address space. For cases when we must restore inhibit, first
thing we do is load a valid address space. I thought this would be
enough, but apparently there are references within the context itself
which will refer to the old address space - therefore, we also must
reinitialize.

v2: to->ppgtt is only valid in full ppgtt.
v3: Rebased.
v4: Make post PDP update clearer.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Michel Thierry <michel.thierry@intel.com> (v2+)
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Track page table reload need

This patch was formerly known as, "Force pd restore when PDEs change,
gen6-7." I had to change the name because it is needed for GEN8 too.

The real issue this is trying to solve is when a new object is mapped
into the current address space. The GPU does not snoop the new mapping
so we must do the gen specific action to reload the page tables.

GEN8 and GEN7 do differ in the way they load page tables for the RCS.
GEN8 does so with the context restore, while GEN7 requires the proper
load commands in the command streamer. Non-render is similar for both.

Caveat for GEN7
The docs say you cannot change the PDEs of a currently running context.
We never map new PDEs of a running context, and expect them to be
present - so I think this is okay. (We can unmap, but this should also
be okay since we only unmap unreferenced objects that the GPU shouldn't
be tryingto va->pa xlate.) The MI_SET_CONTEXT command does have a flag
to signal that even if the context is the same, force a reload. It's
unclear exactly what this does, but I have a hunch it's the right thing
to do.

The logic assumes that we always emit a context switch after mapping new
PDEs, and before we submit a batch. This is the case today, and has been
the case since the inception of hardware contexts. A note in the comment
let's the user know.

It's not just for gen8. If the current context has mappings change, we
need a context reload to switch

v2: Rebased after ppgtt clean up patches. Split the warning for aliasing
and true ppgtt options. And do not break aliasing ppgtt, where to->ppgtt
is always null.

v3: Invalidate PPGTT TLBs inside alloc_va_range.

v4: Rename ppgtt_invalidate_tlbs to mark_tlbs_dirty and move
pd_dirty_rings from i915_address_space to i915_hw_ppgtt. Fixes when
neither ctx->ppgtt and aliasing_ppgtt exist.

v5: Removed references to teardown_va_range.

v6: Updated needs_pd_load_pre/post.

v7: Fix pd_dirty_rings check in needs_pd_load_post, and update/move
comment about updated PDEs to object_pin/bind (Mika).

Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Michel Thierry <michel.thierry@intel.com> (v2+)
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Track GEN6 page table usage

Instead of implementing the full tracking + dynamic allocation, this
patch does a bit less than half of the work, by tracking and warning on
unexpected conditions. The tracking itself follows which PTEs within a
page table are currently being used for objects. The next patch will
modify this to actually allocate the page tables only when necessary.

With the current patch there isn't much in the way of making a gen
agnostic range allocation function. However, in the next patch we'll add
more specificity which makes having separate functions a bit easier to
manage.

One important change introduced here is that DMA mappings are
created/destroyed at the same page directories/tables are
allocated/deallocated.

Notice that aliasing PPGTT is not managed here. The patch which actually
begins dynamic allocation/teardown explains the reasoning for this.

v2: s/pdp.page_directory/pdp.page_directories
Make a scratch page allocation helper

v3: Rebase and expand commit message.

v4: Allocate required pagetables only when it is needed, _bind_to_vm
instead of bind_vma (Daniel).

v5: Rebased to remove the unnecessary noise in the diff, also:
- PDE mask is GEN agnostic, renamed GEN6_PDE_MASK to I915_PDE_MASK.
- Removed unnecessary checks in gen6_alloc_va_range.
- Changed map/unmap_px_single macros to use dma functions directly and
   be part of a static inline function instead.
- Moved drm_device plumbing through page tables operation to its own
   patch.
- Moved allocate/teardown_va_range calls until they are fully
   implemented (in subsequent patch).
- Merged pt and scratch_pt unmap_and_free path.
- Moved scratch page allocator helper to the patch that will use it.

v6: Reduce complexity by not tearing down pagetables dynamically, the
same can be achieved while freeing empty vms. (Daniel)

v7: s/i915_dma_map_px_single/i915_dma_map_single
s/gen6_write_pdes/gen6_write_pde
Prevent a NULL case when only GGTT is available. (Mika)

v8: Rebased after s/page_tables/page_table/.

v9: Reworked i915_pte_index and i915_pte_count.
Also exercise bitmap allocation here (gen6_alloc_va_range) and fix
incorrect write_page_range in i915_gem_restore_gtt_mappings (Mika).

Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Michel Thierry <michel.thierry@intel.com> (v3+)
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Extract context switch skip and add pd load logic

In Gen8, PDPs are saved and restored with legacy contexts (legacy contexts
only exist on the render ring). So change the ordering of LRI vs MI_SET_CONTEXT
for the initialization of the context. Also the only cases in which we
need to manually update the PDPs are when MI_RESTORE_INHIBIT has been
set in MI_SET_CONTEXT (i.e. when the context is not yet initialized or
it is the default context).

Legacy submission is not available post GEN8, so it isn't necessary to
add extra checks for newer generations.

v2: Use new functions to replace the logic right away (Daniel)
v3: Add missing pd load logic.
v4: Add warning in case pd_load_pre & pd_load_post are true, and add
missing trace_switch_mm. Cleaned up pd_load conditions. Add more
information about when is pd_load_post needed. (Mika)

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Michel Thierry <michel.thierry@intel.com> (v2+)
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: page table generalizations

No functional changes, but will improve code clarity and removed some
duplicated defines.

Signed-off-by: Michel Thierry <michel.thierry@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Send out the full AUX address

AUX addresses are 20 bits long. Send out the entire address instead of
just the low 16 bits.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: kerneldoc for i915_gem_shrinker.c

And remove one bogus * from i915_gem_gtt.c since that's not a
kerneldoc there.

v2: Review from Chris:
- Clarify memory space to better distinguish from address space.
- Add note that shrink doesn't guarantee the freed memory and that
users must fall back to shrink_all.
- Explain how pinning ties in with eviction/shrinker.

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Extract i915_gem_shrinker.c

Two code changes:
- Extract i915_gem_shrinker_init.
- Inline i915_gem_object_is_purgeable since we open-code it everywhere
else too.

This already has the benefit of pulling all the shrinker code
together, next patch adds a bit of kerneldoc.

Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Use down ei for manual Baytrail RPS calculations

Use both up/down manual ei calcuations for symmetry and greater
flexibility for reclocking, instead of faking the down interrupt based
on a fixed integer number of up interrupts.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Deepak S<deepak.s@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Improved w/a for rps on Baytrail

Rewrite commit 31685c258e0b0ad6aa486c5ec001382cf8a64212
Author: Deepak S <deepak.s@linux.intel.com>
Date:   Thu Jul 3 17:33:01 2014 -0400

    drm/i915/vlv: WA for Turbo and RC6 to work together.

Other than code clarity, the major improvement is to disable the extra
interrupts generated when idle.  However, the reclocking remains rather
slow under the new manual regime, in particular it fails to downclock as
quickly as desired. The second major improvement is that for certain
workloads, like games, we need to combine render+media activity counters
as the work of displaying the frame is split across the engines and both
need to be taken into account when deciding the global GPU frequency as
memory cycles are shared.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Deepak S <deepak.s@linux.intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Deepak S<deepak.s@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Relax RPS contraints to allows setting minfreq on idle

When we idle, we set the GPU frequency to the hardware minimum (not user
minimum). We introduce a new variable to distinguish between the
different roles, and to allow easy tuning of the idle frequency without
impacting over aspects of RPS. Setting the minimum frequency should be a
safety blanket as the pcu on the GPU should be power gating itself
anyway. However, in order for us to do set the absolute minimum
frequency, we need to relax a few of our assertions that we do not
exceed the user limits.

v2: Add idle_freq
v3: Init idle_freq for vlv and add a bunch of WARNs

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Deepak S <deepak.s@linux.intel.com>
Reviewed-by: Deepak S<deepak.s@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Fallback to using CPU relocations for large batch buffers

If the batch buffer is too large to fit into the aperture and we need a
GTT mapping for relocations, we currently fail. This only applies to a
subset of machines for a subset of environments, quite undesirable. We
can simply check after failing to insert the batch into the GTT as to
whether we only need a mappable binding for relocation and, if so, we can
revert to using a non-mappable binding and an alternate relocation
method. However, using relocate_entry_cpu() is excruciatingly slow for
large buffers on non-LLC as the entire buffer requires clflushing before
and after the relocation handling. Alternatively, we can implement a
third relocation method that only clflushes around the relocation entry.
This is still slower than updating through the GTT, so we prefer using
the GTT where possible, but is orders of magnitude faster as we
typically do not have to then clflush the entire buffer.

An alternative idea of using a temporary WC mapping of the backing store
is promising (it should be faster than using the GTT itself), but
requires fairly extensive arch/x86 support - along the lines of
kmap_atomic_prof_pfn() (which is not universally implemented even for
x86).

Testcase: igt/gem_exec_big #pnv,byt
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=88392
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
[danvet: Add a WARN_ONCE for the impossible reloc case and explain in
a short comment why we want to avoid ping-pong.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Turn on PIN_GLOBAL in i915_gem_object_ggtt_pin

This makes the interface consistent to old i915_gem_obj_ggtt_pin.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: memory leak in __i915_gem_vma_create()

In the original code then if WARN_ON(i915_is_ggtt(vm) != !!ggtt_view)
was true then we leak "vma". Presumably that doesn't happen often but
static checkers complain and this bug is easy to fix.

Fixes: c3bbb6f2825d ('drm/i915: Do not use ggtt_view with (aliasing) PPGTT')
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915/dp: return number of bytes written for short aux/i2c writes

Allow for a larger receive data size, and check if the receiver returned
the number of bytes written. Without this, we've basically skipped all
the unwritten bytes for short writes.

Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Unconfuse DP link rate array names

To keep things clear rename the intel_dp->supported_rates[] to
intel_dp->sink_rates[], and rename the supported_rates[] name we used
elsewhere for the intersection of source and sink rates to
common_rates[].

Cc: Sonika Jindal <sonika.jindal@intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Sonika Jindal <sonika.jindal@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Include the sink/source/supported rates in debug output

TODO: Is there an actually nice way to print an array of ints?

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Sonika Jindal <sonika.jindal@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Add eDP intermediate frequencies for CHV

"P1273_DPLL_Programming Spreadsheet.xlsm" lists a boatload of
frequencies for eDP. Try to use them all.

For now I've decided not to add hardcoded DPLL dividers for these cases
since chv_find_best_dpll() works just fine.

I've not actually tested any of these since I don't have an eDP 1.4 panel.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Sonika Jindal <sonika.jindal@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Avoid overflowing the DP link rate arrays

Complain loudly if we ever attempt to overflow the the supported_rates[]
array. This should never happen since the sink_rates[] array will always
be smaller or of equal size. But should someone change that we want to
catch it without scribblign over the stack.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Sonika Jindal <sonika.jindal@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Fix MST link rate handling

Now that intel_dp_max_link_bw() no longer considers the source
restrictions we may try to enable MST with 5.4GHz even when the source
doesn't support it. To fix that switch the code over to handle the link
rate in the same way as the SST code handles it.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Sonika Jindal <sonika.jindal@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Use DP_LINK_RATE_SET whenever possible

Drop the gen9 checks from the code and issue DP_LINK_RATE_SET whenever
the sink reports to support it.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Sonika Jindal <sonika.jindal@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Fix max link rate in intel_dp_mode_valid()

Consider the link rates reported by the sink via
DP_SUPPORTED_LINK_RATES when checking modes against the max link
rate.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Sonika Jindal <sonika.jindal@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Hide the source vs. sink rate handling from intel_dp_compute_config()

intel_dp_compute_config() only really needs to know the rates supported
by both source and sink, so hide the raw source and sink arrays from it.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Sonika Jindal <sonika.jindal@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Fully separate source vs. sink rates

Remove the sink vs. source limit mess from intel_dp_max_link_bw() and
just move the source restriction checks to intel_dp_source_rates().

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Sonika Jindal <sonika.jindal@intel.com>
[danvet: Resolve conflict with WaDisableHBR2:skl patch.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Remove special case from intel_supported_rates()

Now that both source and sink rates are always filled in there's no need
for any special cases in intel_supported_rates().

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Sonika Jindal <sonika.jindal@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Don't copy sink rates either

Once we've read the rates from the sink we don't have to mess with them,
so the caller can just look at the stored rates without doing extra
copies. If the sink doesn't support the new link rate stuff, we just
point the caller at the default_rates[] array.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Sonika Jindal <sonika.jindal@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Don't copy the DP source rates arrays

The source rates don't change, so we can just point the caller at the
const arrays.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Sonika Jindal <sonika.jindal@intel.com>
Reviewed-by: Todd Previte <tprevite@gmail.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Store the converted link rates in intel_dp->supported_rates[]

No point in converting from hardware format every single time, just
store the rates in the final format under intel_dp.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Sonika Jindal <sonika.jindal@intel.com>
Reviewed-by: Todd Previte <tprevite@gmail.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Make the DP rates int instead of uint32_t

No point in using uint32_t here, just plain old int will do.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Sonika Jindal <sonika.jindal@intel.com>
Reviewed-by: Todd Previte <tprevite@gmail.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Do not use ggtt_view with (aliasing) PPGTT

GGTT views are only applicable when dealing with GGTT. Change the code to
reject ggtt_view where it should not be used and require it when it should
be.

v2:
- Dropped _ppgtt_ infixes, allow both types to be passed
- Disregard other but normal views when no view is specified
- More checks that valid parameters are passed
- More readable error checking

v3:
- Prefer WARN_ONCE over BUG_ON when there is code path for failure

Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
[danvet: Drop unecessary forward decl from earlier patch iterations.]
[danvet: Remove unused variable spotted by Tvrtko.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Fix sink crc connector iteration

Regressed by this commit:

commit 3455454e18ca3f92c565700539e744c620d8276b
Author: Ander Conselvan de Oliveira <ander.conselvan.de.oliveira@intel.com>
Date: Tue Mar 3 15:21:56 2015 +0200

drm/i915: Add a for_each_intel_connector macro

Cc: Ander Conselvan de Oliveira <ander.conselvan.de.oliveira@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Ander Conselvan de Oliveira <conselvan2@gmail.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

Merge tag 'drm-intel-fixes-2015-03-19' into drm-intel-next

Backmerge because of numerous and interleaving conflicts and git
rerere getting confused a bit too often.

Conflicts:
drivers/gpu/drm/i915/intel_display.c

All conflicts are because of -next patches backported to -fixes, so
just go with the code in -next.

Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>

drm/i915: Make sure the primary plane is enabled before reading out the fb state

We don't want to end up in a state where we track that the pipe has its
primary plane enabled when primary plane registers are programmed with
values that look possible but the plane actually disabled.

Refuse to read out the fb state when the primary plane isn't enabled.

Suggested-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Suggested-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reported-by: Andrey Skvortsov <andrej.skvortzov@gmail.com>
Reported-by: Steven Rostedt <rostedt@goodmis.org>
Reference: http://mid.gmane.org/20150203191507.GA2374@crion86
Tested-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>

drm/i915: Update DRIVER_DATE to 20150313

Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Fix vmap_batch page iterator overrun

vmap_batch() calculates amount of needed pages for the mapping
we are going to create. And it uses this page count as an
argument for the for_each_sg_pages() macro. The macro takes the number
of sg list entities as an argument, not the page count. So we ended
up iterating through all the pages on the mapped object, corrupting
memory past the smaller pages[] array.

Fix this by bailing out when we have enough pages.

This regression has been introduced in

commit 17cabf571e50677d980e9ab2a43c5f11213003ae
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date: Wed Jan 14 11:20:57 2015 +0000

drm/i915: Trim the command parser allocations

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Export total subslice and EU counts

Setup new I915_GETPARAM ioctl entries for subslice total and
EU total. Userspace drivers need these values when constructing
GPGPU commands. This kernel query method is intended to replace
the PCI ID-based tables that userspace drivers currently maintain.
The kernel driver can employ fuse register reads as needed to
ensure the most accurate determination of GT config attributes.
This first became important with Cherryview in which the config
could differ between devices with the same PCI ID.

The kernel detection of these values is device-specific and not
included in this patch. Because zero is not a valid value for any of
these parameters, a value of zero is interpreted as unknown for the
device. Userspace drivers should continue to maintain ID-based tables
for older devices not supported by the new query method.

v2: Increment our I915_GETPARAM indices to fit after REVISION
which was merged ahead of us.

For: VIZ-4636
Signed-off-by: Jeff McGee <jeff.mcgee@intel.com>
Tested-by: Zhigang Gong <zhigang.gong@linux.intel.com>
Acked-by: Zhigang Gong <zhigang.gong@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: redefine WARN_ON_ONCE to include the condition

Same as

commit c883ef1b1c998d2d66866772fd0fc34afa45641e
Author: Mika Kuoppala <miku@iki.fi>
Date:   Tue Oct 28 17:32:30 2014 +0200

    drm/i915: Redefine WARN_ON to include the condition

but for WARN_ON_ONCE. Since the kernel WARN_ON_ONCE actually picks up
*our* version of WARN_ON, we end up with messages like

[  838.285319] WARN_ON(!__warned)

which are not that helpful.

Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915/skl: Implement WaDisableHBR2

v2: Use the recently introduced INTEL_REVID() and SKL_REVID defines
(Nick Hoath)

Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Tested-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89554
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Remove the preliminary_hw_support shackles from CHV

CHV should be in a good enough shape now, so let's drop the
.is_preliminary flag.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Read CHV_PLL_DW8 from the correct offset

We accidentally pass 'pipe' instead of 'port' to CHV_PLL_DW8() and
with PIPE_C we end up at register offset 0x8320 which isn't the
0x8020 we wanted. Fix it.

The problem was fortunately caught by the sanity check in vlv_dpio_read():
WARNING: CPU: 1 PID: 238 at ../drivers/gpu/drm/i915/intel_sideband.c:200 vlv_dpio_read+0x77/0x80 [i915]()
DPIO read pipe C reg 0x8320 == 0xffffffff

The problem got introduced with this commit:
commit 71af07f91f12bbab96335e202c82525d31680960
Author: Vijay Purushothaman <vijay.a.purushothaman@linux.intel.com>
Date: Thu Mar 5 19:33:08 2015 +0530

drm/i915: Update prop, int co-eff and gain threshold for CHV

Cc: Vijay Purushothaman <vijay.a.purushothaman@linux.intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Todd Previte <tprevite@gmail.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Rewrite IVB FDI bifurcation conflict checks

Ignore the current state of the pipe and just check crtc_state->enable
and the number of FDI lanes required. This means we don't accidentally
mistake the FDI lanes as being available of one of the pipes just
happens to be disabled at the time of the check. Also we no longer
consider pipe C to require FDI lanes when it's driving the eDP
transcoder.

We also take the opportunity to make the code a bit nicer looking by
hiding the ugly bits in the new pipe_required_fdi_lanes() function.

Cc: Ander Conselvan de Oliveira <ander.conselvan.de.oliveira@intel.com>
Cc: Daniel Vetter <daniel@ffwll.ch>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Ander Conselvan de Oliveira <conselvan2@gmail.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Rewrite some some of the FDI lane checks

The logic in the FDI lane checks is very hard for my poor brain to
grasp. Rewrite it in a more straightforward way.

Cc: Ander Conselvan de Oliveira <ander.conselvan.de.oliveira@intel.com>
Cc: Daniel Vetter <daniel@ffwll.ch>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Ander Conselvan de Oliveira <conselvan2@gmail.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915/skl: Enable the RPS interrupts programming

Enable the RPS interrupts programming(enable/disable/reset) for GEN9,
as missing changes to enable the RPS support on GEN9 have been added.

Signed-off-by: Akash Goel <akash.goel@intel.com>
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915/skl: Enabling processing of Turbo interrupts

Earlier Turbo interrupts were not being processed for SKL,
as something was amiss in turbo programming for SKL.
Now missing changes have been added, so enabling the Turbo
interrupt processing for SKL.

Signed-off-by: Akash Goel <akash.goel@intel.com>
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915/skl: Updated the i915_frequency_info debugfs function

Added support for SKL in the i915_frequency_info debugfs function

v2:
- corrected the handling of reqf (Damien)
- Reorderd the platform check for cagf (Ville)

Signed-off-by: Akash Goel <akash.goel@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Simplify the way BC bifurcation state consistency is kept

Remove the global modeset resource function that would disable the
bifurcation bit, and instead enable/disable it when enabling the pch
transcoder. The mode set consistency check should prevent us from
disabling the bit if pipe C is enabled so the change should be safe.

Note that this doens't affect the logic that prevents the bit being
set while a pipe is active, since the patch retains the behavior of
only chaging the bit if necessary. Because of the checks during mode
set, the first change would necessarily happen with both pipes B and
C disabled, and any subsequent write would be skipped.

v2: Only change the bit during pch trancoder enable. (Ville)

Signed-off-by: Ander Conselvan de Oliveira <ander.conselvan.de.oliveira@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915/skl: Updated the act_freq_mhz_show sysfs function

Added support for SKL in the act_freq_mhz_show sysfs function

Signed-off-by: Akash Goel <akash.goel@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915/skl: Updated the gen9_enable_rps function

On SKL, GT frequency is programmed in units of 16.66 MHZ units compared
to 50 MHZ for older platforms. Also the time value specified for Up/Down EI &
Up/Down thresholds are expressed in units of 1.33 us, compared to 1.28
us for older platforms. So updated the gen9_enable_rps function as per that.

v2: Updated to use new macro GT_INTERVAL_FROM_US

v3: Removed the initial setup of certain registers, from gen9_enable_rps,
    which gets overridden later from gen6_set_rps (Damien)

v4: Removed the enabling of rps interrupts, from gen9_enable_rps.
    To be done from intel_gen6_powersave_work only, as done for other
    platforms also.

Signed-off-by: Akash Goel <akash.goel@intel.com>
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915/skl: Updated the gen6_rps_limits function

RP Interrupt Up/Down Frequency Limits register (A014) definition
has changed for SKL. Updated the gen6_rps_limits function as per that

v2: Renamed the function to intel_rps_limits (Chris)

Signed-off-by: Akash Goel <akash.goel@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915/skl: Restructured the gen6_set_rps_thresholds function

Prior to SKL, the time period programmed in Up/Down EI & Up/Down
threshold registers was in units of 1.28 micro seconds. But for
SKL, the units have changed (1.333 micro seconds).
Have generalized the implementation of gen6_set_rps_thresholds function,
by removing the hard coding done in it as per 1.28 micro seconds.

v2: Renamed the local variables & removed superfluous comments (Chris)

Signed-off-by: Akash Goel <akash.goel@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915/skl: Updated the gen6_set_rps function

On SKL, the frequency is programmed differently in RPNSWREQ (A008)
register (from bits 23 to 31, compared to bits 24 to 31). So updated
the gen6_set_rps function, as per this change.

Signed-off-by: Akash Goel <akash.goel@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915/skl: Updated the gen6_init_rps_frequencies function

On SKL the frequency is specified in units of 16.66 MHZ, barring the
RP_STATE_CAP(0x5998) register, which still reports frequency in units
of 50 MHZ. So an extra conversion is required in gen6_init_rps_frequencies
function for SKL, to store the frequency values as per the actual hardware unit.

v2: Corrected the conversion from 50 to 16.66 MHZ (Ville)

Signed-off-by: Akash Goel <akash.goel@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915/skl: Updated intel_gpu_freq() and intel_freq_opcode()

On SKL, frequency is specified in units of 16.66 MHZ.
Updated the intel_gpu_freq() and intel_freq_opecode() functions
to do the conversion appropriately.

Signed-off-by: Akash Goel <akash.goel@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915/skl: Added new macros

For SKL, register definition for RPNSWREQ (A008), RPSTAT1(A01C)
have changed slightly. Also on SKL, frequency is specified in
units of 16.66 MHZ, compared to 50 MHZ for most of the earlier
platforms and the time values are expressed in units of 1.33 us,
compared to 1.28 us for earlier platforms.
Added new macros for the aforementioned changes.

v2: Renamed the GT_FREQ_FROM_PERIOD macro to GT_INTERVAL_FROM_US (Damien)

v3: Removed the implicit use of dev_priv in GT_INTERVAL_FROM_US macro (Chris)

Signed-off-by: Akash Goel <akash.goel@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: remove indirection in the PCI ID macros

Spell all the PCI IDs out to be able to quickly grep for the IDs. No
functional changes.

Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Damien Lespiau <damien.lespiau@intel.com>
[danvet: Add GT1/2 to comments to not loose that distinction.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Use FW_WM() macro for older gmch platforms too

Use the FW_WM() macro from the VLV wm code to polish up the wm
code for older gmch platforms.

Cc: Daniel Vetter <daniel@ffwll.ch>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Add polish to VLV WM shift+mask operations

Wrap the FW register value shift+mask operations into a macro to hide
the ugliness a bit. Also might avoid bugs due to typos.

Also rename all the primary/sprite plane low order bit masks to have the
_VLV suffix, so that we can use the FW_WM_VLV() macro instead of the
FW_WM() macro for them in a consistent manner. Cursor and all the high
order bits are left to use the FW_WM() macro as there's no real
confusion with them.

Cc: Daniel Vetter <daniel@ffwll.ch>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Use plane->state->fb instead of plane->fb in intel_plane_restore()

plane->fb is not as reliable as plane->state->fb so let's convert
intel_plane_restore() over the the new way of thinking as well.

Cc: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Reduce clutter by using the local plane pointer

No need to go dig throguh intel_crtc->base.cursor when we already have
the same thing as 'plane' local variable.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Remove debug prints from primary plane update funcs

These are now called from the plane commit hooks, so they really need to
be fast or else we risk atomic update failures. So kill the debug prints
which are slowing things down massively.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Add ULL postfix to VGT_MAGIC constant

Without this Dave's 32bit rhel compiler is annoyed. Don't ask me about
the exact rules for this stuff though, but this should be safe.

Reported-by: Dave Airlie <airlied@gmail.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>

drm/fourcc: 64 #defines need ULL postfix

I have no idea about the exact rules, but this angered Dave's 32bit
rhel gcc.

Reported-by: Dave Airlie <airlied@gmail.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>

drm/i915: Don't assume primary & cursor are always on for wm calculation (v4)

Current ILK-style watermark code assumes the primary plane and cursor
plane are always enabled.  This assumption, along with the combination
of two independent commits that got merged at the same time, results in
a NULL dereference.  The offending commits are:

        commit fd2d61341bf39d1054256c07d6eddd624ebc4241
        Author: Matt Roper <matthew.d.roper@intel.com>
        Date:   Fri Feb 27 10:12:01 2015 -0800

            drm/i915: Use plane->state->fb in watermark code (v2)

and

        commit 0fda65680e92545caea5be7805a7f0a617fb6c20
        Author: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
        Date:   Fri Feb 27 15:12:35 2015 +0000

            drm/i915/skl: Update watermarks for Y tiling

The first commit causes us to use the FB from plane->state->fb rather
than the legacy plane->fb, which is updated a bit later in the process.

The second commit includes a change that now triggers watermark
reprogramming on primary plane enable/disable where we didn't have one
before (which wasn't really correct, but we had been getting lucky
because we always calculated as if the primary plane was on).

Together, these two commits cause the watermark calculation to
(properly) see plane->state->fb = NULL when we're in the process of
disabling the primary plane.  However the existing watermark code
assumes there's always a primary fb and tries to dereference it to find
out pixel format / bpp information.

The fix is to make ILK-style watermark calculation actually check the
true status of primary & cursor planes and adjust our watermark logic
accordingly.

v2: Update unchecked uses of state->fb for other platforms (pnv, skl,
    etc.).  Note that this is just a temporary fix.  Ultimately the
    useful information is going to be computed at check time and stored
    right in the state structures so that we don't have to figure this
    all out while we're supposed to be programming the watermarks.
    (caught by Tvrtko)

v3: Fix a couple copy/paste mistakes in SKL code. (Tvrtko)

v4: Only add FB checks for ILK/SKL codepaths.  Older platforms still use
    intel_crtc_active() and will shortcircuit out of watermark
    calculations before ever trying to dereference the primary plane's
    framebuffer.

Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reported-by: Michael Leuchtenburg <michael@slashhome.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89388
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Move drm_framebuffer_unreference out of struct_mutex for flips

intel_user_framebuffer_destroy() requires the struct_mutex for its
object bookkeeping, so this means that all calls to
drm_framebuffer_unreference must not hold that lock.

Regression from commit ab8d66752a9c28cd6c94fa173feacdfc1554aa03
Author: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Date: Mon Feb 2 15:44:15 2015 +0000

drm/i915: Track old framebuffer instead of object

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89166
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
[danvet: Clarify commit message slightly.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Disable DDR DVFS on CHV

DDR DVFS introduces massive memory latencies which can't be handled by
the PND deadline stuff. Instead the watermarks will need to be
programmed to compensate for the latency and the deadlines will need to
be programmed to tight fixed values. That means DDR DVFS can only be
enabled if the display FIFOs are large enough, and that pretty much
means we have to manually repartition them to suit the needs of the
moment.

That's a lot of change, so in the meantime let's just disable DDR DVFS
to get the display(s) to be stable.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Enable the maxfifo PM5 mode when appropriate on CHV

CHV has a new knob in Punit to select between some memory power savings
modes PM2 and PM5. We can allow the deeper PM5 when maxfifo mode is
enabled, so let's do so in the hopes for moar power savings.

v2: Put the thing into a separate function to avoid churn later
v3: Don't break VLV

Reviewed-by: Vijay Purushothaman <vijay.a.purushothaman@linux.intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Arun R Murthy <arun.r.murthy@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Program PFI credits for VLV

PFI credit programming is required when CD clock (related to data flow from
display pipeline to end display) is greater than CZ clock (related to data
flow from memory to display plane). This programming should be done when all
planes are OFF to avoid intermittent hangs while accessing memory even from
different Gfx units (not just display).

If cdclk/czclk >=1, PFI credits could be set as any number. To get better
performance, larger PFI credit can be assigned to PND. Otherwise if
cdclk/czclk<1, the default PFI credit of 8 should be set.

v2:
    - Change log to lower log level instead of DRM_ERROR
    - Change function name to valleyview_program_pfi_credits
    - Move program PFI credits to modeset_init instead of intel_set_mode
    - Change magic numbers to logical constants

[vsyrjala v3:
- only program in response to cdclk update
- program the credits also when cdclk<czclk
- add CHV bits
v4:
- Change CHV cdclk<czclk credits to 12 (Vijay)]

Signed-off-by: Vidya Srinivas <vidya.srinivas@intel.com>
Signed-off-by: Gajanan Bhat <gajanan.bhat@intel.com>
Signed-off-by: Vandana Kannan <vandana.kannan@intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Vijay Purushothaman <vijay.a.purushothaman@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Rewrite VLV/CHV watermark code

Assuming the PND deadline mechanism works reasonably we should do
memory requests as early as possible so that PND has schedule the
requests more intelligently. Currently we're still calculating
the watermarks as if VLV/CHV are identical to g4x, which isn't
the case.

The current code also seems to calculate insufficient watermarks
and hence we're seeing some underruns, especially on high resolution
displays.

To fix it just rip out the current code and replace is with something
that tries to utilize PND as efficiently as possible.

We now calculate the WM watermark to trigger when the FIFO still has
256us worth of data. 256us is the maximum deadline value supoorted by
PND, so issuing memory requests earlier would mean we probably couldn't
utilize the full FIFO as PND would attempt to return the data at
least in at least 256us. We also clamp the watermark to at least 8
cachelines as that's the magic watermark that enabling trickle feed
would also impose. I'm assuming it matches some burst size.

In theory we could just enable trickle feed and ignore the WM values,
except trickle feed doesn't work with max fifo mode anyway, so we'd
still need to calculate the SR watermarks. It seems cleaner to just
disable trickle feed and calculate all watermarks the same way. Also
trickle feed wouldn't account for the 256us max deadline value, thoguh
that may be a moot point in non-max fifo mode sicne the FIFOs are fairly
small.

On VLV max fifo mode can be used with either primary or sprite planes.
So the code now also checks all the planes (apart from the cursor)
when calculating the SR plane watermark.

We don't have to worry about the WM1 watermarks since we're using the
PND deadline scheme which means the hardware ignores WM1 values.

v2: Use plane->state->fb instead of plane->fb

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Make sure we invalidate frontbuffer on fbcon.

There are some cases like suspend/resume or dpms off/on sequences
that can flush frontbuffer bits. In these cases features that relies
on frontbuffer tracking can start working and user can stop getting
screen updates on fbcon having impression the system is frozen.

So, let's make sure we also invalidate frontbuffer on fbdev blank.

v2: Daniel was right, backtrace didn't show other path than this blank
one so let's make sure frontbuffer bits gets invalidate here instead of
on random write operations that doesn't garantee we track all frontbuffer
writes.

Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
[danvet: Exchange code comments for one that complains about the
locking, like in set_par.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Update prop, int co-eff and gain threshold for CHV

This patch implements latest PHY changes in Gain, prop and int co-efficients
based on the vco freq.

v2: Split the original changes into multiple smaller patches based on
review by Ville

v3: Addressed Ville's review comments. Fixed the error introduced in v2.
Clear the old bits before we modify those bits as part of RMW.

v4: TDC target cnt is 10 bits and not 8 bits (Ville)

Signed-off-by: Vijay Purushothaman <vijay.a.purushothaman@linux.intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Initialize CHV digital lock detect threshold

Initialize lock detect threshold and select coarse threshold for the
case where M2 fraction division is disabled.

v2: Split the changes into multiple smaller patches (Ville)
v3: Clear out the old bits before we modify those bits as RMW (Ville)
v4: Reset coarse threshold when M2 fraction is enabled (Ville)

Signed-off-by: Vijay Purushothaman <vijay.a.purushothaman@linux.intel.com>
Reviewed-by: Ville Syrjala <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Disable M2 frac division for integer case

v2 : Handle M2 frac division for both M2 frac and int cases

v3 : Addressed Ville's review comments. Cleared the old bits for RMW

v4 : Fix feedfwd gain (Ville)

Signed-off-by: Vijay Purushothaman <vijay.a.purushothaman@linux.intel.com>
Reviewed-by: Ville Syrjala <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Spelling s/auxilliary/auxiliary/

Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Use crtc->state->active in ilk/skl watermark calculations (v3)

Existing watermark code calls intel_crtc_active() to determine whether a CRTC
is active for the purpose of watermark calculations (and bails out early if it
determines the CRTC is not active).  However intel_crtc_active() only returns
true if crtc->primary->fb is non-NULL, which isn't appropriate in the modern
age of universal planes and atomic modeset since userspace can now disable the
primary plane, but leave the CRTC (and other planes) running.

Note that commit

        commit 0fda65680e92545caea5be7805a7f0a617fb6c20
        Author: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
        Date:   Fri Feb 27 15:12:35 2015 +0000

            drm/i915/skl: Update watermarks for Y tiling

adds a test for primary plane enable/disable to trigger a watermark update
(previously we ignored updates to primary planes, which wasn't really correct,
but we got lucky since we always pretended the primary plane was on).  Tvrtko's
patch tries to update watermarks when we re-enable the primary plane, but that
watermark computation gets aborted early because intel_crtc_active() returns
false due to the disabled primary plane.

Switch the ILK and SKL watermark code over to use crtc->state->active rather
than calling intel_crtc_active() so that we'll properly compute watermarks when
re-enabling the primary plane.

Note that this commit doesn't touch callsites in the watermark code for
older platforms since there were concerns that doing so would lead to
other types of breakage.

Also note that all of the watermark calculation at the moment takes place after
new crtc/plane states are swapped into the DRM objects.  This will change in
the future, so we'll be working with in-flight state objects, but for the time
being, crtc->state is what we want to operate on.

v2: Don't drop primary->fb check from intel_crtc_active(), but rather replace
    ILK/SKL callsites with direct tests of crtc->state->active.  There is
    concern that messing with intel_crtc_active() will lead to other breakage for
    old hardware platforms.  (Ville)

v3: Use intel_crtc->active for now rather than crtc->state->active since
    we don't have CRTC states properly hooked up and initialized yet.
    We'll defer the switch to crtc->state->active until the atomic CRTC
    state work is farther along. (Ville)

Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Update intel_crtc_active() to use state values (v2)

With the switch to atomic plumbing for planes, some of our commit-time
work (e.g., watermarks) is done after the new atomic state is swapped
into the relevant DRM object, but before the DRM core has a chance to
update its legacy state values.  Switch intel_crtc_active() to look at
the state objects rather than legacy fields to ensure we operate on the
proper values.

Note that we're continuing to use intel_crtc->active here for the time
being since crtc->state isn't really hooked up yet.  Once CRTC states
are wired up properly, we'll want to switch this over to use
crtc->state->active instead.

v2: Switch back to intel_crtc->active for now; when Ander's work on CRTC
    states is ready, we can flip this over to use crtc->state->active
    instead. (Ville)

Cc: Ander Conselvan De Oliveira <conselvan2@gmail.com>
Suggested-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Exit early from psr_status if PSR is not supported by the device

Static analysis was complaining that a path existed where we could use
stat[] uninitialized. Fix this by simplifying the logic to exit early if
PSR isn't supported.

Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Fix chv cdclk support

The specs seem to be full of misinformation wrt. the Punit register
0x36. Some versions still show the old VLV bit layout, some the new
layout, and all of them seem to tell us nonsense about the cdclk
value encoding.

Testing on actual hardware has shown that we simply need to program
the desired CCK divider into the Punit register using the new layout of
the bits. Doing that, the status bit change to indicate the same value,
and the CCK 0x6b register also changes accordingly to indicate that CCK
is now using the new divider.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Vijay Purushothaman <vijay.a.purushothaman@linux.intel.com>
Reviewed-by: Yogesh Mohan Marimuthu <yogesh.mohan.marimuthu@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Allow pixel clock up to 95% of cdclk on CHV

Supposedly CHV can sustain a pixel clock of up to 95% of
cdclk, as opposed to the 90% limit that was used old older
platforms. Update the cdclk selection code to allow for this.

This will allow eg. HDMI 4k modes with their 297MHz pixel clock
while still respecting the 320 MHz cdclk limit on CHV.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Vijay Purushothaman <vijay.a.purushothaman@linux.intel.com>
Reviewed-by: Yogesh Mohan Marimuthu <yogesh.mohan.marimuthu@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915/skl: port A fuse straps don't work on early SKL steppings

So try to enumerate eDP unconditionally in those cases.

Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
[danvet: Add wa tag Damien dug out.]
Reviewed-by: Damien Lespiau <damien.lespiau@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915/skl: Restore the DDI translation tables when enabling PW1

I was dumping the DDI translation tables to make sure my patch updating
the HDMI entry was doing the right thing when I noticed that the table
was showing reset values after DPMS.

And indeed, the DDI translation registers are in power well 1 on SKL,
and so we're losing their values when shutting down eDP.

Calling intel_prepare_ddi() on PW1 enabling re-programs the table.

Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Remove unused condition in hsw_power_well_post_enable()

We don't use this function on gen9, no need for that test here.

Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915/skl: Restore pipe interrupt registers after power well enabling

The pipe interrupt registers are in the actual pipe power well, so we
need to restore them when re-enable the corresponding power well.

I've also copied what we do on HSW/BDW for VGA, even if the we haven't
enabled unclaimed registers just yet.

v2: Don't run skl_power_well_post_enable() if the power well is already
enabled (Paulo)

Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915/skl: Mirror what we do on HSW for the power well enable log message

Just to be more consistent with what we do on HSW.

Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915/skl: Introduce enable_requested and is_enabled in the power well code

Just like what we do for HSW/BDW, having those variables makes it a bit
easier to parse the code.

Suggested-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915/skl: Make gen8_irq_power_well_post_enable() take a pipe mask

While we only need to restore pipe B/C interrupt registers on BDW when
enabling the power well, skylake a bit more flexible and we'll also need
to restore the pipe A registers as it has its own power well that can be
toggled.

Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915/chv: Add CHV HW status to SSEU status

Collect the currently enabled counts of slice, subslice, and
execution units using the power gate control ack message
registers specific to Cherryview.

Slice/subslice/EU info and hardware status can now be
determined for CHV, so allow the debugfs SSEU status dump
to proceed for CHV devices.

Signed-off-by: Jeff McGee <jeff.mcgee@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915/chv: Determine CHV slice/subslice/EU info

Total EU was already being detected on CHV, so we just add the
additional info parameters. The detection method is changed to
be more robust in the case of subslice fusing - we don't want
to trust the EU fuse bits corresponding to subslices which are
fused-off.

v2: Fixed subslice disable bitmasks and removed unnecessary ?
operation (Ville)

Signed-off-by: Jeff McGee <jeff.mcgee@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Make sure PND deadline mode is enabled on VLV/CHV

Poke at the CBR1_VLV register during init_clock_gating to make sure the
PND deadline scheme is used.

The hardware has two modes of operation wrt. watermarks:

1) PND deadline mode:
- memory request deadline is calculated from actual FIFO level * DDL
- WM1 watermark values are unused (AFAIK)
- WM watermark level defines when to start fetching data from memory
(assuming trickle feed is not used)

2) backup mode
- deadline is based on FIFO status, DDL is unused
- FIFO split into three regions with WM and WM1 watermarks, each
part specifying a different FIFO status

We want to use the PND deadline mode, so let's make sure the chicken
bit is in the correct position on init.

Also take the opportunity to refactor the shared code between VLV and
CHV to a shared function.

Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Read out display FIFO size on VLV/CHV

VLV/CHV have similar DSPARB registers as older platforms, just more of
them due to more planes. Add a bit of code to read out the current FIFO
split from the registers. Will be useful later when we improve the WM
calculations.

v2: Add display_mmio_offset to DSPARB

Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Pass plane to vlv_compute_drain_latency()

Now that we have drm_planes for the cursor and primary we can move the
pixel_size handling into vlv_compute_drain_latency() and just pass the
appropriate plane to it.

v2: Check plane->state->fb instead of plane->fb

Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org> (v1)
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
[danvet: Resolve conflict with Matt's s/plane->fb/plane->state->fb/
patch.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Reorganize VLV DDL setup

Introduce struct vlv_wm_values to house VLV watermark/drain latency
values. We start by using it when computing the drain latency values.

Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Hide VLV DDL precision handling

Move the DDL precision handling into vlv_compute_drain_latency() so the
callers don't have to duplicate the same code to deal with it.

Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Simplify VLV drain latency computation

The current drain lantency computation relies on hardcoded limits to
determine when the to use the low vs. high precision multiplier.
Rewrite the code to use a more straightforward approach.

Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Kill DRAIN_LATENCY_PRECISION_* defines

Kill the silly DRAIN_LATENCY_PRECISION_* defines and just use the raw
number instead.

v2: Move the sprite 32/16 -> 16/8 preision multiplier
change to another patch (Jesse)

Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Reduce CHV DDL multiplier to 16/8

Apparently we must yet halve the DDL drain latency from what we're
using currently. This little nugget is not in any spec, but came
down through the grapevine.

This makes the displays a bit more stable. Not quite fully stable but at
least they don't fall over immediately on driver load.

v2: Update high_precision in valleyview_update_sprite_wm() too (Jesse)

Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>