Chris Wilson [Mon, 19 Dec 2016 12:43:45 +0000 (12:43 +0000)]
drm/i915: Fallback to single PAGE_SIZE segments for DMA remapping
If we at first do not succeed with attempting to remap our physical
pages using a coalesced scattergather list, try again with one
scattergather entry per page. This should help with swiotlb as it uses a
limited buffer size and only searches for contiguous chunks within its
buffer aligned up to the next boundary - i.e. we may prematurely cause a
failure as we are unable to utilize the unused space between large
chunks and trigger an error such as:
i915 0000:00:02.0: swiotlb buffer is full (sz: 1630208 bytes)
Reported-by: Juergen Gross <jgross@suse.com> Tested-by: Juergen Gross <jgross@suse.com> Fixes: 871dfbd67d4e ("drm/i915: Allow compaction upto SWIOTLB max segment size") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Imre Deak <imre.deak@intel.com> Cc: <drm-intel-fixes@lists.freedesktop.org> Link: http://patchwork.freedesktop.org/patch/msgid/20161219124346.550-1-chris@chris-wilson.co.uk Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Chris Wilson [Tue, 20 Dec 2016 10:40:03 +0000 (10:40 +0000)]
drm/i915/breadcrumbs: s/container_of/rb_entry/
In keeping with commit f802cf7e0986 ("drm/i915/debugfs: use
rb_entry()"), convert the primary user of the rbtrees over to using
rb_entry rather than the equivalent container_of.
Rodrigo Vivi [Mon, 19 Dec 2016 19:05:47 +0000 (11:05 -0800)]
drm/i915: Simplify gem stolen initialization.
Let's take usage of IS_LP to simplify the gem stolen
initialization as suggest by Tvrtko.
Also assume that all new LP platforms follows the chv+
and others bdw+.
v2: Remove the wrong commit message about bxt and glk. (Ander)
Cc: Ander Conselvan de Oliveira <ander.conselvan.de.oliveira@intel.com> Cc: Imre Deak <imre.deak@intel.com> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: Ander Conselvan de Oliveira <conselvan2@gmail.com> Link: http://patchwork.freedesktop.org/patch/msgid/1482174347-24911-1-git-send-email-rodrigo.vivi@intel.com
Rodrigo Vivi [Sun, 18 Dec 2016 21:36:27 +0000 (13:36 -0800)]
drm/i915: Rename get stolen functions for LP platforms chv+
gen8 is used for both Broadwell and Cherryview but this
function here is only Cherryview and all next atom LP platforms.
So let's rename it to avoid confusion as suggested by Ville.
Ville Syrjälä [Wed, 14 Dec 2016 18:00:23 +0000 (20:00 +0200)]
drm/i915: Prevent PPS stealing from a normal DP port on VLV/CHV
VLV apparently gets upset if the PPS for a pipe currently driving an
external DP port gets used for VDD stuff on another eDP port. The DP
port falls over and fails to retrain when this happens, leaving the
user staring at a black screen.
Let's fix it by also tracking which pipe is driving which DP/eDP port.
We'll track this under intel_dp so that we'll share the protection
of the pps_mutex alongside the pps_pipe tracking, since the two
things are intimately related.
I had plans to reduce the protection of pps_mutex to cover only eDP
ports, but with this we can't do that. Well, for for VLV/CHV at least.
For other platforms it should still be possible, which would allow
AUX communication to occur in parallel for multiple DP ports.
v2: Drop stray crap from a comment (Imre)
Grab pps_mutex when clearing active_pipe
Fix a typo in the commit message
v3: Make vlv_active_pipe() static
Chris Wilson [Sun, 18 Dec 2016 15:37:24 +0000 (15:37 +0000)]
drm/i915: Swap if(enable_execlists) in i915_gem_request_alloc for a vfunc
A fairly trivial move of a matching pair of routines (for preparing a
request for construction) onto an engine vfunc. The ulterior motive is
to be able to create a mock request implementation.
Chris Wilson [Sun, 18 Dec 2016 15:37:23 +0000 (15:37 +0000)]
drm/i915/execlists: Request the kernel context be pinned high
PIN_HIGH is an expensive operation (in comparison to allocating from the
hole stack) unsuitable for frequent use (such as switching between
contexts). However, the kernel context should be pinned just once for
the lifetime of the driver, and here it is appropriate to keep it out of
the mappable range (in order to maximise mappable space for users).
Chris Wilson [Sun, 18 Dec 2016 15:37:22 +0000 (15:37 +0000)]
drm/i915: Mark the shadow gvt context as closed
As the shadow gvt is not user accessible and does not have an associated
vm, we can mark it as closed during its construction. This saves leaking
the internal knowledge of i915_gem_context into gvt/.
Chris Wilson [Sun, 18 Dec 2016 15:37:21 +0000 (15:37 +0000)]
drm/i915: Simplify releasing context reference
A few users only take the struct_mutex in order to release a reference
to a context. We can expose a kref_put_mutex() wrapper in order to
simplify these users, and optimise taking of the mutex to the final
unref.
Chris Wilson [Sun, 18 Dec 2016 15:37:20 +0000 (15:37 +0000)]
drm/i915: Unify active context tracking between legacy/execlists/guc
The requests conversion introduced a nasty bug where we could generate a
new request in the middle of constructing a request if we needed to idle
the system in order to evict space for a context. The request to idle
would be executed (and waited upon) before the current one, creating a
minor havoc in the seqno accounting, as we will consider the current
request to already be completed (prior to deferred seqno assignment) but
ring->last_retired_head would have been updated and still could allow
us to overwrite the current request before execution.
We also employed two different mechanisms to track the active context
until it was switched out. The legacy method allowed for waiting upon an
active context (it could forcibly evict any vma, including context's),
but the execlists method took a step backwards by pinning the vma for
the entire active lifespan of the context (the only way to evict was to
idle the entire GPU, not individual contexts). However, to circumvent
the tricky issue of locking (i.e. we cannot take struct_mutex at the
time of i915_gem_request_submit(), where we would want to move the
previous context onto the active tracker and unpin it), we take the
execlists approach and keep the contexts pinned until retirement.
The benefit of the execlists approach, more important for execlists than
legacy, was the reduction in work in pinning the context for each
request - as the context was kept pinned until idle, it could short
circuit the pinning for all active contexts.
We introduce new engine vfuncs to pin and unpin the context
respectively. The context is pinned at the start of the request, and
only unpinned when the following request is retired (this ensures that
the context is idle and coherent in main memory before we unpin it). We
move the engine->last_context tracking into the retirement itself
(rather than during request submission) in order to allow the submission
to be reordered or unwound without undue difficultly.
And finally an ulterior motive for unifying context handling was to
prepare for mock requests.
v2: Rename to last_retired_context, split out legacy_context tracking
for MI_SET_CONTEXT.
Chris Wilson [Sun, 18 Dec 2016 15:37:19 +0000 (15:37 +0000)]
drm/i915: Move intel_lrc_context_pin() to avoid the forward declaration
Just a simple move to avoid a forward declaration, though the diff likes
to present itself as a move of intel_logical_ring_alloc_request_extras()
in the opposite direction.
Matthew Auld [Tue, 13 Dec 2016 20:32:22 +0000 (20:32 +0000)]
drm/i915: convert to using range_overflows
Convert some of the obvious hand-rolled ranged overflow sanity checks to
our shiny new range_overflows macro.
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Suggested-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Matthew Auld <matthew.auld@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20161213203222.32564-4-matthew.auld@intel.com Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
In a number places we hand-roll the overflow sanity check for ranges, so
roll that into single macro, conceived by Chris, along with its typed
variant.
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Suggested-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Matthew Auld <matthew.auld@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20161213203222.32564-3-matthew.auld@intel.com Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Matthew Auld [Tue, 13 Dec 2016 20:32:20 +0000 (20:32 +0000)]
drm/i915: move vma sanity checking into i915_vma_bind
If we move the sanity checking from gen8_alloc_va_range_3lvl and
gen6_alloc_va_range into i915_vma_bind, we will increase our coverage to
now both callbacks. We also convert each WARN_ON over to a GEM_WARN_ON.
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Suggested-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Matthew Auld <matthew.auld@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20161213203222.32564-2-matthew.auld@intel.com Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Matthew Auld [Tue, 13 Dec 2016 20:32:19 +0000 (20:32 +0000)]
drm/i915: introduce GEM_WARN_ON
In a similar spirit to GEM_BUG_ON we now also have GEM_WARN_ON, with the
simple goal of expressing warnings which are truly insane, and so are
only really useful for CI where we have some abusive tests.
v2:
- use BUILD_BUG_ON_INVALID for !DEBUG_GEM
- clarify commit message
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Suggested-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Matthew Auld <matthew.auld@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20161213203222.32564-1-matthew.auld@intel.com Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Tvrtko Ursulin [Fri, 16 Dec 2016 13:18:42 +0000 (13:18 +0000)]
drm/i915: Fix use after free in logical_render_ring_init
Commit 3b3f1650b1ca ("drm/i915: Allocate intel_engine_cs
structure only for the enabled engines") introduced the
dynanically allocated engine instances and created an
potential use after free scenario in logical_render_ring_init
where lrc_destroy_wa_ctx_obj could be called after the engine
instance has been freed.
This can only happen during engine setup/init error handling
which luckily does not happen ever in practice.
Fix is to not call lrc_destroy_wa_ctx_obj since it would have
already been executed from the preceding engine cleanup.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Fixes: 3b3f1650b1ca ("drm/i915: Allocate intel_engine_cs structure only for the enabled engines") Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Daniel Vetter <daniel.vetter@intel.com> Cc: Jani Nikula <jani.nikula@linux.intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/1481894322-2145-1-git-send-email-tvrtko.ursulin@linux.intel.com
udelay_range(1, 2) is inefficient and as discussions with Jani Nikula
<jani.nikula@linux.intel.com> unnecessary here. This replaces this
tight setting with a relaxed delay of min=20 and max=50 which helps
the hrtimer subsystem optimize timer handling.
udelay_range(2, 3) is inefficient and as discussions with Jani Nikula
<jani.nikula@linux.intel.com> unnecessary here. This replaces this
tight setting with a relaxed delay of min=20 and max=50. which helps
the hrtimer subsystem optimize timer handling.
Paulo Zanoni [Tue, 13 Dec 2016 20:57:44 +0000 (18:57 -0200)]
drm/i915: disable PSR by default on HSW/BDW
We've been ignoring the poor bugzilla reporters that say PSR causes
system lockups and all other sorts of problems. The earliest bug
report is from April, so I think we can use the "revert the offending
commit if no fixes are presented within 8 months" rule here.
Fixes: 9b58e352b463 ("drm/i915: Enable PSR by default on Haswell and Broadwell.")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97602
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97515
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96736
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96704
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96569
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95176
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94985 Cc: <stable@vger.kernel.org> # v4.6+ Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Jim Bride <jim.bride@linux.intel.com> Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Acked-by: Jani Nikula <jani.nikula@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1481662664-18986-1-git-send-email-paulo.r.zanoni@intel.com
Mika Kuoppala [Wed, 14 Dec 2016 12:26:20 +0000 (14:26 +0200)]
drm/i915: Fix setting of boost freq tunable
For limiting the max frequency of gpu, the max freq tunable
is not enough to hard limit the max gap. We now have also per
client boost max freq. When this tunable was introduced,
it was mistakenly made read only. Allow user to gain control by
setting it writable.
Fixes: 29ecd78d3b79 ("drm/i915: Define a separate variable and control for RPS waitboost frequency") Cc: <stable@vger.kernel.org> # v4.9+ Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Cc: Daniel Vetter <daniel.vetter@intel.com> Cc: Jani Nikula <jani.nikula@linux.intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1481718380-9170-1-git-send-email-mika.kuoppala@intel.com
Jani Nikula [Tue, 13 Dec 2016 11:10:59 +0000 (13:10 +0200)]
drm/i915: simplify check for I915G/I945G in bit 6 swizzling detection
Commit c9c4b6f6c283 ("drm/i915: fix swizzle detection for gen3") added a
complicated check for I915G/I945G. Pineview and other gen3 devices match
IS_MOBILE() anyway. Simplify.
Daniel Vetter [Tue, 13 Dec 2016 19:54:14 +0000 (20:54 +0100)]
drm/i915: tune down the fast link training vs boot fail
It's been unfixed since a while and no one is immediately working on
this. And we have the FIXME already. And now also a task in the DP
team's backlog.
Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: stable@vger.kernel.org Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
References: https://lists.freedesktop.org/archives/intel-gfx/2016-July/101951.html Acked-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
[danvet: Adjust comment per Ville's feedback.] Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20161213195414.28923-1-daniel.vetter@ffwll.ch
Manasi Navare [Sat, 10 Dec 2016 00:22:50 +0000 (16:22 -0800)]
drm/i915: Move all the DP compliance data to a separate struct
This patch does not change anything functionally, just cleans up
the DP compliance related variables and stores them all together
in a separate struct intel_dp_compliance. There is another struct
intel_dp_compliance_data to store all the test data. This makes it easy to
reset the compliance variables through a memset instead of
individual resetting.
v2:
* Removed functional changes for EDID (Jani Nikula)
Manasi Navare [Fri, 9 Dec 2016 03:05:12 +0000 (19:05 -0800)]
drm/i915: Find fallback link rate/lane count
If link training fails, then we need to fallback to lower
link rate first and if link training fails at RBR, then
fallback to lower lane count.
This function finds the next lower link rate/lane count
value after link training failure and limits the max
link_rate and lane_count values to these fallback values.
v7:
* Remove unnecessary intializations and remove redundant
call to intel_dp_common_rates (Jani Nikula)
v6:
* Cap the max link rate and lane count to the max
values obtained during fallback link training (Daniel Vetter)
v5:
* Start the fallback at the lane count value passed not
the max lane count (Jani Nikula)
v4:
* Remove the redundant variable link_train_failed
v3:
* Remove fallback_link_rate_index variable, just obtain
that using the helper intel_dp_link_rate_index (Jani Nikula)
v2:
Squash the patch that returns the link rate index (Jani Nikula)
Acked-by: Tony Cheng <tony.cheng@amd.com> Acked-by: Harry Wentland <harry.wentland@amd.com> Cc: Ville Syrjala <ville.syrjala@linux.intel.com> Cc: Jani Nikula <jani.nikula@linux.intel.com> Cc: Daniel Vetter <daniel.vetter@intel.com> Signed-off-by: Manasi Navare <manasi.d.navare@intel.com> Reviewed-by: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1481252712-12925-1-git-send-email-manasi.d.navare@intel.com
Manasi Navare [Tue, 6 Dec 2016 00:27:36 +0000 (16:27 -0800)]
drm/i915: Compute sink's max lane count/link BW at Hotplug
Sink's capabilities are advertised through DPCD registers and get
updated only on hotplug. So they should be computed only once in the
long pulse handler and saved off in intel_dp structure for the use
later. For this reason two new fields max_sink_lane_count and
max_sink_link_bw are added to intel_dp structure.
This also simplifies the fallback link rate/lane count logic
to handle link training failure. In that case, the max_sink_link_bw
and max_sink_lane_count can be reccomputed to match the fallback
values lowering the sink capabilities due to link train failure.
Jani Nikula [Mon, 5 Dec 2016 07:30:34 +0000 (09:30 +0200)]
drm/i915/bxt: add bxt dsi gpio element support
Request the GPIO by index through the consumer API. For now, use a quick
hack to store the already requested ones, simply because I have no idea
whether this actually works or not, and I have no way to test it.
v2 by Mika: switch *NULL* to *"panel"* when requesting gpio for MIPI/DSI
panel.
Chris Wilson [Fri, 9 Dec 2016 15:05:55 +0000 (15:05 +0000)]
drm/i915: Retire before attempting to evict from the active lists
Some object retain an extra pin whilst they are active (e.g. contexts).
This excludes them from being considered for eviction unless we idle the
GPU. If before we look at the active list, we retire beforehand we can
hopefully remove a few excess pins and reduce the amount of searching
required.
Chris Wilson [Wed, 7 Dec 2016 13:34:11 +0000 (13:34 +0000)]
drm/i915: Reorder phys backing storage release
In commit a4f5ea64f0a8 ("drm/i915: Refactor object page API"), I
reordered the object->pages teardown to be more friendly wrt to a
separate obj->mm.lock. However, I overlooked the phys object and left it
with a dangling use-after-free of its phys_handle. Move the allocation
of the phys handle to get_pages and it release to put_pages to prevent
the invalid access and to improve symmetry.
v2: Add commentary about always aligning to page size.
Testcase: igt/drv_selftest/objects Reported-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Fixes: a4f5ea64f0a8 ("drm/i915: Refactor object page API") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20161207133411.8028-1-chris@chris-wilson.co.uk
drm/i915/psr: report psr2 hw enabled from psr2_ctl
For PSR2 , as per spec, PSR2_CTL bit 31 to be set.
for psr1, bit 31 in SRD_CTL to be set. Reporting
"HW Enabled & Active bit" status for psr2 from SRD_CTL
gives wrong status.
Robert Bragg [Wed, 7 Dec 2016 21:40:33 +0000 (21:40 +0000)]
drm/i915/perf: More documentation hooked to i915.rst
This adds a 'Perf' section to i915.rst with the following sub sections:
- Overview
- Comparison with Core Perf
- i915 Driver Entry Points
- i915 Perf Stream
- i915 Perf Observation Architecture Stream
- All i915 Perf Internals
v2:
section headers in i915.rst (Daniel Vetter)
missing symbol docs + other fixups (Matthew Auld)
Imre Deak [Mon, 5 Dec 2016 16:27:38 +0000 (18:27 +0200)]
drm/i915/gen9: Fix PCODE polling during SAGV disabling
According to the previous patch, it's possible atm that we call
intel_do_sagv_disable() only once during the 1ms period and time out if
that call fails. As opposed to this the spec says that we need to keep
retrying this request for a 1ms duration, so let's do this similarly to
the CDCLK change notification request.
v4-5:
- Rebased on the reply_mask, reply change.
v6:
- Remove w/s change. (Lyude)
- Rebased on the timeout_base argument change.
Cc: Lyude <cpaul@redhat.com> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Fixes: 656d1b89e5ff ("drm/i915/skl: Add support for the SAGV, fix underrun hangs") Signed-off-by: Imre Deak <imre.deak@intel.com> Reviewed-by: Lyude <lyude@redhat.com> (v4) Link: http://patchwork.freedesktop.org/patch/msgid/1480955258-26311-2-git-send-email-imre.deak@intel.com
drm/i915: Wait up to 3ms for the pcu to ack the cdclk change request on SKL
increased the timeout to match the spec, but we still see a timeout on
at least one SKL. A CDCLK change request following the failed one will
succeed nevertheless.
I could reproduce this problem easily by running kms_pipe_crc_basic in a
loop. In all failure cases _wait_for() was pre-empted for >3ms and so in
the worst case - when the pre-emption happened right after calculating
timeout__ in _wait_for() - we called skl_cdclk_wait_for_pcu_ready() only
once which failed and so _wait_for() timed out. As opposed to this the
spec says to keep retrying the request for at most a 3ms period.
To fix this send the first request explicitly to guarantee that there is
3ms between the first and last request. Though this matches the spec, I
noticed that in rare cases this can still time out if we sent only a few
requests (in the worst case 2) _and_ PCODE is busy for some reason even
after a previous request and a 3ms delay. To work around this retry the
polling with pre-emption disabled to maximize the number of requests.
Also increase the timeout to 10ms to account for interrupts that could
reduce the number of requests. With this change I couldn't trigger
the problem.
v2:
- Use 1ms poll period instead of 10us. (Chris)
v3:
- Poll with pre-emption disabled to increase the number of request
attempts. (Ville, Chris)
- Factor out a helper to poll, it's also needed by the next patch.
v4:
- Pass reply_mask, reply to skl_pcode_request(), instead of assuming the
reply is generic. (Ville)
v5:
- List the request specific timeout values as code comment. (Ville)
v6:
- Try the poll first with preemption enabled.
- Add code comment about first request being queued by PCODE. (Art)
- Add timeout_base_ms argument. (Ville)
v7:
- Clarify code comment about first queued request. (Chris)
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Art Runyan <arthur.j.runyan@intel.com> Cc: <stable@vger.kernel.org> # v4.2- : 3b2c171 : drm/i915: Wait up to 3ms Cc: <stable@vger.kernel.org> # v4.2- Fixes: 5d96d8afcfbb ("drm/i915/skl: Deinit/init the display at suspend/resume")
Reference: https://bugs.freedesktop.org/show_bug.cgi?id=97929
Testcase: igt/kms_pipe_crc_basic/suspend-read-crc-pipe-B Signed-off-by: Imre Deak <imre.deak@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/1480955258-26311-1-git-send-email-imre.deak@intel.com
Jani Nikula [Wed, 7 Dec 2016 20:48:09 +0000 (22:48 +0200)]
drm/i915: distinguish G33 and Pineview from each other
Pineview deserves to use its own platform enum (which was already added,
unused, previously). IS_G33() no longer matches Pineview, and gets
replaced by IS_G33() || IS_PINEVIEW() or equivalent. Pineview is no
longer an outlier among platform definitions.
Mahesh Kumar [Thu, 1 Dec 2016 15:49:37 +0000 (21:19 +0530)]
drm/i915/skl+: change WM calc to fixed point 16.16
This patch changes Watermak calculation to fixed point calculation.
Problem with current calculation is during plane_blocks_per_line
calculation we divide intermediate blocks with min_scanlines and
takes floor of the result because of integer operation.
hence we end-up assigning less blocks than required. Which leads to
flickers.
Changes since V1:
- Add fixed point data type as per Paulo's review
Changes since V2:
- use fixed_point instead of fp_16_16
Changes since V3:
- rebase
Changes since V4 (from Paulo):
- My original renaming suggestion was misunderstood, so implement it
- Simplify fixed_16_16_to_u32 implementation
- Fix indentation
Mahesh Kumar [Thu, 1 Dec 2016 15:49:34 +0000 (21:19 +0530)]
drm/i915/bxt: IPC WA for Broxton
Display Workarounds #1135
If IPC is enabled in BXT, display underruns are observed.
WA: The Line Time programmed in the WM_LINETIME register should be
half of the actual calculated Line Time.
Programmed Line Time = 1/2*Calculated Line Time
Changes since V1:
- Add Workaround number in commit & code
Changes since V2 (from Paulo):
- Bikeshed white space and make the WA tag look like the others
Mahesh Kumar [Thu, 1 Dec 2016 15:49:33 +0000 (21:19 +0530)]
drm/i915/skl: Add variables to check x_tile and y_tile
This patch adds variable to check for X_tiled & y_tiled planes, instead
of always checking against framebuffer-modifiers.
Changes:
- Created separate patch as per Paulo's comment
- Added x_tiled variable as well
Changes since V2:
- Incorporate Paulo's comments
- Rebase
Changes since V3 (from Paulo):
- Bikeshed indentation
Hans de Goede [Thu, 1 Dec 2016 20:29:09 +0000 (21:29 +0100)]
drm/i915/dsi: Fix chv_exec_gpio disabling the GPIOs it is setting
Set the CHV_GPIO_GPIOEN bit when updating GPIOs from chv_exec_gpio.
Fixes: a0a6d4ffd2ad ("drm/i915/dsi: add support for gpio elements on CHV") Cc: stable@vger.kernel.org Cc: Jani Nikula <jani.nikula@intel.com> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20161201202925.12220-3-hdegoede@redhat.com Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Hans de Goede [Fri, 2 Dec 2016 15:01:28 +0000 (16:01 +0100)]
drm/i915/dsi: Fix swapping of MIPI_SEQ_DEASSERT_RESET / MIPI_SEQ_ASSERT_RESET
Looking at the ADF code from the Android kernel sources for a
cherrytrail tablet I noticed that it is calling the
MIPI_SEQ_ASSERT_RESET sequence from the panel prepare hook.
Until commit b1cb1bd29189 ("drm/i915/dsi: update reset and power sequences
in panel prepare/unprepare hooks") the mainline i915 code was doing the
same. That commits effectively swaps the calling of MIPI_SEQ_ASSERT_RESET /
MIPI_SEQ_DEASSERT_RESET.
Looking at the naming of the sequences that is the right thing to do,
but the problem is, that the old mainline code and the ADF code was
actually calling the right sequence (tested on a cube iwork8 air tablet),
and the swapping of the calling breaks things.
This breakage was likely not noticed in testing because on cherrytrail,
currently chv_exec_gpio ends up disabling the gpio pins rather then
setting them (this is fixed in the next patch in this patch-set).
This commit fixes the swapping by fixing MIPI_SEQ_ASSERT/DEASSERT_RESET's
places in the enum defining them, so that their (new) names match their
actual use.
Changes in v2:
-Add a comment to the enum explaining that the assert/reassert names
are swapped in the spec
Fixes: b1cb1bd29189 ("drm/i915/dsi: update reset and power sequences...") Cc: Jani Nikula <jani.nikula@intel.com> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com> Acked-by: Jani Nikula <jani.nikula@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20161202150128.29871-1-hdegoede@redhat.com Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Robert Bragg [Thu, 1 Dec 2016 17:21:52 +0000 (17:21 +0000)]
drm/i915/perf: use DRM_DEBUG for userspace issues
Avoid using DRM_ERROR for conditions userspace can trigger with a bad
config when opening a stream or from not reading data in a timely
fashion (whereby the OA buffer fills up). These conditions are tested
by i-g-t which treats error messages as failures if using the test
runner. This wasn't an issue while the i915-perf igt tests were being
run in isolation.
One message relating to seeing a spurious zeroed report was changed to
use DRM_NOTE instead of DRM_ERROR. Ideally this warning shouldn't be
seen, but it's not a serious problem if it is. Considering that the
tail margin mechanism is only a heuristic it's possible we might see
this from time to time.
Ville Syrjälä [Mon, 5 Dec 2016 14:13:28 +0000 (16:13 +0200)]
drm/i915: Protect DSPARB registers with a spinlock
Each DSPARB register can house bits for two separate pipes, hence
we must protect the registers during reprogramming so that parallel
FIFO reconfigurations happening simultaneosly on multiple pipes won't
corrupt each others values.
We'll use a new spinlock for this instead of the wm_mutex since we'll
have to move the DSPARB programming to happen from the vblank evade
critical section, and we can't use mutexes in there.
v2: Document why we use a spinlock instead of a mutex (Maarten)
Jani Nikula [Thu, 1 Dec 2016 12:49:55 +0000 (14:49 +0200)]
drm/i915: replace platform flags with a platform enum
The platform flags in device info are (mostly) mutually
exclusive. Replace the flags with an enum. Add the platform enum also
for platforms that previously didn't have a flag, and give them codename
logging in dmesg.
Pineview remains an exception, the platform being G33 for that.
Arkadiusz Hiler [Mon, 5 Dec 2016 16:04:29 +0000 (17:04 +0100)]
drm/i915/guc: Drop comment on fwif autogeneration
The firmware interface file was initially partially autogenerated, but
this is no longer the case.
It was never updated automatically, and a lot manual changes were
introduced since.
>From now on any changes to the firmware interface will be managed by
hand, which gives us flexibility when it comes to structure reuse
(HuC/GuC) and naming conventions.
Chris Wilson [Tue, 6 Dec 2016 12:40:51 +0000 (12:40 +0000)]
drm/i915: Use memcpy_from_wc for GPU error capture
On all platforms we now always read the contents of buffers via the GTT,
i.e. using WC cpu access. Reads are slow, but they can be accelerated
with an internal read buffer using sse4.1 (movntqda). This is our
i915_memcpy_from_wc() routine which also checks for sse4.1 support and
so we can fallback to using a regular slow memcpy if we need to.
When compressing the pages, the reads are currently done inside zlib's
fill_window() routine and so we must copy the page into a temporary
which is then already inside the CPU cache and fast for zlib's
compression. When not compressing the pages, we don't need a temporary
and can just use the accelerated read from WC into the destination.
v2: Use zstream locals to reduce diff and allocate the additional
temporary storage only if sse4.1 is supported.
v3: Use length=0 for the sse4.1 support check
Matthew Auld [Fri, 2 Dec 2016 18:47:50 +0000 (18:47 +0000)]
drm/i915: allow GEM_BUG_ON expr checking with !DEBUG_GEM
Use BUILD_BUG_ON_INVALID(expr) in GEM_BUG_ON when building without
DEBUG_GEM. This means the compiler can now check the validity of expr
without generating any code, in turn preventing us from inadvertently
breaking the build when DEBUG_GEM is not enabled.
Chris Wilson [Mon, 5 Dec 2016 14:29:41 +0000 (14:29 +0000)]
drm/i915/execlists: Use list_safe_reset_next() instead of opencoding
list.h provides a macro for updating the next element in a safe
list-iter, so let's use it so that it is hopefully clearer to the reader
about the unusual behaviour, and also easier to grep.
Chris Wilson [Mon, 5 Dec 2016 14:29:40 +0000 (14:29 +0000)]
drm/i915: Enable swfence debugobject support for i915.ko
Only once the debugobject symbols are exported can we enable support for
debugging swfences when i915 is built as a module. Requires commit 2617fdca3f68 ("lib/debugobjects: export for use in modules")
Chris Wilson [Mon, 5 Dec 2016 14:29:39 +0000 (14:29 +0000)]
drm/i915: Implement local atomic_state_free callback
As we use debugobjects to track the lifetime of fences within our atomic
state, we ideally want to mark those objects as freed along with their
containers. This merits us hookin into config->funcs->atomic_state_free
for this purpose.
This allows us to enable debugobjects for sw-fences without triggering
known issues.
Chris Wilson [Mon, 5 Dec 2016 14:29:38 +0000 (14:29 +0000)]
drm/i915: Tidy i915_gem_valid_gtt_space()
We can replace a couple of tests with an assertion that the passed in
node is already allocated (as matches the existing call convention) and
by a small bit of refactoring we can bring the line lengths to under
80cols.
Soft-pinning depends upon being able to check for availabilty of an
interval and evict overlapping object from a drm_mm range manager very
quickly. Currently it uses a linear list, and so performance is dire and
not suitable as a general replacement. Worse, the current code will oops
if it tries to evict an active buffer.
It also helps if the routine reports the correct error codes as expected
by its callers and emits a tracepoint upon use.
For posterity since the wrong patch was pushed (i.e. that missed these
key points and had known bugs), this is the changelog that should have
been on commit 506a8e87d8d2 ("drm/i915: Add soft-pinning API for
execbuffer"):
Userspace can pass in an offset that it presumes the object is located
at. The kernel will then do its utmost to fit the object into that
location. The assumption is that userspace is handling its own object
locations (for example along with full-ppgtt) and that the kernel will
rarely have to make space for the user's requests.
This extends the DRM_IOCTL_I915_GEM_EXECBUFFER2 to do the following:
* if the user supplies a virtual address via the execobject->offset
*and* sets the EXEC_OBJECT_PINNED flag in execobject->flags, then
that object is placed at that offset in the address space selected
by the context specifier in execbuffer.
* the location must be aligned to the GTT page size, 4096 bytes
* as the object is placed exactly as specified, it may be used by this
execbuffer call without relocations pointing to it
It may fail to do so if:
* EINVAL is returned if the object does not have a 4096 byte aligned
address
* the object conflicts with another pinned object (either pinned by
hardware in that address space, e.g. scanouts in the aliasing ppgtt)
or within the same batch.
EBUSY is returned if the location is pinned by hardware
EINVAL is returned if the location is already in use by the batch
* EINVAL is returned if the object conflicts with its own alignment (as meets
the hardware requirements) or if the placement of the object does not fit
within the address space
All other execbuffer errors apply.
Presence of this execbuf extension may be queried by passing
I915_PARAM_HAS_EXEC_SOFTPIN to DRM_IOCTL_I915_GETPARAM and checking for
a reported value of 1 (or greater).
v2: Combine the hole/adjusted-hole ENOSPC checks
v3: More color, more splitting, more blurb.
Chris Wilson [Mon, 5 Dec 2016 14:29:36 +0000 (14:29 +0000)]
drm/i915: Mark all non-vma being inserted into the address spaces
We need to distinguish between full i915_vma structs and simple
drm_mm_nodes when considering eviction (i.e. we must be careful not to
treat a mere drm_mm_node as a much larger i915_vma causing memory
corruption, if we are lucky). To do this, color these not-a-vma with -1
(I915_COLOR_UNEVICTABLE).
Hans de Goede [Fri, 2 Dec 2016 14:29:04 +0000 (15:29 +0100)]
drm/i915/dsi: Do not clear DPOUNIT_CLOCK_GATE_DISABLE from vlv_init_display_clock_gating
On my Cherrytrail CUBE iwork8 Air tablet PIPE-A would get stuck on loading
i915 at boot 1 out of every 3 boots, resulting in a non functional LCD.
Once the i915 driver has successfully loaded, the panel can be disabled /
enabled without hitting this issue.
The getting stuck is caused by vlv_init_display_clock_gating() clearing
the DPOUNIT_CLOCK_GATE_DISABLE bit in DSPCLK_GATE_D when called from
chv_pipe_power_well_ops.enable() on driver load, while a pipe is enabled
driving the DSI LCD by the BIOS.
Clearing this bit while DSI is in use is a known issue and
intel_dsi_pre_enable() / intel_dsi_post_disable() already set / clear it
as appropriate.
This commit modifies vlv_init_display_clock_gating() to leave the
DPOUNIT_CLOCK_GATE_DISABLE bit alone fixing the pipe getting stuck.
Changes in v2:
-Replace PIPE-A with "a pipe" or "the pipe" in the commit msg and
comment
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97330 Cc: stable@vger.kernel.org Signed-off-by: Hans de Goede <hdegoede@redhat.com> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20161202142904.25613-1-hdegoede@redhat.com Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Ville Syrjälä [Mon, 28 Nov 2016 17:37:15 +0000 (19:37 +0200)]
drm/i915: Write all DDL registers in one go
We'll want to decouple the vlv/chv wm register reprogramming from any
single pipe. So let's just write all the DDL registers in one go. We
already write all the wm registers anyway since the bits are sprinkled
all over the place and so writing them for just a single pipe would have
been too messy anyway.
Ville Syrjälä [Mon, 28 Nov 2016 17:37:14 +0000 (19:37 +0200)]
drm/i915: Zero out HOWM registers before writing new WM/HOWM register values
On VLV/CHV some of the watermark values are split across two registers:
low order bits in one, and high order bits in another. So we may not be
able to update a single watermark value atomically, and thus we must be
careful that we don't temporarily introduce out of bounds values during
the reprogramming. To prevent this we can simply zero out all the high
order bits initially, then we update the low order bits, and finally
we update the high order bits with the final value.
Ville Syrjälä [Mon, 28 Nov 2016 17:37:12 +0000 (19:37 +0200)]
drm/i915: Skip vblank wait if cxsr was already off
Before we attempt to turn any planes on or off we must first exit
csxr. That's due to cxsr effectively making the plane enable bits
read-only. Currently we achieve that with a vblank wait right after
toggling the cxsr enable bit. We do the vblank wait even if cxsr was
already off, which seems wasteful, so let's try to only do it when
absolutely necessary.
We could start tracking the cxsr state fully somewhere, but for now
it seems easiest to just have intel_set_memory_cxsr() return the
previous cxsr state.
Ville Syrjälä [Mon, 28 Nov 2016 17:37:11 +0000 (19:37 +0200)]
drm/i915: Protect cxsr state with wm_mutex
Let's protect the cxsr state with the wm_mutex, since it might
get poked from multiple places if there's a parallel plane update
happening with a pipe getting enable/disabled.
It's still pretty racy for the old platforms, but for vlv/chv it
should work, I think. If not, we'll improve it later anyway.
Ville Syrjälä [Mon, 28 Nov 2016 17:37:09 +0000 (19:37 +0200)]
drm/i915: Introduce vlv_invert_wm_value()
Add a small helper to do invert the vlv/chv values. Less fragile
perhaps, and let's us clearly mark all overlarge wateramarks as
disabled (by just making them all USHRT_MAX).
Ville Syrjälä [Mon, 28 Nov 2016 17:37:08 +0000 (19:37 +0200)]
drm/i915: Organize vlv/chv watermarks by plane_id
Store the vlv/chv watermark values in straight up arrays indexed by
enum plane_id. Avoids a lot of useless checks for the plane type when
we don't have to think which structure member we need to access.
Ville Syrjälä [Mon, 28 Nov 2016 17:37:06 +0000 (19:37 +0200)]
drm/i915: Clean up VLV/CHV maxfifo watermark setup
Let's compute the maxfifo watermarks using max() instead of min().
Can't even recall why I did it the other way originally. Anyways
using max() avoids having to initialize the watermarks to the max
value first.
Ville Syrjälä [Mon, 28 Nov 2016 17:37:05 +0000 (19:37 +0200)]
drm/i915: Fix the level 0 max_wm hack on VLV/CHV
The watermark should never exceed the FIFO size, so we need to
check against the current FIFO size instead of the theoretical
maximum when we clamp the level 0 watermark.
Ville Syrjälä [Mon, 28 Nov 2016 17:37:04 +0000 (19:37 +0200)]
drm/i915: Use the ilk_disable_lp_wm() return value
ilk_disable_lp_wm() will tell us whether the LP1+ watermarks were
disabled or not, and hence whether we need to for the vblank wait or
not. Let's use that information to eliminate some useless vblank
waits.
Ville Syrjälä [Mon, 28 Nov 2016 17:37:03 +0000 (19:37 +0200)]
drm/i915: Drop the nop intel_update_watermarks() call from haswell_crtc_enable()
HSW+ all use the .initial_watermarks() hook, so there's no point in
calling intel_update_watermarks() from HSW+ specific code. We'll still
hang on to the .initial_watermarks NULL check since theoretically if the
memory latencies are not populated we would not populate the function
pointer either.
drm/i915: Validate mode against max. link data rate for DP MST
Not validating the mode rate against max. link rate results in not pruning
invalid modes. For e.g, a HBR2 5.4 Gbps 2-lane configuration does not
support 4k@60Hz. But, we do not reject this mode.
So, make use of the helpers in intel_dp to validate mode data rate against
max. link data rate of a configuration.
v3: Renamed local variables again for consistency (Manasi)
v2: Renamed mode data rate local variable to be more explanatory.
We store DP link rates as link clock frequencies in kHz, just like all
other clock values. But, DP link rates in the DP Spec. are expressed in
Gbps/lane, which seems to have led to some confusion.
E.g., for HBR2
Max. data rate = 5.4 Gbps/lane x 4 lane x 8/10 x 1/8 = 2160000 kBps
where, 8/10 is for channel encoding and 1/8 is for bit to Byte conversion
Using link clock frequency, like we do
Max. data rate = 540000 kHz * 4 lanes = 2160000 kSymbols/s
Because, each symbol has 8 bit of data, this is 2160000 kBps
and there is no need to account for channel encoding here.
But, currently we do 540000 kHz * 4 lanes * (8/10) = 1728000 kBps
Similarly, while computing the required link bandwidth for a mode,
there is a mysterious 1/10 term.
This should simply be pixel_clock kHz * (bpp/8) to give the final result in
kBps
v2: Changed to DIV_ROUND_UP() and comment changes (Ville)
Imre Deak [Fri, 2 Dec 2016 16:35:41 +0000 (18:35 +0200)]
drm/i915: Add I2C and DP-AUX char devices to debug kconfig
These char devices exposing the driver's I2C and DP-AUX adapters for
user space tools are useful to debug display output related issues.
Enable them with the rest of additional driver debug options.
Linus Torvalds [Sun, 4 Dec 2016 00:40:21 +0000 (16:40 -0800)]
Merge tag 'drm-fixes-for-v4.9-rc8' of git://people.freedesktop.org/~airlied/linux
Pull drm fixes from Dave Airlie:
"A pretty small pull request: a couple of AMD powerxpress regression
fixes and a power management fix, a couple of i915 fixes and one hdlcd
fix, along with one core don't oops because of incorrect API usage fix"
* tag 'drm-fixes-for-v4.9-rc8' of git://people.freedesktop.org/~airlied/linux:
drm/i915: drop the struct_mutex when wedged or trying to reset
drm/i915: Don't touch NULL sg on i915_gem_object_get_pages_gtt() error
drm: Don't call drm_for_each_crtc with a non-KMS driver
drm/radeon: fix check for port PM availability
drm/amdgpu: fix check for port PM availability
drm/amd/powerplay: initialize the soft_regs offset in struct smu7_hwmgr
drm: hdlcd: Fix cleanup order
Dave Airlie [Sat, 3 Dec 2016 20:31:26 +0000 (06:31 +1000)]
Merge tag 'drm-intel-fixes-2016-12-01' of git://anongit.freedesktop.org/git/drm-intel into drm-fixes
2 intel fixes.
* tag 'drm-intel-fixes-2016-12-01' of git://anongit.freedesktop.org/git/drm-intel:
drm/i915: drop the struct_mutex when wedged or trying to reset
drm/i915: Don't touch NULL sg on i915_gem_object_get_pages_gtt() error
Linus Torvalds [Sat, 3 Dec 2016 02:48:11 +0000 (18:48 -0800)]
Merge branch 'akpm' (patches from Andrew)
Merge more fixes from Andrew Morton:
"2 fixes"
* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
mm, vmscan: add cond_resched() into shrink_node_memcg()
mm: workingset: fix NULL ptr in count_shadow_nodes
Michal Hocko [Sat, 3 Dec 2016 01:26:48 +0000 (17:26 -0800)]
mm, vmscan: add cond_resched() into shrink_node_memcg()
Boris Zhmurov has reported RCU stalls during the kswapd reclaim:
INFO: rcu_sched detected stalls on CPUs/tasks:
23-...: (22 ticks this GP) idle=92f/140000000000000/0 softirq=2638404/2638404 fqs=23
(detected by 4, t=6389 jiffies, g=786259, c=786258, q=42115)
Task dump for CPU 23:
kswapd1 R running task 0 148 2 0x00000008
Call Trace:
shrink_node+0xd2/0x2f0
kswapd+0x2cb/0x6a0
mem_cgroup_shrink_node+0x160/0x160
kthread+0xbd/0xe0
__switch_to+0x1fa/0x5c0
ret_from_fork+0x1f/0x40
kthread_create_on_node+0x180/0x180
a closer code inspection has shown that we might indeed miss all the
scheduling points in the reclaim path if no pages can be isolated from
the LRU list. This is a pathological case but other reports from Donald
Buczek have shown that we might indeed hit such a path:
this is minute long snapshot which didn't take a single page from the
LRU. It is not entirely clear why only 1303 pages have been scanned
during that time (maybe there was a heavy IRQ activity interfering).
In any case it looks like we can really hit long periods without
scheduling on non preemptive kernels so an explicit cond_resched() in
shrink_node_memcg which is independent on the reclaim operation is due.
Link: http://lkml.kernel.org/r/20161202095841.16648-1-mhocko@kernel.org Signed-off-by: Michal Hocko <mhocko@suse.com> Reported-by: Boris Zhmurov <bb@kernelpanic.ru> Tested-by: Boris Zhmurov <bb@kernelpanic.ru> Reported-by: Donald Buczek <buczek@molgen.mpg.de> Reported-by: "Christopher S. Aker" <caker@theshore.net> Reported-by: Paul Menzel <pmenzel@molgen.mpg.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Michal Hocko [Sat, 3 Dec 2016 01:26:45 +0000 (17:26 -0800)]
mm: workingset: fix NULL ptr in count_shadow_nodes
Commit 0a6b76dd23fa ("mm: workingset: make shadow node shrinker memcg
aware") has made the workingset shadow nodes shrinker memcg aware. The
implementation is not correct though because memcg_kmem_enabled() might
become true while we are doing a global reclaim when the sc->memcg might
be NULL which is exactly what Marek has seen:
This patch fixes the issue by checking sc->memcg rather than
memcg_kmem_enabled() which is sufficient because shrink_slab makes sure
that only memcg aware shrinkers will get non-NULL memcgs and only if
memcg_kmem_enabled is true.
Fixes: 0a6b76dd23fa ("mm: workingset: make shadow node shrinker memcg aware") Link: http://lkml.kernel.org/r/20161201132156.21450-1-mhocko@kernel.org Signed-off-by: Michal Hocko <mhocko@suse.com> Reported-by: Marek Marczykowski-Górecki <marmarek@mimuw.edu.pl> Tested-by: Marek Marczykowski-Górecki <marmarek@mimuw.edu.pl> Acked-by: Vladimir Davydov <vdavydov.dev@gmail.com> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by: Balbir Singh <bsingharora@gmail.com> Cc: <stable@vger.kernel.org> [4.6+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>