From 7b7a119e8546e27227a7969883a3c34ed7dbb0cf Mon Sep 17 00:00:00 2001 From: Chris Wilson Date: Mon, 31 Oct 2016 12:40:48 +0000 Subject: [PATCH] drm/i915: Mark up obj->mm.lock for shrinker As we may allocate from within the obj->mm.lock we may enter the shrinker for direct reclaim. Operating on the current object is prevented by checking for obj->mm.pages (which is only set as the last operation in the allocation path). However, we need to identify the single recursion of accessing another object's obj->mm.lock as the two locks have identical class and so appear to be the same to lockdep, convincing it that a deadlock is possible. Use mutex_lock_nested() to remove the false positive. [ 2165.945734] ================================= [ 2165.945749] [ INFO: inconsistent lock state ] [ 2165.945765] 4.9.0-rc2+ #2 Tainted: G W [ 2165.945781] --------------------------------- [ 2165.945796] inconsistent {RECLAIM_FS-ON-W} -> {IN-RECLAIM_FS-W} usage. [ 2165.945816] kswapd0/62 [HC0[0]:SC0[0]:HE1:SE1] takes: (&obj->mm.lock){+.+.?.}, at: [] i915_gem_shrink+0x29f/0x500 [i915] [ 2165.945904] {RECLAIM_FS-ON-W} state was registered at: [ 2165.945931] [] mark_held_locks+0x6f/0xa0 [ 2165.945956] [] lockdep_trace_alloc+0x69/0xc0 [ 2165.945982] [] kmem_cache_alloc_trace+0x33/0x2a0 [ 2165.946019] [] i915_gem_object_get_pages_stolen+0x6a/0xd0 [i915] [ 2165.946060] [] ____i915_gem_object_get_pages+0x20/0x60 [i915] [ 2165.946098] [] __i915_gem_object_get_pages+0x58/0x70 [i915] [ 2165.946138] [] _i915_gem_object_create_stolen+0xec/0x120 [i915] [ 2165.946177] [] i915_gem_object_create_stolen_for_preallocated+0xf3/0x3f0 [i915] [ 2165.946222] [] intel_alloc_initial_plane_obj.isra.125+0xd3/0x200 [i915] [ 2165.946266] [] intel_modeset_init+0x931/0x1530 [i915] [ 2165.946301] [] i915_driver_load+0xa14/0x14a0 [i915] [ 2165.946335] [] i915_pci_probe+0x4f/0x70 [i915] [ 2165.946362] [] local_pci_probe+0x42/0xa0 [ 2165.946386] [] pci_device_probe+0x103/0x150 [ 2165.946411] [] driver_probe_device+0x223/0x430 [ 2165.946436] [] __driver_attach+0xe3/0xf0 [ 2165.946461] [] bus_for_each_dev+0x73/0xc0 [ 2165.946485] [] driver_attach+0x1e/0x20 [ 2165.946508] [] bus_add_driver+0x173/0x270 [ 2165.946533] [] driver_register+0x60/0xe0 [ 2165.946557] [] __pci_register_driver+0x5d/0x60 [ 2165.946606] [] soundcore_open+0x17/0x230 [soundcore] [ 2165.946636] [] do_one_initcall+0x50/0x180 [ 2165.946661] [] do_init_module+0x5f/0x1f1 [ 2165.946685] [] load_module+0x2174/0x2a80 [ 2165.946709] [] SYSC_finit_module+0xdf/0x110 [ 2165.946734] [] SyS_finit_module+0xe/0x10 [ 2165.946758] [] entry_SYSCALL_64_fastpath+0x18/0xad [ 2165.946776] irq event stamp: 90871 [ 2165.946788] hardirqs last enabled at (90871): [ 2165.946805] [] __mutex_unlock_slowpath+0x11a/0x1c0 [ 2165.946823] hardirqs last disabled at (90870): [ 2165.946839] [] __mutex_unlock_slowpath+0x5b/0x1c0 [ 2165.946856] softirqs last enabled at (90858): [ 2165.946872] [] __do_softirq+0x39a/0x4c6 [ 2165.946887] softirqs last disabled at (90671): [ 2165.946902] [] irq_exit+0xea/0xf0 [ 2165.946916] other info that might help us debug this: [ 2165.946936] Possible unsafe locking scenario: [ 2165.946955] CPU0 [ 2165.946965] ---- [ 2165.946975] lock(&obj->mm.lock); [ 2165.947000] [ 2165.947010] lock(&obj->mm.lock); [ 2165.947035] *** DEADLOCK *** [ 2165.947054] 2 locks held by kswapd0/62: [ 2165.947067] #0: (shrinker_rwsem){++++..}, at: [] shrink_slab.part.40+0x5e/0x5d0 [ 2165.947120] #1: (&dev->struct_mutex){+.+.+.}, at: [] i915_gem_shrinker_lock+0x1b/0x60 [i915] [ 2165.948909] stack backtrace: [ 2165.950650] CPU: 2 PID: 62 Comm: kswapd0 Tainted: G W 4.9.0-rc2+ #2 [ 2165.951587] Hardware name: LENOVO 80MX/Lenovo E31-80, BIOS DCCN34WW(V2.03) 12/01/2015 [ 2165.952484] ffffc90000b5f8c8 ffffffffb137f645 ffff88016c5a2700 ffffffffb25f20a0 [ 2165.953395] ffffc90000b5f918 ffffffffb10bcecd 0000000000000000 ffff880100000001 [ 2165.954305] 0000000000000001 000000000000000a ffff88016c5a2fd0 ffff88016c5a2700 [ 2165.955240] Call Trace: [ 2165.956170] [] dump_stack+0x68/0x93 [ 2165.957071] [] print_usage_bug+0x1dd/0x1f0 [ 2165.957979] [] mark_lock+0x559/0x5c0 [ 2165.958875] [] ? print_shortest_lock_dependencies+0x1b0/0x1b0 [ 2165.959829] [] __lock_acquire+0x66d/0x12a0 [ 2165.960729] [] ? __slab_free+0xa1/0x340 [ 2165.961625] [] ? debug_lockdep_rcu_enabled+0x1d/0x20 [ 2165.962530] [] ? mark_held_locks+0x6f/0xa0 [ 2165.963457] [] lock_acquire+0xf0/0x1f0 [ 2165.964368] [] ? i915_gem_shrink+0x29f/0x500 [i915] [ 2165.965269] [] ? i915_gem_shrink+0x29f/0x500 [i915] [ 2165.966150] [] mutex_lock_nested+0x77/0x420 [ 2165.967030] [] ? i915_gem_shrink+0x29f/0x500 [i915] [ 2165.967952] [] ? __i915_gem_object_put_pages.part.58+0x161/0x1b0 [i915] [ 2165.968835] [] i915_gem_shrink+0x29f/0x500 [i915] [ 2165.969712] [] i915_gem_shrinker_scan+0x70/0xb0 [i915] [ 2165.970591] [] shrink_slab.part.40+0x1fe/0x5d0 [ 2165.971504] [] shrink_node+0x22c/0x320 [ 2165.972371] [] kswapd+0x38b/0x9b0 [ 2165.973238] [] ? mem_cgroup_shrink_node+0x330/0x330 [ 2165.974068] [] kthread+0xff/0x120 [ 2165.974929] [] ? kthread_park+0x60/0x60 [ 2165.975847] [] ret_from_fork+0x27/0x40 Reported-by: Tvrtko Ursulin Fixes: 1233e2db199d ("drm/i915: Move object backing storage manipulation...") Testcase: igt/gem_ctx_create/maximum-swap Signed-off-by: Chris Wilson Cc: Tvrtko Ursulin Cc: Joonas Lahtinen Link: http://patchwork.freedesktop.org/patch/msgid/20161031124048.30355-1-chris@chris-wilson.co.uk --- drivers/gpu/drm/i915/i915_gem_shrinker.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/i915_gem_shrinker.c b/drivers/gpu/drm/i915/i915_gem_shrinker.c index 0241658af16b..0daa09cabbcc 100644 --- a/drivers/gpu/drm/i915/i915_gem_shrinker.c +++ b/drivers/gpu/drm/i915/i915_gem_shrinker.c @@ -223,7 +223,9 @@ i915_gem_shrink(struct drm_i915_private *dev_priv, continue; if (unsafe_drop_pages(obj)) { - mutex_lock(&obj->mm.lock); + /* May arrive from get_pages on another bo */ + mutex_lock_nested(&obj->mm.lock, + SINGLE_DEPTH_NESTING); if (!obj->mm.pages) { __i915_gem_object_invalidate(obj); count += obj->base.size >> PAGE_SHIFT; -- 2.39.5