When unloading driver, nvme_disable_io_queues() calls nvme_delete_queue()
that sends nvme_admin_delete_cq command to admin sq. So when the command
completed, the lock acquired by nvme_irq() actually belongs to admin queue.
While the lock that nvme_del_cq_end() trying to acquire belongs to io queue.
So it will not deadlock.
This patch adds lock nesting notation to fix following report.
[ 109.840952] =============================================
[ 109.846379] [ INFO: possible recursive locking detected ]
[ 109.851806] 4.5.0+ #180 Tainted: G E
[ 109.856533] ---------------------------------------------
[ 109.861958] swapper/0/0 is trying to acquire lock:
[ 109.866771] (&(&nvmeq->q_lock)->rlock){-.....}, at: [<
ffffffffc0820bc6>] nvme_del_cq_end+0x26/0x70 [nvme]
[ 109.876535]
[ 109.876535] but task is already holding lock:
[ 109.882398] (&(&nvmeq->q_lock)->rlock){-.....}, at: [<
ffffffffc0820c2b>] nvme_irq+0x1b/0x50 [nvme]
[ 109.891547]
[ 109.891547] other info that might help us debug this:
[ 109.898107] Possible unsafe locking scenario:
[ 109.898107]
[ 109.904056] CPU0
[ 109.906515] ----
[ 109.908974] lock(&(&nvmeq->q_lock)->rlock);
[ 109.913381] lock(&(&nvmeq->q_lock)->rlock);
[ 109.917787]
[ 109.917787] *** DEADLOCK ***
[ 109.917787]
[ 109.923738] May be due to missing lock nesting notation
[ 109.923738]
[ 109.930558] 1 lock held by swapper/0/0:
[ 109.934413] #0: (&(&nvmeq->q_lock)->rlock){-.....}, at: [<
ffffffffc0820c2b>] nvme_irq+0x1b/0x50 [nvme]
[ 109.944010]
[ 109.944010] stack backtrace:
[ 109.948389] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G E 4.5.0+ #180
[ 109.955734] Hardware name: Dell Inc. OptiPlex 7010/0YXT71, BIOS A15 08/12/2013
[ 109.962989]
0000000000000000 ffff88011e203c38 ffffffff81383d9c ffffffff81c13540
[ 109.970478]
ffffffff826711d0 ffff88011e203ce8 ffffffff810bb429 0000000000000046
[ 109.977964]
0000000000000046 0000000000000000 0000000000b2e597 ffffffff81f4cb00
[ 109.985453] Call Trace:
[ 109.987911] <IRQ> [<
ffffffff81383d9c>] dump_stack+0x85/0xc9
[ 109.993711] [<
ffffffff810bb429>] __lock_acquire+0x19b9/0x1c60
[ 109.999575] [<
ffffffff810b6d1d>] ? trace_hardirqs_off+0xd/0x10
[ 110.005524] [<
ffffffff810b386d>] ? complete+0x3d/0x50
[ 110.010688] [<
ffffffff810bb760>] lock_acquire+0x90/0xf0
[ 110.016029] [<
ffffffffc0820bc6>] ? nvme_del_cq_end+0x26/0x70 [nvme]
[ 110.022418] [<
ffffffff81772afb>] _raw_spin_lock_irqsave+0x4b/0x60
[ 110.028632] [<
ffffffffc0820bc6>] ? nvme_del_cq_end+0x26/0x70 [nvme]
[ 110.035019] [<
ffffffffc0820bc6>] nvme_del_cq_end+0x26/0x70 [nvme]
[ 110.041232] [<
ffffffff8135b485>] blk_mq_end_request+0x35/0x60
[ 110.047095] [<
ffffffffc0821ad8>] nvme_complete_rq+0x68/0x190 [nvme]
[ 110.053481] [<
ffffffff8135b53f>] __blk_mq_complete_request+0x8f/0x130
[ 110.060043] [<
ffffffff8135b611>] blk_mq_complete_request+0x31/0x40
[ 110.066343] [<
ffffffffc08209e3>] __nvme_process_cq+0x83/0x240 [nvme]
[ 110.072818] [<
ffffffffc0820c35>] nvme_irq+0x25/0x50 [nvme]
[ 110.078419] [<
ffffffff810cdb66>] handle_irq_event_percpu+0x36/0x110
[ 110.084804] [<
ffffffff810cdc77>] handle_irq_event+0x37/0x60
[ 110.090491] [<
ffffffff810d0ea3>] handle_edge_irq+0x93/0x150
[ 110.096180] [<
ffffffff81012306>] handle_irq+0xa6/0x130
[ 110.101431] [<
ffffffff81011abe>] do_IRQ+0x5e/0x120
[ 110.106333] [<
ffffffff8177384c>] common_interrupt+0x8c/0x8c
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Ming Lin <ming.l@ssi.samsung.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Jens Axboe <axboe@fb.com>