git.karo-electronics.de Git - karo-tx-linux.git/commit

aio: v4 ensure access to ctx->ring_pages is correctly serialised for migration

commit fa8a53c39f3fdde98c9eace6a9b412143f0f6ed6 upstream.

As reported by Tang Chen, Gu Zheng and Yasuaki Isimatsu, the following issues
exist in the aio ring page migration support.

As a result, for example, we have the following problem:

            thread 1                      |              thread 2
                                          |
aio_migratepage()                         |
|-> take ctx->completion_lock            |
|-> migrate_page_copy(new, old)          |
|   *NOW*, ctx->ring_pages[idx] == old   |
                                          |
                                          |    *NOW*, ctx->ring_pages[idx] == old
                                          |    aio_read_events_ring()
                                          |     |-> ring = kmap_atomic(ctx->ring_pages[0])
                                          |     |-> ring->head = head;          *HERE, write to the old ring page*
                                          |     |-> kunmap_atomic(ring);
                                          |
|-> ctx->ring_pages[idx] = new           |
|   *BUT NOW*, the content of            |
|    ring_pages[idx] is old.             |
|-> release ctx->completion_lock         |

As above, the new ring page will not be updated.

Fix this issue, as well as prevent races in aio_ring_setup() by holding
the ring_lock mutex during kioctx setup and page migration.  This avoids
the overhead of taking another spinlock in aio_read_events_ring() as Tang's
and Gu's original fix did, pushing the overhead into the migration code.

Note that to handle the nesting of ring_lock inside of mmap_sem, the
migratepage operation uses mutex_trylock().  Page migration is not a 100%
critical operation in this case, so the ocassional failure can be
tolerated.  This issue was reported by Sasha Levin.

Based on feedback from Linus, avoid the extra taking of ctx->completion_lock.
Instead, make page migration fully serialised by mapping->private_lock, and
have aio_free_ring() simply disconnect the kioctx from the mapping by calling
put_aio_ring_file() before touching ctx->ring_pages[].  This simplifies the
error handling logic in aio_migratepage(), and should improve robustness.

v4: always do mutex_unlock() in cases when kioctx setup fails.

Reported-by: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Reported-by: Sasha Levin <sasha.levin@oracle.com>
Signed-off-by: Benjamin LaHaise <bcrl@kvack.org>
Cc: Tang Chen <tangchen@cn.fujitsu.com>
Cc: Gu Zheng <guz.fnst@cn.fujitsu.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

author	Benjamin LaHaise <bcrl@kvack.org>
	Fri, 28 Mar 2014 14:14:45 +0000 (10:14 -0400)
committer	Greg Kroah-Hartman <gregkh@linuxfoundation.org>
	Tue, 13 May 2014 11:32:56 +0000 (13:32 +0200)
commit	0b729b32b5f62a15059208731ac3d6b5a7712e20
tree	e1c6fdbd7ac2c789d033f86a62061f5373a160b6	tree \| snapshot
parent	0451fd9938dd008f59256190611a3a91c2308775	commit \| diff