]> git.karo-electronics.de Git - karo-tx-linux.git/commit
mm: filemap: update find_get_pages_tag() to deal with shadow entries
authorJohannes Weiner <hannes@cmpxchg.org>
Thu, 24 Apr 2014 22:55:25 +0000 (08:55 +1000)
committerStephen Rothwell <sfr@canb.auug.org.au>
Thu, 24 Apr 2014 22:55:25 +0000 (08:55 +1000)
commit2454d37dd2bab4c02f3d326072b6f2e5cdb5dd4b
treef3ffc4b3d65ce5b6104a6ebf346a17e59d9c34c4
parent8842e9c875805b68a08d038c03ce3ee7d27dbea9
mm: filemap: update find_get_pages_tag() to deal with shadow entries

Dave Jones reports the following crash when find_get_pages_tag() runs into
an exceptional entry:

kernel BUG at mm/filemap.c:1347!
RIP: 0010:[<ffffffffb815aeab>]  [<ffffffffb815aeab>] find_get_pages_tag+0x1cb/0x220
Call Trace:
 [<ffffffffb815ad16>] ? find_get_pages_tag+0x36/0x220
 [<ffffffffb8168511>] pagevec_lookup_tag+0x21/0x30
 [<ffffffffb81595de>] filemap_fdatawait_range+0xbe/0x1e0
 [<ffffffffb8159727>] filemap_fdatawait+0x27/0x30
 [<ffffffffb81f2fa4>] sync_inodes_sb+0x204/0x2a0
 [<ffffffffb874d98f>] ? wait_for_completion+0xff/0x130
 [<ffffffffb81fa5b0>] ? vfs_fsync+0x40/0x40
 [<ffffffffb81fa5c9>] sync_inodes_one_sb+0x19/0x20
 [<ffffffffb81caab2>] iterate_supers+0xb2/0x110
 [<ffffffffb81fa864>] sys_sync+0x44/0xb0
 [<ffffffffb875c4a9>] ia32_do_call+0x13/0x13

1343                         /*
1344                          * This function is never used on a shmem/tmpfs
1345                          * mapping, so a swap entry won't be found here.
1346                          */
1347                         BUG();

After 0cd6144aadd2 ("mm + fs: prepare for non-page entries in page cache
radix trees") this comment and BUG() are out of date because exceptional
entries can now appear in all mappings - as shadows of recently evicted
pages.

However, as Hugh Dickins notes,

  "it is truly surprising for a PAGECACHE_TAG_WRITEBACK (and probably
   any other PAGECACHE_TAG_*) to appear on an exceptional entry.

   I expect it comes down to an occasional race in RCU lookup of the
   radix_tree: lacking absolute synchronization, we might sometimes
   catch an exceptional entry, with the tag which really belongs with
   the unexceptional entry which was there an instant before."

And indeed, not only is the tree walk lockless, the tags are also read in
chunks, one radix tree node at a time.  There is plenty of time for page
reclaim to swoop in and replace a page that was already looked up as
tagged with a shadow entry.

Remove the BUG() and update the comment.  While reviewing all other lookup
sites for whether they properly deal with shadow entries of evicted pages,
update all the comments and fix memcg file charge moving to not miss
shmem/tmpfs swapcache pages.

Fixes: 0cd6144aadd2 ("mm + fs: prepare for non-page entries in page cache radix trees")
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Reported-by: Dave Jones <davej@redhat.com>
Acked-by: Hugh Dickins <hughd@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
mm/filemap.c
mm/memcontrol.c
mm/truncate.c