Futex code is smarter than most other gup_fast O_DIRECT code and knows
about the compound internals. However now doing a put_page(head_page)
will not release the pin on the tail page taken by gup-fast, leading to
all sort of refcounting bugchecks. Getting a stable head_page is a little
tricky.
page_head = page is there because if this is not a tail page it's also the
page_head. Only in case this is a tail page, compound_head is called,
otherwise it's guaranteed unnecessary. And if it's a tail page
compound_head has to run atomically inside irq disabled section
__get_user_pages_fast before returning. Otherwise ->first_page won't be a
stable pointer.
Disableing irq before __get_user_page_fast and releasing irq after running
compound_head is needed because if __get_user_page_fast returns == 1, it
means the huge pmd is established and cannot go away from under us.
pmdp_splitting_flush_notify in __split_huge_page_splitting will have to
wait for local_irq_enable before the IPI delivery can return. This means
__split_huge_page_refcount can't be running from under us, and in turn
when we run compound_head(page) we're not reading a dangling pointer from
tailpage->first_page. Then after we get to stable head page, we are
always safe to call compound_lock and after taking the compound lock on
head page we can finally re-check if the page returned by gup-fast is
still a tail page. in which case we're set and we didn't need to split
the hugepage in order to take a futex on it.
Signed-off-by: Andrea Arcangeli <aarcange@redhat.com> Acked-by: Mel Gorman <mel@csn.ul.ie> Acked-by: Rik van Riel <riel@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>