Buggy drivers (e.g. fsl_udc) could call dma_pool_alloc from atomic
context with GFP_KERNEL. In most instances, the first pool_alloc_page
call would succeed and the sleeping functions would never be called. This
allowed the buggy drivers to slip through the cracks.
Add a might_sleep_if() checking for __GFP_WAIT in flags.