Rather than calling vmalloc() repeatedly to grow the firmware image as
we receive data from userspace, just allocate and fill individual pages.
Then vmap() the whole lot in one go when we're done.
A quick test with a 337KiB iwlagn firmware shows the time taken for
request_firmware() going from ~32ms to ~5ms after I apply this patch.
[v2: define PAGE_KERNEL_RO as PAGE_KERNEL where necessary, use min_t()]
[v3: kunmap() takes the struct page *, not the virtual address]
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com> Tested-by: Sachin Sant <sachinp@in.ibm.com>