thp: set compound tail page _count to zero
70b50f94f1644 ("mm: thp: tail page refcounting fix") keeps all
page_tail->_count zero at all times. But the current kernel does not set
page_tail->_count to zero if a 1GB page is utilized. So when an IOMMU 1GB
page is used at KVM, it wil result in a kernel oops because a tail page's
_count does not equal zero.
kernel BUG at include/linux/mm.h:386!
invalid opcode: 0000 [#1] SMP
Call Trace:
[<
ffffffff81072f7f>] gup_pud_range+0xb8/0x19d
[<
ffffffff8107312f>] get_user_pages_fast+0xcb/0x192
[<
ffffffff810bc450>] ? trace_hardirqs_off+0xd/0xf
[<
ffffffff81006a24>] hva_to_pfn+0x119/0x2f2
[<
ffffffff81006c29>] gfn_to_pfn_memslot+0x2c/0x2e
[<
ffffffff8100b909>] kvm_iommu_map_pages+0xfd/0x1c1
[<
ffffffff8100ba49>] kvm_iommu_map_memslots+0x7c/0xbd
[<
ffffffff8100b9cd>] ? kvm_iommu_map_pages+0x1c1/0x1c1
[<
ffffffff8100bb34>] kvm_iommu_map_guest+0xaa/0xbf
[<
ffffffff8100aeb0>] kvm_vm_ioctl_assigned_device+0x2ef/0xa47
[<
ffffffff8100ac6d>] ? kvm_vm_ioctl_assigned_device+0xac/0xa47
[<
ffffffff8104f2a6>] ? native_sched_clock+0x32/0x6b
[<
ffffffff810b0c02>] ? sched_clock_cpu+0x45/0xd4
[<
ffffffff810bc450>] ? trace_hardirqs_off+0xd/0xf
[<
ffffffff810b0cd2>] ? local_clock+0x41/0x5a
[<
ffffffff810bc8a1>] ? lock_release_holdtime+0x2c/0x129
[<
ffffffff8115762d>] ? cmpxchg_double_slab+0xd0/0x12b
[<
ffffffff81248f47>] ? avc_has_perm_noaudit+0x388/0x399
[<
ffffffff8104f2a6>] ? native_sched_clock+0x32/0x6b
[<
ffffffff8104f2e8>] ? sched_clock+0x9/0xd
[<
ffffffff81007dcb>] kvm_vm_ioctl+0x36c/0x3a2
[<
ffffffff8104f2a6>] ? native_sched_clock+0x32/0x6b
[<
ffffffff8104f2e8>] ? sched_clock+0x9/0xd
[<
ffffffff81174b10>] do_vfs_ioctl+0x49e/0x4e4
[<
ffffffff81174bb0>] sys_ioctl+0x5a/0x7c
[<
ffffffff81500e02>] system_call_fastpath+0x16/0x1b
RIP [<
ffffffff81072d13>] gup_huge_pud+0xf2/0x159
Signed-off-by: Youquan Song <youquan.song@intel.com>
Reviewed-by: Andrea Arcangeli <aarcange@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>