Bharat Bhushan [Fri, 15 Nov 2013 05:31:15 +0000 (11:01 +0530)]
kvm: powerpc: define a linux pte lookup function
We need to search linux "pte" to get "pte" attributes for setting TLB in KVM.
This patch defines a lookup_linux_ptep() function which returns pte pointer.
Signed-off-by: Bharat Bhushan <bharat.bhushan@freescale.com> Reviewed-by: Scott Wood <scottwood@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de>
Bharat Bhushan [Fri, 15 Nov 2013 05:31:13 +0000 (11:01 +0530)]
kvm: booke: clear host tlb reference flag on guest tlb invalidation
On booke, "struct tlbe_ref" contains host tlb mapping information
(pfn: for guest-pfn to pfn, flags: attribute associated with this mapping)
for a guest tlb entry. So when a guest creates a TLB entry then
"struct tlbe_ref" is set to point to valid "pfn" and set attributes in
"flags" field of the above said structure. When a guest TLB entry is
invalidated then flags field of corresponding "struct tlbe_ref" is
updated to point that this is no more valid, also we selectively clear
some other attribute bits, example: if E500_TLB_BITMAP was set then we clear
E500_TLB_BITMAP, if E500_TLB_TLB0 is set then we clear this.
Ideally we should clear complete "flags" as this entry is invalid and does not
have anything to re-used. The other part of the problem is that when we use
the same entry again then also we do not clear (started doing or-ing etc).
So far it was working because the selectively clearing mentioned above
actually clears "flags" what was set during TLB mapping. But the problem
starts coming when we add more attributes to this then we need to selectively
clear them and which is not needed.
Signed-off-by: Bharat Bhushan <bharat.bhushan@freescale.com> Reviewed-by: Scott Wood <scottwood@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de>
Paul Mackerras [Tue, 15 Oct 2013 09:43:04 +0000 (20:43 +1100)]
KVM: PPC: Book3S HV: Use load/store_fp_state functions in HV guest entry/exit
This modifies kvmppc_load_fp and kvmppc_save_fp to use the generic
FP/VSX and VMX load/store functions instead of open-coding the
FP/VSX/VMX load/store instructions. Since kvmppc_load/save_fp don't
follow C calling conventions, we make them private symbols within
book3s_hv_rmhandlers.S.
Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>
Paul Mackerras [Tue, 15 Oct 2013 09:43:03 +0000 (20:43 +1100)]
KVM: PPC: Load/save FP/VMX/VSX state directly to/from vcpu struct
Now that we have the vcpu floating-point and vector state stored in
the same type of struct as the main kernel uses, we can load that
state directly from the vcpu struct instead of having extra copies
to/from the thread_struct. Similarly, when the guest state needs to
be saved, we can have it saved it directly to the vcpu struct by
setting the current->thread.fp_save_area and current->thread.vr_save_area
pointers. That also means that we don't need to back up and restore
userspace's FP/vector state. This all makes the code simpler and
faster.
Note that it's not necessary to save or modify current->thread.fpexc_mode,
since nothing in KVM uses or is affected by its value. Nor is it
necessary to touch used_vr or used_vsr.
Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>
Paul Mackerras [Tue, 15 Oct 2013 09:43:02 +0000 (20:43 +1100)]
KVM: PPC: Store FP/VSX/VMX state in thread_fp/vr_state structures
This uses struct thread_fp_state and struct thread_vr_state to store
the floating-point, VMX/Altivec and VSX state, rather than flat arrays.
This makes transferring the state to/from the thread_struct simpler
and allows us to unify the get/set_one_reg implementations for the
VSX registers.
Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>
Paul Mackerras [Tue, 15 Oct 2013 09:43:01 +0000 (20:43 +1100)]
KVM: PPC: Use load_fp/vr_state rather than load_up_fpu/altivec
The load_up_fpu and load_up_altivec functions were never intended to
be called from C, and do things like modifying the MSR value in their
callers' stack frames, which are assumed to be interrupt frames. In
addition, on 32-bit Book S they require the MMU to be off.
This makes KVM use the new load_fp_state() and load_vr_state() functions
instead of load_up_fpu/altivec. This means we can remove the assembler
glue in book3s_rmhandlers.S, and potentially fixes a bug on Book E,
where load_up_fpu was called directly from C.
Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>
Bharat Bhushan [Tue, 8 Oct 2013 04:02:20 +0000 (09:32 +0530)]
kvm/powerpc: move kvm_hypercall0() and friends to epapr_hypercall0()
kvm_hypercall0() and friends have nothing KVM specific so moved to
epapr_hypercall0() and friends. Also they are moved from
arch/powerpc/include/asm/kvm_para.h to arch/powerpc/include/asm/epapr_hcalls.h
Signed-off-by: Bharat Bhushan <bharat.bhushan@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de>
Liu Ping Fan [Tue, 19 Nov 2013 06:12:48 +0000 (14:12 +0800)]
powerpc: kvm: optimize "sc 1" as fast return
In some scene, e.g openstack CI, PR guest can trigger "sc 1" frequently,
this patch optimizes the path by directly delivering BOOK3S_INTERRUPT_SYSCALL
to HV guest, so powernv can return to HV guest without heavy exit, i.e,
no need to swap TLB, HTAB,.. etc
Signed-off-by: Liu Ping Fan <pingfank@linux.vnet.ibm.com> Signed-off-by: Alexander Graf <agraf@suse.de>
Using the address of 'empty_zero_page' as source address in order to
clear a page is wrong. On some architectures empty_zero_page is only the
pointer to the struct page of the empty_zero_page. Therefore the clear
page operation would copy the contents of a couple of struct pages instead
of clearing a page. For kvm only arm/arm64 are affected by this bug.
To fix this use the ZERO_PAGE macro instead which will return the struct
page address of the empty_zero_page on all architectures.
Christoffer Dall [Fri, 15 Nov 2013 21:14:12 +0000 (13:14 -0800)]
arm/arm64: KVM: Fix hyp mappings of vmalloc regions
Using virt_to_phys on percpu mappings is horribly wrong as it may be
backed by vmalloc. Introduce kvm_kaddr_to_phys which translates both
types of valid kernel addresses to the corresponding physical address.
At the same time resolves a typing issue where we were storing the
physical address as a 32 bit unsigned long (on arm), truncating the
physical address for addresses above the 4GB limit. This caused
breakage on Keystone.
Cc: <stable@vger.kernel.org> [3.10+] Reported-by: Santosh Shilimkar <santosh.shilimkar@ti.com> Tested-by: Santosh Shilimkar <santosh.shilimkar@ti.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Linus Torvalds [Fri, 15 Nov 2013 04:51:36 +0000 (13:51 +0900)]
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull KVM changes from Paolo Bonzini:
"Here are the 3.13 KVM changes. There was a lot of work on the PPC
side: the HV and emulation flavors can now coexist in a single kernel
is probably the most interesting change from a user point of view.
On the x86 side there are nested virtualization improvements and a few
bugfixes.
ARM got transparent huge page support, improved overcommit, and
support for big endian guests.
Finally, there is a new interface to connect KVM with VFIO. This
helps with devices that use NoSnoop PCI transactions, letting the
driver in the guest execute WBINVD instructions. This includes some
nVidia cards on Windows, that fail to start without these patches and
the corresponding userspace changes"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (146 commits)
kvm, vmx: Fix lazy FPU on nested guest
arm/arm64: KVM: PSCI: propagate caller endianness to the incoming vcpu
arm/arm64: KVM: MMIO support for BE guest
kvm, cpuid: Fix sparse warning
kvm: Delete prototype for non-existent function kvm_check_iopl
kvm: Delete prototype for non-existent function complete_pio
hung_task: add method to reset detector
pvclock: detect watchdog reset at pvclock read
kvm: optimize out smp_mb after srcu_read_unlock
srcu: API for barrier after srcu read unlock
KVM: remove vm mmap method
KVM: IOMMU: hva align mapping page size
KVM: x86: trace cpuid emulation when called from emulator
KVM: emulator: cleanup decode_register_operand() a bit
KVM: emulator: check rex prefix inside decode_register()
KVM: x86: fix emulation of "movzbl %bpl, %eax"
kvm_host: typo fix
KVM: x86: emulate SAHF instruction
MAINTAINERS: add tree for kvm.git
Documentation/kvm: add a 00-INDEX file
...
Linus Torvalds [Fri, 15 Nov 2013 04:34:37 +0000 (13:34 +0900)]
Merge tag 'stable/for-linus-3.13-rc0-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip
Pull Xen updates from Konrad Rzeszutek Wilk:
"This has tons of fixes and two major features which are concentrated
around the Xen SWIOTLB library.
The short <blurb> is that the tracing facility (just one function) has
been added to SWIOTLB to make it easier to track I/O progress.
Additionally under Xen and ARM (32 & 64) the Xen-SWIOTLB driver
"is used to translate physical to machine and machine to physical
addresses of foreign[guest] pages for DMA operations" (Stefano) when
booting under hardware without proper IOMMU.
There are also bug-fixes, cleanups, compile warning fixes, etc.
The commit times for some of the commits is a bit fresh - that is b/c
we wanted to make sure we have the Ack's from the ARM folks - which
with the string of back-to-back conferences took a bit of time. Rest
assured - the code has been stewing in #linux-next for some time.
Features:
- SWIOTLB has tracing added when doing bounce buffer.
- Xen ARM/ARM64 can use Xen-SWIOTLB. This work allows Linux to
safely program real devices for DMA operations when running as a
guest on Xen on ARM, without IOMMU support. [*1]
- xen_raw_printk works with PVHVM guests if needed.
Bug-fixes:
- Make memory ballooning work under HVM with large MMIO region.
- Inform hypervisor of MCFG regions found in ACPI DSDT.
- Remove deprecated IRQF_DISABLED.
- Remove deprecated __cpuinit.
[*1]:
"On arm and arm64 all Xen guests, including dom0, run with second
stage translation enabled. As a consequence when dom0 programs a
device for a DMA operation is going to use (pseudo) physical
addresses instead machine addresses. This work introduces two trees
to track physical to machine and machine to physical mappings of
foreign pages. Local pages are assumed mapped 1:1 (physical address
== machine address). It enables the SWIOTLB-Xen driver on ARM and
ARM64, so that Linux can translate physical addresses to machine
addresses for dma operations when necessary. " (Stefano)"
* tag 'stable/for-linus-3.13-rc0-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: (32 commits)
xen/arm: pfn_to_mfn and mfn_to_pfn return the argument if nothing is in the p2m
arm,arm64/include/asm/io.h: define struct bio_vec
swiotlb-xen: missing include dma-direction.h
pci-swiotlb-xen: call pci_request_acs only ifdef CONFIG_PCI
arm: make SWIOTLB available
xen: delete new instances of added __cpuinit
xen/balloon: Set balloon's initial state to number of existing RAM pages
xen/mcfg: Call PHYSDEVOP_pci_mmcfg_reserved for MCFG areas.
xen: remove deprecated IRQF_DISABLED
x86/xen: remove deprecated IRQF_DISABLED
swiotlb-xen: fix error code returned by xen_swiotlb_map_sg_attrs
swiotlb-xen: static inline xen_phys_to_bus, xen_bus_to_phys, xen_virt_to_bus and range_straddles_page_boundary
grant-table: call set_phys_to_machine after mapping grant refs
arm,arm64: do not always merge biovec if we are running on Xen
swiotlb: print a warning when the swiotlb is full
swiotlb-xen: use xen_dma_map/unmap_page, xen_dma_sync_single_for_cpu/device
xen: introduce xen_dma_map/unmap_page and xen_dma_sync_single_for_cpu/device
tracing/events: Fix swiotlb tracepoint creation
swiotlb-xen: use xen_alloc/free_coherent_pages
xen: introduce xen_alloc/free_coherent_pages
...
Linus Torvalds [Fri, 15 Nov 2013 04:28:47 +0000 (13:28 +0900)]
Merge tag 'virtio-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux
Pull virtio updates from Rusty Russell:
"Nothing really exciting: some groundwork for changing virtio endian,
and some robustness fixes for broken virtio devices, plus minor
tweaks"
* tag 'virtio-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux:
virtio_scsi: verify if queue is broken after virtqueue_get_buf()
x86, asmlinkage, lguest: Pass in globals into assembler statement
virtio: mmio: fix signature checking for BE guests
virtio_ring: adapt to notify() returning bool
virtio_net: verify if queue is broken after virtqueue_get_buf()
virtio_console: verify if queue is broken after virtqueue_get_buf()
virtio_blk: verify if queue is broken after virtqueue_get_buf()
virtio_ring: add new function virtqueue_is_broken()
virtio_test: verify if virtqueue_kick() succeeded
virtio_net: verify if virtqueue_kick() succeeded
virtio_ring: let virtqueue_{kick()/notify()} return a bool
virtio_ring: change host notification API
virtio_config: remove virtio_config_val
virtio: use size-based config accessors.
virtio_config: introduce size-based accessors.
virtio_ring: plug kmemleak false positive.
virtio: pm: use CONFIG_PM_SLEEP instead of CONFIG_PM
* tag 'modules-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux:
modpost: fix bogus 'exported twice' warnings.
init: fix in-place parameter modification regression
asmlinkage, module: Make ksymtab and kcrctab symbols and __this_module __visible
kernel: add support for init_array constructors
modpost: Optionally ignore secondary errors seen if a single module build fails
module: remove rmmod --wait option.
Linus Torvalds [Fri, 15 Nov 2013 00:32:31 +0000 (09:32 +0900)]
Merge branch 'akpm' (patch-bomb from Andrew Morton)
Merge patches from Andrew Morton:
- memstick fixes
- the rest of MM
- various misc bits that were awaiting merges from linux-next into
mainline: seq_file, printk, rtc, completions, w1, softirqs, llist,
kfifo, hfsplus
* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (72 commits)
cmdline-parser: fix build
hfsplus: Fix undefined __divdi3 in hfsplus_init_header_node()
kfifo API type safety
kfifo: kfifo_copy_{to,from}_user: fix copied bytes calculation
sound/core/memalloc.c: use gen_pool_dma_alloc() to allocate iram buffer
llists-move-llist_reverse_order-from-raid5-to-llistc-fix
llists: move llist_reverse_order from raid5 to llist.c
kernel: fix generic_exec_single indentation
kernel-provide-a-__smp_call_function_single-stub-for-config_smp-fix
kernel: provide a __smp_call_function_single stub for !CONFIG_SMP
kernel: remove CONFIG_USE_GENERIC_SMP_HELPERS
revert "softirq: Add support for triggering softirq work on softirqs"
drivers/w1/masters/w1-gpio.c: use dev_get_platdata()
sched: remove INIT_COMPLETION
tree-wide: use reinit_completion instead of INIT_COMPLETION
sched: replace INIT_COMPLETION with reinit_completion
drivers/rtc/rtc-hid-sensor-time.c: enable HID input processing early
drivers/rtc/rtc-hid-sensor-time.c: use dev_get_platdata()
vsprintf: ignore %n again
seq_file: remove "%n" usage from seq_file users
...
include/linux/cmdline-parser.h:17:12: error: 'BDEVNAME_SIZE' undeclared here
block/cmdline-parser.c:17:2: error: implicit declaration of function 'kzalloc'
Signed-off-by: Alexander Beregalov <alexander.beregalov@intel.com> Cc: CaiZhiyong <caizhiyong@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
i_size_read() returns loff_t, which is long long, i.e. 64-bit. node_size
is size_t, which is either 32-bit or 64-bit. Hence
"i_size_read(attr_file) / node_size" is a 64-by-32 or 64-by-64 division,
causing (some versions of) gcc to emit a call to __divdi3().
Fortunately node_size is actually 16-bit, as the sole caller of
hfsplus_init_header_node() passes a u16. Hence change its type from
size_t to u16, and use do_div() to perform a 64-by-32 division.
Not seen in m68k/allmodconfig in -next, so it really depends on the
verion of gcc.
Stefani Seibold [Thu, 14 Nov 2013 22:32:17 +0000 (14:32 -0800)]
kfifo API type safety
This patch enhances the type safety for the kfifo API. It is now safe
to put const data into a non const FIFO and the API will now generate a
compiler warning when reading from the fifo where the destination
address is pointing to a const variable.
As a side effect the kfifo_put() does now expect the value of an element
instead a pointer to the element. This was suggested Russell King. It
make the handling of the kfifo_put easier since there is no need to
create a helper variable for getting the address of a pointer or to pass
integers of different sizes.
IMHO the API break is okay, since there are currently only six users of
kfifo_put().
The code is also cleaner by kicking out the "if (0)" expressions.
[akpm@linux-foundation.org: coding-style fixes] Signed-off-by: Stefani Seibold <stefani@seibold.net> Cc: Russell King <rmk@arm.linux.org.uk> Cc: Hauke Mehrtens <hauke@hauke-m.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Christoph Hellwig <hch@lst.de> Cc: Jan Kara <jack@suse.cz> Cc: Jens Axboe <axboe@kernel.dk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
llists: move llist_reverse_order from raid5 to llist.c
Make this useful helper available for other users.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jan Kara <jack@suse.cz> Cc: Jens Axboe <axboe@kernel.dk> Cc: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Christoph Hellwig <hch@lst.de> Cc: Jan Kara <jack@suse.cz> Cc: Jens Axboe <axboe@kernel.dk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
kernel/up.c:25: error: redefinition of '__smp_call_function_single'
include/linux/smp.h:154: note: previous definition of '__smp_call_function_single' was here
Cc: Christoph Hellwig <hch@infradead.org> Cc: Christoph Hellwig <hch@lst.de> Cc: Jan Kara <jack@suse.cz> Cc: Jens Axboe <axboe@kernel.dk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
kernel: provide a __smp_call_function_single stub for !CONFIG_SMP
Signed-off-by: Christoph Hellwig <hch@lst.de> Cc: Jan Kara <jack@suse.cz> Cc: Jens Axboe <axboe@kernel.dk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
We've switched over every architecture that supports SMP to it, so
remove the new useless config variable.
Signed-off-by: Christoph Hellwig <hch@lst.de> Cc: Jan Kara <jack@suse.cz> Cc: Jens Axboe <axboe@kernel.dk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
revert "softirq: Add support for triggering softirq work on softirqs"
This commit was incomplete in that code to remove items from the per-cpu
lists was missing and never acquired a user in the 5 years it has been in
the tree. We're going to implement what it seems to try to archive in a
simpler way, and this code is in the way of doing so.
Signed-off-by: Christoph Hellwig <hch@lst.de> Cc: Jan Kara <jack@suse.cz> Cc: Jens Axboe <axboe@kernel.dk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jingoo Han [Thu, 14 Nov 2013 22:32:04 +0000 (14:32 -0800)]
drivers/w1/masters/w1-gpio.c: use dev_get_platdata()
Use the wrapper function for retrieving the platform data instead of
accessing dev->platform_data directly. This is a cosmetic change to make
the code simpler and enhance the readability.
Signed-off-by: Jingoo Han <jg1.han@samsung.com> Acked-by: Evgeniy Polyakov <zbr@ioremap.net> Cc: Greg KH <greg@kroah.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Wolfram Sang [Thu, 14 Nov 2013 22:32:01 +0000 (14:32 -0800)]
sched: replace INIT_COMPLETION with reinit_completion
For the casual device driver writer, it is hard to remember when to use
init_completion (to init a completion structure) or INIT_COMPLETION (to
*reinit* a completion structure). Furthermore, while all other
completion functions exepct a pointer as a parameter, INIT_COMPLETION
does not. To make it easier to remember which function to use and to
make code more readable, introduce a new inline function with the proper
name and consistent argument type. Update the kernel-doc for
init_completion while we are here.
Signed-off-by: Wolfram Sang <wsa@the-dreams.de> Acked-by: Linus Walleij <linus.walleij@linaro.org> (personally at LCE13) Cc: Ingo Molnar <mingo@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Alexander Holler [Thu, 14 Nov 2013 22:32:00 +0000 (14:32 -0800)]
drivers/rtc/rtc-hid-sensor-time.c: enable HID input processing early
Enable the processing of HID input records before the RTC will be
registered, in order to allow the RTC register function to read clock.
Without doing that the clock can only be read after the probe function
has finished.
Signed-off-by: Alexander Holler <holler@ahsoftware.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jingoo Han [Thu, 14 Nov 2013 22:31:59 +0000 (14:31 -0800)]
drivers/rtc/rtc-hid-sensor-time.c: use dev_get_platdata()
Use the wrapper function for retrieving the platform data instead of
accessing dev->platform_data directly. This is a cosmetic change to
make the code simpler and enhance the readability.
Signed-off-by: Jingoo Han <jg1.han@samsung.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Kees Cook [Thu, 14 Nov 2013 22:31:58 +0000 (14:31 -0800)]
vsprintf: ignore %n again
This ignores %n in printf again, as was originally documented.
Implementing %n poses a greater security risk than utility, so it should
stay ignored. To help anyone attempting to use %n, a warning will be
emitted if it is encountered.
Based on an earlier patch by Joe Perches.
Because %n was designed to write to pointers on the stack, it has been
frequently used as an attack vector when bugs are found that leak
user-controlled strings into functions that ultimately process format
strings. While this class of bug can still be turned into an
information leak, removing %n eliminates the common method of elevating
such a bug into an arbitrary kernel memory writing primitive,
significantly reducing the danger of this class of bug.
For seq_file users that need to know the length of a written string for
padding, please see seq_setwidth() and seq_pad() instead.
Signed-off-by: Kees Cook <keescook@chromium.org> Cc: Joe Perches <joe@perches.com> Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Cc: David Miller <davem@davemloft.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Tetsuo Handa [Thu, 14 Nov 2013 22:31:56 +0000 (14:31 -0800)]
seq_file: introduce seq_setwidth() and seq_pad()
There are several users who want to know bytes written by seq_*() for
alignment purpose. Currently they are using %n format for knowing it
because seq_*() returns 0 on success.
This patch introduces seq_setwidth() and seq_pad() for allowing them to
align without using %n format.
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Signed-off-by: Kees Cook <keescook@chromium.org> Cc: Joe Perches <joe@perches.com> Cc: David Miller <davem@davemloft.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
mm: create a separate slab for page->ptl allocation
If DEBUG_SPINLOCK and DEBUG_LOCK_ALLOC are enabled spinlock_t on x86_64
is 72 bytes. For page->ptl they will be allocated from kmalloc-96 slab,
so we loose 24 on each. An average system can easily allocate few tens
thousands of page->ptl and overhead is significant.
Let's create a separate slab for page->ptl allocation to solve this.
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Peter Zijlstra [Thu, 14 Nov 2013 22:31:52 +0000 (14:31 -0800)]
mm: properly separate the bloated ptl from the regular case
Use kernel/bounds.c to convert build-time spinlock_t size check into a
preprocessor symbol and apply that to properly separate the page::ptl
situation.
Signed-off-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
mm: dynamically allocate page->ptl if it cannot be embedded to struct page
If split page table lock is in use, we embed the lock into struct page
of table's page. We have to disable split lock, if spinlock_t is too
big be to be embedded, like when DEBUG_SPINLOCK or DEBUG_LOCK_ALLOC
enabled.
This patch add support for dynamic allocation of split page table lock
if we can't embed it to struct page.
page->ptl is unsigned long now and we use it as spinlock_t if
sizeof(spinlock_t) <= sizeof(long), otherwise it's pointer to spinlock_t.
The spinlock_t allocated in pgtable_page_ctor() for PTE table and in
pgtable_pmd_page_ctor() for PMD table. All other helpers converted to
support dynamically allocated page->ptl.
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Reviewed-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
At the moment xtensa uses slab allocator for PTE table. It doesn't work
with enabled split page table lock: slab uses page->slab_cache and
page->first_page for its pages. These fields share stroage with
page->ptl.
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Chris Zankel <chris@zankel.net> Acked-by: Max Filippov <jcmvbkbc@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Acked-by: Will Deacon <will.deacon@arm.com> Cc: Grant Likely <grant.likely@linaro.org> Cc: Rob Herring <rob.herring@calxeda.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Jeff Dike <jdike@addtoit.com> Cc: Richard Weinberger <richard@nod.at> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Acked-by: David S. Miller <davem@davemloft.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Richard Kuo <rkuo@codeaurora.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Russell King <linux@arm.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Richard Henderson <rth@twiddle.net> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: Matt Turner <mattst88@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
x86: add missed pgtable_pmd_page_ctor/dtor calls for preallocated pmds
In split page table lock case, we embed spinlock_t into struct page.
For obvious reason, we don't want to increase size of struct page if
spinlock_t is too big, like with DEBUG_SPINLOCK or DEBUG_LOCK_ALLOC or
on -rt kernel. So we disable split page table lock, if spinlock_t is
too big.
This patchset allows to allocate the lock dynamically if spinlock_t is
big. In this page->ptl is used to store pointer to spinlock instead of
spinlock itself. It costs additional cache line for indirect access,
but fix page fault scalability for multi-threaded applications.
LOCK_STAT depends on DEBUG_SPINLOCK, so on current kernel enabling
LOCK_STAT to analyse scalability issues breaks scalability. ;)
The patchset mostly fixes this. Results for ./thp_memscale -c 80 -b 512M
on 4-socket machine:
baseline, no CONFIG_LOCK_STAT: 9.115460703 seconds time elapsed
baseline, CONFIG_LOCK_STAT=y: 53.890567123 seconds time elapsed
patched, no CONFIG_LOCK_STAT: 8.852250368 seconds time elapsed
patched, CONFIG_LOCK_STAT=y: 11.069770759 seconds time elapsed
Patch count is scary, but most of them trivial. Overview:
Patches 1-4 Few bug fixes. No dependencies to other patches.
Probably should applied as soon as possible.
Patch 5 Changes signature of pgtable_page_ctor(). We will use it
for dynamic lock allocation, so it can fail.
Patches 6-8 Add missing constructor/destructor calls on few archs.
It's fixes NR_PAGETABLE accounting and prepare to use
split ptl.
Patches 9-33 Add pgtable_page_ctor() fail handling to all archs.
Patches 34 Finally adds support of dynamically-allocated page->pte.
Also contains documentation for split page table lock.
This patch (of 34):
I've missed that we preallocate few pmds on pgd_alloc() if X86_PAE
enabled. Let's add missed constructor/destructor calls.
I haven't noticed it during testing since prep_new_page() clears
page->mapping and therefore page->ptl. It's effectively equal to
spin_lock_init(&page->ptl).
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Acked-by: Ingo Molnar <mingo@kernel.org> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: "James E.J. Bottomley" <jejb@parisc-linux.org> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Chen Liqin <liqin.chen@sunplusct.com> Cc: Chris Metcalf <cmetcalf@tilera.com> Cc: Chris Zankel <chris@zankel.net> Cc: Christoph Lameter <cl@linux.com> Cc: David Howells <dhowells@redhat.com> Cc: David S. Miller <davem@davemloft.net> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Grant Likely <grant.likely@linaro.org> Cc: Guan Xuetao <gxt@mprc.pku.edu.cn> Cc: Haavard Skinnemoen <hskinnemoen@gmail.com> Cc: Hans-Christian Egtvedt <egtvedt@samfundet.no> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Helge Deller <deller@gmx.de> Cc: Hirokazu Takata <takata@linux-m32r.org> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: James Hogan <james.hogan@imgtec.com> Cc: Jeff Dike <jdike@addtoit.com> Cc: Jesper Nilsson <jesper.nilsson@axis.com> Cc: Jonas Bonn <jonas@southpole.se> Cc: Koichi Yasutake <yasutake.koichi@jp.panasonic.com> Cc: Lennox Wu <lennox.wu@gmail.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Matt Turner <mattst88@gmail.com> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: Michal Simek <monstr@monstr.eu> Cc: Mikael Starvik <starvik@axis.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Paul Mundt <lethal@linux-sh.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Richard Henderson <rth@twiddle.net> Cc: Richard Kuo <rkuo@codeaurora.org> Cc: Richard Weinberger <richard@nod.at> Cc: Rob Herring <rob.herring@calxeda.com> Cc: Russell King <linux@arm.linux.org.uk> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tony Luck <tony.luck@intel.com> Cc: Vineet Gupta <vgupta@synopsys.com> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The basic idea is the same as with PTE level: the lock is embedded into
struct page of table's page.
We can't use mm->pmd_huge_pte to store pgtables for THP, since we don't
take mm->page_table_lock anymore. Let's reuse page->lru of table's page
for that.
pgtable_pmd_page_ctor() returns true, if initialization is successful
and false otherwise. Current implementation never fails, but assumption
that constructor can fail will help to port it to -rt where spinlock_t
is rather huge and cannot be embedded into struct page -- dynamic
allocation is required.
Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Tested-by: Alex Thorlton <athorlton@sgi.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: "Eric W . Biederman" <ebiederm@xmission.com> Cc: "Paul E . McKenney" <paulmck@linux.vnet.ibm.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Andi Kleen <ak@linux.intel.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Dave Jones <davej@redhat.com> Cc: David Howells <dhowells@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Kees Cook <keescook@chromium.org> Cc: Mel Gorman <mgorman@suse.de> Cc: Michael Kerrisk <mtk.manpages@gmail.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: Robin Holt <robinmholt@gmail.com> Cc: Sedat Dilek <sedat.dilek@gmail.com> Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Hugh Dickins <hughd@google.com> Reviewed-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Currently mm->pmd_huge_pte protected by page table lock. It will not
work with split lock. We have to have per-pmd pmd_huge_pte for proper
access serialization.
For now, let's just introduce wrapper to access mm->pmd_huge_pte.
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Tested-by: Alex Thorlton <athorlton@sgi.com> Cc: Alex Thorlton <athorlton@sgi.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: "Eric W . Biederman" <ebiederm@xmission.com> Cc: "Paul E . McKenney" <paulmck@linux.vnet.ibm.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Andi Kleen <ak@linux.intel.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Dave Jones <davej@redhat.com> Cc: David Howells <dhowells@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Kees Cook <keescook@chromium.org> Cc: Mel Gorman <mgorman@suse.de> Cc: Michael Kerrisk <mtk.manpages@gmail.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: Robin Holt <robinmholt@gmail.com> Cc: Sedat Dilek <sedat.dilek@gmail.com> Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Hugh Dickins <hughd@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
mm: avoid increase sizeof(struct page) due to split page table lock
Alex Thorlton noticed that some massively threaded workloads work poorly,
if THP enabled. This patchset fixes this by introducing split page table
lock for PMD tables. hugetlbfs is not covered yet.
This patchset is based on work by Naoya Horiguchi.
: akpm result summary:
:
: THP off, v3.12-rc2: 18.059261877 seconds time elapsed
: THP off, patched: 16.768027318 seconds time elapsed
:
: THP on, v3.12-rc2: 42.162306788 seconds time elapsed
: THP on, patched: 8.397885779 seconds time elapsed
:
: HUGETLB, v3.12-rc2: 47.574936948 seconds time elapsed
: HUGETLB, patched: 19.447481153 seconds time elapsed
CONFIG_GENERIC_LOCKBREAK increases sizeof(spinlock_t) to 8 bytes. It
leads to increase sizeof(struct page) by 4 bytes on 32-bit system if split
page table lock is in use, since page->ptl shares space in union with
longs and pointers.
Let's disable split page table lock on 32-bit systems with
GENERIC_LOCKBREAK enabled.
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Alex Thorlton <athorlton@sgi.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: "Eric W . Biederman" <ebiederm@xmission.com> Cc: "Paul E . McKenney" <paulmck@linux.vnet.ibm.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Andi Kleen <ak@linux.intel.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Dave Jones <davej@redhat.com> Cc: David Howells <dhowells@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Kees Cook <keescook@chromium.org> Cc: Mel Gorman <mgorman@suse.de> Cc: Michael Kerrisk <mtk.manpages@gmail.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: Robin Holt <robinmholt@gmail.com> Cc: Sedat Dilek <sedat.dilek@gmail.com> Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Hugh Dickins <hughd@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Thu, 14 Nov 2013 23:45:16 +0000 (08:45 +0900)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs
Pull btrfs update frm Chris Mason:
"This is our usual merge window set of bug fixes, performance
improvements and cleanups. Miao Xie has some really nice
optimizations for writeback.
Josef also expanded our sanity checks quite a bit; these make up a big
chunk of the new lines"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs: (98 commits)
Btrfs: rename btrfs_start_all_delalloc_inodes
Btrfs: don't wait for the completion of all the ordered extents
Btrfs: don't wait for all the async delalloc when shrinking delalloc
Btrfs: fix the confusion between delalloc bytes and metadata bytes
Btrfs: pick up the code for the item number calculation in flush_space()
Btrfs: wait for the ordered extent only when we want
Btrfs: remove unnecessary initialization and memory barrior in shrink_delalloc()
Btrfs: avoid unnecessary scrub workers allocation
Btrfs: check file extent type before anything else
btrfs: Remove useless variable in write_ctree_super()
btrfs: Fix checkpatch.pl warning of spacing issues
btrfs: Replace kmalloc with kmalloc_array
btrfs: Enclose macros with complex values within parenthesis
btrfs: Use WARN_ON()'s return value in place of WARN_ON(1)
btrfs: Remove redundant local zero structure
btrfs: Pack struct btrfs_device
btrfs: Replace multiple atomic_inc() with atomic_add()
btrfs: Add helper function for free_root_pointers()
Btrfs: fix a crash when running balance and defrag concurrently
Btrfs: do not run snapshot-aware defragment on error
...
Linus Torvalds [Thu, 14 Nov 2013 08:19:58 +0000 (17:19 +0900)]
Merge tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4
Pull ext4 changes from Ted Ts'o:
"Ext4 updates for 3.13. Mostly bug fixes and cleanups"
* tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
ext4: add prototypes for macro-generated functions
ext4: return non-zero st_blocks for inline data
ext4: use prandom_u32() instead of get_random_bytes()
ext4: remove unreachable code after ext4_can_extents_be_merged()
ext4: remove unreachable code in ext4_can_extents_be_merged()
ext4: avoid bh leak in retry path of ext4_expand_extra_isize_ea()
ext4: don't count free clusters from a corrupt block group
ext4: fix FITRIM in no journal mode
ext4: drop set but otherwise unused variable from ext4_add_dirent_to_inline()
ext4: change ext4_read_inline_dir() to return 0 on success
ext4: pair trace_ext4_writepages & trace_ext4_writepages_result
ext4: add ratelimiting to ext4 messages
ext4: fix performance regression in ext4_writepages
ext4: fixup kerndoc annotation of mpage_map_and_submit_extent()
ext4: fix assertion in ext4_add_complete_io()
Linus Torvalds [Thu, 14 Nov 2013 08:16:35 +0000 (17:16 +0900)]
Merge tag 'xfs-for-linus-v3.13-rc1' of git://oss.sgi.com/xfs/xfs
Pull xfs update from Ben Myers:
"For 3.13-rc1 we have an eclectic assortment of bugfixes, cleanups, and
refactoring. Bugfixes that stand out are the fix for the AGF/AGI
deadlock, incore extent list fixes, verifier fixes for v4 superblocks
and growfs, and memory leaks. There are some asserts, warnings, and
strings that were cleaned up. There was further rearrangement of code
to make libxfs and the kernel sync up more easily, differences between
v2 and v3 directory code were abstracted using an ops vector,
xfs_inactive was reworked, and the preallocation/hole punching code
was refactored.
- simplify kmem_zone_zalloc
- add traces for AGF/AGI read ops
- add additional AIL traces
- fix xfs_remove AGF vs AGI deadlock
- fix the extent count of new incore extent page in the indirection
array
- don't fail bad secondary superblocks verification on v4 filesystems
due to unzeroed bits after v4 fields
- fix possible NULL dereference in xlog_verify_iclog
- remove redundant assert in xfs_dir2_leafn_split
- prevent stack overflows from page cache allocation
- fix some sparse warnings
- fix directory block format verifier to check the leaf entry count
- abstract the differences in dir2/dir3 via an ops vector
- continue process of reorganization to make libxfs/kernel code
merges easier
- refactor the preallocation and hole punching code
- fix for growfs and verifiers
- remove unnecessary scary corruption error when probing non-xfs
filesystems
- remove extra newlines from strings passed to printk
- prevent deadlock trying to cover an active log
- rework xfs_inactive()
- add the inode directory type support to XFS_IOC_FSGEOM
- cleanup (remove) usage of is_bad_inode
- fix miscalculation in xfs_iext_realloc_direct which results in
oversized direct extent list
- remove unnecessary count arg to xfs_iomap_write_allocate
- fix memory leak in xlog_recover_add_to_trans
- check superblock instead of block magic to determine if dtype field
is present
- fix lockdep annotation due to project quotas
- fix regression in xfs_node_toosmall which can lead to incorrect
directory btree node collapse
- make log recovery verify filesystem uuid of recovering blocks
- fix XFS_IOC_FREE_EOFBLOCKS definition
- remove invalid assert in xfs_inode_free
- fix for AIL lock regression"
* tag 'xfs-for-linus-v3.13-rc1' of git://oss.sgi.com/xfs/xfs: (49 commits)
xfs: simplify kmem_{zone_}zalloc
xfs: add tracepoints to AGF/AGI read operations
xfs: trace AIL manipulations
xfs: xfs_remove deadlocks due to inverted AGF vs AGI lock ordering
xfs: fix the extent count when allocating an new indirection array entry
xfs: be more forgiving of a v4 secondary sb w/ junk in v5 fields
xfs: fix possible NULL dereference in xlog_verify_iclog
xfs:xfs_dir2_node.c: pointer use before check for null
xfs: prevent stack overflows from page cache allocation
xfs: fix static and extern sparse warnings
xfs: validity check the directory block leaf entry count
xfs: make dir2 ftype offset pointers explicit
xfs: convert directory vector functions to constants
xfs: convert directory vector functions to constants
xfs: vectorise encoding/decoding directory headers
xfs: vectorise DA btree operations
xfs: vectorise directory leaf operations
xfs: vectorise directory data operations part 2
xfs: vectorise directory data operations
xfs: vectorise remaining shortform dir2 ops
...
Linus Torvalds [Thu, 14 Nov 2013 07:56:32 +0000 (16:56 +0900)]
Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf updates from Ingo Molnar:
"A number of fixes:
- Fix segfault on perf trace -i perf.data, from Namhyung Kim.
- Fix segfault with --no-mmap-pages, from David Ahern.
- Don't force a refresh during progress update in the TUI, greatly
reducing startup costs, fix from Patrick Palka.
- Fix sw clock event period test wrt not checking if using >
max_sample_freq.
- Handle throttle events in 'object code reading' test, fix from
Adrian Hunter.
- Prevent condition that all sort keys are elided, fix from Namhyung
Kim.
- Round mmap pages to power 2, from David Ahern.
And a number of late arrival changes:
- Add summary only option to 'perf trace', suppressing the decoding
of events, from David Ahern
- 'perf trace --summary' formatting simplifications, from Pekka
Enberg.
- Beautify fifth argument of mmap() as fd, in 'perf trace', from
Namhyung Kim.
- Add direct access to dynamic arrays in libtraceevent, from Steven
Rostedt.
- Synthesize non-exec MMAP records when --data used, allowing the
resolution of data addresses to symbols (global variables, etc), by
Arnaldo Carvalho de Melo.
- Code cleanups by David Ahern and Adrian Hunter"
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (23 commits)
tools lib traceevent: Add direct access to dynamic arrays
perf target: Shorten perf_target__ to target__
perf tests: Handle throttle events in 'object code reading' test
perf evlist: Refactor mmap_pages parsing
perf evlist: Round mmap pages to power 2 - v2
perf record: Fix segfault with --no-mmap-pages
perf trace: Add summary only option
perf trace: Simplify '--summary' output
perf trace: Change syscall summary duration order
perf tests: Compensate lower sample freq with longer test loop
perf trace: Fix segfault on perf trace -i perf.data
perf trace: Separate tp syscall field caching into init routine to be reused
perf trace: Beautify fifth argument of mmap() as fd
perf tests: Use lower sample_freq in sw clock event period test
perf tests: Check return of perf_evlist__open sw clock event period test
perf record: Move existing write_output into helper function
perf record: Use correct return type for write()
perf tools: Prevent condition that all sort keys are elided
perf machine: Simplify synthesize_threads method
perf machine: Introduce synthesize_threads method out of open coded equivalent
...
Linus Torvalds [Thu, 14 Nov 2013 07:55:56 +0000 (16:55 +0900)]
Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull two x86 fixes from Ingo Molnar.
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/microcode/amd: Tone down printk(), don't treat a missing firmware file as an error
x86/dumpstack: Fix printk_address for direct addresses
Linus Torvalds [Thu, 14 Nov 2013 07:30:30 +0000 (16:30 +0900)]
Merge branch 'core-locking-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull core locking changes from Ingo Molnar:
"The biggest changes:
- add lockdep support for seqcount/seqlocks structures, this
unearthed both bugs and required extra annotation.
- move the various kernel locking primitives to the new
kernel/locking/ directory"
* 'core-locking-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (21 commits)
block: Use u64_stats_init() to initialize seqcounts
locking/lockdep: Mark __lockdep_count_forward_deps() as static
lockdep/proc: Fix lock-time avg computation
locking/doc: Update references to kernel/mutex.c
ipv6: Fix possible ipv6 seqlock deadlock
cpuset: Fix potential deadlock w/ set_mems_allowed
seqcount: Add lockdep functionality to seqcount/seqlock structures
net: Explicitly initialize u64_stats_sync structures for lockdep
locking: Move the percpu-rwsem code to kernel/locking/
locking: Move the lglocks code to kernel/locking/
locking: Move the rwsem code to kernel/locking/
locking: Move the rtmutex code to kernel/locking/
locking: Move the semaphore core to kernel/locking/
locking: Move the spinlock code to kernel/locking/
locking: Move the lockdep code to kernel/locking/
locking: Move the mutex code to kernel/locking/
hung_task debugging: Add tracepoint to report the hang
x86/locking/kconfig: Update paravirt spinlock Kconfig description
lockstat: Report avg wait and hold times
lockdep, x86/alternatives: Drop ancient lockdep fixup message
...