Joonsoo Kim [Wed, 15 Aug 2012 15:02:40 +0000 (00:02 +0900)]
slub: remove one code path and reduce lock contention in __slab_free()
When we try to free object, there is some of case that we need
to take a node lock. This is the necessary step for preventing a race.
After taking a lock, then we try to cmpxchg_double_slab().
But, there is a possible scenario that cmpxchg_double_slab() is failed
with taking a lock. Following example explains it.
CPU A CPU B
need lock
... need lock
... lock!!
lock..but spin free success
spin... unlock
lock!!
free fail
In this case, retry with taking a lock is occured in CPU A.
I think that in this case for CPU A,
"release a lock first, and re-take a lock if necessary" is preferable way.
There are two reasons for this.
First, this makes __slab_free()'s logic somehow simple.
With this patch, 'was_frozen = 1' is "always" handled without taking a lock.
So we can remove one code path.
Second, it may reduce lock contention.
When we do retrying, status of slab is already changed,
so we don't need a lock anymore in almost every case.
"release a lock first, and re-take a lock if necessary" policy is
helpful to this.
Signed-off-by: Joonsoo Kim <js1304@gmail.com> Acked-by: Christoph Lameter <cl@linux.com> Signed-off-by: Pekka Enberg <penberg@kernel.org>
Linus Torvalds [Sun, 14 Oct 2012 21:39:05 +0000 (14:39 -0700)]
Merge branch 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus
Pull MIPS update from Ralf Baechle:
"Cleanups and fixes for breakage that occured earlier during this merge
phase. Also a few patches that didn't make the first pull request.
Of those is the Alchemy work that merges code for many of the SOCs and
evaluation boards thus among other code shrinkage, reduces the number
of MIPS defconfigs by 5."
* 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus: (22 commits)
MIPS: SNI: Switch RM400 serial to SCCNXP driver
MIPS: Remove unused empty_bad_pmd_table[] declaration.
MIPS: MT: Remove kspd.
MIPS: Malta: Fix section mismatch.
MIPS: asm-offset.c: Delete unused irq_cpustat_t struct offsets.
MIPS: Alchemy: Merge PB1100/1500 support into DB1000 code.
MIPS: Alchemy: merge PB1550 support into DB1550 code
MIPS: Alchemy: Single kernel for DB1200/1300/1550
MIPS: Optimize TLB refill for RI/XI configurations.
MIPS: proc: Cleanup printing of ASEs.
MIPS: Hardwire detection of DSP ASE Rev 2 for systems, as required.
MIPS: Add detection of DSP ASE Revision 2.
MIPS: Optimize pgd_init and pmd_init
MIPS: perf: Add perf functionality for BMIPS5000
MIPS: perf: Split the Kconfig option CONFIG_MIPS_MT_SMP
MIPS: perf: Remove unnecessary #ifdef
MIPS: perf: Add cpu feature bit for PCI (performance counter interrupt)
MIPS: perf: Change the "mips_perf_event" table unsupported indicator.
MIPS: Align swapper_pg_dir to 64K for better TLB Refill code.
vmlinux.lds.h: Allow architectures to add sections to the front of .bss
...
Linus Torvalds [Sun, 14 Oct 2012 20:39:34 +0000 (13:39 -0700)]
Merge branch 'modules-next' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux
Pull module signing support from Rusty Russell:
"module signing is the highlight, but it's an all-over David Howells frenzy..."
Hmm "Magrathea: Glacier signing key". Somebody has been reading too much HHGTTG.
* 'modules-next' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux: (37 commits)
X.509: Fix indefinite length element skip error handling
X.509: Convert some printk calls to pr_devel
asymmetric keys: fix printk format warning
MODSIGN: Fix 32-bit overflow in X.509 certificate validity date checking
MODSIGN: Make mrproper should remove generated files.
MODSIGN: Use utf8 strings in signer's name in autogenerated X.509 certs
MODSIGN: Use the same digest for the autogen key sig as for the module sig
MODSIGN: Sign modules during the build process
MODSIGN: Provide a script for generating a key ID from an X.509 cert
MODSIGN: Implement module signature checking
MODSIGN: Provide module signing public keys to the kernel
MODSIGN: Automatically generate module signing keys if missing
MODSIGN: Provide Kconfig options
MODSIGN: Provide gitignore and make clean rules for extra files
MODSIGN: Add FIPS policy
module: signature checking hook
X.509: Add a crypto key parser for binary (DER) X.509 certificates
MPILIB: Provide a function to read raw data into an MPI
X.509: Add an ASN.1 decoder
X.509: Add simple ASN.1 grammar compiler
...
Matt Fleming [Fri, 12 Oct 2012 10:19:59 +0000 (11:19 +0100)]
x86, boot: Explicitly include autoconf.h for hostprogs
The hostprogs need access to the CONFIG_* symbols found in
include/generated/autoconf.h. But commit abbf1590de22 ("UAPI: Partition
the header include path sets and add uapi/ header directories") replaced
$(LINUXINCLUDE) with $(USERINCLUDE) which doesn't contain the necessary
include paths.
This has the undesirable effect of breaking the EFI boot stub because
the #ifdef CONFIG_EFI_STUB code in arch/x86/boot/tools/build.c is
never compiled.
It should also be noted that because $(USERINCLUDE) isn't exported by
the top-level Makefile it's actually empty in arch/x86/boot/Makefile.
Cc: H. Peter Anvin <hpa@zytor.com> Cc: Ingo Molnar <mingo@kernel.org> Acked-by: David Howells <dhowells@redhat.com> Signed-off-by: Matt Fleming <matt.fleming@intel.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Sun, 14 Oct 2012 00:18:53 +0000 (17:18 -0700)]
Merge branch 'late-for-linus' of git://git.linaro.org/people/rmk/linux-arm
Pull ARM update from Russell King:
"This is the final round of stuff for ARM, left until the end of the
merge window to reduce the number of conflicts. This set contains the
ARM part of David Howells UAPI changes, and a fix to the ordering of
'select' statements in ARM Kconfig files (see the appropriate commit
for why this happened - thanks to Andrew Morton for pointing out the
problem.)
I've left this as long as I dare for this window to avoid conflicts,
and I regenerated the config patch yesterday, posting it to our
mailing list for review and testing. I have several acks which
include successful test reports for it.
However, today I notice we've got new conflicts with previously unseen
code... though that conflict should be trivial (it's my changes vs a
one liner.)"
* 'late-for-linus' of git://git.linaro.org/people/rmk/linux-arm:
ARM: config: make sure that platforms are ordered by option string
ARM: config: sort select statements alphanumerically
UAPI: (Scripted) Disintegrate arch/arm/include/asm
Fix up fairly conflict in arch/arm/Kconfig (the select re-organization
vs recent addition of GENERIC_KERNEL_EXECVE)
Linus Torvalds [Sat, 13 Oct 2012 20:28:32 +0000 (13:28 -0700)]
Merge tag 'disintegrate-main-20121013' of git://git.infradead.org/users/dhowells/linux-headers
Pull UAPI disintegration for include/linux/{,byteorder/}*.h from David Howells:
"The patches contained herein do the following:
(1) Remove kernel-only stuff in linux/ppp-comp.h from the UAPI. I checked
this with Paul Mackerras before I created the patch and he suggested some
extra bits to unexport.
(2) Remove linux/blk_types.h entirely from the UAPI as none of it is userspace
applicable, and remove from the UAPI that part of linux/fs.h that was the
reason for linux/blk_types.h being exported in the first place. I
discussed this with Jens Axboe before creating the patch.
(3) The big patch of the series to disintegrate include/linux/*.h as a unit.
This could be split up, though there would be collisions in moving stuff
between the two Kbuild files when the parts are merged as that file is
sorted alphabetically rather than being grouped by subsystem.
Of this set of headers, 17 files have changed in the UAPI exported region
since the 4th and only 8 since the 9th so there isn't much change in this
area - as one might expect.
It should be pretty obvious and straightforward if it does come to fixing
up: stuff in __KERNEL__ guards stays where it is and stuff outside moves
to the same file in the include/uapi/linux/ directory.
If a new file appears then things get a bit more complicated as the
"headers +=" line has to move to include/uapi/linux/Kbuild. Only one new
file has appeared since the 9th and I judge this type of event relatively
unlikely.
(4) A patch to disintegrate include/linux/byteorder/*.h as a unit.
Signed-off-by: David Howells <dhowells@redhat.com>"
* tag 'disintegrate-main-20121013' of git://git.infradead.org/users/dhowells/linux-headers:
UAPI: (Scripted) Disintegrate include/linux/byteorder
UAPI: (Scripted) Disintegrate include/linux
UAPI: Unexport linux/blk_types.h
UAPI: Unexport part of linux/ppp-comp.h
Linus Torvalds [Sat, 13 Oct 2012 20:26:39 +0000 (13:26 -0700)]
Merge tag 'disintegrate-spi-20121009' of git://git.infradead.org/users/dhowells/linux-headers
Pull spi UAPI disintegration from David Howells:
"This is to complete part of the Userspace API (UAPI) disintegration
for which the preparatory patches were pulled recently. After these
patches, userspace headers will be segregated into:
include/uapi/linux/.../foo.h
for the userspace interface stuff, and:
include/linux/.../foo.h
for the strictly kernel internal stuff.
Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Grant Likely <grant.likely@secretlab.ca>"
* tag 'disintegrate-spi-20121009' of git://git.infradead.org/users/dhowells/linux-headers:
UAPI: (Scripted) Disintegrate include/linux/spi
Linus Torvalds [Sat, 13 Oct 2012 20:23:39 +0000 (13:23 -0700)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace
Pull user namespace compile fixes from Eric W Biederman:
"This tree contains three trivial fixes. One compiler warning, one
thinko fix, and one build fix"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace:
btrfs: Fix compilation with user namespace support enabled
userns: Fix posix_acl_file_xattr_userns gid conversion
userns: Properly print bluetooth socket uids
Linus Torvalds [Sat, 13 Oct 2012 20:22:01 +0000 (13:22 -0700)]
Merge tag 'md-3.7' of git://neil.brown.name/md
Pull md updates from NeilBrown:
- "discard" support, some dm-raid improvements and other assorted bits
and pieces.
* tag 'md-3.7' of git://neil.brown.name/md: (29 commits)
md: refine reporting of resync/reshape delays.
md/raid5: be careful not to resize_stripes too big.
md: make sure manual changes to recovery checkpoint are saved.
md/raid10: use correct limit variable
md: writing to sync_action should clear the read-auto state.
Subject: [PATCH] md:change resync_mismatches to atomic64_t to avoid races
md/raid5: make sure to_read and to_write never go negative.
md: When RAID5 is dirty, force reconstruct-write instead of read-modify-write.
md/raid5: protect debug message against NULL derefernce.
md/raid5: add some missing locking in handle_failed_stripe.
MD: raid5 avoid unnecessary zero page for trim
MD: raid5 trim support
md/bitmap:Don't use IS_ERR to judge alloc_page().
md/raid1: Don't release reference to device while handling read error.
raid: replace list_for_each_continue_rcu with new interface
add further __init annotations to crypto/xor.c
DM RAID: Fix for "sync" directive ineffectiveness
DM RAID: Fix comparison of index and quantity for "rebuild" parameter
DM RAID: Add rebuild capability for RAID10
DM RAID: Move 'rebuild' checking code to its own function
...
Russell King [Fri, 12 Oct 2012 13:20:52 +0000 (14:20 +0100)]
ARM: config: make sure that platforms are ordered by option string
The large platform selection choice should be sorted by option string
so it's easy to find the platform you're looking for. Fix the few
options which are out of this order.
Acked-by: Tony Lindgren <tony@atomide.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
This is a pet peeve of mine. Any time there's a long list of items
(header file inclusions, kconfig entries, array initalisers, etc) and
someone wants to add a new item, they *always* go and stick it at the
end of the list.
Guys, don't do this. Either put the new item into a randomly-chosen
position or, probably better, alphanumerically sort the list.
lets sort all our select statements alphanumerically. This commit was
created by the following perl:
while (<>) {
while (/\\\s*$/) {
$_ .= <>;
}
undef %selects if /^\s*config\s+/;
if (/^\s+select\s+(\w+).*/) {
if (defined($selects{$1})) {
if ($selects{$1} eq $_) {
print STDERR "Warning: removing duplicated $1 entry\n";
} else {
print STDERR "Error: $1 differently selected\n".
"\tOld: $selects{$1}\n".
"\tNew: $_\n";
exit 1;
}
}
$selects{$1} = $_;
next;
}
if (%selects and (/^\s*$/ or /^\s+help/ or /^\s+---help---/ or
/^endif/ or /^endchoice/)) {
foreach $k (sort (keys %selects)) {
print "$selects{$k}";
}
undef %selects;
}
print;
}
if (%selects) {
foreach $k (sort (keys %selects)) {
print "$selects{$k}";
}
}
and they are identical duplicates, hence the shrinkage in the diffstat
of two lines.
We have four testers reporting success of this change (Tony, Stephen,
Linus and Sekhar.)
Acked-by: Jason Cooper <jason@lakedaemon.net> Acked-by: Tony Lindgren <tony@atomide.com> Acked-by: Stephen Warren <swarren@nvidia.com> Acked-by: Linus Walleij <linus.walleij@linaro.org> Acked-by: Sekhar Nori <nsekhar@ti.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Michael Kerrisk <mtk.manpages@gmail.com> Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Acked-by: Dave Jones <davej@redhat.com>
David Howells [Sat, 13 Oct 2012 09:46:48 +0000 (10:46 +0100)]
UAPI: (Scripted) Disintegrate include/linux
Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Michael Kerrisk <mtk.manpages@gmail.com> Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Acked-by: Dave Jones <davej@redhat.com>
David Howells [Sat, 13 Oct 2012 09:45:06 +0000 (10:45 +0100)]
UAPI: Unexport linux/blk_types.h
It seems that was linux/blk_types.h incorrectly exported to fix up some missing
bits required by the exported parts of linux/fs.h (READ, WRITE, READA, etc.).
So unexport linux/blk_types.h and unexport the relevant bits of linux/fs.h.
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Jens Axboe <jaxboe@fusionio.com>
cc: Tejun Heo <tj@kernel.org>
cc: Al Viro <viro@ZenIV.linux.org.uk>
Jonas Bonn [Sat, 13 Oct 2012 05:38:37 +0000 (07:38 +0200)]
Merge tag 'disintegrate-openrisc-20121009' of git://git.infradead.org/users/dhowells/linux-headers
UAPI Disintegration 2012-10-09
* tag 'disintegrate-openrisc-20121009' of git://git.infradead.org/users/dhowells/linux-headers:
UAPI: (Scripted) Disintegrate arch/openrisc/include/asm
Linus Torvalds [Sat, 13 Oct 2012 02:29:00 +0000 (11:29 +0900)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security
Pull TPM bugfixes from James Morris.
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security:
tpm: Propagate error from tpm_transmit to fix a timeout hang
driver/char/tpm: fix regression causesd by ppi
Linus Torvalds [Sat, 13 Oct 2012 02:27:59 +0000 (11:27 +0900)]
Merge branch 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux
Pull ACPI & Thermal updates from Len Brown:
"The generic Linux thermal layer is gaining some new capabilities
(generic cooling via cpufreq) and some new customers (ARM).
Also, an ACPI EC bug fix plus a regression fix."
* 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux: (30 commits)
tools/power/acpi/acpidump: remove duplicated include from acpidump.c
ACPI idle, CPU hotplug: Fix NULL pointer dereference during hotplug
cpuidle / ACPI: fix potential NULL pointer dereference
ACPI: EC: Add a quirk for CLEVO M720T/M730T laptop
ACPI: EC: Make the GPE storm threshold a module parameter
thermal: Exynos: Fix NULL pointer dereference in exynos_unregister_thermal()
Thermal: Fix bug on cpu_cooling, cooling device's id conflict problem.
thermal: exynos: Use devm_* functions
ARM: exynos: add thermal sensor driver platform data support
thermal: exynos: register the tmu sensor with the kernel thermal layer
thermal: exynos5: add exynos5250 thermal sensor driver support
hwmon: exynos4: move thermal sensor driver to driver/thermal directory
thermal: add generic cpufreq cooling implementation
Fix a build error.
thermal: Fix potential NULL pointer accesses
thermal: add Renesas R-Car thermal sensor support
thermal: fix potential out-of-bounds memory access
Thermal: Introduce locking for cdev.thermal_instances list.
Thermal: Unify the code for both active and passive cooling
Thermal: Introduce simple arbitrator for setting device cooling state
...
Linus Torvalds [Sat, 13 Oct 2012 02:25:41 +0000 (11:25 +0900)]
Merge tag 'for-3.7' of git://openrisc.net/jonas/linux
Pull OpenRISC updates from Jonas Bonn:
"Fixups for some corner cases, build issues, and some obvious bugs in
IRQ handling. No major changes."
* tag 'for-3.7' of git://openrisc.net/jonas/linux:
openrisc: mask interrupts in irq_mask_ack function
openrisc: fix typos in comments and warnings
openrisc: PIC should act on domain-local irqs
openrisc: Make cpu_relax() invoke barrier()
audit: define AUDIT_ARCH_OPENRISC
openrisc: delay: fix handling of counter overflow
openrisc: delay: fix loops calculation for __const_udelay
Linus Torvalds [Sat, 13 Oct 2012 02:20:04 +0000 (11:20 +0900)]
Merge tag 'arm64-uapi' of git://git.kernel.org/pub/scm/linux/kernel/git/cmarinas/linux-aarch64
Pull arm64 uapi disintegration from Catalin Marinas:
"UAPI headers for arm64 together with some clean-up to make it
possible:
- Do not export the COMPAT_* definitions to user
- Simplify the compat unistd32.h definitions and remove the
__SYSCALL_COMPAT guard
- Disintegrate the arch/arm64/include/asm/* headers"
* tag 'arm64-uapi' of git://git.kernel.org/pub/scm/linux/kernel/git/cmarinas/linux-aarch64:
UAPI: (Scripted) Disintegrate arch/arm64/include/asm
arm64: Do not export the compat-specific definitions to the user
arm64: Do not include asm/unistd32.h in asm/unistd.h
arm64: Remove unused definitions from asm/unistd32.h
Linus Torvalds [Sat, 13 Oct 2012 02:16:58 +0000 (11:16 +0900)]
Merge tag 'for_linus-3.7' of git://git.kernel.org/pub/scm/linux/kernel/git/jwessel/kgdb
Pull KGDB/KDB fixes and cleanups from Jason Wessel:
"Cleanups
- Clean up compile warnings in kgdboc.c and x86/kernel/kgdb.c
- Add module event hooks for simplified debugging with gdb
Fixes
- Fix kdb to stop paging with 'q' on bta and dmesg
- Fix for data that scrolls off the vga console due to line wrapping
when using the kdb pager
New
- The debug core registers for kernel module events which allows a
kernel aware gdb to automatically load symbols and break on entry
to a kernel module
- Allow kgdboc=kdb to setup kdb on the vga console"
* tag 'for_linus-3.7' of git://git.kernel.org/pub/scm/linux/kernel/git/jwessel/kgdb:
tty/console: fix warnings in drivers/tty/serial/kgdboc.c
kdb,vt_console: Fix missed data due to pager overruns
kdb: Fix dmesg/bta scroll to quit with 'q'
kgdboc: Accept either kbd or kdb to activate the vga + keyboard kdb shell
kgdb,x86: fix warning about unused variable
mips,kgdb: fix recursive page fault with CONFIG_KPROBES
kgdb: Add module event hooks
Linus Torvalds [Sat, 13 Oct 2012 01:57:57 +0000 (10:57 +0900)]
Merge tag 'dm-3.7-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/agk/linux-dm
Pull device-mapper changes from Alasdair G Kergon:
"Remove the power-of-2 block size constraint on discards in dm thin
provisioning and factor the bio_prison code out into a separate module
(for sharing with the forthcoming cache target).
Use struct bio's front_pad to eliminate the use of one separate
mempool by bio-based devices.
A few other tiny clean-ups."
* tag 'dm-3.7-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/agk/linux-dm:
dm: store dm_target_io in bio front_pad
dm thin: move bio_prison code to separate module
dm thin: prepare to separate bio_prison code
dm thin: support discard with non power of two block size
dm persistent data: convert to use le32_add_cpu
dm: use ACCESS_ONCE for sysfs values
dm bufio: use list_move
dm mpath: fix check for null mpio in end_io fn
Linus Torvalds [Sat, 13 Oct 2012 01:57:01 +0000 (10:57 +0900)]
Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull misc SCSI updates from James Bottomley:
"This is an assorted set of stragglers into the merge window with
driver updates for megaraid_sas, lpfc, bfi and mvumi. It also
includes some fairly major fixes for virtio-scsi (scatterlist init),
scsi_debug (off by one error), storvsc (use after free) and qla2xxx
(potential deadlock).
Signed-off-by: James Bottomley <JBottomley@Parallels.com>"
* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (49 commits)
[SCSI] storvsc: Account for in-transit packets in the RESET path
[SCSI] qla2xxx: fix potential deadlock on ha->hardware_lock
[SCSI] scsi_debug: Fix off-by-one bug when unmapping region
[SCSI] Shorten the path length of scsi_cmd_to_driver()
[SCSI] virtio-scsi: support online resizing of disks
[SCSI] virtio-scsi: fix LUNs greater than 255
[SCSI] virtio-scsi: initialize scatterlist structure
[SCSI] megaraid_sas: Version, Changelog, Copyright update
[SCSI] megaraid_sas: Remove duplicate code
[SCSI] megaraid_sas: Add SystemPD FastPath support
[SCSI] megaraid_sas: Add array boundary check for SystemPD
[SCSI] megaraid_sas: Load io_request DataLength in bytes
[SCSI] megaraid_sas: Add module param for configurable MSI-X vector count
[SCSI] megaraid_sas: Remove un-needed completion_lock spinlock calls
[SCSI] lpfc 8.3.35: Update lpfc version for 8.3.35 driver release
[SCSI] lpfc 8.3.35: Fixed not reporting logical link speed to SCSI midlayer when QoS not on
[SCSI] lpfc 8.3.35: Fix error with fabric service parameters causing performance issues
[SCSI] lpfc 8.3.35: Fixed SCSI host create showing wrong link speed on SLI3 HBA ports
[SCSI] lpfc 8.3.35: Fixed not checking solicition in progress bit when verifying FCF record for use
[SCSI] lpfc 8.3.35: Fixed messages for misconfigured port errors
...
Linus Torvalds [Sat, 13 Oct 2012 01:56:03 +0000 (10:56 +0900)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input
Pull input layer updates from Dmitry Torokhov:
"2nd round of updates for the input subsystem. With it input core no
longer limits number of character devices per event handler (such as
evdev) to 32, but switches to dynamic minors once legacy range is
exhausted. This should get multi-seat installations that currently
run our of event devices very quickly.
You will also get an update for Wacom driver and a couple of driver
fixes."
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
Input: extend the number of event (and other) devices
Input: mousedev - mark mousedev interfaces as non-seekable
Input: mousedev - rename mixdev_open to opened_by_mixdev
Input: mousedev - reformat structure initializers
Input: mousedev - factor out psaux code to reduce #ifdefery
Input: samsung-keypad - add clk_prepare and clk_unprepare
Input: atmel_mxt_ts - simplify mxt_dump_message
Input: wacom - clean up wacom_query_tablet_data
Input: wacom - introduce wacom_fix_phy_from_hid
Input: wacom - allow any multi-input Intuos device to set prox
Input: wacom - report correct touch contact size for I5/Bamboo
Linus Torvalds [Sat, 13 Oct 2012 01:53:54 +0000 (10:53 +0900)]
Merge branch 'for-3.7' of git://linux-nfs.org/~bfields/linux
Pull nfsd update from J Bruce Fields:
"Another relatively quiet cycle. There was some progress on my
remaining 4.1 todo's, but a couple of them were just of the form
"check that we do X correctly", so didn't have much affect on the
code.
Other than that, a bunch of cleanup and some bugfixes (including an
annoying NFSv4.0 state leak and a busy-loop in the server that could
cause it to peg the CPU without making progress)."
* 'for-3.7' of git://linux-nfs.org/~bfields/linux: (46 commits)
UAPI: (Scripted) Disintegrate include/linux/sunrpc
UAPI: (Scripted) Disintegrate include/linux/nfsd
nfsd4: don't allow reclaims of expired clients
nfsd4: remove redundant callback probe
nfsd4: expire old client earlier
nfsd4: separate session allocation and initialization
nfsd4: clean up session allocation
nfsd4: minor free_session cleanup
nfsd4: new_conn_from_crses should only allocate
nfsd4: separate connection allocation and initialization
nfsd4: reject bad forechannel attrs earlier
nfsd4: enforce per-client sessions/no-sessions distinction
nfsd4: set cl_minorversion at create time
nfsd4: don't pin clientids to pseudoflavors
nfsd4: fix bind_conn_to_session xdr comment
nfsd4: cast readlink() bug argument
NFSD: pass null terminated buf to kstrtouint()
nfsd: remove duplicate init in nfsd4_cb_recall
nfsd4: eliminate redundant nfs4_free_stateid
fs/nfsd/nfs4idmap.c: adjust inconsistent IS_ERR and PTR_ERR
...
1) Alexey Kuznetsov noticed we routed TCP resets improperly in the
assymetric routing case, fix this by reverting a change that made us
use the incoming interface in the outgoing route key when we didn't
have a socket context to work with.
2) TCP sysctl kernel memory leakage to userspace fix from Alan Cox.
3) Move UAPI bits from David Howells, WIMAX and CAN this time.
4) Fix TX stalls in e1000e wrt. Byte Queue Limits, from Hiroaki
SHIMODA, Denys Fedoryshchenko, and Jesse Brandeburg.
5) Fix IPV6 crashes in packet generator module, from Amerigo Wang.
6) Tidies and fixes in the new VXLAN driver from Stephen Hemminger.
7) Bridge IP options parse doesn't check first if SKB header has at
least an IP header's worth of content present. Fix from Sarveshwar
Bandi.
8) The kernel now generates compound pages on transmit and the Xen
netback drivers needs some adjustments in order to handle this. Fix
from Ian Campbell.
9) Turn off ASPM in JME driver, from Kevin Bardon and Matthew Garrett.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (43 commits)
mcs7830: Fix link state detection
net: add doc for in4_pton()
net: add doc for in6_pton()
vti: fix sparse bit endian warnings
tcp: resets are misrouted
usbnet: Support devices reporting idleness
Add CDC-ACM support for the CX93010-2x UCMxx USB Modem
net/ethernet/jme: disable ASPM
tcp: sysctl interface leaks 16 bytes of kernel memory
kaweth: print correct debug ptr
e1000e: Change wthresh to 1 to avoid possible Tx stalls
ipv4: fix route mark sparse warning
xen: netback: handle compound page fragments on transmit.
bridge: Pull ip header into skb->data before looking into ip header.
isdn: fix a wrapping bug in isdn_ppp_ioctl()
vxlan: fix oops when give unknown ifindex
vxlan: fix receive checksum handling
vxlan: add additional headroom
vxlan: allow configuring port range
vxlan: associate with tunnel socket on transmit
...
Pull tile arch update from Chris Metcalf:
"The bulk of this change is the tile uapi disintegration. There is
also a one-line change in here to enable interrupts in
do_work_pending() to avoid a WARN_ON in _local_bh_enable_ip()."
Linus Torvalds [Sat, 13 Oct 2012 01:20:11 +0000 (10:20 +0900)]
Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf updates from Ingo Molnar:
"This tree includes some late late perf items that missed the first
round:
tools:
- Bash auto completion improvements, now we can auto complete the
tools long options, tracepoint event names, etc, from Namhyung Kim.
- Look up thread using tid instead of pid in 'perf sched'.
- Move global variables into a perf_kvm struct, from David Ahern.
- Hists refactorings, preparatory for improved 'diff' command, from
Jiri Olsa.
- Hists refactorings, preparatory for event group viewieng work, from
Namhyung Kim.
- Remove double negation on optional feature macro definitions, from
Namhyung Kim.
- Remove several cases of needless global variables, on most
builtins.
- misc fixes
kernel:
- sysfs support for IBS on AMD CPUs, from Robert Richter.
- Support for an upcoming Intel CPU, the Xeon-Phi / Knights Corner
HPC blade PMU, from Vince Weaver.
- misc fixes"
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (46 commits)
perf: Fix perf_cgroup_switch for sw-events
perf: Clarify perf_cpu_context::active_pmu usage by renaming it to ::unique_pmu
perf/AMD/IBS: Add sysfs support
perf hists: Add more helpers for hist entry stat
perf hists: Move he->stat.nr_events initialization to a template
perf hists: Introduce struct he_stat
perf diff: Removing the total_period argument from output code
perf tool: Add hpp interface to enable/disable hpp column
perf tools: Removing hists pair argument from output path
perf hists: Separate overhead and baseline columns
perf diff: Refactor diff displacement possition info
perf hists: Add struct hists pointer to struct hist_entry
perf tools: Complete tracepoint event names
perf/x86: Add support for Intel Xeon-Phi Knights Corner PMU
perf evlist: Remove some unused methods
perf evlist: Introduce add_newtp method
perf kvm: Move global variables into a perf_kvm struct
perf tools: Convert to BACKTRACE_SUPPORT
perf tools: Long option completion support for each subcommands
perf tools: Complete long option names of perf command
...
Linus Torvalds [Sat, 13 Oct 2012 01:05:52 +0000 (10:05 +0900)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/signal
Pull third pile of kernel_execve() patches from Al Viro:
"The last bits of infrastructure for kernel_thread() et.al., with
alpha/arm/x86 use of those. Plus sanitizing the asm glue and
do_notify_resume() on alpha, fixing the "disabled irq while running
task_work stuff" breakage there.
At that point the rest of kernel_thread/kernel_execve/sys_execve work
can be done independently for different architectures. The only
pending bits that do depend on having all architectures converted are
restrictred to fs/* and kernel/* - that'll obviously have to wait for
the next cycle.
I thought we'd have to wait for all of them done before we start
eliminating the longjump-style insanity in kernel_execve(), but it
turned out there's a very simple way to do that without flagday-style
changes."
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/signal:
alpha: switch to saner kernel_execve() semantics
arm: switch to saner kernel_execve() semantics
x86, um: convert to saner kernel_execve() semantics
infrastructure for saner ret_from_kernel_thread semantics
make sure that kernel_thread() callbacks call do_exit() themselves
make sure that we always have a return path from kernel_execve()
ppc: eeh_event should just use kthread_run()
don't bother with kernel_thread/kernel_execve for launching linuxrc
alpha: get rid of switch_stack argument of do_work_pending()
alpha: don't bother passing switch_stack separately from regs
alpha: take SIGPENDING/NOTIFY_RESUME loop into signal.c
alpha: simplify TIF_NEED_RESCHED handling
Linus Torvalds [Sat, 13 Oct 2012 01:04:42 +0000 (10:04 +0900)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull third pile of VFS updates from Al Viro:
"Stuff from Jeff Layton, mostly. Sanitizing interplay between audit
and namei, removing a lot of insanity from audit_inode() mess and
getting things ready for his ESTALE patchset."
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
procfs: don't need a PATH_MAX allocation to hold a string representation of an int
vfs: embed struct filename inside of names_cache allocation if possible
audit: make audit_inode take struct filename
vfs: make path_openat take a struct filename pointer
vfs: turn do_path_lookup into wrapper around struct filename variant
audit: allow audit code to satisfy getname requests from its names_list
vfs: define struct filename and have getname() return it
vfs: unexport getname and putname symbols
acct: constify the name arg to acct_on
vfs: allocate page instead of names_cache buffer in mount_block_root
audit: overhaul __audit_inode_child to accomodate retrying
audit: optimize audit_compare_dname_path
audit: make audit_compare_dname_path use parent_len helper
audit: remove dirlen argument to audit_compare_dname_path
audit: set the name_len in audit_inode for parent lookups
audit: add a new "type" field to audit_names struct
audit: reverse arguments to audit_inode_child
audit: no need to walk list in audit_inode if name is NULL
audit: pass in dentry to audit_copy_inode wherever possible
audit: remove unnecessary NULL ptr checks from do_path_lookup
Jeff Layton [Wed, 10 Oct 2012 20:43:13 +0000 (16:43 -0400)]
vfs: embed struct filename inside of names_cache allocation if possible
In the common case where a name is much smaller than PATH_MAX, an extra
allocation for struct filename is unnecessary. Before allocating a
separate one, try to embed the struct filename inside the buffer first. If
it turns out that that's not long enough, then fall back to allocating a
separate struct filename and redoing the copy.
Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Jeff Layton [Wed, 10 Oct 2012 20:43:13 +0000 (16:43 -0400)]
audit: make audit_inode take struct filename
Keep a pointer to the audit_names "slot" in struct filename.
Have all of the audit_inode callers pass a struct filename ponter to
audit_inode instead of a string pointer. If the aname field is already
populated, then we can skip walking the list altogether and just use it
directly.
Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Jeff Layton [Wed, 10 Oct 2012 20:43:10 +0000 (16:43 -0400)]
vfs: make path_openat take a struct filename pointer
...and fix up the callers. For do_file_open_root, just declare a
struct filename on the stack and fill out the .name field. For
do_filp_open, make it also take a struct filename pointer, and fix up its
callers to call it appropriately.
For filp_open, add a variant that takes a struct filename pointer and turn
filp_open into a wrapper around it.
Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Jeff Layton [Wed, 10 Oct 2012 19:25:28 +0000 (15:25 -0400)]
audit: allow audit code to satisfy getname requests from its names_list
Currently, if we call getname() on a userland string more than once,
we'll get multiple copies of the string and multiple audit_names
records.
Add a function that will allow the audit_names code to satisfy getname
requests using info from the audit_names list, avoiding a new allocation
and audit_names records.
Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Jeff Layton [Wed, 10 Oct 2012 19:25:28 +0000 (15:25 -0400)]
vfs: define struct filename and have getname() return it
getname() is intended to copy pathname strings from userspace into a
kernel buffer. The result is just a string in kernel space. It would
however be quite helpful to be able to attach some ancillary info to
the string.
For instance, we could attach some audit-related info to reduce the
amount of audit-related processing needed. When auditing is enabled,
we could also call getname() on the string more than once and not
need to recopy it from userspace.
This patchset converts the getname()/putname() interfaces to return
a struct instead of a string. For now, the struct just tracks the
string in kernel space and the original userland pointer for it.
Later, we'll add other information to the struct as it becomes
convenient.
Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
btrfs: Fix compilation with user namespace support enabled
When compiling with user namespace support btrfs fails like:
fs/btrfs/tree-log.c: In function ‘fill_inode_item’:
fs/btrfs/tree-log.c:2955:2: error: incompatible type for argument 3 of ‘btrfs_set_inode_uid’
fs/btrfs/ctree.h:2026:1: note: expected ‘u32’ but argument is of type ‘kuid_t’
fs/btrfs/tree-log.c:2956:2: error: incompatible type for argument 3 of ‘btrfs_set_inode_gid’
fs/btrfs/ctree.h:2027:1: note: expected ‘u32’ but argument is of type ‘kgid_t’
Fix this by using i_uid_read and i_gid_read in
Cc: Chris Mason <chris.mason@fusionio.com> Cc: Josef Bacik <jbacik@fusionio.com> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
With user namespace support enabled building bluetooth generated the warning.
net/bluetooth/af_bluetooth.c: In function ‘bt_seq_show’:
net/bluetooth/af_bluetooth.c:598:7: warning: format ‘%u’ expects argument of type ‘unsigned int’, but argument 7 has type ‘kuid_t’ [-Wformat]
Convert sock_i_uid from a kuid_t to a uid_t before printing, to avoid
this problem.
Mikulas Patocka [Fri, 12 Oct 2012 20:02:15 +0000 (21:02 +0100)]
dm: store dm_target_io in bio front_pad
Use the recently-added bio front_pad field to allocate struct dm_target_io.
Prior to this patch, dm_target_io was allocated from a mempool. For each
dm_target_io, there is exactly one bio allocated from a bioset.
This patch merges these two allocations into one allocation: we create a
bioset with front_pad equal to the size of dm_target_io so that every
bio allocated from the bioset has sizeof(struct dm_target_io) bytes
before it. We allocate a bio and use the bytes before the bio as
dm_target_io.
_tio_cache is removed and the tio_pool mempool is now only used for
request-based devices.
This idea was introduced by Kent Overstreet.
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Cc: Kent Overstreet <koverstreet@google.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: tj@kernel.org Cc: Vivek Goyal <vgoyal@redhat.com> Cc: Bill Pemberton <wfp5p@viridian.itc.virginia.edu> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Mike Snitzer [Fri, 12 Oct 2012 20:02:07 +0000 (21:02 +0100)]
dm thin: support discard with non power of two block size
Support discards when the pool's block size is not a power of 2.
The block layer assumes discard_granularity is a power of 2 (in
blkdev_issue_discard), so we set this to the largest power of 2 that is
a divides into the number of sectors in each block, but never less than
DATA_DEV_BLOCK_SIZE_MIN_SECTORS.
This patch eliminates the "Discard support must be disabled when the
block size is not a power of 2" constraint that was imposed in commit 55f2b8b ("dm thin: support for non power of 2 pool blocksize"). That
commit was incomplete: using a block size that is not a power of 2
shouldn't mean disabling discard support on the device completely.
Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Joe Thornber <ejt@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Ondrej Zary [Thu, 11 Oct 2012 22:51:41 +0000 (22:51 +0000)]
mcs7830: Fix link state detection
The device had an undocumented "feature": it can provide a sequence of
spurious link-down status data even if the link is up all the time.
A sequence of 10 was seen so update the link state only after the device
reports the same link state 20 times.
Signed-off-by: Ondrej Zary <linux@rainbow-software.org> Reported-by: Michael Leun <lkml20120218@newton.leun.net> Tested-by: Michael Leun <lkml20120218@newton.leun.net> Signed-off-by: David S. Miller <davem@davemloft.net>
Alexey Kuznetsov [Fri, 12 Oct 2012 04:34:17 +0000 (04:34 +0000)]
tcp: resets are misrouted
After commit e2446eaa ("tcp_v4_send_reset: binding oif to iif in no
sock case").. tcp resets are always lost, when routing is asymmetric.
Yes, backing out that patch will result in misrouting of resets for
dead connections which used interface binding when were alive, but we
actually cannot do anything here. What's died that's died and correct
handling normal unbound connections is obviously a priority.
Comment to comment:
> This has few benefits:
> 1. tcp_v6_send_reset already did that.
It was done to route resets for IPv6 link local addresses. It was a
mistake to do so for global addresses. The patch fixes this as well.
Actually, the problem appears to be even more serious than guaranteed
loss of resets. As reported by Sergey Soloviev <sol@eqv.ru>, those
misrouted resets create a lot of arp traffic and huge amount of
unresolved arp entires putting down to knees NAT firewalls which use
asymmetric routing.
Al Viro [Thu, 11 Oct 2012 01:28:25 +0000 (21:28 -0400)]
infrastructure for saner ret_from_kernel_thread semantics
* allow kernel_execve() leave the actual return to userland to
caller (selected by CONFIG_GENERIC_KERNEL_EXECVE). Callers
updated accordingly.
* architecture that does select GENERIC_KERNEL_EXECVE in its
Kconfig should have its ret_from_kernel_thread() do this:
call schedule_tail
call the callback left for it by copy_thread(); if it ever
returns, that's because it has just done successful kernel_execve()
jump to return from syscall
IOW, its only difference from ret_from_fork() is that it does call the
callback.
* such an architecture should also get rid of ret_from_kernel_execve()
and __ARCH_WANT_KERNEL_EXECVE
This is the last part of infrastructure patches in that area - from
that point on work on different architectures can live independently.
Mikulas Patocka [Fri, 12 Oct 2012 15:59:46 +0000 (16:59 +0100)]
dm: use ACCESS_ONCE for sysfs values
Use the ACCESS_ONCE macro in dm-bufio and dm-verity where a variable
can be modified asynchronously (through sysfs) and we want to prevent
compiler optimizations that assume that the variable hasn't changed.
(See Documentation/atomic_ops.txt.)
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Linus Torvalds [Fri, 12 Oct 2012 13:20:28 +0000 (22:20 +0900)]
Merge tag 'stable/for-linus-3.7-rc0-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen
Pull Xen fixes from Konrad Rzeszutek Wilk:
"This has four bug-fixes and one tiny feature that I forgot to put
initially in my tree due to oversight.
The feature is for kdump kernels to speed up the /proc/vmcore reading.
There is a ram_is_pfn helper function that the different platforms can
register for. We are now doing that.
The bug-fixes cover some embarrassing struct pv_cpu_ops variables
being set to NULL on Xen (but not baremetal). We had a similar issue
in the past with {write|read}_msr_safe and this fills the three
missing ones. The other bug-fix is to make the console output (hvc)
be capable of dealing with misbehaving backends and not fall flat on
its face. Lastly, a quirk for older XenBus implementations that came
with an ancient v3.4 hypervisor (so RHEL5 based) - reading of certain
non-existent attributes just hangs the guest during bootup - so we
take precaution of not doing that on such older installations.
Feature:
- Register a pfn_is_ram helper to speed up reading of /proc/vmcore.
Bug-fixes:
- Three pvops call for Xen were undefined causing BUG_ONs.
- Add a quirk so that the shutdown watches (used by kdump) are not
used with older Xen (3.4).
- Fix ungraceful state transition for the HVC console."
* tag 'stable/for-linus-3.7-rc0-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen:
xen/pv-on-hvm kexec: add quirk for Xen 3.4 and shutdown watches.
xen/bootup: allow {read|write}_cr8 pvops call.
xen/bootup: allow read_tscp call for Xen PV guests.
xen pv-on-hvm: add pfn_is_ram helper for kdump
xen/hvc: handle backend CLOSED without CLOSING
Linus Torvalds [Fri, 12 Oct 2012 13:17:48 +0000 (22:17 +0900)]
Merge branch 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull timer core update from Thomas Gleixner:
- Bug fixes (one for a longstanding dead loop issue)
- Rework of time related vsyscalls
- Alarm timer updates
- Jiffies updates to remove compile time dependencies
* 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
timekeeping: Cast raw_interval to u64 to avoid shift overflow
timers: Fix endless looping between cascade() and internal_add_timer()
time/jiffies: bring back unconditional LATCH definition
time: Convert x86_64 to using new update_vsyscall
time: Only do nanosecond rounding on GENERIC_TIME_VSYSCALL_OLD systems
time: Introduce new GENERIC_TIME_VSYSCALL
time: Convert CONFIG_GENERIC_TIME_VSYSCALL to CONFIG_GENERIC_TIME_VSYSCALL_OLD
time: Move update_vsyscall definitions to timekeeper_internal.h
time: Move timekeeper structure to timekeeper_internal.h for vsyscall changes
jiffies: Remove compile time assumptions about CLOCK_TICK_RATE
jiffies: Kill unused TICK_USEC_TO_NSEC
alarmtimer: Rename alarmtimer_remove to alarmtimer_dequeue
alarmtimer: Remove unused helpers & defines
alarmtimer: Use hrtimer per-alarm instead of per-base
alarmtimer: Implement minimum alarm interval for allowing suspend
Linus Torvalds [Fri, 12 Oct 2012 13:13:05 +0000 (22:13 +0900)]
Merge branch 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler fixes from Ingo Molnar:
"A CPU hotplug related crash fix and a nohz accounting fixlet."
* 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
sched: Update sched_domains_numa_masks[][] when new cpus are onlined
sched: Ensure 'sched_domains_numa_levels' is safe to use in other functions
nohz: Fix one jiffy count too far in idle cputime
Linus Torvalds [Fri, 12 Oct 2012 13:12:07 +0000 (22:12 +0900)]
Merge branch 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull RCU fixes from Ingo Molnar:
"This tree includes a shutdown/cpu-hotplug deadlock fix and a
documentation fix."
* 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
rcu: Advise most users not to enable RCU user mode
rcu: Grace-period initialization excludes only RCU notifier
xen/pv-on-hvm kexec: add quirk for Xen 3.4 and shutdown watches.
The commit 254d1a3f02ebc10ccc6e4903394d8d3f484f715e, titled
"xen/pv-on-hvm kexec: shutdown watches from old kernel" assumes that the
XenBus backend can deal with reading of values from:
"control/platform-feature-xs_reset_watches":
... a patch for xenstored is required so that it
accepts the XS_RESET_WATCHES request from a client (see changeset
23839:42a45baf037d in xen-unstable.hg). Without the patch for xenstored
the registration of watches will fail and some features of a PVonHVM
guest are not available. The guest is still able to boot, but repeated
kexec boots will fail."
Sadly this is not true when using a Xen 3.4 hypervisor and booting a PVHVM
guest. We end up hanging at:
Reverting the commit or using the attached patch fixes the issue. This fix
checks whether the hypervisor is older than 4.0 and if so does not try to
perform the read.
Fixes-Oracle-Bug: 14708233 CC: stable@vger.kernel.org Acked-by: Olaf Hering <olaf@aepfle.de>
[v2: Added a comment in the source code] Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
xen/bootup: allow read_tscp call for Xen PV guests.
The hypervisor will trap it. However without this patch,
we would crash as the .read_tscp is set to NULL. This patch
fixes it and sets it to the native_read_tscp call.
CC: stable@vger.kernel.org Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Michael Kerrisk <mtk.manpages@gmail.com> Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Acked-by: Dave Jones <davej@redhat.com>
tty/console: fix warnings in drivers/tty/serial/kgdboc.c
The con_debug_leave/con_debug_enter functions are stubbed out
by defining them to (0), which causes harmless build warnings.
Using proper inline functions is the normal way to deal with
this.
Without this patch, building the ARM bcm2835_defconfig results in:
drivers/tty/serial/kgdboc.c: In function 'kgdboc_pre_exp_handler':
drivers/tty/serial/kgdboc.c:279:3: warning: statement with no effect [-Wunused-value]
drivers/tty/serial/kgdboc.c: In function 'kgdboc_post_exp_handler':
drivers/tty/serial/kgdboc.c:293:3: warning: statement with no effect [-Wunused-value]
Signed-off-by: Arnd Bergmann <arnd@arndb.de> Cc: Anton Vorontsov <anton.vorontsov@linaro.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Jason Wessel <jason.wessel@windriver.com>
Jason Wessel [Mon, 27 Aug 2012 03:37:03 +0000 (22:37 -0500)]
kdb,vt_console: Fix missed data due to pager overruns
It is possible to miss data when using the kdb pager. The kdb pager
does not pay attention to the maximum column constraint of the screen
or serial terminal. This result is not incrementing the shown lines
correctly and the pager will print more lines that fit on the screen.
Obviously that is less than useful when using a VGA console where you
cannot scroll back.
The pager will now look at the kdb_buffer string to see how many
characters are printed. It might not be perfect considering you can
output ASCII that might move the cursor position, but it is a
substantially better approximation for viewing dmesg and trace logs.
This also means that the vt screen needs to set the kdb COLUMNS
variable.
Cc: <stable@vger.kernel.org> Signed-off-by: Jason Wessel <jason.wessel@windriver.com>
Jason Wessel [Fri, 10 Aug 2012 17:12:23 +0000 (12:12 -0500)]
kgdboc: Accept either kbd or kdb to activate the vga + keyboard kdb shell
It is a common enough mistake for people to specify "kdb" when they
meant to type "kbd" that the kgdboc can just accept both since they
both mean the same thing anyway. Specifically it is for the case
where you want kdb to be active on your graphics console + keyboard
(where kbd was the original abbreviation for keyboard).
With this change kgdboc will now accept either to mean the same thing:
kgdboc=kbd
kgdboc=kdb
Signed-off-by: Jason Wessel <jason.wessel@windriver.com>
Jason Wessel [Fri, 10 Aug 2012 17:21:15 +0000 (12:21 -0500)]
mips,kgdb: fix recursive page fault with CONFIG_KPROBES
This fault was detected using the kgdb test suite on boot and it
crashes recursively due to the fact that CONFIG_KPROBES on mips adds
an extra die notifier in the page fault handler. The crash signature
looks like this:
kgdbts:RUN bad memory access test
KGDB: re-enter exception: ALL breakpoints killed
Call Trace:
[<807b7548>] dump_stack+0x20/0x54
[<807b7548>] dump_stack+0x20/0x54
The fix for now is to have kgdb return immediately if the fault type
is DIE_PAGE_FAULT and allow the kprobe code to decide what is supposed
to happen.
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: David S. Miller <davem@davemloft.net> Cc: <stable@vger.kernel.org> Signed-off-by: Jason Wessel <jason.wessel@windriver.com>
Jason Wessel [Fri, 12 Oct 2012 11:37:33 +0000 (06:37 -0500)]
kgdb: Add module event hooks
Allow gdb to auto load kernel modules when it is attached,
which makes it trivially easy to debug module init functions
or pre-set breakpoints in a kernel module that has not loaded yet.
Signed-off-by: Jason Wessel <jason.wessel@windriver.com>
Jeff Layton [Wed, 10 Oct 2012 19:25:26 +0000 (15:25 -0400)]
vfs: allocate page instead of names_cache buffer in mount_block_root
First, it's incorrect to call putname() after __getname_gfp() since the
bare __getname_gfp() call skips the auditing code, while putname()
doesn't.
mount_block_root allocates a PATH_MAX buffer via __getname_gfp, and then
calls get_fs_names to fill the buffer. That function can call
get_filesystem_list which assumes that that buffer is a full page in
size. On arches where PAGE_SIZE != 4k, then this could potentially
overrun.
In practice, it's hard to imagine the list of filesystem names even
approaching 4k, but it's best to be safe. Just allocate a page for this
purpose instead.
With this, we can also remove the __getname_gfp() definition since there
are no more callers.
Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Jeff Layton [Wed, 10 Oct 2012 19:25:25 +0000 (15:25 -0400)]
audit: overhaul __audit_inode_child to accomodate retrying
In order to accomodate retrying path-based syscalls, we need to add a
new "type" argument to audit_inode_child. This will tell us whether
we're looking for a child entry that represents a create or a delete.
If we find a parent, don't automatically assume that we need to create a
new entry. Instead, use the information we have to try to find an
existing entry first. Update it if one is found and create a new one if
not.
Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Jeff Layton [Wed, 10 Oct 2012 19:25:25 +0000 (15:25 -0400)]
audit: optimize audit_compare_dname_path
In the cases where we already know the length of the parent, pass it as
a parm so we don't need to recompute it. In the cases where we don't
know the length, pass in AUDIT_NAME_FULL (-1) to indicate that it should
be determined.
Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Jeff Layton [Wed, 10 Oct 2012 19:25:23 +0000 (15:25 -0400)]
audit: set the name_len in audit_inode for parent lookups
Currently, this gets set mostly by happenstance when we call into
audit_inode_child. While that might be a little more efficient, it seems
wrong. If the syscall ends up failing before audit_inode_child ever gets
called, then you'll have an audit_names record that shows the full path
but has the parent inode info attached.
Fix this by passing in a parent flag when we call audit_inode that gets
set to the value of LOOKUP_PARENT. We can then fix up the pathname for
the audit entry correctly from the get-go.
While we're at it, clean up the no-op macro for audit_inode in the
!CONFIG_AUDITSYSCALL case.
Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Jeff Layton [Wed, 10 Oct 2012 19:25:21 +0000 (15:25 -0400)]
audit: reverse arguments to audit_inode_child
Most of the callers get called with an inode and dentry in the reverse
order. The compiler then has to reshuffle the arg registers and/or
stack in order to pass them on to audit_inode_child.
Reverse those arguments for a micro-optimization.
Reported-by: Eric Paris <eparis@redhat.com> Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Jeff Layton [Wed, 10 Oct 2012 19:25:21 +0000 (15:25 -0400)]
audit: no need to walk list in audit_inode if name is NULL
If name is NULL then the condition in the loop will never be true. Also,
with this change, we can eliminate the check for n->name == NULL since
the equivalence check will never be true if it is.
Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>