Paul E. McKenney [Thu, 18 Dec 2008 20:55:32 +0000 (21:55 +0100)]
"Tree RCU": scalable classic RCU implementation
This patch fixes a long-standing performance bug in classic RCU that
results in massive internal-to-RCU lock contention on systems with
more than a few hundred CPUs. Although this patch creates a separate
flavor of RCU for ease of review and patch maintenance, it is intended
to replace classic RCU.
This patch still handles stress better than does mainline, so I am still
calling it ready for inclusion. This patch is against the -tip tree.
Nevertheless, experience on an actual 1000+ CPU machine would still be
most welcome.
Most of the changes noted below were found while creating an rcutiny
(which should permit ejecting the current rcuclassic) and while doing
detailed line-by-line documentation.
Updates from v9 (http://lkml.org/lkml/2008/12/2/334):
o Fixes from remainder of line-by-line code walkthrough,
including comment spelling, initialization, undesirable
narrowing due to type conversion, removing redundant memory
barriers, removing redundant local-variable initialization,
and removing redundant local variables.
I do not believe that any of these fixes address the CPU-hotplug
issues that Andi Kleen was seeing, but please do give it a whirl
in case the machine is smarter than I am.
A writeup from the walkthrough may be found at the following
URL, in case you are suffering from terminal insomnia or
masochism:
o Made rcutree tracing use seq_file, as suggested some time
ago by Lai Jiangshan.
o Added a .csv variant of the rcudata debugfs trace file, to allow
people having thousands of CPUs to drop the data into
a spreadsheet. Tested with oocalc and gnumeric. Updated
documentation to suit.
Updates from v8 (http://lkml.org/lkml/2008/11/15/139):
o Fix a theoretical race between grace-period initialization and
force_quiescent_state() that could occur if more than three
jiffies were required to carry out the grace-period
initialization. Which it might, if you had enough CPUs.
o Apply Ingo's printk-standardization patch.
o Substitute local variables for repeated accesses to global
variables.
o Fix comment misspellings and redundant (but harmless) increments
of ->n_rcu_pending (this latter after having explicitly added it).
o Apply checkpatch fixes.
Updates from v7 (http://lkml.org/lkml/2008/10/10/291):
o Fixed a number of problems noted by Gautham Shenoy, including
the cpu-stall-detection bug that he was having difficulty
convincing me was real. ;-)
o Changed cpu-stall detection to wait for ten seconds rather than
three in order to reduce false positive, as suggested by Ingo
Molnar.
o Produced a design document (http://lwn.net/Articles/305782/).
The act of writing this document uncovered a number of both
theoretical and "here and now" bugs as noted below.
o Fix dynticks_nesting accounting confusion, simplify WARN_ON()
condition, fix kerneldoc comments, and add memory barriers
in dynticks interface functions.
o Add more data to tracing.
o Remove unused "rcu_barrier" field from rcu_data structure.
o Count calls to rcu_pending() from scheduling-clock interrupt
to use as a surrogate timebase should jiffies stop counting.
o Fix a theoretical race between force_quiescent_state() and
grace-period initialization. Yes, initialization does have to
go on for some jiffies for this race to occur, but given enough
CPUs...
Updates from v6 (http://lkml.org/lkml/2008/9/23/448):
o Fix a number of checkpatch.pl complaints.
o Apply review comments from Ingo Molnar and Lai Jiangshan
on the stall-detection code.
o Fix several bugs in !CONFIG_SMP builds.
o Fix a misspelled config-parameter name so that RCU now announces
at boot time if stall detection is configured.
o Run tests on numerous combinations of configurations parameters,
which after the fixes above, now build and run correctly.
Updates from v5 (http://lkml.org/lkml/2008/9/15/92, bad subject line):
o Fix a compiler error in the !CONFIG_FANOUT_EXACT case (blew a
changeset some time ago, and finally got around to retesting
this option).
o Fix some tracing bugs in rcupreempt that caused incorrect
totals to be printed.
o I now test with a more brutal random-selection online/offline
script (attached). Probably more brutal than it needs to be
on the people reading it as well, but so it goes.
o A number of optimizations and usability improvements:
o Make rcu_pending() ignore the grace-period timeout when
there is no grace period in progress.
o Make force_quiescent_state() avoid going for a global
lock in the case where there is no grace period in
progress.
o Rearrange struct fields to improve struct layout.
o Make call_rcu() initiate a grace period if RCU was
idle, rather than waiting for the next scheduling
clock interrupt.
o Invoke rcu_irq_enter() and rcu_irq_exit() only when
idle, as suggested by Andi Kleen. I still don't
completely trust this change, and might back it out.
o Make CONFIG_RCU_TRACE be the single config variable
manipulated for all forms of RCU, instead of the prior
confusion.
o Document tracing files and formats for both rcupreempt
and rcutree.
Updates from v4 for those missing v5 given its bad subject line:
o Separated dynticks interface so that NMIs and irqs call separate
functions, greatly simplifying it. In particular, this code
no longer requires a proof of correctness. ;-)
o Separated dynticks state out into its own per-CPU structure,
avoiding the duplicated accounting.
o The case where a dynticks-idle CPU runs an irq handler that
invokes call_rcu() is now correctly handled, forcing that CPU
out of dynticks-idle mode.
o Review comments have been applied (thank you all!!!).
For but one example, fixed the dynticks-ordering issue that
Manfred pointed out, saving me much debugging. ;-)
o Adjusted rcuclassic and rcupreempt to handle dynticks changes.
Attached is an updated patch to Classic RCU that applies a hierarchy,
greatly reducing the contention on the top-level lock for large machines.
This passes 10-hour concurrent rcutorture and online-offline testing on
128-CPU ppc64 without dynticks enabled, and exposes some timekeeping
bugs in presence of dynticks (exciting working on a system where
"sleep 1" hangs until interrupted...), which were fixed in the
2.6.27 kernel. It is getting more reliable than mainline by some
measures, so the next version will be against -tip for inclusion.
See also Manfred Spraul's recent patches (or his earlier work from
2004 at http://marc.info/?l=linux-kernel&m=108546384711797&w=2).
We will converge onto a common patch in the fullness of time, but are
currently exploring different regions of the design space. That said,
I have already gratefully stolen quite a few of Manfred's ideas.
This patch provides CONFIG_RCU_FANOUT, which controls the bushiness
of the RCU hierarchy. Defaults to 32 on 32-bit machines and 64 on
64-bit machines. If CONFIG_NR_CPUS is less than CONFIG_RCU_FANOUT,
there is no hierarchy. By default, the RCU initialization code will
adjust CONFIG_RCU_FANOUT to balance the hierarchy, so strongly NUMA
architectures may choose to set CONFIG_RCU_FANOUT_EXACT to disable
this balancing, allowing the hierarchy to be exactly aligned to the
underlying hardware. Up to two levels of hierarchy are permitted
(in addition to the root node), allowing up to 16,384 CPUs on 32-bit
systems and up to 262,144 CPUs on 64-bit systems. I just know that I
am going to regret saying this, but this seems more than sufficient
for the foreseeable future. (Some architectures might wish to set
CONFIG_RCU_FANOUT=4, which would limit such architectures to 64 CPUs.
If this becomes a real problem, additional levels can be added, but I
doubt that it will make a significant difference on real hardware.)
In the common case, a given CPU will manipulate its private rcu_data
structure and the rcu_node structure that it shares with its immediate
neighbors. This can reduce both lock and memory contention by multiple
orders of magnitude, which should eliminate the need for the strange
manipulations that are reported to be required when running Linux on
very large systems.
Some shortcomings:
o More bugs will probably surface as a result of an ongoing
line-by-line code inspection.
Patches will be provided as required.
o There are probably hangs, rcutorture failures, &c. Seems
quite stable on a 128-CPU machine, but that is kind of small
compared to 4096 CPUs. However, seems to do better than
mainline.
Patches will be provided as required.
o The memory footprint of this version is several KB larger
than rcuclassic.
A separate UP-only rcutiny patch will be provided, which will
reduce the memory footprint significantly, even compared
to the old rcuclassic. One such patch passes light testing,
and has a memory footprint smaller even than rcuclassic.
Initial reaction from various embedded guys was "it is not
worth it", so am putting it aside.
Credits:
o Manfred Spraul for ideas, review comments, and bugs spotted,
as well as some good friendly competition. ;-)
o Josh Triplett, Ingo Molnar, Peter Zijlstra, Mathieu Desnoyers,
Lai Jiangshan, Andi Kleen, Andy Whitcroft, and Andrew Morton
for reviews and comments.
o Thomas Gleixner for much-needed help with some timer issues
(see patches below).
o Jon M. Tollefson, Tim Pepper, Andrew Theurer, Jose R. Santos,
Andy Whitcroft, Darrick Wong, Nishanth Aravamudan, Anton
Blanchard, Dave Kleikamp, and Nathan Lynch for keeping machines
alive despite my heavy abuse^Wtesting.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
Linus Torvalds [Wed, 17 Dec 2008 23:05:26 +0000 (15:05 -0800)]
Merge branch 'i2c-fixes' of git://aeryn.fluff.org.uk/bjdooks/linux
* 'i2c-fixes' of git://aeryn.fluff.org.uk/bjdooks/linux:
i2c-s3c2410: fix check for being in suspend.
i2c-cpm: Detect and report NAK right away instead of timing out
* git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb-2.6:
USB: pl2303: add id for Hewlett-Packard LD220-HP POS pole display
USB: set correct configuration in probe of ti_usb_3410_5052
USB: add 5372:2303 to pl2303
USB: skip Set-Interface(0) if already in altsetting 0
USB: fix comment about endianness of descriptors
USB: Documentation/usb/gadget_serial.txt: update to match driver use_acm behaviour
usbmon: drop bogus 0t from usbmon.txt
USB: gadget: fix rndis working at high speed
USB: ftdi_sio: Adding Ewert Energy System's CANdapter PID
USB: tty: SprogII DCC controller identifiers
usb-storage: update unusual_devs entry for Nokia 5310
USB: Unusual devs patch for Nokia 3500c
USB: storage: unusual_devs.h: Nokia 3109c addition
USB: fix problem with usbtmc driver not loading properly
* git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core-2.6:
STAGING: Move staging drivers back to staging-specific menu
driver core: add newlines to debugging enabled/disabled messages
xilinx_hwicap: remove improper wording in license statement
driver core: fix using 'ret' variable in unregister_dynamic_debug_module
Jeff Layton [Wed, 17 Dec 2008 11:31:53 +0000 (06:31 -0500)]
cifs: fix buffer overrun in parse_DFS_referrals
While testing a kernel with memory poisoning enabled, I saw some warnings
about the redzone getting clobbered when chasing DFS referrals. The
buffer allocation for the unicode converted version of the searchName is
too small and needs to take null termination into account.
Signed-off-by: Jeff Layton <jlayton@redhat.com> Acked-by: Steve French <sfrench@us.ibm.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Wed, 17 Dec 2008 22:58:56 +0000 (14:58 -0800)]
Merge branch 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/galak/powerpc
* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/galak/powerpc:
powerpc: Fix corruption error in rh_alloc_fixed()
powerpc/fsl-booke: Fix the miss interrupt restore
driver core: fix using 'ret' variable in unregister_dynamic_debug_module
The 'ret' variable is assigned, but not used in the return statement. Fix this.
Signed-off-by: Johann Felix Soden <johfel@users.sourceforge.net> Acked-by: Jason Baron <jbaron@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Matthew Arnold [Sat, 13 Dec 2008 11:42:53 +0000 (22:42 +1100)]
USB: add 5372:2303 to pl2303
This patch adds the "Superial" USB-Serial converter to pl2303 so that it
is detected, by the correct driver. Adds the relevant vendor:product
(5372:2303) to the device tables in pl2303.c & pl2303.h. The patch has
been tested against 2.6.24-22-generic.
Signed-off-by: Matthew D Arnold <matthew.arnold-1@uts.edu.au> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Alan Stern [Mon, 1 Dec 2008 15:24:41 +0000 (10:24 -0500)]
USB: skip Set-Interface(0) if already in altsetting 0
When a driver unbinds from an interface, usbcore always sends a
Set-Interface request to reinstall altsetting 0. Unforunately, quite
a few devices have buggy firmware that crashes when it receives this
request.
To avoid such problems, this patch (as1180) arranges to send the
Set-Interface request only when the interface is not already in
altsetting 0.
Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Phil Endecott [Mon, 1 Dec 2008 15:22:33 +0000 (10:22 -0500)]
USB: fix comment about endianness of descriptors
This patch fixes a comment and clarifies the documentation about the
endianness of descriptors. The current policy is that descriptors will
be little-endian at the API even on big-endian systems; however the
/proc/bus/usb API predates this policy and presents descriptors with
some multibyte fields byte-swapped.
Signed-off-by: Phil Endecott <usb_endian_patch@chezphil.org> Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Peter Korsgaard [Thu, 4 Dec 2008 15:30:53 +0000 (16:30 +0100)]
USB: Documentation/usb/gadget_serial.txt: update to match driver use_acm behaviour
Commit 7bb5ea54 (usb gadget serial: use composite gadget framework)
changed the default for the use_acm parameter from 0 to 1.
Update the documentation to match.
Signed-off-by: Peter Korsgaard <jacmet@sunsite.dk> Acked-by: David Brownell <dbrownell@users.sourceforge.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
David Brownell [Tue, 25 Nov 2008 07:11:03 +0000 (23:11 -0800)]
USB: gadget: fix rndis working at high speed
Fix a bug specific to highspeed mode in the recently updated RNDIS
support: it wasn't setting up the high speed notification endpoint,
which prevented high speed RNDIS links from working.
Andrew Ewert [Thu, 4 Dec 2008 15:09:59 +0000 (09:09 -0600)]
USB: ftdi_sio: Adding Ewert Energy System's CANdapter PID
The following patch adds in the USB PID for Ewert Energy System's CANdapter
device (CANBUS to USB-Serial which uses the FTDI 245R chipset) to the ftdi_sio
device driver.
The patch was tested successfully on Linux kernel 2.6.27 under Ubuntu.
Relevant output from /proc/bus/usb/devices (With patch installed):
Alan Cox [Sun, 7 Dec 2008 07:46:04 +0000 (23:46 -0800)]
USB: tty: SprogII DCC controller identifiers
Someone on rmweb reminded me this had been overlooked from ages ago..
Add the identifiers for the Sprog II USB. This is a DCC control interface
using the FTDI-SIO hardware: http://www.sprog-dcc.co.uk/. People have been
using it with insmod options for ages, this just puts it into the driver
data.
Signed-off-by: Alan Cox <alan@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
CSÉCSY László [Tue, 9 Dec 2008 22:39:14 +0000 (23:39 +0100)]
USB: storage: unusual_devs.h: Nokia 3109c addition
2.6.26(.x, cannot remember) could handle the microSD card in my Nokia
3109c attached via USB as mass storage, 2.6.27(.x, up to and included
2.6.27.8) cannot. Please find the attached patch which fixes this
regression, and a copy of /proc/bus/usb/devices with my phone plugged in
running with this patch on Frugalware.
There is an error in rh_alloc_fixed() of the Remote Heap code:
If there is at least one free block blk won't be NULL at the end of the
search loop, so -ENOMEM won't be returned and the else branch of
"if (bs == s || be == e)" will be taken, corrupting the management
structures.
Signed-off-by: Guillaume Knispel <gknispel@proformatique.com> Acked-by: Timur Tabi <timur@freescale.com> Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Dave Liu [Wed, 17 Dec 2008 10:24:15 +0000 (18:24 +0800)]
powerpc/fsl-booke: Fix the miss interrupt restore
The commit e5e774d8833de1a0037be2384efccadf16935675
powerpc/fsl-booke: Fix problem with _tlbil_va being interrupted
introduce one issue. that casue the problem like this:
Paul E. McKenney [Tue, 16 Dec 2008 00:13:07 +0000 (16:13 -0800)]
rcu: fix rcutorture behavior during reboot
Impact: fix very rare reboot hang
Because rcutorture ignored all signals, it does not terminate in
response to the signals sent at shutdown time. This can cause strange
failures due to its continuing to make use of kernel function too late
in the shutdown sequence. This patch therefore adds a shutdown notifier
to rcutorture, causing it to shut down in response to a reboot or an
orderly shutdown.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
Tao Ma [Fri, 5 Dec 2008 01:14:10 +0000 (09:14 +0800)]
ocfs2: Always update xattr search when creating bucket.
When we create xattr bucket during the process of xattr set, we always
need to update the ocfs2_xattr_search since even if the bucket size is
the same as block size, the offset will change because of the removal
of the ocfs2_xattr_block header.
Signed-off-by: Tao Ma <tao.ma@oracle.com> Signed-off-by: Mark Fasheh <mfasheh@suse.com>
Ben Dooks [Fri, 31 Oct 2008 16:10:22 +0000 (16:10 +0000)]
i2c-s3c2410: fix check for being in suspend.
As noted by Julia Lawall <julia@diku.dk>, we can never
trigger the check for being in suspend due to the result
of !readl(i2c->regs + S3C2410_IICCON) & S3C2410_IICCON_IRQEN
always being 0.
Add suspend/resume hooks to stop i2c transactions happening
until the driver has been resumed.
Mike Ditto [Tue, 16 Dec 2008 20:17:09 +0000 (20:17 +0000)]
i2c-cpm: Detect and report NAK right away instead of timing out
Make the driver report an ENXIO error immediately upon NAK instead of
waiting for another interrupt and getting a timeout.
When reading from a device that is not present or declines to respond
to, e.g., a non-existent register address, CPM immediately reports a
NAK condition in the TxBD, but the driver kept waiting until a timeout,
which takes 1 second and causes an ugly console error message.
Signed-off-by: Mike Ditto <mditto@consentry.com> Acked-by: Jochen Friedrich <jochen@scram.de>
[ben-linux@fluff.org: reordered description text] Signed-off-by: Ben Dooks <ben-linux@fluff.org>
Linus Torvalds [Tue, 16 Dec 2008 17:47:58 +0000 (09:47 -0800)]
Merge branch 'sh/for-2.6.28' of git://git.kernel.org/pub/scm/linux/kernel/git/lethal/sh-2.6
* 'sh/for-2.6.28' of git://git.kernel.org/pub/scm/linux/kernel/git/lethal/sh-2.6:
sh: Disable GENERIC_HARDIRQS_NO__DO_IRQ for unconverted platforms.
sh: maple: Do not pass SLAB_POISON to kmem_cache_create()
Linus Torvalds [Tue, 16 Dec 2008 17:47:43 +0000 (09:47 -0800)]
Merge branch 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc
* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc:
powerpc/cell/axon-msi: Fix MSI after kexec
powerpc: Fix bootmem reservation on uninitialized node
powerpc: Check for valid hugepage size in hugetlb_get_unmapped_area
Tejun Heo [Tue, 9 Dec 2008 08:13:19 +0000 (17:13 +0900)]
pata_hpt366: fix cable detection,
pata_hpt366 is strange in that its two channels occupy two PCI
functions and both are primary channels and bit1 of PCI configuration
register 0x5A indicates cable for both channels.
Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Cc: Sergei Shtylyov <sshtylyov@ru.mvista.com> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Matt Fleming [Tue, 16 Dec 2008 00:15:31 +0000 (09:15 +0900)]
sh: maple: Do not pass SLAB_POISON to kmem_cache_create()
SLAB_POISON is not a valid flag for kmem_create_cache() unless
CONFIG_DEBUG_SLAB is set, so remove it from the flags argument.
Acked-by: Adrian McMenamin <adrian@newgolddream.dyndns.info> Signed-off-by: Matt Fleming <mjf@gentoo.org> Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Arnd Bergmann [Fri, 12 Dec 2008 09:19:50 +0000 (09:19 +0000)]
powerpc/cell/axon-msi: Fix MSI after kexec
Commit d015fe995 'powerpc/cell/axon-msi: Retry on missing interrupt'
has turned a rare failure to kexec on QS22 into a reproducible
error, which we have now analysed.
The problem is that after a kexec, the MSIC hardware still points
into the middle of the old ring buffer. We set up the ring buffer
during reboot, but not the offset into it. On older kernels, this
would cause a storm of thousands of spurious interrupts after a
kexec, which would most of the time get dropped silently.
With the new code, we time out on each interrupt, waiting for
it to become valid. If more interrupts come in that we time
out on, this goes on indefinitely, which eventually leads to
a hard crash.
The solution in this commit is to read the current offset from
the MSIC when reinitializing it. This now works correctly, as
expected.
Reported-by: Dirk Herrendoerfer <d.herrendoerfer@de.ibm.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Michael Ellerman <michael@ellerman.id.au> Signed-off-by: Paul Mackerras <paulus@samba.org>
Dave Hansen [Thu, 11 Dec 2008 08:36:06 +0000 (08:36 +0000)]
powerpc: Fix bootmem reservation on uninitialized node
careful_allocation() was calling into the bootmem allocator for
nodes which had not been fully initialized and caused a previous
bug: http://patchwork.ozlabs.org/patch/10528/ So, I merged a
few broken out loops in do_init_bootmem() to fix it. That changed
the code ordering.
I think this bug is triggered by having reserved areas for a node
which are spanned by another node's contents. In the
mark_reserved_regions_for_nid() code, we attempt to reserve the
area for a node before we have allocated the NODE_DATA() for that
nid. We do this since I reordered that loop. I suck.
This is causing crashes at bootup on some systems, as reported
by Jon Tollefson.
This may only present on some systems that have 16GB pages
reserved. But, it can probably happen on any system that is
trying to reserve large swaths of memory that happen to span other
nodes' contents.
This commit ensures that we do not touch bootmem for any node which
has not been initialized, and also removes a compile warning about
an unused variable.
Signed-off-by: Dave Hansen <dave@linux.vnet.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org>
Brian King [Thu, 4 Dec 2008 04:07:54 +0000 (04:07 +0000)]
powerpc: Check for valid hugepage size in hugetlb_get_unmapped_area
It looks like most of the hugetlb code is doing the correct thing if
hugepages are not supported, but the mmap code is not. If we get into
the mmap code when hugepages are not supported, such as in an LPAR
which is running Active Memory Sharing, we can oops the kernel. This
fixes the oops being seen in this path.
Linus Torvalds [Tue, 16 Dec 2008 00:31:05 +0000 (16:31 -0800)]
Merge master.kernel.org:/home/rmk/linux-2.6-arm
* master.kernel.org:/home/rmk/linux-2.6-arm:
[ARM] 5348/1: fix documentation wrt location of the alignment trap interface
[ARM] Ensure linux/hardirqs.h is included where required
[ARM] fix kernel-doc syntax
[ARM] arch/arm/common/sa1111.c: Correct error handling code
[ARM] 5341/2: there is no copy_page on nommu ARM
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6:
Phonet: keep TX queue disabled when the device is off
SCHED: netem: Correct documentation comment in code.
netfilter: update rwlock initialization for nat_table
netlabel: Compiler warning and NULL pointer dereference fix
e1000e: fix double release of mutex
IA64: HP_SIMETH needs to depend upon NET
netpoll: fix race on poll_list resulting in garbage entry
ipv6: silence log messages for locally generated multicast
sungem: improve ethtool output with internal pcs and serdes
tcp: tcp_vegas cong avoid fix
sungem: Make PCS PHY support partially work again.
Rusty Russell [Mon, 15 Dec 2008 08:34:35 +0000 (19:04 +1030)]
Define smp_call_function_many for UP
Otherwise those using it in transition patches (eg. kvm) can't compile
with CONFIG_SMP=n:
arch/x86/kvm/../../../virt/kvm/kvm_main.c: In function 'make_all_cpus_request':
arch/x86/kvm/../../../virt/kvm/kvm_main.c:380: error: implicit declaration of function 'smp_call_function_many'
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Paul Menage [Mon, 15 Dec 2008 21:54:22 +0000 (13:54 -0800)]
cgroups: fix a race between rmdir and remount
When a cgroup is removed, it's unlinked from its parent's children list,
but not actually freed until the last dentry on it is released (at which
point cgrp->root->number_of_cgroups is decremented).
Currently rebind_subsystems checks for the top cgroup's child list being
empty in order to rebind subsystems into or out of a hierarchy - this can
result in the set of subsystems bound to a hierarchy being
removed-but-not-freed cgroup.
The simplest fix for this is to forbid remounts that change the set of
subsystems on a hierarchy that has removed-but-not-freed cgroups. This
bug can be reproduced via:
ACPI toshiba: only register rfkill if bt is enabled
Part of the rfkill initialization was done whenever BT was on or not. The
following patch checks for BT presence before registering the rfkill to
the input layer. Some minor cleanups (> 80 char lines) were also added in
the process.
Signed-off-by: Frederik Deweerdt <frederik.deweerdt@gmail.com> Tested-by: Andrey Borzenkov <arvidjaar@mail.ru> Acked-by: Len Brown <len.brown@intel.com> Cc: Richard Purdie <rpurdie@rpsys.net> Acked-by: Philip Langdale <philipl@overt.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Catalin Marinas [Mon, 15 Dec 2008 21:54:16 +0000 (13:54 -0800)]
slob: do not pass the SLAB flags as GFP in kmem_cache_create()
The kmem_cache_create() function in the slob allocator passes the SLAB
flags as GFP flags to the slob_alloc() function. The patch changes this
call to pass GFP_KERNEL as the other allocators seem to do.
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Matt Mackall <mpm@selenic.com> Cc: Cyrill Gorcunov <gorcunov@gmail.com> Cc: Christoph Lameter <cl@linux-foundation.org> Cc: Pekka Enberg <penberg@cs.helsinki.fi> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Steven Rostedt [Mon, 15 Dec 2008 08:19:14 +0000 (00:19 -0800)]
netfilter: update rwlock initialization for nat_table
The commit e099a173573ce1ba171092aee7bb3c72ea686e59
(netfilter: netns nat: per-netns NAT table) renamed the
nat_table from __nat_table to nat_table without updating the
__RW_LOCK_UNLOCKED(__nat_table.lock).
Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Zachary Amsden [Sat, 13 Dec 2008 20:36:58 +0000 (12:36 -0800)]
x86 Fix VMI crash on boot in 2.6.28-rc8
VMI initialiation can relocate the fixmap, causing early_ioremap to
malfunction if it is initialized before the relocation. To fix this,
VMI activation is split into two phases; the detection, which must
happen before setting up ioremap, and the activation, which must happen
after parsing early boot parameters.
This fixes a crash on boot when VMI is enabled under VMware.
Randy Dunlap [Mon, 1 Dec 2008 22:15:37 +0000 (14:15 -0800)]
[ARM] fix kernel-doc syntax
Fix kernel-doc notation to use correct syntax. Even though this should be
moved to where the function is actually implemented...
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Stefan Richter [Sat, 13 Dec 2008 00:43:59 +0000 (01:43 +0100)]
ieee1394: add quirk fix for Freecom HDD
According to http://bugzilla.kernel.org/show_bug.cgi?id=12206, Freecom
FireWire Hard Drive 1TB reports max_rom=2 but returns garbage if block
read requests are used to read the config ROM. Force max_rom=0 to limit
them to quadlet read requests.
Reported-by: Christian Mueller <cm1@mumac.de> Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
We got interrupted after setting up the MAS registers before the
tlbwe and the interrupt handler that caused the interrupt also did
a kmap_atomic (ide code) and thus on returning from the interrupt
the MAS registers no longer contained the proper values.
Since we dont save/restore MAS registers for normal interrupts we
need to disable interrupts in _tlbil_va to ensure atomicity.
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Ingo Brueckl [Wed, 10 Dec 2008 22:35:00 +0000 (23:35 +0100)]
console ASCII glyph 1:1 mapping
For the console, there is a 1:1 mapping of glyphs which cannot be found
in the current font. This seems to be meant as a kind of 'emergency
fallback' for fonts without unicode mapping which otherwise would
display nothing readable on the screen.
At the moment it affects all chars for which no substitution character
is defined. In particular this means that for all chars (>= 128) where
there is no iso88591-1/unicode character (e.g. control character area)
you'll get the very strange 1:1 mapping of the (cp437) graphics card
glyphs.
I'm pretty sure that the 1:1 mapping should only affect strict ASCII
code characters, i.e. chars < 128.
The patch limits the mapping as it probably was meant anyway.
Signed-off-by: Ingo Brueckl <ib@wupperonline.de> Acked-by: H. Peter Anvin <hpa@zytor.com> Cc: Egmont Koblinger <egmont@uhulinux.hu> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Ingo Brueckl [Wed, 10 Dec 2008 22:34:00 +0000 (23:34 +0100)]
unicode table for cp437
There is a major bug in the cp437 to unicode translation table. Char
0x7c is mapped to U+00a5 which is the Yen sign and wrong. The right
mapping is U+00a6 (broken bar).
Furthermore, a mapping for U+00b4 (a widely used character) is missing
even though easily possible.
The patch fixes these, as well as it provides a few other useful
mappings.
The changes are as follows:
0x0f (enhancement) enables a sort of currency symbol
0x27 (bug) enables a sort of acute accent which is a widely used character
0x44 (enhancement) enables a sort of icelandic capital letter eth
0x7c (major bug) corrects mapping
0xeb (enhancement) enables a sort of icelandic small letter eth
0xee (enhancement) enables a sort of math 'element of'
Signed-off-by: Ingo Brueckl <ib@wupperonline.de> Acked-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Dmitri Vorobiev [Wed, 10 Dec 2008 20:38:36 +0000 (22:38 +0200)]
MIPS: Kconfig: Fix the arch-specific header path
The header path in the help text for the RUNTIME_DEBUG config option is
obsolete and needs to be updated to match the new location of
architecture-specific header files. While at it, fix the spelling mistake.
Signed-off-by: Dmitri Vorobiev <dmitri.vorobiev@movial.fi> Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Commands needing to be retried require a complete re-initialization.
The test-unit-ready portion of this patch was causing boots to fail on
my test machine (as in http://lkml.org/lkml/2008/12/5/161). With this
patch in place, the system is booting reliably.
Mike Anderson found the same problem in the hp_hw_start_stop code,
and I applied the same solution in cdrom_read_cdda_bpc.
Signed-off-by: Alan D. Brunelle <alan.brunelle@hp.com> Cc: Mike Anderson <andmike@linux.vnet.ibm.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Paul Moore [Fri, 12 Dec 2008 05:31:50 +0000 (21:31 -0800)]
netlabel: Compiler warning and NULL pointer dereference fix
Fix the two compiler warnings show below. Thanks to Geert Uytterhoeven for
finding and reporting the problem.
net/netlabel/netlabel_unlabeled.c:567: warning: 'entry' may be used
uninitialized in this function
net/netlabel/netlabel_unlabeled.c:629: warning: 'entry' may be used
uninitialized in this function
Signed-off-by: Paul Moore <paul.moore@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Jeff Kirsher [Fri, 12 Dec 2008 05:28:11 +0000 (21:28 -0800)]
e1000e: fix double release of mutex
During a reset, releasing the swflag after it failed to be acquired would
cause a double unlock of the mutex. Instead, test whether acquisition of
the swflag was successful and if not, do not release the swflag. The reset
must still be done to bring the device to a quiescent state.
This resolves [BUG 12200] BUG: bad unlock balance detected! e1000e
http://bugzilla.kernel.org/show_bug.cgi?id=12200
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Josh Boyer [Tue, 25 Nov 2008 06:33:35 +0000 (06:33 +0000)]
powerpc/40x: Add proper BOOTCFLAGS for cuboot-acadia
The cuboot-acadia.c wrapper can cause assembler errors on some
toolchains due to the lack of the proper BOOTCFLAGS. This adds
the proper flags for the file.
Harvey Harrison [Thu, 11 Dec 2008 11:11:21 +0000 (12:11 +0100)]
i2c-highlander: Trivial endian casting fixes
Fixes sparse warnings:
drivers/i2c/busses/i2c-highlander.c:95:26: warning: incorrect type in argument 1 (different base types)
drivers/i2c/busses/i2c-highlander.c:95:26: expected restricted __be16 const [usertype] *p
drivers/i2c/busses/i2c-highlander.c:95:26: got unsigned short [usertype] *<noident>
drivers/i2c/busses/i2c-highlander.c:106:15: warning: incorrect type in assignment (different base types)
drivers/i2c/busses/i2c-highlander.c:106:15: expected unsigned short [unsigned] [short] [usertype] <noident>
drivers/i2c/busses/i2c-highlander.c:106:15: got restricted __be16
Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Ben Dooks <ben-linux@fluff.org> Acked-by: Paul Mundt <lethal@linux-sh.org> Signed-off-by: Jean Delvare <khali@linux-fr.org>
Harvey Harrison [Thu, 11 Dec 2008 11:11:20 +0000 (12:11 +0100)]
i2c-pmcmsp: Fix endianness misannotation
tmp is used as host-endian and is loaded from a be64, fix the cast and the
endian accessor used.
Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Ben Dooks <ben-linux@fluff.org> Signed-off-by: Jean Delvare <khali@linux-fr.org>
because even when disabled, it breaks for people. See
http://bugzilla.kernel.org/show_bug.cgi?id=12191
for the latest example.
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Acked-by: David S. Miller <davem@davemloft.net> Cc: Krzysztof Halasa <khc@pm.waw.pl> Cc: James Cloos <cloos@jhcloos.com> Cc: "Rafael J. Wysocki" <rjw@sisk.pl> Cc: Krzysztof Helt <krzysztof.h1@poczta.fm> Cc: Jean-Luc Coulon <jean.luc.coulon@gmail.com> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Hugh Dickins [Wed, 10 Dec 2008 20:48:52 +0000 (20:48 +0000)]
fix mapping_writably_mapped()
Lee Schermerhorn noticed yesterday that I broke the mapping_writably_mapped
test in 2.6.7! Bad bad bug, good good find.
The i_mmap_writable count must be incremented for VM_SHARED (just as
i_writecount is for VM_DENYWRITE, but while holding the i_mmap_lock)
when dup_mmap() copies the vma for fork: it has its own more optimal
version of __vma_link_file(), and I missed this out. So the count
was later going down to 0 (dangerous) when one end unmapped, then
wrapping negative (inefficient) when the other end unmapped.
The only impact on x86 would have been that setting a mandatory lock on
a file which has at some time been opened O_RDWR and mapped MAP_SHARED
(but not necessarily PROT_WRITE) across a fork, might fail with -EAGAIN
when it should succeed, or succeed when it should fail.
But those architectures which rely on flush_dcache_page() to flush
userspace modifications back into the page before the kernel reads it,
may in some cases have skipped the flush after such a fork - though any
repetitive test will soon wrap the count negative, in which case it will
flush_dcache_page() unnecessarily.
Fix would be a two-liner, but mapping variable added, and comment moved.
Manfred Spraul [Wed, 10 Dec 2008 17:17:06 +0000 (18:17 +0100)]
lib/idr.c: Fix bug introduced by RCU fix
The last patch to lib/idr.c caused a bug if idr_get_new_above() was
called on an empty idr.
Usually, nodes stay on the same layer. New layers are added to the top
of the tree.
The exception is idr_get_new_above() on an empty tree: In this case, the
new root node is first added on layer 0, then moved upwards. p->layer
was not updated.
As usual: You shall never rely on the source code comments, they will
only mislead you.
Akira Takeuchi [Wed, 10 Dec 2008 12:43:34 +0000 (12:43 +0000)]
MN10300: Fix __put_user_asm8()
Fix __put_user_asm8() by jumping to the end label (3:) from the exception
handler, rather than jumping back to retry the second store instruction (label
2:).
Akira Takeuchi [Wed, 10 Dec 2008 12:43:29 +0000 (12:43 +0000)]
MN10300: Fix the preemption resume_kernel() routine
Fix the preemption resume_kernel() routine by inverting the test to see
whether interrupts are off (IM7 is all enabled, not all disabled).
Furthermore, interrupts should be disabled on entry to resume_kernel() so that
they're correctly set for jumping to restore_all() and doing the need
reschedule test.
Akira Takeuchi [Wed, 10 Dec 2008 12:43:24 +0000 (12:43 +0000)]
MN10300: Discard low-priority Tx interrupts when closing an on-chip serial port
Discard low-prioriy Tx interrupts when closing an MN10300 on-chip serial port.
The MN10300 on-chip serial port uses three interrupts to manage its serial
ports:
(1) A very high priority interrupt that drives virtual DMA for Rx.
(2) A very high priority interrupt that drives virtual DMA for Tx.
(3) A normal priority virtual interrupt that does the normal UART interrupt
stuff and is shared between Rx and Tx.
mn10300_serial_stop_tx() only disables the high priority Tx interrupt. It
doesn't also disable the normal priority one because it is shared with Rx.
However, the high priority interrupt may interrupt local_irq_disabled()
sections, and so may have queued up a low priority virtual interrupt whilst the
UART driver is asking for the Tx interrupt to be disabled.
The result of this can be an oops when we try to process the interrupt in
mn10300_serial_transmit_interrupt() as port->uart.info and port->uart.info->tty
may have gone away.
To deal with this, if either of those pointers is NULL, we make sure the
high-priority Tx interrupt is disabled and discard the interrupt. The low
priority interrupt is disabled by the mn10300_serial_pic irq_chip table.
Linus Torvalds [Wed, 10 Dec 2008 18:04:50 +0000 (10:04 -0800)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6:
PCIe: ASPM: Break out of endless loop waiting for PCI config bits to switch
PCI: stop leaking 'slot_name' in pci_create_slot
Linus Torvalds [Wed, 10 Dec 2008 18:04:25 +0000 (10:04 -0800)]
Merge branch 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6
* 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6:
[IA64] SN: prevent IRQ retargetting in request_irq()
[IA64] Fix section mismatch ioc3uart_init()/ioc3uart_submodule
[IA64] Clear up section mismatch for ioc4_ide_attach_one.
[IA64] Clear up section mismatch with arch_unregister_cpu()
[IA64] Clear up section mismatch for sn_check_wars.
[IA64] Updated the generic_defconfig to work with the 2.6.28-rc7 kernel.
[IA64] Fix GRU compile error w/o CONFIG_HUGETLB_PAGE
[IA64] eliminate NULL test and memset after alloc_bootmem
[IA64] remove BUILD_BUG_ON from paravirt_getreg()
Kay Sievers [Sat, 6 Dec 2008 03:38:11 +0000 (04:38 +0100)]
pktcdvd: remove broken dev_t export of class devices
The pktcdvd created class devices only export some sysfs files,
but have no char dev_t registered in the driver.
At class device creation time they copy the dev_t value of the
block device to the char device, wich will register a new char
device in the driver core and userspace, with a conflicting dev_t
value.
In many cases the class devices dev_t just points to a random
USB device. This fixes the sysfs "duplicate entry" errors.
Signed-off-by: Kay Sievers <kay.sievers@vrfy.org> Acked-by: Peter Osterlund <petero2@telia.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>