Dimitri Sivanich [Fri, 29 Jun 2012 21:17:29 +0000 (14:17 -0700)]
rcu: Segregate rcu_state fields to improve cache locality
The fields in the rcu_state structure that are protected by the
root rcu_node structure's ->lock can share a cache line with the
fields protected by ->onofflock. This can result in excessive
memory contention on large systems, so this commit applies
____cacheline_internodealigned_in_smp to the ->onofflock field in
order to segregate them.
Signed-off-by: Dimitri Sivanich <sivanich@sgi.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Tested-by: Dimitri Sivanich <sivanich@sgi.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org>
Paul E. McKenney [Tue, 12 Jun 2012 00:39:43 +0000 (17:39 -0700)]
rcu: Provide OOM handler to motivate lazy RCU callbacks
In kernels built with CONFIG_RCU_FAST_NO_HZ=y, CPUs can accumulate a
large number of lazy callbacks, which as the name implies will be slow
to be invoked. This can be a problem on small-memory systems, where the
default 6-second sleep for CPUs having only lazy RCU callbacks could well
be fatal. This commit therefore installs an OOM hander that ensures that
every CPU with lazy callbacks has at least one non-lazy callback, in turn
ensuring timely advancement for these callbacks.
Updated to fix bug that disabled OOM killing, noted by Lai Jiangshan.
Updated to push the for_each_rcu_flavor() loop into rcu_oom_notify_cpu(),
thus reducing the number of IPIs, as suggested by Steven Rostedt. Also
to make the for_each_online_cpu() loop be preemptible. (Later, it might
be good to use smp_call_function(), as suggested by Peter Zijlstra.)
Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Tested-by: Sasha Levin <levinsasha928@gmail.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org>
Paul E. McKenney [Thu, 21 Jun 2012 16:54:10 +0000 (09:54 -0700)]
rcu: Prevent offline CPUs from executing RCU core code
Earlier versions of RCU invoked the RCU core from the CPU_DYING notifier
in order to note a quiescent state for the outgoing CPU. Because the
CPU is marked "offline" during the execution of the CPU_DYING notifiers,
the RCU core had to tolerate being invoked from an offline CPU. However,
commit b1420f1c (Make rcu_barrier() less disruptive) left only tracing
code in the CPU_DYING notifier, so the RCU core need no longer execute
on offline CPUs. This commit therefore enforces this restriction.
Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org>
Paul E. McKenney [Fri, 22 Jun 2012 18:08:41 +0000 (11:08 -0700)]
rcu: Break up rcu_gp_kthread() into subfunctions
Then rcu_gp_kthread() function is too large and furthermore needs to
have the force_quiescent_state() code pulled in. This commit therefore
breaks up rcu_gp_kthread() into rcu_gp_init() and rcu_gp_cleanup().
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org>
Paul E. McKenney [Thu, 21 Jun 2012 15:19:05 +0000 (08:19 -0700)]
rcu: Allow RCU grace-period cleanup to be preempted
RCU grace-period cleanup is currently carried out with interrupts
disabled, which can result in excessive latency spikes on large systems
(many hundreds or thousands of CPUs). This patch therefore makes the
RCU grace-period cleanup be preemptible, including voluntary preemption
points, which should eliminate those latency spikes. Similar spikes from
forcing of quiescent states will be dealt with similarly by later patches.
Updated to replace uses of spin_lock_irqsave() with spin_lock_irq(), as
suggested by Peter Zijlstra.
Reported-by: Mike Galbraith <mgalbraith@suse.de> Reported-by: Dimitri Sivanich <sivanich@sgi.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org>
Paul E. McKenney [Thu, 21 Jun 2012 00:07:14 +0000 (17:07 -0700)]
rcu: Move RCU grace-period cleanup into kthread
As a first step towards allowing grace-period cleanup to be preemptible,
this commit moves the RCU grace-period cleanup into the same kthread
that is now used to initialize grace periods. This is needed to keep
scheduling latency down to a dull roar.
[ paulmck: Get rid of stray spin_lock_irqsave() calls. ]
Reported-by: Mike Galbraith <mgalbraith@suse.de> Reported-by: Dimitri Sivanich <sivanich@sgi.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org>
Paul E. McKenney [Wed, 20 Jun 2012 00:18:20 +0000 (17:18 -0700)]
rcu: Allow RCU grace-period initialization to be preempted
RCU grace-period initialization is currently carried out with interrupts
disabled, which can result in 200-microsecond latency spikes on systems
on which RCU has been configured for 4096 CPUs. This patch therefore
makes the RCU grace-period initialization be preemptible, which should
eliminate those latency spikes. Similar spikes from grace-period cleanup
and the forcing of quiescent states will be dealt with similarly by later
patches.
Reported-by: Mike Galbraith <mgalbraith@suse.de> Reported-by: Dimitri Sivanich <sivanich@sgi.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org>
The next step in reducing RCU's grace-period initialization latency on
large systems will make this initialization preemptible. Unfortunately,
making the grace-period initialization subject to interrupts (let alone
preemption) exposes the following race on systems whose rcu_node tree
contains more than one node:
1. CPU 31 starts initializing the grace period, including the
first leaf rcu_node structures, and is then preempted.
2. CPU 0 refers to the first leaf rcu_node structure, and notes
that a new grace period has started. It passes through a
quiescent state shortly thereafter, and informs the RCU core
of this rite of passage.
3. CPU 0 enters an RCU read-side critical section, acquiring
a pointer to an RCU-protected data item.
4. CPU 31 takes an interrupt whose handler removes the data item
referenced by CPU 0 from the data structure, and registers an
RCU callback in order to free it.
5. CPU 31 resumes initializing the grace period, including its
own rcu_node structure. In invokes rcu_start_gp_per_cpu(),
which advances all callbacks, including the one registered
in #4 above, to be handled by the current grace period.
6. The remaining CPUs pass through quiescent states and inform
the RCU core, but CPU 0 remains in its RCU read-side critical
section, still referencing the now-removed data item.
7. The grace period completes and all the callbacks are invoked,
including the one that frees the data item that CPU 0 is still
referencing. Oops!!!
One way to avoid this race is to remove grace-period acceleration from
rcu_start_gp_per_cpu(). Now, the only reason for this acceleration was
to allow CPUs bringing RCU out of idle state to have their callbacks
invoked after only one grace period, rather than the two grace periods
that would otherwise be required. But this acceleration does not
work when RCU grace-period initialization is moved to a kthread because
the CPU posting the callback is no longer necessarily the CPU that is
initializing the resulting grace period.
This commit therefore removes this now-pointless (and soon to be dangerous)
grace-period acceleration, thus avoiding the above race.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Paul E. McKenney [Tue, 19 Jun 2012 01:36:08 +0000 (18:36 -0700)]
rcu: Move RCU grace-period initialization into a kthread
As the first step towards allowing grace-period initialization to be
preemptible, this commit moves the RCU grace-period initialization
into its own kthread. This is needed to keep large-system scheduling
latency at reasonable levels.
Also change raw_spin_lock_irqsave() to raw_spin_lock_irq() as suggested
by Peter Zijlstra in review comments.
Reported-by: Mike Galbraith <mgalbraith@suse.de> Reported-by: Dimitri Sivanich <sivanich@sgi.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org>
trace: Don't declare trace_*_rcuidle functions in modules
Tracepoints declare a static inline trace_*_rcuidle variant of the trace
function, to support safely generating trace events from the idle loop.
Module code never actually uses that variant of trace functions, because
modules don't run code that needs tracing with RCU idled. However, the
declaration of those otherwise unused functions causes the module to
reference rcu_idle_exit and rcu_idle_enter, which RCU does not export to
modules.
To avoid this, don't generate trace_*_rcuidle functions for tracepoints
declared in module code.
Link: http://lkml.kernel.org/r/20120905062306.GA14756@leaf Reported-by: Steven Rostedt <rostedt@goodmis.org> Acked-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: Josh Triplett <josh@joshtriplett.org> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Merge branch 'fixes-for-3.6' of git://git.linaro.org/people/mszyprowski/linux-dma-mapping
Pull DMA-mapping fixes from Marek Szyprowski:
"Another set of fixes for ARM dma-mapping subsystem.
Commit e9da6e9905e6 replaced custom consistent buffer remapping code
with generic vmalloc areas. It however introduced some regressions
caused by limited support for allocations in atomic context. This
series contains fixes for those regressions.
For some subplatforms the default, pre-allocated pool for atomic
allocations turned out to be too small, so a function for setting its
size has been added.
Another set of patches adds support for atomic allocations to
IOMMU-aware DMA-mapping implementation.
The last part of this pull request contains two fixes for Contiguous
Memory Allocator, which relax too strict requirements."
* 'fixes-for-3.6' of git://git.linaro.org/people/mszyprowski/linux-dma-mapping:
ARM: dma-mapping: IOMMU allocates pages from atomic_pool with GFP_ATOMIC
ARM: dma-mapping: Introduce __atomic_get_pages() for __iommu_get_pages()
ARM: dma-mapping: Refactor out to introduce __in_atomic_pool
ARM: dma-mapping: atomic_pool with struct page **pages
ARM: Kirkwood: increase atomic coherent pool size
ARM: DMA-Mapping: print warning when atomic coherent allocation fails
ARM: DMA-Mapping: add function for setting coherent pool size from platform code
ARM: relax conditions required for enabling Contiguous Memory Allocator
mm: cma: fix alignment requirements for contiguous regions
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input
Pull input subsystem updates from Dmitry Torokhov.
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
Input: wacom - add support for EMR on Cintiq 24HD touch
Input: i8042 - add Gigabyte T1005 series netbooks to noloop table
Input: imx_keypad - reset the hardware before enabling
Input: edt-ft5x06 - fix build error when compiling wthout CONFIG_DEBUG_FS
Merge branch 'upstream-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid
Pull HID updates from Jiri Kosina:
"It contains a fix for Eaton Ellipse MAX UPS from Alan Stern,
performance improvement (not processing debug data if noone is
interested), by Henrik Rydberg, and allowing tpkbd-driven devices to
work even with generic driver in a crippled mode, by Andres Freund."
* 'upstream-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid:
HID: tpkbd: work even if the new Lenovo Keyboard driver is not configured
HID: Only dump input if someone is listening
HID: add NOGET quirk for Eaton Ellipse MAX UPS
Andres Freund [Thu, 30 Aug 2012 12:37:14 +0000 (14:37 +0200)]
HID: tpkbd: work even if the new Lenovo Keyboard driver is not configured
c1dcad2d32d0252e8a3023d20311b52a187ecda3 added a new driver configured by
HID_LENOVO_TPKBD but made the hid_have_special_driver entry non-optional which
lead to a recognized but non-working device if the new driver wasn't
configured (which is the correct default).
Signed-off-by: Andres Freund <andres@anarazel.de> Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Merge tag 'stable/for-linus-3.6-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen
Pull Xen bug-fixes from Konrad Rzeszutek Wilk:
* Fix for TLB flushing introduced in v3.6
* Fix Xen-SWIOTLB not using proper DMA mask - device had 64bit but
in a 32-bit kernel we need to allocate for coherent pages from a
32-bit pool.
* When trying to re-use P2M nodes we had a one-off error and triggered
a BUG_ON check with specific CONFIG_ option.
* When doing FLR in Xen-PCI-backend we would first do FLR then save the
PCI configuration space. We needed to do it the other way around.
* tag 'stable/for-linus-3.6-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen:
xen/pciback: Fix proper FLR steps.
xen: Use correct masking in xen_swiotlb_alloc_coherent.
xen: fix logical error in tlb flushing
xen/p2m: Fix one-off error in checking the P2M tree directory.
Merge tag '3.6-pci-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci
Pull PCI updates from Bjorn Helgaas:
"Power management
- PCI/PM: Enable D3/D3cold by default for most devices
- PCI/PM: Keep parent bridge active when probing device
- PCI/PM: Fix config reg access for D3cold and bridge suspending
- PCI/PM: Add ABI document for sysfs file d3cold_allowed
Core
- PCI: Don't print anything while decoding is disabled"
* tag '3.6-pci-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
PCI: Don't print anything while decoding is disabled
PCI/PM: Add ABI document for sysfs file d3cold_allowed
PCI/PM: Fix config reg access for D3cold and bridge suspending
PCI/PM: Keep parent bridge active when probing device
PCI/PM: Enable D3/D3cold by default for most devices
Merge tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc
Pull ARM SoC bug fixes from Olof Johansson:
"Mostly Renesas and Atmel bugfixes this time, targeting boot and build
problems. A couple of patches for gemini and kirkwood as well. On a
whole nothing very controversial."
* tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
ARM: gemini: fix the gemini build
ARM: shmobile: armadillo800eva: enable rw rootfs mount
ARM: Kirkwood: Fix 'SZ_1M' undeclared here for db88f6281-bp-setup.c
ARM: shmobile: mackerel: fixup usb module order
ARM: shmobile: armadillo800eva: fixup: sound card detection order
ARM: shmobile: marzen: fixup smsc911x id for regulator
ARM: at91/feature-removal-schedule: delay at91_mci removal
ARM: mach-shmobile: armadillo800eva: Enable power button as wakeup source
ARM: mach-shmobile: armadillo800eva: Fix GPIO buttons descriptions
ARM: at91/dts: remove partial parameter in at91sam9g25ek.dts
ARM: at91/clock: fix PLLA overclock warning
ARM: at91: fix rtc-at91sam9 irq issue due to sparse irq support
ARM: at91: fix system timer irq issue due to sparse irq support
ARM: shmobile: sh73a0: fixup RELOC_BASE of intca_irq_pins_desc
Merge tag 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging
Pull a hwmon fix from Guenter Roeck:
"One patch, fixing DIV_ROUND_CLOSEST to support negative dividends.
While the changes are not in the drivers/hwmon directory, the problem
primarily affects hwmon drivers, and it makes sense to push the patch
through the hwmon tree."
* tag 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
linux/kernel.h: Fix DIV_ROUND_CLOSEST to support negative dividends
Merge branch 'rc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild
Pull kbuild fixes from Michal Marek:
"These are two fixes that should go into 3.6. The link-vmlinux.sh one
is obvious.
The other one fixes make firmware_install with certain configurations,
where a file in the toplevel firmware tree gets installed first, and
$(INSTALL_FW_PATH)/$$(dir <file>) results in /lib/firmware/./, which
confuses make 3.82 for some reason."
* 'rc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild:
firmware: fix directory creation rule matching with make 3.82
link-vmlinux.sh: Fix stray "echo" in error message
When we do FLR and save PCI config we did it in the wrong order.
The end result was that if a PCI device was unbind from
its driver, then binded to xen-pciback, and then back to its
driver we would get:
Merge tag 'mmc-fixes-for-3.6-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/cjb/mmc
Pull MMC fixes from Chris Ball:
- a firmware bug on several Samsung MoviNAND eMMC models causes
permanent corruption on the device when secure erase and secure trim
requests are made, so we disable those requests on these eMMC devices.
- atmel-mci: fix a hang with some SD cards by waiting for not-busy flag.
- dw_mmc: low-power mode breaks SDIO interrupts; fix PIO error handling;
fix handling of error interrupts.
- mxs-mmc: fix deadlocks; fix compile error due to dma.h arch change.
- omap: fix broken PIO mode causing memory corruption.
- sdhci-esdhc: fix card detection.
* tag 'mmc-fixes-for-3.6-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/cjb/mmc:
mmc: omap: fix broken PIO mode
mmc: card: Skip secure erase on MoviNAND; causes unrecoverable corruption.
mmc: dw_mmc: Disable low power mode if SDIO interrupts are used
mmc: dw_mmc: fix error handling in PIO mode
mmc: dw_mmc: correct mishandling error interrupt
mmc: dw_mmc: amend using error interrupt status
mmc: atmel-mci: not busy flag has also to be used for read operations
mmc: sdhci-esdhc: break out early if clock is 0
mmc: mxs-mmc: fix deadlock caused by recursion loop
mmc: mxs-mmc: fix deadlock in SDIO IRQ case
mmc: bfin_sdh: fix dma_desc_array build error
arch/um/os-Linux/time.c: In function 'deliver_alarm':
arch/um/os-Linux/time.c:117:3: error: too few arguments to function 'alarm_handler'
arch/um/os-Linux/internal.h:1:6: note: declared here
The error was introduced by commit d3c1cfcd ("um: pass siginfo to guest
process") in 3.6-rc1.
Merge branch 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc
Pull powerpc fixes from Benjamin Herrenschmidt:
"Here are a few fixes for 3.6 that were piling up while I was away or
busy (I was mostly MIA a week or two before San Diego).
Some fixes from Anton fixing up issues with our relatively new DSCR
control feature, and a few other fixes that are either regressions or
bugs nasty enough to warrant not waiting."
* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc:
powerpc: Don't use __put_user() in patch_instruction
powerpc: Make sure IPI handlers see data written by IPI senders
powerpc: Restore correct DSCR in context switch
powerpc: Fix DSCR inheritance in copy_thread()
powerpc: Keep thread.dscr and thread.dscr_inherit in sync
powerpc: Update DSCR on all CPUs when writing sysfs dscr_default
powerpc/powernv: Always go into nap mode when CPU is offline
powerpc: Give hypervisor decrementer interrupts their own handler
powerpc/vphn: Fix arch_update_cpu_topology() return value
Merge tag 'gpio-fixes-for-v3.6' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio
Pull GPIO fixes from Linus Walleij:
"These are some GPIO regression fixes for v3.6:
- Erroneous debug message from of_get_named_gpio_flags()
- Make sure the MC9S08DZ60 GPIO driver depend on I2C being compiled
in (not module) or allmodconfig breaks.
- Check return value from irq_alloc_descs() in the Emma Mobile GPIO
driver.
- Assign the owner field for the rdc321x driver so the module won't
be removed if it has active GPIOs."
* tag 'gpio-fixes-for-v3.6' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio:
gpio: rdc321x: Prevent removal of modules exporting active GPIOs
gpio: em: Fix checking return value of irq_alloc_descs
gpio: mc9s08dz60: Fix build error if I2C=m
gpio: Fix debug message in of_get_named_gpio_flags()
Merge tag 'sound-3.6' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
"There are nothing scaring, contains only small fixes for HD-audio and
USB-audio:
- EPSS regression fix and GPIO fix for HD-audio IDT codecs
- A series of USB-audio regression fixes that are found since 3.5
kernel"
* tag 'sound-3.6' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
ALSA: snd-usb: fix cross-interface streaming devices
ALSA: snd-usb: fix calls to next_packet_size
ALSA: snd-usb: restore delay information
ALSA: snd-usb: use list_for_each_safe for endpoint resources
ALSA: snd-usb: Fix URB cancellation at stream start
ALSA: hda - Don't trust codec EPSS bit for IDT 92HD83xx & co
ALSA: hda - Avoid unnecessary parameter read for EPSS
ALSA: hda - Do not set GPIOs for speakers on IDT if there are no speakers
Merge tag 'fbdev-fixes-for-3.6-1' of git://github.com/schandinat/linux-2.6
Pull fbdev fixes from Florian Tobias Schandinat:
- a fix by Paul Cercueil to prevent a possible buffer overflow
- a fix by Bruno Prémont to prevent a rare sleep in invalid context
- a fix by Julia Lawall for a double free in auo_k190x
- a fix by Dan Carpenter to prevent a division by zero in mb862xxfb
- a regression fix by Tomi Valkeinen for the SDI output in OMAP
- a fix by Grazvydas Ignotas to fix the console colors in OMAP
* tag 'fbdev-fixes-for-3.6-1' of git://github.com/schandinat/linux-2.6:
OMAPFB: fix framebuffer console colors
OMAPDSS: Fix SDI PLL locking
video: mb862xxfb: prevent divide by zero bug
drivers/video/auo_k190x.c: drop kfree of devm_kzalloc's data
fbcon: Fix bit_putcs() call to kmalloc(s, GFP_KERNEL)
fbcon: prevent possible buffer overflow.
Merge tag 'upstream-3.6-rc5' of git://git.infradead.org/linux-ubi
Pull ubi fix from Artem Bityutskiy:
"A single small fix for memory deallocation: we allocated memory using
'kmem_cache_alloc()' but were freeing it using 'kfree()' in some
cases. Now we fix this by using 'kmem_cache_free()' instead."
* tag 'upstream-3.6-rc5' of git://git.infradead.org/linux-ubi:
UBI: fix a horrible memory deallocation bug
Fix order of arguments to compat_put_time[spec|val]
Commit 644595f89620 ("compat: Handle COMPAT_USE_64BIT_TIME in
net/socket.c") introduced a bug where the helper functions to take
either a 64-bit or compat time[spec|val] got the arguments in the wrong
order, passing the kernel stack pointer off as a user pointer (and vice
versa).
Because of the user address range check, that in turn then causes an
EFAULT due to the user pointer range checking failing for the kernel
address. Incorrectly resuling in a failed system call for 32-bit
processes with a 64-bit kernel.
On odder architectures like HP-PA (with separate user/kernel address
spaces), it can be used read kernel memory.
Ronny Hegewald [Fri, 31 Aug 2012 09:57:52 +0000 (09:57 +0000)]
xen: Use correct masking in xen_swiotlb_alloc_coherent.
When running 32-bit pvops-dom0 and a driver tries to allocate a coherent
DMA-memory the xen swiotlb-implementation returned memory beyond 4GB.
The underlaying reason is that if the supplied driver passes in a
DMA_BIT_MASK(64) ( hwdev->coherent_dma_mask is set to 0xffffffffffffffff)
our dma_mask will be u64 set to 0xffffffffffffffff even if we set it to
DMA_BIT_MASK(32) previously. Meaning we do not reset the upper bits.
By using the dma_alloc_coherent_mask function - it does the proper casting
and we get 0xfffffffff.
This caused not working sound on a system with 4 GB and a 64-bit
compatible sound-card with sets the DMA-mask to 64bit.
On bare-metal and the forward-ported xen-dom0 patches from OpenSuse a coherent
DMA-memory is always allocated inside the 32-bit address-range by calling
dma_alloc_coherent_mask.
This patch adds the same functionality to xen swiotlb and is a rebase of the
original patch from Ronny Hegewald which never got upstream b/c the
underlaying reason was not understood until now.
The original email with the original patch is in:
http://old-list-archives.xen.org/archives/html/xen-devel/2010-02/msg00038.html
the original thread from where the discussion started is in:
http://old-list-archives.xen.org/archives/html/xen-devel/2010-01/msg00928.html
Signed-off-by: Ronny Hegewald <ronny.hegewald@online.de> Signed-off-by: Stefano Panella <stefano.panella@citrix.com> Acked-By: David Vrabel <david.vrabel@citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> CC: stable@vger.kernel.org
Alex Shi [Fri, 24 Aug 2012 08:55:13 +0000 (08:55 +0000)]
xen: fix logical error in tlb flushing
While TLB_FLUSH_ALL gets passed as 'end' argument to
flush_tlb_others(), the Xen code was made to check its 'start'
parameter. That may give a incorrect op.cmd to MMUEXT_INVLPG_MULTI
instead of MMUEXT_TLB_FLUSH_MULTI. Then it causes some page can not
be flushed from TLB.
This patch fixed this issue.
Reported-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: Alex Shi <alex.shi@intel.com> Acked-by: Jan Beulich <jbeulich@suse.com> Tested-by: Yongjie Ren <yongjie.ren@intel.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
* commit '4cb38750d49010ae72e718d46605ac9ba5a851b4': (6849 commits)
bcma: fix invalid PMU chip control masks
[libata] pata_cmd64x: whitespace cleanup
libata-acpi: fix up for acpi_pm_device_sleep_state API
sata_dwc_460ex: device tree may specify dma_channel
ahci, trivial: fixed coding style issues related to braces
ahci_platform: add hibernation callbacks
libata-eh.c: local functions should not be exposed globally
libata-transport.c: local functions should not be exposed globally
sata_dwc_460ex: support hardreset
ata: use module_pci_driver
drivers/ata/pata_pcmcia.c: adjust suspicious bit operation
pata_imx: Convert to clk_prepare_enable/clk_disable_unprepare
ahci: Enable SB600 64bit DMA on MSI K9AGM2 (MS-7327) v2
[libata] Prevent interface errors with Seagate FreeAgent GoFlex
drivers/acpi/glue: revert accidental license-related 6b66d95895c bits
libata-acpi: add missing inlines in libata.h
i2c-omap: Add support for I2C_M_STOP message flag
i2c: Fall back to emulated SMBus if the operation isn't supported natively
i2c: Add SCCB support
i2c-tiny-usb: Add support for the Robofuzz OSIF USB/I2C converter
...
xen/p2m: Fix one-off error in checking the P2M tree directory.
We would traverse the full P2M top directory (from 0->MAX_DOMAIN_PAGES
inclusive) when trying to figure out whether we can re-use some of the
P2M middle leafs.
Which meant that if the kernel was compiled with MAX_DOMAIN_PAGES=512
we would try to use the 512th entry. Fortunately for us the p2m_top_index
has a check for this:
powerpc: Don't use __put_user() in patch_instruction
patch_instruction() can be called very early on ppc32, when the kernel
isn't yet running at it's linked address. That can cause the !
is_kernel_addr() test in __put_user() to trip and call might_sleep()
which is very bad at that point during boot.
Use a lower level function instead for now, at least until we get to
rework ppc32 boot process to do the code patching later, like ppc64
does.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Paul Mackerras [Tue, 4 Sep 2012 18:33:08 +0000 (18:33 +0000)]
powerpc: Make sure IPI handlers see data written by IPI senders
We have been observing hangs, both of KVM guest vcpu tasks and more
generally, where a process that is woken doesn't properly wake up and
continue to run, but instead sticks in TASK_WAKING state. This
happens because the update of rq->wake_list in ttwu_queue_remote()
is not ordered with the update of ipi_message in
smp_muxed_ipi_message_pass(), and the reading of rq->wake_list in
scheduler_ipi() is not ordered with the reading of ipi_message in
smp_ipi_demux(). Thus it is possible for the IPI receiver not to see
the updated rq->wake_list and therefore conclude that there is nothing
for it to do.
In order to make sure that anything done before smp_send_reschedule()
is ordered before anything done in the resulting call to scheduler_ipi(),
this adds barriers in smp_muxed_message_pass() and smp_ipi_demux().
The barrier in smp_muxed_message_pass() is a full barrier to ensure that
there is a full ordering between the smp_send_reschedule() caller and
scheduler_ipi(). In smp_ipi_demux(), we use xchg() rather than
xchg_local() because xchg() includes release and acquire barriers.
Using xchg() rather than xchg_local() makes sense given that
ipi_message is not just accessed locally.
This moves the barrier between setting the message and calling the
cause_ipi() function into the individual cause_ipi implementations.
Most of them -- those that used outb, out_8 or similar -- already had
a full barrier because out_8 etc. include a sync before the MMIO
store. This adds an explicit barrier in the two remaining cases.
These changes made no measurable difference to the speed of IPIs as
measured using a simple ping-pong latency test across two CPUs on
different cores of a POWER7 machine.
The analysis of the reason why processes were not waking up properly
is due to Milton Miller.
Cc: stable@vger.kernel.org # v3.0+ Reported-by: Milton Miller <miltonm@bga.com> Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Anton Blanchard [Mon, 3 Sep 2012 16:51:10 +0000 (16:51 +0000)]
powerpc: Restore correct DSCR in context switch
During a context switch we always restore the per thread DSCR value.
If we aren't doing explicit DSCR management
(ie thread.dscr_inherit == 0) and the default DSCR changed while
the process has been sleeping we end up with the wrong value.
Check thread.dscr_inherit and select the default DSCR or per thread
DSCR as required.
This was found with the following test case, when running with
more threads than CPUs (ie forcing context switching):
Anton Blanchard [Mon, 3 Sep 2012 16:49:47 +0000 (16:49 +0000)]
powerpc: Fix DSCR inheritance in copy_thread()
If the default DSCR is non zero we set thread.dscr_inherit in
copy_thread() meaning the new thread and all its children will ignore
future updates to the default DSCR. This is not intended and is
a change in behaviour that a number of our users have hit.
We just need to inherit thread.dscr and thread.dscr_inherit from
the parent which ends up being much simpler.
Anton Blanchard [Mon, 3 Sep 2012 16:48:46 +0000 (16:48 +0000)]
powerpc: Keep thread.dscr and thread.dscr_inherit in sync
When we update the DSCR either via emulation of mtspr(DSCR) or via
a change to dscr_default in sysfs we don't update thread.dscr.
We will eventually update it at context switch time but there is
a period where thread.dscr is incorrect.
If we fork at this point we will copy the old value of thread.dscr
into the child. To avoid this, always keep thread.dscr in sync with
reality.
Anton Blanchard [Mon, 3 Sep 2012 16:47:56 +0000 (16:47 +0000)]
powerpc: Update DSCR on all CPUs when writing sysfs dscr_default
Writing to dscr_default in sysfs doesn't actually change the DSCR -
we rely on a context switch on each CPU to do the work. There is no
guarantee we will get a context switch in a reasonable amount of time
so fire off an IPI to force an immediate change.
This issue was found with the following test case:
Paul Mackerras [Thu, 26 Jul 2012 18:51:09 +0000 (18:51 +0000)]
powerpc/powernv: Always go into nap mode when CPU is offline
The CPU hotplug code for the powernv platform currently only puts
offline CPUs into nap mode if the powersave_nap variable is set.
However, HV-style KVM on this platform requires secondary CPU threads
to be offline and in nap mode. Since we know nap mode works just
fine on all POWER7 machines, and the only machines that support the
powernv platform are POWER7 machines, this changes the code to
always put offline CPUs into nap mode, regardless of powersave_nap.
Powersave_nap still controls whether or not CPUs go into nap mode
when idle, as before.
Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Paul Mackerras [Thu, 26 Jul 2012 13:56:11 +0000 (13:56 +0000)]
powerpc: Give hypervisor decrementer interrupts their own handler
At the moment the handler for hypervisor decrementer interrupts is
the same as for decrementer interrupts, i.e. timer_interrupt().
This is bogus; if we ever do get a hypervisor decrementer interrupt
it won't have anything to do with the next timer event. In fact
the only time we get hypervisor decrementer interrupts is when one
is left pending on exit from a KVM guest.
When we get a hypervisor decrementer interrupt we don't need to do
anything special to clear it, since they are edge-triggered on the
transition of HDEC from 0 to -1. Thus this adds an empty handler
function for them. We don't need to have them masked when interrupts
are soft-disabled, so we use STD_EXCEPTION_HV instead of
MASKABLE_EXCEPTION_HV.
Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Linus Walleij [Thu, 30 Aug 2012 17:22:36 +0000 (19:22 +0200)]
ARM: gemini: fix the gemini build
Test-compiling obscure machines I notice that the gemini (which
by the way lacks a defconfig) is broken since some time back.
Adding a simple missing include makes it build again.
Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Olof Johansson <olof@lixom.net>
[ 2.086181] Waiting for root device /dev/mmcblk0p1...
[ 2.324066] Unhandled fault: imprecise external abort (0x406) at 0x00000000
[ 2.331451] Internal error: : 406 [#1] ARM
[ 2.335784] Modules linked in:
[ 2.339050] CPU: 0 Not tainted (3.6.0-rc3 #60)
[ 2.344146] PC is at default_idle+0x28/0x30
[ 2.348602] LR is at trace_hardirqs_on_caller+0x15c/0x1b0
...
This turned out to be due to memory corruption caused by long-broken
PIO code in drivers/mmc/host/omap.c. (Previously, this driver had
been using DMA; but the above commit caused the MMC driver to fall
back to PIO mode with an unmodified Kconfig.)
The PIO code, added with the rest of the driver in commit 730c9b7e6630f786fcec026fb11d2e6f2c90fdcb ("[MMC] Add OMAP MMC host
driver"), confused bytes with 16-bit words. This bug caused memory
located after the PIO transfer buffer to be corrupted with transfers
larger than 32 bytes. The driver also did not increment the buffer
pointer after the transfer occurred. This bug resulted in data
corruption during any transfer larger than 64 bytes.
Signed-off-by: Paul Walmsley <paul@pwsan.com> Reviewed-by: Felipe Balbi <balbi@ti.com> Tested-by: Tony Lindgren <tony@atomide.com> Signed-off-by: Chris Ball <cjb@laptop.org>
Ian Chen [Wed, 29 Aug 2012 06:05:36 +0000 (15:05 +0900)]
mmc: card: Skip secure erase on MoviNAND; causes unrecoverable corruption.
For several MoviNAND eMMC parts, there are known issues with secure
erase and secure trim. For these specific MoviNAND devices, we skip
these operations.
Specifically, there is a bug in the eMMC firmware that causes
unrecoverable corruption when the MMC is erased with MMC_CAP_ERASE
enabled.
Doug Anderson [Wed, 25 Jul 2012 15:33:17 +0000 (08:33 -0700)]
mmc: dw_mmc: Disable low power mode if SDIO interrupts are used
The documentation for the dw_mmc part says that the low power
mode should normally only be set for MMC and SD memory and should
be turned off for SDIO cards that need interrupts detected.
The best place I could find to do this is when the SDIO interrupt
was first enabled. I rely on the fact that dw_mci_setup_bus()
will be called when it's time to reenable.
Signed-off-by: Doug Anderson <dianders@chromium.org> Acked-by: Seungwon Jeon <tgih.jun@samsung.com> Signed-off-by: Chris Ball <cjb@laptop.org>
Seungwon Jeon [Wed, 1 Aug 2012 00:30:46 +0000 (09:30 +0900)]
mmc: dw_mmc: fix error handling in PIO mode
Data transfer will be continued until all the bytes are transmitted,
even if data crc error occurs during a multiple-block data transfer.
This means RXDR/TXDR interrupts will occurs until data transfer is
terminated. Early setting of host->sg to NULL prevents going into
xxx_data_pio functions, hence permanent unhandled RXDR/TXDR interrupts
occurs. And checking error interrupt status in the xxx_data_pio functions
is no need because dw_mci_interrupt does do the same. This patch also
removes it.
Signed-off-by: Seungwon Jeon <tgih.jun@samsung.com> Acked-by: Jaehoon Chung <jh80.chung@samsung.com> Acked-by: Will Newton <will.newton@imgtec.com> Signed-off-by: Chris Ball <cjb@laptop.org>
Seungwon Jeon [Wed, 1 Aug 2012 00:30:40 +0000 (09:30 +0900)]
mmc: dw_mmc: correct mishandling error interrupt
Datasheet of SYNOPSYS mentions that DTO(Data Transfer Over) interrupt
will be raised even if some error interrupts, however it is actually
found that DTO does not occur. SYNOPSYS has confirmed this issue.
Current implementation defers the call of tasklet_schedule until DTO
when the error interrupts is happened. This patch fixes error handling.
Signed-off-by: Seungwon Jeon <tgih.jun@samsung.com> Acked-by: Jaehoon Chung <jh80.chung@samsung.com> Acked-by: Will Newton <will.newton@imgtec.com> Signed-off-by: Chris Ball <cjb@laptop.org>
Seungwon Jeon [Wed, 1 Aug 2012 00:30:30 +0000 (09:30 +0900)]
mmc: dw_mmc: amend using error interrupt status
RINTSTS status includes masked interrupts as well as unmasked.
data_status and cmd_status are set by value of RINTSTS in interrupt handler
and tasklet finally uses it to decide whether error is happened or not.
In addition, MINTSTS status is used for setting data_status in PIO.
Masked error interrupt will not be handled and that status can be considered
non-error case.
Signed-off-by: Seungwon Jeon <tgih.jun@samsung.com>
Reviewed By: Girish K S <girish.shivananjappa@linaro.org> Acked-by: Jaehoon Chung <jh80.chung@samsung.com> Acked-by: Will Newton <will.newton@imgtec.com> Signed-off-by: Chris Ball <cjb@laptop.org>
mmc: atmel-mci: not busy flag has also to be used for read operations
Even if the datasheet says that the not busy flag has to be used only
for write operations, it's false except for version lesser than v2xx.
Not waiting on the not busy flag for read operations can cause the
controller to hang-up during the initialization of some SD cards
with DMA after the first CMD6 -- the next command is sent too early.
Shawn Guo [Wed, 22 Aug 2012 15:10:01 +0000 (23:10 +0800)]
mmc: sdhci-esdhc: break out early if clock is 0
Since commit 30832ab56 ("mmc: sdhci: Always pass clock request value
zero to set_clock host op") was merged, esdhc_set_clock starts hitting
"if (clock == 0)" where ESDHC_SYSTEM_CONTROL has been operated. This
causes SDHCI card-detection function being broken. Fix the regression
by moving "if (clock == 0)" above ESDHC_SYSTEM_CONTROL operation.
Signed-off-by: Shawn Guo <shawn.guo@linaro.org> Cc: <stable@vger.kernel.org> Signed-off-by: Chris Ball <cjb@laptop.org>
UBI was mistakingly using 'kfree()' instead of 'kmem_cache_free()' when
freeing "attach eraseblock" structures in vtbl.c. Thankfully, this happened
only when we were doing auto-format, so many systems were unaffected. However,
there are still many users affected.
It is strange, but the system did not crash and nothing bad happened when
the SLUB memory allocator was used. However, in case of SLOB we observed an
crash right away.
This problem was introduced in 2.6.39 by commit
"6c1e875 UBI: add slab cache for ubi_scan_leb objects"
A note for stable trees:
Because variable were renamed, this won't cleanly apply to older kernels.
Changing names like this should help:
1. ai -> si
2. aeb_slab_cache -> seb_slab_cache
3. new_aeb -> new_seb
Reported-by: Richard Genoud <richard.genoud@gmail.com> Tested-by: Richard Genoud <richard.genoud@gmail.com> Tested-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com> Cc: stable@vger.kernel.org [v2.6.39+] Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
ARM: shmobile: armadillo800eva: enable rw rootfs mount
armadillo800eva default boot loader is "hermit",
and it's tag->u.core.flags has flag when kernel boots.
Because of this, ${LINUX}/arch/arm/kernel/setup.c :: parse_tag_core()
didn't remove MS_RDONLY flag from root_mountflags.
Thus, the rootfs is mounted as "readonly".
This patch adds "rw" kernel parameter,
and enable read/write mounts for rootfs
1) NLA_PUT* --> nla_put_* conversion got one case wrong in
nfnetlink_log, fix from Patrick McHardy.
2) Missed error return check in ipw2100 driver, from Julia Lawall.
3) PMTU updates in ipv4 were setting the expiry time incorrectly, fix
from Eric Dumazet.
4) SFC driver erroneously reversed src and dst when reporting filters
via ethtool.
5) Memory leak in CAN protocol and wrong setting of IRQF_SHARED in
sja1000 can platform driver, from Alexey Khoroshilov and Sven
Schmitt.
6) Fix multicast traffic scaling regression in ipv4_dst_destroy, only
take the lock when we really need to. From Eric Dumazet.
7) Fix non-root process spoofing in netlink, from Pablo Neira Ayuso.
8) CWND reduction in TCP is done incorrectly during non-SACK recovery,
fix from Yuchung Cheng.
9) Revert netpoll change, and fix what was actually a driver specific
problem. From Amerigo Wang. This should cure bootup hangs with
netconsole some people reported.
10) Fix xen-netfront invoking __skb_fill_page_desc() with a NULL page
pointer. From Ian Campbell.
11) SIP NAT fix for expectiontation creation, from Pablo Neira Ayuso.
12) __ip_rt_update_pmtu() needs RCU locking, from Eric Dumazet.
13) Fix usbnet deadlock on resume, can't use GFP_KERNEL in this
situation. From Oliver Neukum.
14) The davinci ethernet driver triggers an OOPS on removal because it
frees an MDIO object before unregistering it. Fix from Bin Liu.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (41 commits)
net: qmi_wwan: add several new Gobi devices
fddi: 64 bit bug in smt_add_para()
net: ethernet: fix kernel OOPS when remove davinci_mdio module
net/xfrm/xfrm_state.c: fix error return code
net: ipv6: fix error return code
net: qmi_wwan: new device: Foxconn/Novatel E396
usbnet: fix deadlock in resume
cs89x0 : packet reception not working
netfilter: nf_conntrack: fix racy timer handling with reliable events
bnx2x: Correct the ndo_poll_controller call
bnx2x: Move netif_napi_add to the open call
ipv4: must use rcu protection while calling fib_lookup
bnx2x: fix 57840_MF pci id
net: ipv4: ipmr_expire_timer causes crash when removing net namespace
e1000e: DoS while TSO enabled caused by link partner with small MSS
l2tp: avoid to use synchronize_rcu in tunnel free function
gianfar: fix default tx vlan offload feature flag
netfilter: nf_nat_sip: fix incorrect handling of EBUSY for RTCP expectation
xen-netfront: use __pskb_pull_tail to ensure linear area is big enough on RX
netfilter: nfnetlink_log: fix error return code in init path
...
Henrik Rydberg [Sat, 1 Sep 2012 19:47:11 +0000 (21:47 +0200)]
HID: Only dump input if someone is listening
Going through the motions of printing the debug message information
takes a long time; using the keyboard can lead to a 160 us irqsoff
latency. This patch skips hid_dump_input() when there are no open
handles, which brings latency down to 100 us.
Signed-off-by: Henrik Rydberg <rydberg@euromail.se> Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Gobi devices are composite, needing both the qcserial and
qmi_wwan drivers to support all functions. Re-syncing the
list of supported devices with qcserial.
Cc: Aleksander Morgado <aleksander@lanedo.com> Cc: Thomas Tuttle <ttuttle@chromium.org> Signed-off-by: Bjørn Mork <bjorn@mork.no> Signed-off-by: David S. Miller <davem@tempietto.lan>
Dan Carpenter [Sat, 1 Sep 2012 09:57:40 +0000 (09:57 +0000)]
fddi: 64 bit bug in smt_add_para()
The intent was to set 4 bytes of data so that's why the sp_len is set
to 4 on the next line. The cast to u_long pointer clears 8 bytes
on 64 bit arches.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: David S. Miller <davem@tempietto.lan>
Guenter Roeck [Sat, 25 Aug 2012 00:25:01 +0000 (17:25 -0700)]
linux/kernel.h: Fix DIV_ROUND_CLOSEST to support negative dividends
DIV_ROUND_CLOSEST returns a bad result for negative dividends:
DIV_ROUND_CLOSEST(-2, 2) = 0
Most of the time this does not matter. However, in the hardware monitoring
subsystem, DIV_ROUND_CLOSEST is sometimes used on integers which can be
negative (such as temperatures).
Signed-off-by: Guenter Roeck <linux@roeck-us.net> Acked-by: Jean Delvare <khali@linux-fr.org>
John Stultz [Fri, 31 Aug 2012 17:30:06 +0000 (13:30 -0400)]
time: Move ktime_t overflow checking into timespec_valid_strict
Andreas Bombe reported that the added ktime_t overflow checking added to
timespec_valid in commit 4e8b14526ca7 ("time: Improve sanity checking of
timekeeping inputs") was causing problems with X.org because it caused
timeouts larger then KTIME_T to be invalid.
Previously, these large timeouts would be clamped to KTIME_MAX and would
never expire, which is valid.
This patch splits the ktime_t overflow checking into a new
timespec_valid_strict function, and converts the timekeeping codes
internal checking to use this more strict function.
Reported-and-tested-by: Andreas Bombe <aeb@debian.org> Cc: Zhouping Liu <zliu@redhat.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Prarit Bhargava <prarit@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: stable@vger.kernel.org Signed-off-by: John Stultz <john.stultz@linaro.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Merge tag 'parisc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/parisc-2.6
Pull PARISC fixes from James Bottomley:
"This is a set of two bug fixes. One is the ATOMIC problem which is
now causing a compile failure in certain situations. The other is
mishandling of PER_LINUX32 which may also cause user visible effects.
Signed-off-by: James Bottomley <JBottomley@Parallels.com>"
* tag 'parisc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/parisc-2.6:
[PARISC] fix personality flag check in copy_thread()
[PARISC] Redefine ATOMIC_INIT and ATOMIC64_INIT to drop the casts
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Pull s390 fixes from Martin Schwidefsky:
"A couple of s390 bug fixes for 3.5-rc4"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
s390/32: Don't clobber personality flags on exec
s390/smp: add missing smp_store_status() for !SMP
s390/dasd: fix ioctl return value
s390: Always use "long" for ssize_t to match size_t
Axel Lin [Wed, 29 Aug 2012 01:35:24 +0000 (09:35 +0800)]
gpio: mc9s08dz60: Fix build error if I2C=m
Make GPIO_MC9S08DZ60 depend on I2C=y, this fixes below build error:
LD init/built-in.o
drivers/built-in.o: In function `mc9s08dz60_get_value':
clk-fixed-factor.c:(.text+0x7214): undefined reference to `i2c_smbus_read_byte_data'
drivers/built-in.o: In function `mc9s08dz60_set':
clk-fixed-factor.c:(.text+0x727c): undefined reference to `i2c_smbus_read_byte_data'
clk-fixed-factor.c:(.text+0x72bc): undefined reference to `i2c_smbus_write_byte_data'
drivers/built-in.o: In function `mc9s08dz60_i2c_driver_init':
clk-fixed-factor.c:(.init.text+0x290): undefined reference to `i2c_register_driver'
drivers/built-in.o: In function `mc9s08dz60_i2c_driver_exit':
clk-fixed-factor.c:(.exit.text+0x2c): undefined reference to `i2c_del_driver'
make: *** [vmlinux] Error 1
Signed-off-by: Axel Lin <axel.lin@gmail.com> Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Cc: Dan Williams <dcbw@redhat.com> Cc: Bjørn Mork <bjorn@mork.no> Cc: Ben Chan <benchan@google.com> Signed-off-by: Aleksander Morgado <aleksander@lanedo.com> Acked-by: Bjørn Mork <bjorn@mork.no> Signed-off-by: David S. Miller <davem@davemloft.net>
Oliver Neukum [Sun, 26 Aug 2012 20:41:38 +0000 (20:41 +0000)]
usbnet: fix deadlock in resume
A usbnet device can share a multifunction device
with a storage device. If the storage device is autoresumed
the usbnet devices also needs to be autoresumed. Allocating
memory with GFP_KERNEL can deadlock in this case.
Commit 68e67f40b ("ALSA: snd-usb: move calls to usb_set_interface")
saved us some unnecessary calls to snd_usb_set_interface() but ignored
the fact that there is at least one device out there which operates on
two endpoint in different interfaces simultaniously.
Take care for this by catching the case where data and sync endpoints
are located on different interfaces and calling snd_usb_set_interface()
between the start of the two endpoints.
Signed-off-by: Daniel Mack <zonque@gmail.com> Reported-by: Robert M. Albrecht <linux@romal.de> Cc: stable@kernel.org [v3.5+] Signed-off-by: Takashi Iwai <tiwai@suse.de>
Daniel Mack [Thu, 30 Aug 2012 16:52:30 +0000 (18:52 +0200)]
ALSA: snd-usb: fix calls to next_packet_size
In order to support devices with implicit feedback streaming models,
packet sizes are now stored with each individual urb, and the PCM
handling code which fills the buffers purely relies on the size fields
now.
However, calling snd_usb_audio_next_packet_size() for all possible
packets in an URB at once, prior to letting the PCM code do its job
does in fact not lead to the same behaviour than what the old code did:
The PCM code will break its loop once a period boundary is reached,
consequently using up less packets that it really could.
As snd_usb_audio_next_packet_size() implements a feedback mechanism to
the endpoints phase accumulator, the number of calls to that function
matters, and when called too often, the data rate runs out of bounds.
Fix this by making the next_packet function public, and call it from the
PCM code as before if the packet data sizes are not defined.
Daniel Mack [Thu, 30 Aug 2012 16:52:29 +0000 (18:52 +0200)]
ALSA: snd-usb: restore delay information
Parts of commit 294c4fb8 ("ALSA: usb: refine delay information with USB
frame counter") were unfortunately lost during the refactoring of the
snd-usb driver in 3.5.
This patch adds them back, restoring the correct delay information
behaviour.
Andrew Lunn [Thu, 30 Aug 2012 05:39:12 +0000 (07:39 +0200)]
ARM: Kirkwood: Fix 'SZ_1M' undeclared here for db88f6281-bp-setup.c
Linux-next has failed to compile for kirkwood since 23 August with:
arch/arm/mach-kirkwood/db88f6281-bp-setup.c:29: error: 'SZ_1M' undeclared here (not in a function)
arch/arm/mach-kirkwood/db88f6281-bp-setup.c:33: error: 'SZ_4M' undeclared here (not in a function)
Add missing <linux/sizes.h>
Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Jason Cooper <jason@lakedaemon.net>
netfilter: nf_conntrack: fix racy timer handling with reliable events
Existing code assumes that del_timer returns true for alive conntrack
entries. However, this is not true if reliable events are enabled.
In that case, del_timer may return true for entries that were
just inserted in the dying list. Note that packets / ctnetlink may
hold references to conntrack entries that were just inserted to such
list.
This patch fixes the issue by adding an independent timer for
event delivery. This increases the size of the ecache extension.
Still we can revisit this later and use variable size extensions
to allocate this area on demand.
Tested-by: Oliver Smith <olipro@8.c.9.b.0.7.4.0.1.0.0.2.ip6.arpa> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
renesas_usbhs driver can play role as both Host and Gadget.
In case of Gadget, it requires not only renesas_usbhs
but also usb gadget module (like g_ether).
So, renesas_usbhs driver calls usb_add_gadget_udc() on probe time.
Because of this behavior,
Host port plays also Gadget role if kernel has both Host/Gadget support.
In mackerel case, from 0ada2da51800a4914887a9bcf22d563be80e50be
(ARM: mach-shmobile: mackerel: use renesas_usbhs instead of r8a66597_hcd)
usb0 plays Gadget role, and usb1 plays Host role,
and current mackerel board probes as usb1 -> usb0.
Thus, 1st installed usb gadget module (like g_ether) will be
assigned to usb1 (= usb Host port), and 2nd module to usb0 (= usb Gadget port).
It is very confusable for user.
This patch fixup usb modes probing order as usb0 -> usb1.
Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com> Signed-off-by: Simon Horman <horms@verge.net.au>
Merav Sicron [Mon, 27 Aug 2012 03:26:19 +0000 (03:26 +0000)]
bnx2x: Move netif_napi_add to the open call
Move netif_napi_add for all queues from the probe call to the open call, to
avoid the case that napi objects are added for queues that may eventually not
be initialized and activated. With the former behavior, the driver could crash
when netpoll was calling ndo_poll_controller.
Signed-off-by: Merav Sicron <meravs@broadcom.com> Signed-off-by: Dmitry Kravkov <dmitry@broadcom.com> Signed-off-by: Eilon Greenstein <eilong@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Yuval Mintz [Sun, 26 Aug 2012 00:35:45 +0000 (00:35 +0000)]
bnx2x: fix 57840_MF pci id
Commit c3def943c7117d42caaed3478731ea7c3c87190e have added support for
new pci ids of the 57840 board, while failing to change the obsolete value
in 'pci_ids.h'.
This patch does so, allowing the probe of such devices.
Signed-off-by: Yuval Mintz <yuvalmin@broadcom.com> Signed-off-by: Eilon Greenstein <eilong@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
net: ipv4: ipmr_expire_timer causes crash when removing net namespace
When tearing down a net namespace, ipv4 mr_table structures are freed
without first deactivating their timers. This can result in a crash in
run_timer_softirq.
This patch mimics the corresponding behaviour in ipv6.
Locking and synchronization seem to be adequate.
We are about to kfree mrt, so existing code should already make sure that
no other references to mrt are pending or can be created by incoming traffic.
The functions invoked here do not cause new references to mrt or other
race conditions to be created.
Invoking del_timer_sync guarantees that ipmr_expire_timer is inactive.
Both ipmr_expire_process (whose completion we may have to wait in
del_timer_sync) and mroute_clean_tables internally use mfc_unres_lock
or other synchronizations when needed, and they both only modify mrt.
Tested in Linux 3.4.8.
Signed-off-by: Francesco Ruggeri <fruggeri@aristanetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Bruce Allan [Fri, 24 Aug 2012 20:38:11 +0000 (20:38 +0000)]
e1000e: DoS while TSO enabled caused by link partner with small MSS
With a low enough MSS on the link partner and TSO enabled locally, the
networking stack can periodically send a very large (e.g. 64KB) TCP
message for which the driver will attempt to use more Tx descriptors than
are available by default in the Tx ring. This is due to a workaround in
the code that imposes a limit of only 4 MSS-sized segments per descriptor
which appears to be a carry-over from the older e1000 driver and may be
applicable only to some older PCI or PCIx parts which are not supported in
e1000e. When the driver gets a message that is too large to fit across the
configured number of Tx descriptors, it stops the upper stack from queueing
any more and gets stuck in this state. After a timeout, the upper stack
assumes the adapter is hung and calls the driver to reset it.
Remove the unnecessary limitation of using up to only 4 MSS-sized segments
per Tx descriptor, and put in a hard failure test to catch when attempting
to check for message sizes larger than would fit in the whole Tx ring.
Refactor the remaining logic that limits the size of data per Tx descriptor
from a seemingly arbitrary 8KB to a limit based on the dynamic size of the
Tx packet buffer as described in the hardware specification.
Also, fix the logic in the check for space in the Tx ring for the next
largest possible packet after the current one has been successfully queued
for transmit, and use the appropriate defines for default ring sizes in
e1000_probe instead of magic values.
This issue goes back to the introduction of e1000e in 2.6.24 when it was
split off from e1000.
Reported-by: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Cc: Stable <stable@vger.kernel.org> [2.6.24+] Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Claudiu Manoil [Thu, 23 Aug 2012 21:46:25 +0000 (21:46 +0000)]
gianfar: fix default tx vlan offload feature flag
Commit -
"b852b72 gianfar: fix bug caused by 87c288c6e9aa31720b72e2bc2d665e24e1653c3e"
disables by default (on mac init) the hw vlan tag insertion.
The "features" flags were not updated to reflect this, and
"ethtool -K" shows tx-vlan-offload to be "on" by default.
Cc: Sebastian Poehn <sebastian.poehn@belden.com> Signed-off-by: Claudiu Manoil <claudiu.manoil@freescale.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Ian Campbell [Wed, 22 Aug 2012 00:26:47 +0000 (00:26 +0000)]
xen-netfront: use __pskb_pull_tail to ensure linear area is big enough on RX
I'm slightly concerned by the "only in exceptional circumstances"
comment on __pskb_pull_tail but the structure of an skb just created
by netfront shouldn't hit any of the especially slow cases.
This approach still does slightly more work than the old way, since if
we pull up the entire first frag we now have to shuffle everything
down where before we just received into the right place in the first
place.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Jeremy Fitzhardinge <jeremy@goop.org> Cc: Mel Gorman <mgorman@suse.de> Cc: xen-devel@lists.xensource.com Cc: netdev@vger.kernel.org Cc: linux-kernel@vger.kernel.org Tested-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>