Dan Williams [Thu, 7 Oct 2010 23:44:50 +0000 (16:44 -0700)]
async_tx: make async_tx channel switching opt-in
The majority of drivers in drivers/dma/ will never establish cross
channel operation chains and do not need the extra overhead in struct
dma_async_tx_descriptor. Make channel switching opt-in by default.
Cc: Anatolij Gustschin <agust@denx.de> Cc: Ira Snyder <iws@ovro.caltech.edu> Cc: Linus Walleij <linus.walleij@stericsson.com> Cc: Saeed Bishara <saeed@marvell.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Dan Williams [Thu, 7 Oct 2010 22:25:04 +0000 (15:25 -0700)]
move async raid6 test to lib/Kconfig.debug
The prompt for "Self test for hardware accelerated raid6 recovery" does not
belong in the top level configuration menu. All the options in
crypto/async_tx/Kconfig are selected and do not depend on CRYPTO.
Kconfig.debug seems like a reasonable fit.
Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: David Woodhouse <David.Woodhouse@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Sascha Hauer [Wed, 6 Oct 2010 08:25:55 +0000 (10:25 +0200)]
dmaengine: Add Freescale i.MX1/21/27 DMA driver
This driver is currently implemented as a user to the old i.MX
DMA API. This allows us to convert each user of the old API to
the dmaengine API one by one. Once this is done the old DMA
driver can be merged into the i.MX dmaengine driver.
V2: remove some debug leftovers and unused variables
Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de> Acked-by: Linus Walleij <linus.walleij@stericsson.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Ramesh Babu K V [Mon, 4 Oct 2010 10:37:53 +0000 (10:37 +0000)]
intel_mid_dma: Add sg list support to DMA driver
For a very high speed DMA various periphral devices need
scatter-gather list support. The DMA hardware support link list items.
This list can be circular also (adding new flag DMA_PREP_CIRCULAR_LIST)
Right now this flag is in driver header and should be moved to
dmaengine header file eventually
Signed-off-by: Ramesh Babu K V <ramesh.b.k.v@intel.com> Signed-off-by: Vinod Koul <vinod.koul@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Yong Wang [Mon, 4 Oct 2010 10:37:27 +0000 (10:37 +0000)]
intel_mid_dma: Allow DMAC2 to share interrupt
Allow DMAC2 to share interrupt since exclusive interrupt line
for mrst DMAC2 is not provided on other platforms.
Signed-off-by: Yong Wang <yong.y.wang@intel.com> Signed-off-by: Vinod Koul <vinod.koul@intel.com> Signed-off-by: Alan Cox <alan@linux.intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Yong Wang [Mon, 4 Oct 2010 10:37:02 +0000 (10:37 +0000)]
intel_mid_dma: Allow IRQ sharing
intel_mid_dma driver allows interrupt sharing. Thus it needs
to check whether IRQ source is the DMA controller and return
the appropriate IRQ return.
Signed-off-by: Yong Wang <yong.y.wang@intel.com> Signed-off-by: Vinod Koul <vinod.koul@intel.com> Signed-off-by: Alan Cox <alan@linux.intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Koul, Vinod [Mon, 4 Oct 2010 10:42:40 +0000 (10:42 +0000)]
intel_mid_dma: Add runtime PM support
This patch adds runtime PM support in this dma driver
for 4 PCI Controllers
Whenever the driver is idle (no channels grabbed), it
can go to low power state
It also adds the PCI suspend and resume support
Signed-off-by: Vinod Koul <vinod.koul@intel.com> Signed-off-by: Alan Cox <alan@linux.intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Linus Walleij [Wed, 6 Oct 2010 09:05:29 +0000 (09:05 +0000)]
DMAENGINE: define a dummy filter function for ste_dma40
All platform data has to be made conditional on CONFIG_STEDMA40
or we can provide a simple dummy filter functions as to avoid
cluttering the code with other #ifdef:s.
Signed-off-by: Linus Walleij <linus.walleij@stericsson.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Per Forlin [Wed, 6 Oct 2010 09:05:28 +0000 (09:05 +0000)]
DMAENGINE: Remove stedma40_set_psize and pre_transfer hook in ste_dma40
Remove obsolete pre_transfer hook in stedma40_chan_cfg. The
intent of this hook is merely to handle burst size
compensation for ux500 variant MMCI. Remove obsolete stedma40_set_psize
since it is only called from pre_transfer. DMAEngine device_control
replaces the functionality of stedma40_set_psize.
Signed-off-by: Per Forlin <per.forlin@stericsson.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Per Forlin [Wed, 6 Oct 2010 09:05:27 +0000 (09:05 +0000)]
DMAENGINE: Set burst size for phy and log chans in ste_dma40 dev_control
Set burst for physical or logical channels respectively.
Convert the values in dma_cfg to dma reg bits
for physical or logical channels.
Signed-off-by: Per Forlin <per.forlin@stericsson.com> Signed-off-by: Linus Walleij <linus.walleij@stericsson.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Rabin Vincent [Wed, 6 Oct 2010 08:20:38 +0000 (08:20 +0000)]
DMAENGINE: ste_dma40: fix resource leaks in error paths.
Fix some leaks of allocated descriptors in error paths.
Acked-by: Jonas Aaberg <jonas.aberg@stericsson.com> Signed-off-by: Rabin Vincent <rabin.vincent@stericsson.com> Signed-off-by: Linus Walleij <linus.walleij@stericsson.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Rabin Vincent [Wed, 6 Oct 2010 08:20:37 +0000 (08:20 +0000)]
DMAENGINE: ste_dma40: fix desc_get
Fix desc_get to alloc a descriptor from the cache if the ones in the
list are waiting for the ack. Also, memzero the descriptor when
allocated from the list to ensure all fields are cleared.
Acked-by: Jonas Aaberg <jonas.aberg@stericsson.com> Signed-off-by: Rabin Vincent <rabin.vincent@stericsson.com> Signed-off-by: Linus Walleij <linus.walleij@stericsson.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Rabin Vincent [Wed, 6 Oct 2010 08:20:36 +0000 (08:20 +0000)]
DMAENGINE: ste_dma40: fix clk_get failure path
clk_get returns an ERR_PTR.
Acked-by: Jonas Aaberg <jonas.aberg@stericsson.com> Signed-off-by: Rabin Vincent <rabin.vincent@stericsson.com> Signed-off-by: Linus Walleij <linus.walleij@stericsson.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Rabin Vincent [Wed, 6 Oct 2010 08:20:35 +0000 (08:20 +0000)]
DMAENGINE: ste_dma40: fix disabled channels list
The value in the array, not the index, specifies the channel to be
disabled.
Acked-by: Jonas Aaberg <jonas.aberg@stericsson.com> Signed-off-by: Rabin Vincent <rabin.vincent@stericsson.com> Signed-off-by: Linus Walleij <linus.walleij@stericsson.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Ira Snyder [Thu, 30 Sep 2010 11:46:46 +0000 (11:46 +0000)]
fsldma: improved DMA_SLAVE support
Now that the generic DMAEngine API has support for scatterlist to
scatterlist copying, the device_prep_slave_sg() portion of the
DMA_SLAVE API is no longer necessary and has been removed.
However, the device_control() portion of the DMA_SLAVE API is still
useful to control device specific parameters, such as externally
controlled DMA transfers and maximum burst length.
A special dma_ctrl_cmd has been added to enable externally controlled
DMA transfers. This is currently specific to the Freescale DMA
controller, but can easily be made generic when another user is found.
Signed-off-by: Ira W. Snyder <iws@ovro.caltech.edu> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Ira Snyder [Thu, 30 Sep 2010 11:46:44 +0000 (11:46 +0000)]
dma: add support for scatterlist to scatterlist copy
This adds support for scatterlist to scatterlist DMA transfers. A
similar interface is exposed by the fsldma driver (through the DMA_SLAVE
API) and by the ste_dma40 driver (through an exported function).
This patch paves the way for making this type of copy operation a part
of the generic DMAEngine API. Futher patches will add support in
individual drivers.
Signed-off-by: Ira W. Snyder <iws@ovro.caltech.edu> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
This patch adds support for the Freescale i.MX SDMA engine.
The SDMA engine is a scatter/gather DMA engine which is implemented
as a seperate coprocessor. SDMA needs its own firmware which is
requested using the standard request_firmware mechanism. The firmware
has different entry points for each peripheral type, so drivers
have to pass the peripheral type to the DMA engine which in turn
picks the correct firmware entry point from a table contained in
the firmware image itself.
The original Freescale code also supports support for transfering
data to the internal SRAM which needs different entry points to
the firmware. Support for this is currently not implemented. Also,
support for the ASRC (asymmetric sample rate converter) is skipped.
I took a very simple approach to implement dmaengine support. Only
a single descriptor is statically assigned to a each channel. This
means that transfers can't be queued up but only a single transfer
is in progress. This simplifies implementation a lot and is sufficient
for the usual device/memory transfers.
Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de> Reviewed-by: Linus Walleij <linus.ml.walleij@gmail.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
dmaengine: add wrapper functions for device control functions
Add wrapper functions around the dma_device->device_control function
to bring back type safety. Also, add a wrapper function around
dma_async_tx_descriptor->tx_submit. This is named dmaengine_submit
instead of dmaengine_tx_submit to get rid of the confusing 'tx' in the
function name
Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Cyclic transfers are useful for audio where a single buffer divided
in periods has to be transfered endlessly until stopped. After being
prepared the transfer is started using the dma_async_descriptor->tx_submit
function. dma_async_descriptor->callback is called after each period.
The transfer is stopped using the DMA_TERMINATE_ALL callback.
While being used for cyclic transfers the channel cannot be used
for other transfer types.
Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de> Cc: Haavard Skinnemoen <haavard.skinnemoen@atmel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
dmaengine: driver for the ARM PL080/PL081 PrimeCells v5
This creates a DMAengine driver for the ARM PL080/PL081 PrimeCells
based on the implementation earlier submitted by Peter Pearse.
This is working like a charm for memcpy and slave DMA to the PL011
PrimeCell on the PB11MPCore.
This DMA controller is used in mostly unmodified form in the ARM
RealView and Versatile platforms, in the ST-Ericsson Nomadik, and
in the ST SPEAr platform.
It has been converted to use the header from the Samsung PL080
derivate instead of its own defintions. The Samsungs have a custom
driver in their mach-* folders though, atleast we can share the
register definitions.
Cc: Peter Pearse <peter.pearse@arm.com> Cc: Ben Dooks <ben-linux@fluff.org> Cc: Kukjin Kim <kgene.kim@samsung.com> Cc: Alessandro Rubini <rubini@unipv.it> Acked-by: Viresh Kumar <viresh.kumar@st.com> Signed-off-by: Linus Walleij <linus.walleij@stericsson.com>
[GFP_KERNEL to GFP_NOWAIT in pl08x_prep_dma_memcpy] Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Simon Guinot [Fri, 17 Sep 2010 21:33:51 +0000 (23:33 +0200)]
dmaengine: fix interrupt clearing for mv_xor
When using simultaneously the two DMA channels on a same engine, some
transfers are never completed. For example, an endless lock can occur
while writing heavily on a RAID5 array (with async-tx offload support
enabled).
Note that this issue can also be reproduced by using the DMA test
client.
On a same engine, the interrupt cause register is shared between two
DMA channels. This patch make sure that the cause bit is only cleared
for the requested channel.
Signed-off-by: Simon Guinot <sguinot@lacie.com> Tested-by: Luc Saillard <luc@saillard.org> Acked-by: saeed bishara <saeed.bishara@gmail.com> Cc: <stable@kernel.org> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Jonas Aaberg [Mon, 9 Aug 2010 12:09:21 +0000 (12:09 +0000)]
DMAENGINE: ste_dma40: added kernel doc for struct
The half-channel struct was undocumented.
Signed-off-by: Jonas Aaberg <jonas.aberg@stericsson.com> Signed-off-by: Linus Walleij <linus.walleij@stericsson.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Jonas Aaberg <jonas.aberg@stericsson.com> Signed-off-by: Linus Walleij <linus.walleij@stericsson.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Jonas Aaberg [Mon, 9 Aug 2010 12:09:04 +0000 (12:09 +0000)]
DMAENGINE: ste_dma40: removed non-used variable from struct
The reqrite of the LCLA code rendered this variable unused.
Signed-off-by: Jonas Aaberg <jonas.aberg@stericsson.com> Signed-off-by: Linus Walleij <linus.walleij@stericsson.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
LLI allocation is now done on job level instead of channel level.
Previously the maximum length of a linked job in hw on a logical
channel was 8, since the LLIs where evenly divided. Now only
executing jobs have allocated LLIs which increase the length to
a maximum of 64 links in HW.
Signed-off-by: Jonas Aaberg <jonas.aberg@stericsson.com> Signed-off-by: Linus Walleij <linus.walleij@stericsson.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Jonas Aaberg [Mon, 9 Aug 2010 12:08:49 +0000 (12:08 +0000)]
DMAENGINE: ste_dma40: fix possible use of uninitialized variable
Signed-off-by: Jonas Aaberg <jonas.aberg@stericsson.com> Signed-off-by: Linus Walleij <linus.walleij@stericsson.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
The handling of pause detection was slightly incorrect.
Signed-off-by: Jonas Aaberg <jonas.aberg@stericsson.com> Signed-off-by: Linus Walleij <linus.walleij@stericsson.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Jonas Aaberg [Mon, 9 Aug 2010 12:08:34 +0000 (12:08 +0000)]
DMAENGINE: ste_dma40: code clean-up
This patch includes non functional code clean up changes,
file header updates and a few magic numbers got defined.
Signed-off-by: Jonas Aaberg <jonas.aberg@stericsson.com> Signed-off-by: Linus Walleij <linus.walleij@stericsson.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Jonas Aaberg [Mon, 9 Aug 2010 12:08:26 +0000 (12:08 +0000)]
DMAENGINE: ste_dma40: added support for link jobs in hw
If a new job is added on a physical channel that already has
a job, the new job is linked in hw to the old job instead of
queueing up the jobs.
Signed-off-by: Jonas Aaberg <jonas.aberg@stericsson.com> Signed-off-by: Linus Walleij <linus.walleij@stericsson.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Jonas Aaberg [Mon, 9 Aug 2010 12:08:18 +0000 (12:08 +0000)]
DMAENGINE: ste_dma40: removed a few magic numbers
Make sure to extract the revision field explicitly and document
what bits are being accessed here without magic numbers.
Signed-off-by: Jonas Aaberg <jonas.aberg@stericsson.com> Signed-off-by: Linus Walleij <linus.walleij@stericsson.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Jonas Aaberg [Mon, 9 Aug 2010 12:08:10 +0000 (12:08 +0000)]
DMAENGINE: ste_dma40: fix bug related to callback handling
The callback got called even when it was not supposed to. Also
removed some not needed interrupt trigger on/off code.
Signed-off-by: Jonas Aaberg <jonas.aberg@stericsson.com> Signed-off-by: Linus Walleij <linus.walleij@stericsson.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Jonas Aaberg [Mon, 9 Aug 2010 12:08:02 +0000 (12:08 +0000)]
DMAENGINE: ste_dma40: Code clean-up and removed an unneeded suspend request
This patch cleans up some code and removes a suspend request that was pointless
since the hw was never configured nor running when it was called.
Signed-off-by: Jonas Aaberg <jonas.aberg@stericsson.com> Signed-off-by: Linus Walleij <linus.walleij@stericsson.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Jonas Aaberg [Mon, 9 Aug 2010 12:07:54 +0000 (12:07 +0000)]
DMAENGINE: ste_dma40: No need reading, masking and setting a set register
Removes an unnecessary register read and a few lines of code.
Signed-off-by: Jonas Aaberg <jonas.aberg@stericsson.com> Signed-off-by: Linus Walleij <linus.walleij@stericsson.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Jonas Aaberg [Mon, 9 Aug 2010 12:07:44 +0000 (12:07 +0000)]
DMAENGINE: ste_dma40: Fix failed to restart logical channel bug
A transfer that runs in the different direction on the same
channel will now be resumed when the other is suspend/stopped.
Signed-off-by: Jonas Aaberg <jonas.aberg@stericsson.com> Signed-off-by: Linus Walleij <linus.walleij@stericsson.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Linus Walleij [Mon, 9 Aug 2010 12:07:36 +0000 (12:07 +0000)]
DMAENGINE: ste_dma40: config checks
Added various configuration checks.
Signed-off-by: Jonas Aaberg <jonas.aberg@stericsson.com> Signed-off-by: Linus Walleij <linus.walleij@stericsson.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Dan Carpenter [Mon, 6 Sep 2010 12:32:30 +0000 (14:32 +0200)]
Staging: vt6655: fix buffer overflow
"param->u.wpa_associate.wpa_ie_len" comes from the user. We should
check it so that the copy_from_user() doesn't overflow the buffer.
Also further down in the function, we assume that if
"param->u.wpa_associate.wpa_ie_len" is set then "abyWPAIE[0]" is
initialized. To make that work, I changed the test here to say that if
"wpa_ie_len" is set then "wpa_ie" has to be a valid pointer or we return
-EINVAL.
Oddly, we only use the first element of the abyWPAIE[] array. So I
suspect there may be some other issues in this function.
The netfilter hook seems to be misused and may leak skbs in situations
when NF_HOOK returns NF_STOLEN. It may not filter everything as
expected. Also the ethernet bridge tables are not yet capable to
understand batman-adv packet correctly.
It was only added for testing purposes and can be removed again.
Mika Westerberg [Sat, 4 Sep 2010 07:23:23 +0000 (10:23 +0300)]
serial: amba-pl010: fix set_ldisc
Commit d87d9b7d1 ("tty: serial - fix tty referencing in set_ldisc") changed
set_ldisc to take ldisc number as parameter. This patch fixes AMBA PL010 driver
according the new prototype.
Signed-off-by: Mika Westerberg <mika.westerberg@iki.fi> Cc: Alan Cox <alan@linux.intel.com> Cc: Russell King <rmk+kernel@arm.linux.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
The TIOCGICOUNT device ioctl in both mos7720.c and mos7840.c allows
unprivileged users to read uninitialized stack memory, because the
"reserved" member of the serial_icounter_struct struct declared on the
stack is not altered or zeroed before being copied back to the user.
This patch takes care of it.
Ming Lei [Mon, 6 Sep 2010 15:27:09 +0000 (23:27 +0800)]
USB: otg: twl4030: fix phy initialization(v1)
Commit 461c317705eca5cac09a360f488715927fd0a927(into 2.6.36-v3)
is put forward to power down phy if no usb cable is connected,
but does introduce the two issues below:
1), phy is not into work state if usb cable is connected
with PC during poweron, so musb device mode is not usable
in such case, follows the reasons:
-twl4030_phy_resume is not called, so
regulators are not enabled
i2c access are not enabled
usb mode not configurated
2), The kernel warings[1] of regulators 'unbalanced disables'
is caused if poweron without usb cable connected
with PC or b-device.
This patch fixes the two issues above:
-power down phy only if no usb cable is connected with PC
and b-device
-do phy initialization(via __twl4030_phy_resume) if usb cable
is connected with PC(vbus event) or another b-device(ID event) in
twl4030_usb_probe.
This patch also doesn't put VUSB3V1 LDO into active mode in
twl4030_usb_ldo_init until VBUS/ID change detected, so we can
save more power consumption than before.
This patch is verified OK on Beagle board either connected with
usb cable or not when poweron.
Signed-off-by: Ming Lei <tom.leiming@gmail.com> Cc: David Brownell <dbrownell@users.sourceforge.net> Cc: Felipe Balbi <me@felipebalbi.com> Cc: Anand Gadiyar <gadiyar@ti.com> Cc: Mike Frysinger <vapier@gentoo.org> Cc: Sergei Shtylyov <sshtylyov@ru.mvista.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
usb: musb_debugfs: don't use the struct file private_data field with seq_files
seq_files use the private_data field of a file struct for storing a seq_file structure,
data should be stored in seq_file's own private field (e.g. file->private_data->private)
Otherwise seq_release() will free the private data when the file is closed.
Al Viro [Mon, 20 Sep 2010 14:13:25 +0000 (15:13 +0100)]
frv: double syscall restarts, syscall restart in sigreturn()
We need to make sure that only the first do_signal() to be handled on
the way out syscall will bother with syscall restarts; additionally, the
check on the "signal has user handler" path had been wrong - compare
with restart prevention in sigreturn()...
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Al Viro [Mon, 20 Sep 2010 14:13:14 +0000 (15:13 +0100)]
frv: avoid infinite loop of SIGSEGV delivery
Use force_sigsegv() rather than force_sig(SIGSEGV, ...) as the former
resets the SEGV handler pointer which will kill the process, rather than
leaving it open to an infinite loop if the SEGV handler itself caused a
SEGV signal.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Al Viro [Mon, 20 Sep 2010 14:13:09 +0000 (15:13 +0100)]
frv: fix address verification holes in setup_frame/setup_rt_frame
a) sa_handler might be maliciously set to point to kernel memory;
blindly dereferencing it in FDPIC case is a Bad Idea(tm).
b) I'm not sure you need that set_fs(USER_DS) there at all, but if you
do, you'd better do it *before* checking the frame you've decided to
use with access_ok(), lest sigaltstack() becomes a convenient
roothole.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Commit 4969c1192d15 ("mm: fix swapin race condition") is now agreed to
be incomplete. There's a race, not very much less likely than the
original race envisaged, in which it is further necessary to check that
the swapcache page's swap has not changed.
Here's the reasoning: cast in terms of reuse_swap_page(), but probably
could be reformulated to rely on try_to_free_swap() instead, or on
swapoff+swapon.
A, faults into do_swap_page(): does page1 = lookup_swap_cache(swap1) and
comes through the lock_page(page1).
B, a racing thread of the same process, faults on the same address: does
page1 = lookup_swap_cache(swap1) and now waits in lock_page(page1), but
for whatever reason is unlucky not to get the lock any time soon.
A carries on through do_swap_page(), a write fault, but cannot reuse the
swap page1 (another reference to swap1). Unlocks the page1 (but B
doesn't get it yet), does COW in do_wp_page(), page2 now in that pte.
C, perhaps the parent of A+B, comes in and write faults the same swap
page1 into its mm, reuse_swap_page() succeeds this time, swap1 is freed.
kswapd comes in after some time (B still unlucky) and swaps out some
pages from A+B and C: it allocates the original swap1 to page2 in A+B,
and some other swap2 to the original page1 now in C. But does not
immediately free page1 (actually it couldn't: B holds a reference),
leaving it in swap cache for now.
B at last gets the lock on page1, hooray! Is PageSwapCache(page1)? Yes.
Is pte_same(*page_table, orig_pte)? Yes, because page2 has now been
given the swap1 which page1 used to have. So B proceeds to insert page1
into A+B's page_table, though its content now belongs to C, quite
different from what A wrote there.
B ought to have checked that page1's swap was still swap1.
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mattst88/alpha-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mattst88/alpha-2.6:
alpha: deal with multiple simultaneously pending signals
alpha: fix a 14 years old bug in sigreturn tracing
alpha: unb0rk sigsuspend() and rt_sigsuspend()
alpha: belated ERESTART_RESTARTBLOCK race fix
alpha: Shift perf event pending work earlier in timer interrupt
alpha: wire up fanotify and prlimit64 syscalls
alpha: kill big kernel lock
alpha: fix build breakage in asm/cacheflush.h
alpha: remove unnecessary cast from void* in assignment.
alpha: Use static const char * const where possible
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (21 commits)
dca: disable dca on IOAT ver.3.0 multiple-IOH platforms
netpoll: Disable IRQ around RCU dereference in netpoll_rx
sctp: Do not reset the packet during sctp_packet_config().
net/llc: storing negative error codes in unsigned short
MAINTAINERS: move atlx discussions to netdev
drivers/net/cxgb3/cxgb3_main.c: prevent reading uninitialized stack memory
drivers/net/eql.c: prevent reading uninitialized stack memory
drivers/net/usb/hso.c: prevent reading uninitialized memory
xfrm: dont assume rcu_read_lock in xfrm_output_one()
r8169: Handle rxfifo errors on 8168 chips
3c59x: Remove atomic context inside vortex_{set|get}_wol
tcp: Prevent overzealous packetization by SWS logic.
net: RPS needs to depend upon USE_GENERIC_SMP_HELPERS
phylib: fix PAL state machine restart on resume
net: use rcu_barrier() in rollback_registered_many
bonding: correctly process non-linear skbs
ipv4: enable getsockopt() for IP_NODEFRAG
ipv4: force_igmp_version ignored when a IGMPv3 query received
ppp: potential NULL dereference in ppp_mp_explode()
net/llc: make opt unsigned in llc_ui_setsockopt()
...
Jan Harkes [Sat, 18 Sep 2010 03:26:01 +0000 (23:26 -0400)]
Coda: mount hangs because of missed REQ_WRITE rename
Coda's REQ_* defines were renamed to avoid clashes with the block layer
(commit 4aeefdc69f7b: "coda: fixup clash with block layer REQ_*
defines").
However one was missed and response messages are no longer matched with
requests and waiting threads are no longer woken up. This patch fixes
this.
Signed-off-by: Jan Harkes <jaharkes@cs.cmu.edu>
[ Also fixed up whitespace while at it -Linus ] Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Al Viro [Sat, 18 Sep 2010 12:42:27 +0000 (08:42 -0400)]
alpha: deal with multiple simultaneously pending signals
Unlike the other targets, alpha sets _one_ sigframe and
buggers off until the next syscall/interrupt, even if
more signals are pending. It leads to quite a few unpleasant
inconsistencies, starting with SIGSEGV potentially arriving
not where it should and including e.g. mess with sigsuspend();
consider two pending signals blocked until sigsuspend()
unblocks them. We pick the first one; then, if we are hit
by interrupt while in the handler, we process the second one
as well. If we are not, and if no syscalls had been made,
we get out of the first handler and leave the second signal
pending; normally sigreturn() would've picked it anyway, but
here it starts with restoring the original mask and voila -
the second signal is blocked again. On everything else we
get both delivered consistently.
It's actually easy to fix; the only thing to watch out for
is prevention of double syscall restart. Fortunately, the
idea I've nicked from arm fix by rmk works just fine...
Testcase demonstrating the behaviour in question; on alpha
we get one or both flags set (usually one), on everything
else both are always set.
#include <signal.h>
#include <stdio.h>
int had1, had2;
void f1(int sig) { had1 = 1; }
void f2(int sig) { had2 = 1; }
main()
{
sigset_t set1, set2;
sigemptyset(&set1);
sigemptyset(&set2);
sigaddset(&set2, 1);
sigaddset(&set2, 2);
signal(1, f1);
signal(2, f2);
sigprocmask(SIG_SETMASK, &set2, NULL);
raise(1);
raise(2);
sigsuspend(&set1);
printf("had1:%d had2:%d\n", had1, had2);
}
Tested-by: Michael Cree <mcree@orcon.net.nz> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Matt Turner <mattst88@gmail.com>
Al Viro [Sat, 18 Sep 2010 12:41:16 +0000 (08:41 -0400)]
alpha: fix a 14 years old bug in sigreturn tracing
The way sigreturn() is implemented on alpha breaks PTRACE_SYSCALL,
all way back to 1.3.95 when alpha has grown PTRACE_SYSCALL support.
What happens is direct return to ret_from_syscall, in order to bypass
mangling of a3 (error indicator) and prevent other mutilations of
registers (e.g. by syscall restart). That's fine, but... the entire
TIF_SYSCALL_TRACE codepath is kept separate on alpha and post-syscall
stopping/notifying the tracer is after the syscall. And the normal
path we are forcibly switching to doesn't have it.
So we end up with *one* stop in traced sigreturn() vs. two in other
syscalls. And yes, strace is visibly broken by that; try to strace
the following
#include <signal.h>
#include <stdio.h>
void f(int sig) {}
main()
{
signal(SIGHUP, f);
raise(SIGHUP);
write(1, "eeeek\n", 6);
}
and watch the show. The
close(1) = 405
in the end of strace output is coming from return value of write() (6 ==
__NR_close on alpha) and syscall number of exit_group() (__NR_exit_group ==
405 there).
The fix is fairly simple - the only thing we end up missing is the call
of syscall_trace() and we can tell whether we'd been called from the
SYSCALL_TRACE path by checking ra value. Since we are setting the
switch_stack up (that's what sys_sigreturn() does), we have the right
environment for calling syscall_trace() - just before we call
undo_switch_stack() and return. Since undo_switch_stack() will overwrite
s0 anyway, we can use it to store the result of "has it been called from
SYSCALL_TRACE path?" check. The same thing applies in rt_sigreturn().
Tested-by: Michael Cree <mcree@orcon.net.nz> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Matt Turner <mattst88@gmail.com>
Al Viro [Sat, 18 Sep 2010 12:40:07 +0000 (08:40 -0400)]
alpha: unb0rk sigsuspend() and rt_sigsuspend()
Old code used to set regs->r0 and regs->r19 to force the right
return value. Leaving that after switch to ERESTARTNOHAND
was a Bad Idea(tm), since now that screws the restart - if we
hit the case when get_signal_to_deliver() returns 0, we will
step back to syscall insn, with v0 set to EINTR and a3 to 1.
The latter won't matter, since EINTR is 4, aka __NR_write.
results on alpha in immediate message to stdout...
Fix is obvious; moreover, since we don't need regs anymore, we can
switch to normal prototypes for these guys and lose the wrappers.
Even better, rt_sigsuspend() is identical to generic version in
kernel/signal.c now.
Tested-by: Michael Cree <mcree@orcon.net.nz> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Matt Turner <mattst88@gmail.com>
Michael Cree [Sun, 19 Sep 2010 06:05:40 +0000 (02:05 -0400)]
alpha: Shift perf event pending work earlier in timer interrupt
Pending work from the performance event subsystem is executed in
the timer interrupt. This patch shifts the call to
perf_event_do_pending() before the call to update_process_times()
as the latter may call back into the perf event subsystem and it
is prudent to have the pending work executed first.
Signed-off-by: Michael Cree <mcree@orcon.net.nz> Signed-off-by: Matt Turner <mattst88@gmail.com>
All uses of the BKL on alpha are totally bogus, nothing
is really protected by this. Remove the remaining users
so we don't have to mark alpha as 'depends on BKL'.
Signed-off-by: Arnd Bergmann <arnd@arndb.de> Cc: Richard Henderson <rth@twiddle.net> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: linux-alpha@vger.kernel.org Signed-off-by: Matt Turner <mattst88@gmail.com>
Alpha SMP flush_icache_user_range() is implemented as an inline
function inside include/asm/cacheflush.h. It dereferences @current
but doesn't include linux/sched.h and thus causes build failure if
linux/sched.h wasn't included previously. Fix it by including the
needed header file explicitly.
Signed-off-by: Tejun Heo <tj@kernel.org> Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Matt Turner <mattst88@gmail.com>
dca: disable dca on IOAT ver.3.0 multiple-IOH platforms
Direct Cache Access is not supported on IOAT ver.3.0 multiple-IOH platforms.
This patch blocks registering of dca providers when multiple IOH detected with IOAT ver.3.0.
Signed-off-by: Maciej Sosnowski <maciej.sosnowski@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Herbert Xu [Fri, 17 Sep 2010 23:55:03 +0000 (16:55 -0700)]
netpoll: Disable IRQ around RCU dereference in netpoll_rx
We cannot use rcu_dereference_bh safely in netpoll_rx as we may
be called with IRQs disabled. We could however simply disable
IRQs as that too causes BH to be disabled and is safe in either
case.
Thanks to John Linville for discovering this bug and providing
a patch.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
sctp: Do not reset the packet during sctp_packet_config().
sctp_packet_config() is called when getting the packet ready
for appending of chunks. The function should not touch the
current state, since it's possible to ping-pong between two
transports when sending, and that can result packet corruption
followed by skb overlfow crash.
Reported-by: Thomas Dreibholz <dreibh@iem.uni-due.de> Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Merge branch 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jdelvare/staging
* 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jdelvare/staging:
hwmon: (lm95241) Replace rate sysfs attribute with update_interval
hwmon: (adm1031) Replace update_rate sysfs attribute with update_interval
hwmon: (w83627ehf) Use proper exit sequence
hwmon: (emc1403) Remove unnecessary hwmon_device_unregister
hwmon: (f75375s) Do not overwrite values read from registers
hwmon: (f75375s) Shift control mode to the correct bit position
hwmon: New subsystem maintainers
hwmon: (lis3lv02d) Prevent NULL pointer dereference
Merge branch 'for-linus' of git://neil.brown.name/md
* 'for-linus' of git://neil.brown.name/md:
md: fix v1.x metadata update when a disk is missing.
md: call md_update_sb even for 'external' metadata arrays.
Al Viro [Fri, 17 Sep 2010 13:34:39 +0000 (14:34 +0100)]
arm: fix really nasty sigreturn bug
If a signal hits us outside of a syscall and another gets delivered
when we are in sigreturn (e.g. because it had been in sa_mask for
the first one and got sent to us while we'd been in the first handler),
we have a chance of returning from the second handler to location one
insn prior to where we ought to return. If r0 happens to contain -513
(-ERESTARTNOINTR), sigreturn will get confused into doing restart
syscall song and dance.
Incredible joy to debug, since it manifests as random, infrequent and
very hard to reproduce double execution of instructions in userland
code...
The fix is simple - mark it "don't bother with restarts" in wrapper,
i.e. set r8 to 0 in sys_sigreturn and sys_rt_sigreturn wrappers,
suppressing the syscall restart handling on return from these guys.
They can't legitimately return a restart-worthy error anyway.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Acked-by: Russell King <rmk+kernel@arm.linux.org.uk> Cc: stable@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
hwmon: (adm1031) Replace update_rate sysfs attribute with update_interval
The attribute reflects an interval, not a rate.
Signed-off-by: Guenter Roeck <guenter.roeck@ericsson.com> Acked-by: Ira W. Snyder <iws@ovro.caltech.edu> Signed-off-by: Jean Delvare <khali@linux-fr.org>
hwmon: (f75375s) Do not overwrite values read from registers
All bits in the values read from registers to be used for the next
write were getting overwritten, avoid doing so to not mess with the
current configuration.
hwmon: (f75375s) Shift control mode to the correct bit position
The spec notes that fan0 and fan1 control mode bits are located in bits
7-6 and 5-4 respectively, but the FAN_CTRL_MODE macro was making the
bits shift by 5 instead of by 4.
Jean Delvare [Fri, 17 Sep 2010 15:24:11 +0000 (17:24 +0200)]
hwmon: New subsystem maintainers
Guenter Roeck volunteered to adopt the hwmon subsystem as long as he
wasn't the only maintainer. As this was also my own condition, we can
add the two of us as co-maintainers of the hwmon subsystem.
Signed-off-by: Jean Delvare <khali@linux-fr.org> Acked-by: Guenter Roeck <guenter.roeck@ericsson.com>