Jialing Fu [Fri, 28 Aug 2015 03:13:09 +0000 (11:13 +0800)]
mmc: core: fix race condition in mmc_wait_data_done
The following panic is captured in ker3.14, but the issue still exists
in latest kernel.
---------------------------------------------------------------------
[ 20.738217] c0 3136 (Compiler) Unable to handle kernel NULL pointer dereference
at virtual address 00000578
......
[ 20.738499] c0 3136 (Compiler) PC is at _raw_spin_lock_irqsave+0x24/0x60
[ 20.738527] c0 3136 (Compiler) LR is at _raw_spin_lock_irqsave+0x20/0x60
[ 20.740134] c0 3136 (Compiler) Call trace:
[ 20.740165] c0 3136 (Compiler) [<ffffffc0008ee900>] _raw_spin_lock_irqsave+0x24/0x60
[ 20.740200] c0 3136 (Compiler) [<ffffffc0000dd024>] __wake_up+0x1c/0x54
[ 20.740230] c0 3136 (Compiler) [<ffffffc000639414>] mmc_wait_data_done+0x28/0x34
[ 20.740262] c0 3136 (Compiler) [<ffffffc0006391a0>] mmc_request_done+0xa4/0x220
[ 20.740314] c0 3136 (Compiler) [<ffffffc000656894>] sdhci_tasklet_finish+0xac/0x264
[ 20.740352] c0 3136 (Compiler) [<ffffffc0000a2b58>] tasklet_action+0xa0/0x158
[ 20.740382] c0 3136 (Compiler) [<ffffffc0000a2078>] __do_softirq+0x10c/0x2e4
[ 20.740411] c0 3136 (Compiler) [<ffffffc0000a24bc>] irq_exit+0x8c/0xc0
[ 20.740439] c0 3136 (Compiler) [<ffffffc00008489c>] handle_IRQ+0x48/0xac
[ 20.740469] c0 3136 (Compiler) [<ffffffc000081428>] gic_handle_irq+0x38/0x7c
----------------------------------------------------------------------
Because in SMP, "mrq" has race condition between below two paths:
path1: CPU0: <tasklet context>
static void mmc_wait_data_done(struct mmc_request *mrq)
{
mrq->host->context_info.is_done_rcv = true;
//
// If CPU0 has just finished "is_done_rcv = true" in path1, and at
// this moment, IRQ or ICache line missing happens in CPU0.
// What happens in CPU1 (path2)?
//
// If the mmcqd thread in CPU1(path2) hasn't entered to sleep mode:
// path2 would have chance to break from wait_event_interruptible
// in mmc_wait_for_data_req_done and continue to run for next
// mmc_request (mmc_blk_rw_rq_prep).
//
// Within mmc_blk_rq_prep, mrq is cleared to 0.
// If below line still gets host from "mrq" as the result of
// compiler, the panic happens as we traced.
wake_up_interruptible(&mrq->host->context_info.wait);
}
mmc: host: omap_hsmmc: use ios->vdd for setting vmmc voltage
vdd voltage is set in mmc core to ios->vdd and vmmc should actually
be set to this voltage. Modify omap_hsmmc_enable_supply
to not take vdd as argument since now it's directly set to
the voltage in ios->vdd.
Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com> Tested-by: Tony Lindgren <tony@atomide.com> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
mmc: host: omap_hsmmc: don't use ->set_power to set initial regulator state
If the regulator is enabled on boot (checked using regulator_is_enabled),
invoke regulator_enable() so that the usecount reflects the correct
state of the regulator and then disable the regulator so that the
initial state of the regulator is disabled. Avoid using ->set_power,
since set_power also takes care of setting the voltages which is not
needed at this point.
Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com> Tested-by: Tony Lindgren <tony@atomide.com> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
mmc: host: omap_hsmmc: return error if any of the regulator APIs fail
Return error if any of the regulator APIs (regulator_enable,
regulator_disable, regulator_set_voltage) fails in
omap_hsmmc_set_power to avoid undefined behavior.
Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com> Tested-by: Tony Lindgren <tony@atomide.com> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Remove the unnecessary pbias regulator_set_voltage done after
pbias regulator_disable in omap_hsmmc_set_power.
Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com> Reviewed-by: Roger Quadros <rogerq@ti.com> Tested-by: Tony Lindgren <tony@atomide.com> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
mmc: host: omap_hsmmc: use mmc_host's vmmc and vqmmc
No functional change. Instead of using omap_hsmmc_host's vcc and vcc_aux
members, use vmmc and vqmmc present in mmc_host which is present
for the same purpose.
Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com> Reviewed-by: Roger Quadros <rogerq@ti.com> Tested-by: Tony Lindgren <tony@atomide.com> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
mmc: host: omap_hsmmc: use the ocrmask provided by the vmmc regulator
If the vmmc regulator provides a valid ocrmask, use it. By this even if
the pdata has a valid ocrmask, it will be overwritten with the ocrmask
of the vmmc regulator.
Also remove the unnecessary compatibility check between the ocrmask in
the pdata and the ocrmask from the vmmc regulator.
Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com> Tested-by: Tony Lindgren <tony@atomide.com> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
No functional change. Instead of using a local regulator variable
in omap_hsmmc_reg_get() for holding the return value of
devm_regulator_get_optional() and then assigning to omap_hsmmc_host
regulator members: vcc, vcc_aux and pbias, directly use the
omap_hsmmc_host regulator members.
Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com> Reviewed-by: Roger Quadros <rogerq@ti.com> Tested-by: Tony Lindgren <tony@atomide.com> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
mmc: host: omap_hsmmc: return on fatal errors from omap_hsmmc_reg_get
Now return error only if the return value of
devm_regulator_get_optional() is not the same as -ENODEV, since with
-EPROBE_DEFER, the regulator can be obtained later and all other
errors are fatal.
Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com> Tested-by: Tony Lindgren <tony@atomide.com> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
mmc: host: omap_hsmmc: use devm_regulator_get_optional() for vmmc
Since vmmc can be optional for some platforms, use
devm_regulator_get_optional() for vmmc. Now return error only
if the return value of devm_regulator_get_optional() is not the
same as -ENODEV, since with -EPROBE_DEFER, the regulator can be
obtained later and all other errors are fatal.
Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com> Tested-by: Tony Lindgren <tony@atomide.com> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Rabin Vincent [Wed, 19 Aug 2015 13:41:36 +0000 (15:41 +0200)]
mmc: usdhi6rol0: fix ack register write
The intent appears to be to clear only the bits which are set in status
(by setting them to zero in the ack write), like in the other interrupt
handlers, and not to always clear everything (by always writing zero).
Use the correct not operator.
Signed-off-by: Rabin Vincent <rabin.vincent@axis.com> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Rabin Vincent [Wed, 19 Aug 2015 13:41:34 +0000 (15:41 +0200)]
mmc: usdhi6rol0: handle probe deferral for regulator
We ignore errors from mmc_regulator_get_supply() because the usage of
the regulators is optional for the driver, but we still need to check
for and handle EPROBE_DEFER, like it's done in for example dw_mmc.
Otherwise we might end up not using the specified regulators just
because of probe order.
Signed-off-by: Rabin Vincent <rabin.vincent@axis.com> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Shawn Lin [Tue, 11 Aug 2015 07:57:05 +0000 (15:57 +0800)]
mmc: sdhci-of-arasan: Add the support for sdhci-5.1
This patch adds the compatible string in sdhci-of-arasan.c to
support sdhci-arasan5.1 version of controller. No documented
controller IP version is found in the TRM, so we use ths version
of command queueing engine integrated into this controller by arasan
to specify our controller.
Signed-off-by: Shawn Lin <shawn.lin@rock-chips.com> Acked-by: Michal Simek <michal.simek@xilinx.com> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Jisheng Zhang [Tue, 18 Aug 2015 08:21:39 +0000 (16:21 +0800)]
mmc: sdhci: also get preset value and driver type for MMC_DDR52
commit bb8175a8aa42 ("mmc: sdhci: clarify DDR timing mode between
SD-UHS and eMMC") added MMC_DDR52 as eMMC's DDR mode to be
distinguished from SD-UHS, but it missed setting driver type for
MMC_DDR52 timing mode.
So sometimes we get the following error on Marvell BG2Q DMP board:
[ 1.559598] mmcblk0: error -84 transferring data, sector 0, nr 8, cmd
response 0x900, card status 0xb00
[ 1.569314] mmcblk0: retrying using single block read
[ 1.575676] mmcblk0: error -84 transferring data, sector 2, nr 6, cmd
response 0x900, card status 0x0
[ 1.585202] blk_update_request: I/O error, dev mmcblk0, sector 2
[ 1.591818] mmcblk0: error -84 transferring data, sector 3, nr 5, cmd
response 0x900, card status 0x0
[ 1.601341] blk_update_request: I/O error, dev mmcblk0, sector 3
This patches fixes this by adding the missing driver type setting.
Shawn Lin [Wed, 12 Aug 2015 05:08:32 +0000 (13:08 +0800)]
mmc: block: skip trim for some kingston eMMCs
For some mass production of kingston eMMCs which adopt Phison's
firmware will meet an unrecoverable data conrruption occasionally
if performing trim due to a firmware bug confirmed by vendor. We
found it on Intel-C3230RK platform. So we add fixup of broken trim
for it.
Signed-off-by: Shawn Lin <shawn.lin@rock-chips.com> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Haibo Chen [Tue, 11 Aug 2015 11:38:31 +0000 (19:38 +0800)]
mmc: sdhci-esdhc-imx: change default watermark level and burst length
By default, for all imx SoC types, the watermark level is 16, and the
burst length is 8. But if the SDIO/SD/MMC I/O speed is fast enough,
this default watermark level and burst length will be the performance
bottleneck.
For example, i.MX7D support eMMC HS400 mode, this mode can run in 8 bit,
200MHZ DDR mode. So the I/O speed improve a lot compare to SD3.0.
The default burst length is 8, if we don't change this value, in
HS400 mode, when we do eMMC read operation, we can find that the
clock signal will stop for a period of time. This means the speed
of data moving on AHB bus is slower than I/O speed. So we should
improve the speed of data moving on AHB bus.
This patch set the default burst length as 16, and set the default
watermark level as 64. The test result is the clock signal has
no stop during the eMMC HS400 operation.
Haibo Chen [Tue, 11 Aug 2015 11:38:30 +0000 (19:38 +0800)]
mmc: sdhci-esdhc-imx: set back the burst_length_enable bit to 1
Currently we find that if a usdhc is choosed to boot system, then ROM
code will set the burst length enable bit of this usdhc as 0.
This will make performance drop a lot if this usdhc's burst length is
configed. So this patch set back the burst_length_enable bit as 1,
which is the default value, and means burst length is enabled for INCR.
Haibo Chen [Tue, 11 Aug 2015 11:38:27 +0000 (19:38 +0800)]
mmc: sdhci-esdhc-imx: add tuning-step setting support
tuning-step is the delay cell steps in tuning procedure. The default value
of tuning-step is 1. Some boards or cards need another value to pass the
tuning procedure. For example, imx7d-sdb board need the tuning-step value
as 2, otherwise it can't pass the tuning procedure.
So this patch add the tuning-step setting in driver, so that user can set
the tuning-step value in dts.
Haibo Chen [Tue, 25 Aug 2015 02:02:11 +0000 (10:02 +0800)]
mmc: sdhci: fix dma memory leak in sdhci_pre_req()
Currently one mrq->data maybe execute dma_map_sg() twice
when mmc subsystem prepare over one new request, and the
following log show up:
sdhci[sdhci_pre_dma_transfer] invalid cookie: 24, next-cookie 25
In this condition, mrq->date map a dma-memory(1) in sdhci_pre_req
for the first time, and map another dma-memory(2) in sdhci_prepare_data
for the second time. But driver only unmap the dma-memory(2), and
dma-memory(1) never unmapped, which cause the dma memory leak issue.
This patch use another method to map the dma memory for the mrq->data
which can fix this dma memory leak issue.
Adam Lee [Mon, 3 Aug 2015 06:33:28 +0000 (14:33 +0800)]
mmc: sdhci-pci: set the clear transfer mode register quirk for O2Micro
This patch fixes MMC not working issue on O2Micro/BayHub Host, which
requires transfer mode register to be cleared when sending no DMA
command.
Signed-off-by: Peter Guo <peter.guo@bayhubtech.com> Signed-off-by: Adam Lee <adam.lee@canonical.com> Cc: stable <stable@vger.kernel.org> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
mmc: sdhci: switch from programmable clock mode to divided one if needed
In programmable mode, if the clock frequency is too high, the divider
can be too small to meet the clock frequency requirement especially to
init the SD card. In this case, switch to the divided clock mode.
Addy Ke [Mon, 10 Aug 2015 16:27:18 +0000 (01:27 +0900)]
mmc: dw_mmc: add quirk for broken data transfer over scheme
This patch add a new quirk to add a s/w timer to notify the driver
to terminate current transfer and report a data timeout to the core,
if DTO interrupt does NOT come within the given time.
dw_mmc call mmc_request_done func to finish transfer depends on
DTO interrupt. If DTO interrupt does not come in sending data state,
the current transfer will be blocked.
We got the reply from synopsys:
There are two counters but both use the same value of [31:8] bits.
Data timeout counter doesn't wait for stop clock and you should get
DRTO even when the clock is not stopped.
Host Starvation timeout counter is triggered with stop clock condition.
This means that host should get DRTO and DTO interrupt.
But this case really exists, when driver reads tuning data from
card on RK3288-pink2 board. I measured waveforms by oscilloscope
and found that card clock was always on and data lines were always
holded high level in sending data state.
There are two possibility that data over interrupt doesn't come in
reading data state on RK3X SoCs:
- get command done interrupt, but doesn't get any data-related interrupt.
- get data error interrupt, but doesn't get data over interrupt.
Heiko Stuebner [Mon, 3 Aug 2015 15:04:10 +0000 (17:04 +0200)]
mmc: dw_mmc: fix pio mode when internal dmac is enabled
The dw_mci_init_dma() may decide to not use dma, but pio instead, caused
by things like wrong dma settings in the system.
Till now the code dw_mci_init_slot() always assumed that dma is available
when CONFIG_MMC_DW_IDMAC was defined, ignoring the host->use_dma var
set during dma init.
So when now the dma init failed for whatever reason, the transfer sizes
would still be set for dma transfers, especially including the maximum
block-count calculated from host->ring_size and resulting in a
[ 4.991109] ------------[ cut here ]------------
[ 4.991111] kernel BUG at drivers/mmc/core/core.c:256!
[ 4.991113] Internal error: Oops - BUG: 0 [#1] SMP ARM
because host->ring_size is 0 in this case and the slot init code uses
the wrong code to calculate the values.
Fix this by selecting the correct calculations using the host->use_dma
variable instead of the CONFIG_MMC_DW_IDMAC config option.
Shawn Lin [Mon, 3 Aug 2015 07:07:21 +0000 (15:07 +0800)]
mmc: dw_mmc: Fix coding style issues
This patch fixes the following issues reported by checkpatch.pl:
- use -EINVAL instead of -ENOSYS, to fix warning message:
"ENOSYS means 'invalid syscall nr' and nothing else"
- split lines whose length is greater than 80 characters
- avoid quoted string split across lines
- use min_t instead of min, to fix warning message:
"min() should probably be min_t(int, cnt, host->part_buf_count)"
- fix missing a blank line after declarations
Signed-off-by: Shawn Lin <shawn.lin@rock-chips.com> Signed-off-by: Jaehoon Chung <jh80.chung@samsung.com>
Alexey Brodkin [Thu, 25 Jun 2015 08:25:07 +0000 (11:25 +0300)]
mmc: dw_mmc: handle data blocks > than 4kB if IDMAC is used
As per DW MobileStorage databook "each descriptor can transfer up to 4kB
of data in chained mode", moreover buffer size that is put in "des1" is
limited to 13 bits, i.e. for example on attempt to
IDMAC_SET_BUFFER1_SIZE(desc, 8192) size value that's effectively written
will be 0.
On the platform with 8kB PAGE_SIZE I see dw_mmc gets data blocks in
SG-list of 8kB size and that leads to unpredictable behavior of the
SD/MMC controller.
In particular on write to FAT partition of SD-card the controller will
stuck in the middle of DMA transaction.
Solution to the problem is simple - we need to pass large (> 4kB) data
buffers to the controller via multiple descriptors. And that's what
that change does.
What's interesting I did try original driver on same platform but
configured with 4kB PAGE_SIZE and may confirm that data blocks passed
in SG-list to dw_mmc never exeed 4kB limit - that explains why nobody
ever faced a problem I did.
Yangbo Lu [Fri, 10 Jul 2015 03:44:03 +0000 (11:44 +0800)]
mmc: block: add fixup of broken CMD23 for Sandisk card
Some Sandisk cards(such as "SDMB-32" and "SDM032" cards)
can't support CMD23, and would generate CMD timeout. So add
FIX-UP for these two types Sandisk cards.
Error log:
mmcblk0: timed out sending SET_BLOCK_COUNT command, card status 0x400900
mmcblk0: timed out sending SET_BLOCK_COUNT command, card status 0x400900
mmcblk0: timed out sending SET_BLOCK_COUNT command, card status 0x400900
end_request: I/O error, dev mmcblk0, sector 0
Buffer I/O error on device mmcblk0, logical block 0
mmcblk0: timed out sending SET_BLOCK_COUNT command, card status 0x400900
Signed-off-by: Yangbo Lu <yangbo.lu@freescale.com> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Yangbo Lu [Fri, 10 Jul 2015 03:36:45 +0000 (11:36 +0800)]
mmc: sdio: avoid using NULL sdio_irq_thread pointer
For Freescale QorIQ LS1021AQDS board, there is a SDIO interrupt
in the process of resume without inserting SD adapter because of
some unknown issue. But the driver doesn't assign sdio_irq_thread
pointer. This will block the resume of kernel. This patch is used
to avoid using NULL sdio_irq_thread pointer.
Signed-off-by: Yangbo Lu <yangbo.lu@freescale.com> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
If no pdata.set_power was set by the platform code, the driver
was updating pdata with its own fallback function. This is a no-no
since pdata shall be read-only.
This patch pushes the check 'pdata->set_power != NULL' down into
the fallback functions. If pdata.set_power is really set, it calls them
and exits, otherwise the fallback code is used.
Signed-off-by: Andreas Fenkart <afenkart@gmail.com> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Before 5b83b2234be6733cf the driver was hard coding the wakeup irq to
be active low. The generic pm wakeirq does not override the active
high/low parameter, hence it must be specified correctly in the
device tree.
Mind that SDIO IRQ is active low as defined in the SDIO specification
Signed-off-by: Andreas Fenkart <afenkart@gmail.com> Acked-by: Tony Lindgren <tony@atomide.com> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Ivan T. Ivanov [Mon, 6 Jul 2015 12:16:21 +0000 (15:16 +0300)]
mmc: sdhci: properly check card present state when quirk NO_CARD_NO_RESET is set
Controller could have both NO_CARD_NO_RESET and BROKEN_CARD_DETECTION
quirks set. Use sdhci_do_get_cd() when applying NO_CARD_NO_RESET, which
properly check for BROKEN_CARD_DETECTION quirk.
Signed-off-by: Ivan T. Ivanov <ivan.ivanov@linaro.org> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Ivan T. Ivanov [Mon, 6 Jul 2015 12:16:19 +0000 (15:16 +0300)]
mmc: sdhci: let GPIO based card detection have higher precedence
Controller could have BROKEN_CARD_DETECTION quirk set, but drivers
could use GPIO to detect card present state. Let, when defined, GPIO
take precedence, so drivers could properly detect card state and not
use polling.
Signed-off-by: Ivan T. Ivanov <ivan.ivanov@linaro.org> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Ivan T. Ivanov [Mon, 6 Jul 2015 11:53:38 +0000 (14:53 +0300)]
mmc: sdhci-msm: Boost controller core clock
Ensure SDCC is working with maximum clock otherwise card
detection could be extremely slow, up to 7 seconds.
Signed-off-by: Ivan T. Ivanov <ivan.ivanov@linaro.org> Reviewed-by: Georgi Djakov <georgi.djakov@linaro.org> Acked-by: Stephen Boyd <sboyd@codeaurora.org> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Daniel Mack [Sat, 6 Jun 2015 21:15:22 +0000 (23:15 +0200)]
mmc: pxamci: switch over to dmaengine use
Switch over pxamci to dmaengine. This prepares the devicetree full
support of pxamci.
This was successfully tested on a PXA3xx board, as well as PXA27x.
Signed-off-by: Daniel Mack <zonque@gmail.com>
[adapted to pxa-dma] Signed-off-by: Robert Jarzmik <robert.jarzmik@free.fr> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
David Jander [Tue, 23 Jun 2015 09:43:52 +0000 (11:43 +0200)]
mmc: core: Optimize case for exactly one erase-group budget
In the (not so unlikely) case that the mmc controller timeout budget is
enough for exactly one erase-group, the simplification of allowing one
sector has an enormous performance penalty. We optimize this special case
by introducing a flag that prohibits erase-group boundary crossing, so
that we can allow trimming more than one sector at a time.
Signed-off-by: David Jander <david@protonic.nl> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Linus Torvalds [Sun, 16 Aug 2015 22:44:33 +0000 (15:44 -0700)]
Merge tag 'armsoc-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc
Pull ARM SoC fixes from Olof Johansson:
"A smallish batch of fixes, a little more than expected this late, but
all fixes are contained to their platforms and seem reasonably low
risk:
- a somewhat large SMP fix for ux500 that still seemed warranted to
include here
- OMAP DT fixes for pbias regulator specification that broke due to
some DT reshuffling
- PCIe IRQ routing bugfix for i.MX
- networking fixes for keystone
- runtime PM for OMAP GPMC
- a couple of error path bug fixes for exynos"
* tag 'armsoc-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
ARM: dts: keystone: Fix the mdio bindings by moving it to soc specific file
ARM: dts: keystone: fix the clock node for mdio
memory: omap-gpmc: Don't try to save uninitialized GPMC context
ARM: imx6: correct i.MX6 PCIe interrupt routing
ARM: ux500: add an SMP enablement type and move cpu nodes
ARM: dts: dra7: Fix broken pbias device creation
ARM: dts: OMAP5: Fix broken pbias device creation
ARM: dts: OMAP4: Fix broken pbias device creation
ARM: dts: omap243x: Fix broken pbias device creation
ARM: EXYNOS: fix double of_node_put() on error path
ARM: EXYNOS: Fix potentian kfree() of ro memory
Linus Torvalds [Sun, 16 Aug 2015 22:11:25 +0000 (15:11 -0700)]
Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Merge x86 fixes from Ingo Molnar:
"Two followup fixes related to the previous LDT fix"
Also applied a further FPU emulation fix from Andy Lutomirski to the
branch before actually merging it.
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
x86/ldt: Further fix FPU emulation
x86/ldt: Correct FPU emulation access to LDT
x86/ldt: Correct LDT access in single stepping logic
Olof Johansson [Sun, 16 Aug 2015 19:29:57 +0000 (21:29 +0200)]
Merge tag 'keystone-dts-late-fixes-v2' of git://git.kernel.org/pub/scm/linux/kernel/git/ssantosh/linux-keystone into fixes
ARM: Couple of Keysyone MDIO DTS fixes for 4.2-rc6+
These are necessary to get the NIC card working on all Keystone
EVMs. Couple of boards are broken without these two fixes.
* tag 'keystone-dts-late-fixes-v2' of git://git.kernel.org/pub/scm/linux/kernel/git/ssantosh/linux-keystone:
ARM: dts: keystone: Fix the mdio bindings by moving it to soc specific file
ARM: dts: keystone: fix the clock node for mdio
Markos Chandras [Thu, 13 Aug 2015 07:47:59 +0000 (08:47 +0100)]
MIPS: Fix seccomp syscall argument for MIPS64
Commit 4c21b8fd8f14 ("MIPS: seccomp: Handle indirect system calls (o32)")
fixed indirect system calls on O32 but it also introduced a bug for MIPS64
where it erroneously modified the v0 (syscall) register with the assumption
that the sycall offset hasn't been taken into consideration. This breaks
seccomp on MIPS64 n64 and n32 ABIs. We fix this by replacing the addition
with a move instruction.
Linus Torvalds [Sat, 15 Aug 2015 20:54:53 +0000 (13:54 -0700)]
Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull SCSI fixes from James Bottomley:
"This has two libfc fixes for bugs causing rare crashes, one iscsi fix
for a potential hang on shutdown, and a fix for an I/O blocksize issue
which caused a regression"
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
sd: Fix maximum I/O size for BLOCK_PC requests
libfc: Fix fc_fcp_cleanup_each_cmd()
libfc: Fix fc_exch_recv_req() error path
libiscsi: Fix host busy blocking during connection teardown
Linus Torvalds [Sat, 15 Aug 2015 00:27:52 +0000 (17:27 -0700)]
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull KVM fixes from Paolo Bonzini:
"Just two very small & simple patches"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
KVM: x86: Use adjustment in guest cycles when handling MSR_IA32_TSC_ADJUST
KVM: x86: zero IDT limit on entry to SMM
Linus Torvalds [Sat, 15 Aug 2015 00:05:26 +0000 (17:05 -0700)]
Merge branch 'akpm' (patches from Andrew)
Merge fixes from Andrew Morton:
"11 fixes"
* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
Update maintainers for DRM STI driver
mm: cma: mark cma_bitmap_maxno() inline in header
zram: fix pool name truncation
memory-hotplug: fix wrong edge when hot add a new node
.mailmap: Andrey Ryabinin has moved
ipc/sem.c: update/correct memory barriers
mm/hwpoison: fix panic due to split huge zero page
ipc,sem: remove uneeded sem_undo_list lock usage in exit_sem()
ipc,sem: fix use after free on IPC_RMID after a task using same semaphore set exits
mm/hwpoison: fix fail isolate hugetlbfs page w/ refcount held
mm/hwpoison: fix page refcount of unknown non LRU page
Signed-off-by: Benjamin Gaignard <benjamin.gaignard@linaro.org> Cc: Vincent Abriou <vincent.abriou@st.com> Cc: Dave Airlie <airlied@linux.ie> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Gregory Fong [Fri, 14 Aug 2015 22:35:21 +0000 (15:35 -0700)]
mm: cma: mark cma_bitmap_maxno() inline in header
cma_bitmap_maxno() was marked as static and not static inline, which can
cause warnings about this function not being used if this file is included
in a file that does not call that function, and violates the conventions
used elsewhere. The two options are to move the function implementation
back to mm/cma.c or make it inline here, and it's simple enough for the
latter to make sense.
Signed-off-by: Gregory Fong <gregory.0xf0@gmail.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Sasha Levin <sasha.levin@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
However, it defines pool name buffer to be only 8 bytes long (minus
trailing zero), which means that we can have only 1000 pool names: zram0
-- zram999.
With CONFIG_ZSMALLOC_STAT enabled an attempt to create a device zram1000
can fail if device zram100 already exists, because snprintf() will
truncate new pool name to zram100 and pass it debugfs_create_dir(),
causing:
debugfs dir <zram100> creation failed
zram: Error creating memory pool
... and so on.
Fix it by passing zram->disk->disk_name to zram_meta_alloc() instead of
divice_id. We construct zram%d name earlier and keep it as a ->disk_name,
no need to snprintf() it again.
Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Cc: Minchan Kim <minchan@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Xishi Qiu [Fri, 14 Aug 2015 22:35:16 +0000 (15:35 -0700)]
memory-hotplug: fix wrong edge when hot add a new node
When we add a new node, the edge of memory may be wrong.
e.g. system has 4 nodes, and node3 is movable, node3 mem:[24G-32G],
1. hotremove the node3,
2. then hotadd node3 with a part of memory, mem:[26G-30G],
3. call hotadd_new_pgdat()
free_area_init_node()
get_pfn_range_for_nid()
4. it will return wrong start_pfn and end_pfn, because we have not
update the memblock.
This patch also fixes a BUG_ON during hot-addition, please see
http://marc.info/?l=linux-kernel&m=142961156129456&w=2
Manfred Spraul [Fri, 14 Aug 2015 22:35:10 +0000 (15:35 -0700)]
ipc/sem.c: update/correct memory barriers
sem_lock() did not properly pair memory barriers:
!spin_is_locked() and spin_unlock_wait() are both only control barriers.
The code needs an acquire barrier, otherwise the cpu might perform read
operations before the lock test.
As no primitive exists inside <include/spinlock.h> and since it seems
noone wants another primitive, the code creates a local primitive within
ipc/sem.c.
With regards to -stable:
The change of sem_wait_array() is a bugfix, the change to sem_lock() is a
nop (just a preprocessor redefinition to improve the readability). The
bugfix is necessary for all kernels that use sem_wait_array() (i.e.:
starting from 3.10).
Huge zero page is allocated if page fault w/o FAULT_FLAG_WRITE flag.
The get_user_pages_fast() which called in madvise_hwpoison() will get
huge zero page if the page is not allocated before. Huge zero page is a
tranparent huge page, however, it is not an anonymous page.
memory_failure will split the huge zero page and trigger
BUG_ON(is_huge_zero_page(page));
After commit 98ed2b0052e6 ("mm/memory-failure: give up error handling
for non-tail-refcounted thp"), memory_failure will not catch non anon
thp from madvise_hwpoison path and this bug occur.
Fix it by catching non anon thp in memory_failure in order to not split
huge zero page in madvise_hwpoison path.
After this patch:
Injecting memory failure for page 0x202800 at 0x7fd8ae800000
MCE: 0x202800: non anonymous thp
[...]
[akpm@linux-foundation.org: remove second split, per Wanpeng] Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com> Acked-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
ipc,sem: remove uneeded sem_undo_list lock usage in exit_sem()
After we acquire the sma->sem_perm lock in exit_sem(), we are protected
against a racing IPC_RMID operation. Also at that point, we are the last
user of sem_undo_list. Therefore it isn't required that we acquire or use
ulp->lock.
Signed-off-by: Herton R. Krzesinski <herton@redhat.com> Acked-by: Manfred Spraul <manfred@colorfullife.com> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Rafael Aquini <aquini@redhat.com> CC: Aristeu Rozanski <aris@redhat.com> Cc: David Jeffery <djeffery@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
ipc,sem: fix use after free on IPC_RMID after a task using same semaphore set exits
The current semaphore code allows a potential use after free: in
exit_sem we may free the task's sem_undo_list while there is still
another task looping through the same semaphore set and cleaning the
sem_undo list at freeary function (the task called IPC_RMID for the same
semaphore set).
For example, with a test program [1] running which keeps forking a lot
of processes (which then do a semop call with SEM_UNDO flag), and with
the parent right after removing the semaphore set with IPC_RMID, and a
kernel built with CONFIG_SLAB, CONFIG_SLAB_DEBUG and
CONFIG_DEBUG_SPINLOCK, you can easily see something like the following
in the kernel log:
I wasn't able to trigger any badness on a recent kernel without the
proper config debugs enabled, however I have softlockup reports on some
kernel versions, in the semaphore code, which are similar as above (the
scenario is seen on some servers running IBM DB2 which uses semaphore
syscalls).
The patch here fixes the race against freeary, by acquiring or waiting
on the sem_undo_list lock as necessary (exit_sem can race with freeary,
while freeary sets un->semid to -1 and removes the same sem_undo from
list_proc or when it removes the last sem_undo).
After the patch I'm unable to reproduce the problem using the test case
[1].
void create_set()
{
int i, j;
pid_t p;
union {
int val;
struct semid_ds *buf;
unsigned short int *array;
struct seminfo *__buf;
} un;
/* Create and initialize semaphore set */
for (i = 0; i < NSET; i++) {
sid[i] = semget(IPC_PRIVATE , NSEM, 0644 | IPC_CREAT);
if (sid[i] < 0) {
perror("semget");
exit(EXIT_FAILURE);
}
}
un.val = 0;
for (i = 0; i < NSET; i++) {
for (j = 0; j < NSEM; j++) {
if (semctl(sid[i], j, SETVAL, un) < 0)
perror("semctl");
}
}
/* Launch threads that operate on semaphore set */
for (i = 0; i < NSEM * NSET * NSET; i++) {
p = fork();
if (p < 0)
perror("fork");
if (p == 0)
thread();
}
/* Free semaphore set */
for (i = 0; i < NSET; i++) {
if (semctl(sid[i], NSEM, IPC_RMID))
perror("IPC_RMID");
}
/* Wait for forked processes to exit */
while (wait(NULL)) {
if (errno == ECHILD)
break;
};
}
int main(int argc, char **argv)
{
pid_t p;
srand(time(NULL));
while (1) {
p = fork();
if (p < 0) {
perror("fork");
exit(EXIT_FAILURE);
}
if (p == 0) {
create_set();
goto end;
}
/* Wait for forked processes to exit */
while (wait(NULL)) {
if (errno == ECHILD)
break;
};
}
end:
return 0;
}
[akpm@linux-foundation.org: use normal comment layout] Signed-off-by: Herton R. Krzesinski <herton@redhat.com> Acked-by: Manfred Spraul <manfred@colorfullife.com> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Rafael Aquini <aquini@redhat.com> CC: Aristeu Rozanski <aris@redhat.com> Cc: David Jeffery <djeffery@redhat.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Wanpeng Li [Fri, 14 Aug 2015 22:34:59 +0000 (15:34 -0700)]
mm/hwpoison: fix fail isolate hugetlbfs page w/ refcount held
Hugetlbfs pages will get a refcount in get_any_page() or
madvise_hwpoison() if soft offlining through madvise. The refcount which
is held by the soft offline path should be released if we fail to isolate
hugetlbfs pages.
Fix it by reducing the refcount for both isolation success and failure.
Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com> Acked-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: <stable@vger.kernel.org> [3.9+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Wanpeng Li [Fri, 14 Aug 2015 22:34:56 +0000 (15:34 -0700)]
mm/hwpoison: fix page refcount of unknown non LRU page
After trying to drain pages from pagevec/pageset, we try to get reference
count of the page again, however, the reference count of the page is not
reduced if the page is still not on LRU list.
Fix it by adding the put_page() to drop the page reference which is from
__get_any_page().
Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com> Acked-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: <stable@vger.kernel.org> [3.9+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Fri, 14 Aug 2015 18:06:43 +0000 (11:06 -0700)]
Merge branch 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull timer fix from Ingo Molnar:
"A single clocksource driver suspend/resume fix"
* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
clockevents/drivers/sh_cmt: Only perform clocksource suspend/resume if enabled
Linus Torvalds [Fri, 14 Aug 2015 17:39:32 +0000 (10:39 -0700)]
Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux
Pull drm fixes from Dave Airlie:
"Back from holidays, found these in the cracks: one nouveau revert, one
vmwgfx locking fix and a bunch of exynos fixes"
This commit seems to cause crashes in gk104_fifo_intr_runlist() by
returning 0xbad0da00 when register 0x2a00 is read. Since this commit was
intended for GM20B which is not completely supported yet, let's revert
it for the time being.
Reported-by: Eric Biggers <ebiggers3@gmail.com> Signed-off-by: Alexandre Courbot <acourbot@nvidia.com> Tested-by: Afzal Mohammed <afzal.mohd.ma@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>
Thomas Hellstrom [Wed, 12 Aug 2015 05:31:17 +0000 (22:31 -0700)]
drm/vmwgfx: Fix execbuf locking issues
This addresses two issues that cause problems with viewperf maya-03 in
situation with memory pressure.
The first issue causes attempts to unreserve buffers if batched
reservation fails due to, for example, a signal pending. While previously
the ttm_eu api was resistant against this type of error, it is no longer
and the lockdep code will complain about attempting to unreserve buffers
that are not reserved. The issue is resolved by avoid calling
ttm_eu_backoff_reservation in the buffer reserve error path.
The second issue is that the binding_mutex may be held when user-space
fence objects are created and hence during memory reclaims. This may cause
recursive attempts to grab the binding mutex. The issue is resolved by not
holding the binding mutex across fence creation and submission.
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com> Reviewed-by: Sinclair Yeh <syeh@vmware.com> Cc: <stable@vger.kernel.org> Signed-off-by: Dave Airlie <airlied@redhat.com>
Linus Torvalds [Thu, 13 Aug 2015 23:34:56 +0000 (16:34 -0700)]
Merge branch 'fixes' of git://ftp.arm.linux.org.uk/~rmk/linux-arm
Pull ARM fixes from Russell King:
"Another few small ARM fixes, mostly addressing some VDSO issues"
* 'fixes' of git://ftp.arm.linux.org.uk/~rmk/linux-arm:
ARM: 8410/1: VDSO: fix coarse clock monotonicity regression
ARM: 8409/1: Mark ret_fast_syscall as a function
ARM: 8408/1: Fix the secondary_startup function in Big Endian case
ARM: 8405/1: VDSO: fix regression with toolchains lacking ld.bfd executable
Linus Torvalds [Thu, 13 Aug 2015 23:19:44 +0000 (16:19 -0700)]
x86: fix error handling for 32-bit compat out-of-range system call numbers
Commit 3f5159a9221f ("x86/asm/entry/32: Update -ENOSYS handling to match
the 64-bit logic") broke the ENOSYS handling for the 32-bit compat case.
The proper error return value was never loaded into %rax, except if
things just happened to go through the audit paths, which ended up
reloading the return value.
This moves the loading or %rax into the normal system call path, just to
make sure the error case triggers it. It's kind of sad, since it adds a
useless instruction to reload the register to the fast path, but it's
not like that single load from the stack is going to be noticeable.
Linus Torvalds [Thu, 13 Aug 2015 20:52:46 +0000 (13:52 -0700)]
Merge tag 'dm-4.2-fixes-5' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm
Pull device mapper fixes from Mike Snitzer:
- two stable fixes for corruption seen in a snapshot of thinp metadata;
metadata snapshots aren't widely used but help provide a consistent
view of the metadata associated with an active thin-pool.
- a dm-cache fix for the 4.2 "default" policy switch from "mq" to "smq"
* tag 'dm-4.2-fixes-5' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
dm cache policy smq: move 'dm-cache-default' module alias to SMQ
dm btree: add ref counting ops for the leaves of top level btrees
dm thin metadata: delete btrees when releasing metadata snapshot
Linus Torvalds [Thu, 13 Aug 2015 20:44:32 +0000 (13:44 -0700)]
Merge branch 'for-linus' of git://git.kernel.dk/linux-block
Pull xen block driver fixes from Jens Axboe:
"A few small bug fixes for xen-blk{front,back} that have been sitting
over my vacation"
* 'for-linus' of git://git.kernel.dk/linux-block:
xen-blkback: replace work_pending with work_busy in purge_persistent_gnt()
xen-blkfront: don't add indirect pages to list when !feature_persistent
xen-blkfront: introduce blkfront_gather_backend_features()