Caesar Wang [Mon, 24 Nov 2014 04:58:59 +0000 (12:58 +0800)]
thermal: rockchip: add driver for thermal
Thermal is TS-ADC Controller module supports
user-defined mode and automatic mode.
User-defined mode refers,TSADC all the control signals entirely by
software writing to register for direct control.
Automaic mode refers to the module automatically poll TSADC output,
and the results were checked.If you find that the temperature High
in a period of time,an interrupt is generated to the processor
down-measures taken;If the temperature over a period of time High,
the resulting TSHUT gave CRU module,let it reset the entire chip,
or via GPIO give PMIC.
Signed-off-by: zhaoyifeng <zyf@rock-chips.com> Signed-off-by: Caesar Wang <caesar.wang@rock-chips.com> Reviewed-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
There is no longer need to share defines between exynos_tmu.c
and exynos_tmu_data.c (as they are now only used by the former
file) so move them accordingly. Then move externs for struct
exynos_tmu_init_data instances to exynos_tmu.h and remove no
longer needed exynos_tmu_data.h include.
There should be no functional changes caused by this patch.
Maximum theoretical size saving (i.e. with only Exynos5410
SoC support enabled in kernel config so all SoC dependend
Exynos thermal driver code was dropped) is 4096 bytes so
there is no much sense in keeping these ifdefs (especially
given that they are useless once the driver gets updated to
use device tree).
While at it remove needless 'void *' casts.
There should be no functional changes caused by this patch.
thermal: exynos: remove TMU_SUPPORT_ADDRESS_MULTIPLE flag
Replace TMU_SUPPORT_ADDRESS_MULTIPLE flag check in exynos_map_dt_data()
by an explicit check for a SoC type (only Exynos5420 with TRIMINFO
quirk and Exynos5440 have TMU_SUPPORT_ADDRESS_MULTIPLE flag set in
their struct exynos_tmu_init_data instances).
Please note that this requires moving SoC type assignment and verification
from exynos_tmu_probe() to exynos_map_dt_data() so it happens earlier
(which is a good thing in itself).
There should be no functional changes caused by this patch.
thermal: exynos: remove TMU_SUPPORT_EMULATION flag
Replace TMU_SUPPORT_EMULATION flag check in exynos_tmu_set_emulation()
by an explicit check for a SoC type (all SoC types except Exynos4210
have TMU_SUPPORT_EMULATION flag set in their struct exynos_tmu_init_data
instances).
There should be no functional changes caused by this patch.
thermal: exynos: remove TMU_SUPPORT_EMUL_TIME flag
Replace TMU_SUPPORT_EMUL_TIME flag check in get_emul_con_reg()
by an explicit check for a SoC type (all SoC types except
Exynos4210 and Exynos5440 have TMU_SUPPORT_EMUL_TIME flag set
in their struct exynos_tmu_init_data instances).
There should be no functional changes caused by this patch.
thermal: exynos: remove TMU_SUPPORT_FALLING_TRIP flag
Replace TMU_SUPPORT_FALLING_TRIP flag check in
exynos[4210,5440]_tmu_control() by an explicit check
for a SoC type (all SoC types except Exynos4210 have
TMU_SUPPORT_FALLING_TRIP flag set in their struct
exynos_tmu_init_data instances).
There should be no functional changes caused by this patch.
Add ->tmu_clear_irqs method to struct exynos_tmu_data and use
it instead exynos_tmu_clear_irqs(). Then add ->tmu_clear_irqs
implementations for Exynos4210+ and Exynos5440. Finally
remove no longer needed reg->tmu_int[stat,clear] abstractions
and struct exynos_tmu_registers instances.
There should be no functional changes caused by this patch.
Add ->tmu_set_emulation method to struct exynos_tmu_data and
use it in exynos_tmu_set_emulation(). Then add ->tmu_set_emulation
implementations for Exynos4412+ and Exynos5440. Finally remove
no longer needed reg->emul_con abstraction.
There should be no functional changes caused by this patch.
Add ->tmu_read method to struct exynos_tmu_data and use it
in exynos_tmu_control(). Then add ->tmu_read implementations
for Exynos4210, Exynos4412+ and Exynos5440. Finally remove
no longer needed reg->tmu_cur_temp abstractions.
There should be no functional changes caused by this patch.
Add ->tmu_control method to struct exynos_tmu_data and use it
in exynos_tmu_control(). Then add ->tmu_control implementations
for Exynos4210+ and Exynos5440. Finally remove no longer needed
reg->tmu_[ctrl,inten], reg->inten_rise[0,1,2,3]_shift and
reg->inten_fall0_shift abstractions.
There should be no functional changes caused by this patch.
Add ->tmu_initialize method to struct exynos_tmu_data and
use it in exynos_tmu_initialize(). Then add ->tmu_initialize
implementations for Exynos4210, Exynos4412+ and Exynos5440.
Finally remove no longer needed reg->threshold_th[0,1],
reg->intclr_[fall,rise]_shift and reg->intclr_[rise,fall]_mask
abstractions.
There are more improvements available in the future on top
of this patch like merging HW_TRIP level setting with setting
of other levels for Exynos4412+ or adding separate method
for clearing IRQs using INTCLEAR register (for Exynos5420,
Exynos5260 and Exynos4412+).
There should be no functional changes caused by this patch.
thermal: exynos: remove TMU_SUPPORT_TRIM_RELOAD flag
Replace TMU_SUPPORT_TRIM_RELOAD flag check in exynos_tmu_initialize()
by an explicit check for a SoC type (only Exynos3250, Exynos4412 and
Exynos5250 have TMU_SUPPORT_READY_STATUS flag set in their struct
exynos_tmu_init_data instances). Please note that this requires
adding separate SoC type for Exynos5420 so it doesn't get mistaken
with Exynos5250.
This is a preparation for introducing per-SoC type tmu_initialize
method.
There should be no functional changes caused by this patch.
thermal: exynos: remove TMU_SUPPORT_READY_STATUS flag
Replace TMU_SUPPORT_READY_STATUS flag check in
exynos_tmu_initialize() by an explicit check for a SoC type
(all SoC types except Exynos5440 have TMU_SUPPORT_READY_STATUS
flag set in their struct exynos_tmu_init_data instances).
This is a preparation for introducing per-SoC type tmu_initialize
method.
There should be no functional changes caused by this patch.
thermal: exynos: replace threshold_falling check by Exynos SoC type one
Replace pdata->threshold_falling check for non-zero value in
exynos_tmu_initialize() by an explicit check for a SoC type
(all SoC types except Exynos5440 have pdata->threshold_falling
assigned to non-zero value in their struct exynos_tmu_registers
instances).
This is a preparation for introducing per-SoC type tmu_initialize
method.
There should be no functional changes caused by this patch.
Simplify HW_TRIP level setting in exynos_tmu_initialize() (don't
pretend that the current code is hardware and configuration
independent and just do SoC type check explicitly). Then remove
no longer needed reg->threshold_[th2,th3_l0_shift] abstractions
(only assigned for Exynos5440 in exynos5440_tmu_registers) and
EXYNOS_MAX_TRIGGER_PER_REG define.
There should be no functional changes caused by this patch.
thermal: exynos: replace tmu_pmin check by Exynos5440 one
reg->tmu_pmin is set to non-zero value only for Exynos5440
so replace check for non-zero value of reg->tmu_pmin by
explicitly checking for Exynos5440 SoC type. Then remove no
longer needed reg->tmu_pmin register abstraction.
There should be no functional changes caused by this patch.
thermal: exynos: replace tmu_irqstatus check by Exynos5440 one
reg->tmu_irqstatus is set to non-zero value only for Exynos5440
so replace check for non-zero value of reg->tmu_irqstatus by
explicitly checking for Exynos5440 SoC type. Then remove no
longer needed reg->tmu_irqstatus register abstraction.
There should be no functional changes caused by this patch.
reg->emul_time_shift is used only in exynos_tmu_set_emulation()
and accessed only if TMU_SUPPORT_EMUL_TIME flag is set. This
flag is not set for Exynos4210 and Exynos5440 (reg->emul_time_shift
field is not even assigned in exynos[4210,5440]_tmu_registers
and is assigned to identical value for all other SoC types) so
the abstraction is not needed and can be removed.
There should be no functional changes caused by this patch.
reg->emul_temp_shift is used only in exynos_tmu_set_emulation()
and accessed only if TMU_SUPPORT_EMULATION flag is set. This
flag is not set for Exynos4210 (reg->emul_temp_shift field is
not even assigned in exynos4210_tmu_registers and is assigned
to identical value for all other SoC types) so the abstraction
is not needed and can be removed.
There should be no functional changes caused by this patch.
reg->therm_trip_en_shift is used only in exynos_tmu_initialize()
and not accessed on Exynos4210 (also reg->therm_trip_en_shift is
not even assigned in exynos4210_tmu_registers but it is assigned
to identical value for all other SoC types) so the register
abstraction is not needed and can be removed.
There should be no functional changes caused by this patch.
reg->therm_trip_mode_shift and reg->therm_trip_mode_mask are
used only in exynos_tmu_control() and accessed only if
pdata->noise_cancel_mode is non-zero. pdata->noise_cancel
field is not defined on Exynos4210 (also therm_trip_mode_shift
and therm_trip_mode_mask entries are not even assigned in
exynos4210_tmu_registers but they are assigned to identical
values for all other SoC types) so the abstractions are not
needed and can be removed.
There should be no functional changes caused by this patch.
reg->test_mux_addr_shift is used only if pdata->test_mux is
non-zero. pdata->test_mux is defined only on Exynos3250 and
Exynos4412 (other SoC types don't even have pdata->test_mux
entry assigned in their struct exynos_tmu_registers instances)
so the abstraction is not needed and can be removed.
There should be no functional changes caused by this patch.
reg->triminfo_ctrl[] is used in only exynos_tmu_initialize() and
accessed only if TMU_SUPPORT_TRIM_RELOAD flag is set. This flag
is set only on Exynos3250, Exynos4412 and Exynos5250 (other SoC
types don't even have triminfo_ctrl[] entries assigned in their
struct exynos_tmu_registers instances) so the register abstraction
is not needed and can be removed.
There should be no functional changes caused by this patch.
reg->threshold_temp is used only in exynos_tmu_initialize() and
is accessed only on Exynos4210 (other SoC types don't even have
threshold_temp entry assigned in their struct exynos_tmu_registers
instances) so the register abstraction is not needed and can be
removed.
There should be no functional changes caused by this patch.
reg->tmu_status is used only in exynos_tmu_initialize() and it
is accessed only if TMU_SUPPORT_READY_STATUS flag is set. This
flag is not set for Exynos5440 and TMU_STATUS register offset
is identical for all other SoC types so the abstraction is not
needed and can be removed.
There should be no functional changes caused by this patch.
reg->triminfo_data is used only in exynos_tmu_initialize() and
the code has already different paths for Exynos5440 and other
SoC types (on which TRIMINFO_DATA register offset is identical)
so the register abstraction is not needed and can be removed.
There should be no functional changes caused by this patch.
thermal: of: improve of-thermal sensor registration API
Different drivers request API extensions in of-thermal. For this reason,
additional callbacks are required to fit the new drivers needs.
The current API implementation expects the registering sensor driver
to provide a get_temp and get_trend callbacks as function parameters.
As the amount of callbacks is growing, this patch changes the existing
implementation to use a .ops field to hold all the of thermal callbacks
to sensor drivers.
This patch also changes the existing of-thermal users to fit the new
API design. No functional change is introduced in this patch.
This adds support for the Tegra SOCTHERM thermal sensing and management
system found in the Tegra124 system-on-chip. This initial driver supports
temperature polling for four thermal zones.
Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com> Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
Mikko Perttunen [Fri, 26 Sep 2014 09:43:12 +0000 (12:43 +0300)]
ARM: tegra: Add thermal trip points for Jetson TK1
This adds critical trip points to the Jetson TK1 device tree.
The device will do a controlled shutdown when either the CPU, GPU
or MEM thermal zone reaches 101 degrees Celsius.
Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com> Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
Mikko Perttunen [Fri, 26 Sep 2014 09:43:11 +0000 (12:43 +0300)]
ARM: tegra: Add soctherm and thermal zones to Tegra124 device tree
This adds the soctherm thermal sensing and management unit to the
Tegra124 device tree along with the four thermal zones corresponding
to the four thermal sensors provided by soctherm.
Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com> Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
This patch introduces a new thermal cooling device based on common clock
framework. The original motivation to write this cooling device is to be
able to cool down thermal zones using clocks that feed co-processors, such
as GPUs, DSPs, Image Processing Co-processors, etc. But it is written
in a way that it can be used on top of any clock.
The implementation is pretty straight forward. The code creates
a thermal cooling device based on a pair of a struct device and a clock name.
The struct device is assumed to be usable by the OPP layer. The OPP layer
is used as source of the list of possible frequencies. The (cpufreq) frequency
table is then used as a map from frequencies to cooling states. Cooling
states are indexes to the frequency table.
The logic sits on top of common clock framework, specifically on clock
pre notifications. Any PRE_RATE_CHANGE is hijacked, and the transition is
only allowed when the new rate is within the thermal limit (cooling state -> freq).
When a thermal cooling device state transition is requested, the clock
is also checked to verify if the current clock rate is within the new
thermal limit.
Cc: Zhang Rui <rui.zhang@intel.com> Cc: Mike Turquette <mturquette@linaro.org> Cc: Nishanth Menon <nm@ti.com> Cc: Pavel Machek <pavel@ucw.cz> Cc: "Rafael J. Wysocki" <rjw@sisk.pl> Cc: Len Brown <len.brown@intel.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: linux-pm@vger.kernel.org Cc: linux-arm-kernel@lists.infradead.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Eduardo Valentin <eduardo.valentin@ti.com> Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
Linus Torvalds [Mon, 17 Nov 2014 00:21:57 +0000 (16:21 -0800)]
Merge tag 'armsoc-for-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc
Pull ARM SoC fixes from Olof Johansson:
"Another small set of fixes:
- some DT compatible typo fixes
- irq setup fix dealing with irq storms on orion
- i2c quirk generalization for mvebu
- a handful of smaller fixes for OMAP
- a couple of added file patterns for OMAP entries in MAINTAINERS"
* tag 'armsoc-for-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
ARM: at91/dt: Fix sama5d3x typos
pinctrl: dra: dt-bindings: Fix output pull up/down
MAINTAINERS: Update entry for omap related .dts files to cover new SoCs
MAINTAINERS: add more files under OMAP SUPPORT
ARM: dts: AM437x-SK-EVM: Fix DCDC3 voltage
ARM: dts: AM437x-GP-EVM: Fix DCDC3 voltage
ARM: dts: AM43x-EPOS-EVM: Fix DCDC3 voltage
ARM: dts: am335x-evm: Fix 5th NAND partition's name
ARM: orion: Fix for certain sequence of request_irq can cause irq storm
ARM: mvebu: armada xp: Generalize use of i2c quirk
1) Fix NULL oops in Schizo PCI controller error handler.
2) Fix race between xchg and other operations on 32-bit sparc, from
Andreas Larsson.
3) swab*() helpers need a dummy memory input operand to show data flow
on 64-bit sparc.
4) Fix RCU warnings due to missing irq_{enter,exit}() around
generic_smp_call_function*() calls.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc:
sparc64: Fix constraints on swab helpers.
sparc32: Implement xchg and atomic_xchg using ATOMIC_HASH locks
sparc64: Do irq_{enter,exit}() around generic_smp_call_function*().
sparc64: Fix crashes in schizo_pcierr_intr_other().
Olof Johansson [Sun, 16 Nov 2014 23:09:53 +0000 (15:09 -0800)]
Merge tag 'omap-fixes-against-v3.18-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap into fixes
Merge "omap fixes against v3.18-rc4" from Tony Lindgren:
Few omap fixes for hangs and wrong pinctrl defines, and update
MAINTAINERS file to avoid missing PMIC and SoC related patches:
- Fix random hangs on am437x because of incorrect default
value for the DDR regulator
- Fix wrong partition name for NAND on am335x-evm
- Fix wrong pinctrl defines for dra7xx
- Update maintainers entries for PMICs and SoCs
* tag 'omap-fixes-against-v3.18-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap:
pinctrl: dra: dt-bindings: Fix output pull up/down
MAINTAINERS: Update entry for omap related .dts files to cover new SoCs
MAINTAINERS: add more files under OMAP SUPPORT
ARM: dts: AM437x-SK-EVM: Fix DCDC3 voltage
ARM: dts: AM437x-GP-EVM: Fix DCDC3 voltage
ARM: dts: AM43x-EPOS-EVM: Fix DCDC3 voltage
ARM: dts: am335x-evm: Fix 5th NAND partition's name
Olof Johansson [Sun, 16 Nov 2014 23:07:37 +0000 (15:07 -0800)]
Merge tag 'mvebu-fixes-3.18' of git://git.infradead.org/linux-mvebu into fixes
Merge "mvebu fixes for v3.18" from Jason Cooper:
- Armada XP
- Generalize i2c quirk
- orion
- Fix irq storm caused by specific sequence of request_irq
* tag 'mvebu-fixes-3.18' of git://git.infradead.org/linux-mvebu:
ARM: orion: Fix for certain sequence of request_irq can cause irq storm
ARM: mvebu: armada xp: Generalize use of i2c quirk
NeilBrown [Tue, 28 Oct 2014 21:49:50 +0000 (08:49 +1100)]
md: Always set RECOVERY_NEEDED when clearing RECOVERY_FROZEN
md_check_recovery will skip any recovery and also clear
MD_RECOVERY_NEEDED if MD_RECOVERY_FROZEN is set.
So when we clear _FROZEN, we must set _NEEDED and ensure that
md_check_recovery gets run.
Otherwise we could miss out on something that is needed.
In particular, this can make it impossible to remove a
failed device from an array is the 'recovery-needed' processing
didn't happen.
Suitable for stable kernels since 3.13.
Cc: stable@vger.kernel.org (3.13+) Reported-and-tested-by: Joe Lawrence <joe.lawrence@stratus.com> Fixes: 30b8feb730f9b9b3c5de02580897da03f59b6b16 Signed-off-by: NeilBrown <neilb@suse.de>
Linus Torvalds [Sun, 16 Nov 2014 19:36:54 +0000 (11:36 -0800)]
Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull SCSI fixes from James Bottomley:
"This is a set of six fixes and a MAINTAINER update.
The fixes are two multipath (one in Test Unit Ready handling for the
path checkers and one in the section of code that sends a start unit
after failover; both of these were perturbed by the scsi-mq update), a
CD-ROM door locking fix that was likewise introduced by scsi-mq and
three driver fixes for a previous code update in cxgb4i, megaraid_sas
and bnx2fc"
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
bnx2fc: fix tgt spinlock locking
megaraid_sas: fix bug in handling return value of pci_enable_msix_range()
cxgb4i: send abort_rpl correctly
cxgbi: add maintainer for cxgb3i/cxgb4i
scsi: TUR path is down after adapter gets reset with multipath
scsi: call device handler for failed TUR command
scsi: only re-lock door after EH on devices that were reset
Linus Torvalds [Sun, 16 Nov 2014 19:19:25 +0000 (11:19 -0800)]
Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Ingo Molnar:
"Microcode fixes, a Xen fix and a KASLR boot loading fix with certain
memory layouts"
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86, microcode, AMD: Fix ucode patch stashing on 32-bit
x86/core, x86/xen/smp: Use 'die_complete' completion when taking CPU down
x86, microcode: Fix accessing dis_ucode_ldr on 32-bit
x86, kaslr: Prevent .bss from overlaping initrd
x86, microcode, AMD: Fix early ucode loading on 32-bit
Linus Torvalds [Sun, 16 Nov 2014 19:00:42 +0000 (11:00 -0800)]
x86-64: make csum_partial_copy_from_user() error handling consistent
Al Viro pointed out that the x86-64 csum_partial_copy_from_user() is
somewhat confused about what it should do on errors, notably it mostly
clears the uncopied end result buffer, but misses that for the initial
alignment case.
All users should check for errors, so it's dubious whether the clearing
is even necessary, and Al also points out that we should probably clean
up the calling conventions, but regardless of any future changes to this
function, the fact that it is inconsistent is just annoying.
So make the __get_user() failure path use the same error exit as all the
other errors do.
Reported-by: Al Viro <viro@zeniv.linux.org.uk> Cc: David Miller <davem@davemloft.net> Cc: Andi Kleen <andi@firstfloor.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Sat, 15 Nov 2014 23:45:07 +0000 (15:45 -0800)]
Merge branch 'fixes' of git://ftp.arm.linux.org.uk/~rmk/linux-arm
Pull ARM fixes from Russell King:
"Two fixes this time, one to ensure that the kuser helper option
depends on MMU as they aren't available for noMMU targets (and if the
option is selected, we end up oopsing.)
The second fix plugs a corner case with the decompressor, ensuring
that the instruction stream can see the relocated code in every case
on ARMv7 CPUs"
* 'fixes' of git://ftp.arm.linux.org.uk/~rmk/linux-arm:
ARM: 8198/1: make kuser helpers depend on MMU
ARM: 8191/1: decompressor: ensure I-side picks up relocated code
Linus Torvalds [Sat, 15 Nov 2014 23:26:48 +0000 (15:26 -0800)]
Merge branch 'parisc-3.18-2' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux
Pull parisc updates from Helge Deller:
"Changes include:
- wire up the bpf syscall
- remove CONFIG_64BIT usage from some userspace-exported header files
- use compat functions for msgctl, shmat, shmctl and semtimedop
syscalls"
* 'parisc-3.18-2' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
parisc: Avoid using CONFIG_64BIT in userspace exported headers
parisc: Use compat layer for msgctl, shmat, shmctl and semtimedop syscalls
parisc: Use BUILD_BUG() instead of undefined functions
parisc: Wire up bpf syscall
Linus Torvalds [Sat, 15 Nov 2014 22:58:40 +0000 (14:58 -0800)]
Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux
Pull drm gixes from Dave Airlie:
- exynos: infinite loop regressions fixed
- i915: one regression
- radeon: one race condition on monitor probing
- noveau: two regressions
- tegra: one vblank regression fix
* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux:
drm/tegra: dc: Add missing call to drm_vblank_on()
drm/nouveau/nv50/disp: Fix modeset on G94
drm/gk20a/fb: fix setting of large page size bit
drm/radeon: add locking around atombios scratch space usage
drm/i915: Fix obj->map_and_fenceable across tiling changes
drm/exynos: fix possible infinite loop issue
drm/exynos: g2d: fix null pointer dereference
drm/exynos: resolve infinite loop issue on non multi-platform
drm/exynos: resolve infinite loop issue on multi-platform
Sasha Levin reports:
"gcc5 changes the default standard to c11, which makes kernel build
unhappy
Explicitly define the kernel standard to be gnu89 which should keep
everything working exactly like it was before gcc5"
There are multiple small issues with the new default, but the biggest
issue seems to be that the old - and very useful - GNU extension to
allow a cast in front of an initializer has gone away.
Patch updated by Kirill:
"I'm pretty sure all gcc versions you can build kernel with supports
-std=gnu89. cc-option is redunrant.
We also need to adjust HOSTCFLAGS otherwise allmodconfig fails for me"
Note by Andrew Pinski:
"Yes it was reported and both problems relating to this extension has
been added to gnu99 and gnu11. Though there are other issues with the
kernel dealing with extern inline have different semantics between
gnu89 and gnu99/11"
End result: we may be able to move up to a newer stdc model eventually,
but right now the newer models have some annoying deficiencies, so the
traditional "gnu89" model ends up being the preferred one.
Linus Torvalds [Sat, 15 Nov 2014 22:15:16 +0000 (14:15 -0800)]
Merge tag 'nfs-for-3.18-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs
Pull NFS client bugfixes from Trond Myklebust:
"Highlights include:
- stable patches to fix NFSv4.x delegation reclaim error paths
- fix a bug whereby we were advertising NFSv4.1 but using NFSv4.2
features
- fix a use-after-free problem with pNFS block layouts
- fix a memory leak in the pNFS files O_DIRECT code
- replace an intrusive and Oops-prone performance fix in the NFSv4
atomic open code with a safer one-line version and revert the two
original patches"
* tag 'nfs-for-3.18-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
sunrpc: fix sleeping under rcu_read_lock in gss_stringify_acceptor
NFS: Don't try to reclaim delegation open state if recovery failed
NFSv4: Ensure that we call FREE_STATEID when NFSv4.x stateids are revoked
NFSv4: Fix races between nfs_remove_bad_delegation() and delegation return
NFSv4.1: nfs41_clear_delegation_stateid shouldn't trust NFS_DELEGATED_STATE
NFSv4: Ensure that we remove NFSv4.0 delegations when state has expired
NFS: SEEK is an NFS v4.2 feature
nfs: Fix use of uninitialized variable in nfs_getattr()
nfs: Remove bogus assignment
nfs: remove spurious WARN_ON_ONCE in write path
pnfs/blocklayout: serialize GETDEVICEINFO calls
nfs: fix pnfs direct write memory leak
Revert "NFS: nfs4_do_open should add negative results to the dcache."
Revert "NFS: remove BUG possibility in nfs4_open_and_get_state"
NFSv4: Ensure nfs_atomic_open set the dentry verifier on ENOENT
Linus Torvalds [Fri, 14 Nov 2014 22:31:54 +0000 (14:31 -0800)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input
Pull input subsystem updates from Dmitry Torokhov:
"Mostly small fixups to PS/2 tochpad drivers (ALPS, Elantech,
Synaptics) to better deal with specific hardware"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
Input: elantech - update the documentation
Input: elantech - provide a sysfs knob for crc_enabled
Input: elantech - report the middle button of the touchpad
Input: alps - ignore bad data on Dell Latitudes E6440 and E7440
Input: alps - allow up to 2 invalid packets without resetting device
Input: alps - ignore potential bare packets when device is out of sync
Input: elantech - fix crc_enabled for Fujitsu H730
Input: elantech - use elantech_report_trackpoint for hardware v4 too
Input: twl4030-pwrbutton - ensure a wakeup event is recorded.
Input: synaptics - add min/max quirk for Lenovo T440s
Linus Torvalds [Fri, 14 Nov 2014 22:24:33 +0000 (14:24 -0800)]
Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 fixes from Catalin Marinas:
- fix EFI stub cache maintenance causing aborts during boot on certain
platforms
- handle byte stores in __clear_user without panicking
- fix race condition in aarch64_insn_patch_text_sync() (instruction
patching)
- Couple of type fixes
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
arm64: ARCH_PFN_OFFSET should be unsigned long
Correct the race condition in aarch64_insn_patch_text_sync()
arm64: __clear_user: handle exceptions on strb
arm64: Fix data type for physical address
arm64: efi: Fix stub cache maintenance
Linus Torvalds [Fri, 14 Nov 2014 22:20:46 +0000 (14:20 -0800)]
Merge tag 'platform-drivers-x86-v3.18-3' of git://git.infradead.org/users/dvhart/linux-platform-drivers-x86
Pull x86 platform drivers fixlets from Darren Hart:
"Just two patches to remove hp_accel events from the keyboard bus
stream via an i8042 filter"
* tag 'platform-drivers-x86-v3.18-3' of git://git.infradead.org/users/dvhart/linux-platform-drivers-x86:
platform: hp_accel: Add SERIO_I8042 as a dependency since it now includes i8042.h/serio.h
platform: hp_accel: add a i8042 filter to remove HPQ6000 data from kb bus stream
Linus Torvalds [Fri, 14 Nov 2014 22:09:19 +0000 (14:09 -0800)]
Merge branch 'for-3.18-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata
Pull libata fixes from Tejun Heo:
"The most notable is the revert of lock splitting optimization in ahci.
This also made the IRQ handling threaded even when there's only one
IRQ in use. The conversion missed IRFQ_SHARED leading to screaming
IRQs problem in some cases and the threaded IRQ handling showed
performance regression in some LKP test cases. The changes are
reverted for now. It'll probably be retried once threaded IRQ
handling is removed from ahci.
Other than that, there's one fix for ahci and several patches adding
device IDs"
* 'for-3.18-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata:
ahci: fix AHCI parameters not taken into account
ata: sata_rcar: Add r8a7793 device support
ahci: Add Device IDs for Intel Sunrise Point PCH
ahci: disable MSI instead of NCQ on Samsung pci-e SSDs on macbooks
Revert "AHCI: Optimize single IRQ interrupt processing"
Revert "AHCI: Do not acquire ata_host::lock from single IRQ handler"
ata: sata_rcar: Disable DIPM mode for r8a7790 ES1
Linus Torvalds [Fri, 14 Nov 2014 22:03:23 +0000 (14:03 -0800)]
Merge branch 'for-linus' of git://git.kernel.dk/linux-block
Pull block layer fixes from Jens Axboe:
"Four small fixes that should be merged for the current 3.18-rc series.
This pull request contains:
- a minor bugfix for computation of best IO priority given two
merging requests. From Jan Kara.
- the final (final) merge count issue that has been plaguing
virtio-blk. From Ming Lei.
- enable parallel reinit notify for blk-mq queues, to combine the
cost of an RCU grace period across lots of devices. From Tejun
Heo.
- an error handling fix for the SCSI_IOCTL_SEND_COMMAND ioctl. From
Tony Battersby"
* 'for-linus' of git://git.kernel.dk/linux-block:
block: blk-merge: fix blk_recount_segments()
scsi: Fix more error handling in SCSI_IOCTL_SEND_COMMAND
blk-mq: make mq_queue_reinit_notify() freeze queues in parallel
block: Fix computation of merged request priority
Linus Torvalds [Fri, 14 Nov 2014 21:38:02 +0000 (13:38 -0800)]
Merge tag 'pm+acpi-3.18-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull ACPI and power management fixes from Rafael Wysocki:
"These are three regression fixes, two recent (generic power domains,
suspend-to-idle) and one older (cpufreq), an ACPI blacklist entry for
one more machine having problems with Windows 8 compatibility, a minor
cpufreq driver fix (cpufreq-dt) and a fixup for new callback
definitions (generic power domains).
Specifics:
- Fix a crash in the suspend-to-idle code path introduced by a recent
commit that forgot to check a pointer against NULL before
dereferencing it (Dmitry Eremin-Solenikov).
- Fix a boot crash on Exynos5 introduced by a recent commit making
that platform use generic Device Tree bindings for power domains
which exposed a weakness in the generic power domains framework
leading to that crash (Ulf Hansson).
- Fix a crash during system resume on systems where cpufreq depends
on Operation Performance Points (OPP) for functionality, but
CONFIG_OPP is not set. This leads the cpufreq driver registration
to fail, but the resume code attempts to restore the pre-suspend
cpufreq configuration (which does not exist) nevertheless and
crashes. From Geert Uytterhoeven.
- Add a new ACPI blacklist entry for Dell Vostro 3546 that has
problems if it is reported as Windows 8 compatible to the BIOS
(Adam Lee).
- Fix swapped arguments in an error message in the cpufreq-dt driver
(Abhilash Kesavan).
- Fix up the prototypes of new callbacks in struct generic_pm_domain
to make them more useful. Users of those callbacks will be added
in 3.19 and it's better for them to be based on the correct struct
definition in mainline from the start. From Ulf Hansson and Kevin
Hilman"
* tag 'pm+acpi-3.18-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
PM / Domains: Fix initial default state of the need_restore flag
PM / sleep: Fix entering suspend-to-IDLE if no freeze_oops is set
PM / Domains: Change prototype for the attach and detach callbacks
cpufreq: Avoid crash in resume on SMP without OPP
cpufreq: cpufreq-dt: Fix arguments in clock failure error message
ACPI / blacklist: blacklist Win8 OSI for Dell Vostro 3546
Linus Torvalds [Fri, 14 Nov 2014 20:44:48 +0000 (12:44 -0800)]
Merge tag 'firewire-fix' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394
Pull firewire fix from Stefan Richter:
"IEEE 1394 (FireWire) subsystem fix: The character device file
interface for raw 1394 I/O took uninitialized kernel stack as
substitute for missing ioctl() argument data. This could partially
show up in subsequent read() output"
* tag 'firewire-fix' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394:
firewire: cdev: prevent kernel stack leaking into ioctl arguments
Stefan Richter [Tue, 11 Nov 2014 16:16:44 +0000 (17:16 +0100)]
firewire: cdev: prevent kernel stack leaking into ioctl arguments
Found by the UC-KLEE tool: A user could supply less input to
firewire-cdev ioctls than write- or write/read-type ioctl handlers
expect. The handlers used data from uninitialized kernel stack then.
This could partially leak back to the user if the kernel subsequently
generated fw_cdev_event_'s (to be read from the firewire-cdev fd)
which notably would contain the _u64 closure field which many of the
ioctl argument structures contain.
The fact that the handlers would act on random garbage input is a
lesser issue since all handlers must check their input anyway.
The fix simply always null-initializes the entire ioctl argument buffer
regardless of the actual length of expected user input. That is, a
runtime overhead of memset(..., 40) is added to each firewirew-cdev
ioctl() call. [Comment from Clemens Ladisch: This part of the stack is
most likely to be already in the cache.]
Remarks:
- There was never any leak from kernel stack to the ioctl output
buffer itself. IOW, it was not possible to read kernel stack by a
read-type or write/read-type ioctl alone; the leak could at most
happen in combination with read()ing subsequent event data.
- The actual expected minimum user input of each ioctl from
include/uapi/linux/firewire-cdev.h is, in bytes:
[0x00] = 32, [0x05] = 4, [0x0a] = 16, [0x0f] = 20, [0x14] = 16,
[0x01] = 36, [0x06] = 20, [0x0b] = 4, [0x10] = 20, [0x15] = 20,
[0x02] = 20, [0x07] = 4, [0x0c] = 0, [0x11] = 0, [0x16] = 8,
[0x03] = 4, [0x08] = 24, [0x0d] = 20, [0x12] = 36, [0x17] = 12,
[0x04] = 20, [0x09] = 24, [0x0e] = 4, [0x13] = 40, [0x18] = 4.
Reported-by: David Ramos <daramos@stanford.edu> Cc: <stable@vger.kernel.org> Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
1) sunhme driver lacks DMA mapping error checks, based upon a report by
Meelis Roos.
2) Fix memory leak in mvpp2 driver, from Sudip Mukherjee.
3) DMA memory allocation sizes are wrong in systemport ethernet driver,
fix from Florian Fainelli.
4) Fix use after free in mac80211 defragmentation code, from Johannes
Berg.
5) Some networking uapi headers missing from Kbuild file, from Stephen
Hemminger.
6) TUN driver gets csum_start offset wrong when VLAN accel is enabled,
and macvtap has a similar bug, from Herbert Xu.
7) Adjust several tunneling drivers to set dev->iflink after registry,
because registry sets that to -1 overwriting whatever we did. From
Steffen Klassert.
8) Geneve forgets to set inner tunneling type, causing GSO segmentation
to fail on some NICs. From Jesse Gross.
9) Fix several locking bugs in stmmac driver, from Fabrice Gasnier and
Giuseppe CAVALLARO.
10) Fix spurious timeouts with NewReno on low traffic connections, from
Marcelo Leitner.
11) Fix descriptor updates in enic driver, from Govindarajulu
Varadarajan.
12) PPP calls bpf_prog_create() with locks held, which isn't kosher.
Fix from Takashi Iwai.
13) Fix NULL deref in SCTP with malformed INIT packets, from Daniel
Borkmann.
14) psock_fanout selftest accesses past the end of the mmap ring, fix
from Shuah Khan.
15) Fix PTP timestamping for VLAN packets, from Richard Cochran.
16) netlink_unbind() calls in netlink pass wrong initial argument, from
Hiroaki SHIMODA.
17) vxlan socket reuse accidently reuses a socket when the address
family is different, so we have to explicitly check this, from
Marcelo Lietner.
18) Fix missing include in nft_reject_bridge.c breaking the build on ppc
and other architectures, from Guenter Roeck.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (75 commits)
vxlan: Do not reuse sockets for a different address family
smsc911x: power-up phydev before doing a software reset.
lib: rhashtable - Remove weird non-ASCII characters from comments
net/smsc911x: Fix delays in the PHY enable/disable routines
net/smsc911x: Fix rare soft reset timeout issue due to PHY power-down mode
netlink: Properly unbind in error conditions.
net: ptp: fix time stamp matching logic for VLAN packets.
cxgb4 : dcb open-lldp interop fixes
selftests/net: psock_fanout seg faults in sock_fanout_read_ring()
net: bcmgenet: apply MII configuration in bcmgenet_open()
net: bcmgenet: connect and disconnect from the PHY state machine
net: qualcomm: Fix dependency
ixgbe: phy: fix uninitialized status in ixgbe_setup_phy_link_tnx
net: phy: Correctly handle MII ioctl which changes autonegotiation.
ipv6: fix IPV6_PKTINFO with v4 mapped
net: sctp: fix memory leak in auth key management
net: sctp: fix NULL pointer dereference in af->from_addr_param on malformed packet
net: ppp: Don't call bpf_prog_create() in ppp_lock
net/mlx4_en: Advertize encapsulation offloads features only when VXLAN tunnel is set
cxgb4 : Fix bug in DCB app deletion
...
Ulrik De Bie [Fri, 14 Nov 2014 01:48:06 +0000 (17:48 -0800)]
Input: elantech - update the documentation
A chapter is added to describe the trackpoint packets.
A section is added to describe the behaviour of the knob crc_enabled in
sysfs.
The introduction of the documentation only mentioned v1/v2, but in the
last part it already contains explanation of v3 and v4. The introduction
is updated.
Signed-off-by: Ulrik De Bie <ulrik.debie-os@e2big.org> Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Ulrik De Bie [Fri, 14 Nov 2014 01:47:04 +0000 (17:47 -0800)]
Input: elantech - provide a sysfs knob for crc_enabled
The detection of crc_enabled is known to fail for Fujitsu H730. A DMI
blacklist is added for that, but it can be expected that other laptops will
pop up with this.
Here a sysfs knob is provided to alter the behaviour of crc_enabled.
Writing 0 or 1 to it sets the variable to 0 or 1. Reading it will show the
crc_enabled variable (0 or 1).
Reported-by: Stefan Valouch <stefan@valouch.com> Signed-off-by: Ulrik De Bie <ulrik.debie-os@e2big.org> Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Ulrik De Bie [Fri, 14 Nov 2014 01:45:12 +0000 (17:45 -0800)]
Input: elantech - report the middle button of the touchpad
In the past, no elantech was known with 3 touchpad mouse buttons.
Fujitsu H730 is the first known elantech with a middle button. This commit
enables this middle button. For backwards compatibility, the Fujitsu is
detected via DMI, and only for this one 3 buttons will be announced.
Reported-by: Stefan Valouch <stefan@valouch.com> Signed-off-by: Ulrik De Bie <ulrik.debie-os@e2big.org> Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Pali Rohár [Sun, 9 Nov 2014 07:36:09 +0000 (23:36 -0800)]
Input: alps - ignore bad data on Dell Latitudes E6440 and E7440
Sometimes on Dell Latitude laptops psmouse/alps driver receive invalid ALPS
protocol V3 packets with bit7 set in last byte. More often it can be
reproduced on Dell Latitude E6440 or E7440 with closed lid and pushing
cover above touchpad.
If bit7 in last packet byte is set then it is not valid ALPS packet. I was
told that ALPS devices never send these packets. It is not know yet who
send those packets, it could be Dell EC, bug in BIOS and also bug in
touchpad firmware...
With this patch alps driver does not process those invalid packets, but
instead of reporting PSMOUSE_BAD_DATA, getting into out of sync state,
getting back in sync with the next byte and spam dmesg we return
PSMOUSE_FULL_PACKET. If driver is truly out of sync we'll fail the checks
on the next byte and report PSMOUSE_BAD_DATA then.
Linus Torvalds [Fri, 14 Nov 2014 00:57:25 +0000 (16:57 -0800)]
Merge branch 'akpm' (fixes from Andrew Morton)
Merge misc fixes from Andrew Morton:
"15 fixes"
* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
MAINTAINERS: add IIO include files
kernel/panic.c: update comments for print_tainted
mem-hotplug: reset node present pages when hot-adding a new pgdat
mem-hotplug: reset node managed pages when hot-adding a new pgdat
mm/debug-pagealloc: correct freepage accounting and order resetting
fanotify: fix notification of groups with inode & mount marks
mm, compaction: prevent infinite loop in compact_zone
mm: alloc_contig_range: demote pages busy message from warn to info
mm/slab: fix unalignment problem on Malta with EVA due to slab merge
mm/page_alloc: restrict max order of merging on isolated pageblock
mm/page_alloc: move freepage counting logic to __free_one_page()
mm/page_alloc: add freepage on isolate pageblock to correct buddy list
mm/page_alloc: fix incorrect isolation behavior by rechecking migratetype
mm/compaction: skip the range until proper target pageblock is met
zram: avoid kunmap_atomic() of a NULL pointer
Linus Torvalds [Fri, 14 Nov 2014 00:36:42 +0000 (16:36 -0800)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client
Pull Ceph fixes from Sage Weil:
"There is an overflow bug fix for cephfs from Zheng, a fix for handling
large authentication ticket buffers in libceph from Ilya, and a few
fixes for the request handling code from Ilya that affect RBD volumes"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client:
libceph: change from BUG to WARN for __remove_osd() asserts
libceph: clear r_req_lru_item in __unregister_linger_request()
libceph: unlink from o_linger_requests when clearing r_osd
libceph: do not crash on large auth tickets
ceph: fix flush tid comparision
Linus Torvalds [Fri, 14 Nov 2014 00:19:14 +0000 (16:19 -0800)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid
Pull HID fixes from Jiri Kosina:
- fix for an oops in HID core upon repeated subdriver insertion/removal
under certain circumstances, by Benjamin Tissoires
- quirk for another Elan Touchscreen device, by Adel Gadllah
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid:
HID: core: cleanup .claimed field on disconnect
HID: usbhid: enable always-poll quirk for Elan Touchscreen 0103
Xie XiuQi [Thu, 13 Nov 2014 23:19:44 +0000 (15:19 -0800)]
kernel/panic.c: update comments for print_tainted
Commit 69361eef9056 ("panic: add TAINT_SOFTLOCKUP") added the 'L' flag,
but failed to update the comments for print_tainted(). So, update the
comments.
Tang Chen [Thu, 13 Nov 2014 23:19:41 +0000 (15:19 -0800)]
mem-hotplug: reset node present pages when hot-adding a new pgdat
When memory is hot-added, all the memory is in offline state. So clear
all zones' present_pages because they will be updated in online_pages()
and offline_pages(). Otherwise, /proc/zoneinfo will corrupt:
Tang Chen [Thu, 13 Nov 2014 23:19:39 +0000 (15:19 -0800)]
mem-hotplug: reset node managed pages when hot-adding a new pgdat
In free_area_init_core(), zone->managed_pages is set to an approximate
value for lowmem, and will be adjusted when the bootmem allocator frees
pages into the buddy system.
But free_area_init_core() is also called by hotadd_new_pgdat() when
hot-adding memory. As a result, zone->managed_pages of the newly added
node's pgdat is set to an approximate value in the very beginning.
Even if the memory on that node has node been onlined,
/sys/device/system/node/nodeXXX/meminfo has wrong value:
This patch fixes this problem by reset node managed pages to 0 after
hot-adding a new node.
1. Move reset_managed_pages_done from reset_node_managed_pages() to
reset_all_zones_managed_pages()
2. Make reset_node_managed_pages() non-static
3. Call reset_node_managed_pages() in hotadd_new_pgdat() after pgdat
is initialized
Joonsoo Kim [Thu, 13 Nov 2014 23:19:36 +0000 (15:19 -0800)]
mm/debug-pagealloc: correct freepage accounting and order resetting
One thing I did in this patch is fixing freepage accounting. If we
clear guard page and link it onto isolate buddy list, we should not
increase freepage count. This patch adds conditional branch to skip
counting in this case. Without this patch, this overcounting happens
frequently if guard order is set and CMA is used.
Another thing fixed in this patch is the target to reset order. In
__free_one_page(), we check the buddy page whether it is a guard page or
not. And, if so, we should clear guard attribute on the buddy page and
reset order of it to 0. But, current code resets original page's order
rather than buddy one's. Maybe, this doesn't have any problem, because
whole merged page's order will be re-assigned soon. But, it is better
to correct code.
Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Cc: Gioh Kim <gioh.kim@lge.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jan Kara [Thu, 13 Nov 2014 23:19:33 +0000 (15:19 -0800)]
fanotify: fix notification of groups with inode & mount marks
fsnotify() needs to merge inode and mount marks lists when notifying
groups about events so that ignore masks from inode marks are reflected
in mount mark notifications and groups are notified in proper order
(according to priorities).
Currently the sorting of the lists done by fsnotify_add_inode_mark() /
fsnotify_add_vfsmount_mark() and fsnotify() differed which resulted
ignore masks not being used in some cases.
Fix the problem by always using the same comparison function when
sorting / merging the mark lists.
Thanks to Heinrich Schuchardt for improvements of my patch.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=87721 Signed-off-by: Jan Kara <jack@suse.cz> Reported-by: Heinrich Schuchardt <xypron.glpk@gmx.de> Tested-by: Heinrich Schuchardt <xypron.glpk@gmx.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Vlastimil Babka [Thu, 13 Nov 2014 23:19:30 +0000 (15:19 -0800)]
mm, compaction: prevent infinite loop in compact_zone
Several people have reported occasionally seeing processes stuck in
compact_zone(), even triggering soft lockups, in 3.18-rc2+.
Testing a revert of commit e14c720efdd7 ("mm, compaction: remember
position within pageblock in free pages scanner") fixed the issue,
although the stuck processes do not appear to involve the free scanner.
Finally, by code inspection, the bug was found in isolate_migratepages()
which uses a slightly different condition to detect if the migration and
free scanners have met, than compact_finished(). That has not been a
problem until commit e14c720efdd7 allowed the free scanner position
between individual invocations to be in the middle of a pageblock.
In a relatively rare case, the migration scanner position can end up at
the beginning of a pageblock, with the free scanner position in the
middle of the same pageblock. If it's the migration scanner's turn,
isolate_migratepages() exits immediately (without updating the
position), while compact_finished() decides to continue compaction,
resulting in a potentially infinite loop. The system can recover only
if another process creates enough high-order pages to make the watermark
checks in compact_finished() pass.
This patch fixes the immediate problem by bumping the migration
scanner's position to meet the free scanner in isolate_migratepages(),
when both are within the same pageblock. This causes compact_finished()
to terminate properly. A more robust check in compact_finished() is
planned as a cleanup for better future maintainability.
Fixes: e14c720efdd73 ("mm, compaction: remember position within pageblock in free pages scanner) Signed-off-by: Vlastimil Babka <vbabka@suse.cz> Reported-by: P. Christeas <xrg@linux.gr> Tested-by: P. Christeas <xrg@linux.gr> Link: http://marc.info/?l=linux-mm&m=141508604232522&w=2 Reported-by: Norbert Preining <preining@logic.at> Tested-by: Norbert Preining <preining@logic.at> Link: https://lkml.org/lkml/2014/11/4/904 Reported-by: Pavel Machek <pavel@ucw.cz> Link: https://lkml.org/lkml/2014/11/7/164 Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: David Rientjes <rientjes@google.com> Cc: Mel Gorman <mel@csn.ul.ie> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
mm: alloc_contig_range: demote pages busy message from warn to info
Having test_pages_isolated failure message as a warning confuses users
into thinking that it is more serious than it really is. In reality, if
called via CMA, allocation will be retried so a single
test_pages_isolated failure does not prevent allocation from succeeding.
Demote the warning message to an info message and reformat it such that
the text "failed" does not appear and instead a less worrying "PFNS
busy" is used.
This message is trivially reproducible on a 10GB x86 machine on 3.16.y
kernels configured with CONFIG_DMA_CMA.
Signed-off-by: Michal Nazarewicz <mina86@mina86.com> Cc: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Cc: Peter Hurley <peter@hurleysoftware.com> Cc: Minchan Kim <minchan@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Joonsoo Kim [Thu, 13 Nov 2014 23:19:25 +0000 (15:19 -0800)]
mm/slab: fix unalignment problem on Malta with EVA due to slab merge
Unlike SLUB, sometimes, object isn't started at the beginning of the
slab in SLAB. This causes the unalignment problem after slab merging is
supported by commit 12220dea07f1 ("mm/slab: support slab merge").
Following is the report from Markos that fail to boot on Malta with EVA.
alloc_unbound_pwq() allocates slab object from pool_workqueue. This
kmem_cache requires 256 bytes alignment, but, current merging code
doesn't honor that, and merge it with kmalloc-256. kmalloc-256 requires
only cacheline size alignment so that above failure occurs. However, in
x86, kmalloc-256 is luckily aligned in 256 bytes, so the problem didn't
happen on it.
To fix this problem, this patch introduces alignment mismatch check in
find_mergeable(). This will fix the problem.
Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com> Reported-by: Markos Chandras <Markos.Chandras@imgtec.com> Tested-by: Markos Chandras <Markos.Chandras@imgtec.com> Acked-by: Christoph Lameter <cl@linux.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: David Rientjes <rientjes@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Joonsoo Kim [Thu, 13 Nov 2014 23:19:21 +0000 (15:19 -0800)]
mm/page_alloc: restrict max order of merging on isolated pageblock
Current pageblock isolation logic could isolate each pageblock
individually. This causes freepage accounting problem if freepage with
pageblock order on isolate pageblock is merged with other freepage on
normal pageblock. We can prevent merging by restricting max order of
merging to pageblock order if freepage is on isolate pageblock.
A side-effect of this change is that there could be non-merged buddy
freepage even if finishing pageblock isolation, because undoing
pageblock isolation is just to move freepage from isolate buddy list to
normal buddy list rather than to consider merging. So, the patch also
makes undoing pageblock isolation consider freepage merge. When
un-isolation, freepage with more than pageblock order and it's buddy are
checked. If they are on normal pageblock, instead of just moving, we
isolate the freepage and free it in order to get merged.
Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Cc: "Kirill A. Shutemov" <kirill@shutemov.name> Cc: Mel Gorman <mgorman@suse.de> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Minchan Kim <minchan@kernel.org> Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com> Cc: Zhang Yanfei <zhangyanfei@cn.fujitsu.com> Cc: Tang Chen <tangchen@cn.fujitsu.com> Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com> Cc: Wen Congyang <wency@cn.fujitsu.com> Cc: Marek Szyprowski <m.szyprowski@samsung.com> Cc: Michal Nazarewicz <mina86@mina86.com> Cc: Laura Abbott <lauraa@codeaurora.org> Cc: Heesub Shin <heesub.shin@samsung.com> Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com> Cc: Ritesh Harjani <ritesh.list@gmail.com> Cc: Gioh Kim <gioh.kim@lge.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Joonsoo Kim [Thu, 13 Nov 2014 23:19:18 +0000 (15:19 -0800)]
mm/page_alloc: move freepage counting logic to __free_one_page()
All the caller of __free_one_page() has similar freepage counting logic,
so we can move it to __free_one_page(). This reduce line of code and
help future maintenance.
This is also preparation step for "mm/page_alloc: restrict max order of
merging on isolated pageblock" which fix the freepage counting problem
on freepage with more than pageblock order.
Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Cc: "Kirill A. Shutemov" <kirill@shutemov.name> Cc: Mel Gorman <mgorman@suse.de> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Minchan Kim <minchan@kernel.org> Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com> Cc: Zhang Yanfei <zhangyanfei@cn.fujitsu.com> Cc: Tang Chen <tangchen@cn.fujitsu.com> Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com> Cc: Wen Congyang <wency@cn.fujitsu.com> Cc: Marek Szyprowski <m.szyprowski@samsung.com> Cc: Michal Nazarewicz <mina86@mina86.com> Cc: Laura Abbott <lauraa@codeaurora.org> Cc: Heesub Shin <heesub.shin@samsung.com> Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com> Cc: Ritesh Harjani <ritesh.list@gmail.com> Cc: Gioh Kim <gioh.kim@lge.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Joonsoo Kim [Thu, 13 Nov 2014 23:19:14 +0000 (15:19 -0800)]
mm/page_alloc: add freepage on isolate pageblock to correct buddy list
In free_pcppages_bulk(), we use cached migratetype of freepage to
determine type of buddy list where freepage will be added. This
information is stored when freepage is added to pcp list, so if
isolation of pageblock of this freepage begins after storing, this
cached information could be stale. In other words, it has original
migratetype rather than MIGRATE_ISOLATE.
There are two problems caused by this stale information.
One is that we can't keep these freepages from being allocated.
Although this pageblock is isolated, freepage will be added to normal
buddy list so that it could be allocated without any restriction. And
the other problem is incorrect freepage accounting. Freepages on
isolate pageblock should not be counted for number of freepage.
Following is the code snippet in free_pcppages_bulk().
/* MIGRATE_MOVABLE list may include MIGRATE_RESERVEs */
__free_one_page(page, page_to_pfn(page), zone, 0, mt);
trace_mm_page_pcpu_drain(page, 0, mt);
if (likely(!is_migrate_isolate_page(page))) {
__mod_zone_page_state(zone, NR_FREE_PAGES, 1);
if (is_migrate_cma(mt))
__mod_zone_page_state(zone, NR_FREE_CMA_PAGES, 1);
}
As you can see above snippet, current code already handle second
problem, incorrect freepage accounting, by re-fetching pageblock
migratetype through is_migrate_isolate_page(page).
But, because this re-fetched information isn't used for
__free_one_page(), first problem would not be solved. This patch try to
solve this situation to re-fetch pageblock migratetype before
__free_one_page() and to use it for __free_one_page().
In addition to move up position of this re-fetch, this patch use
optimization technique, re-fetching migratetype only if there is isolate
pageblock. Pageblock isolation is rare event, so we can avoid
re-fetching in common case with this optimization.
This patch also correct migratetype of the tracepoint output.
Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com> Acked-by: Minchan Kim <minchan@kernel.org> Acked-by: Michal Nazarewicz <mina86@mina86.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Cc: "Kirill A. Shutemov" <kirill@shutemov.name> Cc: Mel Gorman <mgorman@suse.de> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com> Cc: Zhang Yanfei <zhangyanfei@cn.fujitsu.com> Cc: Tang Chen <tangchen@cn.fujitsu.com> Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com> Cc: Wen Congyang <wency@cn.fujitsu.com> Cc: Marek Szyprowski <m.szyprowski@samsung.com> Cc: Laura Abbott <lauraa@codeaurora.org> Cc: Heesub Shin <heesub.shin@samsung.com> Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com> Cc: Ritesh Harjani <ritesh.list@gmail.com> Cc: Gioh Kim <gioh.kim@lge.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Joonsoo Kim [Thu, 13 Nov 2014 23:19:11 +0000 (15:19 -0800)]
mm/page_alloc: fix incorrect isolation behavior by rechecking migratetype
Before describing bugs itself, I first explain definition of freepage.
1. pages on buddy list are counted as freepage.
2. pages on isolate migratetype buddy list are *not* counted as freepage.
3. pages on cma buddy list are counted as CMA freepage, too.
Now, I describe problems and related patch.
Patch 1: There is race conditions on getting pageblock migratetype that
it results in misplacement of freepages on buddy list, incorrect
freepage count and un-availability of freepage.
Patch 2: Freepages on pcp list could have stale cached information to
determine migratetype of buddy list to go. This causes misplacement of
freepages on buddy list and incorrect freepage count.
Patch 4: Merging between freepages on different migratetype of
pageblocks will cause freepages accouting problem. This patch fixes it.
Without patchset [3], above problem doesn't happens on my CMA allocation
test, because CMA reserved pages aren't used at all. So there is no
chance for above race.
With patchset [3], I did simple CMA allocation test and get below
result:
- Virtual machine, 4 cpus, 1024 MB memory, 256 MB CMA reservation
- run kernel build (make -j16) on background
- 30 times CMA allocation(8MB * 30 = 240MB) attempts in 5 sec interval
- Result: more than 5000 freepage count are missed
With patchset [3] and this patchset, I found that no freepage count are
missed so that I conclude that problems are solved.
On my simple memory offlining test, these problems also occur on that
environment, too.
This patch (of 4):
There are two paths to reach core free function of buddy allocator,
__free_one_page(), one is free_one_page()->__free_one_page() and the
other is free_hot_cold_page()->free_pcppages_bulk()->__free_one_page().
Each paths has race condition causing serious problems. At first, this
patch is focused on first type of freepath. And then, following patch
will solve the problem in second type of freepath.
In the first type of freepath, we got migratetype of freeing page
without holding the zone lock, so it could be racy. There are two cases
of this race.
1. pages are added to isolate buddy list after restoring orignal
migratetype
CPU1 CPU2
get migratetype => return MIGRATE_ISOLATE
call free_one_page() with MIGRATE_ISOLATE
grab the zone lock
unisolate pageblock
release the zone lock
grab the zone lock
call __free_one_page() with MIGRATE_ISOLATE
freepage go into isolate buddy list,
although pageblock is already unisolated
This may cause two problems. One is that we can't use this page anymore
until next isolation attempt of this pageblock, because freepage is on
isolate buddy list. The other is that freepage accouting could be wrong
due to merging between different buddy list. Freepages on isolate buddy
list aren't counted as freepage, but ones on normal buddy list are
counted as freepage. If merge happens, buddy freepage on normal buddy
list is inevitably moved to isolate buddy list without any consideration
of freepage accouting so it could be incorrect.
2. pages are added to normal buddy list while pageblock is isolated.
It is similar with above case.
This also may cause two problems. One is that we can't keep these
freepages from being allocated. Although this pageblock is isolated,
freepage would be added to normal buddy list so that it could be
allocated without any restriction. And the other problem is same as
case 1, that it, incorrect freepage accouting.
This race condition would be prevented by checking migratetype again
with holding the zone lock. Because it is somewhat heavy operation and
it isn't needed in common case, we want to avoid rechecking as much as
possible. So this patch introduce new variable, nr_isolate_pageblock in
struct zone to check if there is isolated pageblock. With this, we can
avoid to re-check migratetype in common case and do it only if there is
isolated pageblock or migratetype is MIGRATE_ISOLATE. This solve above
mentioned problems.
Changes from v3:
Add one more check in free_one_page() that checks whether migratetype is
MIGRATE_ISOLATE or not. Without this, abovementioned case 1 could happens.
Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com> Acked-by: Minchan Kim <minchan@kernel.org> Acked-by: Michal Nazarewicz <mina86@mina86.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Cc: "Kirill A. Shutemov" <kirill@shutemov.name> Cc: Mel Gorman <mgorman@suse.de> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com> Cc: Zhang Yanfei <zhangyanfei@cn.fujitsu.com> Cc: Tang Chen <tangchen@cn.fujitsu.com> Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com> Cc: Wen Congyang <wency@cn.fujitsu.com> Cc: Marek Szyprowski <m.szyprowski@samsung.com> Cc: Laura Abbott <lauraa@codeaurora.org> Cc: Heesub Shin <heesub.shin@samsung.com> Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com> Cc: Ritesh Harjani <ritesh.list@gmail.com> Cc: Gioh Kim <gioh.kim@lge.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Joonsoo Kim [Thu, 13 Nov 2014 23:19:07 +0000 (15:19 -0800)]
mm/compaction: skip the range until proper target pageblock is met
Commit 7d49d8868336 ("mm, compaction: reduce zone checking frequency in
the migration scanner") has a side-effect that changes the iteration
range calculation. Before the change, block_end_pfn is calculated using
start_pfn, but now it blindly adds pageblock_nr_pages to the previous
value.
This causes the problem that isolation_start_pfn is larger than
block_end_pfn when we isolate the page with more than pageblock order.
In this case, isolation would fail due to an invalid range parameter.
To prevent this, this patch implements skipping the range until a proper
target pageblock is met. Without this patch, CMA with more than
pageblock order always fails but with this patch it will succeed.
Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Minchan Kim <minchan@kernel.org> Cc: Michal Nazarewicz <mina86@mina86.com> Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Weijie Yang [Thu, 13 Nov 2014 23:19:05 +0000 (15:19 -0800)]
zram: avoid kunmap_atomic() of a NULL pointer
zram could kunmap_atomic() a NULL pointer in a rare situation: a zram
page becomes a full-zeroed page after a partial write io. The current
code doesn't handle this case and performs kunmap_atomic() on a NULL
pointer, which panics the kernel.
This patch fixes this issue.
Signed-off-by: Weijie Yang <weijie.yang@samsung.com> Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Cc: Dan Streetman <ddstreet@ieee.org> Cc: Nitin Gupta <ngupta@vflare.org> Cc: Weijie Yang <weijie.yang.kh@gmail.com> Acked-by: Jerome Marchand <jmarchan@redhat.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Nathan Lynch [Mon, 10 Nov 2014 22:46:27 +0000 (23:46 +0100)]
ARM: 8198/1: make kuser helpers depend on MMU
The kuser helpers page is not set up on non-MMU systems, so it does
not make sense to allow CONFIG_KUSER_HELPERS to be enabled when
CONFIG_MMU=n. Allowing it to be set on !MMU results in an oops in
set_tls (used in execve and the arm_syscall trap handler):
As best I can tell this issue existed for the set_tls ARM syscall
before commit fbfb872f5f41 "ARM: 8148/1: flush TLS and thumbee
register state during exec" consolidated the TLS manipulation code
into the set_tls helper function, but now that we're using it to flush
register state during execve, !MMU users encounter the oops at the
first exec.
Prevent CONFIG_MMU=n configurations from enabling
CONFIG_KUSER_HELPERS.
Fixes: fbfb872f5f41 (ARM: 8148/1: flush TLS and thumbee register state during exec) Signed-off-by: Nathan Lynch <nathan_lynch@mentor.com> Reported-by: Stefan Agner <stefan@agner.ch> Acked-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Cc: stable@vger.kernel.org Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Will Deacon [Tue, 4 Nov 2014 10:40:46 +0000 (11:40 +0100)]
ARM: 8191/1: decompressor: ensure I-side picks up relocated code
To speed up decompression, the decompressor sets up a flat, cacheable
mapping of memory. However, when there is insufficient space to hold
the page tables for this mapping, we don't bother to enable the caches
and subsequently skip all the cache maintenance hooks.
Skipping the cache maintenance before jumping to the relocated code
allows the processor to predict the branch and populate the I-cache
with stale data before the relocation loop has completed (since a
bootloader may have SCTLR.I set, which permits normal, cacheable
instruction fetches regardless of SCTLR.M).
This patch moves the cache maintenance check into the maintenance
routines themselves, allowing the v6/v7 versions to invalidate the
I-cache regardless of the MMU state.
Cc: <stable@vger.kernel.org> Reported-by: Marc Carino <marc.ceeeee@gmail.com> Tested-by: Julien Grall <julien.grall@linaro.org> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Dave Airlie [Thu, 13 Nov 2014 20:24:50 +0000 (06:24 +1000)]
Merge branch 'linux-3.18' of git://anongit.freedesktop.org/git/nouveau/linux-2.6 into drm-fixes
One modesetting, one gk20a fix.
* 'linux-3.18' of git://anongit.freedesktop.org/git/nouveau/linux-2.6:
drm/nouveau/nv50/disp: Fix modeset on G94
drm/gk20a/fb: fix setting of large page size bit
Marcelo Leitner [Thu, 13 Nov 2014 16:43:08 +0000 (14:43 -0200)]
vxlan: Do not reuse sockets for a different address family
Currently, we only match against local port number in order to reuse
socket. But if this new vxlan wants an IPv6 socket and a IPv4 one bound
to that port, vxlan will reuse an IPv4 socket as IPv6 and a panic will
follow. The following steps reproduce it:
# ip link add vxlan6 type vxlan id 42 group 229.10.10.10 \
srcport 5000 6000 dev eth0
# ip link add vxlan7 type vxlan id 43 group ff0e::110 \
srcport 5000 6000 dev eth0
# ip link set vxlan6 up
# ip link set vxlan7 up
<panic>
So address family must also match in order to reuse a socket.
Reported-by: Jean-Tsung Hsiao <jhsiao@redhat.com> Signed-off-by: Marcelo Ricardo Leitner <mleitner@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>