There's no reason for the pending and masked interrupt bitmasks
to be global. Just declare them on the stack in gic_get_int()
since they only consume (256*2)/8 = 64 bytes.
Signed-off-by: Andrew Bresticker <abrestic@chromium.org> Cc: Daniel Lezcano <daniel.lezcano@linaro.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Jason Cooper <jason@lakedaemon.net> Cc: Paul Burton <paul.burton@imgtec.com> Cc: Qais Yousef <qais.yousef@imgtec.com> Cc: John Crispin <blogic@openwrt.org> Cc: linux-mips@linux-mips.org Cc: linux-kernel@vger.kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/8131/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Kevin Cernekee [Tue, 21 Oct 2014 04:28:05 +0000 (21:28 -0700)]
MIPS: bcm3384: Initial commit of bcm3384 platform support
This supports SMP Linux running on the BCM3384 Zephyr (BMIPS5000)
application processor, with fully functional UART and USB 1.1/2.0.
Device Tree is used to configure the following items:
- All peripherals
- Early console base address
- SMP or UP mode
- MIPS counter frequency
- Memory size / regions
- DMA offset
- Kernel command line
The DT-enabled bootloader and build instructions are posted at
https://github.com/Broadcom/aeolus
Kevin Cernekee [Tue, 21 Oct 2014 04:28:01 +0000 (21:28 -0700)]
MIPS: BMIPS: Add PRId for BMIPS5200 (Whirlwind)
This is a dual core (quad thread) BMIPS5000. It needs a little extra
code to boot the second core (CPU2/CPU3), but for now we can treat it the
same as a single core BMIPS5000.
Kevin Cernekee [Tue, 21 Oct 2014 04:28:00 +0000 (21:28 -0700)]
MIPS: BMIPS: Add special cache handling in c-r4k.c
BMIPS435x and BMIPS438x have a single shared L1 D$ and load/store unit,
so it isn't necessary to raise IPIs to keep both CPUs coherent.
BMIPS5000 has VIPT L1 caches that handle aliases in hardware, and its I$
fills from D$. But a special sequence with 2 SYNCs and 32 NOPs is needed
to ensure coherency.
Kevin Cernekee [Tue, 21 Oct 2014 04:27:59 +0000 (21:27 -0700)]
MIPS: BMIPS: Let each platform customize the CPU1 IRQ mask
On some chips like bcm3384, "other stuff" gets wired up to CPU1's IE_IRQ1
input, generating spurious IRQs. In this case we want the platform code
to be able to mask it off.
Kevin Cernekee [Tue, 21 Oct 2014 04:27:58 +0000 (21:27 -0700)]
MIPS: BMIPS: Select the appropriate L1_CACHE_SHIFT for 438x and 5000 CPUs
BMIPS438x has a 64-byte D$ line size and BMIPS5000 has a 128-byte L2
line size. If L1_CACHE_SHIFT is undersized, DMA buffers will not be
cacheline-aligned and terrible things will happen.
Kevin Cernekee [Tue, 21 Oct 2014 04:27:57 +0000 (21:27 -0700)]
MIPS: Allow MIPS_CPU_SCACHE to be used with different line sizes
CONFIG_MIPS_CPU_SCACHE determines whether to build sc-mips.c. However,
it is currently hardwired to use an L1_SHIFT of 6 (64 bytes). Move the
L1_SHIFT selection into the CPU or SoC section so that other SoCs can
select different values.
Kevin Cernekee [Tue, 21 Oct 2014 04:27:56 +0000 (21:27 -0700)]
MIPS: BMIPS: Explicitly configure reset vectors prior to secondary boot
The secondary CPU's reset vector needs to be set to KSEG1 for a cold
boot (release from reset), or KSEG0 for a warm restart. On a cold boot
KSEG0 may be unavailable (BMIPS4380), and on a warm restart KSEG1 may
be unavailable (XKS01 mode on 4380 or 5000).
Jon Fraser [Tue, 21 Oct 2014 04:27:55 +0000 (21:27 -0700)]
MIPS: BMIPS: Mask off timer IRQs when hot-unplugging a CPU
CPU interrupts need to be disabled on a cpu being taken down.
When a cpu is hot-plugged out of the system the following sequence occurs.
On the CPU where the hotplug sequence was initiated:
cpu_down
_cpu_down {
__cpu_notify(CPU_DOWN_PREPARE
__stop_machine(take_cpu_down
wait for cpu to run disable code.
__cpu_die
}
On the CPU being disabled:
take_cpu_down
__cpu_disable {
mp_ops->cpu_disable
bmips_cpu_disable
clear_c0_status(IE_IRQ5) (added)
cpu_notify(CPU_DYING...
}
Before the cpu_notifier is called with CPU_DYING, all interrupts on the
dying cpu must be disabled. This guarantees that before tick_notify is
called with the CPU_DYING event and sets the clock device pointer to
NULL, there can not be any more clock interrupts.
When this wasn't done, an unfortunately-timed timer interrupt sometimes
caused hangs immediately prior to system suspend:
Debug PM is not enabled. To enable partial suspend, rebuild kernel with CONFIG_PM_DEBUG
Pass 1 out of 1,PM: Syncing filesystems ... mode=none, tp1=done.
1, flags=5, cycle_tp=, sleep=
Freezing user space processes ... (elapsed 0.01 seconds) done.
Freezing remaining freezable tasks ... (elapsed 0.01 seconds) done.
PM: suspend of devices complete after 54.199 msecs
PM: late suspend of devices complete after 0.172 msecs
Disabling non-boot CPUs ...
SMP: CPU1 is offline
INFO: rcu_sched detected stalls on CPUs/tasks: { 3} (detected by 0, t=62537 jiffies)
Call Trace:
[<804baa78>] dump_stack+0x8/0x34
[<8008a2d8>] __rcu_pending+0x4b8/0x55c
[<8008adf4>] rcu_check_callbacks+0x78/0x180
[<80037830>] update_process_times+0x40/0x6c
[<80072fe4>] tick_sched_timer+0x74/0xe4
[<80050180>] __run_hrtimer.clone.30+0x64/0x140
[<80051150>] hrtimer_interrupt+0x19c/0x4bc
[<8000cdb8>] c0_compare_interrupt+0x50/0x88
[<80081b18>] handle_irq_event_percpu+0x5c/0x2f4
[<80086490>] handle_percpu_irq+0x8c/0xc0
[<800811b4>] generic_handle_irq+0x34/0x54
[<800067dc>] do_IRQ+0x18/0x2c
[<8000375c>] plat_irq_dispatch+0xd0/0x128
[<80004a04>] ret_from_irq+0x0/0x4
[<80004c40>] r4k_wait+0x20/0x40
[<80006b6c>] cpu_idle+0x98/0xf0
[<805d3988>] start_kernel+0x424/0x440
Jon Fraser [Tue, 21 Oct 2014 04:27:54 +0000 (21:27 -0700)]
MIPS: BMIPS: Allow BMIPS3300 to utilize SMP ebase relocation code
BMIPS3300 processors do not have the hardware to support SMP, but with a
small tweak, the SMP ebase relocation code allows BMIPS3300-based
platforms to reuse the S2/S3 power management code from BMIPS4380-based
chips. Normally this is as simple as adding one line to prom_init():
Kevin Cernekee [Tue, 21 Oct 2014 04:27:53 +0000 (21:27 -0700)]
MIPS: BMIPS: Introduce helper function to change the reset vector
This will need to be called from a few different places, and the logic
is starting to get a bit hairy (with the need for IPIs, CPU bug
workarounds, and hazards).
Kevin Cernekee [Tue, 21 Oct 2014 04:27:52 +0000 (21:27 -0700)]
MIPS: BMIPS: Align secondary boot sequence with latest firmware releases
On some older BMIPS5200 (dual core / quad thread) platforms, the
PROM code set up CPU2/CPU3 so they would be started through an NMI
instead of through the ACTION register. But this was incompatible with
some power management features that were later added, so the scheme was
changed so that Linux is fully responsible for booting CPU2/CPU3.
Kelvin Cheung [Fri, 10 Oct 2014 03:40:01 +0000 (11:40 +0800)]
MIPS: Loongson1B: Some fixes/updates for LS1B
- Fix hanging ethernet issue of LS1B v2.0 by adding pbl field in plat data.
(It seems that the MAC controller of LS1B v2.0 can only accept pbl=1)
- Add GMAC1 support and setup MUX in terms of PHY mode.
- Add CPUFreq support.
- Add MUX Register Definitions.
- Add PWM Register Definitions.
- Update clock register bitfields according to the latest spec.
- Update clock related stuff.
Rafał Miłecki [Thu, 30 Oct 2014 11:50:03 +0000 (12:50 +0100)]
MIPS: BCM47XX: Clean up nvram header
1) Move private defines to the .c file
2) Move SPROM helper to the sprom.c
3) Drop unused code
4) Rename magic to the NVRAM_MAGIC
5) Add const to the char pointer we never modify
Rafał Miłecki [Wed, 29 Oct 2014 09:05:06 +0000 (10:05 +0100)]
MIPS: BCM47XX: Use mtd as an alternative way/API to get NVRAM content
NVRAM can be read using magic memory offset, but after all it's just a
flash partition. On platforms where NVRAM isn't needed early we can get
it using mtd subsystem.
Paul Burton [Thu, 11 Sep 2014 07:30:23 +0000 (08:30 +0100)]
MIPS: Kconfig option to better exercise/debug hybrid FPRs
The hybrid FPR scheme exists to allow for compatibility between existing
FP32 code and newly compiled FP64A code. Such code should hopefully be
rare in the real world, and for the moment is difficult to come across.
All code except that built for the FP64 ABI can correctly execute using
the hybrid FPR scheme, so debugging the hybrid FPR implementation can
be eased by forcing all such code to use it. This is undesirable in
general due to the trap & emulate overhead of the hybrid FPR
implementation, but is a very useful option to have for debugging.
Signed-off-by: Paul Burton <paul.burton@imgtec.com> Cc: linux-mips@linux-mips.org Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: linux-fsdevel@vger.kernel.org Cc: linux-kernel@vger.kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/7680/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Paul Burton [Thu, 11 Sep 2014 07:30:22 +0000 (08:30 +0100)]
MIPS: ELF: Set FP mode according to .MIPS.abiflags
This patch reads the .MIPS.abiflags section when it is present, and sets
the FP mode of the task accordingly. Any loaded ELF files which do not
contain a .MIPS.abiflags section will continue to observe the previous
behaviour, that is FR=1 if EF_MIPS_FP64 is set else FR=0.
Signed-off-by: Paul Burton <paul.burton@imgtec.com> Cc: linux-mips@linux-mips.org Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: linux-fsdevel@vger.kernel.org Cc: linux-kernel@vger.kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/7681/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Paul Burton [Thu, 11 Sep 2014 07:30:21 +0000 (08:30 +0100)]
MIPS: ELF: Add definition for the .MIPS.abiflags section
New toolchains will generate a .MIPS.abiflags section, referenced by a
new PT_MIPS_ABIFLAGS program header. This section will provide
information about the requirements of the ELF, including the ISA level
the code is built for, the ASEs it requires, the size of various
registers and its expectations of the floating point mode. This patch
introduces a definition of the structure of this section and the program
header, for use in a subsequent patch.
Signed-off-by: Paul Burton <paul.burton@imgtec.com> Cc: linux-mips@linux-mips.org Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: linux-fsdevel@vger.kernel.org Cc: linux-kernel@vger.kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/7682/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Paul Burton [Thu, 11 Sep 2014 07:30:20 +0000 (08:30 +0100)]
MIPS: Support for hybrid FPRs
Hybrid FPRs is a scheme where scalar FP registers are 64b wide, but
accesses to odd indexed single registers use bits 63:32 of the
preceeding even indexed 64b register. In this mode all FP code
except that built for the plain FP64 ABI can execute correctly. Most
notably a combination of FP64A & FP32 code can execute correctly,
allowing for existing FP32 binaries to be linked with new FP64A binaries
that can make use of 64 bit FP & MSA.
Hybrid FPRs are implemented by setting both the FR & FRE bits, trapping
& emulating single precision FP instructions (via Reserved Instruction
exceptions) whilst allowing others to execute natively. It therefore has
a penalty in terms of execution speed, and should only be used when no
fully native mode can be. As more binaries are recompiled to use either
the FPXX or FP64(A) ABIs, the need for hybrid FPRs should diminish.
However in the short to mid term it allows for a gradual transition
towards that world, rather than a complete ABI break which is not
feasible for some users & not desirable for many.
A task will be executed using the hybrid FPR scheme when its
TIF_HYBRID_FPREGS flag is set & TIF_32BIT_FPREGS is clear. A further
patch will set the flags as necessary, this patch simply adds the
infrastructure necessary for the hybrid FPR mode to work.
Signed-off-by: Paul Burton <paul.burton@imgtec.com> Cc: linux-mips@linux-mips.org Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: linux-fsdevel@vger.kernel.org Cc: linux-kernel@vger.kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/7683/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Paul Burton [Thu, 11 Sep 2014 07:30:18 +0000 (08:30 +0100)]
MIPS: detect presence of the FRE & UFR bits
Detect the presence of the Config5 FRE & UFE bits, as indicated by the
FREP bit in FPIR. Record this as a CPU option bit, and provide a
cpu_has_fre macro to ease checking of that option bit.
Signed-off-by: Paul Burton <paul.burton@imgtec.com> Cc: linux-mips@linux-mips.org Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: linux-fsdevel@vger.kernel.org Cc: linux-kernel@vger.kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/7678/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Paul Burton [Thu, 11 Sep 2014 07:30:17 +0000 (08:30 +0100)]
MIPS: define bits introduced for hybrid FPRs
Add definitions for the FRE & UFE bits in Config5, and the FREP bit in
FPIR. These bits are used to support a hybrid FPR scheme allowing a
mixture of FP32 & FP64 code to execute within a task.
Signed-off-by: Paul Burton <paul.burton@imgtec.com> Cc: linux-mips@linux-mips.org Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: linux-fsdevel@vger.kernel.org Cc: linux-kernel@vger.kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/7674/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Paul Burton [Thu, 11 Sep 2014 07:30:16 +0000 (08:30 +0100)]
binfmt_elf: allow arch code to examine PT_LOPROC ... PT_HIPROC headers
MIPS is introducing new variants of its O32 ABI which differ in their
handling of floating point, in order to enable a gradual transition
towards a world where mips32 binaries can take advantage of new hardware
features only available when configured for certain FP modes. In order
to do this ELF binaries are being augmented with a new section that
indicates, amongst other things, the FP mode requirements of the binary.
The presence & location of such a section is indicated by a program
header in the PT_LOPROC ... PT_HIPROC range.
In order to allow the MIPS architecture code to examine the program
header & section in question, pass all program headers in this range
to an architecture-specific arch_elf_pt_proc function. This function
may return an error if the header is deemed invalid or unsuitable for
the system, in which case that error will be returned from
load_elf_binary and upwards through the execve syscall.
A means is required for the architecture code to make a decision once
it is known that all such headers have been seen, but before it is too
late to return from an execve syscall. For this purpose the
arch_check_elf function is added, and called once, after all PT_LOPROC
to PT_HIPROC headers have been passed to arch_elf_pt_proc but before
the code which invoked execve has been lost. This enables the
architecture code to make a decision based upon all the headers present
in an ELF binary and its interpreter, as is required to forbid
conflicting FP ABI requirements between an ELF & its interpreter.
In order to allow data to be stored throughout the calls to the above
functions, struct arch_elf_state is introduced.
Finally a variant of the SET_PERSONALITY macro is introduced which
accepts a pointer to the struct arch_elf_state, allowing it to act
based upon state observed from the architecture specific program
headers.
Signed-off-by: Paul Burton <paul.burton@imgtec.com> Cc: linux-mips@linux-mips.org Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: linux-fsdevel@vger.kernel.org Cc: linux-kernel@vger.kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/7679/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Paul Burton [Thu, 11 Sep 2014 07:30:15 +0000 (08:30 +0100)]
binfmt_elf: load interpreter program headers earlier
Load the program headers of an ELF interpreter early enough in
load_elf_binary that they can be examined before it's too late to return
an error from an exec syscall. This patch does not perform any such
checking, it merely lays the groundwork for a further patch to do so.
No functional change is intended.
Signed-off-by: Paul Burton <paul.burton@imgtec.com> Cc: linux-mips@linux-mips.org Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: linux-fsdevel@vger.kernel.org Cc: linux-kernel@vger.kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/7675/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Paul Burton [Thu, 11 Sep 2014 07:30:14 +0000 (08:30 +0100)]
binfmt_elf: Hoist ELF program header loading to a function
load_elf_binary & load_elf_interp both load program headers from an ELF
executable in the same way, duplicating the code. This patch introduces
a helper function (load_elf_phdrs) which performs this common task &
calls it from both load_elf_binary & load_elf_interp. In addition to
reducing code duplication, this is part of preparing to load the ELF
interpreter headers earlier such that they can be examined before it's
too late to return an error from an exec syscall.
Signed-off-by: Paul Burton <paul.burton@imgtec.com> Cc: linux-mips@linux-mips.org Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: linux-fsdevel@vger.kernel.org Cc: linux-kernel@vger.kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/7676/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Huacai Chen [Tue, 4 Nov 2014 06:15:31 +0000 (14:15 +0800)]
MIPS: Loongson-3: Add RS780/SBX00 HPET support
CPUFreq driver need external timer, so add hpet at first.
In Loongson 3, only Core-0 can receive external interrupt. As a result,
timekeeping cannot absolutely use HPET timer. We use a hybrid solution:
Core-0 use HPET as its clock event device, but other cores still use
MIPS; clock source is global and doesn't need interrupt, so use HPET.
Signed-off-by: Huacai Chen <chenhc@lemote.com> Signed-off-by: Hongliang Tao <taohl@lemote.com> Cc: John Crispin <john@phrozen.org> Cc: Steven J. Hill <Steven.Hill@imgtec.com> Cc: linux-mips@linux-mips.org Cc: Fuxin Zhang <zhangfx@lemote.com> Cc: Zhangjin Wu <wuzhangjin@gmail.com>
Patchwork: https://patchwork.linux-mips.org/patch/8329/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Huacai Chen [Tue, 4 Nov 2014 06:15:07 +0000 (14:15 +0800)]
MIPS: Loongson-3: Add oprofile support
Loongson-3 has two groups of performance counters, they are 4 sub-
registers of CP0's REG25. This patch add oprofile support.
REG25, sel 0: Perf Control of group 0;
REG25, sel 1: Perf Counter of group 0;
REG25, sel 2: Perf Control of group 1;
REG25, sel 3: Perf Counter of group 1.
Signed-off-by: Huacai Chen <chenhc@lemote.com> Cc: John Crispin <john@phrozen.org> Cc: Steven J. Hill <Steven.Hill@imgtec.com> Cc: linux-mips@linux-mips.org Cc: Fuxin Zhang <zhangfx@lemote.com> Cc: Zhangjin Wu <wuzhangjin@gmail.com>
Patchwork: https://patchwork.linux-mips.org/patch/8328/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Huacai Chen [Tue, 4 Nov 2014 06:13:27 +0000 (14:13 +0800)]
MIPS: Loongson: Improve LEFI firmware interface
Machtypes of Loongson-3 machines become more and more, but there are
only small differences among different machtypes. Keeping a large table
of machtypes is very ugly and hard to extend. We found that the major
machtype differences are UARTs information (number of UARTs, UART IRQs,
UART clocks, etc.), platform devices (EC, temperature sensors, fan
controllers, etc.) and some workarounds (because of some CPU bugs or
mainboard bugs).
In this patch we improve the UEFI-like (LEFI) interface to make all
Loongson-3 machines use a same machtype "generic-loongson-machine".
Signed-off-by: Huacai Chen <chenhc@lemote.com> Cc: John Crispin <john@phrozen.org> Cc: Steven J. Hill <Steven.Hill@imgtec.com> Cc: linux-mips@linux-mips.org Cc: Fuxin Zhang <zhangfx@lemote.com> Cc: Zhangjin Wu <wuzhangjin@gmail.com>
Patchwork: https://patchwork.linux-mips.org/patch/8324/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Huacai Chen [Tue, 4 Nov 2014 06:13:26 +0000 (14:13 +0800)]
MIPS: Loongson: Allow booting from any core
By offering Logical->Physical core id mapping, so as to reserve some
physical cores via mask. This allow booting from any core when core-0
has problems. Since the maximun cores supported by Loongson-3 is 16,
32-bit cpu_startup_core_id can be split to 16-bit cpu_startup_core_id
and 16-bit reserved_cores_mask for compatibility.
Signed-off-by: Huacai Chen <chenhc@lemote.com> Cc: John Crispin <john@phrozen.org> Cc: Steven J. Hill <Steven.Hill@imgtec.com> Cc: linux-mips@linux-mips.org Cc: Fuxin Zhang <zhangfx@lemote.com> Cc: Zhangjin Wu <wuzhangjin@gmail.com>
Patchwork: https://patchwork.linux-mips.org/patch/8323/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Huacai Chen [Tue, 4 Nov 2014 06:13:24 +0000 (14:13 +0800)]
MIPS: Loongson-3: Add PHYS48_TO_HT40 support
The width of HT-bus is only 40-bit, but Loongson-3 has 48-bit physical
address. This implies only node-0's memory is DMAable because high bits
(Node ID) will lost. Fortunately, by configuring address windows in
firmware, we can extract 2bit Node ID (bit 44~47, only bit 44~45 used
now) from Loongson-3's 48-bit address space and embed it into 40-bit
(bit 37~38). Every NUMA node can do DMA now (however, maximum memory of
each node is reduced to 2^37 = 128GB).
Signed-off-by: Huacai Chen <chenhc@lemote.com> Cc: John Crispin <john@phrozen.org> Cc: Steven J. Hill <Steven.Hill@imgtec.com> Cc: linux-mips@linux-mips.org Cc: Fuxin Zhang <zhangfx@lemote.com> Cc: Zhangjin Wu <wuzhangjin@gmail.com>
Patchwork: https://patchwork.linux-mips.org/patch/8321/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
MIPS: BCM47XX: Make ssb init NVRAM instead of bcm47xx polling it
This makes NVRAM code less bcm47xx/ssb specific allowing it to become a
standalone driver in the future. A similar patch for bcma will follow
when it's ready.
Get rid of the ugly GICREAD/GICWRITE/GICBIS macros and use proper
iomem accessors instead. Since the GIC registers are not directly
accessed outside of the GIC driver any more, make gic_base static
and move all the GIC register manipulation macros out of gic.h,
converting them to static inline functions.
Signed-off-by: Andrew Bresticker <abrestic@chromium.org> Cc: Daniel Lezcano <daniel.lezcano@linaro.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Jason Cooper <jason@lakedaemon.net> Cc: Paul Burton <paul.burton@imgtec.com> Cc: Qais Yousef <qais.yousef@imgtec.com> Cc: John Crispin <blogic@openwrt.org> Cc: linux-mips@linux-mips.org Cc: linux-kernel@vger.kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/8127/
Patchwork: https://patchwork.linux-mips.org/patch/8229/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
irqchip: mips-gic: Export function to read counter width
Export the function gic_get_count_width to read the width of
the GIC global counter from GIC_SH_CONFIG. Update the GIC
clocksource driver to use this new function.
Signed-off-by: Andrew Bresticker <abrestic@chromium.org> Cc: Daniel Lezcano <daniel.lezcano@linaro.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Jason Cooper <jason@lakedaemon.net> Cc: Paul Burton <paul.burton@imgtec.com> Cc: Qais Yousef <qais.yousef@imgtec.com> Cc: John Crispin <blogic@openwrt.org> Cc: linux-mips@linux-mips.org Cc: linux-kernel@vger.kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/8124/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
MIPS: Malta: Use gic_read_count() to read GIC timer
Instead of reading the GIC registers directly, use the interface the GIC
driver already exposes for reading the global timer. Also get rid of
the unnecessary #ifdefs.
Signed-off-by: Andrew Bresticker <abrestic@chromium.org> Cc: Daniel Lezcano <daniel.lezcano@linaro.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Jason Cooper <jason@lakedaemon.net> Cc: Paul Burton <paul.burton@imgtec.com> Cc: Qais Yousef <qais.yousef@imgtec.com> Cc: John Crispin <blogic@openwrt.org> Cc: linux-mips@linux-mips.org Cc: linux-kernel@vger.kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/8123/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Now that all GIC interrupt routing and handling logic is in the GIC
driver itself, un-export variables/functions which are no longer used
outside the GIC driver. This also allows us to remove gic_compare_int
and combine gic_get_int_mask with gic_get_int since these interfaces
are no longer used.
Signed-off-by: Andrew Bresticker <abrestic@chromium.org> Acked-by: Jason Cooper <jason@lakedaemon.net> Reviewed-by: Qais Yousef <qais.yousef@imgtec.com> Tested-by: Qais Yousef <qais.yousef@imgtec.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Jeffrey Deans <jeffrey.deans@imgtec.com> Cc: Markos Chandras <markos.chandras@imgtec.com> Cc: Paul Burton <paul.burton@imgtec.com> Cc: Jonas Gorski <jogo@openwrt.org> Cc: John Crispin <blogic@openwrt.org> Cc: David Daney <ddaney.cavm@gmail.com> Cc: linux-mips@linux-mips.org Cc: linux-kernel@vger.kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/7820/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
The MIPS GIC supports 7 local interrupts, 2 of which are the GIC
local watchdog and count/compare timer. The remainder are CPU
interrupts which may optionally be re-routed through the GIC.
GIC hardware IRQs 0-6 are now used for local interrupts while
hardware IRQs 7+ are used for external (shared) interrupts.
Note that the 5 CPU interrupts may not be re-routable through
the GIC. In that case mapping will fail and the vectors reported
in C0_IntCtl should be used instead. gic_get_c0_compare_int() and
gic_get_c0_perfcount_int() will return the correct IRQ number to
use for the C0 timer and perfcounter interrupts based on the
routability of those interrupts through the GIC.
A separate irq_chip, with callbacks that mask/unmask the local
interrupt on all CPUs, is used for the C0 timer and performance
counter interrupts since all other platforms do not use the percpu
IRQ API for those interrupts.
Malta, SEAD-3, and the GIC clockevent driver have been updated
to use local interrupts and the R4K clockevent driver has been
updated to poll for C0 timer interrupts through the GIC when
the GIC is present.
Signed-off-by: Andrew Bresticker <abrestic@chromium.org> Acked-by: Jason Cooper <jason@lakedaemon.net> Reviewed-by: Qais Yousef <qais.yousef@imgtec.com> Tested-by: Qais Yousef <qais.yousef@imgtec.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Jeffrey Deans <jeffrey.deans@imgtec.com> Cc: Markos Chandras <markos.chandras@imgtec.com> Cc: Paul Burton <paul.burton@imgtec.com> Cc: Jonas Gorski <jogo@openwrt.org> Cc: John Crispin <blogic@openwrt.org> Cc: David Daney <ddaney.cavm@gmail.com> Cc: linux-mips@linux-mips.org Cc: linux-kernel@vger.kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/7819/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
irqchip: mips-gic: Use separate edge/level irq_chips
GIC edge-triggered interrupts must be acknowledged by clearing the edge
detector via a write to GIC_SH_WEDGE. Create a separate edge-triggered
irq_chip with the appropriate irq_ack() callback. This also allows us
to get rid of gic_irq_flags.
Signed-off-by: Andrew Bresticker <abrestic@chromium.org> Acked-by: Jason Cooper <jason@lakedaemon.net> Reviewed-by: Qais Yousef <qais.yousef@imgtec.com> Tested-by: Qais Yousef <qais.yousef@imgtec.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Jeffrey Deans <jeffrey.deans@imgtec.com> Cc: Markos Chandras <markos.chandras@imgtec.com> Cc: Paul Burton <paul.burton@imgtec.com> Cc: Jonas Gorski <jogo@openwrt.org> Cc: John Crispin <blogic@openwrt.org> Cc: David Daney <ddaney.cavm@gmail.com> Cc: linux-mips@linux-mips.org Cc: linux-kernel@vger.kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/7818/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
irqchip: mips-gic: Stop using per-platform mapping tables
Now that the GIC properly uses IRQ domains, kill off the per-platform
routing tables that were used to make the GIC appear transparent.
This includes:
- removing the mapping tables and the support for applying them,
- moving GIC IPI support to the GIC driver,
- properly routing the i8259 through the GIC on Malta, and
- updating IRQ assignments on SEAD-3 when the GIC is present.
Platforms no longer will pass an interrupt mapping table to gic_init.
Instead, they will pass the CPU interrupt vector (2 - 7) that they
expect the GIC to route interrupts to. Note that in EIC mode this
value is ignored and all GIC interrupts are routed to EIC vector 1.
Signed-off-by: Andrew Bresticker <abrestic@chromium.org> Acked-by: Jason Cooper <jason@lakedaemon.net> Reviewed-by: Qais Yousef <qais.yousef@imgtec.com> Tested-by: Qais Yousef <qais.yousef@imgtec.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Jeffrey Deans <jeffrey.deans@imgtec.com> Cc: Markos Chandras <markos.chandras@imgtec.com> Cc: Paul Burton <paul.burton@imgtec.com> Cc: Jonas Gorski <jogo@openwrt.org> Cc: John Crispin <blogic@openwrt.org> Cc: David Daney <ddaney.cavm@gmail.com> Cc: linux-mips@linux-mips.org Cc: linux-kernel@vger.kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/7816/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
There's no need for platforms to have their own GIC irq_ack/irq_eoi
callbacks. irq_ack need only clear the GIC's edge detector on
edge-triggered interrupts and there's no need at all for irq_eoi.
Also get rid of the mask_ack callback since it's not necessary either.
Signed-off-by: Andrew Bresticker <abrestic@chromium.org> Acked-by: Jason Cooper <jason@lakedaemon.net> Reviewed-by: Qais Yousef <qais.yousef@imgtec.com> Tested-by: Qais Yousef <qais.yousef@imgtec.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Jeffrey Deans <jeffrey.deans@imgtec.com> Cc: Markos Chandras <markos.chandras@imgtec.com> Cc: Paul Burton <paul.burton@imgtec.com> Cc: Jonas Gorski <jogo@openwrt.org> Cc: John Crispin <blogic@openwrt.org> Cc: David Daney <ddaney.cavm@gmail.com> Cc: linux-mips@linux-mips.org Cc: linux-kernel@vger.kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/7809/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
The GIC on Malta boards supports a total of 47 interrupts (40 shared
and 7 local) and is assigned a base of 24. This overlaps with the
MSC01 interrupt assignments which have a base of 64, so move the MSC01
interrupt base back a bit to give the GIC some room.
Signed-off-by: Andrew Bresticker <abrestic@chromium.org> Reviewed-by: Qais Yousef <qais.yousef@imgtec.com> Tested-by: Qais Yousef <qais.yousef@imgtec.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Jason Cooper <jason@lakedaemon.net> Cc: Jeffrey Deans <jeffrey.deans@imgtec.com> Cc: Markos Chandras <markos.chandras@imgtec.com> Cc: Paul Burton <paul.burton@imgtec.com> Cc: Jonas Gorski <jogo@openwrt.org> Cc: John Crispin <blogic@openwrt.org> Cc: David Daney <ddaney.cavm@gmail.com> Cc: linux-mips@linux-mips.org Cc: linux-kernel@vger.kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/7815/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
In preparation for GIC IRQ domain support, assign a GIC IRQ base
that does not overlap with the CPU IRQs.
Note that this breaks SEAD-3 when the GIC is in EIC mode, though
I'm not convinced it was working before either. It will be fixed
in the following patches.
Signed-off-by: Andrew Bresticker <abrestic@chromium.org> Reviewed-by: Qais Yousef <qais.yousef@imgtec.com> Tested-by: Qais Yousef <qais.yousef@imgtec.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Jason Cooper <jason@lakedaemon.net> Cc: Jeffrey Deans <jeffrey.deans@imgtec.com> Cc: Markos Chandras <markos.chandras@imgtec.com> Cc: Paul Burton <paul.burton@imgtec.com> Cc: Jonas Gorski <jogo@openwrt.org> Cc: John Crispin <blogic@openwrt.org> Cc: David Daney <ddaney.cavm@gmail.com> Cc: linux-mips@linux-mips.org Cc: linux-kernel@vger.kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/7813/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
MIPS: smp-cps: Enable all hardware interrupts on secondary CPUs
Currently interrupt vectors 2 and 5 are left disabled on secondary CPUs.
Since systems using CPS must also have a GIC, which is responsible for
routing all external interrupts and can map them to any hardware interrupt
vector, enable the remaining vectors. The two software interrupt vectors
are left disabled since they are not used with CPS.
Signed-off-by: Andrew Bresticker <abrestic@chromium.org> Reviewed-by: Qais Yousef <qais.yousef@imgtec.com> Tested-by: Qais Yousef <qais.yousef@imgtec.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Jason Cooper <jason@lakedaemon.net> Cc: Andrew Bresticker <abrestic@chromium.org> Cc: Jeffrey Deans <jeffrey.deans@imgtec.com> Cc: Markos Chandras <markos.chandras@imgtec.com> Cc: Paul Burton <paul.burton@imgtec.com> Cc: Qais Yousef <qais.yousef@imgtec.com> Cc: Jonas Gorski <jogo@openwrt.org> Cc: John Crispin <blogic@openwrt.org> Cc: David Daney <ddaney.cavm@gmail.com> Cc: linux-mips@linux-mips.org Cc: linux-kernel@vger.kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/7803/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
MIPS: Add hook to get C0 performance counter interrupt
The hardware perf event driver and oprofile interpret the global
cp0_perfcount_irq differently: in the hardware perf event driver
it is an offset from MIPS_CPU_IRQ_BASE and in oprofile it is the
actual IRQ number. This still works most of the time since
MIPS_CPU_IRQ_BASE is usually 0, but is clearly wrong. Since the
performance counter interrupt may vary from platform to platform
like the C0 timer interrupt, add the optional get_c0_perfcount_int
hook which returns the IRQ number of the performance counter.
The hook should return < 0 if the performance counter interrupt is
shared with the timer. If the hook is not present, the CPU vector
reported in C0_IntCtl (cp0_perfcount_irq) is used.
Signed-off-by: Andrew Bresticker <abrestic@chromium.org> Reviewed-by: Qais Yousef <qais.yousef@imgtec.com> Tested-by: Qais Yousef <qais.yousef@imgtec.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Jason Cooper <jason@lakedaemon.net> Cc: Andrew Bresticker <abrestic@chromium.org> Cc: Jeffrey Deans <jeffrey.deans@imgtec.com> Cc: Markos Chandras <markos.chandras@imgtec.com> Cc: Paul Burton <paul.burton@imgtec.com> Cc: Qais Yousef <qais.yousef@imgtec.com> Cc: Jonas Gorski <jogo@openwrt.org> Cc: John Crispin <blogic@openwrt.org> Cc: David Daney <ddaney.cavm@gmail.com> Cc: linux-mips@linux-mips.org Cc: linux-kernel@vger.kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/7805/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
For platforms which boot with device-tree or have correctly chained
all external interrupt controllers, a generic plat_irq_dispatch() can
be used. Implement a plat_irq_dispatch() which simply handles all the
pending interrupts as reported by C0_Cause.
Signed-off-by: Andrew Bresticker <abrestic@chromium.org> Reviewed-by: Qais Yousef <qais.yousef@imgtec.com> Tested-by: Qais Yousef <qais.yousef@imgtec.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Jason Cooper <jason@lakedaemon.net> Cc: Andrew Bresticker <abrestic@chromium.org> Cc: Jeffrey Deans <jeffrey.deans@imgtec.com> Cc: Markos Chandras <markos.chandras@imgtec.com> Cc: Paul Burton <paul.burton@imgtec.com> Cc: Qais Yousef <qais.yousef@imgtec.com> Cc: Jonas Gorski <jogo@openwrt.org> Cc: John Crispin <blogic@openwrt.org> Cc: David Daney <ddaney.cavm@gmail.com> Cc: linux-mips@linux-mips.org Cc: linux-kernel@vger.kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/7801/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Markos Chandras [Fri, 29 Aug 2014 08:37:26 +0000 (09:37 +0100)]
MIPS: cpu: Add 'noftlb' kernel command line option to disable the FTLB
Add new 'noftlb' kernel command line option to disable the FTLB.
Since the kernel command line is not available when probing and
enabling the CPU features in cpu_probe(), we let the kernel configure
the FTLB during the config4 decode operation and we disable the FTLB later
on, once the command line has become available to us. This should have
no negative effects since FTLB isn't used so early in the boot process.
FTLB increases the effective TLB size leading to less TLB misses. However,
sometimes it's useful to be able to disable it when debugging memory related
core features or other hardware components.
Sergey Ryazanov [Sat, 30 Aug 2014 02:06:27 +0000 (06:06 +0400)]
MIPS: pci-rt3883: Remove odd locking in PCI config space access code
Caller (generic PCI code) already do proper locking so no need to add
another one here. Local PCI read/write functions are never called
simultaneously, also they do not require synchronization with the PCI
controller ops, since they are used before the controller registration.
Eunbong Song [Wed, 22 Oct 2014 06:39:56 +0000 (06:39 +0000)]
MIPS: Add arch_trigger_all_cpu_backtrace() function
Currently, arch_trigger_all_cpu_backtrace() is defined in only x86 and
sparc which have an NMI. But in case of softlockup, it could be possible
to dump backtrace of all cpus. and this could be helpful for debugging.
for example, if system has 2 cpus.
CPU 0 CPU 1
acquire read_lock()
try to do write_lock()
,,,
missing read_unlock()
In this case, softlockup will occur becasuse CPU 0 does not call
read_unlock(). And dump_stack() print only backtrace for "CPU 0". If
CPU1's backtrace is printed it's very helpful.
[ralf@linux-mips.org: Fixed whitespace and formatting issues.]
Andy Lutomirski [Fri, 21 Nov 2014 21:26:07 +0000 (13:26 -0800)]
uprobes, x86: Fix _TIF_UPROBE vs _TIF_NOTIFY_RESUME
x86 call do_notify_resume on paranoid returns if TIF_UPROBE is set but
not on non-paranoid returns. I suspect that this is a mistake and that
the code only works because int3 is paranoid.
Setting _TIF_NOTIFY_RESUME in the uprobe code was probably a workaround
for the x86 bug. With that bug fixed, we can remove _TIF_NOTIFY_RESUME
from the uprobes code.
Thomas Gleixner [Sun, 23 Nov 2014 22:04:52 +0000 (23:04 +0100)]
sched: Provide update_curr callbacks for stop/idle scheduling classes
Chris bisected a NULL pointer deference in task_sched_runtime() to
commit 6e998916dfe3 'sched/cputime: Fix clock_nanosleep()/clock_gettime()
inconsistency'.
Chris observed crashes in atop or other /proc walking programs when he
started fork bombs on his machine. He assumed that this is a new exit
race, but that does not make any sense when looking at that commit.
What's interesting is that, the commit provides update_curr callbacks
for all scheduling classes except stop_task and idle_task.
While nothing can ever hit that via the clock_nanosleep() and
clock_gettime() interfaces, which have been the target of the commit in
question, the author obviously forgot that there are other code paths
which invoke task_sched_runtime()
If the stats are read for a stomp machine task, aka 'migration/N' and
that task is current on its cpu, this will happily call the NULL pointer
of stop_task->update_curr. Ooops.
Chris observation that this happens faster when he runs the fork bomb
makes sense as the fork bomb will kick migration threads more often so
the probability to hit the issue will increase.
Add the missing update_curr callbacks to the scheduler classes stop_task
and idle_task. While idle tasks cannot be monitored via /proc we have
other means to hit the idle case.
Linus Torvalds [Sun, 23 Nov 2014 21:56:55 +0000 (13:56 -0800)]
Merge branch 'x86-traps' (trap handling from Andy Lutomirski)
Merge x86-64 iret fixes from Andy Lutomirski:
"This addresses the following issues:
- an unrecoverable double-fault triggerable with modify_ldt.
- invalid stack usage in espfix64 failed IRET recovery from IST
context.
- invalid stack usage in non-espfix64 failed IRET recovery from IST
context.
It also makes a good but IMO scary change: non-espfix64 failed IRET
will now report the correct error. Hopefully nothing depended on the
old incorrect behavior, but maybe Wine will get confused in some
obscure corner case"
* emailed patches from Andy Lutomirski <luto@amacapital.net>:
x86_64, traps: Rework bad_iret
x86_64, traps: Stop using IST for #SS
x86_64, traps: Fix the espfix64 #DF fixup and rewrite it in C
Andy Lutomirski [Sun, 23 Nov 2014 02:00:33 +0000 (18:00 -0800)]
x86_64, traps: Rework bad_iret
It's possible for iretq to userspace to fail. This can happen because
of a bad CS, SS, or RIP.
Historically, we've handled it by fixing up an exception from iretq to
land at bad_iret, which pretends that the failed iret frame was really
the hardware part of #GP(0) from userspace. To make this work, there's
an extra fixup to fudge the gs base into a usable state.
This is suboptimal because it loses the original exception. It's also
buggy because there's no guarantee that we were on the kernel stack to
begin with. For example, if the failing iret happened on return from an
NMI, then we'll end up executing general_protection on the NMI stack.
This is bad for several reasons, the most immediate of which is that
general_protection, as a non-paranoid idtentry, will try to deliver
signals and/or schedule from the wrong stack.
This patch throws out bad_iret entirely. As a replacement, it augments
the existing swapgs fudge into a full-blown iret fixup, mostly written
in C. It's should be clearer and more correct.
Signed-off-by: Andy Lutomirski <luto@amacapital.net> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Cc: stable@vger.kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Andy Lutomirski [Sun, 23 Nov 2014 02:00:32 +0000 (18:00 -0800)]
x86_64, traps: Stop using IST for #SS
On a 32-bit kernel, this has no effect, since there are no IST stacks.
On a 64-bit kernel, #SS can only happen in user code, on a failed iret
to user space, a canonical violation on access via RSP or RBP, or a
genuine stack segment violation in 32-bit kernel code. The first two
cases don't need IST, and the latter two cases are unlikely fatal bugs,
and promoting them to double faults would be fine.
This fixes a bug in which the espfix64 code mishandles a stack segment
violation.
This saves 4k of memory per CPU and a tiny bit of code.
Signed-off-by: Andy Lutomirski <luto@amacapital.net> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Cc: stable@vger.kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Sun, 23 Nov 2014 19:46:01 +0000 (11:46 -0800)]
Merge tag 'armsoc-for-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc
Pull ARM SoC fixes from Olof Johansson:
"A collection of fixes this week:
- A set of clock fixes for shmobile platforms
- A fix for tegra that moves serial port labels to be per board.
We're choosing to merge this for 3.18 because the labels will start
being parsed in 3.19, and without this change serial port numbers
that used to be stable since the dawn of time will change numbers.
- A few other DT tweaks for Tegra.
- A fix for multi_v7_defconfig that makes it stop spewing cpufreq
errors on Arndale (Exynos)"
* tag 'armsoc-for-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
ARM: multi_v7_defconfig: fix failure setting CPU voltage by enabling dependent I2C controller
ARM: tegra: roth: Fix SD card VDD_IO regulator
ARM: tegra: Remove eMMC vmmc property for roth/tn7
ARM: dts: tegra: move serial aliases to per-board
ARM: tegra: Add serial port labels to Tegra124 DT
ARM: shmobile: kzm9g legacy: Set i2c clks_per_count to 2
ARM: shmobile: r8a7740 dtsi: Correct IIC0 parent clock
ARM: shmobile: r8a7790: Fix SD3CKCR address to device tree
ARM: shmobile: r8a7740 legacy: Correct IIC0 parent clock
ARM: shmobile: r8a7740 legacy: Add missing INTCA clock for irqpin module
ARM: shmobile: r8a7790: Fix SD3CKCR address
ARM: dts: sun6i: Re-parent ahb1_mux to pll6 as required by dma controller
Linus Torvalds [Sun, 23 Nov 2014 19:33:49 +0000 (11:33 -0800)]
Merge branch 'for-3.18-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu
Pull percpu fix from Tejun Heo:
"This contains one patch to fix a race condition which can lead to
percpu_ref using a percpu pointer which is corrupted with a set DEAD
bit. The bug was introduced while separating out the ATOMIC mode flag
from the DEAD flag. The fix is pretty straight forward.
I just committed the patch to the percpu tree but am sending out the
pull request early as I'll be on vacation for a week. The patch
should be fairly safe and while the latency will be higher I'll be
checking emails"
* 'for-3.18-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu:
percpu-ref: fix DEAD flag contamination of percpu pointer
Linus Torvalds [Sun, 23 Nov 2014 19:16:36 +0000 (11:16 -0800)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs
Pull btrfs deadlock fix from Chris Mason:
"This has a fix for a long standing deadlock that we've been trying to
nail down for a while. It ended up being a bad interaction with the
fair reader/writer locks and the order btrfs reacquires locks in the
btree"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs:
btrfs: fix lockups from btrfs_clear_path_blocking
Tejun Heo [Sat, 22 Nov 2014 14:22:42 +0000 (09:22 -0500)]
percpu-ref: fix DEAD flag contamination of percpu pointer
While decoupling ATOMIC and DEAD flags, f47ad4578461 ("percpu_ref:
decouple switching to percpu mode and reinit") updated
__ref_is_percpu() so that it only tests ATOMIC flag to determine
whether the ref is in percpu mode or not; however, while DEAD implies
ATOMIC, the two flags are set separately during percpu_ref_kill() and
if __ref_is_percpu() races percpu_ref_kill(), it may see DEAD w/o
ATOMIC. Because __ref_is_percpu() returns @ref->percpu_count_ptr
value verbatim as the percpu pointer after testing ATOMIC, the pointer
may now be contaminated with the DEAD flag.
This can be fixed by clearing the flag bits before returning the
pointer which was the fix proposed by Shaohua; however, as DEAD
implies ATOMIC, we can just test for both flags at once and avoid the
explicit masking.
Update __ref_is_percpu() so that it tests that both ATOMIC and DEAD
are clear before returning @ref->percpu_count_ptr as the percpu
pointer.