Gavin Shan [Fri, 20 May 2016 06:41:39 +0000 (16:41 +1000)]
powerpc/powernv: Use PCI slot reset infrastructure
The (OPAL) firmware might provide the PCI slot reset capability
which is identified by property "ibm,reset-by-firmware" on the
PCI slot associated device node.
This routes the reset request to firmware if "ibm,reset-by-firmware"
exists in the PCI slot device node. Otherwise, the reset is done
inside kernel as before.
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Gavin Shan [Fri, 20 May 2016 06:41:38 +0000 (16:41 +1000)]
powerpc/powernv: Support PCI slot ID
The reset and poll functionality from (OPAL) firmware supports
PHB and PCI slot at same time. They are identified by ID. This
supports PCI slot ID by:
* Rename the argument name for opal_pci_reset() and opal_pci_poll()
accordingly
* Rename pnv_eeh_phb_poll() to pnv_eeh_poll() and adjust its argument
name.
* One macro is added to produce PCI slot ID.
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Gavin Shan [Fri, 20 May 2016 06:41:37 +0000 (16:41 +1000)]
powerpc/pci: Delay populating pdn
The pdn (struct pci_dn) instances are allocated from memblock or
bootmem when creating PCI controller (hoses) in setup_arch(). PCI
hotplug, which will be supported by proceeding patches, releases
PCI device nodes and their corresponding pdn on unplugging event.
The memory chunks for pdn instances allocated from memblock or
bootmem are hard to reused after being released.
This delays creating pdn by pci_devs_phb_init() from setup_arch()
to core_initcall() so that they are allocated from slab. The memory
consumed by pdn can be released to system without problem during
PCI unplugging time. It indicates that pci_dn is unavailable in
setup_arch() and the the fixup on pdn (like AGP's) can't be carried
out that time. We have to do that in pcibios_root_bridge_prepare()
on maple/pasemi/powermac platforms where/when the pdn is available.
pcibios_root_bridge_prepare is called from subsys_initcall() which
is executed after core_initcall() so the code flow does not change.
At the mean while, the EEH device is created when pdn is populated,
meaning pdn and EEH device have same life cycle. In turn, we needn't
call eeh_dev_init() to create EEH device explicitly.
Gavin Shan [Fri, 20 May 2016 06:41:36 +0000 (16:41 +1000)]
powerpc/pci: Update bridge windows on PCI plug
On the PCI plugging event, PCI slot's subordinate devices are
scanned and their (IO and MMIO) resources are assigned. Platform
dependent resources (PE#, IO/MMIO/DMA windows) are allocated or
created on updating windows of the slot's upstream bridge.
This updates the windows of the hot plugged slot's upstream bridge
in pcibios_finish_adding_to_bus() so that the platform resources
(PE#, IO/MMIO/DMA segments) are allocated or created accordingly.
Gavin Shan [Fri, 20 May 2016 06:41:35 +0000 (16:41 +1000)]
powerpc/powernv: Dynamically release PE
This supports releasing PEs dynamically. A reference count is
introduced to PE representing number of PCI devices associated
with the PE. The reference count is increased when PCI device
joins the PE and decreased when PCI device leaves the PE in
pnv_pci_release_device(). When the count becomes zero, the PE
and its consumed resources are released. Note that the count
is accessed concurrently. So a counter with "int" type is enough
here.
In order to release the sources consumed by the PE, couple of
helper functions are introduced as below:
Gavin Shan [Fri, 20 May 2016 06:41:34 +0000 (16:41 +1000)]
powerpc/powernv: Make pnv_ioda_deconfigure_pe() visible
pnv_ioda_deconfigure_pe() is visible only when CONFIG_PCI_IOV is
enabled. The function will be used to tear down PE's associated
mapping in PCI hotplug path that doesn't depend on CONFIG_PCI_IOV.
This makes pnv_ioda_deconfigure_pe() visible and not depend on
CONFIG_PCI_IOV.
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Gavin Shan [Fri, 20 May 2016 06:41:33 +0000 (16:41 +1000)]
powerpc/powernv: Extend PCI bridge resources
The PCI slots are associated with root port or downstream ports
of the PCIe switch connected to root port. When adapter is hot
added to the PCI slot, it usually requests more IO or memory
resource from the directly connected parent bridge (port) and
update the bridge's windows accordingly. The resource windows
of upstream bridges can't be updated automatically. It possibly
leads to unbalanced resource across the bridges: The window of
downstream bridge is overruning that of upstream bridge. The
IO or MMIO path won't work.
This resolves the above issue by extending bridge windows of
root port and upstream port of the PCIe switch connected to
the root port to PHB's windows.
The windows of root port and bridge behind that are extended to
the PHB's windows to accomodate the PCI hotplug happening in
future. The PHB's 64KB 32-bits MSI region is included in bridge's
M32 windows (in hardware) though it's excluded in the corresponding
resource, as the bridge's M32 windows have 1MB as their minimal
alignment. We observed EEH error during system boot when the MSI
region is included in bridge's M32 window.
This excludes top 1MB (including 64KB 32-bits MSI region) region
from bridge's M32 windows when extending them.
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Gavin Shan [Fri, 20 May 2016 06:41:32 +0000 (16:41 +1000)]
powerpc/powernv: Setup PE for root bus
There is no parent bridge for root bus, meaning pcibios_setup_bridge()
isn't invoked for root bus. The PE for root bus is the ancestor of
other PEs in PELTV. It means we need PE for root bus populated before
all others.
This populates the PE for root bus in pcibios_setup_bridge() path
if it's not populated yet. The PE number next to the reserved one
is used as the PE# to avoid holes in continuous M64 space.
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Gavin Shan [Fri, 20 May 2016 06:41:31 +0000 (16:41 +1000)]
powerpc/powernv: Create PEs in pcibios_setup_bridge()
Currently, the PEs and their associated resources are assigned in
ppc_md.pcibios_fixup() except those used by SRIOV VFs. The function
is called for once after PCI probing and resources assignment is
completed. So it's obviously not hotplug friendly.
This creates PEs dynamically in pcibios_setup_bridge() that is
called for the event during system bootup and PCI hotplug: updating
PCI bridge's windows after resource assignment/reassignment are done.
In partial hotplug case, not all PCI devices included to one particular
PE are unplugged and plugged again, we just need unbinding/binding the
hot added PCI devices with the corresponding PE without creating new
one. The change is applied to IODA1 and IODA2 PHBs only. The behaviour
on NPU PHBs aren't changed. There are no PCI bridges on NPU PHBs,
meaning pcibios_setup_bridge() won't be invoked there. We have to use
old path (pnv_pci_ioda_fixup()) to setup PEs on NPU PHBs.
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Gavin Shan [Fri, 20 May 2016 06:41:30 +0000 (16:41 +1000)]
powerpc/powernv: Allocate PE# in reverse order
PE number for one particular PE can be allocated dynamically or
reserved according to the consumed M64 (64-bits prefetchable)
segments of the PE. The M64 segment can't be remapped to arbitrary
PE, meaning the PE number is determined according to the index
of the consumed M64 segment. As below figure shows, M64 resource
grows from low to high end, meaning the PE (number) reserved
according to M64 segment grows from low to high end as well,
so does the dynamically allocated PE number. It will lead to
conflict: PE number (M64 segment) reserved by dynamic allocation
is required by hot added PCI adapter at later point. It fails
the PCI hotplug because of the PE number can't be reserved
based on the index of the consumed M64 segment.
PE number for dynamic allocation ----------------->
PE number reserved for M64 segment ----------------->
To resolve above conflicts, this forces the PE number to be
allocated dynamically in reverse order. With this patch applied,
the PE numbers are reserved in ascending order, but allocated
dynamically in reverse order.
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Gavin Shan [Fri, 20 May 2016 06:41:29 +0000 (16:41 +1000)]
powerpc/powernv: Increase PE# capacity
Each PHB maintains an array helping to translate 2-bytes Request
ID (RID) to PE# with the assumption that PE# takes one byte, meaning
that we can't have more than 256 PEs. However, pci_dn->pe_number
already had 4-bytes for the PE#.
This extends the PE# capacity for every PHB. After that, the PE number
is represented by 4-bytes value. Then we can reuse IODA_INVALID_PE to
check the PE# in phb->pe_rmap[] is valid or not.
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Reviewed-by: Daniel Axtens <dja@axtens.net> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Gavin Shan [Fri, 20 May 2016 06:41:28 +0000 (16:41 +1000)]
powerpc/powernv: Move pnv_pci_ioda_setup_opal_tce_kill() around
pnv_pci_ioda_setup_opal_tce_kill() called by pnv_ioda_setup_dma()
to remap the TCE kill regiter. What's done in pnv_ioda_setup_dma()
will be covered in pcibios_setup_bridge() which is invoked on each
PCI bridge. It means we will possibly remap the TCE kill register
for multiple times and it's unnecessary.
This moves pnv_pci_ioda_setup_opal_tce_kill() to where the PHB is
initialized (pnv_pci_init_ioda_phb()) to avoid above issue.
Gavin Shan [Fri, 20 May 2016 06:41:26 +0000 (16:41 +1000)]
powerpc/pci: Override pcibios_setup_bridge()
This overrides pcibios_setup_bridge() that is called to update PCI
bridge windows when PCI resource assignment is completed, to assign
PE and setup various (resource) mapping for the PE in subsequent
patches.
Gavin Shan [Fri, 20 May 2016 06:41:25 +0000 (16:41 +1000)]
PCI: Add pcibios_setup_bridge()
Currently, PowerPC PowerNV platform utilizes ppc_md.pcibios_fixup(),
which is called for once after PCI probing and resource assignment
are completed, to allocate platform required resources for PCI devices:
PE#, IO and MMIO mapping, DMA address translation (TCE) table etc.
Obviously, it's not hotplug friendly.
This adds weak function pcibios_setup_bridge(), which is called by
pci_setup_bridge(). PowerPC PowerNV platform will reuse the function
to assign above platform required resources to newly plugged PCI devices
during PCI hotplug in subsequent patches.
Export cpu_to_core_id(). This will be used by the lpfc driver.
This enables topology_core_id() from <linux/topology.h> (defined
to cpu_to_core_id() in arch/powerpc/include/asm/topology.h) to be
used by (non-builtin) modules.
That is arch-neutral, already used by eg, drivers/base/topology.c,
but it is builtin (obj-y in Makefile) thus didn't need the export.
Since the module uses topology_core_id() and this is defined to
cpu_to_core_id(), it needs the export, otherwise:
Jack Miller [Thu, 9 Jun 2016 02:31:10 +0000 (12:31 +1000)]
selftests/powerpc: Load Monitor Register Tests
Adds two tests. One is a simple test to ensure that the new registers
LMRR and LMSER are properly maintained. The other actually uses the
existing EBB test infrastructure to test that LMRR and LMSER behave as
documented.
Signed-off-by: Jack Miller <jack@codezen.org> Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Jack Miller [Thu, 9 Jun 2016 02:31:09 +0000 (12:31 +1000)]
powerpc: Load Monitor Register Support
This enables new registers, LMRR and LMSER, that can trigger an EBB in
userspace code when a monitored load (via the new ldmx instruction)
loads memory from a monitored space. This facility is controlled by a
new FSCR bit, LM.
This patch disables the FSCR LM control bit on task init and enables
that bit when a load monitor facility unavailable exception is taken
for using it. On context switch, this bit is then used to determine
whether the two relevant registers are saved and restored. This is
done lazily for performance reasons.
Signed-off-by: Jack Miller <jack@codezen.org> Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Michael Neuling [Thu, 9 Jun 2016 02:31:08 +0000 (12:31 +1000)]
powerpc: Improve FSCR init and context switching
This fixes a few issues with FSCR init and switching.
In commit 152d523e6307 ("powerpc: Create context switch helpers
save_sprs() and restore_sprs()") we moved the setting of the FSCR
register from inside an CPU_FTR_ARCH_207S section to inside just a
CPU_FTR_ARCH_DSCR section. Hence we are setting FSCR on POWER6/7 where
the FSCR doesn't exist. This is harmless but we shouldn't do it.
Also, we can simplify the FSCR context switch. We don't need to go
through the calculation involving dscr_inherit. We can just restore
what we saved last time.
We also set an initial value in INIT_THREAD, so that pid 1 which is
cloned from that gets a sane value.
Based on patch by Jack Miller.
Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
powerpc: Fix misleading comment in early_setup_secondary()
Current comment in the early_setup_secondary() for paca->soft_enabled
update is misleading. Comment should say to Mark interrupts "disabled"
instead of "enabled". Fix the typo.
Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
The problem is that the kretprobe_trampoline symbol looks like this:
$ eu-readelf -s /boot/vmlinux G kretprobe_trampoline
2431: c000000001302368 24 NOTYPE LOCAL DEFAULT 37 kretprobe_trampoline_holder
2432: c00000000003d300 8 FUNC LOCAL DEFAULT 1 .kretprobe_trampoline_holder
97543: c00000000003d300 0 NOTYPE GLOBAL DEFAULT 1 kretprobe_trampoline
Its type is NOTYPE, and its size is 0, and this is a problem because
symbol-elf.c:dso__load_sym skips function symbols that are not STT_FUNC
or STT_GNU_IFUNC (this is determined by elf_sym__is_function). Even
if the type is changed to STT_FUNC, when dso__load_sym calls
symbols__fixup_duplicate, the kretprobe_trampoline symbol is dropped in
favour of .kretprobe_trampoline_holder because the latter has non-zero
size (as determined by choose_best_symbol).
With this patch, all vmlinux symbols match /proc/kallsyms and the
testcase passes.
Commit c1c355ce14c0 ("x86/kprobes: Get rid of
kretprobe_trampoline_holder()") gets rid of kretprobe_trampoline_holder
altogether on x86. This commit does the same on powerpc. This change
introduces no regressions on the perf and ftracetest testsuite results.
Reviewed-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Signed-off-by: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Russell Currey [Fri, 17 Jun 2016 05:25:17 +0000 (15:25 +1000)]
powerpc/pci: Fix SRIOV not building without EEH enabled
On Book3E CPUs (and possibly other configs), it is possible to have SRIOV
(CONFIG_PCI_IOV) set without CONFIG_EEH. The SRIOV code does not check
for this, and if EEH is disabled, pci_dn.c fails to build.
Fix this by gating the EEH-specific code in the SRIOV implementation
behind CONFIG_EEH.
Fixes: 39218cd0 ("powerpc/eeh: EEH device for VF") Reported-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Russell Currey <ruscur@russell.cc> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Frederic Barrat [Wed, 15 Jun 2016 14:42:16 +0000 (16:42 +0200)]
cxl: Make vPHB device node match adapter's
On bare-metal, when a device is attached to the cxl card, lsvpd shows
a location code such as (with cxlflash):
# lsvpd -l sg22
...
*YL U78CB.001.WZS0073-P1-C33-B0-T0-L0
which makes it hard to easily identify the cxl adapter owning the
flash device, since in this example C33 refers to a P8 processor.
lsvpd looks in the parent devices until it finds a location code, so the
device node for the vPHB ends up being used.
By reusing the device node of the adapter for the vPHB, lsvpd shows:
# lsvpd -l sg16
...
*YL U78C9.001.WZS09XA-P1-C7-B1-T0-L3
where C7 is the PCI slot of the cxl adapter.
On powerVM, the vPHB was already using the adapter device node, so
there's no change there.
Tested by cxlflash on bare-metal and powerVM.
Signed-off-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com> Reviewed-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com> Acked-by: Ian Munsie <imunsie@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Ian Munsie [Wed, 8 Jun 2016 05:09:54 +0000 (15:09 +1000)]
cxl: Add support for CAPP DMA mode
This adds support for using CAPP DMA mode, which is required for XSL
based cards such as the Mellanox CX4 to function.
This is currently an RFC as it depends on the corresponding support to
be merged into skiboot first, which was submitted here:
http://patchwork.ozlabs.org/patch/625582/
In the event that the skiboot on the system does not have the above
support, it will indicate as such in the kernel log and abort the init
process.
Signed-off-by: Ian Munsie <imunsie@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Frederic Barrat [Mon, 23 May 2016 17:39:18 +0000 (03:39 +1000)]
cxl: Abstract the differences between the PSL and XSL
The XSL (Translation Service Layer) is a stripped down version of the
PSL (Power Service Layer) used in some cards such as the Mellanox CX4.
Like the PSL, it implements the CAIA architecture, but has a number of
differences, mostly in it's implementation dependent registers. This
adds an ops structure to abstract these differences to bring initial
support for XSL CAPI devices.
The XSL does not implement the optional architected SERR register,
however while it treats it as a reserved register and should work with
no special treatment, attempting to access it will cause the XSL_FEC
(First Error Capture) register to be filled out, preventing it from
capturing any subsequent errors. Therefore, this patch also prevents the
kernel from trying to set up the SERR register so that the FEC register
may still be useful, and to save one interrupt.
The XSL also uses a special DMA cxl mode, which uses a slightly
different init sequence for the CAPP and PHB. The kernel support for
this will be in a future patch once the corresponding support has been
merged into skiboot.
Co-authored-by: Ian Munsie <imunsie@au1.ibm.com> Signed-off-by: Ian Munsie <imunsie@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Ian Munsie [Mon, 23 May 2016 16:14:05 +0000 (02:14 +1000)]
cxl: Update process element after allocating interrupts
In the kernel API, it is possible to attempt to allocate AFU interrupts
after already starting a context. Since the process element structure
used by the hardware is only filled out at the time the context is
started, it will not be updated with the interrupt numbers that have
just been allocated and therefore AFU interrupts will not work unless
they were allocated prior to starting the context.
This can present some difficulties as each CAPI enabled PCI device in
the kernel API has a default context, which may need to be started very
early to enable translations, potentially before interrupts can easily
be set up.
This patch makes the API more flexible to allow interrupts to be
allocated after a context has already been started and takes care of
updating the PE structure used by the hardware and notifying it to
discard any cached copy it may have.
The update is currently performed via a terminate/remove/add sequence.
This is necessary on some hardware such as the XSL that does not
properly support the update LLCMD.
Note that this is only supported on powernv at present - attempting to
perform this ordering on PowerVM will raise a warning.
Signed-off-by: Ian Munsie <imunsie@au1.ibm.com> Reviewed-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Andrew Donnellan [Mon, 18 Apr 2016 05:03:50 +0000 (15:03 +1000)]
cxl: static-ify variables to fix sparse warnings
Make a couple more variables static. Found by sparse.
Signed-off-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Reviewed-by: fbarrat@linux.vnet.ibm.com Reviewed-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com> Acked-by: Ian Munsie <imunsie@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
REG_BYTE is defined differently based on whether we're compiling for
LE, BE32 or BE64. Sparse apparently doesn't provide __BIG_ENDIAN__ or
__LITTLE_ENDIAN__, which means we get no definition.
Rather than check for __BIG_ENDIAN__ and then separately for
__LITTLE_ENDIAN__, just switch the #ifdef to check for __BIG_ENDIAN__
and then #else we define the little endian version. Technically that's
dicey because PDP_ENDIAN is also a possibility, but we already do it in
a lot of places so one more hardly matters.
Signed-off-by: Daniel Axtens <dja@axtens.net> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Daniel Axtens [Wed, 18 May 2016 01:16:50 +0000 (11:16 +1000)]
powerpc: Introduce asm-prototypes.h
Sparse picked up a number of functions that are implemented in C and
then only referred to in asm code.
This introduces asm-prototypes.h, which provides a place for
prototypes of these functions.
This silences some sparse warnings.
Signed-off-by: Daniel Axtens <dja@axtens.net>
[mpe: Add include guards, clean up copyright & GPL text] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
powerpc: Add array bounds checking to crash_shutdown_handlers
The array crash_shutdown_handles is an array of size CRASH_HANDLER_MAX+1
containing up to CRASH_HANDLER_MAX shutdown_handlers. It is assumed to
be NULL terminated, which it is under normal circumstances. Array
accesses in the functions crash_shutdown_unregister() and
default_machine_crash_shutdown() rely on this NULL termination property
when traversing this list and don't protect again out of bounds accesses.
If the NULL terminator were somehow overwritten these functions could
potentially access out of the bounds of the array.
Shrink the array to size CRASH_HANDLER_MAX and implement explicit array
bounds checking when accessing the elements of the
crash_shutdown_handles[] array in crash_shutdown_unregister() and
default_machine_crash_shutdown().
Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
The mm zone mechanism was traditionally used by arch specific code to
partition memory into allocation zones. However there are several zones
that are managed by the mm subsystem rather than the architecture. Most
architectures set the max PFN of these special zones to zero, however on
powerpc we set them to ~0ul. This, in conjunction with a bug in
free_area_init_nodes() results in all of system memory being placed in
ZONE_DEVICE when enabled. Device memory cannot be used for regular kernel
memory allocations so this will cause a kernel panic at boot. Given the
planned addition of more mm managed zones (ZONE_CMA) we should aim to be
consistent with every other architecture and set the max PFN for these
zones to zero.
Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Reviewed-by: Balbir Singh <bsingharora@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Rashmica Gupta [Wed, 1 Jun 2016 22:56:47 +0000 (08:56 +1000)]
powerpc/asm: Remove unused symbols in asm-offsets.c
THREAD_DSCR:
Added in efcac6589a27 "powerpc: Per process DSCR + some fixes (try#4)"
Last usage removed in 152d523e6307 "powerpc: Create context switch helpers save_sprs() and restore_sprs()"
THREAD_DSCR_INHERIT:
Added in 714332858bfd "powerpc: Restore correct DSCR in context switch"
Last usage removed in 152d523e6307 "powerpc: Create context switch helpers save_sprs() and restore_sprs()"
THREAD_TAR:
Added in 2468dcf641e4 "powerpc: Add support for context switching the TAR register"
Last usage removed in 152d523e6307 "powerpc: Create context switch helpers save_sprs() and restore_sprs()"
THREAD_BESCR, THREAD_EBBHR and THREAD_EBBRR:
Added in 9353374b8e15 "powerpc: Context switch the new EBB SPRs"
Last usage removed in 152d523e6307 "powerpc: Create context switch helpers save_sprs() and restore_sprs()"
THREAD_SIAR, THREAD_SDAR, THREAD_SIER, THREAD_MMCR0, and THREAD_MMCR2:
Added in 59affcd3e460 "powerpc: Context switch more PMU related SPRs"
Last usage removed in b11ae95100f7 "powerpc: Partial revert of "Context switch more PMU related SPRs""
PACA_LOCK_TOKEN:
Added in 9e368f291560 "KVM: PPC: book3s_hv: Add support for PPC970-family processors"
Last usage removed in c17b98cf6028 "KVM: PPC: Book3S HV: Remove code for PPC970 processors"
HCALL_STAT_SIZE, HCALL_STAT_CALLS, HCALL_STAT_TB and HCALL_STAT_PURR:
Added in 57852a853b0d "[POWERPC] powerpc: Instrument Hypervisor Calls"
Last usage removed in c8cd093a6e9f "powerpc: tracing: Add hypervisor call tracepoints"
VCPU_EPLC:
Added in d30f6e480055 "KVM: PPC: booke: category E.HV (GS-mode) support"
Never used.
CPU_DOWN_FLUSH:
Added in e7affb1dba0e "powerpc/cache: add cache flush operation for various e500"
Never used.
CFG_STAMP_XSEC:
Added in 14cf11af6cf6 "powerpc: Merge enough to start building in arch/powerpc."
Last usage removed in 0e469db8f70c "powerpc: Rework VDSO gettimeofday to prevent time going backwards"
KVM_LPCR:
Added in aa04b4cc5be6 "KVM: PPC: Allocate RMAs (Real Mode Areas) at boot for use by guests"
Last usage removed in a0144e2a6b0b "KVM: PPC: Book3S HV: Store LPCR value for each virtual core"
GPR15, GPR16, GPR17, GPR18, GPR19, GPR20, GPR21, GPR22, GPR23, GPR24,
GPR25, GPR26, GPR27, GPR28, GPR29, GPR30 and GPR31:
Added in 14cf11af6cf6 "powerpc: Merge enough to start building in arch/powerpc."
Never used.
VCPU_SHADOW_FSCR:
Added in 616dff860282 "KVM: PPC: Book3S PR: Handle Facility interrupt and FSCR"
Never used.
VCPU_SHADOW_SRR1:
Added in a2d56020d1d9 "KVM: PPC: Book3S PR: Keep volatile reg values in vcpu rather than shadow_vcpu"
Never used.
KVM_SPLIT_SIZE:
Added in b4deba5c41e9 "KVM: PPC: Book3S HV: Implement dynamicmicro-threading on POWER8"
Never used.
VCPU_VCPUID:
Added in de56a948b918 "KVM: PPC: Add support for Book3S processors in hypervisor mode"
Last usage removed 1b400ba0cd24 "KVM: PPC: Book3S HV: Improve handling of local vs. global TLB invalidations"
_MQ:
Added in 14cf11af6cf6 "powerpc: Merge enough to start building in arch/powerpc."
Never used.
AUDITCONTEXT:
Added in 14cf11af6cf6 "powerpc: Merge enough to start building in arch/powerpc."
Last usage removed in 401d1f029beb "[PATCH] syscall entry/exit revamp"
CLONE_VM:
Added in 14cf11af6cf6 "powerpc: Merge enough to start building in arch/powerpc."
Currently unused.
CLONE_UNTRACED:
Added in 14cf11af6cf6 "powerpc: Merge enough to start building in arch/powerpc."
Currently unused.
Bharata B Rao [Thu, 12 May 2016 13:34:15 +0000 (19:04 +0530)]
powerpc/numa: Fix multiple bugs in memory_hotplug_max()
memory_hotplug_max() uses hot_add_drconf_memory_max() to get maxmimum
addressable memory by referring to ibm,dyanamic-memory property. There
are three problems with the current approach:
1 hot_add_drconf_memory_max() assumes that ibm,dynamic-memory includes
all the LMBs of the guest, but that is not true for PowerKVM which
populates only DR LMBs (LMBs that can be hotplugged/removed) in that
property.
2 hot_add_drconf_memory_max() multiplies lmb-size with lmb-count to arrive
at the max possible address. Since ibm,dynamic-memory doesn't include
RMA LMBs, the address thus obtained will be less than the actual max
address. For example, if max possible memory size is 32G, with lmb-size
of 256MB there can be 127 LMBs in ibm,dynamic-memory (1 LMB for RMA
which won't be present here). hot_add_drconf_memory_max() would then
return the max addressable memory as 127 * 256MB = 31.75GB, the max
address should have been 32G which is what ibm,lrdr-capacity shows.
3 In PowerKVM, there can be a gap between the end of boot time RAM and
beginning of hotplug RAM area. So just multiplying lmb-count with
lmb-size will not provide the correct max possible address for PowerKVM.
This patch fixes 1 by using ibm,lrdr-capacity property to return the max
addressable memory whenever the property is present. Then it fixes 2 & 3
by fetching the address of the last LMB in ibm,dynamic-memory property.
Fixes: cd34206e949b ("powerpc: Add memory_hotplug_max()") Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Bharata B Rao [Thu, 12 May 2016 13:34:14 +0000 (19:04 +0530)]
powerpc/numa: Fix whitespace in hot_add_drconf_memory_max()
Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Boqun Feng [Fri, 10 Jun 2016 03:51:28 +0000 (11:51 +0800)]
powerpc/spinlock: Fix spin_unlock_wait()
There is an ordering issue with spin_unlock_wait() on powerpc, because
the spin_lock primitive is an ACQUIRE and an ACQUIRE is only ordering
the load part of the operation with memory operations following it.
Therefore the following event sequence can happen:
CPU 1 CPU 2 CPU 3
================== ==================== ==============
spin_unlock(&lock);
spin_lock(&lock):
r1 = *lock; // r1 == 0;
o = object; o = READ_ONCE(object); // reordered here
object = NULL;
smp_mb();
spin_unlock_wait(&lock);
*lock = 1;
smp_mb();
o->dead = true; < o = READ_ONCE(object); > // reordered upwards
if (o) // true
BUG_ON(o->dead); // true!!
To fix this, we add a "nop" ll/sc loop in arch_spin_unlock_wait() on
ppc, the "nop" ll/sc loop reads the lock
value and writes it back atomically, in this way it will synchronize the
view of the lock on CPU1 with that on CPU2. Therefore in the scenario
above, either CPU2 will fail to get the lock at first or CPU1 will see
the lock acquired by CPU2, both cases will eliminate this bug. This is a
similar idea as what Will Deacon did for ARM64 in:
d86b8da04dfa ("arm64: spinlock: serialise spin_unlock_wait against concurrent lockers")
Furthermore, if the "nop" ll/sc figures out the lock is locked, we
actually don't need to do the "nop" ll/sc trick again, we can just do a
normal load+check loop for the lock to be released, because in that
case, spin_unlock_wait() is called when someone is holding the lock, and
the store part of the "nop" ll/sc happens before the lock release of the
current lock holder:
"nop" ll/sc -> spin_unlock()
and the lock release happens before the next lock acquisition:
spin_unlock() -> spin_lock() <next holder>
which means the "nop" ll/sc happens before the next lock acquisition:
This patch therefore fixes the issue and also cleans the
arch_spin_unlock_wait() a little bit by removing superfluous memory
barriers in loops and consolidating the implementations for PPC32 and
PPC64 into one.
Suggested-by: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Signed-off-by: Boqun Feng <boqun.feng@gmail.com> Reviewed-by: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
[mpe: Inline the "nop" ll/sc loop and set EH=0, munge change log] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
We're approaching 20 locations where we need to check for ELF ABI v2.
That's fine, except the logic is a bit awkward, because we have to check
that _CALL_ELF is defined and then what its value is.
So check it once in asm/types.h and define PPC64_ELF_ABI_v2 when ELF ABI
v2 is detected.
We also have a few places where what we're really trying to check is
that we are using the 64-bit v1 ABI, ie. function descriptors. So also
add a #define for that, which simplifies several checks.
Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Rashmica Gupta [Mon, 30 May 2016 06:18:15 +0000 (16:18 +1000)]
powerpc/pseries: Remove MPIC from pseries event sources
MPIC was only used by Power3 which is now unsupported, so remove MPIC
code. XICS is now the only supported interrupt controller for
pSeries so do some cleanups too.
Signed-off-by: Rashmica Gupta <rashmicy@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Rashmica Gupta [Mon, 30 May 2016 06:18:13 +0000 (16:18 +1000)]
powerpc/pseries: Remove MPIC from pseries kexec
MPIC was only used by Power3 which is now unsupported, so remove MPIC
code. XICS is now the only supported interrupt controller for
pSeries so do some cleanups too.
Signed-off-by: Rashmica Gupta <rashmicy@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Rashmica Gupta [Mon, 30 May 2016 06:18:12 +0000 (16:18 +1000)]
powerpc/pseries: Remove MPIC from pseries smp
MPIC was only used by Power3 which is now unsupported, so remove MPIC
code. XICS is now the only supported interrupt controller for
pSeries so do some cleanups too.
Signed-off-by: Rashmica Gupta <rashmicy@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Rashmica Gupta [Mon, 30 May 2016 06:18:11 +0000 (16:18 +1000)]
powerpc/pseries: Drop support for MPIC in pseries
MPIC was only used by Power3 which is now unsupported, so drop support
for MPIC. XICS is now the only supported interrupt controller for
pSeries so make the XICS functions generic.
Signed-off-by: Rashmica Gupta <rashmicy@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Anton Blanchard [Wed, 25 May 2016 22:38:13 +0000 (08:38 +1000)]
powerpc: Remove assembly versions of strcpy, strcat, strlen and strcmp
A number of our assembly implementations of string functions do not
align their hot loops. I was going to align them manually, but I
realised that they are are almost instruction for instruction
identical to what gcc produces, with the advantage that gcc does
align them.
In light of that, let's just remove the assembly versions.
Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Anton Blanchard [Sun, 29 May 2016 12:03:51 +0000 (22:03 +1000)]
powerpc: Avoid load hit store in setup_sigcontext()
In setup_sigcontext(), we set current->thread.vrsave then use it
straight after. Since current is hidden from the compiler via inline
assembly, it cannot optimise this and we end up with a load hit store.
Fix this by using a temporary.
Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Anton Blanchard [Sun, 29 May 2016 12:03:50 +0000 (22:03 +1000)]
powerpc: Avoid load hit store in __giveup_fpu() and __giveup_altivec()
In both __giveup_fpu() and __giveup_altivec() we make two modifications
to tsk->thread.regs->msr. gcc decides to do a read/modify/write of
each change, so we end up with a load hit store:
Linus Torvalds [Sun, 12 Jun 2016 13:30:39 +0000 (06:30 -0700)]
Merge branch 'for-rc' of git://git.kernel.org/pub/scm/linux/kernel/git/rzhang/linux
Pull thermal management fixes from Zhang Rui:
- fix an ordering issue in cpu cooling that cooling device is
registered before it's ready (freq_table being populated).
(Lukasz Luba)
- fix a missing comment update (Caesar Wang)
* 'for-rc' of git://git.kernel.org/pub/scm/linux/kernel/git/rzhang/linux:
thermal: add the note for set_trip_temp
thermal: cpu_cooling: fix improper order during initialization
Linus Torvalds [Sun, 12 Jun 2016 01:42:59 +0000 (18:42 -0700)]
Merge branch 'for-linus' of git://git.kernel.dk/linux-block
Pull block layer fixes from Jens Axboe:
"A small collection of fixes for the current series. This contains:
- Two fixes for xen-blkfront, from Bob Liu.
- A bug fix for NVMe, releasing only the specific resources we
requested.
- Fix for a debugfs flags entry for nbd, from Josef.
- Plug fix from Omar, fixing up a case of code being switched between
two functions.
- A missing bio_put() for the new discard callers of
submit_bio_wait(), fixing a regression causing a leak of the bio.
From Shaun.
- Improve dirty limit calculation precision in the writeback code,
fixing a case where setting a limit lower than 1% of memory would
end up being zero. From Tejun"
* 'for-linus' of git://git.kernel.dk/linux-block:
NVMe: Only release requested regions
xen-blkfront: fix resume issues after a migration
xen-blkfront: don't call talk_to_blkback when already connected to blkback
nbd: pass the nbd pointer for flags debugfs
block: missing bio_put following submit_bio_wait
blk-mq: really fix plug list flushing for nomerge queues
writeback: use higher precision calculation in domain_dirty_limits()
Linus Torvalds [Sun, 12 Jun 2016 01:03:39 +0000 (18:03 -0700)]
Merge tag 'gpio-v4.7-3' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio
Pull GPIO fixes from Linus Walleij:
"A new bunch of GPIO fixes for v4.7.
This time I am very grateful that Ricardo Ribalda Delgado went in and
fixed my stupid refcounting mistakes in the removal path for GPIO
chips. I had a feeling something was wrong here and so it was. It
exploded on OMAP and it fixes their problem. Now it should be (more)
solid.
The rest i compilation, Kconfig and driver fixes. Some tagged for
stable.
Summary:
- Fix a NULL pointer dereference when we are searching the GPIO
device list but one of the devices have been removed (struct
gpio_chip pointer is NULL).
- Fix unaligned reference counters: we were ending on +3 after all
said and done. It should be 0. Remove an extraneous get_device(),
and call cdev_del() followed by device_del() in gpiochip_remove()
instead and the count goes to zero and calls the release() function
properly.
- Fix a compile warning due to a missing #include in the OF/device
tree portions.
- Select ANON_INODES for GPIOLIB, we're using that for our character
device. Some randconfig tests disclosed the problem.
- Make sure the Zynq driver clock runs also without CONFIG_PM enabled
- Fix an off-by-one error in the 104-DIO-48E driver
- Fix warnings in bcm_kona_gpio_reset()"
* tag 'gpio-v4.7-3' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio:
gpio: bcm-kona: fix bcm_kona_gpio_reset() warnings
gpio: select ANON_INODES
gpio: include <linux/io-mapping.h> in gpiolib-of
gpiolib: Fix unaligned used of reference counters
gpiolib: Fix NULL pointer deference
gpio: zynq: initialize clock even without CONFIG_PM
gpio: 104-dio-48e: Fix control port offset computation off-by-one error
Linus Torvalds [Sat, 11 Jun 2016 18:42:08 +0000 (11:42 -0700)]
Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull SCSI fixes from James Bottomley:
"Two current fixes:
- one affects Qemu CD ROM emulation, which stopped working after the
updates in SCSI to require VPD pages from all conformant devices.
Fix temporarily by blacklisting Qemu (we can relax later when they
come into compliance).
- The other is a fix to the optimal transfer size. We set up a
minefield for ourselves by being confused about whether the limits
are in bytes or sectors (SCSI optimal is in blocks and the queue
parameter is in bytes).
This tries to fix the problem (wrong setting for queue limits
max_sectors) and make the problem more obvious by introducing a
wrapper function"
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
sd: Fix rw_max for devices that report an optimal xfer size
scsi: Add QEMU CD-ROM to VPD Inquiry Blacklist
Linus Torvalds [Sat, 11 Jun 2016 18:24:54 +0000 (11:24 -0700)]
Merge branch 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux
Pull i2c fixes from Wolfram Sang:
- a bigger fix for i801 to finally be able to be loaded on some
machines again
- smaller driver fixes
- documentation update because of a renamed file
* 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
i2c: mux: reg: Provide of_match_table
i2c: mux: refer to i2c-mux.txt
i2c: octeon: Avoid printk after too long SMBUS message
i2c: octeon: Missing AAK flag in case of I2C_M_RECV_LEN
i2c: i801: Allow ACPI SystemIO OpRegion to conflict with PCI BAR
Linus Torvalds [Sat, 11 Jun 2016 17:55:30 +0000 (10:55 -0700)]
Merge tag '20160610_uvc_compat_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/luto/linux
Pull uvc compat XU ioctl fixes from Andy Lutomirski:
"uvc's compat XU ioctls go through tons of potentially buggy
indirection. The first patch removes the indirection. The second one
cleans up the code.
Compile-tested only. I have the hardware, but I have absolutely no
idea what XU does, how to use it, what software to recompile as
32-bit, or what to test in that software"
* tag '20160610_uvc_compat_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/luto/linux:
uvc_v4l2: Simplify compat ioctl implementation
uvc: Forward compat ioctls to their handlers directly
Linus Torvalds [Fri, 10 Jun 2016 21:13:27 +0000 (14:13 -0700)]
Merge branch 'for-linus-4.7' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs
Pull btrfs fixes from Chris Mason:
"Has some fixes and some new self tests for btrfs. The self tests are
usually disabled in the .config file (unless you're doing btrfs dev
work), and this bunch is meant to find problems with the 64K page size
patches.
Jeff has a patch to help people see if they are using the hardware
assist crc32c module, which really helps us nail down problems when
people ask why crcs are using so much CPU.
Otherwise, it's small fixes"
* 'for-linus-4.7' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs:
Btrfs: self-tests: Fix extent buffer bitmap test fail on BE system
Btrfs: self-tests: Fix test_bitmaps fail on 64k sectorsize
Btrfs: self-tests: Use macros instead of constants and add missing newline
Btrfs: self-tests: Support testing all possible sectorsizes and nodesizes
Btrfs: self-tests: Execute page straddling test only when nodesize < PAGE_SIZE
btrfs: advertise which crc32c implementation is being used at module load
Btrfs: add validadtion checks for chunk loading
Btrfs: add more validation checks for superblock
Btrfs: clear uptodate flags of pages in sys_array eb
Btrfs: self-tests: Support non-4k page size
Btrfs: Fix integer overflow when calculating bytes_per_bitmap
Btrfs: test_check_exists: Fix infinite loop when searching for free space entries
Btrfs: end transaction if we abort when creating uuid root
btrfs: Use __u64 in exported linux/btrfs.h.
Linus Torvalds [Fri, 10 Jun 2016 19:23:49 +0000 (12:23 -0700)]
Merge tag 'powerpc-4.7-3Michael Ellerman:' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull powerpc fixes from
- ptrace: Fix out of bounds array access warning from Khem Raj
- pseries: Fix PCI config address for DDW from Gavin Shan
- pseries: Fix IBM_ARCH_VEC_NRCORES_OFFSET since POWER8NVL was added
from Michael Ellerman
- of: fix autoloading due to broken modalias with no 'compatible' from
Wolfram Sang
- radix: Fix always false comparison against MMU_NO_CONTEXT from Aneesh
Kumar K.V
- hash: Compute the segment size correctly for ISA 3.0 from Aneesh
Kumar K.V
- nohash: Fix build break with 64K pages from Michael Ellerman
* tag 'powerpc-4.7-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
powerpc/nohash: Fix build break with 64K pages
powerpc/mm/hash: Compute the segment size correctly for ISA 3.0
powerpc/mm/radix: Fix always false comparison against MMU_NO_CONTEXT
of: fix autoloading due to broken modalias with no 'compatible'
powerpc/pseries: Fix IBM_ARCH_VEC_NRCORES_OFFSET since POWER8NVL was added
powerpc/pseries: Fix PCI config address for DDW
powerpc/ptrace: Fix out of bounds array access warning
Linus Torvalds [Fri, 10 Jun 2016 19:18:34 +0000 (12:18 -0700)]
Merge tag 'hwmon-for-linus-v4.7-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging
Pull hwmon fixes from Guenter Roeck:
- fix regression in fam15h_power driver
- minor variable type fix in lm90 driver
- document compatible statement for ina2xx driver
* tag 'hwmon-for-linus-v4.7-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
hwmon: (lm90) use proper type for update_interval
hwmon: (ina2xx) Document compatible for INA231
hwmon: (fam15h_power) Disable preemption when reading registers
Linus Torvalds [Fri, 10 Jun 2016 19:10:02 +0000 (12:10 -0700)]
Merge branch 'stacking-fixes' (vfs stacking fixes from Jann)
Merge filesystem stacking fixes from Jann Horn.
* emailed patches from Jann Horn <jannh@google.com>:
sched: panic on corrupted stack end
ecryptfs: forbid opening files without mmap handler
proc: prevent stacking filesystems on top
Jann Horn [Wed, 1 Jun 2016 09:55:07 +0000 (11:55 +0200)]
sched: panic on corrupted stack end
Until now, hitting this BUG_ON caused a recursive oops (because oops
handling involves do_exit(), which calls into the scheduler, which in
turn raises an oops), which caused stuff below the stack to be
overwritten until a panic happened (e.g. via an oops in interrupt
context, caused by the overwritten CPU index in the thread_info).
Jann Horn [Wed, 1 Jun 2016 09:55:06 +0000 (11:55 +0200)]
ecryptfs: forbid opening files without mmap handler
This prevents users from triggering a stack overflow through a recursive
invocation of pagefault handling that involves mapping procfs files into
virtual memory.
Jann Horn [Wed, 1 Jun 2016 09:55:05 +0000 (11:55 +0200)]
proc: prevent stacking filesystems on top
This prevents stacking filesystems (ecryptfs and overlayfs) from using
procfs as lower filesystem. There is too much magic going on inside
procfs, and there is no good reason to stack stuff on top of procfs.
(For example, procfs does access checks in VFS open handlers, and
ecryptfs by design calls open handlers from a kernel thread that doesn't
drop privileges or so.)
Linus Torvalds [Fri, 10 Jun 2016 18:57:17 +0000 (11:57 -0700)]
Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 fix from Will Deacon:
"A fix for an issue that Alex saw whilst swapping with hardware
access/dirty bit support enabled in the kernel: Fix a failure to fault
in old pages on a write when CONFIG_ARM64_HW_AFDBM is enabled"
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
arm64: mm: always take dirty state from new pte in ptep_set_access_flags
Linus Torvalds [Fri, 10 Jun 2016 18:36:04 +0000 (11:36 -0700)]
Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Ingo Molnar:
"Misc fixes from all around the map, plus a commit that introduces a
new header of Intel model name symbols (unused) that will make the
next merge window easier"
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/ioapic: Fix incorrect pointers in ioapic_setup_resources()
x86/entry/traps: Don't force in_interrupt() to return true in IST handlers
x86/cpu/AMD: Extend X86_FEATURE_TOPOEXT workaround to newer models
x86/cpu/intel: Introduce macros for Intel family numbers
x86, build: copy ldlinux.c32 to image.iso
x86/msr: Use the proper trace point conditional for writes
Linus Torvalds [Fri, 10 Jun 2016 18:15:41 +0000 (11:15 -0700)]
Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf fixes from Ingo Molnar:
"A handful of tooling fixes, two PMU driver fixes and a cleanup of
redundant code that addresses a security analyzer false positive"
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf/core: Remove a redundant check
perf/x86/intel/uncore: Remove SBOX support for Broadwell server
perf ctf: Convert invalid chars in a string before set value
perf record: Fix crash when kptr is restricted
perf symbols: Check kptr_restrict for root
perf/x86/intel/rapl: Fix pmus free during cleanup
Linus Torvalds [Fri, 10 Jun 2016 17:53:46 +0000 (10:53 -0700)]
Merge branch 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull locking fixes from Ingo Molnar:
"Misc fixes:
- a file-based futex fix
- one more spin_unlock_wait() fix
- a ww-mutex deadlock detection improvement/fix
- and a raw_read_seqcount_latch() barrier fix"
* 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
futex: Calculate the futex key based on a tail page for file-based futexes
locking/qspinlock: Fix spin_unlock_wait() some more
locking/ww_mutex: Report recursive ww_mutex locking early
locking/seqcount: Re-fix raw_read_seqcount_latch()
Linus Torvalds [Fri, 10 Jun 2016 17:47:22 +0000 (10:47 -0700)]
Merge branch 'efi-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull EFI fixes from Ingo Molnar:
"Two fixes: a regression/crash fix, and a message output fix"
* 'efi-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
efi/arm: Fix the format of EFI debug messages
efi: Fix for_each_efi_memory_desc_in_map() for empty memmaps
1) nfnetlink timestamp taken from wrong skb, fix from Florian Westphal.
2) Revert some msleep conversions in rtlwifi as these spots are in
atomic context, from Larry Finger.
3) Validate that NFTA_SET_TABLE attribute is actually specified when we
call nf_tables_getset(). From Phil Turnbull.
4) Don't do mdio_reset in stmmac driver with spinlock held as that can
sleep, from Vincent Palatin.
5) sk_filter() does things other than run a BPF filter, so we should
not elide it's call just because sk->sk_filter is NULL. Fix from
Eric Dumazet.
6) Fix missing backlog updates in several packet schedulers, from Cong
Wang.
7) bnx2x driver should allow VLAN add/remove while the interface is
down, from Michal Schmidt.
8) Several RDS/TCP race fixes from Sowmini Varadhan.
9) fq_codel scheduler doesn't return correct queue length in dumps,
from Eric Dumazet.
10) Fix TCP stats for tail loss probe and early retransmit in ipv6, from
Yuchung Cheng.
11) Properly initialize udp_tunnel_socket_cfg in l2tp_tunnel_create(),
from Guillaume Nault.
12) qfq scheduler leaks SKBs if a kzalloc fails, fix from Florian
Westphal.
13) sock_fprog passed into PACKET_FANOUT_DATA needs compat handling,
from Willem de Bruijn.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (85 commits)
vmxnet3: segCnt can be 1 for LRO packets
packet: compat support for sock_fprog
stmmac: fix parameter to dwmac4_set_umac_addr()
net/mlx5e: Fix blue flame quota logic
net/mlx5e: Use ndo_stop explicitly at shutdown flow
net/mlx5: E-Switch, always set mc_promisc for allmulti vports
net/mlx5: E-Switch, Modify node guid on vf set MAC
net/mlx5: E-Switch, Fix vport enable flow
net/mlx5: E-Switch, Use the correct error check on returned pointers
net/mlx5: E-Switch, Use the correct free() function
net/mlx5: Fix E-Switch flow steering capabilities check
net/mlx5: Fix flow steering NIC capabilities check
net/mlx5: Fix root flow table update
net/mlx5: Fix MLX5_CMD_OP_MAX to be defined correctly
net/mlx5: Fix masking of reserved bits in XRCD number
net/mlx5: Fix the size of modify QP mailbox
mlxsw: spectrum: Don't sleep during ndo_get_phys_port_name()
mlxsw: spectrum: Make split flow match firmware requirements
wext: Fix 32 bit iwpriv compatibility issue with 64 bit Kernel
cfg80211: remove get/set antenna and tx power warnings
...
Linus Torvalds [Fri, 10 Jun 2016 15:27:30 +0000 (08:27 -0700)]
Merge tag 'sound-4.7-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
"We have only few, mainly HD-audio device-specific fixes. Realtek
codec driver got a slightly more LOC, but they are all for the new
codec chip, and won't affect others at all"
* tag 'sound-4.7-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
ALSA: hda - Add PCI ID for Kabylake
ALSA: hda/realtek: Add T560 docking unit fixup
ALSA: hda - Fix headset mic detection problem for Dell machine
ALSA: uapi: Add three missing header files to Kbuild file
ALSA: hda/realtek - Add support for new codecs ALC700/ALC701/ALC703
ALSA: hda/realtek - ALC256 speaker noise issue
Linus Torvalds [Fri, 10 Jun 2016 15:21:06 +0000 (08:21 -0700)]
Merge tag 'drm-fixes-for-v4.7-rc3' of git://people.freedesktop.org/~airlied/linux
Pull drm fixes from Dave Airlie:
"This weeks instalment of fixes:
amdgpu:
Lots of memory leak and firmware leak fixes
nouveau:
Collection of display fixes, KASAN fixes
vc4:
vblank/pageflipping fixes
fsl-dcu:
Regmap cache fix
omap:
Unused variable warning fix.
Nothing too surprising so far"
* tag 'drm-fixes-for-v4.7-rc3' of git://people.freedesktop.org/~airlied/linux: (46 commits)
drm/amdgpu: fix warning with powerplay disabled.
drm/amd/powerplay: delete useless code as pptable changed in vbios.
drm/amd/powerplay: fix bug visit array out of bounds
drm/amdgpu: fix smu ucode memleak (v2)
drm/amdgpu: add release firmware for cgs
drm/amdgpu: fix tonga smu_fini mem leak
drm/amdgpu: fix fiji smu fini mem leak
drm/amdgpu: fix cik sdma ucode memleak
drm/amdgpu: fix sdma24 ucode mem leak
drm/amdgpu: fix sdma3 ucode mem leak
drm/amdgpu: fix uvd fini mem leak
drm/amdgpu: fix gfx 7 ucode mem leak
drm/amdgpu: fix gfx8 ucode mem leak
drm/amdgpu: fix missing free wb for cond_exec
drm/amdgpu: fix memleak in pptable_init
drm/amdgpu: fix mem leak in atombios
drm/amdgpu: fix mem leak in pplib/hwmgr
drm/amdgpu: fix mem leak in smumgr
drm/amdgpu: add pipeline sync while vmid switch in same ctx
drm/amdgpu: vBIOS post only call when mem_size zero
...
Linus Torvalds [Fri, 10 Jun 2016 15:15:37 +0000 (08:15 -0700)]
Merge tag 'acpi-4.7-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull ACPI fix from Rafael Wysocki:
"A recently introduced boot regression related to the ACPI EC
initialization is addressed by restoring the previous behavior (Lv
Zheng)"
* tag 'acpi-4.7-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
ACPI / EC: Fix a boot EC regresion by restoring boot EC support for the DSDT EC
Linus Torvalds [Fri, 10 Jun 2016 15:09:12 +0000 (08:09 -0700)]
Merge tag 'pm-4.7-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management fixes from Rafael Wysocki:
"Stable-candidate fixes for the intel_pstate driver and the cpuidle
core.
Specifics:
- Fix two intel_pstate initialization issues, one of which was
introduced during the 4.4 cycle (Srinivas Pandruvada)
- Fix kernel build with CONFIG_UBSAN set and CONFIG_CPU_IDLE unset
(Catalin Marinas)"
* tag 'pm-4.7-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
cpufreq: intel_pstate: Fix ->set_policy() interface for no_turbo
cpufreq: intel_pstate: Fix code ordering in intel_pstate_set_policy()
cpuidle: Do not access cpuidle_devices when !CONFIG_CPU_IDLE
Linus Torvalds [Fri, 10 Jun 2016 15:00:47 +0000 (08:00 -0700)]
Merge branch 'akpm' (patches from Andrew)
Merge misc fixes from Andrew Morton:
"7 fixes"
* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
mm/fadvise.c: do not discard partial pages with POSIX_FADV_DONTNEED
mm: introduce dedicated WQ_MEM_RECLAIM workqueue to do lru_add_drain_all
kernel/relay.c: fix potential memory leak
mm: thp: broken page count after commit aa88b68c3b1d
revert "mm: memcontrol: fix possible css ref leak on oom"
kasan: change memory hot-add error messages to info messages
mm/hugetlb: fix huge page reserve accounting for private mappings
Rui Wang [Wed, 8 Jun 2016 06:59:52 +0000 (14:59 +0800)]
x86/ioapic: Fix incorrect pointers in ioapic_setup_resources()
On a 4-socket Brickland system, hot-removing one ioapic is fine.
Hot-removing the 2nd one causes panic in mp_unregister_ioapic()
while calling release_resource().
It is because the iomem_res pointer has already been released
when removing the first ioapic.
To explain the use of &res[num] here: res is assigned to ioapic_resources,
and later in ioapic_insert_resources() we do:
Here 'r' is treated as an arry of 'struct resource', and the r++ ensures
that each element of the array is inserted separately. Thus we should call
release_resouce() on each element at &res[num].
Fix it by assigning the correct pointers to ioapics[i].iomem_res in
ioapic_setup_resources().
Andy Lutomirski [Tue, 24 May 2016 22:54:04 +0000 (15:54 -0700)]
x86/entry/traps: Don't force in_interrupt() to return true in IST handlers
Forcing in_interrupt() to return true if we're not in a bona fide
interrupt confuses the softirq code. This fixes warnings like:
NOHZ: local_softirq_pending 282
... which can happen when running things like selftests/x86.
This will change perf's static percpu buffer usage in IST context.
I think this is okay, and it's changing the behavior to match
historical (pre-4.0) behavior.
Signed-off-by: Andy Lutomirski <luto@kernel.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: stable@vger.kernel.org Fixes: 959274753857 ("x86, traps: Track entry into and exit from IST context") Link: http://lkml.kernel.org/r/cdc215f94d118d691d73df35275022331156fb45.1464130360.git.luto@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
Socket option PACKET_FANOUT_DATA takes a struct sock_fprog as argument
if PACKET_FANOUT has mode PACKET_FANOUT_CBPF. This structure contains
a pointer into user memory. If userland is 32-bit and kernel is 64-bit
the two disagree about the layout of struct sock_fprog.
Add compat setsockopt support to convert a 32-bit compat_sock_fprog to
a 64-bit sock_fprog. This is analogous to compat_sock_fprog support for
SO_REUSEPORT added in commit 1957598840f4 ("soreuseport: add compat
case for setsockopt SO_ATTACH_REUSEPORT_CBPF").
Reported-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Willem de Bruijn <willemb@google.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>
Ben Dooks [Wed, 8 Jun 2016 18:21:17 +0000 (19:21 +0100)]
stmmac: fix parameter to dwmac4_set_umac_addr()
The dwmac4_set_umac_addr() takes a struct mac_device_info as
the first parameter, but is being passed a ioaddr instead from
dwmac4_set_filter(). Fix the warning/bug by changing the first
parameter.
drivers/net/ethernet/stmicro/stmmac/dwmac4_core.c:159:46: warning: incorrect type in argument 1 (different address spaces)
drivers/net/ethernet/stmicro/stmmac/dwmac4_core.c:159:46: expected struct mac_device_info *hw
drivers/net/ethernet/stmicro/stmmac/dwmac4_core.c:159:46: got void [noderef] <asn:2>*ioaddr
Note, only compile tested this as do not have any
hardware with it in.
Signed-off-by: Ben Dooks <ben.dooks@codethink.co.uk> Acked-by: Giuseppe Cavallaro <peppe.cavallaro@st.com> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Fri, 10 Jun 2016 05:06:27 +0000 (22:06 -0700)]
Merge branch 'mlx5-fixes'
Saeed Mahameed says:
====================
Mellanox 100G mlx5 fixes for 4.7-rc
The following series provides some small fixes for mlx5 driver.
Two small fixes for the mlx5e netdev, the 1st is for the blue flame
quota accounting and the 2nd is a small refactoring in shutdown flow.
Five trivial fixes for mlx5 E-Switch.
- Allmulti mc_promisc flag was not set in a specific flow.
- Modify VF node guid when admin mac is changed.
- Race in vport enable flow.
- Misc code fixes (kvfree when needed and error pointers checking).
Three in mlx5 steering area. Correct capabilities checking and root flow table update.
Three misc fixes in mlx5 commands enum and layouts.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Eli Cohen [Thu, 9 Jun 2016 21:07:40 +0000 (00:07 +0300)]
net/mlx5e: Fix blue flame quota logic
Blue flame is a latency enhancement feature that allows the driver to
write the packet data directly to the NIC's registers thus making the
read of the packet data from host memory redundant.
We maintain a quota for the blue flame which is reloaded whenever we
identify that the hardware is processing send requests and processes
them fast enough so by the time we post the next send request it was
able to process all the pending ones. This indicates that the hardware
is capable of processing more blue flame requests efficiently. The blue
flame quota is decremented whenever we send using blue flame.
The current code erroneously clears the budget if we did not use blue
flame for the current post send operation and we fix it here.
Fixes: 88a85f99e51f ('net/mlx5e: TX latency optimization to save DMA reads') Signed-off-by: Eli Cohen <eli@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Eran Ben Elisha [Thu, 9 Jun 2016 21:07:39 +0000 (00:07 +0300)]
net/mlx5e: Use ndo_stop explicitly at shutdown flow
The current implementation copies the flow of ndo_stop instead of
calling it explicitly, Fixed it.
Fixes: 5fc7197d3a25 ("net/mlx5: Add pci shutdown callback") Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Noa Osherovich [Thu, 9 Jun 2016 21:07:37 +0000 (00:07 +0300)]
net/mlx5: E-Switch, Modify node guid on vf set MAC
In RoCE, the RDMA-CM needs the node guid to establish connection
between nodes.
Today, the node guid exposed to mlx5 Ethernet VFs is zero, therefore
RDMA-CM on the VF is broken.
Whenever the administrator sets a MAC for a VF, derive the node guid
from it and set it as well in the following way:
MAC: e4:1d:2d:b3:f4:01 -> node_guid: e4:1d:2d:ff:fe:b3:f4:01
Fixes: 77256579c6b43 ('net/mlx5: E-Switch, Introduce Vport...') Signed-off-by: Noa Osherovich <noaos@mellanox.com> Signed-off-by: Majd Dibbiny <majd@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Reorder vport enable flow to mark the vport as enabled before calling
the vport change handler which was modified to handle the case for
when vport is not enabled.
This fixes the case for when the PF netdev is open before sriov is
enabled, once sriov is enabled at esw_enable_vport,
esw_vport_change_handle_locked didn't read the PF context since it
thought the PF vport was not enabled.
When we enable the vport, arming for events is not required anymore,
since it's done on the vport change handle
Fixes: 586cfa7f1d58 ('net/mlx5: E-Switch, Use vport event handler for vport cleanup') Signed-off-by: Mohamad Haj Yahia <mohamad@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Or Gerlitz [Thu, 9 Jun 2016 21:07:35 +0000 (00:07 +0300)]
net/mlx5: E-Switch, Use the correct error check on returned pointers
The mlx5 flow-steering API (mlx5_create_flow_table/group/rule) never
returns null pointer on error. Even if it was doing that, checking
for IS_ERR_OR_NULL(p) and then returning PTR_ERR(p) would have cause
bugs, since PTR_ERR(NULL) --> success, crash.
To make things more robust and protect against related future bugs,
convert all IS_ERR_OR_NULL checks on returned values to IS_ERR.
Fixes: 5742df0f7dbe ('net/mlx5: E-Switch, Introduce VST vport ingress/egress ACLs') Fixes: 86d722ad2c3b ('net/mlx5: Use flow steering infrastructure for mlx5_en') Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Reported-by: Ilya Lesokhin <ilyal@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Maor Gottlieb [Thu, 9 Jun 2016 21:07:32 +0000 (00:07 +0300)]
net/mlx5: Fix flow steering NIC capabilities check
Flow steering infrastructure is currently used only on link layer
ethernet, therefore the driver should initialize the flow steering
when the device link layer is ethernet.
In addition, add missing capability check before initializing the
namespace of NIC RX flow tables.
Fixes: 2530236303d9 ('net/mlx5_core: Flow steering tree initialization') Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Maor Gottlieb [Thu, 9 Jun 2016 21:07:31 +0000 (00:07 +0300)]
net/mlx5: Fix root flow table update
When we destroy the last flow table we need to update
the root_ft to NULL.
It fixes an issue for when the last flow table is destroyed
and recreated again, root_ft pointer will not be updated,
as a result traffic will be dropped.
Fixes: 2cc43b494a6c ('net/mlx5_core: Managing root flow table') Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Shahar Klein [Thu, 9 Jun 2016 21:07:30 +0000 (00:07 +0300)]
net/mlx5: Fix MLX5_CMD_OP_MAX to be defined correctly
Having MLX5_CMD_OP_MAX on another file causes us to repeatedly miss
accounting new commands added to the driver and hence there're no entries
for them in debugfs. To solve that, we integrate it into the commands enum
as the last entry.
Fixes: 34a40e689393 ('net/mlx5_core: Introduce modify flow table command') Signed-off-by: Shahar Klein <shahark@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Majd Dibbiny [Thu, 9 Jun 2016 21:07:29 +0000 (00:07 +0300)]
net/mlx5: Fix masking of reserved bits in XRCD number
Mask the reserved bits when reading the number of newly
created XRCD.
Fixes: e126ba97dba9 ('mlx5: Add driver for Mellanox Connect-IB adapters') Signed-off-by: Majd Dibbiny <majd@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>