powerpc: tm: Always use fp_state and vr_state to store live registers
There is currently an inconsistency as to how the entire CPU register
state is saved and restored when a thread uses transactional memory
(TM).
Using transactional memory results in the CPU having duplicated
(almost) all of its register state. This duplication results in a set
of registers which can be considered 'live', those being currently
modified by the instructions being executed and another set that is
frozen at a point in time.
On context switch, both sets of state have to be saved and (later)
restored. These two states are often called a variety of different
things. Common terms for the state which only exists after the CPU has
entered a transaction (performed a TBEGIN instruction) in hardware are
'transactional' or 'speculative'.
Between a TBEGIN and a TEND or TABORT (or an event that causes the
hardware to abort), regardless of the use of TSUSPEND the
transactional state can be referred to as the live state.
The second state is often to referred to as the 'checkpointed' state
and is a duplication of the live state when the TBEGIN instruction is
executed. This state is kept in the hardware and will be rolled back
to on transaction failure.
Currently all the registers stored in pt_regs are ALWAYS the live
registers, that is, when a thread has transactional registers their
values are stored in pt_regs and the checkpointed state is in
ckpt_regs. A strange opposite is true for fp_state/vr_state. When a
thread is non transactional fp_state/vr_state holds the live
registers. When a thread has initiated a transaction fp_state/vr_state
holds the checkpointed state and transact_fp/transact_vr become the
structure which holds the live state (at this point it is a
transactional state).
This method creates confusion as to where the live state is, in some
circumstances it requires extra work to determine where to put the
live state and prevents the use of common functions designed (probably
before TM) to save the live state.
With this patch pt_regs, fp_state and vr_state all represent the
same thing and the other structures [pending rename] are for
checkpointed state.
Acked-by: Simon Guo <wei.guo.simon@gmail.com> Signed-off-by: Cyril Bur <cyrilbur@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
selftests/powerpc: Add checks for transactional VSXs in signal contexts
If a thread receives a signal while transactional the kernel creates a
second context to show the transactional state of the process. This
test loads some known values and waits for a signal and confirms that
the expected values are in the signal context.
Signed-off-by: Cyril Bur <cyrilbur@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
selftests/powerpc: Add checks for transactional VMXs in signal contexts
If a thread receives a signal while transactional the kernel creates a
second context to show the transactional state of the process. This
test loads some known values and waits for a signal and confirms that
the expected values are in the signal context.
Signed-off-by: Cyril Bur <cyrilbur@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
selftests/powerpc: Add checks for transactional FPUs in signal contexts
If a thread receives a signal while transactional the kernel creates a
second context to show the transactional state of the process. This
test loads some known values and waits for a signal and confirms that
the expected values are in the signal context.
Signed-off-by: Cyril Bur <cyrilbur@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
selftests/powerpc: Add checks for transactional GPRs in signal contexts
If a thread receives a signal while transactional the kernel creates a
second context to show the transactional state of the process. This
test loads some known values and waits for a signal and confirms that
the expected values are in the signal context.
Signed-off-by: Cyril Bur <cyrilbur@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
selftests/powerpc: Rework FPU stack placement macros and move to header file
The FPU regs are placed at the top of the stack frame. Currently the
position expected to be passed to the macro. The macros now should be
passed the stack frame size and from there they can calculate where to
put the regs, this makes the use simpler.
Also move them to a header file to be used in an different area of the
powerpc selftests
Signed-off-by: Cyril Bur <cyrilbur@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
selftests/powerpc: Check for VSX preservation across userspace preemption
Ensure the kernel correctly switches VSX registers correctly. VSX
registers are all volatile, and despite the kernel preserving VSX
across syscalls, it doesn't have to. Test that during interrupts and
timeslices ending the VSX regs remain the same.
Signed-off-by: Cyril Bur <cyrilbur@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
powerpc: signals: Stop using current in signal code
Much of the signal code takes a pt_regs on which it operates. Over
time the signal code has needed to know more about the thread than
what pt_regs can supply, this information is obtained as needed by
using 'current'.
This approach is not strictly incorrect however it does mean that
there is now a hard requirement that the pt_regs being passed around
does belong to current, this is never checked. A safer approach is for
the majority of the signal functions to take a task_struct from which
they can obtain pt_regs and any other information they need. The
caveat that the task_struct they are passed must be current doesn't go
away but can more easily be checked for.
Functions called from outside powerpc signal code are passed a pt_regs
and they can confirm that the pt_regs is that of current and pass
current to other functions, furthurmore, powerpc signal functions can
check that the task_struct they are passed is the same as current
avoiding possible corruption of current (or the task they are passed)
if this assertion ever fails.
CC: paulus@samba.org Signed-off-by: Cyril Bur <cyrilbur@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
powerpc: Never giveup a reclaimed thread when enabling kernel {fp, altivec, vsx}
After a thread is reclaimed from its active or suspended transactional
state the checkpointed state exists on CPU, this state (along with the
live/transactional state) has been saved in its entirety by the
reclaiming process.
There exists a sequence of events that would cause the kernel to call
one of enable_kernel_fp(), enable_kernel_altivec() or
enable_kernel_vsx() after a thread has been reclaimed. These functions
save away any user state on the CPU so that the kernel can use the
registers. Not only is this saving away unnecessary at this point, it
is actually incorrect. It causes a save of the checkpointed state to
the live structures within the thread struct thus destroying the true
live state for that thread.
Signed-off-by: Cyril Bur <cyrilbur@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
powerpc: Return the new MSR from msr_check_and_set()
msr_check_and_set() always performs a mfmsr() to determine if it needs
to perform an mtmsr(), as mfmsr() can be a costly operation
msr_check_and_set() could return the MSR now on the CPU to avoid
callers of msr_check_and_set having to make their own mfmsr() call.
Signed-off-by: Cyril Bur <cyrilbur@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
powerpc: Add check_if_tm_restore_required() to giveup_all()
giveup_all() causes FPU/VMX/VSX facilities to be disabled in a threads
MSR. If the thread performing the giveup was transactional, the kernel
must record which facilities were in use before the giveup as the
thread must have these facilities re-enabled on return to userspace.
>From process.c:
/*
* This is called if we are on the way out to userspace and the
* TIF_RESTORE_TM flag is set. It checks if we need to reload
* FP and/or vector state and does so if necessary.
* If userspace is inside a transaction (whether active or
* suspended) and FP/VMX/VSX instructions have ever been enabled
* inside that transaction, then we have to keep them enabled
* and keep the FP/VMX/VSX state loaded while ever the transaction
* continues. The reason is that if we didn't, and subsequently
* got a FP/VMX/VSX unavailable interrupt inside a transaction,
* we don't know whether it's the same transaction, and thus we
* don't know which of the checkpointed state and the transactional
* state to use.
*/
Calling check_if_tm_restore_required() will set TIF_RESTORE_TM and
save the MSR if needed.
powerpc: Always restore FPU/VEC/VSX if hardware transactional memory in use
Comment from arch/powerpc/kernel/process.c:967:
If userspace is inside a transaction (whether active or
suspended) and FP/VMX/VSX instructions have ever been enabled
inside that transaction, then we have to keep them enabled
and keep the FP/VMX/VSX state loaded while ever the transaction
continues. The reason is that if we didn't, and subsequently
got a FP/VMX/VSX unavailable interrupt inside a transaction,
we don't know whether it's the same transaction, and thus we
don't know which of the checkpointed state and the ransactional
state to use.
restore_math() restore_fp() and restore_altivec() currently may not
restore the registers. It doesn't appear that this is more serious
than a performance penalty. If the math registers aren't restored the
userspace thread will still be run with the facility disabled.
Userspace will not be able to read invalid values. On the first access
it will take an facility unavailable exception and the kernel will
detected an active transaction, at which point it will abort the
transaction. There is the possibility for a pathological case
preventing any progress by transactions, however, transactions
are never guaranteed to make progress.
Fixes: 70fe3d9 ("powerpc: Restore FPU/VEC/VSX if previously used") Signed-off-by: Cyril Bur <cyrilbur@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Gavin Shan [Tue, 2 Aug 2016 04:10:30 +0000 (14:10 +1000)]
powerpc/powernv: Use CPU-endian hub diag-data type in pnv_eeh_get_and_dump_hub_diag()
The hub diag-data type is filled with big-endian data by OPAL call
opal_pci_get_hub_diag_data(). We need convert it to CPU-endian value
before using it. The issue is reported by sparse as pointed by Michael
Ellerman:
eeh-powernv.c:1309:21: warning: restricted __be16 degrades to integer
This converts hub diag-data type to CPU-endian before using it in
pnv_eeh_get_and_dump_hub_diag().
Fixes: 2a485ad7c88d ("powerpc/powernv: Drop PHB operation next_error()") Cc: stable@vger.kernel.org # v4.1+ Suggested-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Reviewed-by: Russell Currey <ruscur@russell.cc> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Gavin Shan [Tue, 2 Aug 2016 04:10:29 +0000 (14:10 +1000)]
powerpc/powernv: Pass CPU-endian PE number to opal_pci_eeh_freeze_clear()
The PE number (@frozen_pe_no), filled by opal_pci_next_error() is in
big-endian format. It should be converted to CPU-endian before it is
passed to opal_pci_eeh_freeze_clear() when clearing the frozen state if
the PE is invalid one. As Michael Ellerman pointed out, the issue is
also detected by sparse:
eeh-powernv.c:1541:41: warning: incorrect type in argument 2 (different base types)
This passes CPU-endian PE number to opal_pci_eeh_freeze_clear() and it
should be part of commit <0f36db77643b> ("powerpc/eeh: Fix wrong printed
PE number"), which was merged to 4.3 kernel.
Fixes: 71b540adffd9 ("powerpc/powernv: Don't escalate non-existing frozen PE") Cc: stable@vger.kernel.org # v4.3+ Suggested-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Reviewed-by: Russell Currey <ruscur@russell.cc> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
drivers/pci/hotplug: Use of_property_read_u32() in powernv driver
This replaces of_get_property() with of_property_read_u32() or
of_property_read_string() so that we needn't consider the endian
issue, the returned value always is in CPU-endian.
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
[mpe: Fold in the change to the "ibm,slot-surprise-pluggable" case] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Andrew Donnellan [Fri, 29 Jul 2016 03:55:34 +0000 (13:55 +1000)]
cxl: replace loop with for_each_child_of_node(), remove unneeded of_node_put()
Rewrite the cxl_guest_init_afu() loop in cxl_of_probe() to use
for_each_child_of_node() rather than a hand-coded for loop.
Remove the useless of_node_put(afu_np) call after the loop, where it's
guaranteed that afu_np == NULL.
Reported-by: SF Markus Elfring <elfring@users.sourceforge.net> Reported-by: Julia Lawall <julia.lawall@lip6.fr> Signed-off-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Reviewed-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Frederic Barrat [Mon, 3 Oct 2016 19:36:02 +0000 (21:36 +0200)]
cxl: Flush PSL cache before resetting the adapter
If the capi link is going down while the PSL owns a dirty cache line,
any access from the host for that data could lead to an Uncorrectable
Error.
So when resetting the capi adapter through sysfs, make sure the PSL
cache is flushed. It won't help if there are any active Process
Elements on the card, as the cache would likely get new dirty cache
lines immediately, but if resetting an idle adapter, it should avoid
any bad surprises from data left over from terminated Process Elements.
Signed-off-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com> Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Acked-by: Ian Munsie <imunsie@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Anton Blanchard [Sun, 25 Sep 2016 12:35:41 +0000 (22:35 +1000)]
powerpc: Set default CPU type to POWER8 for little endian builds
We supported POWER7 CPUs for bootstrapping little endian, but the
target was always POWER8. Now that POWER7 specific issues are
impacting performance, change the default target to POWER8.
Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Anton Blanchard [Sun, 25 Sep 2016 12:35:40 +0000 (22:35 +1000)]
powerpc: Only disable HAVE_EFFICIENT_UNALIGNED_ACCESS on POWER7 little endian
POWER8 handles unaligned accesses in little endian mode, but commit 0b5e6661ac69 ("powerpc: Don't set HAVE_EFFICIENT_UNALIGNED_ACCESS on
little endian builds") disabled it for all.
The issue with unaligned little endian accesses is specific to POWER7,
so update the Kconfig check to match. Using the stat() testcase from
commit a75c380c7129 ("powerpc: Enable DCACHE_WORD_ACCESS on ppc64le"),
performance improves 15% on POWER8.
Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Anton Blanchard [Mon, 3 Oct 2016 06:03:03 +0000 (17:03 +1100)]
powerpc: Remove static branch prediction in atomic{, 64}_add_unless
I see quite a lot of static branch mispredictions on a simple
web serving workload. The issue is in __atomic_add_unless(), called
from _atomic_dec_and_lock(). There is no obvious common case, so it
is better to let the hardware predict the branch.
Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Anton Blanchard [Mon, 3 Oct 2016 06:40:29 +0000 (17:40 +1100)]
powerpc: During context switch, check before setting mm_cpumask
During context switch, switch_mm() sets our current CPU in mm_cpumask.
We can avoid this atomic sequence in most cases by checking before
setting the bit.
Testing on a POWER8 using our context switch microbenchmark:
Anton Blanchard [Fri, 30 Sep 2016 10:39:51 +0000 (20:39 +1000)]
powerpc/configs: Bump kernel ring buffer size on 64 bit configs
When we issue a system reset, every CPU in the box prints an Oops,
including a backtrace. Each of these can be quite large (over 4kB)
and we may end up wrapping the ring buffer and losing important
information.
Bump the base size from 128kB to 256kB and the per CPU size from
4kB to 8kB.
Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Nicholas Piggin [Wed, 21 Sep 2016 07:44:07 +0000 (17:44 +1000)]
powerpc/64s: Remove unused exception code, small cleanups
This was not done before the big patches because I only noticed
them afterwards. It has become much easier to see which handlers
are branched to from which exception vectors now, and to see
exactly what vector space is being used for what.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Nicholas Piggin [Wed, 28 Sep 2016 01:31:48 +0000 (11:31 +1000)]
powerpc: Use gas sections for arranging exception vectors
Use assembler sections of fixed size and location to arrange the 64-bit
Book3S exception vector code (64-bit Book3E also uses it in head_64.S
for 0x0..0x100).
This allows better flexibility in arranging exception code and hiding
unimportant details behind macros.
Gas sections can be a bit painful to use this way, mainly because the
assembler does not know where they will be finally linked. Taking
absolute addresses requires a bit of trickery for example, but it can
be hidden behind macros for the most part.
Generated code is mostly the same except locations, offsets, alignments.
The "+ 0x2" is only required for the trap number / kvm exit number,
which gets loaded as a constant into a register.
Previously, code also used + 0x2 for label names, but we changed to
using "H" to distinguish HV case for that. Remove the last vestiges
of that.
__after_prom_start is taking absolute address of a label in another
fixed section. Newer toolchains seemed to compile this okay, but older
ones do not. FIXED_SYMBOL_ABS_ADDR is more foolproof, it just takes an
additional line to define.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Nicholas Piggin [Wed, 28 Sep 2016 01:31:47 +0000 (11:31 +1000)]
powerpc/64: Change the way relocation copy is calculated
With a subsequent patch to put text into different sections,
(_end - _stext) can no longer be computed at link time to determine
the end of the copy. Instead, calculate it at runtime with
(copy_to_here - _stext) + (_end - copy_to_here).
Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Move exception handler alignment directives into the head-64.h macros,
beause they will no longer work in-place after the next patch. This
slightly changes functions that have alignments applied and therefore
code generation, which is why it was not done initially (see earlier
patch).
Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Michael Ellerman [Fri, 30 Sep 2016 09:43:18 +0000 (19:43 +1000)]
powerpc/64s: Add new exception vector macros
Create arch/powerpc/include/asm/head-64.h with macros that specify
an exception vector (name, type, location), which will be used to
label and lay out exceptions into the object file.
Naming is moved out of exception-64s.h, which is used to specify the
implementation of exception handlers.
objdump of generated code in exception vectors is unchanged except for
names. Alignment directives scattered around are annoying, but done
this way so that disassembly can verify identical instruction
generation before and after patch. These get cleaned up in future
patch.
We change the way KVMTEST works, explicitly passing EXC_HV or EXC_STD
rather than overloading the trap number. This removes the need to have
SOFTEN values for the overloaded trap numbers, eg. 0x502.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Anton Blanchard [Sun, 25 Sep 2016 07:16:53 +0000 (17:16 +1000)]
powerpc/vdso64: Use double word compare on pointers
__kernel_get_syscall_map() and __kernel_clock_getres() use cmpli to
check if the passed in pointer is non zero. cmpli maps to a 32 bit
compare on binutils, so we ignore the top 32 bits.
A simple test case can be created by passing in a bogus pointer with
the bottom 32 bits clear. Using a clk_id that is handled by the VDSO,
then one that is handled by the kernel shows the problem:
The bigger issue is if we pass a valid pointer with the bottom 32 bits
clear, in this case we will return success but won't write any data
to the pointer.
I stumbled across this issue because the LLVM integrated assembler
doesn't accept cmpli with 3 arguments. Fix this by converting them to
cmpldi.
Fixes: a7f290dad32e ("[PATCH] powerpc: Merge vdso's and add vdso support to 32 bits kernel") Cc: stable@vger.kernel.org # v2.6.15+ Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
KVM: PPC: Book3S HV: Migrate pinned pages out of CMA
When PCI Device pass-through is enabled via VFIO, KVM-PPC will
pin pages using get_user_pages_fast(). One of the downsides of
the pinning is that the page could be in CMA region. The CMA
region is used for other allocations like the hash page table.
Ideally we want the pinned pages to be from non CMA region.
This patch (currently only for KVM PPC with VFIO) forcefully
migrates the pages out (huge pages are omitted for the moment).
There are more efficient ways of doing this, but that might
be elaborate and might impact a larger audience beyond just
the kvm ppc implementation.
The magic is in new_iommu_non_cma_page() which allocates the
new page from a non CMA region.
I've tested the patches lightly at my end. The full solution
requires migration of THP pages in the CMA region. That work
will be done incrementally on top of this.
Signed-off-by: Balbir Singh <bsingharora@gmail.com> Acked-by: Alexey Kardashevskiy <aik@ozlabs.ru>
[mpe: Merged via powerpc tree as that's where the changes are] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
drivers/pci/hotplug: Support surprise hotplug in powernv driver
This supports PCI surprise hotplug. The design is highlighted as
below:
* The PCI slot's surprise hotplug capability is exposed through
device node property "ibm,slot-surprise-pluggable", meaning
PCI surprise hotplug will be disabled if skiboot doesn't support
it yet.
* The interrupt because of presence or link state change is raised
on surprise hotplug event. One event is allocated and queued to
the PCI slot for workqueue to pick it up and process in serialized
fashion. The code flow for surprise hotplug is same to that for
managed hotplug except: the affected PEs are put into frozen state
to avoid unexpected EEH error reporting in surprise hot remove path.
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
drivers/pci/hotplug: Remove likely() and unlikely() in powernv driver
This removes likely() and unlikely() in pnv_php.c as the code isn't
running in hot path. Those macros to affect CPU's branch stream don't
help a lot for performance. I used them to identify the cases are
likely or unlikely to happen. No logical changes introduced.
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
This unfreezes PE when it's initialized because the PE might be put
into frozen state in the last hot remove path. It's not harmful to
do so if the PE is already in unfrozen state.
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
This exports eeh_pe_state_mark(). It will be used to mark the surprise
hot removed PE as isolated to avoid unexpected EEH error reporting in
surprise remove path.
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
powerpc/eeh: Allow to freeze PE in eeh_pe_set_option()
Function eeh_pe_set_option() is used to apply the requested options
(enable, disable, unfreeze) in EEH virtualization path. The semantics
of this function isn't complete until freezing is supported.
This allows to freeze the indicated PE. The new semantics is going to
be used in PCI surprise hot remove path, to freeze removed PCI devices
(PE) to avoid unexpected EEH error reporting.
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Gavin Shan [Fri, 24 Jun 2016 06:44:19 +0000 (16:44 +1000)]
powerpc/powernv: Call opal_pci_poll() if needed
When issuing PHB reset, OPAL API opal_pci_poll() is called to drive
the state machine in OPAL forward. However, we needn't always call
the function under some circumstances like reset deassert.
This avoids calling opal_pci_poll() when OPAL_SUCCESS is returned
from opal_pci_reset(). Except the overhead introduced by additional
one unnecessary OPAL call, I didn't run into real issue because of
this.
This patch adds an option to use XZ compression for the kernel image.
Currently this is only enabled for 64-bit Book3S targets, which is
roughly equivalent to the platforms that use the kernel's zImage
wrapper, and that have been tested.
The bulk of the 32-bit platforms and 64-bit BookE use uboot images,
which relies on uboot implementing XZ. In future we can enable XZ
support for those targets once someone has tested it.
Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
powerpc/boot: Add XZ support to the wrapper script
This modifies the wrapper script so that the -Z option takes an argument
to specify the compression type. It can either be 'gz', 'xz' or 'none'.
The legazy --no-gzip and -z options are still supported and will set the
compression to none and gzip respectively, but they are not documented.
Only XZ -6 is used for compression rather than XZ -9. Using compression
levels higher than 6 requires the decompressor to build a large (64MB)
dictionary when decompressing and some environments cannot satisfy such
large allocations (e.g. POWER 6 LPAR partition firmware).
Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Currently the powerpc boot wrapper has its own wrapper around zlib to
handle decompressing gzipped kernels. The kernel decompressor library
functions now provide a generic interface that can be used in the
pre-boot environment. This allows boot wrappers to easily support
different compression algorithms. This patch converts the wrapper to use
this new API, but does not add support for using new algorithms.
Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Most architectures allow the compression algorithm used to produced the
vmlinuz image to be selected as a kernel config option. In preperation
for supporting algorithms other than gzip in the powerpc boot wrapper
the makefile needs to be modified to use these config options.
Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
The powerpc boot wrapper is potentially compiled with a separate
toolchain and/or toolchain flags than the rest of the kernel. The usual
case is a 64-bit big endian kernel builds a 32-bit big endian wrapper.
The main problem with this is that the wrapper does not have access to
the kernel headers (without a lot of gross hacks). To get around this
the required headers are copied into the build directory via several sed
scripts which rewrite problematic includes. This patch moves these
fixups out of the makefile into a separate .sed script file to clean up
makefile slightly.
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
[mpe: Reword first paragraph of change log a little] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Rui Teng [Fri, 2 Sep 2016 06:17:26 +0000 (14:17 +0800)]
powerpc: Clean up tm_abort duplication in hash_utils_64.c
The same logic appears twice and should probably be pulled out into a
function.
Suggested-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Rui Teng <rui.teng@linux.vnet.ibm.com>
[mpe: Rename to tm_flush_hash_page() and move comment into the function] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
CLR_TOP32() is defined as blank. Last useful instance of CLR_TOP32()
was removed by commit 40ef8cbc6d360 ("powerpc: Get 64-bit configs to
compile with ARCH=powerpc") in 2005.
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
On some CPUs like the 8xx, _PAGE_RW hence _PAGE_WRITE is defined
as 0 and _PAGE_RO has to be set when a page is not writable
_PAGE_RO is defined by default in pte-common.h, however BOOK3S/64
doesn't include that file so _PAGE_RO has to be defined explicitly
in book3s/64/pgtable.h
Fixes: a7b9f671f2d14 ("powerpc32: adds handling of _PAGE_RO") Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Russell Currey [Mon, 12 Sep 2016 04:17:24 +0000 (14:17 +1000)]
powerpc/eeh: Skip finding bus until after failure reporting
In eeh_handle_special_event(), eeh_pe_bus_get() is called before calling
eeh_report_failure() on every device under a PE. If a PE was missing a
bus for some reason, the error would occur before reporting failure, even
though eeh_report_failure() doesn't require a bus.
Fix this by moving the bus retrieval and error check after the
eeh_report_failure() calls.
Signed-off-by: Russell Currey <ruscur@russell.cc> Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Russell Currey [Mon, 12 Sep 2016 04:17:23 +0000 (14:17 +1000)]
powerpc/powernv/eeh: Skip finding bus for VF resets
When the PE used in pnv_eeh_reset() is that of a VF,
pnv_eeh_reset_vf_pe() is used. Unlike the other reset functions called
in pnv_eeh_reset(), the VF reset doesn't require a bus, and if a bus was
missing the function would error out before resetting the VF PE.
To avoid this, reorder the VF reset function to occur before finding and
checking the bus.
Signed-off-by: Russell Currey <ruscur@russell.cc> Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Russell Currey [Mon, 12 Sep 2016 04:17:22 +0000 (14:17 +1000)]
powerpc/eeh: Null check uses of eeh_pe_bus_get
eeh_pe_bus_get() can return NULL if a PCI bus isn't found for a given PE.
Some callers don't check this, and can cause a null pointer dereference
under certain circumstances.
Fix this by checking NULL everywhere eeh_pe_bus_get() is called.
Fixes: 8a6b1bc70dbb ("powerpc/eeh: EEH core to handle special event") Cc: stable@vger.kernel.org # v3.11+ Signed-off-by: Russell Currey <ruscur@russell.cc> Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>