git.karo-electronics.de Git - linux-beck.git/log

cxl: Fix force unmapping mmaps of contexts allocated through the kernel api

The cxl user api uses the address_space associated with the file when we
need to force unmap all cxl mmap regions (e.g. on eeh, driver detach,
etc). Currently, contexts allocated through the kernel api do not do
this and instead skip the mmap invalidation, potentially allowing them
to poke at the hardware after such an event, which may cause all sorts
of trouble.

This patch allocates an address_space for cxl contexts allocated through
the kernel api so that the same invalidate path will for these contexts
as well. We don't use the anonymous inode's address_space, as doing so
could invalidate any mmaps of completely unrelated drivers using
anonymous file descriptors.

This patch also introduces a kernelapi flag, so we know when freeing the
context if the address_space was allocated by us and needs to be freed.

Signed-off-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

cxl: Fix + cleanup error paths in cxl_dev_context_init

If the cxl_context_alloc() call fails, we return immediately without
releasing the reference on the AFU device, allowing it to leak.

This patch switches to using goto style error handling so that the
device is released in common code for both error paths, and will also
simplify things if we add additional initialisation in this function in
the future.

Signed-off-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

powerpc/eeh: Fix fenced PHB caused by eeh_slot_error_detail()

The config space of some PCI devices can't be accessed when their
PEs are in frozen state. Otherwise, fenced PHB might be seen.
Those PEs are identified with flag EEH_PE_CFG_RESTRICTED, meaing
EEH_PE_CFG_BLOCKED is set automatically when the PE is put to
frozen state (EEH_PE_ISOLATED). eeh_slot_error_detail() restores
PCI device BARs with eeh_pe_restore_bars(), which then calls
eeh_ops->restore_config() to reinitialize the PCI device in
(OPAL) firmware. eeh_ops->restore_config() produces PCI config
access that causes fenced PHB. The problem was reported on below
adapter:

   0001:01:00.0 0200: 14e4:168e (rev 10)
   0001:01:00.0 Ethernet controller: Broadcom Corporation \
                NetXtreme II BCM57810 10 Gigabit Ethernet (rev 10)

This fixes the issue by skipping eeh_pe_restore_bars() in
eeh_slot_error_detail() when EEH_PE_CFG_BLOCKED is set for the PE.

Fixes: b6541db1 ("powerpc/eeh: Block PCI config access upon frozen PE")
Cc: stable@vger.kernel.org # v4.0+
Reported-by: Manvanthara B. Puttashankar <mputtash@in.ibm.com>
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

powerpc/pseries: Cleanup on pci_dn_reconfig_notifier()

This applies cleanup on pci_dn_reconfig_notifier(), no functional
changes:

   * Rename variable "pci" to "pdn" to indicate its purpose clearly.
   * The parent node can be released at any time. So it should be
     hold with of_get_parent() before accessing it.
   * The device node doesn't have to have parent node in theory.
     More check on this.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

powerpc/pseries: Fix corrupted pdn list

Commit cca87d30 ("powerpc/pci: Refactor pci_dn") introduced pdn
list for SRIOV VFs. It means the pdn is be put into the child list
of its parent pdn when the pdn is created. When doing PCI hot
unplugging on pSeries, the PCI device node as well as its pdn are
released through procfs entry "powerpc/ofdt". Some one else grabs
the memory chunk of the pdn and update it accordingly. At the same
time, the pdn is still tracked in the child list of parent pdn. It
leads to corrupted child list in the parent pdn.

This fixes above issue by removing the pdn from the child list of
its parent pdn when the device node is detached from the system.
Note the pdn is free'd when the device node is released if the
device node is dynamic one. Otherwise, the device node as well
as the pdn won't be released.

Fixes: cca87d30 ("powerpc/pci: Refactor pci_dn")
Cc: stable@vger.kernel.org # 4.1+
Reported-by: Santwana Samantray <santwana.samantray@in.ibm.com>
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/scottwood/linux into next

Freescale updates from Scott:

"Highlights include 32-bit memcpy/memset optimizations, checksum
optimizations, 85xx config fragments and updates, device tree updates,
e6500 fixes for non-SMP, and misc cleanup and minor fixes."

powerpc/powernv: Enable LEDS support

Commit 84ad6e5c added LEDS support for PowerNV platform. Lets
update ppc64_defconfig to pick LEDS driver.

PowerNV LEDS driver looks for "/ibm,opal/leds" node in device
tree and loads if this node exists. Hence added it as 'm'.

Also note that powernv LEDS driver needs NEW_LEDS and LEDS_CLASS
as well. Hence added them to config file.

mpe: Also add them to pseries_defconfig, which is currently also used
for powernv systems.

Suggested-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
Cc: Stewart Smith <stewart@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

powerpc/iommu: Set default DMA offset in dma_dev_setup

Commit e91c25111aa3 "powerpc/iommu: Cleanup setting of DMA base/offset"
expects that the default DMA offset is set from pnv_ioda_setup_bus_dma()
which is correct unless it is SRIOV where the code flow is different -
at the moment when pnv_ioda_setup_bus_dma() is called, PCI devices for
VFs are not created yet.

This adds missing set_dma_offset() to pnv_pci_ioda_dma_dev_setup() to
cover the case of SRIOV.

Note that we still need set_dma_offset() in pnv_ioda_setup_bus_dma() as
at the boot time pnv_pci_ioda_dma_dev_setup() is called when no PE was
created yet, this happens at the PHB fixup stage.

Fixes: e91c25111aa3 ("powerpc/iommu: Cleanup setting of DMA base/offset")
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

cxl: Remove racy attempt to force EEH invocation in reset

cxl_reset currently PERSTs the slot, and then repeatedly tries to
read MMIO space in order to kick off EEH.

There are 2 problems with this: it's unnecessary, and it's racy.

It's unnecessary because the PERST will bring down the PHB link.
That will be picked up by the CAPP, which will send out an HMI.
Skiboot, noticing an HMI from the CAPP, will send an OPAL
notification to the kernel, which will trigger EEH recovery.

It's also racy: the EEH recovery triggered by the CAPP will
eventually cause the MMIO space to have its mapping invalidated
and the pointer NULLed out. This races with our attempt to read
the MMIO space. This is causing OOPSes in testing.

Simply drop all the attempts to force EEH detection, and trust
that Skiboot will send the notification and that we'll act on it.
The Skiboot code to send the EEH notification has been in Skiboot
for as long as CAPP recovery has been supported, so we don't need
to worry about breaking obscure setups with ancient firmware.

Cc: Ryan Grimm <grimm@linux.vnet.ibm.com>
Cc: stable@vger.kernel.org
Fixes: 62fa19d4b4fd ("cxl: Add ability to reset the card")
Signed-off-by: Daniel Axtens <dja@axtens.net>
Acked-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

cxl: Release irqs if memory allocation fails

This minor patch plugs a potential irq leak in case of a memory
allocation failure inside function the afu_allocate_irqs. Presently the
irqs allocated to the context gets leaked if allocation of either
one of context irq_bitmap or irq_names fails.

Signed-off-by: Vaibhav Jain <vaibhav@linux.vnet.ibm.com>
Acked-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

cxl: Remove use of macro DEFINE_PCI_DEVICE_TABLE

Macro DEFINE_PCI_DEVICE_TABLE is deprecated. So, here use
struct pci_device_id instead of DEFINE_PCI_DEVICE_TABLE with
the goal of getting rid of this macro completely.

The Coccinelle semantic patch that performs this transformation
is as follows:

@@
identifier a;
declarer name DEFINE_PCI_DEVICE_TABLE;
initializer i;
@@
- DEFINE_PCI_DEVICE_TABLE(a)
+ const struct pci_device_id a[]
= i;

Signed-off-by: Vaishali Thakkar <vthakkar1994@gmail.com>
Acked-by: Michael Neuling <mikey@neuling.org>
Acked-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

powerpc/powernv: Fix mis-merge of OPAL support for LEDS driver

When I merged the OPAL support for the powernv LEDS driver I missed a
hunk.

This is slightly modified from the original patch, as the original added
code to opal-api.h which is not in the skiboot version, which is
discouraged.

Instead those values are moved into the driver, which is the only place
they are used.

Fixes: 8a8d91817aec ("powerpc/powernv: Add OPAL interfaces for accessing and modifying system LED states")
Reviewed-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

powerpc/powernv: Reset HILE before kexec_sequence()

On powernv secondary cpus are returned to OPAL, and will then enter
the target kernel in big-endian. However if it is set the HILE bit
will persist, causing the first exception in the target kernel to be
delivered in litte-endian regardless of the current endianness.

If running on top of OPAL make sure the HILE bit is reset once we've
finished waiting for all of the secondaries to be returned to OPAL.

Signed-off-by: Samuel Mendoza-Jonas <sam.mj@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

powerpc/kexec: Reset secondary cpu endianness before kexec

If the target kernel does not inlcude the FIXUP_ENDIAN check, coming
from a different-endian kernel will cause the target kernel to panic.
All ppc64 kernels can handle starting in big-endian mode, so return to
big-endian before branching into the target kernel.

This mainly affects pseries as secondaries on powernv are returned to
OPAL.

Signed-off-by: Samuel Mendoza-Jonas <sam.mj@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

powerpc/hvsi: Fix endianness issues in the HVSI driver

This patch fixes several endianness issues detected when running the HVSI
driver in little endian mode.

These issues are raised in little endian mode because the data exchanged in
memory between the kernel and the hypervisor has to be in big endian
format. This exhibits as errors such as:

irq: (null) didn't like hwirq-0x1000a00 to VIRQ16 mapping (rc=-22)
hvsi_console_init: couldn't create irq mapping for 0x1000a00

The data structures already have endian annotations, and sparse is
generating numerous warnings based on those. This commit fixes all of
them.

Signed-off-by: Laurent Dufour <ldufour@linux.vnet.ibm.com>
CC: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
CC: Jiri Slaby <jslaby@suse.cz>
CC: linuxppc-dev@lists.ozlabs.org
CC: linux-kernel@vger.kernel.org
[mpe: Flesh out change log]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

leds/powernv: Add driver for PowerNV platform

This patch implements LED driver for PowerNV platform using the existing
generic LED class framework.

PowerNV platform has below type of LEDs:
  - System attention
      Indicates there is a problem with the system that needs attention.
  - Identify
      Helps the user locate/identify a particular FRU or resource in the
      system.
  - Fault
      Indicates there is a problem with the FRU or resource at the
      location with which the indicator is associated.

We register classdev structures for all individual LEDs detected on the
system through LED specific device tree nodes. Device tree nodes specify
what all kind of LEDs present on the same location code. It registers
LED classdev structure for each of them.

All the system LEDs can be found in the same regular path /sys/class/leds/.
We don't use LED colors. We use LED node and led-types property to form
LED classdev. Our LEDs have names in this format.

        <location_code>:<attention|identify|fault>

Any positive brightness value would turn on the LED and a zero value would
turn off the LED. The driver will return LED_FULL (255) for any turned on
LED and LED_OFF (0) for any turned off LED.

The platform level implementation of LED get and set state has been
achieved through OPAL calls. These calls are made available for the
driver by exporting from architecture specific codes.

Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
Signed-off-by: Anshuman Khandual <khandual@linux.vnet.ibm.com>
Acked-by: Stewart Smith <stewart@linux.vnet.ibm.com>
Tested-by: Stewart Smith <stewart@linux.vnet.ibm.com>
Acked-by: Jacek Anaszewski <j.anaszewski@samsung.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

powerpc/powernv: Create LED platform device

This patch adds platform devices for leds. Also export LED related
OPAL API's so that led driver can use these APIs.

Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

powerpc/powernv: Add OPAL interfaces for accessing and modifying system LED states

This patch registers the following two new OPAL interfaces calls
for the platform LED subsystem. With the help of these new OPAL calls,
the kernel will be able to get or set the state of various individual
LEDs on the system at any given location code which is passed through
the LED specific device tree nodes.

(1) OPAL_LEDS_GET_INDICATOR opal_leds_get_ind
(2) OPAL_LEDS_SET_INDICATOR opal_leds_set_ind

Signed-off-by: Anshuman Khandual <khandual@linux.vnet.ibm.com>
Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
Acked-by: Stewart Smith <stewart@linux.vnet.ibm.com>
Tested-by: Stewart Smith <stewart@linux.vnet.ibm.com>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

powerpc/powernv: Fix the log message when disabling VF

On powernv platform, IOV BAR would be shifted if necessary. While the log
message is not correct when disabling VFs.

This patch fixes this by print correct message based on the offset value.

Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

cxl: Allow release of contexts which have been OPENED but not STARTED

If we open a context but do not start it (either because we do not attempt
to start it, or because it fails to start for some reason), we are left
with a context in state OPENED. Previously, cxl_release_context() only
allowed releasing contexts in state CLOSED, so attempting to release an
OPENED context would fail.

In particular, this bug causes available contexts to run out after some EEH
failures, where drivers attempt to release contexts that have failed to
start.

Allow releasing contexts in any state with a value lower than STARTED, i.e.
OPENED or CLOSED (we can't release a STARTED context as it's currently
using the hardware, and we assume that contexts in any new states which may
be added in future with a value higher than STARTED are also unsafe to
release).

Cc: stable@vger.kernel.org
Fixes: 6f7f0b3df6d4 ("cxl: Add AFU virtual PHB and kernel API")
Signed-off-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: Daniel Axtens <dja@axtens.net>
Acked-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

macintosh/therm_windtunnel: Export OF module alias information

The I2C core always reports the MODALIAS uevent as "i2c:<client name"
regardless if the driver was matched using the I2C id_table or the
of_match_table. So technically there's no need for a driver to export
the OF table since currently it's not used.

In fact, the I2C device ID table is mandatory for I2C drivers since
a i2c_device_id is passed to the driver's probe function even if the
I2C core used the OF table to match the driver.

And since the I2C core uses different tables, OF-only drivers needs to
have duplicated data that has to be kept in sync and also the dev node
compatible manufacturer prefix is stripped when reporting the MODALIAS.

To avoid the above, the I2C core behavior may be changed in the future
to not require an I2C device table for OF-only drivers and report the
OF module alias. So, it's better to also export the OF table to prevent
breaking module autoloading if that happens.

Signed-off-by: Javier Martinez Canillas <javier@osg.samsung.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

macintosh/therm_windtunnel: Export I2C module alias information

The I2C core always reports the MODALIAS uevent as "i2c:<client name"
regardless if the driver was matched using the I2C id_table or the
of_match_table. So the driver needs to export the I2C table and this
be built into the module or udev won't have the necessary information
to auto load the correct module when the device is added.

Signed-off-by: Javier Martinez Canillas <javier@osg.samsung.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

powerpc/512x: silence a USB Kconfig dependency warning

the PPC_MPC512x config automatically selected USB_EHCI_BIG_ENDIAN_*
switches, which made Kconfig warn about "unmet direct dependencies":

  scripts/kconfig/conf --silentoldconfig Kconfig
  warning: (PPC_MPC512x && 440EPX) selects USB_EHCI_BIG_ENDIAN_DESC which has unmet direct dependencies (USB_SUPPORT && USB && USB_EHCI_HCD)
  warning: (PPC_MPC512x && PPC_PS3 && PPC_CELLEB && 440EPX) selects USB_EHCI_BIG_ENDIAN_MMIO which has unmet direct dependencies (USB_SUPPORT && USB && USB_EHCI_HCD)
  warning: (PPC_MPC512x && 440EPX) selects USB_EHCI_BIG_ENDIAN_DESC which has unmet direct dependencies (USB_SUPPORT && USB && USB_EHCI_HCD)
  warning: (PPC_MPC512x && PPC_PS3 && PPC_CELLEB && 440EPX) selects USB_EHCI_BIG_ENDIAN_MMIO which has unmet direct dependencies (USB_SUPPORT && USB && USB_EHCI_HCD)

make the selected entries additionally depend on USB_EHCI_HCD which
silences the warning

Signed-off-by: Gerhard Sittig <gsi@denx.de>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

powerpc/nvram: print no error when pstore backend is not nvram

Pstore only supports one backend at a time. The preferred
pstore backend is set by passing the pstore.backend=<name>
argument to the kernel at boot time. Currently, while trying
to register with pstore, nvram throws an error message even
when "pstore.backend != nvram", which is unnecessary. This
patch removes the error message in case "pstore.backend != nvram".

Signed-off-by: Hari Bathini <hbathini@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

cxl: Add alternate MMIO error handling

userspace programs using cxl currently have to use two strategies for
dealing with MMIO errors simultaneously. They have to check every read
for a return of all Fs in case the adapter has gone away and the kernel
has not yet noticed, and they have to deal with SIGBUS in case the
kernel has already noticed, invalidated the mapping and marked the
context as failed.

In order to simplify things, this patch adds an alternative approach
where the kernel will return a page filled with Fs instead of delivering
a SIGBUS. This allows userspace to only need to deal with one of these
two error paths, and is intended for use in libraries that use cxl
transparently and may not be able to safely install a signal handler.

This approach will only work if certain constraints are met. Namely, if
the application is both reading and writing to an address in the problem
state area it cannot assume that a non-FF read is OK, as it may just be
reading out a value it has previously written. Further - since only one
page is used per context a write to a given offset would be visible when
reading the same offset from a different page in the mapping (this only
applies within a single context, not between contexts).

An application could deal with this by e.g. making sure it also reads
from a read-only offset after any reads to a read/write offset.

Due to these constraints, this functionality must be explicitly
requested by userspace when starting the context by passing in the
CXL_START_WORK_ERR_FF flag.

Signed-off-by: Ian Munsie <imunsie@au1.ibm.com>
Acked-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

powerpc/nvram: use kmemdup rather than duplicating its implementation

The patch was generated using fixed coccinelle semantic patch
scripts/coccinelle/api/memdup.cocci [1].

[1]: http://permalink.gmane.org/gmane.linux.kernel/2014320

Signed-off-by: Andrzej Hajda <a.hajda@samsung.com>
Reviewed-by: Nathan Fontenot <nfont@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

powerpc/pseries: use kmemdup rather than duplicating its implementation

The patch was generated using fixed coccinelle semantic patch
scripts/coccinelle/api/memdup.cocci [1].

[1]: http://permalink.gmane.org/gmane.linux.kernel/2014320

Signed-off-by: Andrzej Hajda <a.hajda@samsung.com>
Reviewed-by: Nathan Fontenot <nfont@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

powerpc/eeh: Disable automatically blocked PCI config

pcibios_set_pcie_reset_state() could be called to complete
reset request when passing through PCI device, flag
EEH_PE_ISOLATED is set before saving the PCI config sapce.
On some Broadcom adapters, EEH_PE_CFG_BLOCKED is automatically
set when the flag EEH_PE_ISOLATED is marked. It caused bogus
data saved from the PCI config space, which will be restored
to the PCI adapter after the reset. Eventually, the hardware
can't work with corrupted data in PCI config space.

The patch fixes the issue with eeh_pe_state_mark_no_cfg(), which
doesn't set EEH_PE_CFG_BLOCKED when seeing EEH_PE_ISOLATED on the
PE, in order to avoid the bogus data saved and restored to the PCI
config space.

Reported-by: Rajanikanth H. Adaveeshaiah <rajanikanth.ha@in.ibm.com>
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

powerpc: Export include/uapi/asm/eeh.h

This adds include/uapi/asm/eeh.h to kbuild so that the header
file will be exported automatically with below command. The
header file was added by commit ed3e81ff2016 ("powerpc/eeh: Move PE
state constants around")

make INSTALL_HDR_PATH=/tmp/headers \
SRCARCH=powerpc headers_install

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

powerpc/pseries: enable RTC class support

A working rtc kernel driver is needed so that hwclock can synchronize
system clock to rtc during shutdown/boot. We already have a powernv
platform rtc driver located at drivers/rtc/rtc-opal.c. However it depends
on CONFIG_RTC_CLASS which is disabled by default. Hence the driver isn't
enabled and not compiled for the powernv kernel.

We fix this by enabling rtc class support in pseries defconfig which
enables this driver and compiles it into the pseries kernel. In case
CONFIG_PPC_POWERNV is not enabled we fallback to 'Generic RTC support'
driver which emulates the legacy 'PC RTC driver'.

Signed-off-by: Vaibhav Jain <vaibhav@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

powerpc/numa: initialize distance lookup table from drconf path

In some situations, a NUMA guest that supports
ibm,dynamic-memory-reconfiguration node will end up having flat NUMA
distances between nodes. This is because of two problems in the
current code.

1) Different representations of associativity lists.

   There is an assumption about the associativity list in
   initialize_distance_lookup_table(). Associativity list has two forms:

   a) [cpu,memory]@x/ibm,associativity has following
      format:
           <N> <N integers>

   b) ibm,dynamic-reconfiguration-memory/ibm,associativity-lookup-arrays

           <M> <N> <M associativity lists each having N integers>
           M = the number of associativity lists
           N = the number of entries per associativity list

   Fix initialize_distance_lookup_table() so that it does not assume
   "case a". And update the caller to skip the length field before
   sending the associativity list.

2) Distance table not getting updated from drconf path.

   Node distance table will not get initialized in certain cases as
   ibm,dynamic-reconfiguration-memory path does not initialize the
   lookup table.

   Call initialize_distance_lookup_table() from drconf path with
   appropriate associativity list.

Reported-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
Acked-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

powerpc/powernv: move dma_get_required_mask from pnv_phb to pci_controller_ops

Simplify the dma_get_required_mask call chain by moving it from pnv_phb to
pci_controller_ops, similar to commit 763d2d8df1ee ("powerpc/powernv:
Move dma_set_mask from pnv_phb to pci_controller_ops").

Previous call chain:

  0) call dma_get_required_mask() (kernel/dma.c)
  1) call ppc_md.dma_get_required_mask, if it exists. On powernv, that
     points to pnv_dma_get_required_mask() (platforms/powernv/setup.c)
  2) device is PCI, therefore call pnv_pci_dma_get_required_mask()
     (platforms/powernv/pci.c)
  3) call phb->dma_get_required_mask if it exists
  4) it only exists in the ioda case, where it points to
       pnv_pci_ioda_dma_get_required_mask() (platforms/powernv/pci-ioda.c)

New call chain:

  0) call dma_get_required_mask() (kernel/dma.c)
  1) device is PCI, therefore call pci_controller_ops.dma_get_required_mask
     if it exists
  2) in the ioda case, that points to pnv_pci_ioda_dma_get_required_mask()
     (platforms/powernv/pci-ioda.c)

In the p5ioc2 case, the call chain remains the same -
dma_get_required_mask() does not find either a ppc_md call or
pci_controller_ops call, so it calls __dma_get_required_mask().

Signed-off-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Reviewed-by: Daniel Axtens <dja@axtens.net>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

powerpc/mm: Drop CONFIG_PPC_HAS_HASH_64K

The relation between CONFIG_PPC_HAS_HASH_64K and CONFIG_PPC_64K_PAGES is
painfully complicated.

But if we rearrange it enough we can see that PPC_HAS_HASH_64K
essentially depends on PPC_STD_MMU_64 && PPC_64K_PAGES.

We can then notice that PPC_HAS_HASH_64K is used in files that are only
built for PPC_STD_MMU_64, meaning it's equivalent to PPC_64K_PAGES.

So replace all uses and drop it.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>

powerpc/mm: Simplify page size kconfig dependencies

For config options with only a single value, guarding the single value
with 'if' is the same as adding a 'depends' statement. And it's more
standard to just use 'depends'.

And if the option has both an 'if' guard and a 'depends' we can collapse
them into a single 'depends' by combining them with &&.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>

powerpc/mm: Drop the 64K on 4K version of pte_pagesize_index()

Now that support for 64k pages with a 4K kernel is removed, this code is
unreachable.

CONFIG_PPC_HAS_HASH_64K can only be true when CONFIG_PPC_64K_PAGES is
also true.

But when CONFIG_PPC_64K_PAGES is true we include pte-hash64.h which
includes pte-hash64-64k.h, which defines both pte_pagesize_index() and
crucially __real_pte, which means this definition can never be used.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>

powerpc/cell: Drop support for 64K local store on 4K kernels

Back in the olden days we added support for using 64K pages to map the
SPU (Synergistic Processing Unit) local store on Cell, when the main
kernel was using 4K pages.

This was useful at the time because distros were using 4K pages, but
using 64K pages on the SPUs could reduce TLB pressure there.

However these days the number of Cell users is approaching zero, and
supporting this option adds unpleasant complexity to the memory
management code.

So drop the option, CONFIG_SPU_FS_64K_LS, and all related code.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Acked-by: Jeremy Kerr <jk@ozlabs.org>

powerpc/mm: Fix pte_pagesize_index() crash on 4K w/64K hash

The powerpc kernel can be built to have either a 4K PAGE_SIZE or a 64K
PAGE_SIZE.

However when built with a 4K PAGE_SIZE there is an additional config
option which can be enabled, PPC_HAS_HASH_64K, which means the kernel
also knows how to hash a 64K page even though the base PAGE_SIZE is 4K.

This is used in one obscure configuration, to support 64K pages for SPU
local store on the Cell processor when the rest of the kernel is using
4K pages.

In this configuration, pte_pagesize_index() is defined to just pass
through its arguments to get_slice_psize(). However pte_pagesize_index()
is called for both user and kernel addresses, whereas get_slice_psize()
only knows how to handle user addresses.

This has been broken forever, however until recently it happened to
work. That was because in get_slice_psize() the large kernel address
would cause the right shift of the slice mask to return zero.

However in commit 7aa0727f3302 ("powerpc/mm: Increase the slice range to
64TB"), the get_slice_psize() code was changed so that instead of a
right shift we do an array lookup based on the address. When passed a
kernel address this means we index way off the end of the slice array
and return random junk.

That is only fatal if we happen to hit something non-zero, but when we
do return a non-zero value we confuse the MMU code and eventually cause
a check stop.

This fix is ugly, but simple. When we're called for a kernel address we
return 4K, which is always correct in this configuration, otherwise we
use the slice mask.

Fixes: 7aa0727f3302 ("powerpc/mm: Increase the slice range to 64TB")
Reported-by: Cyril Bur <cyrilbur@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>

powerpc/t1023rdb/dts: set ifc nand chip select from 2 to 1

IFC NAND chip select is wrongly mapped to 2 in reg property of
NAND node. Due to this kernel is not able probe NAND flash. Set
chip select to 1 in reg property.

Signed-off-by: Jaiprakash Singh <b44839@freescale.com>
Signed-off-by: Shengzhou Liu <Shengzhou.Liu@freescale.com>
Signed-off-by: Scott Wood <scottwood@freescale.com>

powerpc/mpc85xx:Add SCFG device tree support of T104x

Signed-off-by: Wang Dongsheng <dongsheng.wang@freescale.com>
Signed-off-by: Scott Wood <scottwood@freescale.com>

powerpc/85xx: Add binding for SCFG

SCFG provides SoC specific configuration and status registers for
the chip. Add this for powerpc platform.

Signed-off-by: Wang Dongsheng <dongsheng.wang@freescale.com>
Signed-off-by: Scott Wood <scottwood@freescale.com>

powerpc/fsl-booke: Add T1040D4RDB/T1042D4RDB board support

T1040D4RDB/T1042D4RDB are Freescale Reference Design Board
which can support T1040/T1042 QorIQ Power
Architecture™ processor respectively

T1040D4RDB/T1042D4RDB board Overview
-------------------------------------
- SERDES Connections, 8 lanes supporting:
        - PCI
        - SGMII
        - SATA 2.0
        - QSGMII(only for T1040D4RDB)
    - DDR Controller
        - Supports rates of up to 1600 MHz data-rate
        - Supports one DDR4 UDIMM
    -IFC/Local Bus
        - NAND flash: 1GB 8-bit NAND flash
        - NOR: 128MB 16-bit NOR Flash
    - Ethernet
        - Two on-board RGMII 10/100/1G ethernet ports.
        - PHY #0 remains powered up during deep-sleep
    - CPLD
    - Clocks
        - System and DDR clock (SYSCLK, “DDRCLK”)
        - SERDES clocks
    - Power Supplies
    - USB
        - Supports two USB 2.0 ports with integrated PHYs
        - Two type A ports with 5V@1.5A per port.
    - SDHC
        - SDHC/SDXC connector
    - SPI
        - On-board 64MB SPI flash
    - I2C
        - Devices connected: EEPROM, thermal monitor, VID controller
    - Other IO
        - Two Serial ports
        - ProfiBus port

    Add support for T1040/T1042D4RDB board:
    -add device tree
    -Add entry in corenet_generic.c

Signed-off-by: Priyanka Jain <Priyanka.Jain@freescale.com>
Signed-off-by: Scott Wood <scottwood@freescale.com>

powerpc/85xx: Remove unused pci fixup hooks on c293pcie

The c293pcie board is an endpoint device and it doesn't need PM,
so remove hooks pcibios_fixup_phb and pcibios_fixup_bus.

Signed-off-by: Hou Zhiqiang <B48286@freescale.com>
Signed-off-by: Scott Wood <scottwood@freescale.com>

powerpc/e6500: hw tablewalk: optimize a bit for tcd lock acquiring codes

It makes no sense to put the instructions for calculating the lock
value (cpu number + 1) and the clearing of eq bit of cr1 in lbarx/stbcx
loop. And when the lock is acquired by the other thread, the current
lock value has no chance to equal with the lock value used by current
cpu. So we can skip the comparing for these two lock values in the
lbz/bne loop.

Signed-off-by: Kevin Hao <haokexin@gmail.com>
Signed-off-by: Scott Wood <scottwood@freescale.com>

powerpc/e6500: remove the stale TCD_LOCK macro

Since we moved the "lock" to be the first element of
struct tlb_core_data in commit 82d86de25b9c ("powerpc/e6500: Make TLB
lock recursive"), this macro is not used by any code. Just delete it.

Signed-off-by: Kevin Hao <haokexin@gmail.com>
Signed-off-by: Scott Wood <scottwood@freescale.com>

powerpc: Add a vga alias node for P1022

In u-boot, when set the video as console, the name 'vga' is used
as a general name for the video device, during the fdt_fixup_stdout
process, the 'vga' name is used to search in the dtb to setup the
'linux,stdout-path' node. Though the P1022 DIU is not VGA-compatible
device, to meet the 'vga' name used in u-boot, the vga alias node is
added for P1022 in this patch. At the same time, a display alias is
also added so that no other components grow dependencies on the vga
alias node.

Signed-off-by: Jason Jin <Jason.Jin@freescale.com>
Signed-off-by: Wang Dongsheng <dongsheng.wang@freescale.com>
Signed-off-by: Scott Wood <scottwood@freescale.com>

selftests/powerpc: Install tempfile so the subpage_prot_file test works

We forgot to install the tempfile, so when the selftests are installed
and then run the subpage_prot_file test fails.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

cxl: Plug irq_bitmap getting leaked in cxl_context

This patch plugs the leak of irq_bitmap, allocated as part of
initialization of cxl_context struct; during the call to
afu_allocate_irqs. The bitmap is now release during the call to function
afu_release_irqs.

Reported-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Signed-off-by: Vaibhav Jain <vaibhav@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

cxl: Add CONFIG_CXL_EEH symbol

CONFIG_CXL_EEH is for CXL's EEH related code.

Other drivers can depend on or #ifdef on this symbol to configure
PERST behaviour, allowing CXL to participate in the EEH process.

Reviewed-by: Cyril Bur <cyrilbur@gmail.com>
Signed-off-by: Daniel Axtens <dja@axtens.net>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

cxl: EEH support

EEH (Enhanced Error Handling) allows a driver to recover from the
temporary failure of an attached PCI card. Enable basic CXL support
for EEH.

Signed-off-by: Daniel Axtens <dja@axtens.net>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

cxl: Allow the kernel to trust that an image won't change on PERST.

Provide a kernel API and a sysfs entry which allow a user to specify
that when a card is PERSTed, it's image will stay the same, allowing
it to participate in EEH.

cxl_reset is used to reflash the card. In that case, we cannot safely
assert that the image will not change. Therefore, disallow cxl_reset
if the flag is set.

Signed-off-by: Daniel Axtens <dja@axtens.net>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

cxl: Don't remove AFUs/vPHBs in cxl_reset

If the driver doesn't participate in EEH, the AFUs will be removed
by cxl_remove, which will be invoked by EEH.

If the driver does particpate in EEH, the vPHB needs to stick around
so that the it can particpate.

In both cases, we shouldn't remove the AFU/vPHB.

Reviewed-by: Cyril Bur <cyrilbur@gmail.com>
Signed-off-by: Daniel Axtens <dja@axtens.net>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

cxl: Refactor AFU init/teardown

As with an adapter, some aspects of initialisation are done only once
in the lifetime of an AFU: for example, allocating memory, or setting
up sysfs/debugfs files.

However, we may want to be able to do some parts of the initialisation
multiple times: for example, in error recovery we want to be able to
tear down and then re-map IO memory and IRQs.

Therefore, refactor AFU init/teardown as follows.

- Create two new functions: 'cxl_configure_afu', and its pair
   'cxl_deconfigure_afu'. As with the adapter functions,
   these (de)configure resources that do not need to last the entire
   lifetime of the AFU.

- Allocating and releasing memory remain the task of 'cxl_alloc_afu'
   and 'cxl_release_afu'.

- Once-only functions that do not involve allocating/releasing memory
   stay in the overarching 'cxl_init_afu'/'cxl_remove_afu' pair.
   However, the task of picking an AFU mode and activating it has been
   broken out.

Signed-off-by: Daniel Axtens <dja@axtens.net>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

cxl: Refactor adaptor init/teardown

Some aspects of initialisation are done only once in the lifetime of
an adapter: for example, allocating memory for the adapter,
allocating the adapter number, or setting up sysfs/debugfs files.

However, we may want to be able to do some parts of the
initialisation multiple times: for example, in error recovery we
want to be able to tear down and then re-map IO memory and IRQs.

Therefore, refactor CXL init/teardown as follows.

- Keep the overarching functions 'cxl_init_adapter' and its pair,
   'cxl_remove_adapter'.

- Move all 'once only' allocation/freeing steps to the existing
   'cxl_alloc_adapter' function, and its pair 'cxl_release_adapter'
   (This involves moving allocation of the adapter number out of
   cxl_init_adapter.)

- Create two new functions: 'cxl_configure_adapter', and its pair
   'cxl_deconfigure_adapter'. These two functions 'wire up' the
   hardware --- they (de)configure resources that do not need to
   last the entire lifetime of the adapter

Signed-off-by: Daniel Axtens <dja@axtens.net>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

cxl: Clean up adapter MMIO unmap path.

- MMIO pointer unmapping is guarded by a null pointer check.
   However, iounmap doesn't null the pointer, just invalidate it.
   Therefore, explicitly null the pointer after unmapping.

- afu_desc_mmio also needs to be unmapped.

- PCI regions are allocated in cxl_map_adapter_regs.
   Therefore they should be released in unmap, not elsewhere.

Acked-by: Cyril Bur <cyrilbur@gmail.com>
Signed-off-by: Daniel Axtens <dja@axtens.net>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

cxl: Make IRQ release idempotent

Check if an IRQ is mapped before releasing it.

This will simplify future EEH code by allowing unconditional unmapping
of IRQs.

Acked-by: Cyril Bur <cyrilbur@gmail.com>
Signed-off-by: Daniel Axtens <dja@axtens.net>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

cxl: Allocate and release the SPA with the AFU

Previously the SPA was allocated and freed upon entering and leaving
AFU-directed mode. This causes some issues for error recovery - contexts
hold a pointer inside the SPA, and they may persist after the AFU has
been detached.

We would ideally like to allocate the SPA when the AFU is allocated, and
release it until the AFU is released. However, we don't know how big the
SPA needs to be until we read the AFU descriptor.

Therefore, restructure the code:

- Allocate the SPA only once, on the first attach.

- Release the SPA only when the entire AFU is being released (not
detached). Guard the release with a NULL check, so we don't free
if it was never allocated (e.g. dedicated mode)

Acked-by: Cyril Bur <cyrilbur@gmail.com>
Signed-off-by: Daniel Axtens <dja@axtens.net>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

cxl: Drop commands if the PCI channel is not in normal state

If the PCI channel has gone down, don't attempt to poke the hardware.

We need to guard every time cxl_whatever_(read|write) is called. This
is because a call to those functions will dereference an offset into an
mmio register, and the mmio mappings get invalidated in the EEH
teardown.

Check in the read/write functions in the header.
We give them the same semantics as usual PCI operations:
- a write to a channel that is down is ignored.
- a read from a channel that is down returns all fs.

Also, we try to access the MMIO space of a vPHB device as part of the
PCI disable path. Because that's a read that bypasses most of our usual
checks, we handle it explicitly.

As far as user visible warnings go:
- Check link state in file ops, return -EIO if down.
- Be reasonably quiet if there's an error in a teardown path,
   or when we already know the hardware is going down.
- Throw a big WARN if someone tries to start a CXL operation
   while the card is down. This gives a useful stacktrace for
   debugging whatever is doing that.

Signed-off-by: Daniel Axtens <dja@axtens.net>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

cxl: Convert MMIO read/write macros to inline functions

We're about to make these more complex, so make them functions
first.

Signed-off-by: Daniel Axtens <dja@axtens.net>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

powerpc/eeh: Probe after unbalanced kref check

In the complete hotplug case, EEH PEs are supposed to be released
and set to NULL. Normally, this is done by eeh_remove_device(),
which is called from pcibios_release_device().

However, if something is holding a kref to the device, it will not
be released, and the PE will remain. eeh_add_device_late() has
a check for this which will explictly destroy the PE in this case.

This check in eeh_add_device_late() occurs after a call to
eeh_ops->probe(). On PowerNV, probe is a pointer to pnv_eeh_probe(),
which will exit without probing if there is an existing PE.

This means that on PowerNV, devices with outstanding krefs will not
be rediscovered by EEH correctly after a complete hotplug. This is
affecting CXL (CAPI) devices in the field.

Put the probe after the kref check so that the PE is destroyed
and affected devices are correctly rediscovered by EEH.

Fixes: d91dafc02f42 ("powerpc/eeh: Delay probing EEH device during hotplug")
Cc: stable@vger.kernel.org
Cc: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Daniel Axtens <dja@axtens.net>
Acked-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

powerpc: Add an inline function to update POWER8 HID0

Section 3.7 of Version 1.2 of the Power8 Processor User's Manual
prescribes that updates to HID0 be preceded by a SYNC instruction and
followed by an ISYNC instruction (Page 91).

Create an inline function name update_power8_hid0() which follows this
recipe and invoke it from the static split core path.

Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
Reviewed-by: Sam Bobroff <sam.bobroff@au1.ibm.com>
Tested-by: Sam Bobroff <sam.bobroff@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

powerpc/prom: Use DRCONF flags while processing detected LMBs

Replace hard coded values with existing DRCONF flags while procesing
detected LMBs from the device tree. Does not change any functionality.

Signed-off-by: Anshuman Khandual <khandual@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

powerpc/xmon: Drop the valid variable completely in dump_segments()

The value of 'valid' is always zero when 'esid' is zero, and if 'esid'
is non-zero then the value of 'valid' is irrelevant because we are using
logical or in the if expression.

In fact 'valid' can be dropped completely from dump_segments() by
simply doing the check with SLB_ESID_V directly in the if.

Signed-off-by: Anshuman Khandual <khandual@linux.vnet.ibm.com>
[mpe: Rewrite change log]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

powerpc/prom: Simplify the logic to fetch SLB size

The code to fetch the SLB size from the device tree wants to first look
for "slb-size" and then if that's not found "ibm,slb-size".

We can simplify the code by looking for the properties and then if we
find one of them we set mmu_slb_size.

We also change the function name from check_cpu_slb_size() to
init_mmu_slb_size() as the function doesn't check anything, it only
initialises mmu_slb_size.

Signed-off-by: Anshuman Khandual <khandual@linux.vnet.ibm.com>
[mpe: Rewrite change log]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

powerpc/slb: Add documentation on runtime patching of SLB encoding

This patch adds some documentation to patch_slb_encoding() explaining
how it works.

Signed-off-by: Anshuman Khandual <khandual@linux.vnet.ibm.com>
[mpe: Update change log and mention the signedness of the immediate]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

powerpc/slb: Rename all the 'slot' occurrences to 'entry'

The SLB code uses 'slot' and 'entry' interchangeably, change it to always
use 'entry'.

Signed-off-by: Anshuman Khandual <khandual@linux.vnet.ibm.com>
[mpe: Rewrite change log]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

powerpc/slb: Remove a duplicate extern variable

This patch just removes one redundant entry for one extern variable
'slb_compare_rr_to_size' from the scope. This patch does not change
any functionality.

Signed-off-by: Anshuman Khandual <khandual@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

cxl: sparse: Silence iomem warning in debugfs file creation

An IO address, tagged with __iomem, is passed to debugfs_create_file
as private data. This requires that it be cast to void *. The cast
drops the __iomem annotation and so creates a sparse warning:

drivers/misc/cxl/debugfs.c:51:57: warning: cast removes address space of expression

The address space marker is added back in the file operations
(fops_io_u64).

Silence the warning with __force.

Signed-off-by: Daniel Axtens <dja@axtens.net>
Acked-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

cxl: sparse: Make declarations static

A few declarations were identified by sparse as needing to be static:

  drivers/misc/cxl/irq.c:408:6: warning: symbol 'afu_irq_name_free' was not declared. Should it be static?
  drivers/misc/cxl/irq.c:467:6: warning: symbol 'afu_register_hwirqs' was not declared. Should it be static?
  drivers/misc/cxl/file.c:254:6: warning: symbol 'afu_compat_ioctl' was not declared. Should it be static?
  drivers/misc/cxl/file.c:399:30: warning: symbol 'afu_master_fops' was not declared. Should it be static?

Make them static.

Signed-off-by: Daniel Axtens <dja@axtens.net>
Acked-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

cxl: Compile with -Werror

It's a good idea, and it brings us in line with the rest of arch/powerpc.

Signed-off-by: Daniel Axtens <dja@axtens.net>
Acked-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

powerpc/fsl: Force coherent memory on e500mc derivatives

In CoreNet systems it is not allowed to mix M and non-M mappings to the
same memory, and coherent DMA accesses are considered to be M mappings
for this purpose. Ignoring this has been observed to cause hard
lockups in non-SMP kernels on e6500.

Furthermore, e6500 implements the LRAT (logical to real address table)
which allows KVM guests to control the WIMGE bits. This means that
KVM cannot force the M bit on the way it usually does, so the guest had
better set it itself.

Signed-off-by: Scott Wood <scottwood@freescale.com>

powerpc/booke64: Move mb() to __set_pte_at() with kernel-addr test

map_kernel() doesn't catch all places that create kernel PTEs. In
particular, vmalloc() calls set_pte_at() directly. This causes a
crash when booting a non-SMP kernel on e6500.

Move the sync to __set_pte(), to be executed only for kernel addresses.

Signed-off-by: Scott Wood <scottwood@freescale.com>

fsl_ifc: Change IO accessor based on endianness

IFC IO accressor are set at run time based
on IFC IP registers endianness.IFC node in
DTS file contains information about
endianness.

Signed-off-by: Jaiprakash Singh <b44839@freescale.com>
Signed-off-by: Scott Wood <scottwood@freescale.com>
Acked-by: Brian Norris <computersforpeace@gmail.com>

powerpc/config: enable aquantia PHY

Aquantia PHYs used on platforms such as T2080RDB, T1024RDB.

Signed-off-by: Shaohui Xie <Shaohui.Xie@freescale.com>
Signed-off-by: Scott Wood <scottwood@freescale.com>

powerpc/85xx: enable teranetics PHY

The PHY uses XAUI interface to connect to MAC, mostly the PHY used on
riser card.

Signed-off-by: Shaohui Xie <Shaohui.Xie@freescale.com>
Signed-off-by: Scott Wood <scottwood@freescale.com>

powerpc/t1023rdb: add ina220 current sensor node

Add support for INA220 current sensor.

Signed-off-by: Shengzhou Liu <Shengzhou.Liu@freescale.com>
Signed-off-by: Scott Wood <scottwood@freescale.com>

powerpc/t1024rdb: add ina220 current sensor node

Add support for INA220 current sensor.

Signed-off-by: Shengzhou Liu <Shengzhou.Liu@freescale.com>
Signed-off-by: Scott Wood <scottwood@freescale.com>

powerpc/32: Few optimisations in memcpy

This patch adds a few optimisations in memcpy functions by using
lbzu/stbu instead of lxb/stb and by re-ordering insn inside a loop
to reduce latency due to loading

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Scott Wood <scottwood@freescale.com>

powerpc/32: cacheable_memcpy becomes memcpy

cacheable_memcpy uses dcbz instruction and is more efficient than
memcpy when the destination is in RAM. If the destination is in an
io area, memcpy_toio() is normally used, not memcpy

This patch renames memcpy as generic_memcpy, and renames
cacheable_memcpy as memcpy

On MPC885, we get approximatly 7% increase of the transfer rate
on an FTP reception

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Scott Wood <scottwood@freescale.com>

powerpc/32: Merge the new memset() with the old one

cacheable_memzero() which has become the new memset() and the old
memset() are quite similar, so just merge them.

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Scott Wood <scottwood@freescale.com>

powerpc/32: memset(0): use cacheable_memzero

cacheable_memzero uses dcbz instruction and is more efficient than
memset(0) when the destination is in RAM

This patch renames memset as generic_memset, and defines memset
as a prolog to cacheable_memzero. This prolog checks if the byte
to set is 0. If not, it falls back to generic_memcpy()

cacheable_memzero disappears as it is not referenced anywhere anymore

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Scott Wood <scottwood@freescale.com>

Partially revert "powerpc: Remove duplicate cacheable_memcpy/memzero functions"

This partially reverts
commit 'powerpc: Remove duplicate cacheable_memcpy/memzero functions
("b05ae4ee602b7dc90771408ccf0972e1b3801a35")'

Functions cacheable_memcpy/memzero are more efficient than
memcpy/memset as they use the dcbz instruction which avoids refill
of the cacheline with the data that we will overwrite.

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Scott Wood <scottwood@freescale.com>

powerpc: use memset_io() to clear CPM Muram

CPM muram is not cached, so use memset_io() instead of memset()

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Scott Wood <scottwood@freescale.com>

powerpc/mm: Don't call __flush_dcache_icache_phys() with PA>VA

__flush_dcache_icache_phys() requires the ability to access the
memory with the MMU disabled, which means that on a 32-bit system
any memory above 4 GiB is inaccessible. In particular, mpc86xx is
32-bit and can have more than 4 GiB of RAM.

Signed-off-by: Scott Wood <scottwood@freescale.com>

powerpc: add support for csum_add()

The C version of csum_add() as defined in include/net/checksum.h gives
the following assembly in ppc32:
       0:       7c 04 1a 14     add     r0,r4,r3
       4:       7c 64 00 10     subfc   r3,r4,r0
       8:       7c 63 19 10     subfe   r3,r3,r3
       c:       7c 63 00 50     subf    r3,r3,r0
and the following in ppc64:
   0xc000000000001af8 <+0>: add     r3,r3,r4
   0xc000000000001afc <+4>: cmplw   cr7,r3,r4
   0xc000000000001b00 <+8>: mfcr    r4
   0xc000000000001b04 <+12>: rlwinm  r4,r4,29,31,31
   0xc000000000001b08 <+16>: add     r3,r4,r3
   0xc000000000001b0c <+20>: clrldi  r3,r3,32
   0xc000000000001b10 <+24>: blr

include/net/checksum.h also offers the possibility to define an arch
specific function.  This patch provides a specific csum_add() inline
function.

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Scott Wood <scottwood@freescale.com>

powerpc: put csum_tcpudp_magic inline

csum_tcpudp_magic() is only a few instructions, and does modify
really few registers. So it is not worth having it as a separate
function and suffer function branching and saving of volatile
registers.

This patch makes it inline by use of the already existing
csum_tcpudp_nofold() function.

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Scott Wood <scottwood@freescale.com>

powerpc/85xx: Use kconfig fragments

Unify mpc85xx and corenet configs using fragments, to ease maintenance
and avoid the sort of drift that the previous patch fixed.

Hardware and software options are separated, with the hope that other
embedded platforms could share the software options, and to make it
easier to maintain custom/alternate configs that focus on either
hardware or software options.

Due to the previous patch, this patch should not affect the results of
any of the affected defconfigs -- only how those results are achieved.
The resulting config is more or less the union of the options that any
of the configs previously selected. No attempt was made in this (or
the previous) patch to edit out questionable options, but this patch
will make it easier to do so in future patches.

Signed-off-by: Scott Wood <scottwood@freescale.com>

powerpc/85xx: Make defconfigs consistent

The mpc85xx and corenet configs have many differences between them that
can't be explained by the target hardware of each config. The next
patch will consolidate these targets using kconfig fragments; this
patch shows what the resulting defconfigs will look like (generated by
using savedefconfig on a fragment-generated config).

Signed-off-by: Scott Wood <scottwood@freescale.com>

powerpc: Update corenet32_smp_defconfig for modern distros

corenet32_smp_defconfig is missing some things that modern distros
require, enable them.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Scott Wood <scottwood@freescale.com>

powerpc/corenet32: enable DMA in defconfig

By default we enable DMA(CONFIG_FSL_DMA) support
which are needed on P2041RDB, P3041DS, P4080DS,
B4860QDS, etc.

Signed-off-by: Yuan Yao <yao.yuan@freescale.com>
Signed-off-by: Scott Wood <scottwood@freescale.com>

powerpc/corenet: enable eSDHC

Signed-off-by: Yangbo Lu <yangbo.lu@freescale.com>
Signed-off-by: Scott Wood <scottwood@freescale.com>

powerpc/ftrace: add powerpc timebase as a trace clock source

Add a new powerpc-specific trace clock using the timebase register,
similar to x86-tsc. This gives us
- a fast, monotonic, hardware clock source for trace entries, and
- a clock that can be used to correlate events across cpus as well as across
hypervisor and guests.

Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

powerpc/4xx: Fix return value check in hsta_msi_probe()

In case of error, the functions platform_get_resource() and kmalloc()
returns NULL not ERR_PTR(). The IS_ERR() test in the return value check
should be replaced with NULL test.

Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

windfarm: remove three exported but unused functions

wf_find_control(), wf_find_sensor(), and wf_is_overtemp() are exported
but unused. Remove these three functions.

Signed-off-by: Paul Bolle <pebolle@tiscali.nl>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

windfarm: make wf_critical_overtemp() static

wf_critical_overtemp() is exported. But nothing uses that export.
That's unsurprising because there's no header that defines it. Stop
exporting that function and make it static.

Signed-off-by: Paul Bolle <pebolle@tiscali.nl>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

windfarm: decrement client count when unregistering

wf_unregister_client() increments the client count when a client
unregisters. That is obviously incorrect. Decrement that client count
instead.

Fixes: 75722d3992f5 ("[PATCH] ppc64: Thermal control for SMU based machines")
Signed-off-by: Paul Bolle <pebolle@tiscali.nl>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

powerpc: Remove redundant breaks

break; break; isn't useful.

Remove one.

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

powerpc: pci: use %pR for printing struct resource

Use %pR to simplify the debug code. This also make the debug info more
readable.

Signed-off-by: Kevin Hao <haokexin@gmail.com>
[mpe: Unsplit multi-line printk strings]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

cxl: Don't ignore add_process_element() result when attaching context

Currently when attaching a context in dedicated mode, we ignore the
result of add_process_element(), which could potentially fail.

If add_process_element() returns an error, pass it back to the caller.

Signed-off-by: Daniel Axtens <dja@axtens.net>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

powerpc/powernv: Invoke opal_cec_reboot2() on unrecoverable HMI.

Invoke new opal_cec_reboot2() call with reboot type
OPAL_REBOOT_PLATFORM_ERROR (for unrecoverable HMI interrupts) to inform
BMC/OCC about this error, so that BMC can collect relevant data for error
analysis and decide what component to de-configure before rebooting.

Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

powerpc/powernv: Invoke opal_cec_reboot2() on unrecoverable machine check errors.

On non-recoverable MCE errors in kernel space, Linux kernel panics
and system reboots. On BMC based system opal-prd runs as a daemon
in the host. Hence, kernel crash may prevent opal-prd to detect and
analyze this MCE error. This may land us in a situation where the faulty
memory never gets de-configured and Linux would keep hitting same MCE error
again and again. If this happens in early stage of kernel initialization,
then Linux will keep crashing and rebooting in a loop.

This patch fixes this issue by invoking new opal_cec_reboot2() call with
reboot type OPAL_REBOOT_PLATFORM_ERROR to inform BMC/OCC about this
error, so that BMC can collect relevant data for error analysis and
decide what component to de-configure before rebooting.

This patch is dependent on OPAL patchset posted on skiboot mailing list
at https://lists.ozlabs.org/pipermail/skiboot/2015-July/001771.html that
introduces opal_cec_reboot2() opal call.

Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>