Bjorn Helgaas [Mon, 12 Dec 2016 17:25:13 +0000 (11:25 -0600)]
Merge branch 'pci/host-vmd' into next
* pci/host-vmd:
PCI: vmd: Fix suspend handlers defined-but-not-used warning
PCI: vmd: Use SRCU as a local RCU to prevent delaying global RCU
PCI: vmd: Remove unnecessary pci_set_drvdata()
Bjorn Helgaas [Mon, 12 Dec 2016 17:25:11 +0000 (11:25 -0600)]
Merge branch 'pci/host-rockchip' into next
* pci/host-rockchip:
PCI: rockchip: Move the deassert of pm/aclk/pclk after phy_init()
PCI: rockchip: Split out rockchip_cfg_atu()
PCI: rockchip: Clean up bit definitions for PCIE_RC_CONFIG_LCS
PCI: rockchip: Correct the use of FTS mask
PCI: rockchip: Remove the pointer to L1 substate cap
PCI: rockchip: Specify the link capability
PCI: rockchip: Fix negotiated lanes calculation
PCI: rockchip: Add Kconfig COMPILE_TEST
PCI: rockchip: Mark RC as common clock architecture
PCI: rockchip: Provide captured slot power limit and scale
PCI: rockchip: Add three new resets as required properties
PCI: Don't attempt to claim shadow copies of ROM
PCI: designware: Check for iATU unroll support after initializing host
PCI: qcom: Fix pp->dev usage before assignment
PCI: designware-plat: Update author email address
PCI: layerscape: Fix drvdata usage before assignment
PCI: designware-plat: Change maintainer to Jose Abreu
Bjorn Helgaas [Mon, 12 Dec 2016 17:25:10 +0000 (11:25 -0600)]
Merge branch 'pci/host-rcar' into next
* pci/host-rcar:
PCI: rcar: Add gen3 fallback compatibility string for pcie-rcar
PCI: rcar: Use gen2 fallback compatibility last
PCI: rcar-gen2: Use gen2 fallback compatibility last
Bjorn Helgaas [Mon, 12 Dec 2016 17:25:07 +0000 (11:25 -0600)]
Merge branch 'pci/host-hv' into next
* pci/host-hv:
PCI: hv: Allocate physically contiguous hypercall params buffer
PCI: hv: Delete the device earlier from hbus->children for hot-remove
PCI: hv: Fix hv_pci_remove() for hot-remove
PCI: hv: Use the correct buffer size in new_pcichild_device()
PCI: hv: Make unnecessarily global IRQ masking functions static
Bjorn Helgaas [Mon, 12 Dec 2016 17:25:06 +0000 (11:25 -0600)]
Merge branch 'pci/host-altera' into next
* pci/host-altera:
PCI: altera: Remove redundant error message in altera_pcie_parse_dt()
PCI: altera: Use builtin_platform_driver() to simplify the code
Bjorn Helgaas [Mon, 12 Dec 2016 17:25:05 +0000 (11:25 -0600)]
Merge branch 'pci/virtualization' into next
* pci/virtualization:
PCI: Add comments about ROM BAR updating
PCI: Decouple IORESOURCE_ROM_ENABLE and PCI_ROM_ADDRESS_ENABLE
PCI: Remove pci_resource_bar() and pci_iov_resource_bar()
PCI: Don't update VF BARs while VF memory space is enabled
PCI: Separate VF BAR updates from standard BAR updates
PCI: Update BARs using property bits appropriate for type
PCI: Ignore BAR updates on virtual functions
PCI: Do any VF BAR updates before enabling the BARs
PCI: Support INTx masking on ConnectX-4 with firmware x.14.1100+
PCI: Convert Mellanox broken INTx quirks to be for listed devices only
PCI: Convert broken INTx masking quirks from HEADER to FINAL
net/mlx4_core: Use device ID defines
PCI: Add Mellanox device IDs
Bjorn Helgaas [Mon, 12 Dec 2016 17:25:04 +0000 (11:25 -0600)]
Merge branch 'pci/pm' into next
* pci/pm:
x86/platform/intel-mid: Constify mid_pci_platform_pm
PCI: pciehp: Add runtime PM support for PCIe hotplug ports
ACPI / hotplug / PCI: Make device_is_managed_by_native_pciehp() public
ACPI / hotplug / PCI: Use cached copy of PCI_EXP_SLTCAP_HPC bit
PCI: Unfold conditions to block runtime PM on PCIe ports
PCI: Consolidate conditions to allow runtime PM on PCIe ports
PCI: Activate runtime PM on a PCIe port only if it can suspend
PCI: Speed up algorithm in pci_bridge_d3_update()
PCI: Autosense device removal in pci_bridge_d3_update()
PCI: Don't acquire ref on parent in pci_bridge_d3_update()
USB: UHCI: report non-PME wakeup signalling for Intel hardware
PCI: Check for PME in targeted sleep state
PCI: Enable access to non-standard VPD for Chelsio devices (cxgb3)
There is at least one Chelsio 10Gb card which uses VPD area to store some
non-standard blocks (example below). However pci_vpd_size() returns the
length of the first block only assuming that there can be only one VPD "End
Tag".
Since 4e1a635552d3 ("vfio/pci: Use kernel VPD access functions"), VFIO
blocks access beyond that offset, which prevents the guest "cxgb3" driver
from probing the device. The host system does not have this problem as its
driver accesses the config space directly without pci_read_vpd().
Add a quirk to override the VPD size to a bigger value. The maximum size
is taken from EEPROMSIZE in drivers/net/ethernet/chelsio/cxgb3/common.h.
We do not read the tag as the cxgb3 driver does as the driver supports
writing to EEPROM/VPD and when it writes, it only checks for 8192 bytes
boundary. The quirk is registered for all devices supported by the cxgb3
driver.
This adds a quirk to the PCI layer (not to the cxgb3 driver) as the cxgb3
driver itself accesses VPD directly and the problem only exists with the
vfio-pci driver (when cxgb3 is not running on the host and may not be even
loaded) which blocks accesses beyond the first block of VPD data. However
vfio-pci itself does not have quirks mechanism so we add it to PCI.
This is the controller:
Ethernet controller [0200]: Chelsio Communications Inc T310 10GbE Single Port Adapter [1425:0030]
This is what I parsed from its VPD:
===
b'\x82*\x0010 Gigabit Ethernet-SR PCI Express Adapter\x90J\x00EC\x07D76809 FN\x0746K'
0000 Large item 42 bytes; name 0x2 Identifier String
b'10 Gigabit Ethernet-SR PCI Express Adapter'
002d Large item 74 bytes; name 0x10
#00 [EC] len=7: b'D76809 '
#0a [FN] len=7: b'46K7897'
#14 [PN] len=7: b'46K7897'
#1e [MN] len=4: b'1037'
#25 [FC] len=4: b'5769'
#2c [SN] len=12: b'YL102035603V'
#3b [NA] len=12: b'00145E992ED1'
007a Small item 1 bytes; name 0xf End Tag
Bjorn Helgaas [Tue, 15 Nov 2016 13:57:30 +0000 (07:57 -0600)]
PCI: pciehp: Remove loading message
Remove the "PCI Express Hot Plug Controller Driver" version message. I
don't think it contains any useful information. Remove unused #defines
and move the author information to a comment.
Bjorn Helgaas [Tue, 15 Nov 2016 13:55:51 +0000 (07:55 -0600)]
PCI: hotplug: Remove hotplug core message
Remove the "PCI Hot Plug PCI Core" version message. I don't think it
contains any useful information. Remove unused #defines and move the
author information to a comment.
Bjorn Helgaas [Tue, 15 Nov 2016 13:54:19 +0000 (07:54 -0600)]
PCI: Remove service driver load/unload messages
Remove the "service driver %s loaded" and unloaded messages. All service
drivers already log something in their probe functions, where they can log
more useful details.
Bjorn Helgaas [Mon, 21 Nov 2016 21:07:53 +0000 (15:07 -0600)]
PCI/PME: Log PME IRQ when claiming Root Port
We already log a "Signaling PME" whenever the PME service driver claims a
Root Port. In fact, we also log the same message for every device in the
hierarchy below the Root Port.
Log the "Signaling PME" once (only for the Root Port, since we can
trivially find out which devices are below the Root Port), and include the
IRQ number in the message to help connect the dots with /proc/interrupts.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Wang Sheng-Hui [Thu, 22 Sep 2016 01:05:46 +0000 (09:05 +0800)]
PCI: Move config space size macros to pci_regs.h
Move PCI configuration space size macros (PCI_CFG_SPACE_SIZE and
PCI_CFG_SPACE_EXP_SIZE) from drivers/pci/pci.h to
include/uapi/linux/pci_regs.h so they can be used by more drivers and
eliminate duplicate definitions.
[bhelgaas: Expand comment to include PCI-X details] Signed-off-by: Wang Sheng-Hui <shhuiw@foxmail.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
David Daney [Thu, 17 Nov 2016 22:25:01 +0000 (14:25 -0800)]
PCI/ASPM: Don't retrain link if ASPM not possible
Some (defective) PCIe devices are not able to reliably do link retraining.
Check to see if ASPM is possible between link partners before configuring
common clocking, and doing the resulting link retraining. If ASPM is not
possible, there is no reason to risk losing access to a device due to an
unnecessary link retraining.
Signed-off-by: David Daney <david.daney@cavium.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Andy Gospodarek [Thu, 1 Dec 2016 20:34:52 +0000 (15:34 -0500)]
PCI: iproc: Skip check for legacy IRQ on PAXC buses
PAXC and PAXCv2 buses do not support legacy IRQs so there is no reason to
even try and map them. Without a change like this, one cannot create VFs
on Nitro ports since legacy interrupts are checked as part of the PCI
device creation process. Testing on PAXC hardware showed that VFs are
properly created with only the change to not set pcie->map_irq, but just to
be safe the change in iproc_pcie_setup() will ensure that pdev_fixup_irq()
will not panic.
Signed-off-by: Andy Gospodarek <gospo@broadcom.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: Ray Jui <ray.jui@broadcom.com>
Ashok Raj [Sat, 19 Nov 2016 08:32:46 +0000 (00:32 -0800)]
PCI: pciehp: Leave power indicator on when enabling already-enabled slot
If an error occurs when enabling a slot, pciehp_power_thread() turns off
the power indicator. But if the only error is that the slot was already
enabled, we should leave the power indicator on.
Return success if called to enable an already-enabled slot.
This is in the same spirit of the special handling for EEXISTS when
pciehp_configure_device() determines the slot devices already exist.
Ashok Raj [Sat, 19 Nov 2016 08:32:45 +0000 (00:32 -0800)]
PCI: pciehp: Prioritize data-link event over presence detect
If Slot Status indicates changes in both Data Link Layer Status and
Presence Detect, prioritize the Link status change.
When both events are observed, pciehp currently relies on the Slot Status
Presence Detect State (PDS) to agree with the Link Status Data Link Layer
Active status. The Presence Detect State, however, may be set to 1 through
out-of-band presence detect even if the link is down, which creates
conflicting events.
Since the Link Status accurately reflects the reachability of the
downstream bus, the Link Status event should take precedence over a
Presence Detect event. Skip checking the PDC status if we handled a link
event in the same handler.
Simon Horman [Tue, 6 Dec 2016 15:51:31 +0000 (16:51 +0100)]
PCI: rcar: Add gen3 fallback compatibility string for pcie-rcar
Add fallback compatibility string for the R-Car Gen 3 family. This is in
keeping with the both the existing fallback compatibility string for the
R-Car Gen 2 family and the fallback scheme being adopted wherever
appropriate for drivers for Renesas SoCs.
Signed-off-by: Simon Horman <horms+renesas@verge.net.au> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Shawn Lin [Thu, 24 Nov 2016 01:54:21 +0000 (09:54 +0800)]
PCI: rockchip: Move the deassert of pm/aclk/pclk after phy_init()
Move deassert of pm/aclk/pclk after phy_init() as we want to optimize the
logic of reset control and reuse rockchip_pcie_init_port() later which
should fully follow the cold boot procedure of ROM code.
Signed-off-by: Shawn Lin <shawn.lin@rock-chips.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Brian Norris <briannorris@chromium.org>
Shawn Lin [Wed, 7 Dec 2016 21:06:00 +0000 (15:06 -0600)]
PCI: rockchip: Clean up bit definitions for PCIE_RC_CONFIG_LCS
PCIE_RC_CONFIG_LCS contains control and status bits specific to the PCIe
link. The layout for this register looks the same as the existing
PCI_EXP_LNKCTL and PCI_EXP_LNKSTA. So let's reuse them.
Signed-off-by: Shawn Lin <shawn.lin@rock-chips.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Brian Norris [Wed, 7 Dec 2016 21:06:00 +0000 (15:06 -0600)]
PCI: rockchip: Correct the use of FTS mask
We're trying to mask out bits[23:8] while retaining [32:24, 7:0], but we're
doing the inverse. That doesn't have too much effect, since we're setting
all the [23:8] bits to 1, and the other bits are only relevant for modes
we're currently not using. But we should get this right.
Shawn Lin [Wed, 7 Dec 2016 21:05:59 +0000 (15:05 -0600)]
PCI: rockchip: Specify the link capability
rk3399 supports PCIe 2.x link speeds marginally at best, and on some
boards, the link won't train at 5 GT/s at all. Rather than sacrifice 500ms
waiting for training that will never happen, let's use the helper function,
of_pci_get_max_link_speed(), to get the max link speed from DT and specify
link capability.
Signed-off-by: Brian Norris <briannorris@chromium.org> Signed-off-by: Shawn Lin <shawn.lin@rock-chips.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Shawn Lin [Wed, 7 Dec 2016 21:05:59 +0000 (15:05 -0600)]
PCI: rockchip: Fix negotiated lanes calculation
The calculation of negotiated lanes is wrong: it should be shifted by
PCIE_CORE_PL_CONF_LANE_SHIFT, but it is shifted by
PCIE_CORE_PL_CONF_LANE_MASK instead. Let's fix it.
Shawn Lin [Wed, 7 Dec 2016 21:05:58 +0000 (15:05 -0600)]
PCI: rockchip: Mark RC as common clock architecture
The default value of common clock configuration is zero indicating
Rockchip's RC is using asynchronous clock architecture but actually we are
using common clock. This will confuse some EP drivers if they need some
different settings referring to this value.
Set the Common Clock Configuration bit in the Link Control Register.
Signed-off-by: Shawn Lin <shawn.lin@rock-chips.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Bjorn Helgaas [Tue, 6 Dec 2016 20:27:59 +0000 (14:27 -0600)]
PCI: Explain ARM64 ACPI/MCFG quirk Kconfig and build strategy
Add Makefile comments to explain the Kconfig and build strategy for ARM64
drivers that work around not-quite-ECAM issues. No functional change
intended.
drivers/pci/host/vmd.c:731:12: warning: ‘vmd_suspend’ defined but not used [-Wunused-function]
static int vmd_suspend(struct device *dev)
^
drivers/pci/host/vmd.c:739:12: warning: ‘vmd_resume’ defined but not used [-Wunused-function]
static int vmd_resume(struct device *dev)
^
Jon Derrick [Fri, 11 Nov 2016 23:08:45 +0000 (16:08 -0700)]
PCI: vmd: Use SRCU as a local RCU to prevent delaying global RCU
SRCU lets synchronize_srcu() depend on VMD-local RCU primitives, preventing
long delays from locking up RCU in other systems. VMD performs a
synchronize when removing a device, but will hit all IRQ lists if the
device uses all VMD vectors. This patch will not help VMD's RCU
synchronization, but will isolate the read side delays to the VMD
subsystem. Additionally, the use of SRCU in VMD's ISR will keep it
isolated from any other RCU waiters in the rest of the system.
Thierry Reding [Fri, 25 Nov 2016 10:57:15 +0000 (11:57 +0100)]
PCI: tegra: Add Tegra210 support
The PCIe host controller found on Tegra X1 is very similar to its
predecessor on Tegra K1. A bug was introduced in the new revision that
is worked around by always enabling the performance counter, otherwise
accesses to configuration space will block for a number of seconds.
Thierry Reding [Fri, 25 Nov 2016 10:57:14 +0000 (11:57 +0100)]
PCI: tegra: Implement PCA enable workaround
Tegra210's PCIe controller has a bug that requires the PCA (performance
counter) feature to be enabled. If this isn't done, accesses to device
configuration space will hang the chip for tens of seconds. Implement the
workaround.
Based on commit 514e19138af2 ("pci: tegra: implement PCA enable
workaround") from U-Boot by Stephen Warren <swarren@nvidia.com>.
Thierry Reding [Fri, 25 Nov 2016 10:57:13 +0000 (11:57 +0100)]
dt-bindings: pci: tegra: Add Tegra210 support
Add support for the PCI host controller found on Tegra210 SoCs. It is very
similar to the variant found on Tegra124, with a couple of small
differences regarding the power supplies.
Arnd Bergmann [Fri, 25 Nov 2016 10:57:12 +0000 (11:57 +0100)]
PCI: tegra: Use new pci_register_host_bridge() interface
Tegra is one of the remaining platforms that still use the traditional
pci_common_init_dev() interface for probing PCI host bridges.
This demonstrates how to convert it to the pci_register_host interface I
just added in a previous patch. This leads to a more linear probe sequence
that can handle errors better because we avoid callbacks into the driver,
and it makes the driver architecture independent.
Arnd Bergmann [Fri, 25 Nov 2016 10:57:09 +0000 (11:57 +0100)]
PCI: Add pci_register_host_bridge() interface
Make the existing pci_host_bridge structure a proper device that is usable
by PCI host drivers in a more standard way. In addition to the existing
pci_scan_bus(), pci_scan_root_bus(), pci_scan_root_bus_msi(), and
pci_create_root_bus() interfaces, this unfortunately means having to add
yet another interface doing basically the same thing, and add some extra
code in the initial step.
However, this time it's more likely to be extensible enough that we won't
have to do another one again in the future, and we should be able to reduce
code much more as a result.
The main idea is to pull the allocation of 'struct pci_host_bridge' out of
the registration, and let individual host drivers and architecture code
fill the members before calling the registration function.
There are a number of things we can do based on this:
* Use a single memory allocation for the driver-specific structure
and the generic PCI host bridge
* consolidate the contents of driver-specific structures by moving
them into pci_host_bridge
* Add a consistent interface for removing a PCI host bridge again
when unloading a host driver module
* Replace the architecture specific __weak pcibios_*() functions with
callbacks in a pci_host_bridge device
* Move common boilerplate code from host drivers into the generic
function, based on contents of the structure
* Extend pci_host_bridge with additional members when needed without
having to add arguments to pci_scan_*().
* Move members of struct pci_bus into pci_host_bridge to avoid
having lots of identical copies.
Duc Dang [Fri, 2 Dec 2016 02:27:07 +0000 (18:27 -0800)]
PCI: Add MCFG quirks for X-Gene host controller
PCIe controllers in X-Gene SoCs are not ECAM compliant: software needs to
configure additional controller's register to address device at
bus:dev:function.
Add a quirk to discover controller MMIO register space and configure
controller registers to select and address the target secondary device.
The quirk will only be applied for X-Gene PCIe MCFG table with
OEM revison 1, 2, 3 or 4 (PCIe controller v1 and v2 on X-Gene SoCs).
Tomasz Nowicki [Thu, 1 Dec 2016 05:16:34 +0000 (23:16 -0600)]
PCI: Add MCFG quirks for Cavium ThunderX pass1.x host controller
ThunderX pass1.x requires to emulate the EA headers for on-chip devices
hence it has to use custom pci_thunder_ecam_ops for accessing PCI config
space (pci-thunder-ecam.c). Add new entries to MCFG quirk array where it
can be applied while probing ACPI based PCI host controller.
ThunderX pass1.x is using the same way for accessing off-chip devices
(so-called PEM) as silicon pass-2.x so we need to add PEM quirk entries
too.
Quirk is considered for ThunderX silicon pass1.x only which is identified
via MCFG revision 2.
ThunderX pass 1.x requires the following accessors:
NUMA node 0 PCI segments 0- 3: pci_thunder_ecam_ops (MCFG quirk)
NUMA node 0 PCI segments 4- 9: thunder_pem_ecam_ops (MCFG quirk)
NUMA node 1 PCI segments 10-13: pci_thunder_ecam_ops (MCFG quirk)
NUMA node 1 PCI segments 14-19: thunder_pem_ecam_ops (MCFG quirk)
[bhelgaas: change Makefile/ifdefs so quirk doesn't depend on
CONFIG_PCI_HOST_THUNDER_ECAM] Signed-off-by: Tomasz Nowicki <tn@semihalf.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tomasz Nowicki [Thu, 1 Dec 2016 06:07:56 +0000 (00:07 -0600)]
PCI: Add MCFG quirks for Cavium ThunderX pass2.x host controller
ThunderX PCIe controller to off-chip devices (so-called PEM) is not fully
compliant with ECAM standard. It uses non-standard configuration space
accessors (see thunder_pem_ecam_ops) and custom configuration space
granulation (see bus_shift = 24). In order to access configuration space
and probe PEM as ACPI-based PCI host controller we need to add MCFG quirk
infrastructure. This involves:
1. A new thunder_pem_acpi_init() init function to locate PEM-specific
register ranges using ACPI.
2. Export PEM thunder_pem_ecam_ops structure so it is visible to MCFG quirk
code.
3. New quirk entries for each PEM segment. Each contains platform IDs,
mentioned thunder_pem_ecam_ops and CFG resources.
Quirk is considered for ThunderX silicon pass2.x only which is identified
via MCFG revision 1.
ThunderX pass 2.x requires the following accessors:
NUMA Node 0 PCI segments 0- 3: pci_generic_ecam_ops (ECAM-compliant)
NUMA Node 0 PCI segments 4- 9: thunder_pem_ecam_ops (MCFG quirk)
NUMA Node 1 PCI segments 10-13: pci_generic_ecam_ops (ECAM-compliant)
NUMA Node 1 PCI segments 14-19: thunder_pem_ecam_ops (MCFG quirk)
[bhelgaas: adapt to use acpi_get_rc_resources(), update Makefile/ifdefs so
quirk doesn't depend on CONFIG_PCI_HOST_THUNDER_PEM] Signed-off-by: Tomasz Nowicki <tn@semihalf.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Dongdong Liu [Thu, 1 Dec 2016 06:45:35 +0000 (00:45 -0600)]
PCI: Add MCFG quirks for HiSilicon Hip05/06/07 host controllers
The PCIe controller in Hip05/Hip06/Hip07 SoCs is not completely
ECAM-compliant. It is non-ECAM only for the RC bus config space; for any
other bus underneath the root bus it does support ECAM access.
Add specific quirks for PCI config space accessors. This involves:
1. New initialization call hisi_pcie_init() to obtain RC base
addresses from PNP0C02 at the root of the ACPI namespace (under \_SB).
2. New entry in common quirk array.
[bhelgaas: move to pcie-hisi.c and change Makefile/ifdefs so quirk doesn't
depend on CONFIG_PCI_HISI] Signed-off-by: Dongdong Liu <liudongdong3@huawei.com> Signed-off-by: Gabriele Paoloni <gabriele.paoloni@huawei.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Dongdong Liu [Thu, 1 Dec 2016 06:33:42 +0000 (00:33 -0600)]
PCI/ACPI: Provide acpi_get_rc_resources() for ARM64 platform
The acpi_get_rc_resources() is used to get the RC register address that can
not be described in MCFG. It takes the _HID & segment to look for and
outputs the RC address resource. Use PNP0C02 devices to describe such RC
address resource. Use _UID to match segment to tell which root bus the
PNP0C02 resource belongs to.
[bhelgaas: add dev argument, wrap in #ifdef CONFIG_PCI_QUIRKS] Signed-off-by: Dongdong Liu <liudongdong3@huawei.com> Signed-off-by: Tomasz Nowicki <tn@semihalf.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tomasz Nowicki [Fri, 9 Sep 2016 19:24:04 +0000 (21:24 +0200)]
PCI/ACPI: Check for platform-specific MCFG quirks
The PCIe spec (r3.0, sec 7.2.2) specifies an "Enhanced Configuration Access
Mechanism" (ECAM) for memory-mapped access to configuration space. ECAM is
required for PCIe systems unless there's a standard firmware interface for
config access.
In the absence of a firmware interface, we use pci_generic_ecam_ops, and on
ACPI systems, we discover the ECAM space via the MCFG table and/or the _CBA
method.
Unfortunately some systems provide MCFG but don't implement ECAM according
to spec, so we need a mechanism for quirks to make those systems work.
Add an MCFG quirk mechanism to override the config accessor functions
and/or the memory-mapped address space.
A quirk is selected if it matches all of the following:
- OEM ID
- OEM Table ID
- OEM Revision
- PCI segment (from _SEG)
- PCI bus number range (from _CRS, wildcard allowed)
If the quirk specifies config accessor functions or a memory-mapped address
range, these override the defaults.
[bhelgaas: changelog, reorder quirk matching, fix oem_revision typo per
Duc, add under #ifdef CONFIG_PCI_QUIRKS] Signed-off-by: Tomasz Nowicki <tn@semihalf.com> Signed-off-by: Dongdong Liu <liudongdong3@huawei.com> Signed-off-by: Christopher Covington <cov@codeaurora.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tomasz Nowicki [Fri, 9 Sep 2016 19:24:03 +0000 (21:24 +0200)]
PCI/ACPI: Extend pci_mcfg_lookup() to return ECAM config accessors
pci_mcfg_lookup() is the external interface to the generic MCFG code.
Previously it merely looked up the ECAM base address for a given domain and
bus range. We want a way to add MCFG quirks, some of which may require
special config accessors and adjustments to the ECAM address range.
Extend pci_mcfg_lookup() so it can return a pointer to a pci_ecam_ops
structure and a struct resource for the ECAM address space. For now, it
always returns &pci_generic_ecam_ops (the standard accessor) and the
resource described by the MCFG.
No functional changes intended.
[bhelgaas: changelog] Signed-off-by: Tomasz Nowicki <tn@semihalf.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Bjorn Helgaas [Fri, 2 Dec 2016 23:25:54 +0000 (17:25 -0600)]
arm64: PCI: Exclude ACPI "consumer" resources from host bridge windows
On x86 and ia64, we have treated all ACPI _CRS resources of PNP0A03 host
bridge devices as "producers", i.e., as host bridge windows. That's partly
because some x86 BIOSes improperly used "consumer" descriptors to describe
windows and partly because Linux didn't have good support for handling
consumer and producer descriptors differently.
One result is that x86 BIOSes describe host bridge "consumer" resources in
the _CRS of a PNP0C02 device, not the PNP0A03 device itself. On arm64 we
don't have a legacy of firmware that has this consumer/producer confusion,
so we can handle PNP0A03 "consumer" descriptors as host bridge registers
instead of windows.
Exclude non-window ("consumer") resources from the list of host bridge
windows. This allows the use of "consumer" PNP0A03 descriptors for bridge
register space.
Tomasz Nowicki [Thu, 24 Nov 2016 11:05:23 +0000 (12:05 +0100)]
arm64: PCI: Manage controller-specific data on per-controller basis
Currently we use one shared global acpi_pci_root_ops structure to keep
controller-specific ops. We pass its pointer to acpi_pci_root_create() and
associate it with a host bridge instance for good. Such a design implies
serious drawback. Any potential manipulation on the single system-wide
acpi_pci_root_ops leads to kernel crash. The structure content is not
really changing even across multiple host bridges creation; thus it was not
an issue so far.
In preparation for adding ECAM quirks mechanism (where controller-specific
PCI ops may be different for each host bridge) allocate new
acpi_pci_root_ops and fill in with data for each bridge. Now it is safe to
have different controller-specific info. As a consequence free
acpi_pci_root_ops when host bridge is released.
No functional changes in this patch.
Signed-off-by: Tomasz Nowicki <tn@semihalf.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Bjorn Helgaas [Wed, 30 Nov 2016 20:48:33 +0000 (14:48 -0600)]
arm64: PCI: Search ACPI namespace to ensure ECAM space is reserved
The static MCFG table tells us the base of ECAM space, but it does not
reserve the space -- the reservation should be done via a device in the
ACPI namespace whose _CRS includes the ECAM region.
Use acpi_resource_consumer() to check whether the ECAM space is reserved by
an ACPI namespace device. If it is, emit a message showing which device
reserves it. If not, emit a "[Firmware Bug]" warning.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Bjorn Helgaas [Wed, 30 Nov 2016 20:47:13 +0000 (14:47 -0600)]
ACPI: Add acpi_resource_consumer() to find device that claims a resource
Add acpi_resource_consumer(). This takes a struct resource and searches
the ACPI namespace for a device whose current resource settings (_CRS)
includes the resource. It returns the device if it exists, or NULL if no
device uses the resource.
If more than one device uses the resource (this may happen in the case of
bridges), acpi_resource_consumer() returns the first one found by
acpi_get_devices() in its modified depth-first walk of the namespace.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Bjorn Helgaas [Mon, 28 Nov 2016 22:17:41 +0000 (16:17 -0600)]
PCI: Add comments about ROM BAR updating
pci_update_resource() updates a hardware BAR so its address matches the
kernel's struct resource UNLESS it's a disabled ROM BAR. We only update
those when we enable the ROM.
It's not obvious from the code why ROM BARs should be handled specially.
Apparently there are Matrox devices with defective ROM BARs that read as
zero when disabled. That means that if pci_enable_rom() reads the disabled
BAR, sets PCI_ROM_ADDRESS_ENABLE (without re-inserting the address), and
writes it back, it would enable the ROM at address zero.
Add comments and references to explain why we can't make the code look more
rational.
The code changes are from 755528c860b0 ("Ignore disabled ROM resources at
setup") and 8085ce084c0f ("[PATCH] Fix PCI ROM mapping").
Bjorn Helgaas [Mon, 28 Nov 2016 23:21:02 +0000 (17:21 -0600)]
PCI: Decouple IORESOURCE_ROM_ENABLE and PCI_ROM_ADDRESS_ENABLE
Remove the assumption that IORESOURCE_ROM_ENABLE == PCI_ROM_ADDRESS_ENABLE.
PCI_ROM_ADDRESS_ENABLE is the ROM enable bit defined by the PCI spec, so if
we're reading or writing a BAR register value, that's what we should use.
IORESOURCE_ROM_ENABLE is a corresponding bit in struct resource flags.
Bjorn Helgaas [Mon, 28 Nov 2016 22:43:06 +0000 (16:43 -0600)]
PCI: Don't update VF BARs while VF memory space is enabled
If we update a VF BAR while it's enabled, there are two potential problems:
1) Any driver that's using the VF has a cached BAR value that is stale
after the update, and
2) We can't update 64-bit BARs atomically, so the intermediate state
(new lower dword with old upper dword) may conflict with another
device, and an access by a driver unrelated to the VF may cause a bus
error.
Warn about attempts to update VF BARs while they are enabled. This is a
programming error, so use dev_WARN() to get a backtrace.
Bjorn Helgaas [Mon, 28 Nov 2016 15:15:52 +0000 (09:15 -0600)]
PCI: Separate VF BAR updates from standard BAR updates
Previously pci_update_resource() used the same code path for updating
standard BARs and VF BARs in SR-IOV capabilities.
Split the VF BAR update into a new pci_iov_update_resource() internal
interface, which makes it simpler to compute the BAR address (we can get
rid of pci_resource_bar() and pci_iov_resource_bar()).
This patch:
- Renames pci_update_resource() to pci_std_update_resource(),
- Adds pci_iov_update_resource(),
- Makes pci_update_resource() a wrapper that calls the appropriate one,
hv_do_hypercall() assumes that we pass a segment from a physically
contiguous buffer. A buffer allocated on the stack may not work if
CONFIG_VMAP_STACK=y is set.
Use kmalloc() to allocate this buffer.
Reported-by: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: Long Li <longli@microsoft.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: K. Y. Srinivasan <kys@microsoft.com>
Bjorn Helgaas [Tue, 29 Nov 2016 14:14:47 +0000 (08:14 -0600)]
PCI: Update BARs using property bits appropriate for type
The BAR property bits (0-3 for memory BARs, 0-1 for I/O BARs) are supposed
to be read-only, but we do save them in res->flags and include them when
updating the BAR.
Mask the I/O property bits with ~PCI_BASE_ADDRESS_IO_MASK (0x3) instead of
PCI_REGION_FLAG_MASK (0xf) to make it obvious that we can't corrupt bits
2-3 of I/O addresses.
Use PCI_ROM_ADDRESS_MASK for ROM BARs. This means we'll only check the top
21 bits (instead of the 28 bits we used to check) of a ROM BAR to see if
the update was successful.
Bjorn Helgaas [Mon, 28 Nov 2016 17:19:27 +0000 (11:19 -0600)]
PCI: Ignore BAR updates on virtual functions
VF BARs are read-only zero, so updating VF BARs will not have any effect.
See the SR-IOV spec r1.1, sec 3.4.1.11.
We already ignore these updates because of 70675e0b6a1a ("PCI: Don't try to
restore VF BARs"); this merely restructures it slightly to make it easier
to split updates for standard and SR-IOV BARs.
Gavin Shan [Wed, 26 Oct 2016 01:15:35 +0000 (12:15 +1100)]
PCI: Do any VF BAR updates before enabling the BARs
Previously we enabled VFs and enable their memory space before calling
pcibios_sriov_enable(). But pcibios_sriov_enable() may update the VF BARs:
for example, on PPC PowerNV we may change them to manage the association of
VFs to PEs.
Because 64-bit BARs cannot be updated atomically, it's unsafe to update
them while they're enabled. The half-updated state may conflict with other
devices in the system.
Call pcibios_sriov_enable() before enabling the VFs so any BAR updates
happen while the VF BARs are disabled.
Ray Jui [Tue, 22 Nov 2016 01:48:30 +0000 (17:48 -0800)]
PCI: iproc: Fix incorrect MSI address alignment
In the code to handle PAXB v2 based MSI steering, the logic aligns the MSI
register address to the size of supported inbound mapping range. This is
incorrect since it rounds "up" the starting address to the next aligned
address, but what we want is the starting address to be rounded "down" to
the aligned address.
This patch fixes the issue and allows MSI writes to be properly steered to
the GIC.
Fixes: 4b073155fbd3 ("PCI: iproc: Add support for the next-gen PAXB controller") Signed-off-by: Ray Jui <ray.jui@broadcom.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
PCI: qcom: Add support for MSM8996 PCIe controller
Add support for the MSM8996/APQ8096 PCIe controller. MSM8996 supports Gen
1/2, one lane, 3 PCIe root complexes with support for MSI and legacy
interrupts, and it conforms to PCI Express Base 2.1 specification.
Add a post_init callback to qcom_pcie_ops, as the PCIe pipe clocks are only
setup after the phy is powered on. It also adds an ltssm_enable callback
as it is very much different from other supported SoCs in the driver.
Noa Osherovich [Tue, 15 Nov 2016 08:00:00 +0000 (10:00 +0200)]
PCI: Support INTx masking on ConnectX-4 with firmware x.14.1100+
Mellanox devices were marked as having INTx masking ability broken. As a
result, the VFIO driver fails to start when more than one device function
is passed-through to a VM if both have the same INTx pin.
Prior to Connect-IB, Mellanox devices exposed to the operating system one
PCI function per all ports. Starting from Connect-IB, the devices are
function-per-port. When passing the second function to a VM, VFIO will
fail to start.
Exclude ConnectX-4, ConnectX4-Lx and Connect-IB from the list of Mellanox
devices marked as having broken INTx masking:
- ConnectX-4 and ConnectX4-LX firmware version is checked. If INTx
masking is supported, we unmark the broken INTx masking.
- Connect-IB does not support INTx currently so will not cause any
problem.
[bhelgaas: call pci_disable_device() always, after iounmap()] Fixes: 11e42532ada3 ("PCI: Assume all Mellanox devices have broken INTx masking") Signed-off-by: Noa Osherovich <noaos@mellanox.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Reviewed-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Bjorn Helgaas [Mon, 31 Oct 2016 21:00:01 +0000 (16:00 -0500)]
PCI: Warn on possible RW1C corruption for sub-32 bit config writes
Hardware that supports only 32-bit config writes is not spec-compliant.
For example, if software performs a 16-bit write, we must do a 32-bit read,
merge in the 16 bits we intend to write, followed by a 32-bit write. If
the 16 bits we *don't* intend to write happen to have any RW1C (write-one-
to-clear) bits set, we just inadvertently cleared something we shouldn't
have.
Add a rate-limited warning when we do sub-32 bit config writes. Remove
similar probe-time warnings from some of the affected host bridge drivers.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Enthusiastically-Acked-by: Russell King <rmk+kernel@armlinux.org.uk> Acked-by: Shawn Lin <shawn.lin@rock-chips.com> # rockchip Acked-by: Thierry Reding <treding@nvidia.com>
Emil Velikov [Mon, 21 Nov 2016 22:24:49 +0000 (16:24 -0600)]
PCI: Create revision file in sysfs
Currently the revision isn't available via sysfs/libudev thus if one wants
to know the value one needs to read through the config file, which can be
quite time-consuming because it wakes/powers up the device.
There are at least two userspace components which could make use the new
file: libpciaccess and libdrm. The former wakes up _every_ PCI device,
which can be observed via glxinfo when using Mesa 10.0+ drivers. The
latter, in association with Mesa 13.0, can lead to 2-3 second delays while
starting firefox, thunderbird or chromium.
Link: https://bugs.freedesktop.org/show_bug.cgi?id=98502 Tested-by: Mauro Santos <registo.mailling@gmail.com> Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch CC: Greg KH <gregkh@linuxfoundation.org>
Lukas Wunner [Fri, 28 Oct 2016 08:52:06 +0000 (10:52 +0200)]
PCI: pciehp: Add runtime PM support for PCIe hotplug ports
Linux 4.8 added support for runtime suspending PCIe ports to D3hot with
commit 006d44e49a25 ("PCI: Add runtime PM support for PCIe ports"), but
excluded hotplug ports. Those are now afforded runtime PM by the present
commit.
Hotplug ports require a few extra considerations:
- The configuration space of the port remains accessible in D3hot, so all
the functions to read or modify the Slot Status and Slot Control
registers need not be modified. Even turning on slot power doesn't seem
to require the port to be in D0, at least the PCIe spec doesn't say so
and I confirmed that by testing with a Thunderbolt controller.
- However D0 is required to access devices on the secondary bus. This
happens in pciehp_check_link_status() and pciehp_configure_device() (both
called from board_added()) and in pciehp_unconfigure_device() (called
from remove_board()), so acquire a runtime PM ref for their invocation.
- The hotplug port stays active as long as it has active children. If all
hotplugged devices below the port runtime suspend, the port is allowed to
runtime suspend as well. Plug and unplug detection continues to work in
D3hot.
- Hotplug interrupts are delivered in-band, so while the hotplug port
itself is allowed to go to D3hot, its parent ports must stay in D0 for
interrupts to come through. Add a corresponding restriction to
pci_dev_check_d3cold().
- Runtime PM may only be allowed if the hotplug port is handled natively by
the OS. On ACPI systems, the port may alternatively be handled by the
firmware and things break if the OS puts the port into D3 behind the
firmware's back: E.g. Thunderbolt hotplug ports on non-Macs are handled
by Intel's firmware in System Management Mode and the firmware is known
to access devices on the port's secondary bus without checking first if
the port is in D0: https://bugzilla.kernel.org/show_bug.cgi?id=53811
Signed-off-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> CC: Mika Westerberg <mika.westerberg@linux.intel.com>
Lukas Wunner [Fri, 28 Oct 2016 08:52:06 +0000 (10:52 +0200)]
ACPI / hotplug / PCI: Make device_is_managed_by_native_pciehp() public
We're about to add runtime PM of hotplug ports, but we need to restrict it
to ports that are handled natively by the OS: If they're handled by the
firmware (which is the case for Thunderbolt on non-Macs), things would
break if the OS put the ports into D3hot behind the firmware's back.
To determine if a hotplug port is handled natively, one has to walk up from
the port to the root bridge and check the cached _OSC Control Field for the
value of the "PCI Express Native Hot Plug control" bit. There's already a
function to do that, device_is_managed_by_native_pciehp(), but it's private
to drivers/pci/hotplug/acpiphp_glue.c and only compiled in if
CONFIG_HOTPLUG_PCI_ACPI is enabled.
Make it public and move it to drivers/pci/pci-acpi.c, so that it is
available in the more general CONFIG_ACPI case.
The function contains a check if the device in question is a hotplug port
and returns false if it's not. The caller we're going to add doesn't need
this as it only calls the function if it actually *is* a hotplug port.
Move the check out of the function into the single existing caller.
Rename it to pciehp_is_native() and add some kerneldoc and polish.
No functional change intended.
Tested-by: Mika Westerberg <mika.westerberg@linux.intel.com> Signed-off-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Lukas Wunner [Fri, 28 Oct 2016 08:52:06 +0000 (10:52 +0200)]
ACPI / hotplug / PCI: Use cached copy of PCI_EXP_SLTCAP_HPC bit
We cache the PCI_EXP_SLTCAP_HPC bit in pci_dev->is_hotplug_bridge on device
probe, so there's no need to read it again when adding the ACPI hotplug
context.
Here's the call chain to prove that no ordering issue is introduced:
pci_scan_child_bus [drivers/pci/probe.c]
pci_scan_slot
pci_scan_single_device
pci_scan_device
pci_setup_device
set_pcie_hotplug_bridge
[is_hotplug_bridge bit is set here]
pci_scan_bridge
pci_add_new_bus
pci_alloc_child_bus
pcibios_add_bus [arch/(x86|arm64|ia64)/...]
acpi_pci_add_bus [drivers/pci/pci-acpi.c]
acpiphp_enumerate_slots [drivers/pci/hotplug/acpiphp_glue.c]
acpiphp_add_context
device_is_managed_by_native_pciehp
[is_hotplug_bridge bit is queried here]
No functional change intended.
Tested-by: Mika Westerberg <mika.westerberg@linux.intel.com> Signed-off-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>