git.karo-electronics.de Git - linux-beck.git/log

Merge branch 'pci/host-vmd' into next

* pci/host-vmd:
  PCI: vmd: Fix suspend handlers defined-but-not-used warning
  PCI: vmd: Use SRCU as a local RCU to prevent delaying global RCU
  PCI: vmd: Remove unnecessary pci_set_drvdata()

Merge branch 'pci/host-tegra' into next

* pci/host-tegra:
  arm64: tegra: Enable PCIe on Jetson TX1
  arm64: tegra: Add PCIe host bridge on Tegra210
  PCI: tegra: Enable the driver on 64-bit ARM
  PCI: tegra: Add Tegra210 support
  PCI: tegra: Implement PCA enable workaround
  dt-bindings: pci: tegra: Add Tegra210 support
  PCI: tegra: Use new pci_register_host_bridge() interface
  PCI: Export host bridge registration interface
  PCI: Allow driver-specific data in host bridge
  PCI: Add pci_register_host_bridge() interface

Merge branch 'pci/host-spear' into next

* pci/host-spear:
PCI: spear: Use builtin_platform_driver() to simplify the code

Merge branch 'pci/host-rockchip' into next

* pci/host-rockchip:
  PCI: rockchip: Move the deassert of pm/aclk/pclk after phy_init()
  PCI: rockchip: Split out rockchip_cfg_atu()
  PCI: rockchip: Clean up bit definitions for PCIE_RC_CONFIG_LCS
  PCI: rockchip: Correct the use of FTS mask
  PCI: rockchip: Remove the pointer to L1 substate cap
  PCI: rockchip: Specify the link capability
  PCI: rockchip: Fix negotiated lanes calculation
  PCI: rockchip: Add Kconfig COMPILE_TEST
  PCI: rockchip: Mark RC as common clock architecture
  PCI: rockchip: Provide captured slot power limit and scale
  PCI: rockchip: Add three new resets as required properties
  PCI: Don't attempt to claim shadow copies of ROM
  PCI: designware: Check for iATU unroll support after initializing host
  PCI: qcom: Fix pp->dev usage before assignment
  PCI: designware-plat: Update author email address
  PCI: layerscape: Fix drvdata usage before assignment
  PCI: designware-plat: Change maintainer to Jose Abreu

Merge branch 'pci/host-rcar' into next

* pci/host-rcar:
  PCI: rcar: Add gen3 fallback compatibility string for pcie-rcar
  PCI: rcar: Use gen2 fallback compatibility last
  PCI: rcar-gen2: Use gen2 fallback compatibility last

Merge branch 'pci/host-qcom' into next

* pci/host-qcom:
PCI: qcom: Add support for MSM8996 PCIe controller

Merge branch 'pci/host-layerscape' into next

* pci/host-layerscape:
PCI: layerscape: Add LS1046a support
PCI: layerscape: Remove redundant error message from ls_pcie_probe()

Merge branch 'pci/host-iproc' into next

* pci/host-iproc:
  PCI: iproc: Skip check for legacy IRQ on PAXC buses
  PCI: iproc: Fix incorrect MSI address alignment
  PCI: iproc: Add support for the next-gen PAXB controller
  PCI: iproc: Add PAXBv2 binding info
  PCI: iproc: Add inbound DMA mapping support
  PCI: iproc: Add optional dma-ranges
  PCI: iproc: Make outbound mapping code more generic
  PCI: iproc: Remove redundant outbound properties
  PCI: iproc: Add PAXC v2 support
  PCI: iproc: Add PAXCv2 related binding
  PCI: iproc: Fix exception with multi-function devices
  PCI: iproc: Add BCMA type
  PCI: iproc: Do not reset PAXC when initializing the driver
  PCI: iproc: Improve core register population

Merge branch 'pci/host-imx6' into next

* pci/host-imx6:
MAINTAINERS: Add devicetree binding to PCI i.MX6 entry
MAINTAINERS: Update Richard Zhu's email address

Merge branch 'pci/host-hv' into next

* pci/host-hv:
  PCI: hv: Allocate physically contiguous hypercall params buffer
  PCI: hv: Delete the device earlier from hbus->children for hot-remove
  PCI: hv: Fix hv_pci_remove() for hot-remove
  PCI: hv: Use the correct buffer size in new_pcichild_device()
  PCI: hv: Make unnecessarily global IRQ masking functions static

Merge branch 'pci/host-hisi' into next

* pci/host-hisi:
PCI: hisi: Remove redundant error message from hisi_pcie_probe()

Merge branch 'pci/host-altera' into next

* pci/host-altera:
PCI: altera: Remove redundant error message in altera_pcie_parse_dt()
PCI: altera: Use builtin_platform_driver() to simplify the code

Merge branch 'pci/host' into next

* pci/host:
of/pci: Add of_pci_get_max_link_speed() to parse max-link-speed from DT
Documentation/devicetree: Add PCIe max-link-speed property

Merge branch 'pci/virtualization' into next

* pci/virtualization:
  PCI: Add comments about ROM BAR updating
  PCI: Decouple IORESOURCE_ROM_ENABLE and PCI_ROM_ADDRESS_ENABLE
  PCI: Remove pci_resource_bar() and pci_iov_resource_bar()
  PCI: Don't update VF BARs while VF memory space is enabled
  PCI: Separate VF BAR updates from standard BAR updates
  PCI: Update BARs using property bits appropriate for type
  PCI: Ignore BAR updates on virtual functions
  PCI: Do any VF BAR updates before enabling the BARs
  PCI: Support INTx masking on ConnectX-4 with firmware x.14.1100+
  PCI: Convert Mellanox broken INTx quirks to be for listed devices only
  PCI: Convert broken INTx masking quirks from HEADER to FINAL
  net/mlx4_core: Use device ID defines
  PCI: Add Mellanox device IDs

Merge branch 'pci/pm' into next

* pci/pm:
  x86/platform/intel-mid: Constify mid_pci_platform_pm
  PCI: pciehp: Add runtime PM support for PCIe hotplug ports
  ACPI / hotplug / PCI: Make device_is_managed_by_native_pciehp() public
  ACPI / hotplug / PCI: Use cached copy of PCI_EXP_SLTCAP_HPC bit
  PCI: Unfold conditions to block runtime PM on PCIe ports
  PCI: Consolidate conditions to allow runtime PM on PCIe ports
  PCI: Activate runtime PM on a PCIe port only if it can suspend
  PCI: Speed up algorithm in pci_bridge_d3_update()
  PCI: Autosense device removal in pci_bridge_d3_update()
  PCI: Don't acquire ref on parent in pci_bridge_d3_update()
  USB: UHCI: report non-PME wakeup signalling for Intel hardware
  PCI: Check for PME in targeted sleep state

Merge branch 'pci/msi' into next

* pci/msi:
PCI/MSI: Check for NULL affinity mask in pci_irq_get_affinity()

Merge branch 'pci/misc' into next

* pci/misc:
  PCI: Enable access to non-standard VPD for Chelsio devices (cxgb3)
  PCI: Expand "VPD access disabled" quirk message
  PCI: pciehp: Remove loading message
  PCI: hotplug: Remove hotplug core message
  PCI: Remove service driver load/unload messages
  PCI/AER: Log AER IRQ when claiming Root Port
  PCI/AER: Log errors with PCI device, not PCIe service device
  PCI/AER: Remove unused version macros
  PCI/PME: Log PME IRQ when claiming Root Port
  PCI/PME: Drop unused support for PMEs from Root Complex Event Collectors
  PCI: Move config space size macros to pci_regs.h

Merge branch 'pci/hotplug' into next

* pci/hotplug:
  PCI: pciehp: Leave power indicator on when enabling already-enabled slot
  PCI: pciehp: Prioritize data-link event over presence detect
  PCI: cpqphp: Add missing call to pci_disable_device()

Merge branch 'pci/enumeration' into next

* pci/enumeration:
PCI: Warn on possible RW1C corruption for sub-32 bit config writes
PCI: Create revision file in sysfs

Merge branch 'pci/ecam' into next

* pci/ecam:
  PCI: Explain ARM64 ACPI/MCFG quirk Kconfig and build strategy
  PCI: Add MCFG quirks for X-Gene host controller
  PCI: Add MCFG quirks for Cavium ThunderX pass1.x host controller
  PCI: Add MCFG quirks for Cavium ThunderX pass2.x host controller
  PCI: thunder-pem: Factor out resource lookup
  PCI: Add MCFG quirks for HiSilicon Hip05/06/07 host controllers
  PCI: Add MCFG quirks for Qualcomm QDF2432 host controller
  PCI/ACPI: Provide acpi_get_rc_resources() for ARM64 platform
  PCI/ACPI: Check for platform-specific MCFG quirks
  PCI/ACPI: Extend pci_mcfg_lookup() to return ECAM config accessors
  arm64: PCI: Exclude ACPI "consumer" resources from host bridge windows
  arm64: PCI: Manage controller-specific data on per-controller basis
  arm64: PCI: Search ACPI namespace to ensure ECAM space is reserved
  arm64: PCI: Add local struct device pointers
  ACPI: Add acpi_resource_consumer() to find device that claims a resource

Merge branch 'pci/aspm' into next

* pci/aspm:
PCI/ASPM: Don't retrain link if ASPM not possible
PCI/ASPM: Use permission-specific DEVICE_ATTR variants

PCI: Enable access to non-standard VPD for Chelsio devices (cxgb3)

There is at least one Chelsio 10Gb card which uses VPD area to store some
non-standard blocks (example below).  However pci_vpd_size() returns the
length of the first block only assuming that there can be only one VPD "End
Tag".

Since 4e1a635552d3 ("vfio/pci: Use kernel VPD access functions"), VFIO
blocks access beyond that offset, which prevents the guest "cxgb3" driver
from probing the device.  The host system does not have this problem as its
driver accesses the config space directly without pci_read_vpd().

Add a quirk to override the VPD size to a bigger value.  The maximum size
is taken from EEPROMSIZE in drivers/net/ethernet/chelsio/cxgb3/common.h.
We do not read the tag as the cxgb3 driver does as the driver supports
writing to EEPROM/VPD and when it writes, it only checks for 8192 bytes
boundary.  The quirk is registered for all devices supported by the cxgb3
driver.

This adds a quirk to the PCI layer (not to the cxgb3 driver) as the cxgb3
driver itself accesses VPD directly and the problem only exists with the
vfio-pci driver (when cxgb3 is not running on the host and may not be even
loaded) which blocks accesses beyond the first block of VPD data.  However
vfio-pci itself does not have quirks mechanism so we add it to PCI.

This is the controller:
Ethernet controller [0200]: Chelsio Communications Inc T310 10GbE Single Port Adapter [1425:0030]

This is what I parsed from its VPD:
===
b'\x82*\x0010 Gigabit Ethernet-SR PCI Express Adapter\x90J\x00EC\x07D76809 FN\x0746K'
0000 Large item 42 bytes; name 0x2 Identifier String
b'10 Gigabit Ethernet-SR PCI Express Adapter'
002d Large item 74 bytes; name 0x10
#00 [EC] len=7: b'D76809 '
#0a [FN] len=7: b'46K7897'
#14 [PN] len=7: b'46K7897'
#1e [MN] len=4: b'1037'
#25 [FC] len=4: b'5769'
#2c [SN] len=12: b'YL102035603V'
#3b [NA] len=12: b'00145E992ED1'
007a Small item 1 bytes; name 0xf End Tag

0c00 Large item 16 bytes; name 0x2 Identifier String
b'S310E-SR-X      '
0c13 Large item 234 bytes; name 0x10
#00 [PN] len=16: b'TBD             '
#13 [EC] len=16: b'110107730D2     '
#26 [SN] len=16: b'97YL102035603V  '
#39 [NA] len=12: b'00145E992ED1'
#48 [V0] len=6: b'175000'
#51 [V1] len=6: b'266666'
#5a [V2] len=6: b'266666'
#63 [V3] len=6: b'2000  '
#6c [V4] len=2: b'1 '
#71 [V5] len=6: b'c2    '
#7a [V6] len=6: b'0     '
#83 [V7] len=2: b'1 '
#88 [V8] len=2: b'0 '
#8d [V9] len=2: b'0 '
#92 [VA] len=2: b'0 '
#97 [RV] len=80: b's\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00'...
0d00 Large item 252 bytes; name 0x11
#00 [VC] len=16: b'122310_1222 dp  '
#13 [VD] len=16: b'610-0001-00 H1\x00\x00'
#26 [VE] len=16: b'122310_1353 fp  '
#39 [VF] len=16: b'610-0001-00 H1\x00\x00'
#4c [RW] len=173: b'\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00'...
0dff Small item 0 bytes; name 0xf End Tag

10f3 Large item 13315 bytes; name 0x62
!!! unknown item name 98: b'\xd0\x03\x00@`\x0c\x08\x00\x00\x00\x00\x00\x00\x00\x00\x00'
===

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>

PCI: Expand "VPD access disabled" quirk message

It's not very enlightening to see

pci 0000:07:00.0: [Firmware Bug]: VPD access disabled

in the dmesg log because there's no clue about what the firmware bug is.
Expand the message to explain why we're disabling VPD.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>

PCI: pciehp: Remove loading message

Remove the "PCI Express Hot Plug Controller Driver" version message. I
don't think it contains any useful information. Remove unused #defines
and move the author information to a comment.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>

PCI: hotplug: Remove hotplug core message

Remove the "PCI Hot Plug PCI Core" version message. I don't think it
contains any useful information. Remove unused #defines and move the
author information to a comment.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>

PCI: Remove service driver load/unload messages

Remove the "service driver %s loaded" and unloaded messages. All service
drivers already log something in their probe functions, where they can log
more useful details.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>

PCI/AER: Log AER IRQ when claiming Root Port

Add a log message when we enable AER on a Root Port and the hierarchy below
it.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>

PCI/AER: Log errors with PCI device, not PCIe service device

All other AER-related log messages use the PCI device, e.g.,
"pci 0000:00:1c.0", not the PCIe service device, e.g.,
"aer 0000:00:1c.0:pcie02".

Change the probe error messages to match the rest and include a little
context.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>

PCI/AER: Remove unused version macros

Remove the unused DRIVER_VERSION, DRIVER_AUTHOR, and DRIVER_DESC macros.
The author information is already included in a comment above.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>

PCI/PME: Log PME IRQ when claiming Root Port

We already log a "Signaling PME" whenever the PME service driver claims a
Root Port. In fact, we also log the same message for every device in the
hierarchy below the Root Port.

Log the "Signaling PME" once (only for the Root Port, since we can
trivially find out which devices are below the Root Port), and include the
IRQ number in the message to help connect the dots with /proc/interrupts.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

PCI/PME: Drop unused support for PMEs from Root Complex Event Collectors

Since we register pcie_pme_driver only for PCI_EXP_TYPE_ROOT_PORT, the PME
driver never claims Root Complex Event Collectors.

Remove unused code related to Root Complex Event Collectors.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

PCI: Move config space size macros to pci_regs.h

Move PCI configuration space size macros (PCI_CFG_SPACE_SIZE and
PCI_CFG_SPACE_EXP_SIZE) from drivers/pci/pci.h to
include/uapi/linux/pci_regs.h so they can be used by more drivers and
eliminate duplicate definitions.

[bhelgaas: Expand comment to include PCI-X details]
Signed-off-by: Wang Sheng-Hui <shhuiw@foxmail.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>

x86/platform/intel-mid: Constify mid_pci_platform_pm

This struct never needs to be modified.  The size of pci-mid.o ELF
sections changes thusly:

  -.data          56
  +.data           0
  -.rodata        32
  +.rodata        88

Signed-off-by: Lukas Wunner <lukas@wunner.de>
Acked-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>

PCI/ASPM: Don't retrain link if ASPM not possible

Some (defective) PCIe devices are not able to reliably do link retraining.

Check to see if ASPM is possible between link partners before configuring
common clocking, and doing the resulting link retraining. If ASPM is not
possible, there is no reason to risk losing access to a device due to an
unnecessary link retraining.

Signed-off-by: David Daney <david.daney@cavium.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>

PCI: iproc: Skip check for legacy IRQ on PAXC buses

PAXC and PAXCv2 buses do not support legacy IRQs so there is no reason to
even try and map them. Without a change like this, one cannot create VFs
on Nitro ports since legacy interrupts are checked as part of the PCI
device creation process. Testing on PAXC hardware showed that VFs are
properly created with only the change to not set pcie->map_irq, but just to
be safe the change in iproc_pcie_setup() will ensure that pdev_fixup_irq()
will not panic.

Signed-off-by: Andy Gospodarek <gospo@broadcom.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Ray Jui <ray.jui@broadcom.com>

PCI: pciehp: Leave power indicator on when enabling already-enabled slot

If an error occurs when enabling a slot, pciehp_power_thread() turns off
the power indicator. But if the only error is that the slot was already
enabled, we should leave the power indicator on.

Return success if called to enable an already-enabled slot.
This is in the same spirit of the special handling for EEXISTS when
pciehp_configure_device() determines the slot devices already exist.

Signed-off-by: Ashok Raj <ashok.raj@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Keith Busch <keith.busch@intel.com>

PCI: pciehp: Prioritize data-link event over presence detect

If Slot Status indicates changes in both Data Link Layer Status and
Presence Detect, prioritize the Link status change.

When both events are observed, pciehp currently relies on the Slot Status
Presence Detect State (PDS) to agree with the Link Status Data Link Layer
Active status. The Presence Detect State, however, may be set to 1 through
out-of-band presence detect even if the link is down, which creates
conflicting events.

Since the Link Status accurately reflects the reachability of the
downstream bus, the Link Status event should take precedence over a
Presence Detect event. Skip checking the PDC status if we handled a link
event in the same handler.

Signed-off-by: Ashok Raj <ashok.raj@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Keith Busch <keith.busch@intel.com>

PCI: rcar: Add gen3 fallback compatibility string for pcie-rcar

Add fallback compatibility string for the R-Car Gen 3 family. This is in
keeping with the both the existing fallback compatibility string for the
R-Car Gen 2 family and the fallback scheme being adopted wherever
appropriate for drivers for Renesas SoCs.

Signed-off-by: Simon Horman <horms+renesas@verge.net.au>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>

PCI: rcar: Use gen2 fallback compatibility last

Improve readability by listing fallback compatibility strings after the
more-specific compatibility strings they provide a fallback for.

This does not affect run-time behaviour as it is the order in the DTB that
determines which compatibility string is used.

Signed-off-by: Simon Horman <horms+renesas@verge.net.au>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>

PCI: rcar-gen2: Use gen2 fallback compatibility last

Improve readability by listing fallback compatibility strings after the
more-specific compatibility strings they provide a fallback for.

This does not affect run-time behaviour as it is the order in the DTB that
determines which compatibility string is used.

Signed-off-by: Simon Horman <horms+renesas@verge.net.au>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>

PCI: rockchip: Move the deassert of pm/aclk/pclk after phy_init()

Move deassert of pm/aclk/pclk after phy_init() as we want to optimize the
logic of reset control and reuse rockchip_pcie_init_port() later which
should fully follow the cold boot procedure of ROM code.

Signed-off-by: Shawn Lin <shawn.lin@rock-chips.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Brian Norris <briannorris@chromium.org>

PCI: rockchip: Split out rockchip_cfg_atu()

Split out a new function, rockchip_cfg_atu(), in order to re-configure the
ATU when missing these information after wakeup from S3.

[bhelgaas: add "dev" temporary, return 0 when known]
Signed-off-by: Shawn Lin <shawn.lin@rock-chips.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Brian Norris <briannorris@chromium.org>

PCI: rockchip: Clean up bit definitions for PCIE_RC_CONFIG_LCS

PCIE_RC_CONFIG_LCS contains control and status bits specific to the PCIe
link. The layout for this register looks the same as the existing
PCI_EXP_LNKCTL and PCI_EXP_LNKSTA. So let's reuse them.

Signed-off-by: Shawn Lin <shawn.lin@rock-chips.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>

PCI: rockchip: Correct the use of FTS mask

We're trying to mask out bits[23:8] while retaining [32:24, 7:0], but we're
doing the inverse. That doesn't have too much effect, since we're setting
all the [23:8] bits to 1, and the other bits are only relevant for modes
we're currently not using. But we should get this right.

Fixes: ca1989084054 ("PCI: rockchip: Fix wrong transmitted FTS count")
Signed-off-by: Brian Norris <briannorris@chromium.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Shawn Lin <shawn.lin@rock-chips.com>

PCI: rockchip: Remove the pointer to L1 substate cap

Per the errata of TRM, the RC can't support L1 substate, so remove the L1
substate cap as well as operation for PCIE_RC_CONFIG_L1_SUBSTATE_CTRL2.

Tested-by: Brian Norris <briannorris@chromium.org>
Signed-off-by: Shawn Lin <shawn.lin@rock-chips.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>

PCI: rockchip: Specify the link capability

rk3399 supports PCIe 2.x link speeds marginally at best, and on some
boards, the link won't train at 5 GT/s at all. Rather than sacrifice 500ms
waiting for training that will never happen, let's use the helper function,
of_pci_get_max_link_speed(), to get the max link speed from DT and specify
link capability.

Signed-off-by: Brian Norris <briannorris@chromium.org>
Signed-off-by: Shawn Lin <shawn.lin@rock-chips.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>

PCI: rockchip: Fix negotiated lanes calculation

The calculation of negotiated lanes is wrong: it should be shifted by
PCIE_CORE_PL_CONF_LANE_SHIFT, but it is shifted by
PCIE_CORE_PL_CONF_LANE_MASK instead. Let's fix it.

Fixes: e77f847df54c ("PCI: rockchip: Add Rockchip PCIe controller support")
Signed-off-by: Shawn Lin <shawn.lin@rock-chips.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>

PCI: rockchip: Add Kconfig COMPILE_TEST

Allow selection of the Rockchip driver for compile testing, even if we
aren't building for ARCH_ROCKCHIP.

[bhelgaas: changelog]
Signed-off-by: Shawn Lin <shawn.lin@rock-chips.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>

PCI: rockchip: Mark RC as common clock architecture

The default value of common clock configuration is zero indicating
Rockchip's RC is using asynchronous clock architecture but actually we are
using common clock. This will confuse some EP drivers if they need some
different settings referring to this value.

Set the Common Clock Configuration bit in the Link Control Register.

Signed-off-by: Shawn Lin <shawn.lin@rock-chips.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>

PCI: rockchip: Provide captured slot power limit and scale

If vpcie3v3 is available, we could provide these information via RC's
configure register to make EP able to know the power limit.

Signed-off-by: Shawn Lin <shawn.lin@rock-chips.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>

PCI: Explain ARM64 ACPI/MCFG quirk Kconfig and build strategy

Add Makefile comments to explain the Kconfig and build strategy for ARM64
drivers that work around not-quite-ECAM issues. No functional change
intended.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>

PCI: spear: Use builtin_platform_driver() to simplify the code

Use builtin_platform_driver() helper to simplify the code.

Signed-off-by: Geliang Tang <geliangtang@gmail.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>

PCI: vmd: Fix suspend handlers defined-but-not-used warning

Fix the following warnings:

  drivers/pci/host/vmd.c:731:12: warning: ‘vmd_suspend’ defined but not used [-Wunused-function]
   static int vmd_suspend(struct device *dev)
              ^
  drivers/pci/host/vmd.c:739:12: warning: ‘vmd_resume’ defined but not used [-Wunused-function]
   static int vmd_resume(struct device *dev)
              ^

Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Bjorn Helgaas <helgaas@kernel.org>
Reviewed-by: Keith Busch <keith.busch@intel.com>

PCI: vmd: Use SRCU as a local RCU to prevent delaying global RCU

SRCU lets synchronize_srcu() depend on VMD-local RCU primitives, preventing
long delays from locking up RCU in other systems.  VMD performs a
synchronize when removing a device, but will hit all IRQ lists if the
device uses all VMD vectors.  This patch will not help VMD's RCU
synchronization, but will isolate the read side delays to the VMD
subsystem.  Additionally, the use of SRCU in VMD's ISR will keep it
isolated from any other RCU waiters in the rest of the system.

Tested using concurrent FIO and NVMe resets:

  [global]
  rw=read
  bs=4k
  direct=1
  ioengine=libaio
  iodepth=32
  norandommap
  timeout=300
  runtime=1000000000

  [nvme0]
  cpus_allowed=0-63
  numjobs=8
  filename=/dev/nvme0n1

  [nvme1]
  cpus_allowed=0-63
  numjobs=8
  filename=/dev/nvme1n1

  while (true) do
    for i in /sys/class/nvme/nvme*; do
      echo "Resetting ${i##*/}"
      echo 1 > $i/reset_controller;
      sleep 5
    done;
  done

Signed-off-by: Jon Derrick <jonathan.derrick@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Keith Busch <keith.busch@intel.com>

arm64: tegra: Enable PCIe on Jetson TX1

Enable the x4 PCIe and M.2 Key E slots on Jetson TX1. The Key E slot is
currently untested due to lack of hardware.

Signed-off-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Bjorn Helgaas <helgaas@kernel.org>

arm64: tegra: Add PCIe host bridge on Tegra210

Add the PCIe host bridge found on Tegra X1. It implements two root ports
that support x4 and x1 configurations, respectively.

Signed-off-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Bjorn Helgaas <helgaas@kernel.org>

PCI: tegra: Enable the driver on 64-bit ARM

The Tegra PCI host controller driver no longer relies on any of the 32-bit
ARM glue for PCI, so it can be enabled on 64-bit configurations.

Signed-off-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Bjorn Helgaas <helgaas@kernel.org>

PCI: tegra: Add Tegra210 support

The PCIe host controller found on Tegra X1 is very similar to its
predecessor on Tegra K1. A bug was introduced in the new revision that
is worked around by always enabling the performance counter, otherwise
accesses to configuration space will block for a number of seconds.

Signed-off-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Bjorn Helgaas <helgaas@kernel.org>

PCI: tegra: Implement PCA enable workaround

Tegra210's PCIe controller has a bug that requires the PCA (performance
counter) feature to be enabled. If this isn't done, accesses to device
configuration space will hang the chip for tens of seconds. Implement the
workaround.

Based on commit 514e19138af2 ("pci: tegra: implement PCA enable
workaround") from U-Boot by Stephen Warren <swarren@nvidia.com>.

Signed-off-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Bjorn Helgaas <helgaas@kernel.org>

dt-bindings: pci: tegra: Add Tegra210 support

Add support for the PCI host controller found on Tegra210 SoCs. It is very
similar to the variant found on Tegra124, with a couple of small
differences regarding the power supplies.

Signed-off-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Bjorn Helgaas <helgaas@kernel.org>

PCI: tegra: Use new pci_register_host_bridge() interface

Tegra is one of the remaining platforms that still use the traditional
pci_common_init_dev() interface for probing PCI host bridges.

This demonstrates how to convert it to the pci_register_host interface I
just added in a previous patch. This leads to a more linear probe sequence
that can handle errors better because we avoid callbacks into the driver,
and it makes the driver architecture independent.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Bjorn Helgaas <helgaas@kernel.org>

PCI: Export host bridge registration interface

Allow PCI host bridge drivers to use the new host bridge interfaces to
register their host bridge.

Signed-off-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Bjorn Helgaas <helgaas@kernel.org>

PCI: Allow driver-specific data in host bridge

Provide a way to allocate driver-specific data along with a PCI host bridge
structure. The bridge's ->private field points to this data.

Signed-off-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Bjorn Helgaas <helgaas@kernel.org>

PCI: Add pci_register_host_bridge() interface

Make the existing pci_host_bridge structure a proper device that is usable
by PCI host drivers in a more standard way. In addition to the existing
pci_scan_bus(), pci_scan_root_bus(), pci_scan_root_bus_msi(), and
pci_create_root_bus() interfaces, this unfortunately means having to add
yet another interface doing basically the same thing, and add some extra
code in the initial step.

However, this time it's more likely to be extensible enough that we won't
have to do another one again in the future, and we should be able to reduce
code much more as a result.

The main idea is to pull the allocation of 'struct pci_host_bridge' out of
the registration, and let individual host drivers and architecture code
fill the members before calling the registration function.

There are a number of things we can do based on this:

* Use a single memory allocation for the driver-specific structure
  and the generic PCI host bridge
* consolidate the contents of driver-specific structures by moving
  them into pci_host_bridge
* Add a consistent interface for removing a PCI host bridge again
  when unloading a host driver module
* Replace the architecture specific __weak pcibios_*() functions with
  callbacks in a pci_host_bridge device
* Move common boilerplate code from host drivers into the generic
  function, based on contents of the structure
* Extend pci_host_bridge with additional members when needed without
  having to add arguments to pci_scan_*().
* Move members of struct pci_bus into pci_host_bridge to avoid
  having lots of identical copies.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Bjorn Helgaas <helgaas@kernel.org>

PCI: Add MCFG quirks for X-Gene host controller

PCIe controllers in X-Gene SoCs are not ECAM compliant: software needs to
configure additional controller's register to address device at
bus:dev:function.

Add a quirk to discover controller MMIO register space and configure
controller registers to select and address the target secondary device.

The quirk will only be applied for X-Gene PCIe MCFG table with
OEM revison 1, 2, 3 or 4 (PCIe controller v1 and v2 on X-Gene SoCs).

Tested-by: Jon Masters <jcm@redhat.com>
Signed-off-by: Duc Dang <dhdang@apm.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>

PCI: Add MCFG quirks for Cavium ThunderX pass1.x host controller

ThunderX pass1.x requires to emulate the EA headers for on-chip devices
hence it has to use custom pci_thunder_ecam_ops for accessing PCI config
space (pci-thunder-ecam.c). Add new entries to MCFG quirk array where it
can be applied while probing ACPI based PCI host controller.

ThunderX pass1.x is using the same way for accessing off-chip devices
(so-called PEM) as silicon pass-2.x so we need to add PEM quirk entries
too.

Quirk is considered for ThunderX silicon pass1.x only which is identified
via MCFG revision 2.

ThunderX pass 1.x requires the following accessors:

  NUMA node 0 PCI segments  0- 3: pci_thunder_ecam_ops (MCFG quirk)
  NUMA node 0 PCI segments  4- 9: thunder_pem_ecam_ops (MCFG quirk)
  NUMA node 1 PCI segments 10-13: pci_thunder_ecam_ops (MCFG quirk)
  NUMA node 1 PCI segments 14-19: thunder_pem_ecam_ops (MCFG quirk)

[bhelgaas: change Makefile/ifdefs so quirk doesn't depend on
CONFIG_PCI_HOST_THUNDER_ECAM]
Signed-off-by: Tomasz Nowicki <tn@semihalf.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>

PCI: Add MCFG quirks for Cavium ThunderX pass2.x host controller

ThunderX PCIe controller to off-chip devices (so-called PEM) is not fully
compliant with ECAM standard. It uses non-standard configuration space
accessors (see thunder_pem_ecam_ops) and custom configuration space
granulation (see bus_shift = 24). In order to access configuration space
and probe PEM as ACPI-based PCI host controller we need to add MCFG quirk
infrastructure. This involves:
1. A new thunder_pem_acpi_init() init function to locate PEM-specific
   register ranges using ACPI.
2. Export PEM thunder_pem_ecam_ops structure so it is visible to MCFG quirk
   code.
3. New quirk entries for each PEM segment. Each contains platform IDs,
   mentioned thunder_pem_ecam_ops and CFG resources.

Quirk is considered for ThunderX silicon pass2.x only which is identified
via MCFG revision 1.

ThunderX pass 2.x requires the following accessors:

  NUMA Node 0 PCI segments  0- 3: pci_generic_ecam_ops (ECAM-compliant)
  NUMA Node 0 PCI segments  4- 9: thunder_pem_ecam_ops (MCFG quirk)
  NUMA Node 1 PCI segments 10-13: pci_generic_ecam_ops (ECAM-compliant)
  NUMA Node 1 PCI segments 14-19: thunder_pem_ecam_ops (MCFG quirk)

[bhelgaas: adapt to use acpi_get_rc_resources(), update Makefile/ifdefs so
quirk doesn't depend on CONFIG_PCI_HOST_THUNDER_PEM]
Signed-off-by: Tomasz Nowicki <tn@semihalf.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>

PCI: thunder-pem: Factor out resource lookup

Pull the register resource lookup out of thunder_pem_init() so we can
easily add a corresponding lookup using ACPI. No functional change
intended.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>

PCI: Add MCFG quirks for HiSilicon Hip05/06/07 host controllers

The PCIe controller in Hip05/Hip06/Hip07 SoCs is not completely
ECAM-compliant. It is non-ECAM only for the RC bus config space; for any
other bus underneath the root bus it does support ECAM access.

Add specific quirks for PCI config space accessors. This involves:
1. New initialization call hisi_pcie_init() to obtain RC base
addresses from PNP0C02 at the root of the ACPI namespace (under \_SB).
2. New entry in common quirk array.

[bhelgaas: move to pcie-hisi.c and change Makefile/ifdefs so quirk doesn't
depend on CONFIG_PCI_HISI]
Signed-off-by: Dongdong Liu <liudongdong3@huawei.com>
Signed-off-by: Gabriele Paoloni <gabriele.paoloni@huawei.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>

PCI: Add MCFG quirks for Qualcomm QDF2432 host controller

The Qualcomm Technologies QDF2432 SoC does not support accesses smaller
than 32 bits to the PCI configuration space. Register the appropriate
quirk.

[bhelgaas: add QCOM_ECAM32 macro, ifdef for ACPI and PCI_QUIRKS]
Signed-off-by: Christopher Covington <cov@codeaurora.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>

PCI/ACPI: Provide acpi_get_rc_resources() for ARM64 platform

The acpi_get_rc_resources() is used to get the RC register address that can
not be described in MCFG.  It takes the _HID & segment to look for and
outputs the RC address resource.  Use PNP0C02 devices to describe such RC
address resource.  Use _UID to match segment to tell which root bus the
PNP0C02 resource belongs to.

[bhelgaas: add dev argument, wrap in #ifdef CONFIG_PCI_QUIRKS]
Signed-off-by: Dongdong Liu <liudongdong3@huawei.com>
Signed-off-by: Tomasz Nowicki <tn@semihalf.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>

PCI/ACPI: Check for platform-specific MCFG quirks

The PCIe spec (r3.0, sec 7.2.2) specifies an "Enhanced Configuration Access
Mechanism" (ECAM) for memory-mapped access to configuration space.  ECAM is
required for PCIe systems unless there's a standard firmware interface for
config access.

In the absence of a firmware interface, we use pci_generic_ecam_ops, and on
ACPI systems, we discover the ECAM space via the MCFG table and/or the _CBA
method.

Unfortunately some systems provide MCFG but don't implement ECAM according
to spec, so we need a mechanism for quirks to make those systems work.

Add an MCFG quirk mechanism to override the config accessor functions
and/or the memory-mapped address space.

A quirk is selected if it matches all of the following:

  - OEM ID
  - OEM Table ID
  - OEM Revision
  - PCI segment (from _SEG)
  - PCI bus number range (from _CRS, wildcard allowed)

If the quirk specifies config accessor functions or a memory-mapped address
range, these override the defaults.

[bhelgaas: changelog, reorder quirk matching, fix oem_revision typo per
Duc, add under #ifdef CONFIG_PCI_QUIRKS]
Signed-off-by: Tomasz Nowicki <tn@semihalf.com>
Signed-off-by: Dongdong Liu <liudongdong3@huawei.com>
Signed-off-by: Christopher Covington <cov@codeaurora.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>

PCI/ACPI: Extend pci_mcfg_lookup() to return ECAM config accessors

pci_mcfg_lookup() is the external interface to the generic MCFG code.
Previously it merely looked up the ECAM base address for a given domain and
bus range. We want a way to add MCFG quirks, some of which may require
special config accessors and adjustments to the ECAM address range.

Extend pci_mcfg_lookup() so it can return a pointer to a pci_ecam_ops
structure and a struct resource for the ECAM address space. For now, it
always returns &pci_generic_ecam_ops (the standard accessor) and the
resource described by the MCFG.

No functional changes intended.

[bhelgaas: changelog]
Signed-off-by: Tomasz Nowicki <tn@semihalf.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>

arm64: PCI: Exclude ACPI "consumer" resources from host bridge windows

On x86 and ia64, we have treated all ACPI _CRS resources of PNP0A03 host
bridge devices as "producers", i.e., as host bridge windows.  That's partly
because some x86 BIOSes improperly used "consumer" descriptors to describe
windows and partly because Linux didn't have good support for handling
consumer and producer descriptors differently.

One result is that x86 BIOSes describe host bridge "consumer" resources in
the _CRS of a PNP0C02 device, not the PNP0A03 device itself.  On arm64 we
don't have a legacy of firmware that has this consumer/producer confusion,
so we can handle PNP0A03 "consumer" descriptors as host bridge registers
instead of windows.

Exclude non-window ("consumer") resources from the list of host bridge
windows.  This allows the use of "consumer" PNP0A03 descriptors for bridge
register space.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>

arm64: PCI: Manage controller-specific data on per-controller basis

Currently we use one shared global acpi_pci_root_ops structure to keep
controller-specific ops. We pass its pointer to acpi_pci_root_create() and
associate it with a host bridge instance for good. Such a design implies
serious drawback. Any potential manipulation on the single system-wide
acpi_pci_root_ops leads to kernel crash. The structure content is not
really changing even across multiple host bridges creation; thus it was not
an issue so far.

In preparation for adding ECAM quirks mechanism (where controller-specific
PCI ops may be different for each host bridge) allocate new
acpi_pci_root_ops and fill in with data for each bridge. Now it is safe to
have different controller-specific info. As a consequence free
acpi_pci_root_ops when host bridge is released.

No functional changes in this patch.

Signed-off-by: Tomasz Nowicki <tn@semihalf.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>

arm64: PCI: Search ACPI namespace to ensure ECAM space is reserved

The static MCFG table tells us the base of ECAM space, but it does not
reserve the space -- the reservation should be done via a device in the
ACPI namespace whose _CRS includes the ECAM region.

Use acpi_resource_consumer() to check whether the ECAM space is reserved by
an ACPI namespace device. If it is, emit a message showing which device
reserves it. If not, emit a "[Firmware Bug]" warning.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>

arm64: PCI: Add local struct device pointers

Use a local "struct device *dev" for brevity. No functional change
intended.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>

ACPI: Add acpi_resource_consumer() to find device that claims a resource

Add acpi_resource_consumer(). This takes a struct resource and searches
the ACPI namespace for a device whose current resource settings (_CRS)
includes the resource. It returns the device if it exists, or NULL if no
device uses the resource.

If more than one device uses the resource (this may happen in the case of
bridges), acpi_resource_consumer() returns the first one found by
acpi_get_devices() in its modified depth-first walk of the namespace.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

PCI: Add comments about ROM BAR updating

pci_update_resource() updates a hardware BAR so its address matches the
kernel's struct resource UNLESS it's a disabled ROM BAR. We only update
those when we enable the ROM.

It's not obvious from the code why ROM BARs should be handled specially.
Apparently there are Matrox devices with defective ROM BARs that read as
zero when disabled. That means that if pci_enable_rom() reads the disabled
BAR, sets PCI_ROM_ADDRESS_ENABLE (without re-inserting the address), and
writes it back, it would enable the ROM at address zero.

Add comments and references to explain why we can't make the code look more
rational.

The code changes are from 755528c860b0 ("Ignore disabled ROM resources at
setup") and 8085ce084c0f ("[PATCH] Fix PCI ROM mapping").

Link: https://lkml.org/lkml/2005/8/30/138
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Gavin Shan <gwshan@linux.vnet.ibm.com>

PCI: Decouple IORESOURCE_ROM_ENABLE and PCI_ROM_ADDRESS_ENABLE

Remove the assumption that IORESOURCE_ROM_ENABLE == PCI_ROM_ADDRESS_ENABLE.
PCI_ROM_ADDRESS_ENABLE is the ROM enable bit defined by the PCI spec, so if
we're reading or writing a BAR register value, that's what we should use.
IORESOURCE_ROM_ENABLE is a corresponding bit in struct resource flags.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Gavin Shan <gwshan@linux.vnet.ibm.com>

PCI: Remove pci_resource_bar() and pci_iov_resource_bar()

pci_std_update_resource() only deals with standard BARs, so we don't have
to worry about the complications of VF BARs in an SR-IOV capability.

Compute the BAR address inline and remove pci_resource_bar(). That makes
pci_iov_resource_bar() unused, so remove that as well.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Gavin Shan <gwshan@linux.vnet.ibm.com>

PCI: Don't update VF BARs while VF memory space is enabled

If we update a VF BAR while it's enabled, there are two potential problems:

  1) Any driver that's using the VF has a cached BAR value that is stale
     after the update, and

  2) We can't update 64-bit BARs atomically, so the intermediate state
     (new lower dword with old upper dword) may conflict with another
     device, and an access by a driver unrelated to the VF may cause a bus
     error.

Warn about attempts to update VF BARs while they are enabled.  This is a
programming error, so use dev_WARN() to get a backtrace.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Gavin Shan <gwshan@linux.vnet.ibm.com>

PCI: Separate VF BAR updates from standard BAR updates

Previously pci_update_resource() used the same code path for updating
standard BARs and VF BARs in SR-IOV capabilities.

Split the VF BAR update into a new pci_iov_update_resource() internal
interface, which makes it simpler to compute the BAR address (we can get
rid of pci_resource_bar() and pci_iov_resource_bar()).

This patch:

  - Renames pci_update_resource() to pci_std_update_resource(),
  - Adds pci_iov_update_resource(),
  - Makes pci_update_resource() a wrapper that calls the appropriate one,

No functional change intended.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Gavin Shan <gwshan@linux.vnet.ibm.com>

PCI: hv: Allocate physically contiguous hypercall params buffer

hv_do_hypercall() assumes that we pass a segment from a physically
contiguous buffer. A buffer allocated on the stack may not work if
CONFIG_VMAP_STACK=y is set.

Use kmalloc() to allocate this buffer.

Reported-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: Long Li <longli@microsoft.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: K. Y. Srinivasan <kys@microsoft.com>

PCI: Update BARs using property bits appropriate for type

The BAR property bits (0-3 for memory BARs, 0-1 for I/O BARs) are supposed
to be read-only, but we do save them in res->flags and include them when
updating the BAR.

Mask the I/O property bits with ~PCI_BASE_ADDRESS_IO_MASK (0x3) instead of
PCI_REGION_FLAG_MASK (0xf) to make it obvious that we can't corrupt bits
2-3 of I/O addresses.

Use PCI_ROM_ADDRESS_MASK for ROM BARs. This means we'll only check the top
21 bits (instead of the 28 bits we used to check) of a ROM BAR to see if
the update was successful.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>

PCI: Ignore BAR updates on virtual functions

VF BARs are read-only zero, so updating VF BARs will not have any effect.
See the SR-IOV spec r1.1, sec 3.4.1.11.

We already ignore these updates because of 70675e0b6a1a ("PCI: Don't try to
restore VF BARs"); this merely restructures it slightly to make it easier
to split updates for standard and SR-IOV BARs.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Gavin Shan <gwshan@linux.vnet.ibm.com>

PCI: Do any VF BAR updates before enabling the BARs

Previously we enabled VFs and enable their memory space before calling
pcibios_sriov_enable(). But pcibios_sriov_enable() may update the VF BARs:
for example, on PPC PowerNV we may change them to manage the association of
VFs to PEs.

Because 64-bit BARs cannot be updated atomically, it's unsafe to update
them while they're enabled. The half-updated state may conflict with other
devices in the system.

Call pcibios_sriov_enable() before enabling the VFs so any BAR updates
happen while the VF BARs are disabled.

[bhelgaas: changelog]
Tested-by: Carol Soto <clsoto@us.ibm.com>
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>

PCI: iproc: Fix incorrect MSI address alignment

In the code to handle PAXB v2 based MSI steering, the logic aligns the MSI
register address to the size of supported inbound mapping range. This is
incorrect since it rounds "up" the starting address to the next aligned
address, but what we want is the starting address to be rounded "down" to
the aligned address.

This patch fixes the issue and allows MSI writes to be properly steered to
the GIC.

Fixes: 4b073155fbd3 ("PCI: iproc: Add support for the next-gen PAXB controller")
Signed-off-by: Ray Jui <ray.jui@broadcom.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>

PCI: qcom: Add support for MSM8996 PCIe controller

Add support for the MSM8996/APQ8096 PCIe controller. MSM8996 supports Gen
1/2, one lane, 3 PCIe root complexes with support for MSI and legacy
interrupts, and it conforms to PCI Express Base 2.1 specification.

Add a post_init callback to qcom_pcie_ops, as the PCIe pipe clocks are only
setup after the phy is powered on. It also adds an ltssm_enable callback
as it is very much different from other supported SoCs in the driver.

Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Stanimir Varbanov <svarbanov@mm-sol.com>

PCI: cpqphp: Add missing call to pci_disable_device()

Most error branches following the call to pci_enable_device() contain a
call to pci_disable_device(). Add these calls where they are missing.

This issue was found with Hector.

Signed-off-by: Quentin Lambert <lambert.quentin@gmail.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>

PCI: iproc: Add support for the next-gen PAXB controller

Add support for the next generation of the iProc PAXB host controller, used
in Stingray.

Signed-off-by: Oza Oza <oza.oza@broadcom.com>
Signed-off-by: Ray Jui <ray.jui@broadcom.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Scott Branden <scott.branden@broadcom.com>

PCI: iproc: Add PAXBv2 binding info

Add new compatible string 'brcm,iproc-pcie-paxb-v2', for the next
generation of the iProc PAXB PCIe host controller.

Signed-off-by: Oza Oza <oza.oza@broadcom.com>
Signed-off-by: Ray Jui <ray.jui@broadcom.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Scott Branden <scott.branden@broadcom.com>

PCI: Support INTx masking on ConnectX-4 with firmware x.14.1100+

Mellanox devices were marked as having INTx masking ability broken.  As a
result, the VFIO driver fails to start when more than one device function
is passed-through to a VM if both have the same INTx pin.

Prior to Connect-IB, Mellanox devices exposed to the operating system one
PCI function per all ports.  Starting from Connect-IB, the devices are
function-per-port.  When passing the second function to a VM, VFIO will
fail to start.

Exclude ConnectX-4, ConnectX4-Lx and Connect-IB from the list of Mellanox
devices marked as having broken INTx masking:

- ConnectX-4 and ConnectX4-LX firmware version is checked. If INTx
  masking is supported, we unmark the broken INTx masking.
- Connect-IB does not support INTx currently so will not cause any
  problem.

[bhelgaas: call pci_disable_device() always, after iounmap()]
Fixes: 11e42532ada3 ("PCI: Assume all Mellanox devices have broken INTx masking")
Signed-off-by: Noa Osherovich <noaos@mellanox.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: Gavin Shan <gwshan@linux.vnet.ibm.com>

PCI: Convert Mellanox broken INTx quirks to be for listed devices only

Change Mellanox's broken_intx_masking() quirk from an "all Mellanox
devices" to a quirk for listed devices only.

[bhelgaas: remove #defines, reorder to keep other quirks together]
Signed-off-by: Noa Osherovich <noaos@mellanox.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: Gavin Shan <gwshan@linux.vnet.ibm.com>

PCI: Convert broken INTx masking quirks from HEADER to FINAL

Convert all quirk_broken_intx_masking() quirks from HEADER to FINAL.

The quirk sets dev->broken_intx_masking, which is only used by
pci_intx_mask_supported(), which is not needed until after FINAL
quirks have been run.

[bhelgaas: changelog]
Signed-off-by: Noa Osherovich <noaos@mellanox.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Gavin Shan <gwshan@linux.vnet.ibm.com>

PCI: Warn on possible RW1C corruption for sub-32 bit config writes

Hardware that supports only 32-bit config writes is not spec-compliant.
For example, if software performs a 16-bit write, we must do a 32-bit read,
merge in the 16 bits we intend to write, followed by a 32-bit write. If
the 16 bits we *don't* intend to write happen to have any RW1C (write-one-
to-clear) bits set, we just inadvertently cleared something we shouldn't
have.

Add a rate-limited warning when we do sub-32 bit config writes. Remove
similar probe-time warnings from some of the affected host bridge drivers.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Enthusiastically-Acked-by: Russell King <rmk+kernel@armlinux.org.uk>
Acked-by: Shawn Lin <shawn.lin@rock-chips.com> # rockchip
Acked-by: Thierry Reding <treding@nvidia.com>

PCI: Create revision file in sysfs

Currently the revision isn't available via sysfs/libudev thus if one wants
to know the value one needs to read through the config file, which can be
quite time-consuming because it wakes/powers up the device.

There are at least two userspace components which could make use the new
file: libpciaccess and libdrm. The former wakes up _every_ PCI device,
which can be observed via glxinfo when using Mesa 10.0+ drivers. The
latter, in association with Mesa 13.0, can lead to 2-3 second delays while
starting firefox, thunderbird or chromium.

Link: https://bugs.freedesktop.org/show_bug.cgi?id=98502
Tested-by: Mauro Santos <registo.mailling@gmail.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch
CC: Greg KH <gregkh@linuxfoundation.org>

PCI: pciehp: Add runtime PM support for PCIe hotplug ports

Linux 4.8 added support for runtime suspending PCIe ports to D3hot with
commit 006d44e49a25 ("PCI: Add runtime PM support for PCIe ports"), but
excluded hotplug ports.  Those are now afforded runtime PM by the present
commit.

Hotplug ports require a few extra considerations:

- The configuration space of the port remains accessible in D3hot, so all
  the functions to read or modify the Slot Status and Slot Control
  registers need not be modified.  Even turning on slot power doesn't seem
  to require the port to be in D0, at least the PCIe spec doesn't say so
  and I confirmed that by testing with a Thunderbolt controller.

- However D0 is required to access devices on the secondary bus.  This
  happens in pciehp_check_link_status() and pciehp_configure_device() (both
  called from board_added()) and in pciehp_unconfigure_device() (called
  from remove_board()), so acquire a runtime PM ref for their invocation.

- The hotplug port stays active as long as it has active children.  If all
  hotplugged devices below the port runtime suspend, the port is allowed to
  runtime suspend as well.  Plug and unplug detection continues to work in
  D3hot.

- Hotplug interrupts are delivered in-band, so while the hotplug port
  itself is allowed to go to D3hot, its parent ports must stay in D0 for
  interrupts to come through.  Add a corresponding restriction to
  pci_dev_check_d3cold().

- Runtime PM may only be allowed if the hotplug port is handled natively by
  the OS.  On ACPI systems, the port may alternatively be handled by the
  firmware and things break if the OS puts the port into D3 behind the
  firmware's back:  E.g. Thunderbolt hotplug ports on non-Macs are handled
  by Intel's firmware in System Management Mode and the firmware is known
  to access devices on the port's secondary bus without checking first if
  the port is in D0: https://bugzilla.kernel.org/show_bug.cgi?id=53811

Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
CC: Mika Westerberg <mika.westerberg@linux.intel.com>

ACPI / hotplug / PCI: Make device_is_managed_by_native_pciehp() public

We're about to add runtime PM of hotplug ports, but we need to restrict it
to ports that are handled natively by the OS:  If they're handled by the
firmware (which is the case for Thunderbolt on non-Macs), things would
break if the OS put the ports into D3hot behind the firmware's back.

To determine if a hotplug port is handled natively, one has to walk up from
the port to the root bridge and check the cached _OSC Control Field for the
value of the "PCI Express Native Hot Plug control" bit.  There's already a
function to do that, device_is_managed_by_native_pciehp(), but it's private
to drivers/pci/hotplug/acpiphp_glue.c and only compiled in if
CONFIG_HOTPLUG_PCI_ACPI is enabled.

Make it public and move it to drivers/pci/pci-acpi.c, so that it is
available in the more general CONFIG_ACPI case.

The function contains a check if the device in question is a hotplug port
and returns false if it's not.  The caller we're going to add doesn't need
this as it only calls the function if it actually *is* a hotplug port.
Move the check out of the function into the single existing caller.

Rename it to pciehp_is_native() and add some kerneldoc and polish.

No functional change intended.

Tested-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

ACPI / hotplug / PCI: Use cached copy of PCI_EXP_SLTCAP_HPC bit

We cache the PCI_EXP_SLTCAP_HPC bit in pci_dev->is_hotplug_bridge on device
probe, so there's no need to read it again when adding the ACPI hotplug
context.

Here's the call chain to prove that no ordering issue is introduced:

pci_scan_child_bus [drivers/pci/probe.c]
  pci_scan_slot
    pci_scan_single_device
      pci_scan_device
        pci_setup_device
          set_pcie_hotplug_bridge
            [is_hotplug_bridge bit is set here]
  pci_scan_bridge
    pci_add_new_bus
      pci_alloc_child_bus
        pcibios_add_bus  [arch/(x86|arm64|ia64)/...]
          acpi_pci_add_bus [drivers/pci/pci-acpi.c]
            acpiphp_enumerate_slots [drivers/pci/hotplug/acpiphp_glue.c]
              acpiphp_add_context
                device_is_managed_by_native_pciehp
                  [is_hotplug_bridge bit is queried here]

No functional change intended.

Tested-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>