Andrew Lunn [Wed, 30 Dec 2015 15:28:25 +0000 (16:28 +0100)]
ethtool: Add phy statistics
Ethernet PHYs can maintain statistics, for example errors while idle
and receive errors. Add an ethtool mechanism to retrieve these
statistics, using the same model as MAC statistics.
Signed-off-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Wed, 30 Dec 2015 21:34:53 +0000 (16:34 -0500)]
Merge branch '10GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue
Jeff Kirsher says:
====================
10GbE Intel Wired LAN Driver Updates 2015-12-29
This series contains updates to ixgbe and ixgbevf.
William Dauchy provides a fix for ixgbevf that was implemented for ixgbe,
commit 5d6002b7b822c7 ("ixgbe: Fix handling of NAPI budget when multiple
queues are enabled per vector"). The issue was that the polling routine
would increase the budget for receive to at least 1 per queue if multiple
queues were present, which resulted in receive packets being processed
when the budget was 0.
Emil provides minor cleanups for ixgbevf, one being that we need to
check rx_itr_setting with == and not &, since it is not a mask. Added
QSFP PHY support in ixgbe to allow for more accurate reporting of port
settings. Fixed the max RSS limit for X550 which is 63, not 64.
Veola fixes ixgbe ethtool reporting of backplane type interfaces as
1000/10000baseT link modes, instead, report the media as KR, KX or KX4
based on the backplane interface present.
Mark cleans up redundancy in the setting of hw_enc_features that makes
it appear that X550 has more encapsulation features than other devices.
Also do not set NETIF_F_SG any longer since that is set by the
register_netdev() call. Also fixed the X550EM_x revision check, which
needs to check a value, not just a bit.
Alex Duyck fixes additional bugs in ixgbe_clear_vf_vlans(), one being
that the mask was using a divide instead of a modulus, which resulted
in the mask bit being incorrectly set to 0 or 1 based on the value of
the VF being tested. Alex also found that he was not consistent in
using the "word" argument as an offset or as a register offset, so
made the code consistently use word as the offset in the array.
v2: dropped patch 8 of the original series, as it was undoing a part of
the fix Alex Duyck was doing in patch 9 of the original series.
Dropped based on feedback from Emil (the author).
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Wed, 30 Dec 2015 21:33:46 +0000 (16:33 -0500)]
Merge branch 'be2net-next'
Sathya Perla says:
====================
be2net: patch set
The following patch set contains some feature additions, code re-organization
and cleanup and a few non-critical fixes. Pls consider applying this to
the net-next tree. Thanks.
v3 changes: add a default case to the switch statement in patch 5 to
satisfy the compiler (-Wswitch).
v2 changes: replaced an if/else block that checks for error values with a
switch/case statement in patch 5.
Patch 1 fixes VF link state transition from disabled to auto that did
not work due to an issue in the FW. This issue could not be fixed in FW due
to some backward compatibility issues it causes with released drivers.
The issue has been fixed by introducing a new version (v2) of the cmd
from 10.6 FW onwards. This patch adds support for v2 version of this cmd.
Patch 2 reports a EOPNOTSUPP status to ethtool when the user tries to
configure BE3/SRIOV in VEPA mode as it is not supported by the chip.
Patch 3 cleansup FW flash image related constant definitions. Many of these
definitions (such as section offset values) were defined in decimal format
rather than hexa-decimal. This makes this part of the code un-readable.
Also some defines related to BE2 are labeld "g2" and defines related to BE3
are labeled "g3". This patch cleans up all of this to make this code more
readable.
Patch 4 moves the FW cmd code to be_cmds.c. All code relating to FW cmds
has been in be_cmds.[ch], excepting FW flash cmd related code.
This patch moves these routines from be_main.c to be_cmds.c.
Patch 5 adds a log message to report digital signature errors while
flashing a FW image. From FW version 11.0 onwards, the FW supports a new
"secure mode" feature (based on a jumper setting on the adapter.) In this
mode, the FW image when flashed is authenticated with a digital signature.
Patch 6 removes a line of code that has no effect.
Patch 7 removes some unused variables.
Patch 8 fixes port resource descriptor query via the GET_PROFILE FW cmd.
An earlier commit passed a specific pf_num while issuing this cmd as FW
returns descriptors for all functions when pf_num is zero. But, when pf_num
is set to a non-zero value, FW does not return the port resource descriptor.
This patch fixes this by setting pf_num to 0 while issuing the query cmd
and adds code to pick the correct NIC resource descriptor from the list of
descriptors returned by FW.
Patch 9 adds support for ethtool get-dump feature. In the past when this
option was not yet available, this feature was supported via the
--register-dump option as a workaround. This patch removes support for
FW-dump via --register-dump option as it is now available via --get-dump
option. Even though the "ethtool --register-dump" cmd which used to work
earlier, will now fail with ENOTSUPP error, we feel it is not an issue as
this is used only for diagnostics purpose.
Patch 10 bumps up the driver version.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Suresh Reddy [Wed, 30 Dec 2015 06:29:05 +0000 (01:29 -0500)]
be2net: bump up the driver version to 11.0.0.0
Signed-off-by: Suresh Reddy <suresh.reddy@avagotech.com> Signed-off-by: Sathya Perla <sathya.perla@avagotech.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Venkat Duvvuru [Wed, 30 Dec 2015 06:29:04 +0000 (01:29 -0500)]
be2net: support ethtool get-dump option
This patch adds support for ethtool's --get-dump option in be2net,
to retrieve FW dump. In the past when this option was not yet available,
this feature was supported via the --register-dump option as a workaround.
This patch removes support for FW-dump via --register-dump option as it is
now available via --get-dump option. Even though the
"ethtool --register-dump" cmd which used to work earlier, will now fail
with ENOTSUPP error, we feel it is not an issue as this is used only
for diagnostics purpose.
Signed-off-by: Venkat Duvvuru <venkatkumar.duvvuru@avagotech.com> Signed-off-by: Sathya Perla <sathya.perla@avagotech.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Suresh Reddy [Wed, 30 Dec 2015 06:29:03 +0000 (01:29 -0500)]
be2net: fix port-res desc query of GET_PROFILE_CONFIG FW cmd
Commit 72ef3a88fa8e ("be2net: set pci_func_num while issuing
GET_PROFILE_CONFIG cmd") passed a specific pf_num while issuing a
GET_PROFILE_CONFIG cmd as FW returns descriptors for all functions when
pf_num is zero. But, when pf_num is set to a non-zero value, FW does not
return the Port resource descriptor.
This patch fixes this by setting pf_num to 0 while issuing the query cmd
and adds code to pick the correct NIC resource descriptor from the list of
descriptors returned by FW.
Fixes: 72ef3a88fa8e ("be2net: set pci_func_num while issuing
GET_PROFILE_CONFIG cmd") Signed-off-by: Suresh Reddy <suresh.reddy@avagotech.com> Signed-off-by: Sathya Perla <sathya.perla@avagotech.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Venkat Duvvuru [Wed, 30 Dec 2015 06:29:02 +0000 (01:29 -0500)]
be2net: remove unused error variables
eeh_error, fw_timeout, hw_error variables in the be_adapter structure are
not used anymore. An earlier patch that introduced adapter->err_flags to
store this information missed removing these variables.
Signed-off-by: Venkat Duvvuru <venkatkumar.duvvuru@avagotech.com> Signed-off-by: Sathya Perla <sathya.perla@avagotech.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Sathya Perla [Wed, 30 Dec 2015 06:29:01 +0000 (01:29 -0500)]
be2net: remove a line of code that has no effect
This patch removes a line of code that changes adapter->recommended_prio
value followed by yet another assignment.
Also, the variable is used to store the vlan priority value that is already
shifted to the PCP bits position in the vlan tag format. Hence, the name of
this variable is changed to recommended_prio_bits.
Signed-off-by: Sathya Perla <sathya.perla@avagotech.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Suresh Reddy [Wed, 30 Dec 2015 06:29:00 +0000 (01:29 -0500)]
be2net: log digital signature errors while flashing FW image
(based on a jumper setting on the adapter.) In this mode, the FW image when
flashed is authenticated with a digital signature. This patch logs
appropriate error messages and return a status to ethtool when errors
relating to FW image authentication occur.
Signed-off-by: Suresh Reddy <suresh.reddy@avagotech.com> Signed-off-by: Sathya Perla <sathya.perla@avagotech.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Suresh Reddy [Wed, 30 Dec 2015 06:28:59 +0000 (01:28 -0500)]
be2net: move FW flash cmd code to be_cmds.c
All code relating to FW cmds is in be_cmds.[ch] excepting FW flash cmd
related code. This patch moves these routines from be_main.c to be_cmds.c
Signed-off-by: Suresh Reddy <suresh.reddy@avagotech.com> Signed-off-by: Sathya Perla <sathya.perla@avagotech.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Suresh Reddy [Wed, 30 Dec 2015 06:28:58 +0000 (01:28 -0500)]
be2net: cleanup FW flash image related macro defines
Many constant definitions relating to the FW-image layout
(such as section offset values) were defined in decimal format rather than
hexa-decimal. This makes this part of the code un-readable. Also some
defines related to BE2 are labeld "g2" and defines related to BE3 are
labeled "g3". This patch cleans up all of this to make this code more
readable.
Signed-off-by: Suresh Reddy <suresh.reddy@avagotech.com> Signed-off-by: Sathya Perla <sathya.perla@avagotech.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Suresh Reddy [Wed, 30 Dec 2015 06:28:57 +0000 (01:28 -0500)]
be2net: avoid configuring VEPA mode on BE3
BE3 chip doesn't support VEPA mode.
Signed-off-by: Suresh Reddy <suresh.reddy@avagotech.com> Signed-off-by: Sathya Perla <sathya.perla@avagotech.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Suresh Reddy [Wed, 30 Dec 2015 06:28:56 +0000 (01:28 -0500)]
be2net: fix VF link state transition from disabled to auto
The VF link state setting transition from "disable" to "auto" does not work
due to a bug in SET_LOGICAL_LINK_CONFIG_V1 cmd in FW. This issue could not
be fixed in FW due to some backward compatibility issues it causes with
some released drivers. The issue has been fixed by introducing a new
version (v2) of the cmd from 10.6 FW onwards. In v2, to set the VF link
state to auto, both PLINK_ENABLE and PLINK_TRACK bits have to be set to 1.
The VF link state setting feature now works on Lancer chips too from
FW ver 10.6.315.0 onwards.
Signed-off-by: Suresh Reddy <suresh.reddy@avagotech.com> Signed-off-by: Sathya Perla <sathya.perla@avagotech.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Alexander Duyck [Wed, 23 Dec 2015 17:00:35 +0000 (09:00 -0800)]
ixgbe: Fix bugs in ixgbe_clear_vf_vlans()
When I had rewritten the code for ixgbe_clear_vf_vlans() it looks like I
had transitioned back and forth between using word as an offset and using
word as a register offset. As a result I honestly don't see how the code
was working before other than the fact that resetting the VLANs on the VF
like didn't do much to clear them.
Another issue found is that the mask was using a divide instead of a
modulus. As a result the mask bit was incorrectly being set to either bit
0 or 1 based on the value of the VF being tested. As a result the wrong
VFs were having their VLANs cleared if they were enabled.
I have updated the code so that word represents the offset in the array.
This way we can use the modulus and xor operations and they will make sense
instead of being performed on a 4 byte aligned value.
I replaced the statement "(word % 2) ^ 1" with "~word % 2" in order to
reduce the line length as the line exceeded 80 characters with the register
name inserted. The two should be equivalent so the change should be safe.
Reported-by: Emil Tantilov <emil.s.tantilov@intel.com> Signed-off-by: Alexander Duyck <aduyck@mirantis.com> Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Mark Rustad [Fri, 20 Nov 2015 21:12:17 +0000 (13:12 -0800)]
ixgbe: Correct X550EM_x revision check
The X550EM_x revision check needs to check a value, not just a bit.
Use a mask and check the value. Also remove the redundant check
inside the ixgbe_enter_lplu_t_x550em, because it can only be called
when both the mac type and revision check pass.
Signed-off-by: Mark Rustad <mark.d.rustad@intel.com> Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Emil Tantilov [Fri, 20 Nov 2015 21:02:16 +0000 (13:02 -0800)]
ixgbe: fix RSS limit for X550
X550 allows for up to 64 RSS queues, but the driver can have max
of 63 (-1 MSIX vector for link).
On systems with >= 64 CPUs the driver will set the redirection table
for all 64 queues which will result in packets being dropped.
Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com> Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Mark Rustad [Wed, 18 Nov 2015 23:37:04 +0000 (15:37 -0800)]
ixgbe: Clean up redundancy in hw_enc_features
Clean up minor redundancy in the setting of hw_enc_features that
makes it appears that X550 uniquely has more encapsulation features
than other devices. The driver only supports one more feature, so
make it look that way. No longer set NETIF_F_SG since that is set
by the register_netdev call. Thanks to Alex Duyck for noticing this
slight confusion.
Reported-by: Alexander Duyck <aduyck@mirantis.com> Signed-off-by: Mark Rustad <mark.d.rustad@intel.com> Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Veola Nazareth [Wed, 11 Nov 2015 23:22:59 +0000 (16:22 -0700)]
ixgbe: report correct media type for KR, KX and KX4 interfaces
Ethtool reports backplane type interfaces as 1000/10000baseT link modes.
This has been corrected to report the media as KR, KX or KX4 based on the
backplane interface present.
Signed-off-by: Veola Nazareth <veola.nazareth@intel.com> Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Emil Tantilov [Mon, 9 Nov 2015 23:07:12 +0000 (15:07 -0800)]
ixgbe: add support for QSFP PHY types in ixgbe_get_settings()
Add missing QSFP PHY types to allow for more accurate reporting of
port settings.
Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com> Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Emil Tantilov [Thu, 5 Nov 2015 00:02:21 +0000 (16:02 -0800)]
ixgbevf: minor cleanups for ixgbevf_set_itr()
adapter->rx_itr_setting is not a mask so check it with == instead of &
do not default to 12K interrupts in ixgbevf_set_itr()
There should be no functional effect from these changes.
Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com> Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
William Dauchy [Fri, 30 Oct 2015 17:16:30 +0000 (18:16 +0100)]
ixgbevf: Fix handling of NAPI budget when multiple queues are enabled per vector
This is the same patch as for ixgbe but applied differently according to
busy polling. See commit 5d6002b7b822c74 ("ixgbe: Fix handling of NAPI
budget when multiple queues are enabled per vector")
Signed-off-by: William Dauchy <william@gandi.net> Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
David S. Miller [Tue, 29 Dec 2015 20:13:45 +0000 (15:13 -0500)]
Merge branch 'bpf_hash-locking'
Ming Lei says:
====================
bpf: hash: use per-bucket spinlock
This patchset tries to optimize ebpf hash map, and follows
the idea:
Both htab_map_update_elem() and htab_map_delete_elem()
can be called from eBPF program, and they may be in kernel
hot path, it isn't efficient to use a per-hashtable lock
in this two helpers, so this patch converts the lock into
per-bucket spinlock.
With this patchset, looks the performance penalty from eBPF
decreased a lot, see the following test:
1) run 'tools/biolatency' of bcc before running block test;
2) run fio to test block throught over /dev/nullb0,
(randread, 16jobs, libaio, 4k bs) and the test box
is one 24cores(dual sockets) VM server:
- without patchset: 607K IOPS
- with this patchset: 1184K IOPS
- without running eBPF prog: 1492K IOPS
TODO:
- remove the per-hashtable atomic counter
V2:
- fix checking on buckets size
V1:
- fix the wrong 3/3 patch
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Both htab_map_update_elem() and htab_map_delete_elem() can be
called from eBPF program, and they may be in kernel hot path,
so it isn't efficient to use a per-hashtable lock in this two
helpers.
The per-hashtable spinlock is used for protecting bucket's
hlist, and per-bucket lock is just enough. This patch converts
the per-hashtable lock into per-bucket spinlock, so that
contention can be decreased a lot.
Signed-off-by: Ming Lei <tom.leiming@gmail.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>
Added the PCI IDs for the BCM57301 and BCM57402 controllers.
Signed-off-by: David Christensen <davidch@broadcom.com> Signed-off-by: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Michael Chan [Sun, 27 Dec 2015 23:19:27 +0000 (18:19 -0500)]
bnxt_en: Keep track of the ring group resource.
Newer firmware will return the ring group resource when we call
hwrm_func_qcaps(). To be compatible with older firmware, use the
number of tx rings as the number of ring groups if the older firmware
returns 0. When determining how many rx rings we can support, take
the ring group resource in account as well in _bnxt_get_max_rings().
Divide and assign the ring groups to VFs.
Signed-off-by: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Michael Chan [Sun, 27 Dec 2015 23:19:26 +0000 (18:19 -0500)]
bnxt_en: Improve VF resource accounting.
We need to keep track of all resources, such as rx rings, tx rings,
cmpl rings, rss contexts, stats contexts, vnics, after we have
divided them for the VFs. Otherwise, subsequent ring changes on
the PF may not work correctly.
We adjust all max resources in struct bnxt_pf_info after they have been
assigned to the VFs. There is no need to keep the separate
max_pf_tx_rings and max_pf_rx_rings.
When SR-IOV is disabled, we call bnxt_hwrm_func_qcaps() to restore the
max resources for the PF.
Signed-off-by: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Michael Chan [Sun, 27 Dec 2015 23:19:24 +0000 (18:19 -0500)]
bnxt_en: Check hardware resources before enabling NTUPLE.
The hardware resources required to enable NTUPLE varies depending on
how many rx channels are configured. We need to make sure we have the
resources before we enable NTUPLE. Add bnxt_rfs_capable() to do the
checking.
In addition, we need to do the same checking in ndo_fix_features(). As
the rx channels are changed using ethtool -L, we call
netdev_update_features() to make the necessary adjustment for NTUPLE.
Calling netdev_update_features() in netif_running() state but before
calling bnxt_open_nic() would be a problem. To make this work,
bnxt_set_features() has to be modified to test for BNXT_STATE_OPEN for
the true hardware state instead of checking netif_running().
Signed-off-by: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Michael Chan [Sun, 27 Dec 2015 23:19:23 +0000 (18:19 -0500)]
bnxt_en: Don't treat single segment rx frames as GRO frames.
If hardware completes single segment rx frames, don't bother setting
up all the GRO related fields. Pass the SKB up as a normal frame.
Reviewed-by: vasundhara volam <vvolam@broadcom.com> Signed-off-by: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
bnxt_en: Increment checksum error counter only if NETIF_F_RXCSUM is set.
rx_l4_csum_error is now incremented only when offload is enabled
Signed-off-by: Satish Baddipadige <sbaddipa@broadcom.com> Signed-off-by: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Rob Swindell [Sun, 27 Dec 2015 23:19:20 +0000 (18:19 -0500)]
bnxt_en: Add support for upgrading APE/NC-SI firmware via Ethtool FLASHDEV
NC-SI firmware of type apeFW (10) is now supported.
Signed-off-by: Rob Swindell <swindell@broadcom.com> Signed-off-by: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Jeffrey Huang [Sun, 27 Dec 2015 23:19:18 +0000 (18:19 -0500)]
bnxt_en: support hwrm_func_drv_unrgtr command
During remove_one, the driver should issue hwrm_func_drv_unrgtr
command to inform firmware that this function has been unloaded.
This is to let firmware keep track of driver present/absent state
when driver is gracefully unloaded. A keep alive timer is needed
later to keep track of driver state during abnormal shutdown.
Signed-off-by: Jeffrey Huang <huangjw@broadcom.com> Signed-off-by: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Thomas Falcon [Mon, 21 Dec 2015 17:26:06 +0000 (11:26 -0600)]
Driver for IBM System i/p VNIC protocol
This is a new device driver for a high performance SR-IOV assisted virtual
network for IBM System p and IBM System i systems. The SR-IOV VF will be
attached to the VIOS partition and mapped to the Linux client via the
hypervisor's VNIC protocol that this driver implements.
This driver is able to perform basic tx and rx, new features
and improvements will be added as they are being developed and tested.
Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com> Signed-off-by: John Allen <jallen@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Mon, 28 Dec 2015 01:51:40 +0000 (20:51 -0500)]
Merge branch 'fsl-fmain'
Igal Liberman says:
====================
Freescale DPAA FMan
The Freescale Data Path Acceleration Architecture (DPAA) is a set
of hardware components on specific QorIQ multicore processors.
This architecture provides the infrastructure to support
simplified sharing of networking interfaces and accelerators
by multiple CPU cores and the accelerators.
One of the DPAA accelerators is the Frame Manager (FMan)
which contains a series of hardware blocks: ports, Ethernet MACs,
a multi user RAM (MURAM) and Storage Profile (SP).
This patch set introduce the FMan drivers.
Each driver configures and initializes the corresponding
FMan hardware module (described above).
The MAC driver offers support for three different
types of MACs (eTSEC, TGEC, MEMAC).
v9 --> v10:
- Addressed feedback from David Miller
Remove private CRC implementation
- Addressed feedback from Kenneth Klette Jonassen:
- Use Kernel PHY API to configure dTSEC TBI
- Use Kernel PHY API to configure mEMAC PCS
This patchset requires device tree update:
https://patchwork.ozlabs.org/patch/559501/
- Addressed feedback from Andy Fleming
v8 --> v9:
No changes
v7 --> v8:
- Addressed feedback from David Miller
- Support for ARM:
- Device tree parsing
- IO Accessors
- Addressed compilation issue on non-PPC targets
v6 --> v7:
- Addressed compilation issue on non-PPC targets
- Removed B4860 rev 1 support
v5 --> v6:
- Addressed feedback from Scott:
- Moved kernel doc to source files
- Removed a series of configurable settings
- Miscellaneous code updates
v4 --> v5:
- Addressed feedback from David Miller:
- Removed driver layering
- Reduce namespace pollution
- Reduce code complexity and size
v3 --> v4:
- Remove device_initcall call in driver registration (redundant)
- Remove hot/cold labels
- Minor update in FMan Clock read from device-tree
- Update fixed-link support
- Addressed feedback from Stephen Hemminger
- Remove bogus blank line
v2 --> v3:
- Addressed feedback from Scott:
- Remove typedefs
- Remove unnecessary memory barriers
- Remove unnecessary casting
- Remove KConfig options
- Remove early_params
- Remove Hungarian notation
- Remove __packed__ attribute and padding from structures
- Remove unlikely attribute (where it's not needed)
- Use proper error codes and remove unnecessary prints
- Use proper values for sleep routines
- Replace complex Macros with functions
- Improve device tree processing code
- Use symbolic defines
- Add time-out in busy-wait loops
- Removed exit code (loadable module support will be added later)
- Fixed "fixed-link" issue raised by Joakim Tjernlund
v1 --> v2:
- Addressed feedback from Paul Bolle:
- General feedback of FMan Driver layer
- Remove Errata defines
- Aligned comments to Kernel Doc
- Remove Loadable Module support (not yet supported)
- Removed not needed KConfig dependencies
- Addressed feedback from Scott Wood
- Use Kernel ioread/iowrite services
- Squash FLIB source and header patches together
This submission is based on the prior Freescale DPAA FMan V3,RFC submission.
Several issues addresses in this submission:
- Reduced MAC layering and complexity
- Reduced code base
- T1024/T2080 10G best effort support
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Igal Liberman [Mon, 21 Dec 2015 00:21:29 +0000 (02:21 +0200)]
fsl/fman: Add FMan Port Support
Add the Data Path Acceleration Architecture Frame Manger Port Driver.
The FMan driver uses a module called "Port" to represent the physical
TX and RX ports.
Each FMan version has different number of physical ports.
This patch adds The FMan Port configuration, initialization and
runtime control routines for both TX and RX.
Signed-off-by: Igal Liberman <igal.liberman@freescale.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Igal Liberman [Mon, 21 Dec 2015 00:21:27 +0000 (02:21 +0200)]
fsl/fman: Add FMan MAC support
Add the Data Path Acceleration Architecture Frame Manger MAC support.
This patch adds The FMan MAC configuration, initialization and
runtime control routines.
This patch contains support for these types of MACs:
- dTSEC: Three speed Ethernet controller (10/100/1000 Mbps)
- tGEC: 10G Ethernet controller (10 Gbps)
- mEMAC: Multi-rate Ethernet MAC (10/100/1000/10000 Mbps)
Different FMan revisions have different type and number of MACs.
Signed-off-by: Igal Liberman <igal.liberman@freescale.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Igal Liberman [Mon, 21 Dec 2015 00:21:26 +0000 (02:21 +0200)]
fsl/fman: Add FMan support
Add the Data Path Acceleration Architecture Frame Manger Driver.
The FMan embeds a series of hardware blocks that implement a group
of Ethernet interfaces. This patch adds The FMan configuration,
initialization and runtime control routines.
The FMan driver supports several hardware versions
differentiated by things like:
- Different type of MACs
- Number of MAC and ports
- Available resources
- Different hardware errata
Signed-off-by: Igal Liberman <igal.liberman@freescale.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Igal Liberman [Mon, 21 Dec 2015 00:21:25 +0000 (02:21 +0200)]
fsl/fman: Add FMan MURAM support
Add Frame Manager Multi-User RAM support.
This internal FMan memory block is used by the
FMan hardware modules, the management being made
through the generic allocator.
The FMan Internal memory, for example, is used for
allocating transmit and receive FIFOs.
Signed-off-by: Igal Liberman <Igal.Liberman@freescale.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Pravin B Shelar [Thu, 24 Dec 2015 22:34:54 +0000 (14:34 -0800)]
ip_tunnel: Move stats update to iptunnel_xmit()
By moving stats update into iptunnel_xmit(), we can simplify
iptunnel_xmit() usage. With this change there is no need to
call another function (iptunnel_xmit_stats()) to update stats
in tunnel xmit code path.
Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Thu, 24 Dec 2015 03:34:45 +0000 (22:34 -0500)]
Merge branch 'cxgb4-T6-update'
Hariprasad Shenai says:
====================
Update support for T6 adapters
This patch changes updates the various code changes related to
register, stats and hardware related changes for T6 family of
adapters.
This patch series has been created against net-next tree and includes
patches on cxgb4 and cxgb4vf driver.
We have included all the maintainers of respective drivers. Kindly review
the change and let us know in case of any review comments.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
cxgb4: Update mps_tcam output to include T6 fields
In T6, MPS classification has a 512 deep TCAM to do the match lookup.
Each entry has 80x2b sets containing 48 bit MAC address, port number,
VLAN Valid/ID, VNI, lookup type (outer or inner packet header).
[71:48] bit locations are overloaded for outer vs. inner lookup types.
Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
cxgb4: Update Congestion Channel map for T6 adapter
Updating Congestion Channel/Priority Map in Congestion Manager Context
for T6. In T6 port 0 is mapped to channel 0 and port 1 is mapped to
channel 1. For 2 port T4/T5 adapter, port 0 is mapped to channel 0,1 and
port 1 is mapped to channel 2,3
Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Updating the driver to set the correct boundary values in SGE_CONTROL to
32B.
Also, need to take care of this fl alignment change when calculating the
next packet offset.
Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Andrew Lunn [Wed, 23 Dec 2015 12:23:17 +0000 (13:23 +0100)]
dsa: mv88e6xxx: Add Second back of statistics
The 6320 family of switch chips has a second bank for statistics, but
is missing three statistics in the port registers. Generalise and
extend the code:
* adding a field to the statistics table indicating the bank/register
set where each statistics is.
* add a function indicating if an individual statistics
is available on this device
* calculate at run time the sset_count.
* return strings based on the available statistics of the device
* return statistics based on the available statistics of the device
* Add support for reading from the second bank.
Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Thu, 24 Dec 2015 03:06:39 +0000 (22:06 -0500)]
Merge branch 'sfc-vf'
Bert Kenward says:
====================
sfc: additional virtual function support
This introduces the client side of a mechanism to defer authorisation of
operations, for example multicast subscription. Although primarily aimed at
SRIOV VFs this can also apply to unprivileged PFs.
Also handle reboot ordering corner cases better and reduce the level of some
logging.
v2: remove #ifdef DEBUG around new WARN_ON in mcdi.c.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Bert Kenward [Wed, 23 Dec 2015 08:58:15 +0000 (08:58 +0000)]
sfc: Downgrade or remove some error messages
Depending on configuration the NIC may return errors for unprivileged
functions and/or VFs. Where these are expected and handled, reduce the
level of any output.
Signed-off-by: Bert Kenward <bkenward@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Bert Kenward [Wed, 23 Dec 2015 08:57:36 +0000 (08:57 +0000)]
sfc: Make failed filter removal less noisy
There are situations - mostly reset related - where our view of the
filter table differs from the hardware. In this case we may try and
remove filters that aren't actually installed. This isn't that
interesting in most situations, so downgrade the logging.
Signed-off-by: Bert Kenward <bkenward@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Bert Kenward [Wed, 23 Dec 2015 08:57:18 +0000 (08:57 +0000)]
sfc: Handle MCDI proxy authorisation
For unprivileged functions operations can be authorised by an admin
function. Extra steps are introduced to the MCDI protocol in this
situation - the initial response from the MCDI tells us that the
operation has been deferred, and we must retry when told. We then
receive an event telling us to retry.
Note that this provides only the functionality for the unprivileged
functions, not the handling of the administrative side.
Signed-off-by: Bert Kenward <bkenward@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Bert Kenward [Wed, 23 Dec 2015 08:56:40 +0000 (08:56 +0000)]
sfc: Retry MCDI after NO_EVB_PORT error on a VF
After reboot the vswitch configuration from the PF may not be
complete before the VF attempts to restore filters. In that
case we see NO_EVB_PORT errors from the MC. Retry up to a time
limit or until a different result is seen.
Signed-off-by: Bert Kenward <bkenward@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Wed, 23 Dec 2015 17:05:53 +0000 (12:05 -0500)]
Merge branch 'cxgb4-next'
Hariprasad Shenai says:
====================
Trivial enhancements for cxgb4
This series adds a debug message if adapter isn't inserted in right PCI
slot. Changes naming conventions for iSCSI rx queues, use node info while
allocating rx queue and use napi_complete_done() api in napi handler.
This patch series has been created against net-next tree and includes
patches on cxgb4 driver.
We have included all the maintainers of respective drivers. Kindly review
the change and let us know in case of any review comments.
Thanks
V2: Dropped 'dcb_info' debug entry patch, since the same can be achieved
using lldp tool.
Based on review comments by Or Gerlitz <gerlitz.or@gmail.com> and
David Miller.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
All the upper level protocols like rdma, iscsi have their own offload rx
queues, so instead of using the generic naming convention be specific
while naming them. Improves code readability
Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
cxgb4: Warn if device doesn't have enough PCI bandwidth
Check if the device get enough bandwidth from the entire PCI chain to
satisfy its capabilities. This patch determines the PCIe device's
bandwidth capabilities by reading its PCIe Link Capabilities registers
and then call the pcie_get_minimum_link function to ensure that the
adapter is hooked into a slot which is capable of providing the
necessary bandwidth capabilities.
Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 22 Dec 2015 22:03:06 +0000 (17:03 -0500)]
Merge branch 'bindtodevice_tw_rst'
Florian Westphal says:
====================
tcp: honour SO_BINDTODEVICE for TW_RST case too
This is V2, this time as a small series since I followed Erics advice
to split this into smaller chunks, I hope this makes it easier to
review.
First patch adds inet_sk_transparent helper.
Second patch contains an if/else swap that I split from the
original TW_RST v1 one.
Third patch is the actual change without the superfluous sock_net change.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Florian Westphal [Mon, 21 Dec 2015 20:29:26 +0000 (21:29 +0100)]
tcp: honour SO_BINDTODEVICE for TW_RST case too
Hannes points out that when we generate tcp reset for timewait sockets we
pretend we found no socket and pass NULL sk to tcp_vX_send_reset().
Make it cope with inet tw sockets and then provide tw sk.
This makes RSTs appear on correct interface when SO_BINDTODEVICE is used.
Packetdrill test case:
// want default route to be used, we rely on BINDTODEVICE
`ip route del 192.0.2.0/24 via 192.168.0.2 dev tun0`
0.000 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3
// test case still works due to BINDTODEVICE
0.001 setsockopt(3, SOL_SOCKET, SO_BINDTODEVICE, "tun0", 4) = 0
0.100...0.200 connect(3, ..., ...) = 0
// more data while in FIN_WAIT2, expect RST
1.300 < P. 1:1001(1000) ack 1 win 46
// fails without this change -- default route is used
1.301 > R 1:1(0) win 0
Reported-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: Florian Westphal <fw@strlen.de> Acked-by: Eric Dumazet <edumazet@google.com> Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Switch the two branches: check if we have a socket first, then
fall back to a listener lookup if we saw a md5 option (hash_location).
Signed-off-by: Florian Westphal <fw@strlen.de> Acked-by: Eric Dumazet <edumazet@google.com> Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Florian Westphal [Mon, 21 Dec 2015 20:29:24 +0000 (21:29 +0100)]
net: add inet_sk_transparent() helper
Avoids cluttering tcp_v4_send_reset when followup patch extends
it to deal with timewait sockets.
Suggested-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Florian Westphal <fw@strlen.de> Acked-by: Eric Dumazet <edumazet@google.com> Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Jiri Pirko [Tue, 22 Dec 2015 08:43:07 +0000 (09:43 +0100)]
mlxsw: core: Use devm_kzalloc to allocate mlxsw_hwmon structure
KASan reported use-after-free for the hwmon structure. So fix this by
using devm_kzalloc and let the core take care about freeing the memory
during device dettach.
Reported-by: Ido Schimmel <idosch@mellanox.com> Fixes: 89309da39 ("mlxsw: core: Implement temperature hwmon interface") Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Lorenzo Colitti [Mon, 21 Dec 2015 15:03:44 +0000 (00:03 +0900)]
net: tcp: deal with listen sockets properly in tcp_abort.
When closing a listen socket, tcp_abort currently calls
tcp_done without clearing the request queue. If the socket has a
child socket that is established but not yet accepted, the child
socket is then left without a parent, causing a leak.
Fix this by setting the socket state to TCP_CLOSE and calling
inet_csk_listen_stop with the socket lock held, like tcp_close
does.
Tested using net_test. With this patch, calling SOCK_DESTROY on a
listen socket that has an established but not yet accepted child
socket results in the parent and the child being closed, such
that they no longer appear in sock_diag dumps.
Reported-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Lorenzo Colitti <lorenzo@google.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
It looks like an attempt to use CPU notifier here which was never
completed. Nobody tried to wire it up completely since 2k9. So I unwind
this code and get rid of everything not required. Oh look! 19 lines were
removed while code still does the same thing.
Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com> Tested-by: Santosh Shilimkar <santosh.shilimkar@oracle.com> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 22 Dec 2015 19:49:03 +0000 (14:49 -0500)]
Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue
Jeff Kirsher says:
====================
100GbE Intel Wired LAN Driver Updates 2015-12-22
This series contains updates to fm10k only.
Bruce cleans up the initialization of fm10k_workqueue at the global level,
which fixes a checkpatch.pl error. Made several other cleanups of the
driver, like making structures that do not change constant, remove unused
code, cleanup code comments and use boolean states true/false instead of
an integer since a bool is all that is needed.
Jacob fixed the TLV format for little endian structures which are 4 byte
aligned copy, so add an additional __aligned(4) and __packed to ensure
that these structures are actually 4 byte aligned and packed correctly.
Updated the driver to use ether_addr_equal() instead of memcmp() to
compare MAC addresses.
Alex Duyck cleans up the exception handling so all of the paths result in
a similar state if we fail. Specifically the driver will now unload the
mailbox interrupt, free the queue vectors and MSI-X, and then detach the
interface.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Bruce Allan [Wed, 9 Dec 2015 01:20:49 +0000 (17:20 -0800)]
fm10k: IS_ENABLED() is not appropriate for boolean kconfig option
Tri-states need 'if IS_ENABLED()', booleans should use 'ifdef'.
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Bruce Allan [Wed, 9 Dec 2015 01:20:44 +0000 (17:20 -0800)]
fm10k: cleanup mailbox code comments etc
Cleanup a number of issues with function header comments, lower-case
acronyms (i.e. FIFO, TLV), duplicate comments and a stubbed-out header
comment for fm10k_sm_mbx_init.
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Bruce Allan [Tue, 8 Dec 2015 23:51:11 +0000 (15:51 -0800)]
fm10k: use true/false for boolean get_host_state
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Bruce Allan [Tue, 8 Dec 2015 23:51:04 +0000 (15:51 -0800)]
fm10k: remove unused struct element
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Bruce Allan [Tue, 8 Dec 2015 23:50:39 +0000 (15:50 -0800)]
fm10k: constify fm10k_mac_ops, fm10k_iov_ops and fm10k_info structures
These structures never change so declare them as const.
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Bruce Allan [Tue, 8 Dec 2015 23:50:34 +0000 (15:50 -0800)]
fm10k: address operator not needed when declaring function pointers
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Jacob Keller [Mon, 16 Nov 2015 23:33:34 +0000 (15:33 -0800)]
fm10k: use ether_addr_equal instead of memcmp
When comparing MAC addresses, use ether_addr_equal instead of memcmp to
ETH_ALEN length. Found and replaced using the following sed:
sed -e 's/memcmp\x28\(.*\), ETH_ALEN\x29/!ether_addr_equal\x28\1\x29/'
Reported-by: Bruce Allan <bruce.w.allan@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Alexander Duyck [Tue, 10 Nov 2015 17:40:30 +0000 (09:40 -0800)]
fm10k: Cleanup exception handling for changing queues
This patch is meant to cleanup the exception handling for the paths where
we reset the interrupts and then reconfigure them. In all of these paths
we had very different levels of exception handling. I have updated the
driver so that all of the paths should result in a similar state if we
fail.
Specifically the driver will now unload the mailbox interrupt, free the
queue vectors and MSI-X, and then detach the interface.
In addition for any of the PCIe related resets I have added a check with
the hw_ready function to just make sure the registers are in a readable
state prior to reopening the interface.
Signed-off-by: Alexander Duyck <aduyck@mirantis.com> Reviewed-by: Bruce Allan <bruce.w.allan@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Jacob Keller [Mon, 9 Nov 2015 22:04:08 +0000 (14:04 -0800)]
fm10k: correctly pack TLV structures and explain reasoning
The TLV format for little endian structures is actually 4 byte aligned
copy. To this end, we need to add an additional __aligned(4) marker
along with __packed to ensure that these structures are actually 4 byte
aligned and packed correctly. Use of just __packed will not work as this
will result in 1byte alignment which is incorrect. Add a comment
explaining the reasoning behind why these structures need the special
treatment.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>