]> git.karo-electronics.de Git - linux-beck.git/log
linux-beck.git
9 years agoi40e: For VF reset (VFR and VFLR) add some more delay
Anjali Singhai Jain [Tue, 7 Apr 2015 23:45:35 +0000 (19:45 -0400)]
i40e: For VF reset (VFR and VFLR) add some more delay

With a HW issue that was recently discovered, after a VFLR HW might be
indicating to us a reset completion little too early. So wait another 10
msec for cache to be cleaned up.

Change-ID: I6a24dcf5dd7ffcd6500246e717411ef58532d1e9
Signed-off-by: Anjali Singhai Jain <anjali.singhai@intel.com>
Tested-by: Jim Young <james.m.young@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
9 years agoi40e: move VF notification routines up
Mitch Williams [Tue, 7 Apr 2015 23:45:34 +0000 (19:45 -0400)]
i40e: move VF notification routines up

Move the VF notification functions to the top of the file. This
eliminates an unnecessary declaration.

Change-ID: I036171f14180ee9f0ce4e0a21334d6a217d06c94
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Tested-by: Jim Young <james.m.young@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
9 years agoi40e: notify VFs of link state
Mitch Williams [Tue, 7 Apr 2015 23:45:33 +0000 (19:45 -0400)]
i40e: notify VFs of link state

Gratuitously notify VFs of link state when they activate their queues.
In general, this is the last thing that a VF driver will do as it opens
its interface, so this is a good time to notify the VF.

Currently, VF devices assume link is up unless told otherwise, which
means that VFs instantiated on a PF with no link will report the wrong
state. This change corrects that issue.

Change-ID: Iea53622904ecc681ac3f8938d81c30033ef9a0a6
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Tested-by: Jim Young <james.m.young@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
9 years agoi40evf: remove aq_pending
Mitch Williams [Tue, 7 Apr 2015 23:45:32 +0000 (19:45 -0400)]
i40evf: remove aq_pending

The aq_pending field in the adapter structure is actually redundant with
the current_op field. Remove the aq_pending field and expunge all traces
of it from the official record. This simplifies the code significantly,
especially in the virtual channel completion routine.

Change-ID: Ib2957c8c19882bd0cecc6fcd133912c24b46a1ff
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Tested-by: Jim Young <james.m.young@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
9 years agoi40e: Add support to program FDir SB rules for VF from PF through ethtool
Anjali Singhai Jain [Tue, 7 Apr 2015 23:45:31 +0000 (19:45 -0400)]
i40e: Add support to program FDir SB rules for VF from PF through ethtool

With this patch we can now add Flow director Sideband rules for a VF from
it's PF. Here is an example on how it can be done when VF id = 5 and
queue = 2:

"ethtool -N ethx flow-type udp4 src-ip x.x.x.x dst-ip y.y.y.y src-port p1 dst-port p2 action 2 user-def 5"

User-def specifies VF id and action specifies queue.

Change-ID: Ib37d6dff3823a4d85caffde638473891c38c2b89
Signed-off-by: Anjali Singhai Jain <anjali.singhai@intel.com>
Tested-by: Jim Young <james.m.young@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
9 years agoi40evf: fix bad indentation
Mitch Williams [Tue, 7 Apr 2015 23:45:30 +0000 (19:45 -0400)]
i40evf: fix bad indentation

Not sure how this slipped through. Cosmetic change only.

Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Tested-by: Jim Young <james.m.young@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
9 years agoi40e: stop VF rings
Mitch Williams [Tue, 7 Apr 2015 18:32:55 +0000 (11:32 -0700)]
i40e: stop VF rings

Explicitly stop the rings belonging to each VF when disabling SR-IOV.
Even though the VFs were gone, and the associated VSIs were removed,
the rings were not stopped, and in some circumstances the hardware would
continue to access the memory formerly used by the rings, causing
memory corruption or DMAR errors, both of which would lead to general
malaise of the kernel.

To relieve this condition, explicitly stop all the rings associated with
each VF before releasing its resources.

Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Tested-by: Jim Young <james.m.young@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
9 years agofm10k: Bump driver version to 0.15.2
Jeff Kirsher [Fri, 3 Apr 2015 20:33:10 +0000 (13:33 -0700)]
fm10k: Bump driver version to 0.15.2

With the recent driver changes, bump the version.

Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Tested-by: Krishneil Singh <krishneil.k.singh@intel.com>
9 years agofm10k: corrected VF multicast update
Jeff Kirsher [Fri, 3 Apr 2015 20:27:15 +0000 (13:27 -0700)]
fm10k: corrected VF multicast update

VFs were being improperly added to the switch's multicast group. The
error stems from the fact that incorrect arguments were passed to the
"update_mc_addr" function. It would seem to be a copy paste error since
the parameters are similar to the "update_uc_addr" function.

Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: Ngai-Mint Kwan <ngai-mint.kwan@intel.com>
Acked-by: Matthew Vick <matthew.vick@intel.com>
Tested-by: Krishneil Singh <krishneil.k.singh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
9 years agofm10k: mbx_update_max_size does not drop all oversized messages
Jeff Kirsher [Fri, 3 Apr 2015 20:27:14 +0000 (13:27 -0700)]
fm10k: mbx_update_max_size does not drop all oversized messages

When we call update_max_size it does not drop all oversized messages.
This is due to the difficulty in performing this operation, since it is
a FIFO which makes updating anything other than head or tail very
difficult. To fix this, modify validate_msg_size to ensure that we error
out later when trying to transmit the message that could be oversized.
This will generally be a rare condition, as it requires the FIFO to
include a message larger than the max_size negotiated during mailbox
connect. Note that max_size is always smaller than rx.size so it should
be safe to use here.

Also, update the update_max_size function header comment to clearly
indicate that it does not drop all oversized messages, but only those at
the head of the FIFO.

Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Acked-by: Matthew Vick <matthew.vick@intel.com>
Tested-by: Krishneil Singh <krishneil.k.singh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
9 years agofm10k: reset head instead of calling update_max_size
Jeff Kirsher [Fri, 3 Apr 2015 20:27:13 +0000 (13:27 -0700)]
fm10k: reset head instead of calling update_max_size

When we forcefully shutdown the mailbox, we then go about resetting max
size to 0, and clearing all messages in the FIFO. Instead, we should
just reset the head pointer so that the FIFO becomes empty, rather than
changing the max size to 0. This helps prevent increment in tx_dropped
counter during mailbox negotiation, which is confusing to viewers of
Linux ethtool statistics output.

Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Acked-by: Matthew Vick <matthew.vick@intel.com>
Tested-by: Krishneil Singh <krishneil.k.singh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
9 years agofm10k: renamed mbx_tx_dropped to mbx_tx_oversized
Jeff Kirsher [Fri, 3 Apr 2015 20:27:12 +0000 (13:27 -0700)]
fm10k: renamed mbx_tx_dropped to mbx_tx_oversized

The use of dropped doesn't really mean dropped mailbox messages, but
rather specifically messages which were too large to fit in the remote
Rx FIFO. Rename the stat to more clearly indicate what it means.

Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Acked-by: Matthew Vick <matthew.vick@intel.com>
Tested-by: Krishneil Singh <krishneil.k.singh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
9 years agofm10k: update xcast mode before synchronizing multicast addresses
Jeff Kirsher [Fri, 3 Apr 2015 20:27:11 +0000 (13:27 -0700)]
fm10k: update xcast mode before synchronizing multicast addresses

When the PF receives a request to update a multicast address for the VF,
it checks the enabled multicast mode first. Fix a bug where the VF tried
to set a multicast address before requesting the required xcast mode.
This ensures the multicast addresses are honored as long as the xcast
mode was allowed.

Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Acked-by: Matthew Vick <matthew.vick@intel.com>
Tested-by: Krishneil Singh <krishneil.k.singh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
9 years agofm10k: start service timer on probe
Jeff Kirsher [Fri, 3 Apr 2015 20:27:09 +0000 (13:27 -0700)]
fm10k: start service timer on probe

Since the service task handles varying work that doesn't all require the
interface to be up, launch the service timer immediately. This ensures
that we continually check the mailbox, as well as handle other tasks
while the device is down.

Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Acked-by: Matthew Vick <matthew.vick@intel.com>
Tested-by: Krishneil Singh <krishneil.k.singh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
9 years agofm10k: fix function header comment
Jeff Kirsher [Fri, 3 Apr 2015 20:27:08 +0000 (13:27 -0700)]
fm10k: fix function header comment

The header comment included a miscopy of a C-code line, and also
mis-used Rx FIFO when it clearly meant Tx FIFO

Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Acked-by: Matthew Vick <matthew.vick@intel.com>
Tested-by: Krishneil Singh <krishneil.k.singh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
9 years agofm10k: comment next_vf_mbx flow
Jeff Kirsher [Fri, 3 Apr 2015 20:27:07 +0000 (13:27 -0700)]
fm10k: comment next_vf_mbx flow

Add a header comment explaining why we have the somewhat crazy mailbox
flow. This flow is necessary as it prevents the PF<->SM mailbox from
being flooded by the VF messages, which normally trigger a message to
the PF. This helps prevent the case where we see a PF mailbox timeout.

Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Acked-by: Matthew Vick <matthew.vick@intel.com>
Tested-by: Krishneil Singh <krishneil.k.singh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
9 years agofm10k: don't handle mailbox events in iov_event path and always process mailbox
Jeff Kirsher [Sat, 11 Apr 2015 00:20:17 +0000 (17:20 -0700)]
fm10k: don't handle mailbox events in iov_event path and always process mailbox

Since we already schedule the service task, we can just wait for this
task to handle the mailbox events from the VF. This reduces some complex
code flow, and makes it so we have a single path for handling the VF
messages. There is a possibility that we have a slight delay in handling
VF messages, but it should be minimal.

The result of tx_complete and !rx_ready is insufficient to determine
whether we need to process the mailbox. There is a possible race
condition whereby the VF fills up the mbmem for us, but we have already
recently processed the mailboxes in the interrupt. During this time,
the interrupt is disabled. Thus, our Rx FIFO is empty, but the mbmem now
has data in it. Since we continually check whether Rx FIFO is empty, we
then never call process. This results in the possibility to prevent PF
from handling the VF mailbox messages.

Instead, just call process every time, despite the fact that we may or
may not have anything to process for the VF. There should be minimal
overhead for doing this, and it resolves an issue where the VF never
comes up due to never getting response for its SET_LPORT_STATE message.

Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Acked-by: Matthew Vick <matthew.vick@intel.com>
Tested-by: Krishneil Singh <krishneil.k.singh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
9 years agofm10k: use separate workqueue for fm10k driver
Jeff Kirsher [Fri, 3 Apr 2015 20:27:05 +0000 (13:27 -0700)]
fm10k: use separate workqueue for fm10k driver

Since we run the watchdog periodically, which might take a while and
potentially monopolize the system default workqueue, create our own
separate work queue. This also helps reduce and stabilize latency
between scheduling the work in our interrupt and actually performing
the work. Still use a timer for the regular scheduled interval but
queue the work onto its own work queue.

It seemed overkill to create a single workqueue per interface, so we
just spawn a single work queue for all interfaces upon driver load. For
this reason, use a multi-threaded workqueue with one thread per
processor, rather than single threaded queue.

Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Acked-by: Matthew Vick <matthew.vick@intel.com>
Tested-by: Krishneil Singh <krishneil.k.singh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
9 years agofm10k: Set PF queues to unlimited bandwidth during virtualization
Jeff Kirsher [Fri, 3 Apr 2015 20:27:04 +0000 (13:27 -0700)]
fm10k: Set PF queues to unlimited bandwidth during virtualization

When returning virtualization queues from the VF back to the PF, do not
retain the VF rate limiter.

Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: Todd Russell <todd.a.russell@intel.com>
Acked-by: Matthew Vick <matthew.vick@intel.com>
Tested-by: Krishneil Singh <krishneil.k.singh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
9 years agofm10k: expose tx_timeout_count as an ethtool stat
Jeff Kirsher [Fri, 3 Apr 2015 20:27:03 +0000 (13:27 -0700)]
fm10k: expose tx_timeout_count as an ethtool stat

Named it tx_hang_count to differentiate it from tx_hwtstamp_timeout.

Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Acked-by: Matthew Vick <matthew.vick@intel.com>
Tested-by: Krishneil Singh <krishneil.k.singh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
9 years agofm10k: only increment tx_timeout_count in Tx hang path
Jeff Kirsher [Fri, 3 Apr 2015 20:27:02 +0000 (13:27 -0700)]
fm10k: only increment tx_timeout_count in Tx hang path

We were incrementing the tx_timeout_count for both the Tx hang
and then for all reset flows.  Instead, we should only increment
tx_timeout_count in the Tx hang path, so that our Tx hang counter
does not increment when it was not caused by a Tx hang.

Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Acked-by: Matthew Vick <matthew.vick@intel.com>
Tested-by: Krishneil Singh <krishneil.k.singh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
9 years agofm10k: remove extraneous "Reset interface" message
Jeff Kirsher [Fri, 3 Apr 2015 20:27:01 +0000 (13:27 -0700)]
fm10k: remove extraneous "Reset interface" message

Since we already print this message when a reset is requested via the
RESET_REQUESTED flag, we do not need to print it before setting the
flag.

Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Acked-by: Matthew Vick <matthew.vick@intel.com>
Tested-by: Krishneil Singh <krishneil.k.singh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
9 years agofm10k: separate PF only stats so that VF does not display them
Jeff Kirsher [Fri, 3 Apr 2015 20:27:00 +0000 (13:27 -0700)]
fm10k: separate PF only stats so that VF does not display them

This patch resolves an issue with ethtool stats displaying useless
values on the VF, because some stats simply have no meaning to the VF.
Resolve this by splitting these out into PF_STATS and only showing them
if we aren't the VF.

Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Acked-by: Matthew Vick <matthew.vick@intel.com>
Tested-by: Krishneil Singh <krishneil.k.singh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
9 years agofm10k: use hw->mac.max_queues for stats
Jeff Kirsher [Fri, 3 Apr 2015 20:26:59 +0000 (13:26 -0700)]
fm10k: use hw->mac.max_queues for stats

Even though it shouldn't strictly matter, don't count queue stats higher
than the max_queues value stored for this mac. This ensures that we
don't attempt to check queues which don't belong to use in VFs. This
shouldn't be a visible change, as the VFs should see zero for queues
which don't belong to them.

Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Acked-by: Matthew Vick <matthew.vick@intel.com>
Tested-by: Krishneil Singh <krishneil.k.singh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
9 years agofm10k: only show actual queues, not the maximum in hardware
Jeff Kirsher [Fri, 3 Apr 2015 20:26:58 +0000 (13:26 -0700)]
fm10k: only show actual queues, not the maximum in hardware

Currently, we show statistics for all 128 queues, even though we don't
necessarily have that many queues available especially in the VF case.
Instead, use the hw->mac.max_queues value, which tells us how many
queues we actually have, rather than the space for the rings we
allocated. In this way, we prevent dumping statistics that are useless
on the VF.

Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Acked-by: Matthew Vick <matthew.vick@intel.com>
Tested-by: Krishneil Singh <krishneil.k.singh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
9 years agofm10k: allow creation of VLAN on default vid
Jeff Kirsher [Fri, 10 Apr 2015 23:48:19 +0000 (16:48 -0700)]
fm10k: allow creation of VLAN on default vid

Previously, the user was not allowed to create a VLAN interface on top
of the switch default vid.

Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Acked-by: Matthew Vick <matthew.vick@intel.com>
Tested-by: Krishneil Singh <krishneil.k.singh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
9 years agofm10k: fix unused warnings
Jeff Kirsher [Fri, 3 Apr 2015 20:26:56 +0000 (13:26 -0700)]
fm10k: fix unused warnings

The were several functions which had parameters which were never or
sometimes used in functions.  To resolve possible compiler warnings,
use __always_unused or __maybe_unused kernel macros to resolve.

Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Acked-by: Matthew Vick <matthew.vick@intel.com>
Tested-by: Krishneil Singh <krishneil.k.singh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
9 years agofm10k: Add netconsole support
Jeff Kirsher [Sat, 11 Apr 2015 02:14:31 +0000 (19:14 -0700)]
fm10k: Add netconsole support

This change adds a function called "fm10k_netpoll" that's used to define
"ndo_poll_controller" in "fm10k_netdev_ops". This is required to enable
support for "netconsole" in fm10k.

Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: Ngai-Mint Kwan <ngai-mint.kwan@intel.com>
Acked-by: Matthew Vick <matthew.vick@intel.com>
Tested-by: Krishneil Singh <krishneil.k.singh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
9 years agofm10k: Have the VF get the default VLAN during init
Jeff Kirsher [Fri, 3 Apr 2015 20:26:54 +0000 (13:26 -0700)]
fm10k: Have the VF get the default VLAN during init

Currently, the VFs do not read the default VLAN during initialization,
so they will not be able to indicate untagged frames properly.

Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: Matthew Vick <matthew.vick@intel.com>
Tested-by: Krishneil Singh <krishneil.k.singh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
9 years agofm10k: Correct spelling mistake
Jeff Kirsher [Fri, 3 Apr 2015 20:26:53 +0000 (13:26 -0700)]
fm10k: Correct spelling mistake

Corrected a spelling mistake that was found over time.

Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: Ngai-Mint Kwan <ngai-mint.kwan@intel.com>
Acked-by: Matthew Vick <matthew.vick@intel.com>
Tested-by: Krishneil Singh <krishneil.k.singh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
9 years agofm10k: Remove redundant rx_errors in ethtool
Jeff Kirsher [Fri, 3 Apr 2015 20:26:52 +0000 (13:26 -0700)]
fm10k: Remove redundant rx_errors in ethtool

Output of ethtool was reporting 2 rx_errors entries. This change
removes one of the redundant entries.

Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Acked-by: Matthew Vick <matthew.vick@intel.com>
Tested-by: Krishneil Singh <krishneil.k.singh@intel.com>
9 years agofm10k: Corrected an error in Tx statistics
Jeff Kirsher [Fri, 3 Apr 2015 20:26:51 +0000 (13:26 -0700)]
fm10k: Corrected an error in Tx statistics

The function collecting Tx statistics was actually using values from the RX
ring. Thus, Tx and Rx statistics values reported by "ifconfig" will
return identical values. This change corrects this error and the Tx
statistics is now reading from the Tx ring.

Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Acked-by: Matthew Vick <matthew.vick@intel.com>
Tested-by: Krishneil Singh <krishneil.k.singh@intel.com>
9 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
David S. Miller [Tue, 14 Apr 2015 19:44:14 +0000 (15:44 -0400)]
Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net

The dwmac-socfpga.c conflict was a case of a bug fix overlapping
changes in net-next to handle an error pointer differently.

Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agoMerge branch 'cxgb4-next'
David S. Miller [Tue, 14 Apr 2015 19:08:52 +0000 (15:08 -0400)]
Merge branch 'cxgb4-next'

Hariprasad Shenai says:

====================
cxgb4: Misc. fixes for sge

Increases value of MAX_IMM_TX_PKT_LEN to improve latency, fill freelist
starving threshold based on adapter type, add comments for tx flits and sge
length code and don't call t4_slow_intr_handler when we are not master PF.

This patch series has been created against net-next tree and includes patches on
cxgb4 driver

We have included all the maintainers of respective drivers. Kindly review the
change and let us know in case of any review comments.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agocxgb4: Don't call t4_slow_intr_handler when we're not the Master PF
Hariprasad Shenai [Tue, 14 Apr 2015 20:32:34 +0000 (02:02 +0530)]
cxgb4: Don't call t4_slow_intr_handler when we're not the Master PF

Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agocxgb4: Add comment for calculate tx flits and sge length code
Hariprasad Shenai [Tue, 14 Apr 2015 20:32:33 +0000 (02:02 +0530)]
cxgb4: Add comment for calculate tx flits and sge length code

Add comment for tx filt and sge length calucaltion code, also remove
a hardcoded value

Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agocxgb4: Use device node in page allocation
Hariprasad Shenai [Tue, 14 Apr 2015 20:32:32 +0000 (02:02 +0530)]
cxgb4: Use device node in page allocation

Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agocxgb4: Freelist starving threshold varies from adapter to adapter
Hariprasad Shenai [Tue, 14 Apr 2015 20:32:31 +0000 (02:02 +0530)]
cxgb4: Freelist starving threshold varies from adapter to adapter

fl_starv_thres could be different from adapter to adapter, don't use
hardcoded values

Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agocxgb4: Increased the value of MAX_IMM_TX_PKT_LEN from 128 to 256 bytes
Hariprasad Shenai [Tue, 14 Apr 2015 20:32:30 +0000 (02:02 +0530)]
cxgb4: Increased the value of MAX_IMM_TX_PKT_LEN from 128 to 256 bytes

This allows a significant latency drop for packets of sizes between 128 and 192
bytes

Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agobgmac: drop ring->num_slots
Felix Fietkau [Tue, 14 Apr 2015 10:08:02 +0000 (12:08 +0200)]
bgmac: drop ring->num_slots

The ring size is always known at compile time, so make the code a bit
more efficient

Signed-off-by: Felix Fietkau <nbd@openwrt.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agobgmac: fix DMA rx corruption
Felix Fietkau [Tue, 14 Apr 2015 10:08:01 +0000 (12:08 +0200)]
bgmac: fix DMA rx corruption

The driver needs to inform the hardware about the first invalid (not yet
filled) rx slot, by writing its DMA descriptor pointer offset to the
BGMAC_DMA_RX_INDEX register.

This register was set to a value exceeding the rx ring size, effectively
allowing the hardware constant access to the full ring, regardless of
which slots are initialized.

To fix this issue, always mark the last filled rx slot as invalid.

Signed-off-by: Felix Fietkau <nbd@openwrt.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agobgmac: simplify dma init/cleanup
Felix Fietkau [Tue, 14 Apr 2015 10:08:00 +0000 (12:08 +0200)]
bgmac: simplify dma init/cleanup

Instead of allocating buffers at device init time and initializing
descriptors at device open, do both at the same time (during open).
Free all buffers when closing the device.

Signed-off-by: Felix Fietkau <nbd@openwrt.org>
Acked-by: Rafał Miłecki <zajec5@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agobgmac: increase rx ring size from 511 to 512
Felix Fietkau [Tue, 14 Apr 2015 10:07:59 +0000 (12:07 +0200)]
bgmac: increase rx ring size from 511 to 512

Limiting it to 511 looks like a failed attempt at leaving one descriptor
empty to allow the hardware to stop processing a buffer that has not
been prepared yet. However, this doesn't work because this affects the
total ring size as well

Signed-off-by: Felix Fietkau <nbd@openwrt.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agobgmac: add check for oversized packets
Felix Fietkau [Tue, 14 Apr 2015 10:07:58 +0000 (12:07 +0200)]
bgmac: add check for oversized packets

In very rare cases, the MAC can catch an internal buffer that is bigger
than it's supposed to be. Instead of crashing the kernel, simply pass
the buffer back to the hardware

Signed-off-by: Felix Fietkau <nbd@openwrt.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agobgmac: simplify/optimize rx DMA error handling
Felix Fietkau [Tue, 14 Apr 2015 10:07:57 +0000 (12:07 +0200)]
bgmac: simplify/optimize rx DMA error handling

Allocate a new buffer before processing the completed one. If allocation
fails, reuse the old buffer.

Signed-off-by: Felix Fietkau <nbd@openwrt.org>
Acked-by: Rafał Miłecki <zajec5@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agobgmac: set received skb headroom to NET_SKB_PAD
Felix Fietkau [Tue, 14 Apr 2015 10:07:56 +0000 (12:07 +0200)]
bgmac: set received skb headroom to NET_SKB_PAD

A packet buffer offset of 30 bytes is inefficient, because the first 2
bytes end up in a different cacheline.

Signed-off-by: Felix Fietkau <nbd@openwrt.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agobgmac: leave interrupts disabled as long as there is work to do
Felix Fietkau [Tue, 14 Apr 2015 10:07:55 +0000 (12:07 +0200)]
bgmac: leave interrupts disabled as long as there is work to do

Always poll rx and tx during NAPI poll instead of relying on the status
of the first interrupt. This prevents bgmac_poll from leaving unfinished
work around until the next IRQ.
In my tests this makes bridging/routing throughput under heavy load more
stable and ensures that no new IRQs arrive as long as bgmac_poll uses up
the entire budget.

Signed-off-by: Felix Fietkau <nbd@openwrt.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agobgmac: simplify tx ring index handling
Felix Fietkau [Tue, 14 Apr 2015 10:07:54 +0000 (12:07 +0200)]
bgmac: simplify tx ring index handling

Keep incrementing ring->start and ring->end instead of pointing it to
the actual ring slot entry. This simplifies the calculation of the
number of free slots.

Signed-off-by: Felix Fietkau <nbd@openwrt.org>
Acked-by: Rafał Miłecki <zajec5@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agotoshiba: Remove celleb from Kconfig options
Daniel Axtens [Tue, 14 Apr 2015 05:28:44 +0000 (15:28 +1000)]
toshiba: Remove celleb from Kconfig options

The toshiba drivers had celleb as an optional dependency.
celleb has been dropped [1], so clean that out of Kconfig.

[1] http://patchwork.ozlabs.org/patch/451730/

CC: netdev@vger.kernel.org
CC: Valentin Rothberg <valentinrothberg@gmail.com>
CC: mpe@ellerman.id.au
CC: linuxppc-dev@lists.ozlabs.org
Signed-off-by: Daniel Axtens <dja@axtens.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agohv_netvsc: Implement partial copy into send buffer
Haiyang Zhang [Mon, 13 Apr 2015 23:34:35 +0000 (16:34 -0700)]
hv_netvsc: Implement partial copy into send buffer

If remaining space in a send buffer slot is too small for the whole message,
we only copy the RNDIS header and PPI data into send buffer, so we can batch
one more packet each time. It reduces the vmbus per-message overhead.

Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Reviewed-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agoMerge branch 'for-davem' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
David S. Miller [Mon, 13 Apr 2015 22:18:05 +0000 (18:18 -0400)]
Merge branch 'for-davem' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs

Al Viro says:

====================
netdev-related stuff in vfs.git

There are several commits sitting in vfs.git that probably ought to go in
via net-next.git.  First of all, there's merge with vfs.git#iocb - that's
Christoph's aio rework, which has triggered conflicts with the ->sendmsg()
and ->recvmsg() patches a while ago.  It's not so much Christoph's stuff
that ought to be in net-next, as (pretty simple) conflict resolution on merge.
The next chunk is switch to {compat_,}import_iovec/import_single_range - new
safer primitives for initializing iov_iter.  The primitives themselves come
from vfs/git#iov_iter (and they are used quite a lot in vfs part of queue),
conversion of net/socket.c syscalls belongs in net-next, IMO.  Next there's
afs and rxrpc stuff from dhowells.  And then there's sanitizing kernel_sendmsg
et.al.  + missing inlined helper for "how much data is left in msg->msg_iter" -
this stuff is used in e.g.  cifs stuff, but it belongs in net-next.

That pile is pullable from
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs.git for-davem

I'll post the individual patches in there in followups; could you take a look
and tell if everything in there is OK with you?
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agotcp/dccp: get rid of central timewait timer
Eric Dumazet [Mon, 13 Apr 2015 01:51:09 +0000 (18:51 -0700)]
tcp/dccp: get rid of central timewait timer

Using a timer wheel for timewait sockets was nice ~15 years ago when
memory was expensive and machines had a single processor.

This does not scale, code is ugly and source of huge latencies
(Typically 30 ms have been seen, cpus spinning on death_lock spinlock.)

We can afford to use an extra 64 bytes per timewait sock and spread
timewait load to all cpus to have better behavior.

Tested:

On following test, /proc/sys/net/ipv4/tcp_tw_recycle is set to 1
on the target (lpaa24)

Before patch :

lpaa23:~# ./super_netperf 200 -H lpaa24 -t TCP_CC -l 60 -- -p0,0
419594

lpaa23:~# ./super_netperf 200 -H lpaa24 -t TCP_CC -l 60 -- -p0,0
437171

While test is running, we can observe 25 or even 33 ms latencies.

lpaa24:~# ping -c 1000 -i 0.02 -qn lpaa23
...
1000 packets transmitted, 1000 received, 0% packet loss, time 20601ms
rtt min/avg/max/mdev = 0.020/0.217/25.771/1.535 ms, pipe 2

lpaa24:~# ping -c 1000 -i 0.02 -qn lpaa23
...
1000 packets transmitted, 1000 received, 0% packet loss, time 20702ms
rtt min/avg/max/mdev = 0.019/0.183/33.761/1.441 ms, pipe 2

After patch :

About 90% increase of throughput :

lpaa23:~# ./super_netperf 200 -H lpaa24 -t TCP_CC -l 60 -- -p0,0
810442

lpaa23:~# ./super_netperf 200 -H lpaa24 -t TCP_CC -l 60 -- -p0,0
800992

And latencies are kept to minimal values during this load, even
if network utilization is 90% higher :

lpaa24:~# ping -c 1000 -i 0.02 -qn lpaa23
...
1000 packets transmitted, 1000 received, 0% packet loss, time 19991ms
rtt min/avg/max/mdev = 0.023/0.064/0.360/0.042 ms

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agonetfilter: Fix format string of nfnetlink_log proc file
Richard Weinberger [Sun, 12 Apr 2015 22:52:39 +0000 (00:52 +0200)]
netfilter: Fix format string of nfnetlink_log proc file

The printed values are all of type unsigned integer, therefore use
%u instead of %d. Otherwise an user can face negative values.

Signed-off-by: Richard Weinberger <richard@nod.at>
Acked-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agonetfilter: Fix format string of nfnetlink_queue proc file
Richard Weinberger [Sun, 12 Apr 2015 22:52:38 +0000 (00:52 +0200)]
netfilter: Fix format string of nfnetlink_queue proc file

The printed values are all of type unsigned integer, therefore use
%u instead of %d. Otherwise an user can face negative values.

Fixes:
$ cat /proc/net/netfilter/nfnetlink_queue
    0  29508   278 2 65531     0 2004213241 -2129885586  1
    1 -27747     0 2 65531     0     0        0  1
    2 -27748     0 2 65531     0     0        0  1

Signed-off-by: Richard Weinberger <richard@nod.at>
Acked-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agonetfilter: Fix portid types
Richard Weinberger [Sun, 12 Apr 2015 22:52:37 +0000 (00:52 +0200)]
netfilter: Fix portid types

The netlink portid is an unsigned integer, use this type
also in netfilter.

Signed-off-by: Richard Weinberger <richard@nod.at>
Acked-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agonfc: Fix portid type in urelease_work
Richard Weinberger [Sun, 12 Apr 2015 22:52:36 +0000 (00:52 +0200)]
nfc: Fix portid type in urelease_work

portid is an unsigned integer. Fix urelease_work to
match all other portid user in the kernel.

Signed-off-by: Richard Weinberger <richard@nod.at>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agonetlink: Fix portid type in netlink_notify
Richard Weinberger [Sun, 12 Apr 2015 22:52:35 +0000 (00:52 +0200)]
netlink: Fix portid type in netlink_notify

portid is an unsigned integer. Fix netlink_notify to
match all other portid user in the kernel.

Signed-off-by: Richard Weinberger <richard@nod.at>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agotcp: fix bogus RTT for CC when retransmissions are acked
Kenneth Klette Jonassen [Sat, 11 Apr 2015 00:17:49 +0000 (02:17 +0200)]
tcp: fix bogus RTT for CC when retransmissions are acked

Since retransmitted segments are not used for RTT estimation, previously
SACKed segments present in the rtx queue are used. This estimation can be
several times larger than the actual RTT. When a cumulative ack covers both
previously SACKed and retransmitted segments, CC may thus get a bogus RTT.

Such segments previously had an RTT estimation in tcp_sacktag_one(), so it
seems reasonable to not reuse them in tcp_clean_rtx_queue() at all.

Afaik, this has had no effect on SRTT/RTO because of Karn's check.

Signed-off-by: Kenneth Klette Jonassen <kennetkl@ifi.uio.no>
Acked-by: Neal Cardwell <ncardwell@google.com>
Tested-by: Neal Cardwell <ncardwell@google.com>
Acked-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agonet: use jump label patching for ingress qdisc in __netif_receive_skb_core
Daniel Borkmann [Fri, 10 Apr 2015 21:07:54 +0000 (23:07 +0200)]
net: use jump label patching for ingress qdisc in __netif_receive_skb_core

Even if we make use of classifier and actions from the egress
path, we're going into handle_ing() executing additional code
on a per-packet cost for ingress qdisc, just to realize that
nothing is attached on ingress.

Instead, this can just be blinded out as a no-op entirely with
the use of a static key. On input fast-path, we already make
use of static keys in various places, e.g. skb time stamping,
in RPS, etc. It makes sense to not waste time when we're assured
that no ingress qdisc is attached anywhere.

Enabling/disabling of that code path is being done via two
helpers, namely net_{inc,dec}_ingress_queue(), that are being
invoked under RTNL mutex when a ingress qdisc is being either
initialized or destructed.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agoMerge branch 'netdev_diet'
David S. Miller [Mon, 13 Apr 2015 17:15:14 +0000 (13:15 -0400)]
Merge branch 'netdev_diet'

Thomas Graf says:

====================
Bring sizeof(net_device) down to < 2K bytes

The size of struct net_device crossed the 2K boundary a while ago which
is a waste in combination with many net namespaces. This series brings
the size of struct net_device down to well below 2K in total size with
a typical configuration. Some reserves a several holes leave room for
further expansion.

Before:
/* size: 2176, cachelines: 34, members: 121 */

After:
/* size: 1984, cachelines: 31, members: 120 */
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agonet_device: Reorder members to fill holes
Thomas Graf [Fri, 10 Apr 2015 13:52:38 +0000 (15:52 +0200)]
net_device: Reorder members to fill holes

Some trivial reorders while preserving the RX/TX cache lines
split to fill a couple of holes.

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agoe1000e: Move pm_qos_req to e1000e adapter
Thomas Graf [Fri, 10 Apr 2015 13:52:37 +0000 (15:52 +0200)]
e1000e: Move pm_qos_req to e1000e adapter

e1000e is the only driver requiring pm_qos_req, instead of causing
every device to waste up to 240 bytes. Allocate it for the specific
driver.

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agoselinux/nlmsg: add a build time check for rtnl/xfrm cmds
Nicolas Dichtel [Mon, 13 Apr 2015 13:20:37 +0000 (15:20 +0200)]
selinux/nlmsg: add a build time check for rtnl/xfrm cmds

When a new rtnl or xfrm command is added, this part of the code is frequently
missing. Let's help the developer with a build time test.

Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agoMerge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next...
David S. Miller [Mon, 13 Apr 2015 01:36:57 +0000 (21:36 -0400)]
Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue

Jeff Kirsher says:

====================
Intel Wired LAN Driver Updates 2015-04-11

This series contains updates to iflink, ixgbe and ixgbevf.

The entire set of changes come from Vlad Zolotarov to ultimately add
the ethtool ops to VF driver to allow querying the RSS indirection table
and RSS random key.

Currently we support only 82599 and x540 devices.  On those devices, VFs
share the RSS redirection table and hash key with a PF.  Letting the VF
query this information may introduce some security risks, therefore this
feature will be disabled by default.

The new netdev op allows a system administrator to change the default
behaviour with "ip link set" command.  The relevant iproute2 patch has
already been sent and awaits for this series upstream.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agoMerge branch 'fou-next'
David S. Miller [Mon, 13 Apr 2015 01:25:14 +0000 (21:25 -0400)]
Merge branch 'fou-next'

Cong Wang says:

====================
fou: some fixes and updates

Patch 1~3 fix some minor bugs in net/ipv4/fou.c, the only
thing I am not sure is if it's too late to change the
byte order of FOU_ATTR_PORT, if so we have to fix iproute2
instead of kernel.

Patch 4~5 add some new features to make it complete.

v2: make fou->port be16 too
====================

Acked-by: Tom Herbert <tom@herbertland.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agofou: implement FOU_CMD_GET
WANG Cong [Fri, 10 Apr 2015 19:00:30 +0000 (12:00 -0700)]
fou: implement FOU_CMD_GET

Cc: Tom Herbert <tom@herbertland.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agofou: add network namespace support
WANG Cong [Fri, 10 Apr 2015 19:00:29 +0000 (12:00 -0700)]
fou: add network namespace support

Also convert the spinlock to a mutex.

Cc: Tom Herbert <tom@herbertland.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agofou: always use be16 for port
WANG Cong [Fri, 10 Apr 2015 19:00:28 +0000 (12:00 -0700)]
fou: always use be16 for port

udp_config.local_udp_port is be16. And iproute2 passes
network order for FOU_ATTR_PORT.

This doesn't fix any bug, just for consistency.

Cc: Tom Herbert <tom@herbertland.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agofou: exit early when parsing config fails
WANG Cong [Fri, 10 Apr 2015 19:00:27 +0000 (12:00 -0700)]
fou: exit early when parsing config fails

Not a big deal, just for corretness.

Cc: Tom Herbert <tom@herbertland.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agofou: avoid calling udp_del_offload() twice
WANG Cong [Fri, 10 Apr 2015 19:00:26 +0000 (12:00 -0700)]
fou: avoid calling udp_del_offload() twice

This fixes the following harmless warning:

./ip/ip fou del port 7777
[  122.907516] udp_del_offload: didn't find offload for port 7777

Cc: Tom Herbert <tom@herbertland.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agoMerge branch 'selinux_xfrm_nl_cmd'
David S. Miller [Mon, 13 Apr 2015 01:19:40 +0000 (21:19 -0400)]
Merge branch 'selinux_xfrm_nl_cmd'

Nicolas Dichtel says:

====================
selinux: add missing xfrm nl cmd

With this series, xfrm commands are fully synchronized.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agoselinux/nlmsg: add XFRM_MSG_MAPPING
Nicolas Dichtel [Fri, 10 Apr 2015 14:24:28 +0000 (16:24 +0200)]
selinux/nlmsg: add XFRM_MSG_MAPPING

This command is missing.

Fixes: 3a2dfbe8acb1 ("xfrm: Notify changes in UDP encapsulation via netlink")
CC: Martin Willi <martin@strongswan.org>
Reported-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agoselinux/nlmsg: add XFRM_MSG_MIGRATE
Nicolas Dichtel [Fri, 10 Apr 2015 14:24:27 +0000 (16:24 +0200)]
selinux/nlmsg: add XFRM_MSG_MIGRATE

This command is missing.

Fixes: 5c79de6e79cd ("[XFRM]: User interface for handling XFRM_MSG_MIGRATE")
Reported-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agoselinux/nlmsg: add XFRM_MSG_REPORT
Nicolas Dichtel [Fri, 10 Apr 2015 14:24:26 +0000 (16:24 +0200)]
selinux/nlmsg: add XFRM_MSG_REPORT

This command is missing.

Fixes: 97a64b4577ae ("[XFRM]: Introduce XFRM_MSG_REPORT.")
Reported-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agotcp: do not cache align timewait sockets
Eric Dumazet [Fri, 10 Apr 2015 13:07:18 +0000 (06:07 -0700)]
tcp: do not cache align timewait sockets

With recent adoption of skc_cookie in struct sock_common,
struct tcp_timewait_sock size increased from 192 to 200 bytes
on 64bit arches. SLAB rounds then to 256 bytes.

It is time to drop SLAB_HWCACHE_ALIGN constraint for twsk_slab.

This saves about 12 MB of memory on typical configuration reaching
262144 timewait sockets, and has no noticeable impact on performance.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agoMerge tag 'mac80211-next-for-davem-2015-04-10' of git://git.kernel.org/pub/scm/linux...
David S. Miller [Mon, 13 Apr 2015 00:43:46 +0000 (20:43 -0400)]
Merge tag 'mac80211-next-for-davem-2015-04-10' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211-next

Johannes Berg says:

====================
There isn't much left, but we have
 * new mac80211 internal software queue to allow drivers to have
   shorter hardware queues and pull on-demand
 * use rhashtable for mac80211 station table
 * minstrel rate control debug improvements and some refactoring
 * fix noisy message about TX power reduction
 * fix continuous message printing and activity if CRDA doesn't respond
 * fix VHT-related capabilities with "iw connect" or "iwconfig ..."
 * fix Kconfig for cfg80211 wireless extensions compatibility
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agonet/macb: sqe_test_errors are TX errors, not RX errors
Wolfgang Steinwender [Fri, 10 Apr 2015 09:42:56 +0000 (11:42 +0200)]
net/macb: sqe_test_errors are TX errors, not RX errors

The statistics are grouped by TX and RX errors.
The SQE Test Errors Register indicates problems with TX.

Signed-off-by: Wolfgang Steinwender <wsteinwender@pcs.com>
Acked-by: Nicolas Ferre <nicolas.ferre@atmel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agonew helper: msg_data_left()
Al Viro [Tue, 16 Dec 2014 02:39:31 +0000 (21:39 -0500)]
new helper: msg_data_left()

convert open-coded instances

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
9 years agoMerge remote-tracking branch 'dh/afs' into for-davem
Al Viro [Sat, 11 Apr 2015 19:51:09 +0000 (15:51 -0400)]
Merge remote-tracking branch 'dh/afs' into for-davem

9 years agoget rid of the size argument of sock_sendmsg()
Al Viro [Thu, 11 Dec 2014 05:02:50 +0000 (00:02 -0500)]
get rid of the size argument of sock_sendmsg()

it's equal to iov_iter_count(&msg->msg_iter) in all cases

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
9 years agoixgbevf: Add the appropriate ethtool ops to query RSS indirection table and key
Vlad Zolotarov [Mon, 30 Mar 2015 18:35:29 +0000 (21:35 +0300)]
ixgbevf: Add the appropriate ethtool ops to query RSS indirection table and key

Added get_rxfh_indir_size, get_rxfh_key_size and get_rxfh ethtool_ops
callbacks implementations.

This enables the ethtool's "-x" and "--show-rxfh[-indir]" options for VF
devices.

This patch adds the support for 82599 and x540 devices only. Support for
other devices will be added later.

Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
9 years agoixgbevf: Add RSS Key query code
Vlad Zolotarov [Mon, 30 Mar 2015 18:35:28 +0000 (21:35 +0300)]
ixgbevf: Add RSS Key query code

Add the ixgbevf_get_rss_key() function that queries the PF for an RSS
Random Key using a new VF-PF channel IXGBE_VF_GET_RSS_KEY command.

This patch adds the support for 82599 and x540 devices only. Support for
other devices will be added later.

Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
9 years agoixgbe: Add GET_RSS_KEY command to VF-PF channel commands set
Vlad Zolotarov [Mon, 30 Mar 2015 18:35:27 +0000 (21:35 +0300)]
ixgbe: Add GET_RSS_KEY command to VF-PF channel commands set

For 82599 and x540 VFs and PF share the same RSS Key. Therefore we will
return the same RSS key for all VFs.

Support for other devices will be added later.

Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
9 years agoixgbevf: Add a RETA query code
Vlad Zolotarov [Mon, 30 Mar 2015 18:35:26 +0000 (21:35 +0300)]
ixgbevf: Add a RETA query code

We will currently support only 82599 and x540 devices. Support for other
devices will be added later.

   - Added a new API version support.
   - Added the query implementation in the ixgbevf.

Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
9 years agoixgbe: Add a RETA query command to VF-PF channel API
Vlad Zolotarov [Wed, 1 Apr 2015 08:24:54 +0000 (11:24 +0300)]
ixgbe: Add a RETA query command to VF-PF channel API

Add this new command for 82599 and x540 devices only. Support for other
devices will be added later.

82599 and x540 VFs and PF share the same RSS redirection table (RETA).
Therefore we just return it for all VFs.

For 82599 and x540 RETA table is an array of 32 registers (128 bytes) and
the maximum number of registers that may be delivered in a single VF-PF
channel command is 15. On the other hand VFs of these devices can be
configured to have up to 4 RSS queues. Therefore we will "compress" the
RETA by transferring only 2 bits per entry and thereby it will take only 8
registers (DWORDS) to transfer the whole VF RETA.

Thus this patch does the following:

  - Adds a new API version (to specify a new commands set).
  - Adds the IXGBE_VF_GET_RETA command to the VF-PF commands set.

Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
9 years agoixgbe: Add a new netdev op to allow/prevent a VF from querying an RSS info
Vlad Zolotarov [Mon, 30 Mar 2015 18:35:24 +0000 (21:35 +0300)]
ixgbe: Add a new netdev op to allow/prevent a VF from querying an RSS info

Implements the new netdev op to allow user to enable/disable the ability
of a specific VF to query its RSS Indirection Table and an RSS Hash Key.

This patch limits the new feature support to 82599 and x540 devices only.
Support for other devices will be added later.

Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
9 years agoif_link: Add an additional parameter to ifla_vf_info for RSS querying
Vlad Zolotarov [Mon, 30 Mar 2015 18:35:23 +0000 (21:35 +0300)]
if_link: Add an additional parameter to ifla_vf_info for RSS querying

Add configuration setting for drivers to allow/block an RSS Redirection
Table and a Hash Key querying for discrete VFs.

On some devices VF share the mentioned above information with PF and
querying it may adduce a theoretical security risk. We want to let a
system administrator to decide if he/she wants to take this risk or not.

Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
9 years agoixgbe: Add the appropriate ethtool ops to query RSS indirection table and key
Vlad Zolotarov [Mon, 30 Mar 2015 18:18:58 +0000 (21:18 +0300)]
ixgbe: Add the appropriate ethtool ops to query RSS indirection table and key

Added get_rxfh_indir_size, get_rxfh_key_size and get_rxfh ethtool_ops
callbacks implementations.

This enables the ethtool's "-x" and "--show-rxfh[-indir]" options.

Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
9 years agoixgbe: Refactor the RSS configuration code
Vlad Zolotarov [Mon, 30 Mar 2015 18:18:57 +0000 (21:18 +0300)]
ixgbe: Refactor the RSS configuration code

This patch is a preparation for enablement of ethtool RSS indirection
table and hash key querying. We don't want to read registers every time
the RSS info is queried. Therefore we will store its current content in the
arrays in the adapter struct and will read it from there (instead of from
registers) when requested.

Will change the code that writes the indirection table and hash key into
the HW registers to take its content from these arrays. This will also
simplify the indirection table updating ethtool callback implementation
in the future.

Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
9 years agoMerge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next...
David S. Miller [Fri, 10 Apr 2015 19:49:34 +0000 (12:49 -0700)]
Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue

Jeff Kirsher says:

====================
Intel Wired LAN Driver Updates 2015-04-10

This series contains updates to ixgbe and documentation for igb,
ixgbe and ixgb.

Stephen cleans up documentation to igb, ixgbe and ixgb.

Don updates how bridge mode is stored to minimize obfuscation and
makes updates for future silicon easier.  Adds a new bridge mode
support function which gathers all the logic needed to configure
bridge modes.  Adds Source Address Prunning for VEPA bridge mode
for x550 devices.

Vasu adds specific FCoE offloads for x550 for DDP context programming
and increased DDP exchanges.

Alex Duyck cleans up the use of HW_VLAN_CTAG_FILTER in hw_features,
where the driver was actually ignoring the value of the bit and was
just assuming it was always set.  Also cleans up the use of rcu_barrier()
since the driver has not used call_rcu() to free the rings for some
time now.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agortnetlink: Mark name argument of rtnl_create_link() const
Thomas Graf [Thu, 9 Apr 2015 23:45:53 +0000 (01:45 +0200)]
rtnetlink: Mark name argument of rtnl_create_link() const

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agoixgbe: Drop unnecessary call to rcu_barrier
Alexander Duyck [Fri, 10 Apr 2015 05:03:24 +0000 (22:03 -0700)]
ixgbe: Drop unnecessary call to rcu_barrier

The ixgbe driver hasn't used call_rcu to free the rings for some time now.
Since that is the case the call to rcu_barrier can be dropped since calls
to kfree_rcu don't require it.

Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
9 years agoixgbe: Remove NETIF_F_HW_VLAN_CTAG_FILTER from hw_features
Alexander Duyck [Fri, 10 Apr 2015 05:03:24 +0000 (22:03 -0700)]
ixgbe: Remove NETIF_F_HW_VLAN_CTAG_FILTER from hw_features

This change makes it so that the HW_VLAN_CTAG_FILTER bit is not falsely
advertised as being a feature that can be toggled on ixgbe parts.  The
driver was setting the bit in features and letting it be inherited by
hw_features, however the driver was actually ignoring the value of the bit
and just assuming it was always set.  As a result VLAN filtering was always
enabled which is a requirement for SR-IOV, VMDq, DCB, FCoE, and possibly
other features within the adapters.

Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
9 years agoixgbe: adds x550 specific FCoE offloads
Vasu Dev [Fri, 10 Apr 2015 05:03:23 +0000 (22:03 -0700)]
ixgbe: adds x550 specific FCoE offloads

Adds x550 specific FCoE offloads for DDP context programming and
increased DDP exchanges.

Signed-off-by: Vasu Dev <vasu.dev@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
9 years agoixgbe: add support for X550 source_address_prunning
Don Skidmore [Fri, 10 Apr 2015 05:03:23 +0000 (22:03 -0700)]
ixgbe: add support for X550 source_address_prunning

This patch will enable X550 Source Address Prunning for VEPA
bridge mode.  This requires that we also have replication enabled
as well, while in this mode.

Signed-off-by: Don Skidmore <donald.c.skidmore@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
9 years agoixgbe: add new bridge mode support function.
Don Skidmore [Fri, 10 Apr 2015 05:03:22 +0000 (22:03 -0700)]
ixgbe: add new bridge mode support function.

This patch gathers together all the logic needed to configure bridge
modes.  Currently that it is rather simple but this is really laying
the ground work for future X550 feature enhancement.

Signed-off-by: Don Skidmore <donald.c.skidmore@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
9 years agoixgbe: Move bridge mode from flag to variable
Don Skidmore [Fri, 10 Apr 2015 05:03:22 +0000 (22:03 -0700)]
ixgbe: Move bridge mode from flag to variable

We are currently storing our BRIDGE_MODE as a bit in our adapter flags.
This patch will store the actual mode instead which minimizes obfuscation
and makes following patches for X550 simpler.

Signed-off-by: Don Skidmore <donald.c.skidmore@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
9 years agoixgb: remove references to ifconfig
Stephen Hemminger [Fri, 10 Apr 2015 05:03:21 +0000 (22:03 -0700)]
ixgb: remove references to ifconfig

Move documentation into this century, even if this device hasn't
been available for some time.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
9 years agoixgbe: fix documentation
Stephen Hemminger [Fri, 10 Apr 2015 05:03:21 +0000 (22:03 -0700)]
ixgbe: fix documentation

The MTU values in the documentation do not match the source.
The source has frame limit of IXGBE_MAX_JUMBO_FRAME_SIZE (9728)
which is MTU of 9710 because of the accounting for Ethernet header
and CRC.

Also, don't refer to the obsolete ifconfig command.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
9 years agoigb: doc don't refer to ifconfig
Stephen Hemminger [Fri, 10 Apr 2015 04:02:02 +0000 (21:02 -0700)]
igb: doc don't refer to ifconfig

ifconfig command is obsolete, best to remove all references so that
new users learn ip.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>