We shouldn't restart Admin queue subtask if PF reset fails since we do
not have the AQ setup at that point. This patch makes sure we disable AQ
clean subtask when PF reset fails.
This will resolve an occasional kernel panic when PF reset fails for
some reason.
Change-ID: I11a747773362a8c5c0ad7a10cd34be0bda8eb9e8 Signed-off-by: Anjali Singhai Jain <anjali.singhai@intel.com> Tested-by: Jim Young <jamesx.m.young@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The driver is un-necessarily printing a warning that is only marginally
useful to the user. Make the warning only print if extended driver
string printing is enabled, other messages related to a reset event
will still continue to print.
Change-ID: I5e8beca6516a2f176cd2e72b0ac2b3b909e6c953 Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Jim Young <jamesx.m.young@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
i40e: Tell OS link is going down when calling set_phy_config
Since we don't seem to be getting an LSE telling us link is going down
during set_phy_config (but we do get an LSE telling us we are coming
back up), fake one for the OS and tell them link is going down. Also
do an atomic restart no matter what because there are times the user
may want to end with link up even if they started with link down (like
if they accidentally set it to a speed that can't link and are trying to
fix it).
Change-ID: I0a642af9c1d0feb67bce741aba1a9c33bd349ed6 Signed-off-by: Catherine Sullivan <catherine.sullivan@intel.com> Tested-by: Jim Young <jamesx.m.young@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Serey Kong [Tue, 29 Jul 2014 04:03:53 +0000 (04:03 +0000)]
i40e: Remove unnecessary assignment
Remove unnecessary setting of "ret" variable as it's already set at
the top of the function.
Change-ID: Icaccfc67f335817a23579b7c43625d59ad6c9925 Signed-off-by: Serey Kong <serey.kong@intel.com> Tested-by: Jim Young <jamesx.m.young@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Serey Kong [Sat, 12 Jul 2014 07:28:14 +0000 (07:28 +0000)]
i40e: Change wording to be more consistent
Change "spoofck" to "spoofchk" to be consistent with as defined in netdev.
Change-ID: I9866d6284cb5f92c8d71dc0776c6d1e71dfb62a5 Signed-off-by: Serey Kong <serey.kong@intel.com> Tested-by: Jim Young <jamesx.m.young@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
i40e: Allow user to change link settings if link is down
Allow the user to change auto-negotiation and speed settings if
link is down.
Change-ID: I372967c627682b5e1835f623a7cbf41b21b51043 Signed-off-by: Catherine Sullivan <catherine.sullivan@intel.com> Tested-by: Jim Young <jamesx.m.young@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Now that fw has implemented dual speed module support, we can add ours.
Also, add the phy type for 1G LR/SR and set its media type to fiber.
Lastly, instead of a WARN_ON if the phy type is not recognized just print
a warning.
Change-ID: I2e5227d4a8c2907b0ed423038e5dbce774e466b0 Signed-off-by: Catherine Sullivan <catherine.sullivan@intel.com> Tested-by: Jim Young <jamesx.m.young@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Rick Jones [Wed, 3 Sep 2014 16:18:00 +0000 (09:18 -0700)]
mlx4_en: Convert the normal skb free path to dev_consume_skb_any()
It would appear the mlx4_en driver was still making a call to
dev_kfree_skb_any() where dev_consume_skb_any() would be more
appropriate. This should make dropped packet profiling/tracking
easier/better over a NIC driven by mlx4_en.
Signed-off-by: Rick Jones <rick.jones2@hp.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
lib/rhashtable: allow user to set the minimum shifts of shrinking
Although rhashtable library allows user to specify a quiet big size
for user's created hash table, the table may be shrunk to a
very small size - HASH_MIN_SIZE(4) after object is removed from
the table at the first time. Subsequently, even if the total amount
of objects saved in the table is quite lower than user's initial
setting in a long time, the hash table size is still dynamically
adjusted by rhashtable_shrink() or rhashtable_expand() each time
object is inserted or removed from the table. However, as
synchronize_rcu() has to be called when table is shrunk or
expanded by the two functions, we should permit user to set the
minimum table size through configuring the minimum number of shifts
according to user specific requirement, avoiding these expensive
actions of shrinking or expanding because of calling synchronize_rcu().
Signed-off-by: Ying Xue <ying.xue@windriver.com> Acked-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
qdisc: validate frames going through the direct_xmit path
In commit 50cbe9ab5f8d ("net: Validate xmit SKBs right when we
pull them out of the qdisc") the validation code was moved out of
dev_hard_start_xmit and into dequeue_skb.
However this overlooked the fact that we do not always enqueue
the skb onto a qdisc. First situation is if qdisc have flag
TCQ_F_CAN_BYPASS and qdisc is empty. Second situation is if
there is no qdisc on the device, which is a common case for
software devices.
Originally spotted and inital patch by Alexander Duyck.
As a result Alex was seeing issues trying to connect to a
vhost_net interface after commit 50cbe9ab5f8d was applied.
Added a call to validate_xmit_skb() in __dev_xmit_skb(), in the
code path for qdiscs with TCQ_F_CAN_BYPASS flag, and in
__dev_queue_xmit() when no qdisc.
Also handle the error situation where dev_hard_start_xmit() could
return a skb list, and does not return dev_xmit_complete(rc) and
falls through to the kfree_skb(), in that situation it should
call kfree_skb_list().
Fixes: 50cbe9ab5f8d ("net: Validate xmit SKBs right when we pull them out of the qdisc") Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
qdisc: exit case fixes for skb list handling in qdisc layer
More minor fixes to merge commit 53fda7f7f9e (Merge branch 'xmit_list')
that allows us to work with a list of SKBs.
Fixing exit cases in qdisc_reset() and qdisc_destroy(), where a
leftover requeued SKB (qdisc->gso_skb) can have the potential of
being a skb list, thus use kfree_skb_list().
This is a followup to commit 10770bc2d1 ("qdisc: adjustments for
API allowing skb list xmits").
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 2 Sep 2014 19:46:04 +0000 (12:46 -0700)]
Merge branch 'be2net-next'
Sathya Perla says:
====================
be2net: patch set
v2 changes: add a new line after variable declaration in patch 12.
***
Patch 1 adds a few new log messages to help debugging in failure cases.
Patch 2 uses new macros for parsing RX/TX completions and TX wrbs to
help shorten the lines.
Patch 3 adds a description for the RX counter rx_input_fifo_overflow_drop.
Patch 4 adds TX completion error statistics reporting via ethtool.
Patch 5 adds a dma_mapping_error counter and its reporting via ethtool.
Patch 6 fixes up log messages in the Lancer FW download path.
Patch 7 replaces gotos with direct return statements.
Patch 8 cleans up be_change_mtu() code by using a new macro BE_MAX_MTU
Patch 9 makes be_cmd_get_regs() routine to return an integer status
similar to other FW cmd routines in be_cmds.c
Patch 10 gets rid of TX budget as enforcing a budget on TX completion
processing in NAPI is neither suggested nor it provides a performance benefit.
Patch 11 defines and uses a new macro for_all_tx_queues_on_eq() similar
to the RX processing code.
Patch 12 queries max_tx_qs from the FW for BE3 super-nic profiles.
For those profiles, the driver cannot assume a constant BE3_MAX_TX_QS value,
as the value may change for each function.
Please consider applying this patch set to the net-next tree. Thanks!
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
be2net: query max_tx_qs for BE3 super-nic profile from FW
In the BE3 super-nic profile, the max_tx_qs value can vary for each function.
So the driver needs to query this value from FW instead of using the
pre-defined constant BE3_MAX_TX_QS.
Signed-off-by: Suresh Reddy <Suresh.Reddy@emulex.com> Signed-off-by: Sathya Perla <sathya.perla@emulex.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Replace the for() loop that traverses all the TX queues on an EQ
with the macro for_all_tx_queues_on_eq(). With this expalnatory
name, the one line comment is not required anymore.
Signed-off-by: Sathya Perla <sathya.perla@emulex.com> Signed-off-by: David S. Miller <davem@davemloft.net>
There are a few failure cases in be_cmd_get_regs() that ideally must return
an error value. This style is used across all the routines in be_cmds.c with
this routine being an exception. This patch fixes this.
Signed-off-by: Vasundhara Volam <vasundhara.volam@emulex.com> Signed-off-by: Sathya Perla <sathya.perla@emulex.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Kalesh AP [Tue, 2 Sep 2014 04:26:53 +0000 (09:56 +0530)]
be2net: define BE_MAX_MTU
This patch defines a new macro BE_MAX_MTU to make the code in be_change_mtu()
more readable.
Signed-off-by: Kalesh AP <kalesh.purayil@emulex.com> Signed-off-by: Sathya Perla <sathya.perla@emulex.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Kalesh AP [Tue, 2 Sep 2014 04:26:52 +0000 (09:56 +0530)]
be2net: remove unncessary gotos
In cases where there is no extra code to handle an error, this patch replaces
gotos with a direct return statement.
Signed-off-by: Kalesh AP <kalesh.purayil@emulex.com> Signed-off-by: Sathya Perla <sathya.perla@emulex.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Kalesh AP [Tue, 2 Sep 2014 04:26:51 +0000 (09:56 +0530)]
be2net: fix log messages in lancer FW download path
Log messages in the Lancer FW download path have issues such as:
- a single message spanning multiple lines
- the success message is logged even in failure cases
- status codes are already logged in the FW cmd routines
This patch fixes these issues.
Signed-off-by: Kalesh AP <kalesh.purayil@emulex.com> Signed-off-by: Sathya Perla <sathya.perla@emulex.com> Signed-off-by: David S. Miller <davem@davemloft.net>
be2net: Add a dma_mapping_error counter in ethtool
Add a dma_mapping_error counter to count the number of packets dropped
due to DMA mapping errors.
Signed-off-by: Vasundhara Volam <vasundhara.volam@emulex.com> Signed-off-by: Sathya Perla <sathya.perla@emulex.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Kalesh AP [Tue, 2 Sep 2014 04:26:49 +0000 (09:56 +0530)]
be2net: Add TX completion error statistics in ethtool
HW reports TX completion errors in TX completion. This patch adds these
counters to ethtool statistics.
Signed-off-by: Kalesh AP <kalesh.purayil@emulex.com> Signed-off-by: Sathya Perla <sathya.perla@emulex.com> Signed-off-by: David S. Miller <davem@davemloft.net>
The AMAP_GET/SET_BITS() macro calls take structure name as a parameter
and hence are long and span more than one line. Replace these calls
with a wrapper macros for RX/Tx compls and TX wrb. This results in fewer
lines and more readable code in be_main.c
Signed-off-by: Sathya Perla <sathya.perla@emulex.com> Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds the following log messages to help debugging
failure cases:
1) log FW version number: this is useful when driver initialization
fails and the FW version number cannot be queried via ethtool
2) per function resource limits for BEx chips: these values are
currently being printed only for Skyhawk and Lancer
3) PCI BAR mapping failure
4) function_mode/caps queried from FW: this helps catch any FW bugs
that could advertise wrong capabilities to the driver
Signed-off-by: Sathya Perla <sathya.perla@emulex.com> Signed-off-by: David S. Miller <davem@davemloft.net>
sk->sk_error_queue is dequeued in four locations. All share the
exact same logic. Deduplicate.
Also collapse the two critical sections for dequeue (at the top of
the recv handler) and signal (at the bottom).
This moves signal generation for the next packet forward, which should
be harmless.
It also changes the behavior if the recv handler exits early with an
error. Previously, a signal for follow-up packets on the errqueue
would then not be scheduled. The new behavior, to always signal, is
arguably a bug fix.
For rxrpc, the change causes the same function to be called repeatedly
for each queued packet (because the recv handler == sk_error_report).
It is likely that all packets will fail for the same reason (e.g.,
memory exhaustion).
This code runs without sk_lock held, so it is not safe to trust that
sk->sk_err is immutable inbetween releasing q->lock and the subsequent
test. Introduce int err just to avoid this potential race.
Signed-off-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 2 Sep 2014 04:36:35 +0000 (21:36 -0700)]
Merge branch 'csums-next'
Tom Herbert says:
====================
net: Checksum offload changes - Part VI
I am working on overhauling RX checksum offload. Goals of this effort
are:
- Specify what exactly it means when driver returns CHECKSUM_UNNECESSARY
- Preserve CHECKSUM_COMPLETE through encapsulation layers
- Don't do skb_checksum more than once per packet
- Unify GRO and non-GRO csum verification as much as possible
- Unify the checksum functions (checksum_init)
- Simplify code
What is in this seventh patch set:
- Add skb->csum. This allows a device or GRO to indicate that an
invalid checksum was detected.
- Checksum unncessary to checksum complete conversions.
With these changes, I believe that the third goal of the overhaul is
now mostly achieved. In the case of no encapsulation or one layer of
encapsulation, there should only be at most one skb_checksum over
each packet (between GRO and normal path). In the case of two layers
of encapsulation, it is still possible with the right combination of
non-zero and zero UDP checksums to have >1 skb_checksum. For instance:
IP>GRE(with csum)>IP>UDP(zero csum)>VXLAN>IP>UDP(non-zero csum),
would likely necessiate an skb_checksum in GRO and normal path.
This doesn't seem like a common scenario at all so I'm inclined to
not address this now, if multiple layers of encapsulation becomes
popular we can reassess.
Note that checksum conversion shows a nice improvement for RX VXLAN when
outer UDP checksum is enabled (12.65% CPU compared to 20.94%). This
is not only from the fact that we don't need checksum calculation on
the host, but also allows GRO for VXLAN in this case. Checksum
conversion does not help send side (which still needs to perform
a checksum on host). For that we will implement remote checksum offload
in a later patch
(http://tools.ietf.org/html/draft-herbert-remotecsumoffload-00).
Please review carefully and test if possible, mucking with basic
checksum functions is always a little precarious :-)
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Tom Herbert [Sun, 31 Aug 2014 22:12:44 +0000 (15:12 -0700)]
gre: Add support for checksum unnecessary conversions
Call skb_checksum_try_convert and skb_gro_checksum_try_convert
after checksum is found present and validated in the GRE header
for normal and GRO paths respectively.
In GRO path, call skb_gro_checksum_try_convert
Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Tom Herbert [Sun, 31 Aug 2014 22:12:43 +0000 (15:12 -0700)]
udp: Add support for doing checksum unnecessary conversion
Add support for doing CHECKSUM_UNNECESSARY to CHECKSUM_COMPLETE
conversion in UDP tunneling path.
In the normal UDP path, we call skb_checksum_try_convert after locating
the UDP socket. The check is that checksum conversion is enabled for
the socket (new flag in UDP socket) and that checksum field is
non-zero.
In the UDP GRO path, we call skb_gro_checksum_try_convert after
checksum is validated and checksum field is non-zero. Since this is
already in GRO we assume that checksum conversion is always wanted.
Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Tom Herbert [Sun, 31 Aug 2014 22:12:42 +0000 (15:12 -0700)]
net: Infrastructure for checksum unnecessary conversions
For normal path, added skb_checksum_try_convert which is called
to attempt to convert CHECKSUM_UNNECESSARY to CHECKSUM_COMPLETE. The
primary condition to allow this is that ip_summed is CHECKSUM_NONE
and csum_valid is true, which will be the state after consuming
a CHECKSUM_UNNECESSARY.
For GRO path, added skb_gro_checksum_try_convert which is the GRO
analogue of skb_checksum_try_convert. The primary condition to allow
this is that NAPI_GRO_CB(skb)->csum_cnt == 0 and
NAPI_GRO_CB(skb)->csum_valid is set. This implies that we have consumed
all available CHECKSUM_UNNECESSARY checksums in the GRO path.
Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Tom Herbert [Sun, 31 Aug 2014 22:12:41 +0000 (15:12 -0700)]
net: Support for csum_bad in skbuff
This flag indicates that an invalid checksum was detected in the
packet. __skb_mark_checksum_bad helper function was added to set this.
Checksums can be marked bad from a driver or the GRO path (the latter
is implemented in this patch). csum_bad is checked in
__skb_checksum_validate_complete (i.e. calling that when ip_summed ==
CHECKSUM_NONE).
csum_bad works in conjunction with ip_summed value. In the case that
ip_summed is CHECKSUM_NONE and csum_bad is set, this implies that the
first (or next) checksum encountered in the packet is bad. When
ip_summed is CHECKSUM_UNNECESSARY, the first checksum after the last
one validated is bad. For example, if ip_summed == CHECKSUM_UNNECESSARY,
csum_level == 1, and csum_bad is set-- then the third checksum in the
packet is bad. In the normal path, the packet will be dropped when
processing the protocol layer of the bad checksum:
__skb_decr_checksum_unnecessary called twice for the good checksums
changing ip_summed to CHECKSUM_NONE so that
__skb_checksum_validate_complete is called to validate the third
checksum and that will fail since csum_bad is set.
Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
drivers/net/phy/mdio-bcm-unimac.c:195:37-38: unimac_mdio_ids is not NULL
terminated at line 195
Make sure of_device_id tables are NULL terminated
Generated by: scripts/coccinelle/misc/of_table.cocci
Signed-off-by: Fengguang Wu <fengguang.wu@intel.com> Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Florian Fainelli [Fri, 29 Aug 2014 19:42:07 +0000 (12:42 -0700)]
net: dsa: make dsa_pack_type static
net/dsa/dsa.c:624:20: sparse: symbol 'dsa_pack_type' was not declared.
Should it be static?
Fixes: 3e8a72d1dae374 ("net: dsa: reduce number of protocol hooks") Signed-off-by: Fengguang Wu <fengguang.wu@intel.com> Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
bonding: add slave_changelink support and use it for queue_id
This patch adds support for slave_changelink to the bonding and uses it
to give the ability to change the queue_id of the enslaved devices via
netlink. It sets slave_maxtype and uses bond_changelink as a prototype for
bond_slave_changelink.
Example/test command after the iproute2 patch:
ip link set eth0 type bond_slave queue_id 10
CC: David S. Miller <davem@davemloft.net> CC: Jay Vosburgh <j.vosburgh@gmail.com> CC: Veaceslav Falico <vfalico@gmail.com> CC: Andy Gospodarek <andy@greyhouse.net> Suggested-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com> Acked-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>
Florian Fainelli [Thu, 28 Aug 2014 22:11:03 +0000 (15:11 -0700)]
net: systemport: tell RXCHK if we are using Broadcom tags
When Broadcom tags are enabled, e.g: when interfaced to an Ethernet
switch, make sure that we tell the RXCHK engine that it should be
expecting a 4-bytes Broadcom tag after the Ethernet MAC Source Address.
Use netdev_uses_dsa() to check for that condition since that will tell
us if a switch is attached to our network interface.
Fixes: 80105befdb4b ("net: systemport: add Broadcom SYSTEMPORT Ethernet MAC driver") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
pktgen: add flag NO_TIMESTAMP to disable timestamping
Then testing the TX limits of the stack, then it is useful to
be-able to disable the do_gettimeofday() timetamping on every packet.
This implements a pktgen flag NO_TIMESTAMP which will disable this
call to do_gettimeofday().
The performance change on (my system E5-2695) with skb_clone=0, goes
from TX 2,423,751 pps to 2,567,165 pps with flag NO_TIMESTAMP. Thus,
the cost of do_gettimeofday() or saving is approx 23 nanosec.
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Dmitry Kravkov <Dmitry.Kravkov@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Dmitry Kravkov <Dmitry.Kravkov@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Erik Hugne [Thu, 28 Aug 2014 07:08:47 +0000 (09:08 +0200)]
tipc: add name distributor resiliency queue
TIPC name table updates are distributed asynchronously in a cluster,
entailing a risk of certain race conditions. E.g., if two nodes
simultaneously issue conflicting (overlapping) publications, this may
not be detected until both publications have reached a third node, in
which case one of the publications will be silently dropped on that
node. Hence, we end up with an inconsistent name table.
In most cases this conflict is just a temporary race, e.g., one
node is issuing a publication under the assumption that a previous,
conflicting, publication has already been withdrawn by the other node.
However, because of the (rtt related) distributed update delay, this
may not yet hold true on all nodes. The symptom of this failure is a
syslog message: "tipc: Cannot publish {%u,%u,%u}, overlap error".
In this commit we add a resiliency queue at the receiving end of
the name table distributor. When insertion of an arriving publication
fails, we retain it in this queue for a short amount of time, assuming
that another update will arrive very soon and clear the conflict. If so
happens, we insert the publication, otherwise we drop it.
The (configurable) retention value defaults to 2000 ms. Knowing from
experience that the situation described above is extremely rare, there
is no risk that the queue will accumulate any large number of items.
Signed-off-by: Erik Hugne <erik.hugne@ericsson.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Acked-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Erik Hugne [Thu, 28 Aug 2014 07:08:46 +0000 (09:08 +0200)]
tipc: refactor name table updates out of named packet receive routine
We need to perform the same actions when processing deferred name
table updates, so this functionality is moved to a separate
function.
Signed-off-by: Erik Hugne <erik.hugne@ericsson.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Acked-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
hayeswang [Thu, 28 Aug 2014 02:24:18 +0000 (10:24 +0800)]
r8152: reduce the number of Tx
Because the Tx has the features of stopping queue and aggregation,
We don't need many tx buffers. Change the tx number from 10 to 4
to reduce the usage of the memory. This could save 16K * 6 bytes
memory.
Signed-off-by: Hayes Wang <hayeswang@realtek.com> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 2 Sep 2014 00:40:01 +0000 (17:40 -0700)]
Merge branch 'xmit_list'
David Miller says:
====================
net: Make dev_hard_start_xmit() work fundamentally on lists
After this patch set, dev_hard_start_xmit() will work fundemantally on
any and all SKB lists.
This opens the path for a clean implementation of pulling multiple
packets out during qdisc_restart(), and then passing that blob in one
shot to dev_hard_start_xmit().
There were two main architectural blockers to this:
1) The GSO handling, we kept the original GSO head SKB around simply
because dev_hard_start_xmit() had no way to communicate to the
caller how far into the segmented list it was able to go. Now it
can, so the head GSO can be liberated immediately.
All of the special GSO head SKB destructor et al. handling goes
away too.
2) Validate of VLAN, CSUM, and segmentation characteristics was being
performed inside of dev_hard_start_xmit(). If want to truly batch,
we have to let the higher levels to this. In particular, this is
now dequeue_skb()'s job.
And with those two issues out of the way, it should now be trivial to
build experiments on top of this patch set, all of the framework
should be there now. You could do something as simple as:
skb = q->dequeue(q);
if (skb)
skb = validate_xmit_skb(skb, qdisc_dev(q));
if (skb) {
struct sk_buff *new, *head = skb;
int limit = 5;
do {
new = q->dequeue(q);
if (new)
new = validate_xmit_skb(new, qdisc_dev(q));
if (new) {
skb->next = new;
skb = new;
}
} while (new && --limit);
skb = head;
}
inside of the else branch of dequeue_skb().
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Sat, 30 Aug 2014 04:19:14 +0000 (21:19 -0700)]
net: Move main gso loop out of dev_hard_start_xmit() into helper.
There is a slight policy change happening here as well.
The previous code would drop the entire rest of the GSO skb if any of
them got, for example, a congestion notification.
That makes no sense, anything NET_XMIT_MASK and below is something
like congestion or policing. And in the congestion case it doesn't
even mean the packet was actually dropped.
Just continue until dev_xmit_complete() evaluates to false.
Signed-off-by: David S. Miller <davem@davemloft.net>
Ley Foon Tan [Thu, 28 Aug 2014 04:59:46 +0000 (12:59 +0800)]
net: stmmac: fix warning from Sparse for socfpga
Warning:
drivers/net/ethernet/stmicro/stmmac/dwmac-socfpga.c:122:41:
sparse: cast removes address space of expression
drivers/net/ethernet/stmicro/stmmac/dwmac-socfpga.c:122:38:
sparse: incorrect type in assignment (different address spaces)
Signed-off-by: Ley Foon Tan <lftan@altera.com> Acked-by: Giuseppe Cavallaro <peppe.cavallaro@st.com> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Sat, 30 Aug 2014 03:41:17 +0000 (20:41 -0700)]
Merge branch 'csums-next'
Tom Herbert says:
====================
net: Checksum offload changes - Part VI
I am working on overhauling RX checksum offload. Goals of this effort
are:
- Specify what exactly it means when driver returns CHECKSUM_UNNECESSARY
- Preserve CHECKSUM_COMPLETE through encapsulation layers
- Don't do skb_checksum more than once per packet
- Unify GRO and non-GRO csum verification as much as possible
- Unify the checksum functions (checksum_init)
- Simplify code
What is in this sixth patch set:
- Clarify the specific requirements of devices returning
CHECKSUM_UNNECESSARY (comments in skbuff.h).
- Add csum_level field to skbuff. This is used to express how
many checksums are covered by CHECKSUM_UNNECESSARY (stores n - 1).
- Change __skb_checksum_validate_needed to "consume" each checksum
as indicated by csum_level as layers of the the packet are parsed.
- Remove skb_pop_rcv_encapsulation, no longer needed in the new
csum_level model.
- Allow GRO path to "consume" checksums provided in CHECKSUM_UNNECESSARY
and to report new verfied checksums for use in normal path fallback.
- Add proper support to SCTP to accept CHECKSUM_UNNECESSARY to validate
header CRC.
- Modify drivers to set skb->csum_level instead of setting
skb->encapsulation to indicate validation of an encapsulated
checksum on receive.
v2:
Allocate a new 16 bits for flags in skbuff.
Please review carefully and test if possible, mucking with basic
checksum functions is always a little precarious :-)
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Tom Herbert [Thu, 28 Aug 2014 04:27:06 +0000 (21:27 -0700)]
sctp: Change sctp to implement csum_levels
CHECKSUM_UNNECESSARY may be applied to the SCTP CRC so we need to
appropriate account for this by decrementing csum_level. This is
done by calling __skb_dec_checksum_unnecessary.
Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Tom Herbert [Thu, 28 Aug 2014 04:26:56 +0000 (21:26 -0700)]
net: Allow GRO to use and set levels of checksum unnecessary
Allow GRO path to "consume" checksums provided in CHECKSUM_UNNECESSARY
and to report new checksums verfied for use in fallback to normal
path.
Change GRO checksum path to track csum_level using a csum_cnt field
in NAPI_GRO_CB. On GRO initialization, if ip_summed is
CHECKSUM_UNNECESSARY set NAPI_GRO_CB(skb)->csum_cnt to
skb->csum_level + 1. For each checksum verified, decrement
NAPI_GRO_CB(skb)->csum_cnt while its greater than zero. If a checksum
is verfied and NAPI_GRO_CB(skb)->csum_cnt == 0, we have verified a
deeper checksum than originally indicated in skbuf so increment
csum_level (or initialize to CHECKSUM_UNNECESSARY if ip_summed is
CHECKSUM_NONE or CHECKSUM_COMPLETE).
Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Tom Herbert [Thu, 28 Aug 2014 04:26:46 +0000 (21:26 -0700)]
net: Clarification of CHECKSUM_UNNECESSARY
This patch:
- Clarifies the specific requirements of devices returning
CHECKSUM_UNNECESSARY (comments in skbuff.h).
- Adds csum_level field to skbuff. This is used to express how
many checksums are covered by CHECKSUM_UNNECESSARY (stores n - 1).
This replaces the overloading of skb->encapsulation, that field is
is now only used to indicate inner headers are valid.
- Change __skb_checksum_validate_needed to "consume" each checksum
as indicated by csum_level as layers of the the packet are parsed.
- Remove skb_pop_rcv_encapsulation, no longer needed in the new
csum_level model.
Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Yuval Mintz [Thu, 28 Aug 2014 05:07:32 +0000 (08:07 +0300)]
bnx2x: Fix sparse warnings
This fixes a sprase warning introduced recently by commit eeed018cbfa30 ("bnx2x: Add timestamping and PTP hardware clock support"),
as well as another unrelated sparse endian issue.
Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com> Signed-off-by: Ariel Elior <Ariel.Elior@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Rasmus Villemoes [Thu, 28 Aug 2014 11:44:34 +0000 (13:44 +0200)]
include/rxrpc/types.h: Remove unused header
The header file include/rxrpc/types.h does not seem to be used
anywhere. It was orphaned by 63b6be55 "[AF_RXRPC]: Delete the old
RxRPC code.". Remove it.
Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> Signed-off-by: David S. Miller <davem@davemloft.net>
Rasmus Villemoes [Thu, 28 Aug 2014 11:44:33 +0000 (13:44 +0200)]
include/linux/phonedev.h: Remove unused header
The header file include/linux/phonedev.h does not seem to be used
anywhere. It was orphaned by 7326446c "Staging: remove telephony
drivers". Remove it.
Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> Signed-off-by: David S. Miller <davem@davemloft.net>
Rasmus Villemoes [Thu, 28 Aug 2014 11:44:32 +0000 (13:44 +0200)]
include/linux/i82593.h: Remove unused header
The header file include/linux/i82593.h does not seem to be used
anywhere. It was orphaned by 8a594170 "drivers/net: delete intel
i825xx based znet notebook driver". Remove it.
Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> Signed-off-by: David S. Miller <davem@davemloft.net>
Rasmus Villemoes [Thu, 28 Aug 2014 11:44:31 +0000 (13:44 +0200)]
include/linux/cycx_x25.h: Remove unused header
The header file include/linux/cycx_x25.h does not seem to be used
anywhere. It was orphaned by 6fcdf4facb "wanrouter: delete now
orphaned header content, files/drivers". Remove it.
Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> Signed-off-by: David S. Miller <davem@davemloft.net>
Ying Xue [Thu, 28 Aug 2014 02:02:41 +0000 (10:02 +0800)]
tipc: fix a potential oops
Commit 6c9808ce09f7 ("tipc: remove port_lock") accidentally involves
a potential bug: when tipc socket instance(tsk) is not got with given
reference number in tipc_sk_get(), tsk is set to NULL. Subsequently
we jump to exit label where to decrease socket reference counter
pointed by tsk pointer in tipc_sk_put(). However, As now tsk is NULL,
oops may happen because of touching a NULL pointer.
Signed-off-by: Ying Xue <ying.xue@windriver.com> Acked-by: Erik Hugne <erik.hugne@ericsson.com> Acked-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Florian Fainelli [Wed, 27 Aug 2014 18:44:33 +0000 (11:44 -0700)]
net: phy: properly report internal PHYs through sysfs
Internal PHYs may not have a valid PHY interface defined, which will
show up in sysfs as "". Add an explicit check of internal PHYs to report
their interface correctly.
Fixes: 3d055d8d1c24 ("net: phy: expose PHY device interface mode") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Sat, 30 Aug 2014 03:15:42 +0000 (20:15 -0700)]
Merge branch 'qlcnic-next'
Shahed Shaikh says:
====================
qlcnic: Feature addition and enhancements
This series contains following feature addition and enhancements,
- Update Link speed and Port type information for 83xx series adapters
- Support 0x8830 device ID
- Support for Power on Self Test (POST) feature for 83xx
- Use usleep_range() instead of msleep() for values less than 20ms
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Shahed Shaikh [Wed, 27 Aug 2014 16:43:20 +0000 (12:43 -0400)]
qlcnic: Add support to run firmware POST
This patch adds support to run Power On Self Test (POST) for 83xx adapters.
POST can be run in 3 different speed modes :
i) Fast mode (takes about 690 ms)
ii) Medium mode (takes about 2930 ms)
iii) Slow mode (takes about 7500 ms)
To run POST, firmware file with name "83xx_post_fw.bin" should be present under
/lib/firmware directory. load_fw_file module parameter is used to specify
POST operation and its speed mode.
load_fw_file = 2 : Fast mode
load_fw_file = 3 : Medium mode
load_fw_file = 4 : Slow mode
Signed-off-by: Shahed Shaikh <shahed.shaikh@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
qlcnic: Use usleep_range() instead of msleep() for sleep less than 20ms
As per recommendation, msleep() may sleep longer than intended time for
values less than 20ms. So, use usleep_range() instead of msleep()
Signed-off-by: Jitendra Kalsaria <jitendra.kalsaria@qlogic.com> Signed-off-by: Shahed Shaikh <shahed.shaikh@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
qlcnic: Update Link speed and port type info for 83xx adapter
o Update the port type information
o Advertise correct link modes and autonegotiation
o Add support to change link speed
Signed-off-by: Jitendra Kalsaria <jitendra.kalsaria@qlogic.com> Signed-off-by: Shahed Shaikh <shahed.shaikh@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Thu, 28 Aug 2014 21:19:38 +0000 (14:19 -0700)]
Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net-next
Jeff Kirsher says:
====================
Intel Wired LAN Driver Updates 2014-08-27
This series contains updates to i40e and i40evf.
Carolyn provides two patches, first changes the wording of the flow
director add/remove and asynchronous failure messages to include the
fd_id to try and add some way to track the operations on a given fd_id.
Second adds a check during handle_link_event for unqualified modules
when link is down and there is a module plugged in.
Anjali provides four patches to i40e/i40evf. First update flow director
messages so that a user can tell if a filter was added or deleted. Then
updates the ATR policy to not auto-disable ATR when we have errors in
programming. The disabling of ATR when we got programming errors was
buggy and was still adding new rules and causing continuous errors.
With this policy change, we flush instead when we see too many errors.
In addition she adds a flow director flush counter to ethtool to help
know how many times the interface had to flush and replay the flow
director filter table. Updates the driver to ignores a driver
perceived transmit hang if the number of descriptors pending is less
than 4, and instead log a stat when this situation happens. This is
because the queue progresses forward and the stack never experiences
a real hang in these situations.
Shannon provides three patches for i40e/i40evf, first enables the
l2tsel bit on receive queue contexts that are assigned to VFs so that
the VF can get the stripped VLAN tag. Then adds a max buffer size
parameter to the print helper to be sure the code knows when to stop.
Lastly, remove the complaint when removing the default MAC VLAN filter.
This was because old firmware had an incorrect MAC VLAN filter that
needed to be replaced at startup, and now newer firmware does not have
this problem. So now we only add the new filter if the removal
succeeded and no need to complain if the removal fails.
Ashish provides a change to vsi->num_queue_pairs to equal the number
that is configured by the VF. This limits the number of queues that
are enabled/disabled and fixes the mismatch case for when a VF
configures fewer queues than is allocated to it by the PF.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Daniel Borkmann [Tue, 26 Aug 2014 17:34:18 +0000 (19:34 +0200)]
ixgbe: flush when in xmit_more mode and under descriptor pressure
When xmit_more mode is being used and the ring is about to
become full or the stack has stopped the ring, enforce a tail
pointer write to the hw. Otherwise, we could risk a TX hang.
Code suggested by Alexander Duyck.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Thu, 28 Aug 2014 06:16:19 +0000 (23:16 -0700)]
Merge branch 'bcm7xxx'
Florian Fainelli says:
====================
Broadcom BCM7xxx PHY updates for new entries
Another week, another set of updates for the Broadcom BCM7xxx PHY driver. This
patch set cleanups the existing definitions, adds a macro to ease the addition
of future chips, and finally add two new SoCs to the list of supported chips.
Resending since the first patch did not make it to the list, sorry about that.
Changes in v2:
- rephrased commit message for patch 1 to make it pass majordomo
capital triple X was rejected
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Florian Fainelli [Tue, 26 Aug 2014 20:15:27 +0000 (13:15 -0700)]
net: phy: bcm7xxx: add BCM7250 and BCM7364 PHY entries
Add two new entries to the Broadcom BCM7xxx internal PHY driver for
BCM7250 and BCM7364 chips. Those chips share the usual 28nm process
Gigabit PHY sequence and require the same workarounds so far.
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Florian Fainelli [Tue, 26 Aug 2014 20:15:24 +0000 (13:15 -0700)]
net: phy: bcm7xxx: introduce helper macro
All 28nm Gigabit PHYs supported by the driver have the same
callbacks, the only differences being the 32-bits OUI and the name. Use
a macro to factor this, making it easier in the future to add new
entries.
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>