Taku Izumi [Thu, 9 Dec 2010 15:17:13 +0000 (15:17 +0000)]
bonding: add the debugfs facility to the bonding driver
This patch provides the debugfs facility to the bonding driver.
The "bonding" directory is created in the debugfs root and directories of
each bonding interface (like bond0, bond1...) are created in that.
# mount -t debugfs none /sys/kernel/debug
# ls /sys/kernel/debug/bonding
bond0 bond1
Signed-off-by: Taku Izumi <izumi.taku@jp.fujitsu.com> Signed-off-by: Jay Vosburgh <fubar@us.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Roopa Prabhu [Fri, 10 Dec 2010 12:02:33 +0000 (12:02 +0000)]
enic: Move enic port profile handling code to a new 802.1Qbh provisioning info type
Signed-off-by: Roopa Prabhu <roprabhu@cisco.com> Signed-off-by: David Wang <dwang2@cisco.com> Signed-off-by: Christian Benvenuti <benve@cisco.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Jon Mason [Fri, 10 Dec 2010 14:03:01 +0000 (14:03 +0000)]
vxge: independent interrupt moderation
Configure the workload clock register and TIM register for independent
interrupt moderation based on the individual vpath utilization instead
of common link utilization. This greatly improves latency.
Signed-off-by: Jon Mason <jon.mason@exar.com> Signed-off-by: Ram Vepa <ram.vepa@exar.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Jon Mason [Fri, 10 Dec 2010 14:03:00 +0000 (14:03 +0000)]
vxge: hotplug stall
When hot-unplugging a vxge adapter while running, the driver's remove
routine prints warning and then stalls the calling thread. This is due
to vxge_remove calling vxge_device_unregister to unregister the netdev
before calling flush_scheduled_work clear any pending work. Swapping
the order of these two functions resolves the issue.
Signed-off-by: Jon Mason <jon.mason@exar.com> Signed-off-by: Ram Vepa <ram.vepa@exar.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Jon Mason [Fri, 10 Dec 2010 14:02:59 +0000 (14:02 +0000)]
vxge: transmit timeout deadlock
Use a workqueue to handle the device reset during a transmit timeout, as
there can be a deadlock during bringup. Also, set the netif carrier off
before the watchdog reset is started to prevent the timeout from
reoccurring while still processing the first.
Signed-off-by: Jon Mason <jon.mason@exar.com> Signed-off-by: Ram Vepa <ram.vepa@exar.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Jon Mason [Fri, 10 Dec 2010 14:02:58 +0000 (14:02 +0000)]
vxge: use pci_request_region()
Only BAR0 is ever accessed, thus making the calls to pci_request_regions
overkill. Change calls of pci_request_regions to pci_request_region to
reduce the size of the mapped area.
Signed-off-by: Jon Mason <jon.mason@exar.com> Signed-off-by: Ram Vepa <ram.vepa@exar.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Jon Mason [Fri, 10 Dec 2010 14:02:57 +0000 (14:02 +0000)]
vxge: fix crash of VF when unloading PF
Calling pci_disable_sriov when unloading a SR-IOV physical function
driver from a host when a guest is using a virtual function from that
device can cause a host crash or VM crash. The crash is caused by the
virtual config space no longer being present when PF is removed (due to
the pci_disable_sriov). This can be avoided by not calling
pci_disable_sriov to disable the PCI space when shutting down the PF.
Each function in the X3100 operates independently and in this case will
operate properly in the absence of the PF.
Also, added improved logic in the detection of SR-IOV initialization.
Signed-off-by: Jon Mason <jon.mason@exar.com> Signed-off-by: Ram Vepa <ram.vepa@exar.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Herbert Xu [Thu, 9 Dec 2010 17:38:11 +0000 (17:38 +0000)]
bridge: Use consistent NF_DROP returns in nf_pre_routing
The nf_pre_routing functions in bridging have collected two
distinct ways of returning NF_DROP over the years, inline and
via goto. There is no reason for preferring either one.
So this patch arbitrarily picks the inline variant and converts
the all the gotos.
Also removes a redundant comment.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
Ben Hutchings [Thu, 9 Dec 2010 12:10:25 +0000 (12:10 +0000)]
netdev: Use default implementation of ethtool_ops::get_link where possible
Various drivers are using implementations of ethtool_ops::get_link
that are equivalent to the default ethtool_op_get_link(). Change
them to use that instead.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Ben Hutchings [Thu, 9 Dec 2010 12:08:35 +0000 (12:08 +0000)]
ethtool: Report link-down while interface is down
While an interface is down, many implementations of
ethtool_ops::get_link, including the default, ethtool_op_get_link(),
will report the last link state seen while the interface was up. In
general the current physical link state is not available if the
interface is down.
Define ETHTOOL_GLINK to reflect whether the interface *and* any
physical port have a working link, and consistently return 0 when the
interface is down.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Roopa Prabhu [Wed, 8 Dec 2010 13:54:03 +0000 (13:54 +0000)]
enic: Use VF mac set by IFLA_VF_MAC in port profile provisioning data
This patch adds support in enic 802.1Qbh port profile provisioning code
to use the mac address set by IFLA_VF_MAC. For now we handle this mac as a
special case for a VM mac address sent to us by libvirt. The VM mac address
is sent to the switch along with the rest of the port profile provisioning
data. This patch also adds calls to register and deregister the mac address
during port profile association/deassociation.
Signed-off-by: Roopa Prabhu <roprabhu@cisco.com> Signed-off-by: David Wang <dwang2@cisco.com> Signed-off-by: Christian Benvenuti <benve@cisco.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Roopa Prabhu [Wed, 8 Dec 2010 13:53:58 +0000 (13:53 +0000)]
enic: Add ndo_set_vf_mac support for enic dynamic devices
This patch implements the ndo_set_vf_mac netdev operation for enic
dynamic devices. It treats the mac address set by IFLA_VF_MAC as a
special case to use it in the port profile provisioning data.
Signed-off-by: Roopa Prabhu <roprabhu@cisco.com> Signed-off-by: David Wang <dwang2@cisco.com> Signed-off-by: Christian Benvenuti <benve@cisco.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Roopa Prabhu [Wed, 8 Dec 2010 13:19:58 +0000 (13:19 +0000)]
enic: Add ndo_set_rx_mode support for enic vnics
Add ndo_set_rx_mode support to register unicast and multicast
address filters for enic devices
Signed-off-by: Roopa Prabhu <roprabhu@cisco.com> Signed-off-by: Vasanthy Kolluri <vkolluri@cisco.com> Signed-off-by: David Wang <dwang2@cisco.com> Signed-off-by: Christian Benvenuti <benve@cisco.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Junchang Wang [Wed, 8 Dec 2010 16:55:16 +0000 (16:55 +0000)]
pktgen: adding prefetchw() call
We know for sure pktgen is going to write skb->data right after
*_alloc_skb, causing unnecessary cache misses.
Idea is to add a prefetchw() call to prefetch the first cache line
indicated by skb->data. On systems with Adjacent Cache Line Prefetch,
it's probably two cache lines are prefetched.
With this patch, pktgen on Intel SR1625 server with two E5530
quad-core processors and a single ixgbe-based NIC went from 8.63Mpps
to 9.03Mpps, with 4.6% improvement.
Signed-off-by: Junchang Wang <junchangwang@gmail.com> Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
This driver now uses Generic Receive Offload, not the older LRO.
Change references to LRO in names and comments.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Acked-by: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Martin Willi [Wed, 8 Dec 2010 04:37:51 +0000 (04:37 +0000)]
xfrm: Traffic Flow Confidentiality for IPv6 ESP
Add TFC padding to all packets smaller than the boundary configured
on the xfrm state. If the boundary is larger than the PMTU, limit
padding to the PMTU.
Signed-off-by: Martin Willi <martin@strongswan.org> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
Martin Willi [Wed, 8 Dec 2010 04:37:50 +0000 (04:37 +0000)]
xfrm: Traffic Flow Confidentiality for IPv4 ESP
Add TFC padding to all packets smaller than the boundary configured
on the xfrm state. If the boundary is larger than the PMTU, limit
padding to the PMTU.
Signed-off-by: Martin Willi <martin@strongswan.org> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
The XFRMA_TFCPAD attribute for XFRM state installation configures
Traffic Flow Confidentiality by padding ESP packets to a specified
length.
Signed-off-by: Martin Willi <martin@strongswan.org> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
Jan Glauber [Wed, 8 Dec 2010 02:58:01 +0000 (02:58 +0000)]
qeth: buffer count imbalance
The used buffers counter is not incremented in case of an error so
the counter can become negative. Increment the used buffers counter
before checking for errors.
Signed-off-by: Jan Glauber <jang@linux.vnet.ibm.com> Signed-off-by: Frank Blaschka <frank.blaschka@de.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Einar Lueck [Wed, 8 Dec 2010 02:57:59 +0000 (02:57 +0000)]
qeth: support VIPA add/del in offline mode
Only work through the IP adddress to do list if the card is UP or
SOFTSETUP. Enables to configure VIPA add/del in offline mode.
Signed-off-by: Einar Lueck <elelueck@de.ibm.com> Signed-off-by: Frank Blaschka <frank.blaschka@de.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Einar Lueck [Wed, 8 Dec 2010 02:57:58 +0000 (02:57 +0000)]
qeth: support ipv6 query arp cache for HiperSockets
Function qeth_l3_arp_query now queries for IPv6 addresses, too, if
QETH_QARP_WITH_IPV6 is passed as parameter to the ioctl. HiperSockets
and GuestLAN in HiperSockets mode provide corresponding entries.
Signed-off-by: Einar Lueck <elelueck@de.ibm.com> Signed-off-by: Frank Blaschka <frank.blaschka@de.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Tobias Klauser [Fri, 10 Dec 2010 03:18:04 +0000 (03:18 +0000)]
bridge: Fix return values of br_multicast_add_group/br_multicast_new_group
If br_multicast_new_group returns NULL, we would return 0 (no error) to
the caller of br_multicast_add_group, which is not what we want. Instead
br_multicast_new_group should return ERR_PTR(-ENOMEM) in this case.
Also propagate the error number returned by br_mdb_rehash properly.
Signed-off-by: Tobias Klauser <tklauser@distanz.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
Shan Wei [Fri, 10 Dec 2010 11:49:23 +0000 (12:49 +0100)]
dccp: remove unused macros
Remove macros which have been unused since the initial implementation
(commit 7c657876b63cb1d8a2ec06f8fc6c37bb8412e66c, [DCCP]: Initial
implementation from Tue Aug 9 20:14:34 2005 -0700).
Signed-off-by: Vladislav Zolotarov <vladz@broadcom.com> Signed-off-by: Eilon Greenstein <eilong@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
bnx2x_src_init_t2() is used only when BCM_CNIC is defined.
So, to avoid a compilation warning, we won't define it unless
BCM_CNIC is defined.
Signed-off-by: Vladislav Zolotarov <vladz@broadcom.com> Signed-off-by: Eilon Greenstein <eilong@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
bnx2x: Use dma_alloc_coherent() semantics for ILT memory allocation
Signed-off-by: Vladislav Zolotarov <vladz@broadcom.com> Signed-off-by: Eilon Greenstein <eilong@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Make the LSO code work on BE platforms: parsing_data field of
a parsing BD (PBD) for 57712 was improperly composed which made FW read wrong
values for TCP header's length and offset and, as a result, the corresponding
PCI device was performing bad DMA reads triggering EEH.
Signed-off-by: Vladislav Zolotarov <vladz@broadcom.com> Signed-off-by: Eilon Greenstein <eilong@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
The current jhash.h implements the lookup2() hash function by Bob Jenkins.
However, lookup2() is outdated as Bob wrote a new hash function called
lookup3(). The patch replaces the lookup2() implementation of the 'jhash*'
functions with that of lookup3().
You can read a longer comparison of the two and other hash functions at
http://burtleburtle.net/bob/hash/doobs.html.
Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Acked-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Tue, 30 Nov 2010 19:04:07 +0000 (19:04 +0000)]
net: optimize INET input path further
Followup of commit b178bb3dfc30 (net: reorder struct sock fields)
Optimize INET input path a bit further, by :
1) moving sk_refcnt close to sk_lock.
This reduces number of dirtied cache lines by one on 64bit arches (and
64 bytes cache line size).
2) moving inet_daddr & inet_rcv_saddr at the beginning of sk
(same cache line than hash / family / bound_dev_if / nulls_node)
This reduces number of accessed cache lines in lookups by one, and dont
increase size of inet and timewait socks.
inet and tw sockets now share same place-holder for these fields.
David S. Miller [Thu, 9 Dec 2010 05:16:57 +0000 (21:16 -0800)]
net: Abstract away all dst_entry metrics accesses.
Use helper functions to hide all direct accesses, especially writes,
to dst_entry metrics values.
This will allow us to:
1) More easily change how the metrics are stored.
2) Implement COW for metrics.
In particular this will help us put metrics into the inetpeer
cache if that is what we end up doing. We can make the _metrics
member a pointer instead of an array, initially have it point
at the read-only metrics in the FIB, and then on the first set
grab an inetpeer entry and point the _metrics member there.
Signed-off-by: David S. Miller <davem@davemloft.net> Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Eric Dumazet [Tue, 7 Dec 2010 12:20:47 +0000 (12:20 +0000)]
tcp: protect sysctl_tcp_cookie_size reads
Make sure sysctl_tcp_cookie_size is read once in
tcp_cookie_size_check(), or we might return an illegal value to caller
if sysctl_tcp_cookie_size is changed by another cpu.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Cc: Ben Hutchings <bhutchings@solarflare.com> Cc: William Allen Simpson <william.allen.simpson@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Tue, 7 Dec 2010 12:03:55 +0000 (12:03 +0000)]
tcp: avoid a possible divide by zero
sysctl_tcp_tso_win_divisor might be set to zero while one cpu runs in
tcp_tso_should_defer(). Make sure we dont allow a divide by zero by
reading sysctl_tcp_tso_win_divisor exactly once.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Breno Leitao [Wed, 8 Dec 2010 20:19:14 +0000 (12:19 -0800)]
ehea: Fixing LRO configuration
In order to set LRO on ehea, the user must set a module parameter, which
is not the standard way to do so. This patch adds a way to set LRO using
the ethtool tool.
Signed-off-by: Breno Leitao <leitao@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Tom Herbert [Wed, 8 Dec 2010 20:16:33 +0000 (12:16 -0800)]
tcp: Replace time wait bucket msg by counter
Rather than printing the message to the log, use a mib counter to keep
track of the count of occurences of time wait bucket overflow. Reduces
spam in logs.
Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
x25 does not decrement the network device reference counts on module unload.
Thus unregistering any pre-existing interface after unloading the x25 module
hangs and results in
unregister_netdevice: waiting for tap0 to become free. Usage count = 1
This patch decrements the reference counts of all interfaces in x25_link_free,
the way it is already done in x25_link_device_down for NETDEV_DOWN events.
Signed-off-by: Apollon Oikonomopoulos <apollon@noc.grnet.gr> Signed-off-by: David S. Miller <davem@davemloft.net>
Regarding benet be_cmd_multicast_set() function, now using
netdev_for_each_mc_addr() helper for mac address copy, but
when copying to req->mac[] did not increase of the index.
Cc: Sathya Perla <sathyap@serverengines.com> Cc: Subbu Seetharaman <subbus@serverengines.com> Cc: Sarveshwar Bandi <sarveshwarb@serverengines.com> Cc: Ajit Khaparde <ajitk@serverengines.com> Signed-off-by: Joe Jin <joe.jin@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Roger Luethi [Mon, 6 Dec 2010 00:59:40 +0000 (00:59 +0000)]
via-rhine: hardware VLAN support
This patch adds VLAN hardware support for Rhine chips.
The driver uses up to 3 additional bytes of buffer space when extracting
802.1Q headers; PKT_BUF_SZ should still be sufficient.
The initial code was provided by David Lv. I reworked it to use standard
kernel facilities. Coding style clean up mostly follows via-velocity.
Adapted to new interface for VLAN acceleration (per request of Jesse Gross).
Signed-off-by: David Lv <DavidLv@viatech.com.cn> Signed-off-by: Roger Luethi <rl@hellgate.ch>
drivers/net/via-rhine.c | 326 +++++++++++++++++++++++++++++++++++++++++++++--
1 files changed, 312 insertions(+), 14 deletions(-) Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Sun, 5 Dec 2010 01:23:53 +0000 (01:23 +0000)]
net: RCU conversion of dev_getbyhwaddr() and arp_ioctl()
Le dimanche 05 décembre 2010 à 09:19 +0100, Eric Dumazet a écrit :
> Hmm..
>
> If somebody can explain why RTNL is held in arp_ioctl() (and therefore
> in arp_req_delete()), we might first remove RTNL use in arp_ioctl() so
> that your patch can be applied.
>
> Right now it is not good, because RTNL wont be necessarly held when you
> are going to call arp_invalidate() ?
While doing this analysis, I found a refcount bug in llc, I'll send a
patch for net-2.6
Meanwhile, here is the patch for net-next-2.6
Your patch then can be applied after mine.
Thanks
[PATCH] net: RCU conversion of dev_getbyhwaddr() and arp_ioctl()
dev_getbyhwaddr() was called under RTNL.
Rename it to dev_getbyhwaddr_rcu() and change all its caller to now use
RCU locking instead of RTNL.
Change arp_ioctl() to use RCU instead of RTNL locking.
Note: this fix a dev refcount bug in llc
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Sun, 5 Dec 2010 02:03:26 +0000 (02:03 +0000)]
llc: fix a device refcount imbalance
Le dimanche 05 décembre 2010 à 12:23 +0100, Eric Dumazet a écrit :
> Le dimanche 05 décembre 2010 à 09:19 +0100, Eric Dumazet a écrit :
>
> > Hmm..
> >
> > If somebody can explain why RTNL is held in arp_ioctl() (and therefore
> > in arp_req_delete()), we might first remove RTNL use in arp_ioctl() so
> > that your patch can be applied.
> >
> > Right now it is not good, because RTNL wont be necessarly held when you
> > are going to call arp_invalidate() ?
>
> While doing this analysis, I found a refcount bug in llc, I'll send a
> patch for net-2.6
Oh well, of course I must first fix the bug in net-2.6, and wait David
pull the fix in net-next-2.6 before sending this rcu conversion.
Note: this patch should be sent to stable teams (2.6.34 and up)
[PATCH net-2.6] llc: fix a device refcount imbalance
commit abf9d537fea225 (llc: add support for SO_BINDTODEVICE) added one
refcount imbalance in llc_ui_bind(), because dev_getbyhwaddr() doesnt
take a reference on device, while dev_get_by_index() does.
Fix this using RCU locking. And since an RCU conversion will be done for
2.6.38 for dev_getbyhwaddr(), put the rcu_read_lock/unlock exactly at
their final place.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Cc: stable@kernel.org Cc: Octavian Purdila <opurdila@ixiacom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Changli Gao [Sat, 4 Dec 2010 14:09:08 +0000 (14:09 +0000)]
ifb: goto resched directly if error happens and dp->tq isn't empty
If we break the loop when there are still skbs in tq and no skb in
rq, the skbs will be left in txq until new skbs are enqueued into rq.
In rare cases, no new skb is queued, then these skbs will stay in rq
forever.
After this patch, if tq isn't empty when we break the loop, we goto
resched directly.
Signed-off-by: Changli Gao <xiaosuo@gmail.com> Signed-off-by: Jamal Hadi Salim <hadi@cyberus.ca> Signed-off-by: David S. Miller <davem@davemloft.net>
The bug has to do with boundary checks on the initial receive window.
If the initial receive window falls between init_cwnd and the
receive window specified by the user, the initial window is incorrectly
brought down to init_cwnd. The correct behavior is to allow it to
remain unchanged.
Signed-off-by: Nandita Dukkipati <nanditad@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Fix the calculation of the inexact hash-based MAC address filter.
It's 64 bits but current code is missing a ULL. Results in filtering out
some legitimate packets.
Signed-off-by: Dimitris Michailidis <dm@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Oliver Hartkopp [Thu, 2 Dec 2010 10:57:59 +0000 (10:57 +0000)]
can: add slcan driver for serial/USB-serial CAN adapters
This patch adds support for serial/USB-serial CAN adapters implementing the
LAWICEL ASCII protocol for CAN frame transport over serial lines.
The driver implements the SLCAN line discipline and is heavily based on the
slip.c driver. Therefore the code style remains similar to slip.c to be able
to apply changes of the SLIP driver to the SLCAN driver easily.
For more details see the slcan Kconfig entry.
Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net> Signed-off-by: David S. Miller <davem@davemloft.net>
Andy Shevchenko [Thu, 2 Dec 2010 02:45:08 +0000 (02:45 +0000)]
atm: lanai: use kernel's '%pM' format option to print MAC
Signed-off-by: Andy Shevchenko <andy.shevchenko@gmail.com> Cc: Chas Williams <chas@cmf.nrl.navy.mil> Cc: linux-atm-general@lists.sourceforge.net Signed-off-by: David S. Miller <davem@davemloft.net>
Ben Hutchings [Tue, 7 Dec 2010 19:24:45 +0000 (19:24 +0000)]
sfc: Fix crash in legacy onterrupt handler during ring reallocation
If we are using a legacy interrupt, our IRQ may be shared and our
interrupt handler may be called even though interrupts are disabled on
the NIC. When we change ring sizes, we reallocate the event queue and
the interrupt handler may use an invalid pointer when called for
another device's interrupt.
Maintain a legacy_irq_enabled flag and test that at the top of the
interrupt handler. Note that this problem results from the need to
work around broken INT_ISR0 reads, and does not affect the legacy
interrupt handler for Falcon A1.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Ben Hutchings [Tue, 7 Dec 2010 19:11:26 +0000 (19:11 +0000)]
sfc: Generalise filter spec initialisation
Move search_depth arrays into per-table state.
Define initialisation function efx_filter_init_rx() which sets
everything apart from the match fields.
Define efx_filter_set_{ipv4_local,ipv4_full,eth_local}() to set the
match fields. This allows some simplification of callers and later
support for additional protocols and more flexible matching using
multiple calls to these functions.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Ben Hutchings [Tue, 7 Dec 2010 19:02:27 +0000 (19:02 +0000)]
sfc: Remove filter table IDs from filter functions
The separation between filter tables is largely an internal detail
and it may be removed in future hardware. To prepare for that:
- Merge table ID with filter index to make an opaque filter ID
- Wrap efx_filter_table_clear() with a function that clears filters
from both RX tables, which is all that the current caller requires
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Ben Hutchings [Tue, 7 Dec 2010 18:29:52 +0000 (18:29 +0000)]
sfc: Log start and end of ethtool self-test at INFO level
Add message at start of self-test and increase log level of message at
end of self-test, so that any other messages produced during the
test are clearly associated with it.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
This patch adds a generic infrastructure for policy-based dequeueing of
TX packets and provides two policies:
* a simple FIFO policy (which is the default) and
* a priority based policy (set via socket options).
Both policies honour the tx_qlen sysctl for the maximum size of the write
queue (can be overridden via socket options).
The priority policy uses skb->priority internally to assign an u32 priority
identifier, using the same ranking as SO_PRIORITY. The skb->priority field
is set to 0 when the packet leaves DCCP. The priority is supplied as ancillary
data using cmsg(3), the patch also provides the requisite parsing routines.
Signed-off-by: Tomasz Grobelny <tomasz@grobelny.oswiecenia.net> Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Ben Hutchings [Mon, 15 Nov 2010 23:53:11 +0000 (23:53 +0000)]
sfc: Use TX push whenever adding descriptors to an empty queue
Whenever we add DMA descriptors to a TX ring and update the ring
pointer, the TX DMA engine must first read the new DMA descriptors and
then start reading packet data. However, all released Solarflare 10G
controllers have a 'TX push' feature that allows us to reduce latency
by writing the first new DMA descriptor along with the pointer update.
This is only useful when the queue is empty. The hardware should
ignore the pushed descriptor if the queue is not empty, but this check
is buggy, so we must do it in software.
In order to tell whether a TX queue is empty, we need to compare the
previous transmission count (write_count) and completion count
(read_count). However, if we do that every time we update the ring
pointer then read_count may ping-pong between the caches of two CPUs
running the transmission and completion paths for the queue.
Therefore, we split the check for an empty queue between the
completion path and the transmission path:
- Add an empty_read_count field representing a point at which the
completion path saw the TX queue as empty.
- Add an old_write_count field for use on the completion path.
- On the completion path, whenever read_count reaches or passes
old_write_count the TX queue may be empty. We then read
write_count, set empty_read_count if read_count == write_count,
and update old_write_count.
- On the transmission path, we read empty_read_count. If it's set, we
compare it with the value of write_count before the current set of
descriptors was added. If they match, the queue really is empty and
we can use TX push.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Ben Hutchings [Mon, 6 Dec 2010 22:58:41 +0000 (22:58 +0000)]
sfc: Remove locking from implementation of efx_writeo_paged()
It is not necessary to serialise writes to the paged 128-bit
registers. However, if we don't then we must always write the last
dword separately, not as part of a qword write.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Ben Hutchings [Mon, 6 Dec 2010 22:53:15 +0000 (22:53 +0000)]
sfc: Reorder struct efx_nic to separate fields by volatility
Place the regularly updated fields (locks, MAC stats, etc.) on a
separate cache-line from fields which are mostly constant. This
should reduce cache misses for access to the latter on the data path.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Tobias Klauser [Thu, 2 Dec 2010 07:20:05 +0000 (07:20 +0000)]
net: am79c961a: Omit private ndo_get_stats function
am79c961_getstats() just returns dev->stats so we can leave it out
alltogether and let dev_get_stats() do the job.
Signed-off-by: Tobias Klauser <tklauser@distanz.ch> Acked-by: Russell King <rmk+kernel@arm.linux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>