Nick Nunley [Tue, 27 Apr 2010 13:10:03 +0000 (13:10 +0000)]
ixgb: use DMA API instead of PCI DMA functions
Signed-off-by: Nicholas Nunley <nicholasx.d.nunley@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Nick Nunley [Tue, 27 Apr 2010 13:09:44 +0000 (13:09 +0000)]
igbvf: use DMA API instead of PCI DMA functions
Signed-off-by: Nicholas Nunley <nicholasx.d.nunley@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Alexander Duyck [Tue, 27 Apr 2010 13:09:25 +0000 (13:09 +0000)]
igb: convert igb from using PCI DMA functions to using DMA API functions
This patch makes it so that igb now uses the DMA API functions instead of
the PCI API functions. To do this the pci_dev pointer that was in the
rings has been replaced with a device pointer, and as a result all
references to [tr]x_ring->pdev have been replaced with [tr]x_ring->dev.
This patch is based of of work originally done by Nicholas Nunley. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Nick Nunley [Tue, 27 Apr 2010 13:09:05 +0000 (13:09 +0000)]
e1000e: use DMA API instead of PCI DMA functions
Signed-off-by: Nicholas Nunley <nicholasx.d.nunley@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Nick Nunley [Tue, 27 Apr 2010 13:08:45 +0000 (13:08 +0000)]
e1000: use DMA API instead of PCI DMA functions
Signed-off-by: Nicholas Nunley <nicholasx.d.nunley@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
doing it within the driver does not look good.
And surely isn't how platform devices were meat to be used.
Acked-by: Ralf Baechle <ralf@linux-mips.org> Signed-off-by: Sebastian Andrzej Siewior <sebastian@breakpoint.cc> Signed-off-by: David S. Miller <davem@davemloft.net>
net: disallow to use net_assign_generic externally
Now there's no need to use this fuction directly because it's handled by
register_pernet_device. So to make this simple and easy to understand,
make this static to do not tempt potentional users.
Signed-off-by: Jiri Pirko <jpirko@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
cxgb4: parse the VPD instead of relying on a static VPD layout
Some boards' VPDs contain additional keywords or have longer serial numbers,
meaning the keyword locations are variable. Ditch the static layout and
use the pci_vpd_* family of functions to parse the VPD instead.
Signed-off-by: Dimitris Michailidis <dm@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Tue, 27 Apr 2010 22:13:20 +0000 (15:13 -0700)]
net: sk_add_backlog() take rmem_alloc into account
Current socket backlog limit is not enough to really stop DDOS attacks,
because user thread spend many time to process a full backlog each
round, and user might crazy spin on socket lock.
We should add backlog size and receive_queue size (aka rmem_alloc) to
pace writers, and let user run without being slow down too much.
Introduce a sk_rcvqueues_full() helper, to avoid taking socket lock in
stress situations.
Under huge stress from a multiqueue/RPS enabled NIC, a single flow udp
receiver can now process ~200.000 pps (instead of ~100 pps before the
patch) on a 8 core machine.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
net: batch skb dequeueing from softnet input_pkt_queue
batch skb dequeueing from softnet input_pkt_queue to reduce potential lock
contention when RPS is enabled.
Note: in the worst case, the number of packets in a softnet_data may
be double of netdev_max_backlog.
Signed-off-by: Changli Gao <xiaosuo@gmail.com> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
ixgbe: Properly display 1 gig downshift warning for backplane
Description: When using Intel smartspeed, the patch displays a
warning when the link down shifts to 1 Gig.
Signed-off-by: Anjali Singhai <anjali.singhai@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Don Skidmore [Tue, 27 Apr 2010 11:31:06 +0000 (11:31 +0000)]
ixgbe: cleanup ethtool autoneg input
The way we were setting autoneg via ethtool was inconstant with that
of our other drivers. It will change the following:
If autoneg is off:
>ethtool -a eth0
Pause parameters for eth0:
Autonegotiate: off
RX: off
TX: off
Before:
>ethtool -A eth0 autoneg on
>ethtool -a eth0
Pause parameters for eth0:
Autonegotiate: off
RX: off
TX: off
Now:
>ethtool -A eth0 autoneg on
>ethtool -a eth0
Pause parameters for eth0:
Autonegotiate: on
RX: on
TX: on
Signed-off-by: Don Skidmore <donald.c.skidmore@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Greg Rose [Tue, 27 Apr 2010 11:31:45 +0000 (11:31 +0000)]
ixgbevf: Fix link speed display
The ixgbevf driver would always report 10Gig speeds even when the link
speed is downshifted to 1Gig. This patch fixes that problem.
Signed-off-by: Greg Rose <gregory.v.rose@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Joe Perches [Tue, 27 Apr 2010 00:50:58 +0000 (00:50 +0000)]
ixgb: Use pr_<level> and netdev_<level>
Convert DEBUGOUTx to pr_debug
Convert DEBUGFUNC to more commonly used ENTER
Convert mac address output to %pM
Use #define pr_fmt
Convert a few printks to pr_<level>
Improve ixgb_mc_addr_list_update: use a temporary for current mc address
Use etherdevice.h functions for mac address testing
Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
John Fastabend [Tue, 27 Apr 2010 02:13:39 +0000 (02:13 +0000)]
ixgbe: ixgbe_down needs to stop dev_watchdog
There is a small race between when the tx queues are stopped
and when netif_carrier_off() is called in ixgbe_down. If the
dev_watchdog() timer fires during this time it is possible for
a false tx timeout to occur.
This patch moves the netif_carrier_off() so that it is called before
the tx queues are stopped preventing the dev_watchdog timer from
detecting false tx timeouts. The race is seen occosionally when
FCoE or DCB settings are being configured or changed.
Testing note, running ifconfig up/down will not reproduce this
issue because dev_open/dev_close call dev_deactivate() and then
dev_activate().
Signed-off-by: John Fastabend <john.r.fastabend@intel.com> Acked-by: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Alexander Duyck [Tue, 27 Apr 2010 01:02:40 +0000 (01:02 +0000)]
igb: add support for reporting 5GT/s during probe on PCIe Gen2
This change corrects the fact that we were not reporting Gen2 link speeds
when we were in fact connected at Gen2 rates.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
writebacks can be held indefinitely by hardware if EITR=0, when
combined with TXDCTL.WTHRESH=8. When EITR=0, WTHRESH should be
set back to zero.
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
82598/82599 can support EITR == 0, which allows for the
absolutely lowest latency setting in the hardware. This disables
writeback batching and anything else that relies upon a delayed
interrupt. This patch enables the feature of "override" when a
user sets rx-usecs to zero, the driver will respect that setting
over using RSC, and automatically disable RSC. If rx-usecs is
used to set the EITR value to 0, then the driver should disable
LRO (aka RSC) internally until EITR is set to non-zero again.
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Koki Sanagi [Tue, 27 Apr 2010 01:01:39 +0000 (01:01 +0000)]
igbvf: double increment nr_frags
There is no need to increment nr_frags because skb_fill_page_desc increments
it.
Signed-off-by: Koki Sanagi <sanagi.koki@jp.fujitsu.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Koki Sanagi [Tue, 27 Apr 2010 01:01:19 +0000 (01:01 +0000)]
igb: double increment nr_frags
There is no need to increment nr_frags because skb_fill_page_desc increments
it.
Signed-off-by: Koki Sanagi <sanagi.koki@jp.fujitsu.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Mon, 26 Apr 2010 20:40:43 +0000 (20:40 +0000)]
net: fix a lockdep rcu warning in __sk_dst_set()
__sk_dst_set() might be called while no state can be integrated in a
rcu_dereference_check() condition.
So use rcu_dereference_raw() to shutup lockdep warnings (if
CONFIG_PROVE_RCU is set)
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
TCP: avoid to send keepalive probes if receiving data
RFC 1122 says the following:
...
Keep-alive packets MUST only be sent when no data or
acknowledgement packets have been received for the
connection within an interval.
...
The acknowledgement packet is reseting the keepalive
timer but the data packet isn't. This patch fixes it by
checking the timestamp of the last received data packet
too when the keepalive timer expires.
Signed-off-by: Flavio Leitner <fleitner@redhat.com> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Acked-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>
I prefer that the hlist be only accessed through the hlist macro
objects. Explicit twiddling of links (especially with RCU) exposes
the code to future bugs.
Compile tested only.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Paul E. McKenney [Tue, 27 Apr 2010 06:22:01 +0000 (06:22 +0000)]
net: suppress RCU lockdep false positive in twsk_net()
Calls to twsk_net() are in some cases protected by reference counting
as an alternative to RCU protection. Cases covered by reference counts
include __inet_twsk_kill(), inet_twsk_free(), inet_twdr_do_twkill_work(),
inet_twdr_twcal_tick(), and tcp_timewait_state_process(). RCU is used
by inet_twsk_purge(). Locking is used by established_get_first()
and established_get_next(). Finally, __inet_twsk_hashdance() is an
initialization case.
It appears to be non-trivial to locate the appropriate locks and
reference counts from within twsk_net(), so used rcu_dereference_raw().
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Andre Detsch [Mon, 26 Apr 2010 05:38:27 +0000 (05:38 +0000)]
cxgb3: Wait longer for control packets on initialization
In some Power7 platforms, when using VIOS (Virtual I/O Server), we
need to wait longer for control packets to finish transfer during
initialization.
Without this change, initialization may fail prematurely.
Signed-off-by: Wen Xiong <wenxiong@us.ibm.com> Signed-off-by: Andre Detsch <adetsch@br.ibm.com> Acked-by: Divy Le Ray <divy@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Bruce Allan [Tue, 27 Apr 2010 03:33:04 +0000 (03:33 +0000)]
e1000e: enable/disable ASPM L0s and L1 and ERT according to hardware errata
Prompted by a previous patch submitted by Matthew Garret <mjg@redhat.com>,
further digging into errata documentation reveals the current enabling or
disabling of ASPM L0s and L1 states for certain parts supported by this
driver are incorrect. 82571 and 82572 should always disable L1. For
standard frames, 82573/82574/82583 can enable L1 but L0s must be disabled,
and for jumbo frames 82573/82574 must disable L1. This allows for some
parts to enable L1 in certain configurations leading to better power
savings.
Also according to the same errata, Early Receive (ERT) should be disabled
on 82573 when using jumbo frames.
Cc: Matthew Garret <mjg@redhat.com> Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Peter Waskiewicz [Tue, 27 Apr 2010 00:38:15 +0000 (00:38 +0000)]
ixgbe: Power down PHY during driver resets
The PHY laser is still on during driver init. It's allowing
garbage to hit our FIFO, which eventually can cause the entire
device to die. Power down the laser while setting up the device,
and re-enable the laser before getting link.
Signed-off-by: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 27 Apr 2010 17:16:54 +0000 (10:16 -0700)]
bridge: Fix build of ipv6 multicast code.
Based upon a report from Stephen Rothwell:
--------------------
net/bridge/br_multicast.c: In function 'br_ip6_multicast_alloc_query':
net/bridge/br_multicast.c:469: error: implicit declaration of function 'csum_ipv6_magic'
bridge br_multicast: Ensure to initialize BR_INPUT_SKB_CB(skb)->mrouters_only.
Even with commit 32dec5dd0233ebffa9cae25ce7ba6daeb7df4467 ("bridge
br_multicast: Don't refer to BR_INPUT_SKB_CB(skb)->mrouters_only
without IGMP snooping."), BR_INPUT_SKB_CB(skb)->mrouters_only is
not appropriately initialized if IGMP/MLD snooping support is
compiled and disabled, so we can see garbage.
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>
bridge br_multicast: Ensure to initialize BR_INPUT_SKB_CB(skb)->mrouters_only.
Even with commit 32dec5dd0233ebffa9cae25ce7ba6daeb7df4467 ("bridge
br_multicast: Don't refer to BR_INPUT_SKB_CB(skb)->mrouters_only
without IGMP snooping."), BR_INPUT_SKB_CB(skb)->mrouters_only is
not appropriately initialized if IGMP snooping support is
compiled and disabled, so we can see garbage.
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Stefan Schmidt [Mon, 26 Apr 2010 18:20:32 +0000 (11:20 -0700)]
ieee802154: Fix oops during ieee802154_sock_ioctl
Trying to run izlisten (from lowpan-tools tests) on a device that does not
exists I got the oops below. The problem is that we are using get_dev_by_name
without checking if we really get a device back. We don't in this case and
writing to dev->type generates this oops.
[Oops code removed by Dmitry Eremin-Solenikov]
If possible this patch should be applied to the current -rc fixes branch.
Signed-off-by: Stefan Schmidt <stefan@datenfreihafen.org> Signed-off-by: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Andre Detsch [Mon, 26 Apr 2010 07:27:07 +0000 (07:27 +0000)]
tg3: Fix INTx fallback when MSI fails
tg3: Fix INTx fallback when MSI fails
MSI setup changes the value of irq_vec in struct tg3 *tp.
This attribute must be taken into account and restored before
we try to do a new request_irq for INTx fallback.
In powerpc, the original code was leading to an EINVAL return within
request_irq, because the driver was trying to use the disabled MSI
virtual irq number instead of tp->pdev->irq.
Signed-off-by: Andre Detsch <adetsch@br.ibm.com> Acked-by: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Patrick McHardy [Mon, 26 Apr 2010 14:02:08 +0000 (16:02 +0200)]
net: ipmr: add support for dumping routing tables over netlink
The ipmr /proc interface (ip_mr_cache) can't be extended to dump routes
from any tables but the main table in a backwards compatible fashion since
the output format ends in a variable amount of output interfaces.
Introduce a new netlink interface to dump multicast routes from all tables,
similar to the netlink interface for regular routes.
Patrick McHardy [Mon, 26 Apr 2010 14:02:05 +0000 (16:02 +0200)]
net: rtnetlink: decouple rtnetlink address families from real address families
Decouple rtnetlink address families from real address families in socket.h to
be able to add rtnetlink interfaces to code that is not a real address family
without increasing AF_MAX/NPROTO.
This will be used to add support for multicast route dumping from all tables
as the proc interface can't be extended to support anything but the main table
without breaking compatibility.
This partialy undoes the patch to introduce independant families for routing
rules and converts ipmr routing rules to a new rtnetlink family. Similar to
that patch, values up to 127 are reserved for real address families, values
above that may be used arbitrarily.
Patrick McHardy [Mon, 26 Apr 2010 14:02:04 +0000 (16:02 +0200)]
net: fib_rules: mark arguments to fib_rules_register const and __net_initdata
fib_rules_register() duplicates the template passed to it without modification,
mark the argument as const. Additionally the templates are only needed when
instantiating a new namespace, so mark them as __net_initdata, which means
they can be discarded when CONFIG_NET_NS=n.
Eric Dumazet [Sun, 25 Apr 2010 22:09:42 +0000 (15:09 -0700)]
ipv6: Fix inet6_csk_bind_conflict()
Commit fda48a0d7a84 (tcp: bind() fix when many ports are bound)
introduced a bug on IPV6 part.
We should not call ipv6_addr_any(inet6_rcv_saddr(sk2)) but
ipv6_addr_any(inet6_rcv_saddr(sk)) because sk2 can be IPV4, while sk is
IPV6.
Reported-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Tested-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Sky2 hardware supports hardware receive hash calculation.
Now that Receive Packet Steering is available, add support
to enable it.
This version does not depend on CONFIG_RPS. Also set_flags rejects
all values except RXHASH, so driver won't have to change next time
somebody adds a new one.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Brian Haley [Fri, 23 Apr 2010 11:26:09 +0000 (11:26 +0000)]
IPv6: Complete IPV6_DONTFRAG support
Finally add support to detect a local IPV6_DONTFRAG event
and return the relevant data to the user if they've enabled
IPV6_RECVPATHMTU on the socket. The next recvmsg() will
return no data, but have an IPV6_PATHMTU as ancillary data.
Signed-off-by: Brian Haley <brian.haley@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Brian Haley [Fri, 23 Apr 2010 11:26:07 +0000 (11:26 +0000)]
IPv6: data structure changes for new socket options
Add underlying data structure changes and basic setsockopt()
and getsockopt() support for IPV6_RECVPATHMTU, IPV6_PATHMTU,
and IPV6_DONTFRAG. IPV6_PATHMTU is actually fully functional
at this point.
Signed-off-by: Brian Haley <brian.haley@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Since .size is set properly in "struct pernet_operations l2tp_eth_net_ops",
allocating space for "struct l2tp_eth_net" by hand is not correct, even causes
memory leakage.
Signed-off-by: Jiri Pirko <jpirko@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Since .size is set properly in "struct pernet_operations l2tp_net_ops",
allocating space for "struct l2tp_net" by hand is not correct, even causes
memory leakage.
Signed-off-by: Jiri Pirko <jpirko@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Anton Vorontsov [Fri, 23 Apr 2010 07:12:44 +0000 (07:12 +0000)]
gianfar: Fix potential oops during OF address translation
gianfar driver may pass NULL pointer to the of_translate_address(),
which may lead to a kernel oops. Fix this by using of_iomap(), which
is also much simpler and shorter.
Signed-off-by: Anton Vorontsov <avorontsov@mvista.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Wed, 21 Apr 2010 09:26:15 +0000 (09:26 +0000)]
tcp: bind() fix when many ports are bound
Port autoselection done by kernel only works when number of bound
sockets is under a threshold (typically 30000).
When this threshold is over, we must check if there is a conflict before
exiting first loop in inet_csk_get_port()
Change inet_csk_bind_conflict() to forbid two reuse-enabled sockets to
bind on same (address,port) tuple (with a non ANY address)
Same change for inet6_csk_bind_conflict()
Reported-by: Gaspar Chilingarov <gasparch@gmail.com> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Acked-by: Evgeniy Polyakov <zbr@ioremap.net> Signed-off-by: David S. Miller <davem@davemloft.net>
Sky2 status ring must be big enough to handle worst case number
of status messages. It was being oversized (to handle dual port cards),
and excessive number of tx ring entries were allowed. This patch reduces
the footprint and makes sure the value is enough.
Later patch to add RSS increases the number of possible Rx status elements.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
andrew hendry [Mon, 19 Apr 2010 13:29:06 +0000 (13:29 +0000)]
X25: Use identifiers for isdn device to x25 interface
Change magic numbers to identifiers for X25 interface.
also minor check patch formatting.
Signed-off-by: Andrew Hendry <andrew.hendry@gmail.com> Acked-by: Karsten Keil <isdn@linux-pingi.de> Signed-off-by: David S. Miller <davem@davemloft.net>
Andrew Hendry [Thu, 22 Apr 2010 23:12:36 +0000 (16:12 -0700)]
X25: Add if_x25.h and x25 to device identifiers
V2 Feedback from John Hughes.
- Add header for userspace implementations such as xot/xoe to use
- Use explicit values for interface stability
- No changes to driver patches
V1
- Use identifiers instead of magic numbers for X25 layer 3 to device interface.
- Also fixed checkpatch notes on updated code.
[ Add new user header to include/linux/Kbuild -DaveM ]
Signed-off-by: Andrew Hendry <andrew.hendry@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
net: Socket filter ancilliary data access for skb->dev->type
Add an SKF_AD_HATYPE field to the packet ancilliary data area, giving
access to skb->dev->type, as reported in the sll_hatype field.
When capturing packets on a PF_PACKET/SOCK_RAW socket bound to all
interfaces, there doesn't appear to be a way for the filter program to
actually find out the underlying hardware type the packet was captured
on. This patch adds such ability.
This patch also handles the case where skb->dev can be NULL, such as on
netlink sockets.
Signed-off-by: Paul Evans <leonerd@leonerd.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
Tom Herbert [Thu, 22 Apr 2010 07:00:24 +0000 (07:00 +0000)]
tcp: fix outsegs stat for TSO segments
Account for TSO segments of an skb in TCP_MIB_OUTSEGS counter. Without
doing this, the counter can be off by orders of magnitude from the
actual number of segments sent.
Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Dan Carpenter [Wed, 21 Apr 2010 23:55:27 +0000 (23:55 +0000)]
rdma: potential ERR_PTR dereference
In the original code, the "goto out" calls "rdma_destroy_id(cm_id);"
That isn't needed here and would cause problems because "cm_id" is an
ERR_PTR. The new code just returns directly.
Signed-off-by: Dan Carpenter <error27@gmail.com> Acked-by: Andy Grover <andy.grover@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
We do netif_device_attach, even if resource allocation fails.
Driver callbacks can be called, if device is attached.
All these callbacks need to be protected by ADAPTER_UP_MAGIC check.
Signed-off-by: Amit Kumar Salecha <amit.salecha@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Different class of drivers co-exist for CNA device,
there is some minimal interaction that will be required amongst
the drivers for performing some device level operations.
All the driver should follow inter driver coexistence document.
Fixing polling interval and spelling mistake.
Signed-off-by: Anirban Chakraborty <anirban.chakraborty@qlogic.com> Signed-off-by: Amit Kumar Salecha <amit.salecha@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds IPv6 support for RFC5082 Generalized TTL Security Mechanism.
Not to users of mapped address; the IPV6 and IPV4 socket options are seperate.
The server does have to deal with both IPv4 and IPv6 socket options
and the client has to handle the different for each family.
On client:
int ttl = 255;
getaddrinfo(argv[1], argv[2], &hint, &result);
for (rp = result; rp != NULL; rp = rp->ai_next) {
s = socket(rp->ai_family, rp->ai_socktype, rp->ai_protocol);
if (s < 0) continue;
if (connect(s, rp->ai_addr, rp->ai_addrlen) == 0) {
...
On server:
int minttl = 255 - maxhops;
getaddrinfo(NULL, port, &hints, &result);
for (rp = result; rp != NULL; rp = rp->ai_next) {
s = socket(rp->ai_family, rp->ai_socktype, rp->ai_protocol);
if (s < 0) continue;
Eric Dumazet [Thu, 22 Apr 2010 07:22:45 +0000 (00:22 -0700)]
rps: immediate send IPI in process_backlog()
If some skb are queued to our backlog, we are delaying IPI sending at
the end of net_rx_action(), increasing latencies. This defeats the
queueing, since we want to quickly dispatch packets to the pool of
worker cpus, then eventually deeply process our packets.
It's better to send IPI before processing our packets in upper layers,
from process_backlog().
Change the _and_disable_irq suffix to _and_enable_irq(), since we enable
local irq in net_rps_action(), sorry for the confusion.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Jiri Olsa [Tue, 20 Apr 2010 21:21:26 +0000 (21:21 +0000)]
net: ipv6 bind to device issue
The issue raises when having 2 NICs both assigned the same
IPv6 global address.
If a sender binds to a particular NIC (SO_BINDTODEVICE),
the outgoing traffic is being sent via the first found.
The bonded device is thus not taken into an account during the
routing.
From the ip6_route_output function:
If the binding address is multicast, linklocal or loopback,
the RT6_LOOKUP_F_IFACE bit is set, but not for global address.
So binding global address will neglect SO_BINDTODEVICE-binded device,
because the fib6_rule_lookup function path won't check for the
flowi::oif field and take first route that fits.
Signed-off-by: Jiri Olsa <jolsa@redhat.com> Signed-off-by: Scott Otto <scott.otto@alcatel-lucent.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Johannes Berg [Tue, 20 Apr 2010 21:06:07 +0000 (21:06 +0000)]
ethernet: print protocol in host byte order
Eric's recent patch added __force, but this
place would seem to require actually doing
a byte order conversion so the printk is
consistent across architectures.
Cc: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: David S. Miller <davem@davemloft.net>