Tom Herbert [Thu, 2 Jan 2014 19:48:26 +0000 (11:48 -0800)]
ipv4: Cache dst in tunnels
Avoid doing a route lookup on every packet being tunneled.
In ip_tunnel.c cache the route returned from ip_route_output if
the tunnel is "connected" so that all the rouitng parameters are
taken from tunnel parms for a packet. Specifically, not NBMA tunnel
and tos is from tunnel parms (not inner packet).
Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Neil Horman [Thu, 2 Jan 2014 17:54:27 +0000 (12:54 -0500)]
sctp: Add process name and pid to deprecation warnings
Recently I updated the sctp socket option deprecation warnings to be both a bit
more clear and ratelimited to prevent user processes from spamming the log file.
Ben Hutchings suggested that I add the process name and pid to these warnings so
that users can tell who is responsible for using the deprecated apis. This
patch accomplishes that.
Signed-off-by: Neil Horman <nhorman@tuxdriver.com> CC: Vlad Yasevich <vyasevich@gmail.com> CC: Ben Hutchings <bhutchings@solarflare.com> CC: "David S. Miller" <davem@davemloft.net> CC: netdev@vger.kernel.org Signed-off-by: David S. Miller <davem@davemloft.net>
Julia Lawall [Thu, 2 Jan 2014 16:28:49 +0000 (17:28 +0100)]
net: tulip: delete useless tests on netdev_priv
Netdev_priv performs an addition, not a pointer dereference, so it seems
quite unlikely that its result would ever be NULL.
A semantic patch that makes this change is as follows:
(http://coccinelle.lip6.fr/)
// <smpl>
@@
statement S;
@@
- if (!netdev_priv(...)) S
// </smpl>
Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr> Acked-by: Grant Grundler <grundler@parisc-linux.org> Signed-off-by: David S. Miller <davem@davemloft.net>
net/sched/cls_cgroup.c:106:29: error: static declaration of ‘net_cls_subsys’ follows non-static declaration
static struct cgroup_subsys net_cls_subsys = {
^
In file included from include/linux/cgroup.h:654:0,
from net/sched/cls_cgroup.c:18:
include/linux/cgroup_subsys.h:35:29: note: previous declaration of ‘net_cls_subsys’ was here
SUBSYS(net_cls)
^
make[2]: *** [net/sched/cls_cgroup.o] Error 1
Cc: Stephen Hemminger <stephen@networkplumber.org> Cc: David S. Miller <davem@davemloft.net> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
No need to export functions only used in one file.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Shawn Bohrer [Tue, 31 Dec 2013 17:39:40 +0000 (11:39 -0600)]
mlx4_en: Only cycle port if HW timestamp config changes
If the hwtstamp_config matches what is currently set for the device then
simply return. Without this change any program that tries to enable
hardware timestamps will cause the link to cycle even if hardware
timstamps were already enabled.
Signed-off-by: Shawn Bohrer <sbohrer@rgmadvisors.com> Acked-By: Hadar Hen Zion <hadarh@mellanox.com> Acked-by: Richard Cochran <richardcochran@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Shawn Bohrer [Tue, 31 Dec 2013 17:39:39 +0000 (11:39 -0600)]
mlx4_en: Add PTP hardware clock
This adds a PHC to the mlx4_en driver. We use reader/writer spinlocks to
protect the timecounter since every packet received needs to call
timecounter_cycle2time() when timestamping is enabled. This can become
a performance bottleneck with RSS and multiple receive queues if normal
spinlocks are used.
This driver has been tested with both Documentation/ptp/testptp and the
linuxptp project (http://linuxptp.sourceforge.net/) on a Mellanox
ConnectX-3 card.
Signed-off-by: Shawn Bohrer <sbohrer@rgmadvisors.com> Acked-By: Hadar Hen Zion <hadarh@mellanox.com> Acked-by: Richard Cochran <richardcochran@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Sachin Kamat [Mon, 30 Dec 2013 05:11:56 +0000 (10:41 +0530)]
net: Cleanup in eth-netx.h
Commit 2960ed346877 ("ARM: netx: move platform_data definitions")
moved the file to the current location but forgot to remove the pointer
to its previous location. Clean it up. While at it also change the header
file protection macros appropriately.
Signed-off-by: Sachin Kamat <sachin.kamat@linaro.org> Cc: Sascha Hauer <s.hauer@pengutronix.de> Signed-off-by: David S. Miller <davem@davemloft.net>
Li RongQing [Fri, 27 Dec 2013 08:32:38 +0000 (16:32 +0800)]
ipv6: remove prune parameter for fib6_clean_all
since the prune parameter for fib6_clean_all always is 0, remove it.
Signed-off-by: Li RongQing <roy.qing.li@gmail.com> Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
wangweidong [Fri, 27 Dec 2013 02:09:39 +0000 (10:09 +0800)]
tipc: make the code look more readable
In commit 3b8401fe9d ("tipc: kill unnecessary goto's") didn't make
the code look most readable, so fix it. This patch is cosmetic
and does not change the operation of TIPC in any way.
Suggested-by: David Laight <David.Laight@ACULAB.COM> Signed-off-by: Wang Weidong <wangweidong1@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Salam Noureddine [Tue, 24 Dec 2013 22:17:02 +0000 (14:17 -0800)]
ipv4: arp: update neighbour address when a gratuitous arp is received and arp_accept is set
Gratuitous arp packets are useful in switchover scenarios to update
client arp tables as quickly as possible. Currently, the mac address
of a neighbour is only updated after a locktime period has elapsed
since the last update. In most use cases such delays are unacceptable
for network admins. Moreover, the "updated" field of the neighbour
stucture doesn't record the last time the address of a neighbour
changed but records any change that happens to the neighbour. This is
clearly a bug since locktime uses that field as meaning "addr_updated".
With this observation, I was able to perpetuate a stale address by
sending a stream of gratuitous arp packets spaced less than locktime
apart. With this change the address is updated when a gratuitous arp
is received and the arp_accept sysctl is set.
Signed-off-by: Salam Noureddine <noureddine@aristanetworks.com> Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
This addresses some of those warnings by:
* make icmpv6_route_lookup static
* move inline's out of ip6_route.h since only used into route.c
* move rt6_bind_peer into route.c
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Cleanups in netlink_tap code
* remove unused function netlink_clear_multicast_users
* make local function static
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Reviewed-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Thu, 2 Jan 2014 03:59:13 +0000 (22:59 -0500)]
Merge branch 'bonding'
Ding Tianhong says:
====================
bonding: slight optimization for bonding
This serious of patches will slight optimize the mac address compare
and xmit path for bonding, also make some cleanups.
Julia was using ether_addr_equal_64bits to instead of ether_addr_equal,
it is really a hard work and she may did not make patch for bonding yet,
so I have do it in this patchset and that she could miss the bonding drivers.
resend and add cc for Julia.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
dingtianhong [Thu, 2 Jan 2014 01:13:12 +0000 (09:13 +0800)]
bonding: remove the return value for bond_3ad_bind_slave()
I'm sure the operand slave and bond for the function will not
be NULL, so the check for the bond will not make any sense, so
remove the judgement, and the return value was useless here,
remove the unwanted return value.
The comments for the bond 3ad is too old, cleanup some errors
and warming.
Signed-off-by: Ding Tianhong <dingtianhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
dingtianhong [Thu, 2 Jan 2014 01:13:06 +0000 (09:13 +0800)]
bonding: slight optimizztion for bond_slave_override()
When the skb is xmit by the function bond_slave_override(),
it will have duplicate judgement for slave state, and I think it
will consumes a little performance, maybe it is negligible,
so I simplify the function and remove the unwanted judgement.
Signed-off-by: Ding Tianhong <dingtianhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
dingtianhong [Thu, 2 Jan 2014 01:13:02 +0000 (09:13 +0800)]
bonding: slight optimization for bond_alb_xmit()
The bond_alb_xmit will check the return value for
bond_dev_queue_xmit() every time, but the bond_dev_queue_xmit()
is always return 0, it is no need to check the value every time,
so remove the unneed judgement for the xmit path.
Signed-off-by: Ding Tianhong <dingtianhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
dingtianhong [Thu, 2 Jan 2014 01:12:59 +0000 (09:12 +0800)]
bonding: slight optimization for bond_3ad_xmit_xor()
The bond_dev_queue_xmit() will always return 0, and as a fast path,
it is inappropriate to check the res value when xmit every package,
so remove the res check and avoid once judgement for xmit.
Signed-off-by: Ding Tianhong <dingtianhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
dingtianhong [Thu, 2 Jan 2014 01:12:54 +0000 (09:12 +0800)]
bonding: use ether_addr_equal_unaligned for bond addr compare
Use possibly more efficient ether_addr_equal_64bits
to instead of memcmp.
Modify the MAC_ADDR_COMPARE to MAC_ADDR_EQUAL, this looks more
appropriate.
The comments for the bond 3ad is too old, cleanup some errors
and warming.
Suggested-by: Joe Perches <joe@perches.com> Signed-off-by: Ding Tianhong <dingtianhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
These patches were tucked-in with me for my long winter's nap!
Please pull them for the 3.14 stream...
For the mac80211 bits, Johannes says:
"Here I just have a collection of fixes/improvements/cleanups, very
little really stands out apart from CSA fixes, vendor command support
and the RCU speedups."
For the iwlwifi bits, Emmanuel says:
"I have hear quite a few things. Alex continues his work on power
management. Arik is reworking the transport API by unifying redudant
APIs and making error handling more generic. Eyal keeps on digging in
the rate scaling code.
We also have two new features - Max is using the brand new generic
cipher infrastructure in mac80211, and Lilach implements the smart fifo
which allows to save power by making interrupt coalescing smarter."
Along with those, Arend and company bring a batch of brcmfmac.
Sujith and Felix bring the usual high level of ath9k activity as well.
Bing gives mwifiex some love as well, and a handful of other bits
get updates here and there.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Eddie Wai [Wed, 1 Jan 2014 07:18:34 +0000 (23:18 -0800)]
cnic: Add a signature to indicate valid doorbell offset.
The buffer that is used to pass doorbell offset to the userspace UIO
driver may contain nonzero value in older versions of bnx2x driver.
Userspace cannot easily tell whether it contains a valid doorbell
offset or not. With the added signature, userspace will only use
the doorbell offset if the signature is present.
Update version to 2.5.19.
Signed-off-by: Eddie Wai <eddie.wai@broadcom.com> Signed-off-by: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Michael Chan [Wed, 1 Jan 2014 07:22:33 +0000 (23:22 -0800)]
bnx2: Enable auto-mdix when autoneg is disabled.
Auto-mdix currently only works if autoneg is enabled. This patch enables
auto-mdix all the time by setting a bit in a PHY register. Define
meaningful constants for this PHY registers.
Signed-off-by: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Michael Chan [Wed, 1 Jan 2014 07:22:32 +0000 (23:22 -0800)]
bnx2: Advertise nothing when speed is forced
The current code does not reset the advertisement register when the speed
is forced, leaving the default advertisement value of 10 Mbps. This does
not work with some link partners when the next patch enables auto-mdix.
Set advertisement register to 0 if the speed is forced.
Signed-off-by: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Greg Rose [Sat, 21 Dec 2013 06:13:16 +0000 (06:13 +0000)]
i40evf: add driver to kernel build system
Modify the existing Kconfig, Makefile, and MAINTAINERS to add the driver
to the kernel. Add a Makefile and a documentation
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com> Signed-off-by: Greg Rose <gregory.v.rose@intel.com> Tested-by: Sibai Li <sibai.li@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Greg Rose [Sat, 21 Dec 2013 06:13:11 +0000 (06:13 +0000)]
i40evf: init code and hardware support
This patch implements the hardware specific init and management.
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com> Signed-off-by: Greg Rose <gregory.v.rose@intel.com> Tested-by: Sibai Li <sibai.li@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Greg Rose [Sat, 21 Dec 2013 06:13:06 +0000 (06:13 +0000)]
i40evf: driver core headers
This patch contains the main driver header files, containing structures
and data types specific to the linux driver.
i40e_osdep.h contains some code that helps us adapt our OS agnostic code
to Linux.
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com> Signed-off-by: Greg Rose <gregory.v.rose@intel.com> Tested-by: Sibai Li <sibai.li@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Greg Rose [Sat, 21 Dec 2013 06:13:01 +0000 (06:13 +0000)]
i40evf: virtual channel interface
This PCI-E SR-IOV virtual function (VF) driver is dependant upon the
physical function (PF) driver (i40e) for nearly all of its hardware
configuration. Requests from the VF driver are passed to the PF using
the hardware's Admin Queue.
This patch contains the functionality for communicating with the PF
driver. Because of the delay inherent in this communications channel,
most of the replies from the PF driver are handled asynchronously. The
exceptions are the "send API version" and "get VF config" messages,
which busy-wait because they are done so early during init that
interrupts are not yet configured.
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com> Signed-off-by: Greg Rose <gregory.v.rose@intel.com> Tested-by: Sibai Li <sibai.li@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Greg Rose [Sat, 21 Dec 2013 06:12:56 +0000 (06:12 +0000)]
i40evf: core ethtool functionality
This patch contains the ethtool interface and related functionality.
Since the VF driver is mostly unaware of link, much of that
functionality is unused. The driver implements ethtool hooks for
statistics, driver info, and some basic non-link-related driver
settings.
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com> Signed-off-by: Greg Rose <gregory.v.rose@intel.com> Tested-by: Sibai Li <sibai.li@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Greg Rose [Sat, 21 Dec 2013 06:12:51 +0000 (06:12 +0000)]
i40evf: transmit and receive functionality
This file contains the transmit, receive, and NAPI functionality.
Some of the functions in this module are extracted from the i40e driver
but functions that are not appropriate for virtual function devices have
been removed.
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com> Signed-off-by: Greg Rose <gregory.v.rose@intel.com> Tested-by: Sibai Li <sibai.li@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Greg Rose [Sat, 21 Dec 2013 06:12:45 +0000 (06:12 +0000)]
i40evf: main driver core
This is the driver for the Intel(R) XL710 X710 Virtual Function.
This patch contains the main driver entry points, but does not include
transmit and receive or ethtool functionality, which are presented as
separate patches in this series.
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com> Signed-off-by: Greg Rose <gregory.v.rose@intel.com> Tested-by: Sibai Li <sibai.li@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
David S. Miller [Tue, 31 Dec 2013 21:48:37 +0000 (16:48 -0500)]
Merge branch 'addr_compare'
Ding Tianhong says:
====================
slight optimization of addr compare for net modules
This is the second patchset for slight optimization of address compare,
mainly for net tree, just following the Joe's opinion, it will help review
the code for maintainers and supports.
v2: Change some style for patch 2.
According Eric's suggestion, use the ether_addr_equal_64bits to instead
of ether_addr_equal for patch 19.
In fact, there are a lot of places which could use ether_addr_equal_64bits
to instead of ether_addr_equal, but not this time, thanks for Joe's
opinion.
v3: Change some style for patch 11/19:
(net: packetengines: slight optimization of addr compare).
Joe pointed out that is_broadcast_ether_addr(addr) would be appropriate here,
but this should be left alone and not in this patch, so fix it later.
In the patch (net: caif: slight optimization of addr compare), the operand for
memcmp is not mac address, so it is unsuitable to use the ether_addr_equal
to compare a non mac address, so remove the patch from the series.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
dingtianhong [Mon, 30 Dec 2013 07:41:33 +0000 (15:41 +0800)]
net: plip: slight optimization of addr compare
Use possibly more efficient ether_addr_equal_64bits
to instead of memcmp.
Cc: "David S. Miller" <davem@davemloft.net> Suggested-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Ding Tianhong <dingtianhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
dingtianhong [Mon, 30 Dec 2013 07:41:30 +0000 (15:41 +0800)]
net: fddi: slight optimization of addr compare
Use possibly more efficient ether_addr_equal
to instead of memcmp.
Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: Ding Tianhong <dingtianhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
dingtianhong [Mon, 30 Dec 2013 07:41:27 +0000 (15:41 +0800)]
net: ti: slight optimization of addr compare
Use possibly more efficient ether_addr_equal
to instead of memcmp.
Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: Ding Tianhong <dingtianhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
dingtianhong [Mon, 30 Dec 2013 07:41:24 +0000 (15:41 +0800)]
net: sun: optimization of addr compare
Use possibly more efficient ether_addr_equal
to instead of memcmp.
Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: Ding Tianhong <dingtianhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
dingtianhong [Mon, 30 Dec 2013 07:41:21 +0000 (15:41 +0800)]
net: seeq: slight optimization of addr compare
Use possibly more efficient ether_addr_equal
to instead of memcmp.
Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: Ding Tianhong <dingtianhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
dingtianhong [Mon, 30 Dec 2013 07:41:17 +0000 (15:41 +0800)]
net: renesas: slight optimization of addr compare
Use possibly more efficient ether_addr_equal
to instead of memcmp.
Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: Ding Tianhong <dingtianhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
dingtianhong [Mon, 30 Dec 2013 07:41:06 +0000 (15:41 +0800)]
net: packetengines: slight optimization of addr
Use possibly more efficient ether_addr_equal
to instead of memcmp.
Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: Ding Tianhong <dingtianhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
dingtianhong [Mon, 30 Dec 2013 07:40:59 +0000 (15:40 +0800)]
net: ksz884x: slight optimization of addr compare
Use possibly more efficient ether_addr_equal
to instead of memcmp.
Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: Ding Tianhong <dingtianhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
dingtianhong [Mon, 30 Dec 2013 07:40:55 +0000 (15:40 +0800)]
net: mlx4: slight optimization of addr compare
Use possibly more efficient ether_addr_equal
to instead of memcmp.
Cc: Amir Vadai <amirv@mellanox.com> Signed-off-by: Ding Tianhong <dingtianhong@huawei.com> Acked-By: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
dingtianhong [Mon, 30 Dec 2013 07:40:50 +0000 (15:40 +0800)]
net: ixgbe: slight optimization of addr compare
Use possibly more efficient ether_addr_equal
to instead of memcmp.
Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Cc: Jesse Brandeburg <jesse.brandeburg@intel.com> Cc: Bruce Allan <bruce.w.allan@intel.com> Signed-off-by: Ding Tianhong <dingtianhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
dingtianhong [Mon, 30 Dec 2013 07:40:46 +0000 (15:40 +0800)]
net: igbvf: slight optimization of addr compare
Use possibly more efficient ether_addr_equal
to instead of memcmp.
Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Cc: Jesse Brandeburg <jesse.brandeburg@intel.com> Cc: Carolyn Wyborny <carolyn.wyborny@intel.com> Cc: Don Skidmore <donald.c.skidmore@intel.com> Signed-off-by: Ding Tianhong <dingtianhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
dingtianhong [Mon, 30 Dec 2013 07:40:32 +0000 (15:40 +0800)]
net: bnx2x: slight optimization of addr compare
Use the possibly more efficient ether_addr_equal or
ether_addr_equal_unaligned to instead of memcmp.
Cc: Ariel Elior <ariele@broadcom.com> Cc: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com> Signed-off-by: Ding Tianhong <dingtianhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
dingtianhong [Mon, 30 Dec 2013 07:40:28 +0000 (15:40 +0800)]
net: 3com: slight optimization of addr compare
Use possibly more efficient ether_addr_equal
to instead of memcmp.
Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: Ding Tianhong <dingtianhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
net, rps: fix build failure when CONFIG_RPS isn't set
In file included from net/socket.c:99:0:
include/net/sock.h: In function ‘sock_rps_record_flow’:
include/net/sock.h:849:30: error: ‘const struct sock’ has no member named ‘sk_rxhash’
include/net/sock.h: In function ‘sock_rps_reset_flow’:
include/net/sock.h:854:29: error: ‘const struct sock’ has no member named ‘sk_rxhash’
Reported-by: Fengguang Wu <fengguang.wu@intel.com> Signed-off-by: Zhi Yong Wu <wuzhy@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
wangweidong [Thu, 26 Dec 2013 05:55:56 +0000 (13:55 +0800)]
sctp: move skb_dst_set() a bit downwards in sctp_packet_transmit()
skb_dst_set will use dst, if dst is NULL although is not a problem,
then goto the 'no_route' and free nskb, so do the skb_dst_set is pointless.
so move the skb_dst_set after dst check.
Remove the unnecessary initialization as well.
v2: fix the subject line because it would confuse people,
as pointed out by Daniel.
Signed-off-by: Wang Weidong <wangweidong1@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Yang Yingliang [Wed, 25 Dec 2013 09:35:15 +0000 (17:35 +0800)]
sch_netem: support of 64bit rates
Add a new attribute to support 64bit rates so that
tc can use them to break the 32bit limit.
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Acked-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Yang Yingliang [Wed, 25 Dec 2013 09:35:14 +0000 (17:35 +0800)]
sch_netem: more precise length of packets
With TSO/GSO/GRO packets, skb->len doesn't represent
a precise amount of bytes on wire.
This patch replace skb->len with qdisc_pkt_len(skb)
which is more precise.
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Acked-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Or Gerlitz [Mon, 23 Dec 2013 14:09:44 +0000 (16:09 +0200)]
net/mlx4_en: Add netdev support for TCP/IP offloads of vxlan tunneling
When the device tunneling offloads mode is vxlan do the following
- call SET_PORT with the relevant setting
- add DMFS steering vxlan rule for the device self and multicast mac addresses
of the form: {<ETH, outer-mac> <VXLAN, ANY vnid> <ETH, ANY mac>} --> RSS QP
- set relevant QPC fields in RSS context and RX ring QPs
- in TX flow, set WQE fields to generate HW checksum, and handle gso skbs
which are marked for encapsulation such that the HW will segment them properly.
- in RX flow, read HW offloaded checksum for encapsulated packets from the CQE
- advertize hw_enc_features and NETIF_F_GSO_UDP_TUNNEL to the networking stack
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Or Gerlitz [Mon, 23 Dec 2013 14:09:43 +0000 (16:09 +0200)]
net/mlx4_core: Add basic support for TCP/IP offloads under tunneling
Add the low-level device commands and definitions used for TCP/IP HW offloads
of tunneled/vxlan traffic which are supported by the ConnectX3-pro NIC.
This is done through the following elements:
- read tunneling device caps in QUERY_DEV_CAP
- add helper function to do SET_PORT for tunneling
- add DMFS VXLAN steering rule definitions
- add CQE and WQE checksum offload field definitions
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Daniel Borkmann [Mon, 23 Dec 2013 13:35:56 +0000 (14:35 +0100)]
netlink: specify netlink packet direction for nlmon
In order to facilitate development for netlink protocol dissector,
fill the unused field skb->pkt_type of the cloned skb with a hint
of the address space of the new owner (receiver) socket in the
notion of "to kernel" resp. "to user".
At the time we invoke __netlink_deliver_tap_skb(), we already have
set the new skb owner via netlink_skb_set_owner_r(), so we can use
that for netlink_is_kernel() probing.
In normal PF_PACKET network traffic, this field denotes if the
packet is destined for us (PACKET_HOST), if it's broadcast
(PACKET_BROADCAST), etc.
As we only have 3 bit reserved, we can use the value (= 6) of
PACKET_FASTROUTE as it's _not used_ anywhere in the whole kernel
and not supported anywhere, and packets of such type were never
exposed to user space, so there are no overlapping users of such
kind. Thus, as wished, that seems the only way to make both
PACKET_* values non-overlapping and therefore device agnostic.
By using those two flags for netlink skbs on nlmon devices, they
can be made available and picked up via sll_pkttype (previously
unused in netlink context) in struct sockaddr_ll. We now have
these two directions:
- PACKET_USER (= 6) -> to user space
- PACKET_KERNEL (= 7) -> to kernel space
Partial `ip a` example strace for sa_family=AF_NETLINK with
detected nl msg direction:
syscall: direction:
sendto(3, ...) = 40 /* to kernel */
recvmsg(3, ...) = 3404 /* to user */
recvmsg(3, ...) = 1120 /* to user */
recvmsg(3, ...) = 20 /* to user */
sendto(3, ...) = 40 /* to kernel */
recvmsg(3, ...) = 168 /* to user */
recvmsg(3, ...) = 144 /* to user */
recvmsg(3, ...) = 20 /* to user */
Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: Jakub Zawadzki <darkjames-ws@darkjames.pl> Signed-off-by: David S. Miller <davem@davemloft.net>
Daniel Borkmann [Mon, 23 Dec 2013 13:35:55 +0000 (14:35 +0100)]
netlink: only do not deliver to tap when both sides are kernel sks
We should also deliver packets to nlmon devices when we are in
netlink_unicast_kernel(), and only one of the {src,dst} sockets
is user sk and the other one kernel sk. That's e.g. the case in
netlink diag, netlink route, etc. Still, forbid to deliver messages
from kernel to kernel sks.
Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: Jakub Zawadzki <darkjames-ws@darkjames.pl> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 31 Dec 2013 18:59:21 +0000 (13:59 -0500)]
Merge branch 'sctp_logspam'
Neil Horman says:
====================
sctp: Consolidate and ratelimit deprecation warnings
The SCTP protocol has several deprecation warnings in its setsockopt path that
can be triggered by unprivlidged users. Since these are not ratelimited, we can
spam the logs quite easily here. Since these are all deprecation warnings, and
that type of warning isn't uncommon in the rest of the kernel, lets make a
common pr_warn_deprecated macro to produce somewhat generalized ratelimited
deprecation warnings easily
====================
Signed-off-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Neil Horman [Mon, 23 Dec 2013 13:29:43 +0000 (08:29 -0500)]
SCTP: Reduce log spamming for sctp setsockopt
During a recent discussion regarding some sctp socket options, it was noted that
we have several points at which we issue log warnings that can be flooded at an
unbounded rate by any user. Fix this by converting all the pr_warns in the
sctp_setsockopt path to be pr_warn_ratelimited.
Note there are several debug level messages as well. I'm leaving those alone,
as, if you turn on pr_debug, you likely want lots of verbosity.
Signed-off-by: Neil Horman <nhorman@tuxdriver.com> CC: Vlad Yasevich <vyasevich@gmail.com> CC: David Miller <davem@davemloft.net> CC: netdev@vger.kernel.org Signed-off-by: David S. Miller <davem@davemloft.net>
Neil Horman [Mon, 23 Dec 2013 13:29:42 +0000 (08:29 -0500)]
printk: Add a DEPRECATED macro
sctp has several points in its setsockopt path in which it issues deprecation
warnings. It seems like it might be handy to macrotize such a warning so other
subsystems can use it easily
Signed-off-by: Neil Horman <nhorman@tuxdriver.com> CC: Greg Kroah-Hartman <gregkh@linuxfoundation.org> CC: "David S. Miller" <davem@davemloft.net> CC: linux-kernel@vger.kernel.org CC: netdev@vger.kernel.org Signed-off-by: David S. Miller <davem@davemloft.net>
Li RongQing [Mon, 23 Dec 2013 01:32:38 +0000 (09:32 +0800)]
ipv6: unneccessary to get address prefix in addrconf_get_prefix_route
Since addrconf_get_prefix_route inputs the address prefix to fib6_locate,
which does not uses the data which is out of the prefix_len length,
so do not need to use ipv6_addr_prefix to get address prefix.
Signed-off-by: Li RongQing <roy.qing.li@gmail.com> Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 31 Dec 2013 18:31:42 +0000 (13:31 -0500)]
Merge branch 'tun_rfs'
Zhi Yong Wu says:
====================
tun: add the RFS support
Since Tom Herbert's hash related patchset was modified and got merged,
his pachset about adding support for RFS on tun flows also need to get
adjusted accordingly. I tried to update them, and before i will start
to do some perf tests, i hope to get one correct code base, so it's time
to post them out now. Any constructive comments are welcome, thanks.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Tom Herbert [Sun, 22 Dec 2013 10:54:32 +0000 (18:54 +0800)]
tun: Add support for RFS on tun flows
This patch adds support so that the rps_flow_tables (RFS) can be
programmed using the tun flows which are already set up to track flows
for the purposes of queue selection.
On the receive path (corresponding to select_queue and tun_net_xmit) the
rxhash is saved in the flow_entry. The original code only does flow
lookup in select_queue, so this patch adds a flow lookup in tun_net_xmit
if num_queues == 1 (select_queue is not called from
dev_queue_xmit->netdev_pick_tx in that case).
The flow is recorded (processing CPU) in tun_flow_update (TX path), and
reset when flow is deleted.
Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: Zhi Yong Wu <wuzhy@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Tom Herbert [Sun, 22 Dec 2013 10:54:31 +0000 (18:54 +0800)]
net: Allow setting sock flow hash without a sock
This patch adds sock_rps_record_flow_hash and sock_rps_reset_flow_hash
which take a hash value as an argument and sets the sock_flow_table
accordingly. This allows the table to be populated in cases where flow
is being tracked outside of a sock structure.
sock_rps_record_flow and sock_rps_reset_flow call this function
where the hash is taken from sk_rxhash.
Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: Zhi Yong Wu <wuzhy@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
dingtianhong [Sat, 21 Dec 2013 06:40:12 +0000 (14:40 +0800)]
bonding: add option lp_interval for loading module
The bond driver could set the lp_interval when loading module.
Suggested-by: Scott Feldman <sfeldma@cumulusnetworks.com> Signed-off-by: Ding Tianhong <dingtianhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>