Daniel Borkmann [Thu, 7 Feb 2013 23:22:58 +0000 (23:22 +0000)]
net: sctp: sctp_auth_make_key_vector: use sctp_auth_create_key
In sctp_auth_make_key_vector, we allocate a temporary sctp_auth_bytes
structure with kmalloc instead of the sctp_auth_create_key allocator.
Change this to sctp_auth_create_key as it is the case everywhere else,
so that we also can properly free it via sctp_auth_key_put. This makes
it easier for future code changes in the structure and allocator itself,
since a single API is consistently used for this purpose. Also, by
using sctp_auth_create_key we're doing sanity checks over the arguments.
Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Acked-by: Vlad Yasevich <vyasevich@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Thu, 7 Feb 2013 16:41:02 +0000 (16:41 +0000)]
macvlan: add a salt to mc_hash()
Some multicast addresses are common to all macvlans,
so if a multicast message has a hash value collision, we
have to deliver a copy to all macvlans, adding significant
latency and possible packet drops if netdev_max_backlog
limit is hit.
Having a per macvlan hash function permits to reduce the
impact of hash collisions.
Suggested-by: Maciej Żenczykowski <maze@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Maciej Żenczykowski <maze@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Thu, 7 Feb 2013 16:02:57 +0000 (16:02 +0000)]
macvlan: broadcast addr should be part of mc_filter
commit cd431e738509e (macvlan: add multicast filter) forgot
the broadcast case.
Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Maciej Żenczykowski <maze@google.com> SIgned-off-by: Maciej Żenczykowski <maze@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
we should use rcu_dereference_bh() when we hold rcu_read_lock_bh().
Reported-by: Sasha Levin <sasha.levin@oracle.com> Cc: David S. Miller <davem@davemloft.net> Cc: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: Cong Wang <amwang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Joe Perches [Thu, 7 Feb 2013 11:46:27 +0000 (11:46 +0000)]
drivers: net: Remove remaining alloc/OOM messages
alloc failures already get standardized OOM
messages and a dump_stack.
For the affected mallocs around these OOM messages:
Converted kmallocs with multiplies to kmalloc_array.
Converted a kmalloc/memcpy to kmemdup.
Removed now unused stack variables.
Removed unnecessary parentheses.
Neatened alignment.
Signed-off-by: Joe Perches <joe@perches.com> Acked-by: Arend van Spriel <arend@broadcom.com> Acked-by: Marc Kleine-Budde <mkl@pengutronix.de> Acked-by: John W. Linville <linville@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Rafał Miłecki [Thu, 7 Feb 2013 05:40:38 +0000 (05:40 +0000)]
bgmac: fix "cmdcfg" calls for promisc and loopback modes
The last (bool) parameter in bgmac_cmdcfg_maskset says if the write
should be made, even if value didn't change. Currently driver doesn't
match the specs about (not) forcing some changes. This makes it follow
them.
Reported-by: Nathan Hintz <nlhintz@hotmail.com> Signed-off-by: Rafał Miłecki <zajec5@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Alexander Duyck [Fri, 8 Feb 2013 10:17:15 +0000 (10:17 +0000)]
skbuff: Move definition of NETDEV_FRAG_PAGE_MAX_SIZE
In order to address the fact that some devices cannot support the full 32K
frag size we need to have the value accessible somewhere so that we can use it
to do comparisons against what the device can support. As such I am moving
the values out of skbuff.c and into skbuff.h.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Fri, 8 Feb 2013 20:18:40 +0000 (15:18 -0500)]
Merge branch 'wireless'
John W. Linville says:
====================
Please accept this pull request intended for the 3.9 stream!
Included are a mac80211 pull, an iwlwifi pull (actually two -- one
was a fast-forward), a wl12xx pull, and a couple of Bluetooth pulls.
On mac80211, Johannes says:
"I've included
* AKM definitions from Bing,
* mesh fixes from Thomas, including a fix from him for me breaking his
patch while applying,
* channel check fix from Simon,
* an old patch from Yoni Divinsky who doesn't even work for TI any
more, to configure the WEP TX key for ARP offload etc.
* MAC ACL API from Vasanth
* a fix for the infamous chanctx_conf warning from Arnd
* from myself, a fix for my previous aggregation changes, some cleanup
and some improvements and fixes for WoWLAN"
On iwlwifi, Johannes says:
"Two small changes for iwlwifi-next, one to update all our Copyright
notices and one to provide the RX page order."
And also:
"So what I have here is some cleanups, preparations and the new MVM
(multi-virtual MAC) driver itself and (this is new) some work on the
transport API as well as a message flooding fix."
On wl12xx, Luca says:
"Lots of bugfixes and improvements in our TI wireless drivers,
including support for multi-channel. Intended for 3.9."
On Bluetooth, Gustavo says:
"This is my first pull request to 3.9. The biggest changes here are from Johan
Hedberg who made a lot of fixes in the Management interface. The issues arose
due to a new test tool we wrote and the usage of the Management interface as
default in BlueZ 5. The rest of the patches are more clean ups and small
fixes."
And also:
"Here goes another batch intended for 3.9, the majority of the patch here are
from Johan who is fixing many issues in the management interface that have
appeared lately. The rest of the patches are just small improvements, fixes
and clean ups."
Along with those are the usual variety of updates/enhancements to
the mwl8k, mwifiex, ath9k, rtlwifi, and rt2x00 drivers as well as
a few updates for the ssb and bcma busses. I don't think there are
any big headliners there.
Please let me know if there are problems!
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Fri, 8 Feb 2013 04:48:33 +0000 (23:48 -0500)]
Merge branch 'tg3'
Attention linux-next maintainer, you will hit a merge conflict between
this merge and the mips tree, the resolution is to preserve the removal
of uses of nvram_geenv() and nvram_parse_macaddr() from the net-next
side.
Hauke Mehrtens says:
====================
These patches are adding support for the Ethernet core found in the
BCM4705/BCM4785 SoC.
This is based on current master of davem/net-next.git.
v4:
* move setting of DMA_RWCTRL_ONE_DMA
v3:
* combined first two patches into one patch
v2:
* use of struct sprom in ssb_gige_get_macaddr() instead of accessing
the nvram directly
* add return value to ssb_gige_get_macaddr()
* try to read the mac address from ssb core before accessing the own
registers.
* fix two checkpatch warnings
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Hauke Mehrtens [Thu, 7 Feb 2013 05:37:39 +0000 (05:37 +0000)]
tg3: add support for Ethernet core in bcm4785
The BCM4785 or sometimes named BMC4705 is a Broadcom SoC which a
Gigabit 5750 Ethernet core. The core is connected via PCI with the rest
of the SoC, but it uses some extension.
This core does not use a firmware or an eeprom.
Some devices only have a switch which supports 100MBit/s, this
currently does not work with this driver.
This patch was original written by Michael Buesch <m@bues.ch> and is in
OpenWrt for some years now.
This was tested on a Linksys WRT610N V1 and older versions of this patch
were tested by other people on different devices.
Signed-off-by: Hauke Mehrtens <hauke@hauke-m.de> Acked-by: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Hauke Mehrtens [Thu, 7 Feb 2013 05:37:38 +0000 (05:37 +0000)]
tg3: make it possible to provide phy_id in ioctl
In OpenWrt we currently use a switch driver which uses the ioctls to
configure the switch in the phy. We have to provide the phy_id to do
so, but without this patch this is not possible.
Signed-off-by: Hauke Mehrtens <hauke@hauke-m.de> Acked-by: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Hauke Mehrtens [Thu, 7 Feb 2013 05:37:37 +0000 (05:37 +0000)]
ssb: get mac address from sprom struct for gige driver
The mac address is already stored in the sprom structure by the
platform code of the SoC this Ethernet core is found on, it just has to
be fetched from this structure instead of accessing the nvram here.
This patch also adds a return value to indicate if a mac address could
be fetched from the sprom structure.
When CONFIG_SSB_DRIVER_GIGE is not set the header file now also declares
ssb_gige_get_macaddr().
Signed-off-by: Hauke Mehrtens <hauke@hauke-m.de> Acked-by: Michael Buesch <m@bues.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
Instead of calling 3 times ntohs(random->param_hdr.length), 2 times
ntohs(hmacs->param_hdr.length), and 3 times ntohs(chunks->param_hdr.length)
within the same function, we only call each once and store it in a
variable.
Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Acked-by: Vlad Yasevich <vyasevich@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Frank Li [Wed, 6 Feb 2013 14:59:59 +0000 (14:59 +0000)]
net: fec: fix spin_lock dead lock
=========================================================
[ INFO: possible irq lock inversion dependency detected ]
3.8.0-rc5+ #82 Not tainted
---------------------------------------------------------
swapper/0/0 just changed the state of lock:
(&(&fep->hw_lock)->rlock){..-...}, at: [<8034e2f8>] fec_enet_start_xmit+0x48/0x 2cc
but this lock took another, SOFTIRQ-unsafe lock in the past:
(prepare_lock){+.+.+.}
and interrupts could create inverse lock ordering between them.
other info that might help us debug this:
Possible interrupt unsafe locking scenario:
Tom Herbert [Wed, 6 Feb 2013 07:58:41 +0000 (07:58 +0000)]
mlx4_en: Fix BQL reset TX queue call point
Fix issue in Mellanox driver related to BQL. netdev_tx_reset_queue
was not being called in certain situations where the device was
being start and stopped. Moved netdev_tx_reset_queue from the reset
device path to mlx4_en_free_tx_buf which is where the rings are
cleaned in a reset (specifically from device being stopped).
Signed-off-by: Tom Herbert <therbert@google.com> Acked-By: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Fri, 8 Feb 2013 04:28:32 +0000 (23:28 -0500)]
Merge branch 'mlx4'
Amir Vadai says:
====================
This series from Yan Burman adds support for unicast MAC address filtering and
ndo FDB operations. It also includes some optimizations to loopback related
decisions and checks in the TX/RX fast path and one cleanup, all in separate
patches.
Today, when adding macvlan devices, the NIC goes into promiscuous mode, since
unicast MAC filtering is not supported. With these changes, macvlan devices can
be added without the penalty of promiscuous mode.
If for some reason adding a unicast address filter fails e.g as of missing space in
the HW mac table, the device forces itself into promiscuous mode (and out of this
forced state when enough space is available).
Also, now it is possible to have bridge under multi-function configuration that include
PF and VFs. In order to use bridge over PF/VFs, VM MAC fdb entries must be added e.g.
using 'bridge fdb add' command.
Changes from v1 - based on more comments from Eric Dumazet:
* added failure handling when adding unicast address filter
Changes from v0 - based on comments from Eric Dumazet:
* Removed unneeded synchronize_rcu()
* Use kfree_rcu() instead of synchronize_rcu() + kfree()
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Yan Burman [Thu, 7 Feb 2013 02:25:27 +0000 (02:25 +0000)]
net/mlx4_en: Implement ndo fdb functionality
Add support for setting embedded switch fdb in case of SRIOV, by
implementing ndo_fdb_{add, del, dump}. This will allow to use
bridged configuration with multi-function. In order to add VM MAC
to the eSwitch fdb, the following command may be used over the relevant function interface:
bridge fdb add <MAC> permanent self dev <IFACE>
Signed-off-by: Yan Burman <yanb@mellanox.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Yan Burman [Thu, 7 Feb 2013 02:25:26 +0000 (02:25 +0000)]
net/mlx4_en: Add unicast MAC filtering
Implement and advertise unicast MAC filtering, such that setting macvlan
instance over mlx4_en interfaces will not require the networking core
to put mlx4_en devices in promiscuous mode.
If for some reason adding a unicast address filter fails e.g as of missing space in
the HW mac table, the device forces itself into promiscuous mode (and out of this
forced state when enough space is available).
Signed-off-by: Yan Burman <yanb@mellanox.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Yan Burman [Thu, 7 Feb 2013 02:25:25 +0000 (02:25 +0000)]
net/mlx4_en: Manage hash of MAC addresses per port
As a preparation step for supporting multiple unicast addresses, store MAC addresses in hash table.
Remove the radix tree for MAC addresses per QP, as it's not in use.
Signed-off-by: Yan Burman <yanb@mellanox.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Yan Burman [Thu, 7 Feb 2013 02:25:24 +0000 (02:25 +0000)]
net/mlx4_en: Save previous MAC address of the port so we can replace it later
In preparation to having more than one unicast MAC per port, we need to keep track
of the previous MAC address in the flow of ndo_set_mac_address,
so that mlx4_en_replace_mac will know what to replace.
Signed-off-by: Yan Burman <yanb@mellanox.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Yan Burman [Thu, 7 Feb 2013 02:25:23 +0000 (02:25 +0000)]
net/mlx4_en: Re-arrange ndo_set_rx_mode related code
Currently, mlx4_en_do_set_multicast serves as the ndo_set_rx_mode entry for mlx4_en,
doing all related work. Split it to few calls, one per required functionality
(e.g multicast, promiscuous, etc) and rename some structures and calls
to use rx_mode notation instead of multicast.
Signed-off-by: Yan Burman <yanb@mellanox.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Yan Burman [Thu, 7 Feb 2013 02:25:22 +0000 (02:25 +0000)]
net/mlx4: Move Ethernet related functionality from mlx4_core to mlx4_en
Move low level code that deals with management of Ethernet MACs and QPs from mlx4_core to mlx4_en.
Also convert the new functions to deal with MACs in form of char array instead of u64.
Actual functions moved:
mlx4_replace_mac
mlx4_get_eth_qp
mlx4_put_eth_qp
To conduct this change, some functionality had to be exported from the core,
the following functions were added:
mlx4_get_base_qp
__mlx4_replace_mac (low level function for CX1/A0 compatibility)
Signed-off-by: Yan Burman <yanb@mellanox.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Yan Burman [Thu, 7 Feb 2013 02:25:20 +0000 (02:25 +0000)]
net/mlx4_en: Optimize Rx fast path filter checks
Currently, RX path code that does RX filtering is not optimized
and does an expensive conversion. In order to use ether_addr_equal_64bits
which is optimized for such cases, we need the MAC address kept by the device
to be in the form of unsigned char array instead of u64. Store the MAC address
as unsigned char array and convert to/from u64 out of the fast path when needed.
Side effect of this is that we no longer need priv->mac, since it's the same
as dev->dev_addr.
This optimization was suggested by Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Yan Burman <yanb@mellanox.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Yan Burman [Thu, 7 Feb 2013 02:25:19 +0000 (02:25 +0000)]
net/mlx4_en: Optimize loopback related checks in data path
Currently there are relatively complex conditional checks in the fast path,
for TX loopback enabling and resulting RX filter logic.
Move elaborate if's out of data path, replace them with a single flag
for each state and update that state from appropriate places.
Also, in native (non SRIOV) mode and not in loopback or in selftest,
there is no need to try and filter out packets that HW loopback-ed,
as in native mode we do not loopback packets anymore.
Signed-off-by: Yan Burman <yanb@mellanox.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Hauke Mehrtens [Wed, 6 Feb 2013 05:51:49 +0000 (05:51 +0000)]
bgmac: add ndo_set_rx_mode netdev ops
When changing the device from or to promisc mode this only affects the
device after the device is bought up the next time. For bridging it is
needed to change the device to promisc mode while it is up, which is
possible with this patch.
Signed-off-by: Hauke Mehrtens <hauke@hauke-m.de> Signed-off-by: David S. Miller <davem@davemloft.net>
Cong Wang [Tue, 5 Feb 2013 16:36:38 +0000 (16:36 +0000)]
net: adjust skb_gso_segment() for calling in rx path
skb_gso_segment() is almost always called in tx path,
except for openvswitch. It calls this function when
it receives the packet and tries to queue it to user-space.
In this special case, the ->ip_summed check inside
skb_gso_segment() is no longer true, as ->ip_summed value
has different meanings on rx path.
This patch adjusts skb_gso_segment() so that we can at least
avoid such warnings on checksum.
Cc: Jesse Gross <jesse@nicira.com> Cc: David S. Miller <davem@davemloft.net> Signed-off-by: Cong Wang <amwang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Flavio Leitner [Tue, 5 Feb 2013 09:30:55 +0000 (09:30 +0000)]
team: allow userspace to take control over carrier
Some modes don't require any special carrier handling so
in these cases, the kernel can control the carrier as for
any other interface. However, some other modes, e.g. lacp,
requires more than just that, so userspace needs to control
the carrier itself.
The daemon today is ready to control it, but the kernel
still can change it based on events.
This fix so that either kernel or userspace is controlling
the carrier.
Signed-off-by: Flavio Leitner <fbl@redhat.com> Signed-off-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>
Neil Horman [Tue, 5 Feb 2013 08:05:43 +0000 (08:05 +0000)]
netpoll: protect napi_poll and poll_controller during dev_[open|close]
Ivan Vercera was recently backporting commit 9c13cb8bb477a83b9a3c9e5a5478a4e21294a760 to a RHEL kernel, and I noticed that,
while this patch protects the tg3 driver from having its ndo_poll_controller
routine called during device initalization, it does nothing for the driver
during shutdown. I.e. it would be entirely possible to have the
ndo_poll_controller method (or subsequently the ndo_poll) routine called for a
driver in the netpoll path on CPU A while in parallel on CPU B, the ndo_close or
ndo_open routine could be called. Given that the two latter routines tend to
initizlize and free many data structures that the former two rely on, the result
can easily be data corruption or various other crashes. Furthermore, it seems
that this is potentially a problem with all net drivers that support netpoll,
and so this should ideally be fixed in a common path.
As Ben H Pointed out to me, we can't preform dev_open/dev_close in atomic
context, so I've come up with this solution. We can use a mutex to sleep in
open/close paths and just do a mutex_trylock in the napi poll path and abandon
the poll attempt if we're locked, as we'll just retry the poll on the next send
anyway.
I've tested this here by flooding netconsole with messages on a system whos nic
driver I modfied to periodically return NETDEV_TX_BUSY, so that the netpoll tx
workqueue would be forced to send frames and poll the device. While this was
going on I rapidly ifdown/up'ed the interface and watched for any problems.
I've not found any.
Signed-off-by: Neil Horman <nhorman@tuxdriver.com> CC: Ivan Vecera <ivecera@redhat.com> CC: "David S. Miller" <davem@davemloft.net> CC: Ben Hutchings <bhutchings@solarflare.com> CC: Francois Romieu <romieu@fr.zoreil.com> CC: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Calling icmpv6_send() on a local message size error leads to an
incorrect update of the path mtu in the case when IPsec is used.
So use ipv6_local_error() instead to notify the socket about the
error.
Reported-by: Jiri Bohac <jbohac@suse.cz> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Joe Perches [Mon, 4 Feb 2013 18:22:29 +0000 (18:22 +0000)]
drivers: net: misc: Remove unused OOM variables
commits 9d11bd159
("wimax: Remove unnecessary alloc/OOM messages, alloc cleanups")
and b2adaca92
("ethernet: Remove unnecessary alloc/OOM messages, alloc cleanups")
added a couple of unused variable warnings.
Remove the now unused variables.
Noticed-by: kbuild test robot <fengguang.wu@intel.com> Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 5 Feb 2013 19:54:49 +0000 (14:54 -0500)]
Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net-next
Jeff Kirsher says:
====================
This series contains updates to e1000e and ixgbe. Majority of the patches
are against e1000e, where Bruce makes several cosmetic #define moves into
header files. In addition, Bruce does a cleanup of braces to resolve
checkpatch warnings (when using the strict option).
Ixgbe patches contain several fixes as well as updating the copyright. The
fixes from Josh Hay, resolved a possible NULL pointer dereference and
resolved Smatch warnings by fixing return values and memcpy parameters.
Alex provides 2 fixes, the first is to replace rmb() with
read_barrier_depends() in the Tx cleanup. The second fixes an MTU
warning when using SR-IOV which corrects the fact that we were using 1522
to test for the max frame size in ixgbe_change_mtu and 1518 in
ixgbe_set_vf_lpe. The difference was the addition of VLAN_HLEN, which we
only need to add in the case of computing a buffer size, but not a filter
size. Lastly, a patch from Emil which is based on a community patch from
Aurélien Guillaume which adds functions needed for reading SFF-8472
diagnostic data from SFP modules.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
TCP Appropriate Byte Count was added by me, but later disabled.
There is no point in maintaining it since it is a potential source
of bugs and Linux already implements other better window protection
heuristics.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Tom Parkin [Thu, 31 Jan 2013 23:43:03 +0000 (23:43 +0000)]
l2tp: create tunnel sockets in the right namespace
When creating unmanaged tunnel sockets we should honour the network namespace
passed to l2tp_tunnel_create. Furthermore, unmanaged tunnel sockets should
not hold a reference to the network namespace lest they accidentally keep
alive a namespace which should otherwise have been released.
Unmanaged tunnel sockets now drop their namespace reference via sk_change_net,
and are released in a new pernet exit callback, l2tp_exit_net.
Signed-off-by: Tom Parkin <tparkin@katalix.com> Signed-off-by: James Chapman <jchapman@katalix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Tom Parkin [Thu, 31 Jan 2013 23:43:02 +0000 (23:43 +0000)]
l2tp: prevent tunnel creation on netns mismatch
l2tp_tunnel_create is passed a pointer to the network namespace for the
tunnel, along with an optional file descriptor for the tunnel which may
be passed in from userspace via. netlink.
In the case where the file descriptor is defined, ensure that the namespace
associated with that socket matches the namespace explicitly passed to
l2tp_tunnel_create.
Signed-off-by: Tom Parkin <tparkin@katalix.com> Signed-off-by: James Chapman <jchapman@katalix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Tom Parkin [Thu, 31 Jan 2013 23:43:01 +0000 (23:43 +0000)]
l2tp: set netnsok flag for netlink messages
The L2TP netlink code can run in namespaces. Set the netnsok flag in
genl_family to true to reflect that fact.
Signed-off-by: Tom Parkin <tparkin@katalix.com> Signed-off-by: James Chapman <jchapman@katalix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Tom Parkin [Thu, 31 Jan 2013 23:43:00 +0000 (23:43 +0000)]
l2tp: put tunnel socket release on a workqueue
To allow l2tp_tunnel_delete to be called from an atomic context, place the
tunnel socket release calls on a workqueue for asynchronous execution.
Tunnel memory is eventually freed in the tunnel socket destructor.
Signed-off-by: Tom Parkin <tparkin@katalix.com> Signed-off-by: James Chapman <jchapman@katalix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
The ipv6 route.c conflict is simple, just ignore the 'net' side change
as we fixed the same problem in 'net-next' by eliminating cached
neighbours from ipv6 routes.
The e1000e conflict is an addition of a new statistic in the ethtool
code, trivial.
The vmxnet3 conflict is about one change in 'net' removing a guarding
conditional, whilst in 'net-next' we had a netdev_info() conversion.
The iwlwifi conflict is dealing with a WARN_ON() conversion in
'net-next' vs. a revert happening in 'net'.
Signed-off-by: David S. Miller <davem@davemloft.net>
Alexander Duyck [Wed, 9 Jan 2013 08:50:42 +0000 (08:50 +0000)]
ixgbe: Fix SR-IOV MTU warning
This change corrects the fact that we were using 1522 to test for the
max frame size in ixgbe_change_mtu and 1518 in ixgbe_set_vf_lpe. The
difference was the addition of VLAN_HLEN which we only need to add in the case
of computing a buffer size, but not a filter size.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Sibai Li <Sibai.li@intel.com> Tested-by: Stephen Ko <stephen.s.ko@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Alexander Duyck [Tue, 8 Jan 2013 07:00:58 +0000 (07:00 +0000)]
ixgbe: Replace rmb in Tx cleanup with read_barrier_depends
The rmb in the Tx cleanup path is a much stronger barrier than we really need.
All that is really needed is a read_barrier_depends since the location of the
EOP descriptor is dependent on the eop_desc value.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Don Skidmore [Tue, 8 Jan 2013 05:02:28 +0000 (05:02 +0000)]
ixgbe: update date to 2013
Signed-off-by: Don Skidmore <donald.c.skidmore@intel.com> Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Josh Hay [Fri, 4 Jan 2013 03:34:42 +0000 (03:34 +0000)]
ixgbe: fix return values and memcpy parameters to eliminate Smatch warnings
This patch removes the rval variable returns from function and replaces
them with direct returns in ixgbe_dcbnl_getnumtcs. It also changes how
ixgbe_gstrings_test is copied into data with memcpy in ixgbe_get_strings
because "*ixgbe_gstrings_test too small (32 vs 160)".
Signed-off-by: Josh Hay <joshua.a.hay@intel.com> Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Josh Hay [Fri, 4 Jan 2013 03:34:36 +0000 (03:34 +0000)]
ixgbe: fix potential null dereference
This patch adds a default case which goes to the next loop iteration
in the case where p is not set, preventing p from being dereferenced.
Signed-off-by: Josh Hay <joshua.a.hay@intel.com> Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Bruce Allan [Thu, 24 Jan 2013 00:50:18 +0000 (00:50 +0000)]
e1000e: cleanup checkpatch braces checks
Resolve the following strict checkpatch checks:
CHECK:BRACES: Blank lines aren't necessary after an open brace '{'
CHECK:BRACES: Blank lines aren't necessary before a close brace '}'
CHECK:BRACES: braces {} should be used on all arms of this statement
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Bruce Allan [Tue, 5 Feb 2013 08:30:59 +0000 (00:30 -0800)]
e1000e: convert enums of register offsets and move #defines to regs.h
There are enough register offsets to warrant being in their own header
file, and doing so logically separates them from other header file content.
They have been converted from an enumerated data type to #defines as is
done in all the other Intel wired ethernet drivers.
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Bruce Allan [Tue, 22 Jan 2013 08:44:35 +0000 (08:44 +0000)]
e1000e: cosmetic move of #defines and prototypes to the new manage.h
Move #defines, function prototypes and data types which are applicable to
all/most devices supported by the driver but are specific to the
manageability component of each device to the new manage.h header file.
These #defines, function prototypes and data types can be used by other
files in the driver and moving them to the manageability-specific file
makes it clearer to which component they are applicable.
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Bruce Allan [Tue, 22 Jan 2013 08:44:30 +0000 (08:44 +0000)]
e1000e: cosmetic move of #defines and function prototypes to the new nvm.h
Move #defines and function prototypes which are applicable to all/most
devices supported by the driver and are specific to the NVM component of
each device to the new nvm.h header file. These #defines and function
prototypes can be used by other files in the driver and moving them to the
NVM-specific file makes it clearer to which component they are applicable.
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Bruce Allan [Tue, 22 Jan 2013 08:44:25 +0000 (08:44 +0000)]
e1000e: cosmetic move of #defines and function prototypes to the new phy.h
Move #defines and function prototypes which are applicable to all/most
devices supported by the driver and are specific to the PHY component of
each device to the new phy.h header file. These function prototypes can be
used by other files in the driver and moving them to the PHY-specific file
makes it clearer to which component they are applicable.
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Bruce Allan [Tue, 22 Jan 2013 08:44:19 +0000 (08:44 +0000)]
e1000e: cosmetic move of function prototypes to the new mac.h
Move prototypes for functions which are applicable to all/most devices
supported by the driver and are specific to the MAC component of each
device to the new mac.h header file. These function prototypes can be used
by other files in the driver and moving them to the MAC-specific file makes
it clearer to which component they are applicable.
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Bruce Allan [Tue, 22 Jan 2013 08:44:14 +0000 (08:44 +0000)]
e1000e: cosmetic move of #defines and prototypes to the new ich8lan.h
Move #defines and function prototypes specific to the ICH/PCH family of
devices (ICH8/82562, ICH8/82566, ICH8/82567, ICH9/82562, ICH9/82566,
ICH9/82567, ICH10/82567, 82577, 82578, 82579, I217, I218) to the new
ich8lan.h header file (the convention for Intel wired ethernet drivers is
to use the name of the first device in the family for related file and
function names). These defines and function prototypes can be used by
other files in the driver and moving them to the ICH/PCH-family-specific
file makes it clearer to which devices they are applicable.
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Bruce Allan [Tue, 22 Jan 2013 08:44:09 +0000 (08:44 +0000)]
e1000e: cosmetic move of #defines to the new 80003es2lan.h
Move #defines specific to the ESB2/82563 family of devices to the new
80003es2lan.h header file. These defines can be used by other files in the
driver and moving them to the 80003es2lan-family-specific file makes it
clearer to which devices they are applicable.
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Bruce Allan [Tue, 22 Jan 2013 08:44:04 +0000 (08:44 +0000)]
e1000e: cosmetic move of #defines and prototypes to the new 82571.h
Move #defines and function prototypes specific to the 8257x family of
devices (82571, 82572, 82573, 82574, 82583) to the new 82571.h header file
(the convention for Intel wired ethernet drivers is to use the name of the
first device in the family for related file and function names). These
defines and function prototypes can be used by other files in the driver
and moving them to the 8257x-family-specific file makes it clearer to which
devices they are applicable.
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
"gianfar: Pack struct gfar_priv_grp into three cachelines"
causes the following null dereference at driver init on sbc8548:
libphy: Freescale PowerQUICC MII Bus: probed
Unable to handle kernel paging request for data at address 0x00000000
Faulting instruction address: 0xc01d6a38
Oops: Kernel access of bad area, sig: 11 [#1]
[...]
NIP [c01d6a38] gfar_parse_group+0x228/0x280
LR [c01d6a34] gfar_parse_group+0x224/0x280
Call Trace:
[ef82dd60] [c01d6a34] gfar_parse_group+0x224/0x280 (unreliable)
[ef82dd90] [c01d73a4] gfar_probe+0x284/0xfe0
The reason is that the commit also changed the allocation of the
Rx and error handling irq structs to be skipped for !MQ_MG_MODE.
In the !MQ_MG_MODE case, only the Tx irq struct is allocated.
Digging further, we see that MQ_MG_MODE is set only if we find
the OF compatible string "fsl,etsec2".
A quick grep in the dts directory shows lots of boards that support
Rx/Tx/Err, but without this specific compat string. And hence they
go after the unallocated Rx/Error structs and cause the above oops.
Hence such a change can not be deployed until all the dts files
are updated and sufficiently deployed. Further, the optimization
is of limited value, since the kmalloc'd struct in question has only
a single unsigned int, and an (IFNAMSIZ + 6) sized string.
Note that no changes to the freeing code are needed here, as it
already did an unconditional free of Rx/Tx/Error gfar_irqinfo.
Cc: Claudiu Manoil <claudiu.manoil@freescale.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Jiri Pirko [Fri, 1 Feb 2013 08:17:25 +0000 (08:17 +0000)]
team: move netlink event notifiers after team_port_leave()
In team_port_del(), there is need to be do all the cleanup related
things first and netlink event notifiers should be called after that.
This fixes two problems:
team carrier is now correctly set (port is removed from list first)
mode can set option as changed in .port_leave op
Signed-off-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>
Jiri Pirko [Fri, 1 Feb 2013 08:17:24 +0000 (08:17 +0000)]
team: handle sending port list in the same way option list is sent
Essentially do the same thing with port list as with option list.
Multipart netlink message.
Side effect is that port event message can send port which is not longer
in team->port_list.
Signed-off-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>
Hadar Hen Zion [Mon, 4 Feb 2013 03:01:21 +0000 (03:01 +0000)]
net/mlx4_en: Fix compilation error when CONFIG_INET isn't defined
ip_eth_mc_map function can't be used when CONFIG_INET isn't defined.
Fixed compilation error by adding CONFIG_INET define check before using the
function.
Signed-off-by: Hadar Hen Zion <hadarh@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Ying Xue [Sun, 3 Feb 2013 20:32:57 +0000 (20:32 +0000)]
net: remove redundant check for timer pending state before del_timer
As in del_timer() there has already placed a timer_pending() function
to check whether the timer to be deleted is pending or not, it's
unnecessary to check timer pending state again before del_timer() is
called.
Signed-off-by: Ying Xue <ying.xue@windriver.com> Cc: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Sathya Perla [Sun, 3 Feb 2013 20:30:11 +0000 (20:30 +0000)]
be2net: fix re-loaded PF driver to re-gain control of its VFs
Currently, when the PF driver is unloaded and re-loaded while VFs are attached
to VMs, it loses control of its VFs.
The PF driver now uses the newly defined/created GET_IFACE_LIST cmd
(available in FW ver >= 4.6) to query the if_id of the VFs
(enabled in its previous life). The PF driver then uses the if_id for
further VF configuration.
The GET_IFACE_MAC_LIST cmd has also implemented in BE3 FW for PF to
query pmac-ids used by its VFs.
Signed-off-by: Sathya Perla <sathya.perla@emulex.com> Signed-off-by: David S. Miller <davem@davemloft.net>
alloc failures already get standardized OOM
messages and a dump_stack.
Convert kzalloc's with multiplies to kcalloc.
Convert kmalloc's with multiplies to kmalloc_array.
Remove now unused variables.
Remove unnecessary memset after kzalloc->kcalloc.
Whitespace cleanups for these changes.
Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>
alloc failures already get standardized OOM
messages and a dump_stack.
Convert kzalloc's with multiplies to kcalloc.
Convert kmalloc's with multiplies to kmalloc_array.
Fix a few whitespace defects.
Convert a constant 6 to ETH_ALEN.
Use parentheses around sizeof.
Convert vmalloc/memset to vzalloc.
Remove now unused size variables.
Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Daniel Borkmann [Fri, 1 Feb 2013 04:37:43 +0000 (04:37 +0000)]
sctp: sctp_close: fix release of bindings for deferred call_rcu's
It seems due to RCU usage, i.e. within SCTP's address binding list,
a, say, ``behavioral change'' was introduced which does actually
not conform to the RFC anymore. In particular consider the following
(fictional) scenario to demonstrate this:
do:
Two SOCK_SEQPACKET-style sockets are opened (S1, S2)
S1 is bound to 127.0.0.1, port 1024 [server]
S2 is bound to 127.0.0.1, port 1025 [client]
listen(2) is invoked on S1
From S2 we call one sendmsg(2) with msg.msg_name and
msg.msg_namelen parameters set to the server's
address
S1, S2 are closed
goto do
The first pass of this loop passes successful, while the second round
fails during binding of S1 (address still in use). What is happening?
In the first round, the initial handshake is being done, and, at the
time close(2) is called on S1, a non-graceful shutdown is performed via
ABORT since in S1's receive queue an unprocessed packet is present,
thus stating an error condition. This can be considered as a correct
behavior.
During close also all bound addresses are freed, thus nothing *must*
be active anymore. In reference to RFC2960:
After checking the Verification Tag, the receiving endpoint shall
remove the association from its record, and shall report the
termination to its upper layer. (9.1 Abort of an Association)
Also, no half-open states are supported, thus after an ungraceful
shutdown, we leave nothing behind. However, this seems not to be
happening though. In a real-world scenario, this is exactly where
it breaks the lksctp-tools functional test suite, *for instance*:
./test_sockopt
test_sockopt.c 1 PASS : getsockopt(SCTP_STATUS) on a socket with no assoc
test_sockopt.c 2 PASS : getsockopt(SCTP_STATUS)
test_sockopt.c 3 PASS : getsockopt(SCTP_STATUS) with invalid associd
test_sockopt.c 4 PASS : getsockopt(SCTP_STATUS) with NULL associd
test_sockopt.c 5 BROK : bind: Address already in use
The underlying problem is that sctp_endpoint_destroy() hasn't been
triggered yet while the next bind attempt is being done. It will be
triggered eventually (but too late) by sctp_transport_destroy_rcu()
after one RCU grace period:
Thus, we move out the condition with sctp_association_put() as well as
the sctp_packet_free() invocation and the issue can be solved. We also
better free the SCTP chunks first before putting the ref of the association.
With this patch, the example above (which simulates a similar scenario
as in the implementation of this test case) and therefore also the test
suite run successfully through. Tested by myself.
Cc: Vlad Yasevich <vyasevich@gmail.com> Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Acked-by: Vlad Yasevich <vyasevich@gmail.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
since the mdb table is belong to bridge device,and the
bridge device can only be seen in one netns.
So it's safe to allow unprivileged user which is the
creator of userns and netns to modify the mdb table.
Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Gao feng [Thu, 31 Jan 2013 16:30:58 +0000 (16:30 +0000)]
netns: ebtable: allow unprivileged users to operate ebtables
ebt_table is a private resource of netns, operating ebtables
in one netns will not affect other netns, we can allow the
creator user of userns and netns to change the ebtables.
Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Gao feng [Thu, 31 Jan 2013 16:30:57 +0000 (16:30 +0000)]
netns: fdb: allow unprivileged users to add/del fdb entries
Right now,only ixgdb,macvlan,vxlan and bridge implement
fdb_add/fdb_del operations.
these operations only operate the private data of net
device. So allowing the unprivileged users who creates
the userns and netns to add/del fdb entries will do no
harm to other netns.
Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Bjørn Mork [Thu, 31 Jan 2013 08:36:05 +0000 (08:36 +0000)]
net: usbnet: fix tx_dropped statistics
It is normal for minidrivers accumulating frames to return NULL
from their tx_fixup function. We do not want to count this as a
drop, or log any debug messages. A different exit path is
therefore chosen for such drivers, skipping the debug message
and the tx_dropped increment.
The test for accumulating drivers was however completely bogus,
making the exit path selection depend on whether the user had
enabled tx_err logging or not. This would arbitrarily mess up
accounting for both accumulating and non-accumulating minidrivers,
and would result in unwanted debug messages for the accumulating
drivers.
Fix by testing for FLAG_MULTI_PACKET instead, which probably was
the intention from the beginning. This usage match the documented
behaviour of this flag:
Indicates to usbnet, that USB driver accumulates multiple IP packets.
Affects statistic (counters) and short packet handling.
Signed-off-by: Bjørn Mork <bjorn@mork.no> Signed-off-by: David S. Miller <davem@davemloft.net>
This patch updates LINUX_MIB_LISTENDROPS and LINUX_MIB_LISTENOVERFLOWS in
tcp_v6_conn_request() and tcp_v6_err(). tcp_v6_conn_request() in particular can
drop SYNs for various reasons which are not currently tracked.
Signed-off-by: Vijay Subramanian <subramanian.vijay@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
This patch updates LINUX_MIB_LISTENDROPS in tcp_v4_conn_request() and
tcp_v4_err(). tcp_v4_conn_request() in particular can drop SYNs for various
reasons which are not currently tracked.
Signed-off-by: Vijay Subramanian <subramanian.vijay@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>