git.karo-electronics.de Git - linux-beck.git/log

net: sctp: sctp_auth_make_key_vector: use sctp_auth_create_key

In sctp_auth_make_key_vector, we allocate a temporary sctp_auth_bytes
structure with kmalloc instead of the sctp_auth_create_key allocator.
Change this to sctp_auth_create_key as it is the case everywhere else,
so that we also can properly free it via sctp_auth_key_put. This makes
it easier for future code changes in the structure and allocator itself,
since a single API is consistently used for this purpose. Also, by
using sctp_auth_create_key we're doing sanity checks over the arguments.

Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Acked-by: Vlad Yasevich <vyasevich@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

macvlan: add a salt to mc_hash()

Some multicast addresses are common to all macvlans,
so if a multicast message has a hash value collision, we
have to deliver a copy to all macvlans, adding significant
latency and possible packet drops if netdev_max_backlog
limit is hit.

Having a per macvlan hash function permits to reduce the
impact of hash collisions.

Suggested-by: Maciej Żenczykowski <maze@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Maciej Żenczykowski <maze@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

macvlan: broadcast addr should be part of mc_filter

commit cd431e738509e (macvlan: add multicast filter) forgot
the broadcast case.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Maciej Żenczykowski <maze@google.com>
SIgned-off-by: Maciej Żenczykowski <maze@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ipv6: fix a RCU warning in net/ipv6/ip6_flowlabel.c

This patch fixes the following RCU warning:

[   51.680236] ===============================
[   51.681914] [ INFO: suspicious RCU usage. ]
[   51.683610] 3.8.0-rc6-next-20130206-sasha-00028-g83214f7-dirty #276 Tainted: G        W
[   51.686703] -------------------------------
[   51.688281] net/ipv6/ip6_flowlabel.c:671 suspicious rcu_dereference_check() usage!

we should use rcu_dereference_bh() when we hold rcu_read_lock_bh().

Reported-by: Sasha Levin <sasha.levin@oracle.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: Cong Wang <amwang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

drivers: net: Remove remaining alloc/OOM messages

alloc failures already get standardized OOM
messages and a dump_stack.

For the affected mallocs around these OOM messages:

Converted kmallocs with multiplies to kmalloc_array.
Converted a kmalloc/memcpy to kmemdup.
Removed now unused stack variables.
Removed unnecessary parentheses.
Neatened alignment.

Signed-off-by: Joe Perches <joe@perches.com>
Acked-by: Arend van Spriel <arend@broadcom.com>
Acked-by: Marc Kleine-Budde <mkl@pengutronix.de>
Acked-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

bgmac: fix "cmdcfg" calls for promisc and loopback modes

The last (bool) parameter in bgmac_cmdcfg_maskset says if the write
should be made, even if value didn't change. Currently driver doesn't
match the specs about (not) forcing some changes. This makes it follow
them.

Reported-by: Nathan Hintz <nlhintz@hotmail.com>
Signed-off-by: Rafał Miłecki <zajec5@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

bgmac: validate (and random if needed) MAC addr

This adds check for a valid Ethernet MAC address and in case it is not,
it will generate a valid random one, such that the adapter is still
usable.

Signed-off-by: Rafał Miłecki <zajec5@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

skbuff: Move definition of NETDEV_FRAG_PAGE_MAX_SIZE

In order to address the fact that some devices cannot support the full 32K
frag size we need to have the value accessible somewhere so that we can use it
to do comparisons against what the device can support. As such I am moving
the values out of skbuff.c and into skbuff.h.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'wireless'

John W. Linville says:

====================
Please accept this pull request intended for the 3.9 stream!

Included are a mac80211 pull, an iwlwifi pull (actually two -- one
was a fast-forward), a wl12xx pull, and a couple of Bluetooth pulls.

On mac80211, Johannes says:

"I've included
* AKM definitions from Bing,
* mesh fixes from Thomas, including a fix from him for me breaking his
   patch while applying,
* channel check fix from Simon,
* an old patch from Yoni Divinsky who doesn't even work for TI any
   more, to configure the WEP TX key for ARP offload etc.
* MAC ACL API from Vasanth
* a fix for the infamous chanctx_conf warning from Arnd
* from myself, a fix for my previous aggregation changes, some cleanup
   and some improvements and fixes for WoWLAN"

On iwlwifi, Johannes says:

"Two small changes for iwlwifi-next, one to update all our Copyright
notices and one to provide the RX page order."

And also:

"So what I have here is some cleanups, preparations and the new MVM
(multi-virtual MAC) driver itself and (this is new) some work on the
transport API as well as a message flooding fix."

On wl12xx, Luca says:

"Lots of bugfixes and improvements in our TI wireless drivers,
including support for multi-channel.  Intended for 3.9."

On Bluetooth, Gustavo says:

"This is my first pull request to 3.9. The biggest changes here are from Johan
Hedberg who made a lot of fixes in the Management interface. The issues arose
due to a new test tool we wrote and the usage of the Management interface as
default in BlueZ 5. The rest of the patches are more clean ups and small
fixes."

And also:

"Here goes another batch intended for 3.9, the majority of the patch here are
from Johan who is fixing many issues in the management interface that have
appeared lately. The rest of the patches are just small improvements, fixes
and clean ups."

Along with those are the usual variety of updates/enhancements to
the mwl8k, mwifiex, ath9k, rtlwifi, and rt2x00 drivers as well as
a few updates for the ssb and bcma busses.  I don't think there are
any big headliners there.

Please let me know if there are problems!
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next into for-davem

Merge branch 'tg3'

Attention linux-next maintainer, you will hit a merge conflict between
this merge and the mips tree, the resolution is to preserve the removal
of uses of nvram_geenv() and nvram_parse_macaddr() from the net-next
side.

Hauke Mehrtens says:

====================
These patches are adding support for the Ethernet core found in the
BCM4705/BCM4785 SoC.

This is based on current master of davem/net-next.git.

v4:
* move setting of DMA_RWCTRL_ONE_DMA

v3:
* combined first two patches into one patch

v2:
* use of struct sprom in ssb_gige_get_macaddr() instead of accessing
the nvram directly
* add return value to ssb_gige_get_macaddr()
* try to read the mac address from ssb core before accessing the own
registers.
* fix two checkpatch warnings
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

tg3: add support for Ethernet core in bcm4785

The BCM4785 or sometimes named BMC4705 is a Broadcom SoC which a
Gigabit 5750 Ethernet core. The core is connected via PCI with the rest
of the SoC, but it uses some extension.

This core does not use a firmware or an eeprom.

Some devices only have a switch which supports 100MBit/s, this
currently does not work with this driver.

This patch was original written by Michael Buesch <m@bues.ch> and is in
OpenWrt for some years now.

This was tested on a Linksys WRT610N V1 and older versions of this patch
were tested by other people on different devices.

Signed-off-by: Hauke Mehrtens <hauke@hauke-m.de>
Acked-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

tg3: make it possible to provide phy_id in ioctl

In OpenWrt we currently use a switch driver which uses the ioctls to
configure the switch in the phy. We have to provide the phy_id to do
so, but without this patch this is not possible.

Signed-off-by: Hauke Mehrtens <hauke@hauke-m.de>
Acked-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ssb: get mac address from sprom struct for gige driver

The mac address is already stored in the sprom structure by the
platform code of the SoC this Ethernet core is found on, it just has to
be fetched from this structure instead of accessing the nvram here.
This patch also adds a return value to indicate if a mac address could
be fetched from the sprom structure.
When CONFIG_SSB_DRIVER_GIGE is not set the header file now also declares
ssb_gige_get_macaddr().

Signed-off-by: Hauke Mehrtens <hauke@hauke-m.de>
Acked-by: Michael Buesch <m@bues.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: sctp: sctp_auth_make_key_vector: remove duplicate ntohs calls

Instead of calling 3 times ntohs(random->param_hdr.length), 2 times
ntohs(hmacs->param_hdr.length), and 3 times ntohs(chunks->param_hdr.length)
within the same function, we only call each once and store it in a
variable.

Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Acked-by: Vlad Yasevich <vyasevich@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: fec: fix spin_lock dead lock

=========================================================
[ INFO: possible irq lock inversion dependency detected ]
3.8.0-rc5+ #82 Not tainted
---------------------------------------------------------
swapper/0/0 just changed the state of lock:
(&(&fep->hw_lock)->rlock){..-...}, at: [<8034e2f8>] fec_enet_start_xmit+0x48/0x 2cc
but this lock took another, SOFTIRQ-unsafe lock in the past:
(prepare_lock){+.+.+.}

and interrupts could create inverse lock ordering between them.
other info that might help us debug this:
Possible interrupt unsafe locking scenario:

CPU0 CPU1
---- ----
lock(prepare_lock);
local_irq_disable()
lock(&(&fep->hw_lock)->rlock);
lock(prepare_lock);
<Interrupt>
lock(&(&fep->hw_lock)->rlock);

*** DEADLOCK ***

Signed-off-by: Frank Li <Frank.Li@freescale.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: fec: correct fix method about miss init spinlock

Old method will cause init spinlock twice.
New method will avoid init spinlock twice and fix miss init spinlock
at fec_restart.

BUG: spinlock bad magic on CPU#1, swapper/0/1
lock: 0xbfae0f8c, .magic: 00000000, .owner: <none>/-1, .owner_cpu: 0
Backtrace:
[<80011d54>] (dump_backtrace+0x0/0x10c) from [<804e7800>] (dump_stack+0x18/0x1c)
r6:bfae0000 r5:bfae0f8c r4:00000000 r3:806c1310
[<804e77e8>] (dump_stack+0x0/0x1c) from [<804e9f20>] (spin_dump+0x80/0x94)
[<804e9ea0>] (spin_dump+0x0/0x94) from [<804e9f60>] (spin_bug+0x2c/0x30)
r5:805f6f8c r4:bfae0f8c
[<804e9f34>] (spin_bug+0x0/0x30) from [<80257984>] (do_raw_spin_lock+0x170/0x1b0                                         )
r5:806b4950 r4:bfae0f8c
[<80257814>] (do_raw_spin_lock+0x0/0x1b0) from [<804ed15c>] (_raw_spin_lock_irqs                                         ave+0x18/0x20)
[<804ed144>] (_raw_spin_lock_irqsave+0x0/0x20) from [<8033c694>] (fec_ptp_start_                                         cyclecounter+0x3c/0x120)
r4:bfae0f8c r3:00000002
[<8033c658>] (fec_ptp_start_cyclecounter+0x0/0x120) from [<80339e08>] (fec_resta                                         rt+0x56c/0x5f8)
r8:00000000 r7:806e6f48 r6:00000112 r5:806b4950 r4:bfae0000
[<8033989c>] (fec_restart+0x0/0x5f8) from [<8033b9e4>] (fec_probe+0x508/0xa48)

Signed-off-by: Frank Li <Frank.Li@freescale.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

mlx4_en: Fix BQL reset TX queue call point

Fix issue in Mellanox driver related to BQL. netdev_tx_reset_queue
was not being called in certain situations where the device was
being start and stopped. Moved netdev_tx_reset_queue from the reset
device path to mlx4_en_free_tx_buf which is where the rings are
cleaned in a reset (specifically from device being stopped).

Signed-off-by: Tom Herbert <therbert@google.com>
Acked-By: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'mlx4'

Amir Vadai says:

====================
This series from Yan Burman adds support for unicast MAC address filtering and
ndo FDB operations. It also includes some optimizations to loopback related
decisions and checks in the TX/RX fast path and one cleanup, all in separate
patches.

Today, when adding macvlan devices, the NIC goes into promiscuous mode, since
unicast MAC filtering is not supported. With these changes, macvlan devices can
be added without the penalty of promiscuous mode.

If for some reason adding a unicast address filter fails e.g as of missing space in
the HW mac table, the device forces itself into promiscuous mode (and out of this
forced state when enough space is available).

Also, now it is possible to have bridge under multi-function configuration that include
PF and VFs. In order to use bridge over PF/VFs, VM MAC fdb entries must be added e.g.
using 'bridge fdb add' command.

Changes from v1 - based on more comments from Eric Dumazet:
* added failure handling when adding unicast address filter

Changes from v0 - based on comments from Eric Dumazet:
* Removed unneeded synchronize_rcu()
* Use kfree_rcu() instead of synchronize_rcu() + kfree()
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

net/mlx4_en: Implement ndo fdb functionality

Add support for setting embedded switch fdb in case of SRIOV, by
implementing ndo_fdb_{add, del, dump}. This will allow to use
bridged configuration with multi-function. In order to add VM MAC
to the eSwitch fdb, the following command may be used over the relevant function interface:
bridge fdb add <MAC> permanent self dev <IFACE>

Signed-off-by: Yan Burman <yanb@mellanox.com>
Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net/mlx4_en: Add unicast MAC filtering

Implement and advertise unicast MAC filtering, such that setting macvlan
instance over mlx4_en interfaces will not require the networking core
to put mlx4_en devices in promiscuous mode.

If for some reason adding a unicast address filter fails e.g as of missing space in
the HW mac table, the device forces itself into promiscuous mode (and out of this
forced state when enough space is available).

Signed-off-by: Yan Burman <yanb@mellanox.com>
Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net/mlx4_en: Manage hash of MAC addresses per port

As a preparation step for supporting multiple unicast addresses, store MAC addresses in hash table.
Remove the radix tree for MAC addresses per QP, as it's not in use.

Signed-off-by: Yan Burman <yanb@mellanox.com>
Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net/mlx4_en: Save previous MAC address of the port so we can replace it later

In preparation to having more than one unicast MAC per port, we need to keep track
of the previous MAC address in the flow of ndo_set_mac_address,
so that mlx4_en_replace_mac will know what to replace.

Signed-off-by: Yan Burman <yanb@mellanox.com>
Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net/mlx4_en: Re-arrange ndo_set_rx_mode related code

Currently, mlx4_en_do_set_multicast serves as the ndo_set_rx_mode entry for mlx4_en,
doing all related work. Split it to few calls, one per required functionality
(e.g multicast, promiscuous, etc) and rename some structures and calls
to use rx_mode notation instead of multicast.

Signed-off-by: Yan Burman <yanb@mellanox.com>
Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net/mlx4: Move Ethernet related functionality from mlx4_core to mlx4_en

Move low level code that deals with management of Ethernet MACs and QPs from mlx4_core to mlx4_en.
Also convert the new functions to deal with MACs in form of char array instead of u64.

Actual functions moved:
mlx4_replace_mac
mlx4_get_eth_qp
mlx4_put_eth_qp

To conduct this change, some functionality had to be exported from the core,
the following functions were added:
mlx4_get_base_qp
__mlx4_replace_mac (low level function for CX1/A0 compatibility)

Signed-off-by: Yan Burman <yanb@mellanox.com>
Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net/mlx4_en: Cleanup multiline strings

Make the code consistent in regard to error messages
not spanning multiple lines.

Signed-off-by: Yan Burman <yanb@mellanox.com>
Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net/mlx4_en: Optimize Rx fast path filter checks

Currently, RX path code that does RX filtering is not optimized
and does an expensive conversion. In order to use ether_addr_equal_64bits
which is optimized for such cases, we need the MAC address kept by the device
to be in the form of unsigned char array instead of u64. Store the MAC address
as unsigned char array and convert to/from u64 out of the fast path when needed.
Side effect of this is that we no longer need priv->mac, since it's the same
as dev->dev_addr.

This optimization was suggested by Eric Dumazet <eric.dumazet@gmail.com>

Signed-off-by: Yan Burman <yanb@mellanox.com>
Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net/mlx4_en: Optimize loopback related checks in data path

Currently there are relatively complex conditional checks in the fast path,
for TX loopback enabling and resulting RX filter logic.
Move elaborate if's out of data path, replace them with a single flag
for each state and update that state from appropriate places.
Also, in native (non SRIOV) mode and not in loopback or in selftest,
there is no need to try and filter out packets that HW loopback-ed,
as in native mode we do not loopback packets anymore.

Signed-off-by: Yan Burman <yanb@mellanox.com>
Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

bgmac: add ndo_set_rx_mode netdev ops

When changing the device from or to promisc mode this only affects the
device after the device is bought up the next time. For bridging it is
needed to change the device to promisc mode while it is up, which is
possible with this patch.

Signed-off-by: Hauke Mehrtens <hauke@hauke-m.de>
Signed-off-by: David S. Miller <davem@davemloft.net>

bgmac: add generic ndo_validate_addr netdev ops

Signed-off-by: Hauke Mehrtens <hauke@hauke-m.de>
Signed-off-by: David S. Miller <davem@davemloft.net>

bgmac: write mac address to hardware in ndo_set_mac_address

The generic implementation just changes the netdev struct and does not
write the new mac address to the hardware or issues some command to do
so.

Signed-off-by: Hauke Mehrtens <hauke@hauke-m.de>
Signed-off-by: David S. Miller <davem@davemloft.net>

bgmac: implement missing code for BCM53572

Signed-off-by: Hauke Mehrtens <hauke@hauke-m.de>
Signed-off-by: David S. Miller <davem@davemloft.net>

macvlan: add multicast filter

Setting up IPv6 addresses on configurations with many macvlans
is not really working, as many multicast messages are dropped.

Add a multicast filter to macvlan to reduce the amount of cloned
skbs and overhead.

Successfully tested with 1024 macvlans on one ethernet device.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Ben Greear <greearb@candelatech.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: reset mac header in dev_start_xmit()

On 64 bit arches :

There is a off-by-one error in qdisc_pkt_len_init() because
mac_header is not set in xmit path.

skb_mac_header() returns an out of bound value that was
harmless because hdr_len is an 'unsigned int'

On 32bit arches, the error is abysmal.

This patch is also a prereq for "macvlan: add multicast filter"

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Ben Greear <greearb@candelatech.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: adjust skb_gso_segment() for calling in rx path

skb_gso_segment() is almost always called in tx path,
except for openvswitch. It calls this function when
it receives the packet and tries to queue it to user-space.
In this special case, the ->ip_summed check inside
skb_gso_segment() is no longer true, as ->ip_summed value
has different meanings on rx path.

This patch adjusts skb_gso_segment() so that we can at least
avoid such warnings on checksum.

Cc: Jesse Gross <jesse@nicira.com>
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: Cong Wang <amwang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

wpan: use stack buffer instead of heap

head buffer is only temporary available in mac802154_header_create.
So it's not necessary to put it on the heap.

Signed-off-by: Alexander Aring <alex.aring@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

6lowpan: use stack buffer instead of heap

head buffer is only temporary available in lowpan_header_create.
So it's not necessary to put it on the heap.

Also fixed a comment codestyle issue.

Signed-off-by: Alexander Aring <alex.aring@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

6lowpan: Remove __init tag from lowpan_netlink_fini().

It's called from both __init and __exit code, so neither
tag is appropriate.

Signed-off-by: David S. Miller <davem@davemloft.net>

team: allow userspace to take control over carrier

Some modes don't require any special carrier handling so
in these cases, the kernel can control the carrier as for
any other interface. However, some other modes, e.g. lacp,
requires more than just that, so userspace needs to control
the carrier itself.

The daemon today is ready to control it, but the kernel
still can change it based on events.

This fix so that either kernel or userspace is controlling
the carrier.

Signed-off-by: Flavio Leitner <fbl@redhat.com>
Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>

drivers: net:ethernet: cpsw: add support for VLAN

adding support for VLAN interface for cpsw.

CPSW VLAN Capability
* Can filter VLAN packets in Hardware

Signed-off-by: Mugunthan V N <mugunthanvnm@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

drivers: net: cpsw: Add helper functions for VLAN ALE implementation

Add helper functions for VLAN ALE implementations for Add, Delete
Dump VLAN related ALE entries

Signed-off-by: Mugunthan V N <mugunthanvnm@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

netpoll: protect napi_poll and poll_controller during dev_[open|close]

Ivan Vercera was recently backporting commit
9c13cb8bb477a83b9a3c9e5a5478a4e21294a760 to a RHEL kernel, and I noticed that,
while this patch protects the tg3 driver from having its ndo_poll_controller
routine called during device initalization, it does nothing for the driver
during shutdown. I.e. it would be entirely possible to have the
ndo_poll_controller method (or subsequently the ndo_poll) routine called for a
driver in the netpoll path on CPU A while in parallel on CPU B, the ndo_close or
ndo_open routine could be called.  Given that the two latter routines tend to
initizlize and free many data structures that the former two rely on, the result
can easily be data corruption or various other crashes.  Furthermore, it seems
that this is potentially a problem with all net drivers that support netpoll,
and so this should ideally be fixed in a common path.

As Ben H Pointed out to me, we can't preform dev_open/dev_close in atomic
context, so I've come up with this solution.  We can use a mutex to sleep in
open/close paths and just do a mutex_trylock in the napi poll path and abandon
the poll attempt if we're locked, as we'll just retry the poll on the next send
anyway.

I've tested this here by flooding netconsole with messages on a system whos nic
driver I modfied to periodically return NETDEV_TX_BUSY, so that the netpoll tx
workqueue would be forced to send frames and poll the device.  While this was
going on I rapidly ifdown/up'ed the interface and watched for any problems.
I've not found any.

Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
CC: Ivan Vecera <ivecera@redhat.com>
CC: "David S. Miller" <davem@davemloft.net>
CC: Ben Hutchings <bhutchings@solarflare.com>
CC: Francois Romieu <romieu@fr.zoreil.com>
CC: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

wpan: whitespace fix

Signed-off-by: Alexander Aring <alex.aring@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ipv6: Don't send packet to big messages to self

Calling icmpv6_send() on a local message size error leads to an
incorrect update of the path mtu in the case when IPsec is used.
So use ipv6_local_error() instead to notify the socket about the
error.

Reported-by: Jiri Bohac <jbohac@suse.cz>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

drivers: net: misc: Remove unused OOM variables

commits 9d11bd159
("wimax: Remove unnecessary alloc/OOM messages, alloc cleanups")
and b2adaca92
("ethernet: Remove unnecessary alloc/OOM messages, alloc cleanups")
added a couple of unused variable warnings.

Remove the now unused variables.

Noticed-by: kbuild test robot <fengguang.wu@intel.com>
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: core: Remove unnecessary alloc/OOM messages

alloc failures already get standardized OOM
messages and a dump_stack.

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net-next

Jeff Kirsher says:

====================
This series contains updates to e1000e and ixgbe.  Majority of the patches
are against e1000e, where Bruce makes several cosmetic #define moves into
header files.  In addition, Bruce does a cleanup of braces to resolve
checkpatch warnings (when using the strict option).

Ixgbe patches contain several fixes as well as updating the copyright.  The
fixes from Josh Hay, resolved a possible NULL pointer dereference and
resolved Smatch warnings by fixing return values and memcpy parameters.
Alex provides 2 fixes, the first is to replace rmb() with
read_barrier_depends() in the Tx cleanup.  The second fixes an MTU
warning when using SR-IOV which corrects the fact that we were using 1522
to test for the max frame size in ixgbe_change_mtu and 1518 in
ixgbe_set_vf_lpe.  The difference was the addition of VLAN_HLEN, which we
only need to add in the case of computing a buffer size, but not a filter
size.  Lastly, a patch from Emil which is based on a community patch from
Aurélien Guillaume which adds functions needed for reading SFF-8472
diagnostic data from SFP modules.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

tcp: remove Appropriate Byte Count support

TCP Appropriate Byte Count was added by me, but later disabled.
There is no point in maintaining it since it is a potential source
of bugs and Linux already implements other better window protection
heuristics.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

ipv4: Disallow non-namespace aware protocols to register.

All in-tree ipv4 protocol implementations are now namespace
aware. Therefore all the run-time checks are superfluous.

Reject registry of any non-namespace aware ipv4 protocol.
Eventually we'll remove prot->netns_ok and this registry
time check as well.

Signed-off-by: David S. Miller <davem@davemloft.net>

l2tp: Make ipv4 protocol handler namespace aware.

The infrastructure is already pretty much entirely there
to allow this conversion.

The tunnel and session lookups have per-namespace tables,
and the ipv4 bind lookup includes the namespace in the
lookup key.

Set netns_ok in l2tp_ip_protocol.

Signed-off-by: David S. Miller <davem@davemloft.net>

l2tp: create tunnel sockets in the right namespace

When creating unmanaged tunnel sockets we should honour the network namespace
passed to l2tp_tunnel_create. Furthermore, unmanaged tunnel sockets should
not hold a reference to the network namespace lest they accidentally keep
alive a namespace which should otherwise have been released.

Unmanaged tunnel sockets now drop their namespace reference via sk_change_net,
and are released in a new pernet exit callback, l2tp_exit_net.

Signed-off-by: Tom Parkin <tparkin@katalix.com>
Signed-off-by: James Chapman <jchapman@katalix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

l2tp: prevent tunnel creation on netns mismatch

l2tp_tunnel_create is passed a pointer to the network namespace for the
tunnel, along with an optional file descriptor for the tunnel which may
be passed in from userspace via. netlink.

In the case where the file descriptor is defined, ensure that the namespace
associated with that socket matches the namespace explicitly passed to
l2tp_tunnel_create.

Signed-off-by: Tom Parkin <tparkin@katalix.com>
Signed-off-by: James Chapman <jchapman@katalix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

l2tp: set netnsok flag for netlink messages

The L2TP netlink code can run in namespaces. Set the netnsok flag in
genl_family to true to reflect that fact.

Signed-off-by: Tom Parkin <tparkin@katalix.com>
Signed-off-by: James Chapman <jchapman@katalix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

l2tp: put tunnel socket release on a workqueue

To allow l2tp_tunnel_delete to be called from an atomic context, place the
tunnel socket release calls on a workqueue for asynchronous execution.

Tunnel memory is eventually freed in the tunnel socket destructor.

Signed-off-by: Tom Parkin <tparkin@katalix.com>
Signed-off-by: James Chapman <jchapman@katalix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net

Conflicts:
drivers/net/ethernet/intel/e1000e/ethtool.c
drivers/net/vmxnet3/vmxnet3_drv.c
drivers/net/wireless/iwlwifi/dvm/tx.c
net/ipv6/route.c

The ipv6 route.c conflict is simple, just ignore the 'net' side change
as we fixed the same problem in 'net-next' by eliminating cached
neighbours from ipv6 routes.

The e1000e conflict is an addition of a new statistic in the ethtool
code, trivial.

The vmxnet3 conflict is about one change in 'net' removing a guarding
conditional, whilst in 'net-next' we had a netdev_info() conversion.

The iwlwifi conflict is dealing with a WARN_ON() conversion in
'net-next' vs. a revert happening in 'net'.

Signed-off-by: David S. Miller <davem@davemloft.net>

ixgbe: Fix SR-IOV MTU warning

This change corrects the fact that we were using 1522 to test for the
max frame size in ixgbe_change_mtu and 1518 in ixgbe_set_vf_lpe. The
difference was the addition of VLAN_HLEN which we only need to add in the case
of computing a buffer size, but not a filter size.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Sibai Li <Sibai.li@intel.com>
Tested-by: Stephen Ko <stephen.s.ko@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

ixgbe: Replace rmb in Tx cleanup with read_barrier_depends

The rmb in the Tx cleanup path is a much stronger barrier than we really need.
All that is really needed is a read_barrier_depends since the location of the
EOP descriptor is dependent on the eop_desc value.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

ixgbe: update date to 2013

Signed-off-by: Don Skidmore <donald.c.skidmore@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

ixgbe: fix return values and memcpy parameters to eliminate Smatch warnings

This patch removes the rval variable returns from function and replaces
them with direct returns in ixgbe_dcbnl_getnumtcs. It also changes how
ixgbe_gstrings_test is copied into data with memcpy in ixgbe_get_strings
because "*ixgbe_gstrings_test too small (32 vs 160)".

Signed-off-by: Josh Hay <joshua.a.hay@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

ixgbe: fix potential null dereference

This patch adds a default case which goes to the next loop iteration
in the case where p is not set, preventing p from being dereferenced.

Signed-off-by: Josh Hay <joshua.a.hay@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

ixgbe: allow reading of SFF-8472 data over i2c

This patch adds functions needed for reading SFF-8472 diagnostic data
from SFP modules.

Based on original patch from Aurélien Guillaume <footplus@gmail.com>

CC: Aurélien Guillaume <footplus@gmail.com>
Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

e1000e: cleanup checkpatch braces checks

Resolve the following strict checkpatch checks:
CHECK:BRACES: Blank lines aren't necessary after an open brace '{'
CHECK:BRACES: Blank lines aren't necessary before a close brace '}'
CHECK:BRACES: braces {} should be used on all arms of this statement

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

e1000e: convert enums of register offsets and move #defines to regs.h

There are enough register offsets to warrant being in their own header
file, and doing so logically separates them from other header file content.
They have been converted from an enumerated data type to #defines as is
done in all the other Intel wired ethernet drivers.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

e1000e: cosmetic move of #defines and prototypes to the new manage.h

Move #defines, function prototypes and data types which are applicable to
all/most devices supported by the driver but are specific to the
manageability component of each device to the new manage.h header file.
These #defines, function prototypes and data types can be used by other
files in the driver and moving them to the manageability-specific file
makes it clearer to which component they are applicable.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

e1000e: cosmetic move of #defines and function prototypes to the new nvm.h

Move #defines and function prototypes which are applicable to all/most
devices supported by the driver and are specific to the NVM component of
each device to the new nvm.h header file. These #defines and function
prototypes can be used by other files in the driver and moving them to the
NVM-specific file makes it clearer to which component they are applicable.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

e1000e: cosmetic move of #defines and function prototypes to the new phy.h

Move #defines and function prototypes which are applicable to all/most
devices supported by the driver and are specific to the PHY component of
each device to the new phy.h header file. These function prototypes can be
used by other files in the driver and moving them to the PHY-specific file
makes it clearer to which component they are applicable.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

e1000e: cosmetic move of function prototypes to the new mac.h

Move prototypes for functions which are applicable to all/most devices
supported by the driver and are specific to the MAC component of each
device to the new mac.h header file. These function prototypes can be used
by other files in the driver and moving them to the MAC-specific file makes
it clearer to which component they are applicable.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

e1000e: cosmetic move of #defines and prototypes to the new ich8lan.h

Move #defines and function prototypes specific to the ICH/PCH family of
devices (ICH8/82562, ICH8/82566, ICH8/82567, ICH9/82562, ICH9/82566,
ICH9/82567, ICH10/82567, 82577, 82578, 82579, I217, I218) to the new
ich8lan.h header file (the convention for Intel wired ethernet drivers is
to use the name of the first device in the family for related file and
function names). These defines and function prototypes can be used by
other files in the driver and moving them to the ICH/PCH-family-specific
file makes it clearer to which devices they are applicable.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

e1000e: cosmetic move of #defines to the new 80003es2lan.h

Move #defines specific to the ESB2/82563 family of devices to the new
80003es2lan.h header file. These defines can be used by other files in the
driver and moving them to the 80003es2lan-family-specific file makes it
clearer to which devices they are applicable.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

e1000e: cosmetic move of #defines and prototypes to the new 82571.h

Move #defines and function prototypes specific to the 8257x family of
devices (82571, 82572, 82573, 82574, 82583) to the new 82571.h header file
(the convention for Intel wired ethernet drivers is to use the name of the
first device in the family for related file and function names). These
defines and function prototypes can be used by other files in the driver
and moving them to the 8257x-family-specific file makes it clearer to which
devices they are applicable.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

qlcnic: Updating copyright information.

We recently refactored the driver source, this patch will take care of
updating copyright date and adding it to newly added files.

Signed-off-by: Jitendra Kalsaria <jitendra.kalsaria@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

gianfar: dont conditionally alloc Rx/Err irq structs

Commit ee873fda3bec7c668407b837fc5519eb961fcd37

    "gianfar: Pack struct gfar_priv_grp into three cachelines"

causes the following null dereference at driver init on sbc8548:

   libphy: Freescale PowerQUICC MII Bus: probed
   Unable to handle kernel paging request for data at address 0x00000000
   Faulting instruction address: 0xc01d6a38
   Oops: Kernel access of bad area, sig: 11 [#1]
   [...]
   NIP [c01d6a38] gfar_parse_group+0x228/0x280
   LR [c01d6a34] gfar_parse_group+0x224/0x280
   Call Trace:
   [ef82dd60] [c01d6a34] gfar_parse_group+0x224/0x280 (unreliable)
   [ef82dd90] [c01d73a4] gfar_probe+0x284/0xfe0

The reason is that the commit also changed the allocation of the
Rx and error handling irq structs to be skipped for !MQ_MG_MODE.
In the !MQ_MG_MODE case, only the Tx irq struct is allocated.

Digging further, we see that MQ_MG_MODE is set only if we find
the OF compatible string "fsl,etsec2".

A quick grep in the dts directory shows lots of boards that support
Rx/Tx/Err, but without this specific compat string.  And hence they
go after the unallocated Rx/Error structs and cause the above oops.

Hence such a change can not be deployed until all the dts files
are updated and sufficiently deployed.  Further, the optimization
is of limited value, since the kmalloc'd struct in question has only
a single unsigned int, and an (IFNAMSIZ + 6) sized string.

Note that no changes to the freeing code are needed here, as it
already did an unconditional free of Rx/Tx/Error gfar_irqinfo.

Cc: Claudiu Manoil <claudiu.manoil@freescale.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ipcomp: Mark as netns_ok.

This module is namespace aware, netns_ok was just disabled by default
for sanity.

Signed-off-by: David S. Miller <davem@davemloft.net>

net: fec: fix miss init spinlock

BUG: spinlock bad magic on CPU#1, swapper/0/1
lock: 0xbfae0f8c, .magic: 00000000, .owner: <none>/-1, .owner_cpu: 0
Backtrace:
[<80011d54>] (dump_backtrace+0x0/0x10c) from [<804e7800>] (dump_stack+0x18/0x1c)
r6:bfae0000 r5:bfae0f8c r4:00000000 r3:806c1310
[<804e77e8>] (dump_stack+0x0/0x1c) from [<804e9f20>] (spin_dump+0x80/0x94)
[<804e9ea0>] (spin_dump+0x0/0x94) from [<804e9f60>] (spin_bug+0x2c/0x30)
r5:805f6f8c r4:bfae0f8c
[<804e9f34>] (spin_bug+0x0/0x30) from [<80257984>] (do_raw_spin_lock+0x170/0x1b0                                         )
r5:806b4950 r4:bfae0f8c
[<80257814>] (do_raw_spin_lock+0x0/0x1b0) from [<804ed15c>] (_raw_spin_lock_irqs                                         ave+0x18/0x20)
[<804ed144>] (_raw_spin_lock_irqsave+0x0/0x20) from [<8033c694>] (fec_ptp_start_                                         cyclecounter+0x3c/0x120)
r4:bfae0f8c r3:00000002
[<8033c658>] (fec_ptp_start_cyclecounter+0x0/0x120) from [<80339e08>] (fec_resta                                         rt+0x56c/0x5f8)
r8:00000000 r7:806e6f48 r6:00000112 r5:806b4950 r4:bfae0000
[<8033989c>] (fec_restart+0x0/0x5f8) from [<8033b9e4>] (fec_probe+0x508/0xa48)

Signed-off-by: Frank Li <Frank.Li@freescale.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

team: ab: set active port option as changed when port is leaving

In case port is leaving the team, set the option "activeport" as changed
so the change can be properly propagated to userspace

Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>

team: move netlink event notifiers after team_port_leave()

In team_port_del(), there is need to be do all the cleanup related
things first and netlink event notifiers should be called after that.
This fixes two problems:
team carrier is now correctly set (port is removed from list first)
mode can set option as changed in .port_leave op

Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>

team: handle sending port list in the same way option list is sent

Essentially do the same thing with port list as with option list.
Multipart netlink message.
Side effect is that port event message can send port which is not longer
in team->port_list.

Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>

net/mlx4_en: Fix compilation error when CONFIG_INET isn't defined

ip_eth_mc_map function can't be used when CONFIG_INET isn't defined.
Fixed compilation error by adding CONFIG_INET define check before using the
function.

Signed-off-by: Hadar Hen Zion <hadarh@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net/mlx4_en: Fix error propagation for ethtool helper function

Propagate return value of mlx4_en_ethtool_add_mac_rule_by_ipv4 in case of
failure.

Signed-off-by: Hadar Hen Zion <hadarh@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

mcast: do not check 'rv' twice in a row

With the loop, don't check 'rv' twice in a row. Without the loop, 'rv'
doesn't even need to be checked.

Make the comment more grammar-friendly.

Signed-off-by: Jean Sacren <sakiwit@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: remove redundant check for timer pending state before del_timer

As in del_timer() there has already placed a timer_pending() function
to check whether the timer to be deleted is pending or not, it's
unnecessary to check timer pending state again before del_timer() is
called.

Signed-off-by: Ying Xue <ying.xue@windriver.com>
Cc: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

be2net: update driver version to 4.6.x

Signed-off-by: Sathya Perla <sathya.perla@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

be2net: fix re-loaded PF driver to re-gain control of its VFs

Currently, when the PF driver is unloaded and re-loaded while VFs are attached
to VMs, it loses control of its VFs.

The PF driver now uses the newly defined/created GET_IFACE_LIST cmd
(available in FW ver >= 4.6) to query the if_id of the VFs
(enabled in its previous life). The PF driver then uses the if_id for
further VF configuration.

The GET_IFACE_MAC_LIST cmd has also implemented in BE3 FW for PF to
query pmac-ids used by its VFs.

Signed-off-by: Sathya Perla <sathya.perla@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

drivers:net:misc: Remove unnecessary alloc/OOM messages

alloc failures already get standardized OOM
messages and a dump_stack.

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

wireless: Remove unnecessary alloc/OOM messages, alloc cleanups

alloc failures already get standardized OOM
messages and a dump_stack.

Convert kzalloc's with multiplies to kcalloc.
Convert kmalloc's with multiplies to kmalloc_array.
Remove now unused variables.
Remove unnecessary memset after kzalloc->kcalloc.
Whitespace cleanups for these changes.

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

wimax: Remove unnecessary alloc/OOM messages, alloc cleanups

alloc failures already get standardized OOM
messages and a dump_stack.

Convert kzalloc's with multiplies to kcalloc.
Remove now unused size variables.

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

wan: Remove unnecessary alloc/OOM messages

alloc failures already get standardized OOM
messages and a dump_stack.

Hoist assigns from if tests.

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

drivers: net: usb: Remove unnecessary alloc/OOM messages

alloc failures already get standardized OOM
messages and a dump_stack.

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ethernet: Remove unnecessary alloc/OOM messages, alloc cleanups

alloc failures already get standardized OOM
messages and a dump_stack.

Convert kzalloc's with multiplies to kcalloc.
Convert kmalloc's with multiplies to kmalloc_array.
Fix a few whitespace defects.
Convert a constant 6 to ETH_ALEN.
Use parentheses around sizeof.
Convert vmalloc/memset to vzalloc.
Remove now unused size variables.

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

can: Remove unnecessary alloc/OOM messages

alloc failures already get standardized OOM
messages and a dump_stack.

Signed-off-by: Joe Perches <joe@perches.com>
Acked-by: Marc Kleine-Budde <mkl@pengutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>

caif: Remove unnecessary alloc/OOM messages

alloc failures already get standardized OOM
messages and a dump_stack.

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

sctp: sctp_close: fix release of bindings for deferred call_rcu's

It seems due to RCU usage, i.e. within SCTP's address binding list,
a, say, ``behavioral change'' was introduced which does actually
not conform to the RFC anymore. In particular consider the following
(fictional) scenario to demonstrate this:

  do:
    Two SOCK_SEQPACKET-style sockets are opened (S1, S2)
    S1 is bound to 127.0.0.1, port 1024 [server]
    S2 is bound to 127.0.0.1, port 1025 [client]
    listen(2) is invoked on S1
    From S2 we call one sendmsg(2) with msg.msg_name and
       msg.msg_namelen parameters set to the server's
       address
    S1, S2 are closed
    goto do

The first pass of this loop passes successful, while the second round
fails during binding of S1 (address still in use). What is happening?
In the first round, the initial handshake is being done, and, at the
time close(2) is called on S1, a non-graceful shutdown is performed via
ABORT since in S1's receive queue an unprocessed packet is present,
thus stating an error condition. This can be considered as a correct
behavior.

During close also all bound addresses are freed, thus nothing *must*
be active anymore. In reference to RFC2960:

  After checking the Verification Tag, the receiving endpoint shall
  remove the association from its record, and shall report the
  termination to its upper layer. (9.1 Abort of an Association)

Also, no half-open states are supported, thus after an ungraceful
shutdown, we leave nothing behind. However, this seems not to be
happening though. In a real-world scenario, this is exactly where
it breaks the lksctp-tools functional test suite, *for instance*:

  ./test_sockopt
  test_sockopt.c  1 PASS : getsockopt(SCTP_STATUS) on a socket with no assoc
  test_sockopt.c  2 PASS : getsockopt(SCTP_STATUS)
  test_sockopt.c  3 PASS : getsockopt(SCTP_STATUS) with invalid associd
  test_sockopt.c  4 PASS : getsockopt(SCTP_STATUS) with NULL associd
  test_sockopt.c  5 BROK : bind: Address already in use

The underlying problem is that sctp_endpoint_destroy() hasn't been
triggered yet while the next bind attempt is being done. It will be
triggered eventually (but too late) by sctp_transport_destroy_rcu()
after one RCU grace period:

  sctp_transport_destroy()
    sctp_transport_destroy_rcu() ----.
      sctp_association_put() [*]  <--+--> sctp_packet_free()
        sctp_association_destroy()          [...]
          sctp_endpoint_put()                 skb->destructor
            sctp_endpoint_destroy()             sctp_wfree()
              sctp_bind_addr_free()               sctp_association_put() [*]

Thus, we move out the condition with sctp_association_put() as well as
the sctp_packet_free() invocation and the issue can be solved. We also
better free the SCTP chunks first before putting the ref of the association.

With this patch, the example above (which simulates a similar scenario
as in the implementation of this test case) and therefore also the test
suite run successfully through. Tested by myself.

Cc: Vlad Yasevich <vyasevich@gmail.com>
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Acked-by: Vlad Yasevich <vyasevich@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

cxgb3: Update VLAN extraction stats in the GRO path

Signed-off-by: Divy Le Ray <divy@chelsio.com>
Signed-off-by: Vipul Pandya <vipul@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

netns: bond: allow unprivileged users to control bond device

reduce the permission check of bond device's ioctl.
allow the userns root to control the bond device.

Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

netns: bridge: allow unprivileged users add/delete mdb entry

since the mdb table is belong to bridge device,and the
bridge device can only be seen in one netns.
So it's safe to allow unprivileged user which is the
creator of userns and netns to modify the mdb table.

Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

netns: ebtable: allow unprivileged users to operate ebtables

ebt_table is a private resource of netns, operating ebtables
in one netns will not affect other netns, we can allow the
creator user of userns and netns to change the ebtables.

Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

netns: fdb: allow unprivileged users to add/del fdb entries

Right now,only ixgdb,macvlan,vxlan and bridge implement
fdb_add/fdb_del operations.

these operations only operate the private data of net
device. So allowing the unprivileged users who creates
the userns and netns to add/del fdb entries will do no
harm to other netns.

Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: usbnet: fix tx_dropped statistics

It is normal for minidrivers accumulating frames to return NULL
from their tx_fixup function. We do not want to count this as a
drop, or log any debug messages. A different exit path is
therefore chosen for such drivers, skipping the debug message
and the tx_dropped increment.

The test for accumulating drivers was however completely bogus,
making the exit path selection depend on whether the user had
enabled tx_err logging or not. This would arbitrarily mess up
accounting for both accumulating and non-accumulating minidrivers,
and would result in unwanted debug messages for the accumulating
drivers.

Fix by testing for FLAG_MULTI_PACKET instead, which probably was
the intention from the beginning. This usage match the documented
behaviour of this flag:

Indicates to usbnet, that USB driver accumulates multiple IP packets.
Affects statistic (counters) and short packet handling.

Signed-off-by: Bjørn Mork <bjorn@mork.no>
Signed-off-by: David S. Miller <davem@davemloft.net>

tcp: ipv6: Update MIB counters for drops

This patch updates LINUX_MIB_LISTENDROPS and LINUX_MIB_LISTENOVERFLOWS in
tcp_v6_conn_request() and tcp_v6_err(). tcp_v6_conn_request() in particular can
drop SYNs for various reasons which are not currently tracked.

Signed-off-by: Vijay Subramanian <subramanian.vijay@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

tcp: Update MIB counters for drops

This patch updates LINUX_MIB_LISTENDROPS in tcp_v4_conn_request() and
tcp_v4_err(). tcp_v4_conn_request() in particular can drop SYNs for various
reasons which are not currently tracked.

Signed-off-by: Vijay Subramanian <subramanian.vijay@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>