]> git.karo-electronics.de Git - linux-beck.git/log
linux-beck.git
11 years agonet: fec: Fix multicast list setup in fec_restart().
Christoph Müllner [Thu, 27 Jun 2013 19:18:23 +0000 (21:18 +0200)]
net: fec: Fix multicast list setup in fec_restart().

Setup the multicast list of the net_device instead of
clearing it blindly. This restores the multicast groups
in case of a link down/up event or when resuming from
suspend.

Signed-off-by: Christoph Muellner <christoph.muellner@theobroma-systems.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agoipv6: fix ecmp lookup when oif is specified
Nicolas Dichtel [Fri, 28 Jun 2013 15:35:48 +0000 (17:35 +0200)]
ipv6: fix ecmp lookup when oif is specified

There is no reason to skip ECMP lookup when oif is specified, but this implies
to check oif given by user when selecting another route.
When the new route does not match oif requirement, we simply keep the initial
one.

Spotted-by: dingzhi <zhi.ding@6wind.com>
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agoipv6: only apply anti-spoofing checks to not-pointopoint tunnels
Hannes Frederic Sowa [Thu, 27 Jun 2013 20:46:04 +0000 (22:46 +0200)]
ipv6: only apply anti-spoofing checks to not-pointopoint tunnels

Because of commit 218774dc341f219bfcf940304a081b121a0e8099 ("ipv6: add
anti-spoofing checks for 6to4 and 6rd") the sit driver dropped packets
for 2002::/16 destinations and sources even when configured to work as a
tunnel with fixed endpoint. We may only apply the 6rd/6to4 anti-spoofing
checks if the device is not in pointopoint mode.

This was an oversight from me in the above commit, sorry.  Thanks to
Roman Mamedov for reporting this!

Reported-by: Roman Mamedov <rm@romanrm.ru>
Cc: David Miller <davem@davemloft.net>
Cc: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agoMerge branch 'for-davem' of git://git.kernel.org/pub/scm/linux/kernel/git/linville...
David S. Miller [Mon, 1 Jul 2013 20:21:17 +0000 (13:21 -0700)]
Merge branch 'for-davem' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next

John W. Linville says:

====================
Yet one more pull request for wireless updates intended for 3.11...

For the mac80211 bits, Johannes says:

"Here we have a few memory leak fixes related to BSS struct handling
mostly from Ben, including a fix for a more theoretical problem
(associating while a BSS struct times out) from myself, a compilation
warning fix from Arend, mesh fixes from Thomas, tracking the beacon
bitrate (Alex), a bandwidth change event fix (Ilan) and some initial
work for 5/10 MHz channels from Simon."

Regarding the iwlwifi bits, Johannes says:

"Emmanuel removed some unneeded/unsupported module parameters and adds a
Bluetooth 1x1 lookup-table for some upcoming products. From Alex I have
an older patch to add low-power receive support, this depended on a
mac80211 commit that only just came in with the merge from wireless-next
I did. Ilan made beacon timings better, and Eytan added some debug
statements for thermal throttling. I have a few cleanups, a fix for a
long-standing but rare warning, and, arguably the most important patch
here, the firmware API version bump for the 7260/3160 devices."

Also included is a Bluetooth pull -- Gustavo says:

"Here goes a set of patches to 3.11. The biggest work here is from Andre Guedes
on the move of the Discovery to use the new request framework. Other than that
Johan provided a bunch of fixes to the L2CAP code. The rest are just small
fixes and clean ups."

On top of all that, there are a variety of updates and fixes to
brcmfmac, rt2x00, wil6210, ath9k, ath10k, and a few others here and
there.  This also includes a pull of the wireless tree, in order to
prevent some merge conflicts.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agoopenvswitch: Add Kconfig dependency on GRE-DEMUX.
Pravin B Shelar [Fri, 28 Jun 2013 23:07:40 +0000 (16:07 -0700)]
openvswitch: Add Kconfig dependency on GRE-DEMUX.

Openvswitch uses function from NET_IPGRE_DEMUX module.
Add Kconfig dependency to fix following compilation errors:
http://marc.info/?l=linux-netdev&m=137244035226634

CC: Jesse Gross <jesse@nicira.com>
Reported-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Pravin Shelar <pshelar@nicira.com>
Acked-by: Randy Dunlap <rdunlap@infradead.org>
Acked-by: Jesse Gross <jesse@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agoalx: fix ethtool support code
Johannes Berg [Sat, 29 Jun 2013 17:23:19 +0000 (19:23 +0200)]
alx: fix ethtool support code

A number of places treated features wrongly, listing not-supported
features instead of supported ones. Also, the get_drvinfo ethtool
callback isn't needed, and alx_get_pauseparam can be simplified.

Reported-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agoalx: fix MAC address alignment problem
Johannes Berg [Sat, 29 Jun 2013 17:23:18 +0000 (19:23 +0200)]
alx: fix MAC address alignment problem

In two places, parts of MAC addresses are used as u32/u16
values. This can cause alignment problems, use put_unaligned
and get_unaligned to fix this.

Reported-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agoalx: separate link speed/duplex fields
Johannes Berg [Sat, 29 Jun 2013 17:23:17 +0000 (19:23 +0200)]
alx: separate link speed/duplex fields

As suggested by Ben Hutchings, use separate fields to track
current link speed and duplex setting.

Reported-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agoalx: make sizes unsigned
Johannes Berg [Sat, 29 Jun 2013 17:23:16 +0000 (19:23 +0200)]
alx: make sizes unsigned

The ring sizes should be unsigned, pointed out by Ben Hutchings.

Reported-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agoalx: remove NET_CORE Kconfig select
Johannes Berg [Sat, 29 Jun 2013 17:23:15 +0000 (19:23 +0200)]
alx: remove NET_CORE Kconfig select

That select doesn't make any sense, remove it.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agoalx: fix 100mbit/half duplex speed translation
Johannes Berg [Sat, 29 Jun 2013 17:23:14 +0000 (19:23 +0200)]
alx: fix 100mbit/half duplex speed translation

100mbit half duplex is ADVERTISED_100baseT_Half, not
ADVERTISED_10baseT_Half.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agoalx: treat flow control correctly in alx_set_pauseparam()
Johannes Berg [Sat, 29 Jun 2013 17:23:13 +0000 (19:23 +0200)]
alx: treat flow control correctly in alx_set_pauseparam()

Even when alx_setup_speed_duplex() is called, we still
need to call alx_cfg_mac_flowcontrol() and set hw->flowctrl
if flow control changed.

This was a bug I accidentally introduced while simplifying
the original driver.

Reported-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agonet/mlx4_core: Add HW enforcement to VF link state
Rony Efraim [Thu, 27 Jun 2013 16:05:22 +0000 (19:05 +0300)]
net/mlx4_core: Add HW enforcement to VF link state

When the firmware supports the UPDATE_QP command, if the VF link is disabled,
block all QPs opened by the VF, by programming the UPDATE_QP command to drop
all RX & TX traffic to/from these QPs. Operates only in VST mode.

Signed-off-by: Rony Efraim <ronye@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agonet/mlx4_core: Dynamic VST to VST vlan/qos changes
Jack Morgenstein [Thu, 27 Jun 2013 16:05:21 +0000 (19:05 +0300)]
net/mlx4_core: Dynamic VST to VST vlan/qos changes

Within VST mode, enable modifying the vlan and/or qos
for a VF without requiring unbind/rebind.

This requires firmware which supports the UPDATE_QP command.
(If the command is not available, we fall back to requiring
unbind/bind to activate these changes).

To avoid race conditions with modify-qp on QPs that are affected
by update-qp, this operation is performed on the comm_wq.

If the update operation succeeds for all the necessary QPs, a
vlan_unregister is performed for the abandoned vlan id.

Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agoMerge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next
David S. Miller [Mon, 1 Jul 2013 00:35:13 +0000 (17:35 -0700)]
Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next

Pablo Neira Ayuso says:

====================
The following batch contains Netfilter/IPVS updates for net-next,
they are:

* Enforce policy to several nfnetlink subsystem, from Daniel
  Borkmann.

* Use xt_socket to match the third packet (to perform simplistic
  socket-based stateful filtering), from Eric Dumazet.

* Avoid large timeout for picked up from the middle TCP flows,
  from Florian Westphal.

* Exclude IPVS from struct net if IPVS is disabled and removal
  of unnecessary included header file, from JunweiZhang.

* Release SCTP connection immediately under load, to mimic current
  TCP behaviour, from Julian Anastasov.

* Replace and enhance SCTP state machine, from Julian Anastasov.

* Add tweak to reduce sync traffic in the presence of persistence,
  also from Julian Anastasov.

* Add tweak for the IPVS SH scheduler not to reject connections
  directed to a server, choose a new one instead, from Alexander
  Frolkin.

* Add support for sloppy TCP and SCTP modes, that creates state
  information on any packet, not only initial handshake packets,
  from Alexander Frolkin.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agonetfilter: nf_queue: add NFQA_SKB_CSUM_NOTVERIFIED info flag
Florian Westphal [Sat, 29 Jun 2013 12:15:47 +0000 (14:15 +0200)]
netfilter: nf_queue: add NFQA_SKB_CSUM_NOTVERIFIED info flag

The common case is that TCP/IP checksums have already been
verified, e.g. by hardware (rx checksum offload), or conntrack.

Userspace can use this flag to determine when the checksum
has not been validated yet.

If the flag is set, this doesn't necessarily mean that the packet has
an invalid checksum, e.g. if NIC doesn't support rx checksum.

Userspace that sucessfully enabled NFQA_CFG_F_GSO queue feature flag can
infer that IP/TCP checksum has already been validated if either the
SKB_INFO attribute is not present or the NFQA_SKB_CSUM_NOTVERIFIED
flag is unset.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
11 years agobonding: combine pr_debugs in bond_set_dev_addr into one
Nikolay Aleksandrov [Sat, 29 Jun 2013 11:16:59 +0000 (13:16 +0200)]
bonding: combine pr_debugs in bond_set_dev_addr into one

Combine the multiple pr_debugs in bond_set_dev_addr into one pr_debug.

Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agoMerge branch 'for-davem' of git://gitorious.org/linux-can/linux-can-next
David S. Miller [Sat, 29 Jun 2013 05:13:14 +0000 (22:13 -0700)]
Merge branch 'for-davem' of git://gitorious.org/linux-can/linux-can-next

Marc Kleine-Budde says:

====================
this is a pull-request for net-next/master. It consists of three
patches by Fabio Estevam and me, which convert the flexcan transceiver
switching to DT[1] and a patch by Sachin Kamat, which cleans up the
at91_can driver a bit.

[1] These patches touch arch/arm/mach-imx, so I collected Acked-bys
from Shawn Guo and Sascha Hauer.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agossb/trivial: replace numeric with standard PM state macros
Yijing Wang [Thu, 27 Jun 2013 13:00:11 +0000 (21:00 +0800)]
ssb/trivial: replace numeric with standard PM state macros

Use standard PM state macros PCI_Dx instead of numeric 0/1/2..

Signed-off-by: Yijing Wang <wangyijing@huawei.com>
Cc: netdev@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agonet/trivial: replace numeric with standard PM state macros
Yijing Wang [Thu, 27 Jun 2013 12:53:42 +0000 (20:53 +0800)]
net/trivial: replace numeric with standard PM state macros

Use standard PM state macros PCI_Dx instead of numeric 0/1/2..

Signed-off-by: Yijing Wang <wangyijing@huawei.com>
Cc: netdev@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agonlmon: fix comparison in nlmon_is_valid_mtu
Daniel Borkmann [Thu, 27 Jun 2013 11:44:26 +0000 (13:44 +0200)]
nlmon: fix comparison in nlmon_is_valid_mtu

This patch fixes the following warning introduced in e4fc408e0e99
("packet: nlmon: virtual netlink monitoring device for packet
sockets") reported by Dan Carpenter:

warning: "drivers/net/nlmon.c:31 nlmon_is_valid_mtu()
 warn: always true condition '(new_mtu <= ((~0 >> 1))) =>
      (s32min-s32max <= s32max)'"

Thus, we should simply remove the test against INT_MAX. Next to that
we also need to explicitly cast the sizeof() case as the comparison
is type promoted to unsigned long so negative values are then
valid instead of invalid. While at it, this also adds a comment about
Netlink and MTUs.

Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agodrivers: net: cpsw: add newline after MACID log
Daniel Mack [Thu, 27 Jun 2013 09:40:47 +0000 (11:40 +0200)]
drivers: net: cpsw: add newline after MACID log

Cosmetic patch to add a newline after logging the device's MACID.

Signed-off-by: Daniel Mack <zonque@gmail.com>
Acked-by: Mugunthan V N <mugunthanvnm@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agopch_gbe: use managed functions pcim_* and devm_*
Andy Shevchenko [Fri, 28 Jun 2013 11:02:54 +0000 (14:02 +0300)]
pch_gbe: use managed functions pcim_* and devm_*

This makes the error handling much more simpler than open-coding everything and
in addition makes the probe function smaller an tidier.

Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agopch_gbe: convert pr_* to netdev_*
Andy Shevchenko [Fri, 28 Jun 2013 11:02:53 +0000 (14:02 +0300)]
pch_gbe: convert pr_* to netdev_*

We may use nice macros to prefix our messages with proper device name.

Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agopch_gbe: remove inline keyword for exported functions
Andy Shevchenko [Fri, 28 Jun 2013 11:02:52 +0000 (14:02 +0300)]
pch_gbe: remove inline keyword for exported functions

There is no much sense to mark functions inline that are going to be used in
the other compile modules.

Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agousbnet: ax88179_178a: add .reset_resume hook
David Chang [Thu, 27 Jun 2013 09:16:43 +0000 (17:16 +0800)]
usbnet: ax88179_178a: add .reset_resume hook

I tested with the AX88179 usb dongle, if without .reset_resume hook,
after S3/S4 resume you have to enable network interface or reload the
dirver module manually otherwise the network interface can not work.

Signed-off-by: David Chang <dchang@suse.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agousbnet: ax88179_178a: Correct a typo in description
David Chang [Thu, 27 Jun 2013 09:16:42 +0000 (17:16 +0800)]
usbnet: ax88179_178a: Correct a typo in description

Correct a typo in description of driver_info, it should be Gigabit

Signed-off-by: David Chang <dchang@suse.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agoipv4: use next hop exceptions also for input routes
Timo Teräs [Thu, 27 Jun 2013 07:27:05 +0000 (10:27 +0300)]
ipv4: use next hop exceptions also for input routes

Commit d2d68ba9 (ipv4: Cache input routes in fib_info nexthops)
assmued that "locally destined, and routed packets, never trigger
PMTU events or redirects that will be processed by us".

However, it seems that tunnel devices do trigger PMTU events in certain
cases. At least ip_gre, ip6_gre, sit, and ipip do use the inner flow's
skb_dst(skb)->ops->update_pmtu to propage mtu information from the
outer flows. These can cause the inner flow mtu to be decreased. If
next hop exceptions are not consulted for pmtu, IP fragmentation will
not be done properly for these routes.

It also seems that we really need to have the PMTU information always
for netfilter TCPMSS clamp-to-pmtu feature to work properly.

So for the time being, cache separate copies of input routes for
each next hop exception.

Signed-off-by: Timo Teräs <timo.teras@iki.fi>
Reviewed-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agoipv6: resend MLD report if a link-local address completes DAD
Hannes Frederic Sowa [Wed, 26 Jun 2013 22:07:01 +0000 (00:07 +0200)]
ipv6: resend MLD report if a link-local address completes DAD

RFC3590/RFC3810 specifies we should resend MLD reports as soon as a
valid link-local address is available.

We now use the valid_ll_addr_cnt to check if it is necessary to resend
a new report.

Changes since Flavio Leitner's version:
a) adapt for valid_ll_addr_cnt
b) resend first reports directly in the path and just arm the timer for
   mc_qrv-1 resends.

Reported-by: Flavio Leitner <fleitner@redhat.com>
Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
Cc: David Stevens <dlstevens@us.ibm.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: Flavio Leitner <fbl@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agoipv6: introduce per-interface counter for dad-completed ipv6 addresses
Hannes Frederic Sowa [Wed, 26 Jun 2013 22:06:56 +0000 (00:06 +0200)]
ipv6: introduce per-interface counter for dad-completed ipv6 addresses

To reduce the number of unnecessary router solicitations, MLDv2 and IGMPv3
messages we need to track the number of valid (as in non-optimistic,
no-dad-failed and non-tentative) link-local addresses. Therefore, this
patch implements a valid_ll_addr_cnt in struct inet6_dev.

We now only emit router solicitations if the first link-local address
finishes duplicate address detection.

The changes for MLDv2 and IGMPv3 are in a follow-up patch.

While there, also simplify one if statement(one minor nit I made in one
of my previous patches):

if (!...)
do();
else
return;

<<into>>

if (...)
return;
do();

Cc: Flavio Leitner <fbl@redhat.com>
Cc: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Cc: David Stevens <dlstevens@us.ibm.com>
Suggested-by: David Stevens <dlstevens@us.ibm.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Acked-by: Flavio Leitner <fbl@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agoMerge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wirel...
John W. Linville [Fri, 28 Jun 2013 17:18:21 +0000 (13:18 -0400)]
Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next into for-davem

Conflicts:
net/wireless/nl80211.c

11 years agobonding: when cloning a MAC use NET_ADDR_STOLEN
nikolay@redhat.com [Wed, 26 Jun 2013 15:13:39 +0000 (17:13 +0200)]
bonding: when cloning a MAC use NET_ADDR_STOLEN

A simple semantic change, when a slave's MAC is cloned by the bond
master then set addr_assign_type to NET_ADDR_STOLEN instead of
NET_ADDR_SET. Also use bond_set_dev_addr() in BOND_FOM_ACTIVE mode
to change the bond's MAC address because the assign_type has to be
set properly.

Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agobonding: remove unnecessary dev_addr_from_first member
nikolay@redhat.com [Wed, 26 Jun 2013 15:13:38 +0000 (17:13 +0200)]
bonding: remove unnecessary dev_addr_from_first member

In struct bonding there's a member called dev_addr_from_first which is
used to denote when the bond dev should clone the first slave's MAC
address but since we have netdev's addr_assign_type variable that is not
necessary. We clone the first slave's MAC each time we have a random MAC
set to the bond device. This has the nice side-effect of also fixing an
inconsistency - when the MAC address of the bond dev is set after its
creation, but prior to having slaves, it's not kept and the first slave's
MAC is cloned. The only way to keep the MAC was to create the bond device
with the MAC address set (e.g. through ip link). In all cases if the
bond device is left without any slaves - its MAC gets reset to a random
one as before.

Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agobonding: remove unnecessary setup_by_slave member
nikolay@redhat.com [Wed, 26 Jun 2013 15:13:37 +0000 (17:13 +0200)]
bonding: remove unnecessary setup_by_slave member

We have a member called setup_by_slave in struct bonding to denote if the
bond dev has different type than ARPHRD_ETHER, but that is already denoted
in bond's netdev type variable if it was setup by the slave, so use that
instead of the member.

Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agonetlink: fix splat in skb_clone with large messages
Pablo Neira [Fri, 28 Jun 2013 01:04:23 +0000 (03:04 +0200)]
netlink: fix splat in skb_clone with large messages

Since (c05cdb1 netlink: allow large data transfers from user-space),
netlink splats if it invokes skb_clone on large netlink skbs since:

* skb_shared_info was not correctly initialized.
* skb->destructor is not set in the cloned skb.

This was spotted by trinity:

[  894.990671] BUG: unable to handle kernel paging request at ffffc9000047b001
[  894.991034] IP: [<ffffffff81a212c4>] skb_clone+0x24/0xc0
[...]
[  894.991034] Call Trace:
[  894.991034]  [<ffffffff81ad299a>] nl_fib_input+0x6a/0x240
[  894.991034]  [<ffffffff81c3b7e6>] ? _raw_read_unlock+0x26/0x40
[  894.991034]  [<ffffffff81a5f189>] netlink_unicast+0x169/0x1e0
[  894.991034]  [<ffffffff81a601e1>] netlink_sendmsg+0x251/0x3d0

Fix it by:

1) introducing a new netlink_skb_clone function that is used in nl_fib_input,
   that sets our special skb->destructor in the cloned skb. Moreover, handle
   the release of the large cloned skb head area in the destructor path.

2) not allowing large skbuffs in the netlink broadcast path. I cannot find
   any reasonable use of the large data transfer using netlink in that path,
   moreover this helps to skip extra skb_clone handling.

I found two more netlink clients that are cloning the skbs, but they are
not in the sendmsg path. Therefore, the sole client cloning that I found
seems to be the fib frontend.

Thanks to Eric Dumazet for helping to address this issue.

Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agosit: add support of x-netns
Nicolas Dichtel [Wed, 26 Jun 2013 14:11:28 +0000 (16:11 +0200)]
sit: add support of x-netns

This patch allows to switch the netns when packet is encapsulated or
decapsulated. In other word, the encapsulated packet is received in a netns,
where the lookup is done to find the tunnel. Once the tunnel is found, the
packet is decapsulated and injecting into the corresponding interface which
stands to another netns.

When one of the two netns is removed, the tunnel is destroyed.

Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agodev: introduce skb_scrub_packet()
Nicolas Dichtel [Wed, 26 Jun 2013 14:11:27 +0000 (16:11 +0200)]
dev: introduce skb_scrub_packet()

The goal of this new function is to perform all needed cleanup before sending
an skb into another netns.

Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agoath10k: minimally handle new channel width enumeration values
John W. Linville [Thu, 27 Jun 2013 17:50:09 +0000 (13:50 -0400)]
ath10k: minimally handle new channel width enumeration values

  CC      drivers/net/wireless/ath/ath10k/mac.o
drivers/net/wireless/ath/ath10k/mac.c: In function ‘chan_to_phymode’:
drivers/net/wireless/ath/ath10k/mac.c:229:3: warning: enumeration value ‘NL80211_CHAN_WIDTH_5’ not handled in switch [-Wswitch]
drivers/net/wireless/ath/ath10k/mac.c:229:3: warning: enumeration value ‘NL80211_CHAN_WIDTH_10’ not handled in switch [-Wswitch]
drivers/net/wireless/ath/ath10k/mac.c:247:3: warning: enumeration value ‘NL80211_CHAN_WIDTH_5’ not handled in switch [-Wswitch]
drivers/net/wireless/ath/ath10k/mac.c:247:3: warning: enumeration value ‘NL80211_CHAN_WIDTH_10’ not handled in switch [-Wswitch]

Signed-off-by: John W. Linville <linville@tuxdriver.com>
11 years agoath9k_htc: ifdef out IFTYPE_MESH advertisement
Thomas Pedersen [Wed, 26 Jun 2013 22:06:58 +0000 (15:06 -0700)]
ath9k_htc: ifdef out IFTYPE_MESH advertisement

This is needed so the interface combination can still be
validated when CONFIG_MAC80211_MESH is not enabled.
Otherwise wiphy registration fails.

Signed-off-by: Thomas Pedersen <thomas@cozybit.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
11 years agobrcmfmac: remove code and comment for older kernel support
Arend van Spriel [Wed, 26 Jun 2013 12:20:22 +0000 (14:20 +0200)]
brcmfmac: remove code and comment for older kernel support

In the code of the receive path some code was dealing with how
things were done in older kernels. Not really needed for an
upstream driver.

Reviewed-by: Hante Meuleman <meuleman@broadcom.com>
Reviewed-by: Pieter-Paul Giesberts <pieterpg@broadcom.com>
Signed-off-by: Arend van Spriel <arend@broadcom.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
11 years agobrcmfmac: reduce firmware-signalling locking scope in rx path
Arend van Spriel [Wed, 26 Jun 2013 12:20:21 +0000 (14:20 +0200)]
brcmfmac: reduce firmware-signalling locking scope in rx path

In the receive path a spinlock is taken upon parsing the TLV signal
header. This moves to locking to the TLV handling functions where
it protects the data structures.

Reviewed-by: Hante Meuleman <meuleman@broadcom.com>
Reviewed-by: Pieter-Paul Giesberts <pieterpg@broadcom.com>
Signed-off-by: Arend van Spriel <arend@broadcom.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
11 years agobrcmfmac: cleanup debug messages in brcmf_fws_hdrpush()
Arend van Spriel [Wed, 26 Jun 2013 12:20:20 +0000 (14:20 +0200)]
brcmfmac: cleanup debug messages in brcmf_fws_hdrpush()

Trivial cleanup of debug messages.

Reviewed-by: Hante Meuleman <meuleman@broadcom.com>
Signed-off-by: Arend van Spriel <arend@broadcom.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
11 years agobrcmfmac: tag packet in the netdev transmit callback
Arend van Spriel [Wed, 26 Jun 2013 12:35:10 +0000 (14:35 +0200)]
brcmfmac: tag packet in the netdev transmit callback

Transmit packets needs to be tagged in order to receive a tx status
feedback from the firmware. Determine the tag in the netdev transmit
callback instead of determining the tag just before transfer to the
device. This reduces the number of exception flows and hence makes
the driver code simpler.

Reviewed-by: Hante Meuleman <meuleman@broadcom.com>
Reviewed-by: Pieter-Paul Giesberts <pieterpg@broadcom.com>
Signed-off-by: Arend van Spriel <arend@broadcom.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
11 years agobrcmfmac: add broken scatter-gather DMA support
Franky Lin [Wed, 26 Jun 2013 12:20:18 +0000 (14:20 +0200)]
brcmfmac: add broken scatter-gather DMA support

DMA engine of some old SDIO host controllers require block size alignment for
data length of each scatterlist item. This patch introduces an intermediate
buffer list to support this kind of platform. It decreases the throughput
because of an extra memcpy in critical data path. So don't turn this on unless
it's necessary.

Reviewed-by: Pieter-Paul Giesberts <pieterpg@broadcom.com>
Reviewed-by: Arend van Spriel <arend@broadcom.com>
Signed-off-by: Franky Lin <frankyl@broadcom.com>
Signed-off-by: Arend van Spriel <arend@broadcom.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
11 years agobrcmfmac: use unified dongle address preparation function
Franky Lin [Wed, 26 Jun 2013 12:20:17 +0000 (14:20 +0200)]
brcmfmac: use unified dongle address preparation function

Introduce a unified dongle backplane address preparation function
brcmf_sdio_addrprep to replace duplicate address prep code.

Reviewed-by: Pieter-Paul Giesberts <pieterpg@broadcom.com>
Reviewed-by: Arend van Spriel <arend@broadcom.com>
Signed-off-by: Franky Lin <frankyl@broadcom.com>
Signed-off-by: Arend van Spriel <arend@broadcom.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
11 years agobrcmfmac: remove SDIO_REQ_ASYNC flag
Franky Lin [Wed, 26 Jun 2013 12:20:16 +0000 (14:20 +0200)]
brcmfmac: remove SDIO_REQ_ASYNC flag

Remove SDIO_REQ_ASYNC from brcmfmac since it is not being used.

Reviewed-by: Hante Meuleman <meuleman@broadcom.com>
Reviewed-by: Pieter-Paul Giesberts <pieterpg@broadcom.com>
Reviewed-by: Arend van Spriel <arend@broadcom.com>
Signed-off-by: Franky Lin <frankyl@broadcom.com>
Signed-off-by: Arend van Spriel <arend@broadcom.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
11 years agobrcmfmac: remove (ab)use of NL80211_NUM_ACS
Arend van Spriel [Wed, 26 Jun 2013 12:20:15 +0000 (14:20 +0200)]
brcmfmac: remove (ab)use of NL80211_NUM_ACS

Used NL80211_NUM_ACS to indicate the BCMC fifo used in the driver
which has the same value now, but it is a bad idea relying on that.

Signed-off-by: Arend van Spriel <arend@broadcom.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
11 years agobrcmfmac: simplify transmit path
Arend van Spriel [Wed, 26 Jun 2013 12:20:14 +0000 (14:20 +0200)]
brcmfmac: simplify transmit path

When getting a transmit packet from the networking layer simply
enqueue the packet unconditional and have it handled by the dequeue
worker. The transfer of the packet to the bus-specific driver part
is now done from one context.

Reviewed-by: Hante Meuleman <meuleman@broadcom.com>
Reviewed-by: Franky (Zhenhui) Lin <frankyl@broadcom.com>
Reviewed-by: Pieter-Paul Giesberts <pieterpg@broadcom.com>
Signed-off-by: Arend van Spriel <arend@broadcom.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
11 years agobcma: add support for BCM43142
Rafał Miłecki [Wed, 26 Jun 2013 08:02:11 +0000 (10:02 +0200)]
bcma: add support for BCM43142

Signed-off-by: Rafał Miłecki <zajec5@gmail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
11 years agob43: replace B43_BCMA_EXTRA with modparam allhwsupport
Rafał Miłecki [Wed, 26 Jun 2013 07:55:54 +0000 (09:55 +0200)]
b43: replace B43_BCMA_EXTRA with modparam allhwsupport

This allows enabling support for extra hardware with just a module
param, without kernel/module recompilation.

Signed-off-by: Rafał Miłecki <zajec5@gmail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
11 years agoath10k: leave MMIC generation to the HW
Michal Kazior [Wed, 26 Jun 2013 06:57:24 +0000 (08:57 +0200)]
ath10k: leave MMIC generation to the HW

Apparently HW doesn't require us to generate MMIC
for TKIP suite.

Each frame was 8 bytes longer than it should be
and some APs would drop frames that exceed 1520
bytes of 802.11 payload. This could be observed
during throughput tests or fragmented IP traffic.

Signed-off-by: Michal Kazior <michal.kazior@tieto.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
11 years agoath10k: fix 5ghz channel definitions
Michal Kazior [Wed, 26 Jun 2013 06:54:54 +0000 (08:54 +0200)]
ath10k: fix 5ghz channel definitions

Nonsense channel flags were being set.

Although it doesn't seem this was visible to the
user the patch makes sure that channel
availability won't be crippled in the future if
ath_common behaviour changes.

Signed-off-by: Michal Kazior <michal.kazior@tieto.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
11 years agoath10k: fix MSI-X setup failpath
Michal Kazior [Wed, 26 Jun 2013 06:50:50 +0000 (08:50 +0200)]
ath10k: fix MSI-X setup failpath

Irqs were not freed up correctly upon msi-x setup
failure.

Signed-off-by: Michal Kazior <michal.kazior@tieto.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
11 years agort2x00: rt2800lib: fix default TX power check for RT55xx
Gabor Juhos [Tue, 25 Jun 2013 20:57:29 +0000 (22:57 +0200)]
rt2x00: rt2800lib: fix default TX power check for RT55xx

The code writes the default_power2 value into the TX field
of the RFCSR50 register, however the condition in the if
statement uses default_power1. Due to this, wrong TX power
value might be written into the register.

Use the correct value in the condition to fix the issue.

Compile tested only.

Signed-off-by: Gabor Juhos <juhosg@openwrt.org>
Cc: stable@vger.kernel.org # 3.10
Acked-by: Gertjan van Wingerde <gwingerde@gmail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
11 years agoath9k: Add mix tx gain table for AR9462 2.0
Sujith Manoharan [Tue, 25 Jun 2013 06:59:23 +0000 (12:29 +0530)]
ath9k: Add mix tx gain table for AR9462 2.0

Signed-off-by: Sujith Manoharan <c_manoha@qca.qualcomm.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
11 years agort2x00: rt2800lib: turn on tertiary PAs/LNAs for 3T/3R devices
Gabor Juhos [Mon, 24 Jun 2013 21:03:24 +0000 (23:03 +0200)]
rt2x00: rt2800lib: turn on tertiary PAs/LNAs for 3T/3R devices

The 3T/3R devices are using the tertiary PAs/LNAs
however those are never turned on. Fix the code to
turn on those on for such devices.

Also modify the code to use switch statements to
improve readability.

Signed-off-by: Gabor Juhos <juhosg@openwrt.org>
Acked-by: Gertjan van Wingerde <gwingerde@gmail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
11 years agort2x00: rt2800lib: turn on secondary PAs/LNAs for 3T/3R devices
Gabor Juhos [Mon, 24 Jun 2013 21:03:23 +0000 (23:03 +0200)]
rt2x00: rt2800lib: turn on secondary PAs/LNAs for 3T/3R devices

The secondary PAs/LNAs are turned on only for 2T/2R
devices, however these are used for 3T/3R devices as
well. Always turn those on if the device uses more
than one tx/rx chains.

Signed-off-by: Gabor Juhos <juhosg@openwrt.org>
Acked-by: Stanislaw Gruszka <stf_xl@wp.pl>
Acked-by: Gertjan van Wingerde <gwingerde@gmail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
11 years agort2x00: rt2800: increase EEPROM_SIZE to 512 bytes
Gabor Juhos [Mon, 24 Jun 2013 21:03:22 +0000 (23:03 +0200)]
rt2x00: rt2800: increase EEPROM_SIZE to 512 bytes

Ralink 3T chipsets are using a different EEPROM
layout than the others. The EEPROM on these devices
contain more data than the others which does not fit
into 272 byte which the rt2800 driver actually uses.

The Ralink reference driver defines EEPROM_SIZE to
512/1024 bytes for PCI/USB devices respectively.

Increase the EEPROM_SIZE constant to 512 bytes, in
order to make room for EEPROM data of 3T devices.

Signed-off-by: Gabor Juhos <juhosg@openwrt.org>
Acked-by: Helmut Schaa <helmut.schaa@googlemail.com>
Acked-by: Stanislaw Gruszka <stf_xl@wp.pl>
Acked-by: Gertjan van Wingerde <gwingerde@gmail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
11 years agocan: at91_can: Use of_match_ptr()
Sachin Kamat [Wed, 12 Jun 2013 11:30:39 +0000 (17:00 +0530)]
can: at91_can: Use of_match_ptr()

of_match_ptr() eliminates having an #ifdef returning NULL for the case
when OF is disabled.

Signed-off-by: Sachin Kamat <sachin.kamat@linaro.org>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
11 years agoARM: imx: flexcan: Remove platform file
Fabio Estevam [Tue, 11 Jun 2013 02:12:58 +0000 (23:12 -0300)]
ARM: imx: flexcan: Remove platform file

As there are no more users of the flexcan platform file, let's remove it.

Cc: Sascha Hauer <s.hauer@pengutronix.de>
Acked-by: Shawn Guo <shawn.guo@linaro.org>
Signed-off-by: Fabio Estevam <fabio.estevam@freescale.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
11 years agocan: flexcan: Use a regulator to control the CAN transceiver
Fabio Estevam [Tue, 11 Jun 2013 02:12:57 +0000 (23:12 -0300)]
can: flexcan: Use a regulator to control the CAN transceiver

Instead of using a GPIO to turn on/off the CAN transceiver, it is better to
use a regulator as some systems may use a PMIC to power the CAN transceiver.

Acked-by: Shawn Guo <shawn.guo@linaro.org>
Signed-off-by: Fabio Estevam <fabio.estevam@freescale.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
11 years agoARM: imx: prepare for removal of flexcan_platform_data
Marc Kleine-Budde [Tue, 11 Jun 2013 02:12:56 +0000 (23:12 -0300)]
ARM: imx: prepare for removal of flexcan_platform_data

As there are no imx in-tree users of flexcan_platform_data, this patch removes
the possibility to register a flexcan device with platform data.

The functionality to swith on/off CAN transceivers is added to DT via
regulators in a later patch.

Compile time tested with imx_v4_v5_defconfig and imx_v6_v7_defconfig.

Acked-by: Shawn Guo <shawn.guo@linaro.org>
Acked-by: Sascha Hauer <s.hauer@pengutronix.de>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
11 years agoMerge branch 'for-john' of git://git.kernel.org/pub/scm/linux/kernel/git/iwlwifi...
John W. Linville [Thu, 27 Jun 2013 00:00:11 +0000 (20:00 -0400)]
Merge branch 'for-john' of git://git.kernel.org/pub/scm/linux/kernel/git/iwlwifi/iwlwifi-next

11 years agofec: Add support for reading RMON registers
Chris Healy [Wed, 26 Jun 2013 06:18:52 +0000 (23:18 -0700)]
fec: Add support for reading RMON registers

Add ethtool operation to read RMON registers.

Tested against net-next on i.MX28.

v2: make conditional on #ifndef CONFIG_M5272

Signed-off-by: Chris Healy <cphealy@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agoipv6: rearm router solicitaion timer when setting new tokenized address
Hannes Frederic Sowa [Wed, 26 Jun 2013 01:41:49 +0000 (03:41 +0200)]
ipv6: rearm router solicitaion timer when setting new tokenized address

When a new tokenized address gets installed we send out just one
router solicition. We should send out `rtr_solicits' in case one router
advertisment got lost.

So, rearm the timer as we do in addrconf_dad_complete.

Cc: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Acked-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agosit: fix 4in4 + IPsec scenario
Nicolas Dichtel [Wed, 26 Jun 2013 15:40:33 +0000 (17:40 +0200)]
sit: fix 4in4 + IPsec scenario

Since commit 32b8a8e59c9c "sit: add IPv4 over IPv4 support",
tunnel->parms.iph.protocol is 0 when both 4in4 and 6in4 are setup, but
xfrm_lookup() is called only when proto is != 0, thus we need to pass the real
value.

Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agoMerge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec...
David S. Miller [Wed, 26 Jun 2013 20:23:13 +0000 (13:23 -0700)]
Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-next

Steffen Klassert says:

====================
Just one patch this time.

1) Drop packets when the matching SA is in larval state and add a
   statistic counter for that. From Fan Du.

Please pull or let me know if there are problems.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agoMerge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless
John W. Linville [Wed, 26 Jun 2013 16:01:42 +0000 (12:01 -0400)]
Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless

11 years agonetns: exclude ipvs from struct net when IPVS disabled
JunweiZhang [Wed, 26 Jun 2013 08:40:06 +0000 (16:40 +0800)]
netns: exclude ipvs from struct net when IPVS disabled

no real problem is fixed, just save a few bytes in
net_namespace structure.

Signed-off-by: JunweiZhang <junwei.zhang@6wind.com>
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Reviewed-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Simon Horman <horms@verge.net.au>
11 years agokernel: remove unnecessary head file
JunweiZhang [Wed, 26 Jun 2013 08:40:05 +0000 (16:40 +0800)]
kernel: remove unnecessary head file

ip_vs.h is not necessary for sysctl_binary.c.

prepare for the next patch to avoid compile issue.

Signed-off-by: JunweiZhang <junwei.zhang@6wind.com>
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Reviewed-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Simon Horman <horms@verge.net.au>
11 years agoipvs: add sync_persist_mode flag
Julian Anastasov [Mon, 24 Jun 2013 19:44:41 +0000 (22:44 +0300)]
ipvs: add sync_persist_mode flag

Add sync_persist_mode flag to reduce sync traffic
by syncing only persistent templates.

Signed-off-by: Julian Anastasov <ja@ssi.bg>
Tested-by: Aleksey Chudov <aleksey.chudov@gmail.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
11 years agoipvs: SH fallback and L4 hashing
Alexander Frolkin [Wed, 19 Jun 2013 09:54:25 +0000 (10:54 +0100)]
ipvs: SH fallback and L4 hashing

By default the SH scheduler rejects connections that are hashed onto a
realserver of weight 0.  This patch adds a flag to make SH choose a
different realserver in this case, instead of rejecting the connection.

The patch also adds a flag to make SH include the source port (TCP, UDP,
SCTP) in the hash as well as the source address.  This basically allows
for deterministic round-robin load balancing (i.e., where any director
in a cluster of directors with identical config will send the same
packet the same way).

The flags are service flags (IP_VS_SVC_F_SCHED*) so that these options
can be set per service.  They are set using a new option to ipvsadm.

Signed-off-by: Alexander Frolkin <avf@eldamar.org.uk>
Acked-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Simon Horman <horms@verge.net.au>
11 years agoipvs: drop SCTP connections depending on state
Julian Anastasov [Tue, 18 Jun 2013 07:08:08 +0000 (10:08 +0300)]
ipvs: drop SCTP connections depending on state

Drop SCTP connections under load (dropentry context) depending
on the protocol state, just like for TCP: INIT conns are
dropped immediately, established are dropped randomly while
connections in progress or shutdown are skipped.

Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Simon Horman <horms@verge.net.au>
11 years agoipvs: replace the SCTP state machine
Julian Anastasov [Tue, 18 Jun 2013 07:08:07 +0000 (10:08 +0300)]
ipvs: replace the SCTP state machine

Convert the SCTP state table, so that it is more readable.
Change the states to be according to the diagram in RFC 2960
and add more states suitable for middle box. Still, such
change in states adds incompatibility if systems in sync
setup include this change and others do not include it.

With this change we also have proper transitions in INPUT-ONLY
mode (DR/TUN) where we see packets only from client. Now
we should not switch to 10-second CLOSED state at a time
when we should stay in ESTABLISHED state.

The short names for states are because we have 16-char space
in ipvsadm and 11-char limit for the connection list format.
It is a sequence of the TCP implementation where the longest
state name is ESTABLISHED.

Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Simon Horman <horms@verge.net.au>
11 years agoipvs: sloppy TCP and SCTP
Alexander Frolkin [Thu, 13 Jun 2013 07:56:15 +0000 (08:56 +0100)]
ipvs: sloppy TCP and SCTP

This adds support for sloppy TCP and SCTP modes to IPVS.

When enabled (sysctls net.ipv4.vs.sloppy_tcp and
net.ipv4.vs.sloppy_sctp), allows IPVS to create connection state on any
packet, not just a TCP SYN (or SCTP INIT).

This allows connections to fail over from one IPVS director to another
mid-flight.

Signed-off-by: Alexander Frolkin <avf@eldamar.org.uk>
Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Simon Horman <horms@verge.net.au>
11 years agoipvs: provide iph to schedulers
Julian Anastasov [Sun, 16 Jun 2013 06:09:36 +0000 (09:09 +0300)]
ipvs: provide iph to schedulers

Before now the schedulers needed access only to IP
addresses and it was easy to get them from skb by
using ip_vs_fill_iph_addr_only.

New changes for the SH scheduler will need the protocol
and ports which is difficult to get from skb for the
IPv6 case. As we have all the data in the iph structure,
to avoid the same slow lookups provide the iph to schedulers.

Signed-off-by: Julian Anastasov <ja@ssi.bg>
Acked-by: Hans Schillstrom <hans@schillstrom.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
11 years agoarc_emac: fix compile-time errors & warnings on PPC64
Alexey Brodkin [Wed, 26 Jun 2013 07:49:26 +0000 (11:49 +0400)]
arc_emac: fix compile-time errors & warnings on PPC64

As reported by "kbuild test robot" there were some errors and warnings
on attempt to build kernel with "make ARCH=powerpc allmodconfig".

And this patch addresses both errors and warnings.
Below is a list of introduced changes:
1. Fix compile-time errors (misspellings in "dma_unmap_single") on PPC.
2. Use DMA address instead of "skb->data" as a pointer to data buffer.
This fixed warnings on pointer to int conversion on 64-bit systems.
3. Re-implemented initial allocation of Rx buffers in "arc_emac_open" in
the same way they're re-allocated during operation (receiving packets).
So once again DMA address could be used instead of "skb->data".
4. Explicitly use EMAC_BUFFER_SIZE for Rx buffers allocation.

Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com>
Cc: netdev@vger.kernel.org
Cc: Andy Shevchenko <andy.shevchenko@gmail.com>
Cc: Francois Romieu <romieu@fr.zoreil.com>
Cc: Joe Perches <joe@perches.com>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Mischa Jonker <mjonker@synopsys.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Grant Likely <grant.likely@linaro.org>
Cc: Rob Herring <rob.herring@calxeda.com>
Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
Cc: linux-kernel@vger.kernel.org
Cc: devicetree-discuss@lists.ozlabs.org
Cc: Florian Fainelli <florian@openwrt.org>
Cc: David Laight <david.laight@aculab.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agobonding: add an option to fail when any of arp_ip_target is inaccessible
Veaceslav Falico [Mon, 24 Jun 2013 09:49:34 +0000 (11:49 +0200)]
bonding: add an option to fail when any of arp_ip_target is inaccessible

Currently, we fail only when all of the ips in arp_ip_target are gone.
However, in some situations we might need to fail if even one host from
arp_ip_target becomes unavailable.

All situations, obviously, rely on the idea that we need *completely*
functional network, with all interfaces/addresses working correctly.

One real world example might be:
vlans on top on bond (hybrid port). If bond and vlans have ips assigned
and we have their peers monitored via arp_ip_target - in case of switch
misconfiguration (trunk/access port), slave driver malfunction or
tagged/untagged traffic dropped on the way - we will be able to switch
to another slave.

Though any other configuration needs that if we need to have access to all
arp_ip_targets.

This patch adds this possibility by adding a new parameter -
arp_all_targets (both as a module parameter and as a sysfs knob). It can be
set to:

0 or any (the default) - which works exactly as it's working now -
the slave is up if any of the arp_ip_targets are up.

1 or all - the slave is up if all of the arp_ip_targets are up.

This parameter can be changed on the fly (via sysfs), and requires the mode
to be active-backup and arp_validate to be enabled (it obeys the
arp_validate config on which slaves to validate).

Internally it's done through:

1) Add target_last_arp_rx[BOND_MAX_ARP_TARGETS] array to slave struct. It's
   an array of jiffies, meaning that slave->target_last_arp_rx[i] is the
   last time we've received arp from bond->params.arp_targets[i] on this
   slave.

2) If we successfully validate an arp from bond->params.arp_targets[i] in
   bond_validate_arp() - update the slave->target_last_arp_rx[i] with the
   current jiffies value.

3) When getting slave's last_rx via slave_last_rx(), we return the oldest
   time when we've received an arp from any address in
   bond->params.arp_targets[].

If the value of arp_all_targets == 0 - we still work the same way as
before.

Also, update the documentation to reflect the new parameter.

v3->v4:
Kill the forgotten rtnl_unlock(), rephrase the documentation part to be
more clear, don't fail setting arp_all_targets if arp_validate is not set -
it has no effect anyway but can be easier to set up. Also, print a warning
if the last arp_ip_target is removed while the arp_interval is on, but not
the arp_validate.

v2->v3:
Use _bh spinlock, remove useless rtnl_lock() and use jiffies for new
arp_ip_target last arp, instead of slave_last_rx(). On bond_enslave(),
use the same initialization value for target_last_arp_rx[] as is used
for the default last_arp_rx, to avoid useless interface flaps.

Also, instead of failing to remove the last arp_ip_target just print a
warning - otherwise it might break existing scripts.

v1->v2:
Correctly handle adding/removing hosts in arp_ip_target - we need to
shift/initialize all slave's target_last_arp_rx. Also, don't fail module
loading on arp_all_targets misconfiguration, just disable it, and some
minor style fixes.

Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agobonding: doc: some details on backup slave arp validation
Veaceslav Falico [Mon, 24 Jun 2013 09:49:33 +0000 (11:49 +0200)]
bonding: doc: some details on backup slave arp validation

Add some details to bonding documentation on how backup slave arp
validation works.

Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agobonding: don't trust arp requests unless active slave really works
Veaceslav Falico [Mon, 24 Jun 2013 09:49:32 +0000 (11:49 +0200)]
bonding: don't trust arp requests unless active slave really works

Currently, if we receive any arp packet on a backup slave in active-backup
mode and arp_validate enabled, we suppose that it's an arp request, swap
source/target ip and try to validate it. This optimization gives us
virtually no downtime in the most common situation (active and backup
slaves are in the same broadcast domain and the active slave failed).

However, if we can't reach the arp_ip_target(s), we end up in an endless
loop of reselecting slaves, because we receive our arp requests, sent by
the active slave, and think that backup slaves are up, thus selecting them
as active and, again, sending arp requests, which fool our backup slaves.

Fix this by not validating the swapped arp packets if the current active
slave didn't receive any arp reply after it was selected as active. This
way we will only accept arp requests if we know that the current active
slave can actually reach arp_ip_target.

v3->v4:
Obey 80 lines and make checkpatch.pl happy, per Sergei's suggestion.

v1->v3:
No change.

Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agobonding: don't validate arp if we don't have to
Veaceslav Falico [Mon, 24 Jun 2013 09:49:31 +0000 (11:49 +0200)]
bonding: don't validate arp if we don't have to

Currently, we validate all the incoming arps if arp_validate not 0.
However, we don't have to validate backup slaves if arp_validate == active
and vice versa, so return early in bond_arp_rcv() in these cases.

It works correctly now because we verify arp_validate in slave_last_rx(),
however we're just doing useless work in bond_arp_rcv().

Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agobonding: don't add duplicate targets to arp_ip_target
Veaceslav Falico [Mon, 24 Jun 2013 09:49:30 +0000 (11:49 +0200)]
bonding: don't add duplicate targets to arp_ip_target

Print a warning and skip them.

Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agobonding: add helper function bond_get_targets_ip(targets, ip)
Veaceslav Falico [Mon, 24 Jun 2013 09:49:29 +0000 (11:49 +0200)]
bonding: add helper function bond_get_targets_ip(targets, ip)

Add function bond_get_targets_ip(targets, ip) which searches through
targets array of ips (arp_targets) and returns the position of first
match. If ip == 0, returns the first free slot. On failure to find the
ip or free slot, return -1.

Use it to verify if the arp we've received is valid and in sysfs.

v1->v2:
Fix "[2/6] bonding: add helper function bond_get_targets_ip(targets, ip)",
per Nikolay's advice, to verify if source ip != 0.0.0.0, otherwise we might
update 'null' arp_ip_targets' last_rx. Also, address style.

Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agonet: davinci_mdio: gaurd the DT code with IS_ENABLED(CONFIG_OF)
Lad, Prabhakar [Tue, 25 Jun 2013 15:54:53 +0000 (21:24 +0530)]
net: davinci_mdio: gaurd the DT code with IS_ENABLED(CONFIG_OF)

guard the davinci_mdio_of_mtable table and davinci_mdio_probe_dt()
with CONFIG_OF.

Signed-off-by: Lad, Prabhakar <prabhakar.csengg@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agonet: davinci_emac: simplify the OF parser code
Lad, Prabhakar [Tue, 25 Jun 2013 15:54:52 +0000 (21:24 +0530)]
net: davinci_emac: simplify the OF parser code

This patch cleans up the OF parser code, removes unnecessary checks
on of_property_read_*() and guards davinci_emac_of_match table with
CONFIG_OF.

Signed-off-by: Lad, Prabhakar <prabhakar.csengg@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agonet: davinci: emac: Convert to devm_* api
Lad, Prabhakar [Tue, 25 Jun 2013 15:54:51 +0000 (21:24 +0530)]
net: davinci: emac: Convert to devm_* api

Use devm_ioremap_resource instead of devm_request_mem_region()/devm_ioremap()
and devm_request_irq() instead of request_irq().

This ensures more consistent error values and simplifies error paths.

Signed-off-by: Lad, Prabhakar <prabhakar.csengg@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agodoc: fix some syntax errors in netlink mmap sample code
Cong Wang [Mon, 24 Jun 2013 11:46:54 +0000 (19:46 +0800)]
doc: fix some syntax errors in netlink mmap sample code

Cc: Patrick McHardy <kaber@trash.net>
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: Cong Wang <amwang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agomacvtap: Perform GSO on forwarding path.
Vlad Yasevich [Tue, 25 Jun 2013 20:04:22 +0000 (16:04 -0400)]
macvtap: Perform GSO on forwarding path.

When macvtap forwards skb to its tap, it needs to check
if GSO needs to be performed.  This is sometimes necessary
when the HW device performed GRO, but the guest reading
from the tap does not support it (ex: Windows 7).

Signed-off-by: Vlad Yasevich <vyasevic@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agomacvtap: Let TUNSETOFFLOAD actually controll offload features.
Vlad Yasevich [Tue, 25 Jun 2013 20:04:21 +0000 (16:04 -0400)]
macvtap: Let TUNSETOFFLOAD actually controll offload features.

When the user issues TUNSETOFFLOAD ioctl, macvtap does not do
anything other then to verify arguments.  This patch adds
functionality to allow users to actually control offload features.
NETIF_F_GSO and NETIF_F_GRO are always on, but the rest of the
features can be controlled.

Signed-off-by: Vlad Yasevich <vyasevic@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agomacvtap: Consistently use rcu functions
Vlad Yasevich [Tue, 25 Jun 2013 20:04:20 +0000 (16:04 -0400)]
macvtap: Consistently use rcu functions

Currently macvtap uses rcu_bh functions in its
user facing fuction macvtap_get_user() and macvtap_put_user().
However, its packet handlers use normal rcu as the rcu_read_lock()
is taken in netif_receive_skb().  We can safely discontinue
the usage or rcu with bh disabled.

Signed-off-by: Vlad Yasevich <vyasevic@redhat.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agomacvtap: Convert to using rtnl lock
Vlad Yasevich [Tue, 25 Jun 2013 20:04:19 +0000 (16:04 -0400)]
macvtap: Convert to using rtnl lock

Macvtap uses a private lock to protect the relationship between
macvtap_queue and macvlan_dev.  The private lock is not needed
since the relationship is managed by user via open(), release(),
and dellink() calls.  dellink() already happens under rtnl, so
we can safely convert open() and release(), and use it in ioctl()
as well.

Suggested by Eric Dumazet.

Signed-off-by: Vlad Yasevich <vyasevic@redhat.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agonet: poll/select low latency socket support
Eliezer Tamir [Mon, 24 Jun 2013 07:28:03 +0000 (10:28 +0300)]
net: poll/select low latency socket support

select/poll busy-poll support.

Split sysctl value into two separate ones, one for read and one for poll.
updated Documentation/sysctl/net.txt

Add a new poll flag POLL_LL. When this flag is set, sock_poll will call
sk_poll_ll if possible. sock_poll sets this flag in its return value
to indicate to select/poll when a socket that can busy poll is found.

When poll/select have nothing to report, call the low-level
sock_poll again until we are out of time or we find something.

Once the system call finds something, it stops setting POLL_LL, so it can
return the result to the user ASAP.

Signed-off-by: Eliezer Tamir <eliezer.tamir@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agoethernet/arc/arc_emac - Add new driver
Alexey Brodkin [Mon, 24 Jun 2013 05:54:27 +0000 (09:54 +0400)]
ethernet/arc/arc_emac - Add new driver

Driver for non-standard on-chip ethernet device ARC EMAC 10/100,
instantiated in some legacy ARC (Synopsys) FPGA Boards such as
ARCAngel4/ML50x.

Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com>
Cc: Andy Shevchenko <andy.shevchenko@gmail.com>
Cc: Francois Romieu <romieu@fr.zoreil.com>
Cc: Joe Perches <joe@perches.com>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Mischa Jonker <mjonker@synopsys.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Grant Likely <grant.likely@linaro.org>
Cc: Rob Herring <rob.herring@calxeda.com>
Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: linux-kernel@vger.kernel.org
Cc: devicetree-discuss@lists.ozlabs.org
Cc: Florian Fainelli <florian@openwrt.org>
Cc: David Laight <david.laight@aculab.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agonet: sctp: simplify sctp_get_port
Daniel Borkmann [Tue, 25 Jun 2013 16:17:30 +0000 (18:17 +0200)]
net: sctp: simplify sctp_get_port

No need to have an extra ret variable when we directly can return
the value of sctp_get_port_local().

Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Acked-by: Vlad Yasevich <vyasevich@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agonet: sctp: decouple cleaning some socket data from endpoint
Daniel Borkmann [Tue, 25 Jun 2013 16:17:29 +0000 (18:17 +0200)]
net: sctp: decouple cleaning some socket data from endpoint

Rather instead of having the endpoint clean the garbage from the
socket, use a sk_destruct handler sctp_destruct_sock(), that does
the job for that when there are no more references on the socket.
At least do this for our crypto transform through crypto_free_hash()
that is allocated when in listening state.

Also, perform sctp_put_port() only when sk is valid. At a later
point in time we can still determine if there's an option of
placing this into sk_prot->unhash() or sctp_endpoint_free() without
any races. For now, leave it in sctp_endpoint_destroy() though.

Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Acked-by: Vlad Yasevich <vyasevich@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agonet: sctp: minor: sctp_seq_dump_local_addrs add missing newline
Daniel Borkmann [Tue, 25 Jun 2013 16:17:28 +0000 (18:17 +0200)]
net: sctp: minor: sctp_seq_dump_local_addrs add missing newline

A trailing newline has been forgotten to add into the WARN().

Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Acked-by: Vlad Yasevich <vyasevich@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agonet: sctp: migrate cookie life from timeval to ktime
Daniel Borkmann [Tue, 25 Jun 2013 16:17:27 +0000 (18:17 +0200)]
net: sctp: migrate cookie life from timeval to ktime

Currently, SCTP code defines its own timeval functions (since timeval
is rarely used inside the kernel by others), namely tv_lt() and
TIMEVAL_ADD() macros, that operate on SCTP cookie expiration.

We might as well remove all those, and operate directly on ktime
structures for a couple of reasons: ktime is available on all archs;
complexity of ktime calculations depending on the arch is less than
(reduces to a simple arithmetic operations on archs with
BITS_PER_LONG == 64 or CONFIG_KTIME_SCALAR) or equal to timeval
functions (other archs); code becomes more readable; macros can be
thrown out.

Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Acked-by: Vlad Yasevich <vyasevich@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agoktime: add ms_to_ktime() and ktime_add_ms() helpers
Daniel Borkmann [Tue, 25 Jun 2013 16:17:26 +0000 (18:17 +0200)]
ktime: add ms_to_ktime() and ktime_add_ms() helpers

Add two ktime helper functions that i) convert a given msec value to
a ktime structure and ii) that adds a msec value to a ktime structure.

Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Acked-by: Vlad Yasevich <vyasevich@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agonet: sctp: remove TEST_FRAME ifdef
Daniel Borkmann [Tue, 25 Jun 2013 16:17:25 +0000 (18:17 +0200)]
net: sctp: remove TEST_FRAME ifdef

We do neither ship a test_frame.h, nor will this be compatible with
the 2.5 out-of-tree lksctp kernel test suite anyway. So remove this
artefact.

Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Acked-by: Vlad Yasevich <vyasevich@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agonet/mlx4_core: Fail device init if num_vfs is negative
Jack Morgenstein [Tue, 25 Jun 2013 09:09:38 +0000 (12:09 +0300)]
net/mlx4_core: Fail device init if num_vfs is negative

Should not allow negative num_vfs

Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.com>
Signed-off-by: Vladimir Sokolovsky <vlad@mellanox.com>
Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>