git.karo-electronics.de Git - karo-tx-linux.git/log

vlan: Calling vlan_hwaccel_do_receive() is always valid.

It is now acceptable to receive vlan tagged packets at any time,
even if CONFIG_VLAN_8021Q is not set. This means that calling
vlan_hwaccel_do_receive() should not result in BUG() but rather just
behave as if there were no vlan devices configured.

Reported-by: Vladislav Zolotarov <vladz@broadcom.com>
Signed-off-by: Jesse Gross <jesse@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6

Conflicts:
net/core/dev.c

Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/kaber/nf-next-2.6

Merge branch 'for-patrick' of git://git.kernel.org/pub/scm/linux/kernel/git/horms/lvs-test-2.6

tproxy: use the interface primary IP address as a default value for --on-ip

The REDIRECT target and the older TProxy versions used the primary address
of the incoming interface as the default value of the --on-ip parameter.
This was unintentionally changed during the initial TProxy submission and
caused confusion among users.

Since IPv6 has no notion of primary address, we just select the first address
on the list: this way the socket lookup finds wildcard bound sockets
properly and we cannot really do better without the user telling us the
IPv6 address of the proxy.

This is implemented for both IPv4 and IPv6.

Signed-off-by: Balazs Scheidler <bazsi@balabit.hu>
Signed-off-by: KOVACS Krisztian <hidden@balabit.hu>
Signed-off-by: Patrick McHardy <kaber@trash.net>

tproxy: added IPv6 support to the socket match

The ICMP extraction bits were contributed by Harry Mason.

Signed-off-by: Balazs Scheidler <bazsi@balabit.hu>
Signed-off-by: KOVACS Krisztian <hidden@balabit.hu>
Signed-off-by: Patrick McHardy <kaber@trash.net>

cxgb3: function namespace cleanup

Make local functions static. Remove functions that are
defined and never used. Compile tested only.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Acked-by: Divy Le Ray <divy@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

tproxy: added IPv6 support to the TPROXY target

This requires a new revision as the old target structure was
IPv4 specific.

Signed-off-by: Balazs Scheidler <bazsi@balabit.hu>
Signed-off-by: KOVACS Krisztian <hidden@balabit.hu>
Signed-off-by: Patrick McHardy <kaber@trash.net>

tproxy: added IPv6 socket lookup function to nf_tproxy_core

Signed-off-by: Balazs Scheidler <bazsi@balabit.hu>
Signed-off-by: KOVACS Krisztian <hidden@balabit.hu>
Signed-off-by: Patrick McHardy <kaber@trash.net>

be2net: Changes to use only priority codes allowed by f/w

Changes to use one of the priority codes allowed by CNA f/w for NIC traffic
from host. The driver gets the bit map of the priority codes allowed for
host traffic through a asynchronous message from the f/w that the driver
subscribes to.

Signed-off-by: Somnath Kotur <somnath.kotur@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

tproxy: allow non-local binds of IPv6 sockets if IP_TRANSPARENT is enabled

Signed-off-by: Balazs Scheidler <bazsi@balabit.hu>
Signed-off-by: KOVACS Krisztian <hidden@balabit.hu>
Signed-off-by: Patrick McHardy <kaber@trash.net>

tproxy: added tproxy sockopt interface in the IPV6 layer

Support for IPV6_RECVORIGDSTADDR sockopt for UDP sockets were contributed by
Harry Mason.

Signed-off-by: Balazs Scheidler <bazsi@balabit.hu>
Signed-off-by: KOVACS Krisztian <hidden@balabit.hu>
Signed-off-by: Patrick McHardy <kaber@trash.net>

tproxy: added udp6_lib_lookup function

Just like with IPv4, we need access to the UDP hash table to look up local
sockets, but instead of exporting the global udp_table, export a lookup
function.

Signed-off-by: Balazs Scheidler <bazsi@balabit.hu>
Signed-off-by: KOVACS Krisztian <hidden@balabit.hu>
Signed-off-by: Patrick McHardy <kaber@trash.net>

tproxy: added const specifiers to udp lookup functions

The parameters for various UDP lookup functions were non-const, even though
they could be const. TProxy has some const references and instead of
downcasting it, I added const specifiers along the path.

Signed-off-by: Balazs Scheidler <bazsi@balabit.hu>
Signed-off-by: KOVACS Krisztian <hidden@balabit.hu>
Signed-off-by: Patrick McHardy <kaber@trash.net>

tproxy: split off ipv6 defragmentation to a separate module

Like with IPv4, TProxy needs IPv6 defragmentation but does not
require connection tracking. Since defragmentation was coupled
with conntrack, I split off the two, creating an nf_defrag_ipv6 module,
similar to the already existing nf_defrag_ipv4.

Signed-off-by: Balazs Scheidler <bazsi@balabit.hu>
Signed-off-by: KOVACS Krisztian <hidden@balabit.hu>
Signed-off-by: Patrick McHardy <kaber@trash.net>

l2tp: small cleanup

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

nf_nat: restrict ICMP translation for embedded header

Skip ICMP translation of embedded protocol header
if NAT bits are not set. Needed for IPVS to see the original
embedded addresses because for IPVS traffic the IPS_SRC_NAT_BIT
and IPS_DST_NAT_BIT bits are not set. It happens when IPVS performs
DNAT for client packets after using nf_conntrack_alter_reply
to expect replies from real server.

Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Simon Horman <horms@verge.net.au>

can: mcp251x: fix generation of error frames

The function "mcp251x_error_skb" is used to generate error frames.
They are identified by the "CAN_ERR_FLAG" in can_id. The function
overwrites the can_id so that the frames show up as normal frames instead
of error frames.

This patch fixes the problem by or'ing the can_id instead of overwriting it.

Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Tested-by: Jargalan Nermunkh <jargalan.nermunkh@criticallink.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

can: mcp251x: fix endless loop in interrupt handler if CANINTF_MERRF is set

Commit d3cd15657516141adce387810be8cb377baf020e introduced a bug, the
interrupt handler would loop endlessly if the CANINTF_MERRF bit is set,
because it's not cleared.

This patch fixes the problem by masking out the CANINTF_MERRF and all other
non interesting bits.

Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>

can-raw: add msg_flags to distinguish local traffic

CAN has no addressing scheme. It is currently impossible for userspace
to tell is a received CAN frame comes from another process on the local
host, or from a remote CAN device.

This patch add support for userspace applications to distinguish between
'own', 'local' and 'remote' CAN traffic. The distinction is made by returning
flags in msg->msg_flags in the call to recvmsg().

The added documentation explains the introduced flags.

Signed-off-by: Kurt Van Dijck <kurt.van.dijck@eia.be>
Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net>
Signed-off-by: David S. Miller <davem@davemloft.net>

9p: client code cleanup

Make p9_client_version static since only used in one file.
Remove p9_client_auth because it is defined but never used.
Compile tested only.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

rds: make local functions/variables static

The RDS protocol has lots of functions that should be
declared static. rds_message_get/add_version_extension is
removed since it defined but never used.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

napi: unexport napi_reuse_skb

The function napi_reuse_skb is only used inside core.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

tipc: delete needless memset from bearer enabling.

Eliminates zeroing out of the new bearer structure at the start of
activation, since it is already in that state.

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

drivers/net/ax88796.c: Return error code in failure

In this code, 0 is returned on failure, even though other
failures return -ENOMEM or other similar values.

A simplified version of the semantic match that finds this problem is as
follows: (http://coccinelle.lip6.fr/)

// <smpl>
@a@
identifier alloc;
identifier ret;
constant C;
expression x;
@@

x = alloc(...);
if (x == NULL) { <+... \(ret = -C; \| return -C; \) ...+> }

@@
identifier f, a.alloc;
expression ret;
expression x,e1,e2,e3;
@@

ret = 0
... when != ret = e1
*x = alloc(...)
... when != ret = e2
if (x == NULL) { ... when != ret = e3
return ret;
}
// </smpl>

Signed-off-by: Julia Lawall <julia@diku.dk>
Signed-off-by: David S. Miller <davem@davemloft.net>

b44: fix resume, request_irq after hw reset

On resume, call request_irq() after resetting the hardware rather than
before. It's a shared interrupt so the handler could be called
immediately if another device on the same irq interrupts (and will be
called immediately if CONFIG_DEBUG_SHIRQ=y), but unless the hardware is
reinitialised with b44_init_hw() the read of the interrupt status
register will hang the system.

Signed-off-by: James Hogan <james@albanarts.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

cxgb4: function namespace cleanup (v3)

Make functions only used in one file local.
Remove lots of dead code, relating to unsupported functions
in mainline driver like RSS, IPv6, and TCP offload.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Acked-by: Dimitris Michailidis <dm@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

CAPI: Silence lockdep warning on get_capi_appl_by_nr usage

As long as we hold capi_controller_lock, we can safely access
capi_applications without RCU protection as no one can modify the
application list underneath us. Introduce an RCU-free
__get_capi_appl_by_nr for this purpose. This silences lockdep warnings
on suspicious rcu_dereference usage.

Signed-off-by: Jan Kiszka <jan.kiszka@web.de>
Signed-off-by: David S. Miller <davem@davemloft.net>

bridge: Forward reserved group addresses if !STP

Make all frames sent to reserved group MAC addresses (01:80:c2:00:00:00 to
01:80:c2:00:00:0f) be forwarded if STP is disabled. This enables
forwarding EAPOL frames, among other things.

Signed-off-by: Benjamin Poirier <benjamin.poirier@polymtl.ca>
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net/neighbour: cancel_delayed_work() + flush_scheduled_work() -> cancel_delayed_work_sync()

flush_scheduled_work() is going away. Prepare for it.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

Revert d88dca79d3852a3623f606f781e013d61486828a

TIPC needs to have its endianess issues fixed. Unfortunately, the format of a
subscriber message is passed in directly from user space, so requiring this
message to be in network byte order breaks user space ABI. Revert this change
until such time as we can determine how to do this in a backwards compatible
manner.

Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Revert c6537d6742985da1fbf12ae26cde6a096fd35b5c

Backout the tipc changes to the flags int he subscription message. These
changees, while reasonable on the surface, interefere with user space ABI
compatibility which is a no-no. This was part of the changes to fix the
endianess issues in the TIPC protocol, which would be really nice to do but we
need to do so in a way that is backwards compatible with user space.

Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

tproxy: fix hash locking issue when using port redirection in __inet_inherit_port()

When __inet_inherit_port() is called on a tproxy connection the wrong locks are
held for the inet_bind_bucket it is added to. __inet_inherit_port() made an
implicit assumption that the listener's port number (and thus its bind bucket).
Unfortunately, if you're using the TPROXY target to redirect skbs to a
transparent proxy that assumption is not true anymore and things break.

This patch adds code to __inet_inherit_port() so that it can handle this case
by looking up or creating a new bind bucket for the child socket and updates
callers of __inet_inherit_port() to gracefully handle __inet_inherit_port()
failing.

Reported by and original patch from Stephen Buck <stephen.buck@exinda.com>.
See http://marc.info/?t=128169268200001&r=1&w=2 for the original discussion.

Signed-off-by: KOVACS Krisztian <hidden@balabit.hu>
Signed-off-by: Patrick McHardy <kaber@trash.net>

net/core: Allow tagged VLAN packets to flow through VETH devices.

When there are VLANs on a VETH device, the packets being transmitted
through the VETH device may be 4 bytes bigger than MTU. A check
in dev_forward_skb did not take this into account and so dropped
these packets.

This patch is needed at least as far back as 2.6.34.7 and should
be considered for -stable.

Signed-off-by: Ben Greear <greearb@candelatech.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

phy/marvell: fix 88e1121 support

Commit c477d0447db08068a497e7beb892b2b2a7bff64b added support for RGMII
rx/tx delays except that it ends up clearing rx/tx delays bit for modes
differents that RGMII*ID. Due to this, ethernet is not working anymore
on my guruplug server +. This patch is fixing that.

Signed-off-by: Arnaud Patard <arnaud.patard@rtp-net.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

ixgbe: add a refcnt when turning on/off FCoE offload capability

The FCoE offload is enabled/disabled per adapter, but upper FCoE protocol
stack could have multiple FCoE instances created on the same physical network
interface, e.g., FCoE on multiple VLAN interfaces on the same physical
network interface. In this case we want to turn on FCoE offload at the first
request from ndo_fcoe_enable() but only turn off FCoE offload at the very last
call to ndo_fcoe_disable(). This is fixed by adding a refcnt in the per adapter
FCoE structure and tear down FCoE offload when refcnt decrements to zero.

Signed-off-by: Yi Zou <yi.zou@intel.com>
Tested-by: Ross Brattain <ross.b.brattain@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ixgbe: fix stats handling

Current ixgbe stats have following problems :

- Not 64 bit safe (on 32bit arches)

- Not safe in ixgbe_clean_rx_irq() :
All cpus dirty a common location (netdev->stats.rx_bytes &
netdev->stats.rx_packets) without proper synchronization.
This slow down a bit multiqueue operations, and possibly miss some
updates.

Fixes :

Implement ndo_get_stats64() method to provide accurate 64bit rx|tx
bytes/packets counters, using 64bit safe infrastructure.

ixgbe_get_ethtool_stats() also use this infrastructure to provide 64bit
safe counters.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Acked-by: Don Skidmore <donald.c.skidmore@intel.com>
Tested-by: Stephen Ko <stephen.s.ko@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ixgbe: update copyright info

Update copyright notice

Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

jme: Advance version number

Advance version number and update copyright info

Signed-off-by: Guo-Fu Tseng <cooldavid@cooldavid.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

jme: Adding mii-tool support

Adding mii-tool support for some distribution only have mii-tool
installed by default.

Signed-off-by: Guo-Fu Tseng <cooldavid@cooldavid.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

jme: Prevent possible read re-order error

Adding read memory barrier in between flag reading and data reading of
receive descriptors. This prevents the data being read before hardware
complete writing informations.

Reported-by: Stefan Hajnoczi <stefanha@gmail.com>
Signed-off-by: Guo-Fu Tseng <cooldavid@cooldavid.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

jme: Add comment in jme_set_settings

Explains what `fdc` variable is for.

Signed-off-by: Guo-Fu Tseng <cooldavid@cooldavid.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

jme: Fix PHY power-off error

Adding phy_on in opposition to phy_off.

Signed-off-by: Guo-Fu Tseng <cooldavid@cooldavid.org>
Cc: <stable@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

tproxy: add lookup type checks for UDP in nf_tproxy_get_sock_v4()

Also, inline this function as the lookup_type is always a literal
and inlining removes branches performed at runtime.

Signed-off-by: Balazs Scheidler <bazsi@balabit.hu>
Signed-off-by: KOVACS Krisztian <hidden@balabit.hu>
Signed-off-by: Patrick McHardy <kaber@trash.net>

tproxy: kick out TIME_WAIT sockets in case a new connection comes in with the same tuple

Without tproxy redirections an incoming SYN kicks out conflicting
TIME_WAIT sockets, in order to handle clients that reuse ports
within the TIME_WAIT period.

The same mechanism didn't work in case TProxy is involved in finding
the proper socket, as the time_wait processing code looked up the
listening socket assuming that the listener addr/port matches those
of the established connection.

This is not the case with TProxy as the listener addr/port is possibly
changed with the tproxy rule.

Signed-off-by: Balazs Scheidler <bazsi@balabit.hu>
Signed-off-by: KOVACS Krisztian <hidden@balabit.hu>
Signed-off-by: Patrick McHardy <kaber@trash.net>

bonding: cleanup: remove braces from single block statements

checkpatch.pl cleanup : Remove braces from single statement
blocks.

Signed-off-by: Bandan Das <bandan.das@stratus.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

bonding: cleanup : add space around operators

checkpatch.pl cleanup: Added spaces around operators at various places.
Also fixed some c99 style comments that I came across.

Signed-off-by: Bandan Das <bandan.das@stratus.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Drivers: atm: Makefile: replace the use of <module>-objs with <module>-y

Changed <module>-objs to <module>-y in Makefile.

Signed-off-by: Tracey Dent <tdent48227@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

smsc95xx: generate random MAC address once, not every ifup

The smsc95xx driver currently generates a new random MAC address
every time the interface is brought up. This makes it impossible to
override using the standard `ifconfig hw ether` approach.

Past patches tried to make the MAC address a module parameter or
base it off the die ID, but it seems to me much simpler (and
hopefully less controversial) to stick with the current random
generation scheme, but allow the user to change the address.

This patch does exactly that - it moves the random address
generation from smsc95xx_reset() into smsc95xx_bind(), so that it is
done once on module load, not on every ifup. The user can then
override this using the standard mechanisms.

Applies against 2.6.35 and linux-2.6 head.

Signed-off-by: Bernard Blackham <b-omap@largestprime.net>
Signed-off-by: David S. Miller <davem@davemloft.net>

bnx2: Increase max rx ring size from 1K to 2K

A number of customers are reporting packet loss under certain workloads
(e.g. heavy bursts of small packets) with flow control disabled. A larger
rx ring helps to prevent these losses.

No change in default rx ring size and memory consumption.

Signed-off-by: Andy Gospodarek <andy@greyhouse.net>
Acked-by: John Feeney <jfeeney@redhat.com>
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net_sched: remove the unused parameter of qdisc_create_dflt()

The first parameter dev isn't in use in qdisc_create_dflt().

Signed-off-by: Changli Gao <xiaosuo@gmail.com>
Acked-by: Jamal Hadi Salim <hadi@cyberus.ca>
Signed-off-by: David S. Miller <davem@davemloft.net>

bonding: make release_and_destroy static

Only used in main file.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Acked-by: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

xfrm: make xfrm_bundle_ok local

Only used in one place.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

rtnetlink: remove rtnl_kill_links

The function rtnl_kill_links is defined but never used.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

xfrm6: make xfrm6_tunnel_free_spi local

Function only defined and used in one file.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

pch_gbe: make local functions static

Make routines that are only used in one file static.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

vmxnet3: make bit twiddle routines inline

Gcc doesn't usually handle inline across compilation units, and the
functions don't have to be global in scope. Move the set/reset flag
functions int the existing vmxnet3 header.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Acked-by: Shreyas Bhatewara <sbhatewara@vmware.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

bonding: make bond_resend_igmp_join_requests_delayed static

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Acked-by: Flavio Leitner <fleitner@redhat.com>
Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

sfc: make functions static

Make local functions and variable static. Do some rearrangement
of the string table stuff to put it where it gets used.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Acked-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

bridge: make br_parse_ip_options static

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

socket: localize functions

A couple of functions in socket.c are only used there and
should be localized.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

netxen: make local function static.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

fib: introduce fib_alias_accessed() helper

Perf tools session at NFWS 2010 pointed out a false sharing on struct
fib_alias that can be avoided pretty easily, if we set FA_S_ACCESSED bit
only if needed (ie : not already set)

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net/sched: fix missing spinlock init

Under network load, doing :

tc qdisc del dev eth0 root

triggers :

[  167.193087] BUG: spinlock bad magic on CPU#3, udpflood/4928
[  167.193139]  lock: c15bc324, .magic: 00000000, .owner:
<none>/-1, .owner_cpu: -1
[  167.193193] Pid: 4928, comm: udpflood Not tainted
2.6.36-rc7-11417-g215340c-dirty #323
[  167.193245] Call Trace:
[  167.193292]  [<c13abaa0>] ? printk+0x18/0x20
[  167.193342]  [<c11afb53>] spin_bug+0xa3/0xf0
[  167.193389]  [<c11afcdd>] do_raw_spin_lock+0x7d/0x160
[  167.193440]  [<c1313d4e>] ? __dev_xmit_skb+0x27e/0x2b0
[  167.193496]  [<c107382b>] ? trace_hardirqs_on+0xb/0x10
[  167.193545]  [<c13ae99a>] _raw_spin_lock+0x3a/0x40
[  167.193593]  [<c1313d4e>] ? __dev_xmit_skb+0x27e/0x2b0
[  167.193641]  [<c1313d4e>] __dev_xmit_skb+0x27e/0x2b0

commit 79640a4ca695 (add additional lock to qdisc to increase
throughput) forgot to initialize  noop_qdisc and noqueue_qdisc busylock

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ipvs: provide address family for debugging

As skb->protocol is not valid in LOCAL_OUT add
parameter for address family in packet debugging functions.
Even if ports are not present in AH and ESP change them to
use ip_vs_tcpudp_debug_packet to show at least valid addresses
as before. This patch removes the last user of skb->protocol
in IPVS.

Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Simon Horman <horms@verge.net.au>

ipvs: inherit forwarding method in backup

Connections in backup server should inherit the
forwarding method from real server. It is a way to fix a
problem where the forwarding method in backup connection
is damaged by logical OR operation with the real server's
connection flags. And the change is needed for setups
where the backup server uses different forwarding method
for the same real servers.

Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Simon Horman <horms@verge.net.au>

ipvs: changes for local client

This patch deals with local client processing.

Prefer LOCAL_OUT hook for scheduling connections from
local clients. LOCAL_IN is still supported if the packets are
not marked as processed in LOCAL_OUT. The idea to process
requests in LOCAL_OUT is to alter conntrack reply before
it is confirmed at POST_ROUTING. If the local requests are
processed in LOCAL_IN the conntrack can not be updated
and matching by state is impossible.

Add the following handlers:

- ip_vs_reply[46] at LOCAL_IN:99 to process replies from
remote real servers to local clients. Now when both
replies from remote real servers (ip_vs_reply*) and
local real servers (ip_vs_local_reply*) are handled
it is safe to remove the conn_out_get call from ip_vs_in
because it does not support related ICMP packets.

- ip_vs_local_request[46] at LOCAL_OUT:-98 to process
requests from local client

Handling in LOCAL_OUT causes some changes:

- as skb->dev, skb->protocol and skb->pkt_type are not defined
in LOCAL_OUT make sure we set skb->dev before calling icmpv6_send,
prefer skb_dst(skb) for struct net and remove the skb->protocol
checks from TUN transmitters.

[ horms@verge.net.au: removed trailing whitespace ]
Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Simon Horman <horms@verge.net.au>

ipvs: changes for local real server

This patch deals with local real servers:

- Add support for DNAT to local address (different real server port).
It needs ip_vs_out hook in LOCAL_OUT for both families because
skb->protocol is not set for locally generated packets and can not
be used to set 'af'.

- Skip packets in ip_vs_in marked with skb->ipvs_property because
ip_vs_out processing can be executed in LOCAL_OUT but we still
have the conn_out_get check in ip_vs_in.

- Ignore packets with inet->nodefrag from local stack

- Require skb_dst(skb) != NULL because we use it to get struct net

- Add support for changing the route to local IPv4 stack after DNAT
depending on the source address type. Local client sets output
route and the remote client sets input route. It looks like
IPv6 does not need such rerouting because the replies use
addresses from initial incoming header, not from skb route.

- All transmitters now have strict checks for the destination
address type: redirect from non-local address to local real
server requires NAT method, local address can not be used as
source address when talking to remote real server.

- Now LOCALNODE is not set explicitly as forwarding
method in real server to allow the connections to provide
correct forwarding method to the backup server. Not sure if
this breaks tools that expect to see 'Local' real server type.
If needed, this can be supported with new flag IP_VS_DEST_F_LOCAL.
Now it should be possible connections in backup that lost
their fwmark information during sync to be forwarded properly
to their daddr, even if it is local address in the backup server.
By this way backup could be used as real server for DR or TUN,
for NAT there are some restrictions because tuple collisions
in conntracks can create problems for the traffic.

- Call ip_vs_dst_reset when destination is updated in case
some real server IP type is changed between local and remote.

[ horms@verge.net.au: removed trailing whitespace ]
Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Simon Horman <horms@verge.net.au>

ipvs: move ip_route_me_harder for ICMP

Currently, ip_route_me_harder after ip_vs_out_icmp
is called even if packet is not related to IPVS connection.
Move it into handle_response_icmp. Also, force rerouting
if sending to local client because IPv4 stack uses addresses
from the route.

Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Simon Horman <horms@verge.net.au>

ipvs: create ip_vs_defrag_user

Create new function ip_vs_defrag_user to return correct
IP_DEFRAG_xxx user depending on the hooknum. It will be needed
when we add handlers in LOCAL_OUT.

Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Simon Horman <horms@verge.net.au>

ipvs: fix CHECKSUM_PARTIAL for TUN method

The recent change in IP_VS_XMIT_TUNNEL to set
CHECKSUM_NONE is not correct. After adding IPIP header
skb->csum becomes invalid but the CHECKSUM_PARTIAL
case must be supported. So, use skb_forward_csum() which is
most suitable for us to allow local clients to send IPIP
to remote real server.

Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Simon Horman <horms@verge.net.au>

ipvs: stop ICMP from FORWARD to local

Delivering locally ICMP from FORWARD hook is not supported.

Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Simon Horman <horms@verge.net.au>

ipvs: do not schedule conns from real servers

This patch is needed to avoid scheduling of
packets from local real server when we add ip_vs_in
in LOCAL_OUT hook to support local client.

Currently, when ip_vs_in can not find existing
connection it tries to create new one by calling ip_vs_schedule.

The default indication from ip_vs_schedule was if
connection was scheduled to real server. If real server is
not available we try to use the bypass forwarding method
or to send ICMP error. But in some cases we do not want to use
the bypass feature. So, add flag 'ignored' to indicate if
the scheduler ignores this packet.

Make sure we do not create new connections from replies.
We can hit this problem for persistent services and local real
server when ip_vs_in is added to LOCAL_OUT hook to handle
local clients.

Also, make sure ip_vs_schedule ignores SYN packets
for Active FTP DATA from local real server. The FTP DATA
connection should be created on SYN+ACK from client to assign
correct connection daddr.

Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Simon Horman <horms@verge.net.au>

ipvs: switch to notrack mode

Change skb->ipvs_property semantic. This is preparation
to support ip_vs_out processing in LOCAL_OUT. ipvs_property=1
will be used to avoid expensive lookups for traffic sent by
transmitters. Now when conntrack support is not used we call
ip_vs_notrack method to avoid problems in OUTPUT and
POST_ROUTING hooks instead of exiting POST_ROUTING as before.

Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Simon Horman <horms@verge.net.au>

ipvs: optimize checksums for apps

Avoid full checksum calculation for apps that can provide
info whether csum was broken after payload mangling. For now only
ip_vs_ftp mangles payload and it updates the csum, so the full
recalculation is avoided for all packets.

Add CHECKSUM_UNNECESSARY for snat_handler (TCP and UDP).
It is needed to support SNAT from local address for the case
when csum is fully recalculated.

Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Simon Horman <horms@verge.net.au>

ipvs: fix CHECKSUM_PARTIAL for TCP, UDP

Fix CHECKSUM_PARTIAL handling. Tested for IPv4 TCP,
UDP not tested because it needs network card with HW CSUM support.
May be fixes problem where IPVS can not be used in virtual boxes.
Problem appears with DNAT to local address when the local stack
sends reply in CHECKSUM_PARTIAL mode.

Fix tcp_dnat_handler and udp_dnat_handler to provide
vaddr and daddr in right order (old and new IP) when calling
tcp_partial_csum_update/udp_partial_csum_update (CHECKSUM_PARTIAL).

Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Simon Horman <horms@verge.net.au>

r8169: print errors when dma mapping fail

If dma mapping fail we are dropping packages or fail to open device.
But exact reason of drop/fail stays unknow for a user, so print errors.

Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

r8169: (re)init phy on resume

Fix switching device to low-speed mode after resume reported in:
https://bugzilla.redhat.com/show_bug.cgi?id=502974

Reported-and-tested-by: Laurentiu Badea <bugzilla-redhat@wotevah.com>
Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

r8169: changing mtu clean up

Since we do not change rx buffer size any longer, we can
clean up rtl8169_change_mtu and in consequence rtl8169_down.

Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

r8169: do not account fragments as packets

Only increase tx_{packets,dropped} statistics when transmit or drop
full skb, not just fragment.

Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

r8169: use pointer to struct device as local variable

Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

r8169: replace PCI_DMA_{TO,FROM}DEVICE to DMA_{TO,FROM}_DEVICE

Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

r8169: init rx ring cleanup

Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

r8169: check dma mapping failures

Check possible dma mapping errors and do clean up if it happens.

Fix overwrap bug in rtl8169_tx_clear on the way.

Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

bnx2x: Update bnx2x to use new vlan accleration.

Make the bnx2x driver use the new vlan accleration model.

Signed-off-by: Hao Zheng <hzheng@nicira.com>
Signed-off-by: Jesse Gross <jesse@nicira.com>
CC: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ixgbe: Update ixgbe to use new vlan accleration.

Make the ixgbe driver use the new vlan accleration model.

Signed-off-by: Jesse Gross <jesse@nicira.com>
CC: Peter Waskiewicz <peter.p.waskiewicz.jr@intel.com>
CC: Emil Tantilov <emil.s.tantilov@intel.com>
CC: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

bnx2: Update bnx2 to use new vlan accleration.

Make the bnx2 driver use the new vlan accleration model.

Signed-off-by: Jesse Gross <jesse@nicira.com>
CC: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

bridge: Add support for TX vlan offload.

If some of the underlying devices support it, enable vlan offload on
transmit for bridge devices. This allows senders to take advantage of the
hardware support, similar to other forms of acceleration.

Signed-off-by: Jesse Gross <jesse@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ethtool: Add support for vlan accleration.

Now that vlan acceleration is handled consistently regardless of usage,
it is possible to enable and disable it at will. This adds support for
Ethtool operations that change the offloading status for debugging
purposes, similar to other forms of hardware acceleration.

Signed-off-by: Jesse Gross <jesse@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

vlan: Centralize handling of hardware acceleration.

Currently each driver that is capable of vlan hardware acceleration
must be aware of the vlan groups that are configured and then pass
the stripped tag to a specialized receive function. This is

different from other types of hardware offload in that it places a
significant amount of knowledge in the driver itself rather keeping
it in the networking core.

This makes vlan offloading function more similarly to other forms
of offloading (such as checksum offloading or TSO) by doing the
following:
* On receive, stripped vlans are passed directly to the network
core, without attempting to check for vlan groups or reconstructing
the header if no group
* vlans are made less special by folding the logic into the main
receive routines
* On transmit, the device layer will add the vlan header in software
if the hardware doesn't support it, instead of spreading that logic
out in upper layers, such as bonding.

There are a number of advantages to this:
* Fixes all bugs with drivers incorrectly dropping vlan headers at once.
* Avoids having to disable VLAN acceleration when in promiscuous mode
(good for bridging since it always puts devices in promiscuous mode).
* Keeps VLAN tag separate until given to ultimate consumer, which
avoids needing to do header reconstruction as in tg3 unless absolutely
necessary.
* Consolidates common code in core networking.

Signed-off-by: Jesse Gross <jesse@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

vlan: Avoid hash table lookup to find group.

A struct net_device always maps to zero or one vlan groups and we
always know the device when we are looking up a group. We currently
do a hash table lookup on the device to find the group but it is
much simpler to just store a pointer.

Signed-off-by: Jesse Gross <jesse@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

vlan: Enable software emulation for vlan accleration.

Currently users of hardware vlan accleration need to know whether
the device supports it before generating packets. However, vlan
acceleration will soon be available in a more flexible manner so
knowing ahead of time becomes much more difficult. This adds
a software fallback path for vlan packets on devices without the
necessary offloading support, similar to other types of hardware
accleration.

Signed-off-by: Jesse Gross <jesse@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

vlan: Don't check for vlan group before vlan_tx_tag_present.

Many (but not all) drivers check to see whether there is a vlan
group configured before using a tag stored in the skb. There's
not much point in this check since it just throws away data that
should only be present in the expected circumstances. However,
it will soon be legal and expected to get a vlan tag when no
vlan group is configured, so remove this check from all drivers
to avoid dropping the tags.

Signed-off-by: Jesse Gross <jesse@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

vlan: Rename VLAN_GROUP_ARRAY_LEN to VLAN_N_VID.

VLAN_GROUP_ARRAY_LEN is simply the number of possible vlan VIDs.
Since vlan groups will soon be more of an implementation detail
for vlan devices, rename the constant to be descriptive of its
actual purpose.

Signed-off-by: Jesse Gross <jesse@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ebtables: Allow filtering of hardware accelerated vlan frames.

An upcoming commit will allow packets with hardware vlan acceleration
information to be passed though more parts of the network stack, including
packets trunked through the bridge. This adds support for matching and
filtering those packets through ebtables.

Signed-off-by: Jesse Gross <jesse@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

enic: Fix log message

Fix a log message

Signed-off-by: Vasanthy Kolluri <vkolluri@cisco.com>
Signed-off-by: Roopa Prabhu <roprabhu@cisco.com>
Signed-off-by: David Wang <dwang2@cisco.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

enic: Change min MTU

Change min MTU to 68.

Signed-off-by: Vasanthy Kolluri <vkolluri@cisco.com>
Signed-off-by: Roopa Prabhu <roprabhu@cisco.com>
Signed-off-by: David Wang <dwang2@cisco.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

enic: Replace firmware devcmd CMD_ENABLE with CMD_ENABLE_WAIT

Replace no wait CMD_ENABLE firmware devcmd with CMD_ENABLE_WAIT

Signed-off-by: Vasanthy Kolluri <vkolluri@cisco.com>
Signed-off-by: Roopa Prabhu <roprabhu@cisco.com>
Signed-off-by: David Wang <dwang2@cisco.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

enic: Make firmware cognizant of the user set mac address

Let the firmware know about the mac address set by the user using ndo_set_mac_address

Signed-off-by: Vasanthy Kolluri <vkolluri@cisco.com>
Signed-off-by: Roopa Prabhu <roprabhu@cisco.com>
Signed-off-by: David Wang <dwang2@cisco.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

enic: Add support for multiple hardware receive queues

Add support for multiple hardware receive queues. The ingress traffic is hashed into one of the receive queues based on IP or TCP or both headers. The max no. of receive queues supported is 8.

Signed-off-by: Vasanthy Kolluri <vkolluri@cisco.com>
Signed-off-by: Roopa Prabhu <roprabhu@cisco.com>
Signed-off-by: David Wang <dwang2@cisco.com>
Signed-off-by: David S. Miller <davem@davemloft.net>