git.karo-electronics.de Git - linux-beck.git/log

]> git.karo-electronics.de Git - linux-beck.git/log

projects / linux-beck.git / log

summary | shortlog | log | commit | commitdiff | tree
first ⋅ prev ⋅ next

commit | commitdiff | tree

Matan Barak [Thu, 11 Dec 2014 08:57:56 +0000 (10:57 +0200)]

net/mlx4: Add mlx4_bitmap zone allocator

The zone allocator is a mechanism which manages a few mlx4_bitmaps.

When allocating a resource, the user indicates the desired zone of
which this resource will be allocated from. If possible, the resource
will be allocated from this zone. Otherwise, the resource will be
allocated from a less-than, equal-to, higher-than priority zone,
according to the desired zone's properties with that respective
allocation order.

Signed-off-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Dotan Barak [Thu, 11 Dec 2014 08:57:55 +0000 (10:57 +0200)]

net/mlx4: Add a check if there are too many reserved QPs

The number of reserved QPs is affected both from the firmware and
from the driver's requirements. This patch adds a check that
validates that this number is indeed feasable.

Signed-off-by: Dotan Barak <dotanb@dev.mellanox.co.il>
Signed-off-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Eugenia Emantayev [Thu, 11 Dec 2014 08:57:54 +0000 (10:57 +0200)]

net/mlx4: Change QP allocation scheme

When using BF (Blue-Flame), the QPN overrides the VLAN, CV, and SV fields
in the WQE. Thus, BF may only be used for QPNs with bits 6,7 unset.

The current Ethernet driver code reserves a Tx QP range with 256b alignment.

This is wrong because if there are more than 64 Tx QPs in use,
QPNs >= base + 65 will have bits 6/7 set.

This problem is not specific for the Ethernet driver, any entity that
tries to reserve more than 64 BF-enabled QPs should fail. Also, using
ranges is not necessary here and is wasteful.

The new mechanism introduced here will support reservation for
"Eth QPs eligible for BF" for all drivers: bare-metal, multi-PF, and VFs
(when hypervisors support WC in VMs). The flow we use is:

1. In mlx4_en, allocate Tx QPs one by one instead of a range allocation,
and request "BF enabled QPs" if BF is supported for the function

2. In the ALLOC_RES FW command, change param1 to:
a. param1[23:0] - number of QPs
b. param1[31-24] - flags controlling QPs reservation

Bit 31 refers to Eth blueflame supported QPs. Those QPs must have
bits 6 and 7 unset in order to be used in Ethernet.

Bits 24-30 of the flags are currently reserved.

When a function tries to allocate a QP, it states the required attributes
for this QP. Those attributes are considered "best-effort". If an attribute,
such as Ethernet BF enabled QP, is a must-have attribute, the function has
to check that attribute is supported before trying to do the allocation.

In a lower layer of the code, mlx4_qp_reserve_range masks out the bits
which are unsupported. If SRIOV is used, the PF validates those attributes
and masks out unsupported attributes as well. In order to notify VFs which
attributes are supported, the VF uses QUERY_FUNC_CAP command. This command's
mailbox is filled by the PF, which notifies which QP allocation attributes
it supports.

Signed-off-by: Eugenia Emantayev <eugenia@mellanox.co.il>
Signed-off-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Matan Barak [Thu, 11 Dec 2014 08:57:53 +0000 (10:57 +0200)]

net/mlx4_core: Use tasklet for user-space CQ completion events

Previously, we've fired all our completion callbacks straight from our ISR.

Some of those callbacks were lightweight (for example, mlx4_en's and
IPoIB napi callbacks), but some of them did more work (for example,
the user-space RDMA stack uverbs' completion handler). Besides that,
doing more than the minimal work in ISR is generally considered wrong,
it could even lead to a hard lockup of the system. Since when a lot
of completion events are generated by the hardware, the loop over those
events could be so long, that we'll get into a hard lockup by the system
watchdog.

In order to avoid that, add a new way of invoking completion events
callbacks. In the interrupt itself, we add the CQs which receive completion
event to a per-EQ list and schedule a tasklet. In the tasklet context
we loop over all the CQs in the list and invoke the user callback.

Signed-off-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Or Gerlitz [Thu, 11 Dec 2014 08:57:52 +0000 (10:57 +0200)]

net/mlx4_core: Mask out host side virtualization features for guests

When VFs (guests in this context) issue the QUERY_DEV_CAP command, they
need not be told that host side virtualization features such as VST, FSM
(MAC anti-spoofing) and running > 80 VFs are supported by the device.

Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Or Gerlitz [Thu, 11 Dec 2014 08:57:51 +0000 (10:57 +0200)]

net/mlx4_en: Set csum level for encapsulated packets

This was dropped by mistake for the napi_gro_frags flow, fix that.

Fixes: dd65beac48a5 ('net/mlx4_en: Extend usage of napi_gro_frags')
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Sriharsha Basavapatna [Thu, 11 Dec 2014 08:24:47 +0000 (03:24 -0500)]

be2net: Export tunnel offloads only when a VxLAN tunnel is created

The encapsulated offload flags shouldn't be unconditionally exported
to the stack. The stack expects offloading to work across all tunnel
types when those flags are set. This would break other tunnels (like
GRE) since be2net currently supports tunnel offload for VxLAN only.

Also, with VxLANs Skyhawk-R can offload only 1 UDP dport. If more
than 1 UDP port is added, we should disable offloads in that case too.

Signed-off-by: Sriharsha Basavapatna <sriharsha.basavapatna@emulex.com>
Signed-off-by: Sathya Perla <sathya.perla@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Kevin Hao [Thu, 11 Dec 2014 06:08:41 +0000 (14:08 +0800)]

gianfar: Fix dma check map error when DMA_API_DEBUG is enabled

We need to use dma_mapping_error() to check the dma address returned
by dma_map_single/page(). Otherwise we would get warning like this:
  WARNING: at lib/dma-debug.c:1140
  Modules linked in:
  CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.18.0-rc2-next-20141029 #196
  task: c0834300 ti: effe6000 task.ti: c0874000
  NIP: c02b2c98 LR: c02b2c98 CTR: c030abc4
  REGS: effe7d70 TRAP: 0700   Not tainted  (3.18.0-rc2-next-20141029)
  MSR: 00021000 <CE,ME>  CR: 22044022  XER: 20000000

  GPR00: c02b2c98 effe7e20 c0834300 00000098 00021000 00000000 c030b898 00000003
  GPR08: 00000001 00000000 00000001 749eec9d 22044022 1001abe0 00000020 ef278678
  GPR16: ef278670 ef278668 ef278660 070a8040 c087f99c c08cdc60 00029000 c0840d44
  GPR24: c08be6e8 c0840000 effe7e78 ef041340 00000600 ef114e10 00000000 c08be6e0
  NIP [c02b2c98] check_unmap+0x51c/0x9e4
  LR [c02b2c98] check_unmap+0x51c/0x9e4
  Call Trace:
  [effe7e20] [c02b2c98] check_unmap+0x51c/0x9e4 (unreliable)
  [effe7e70] [c02b31d8] debug_dma_unmap_page+0x78/0x8c
  [effe7ed0] [c03d1640] gfar_clean_rx_ring+0x208/0x488
  [effe7f40] [c03d1a9c] gfar_poll_rx_sq+0x3c/0xa8
  [effe7f60] [c04f8714] net_rx_action+0xc0/0x178
  [effe7f90] [c00435a0] __do_softirq+0x100/0x1fc
  [effe7fe0] [c0043958] irq_exit+0xa4/0xc8
  [effe7ff0] [c000d14c] call_do_irq+0x24/0x3c
  [c0875e90] [c00048a0] do_IRQ+0x8c/0xf8
  [c0875eb0] [c000ed10] ret_from_except+0x0/0x18

For TX, we need to unmap the pages which has already been mapped and
free the skb before return.

For RX, move the dma mapping and error check to gfar_new_skb(). We
would reuse the original skb in the rx ring when either allocating
skb failure or dma mapping error.

Signed-off-by: Kevin Hao <haokexin@gmail.com>
Signed-off-by: Claudiu Manoil <claudiu.manoil@freescale.com>
Reviewed-by: Claudiu Manoil <claudiu.manoil@freescale.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Hariprasad Shenai [Thu, 11 Dec 2014 05:41:43 +0000 (11:11 +0530)]

cxgb4/csiostor: Don't use MASTER_MUST for fw_hello call

Remove use of calls into t4_fw_hello() with MASTER_MUST, which results in
FW_HELLO_CMD_MASTERFORCE being set. The firmware doesn't support this and of
course any existing PF Drivers will totally go for a toss.

Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

David S. Miller [Thu, 11 Dec 2014 04:37:06 +0000 (23:37 -0500)]

Merge branch 'fec-next'

Fugang Duan says:

====================
net: fec: driver code clean and bug fix

The patch serial include code clean and bug fix:
Patch#1: avoid dummy operation during suspend/resume test.
Patch#2: bug fix for i.MX6SX SOC that clean all interrupt events during MAC initial process.
Patch#3: before phy device link status is up, only enable MDIO bus interrupt.

V2:
- Modify the comment form from David's suggestion.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Nimrod Andy [Thu, 11 Dec 2014 01:20:33 +0000 (09:20 +0800)]

net: fec: only enable mdio interrupt before phy device link up

Before phy device link up, we only enable FEC mdio interrupt, which
is more reasonable.

Signed-off-by: Fugang Duan <B38611@freescale.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Nimrod Andy [Thu, 11 Dec 2014 01:20:32 +0000 (09:20 +0800)]

net: fec: clear all interrupt events to support i.MX6SX

For i.MX6SX FEC controller, there have interrupt mask and event
field extension. To support all SOCs FEC, we clear all interrupt
events during MAVC initial process.

Signed-off-by: Fugang Duan <B38611@freescale.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Nimrod Andy [Thu, 11 Dec 2014 01:20:31 +0000 (09:20 +0800)]

net: fec: reset fep link status in suspend function

On some i.MX6 serial boards, phy power and refrence clock are supplied
or controlled by SOC. When do suspend/resume test, the power and clock
are disabled, so phy device link down.

For current driver, fep->link is still up status, which cause extra operation
like below code. To avoid the dumy operation, we set fep->link to down when
phy device is real down.
...
if (fep->link) {
napi_disable(&fep->napi);
netif_tx_lock_bh(ndev);
fec_stop(ndev);
netif_tx_unlock_bh(ndev);
napi_enable(&fep->napi);
fep->link = phy_dev->link;
status_change = 1;
}
...

Signed-off-by: Fugang Duan <B38611@freescale.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Alexei Starovoitov [Thu, 11 Dec 2014 04:14:55 +0000 (20:14 -0800)]

net: sock: fix access via invalid file descriptor

0day robot reported the following crash:
[ 21.233581] BUG: unable to handle kernel NULL pointer dereference at 0000000000000007
[ 21.234709] IP: [<ffffffff8156ebda>] sk_attach_bpf+0x39/0xc2

It's due to bpf_prog_get() returning ERR_PTR.
Check it properly.

Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Fixes: 89aa075832b0 ("net: sock: allow eBPF programs to be attached to sockets")
Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Gu Zheng [Thu, 11 Dec 2014 03:22:04 +0000 (11:22 +0800)]

net: introduce helper macro for_each_cmsghdr

Introduce helper macro for_each_cmsghdr as a wrapper of the enumerating
cmsghdr from msghdr, just cleanup.

Signed-off-by: Gu Zheng <guz.fnst@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Stephen Rothwell [Wed, 10 Dec 2014 08:48:02 +0000 (19:48 +1100)]

cxgb4/cxgb4vf: global named must be unique

Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

David S. Miller [Wed, 10 Dec 2014 20:48:20 +0000 (15:48 -0500)]

Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net

Conflicts:
drivers/net/ethernet/amd/xgbe/xgbe-desc.c
drivers/net/ethernet/renesas/sh_eth.c

Overlapping changes in both conflict cases.

Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Joe Perches [Wed, 10 Dec 2014 18:28:58 +0000 (10:28 -0800)]

irda: Convert function pointer arrays and uses to const

Making things const is a good thing.

(x86-64 defconfig with all irda)
$ size net/irda/built-in.o*
   text    data     bss     dec     hex filename
109276    1868     244 111388   1b31c net/irda/built-in.o.new
108828    2316     244 111388   1b31c net/irda/built-in.o.old

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Joe Perches [Wed, 10 Dec 2014 17:55:50 +0000 (09:55 -0800)]

llc: Make llc_sap_action_t function pointer arrays const

It's better when function pointer arrays aren't modifiable.

Net change:

$ size net/llc/built-in.o.*
   text    data     bss     dec     hex filename
  61193   12758    1344   75295   1261f net/llc/built-in.o.new
  47113   27030    1344   75487   126df net/llc/built-in.o.old

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Joe Perches [Wed, 10 Dec 2014 17:43:57 +0000 (09:43 -0800)]

llc: Make llc_conn_ev_qfyr_t function pointer arrays const

It's better when function pointer arrays aren't modifiable.

Net change from original:

$ size net/llc/built-in.o.*
   text    data     bss     dec     hex filename
  61065   12886    1344   75295   1261f net/llc/built-in.o.new
  47113   27030    1344   75487   126df net/llc/built-in.o.old

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Joe Perches [Wed, 10 Dec 2014 16:18:50 +0000 (08:18 -0800)]

llc: Make function pointer arrays const

It's better when function pointer arrays aren't modifiable.

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

David S. Miller [Wed, 10 Dec 2014 20:17:52 +0000 (15:17 -0500)]

Merge branch 'kill_arch_fast_hash'

Daniel Borkmann says:

====================
Kill arch_fast_hash

Due to the size of changes I have based this against net-next,
also given 3.18 is already out. I've split this into 3 parts,
the first two to remove existing users (so they can optionally
go to stable) and the last one to kill the remaining library bits.

Let me know if there are any issues.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Daniel Borkmann [Wed, 10 Dec 2014 15:33:12 +0000 (16:33 +0100)]

net, lib: kill arch_fast_hash library bits

As there are now no remaining users of arch_fast_hash(), lets kill
it entirely.

This basically reverts commit 71ae8aac3e19 ("lib: introduce arch
optimized hash library") and follow-up work, that is f.e., commit
237217546d44 ("lib: hash: follow-up fixups for arch hash"),
commit e3fec2f74f7f ("lib: Add missing arch generic-y entries for
asm-generic/hash.h") and last but not least commit 6a02652df511
("perf tools: Fix include for non x86 architectures").

Cc: Francesco Fusco <fusco@ntop.org>
Cc: Thomas Graf <tgraf@suug.ch>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Daniel Borkmann [Wed, 10 Dec 2014 15:33:11 +0000 (16:33 +0100)]

net: replace remaining users of arch_fast_hash with jhash

This patch effectively reverts commit 500f80872645 ("net: ovs: use CRC32
accelerated flow hash if available"), and other remaining arch_fast_hash()
users such as from nfsd via commit 6282cd565553 ("NFSD: Don't hand out
delegations for 30 seconds after recalling them.") where it has been used
as a hash function for bloom filtering.

While we think that these users are actually not much of concern, it has
been requested to remove the arch_fast_hash() library bits that arose
from [1] entirely as per recent discussion [2]. The main argument is that
using it as a hash may introduce bias due to its linearity (see avalanche
criterion) and thus makes it less clear (though we tried to document that)
when this security/performance trade-off is actually acceptable for a
general purpose library function.

Lets therefore avoid any further confusion on this matter and remove it to
prevent any future accidental misuse of it. For the time being, this is
going to make hashing of flow keys a bit more expensive in the ovs case,
but future work could reevaluate a different hashing discipline.

[1] https://patchwork.ozlabs.org/patch/299369/
[2] https://patchwork.ozlabs.org/patch/418756/

Cc: Neil Brown <neilb@suse.de>
Cc: Francesco Fusco <fusco@ntop.org>
Cc: Jesse Gross <jesse@nicira.com>
Cc: Thomas Graf <tgraf@suug.ch>
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Daniel Borkmann [Wed, 10 Dec 2014 15:33:10 +0000 (16:33 +0100)]

netlink: use jhash as hashfn for rhashtable

For netlink, we shouldn't be using arch_fast_hash() as a hashing
discipline, but rather jhash() instead.

Since netlink sockets can be opened by any user, a local attacker
would be able to easily create collisions with the DPDK-derived
arch_fast_hash(), which trades off performance for security by
using crc32 CPU instructions on x86_64.

While it might have a legimite use case in other places, it should
be avoided in netlink context, though. As rhashtable's API is very
flexible, we could later on still decide on other hashing disciplines,
if legitimate.

Reference: http://thread.gmane.org/gmane.linux.kernel/1844123
Fixes: e341694e3eb5 ("netlink: Convert netlink_lookup() to use RCU protected hash table")
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Acked-by: Thomas Graf <tgraf@suug.ch>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

David S. Miller [Wed, 10 Dec 2014 20:06:14 +0000 (15:06 -0500)]

Merge branch 'isdn-next'

Tilman Schmidt says:

====================
ISDN patches for net-next

Here's a series of patches for the Gigaset ISDN driver and one for
the ISDN CAPI subsystem. Please merge as appropriate.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Tilman Schmidt [Wed, 10 Dec 2014 12:41:55 +0000 (13:41 +0100)]

isdn/capi: correct argument types of command_2_index

Utility function command_2_index is always called with arguments of
type u8. Adapt its declaration accordingly.

Signed-off-by: Tilman Schmidt <tilman@imap.cc>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Tilman Schmidt [Wed, 10 Dec 2014 12:41:55 +0000 (13:41 +0100)]

isdn/gigaset: enable Kernel CAPI support by default

Kernel CAPI has been the recommended ISDN subsystem for the Gigaset
driver since kernel release 2.6.34.2. It provides full backwards
compatibility to the old I4L subsystem thanks to the capidrv module.
I4L has been marked as deprecated for more than seven years.

Signed-off-by: Tilman Schmidt <tilman@imap.cc>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Tilman Schmidt [Wed, 10 Dec 2014 12:41:55 +0000 (13:41 +0100)]

isdn/gigaset: elliminate unnecessary argument from send_cb()

No need to pass a member of the cardstate structure as a separate
argument if the entire structure is already passed.

Signed-off-by: Tilman Schmidt <tilman@imap.cc>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Tilman Schmidt [Wed, 10 Dec 2014 12:41:55 +0000 (13:41 +0100)]

isdn/gigaset: clarify gigaset_modem_fill control structure

Replace the flag-controlled retry loop by explicit goto statements
in the error branches to make the control structure easier to
understand.

Signed-off-by: Tilman Schmidt <tilman@imap.cc>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Tilman Schmidt [Wed, 10 Dec 2014 12:41:55 +0000 (13:41 +0100)]

isdn/gigaset: drop duplicate declaration

Function gigaset_skb_sent was declared twice, identically, in gigaset.h.

Signed-off-by: Tilman Schmidt <tilman@imap.cc>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Richard Alpe [Wed, 10 Dec 2014 08:46:54 +0000 (09:46 +0100)]

tipc: fix broadcast wakeup contention after congestion

commit 908344cdda80 ("tipc: fix bug in multicast congestion handling")
introduced a race in the broadcast link wakeup functionality.

This patch eliminates this broadcast link wakeup race caused by
operation on the wakeup list without proper locking. If this race
hit and corrupted the list all subsequent wakeup messages would be
lost, resulting in a considerable memory leak.

Signed-off-by: Richard Alpe <richard.alpe@ericsson.com>
Signed-off-by: Erik Hugne <erik.hugne@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Govindarajulu Varadarajan [Wed, 10 Dec 2014 08:10:23 +0000 (13:40 +0530)]

enic: add support for set/get rss hash key

This patch adds support for setting/getting rss hash key using ethtool.

v2:
respin patch to support RSS hash function changes.

Signed-off-by: Govindarajulu Varadarajan <_govind@gmx.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

David S. Miller [Wed, 10 Dec 2014 18:32:02 +0000 (13:32 -0500)]

Merge branch 'napi_page_frags'

Alexander Duyck says:

====================
net: Alloc NAPI page frags from their own pool

This patch series implements a means of allocating page fragments without
the need for the local_irq_save/restore in __netdev_alloc_frag. By doing
this I am able to decrease packet processing time by 11ns per packet in my
test environment.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Alexander Duyck [Wed, 10 Dec 2014 03:41:17 +0000 (19:41 -0800)]

ethernet/broadcom: Use napi_alloc_skb instead of netdev_alloc_skb_ip_align

This patch replaces the calls to netdev_alloc_skb_ip_align in the
copybreak paths.

Cc: Gary Zambrano <zambrano@broadcom.com>
Cc: Florian Fainelli <f.fainelli@gmail.com>
Cc: Ariel Elior <ariel.elior@qlogic.com>
Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Alexander Duyck [Wed, 10 Dec 2014 03:41:09 +0000 (19:41 -0800)]

ethernet/realtek: use napi_alloc_skb instead of netdev_alloc_skb_ip_align

This replaces most of the calls to netdev_alloc_skb_ip_align in the Realtek
drivers. The one instance I didn't replace in 8139cp.c is because it was
called as a part of init and as such is not always accessed from the
softirq context.

Cc: Realtek linux nic maintainers <nic_swsd@realtek.com>
Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Alexander Duyck [Wed, 10 Dec 2014 03:41:03 +0000 (19:41 -0800)]

cxgb: Use napi_alloc_skb instead of netdev_alloc_skb_ip_align

In order to use napi_alloc_skb I needed to pass a pointer to struct adapter
instead of struct pci_dev. This allowed me to access &adapter->napi.

Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Alexander Duyck [Wed, 10 Dec 2014 03:40:56 +0000 (19:40 -0800)]

ethernet/intel: Use napi_alloc_skb

This change replaces calls to netdev_alloc_skb_ip_align with
napi_alloc_skb. The advantage of napi_alloc_skb is currently the fact that
the page allocation doesn't make use of any irq disable calls.

There are few spots where I couldn't replace the calls as the buffer
allocation routine is called as a part of init which is outside of the
softirq context.

Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Alexander Duyck [Wed, 10 Dec 2014 03:40:49 +0000 (19:40 -0800)]

net: Pull out core bits of __netdev_alloc_skb and add __napi_alloc_skb

This change pulls the core functionality out of __netdev_alloc_skb and
places them in a new function named __alloc_rx_skb.  The reason for doing
this is to make these bits accessible to a new function __napi_alloc_skb.
In addition __alloc_rx_skb now has a new flags value that is used to
determine which page frag pool to allocate from.  If the SKB_ALLOC_NAPI
flag is set then the NAPI pool is used.  The advantage of this is that we
do not have to use local_irq_save/restore when accessing the NAPI pool from
NAPI context.

In my test setup I saw at least 11ns of savings using the napi_alloc_skb
function versus the netdev_alloc_skb function, most of this being due to
the fact that we didn't have to call local_irq_save/restore.

The main use case for napi_alloc_skb would be for things such as copybreak
or page fragment based receive paths where an skb is allocated after the
data has been received instead of before.

Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Alexander Duyck [Wed, 10 Dec 2014 03:40:42 +0000 (19:40 -0800)]

net: Split netdev_alloc_frag into __alloc_page_frag and add __napi_alloc_frag

This patch splits the netdev_alloc_frag function up so that it can be used
on one of two page frag pools instead of being fixed on the
netdev_alloc_cache.  By doing this we can add a NAPI specific function
__napi_alloc_frag that accesses a pool that is only used from softirq
context.  The advantage to this is that we do not need to call
local_irq_save/restore which can be a significant savings.

I also took the opportunity to refactor the core bits that were placed in
__alloc_page_frag.  First I updated the allocation to do either a 32K
allocation or an order 0 page.  This is based on the changes in commmit
d9b2938aa where it was found that latencies could be reduced in case of
failures.  Then I also rewrote the logic to work from the end of the page to
the start.  By doing this the size value doesn't have to be used unless we
have run out of space for page fragments.  Finally I cleaned up the atomic
bits so that we just do an atomic_sub_and_test and if that returns true then
we set the page->_count via an atomic_set.  This way we can remove the extra
conditional for the atomic_read since it would have led to an atomic_inc in
the case of success anyway.

Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com>
Acked-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

David S. Miller [Wed, 10 Dec 2014 18:17:23 +0000 (13:17 -0500)]

Merge branch 'for-davem-2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs

More iov_iter work for the networking from Al Viro.

Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Flavio Leitner [Wed, 10 Dec 2014 00:41:48 +0000 (22:41 -0200)]

dummy: use MODULE_VERSION

Use MODULE_VERSION() now that dummy driver has a version.

Signed-off-by: Flavio Leitner <fbl@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Jiri Pirko [Tue, 9 Dec 2014 21:23:29 +0000 (22:23 +0100)]

net: sched: cls: use nla_nest_cancel instead of nlmsg_trim

To cancel nesting, this function is more convenient.

Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Valdis.Kletnieks@vt.edu [Tue, 9 Dec 2014 21:15:50 +0000 (16:15 -0500)]

net: fix suspicious rcu_dereference_check in net/sched/sch_fq_codel.c

commit 46e5da40ae (net: qdisc: use rcu prefix and silence
sparse warnings) triggers a spurious warning:

net/sched/sch_fq_codel.c:97 suspicious rcu_dereference_check() usage!

The code should be using the _bh variant of rcu_dereference.

Signed-off-by: Valdis Kletnieks <valdis.kletnieks@vt.edu>
Acked-by: Eric Dumazet <edumazet@google.com>
Acked-by: John Fastabend <john.r.fastabend@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Lendacky, Thomas [Tue, 9 Dec 2014 20:54:08 +0000 (14:54 -0600)]

amd-xgbe: Use disable_irq_nosync when in IRQ context

The disable_irq_nosync function, not the disable_irq function, must be
used to disable the DMA channel interrupt from within the interrupt
service routine. Change the disable_irq call to disable_irq_nosync.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

David Vrabel [Tue, 9 Dec 2014 18:43:28 +0000 (18:43 +0000)]

xen-netfront: use correct linear area after linearizing an skb

Commit 97a6d1bb2b658ac85ed88205ccd1ab809899884d (xen-netfront: Fix
handling packets on compound pages with skb_linearize) attempted to
fix a problem where an skb that would have required too many slots
would be dropped causing TCP connections to stall.

However, it filled in the first slot using the original buffer and not
the new one and would use the wrong offset and grant access to the
wrong page.

Netback would notice the malformed request and stop all traffic on the
VIF, reporting:

vif vif-3-0 vif3.0: txreq.offset: 85e, size: 4002, end: 6144
vif vif-3-0 vif3.0: fatal error; disabling device

Reported-by: Anthony Wright <anthony@overnetdata.com>
Tested-by: Anthony Wright <anthony@overnetdata.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Eric Dumazet [Tue, 9 Dec 2014 17:56:08 +0000 (09:56 -0800)]

tcp: fix more NULL deref after prequeue changes

When I cooked commit c3658e8d0f1 ("tcp: fix possible NULL dereference in
tcp_vX_send_reset()") I missed other spots we could deref a NULL
skb_dst(skb)

Again, if a socket is provided, we do not need skb_dst() to get a
pointer to network namespace : sock_net(sk) is good enough.

Reported-by: Dann Frazier <dann.frazier@canonical.com>
Bisected-by: Dann Frazier <dann.frazier@canonical.com>
Tested-by: Dann Frazier <dann.frazier@canonical.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Fixes: ca777eff51f7 ("tcp: remove dst refcount false sharing for prequeue mode")
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Jan Beulich [Tue, 9 Dec 2014 11:47:04 +0000 (11:47 +0000)]

netback: don't store invalid vif pointer

When xenvif_alloc() fails, it returns a non-NULL error indicator. To
avoid eventual races, we shouldn't store that into struct backend_info
as readers of it only check for NULL.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Nimrod Andy [Tue, 9 Dec 2014 10:46:56 +0000 (18:46 +0800)]

net: fec: avoid kernal crash by NULL pointer when no phy connection

On i.MX6SX sabreauto board, when there have no phy daughter board connection,
there have kernel crash by NULL pointer:

fec 2188000.ethernet eth0: could not attach to PHY
Unable to handle kernel NULL pointer dereference at virtual address 00000220
pgd = 80004000
[00000220] *pgd=00000000
Internal error: Oops: 5 [#1] PREEMPT SMP ARM
Modules linked in:
CPU: 0 PID: 1 Comm: swapper/0 Not tainted 3.14.24-01042-g27eaeea-dirty #405
task: d8078000 ti: d8076000 task.ti: d8076000
PC is at mutex_lock+0x10/0x54
LR is at phy_start+0x14/0x68
pc : [<806ad4e4>]    lr : [<803b0f90>]    psr: 60000113
sp : d8077d80  ip : 00000000  fp : d83cc000
r10: 0000100c  r9 : d83cc800  r8 : 00000000
r7 : d83bcd0c  r6 : 00000200  r5 : 00000220  r4 : 00000220
r3 : 00000000  r2 : 00000000  r1 : d83bcd90  r0 : 00000220
Flags: nZCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment kernel
Control: 10c5387d  Table: 8000404a  DAC: 00000015
Process swapper/0 (pid: 1, stack limit = 0xd8076240)
Stack: (0xd8077d80 to 0xd8078000)
7d80: 00000000 803b0f90 00000001 00000000 d83bc800 803be034 00000007 805c3fb4
7da0: 00000003 80d4e0bc 805efcb8 fffffff1 fffffff0 00000000 00000000 d8077dfc
7dc0: 0000000d 80d6ce80 80d126b0 800499c8 d83bc800 d83bc800 806f0f40 d83bc82c
7de0: 00000000 00000000 80d6ce80 80d126b0 0000016b 80540250 d8076008 d83bc800
7e00: 0000016b d83bc800 00001003 00000001 00001002 805404d4 d83bc800 00000120
7e20: 00001002 00001002 00000000 805405d4 d83bc800 00000001 80d126c0 00001002
7e40: 80dbc5dc 80d02024 00000000 806ae360 00000002 d6128420 d6127198 12400000
7e60: 00000000 00000000 00000002 d61271e8 00000000 12400000 d801674c 800e49f0
7e80: d6127198 d6124e58 00000000 80238848 d61271c4 00000000 00000001 d8016700
7ea0: 80dd2e00 80d752c0 80d752c0 80cfdaec 0000010c 80239430 806c2e90 d800f080
7ec0: d800f380 804e46b4 ffffffbc 80d15cb0 00000007 80d752c0 80d752c0 80d01e94
7ee0: 0000010c d8076030 00000000 800088cc 80dbaba4 80bd411c d80a6f00 806b1e04
7f00: 00000000 00000000 00000000 80125b84 00000000 80d2c56c 60000113 00000001
7f20: ef7ff9df 806c80cc 0000010c 80043f5c 80c95eb8 00000007 ef7ffa1d 00000007
7f40: 80d2c55c 80d15cb0 00000007 80d752c0 80d752c0 80ccc50c 0000010c 80d0a114
7f60: 80d0a10c 80cccc04 00000007 00000007 80ccc50c 806ae410 00000000 8004cb84
7f80: 80d17bc0 00000000 806a4bd4 00000000 00000000 00000000 00000000 00000000
7fa0: 00000000 806a4bdc 00000000 8000e5f8 00000000 00000000 00000000 00000000
7fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
7fe0: 00000000 00000000 00000000 00000000 00000013 00000000 1e79a7bb e5337f77
[<806ad4e4>] (mutex_lock) from [<803b0f90>] (phy_start+0x14/0x68)
[<803b0f90>] (phy_start) from [<803be034>] (fec_enet_open+0x448/0x5dc)
[<803be034>] (fec_enet_open) from [<80540250>] (__dev_open+0xa8/0x110)
[<80540250>] (__dev_open) from [<805404d4>] (__dev_change_flags+0x88/0x170)
[<805404d4>] (__dev_change_flags) from [<805405d4>] (dev_change_flags+0x18/0x48)
[<805405d4>] (dev_change_flags) from [<80d02024>] (ip_auto_config+0x190/0xf94)
[<80d02024>] (ip_auto_config) from [<800088cc>] (do_one_initcall+0xe8/0x144)
[<800088cc>] (do_one_initcall) from [<80cccc04>] (kernel_init_freeable+0x104/0x1c8)
[<80cccc04>] (kernel_init_freeable) from [<806a4bdc>] (kernel_init+0x8/0xec)
[<806a4bdc>] (kernel_init) from [<8000e5f8>] (ret_from_fork+0x14/0x3c)
Code: e92d4010 e3a03000 e1a04000 ee073fba (e1903f9f)

Add phydev check to fix the issue.

Signed-off-by: Fugang Duan <B38611@freescale.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Ying Xue [Tue, 9 Dec 2014 07:17:56 +0000 (15:17 +0800)]

tipc: avoid double lock 'spin_lock:&seq->lock'

The commit fb9962f3cefe ("tipc: ensure all name sequences are properly
protected with its lock") involves below errors:

net/tipc/name_table.c:980 tipc_purge_publications() error: double lock 'spin_lock:&seq->lock'

Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Florian Fainelli [Mon, 8 Dec 2014 23:59:18 +0000 (15:59 -0800)]

net: systemport: allow changing MAC address

Hook a ndo_set_mac_address callback, update the internal Ethernet MAC in
the netdevice structure, and finally write that address down to the
UniMAC registers. If the interface is down, and most likely clock gated,
we do not update the registers but just the local copy, such that next
ndo_open() call will effectively write down the address.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

David S. Miller [Tue, 9 Dec 2014 23:24:56 +0000 (18:24 -0500)]

Merge branch 'bridge_mode'

Roopa Prabhu says:

====================
remove bridge mode BRIDGE_MODE_SWDEV

BRIDGE_MODE_SWDEV was introduced to indicate switchdev offloads
for bridging from user space (In other words to call into the hw switch
port driver directly). But user can use existing BRIDGE_FLAGS_SELF
to call into the hw switch port driver today. swdev mode is not required
anymore. So, this patch removes it.

v4 - v5
    incorporate comments
    - Define BRIDGE_MODE_UNDEF to handle cases where mode is not defined
    - reverse the order of patches
    - include patch comments in all patches
====================

Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Roopa Prabhu [Mon, 8 Dec 2014 22:04:21 +0000 (14:04 -0800)]

bridge: remove mode BRIDGE_MODE_SWDEV

This patch removes bridge mode swdev.
Users can use BRIDGE_FLAGS_SELF to indicate swdev offload
if needed.

Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Signed-off-by: Scott Feldman <sfeldma@gmail.com>
Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Roopa Prabhu [Mon, 8 Dec 2014 22:04:20 +0000 (14:04 -0800)]

rocker: remove swdev mode

Remove use of 'swdev' mode in rocker. rocker dev offloads
can use the BRIDGE_FLAGS_SELF to indicate offload to hardware.

Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Signed-off-by: Scott Feldman <sfeldma@gmail.com>
Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Roopa Prabhu [Mon, 8 Dec 2014 22:04:19 +0000 (14:04 -0800)]

bridge: new mode flag to indicate mode 'undefined'

This patch adds mode BRIDGE_MODE_UNDEF for cases where mode is not needed.

Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Signed-off-by: Scott Feldman <sfeldma@gmail.com>
Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

David S. Miller [Tue, 9 Dec 2014 23:12:03 +0000 (18:12 -0500)]

Merge tag 'master-2014-12-08' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next

John W. Linville says:

====================
pull request: wireless-next 2014-12-08

Please pull this last batch of pending wireless updates for the 3.19 tree...

For the wireless bits, Johannes says:

"This time I have Felix's no-status rate control work, which will allow
drivers to work better with rate control even if they don't have perfect
status reporting. In addition to this, a small hwsim fix from Patrik,
one of the regulatory patches from Arik, and a number of cleanups and
fixes I did myself.

Of note is a patch where I disable CFG80211_WEXT so that compatibility
is no longer selectable - this is intended as a wake-up call for anyone
who's still using it, and is still easily worked around (it's a one-line
patch) before we fully remove the code as well in the future."

For the Bluetooth bits, Johan says:

"Here's one more bluetooth-next pull request for 3.19:

- Minor cleanups for ieee802154 & mac802154
- Fix for the kernel warning with !TASK_RUNNING reported by Kirill A.
   Shutemov
- Support for another ath3k device
- Fix for tracking link key based security level
- Device tree bindings for btmrvl + a state update fix
- Fix for wrong ACL flags on LE links"

And...

"In addition to the previous one this contains two more cleanups to
mac802154 as well as support for some new HCI features from the
Bluetooth 4.2 specification.

From the original request:

'Here's what should be the last bluetooth-next pull request for 3.19.
It's rather large but the majority of it is the Low Energy Secure
Connections feature that's part of the Bluetooth 4.2 specification. The
specification went public only this week so we couldn't publish the
corresponding code before that. The code itself can nevertheless be
considered fairly mature as it's been in development for over 6 months
and gone through several interoperability test events.

Besides LE SC the pull request contains an important fix for command
complete events for mgmt sockets which also fixes some leaks of hci_conn
objects when powering off or unplugging Bluetooth adapters.

A smaller feature that's part of the pull request is service discovery
support. This is like normal device discovery except that devices not
matching specific UUIDs or strong enough RSSI are filtered out.

Other changes that the pull request contains are firmware dump support
to the btmrvl driver, firmware download support for Broadcom BCM20702A0
variants, as well as some coding style cleanups in 6lowpan &
ieee802154/mac802154 code.'"

For the NFC bits, Samuel says:

"With this one we get:

- NFC digital improvements for DEP support: Chaining, NACK and ATN
  support added.

- NCI improvements: Support for p2p target, SE IO operand addition,
  SE operands extensions to support proprietary implementations, and
  a few fixes.

- NFC HCI improvements: OPEN_PIPE and NOTIFY_ALL_CLEARED support,
  and SE IO operand addition.

- A bunch of minor improvements and fixes for STMicro st21nfcb and
  st21nfca"

For the iwlwifi bits, Emmanuel says:

"Major works are CSA and TDLS. On top of that I have a new
firmware API for scan and a few rate control improvements.
Johannes find a few tricks to improve our CPU utilization
and adds support for a new spin of 7265 called 7265D.
Along with this a few random things that don't stand out."

And...

"I deprecate here -8.ucode since -9 has been published long ago.
Along with that I have a new activity, we have now better
a infrastructure for firmware debugging. This will allow to
have configurable probes insides the firmware.
Luca continues his work on NetDetect, this feature is now
complete. All the rest is minor fixes here and there."

For the Atheros bits, Kalle says:

"Only ath10k changes this time and no major changes. Most visible are:

o new debugfs interface for runtime firmware debugging (Yanbo)

o fix shared WEP (Sujith)

o don't rebuild whenever kernel version changes (Johannes)

o lots of refactoring to make it easier to add new hw support (Michal)

There's also smaller fixes and improvements with no point of listing
here."

In addition, there are a few last minute updates to ath5k,
ath9k, brcmfmac, brcmsmac, mwifiex, rt2x00, rtlwifi, and wil6210.
Also included is a pull of the wireless tree to pick-up the fixes
originally included in "pull request: wireless 2014-12-03"...

Please let me know if there are problems!
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Mitsuhiro Kimura [Mon, 8 Dec 2014 10:46:21 +0000 (19:46 +0900)]

sh_eth: Remove redundant alignment adjustment

PTR_ALIGN macro after skb_reserve is redundant, because skb_reserve
function adjusts the alignment of skb->data.

Signed-off-by: Mitsuhiro Kimura <mitsuhiro.kimura.kc@renesas.com>
Signed-off-by: Yoshihiro Kaneko <ykaneko0929@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Mitsuhiro Kimura [Tue, 9 Dec 2014 12:23:42 +0000 (21:23 +0900)]

sh_eth: Optimization for RX excess judgement

Both of 'boguscnt' and 'quota' have nearly meaning as the condition of
the reception loop.
In order to cut down redundant processing, this patch changes excess
judgement.

Signed-off-by: Mitsuhiro Kimura <mitsuhiro.kimura.kc@renesas.com>
Signed-off-by: Yoshihiro Kaneko <ykaneko0929@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Li RongQing [Mon, 8 Dec 2014 01:42:55 +0000 (09:42 +0800)]

net: avoid to call skb_queue_len again

the queue length of sd->input_pkt_queue has been put into qlen,
and impossible to change, since hold the lock

Signed-off-by: Li RongQing <roy.qing.li@gmail.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Cc: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

David S. Miller [Tue, 9 Dec 2014 22:01:21 +0000 (17:01 -0500)]

Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net-next

Jeff Kirsher says:

====================
Intel Wired LAN Driver Updates 2014-12-09

This series contains updates to i40e and i40evf.

Jeff (me) provides a single patch to convert a macro to a static inline
function based on feedback from Joe Perches on a previous patch.

Shannon provides the remaining twelve patches against i40e.  Almost all
of Shannon's patches cleanup/fix NVM issues varying in range from
adding more detail to debug messages, to removing dead code, to fixing
NVM state transitions after an error.  Change the handy decoder interface
for admin queue return code to help catch and properly report the condition
as a useful errno rather than returning a misleading '0'.  Added a range
check to avoid any possible array index-out-of-bound issues.

v2:
- fixed up patch 05 in the series to use the ARRAY_SIZE() macro as suggested
   by Sergei Shtylyov
- fix up patch 13 to remove unnecessary parens in the return statement
   as suggested by Sergei Shtylyov
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

David S. Miller [Tue, 9 Dec 2014 21:49:00 +0000 (16:49 -0500)]

Merge tag 'linux-can-next-for-3.19-20141207' of git://gitorious.org/linux-can/linux-can-next

Marc Kleine-Budde says:

====================
pull-request: can-next 2014-12-07

this is a pull request of 8 patches for net-next/master.

Andri Yngvason contributes 4 patches in which the CAN state change
handling is consolidated and unified among the sja1000, mscan and
flexcan driver. The three patches by Jeremiah Mahler fix spelling
mistakes and eliminate the banner[] variable in various parts. And a
patch by me that switches on sparse endianess checking by default.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

David S. Miller [Tue, 9 Dec 2014 21:40:20 +0000 (16:40 -0500)]

Merge tag 'linux-can-fixes-for-3.18-20141207' of git://gitorious.org/linux-can/linux-can

Marc Kleine-Budde says:

====================
pull-request: can 2014-12-07

this is a pull request of three patches by Stephane Grosjean which fix several
bugs in the peak_usb CAN drivers.

Please queue, if possible for 3.18, if it's too late these patches takes the
slow lane via net-next.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Eric Dumazet [Sun, 7 Dec 2014 20:22:18 +0000 (12:22 -0800)]

tcp: refine TSO autosizing

Commit 95bd09eb2750 ("tcp: TSO packets automatic sizing") tried to
control TSO size, but did this at the wrong place (sendmsg() time)

At sendmsg() time, we might have a pessimistic view of flow rate,
and we end up building very small skbs (with 2 MSS per skb).

This is bad because :

- It sends small TSO packets even in Slow Start where rate quickly
   increases.
- It tends to make socket write queue very big, increasing tcp_ack()
   processing time, but also increasing memory needs, not necessarily
   accounted for, as fast clones overhead is currently ignored.
- Lower GRO efficiency and more ACK packets.

Servers with a lot of small lived connections suffer from this.

Lets instead fill skbs as much as possible (64KB of payload), but split
them at xmit time, when we have a precise idea of the flow rate.
skb split is actually quite efficient.

Patch looks bigger than necessary, because TCP Small Queue decision now
has to take place after the eventual split.

As Neal suggested, introduce a new tcp_tso_autosize() helper, so that
tcp_tso_should_defer() can be synchronized on same goal.

Rename tp->xmit_size_goal_segs to tp->gso_segs, as this variable
contains number of mss that we can put in GSO packet, and is not
related to the autosizing goal anymore.

Tested:

40 ms rtt link

nstat >/dev/null
netperf -H remote -l -2000000 -- -s 1000000
nstat | egrep "IpInReceives|IpOutRequests|TcpOutSegs|IpExtOutOctets"

Before patch :

Recv   Send    Send
Socket Socket  Message  Elapsed
Size   Size    Size     Time     Throughput
bytes  bytes   bytes    secs.    10^6bits/s

87380 2000000 2000000    0.36         44.22
IpInReceives                    600                0.0
IpOutRequests                   599                0.0
TcpOutSegs                      1397               0.0
IpExtOutOctets                  2033249            0.0

After patch :

Recv   Send    Send
Socket Socket  Message  Elapsed
Size   Size    Size     Time     Throughput
bytes  bytes   bytes    secs.    10^6bits/sec

87380 2000000 2000000    0.36       44.27
IpInReceives                    221                0.0
IpOutRequests                   232                0.0
TcpOutSegs                      1397               0.0
IpExtOutOctets                  2013953            0.0

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Acked-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Al Viro [Tue, 25 Nov 2014 00:45:05 +0000 (19:45 -0500)]

bury memcpy_toiovec()

no users left

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

commit | commitdiff | tree

Al Viro [Mon, 24 Nov 2014 23:29:54 +0000 (18:29 -0500)]

skb_copy_datagram_iovec() can die

no callers other than itself.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

commit | commitdiff | tree

Al Viro [Mon, 24 Nov 2014 22:48:04 +0000 (17:48 -0500)]

ppp_read(): switch to skb_copy_datagram_iter()

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

commit | commitdiff | tree

Al Viro [Mon, 24 Nov 2014 23:17:55 +0000 (18:17 -0500)]

switch memcpy_to_msg() and skb_copy{,_and_csum}_datagram_msg() to primitives

... making both non-draining.  That means that tcp_recvmsg() becomes
non-draining.  And _that_ would break iscsit_do_rx_data() unless we
a) make sure tcp_recvmsg() is uniformly non-draining (it is)
b) make sure it copes with arbitrary (including shifted)
iov_iter (it does, all it uses is iov_iter primitives)
c) make iscsit_do_rx_data() initialize ->msg_iter only once.

Fortunately, (c) is doable with minimal work and we are rid of one
the two places where kernel send/recvmsg users would be unhappy with
non-draining behaviour.

Actually, that makes all but one of ->recvmsg() instances iov_iter-clean.
The exception is skcipher_recvmsg() and it also isn't hard to convert
to primitives (iov_iter_get_pages() is needed there).  That'll wait
a bit - there's some interplay with ->sendmsg() path for that one.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

commit | commitdiff | tree

Al Viro [Mon, 24 Nov 2014 22:07:38 +0000 (17:07 -0500)]

first fruits - kill l2cap ->memcpy_fromiovec()

Just use copy_from_iter(). That's what this method is trying to do
in all cases, in a very convoluted fashion.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

commit | commitdiff | tree

Al Viro [Mon, 24 Nov 2014 15:42:55 +0000 (10:42 -0500)]

put iov_iter into msghdr

Note that the code _using_ ->msg_iter at that point will be very
unhappy with anything other than unshifted iovec-backed iov_iter.
We still need to convert users to proper primitives.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

commit | commitdiff | tree

Al Viro [Tue, 25 Nov 2014 00:32:50 +0000 (19:32 -0500)]

vmci: propagate msghdr all way down to __qp_memcpy_from_queue()

... and switch it to memcpy_to_msg()

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

commit | commitdiff | tree

Al Viro [Mon, 24 Nov 2014 21:44:09 +0000 (16:44 -0500)]

switch l2cap ->memcpy_fromiovec() to msghdr

it'll die soon enough - now that kvec-backed iov_iter works regardless
of set_fs(), both instances will become copy_from_iter() as soon as
we introduce ->msg_iter...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

commit | commitdiff | tree

Al Viro [Mon, 24 Nov 2014 18:26:06 +0000 (13:26 -0500)]

switch tcp_sock->ucopy from iovec (ucopy.iov) to msghdr (ucopy.msg)

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

commit | commitdiff | tree

Al Viro [Mon, 24 Nov 2014 18:23:40 +0000 (13:23 -0500)]

ip_generic_getfrag, udplite_getfrag: switch to passing msghdr

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

commit | commitdiff | tree

Al Viro [Mon, 24 Nov 2014 17:10:46 +0000 (12:10 -0500)]

ipv6 equivalent of "ipv4: Avoid reading user iov twice after raw_probe_proto_opt"

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

commit | commitdiff | tree

Al Viro [Mon, 24 Nov 2014 15:52:29 +0000 (10:52 -0500)]

raw.c: stick msghdr into raw_frag_vec

we'll want access to ->msg_iter

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

commit | commitdiff | tree

Al Viro [Tue, 9 Dec 2014 21:27:52 +0000 (16:27 -0500)]

Merge branch 'iov_iter' into for-davem-2

commit | commitdiff | tree

Julia Lawall [Sun, 7 Dec 2014 19:20:56 +0000 (20:20 +0100)]

chelsio: fix misspelling of current function in string

Replace a misspelled function name by %s and then __func__.

This was done using Coccinelle, including the use of Levenshtein distance,
as proposed by Rasmus Villemoes.

Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Julia Lawall [Sun, 7 Dec 2014 19:20:54 +0000 (20:20 +0100)]

hp100: fix misspelling of current function in string

Replace a misspelled function name by %s and then __func__.

This was done using Coccinelle, including the use of Levenshtein distance,
as proposed by Rasmus Villemoes.

Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Julia Lawall [Sun, 7 Dec 2014 19:20:52 +0000 (20:20 +0100)]

uli526x: fix misspelling of current function in string

Replace a misspelled function name by %s and then __func__.

This was done using Coccinelle, including the use of Levenshtein distance,
as proposed by Rasmus Villemoes.

Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Julia Lawall [Sun, 7 Dec 2014 19:20:47 +0000 (20:20 +0100)]

isdn: fix misspelling of current function in string

Replace a misspelled function name by %s and then __func__.

In the first case, the print is just dropped, because kmalloc itself does
enough error reporting.

This was done using Coccinelle, including the use of Levenshtein distance,
as proposed by Rasmus Villemoes.

Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Julia Lawall [Sun, 7 Dec 2014 19:20:46 +0000 (20:20 +0100)]

dmfe: fix misspelling of current function in string

The function name contains cleanup, not clean.

This was done using Coccinelle, including the use of Levenshtein distance,
as proposed by Rasmus Villemoes.

Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr>
Acked-by: Grant Grundler <grundler@parisc-linux.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Mahesh Bandewar [Sat, 6 Dec 2014 23:53:46 +0000 (15:53 -0800)]

macvlan: play well with ipvlan device

If device is already used as an ipvlan port then refuse to
use it as a macvlan port at early stage of port creation.

thost1:~# ip link add link eth0 ipvl0 type ipvlan
thost1:~# echo $?
0
thost1:~# ip link add link eth0 mvl0 type macvlan
RTNETLINK answers: Device or resource busy
thost1:~# echo $?
2
thost1:~#

Signed-off-by: Mahesh Bandewar <maheshb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Mahesh Bandewar [Sat, 6 Dec 2014 23:53:33 +0000 (15:53 -0800)]

ipvlan: move the device check function into netdevice.h

Move the port check [ipvlan_dev_master()] and device check
[ipvlan_dev_slave()] functions to netdevice.h and rename them
netif_is_ipvlan_port() and netif_is_ipvlan() resp. to be
consistent with macvlan api naming.

Signed-off-by: Mahesh Bandewar <maheshb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Mahesh Bandewar [Sat, 6 Dec 2014 23:53:19 +0000 (15:53 -0800)]

ipvlan: play well with macvlan device

If a device is already a macvlan port then refuse to use it as
an ipvlan port in the early stage of port creation.

thost1:~# ip link add link eth0 mvl0 type macvlan
thost1:~# echo $?
0
thost1:~# ip link add link eth0 ipvl0 type ipvlan
RTNETLINK answers: Device or resource busy
thost1:~# echo $?
2
thost1:~#

Signed-off-by: Mahesh Bandewar <maheshb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Mahesh Bandewar [Sat, 6 Dec 2014 23:53:04 +0000 (15:53 -0800)]

netdevice: Add a function to check macvlan port

Similar to a check for macvlan device, netif_is_macvlan(), add
another function to check if a device is used as macvlan port.

Signed-off-by: Mahesh Bandewar <maheshb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Hannes Frederic Sowa [Sat, 6 Dec 2014 18:19:42 +0000 (19:19 +0100)]

dst: no need to take reference on DST_NOCACHE dsts

Since commit f8864972126899 ("ipv4: fix dst race in sk_dst_get()")
DST_NOCACHE dst_entries get freed by RCU. So there is no need to get a
reference on them when we are in rcu protected sections.

Cc: Eric Dumazet <edumazet@google.com>
Cc: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Reviewed-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Flavio Leitner [Sat, 6 Dec 2014 00:13:24 +0000 (22:13 -0200)]

dummy: add support for ethtool get_drvinfo

The command 'ethtool -i' is useful to find details
about the interface like the device driver being used.
This was missing for dummy driver.

Signed-off-by: Flavio Leitner <fbl@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Joe Stringer [Fri, 5 Dec 2014 19:35:46 +0000 (11:35 -0800)]

bnx2x: Implement ndo_gso_check()

Use vxlan_gso_check() to advertise offload support for this NIC.

Signed-off-by: Joe Stringer <joestringer@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Rami Rosen [Fri, 5 Dec 2014 17:35:43 +0000 (19:35 +0200)]

Documentation (ixgbe.txt): use a decimal address.

This patch fixes the erronous usage of an hexadecimal address in the
example, by replacing it with a decimal address.

Signed-off-by: Rami Rosen <ramirose@gmail.com>
Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Jiri Benc [Fri, 5 Dec 2014 16:24:28 +0000 (17:24 +0100)]

openvswitch: set correct protocol on route lookup

Respect what the caller passed to ovs_tunnel_get_egress_info.

Fixes: 8f0aad6f35f7e ("openvswitch: Extend packet attribute for egress tunnel info")
Signed-off-by: Jiri Benc <jbenc@redhat.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Michal Kubeček [Fri, 5 Dec 2014 16:05:49 +0000 (17:05 +0100)]

macvlan: allow setting LRO independently of lower device

Since commit fbe168ba91f7 ("net: generic dev_disable_lro() stacked
device handling"), dev_disable_lro() zeroes NETIF_F_LRO feature flag
first for a macvlan device and then for its lower device. As an attempt
to set NETIF_F_LRO to zero is ignored, dev_disable_lro() issues a
warning and taints kernel.

Allowing NETIF_F_LRO to be set independently of the lower device
consists of three parts:

  - add the flag to hw_features to allow toggling it
  - allow setting it to 0 even if lower device has the flag set
  - add the flag to MACVLAN_FEATURES to restore copying from lower
    device on macvlan creation

Signed-off-by: Michal Kubecek <mkubecek@suse.cz>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Jeff Kirsher [Tue, 9 Dec 2014 10:31:16 +0000 (02:31 -0800)]

i40e/i40evf: Convert macro to static inline

Inline functions are preferred over macros when they can be used
interchangeably.

CC: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Reported-by: Joe Perches <joe@perches.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

commit | commitdiff | tree

Shannon Nelson [Thu, 13 Nov 2014 08:23:23 +0000 (08:23 +0000)]

i40e: add to NVM update debug message

Add a little more state context to an NVM update debug message.

Change-ID: I512160259052bcdbe5bdf1adf403ab2bf7984970
Signed-off-by: Shannon Nelson <shannon.nelson@intel.com>
Acked-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Jim Young <jamesx.m.young@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

commit | commitdiff | tree

Shannon Nelson [Thu, 13 Nov 2014 08:23:22 +0000 (08:23 +0000)]

i40e: check for AQ timeout in aq_rc decode

Decoding the AQ return code is great except when the AQ send timed out
and there's no return code set. This changes the handy decoder
interface to help catch and properly report the condition as a useful
errno rather than returning a misleading '0'.

Change-ID: I07a1f94f921606da49ffac7837bcdc37cd8222eb
Signed-off-by: Shannon Nelson <shannon.nelson@intel.com>
Acked-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Jim Young <jamesx.m.young@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

commit | commitdiff | tree

Shannon Nelson [Thu, 13 Nov 2014 08:23:21 +0000 (08:23 +0000)]

i40e: poll on NVM semaphore only if not other error

Only poll on the NVM semaphore if there's time left on a previous
reservation. Also, add a little more info to debug messages.

Change-ID: I2439bf870b95a28b810dcb5cca1c06440463cf8a
Signed-off-by: Shannon Nelson <shannon.nelson@intel.com>
Acked-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Jim Young <jamesx.m.young@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

commit | commitdiff | tree

Shannon Nelson [Thu, 13 Nov 2014 08:23:20 +0000 (08:23 +0000)]

i40e: fix up NVM update sm error handling

The state transitions after an error were not managed well, so
these changes get us back to the INIT state or don't transition
out of the INIT state after most errors.

Change-ID: I90aa0e4e348dc4f58cbcdce9c5d4b7fd35981c6c
Signed-off-by: Shannon Nelson <shannon.nelson@intel.com>
Acked-by: Michal Kosiarz <michal.kosiarz@intel.com>
Tested-by: Jim Young <jamesx.m.young@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

commit | commitdiff | tree

Shannon Nelson [Thu, 13 Nov 2014 08:23:19 +0000 (08:23 +0000)]

i40e: set max limit for access polling

Don't bother trying to set a smaller timeout on the polling,
just simplify the code and always use the max limit. Also,
rename a variable for clarity and fix a comment.

Change-ID: I0300c3562ccc4fd5fa3088f8ae52db0c1eb33af5
Signed-off-by: Shannon Nelson <shannon.nelson@intel.com>
Acked-by: Michal Kosiarz <michal.kosiarz@intel.com>
Tested-by: Jim Young <jamesx.m.young@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

commit | commitdiff | tree

Shannon Nelson [Thu, 13 Nov 2014 08:23:18 +0000 (08:23 +0000)]

i40e: remove unused nvm_semaphore_wait

The nvm_semaphore_wait field is set but never used, so let's
just get rid of it.

Change-ID: I2107bd29b69f99b1a61d7591d087429527c9d8fa
Signed-off-by: Shannon Nelson <shannon.nelson@intel.com>
Acked-by: Michal Kosiarz <michal.kosiarz@intel.com>
Tested-by: Jim Young <jamesx.m.young@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

commit | commitdiff | tree

Shannon Nelson [Thu, 13 Nov 2014 08:23:17 +0000 (08:23 +0000)]

i40e: init NVM update state on adminq init

The adminq init is run after the EMPR that is triggered by the
NVM update. The final write command will cause the reset and
will want to wait for the ARQ event that signals the end of the
update, but the reset precludes the event being sent. The state
is probably already at INIT, but we set it so here anyway, and
clear the release_on_done flag as well.

Change-ID: Ie9d724a39e71f988741abc3d51b4cb198c7e0272
Signed-off-by: Shannon Nelson <shannon.nelson@intel.com>
Acked-by: Michal Kosiarz <michal.kosiarz@intel.com>
Acked-by: Kamil Krawczyk <kamil.krawczyk@intel.com>
Tested-by: Jim Young <jamesx.m.young@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

commit | commitdiff | tree

Shannon Nelson [Thu, 13 Nov 2014 08:23:16 +0000 (08:23 +0000)]

i40e: add range check to i40e_aq_rc_to_posix

Just to be sure, add a range check to avoid any possible
array index-out-of-bound issues.

CC: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Change-ID: I9323bee6732c2a47599816e1d6c6b3a1f8dcbf54
Signed-off-by: Shannon Nelson <shannon.nelson@intel.com>
Acked-by: Michal Kosiarz <michal.kosiarz@intel.com>
Tested-by: Jim Young <jamesx.m.young@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

Beck SC1x5 Kernel