git.karo-electronics.de Git - karo-tx-linux.git/log

]> git.karo-electronics.de Git - karo-tx-linux.git/log

Linus Torvalds [Tue, 25 Apr 2017 21:07:24 +0000 (14:07 -0700)]

Merge tag 'arc-4.11-final' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc

Pull ARC fix from Vineet Gupta:
"Last minute fixes for ARC:

   - build error in Mellanox nps platform

   - addressing lack of saving FPU regs in releavnt configs"

* tag 'arc-4.11-final' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc:
  ARCv2: entry: save Accumulator register pair (r58:59) if present
  ARC: [plat-eznps] Fix build error

commit | commitdiff | tree

Eric Dumazet [Tue, 25 Apr 2017 18:36:52 +0000 (11:36 -0700)]

net: move xdp_prog field in RX cache lines

(struct net_device, xdp_prog) field should be moved in RX cache lines,
reducing latencies when a single packet is received on idle host,
since netif_elide_gro() needs it.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Sara Sharon [Sun, 8 Jan 2017 14:45:46 +0000 (16:45 +0200)]

iwlwifi: adjust NVM parsing APIs for new a000 method

In a000 devices we will get all nvm data from the firmware,
and can save most of the parsing.
Export two APIs that op mode will still use.
Adjust API of init_sbands to be independent of NVM file structure
so it can be used by op mode as well.

Signed-off-by: Sara Sharon <sara.sharon@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>

commit | commitdiff | tree

Johannes Berg [Mon, 6 Mar 2017 14:49:00 +0000 (15:49 +0100)]

iwlwifi: pcie: apply no-reclaim logic only to group 0

When applying no-reclaim logic to commands other than the group
zero for legacy commands, commands such as 0x1c (TX_CMD in group
0) can't be used in any other group. Fix that by applying this
logic only for group 0 - it's not and should never be needed for
any other groups.

Reported-by: Sharon Dvir <sharon.dvir@intel.com>
Reported-by: Shaul Triebitz <shaul.triebitz@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>

commit | commitdiff | tree

Sara Sharon [Sun, 5 Mar 2017 16:35:02 +0000 (18:35 +0200)]

iwlwifi: mvm: memset binding before setting values

The changes in commit 9415af7f306b ("iwlwifi: mvm: support new binding
API") assigned values that were later memset to 0. Move the memset
earlier.

Fixes: 9415af7f306b ("iwlwifi: mvm: support new binding API")
Signed-off-by: Sara Sharon <sara.sharon@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>

commit | commitdiff | tree

Sara Sharon [Sun, 5 Mar 2017 09:38:58 +0000 (11:38 +0200)]

iwlwifi: rename wait_for_tx_queues_empty

Rename current wait_tx_queue_empty to wait_tx_queues_empty since
it waits for multiple queues (up to 32).
Next patch will add a wait for single TX queue which is needed for
gen2 to be scalable for 512.

Signed-off-by: Sara Sharon <sara.sharon@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>

commit | commitdiff | tree

Sara Sharon [Thu, 23 Feb 2017 12:19:45 +0000 (14:19 +0200)]

iwlwifi: move to 512 queues

Avoid using the old define since it will enlarge necessary
structs for previous HW.

Signed-off-by: Sara Sharon <sara.sharon@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>

commit | commitdiff | tree

Sara Sharon [Mon, 6 Feb 2017 17:09:32 +0000 (19:09 +0200)]

iwlwifi: mvm: support station type API

Support change to ADD_STA API to support station types.
Each station is assigned its type.
This simplifies FW handling of the broadcast and multicast
stations:
* broadcast station is identified by its type and not the mac
  address.
* multicast queue is no longer treated differently. The opening
  and closing of it is done by referring to its station.
  There is no need to specify it in the MAC command.
* When disabling TX to all station driver can disable the traffic
  on multicast station, so FW doesn't have to do it.
Change is backward compatible.
Change the order of adding and removing the stations according to
FW requirements.

Signed-off-by: Sara Sharon <sara.sharon@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>

commit | commitdiff | tree

Sara Sharon [Thu, 23 Feb 2017 11:15:07 +0000 (13:15 +0200)]

iwlwifi: mvm: remove references to queue_info in new TX path

Most of the fields aren't needed in new TX path.
Enlarging the struct to 512 queues will consume a lot of memory.
Remove all references to the struct in the new TX path.
Move mac80211 queue mapping outside, since it will be needed per
queue for TVQM mode.
Add warning in paths that shouldn't be hit.

Signed-off-by: Sara Sharon <sara.sharon@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>

commit | commitdiff | tree

Liad Kaufman [Tue, 28 Feb 2017 15:15:47 +0000 (17:15 +0200)]

iwlwifi: gen2: support nmi triggering from host

For gen2 there is a new register.

Signed-off-by: Liad Kaufman <liad.kaufman@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>

commit | commitdiff | tree

Johannes Berg [Tue, 28 Feb 2017 21:45:25 +0000 (22:45 +0100)]

iwlwifi: remove module loading failure message

When CONFIG_DEBUG_TEST_DRIVER_REMOVE is set, iwlwifi crashes
when the opmode module cannot be loaded, due to completing
the completion before using drv->dev, which can then already
be freed.

Fix this by removing the (fairly useless) message. Moving the
completion later causes a deadlock instead, so that's not an
option.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>

commit | commitdiff | tree

Johannes Berg [Tue, 28 Feb 2017 15:44:16 +0000 (16:44 +0100)]

iwlwifi: don't leak memory on allocation failure

If we fail to allocate the small chunk of memory for the
pieces of the firmware file, we leak the whole firmware
image instead...

Since the allocation failure is really unlikely, just bail
out at that point instead.

Remove the error message at the label since we now (and
actually have been) use it for various reasons.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>

commit | commitdiff | tree

Johannes Berg [Tue, 28 Feb 2017 15:42:10 +0000 (16:42 +0100)]

iwlwifi: pcie: remove superfluous trans->dev assignment

This struct member is already assigned in the previous
call to iwl_trans_alloc(), so assigning the same value
again is superfluous - remove it.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>

commit | commitdiff | tree

Sara Sharon [Wed, 22 Feb 2017 10:08:19 +0000 (12:08 +0200)]

iwlwifi: mvm: use defines instead of variables for shared dwell times

Most of the dwells are constant across different scan types.
Use defines instead of depending on scan type.
This is needed as preparation to having different scan type per
band.

Signed-off-by: Sara Sharon <sara.sharon@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>

commit | commitdiff | tree

Sara Sharon [Thu, 23 Feb 2017 20:42:01 +0000 (22:42 +0200)]

iwlwifi: mvm: remove color definition

We use the full station field as sta_id, and never
use the color.
FW are about to clean the color out, so those defines
are incorrect now (and were redundant before).

Signed-off-by: Sara Sharon <sara.sharon@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>

commit | commitdiff | tree

Sara Sharon [Wed, 22 Feb 2017 17:34:17 +0000 (19:34 +0200)]

iwlwifi: mvm: move internally to use bigger INVALID_TXQ

We can't use IEEE80211_INVAL_HW_QUEUE to mark a queue as
invalid since 255 will be a valid value for a TVQM queue
index.
Use IWL_MVM_INVALID_QUEUE instead for accessing txq_id.
reserved_queue can stay a u8 since reserved_queue is not
used when TVQM is enabled.

Signed-off-by: Sara Sharon <sara.sharon@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>

commit | commitdiff | tree

Johannes Berg [Thu, 13 Apr 2017 13:50:27 +0000 (15:50 +0200)]

mac80211: rewrite monitor mode delivery logic

The monitor mode delivery logic makes it hard to add any
kind of filtering in an efficient way, because the monitor
SKB is created first and then passed to all interfaces.

Rewrite the logic to create the monitor SKB the first time
it's actually needed, and then keep delivering it.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>

commit | commitdiff | tree

Vasanthakumar Thiagarajan [Tue, 18 Apr 2017 12:09:19 +0000 (17:39 +0530)]

cfg80211: Fix dfs state propagation for non-DFS center channel

When part of a bigger bandwidth (160 MHz) channel falls in DFS
channel range it is possible that the center frequency may not
necessarily be a radar channel. Remove the sanity check on channel
flag for IEEE80211_CHAN_RADAR in regulatory_propagate_dfs_state(),
this should fix the dfs state propagation for non-DFS center freq
which has DFS channels in it's bandwidth, should also fix unnecessary
WARN_ON() spam in regulatory_propagate_dfs_state().

Fixes: 8976672736d6 ("cfg80211: Share Channel DFS state across wiphys of same DFS domain")
Reported-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: Vasanthakumar Thiagarajan <vthiagar@qti.qualcomm.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>

commit | commitdiff | tree

Sara Sharon [Wed, 22 Feb 2017 17:35:10 +0000 (19:35 +0200)]

iwlwifi: mvm: map cab_queue to different txq_id

cab_queue can now get bigger than u8, since in TVQM we will support
512 queues..
Support it by maintaining internal mapping between the actual number
and mac80211 queue (IWL_MVM_DQA_GCAST_QUEUE).
For pre-a000 the internal queue will be the same as the mac80211
queue.

Signed-off-by: Sara Sharon <sara.sharon@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>

commit | commitdiff | tree

Haim Dreyfuss [Thu, 2 Feb 2017 12:49:50 +0000 (14:49 +0200)]

iwlwifi: mvm: Ignore wifi mcc update in the driver while associated

Wifi mcc (mobile country code) update is forbidden while associated.
Currently, FW prevents these updates and the driver is unaware to
this logic. From now on, the FW sends every wifi mcc update to the
driver. The driver in his turn needs to decide whether to
ignore it or not, depends on the association state.

Signed-off-by: Haim Dreyfuss <haim.dreyfuss@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>

commit | commitdiff | tree

Sara Sharon [Wed, 22 Feb 2017 17:40:55 +0000 (19:40 +0200)]

iwlwifi: mvm: don't reserve queue in TVQM mode

The reserved queue is never used, save the trouble.

Signed-off-by: Sara Sharon <sara.sharon@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>

commit | commitdiff | tree

Liad Kaufman [Wed, 22 Feb 2017 12:39:10 +0000 (14:39 +0200)]

iwlwifi: pcie: support debug applying on a000 hw

Allow configuring debug destination on a000 HW.

Signed-off-by: Liad Kaufman <liad.kaufman@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>

commit | commitdiff | tree

Johannes Berg [Fri, 24 Feb 2017 11:02:22 +0000 (12:02 +0100)]

iwlwifi: mvm: avoid variable shadowing

Remove an extra variable 'queue' that already exists.
Also, since there are no code paths that use 'queue'
without intializing it, remove the unnecessary zero
initialization.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>

commit | commitdiff | tree

Dor Shaish [Mon, 20 Feb 2017 14:05:57 +0000 (16:05 +0200)]

iwlwifi: mvm: freeze 7265D and 3168 on API version 29

iwl7265D and iwl3168 are frozen on API version 29.
Set the MAX API allowed level to 29 from now on.

Signed-off-by: Dor Shaish <dor.shaish@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>

commit | commitdiff | tree

Alexander Potapenko [Tue, 25 Apr 2017 16:51:46 +0000 (18:51 +0200)]

net/packet: check length in getsockopt() called with PACKET_HDRLEN

In the case getsockopt() is called with PACKET_HDRLEN and optlen < 4
|val| remains uninitialized and the syscall may behave differently
depending on its value, and even copy garbage to userspace on certain
architectures. To fix this we now return -EINVAL if optlen is too small.

This bug has been detected with KMSAN.

Signed-off-by: Alexander Potapenko <glider@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

David Ahern [Tue, 25 Apr 2017 16:17:29 +0000 (09:17 -0700)]

net: ipv6: regenerate host route if moved to gc list

Taking down the loopback device wreaks havoc on IPv6 routing. By
extension, taking down a VRF device wreaks havoc on its table.

Dmitry and Andrey both reported heap out-of-bounds reports in the IPv6
FIB code while running syzkaller fuzzer. The root cause is a dead dst
that is on the garbage list gets reinserted into the IPv6 FIB. While on
the gc (or perhaps when it gets added to the gc list) the dst->next is
set to an IPv4 dst. A subsequent walk of the ipv6 tables causes the
out-of-bounds access.

Andrey's reproducer was the key to getting to the bottom of this.

With IPv6, host routes for an address have the dst->dev set to the
loopback device. When the 'lo' device is taken down, rt6_ifdown initiates
a walk of the fib evicting routes with the 'lo' device which means all
host routes are removed. That process moves the dst which is attached to
an inet6_ifaddr to the gc list and marks it as dead.

The recent change to keep global IPv6 addresses added a new function,
fixup_permanent_addr, that is called on admin up. That function restarts
dad for an inet6_ifaddr and when it completes the host route attached
to it is inserted into the fib. Since the route was marked dead and
moved to the gc list, re-inserting the route causes the reported
out-of-bounds accesses. If the device with the address is taken down
or the address is removed, the WARN_ON in fib6_del is triggered.

All of those faults are fixed by regenerating the host route if the
existing one has been moved to the gc list, something that can be
determined by checking if the rt6i_ref counter is 0.

Fixes: f1705ec197e7 ("net: ipv6: Make address flushing on ifdown optional")
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Reported-by: Andrey Konovalov <andreyknvl@google.com>
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Xin Long [Tue, 25 Apr 2017 14:58:37 +0000 (22:58 +0800)]

bridge: move bridge multicast cleanup to ndo_uninit

During removing a bridge device, if the bridge is still up, a new mdb entry
still can be added in br_multicast_add_group() after all mdb entries are
removed in br_multicast_dev_del(). Like the path:

  mld_ifc_timer_expire ->
    mld_sendpack -> ...
      br_multicast_rcv ->
        br_multicast_add_group

The new mp's timer will be set up. If the timer expires after the bridge
is freed, it may cause use-after-free panic in br_multicast_group_expired.

BUG: unable to handle kernel NULL pointer dereference at 0000000000000048
IP: [<ffffffffa07ed2c8>] br_multicast_group_expired+0x28/0xb0 [bridge]
Call Trace:
<IRQ>
[<ffffffff81094536>] call_timer_fn+0x36/0x110
[<ffffffffa07ed2a0>] ? br_mdb_free+0x30/0x30 [bridge]
[<ffffffff81096967>] run_timer_softirq+0x237/0x340
[<ffffffff8108dcbf>] __do_softirq+0xef/0x280
[<ffffffff8169889c>] call_softirq+0x1c/0x30
[<ffffffff8102c275>] do_softirq+0x65/0xa0
[<ffffffff8108e055>] irq_exit+0x115/0x120
[<ffffffff81699515>] smp_apic_timer_interrupt+0x45/0x60
[<ffffffff81697a5d>] apic_timer_interrupt+0x6d/0x80

Nikolay also found it would cause a memory leak - the mdb hash is
reallocated and not freed due to the mdb rehash.

unreferenced object 0xffff8800540ba800 (size 2048):
  backtrace:
    [<ffffffff816e2287>] kmemleak_alloc+0x67/0xc0
    [<ffffffff81260bea>] __kmalloc+0x1ba/0x3e0
    [<ffffffffa05c60ee>] br_mdb_rehash+0x5e/0x340 [bridge]
    [<ffffffffa05c74af>] br_multicast_new_group+0x43f/0x6e0 [bridge]
    [<ffffffffa05c7aa3>] br_multicast_add_group+0x203/0x260 [bridge]
    [<ffffffffa05ca4b5>] br_multicast_rcv+0x945/0x11d0 [bridge]
    [<ffffffffa05b6b10>] br_dev_xmit+0x180/0x470 [bridge]
    [<ffffffff815c781b>] dev_hard_start_xmit+0xbb/0x3d0
    [<ffffffff815c8743>] __dev_queue_xmit+0xb13/0xc10
    [<ffffffff815c8850>] dev_queue_xmit+0x10/0x20
    [<ffffffffa02f8d7a>] ip6_finish_output2+0x5ca/0xac0 [ipv6]
    [<ffffffffa02fbfc6>] ip6_finish_output+0x126/0x2c0 [ipv6]
    [<ffffffffa02fc245>] ip6_output+0xe5/0x390 [ipv6]
    [<ffffffffa032b92c>] NF_HOOK.constprop.44+0x6c/0x240 [ipv6]
    [<ffffffffa032bd16>] mld_sendpack+0x216/0x3e0 [ipv6]
    [<ffffffffa032d5eb>] mld_ifc_timer_expire+0x18b/0x2b0 [ipv6]

This could happen when ip link remove a bridge or destroy a netns with a
bridge device inside.

With Nikolay's suggestion, this patch is to clean up bridge multicast in
ndo_uninit after bridge dev is shutdown, instead of br_dev_delete, so
that netif_running check in br_multicast_add_group can avoid this issue.

v1->v2:
  - fix this issue by moving br_multicast_dev_del to ndo_uninit, instead
    of calling dev_close in br_dev_delete.

(NOTE: Depends upon b6fe0440c637 ("bridge: implement missing ndo_uninit()"))

Fixes: e10177abf842 ("bridge: multicast: fix handling of temp and perm entries")
Reported-by: Jianwen Ji <jiji@redhat.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Reviewed-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Sabrina Dubroca [Tue, 25 Apr 2017 13:56:50 +0000 (15:56 +0200)]

ipv6: fix source routing

Commit a149e7c7ce81 ("ipv6: sr: add support for SRH injection through
setsockopt") introduced handling of IPV6_SRCRT_TYPE_4, but at the same
time restricted it to only IPV6_SRCRT_TYPE_0 and
IPV6_SRCRT_TYPE_4. Previously, ipv6_push_exthdr() and fl6_update_dst()
would also handle other values (ie STRICT and TYPE_2).

Restore previous source routing behavior, by handling IPV6_SRCRT_STRICT
and IPV6_SRCRT_TYPE_2 the same way as IPV6_SRCRT_TYPE_0 in
ipv6_push_exthdr() and fl6_update_dst().

Fixes: a149e7c7ce81 ("ipv6: sr: add support for SRH injection through setsockopt")
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Reviewed-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Wei Yongjun [Tue, 25 Apr 2017 11:36:50 +0000 (11:36 +0000)]

drivers: net: xgene-v2: Fix error return code in xge_mdio_config()

Fix to return error code -ENODEV from the no PHY found error
handling case instead of 0, as done elsewhere in this function.

Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

David S. Miller [Tue, 18 Apr 2017 19:36:58 +0000 (15:36 -0400)]

net: Generic XDP

This provides a generic SKB based non-optimized XDP path which is used
if either the driver lacks a specific XDP implementation, or the user
requests it via a new IFLA_XDP_FLAGS value named XDP_FLAGS_SKB_MODE.

It is arguable that perhaps I should have required something like
this as part of the initial XDP feature merge.

I believe this is critical for two reasons:

1) Accessibility.  More people can play with XDP with less
   dependencies.  Yes I know we have XDP support in virtio_net, but
   that just creates another depedency for learning how to use this
   facility.

   I wrote this to make life easier for the XDP newbies.

2) As a model for what the expected semantics are.  If there is a pure
   generic core implementation, it serves as a semantic example for
   driver folks adding XDP support.

One thing I have not tried to address here is the issue of
XDP_PACKET_HEADROOM, thanks to Daniel for spotting that.  It seems
incredibly expensive to do a skb_cow(skb, XDP_PACKET_HEADROOM) or
whatever even if the XDP program doesn't try to push headers at all.
I think we really need the verifier to somehow propagate whether
certain XDP helpers are used or not.

v5:
- Handle both negative and positive offset after running prog
- Fix mac length in XDP_TX case (Alexei)
- Use rcu_dereference_protected() in free_netdev (kbuild test robot)

v4:
- Fix MAC header adjustmnet before calling prog (David Ahern)
- Disable LRO when generic XDP is installed (Michael Chan)
- Bypass qdisc et al. on XDP_TX and record the event (Alexei)
- Do not perform generic XDP on reinjected packets (DaveM)

v3:
- Make sure XDP program sees packet at MAC header, push back MAC
   header if we do XDP_TX.  (Alexei)
- Elide GRO when generic XDP is in use.  (Alexei)
- Add XDP_FLAG_SKB_MODE flag which the user can use to request generic
   XDP even if the driver has an XDP implementation.  (Alexei)
- Report whether SKB mode is in use in rtnl_xdp_fill() via XDP_FLAGS
   attribute.  (Daniel)

v2:
- Add some "fall through" comments in switch statements based
   upon feedback from Andrew Lunn
- Use RCU for generic xdp_prog, thanks to Johannes Berg.

Tested-by: Andy Gospodarek <andy@greyhouse.net>
Tested-by: Jesper Dangaard Brouer <brouer@redhat.com>
Tested-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Wei Yongjun [Tue, 25 Apr 2017 07:07:18 +0000 (07:07 +0000)]

qed: fix invalid use of sizeof in qed_alloc_qm_data()

sizeof() when applied to a pointer typed expression gives the
size of the pointer, not that of the pointed data.

Fixes: b5a9ee7cf3be ("qed: Revise QM configuration")
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Acked-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

sudarsana.kalluru@cavium.com [Tue, 25 Apr 2017 03:59:10 +0000 (20:59 -0700)]

qed: Fix error in the dcbx app meta data initialization.

DCBX app_data array is initialized with the incorrect values for
personality field. This would prevent offloaded protocols from
honoring the PFC.

Signed-off-by: Sudarsana Reddy Kalluru <Sudarsana.Kalluru@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Teng Qin [Tue, 25 Apr 2017 02:00:37 +0000 (19:00 -0700)]

bpf: map_get_next_key to return first key on NULL

When iterating through a map, we need to find a key that does not exist
in the map so map_get_next_key will give us the first key of the map.
This often requires a lot of guessing in production systems.

This patch makes map_get_next_key return the first key when the key
pointer in the parameter is NULL.

Signed-off-by: Teng Qin <qinteng@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

stephen hemminger [Tue, 25 Apr 2017 01:33:38 +0000 (18:33 -0700)]

netvsc: fix calculation of available send sections

My change (introduced in 4.11) to use find_first_clear_bit
incorrectly assumed that the size argument was words, not bits.
The effect was only a small limited number of the available send
sections were being actually used. This can cause performance loss
with some workloads.

Since map_words is now used only during initialization, it can
be on stack instead of in per-device data.

Fixes: b58a185801da ("netvsc: simplify get next send section")
Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Mike Maloney [Tue, 25 Apr 2017 01:29:11 +0000 (21:29 -0400)]

selftests/net: Fix broken test case in psock_fanout

The error return falue form sock_fanout_open is -1, not zero. One test
case was checking for 0 instead of -1.

Tested: Built and tested in clean client.
Signed-off-by: Mike Maloney <maloney@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Ivan Khoronzhuk [Mon, 24 Apr 2017 20:54:06 +0000 (23:54 +0300)]

net: ethernet: ti: netcp_core: remove unused compl queue mapping

This code is unused and probably was unintentionally left while
moving completion queue mapping in submit function.

Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Andreas Kemnade [Mon, 24 Apr 2017 19:18:39 +0000 (21:18 +0200)]

net: hso: fix module unloading

keep tty driver until usb driver is unregistered
rmmod hso
produces traces like this without that:

[40261.645904] usb 2-2: new high-speed USB device number 2 using ehci-omap
[40261.854644] usb 2-2: New USB device found, idVendor=0af0, idProduct=8800
[40261.862609] usb 2-2: New USB device strings: Mfr=3, Product=2, SerialNumber=0
[40261.872772] usb 2-2: Product: Globetrotter HSUPA Modem
[40261.880279] usb 2-2: Manufacturer: Option N.V.
[40262.021270] hso 2-2:1.5: Not our interface
[40265.556945] hso: unloaded
[40265.559875] usbcore: deregistering interface driver hso
[40265.595947] Unable to handle kernel NULL pointer dereference at virtual address 00000033
[40265.604522] pgd = ecb14000
[40265.611877] [00000033] *pgd=00000000
[40265.617034] Internal error: Oops: 5 [#1] PREEMPT SMP ARM
[40265.622650] Modules linked in: hso(-) bnep bluetooth ipv6 arc4 twl4030_madc_hwmon wl18xx wlcore mac80211 cfg80211 snd_soc_simple_card snd_soc_simple_card_utils snd_soc_omap_twl4030 snd_soc_gtm601 generic_adc_battery extcon_gpio omap3_isp videobuf2_dma_contig videobuf2_memops wlcore_sdio videobuf2_v4l2 videobuf2_core ov9650 bmp280_i2c v4l2_common bmp280 bmg160_i2c bmg160_core at24 nvmem_core videodev bmc150_accel_i2c bmc150_magn_i2c media bmc150_accel_core tsc2007 bmc150_magn leds_tca6507 bno055 snd_soc_omap_mcbsp industrialio_triggered_buffer snd_soc_omap kfifo_buf snd_pcm_dmaengine gpio_twl4030 snd_soc_twl4030 twl4030_vibra twl4030_madc wwan_on_off ehci_omap pwm_bl pwm_omap_dmtimer panel_tpo_td028ttec1 encoder_opa362 connector_analog_tv omapdrm drm_kms_helper cfbfillrect syscopyarea cfbimgblt sysfillrect
[40265.698211]  sysimgblt fb_sys_fops cfbcopyarea drm omapdss usb_f_ecm g_ether usb_f_rndis u_ether libcomposite configfs omap2430 phy_twl4030_usb musb_hdrc twl4030_charger industrialio w2sg0004 twl4030_pwrbutton bq27xxx_battery w1_bq27000 omap_hdq [last unloaded: hso]
[40265.723175] CPU: 0 PID: 2701 Comm: rmmod Not tainted 4.11.0-rc6-letux+ #6
[40265.730346] Hardware name: Generic OMAP36xx (Flattened Device Tree)
[40265.736938] task: ecb81100 task.stack: ecb82000
[40265.741729] PC is at cdev_del+0xc/0x2c
[40265.745666] LR is at tty_unregister_device+0x40/0x50
[40265.750915] pc : [<c027472c>]    lr : [<c04b3ecc>]    psr: 600b0113
sp : ecb83ea8  ip : eca4f898  fp : 00000000
[40265.763000] r10: 00000000  r9 : 00000000  r8 : 00000001
[40265.768493] r7 : eca4f800  r6 : 00000003  r5 : 00000000  r4 : ffffffff
[40265.775360] r3 : c1458d54  r2 : 00000000  r1 : 00000004  r0 : ffffffff
[40265.782257] Flags: nZCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment none
[40265.789764] Control: 10c5387d  Table: acb14019  DAC: 00000051
[40265.795806] Process rmmod (pid: 2701, stack limit = 0xecb82218)
[40265.802062] Stack: (0xecb83ea8 to 0xecb84000)
[40265.806640] 3ea0:                   ec9e8100 c04b3ecc bf737378 ed5b7c00 00000003 bf7327ec
[40265.815277] 3ec0: eca4f800 00000000 ec9fd800 eca4f800 bf737070 bf7328bc eca4f820 c05a9a04
[40265.823883] 3ee0: eca4f820 00000000 00000001 eca4f820 ec9fd870 bf737070 eca4f854 ec9fd8a4
[40265.832519] 3f00: ecb82000 00000000 00000000 c04e6960 eca4f820 bf737070 bf737048 00000081
[40265.841125] 3f20: c01071e4 c04e6a60 ecb81100 bf737070 bf737070 c04e5d94 bf737020 c05a8f88
[40265.849731] 3f40: bf737100 00000800 7f5fa254 00000081 c01071e4 c01c4afc 00000000 006f7368
[40265.858367] 3f60: ecb815f4 00000000 c0cac9c4 c01071e4 ecb82000 00000000 00000000 c01512f4
[40265.866973] 3f80: ed5b3200 c01071e4 7f5fa220 7f5fa220 bea78ec9 0010711c 7f5fa220 7f5fa220
[40265.875579] 3fa0: bea78ec9 c0107040 7f5fa220 7f5fa220 7f5fa254 00000800 dd35b800 dd35b800
[40265.884216] 3fc0: 7f5fa220 7f5fa220 bea78ec9 00000081 bea78dcc 00000000 bea78bd8 00000000
[40265.892822] 3fe0: b6f70521 bea78b6c 7f5dd613 b6f70526 80070030 7f5fa254 ffffffff ffffffff
[40265.901458] [<c027472c>] (cdev_del) from [<c04b3ecc>] (tty_unregister_device+0x40/0x50)
[40265.909942] [<c04b3ecc>] (tty_unregister_device) from [<bf7327ec>] (hso_free_interface+0x80/0x144 [hso])
[40265.919982] [<bf7327ec>] (hso_free_interface [hso]) from [<bf7328bc>] (hso_disconnect+0xc/0x18 [hso])
[40265.929718] [<bf7328bc>] (hso_disconnect [hso]) from [<c05a9a04>] (usb_unbind_interface+0x84/0x200)
[40265.939239] [<c05a9a04>] (usb_unbind_interface) from [<c04e6960>] (device_release_driver_internal+0x138/0x1cc)
[40265.949798] [<c04e6960>] (device_release_driver_internal) from [<c04e6a60>] (driver_detach+0x60/0x6c)
[40265.959503] [<c04e6a60>] (driver_detach) from [<c04e5d94>] (bus_remove_driver+0x64/0x8c)
[40265.968017] [<c04e5d94>] (bus_remove_driver) from [<c05a8f88>] (usb_deregister+0x5c/0xb8)
[40265.976654] [<c05a8f88>] (usb_deregister) from [<c01c4afc>] (SyS_delete_module+0x160/0x1dc)
[40265.985443] [<c01c4afc>] (SyS_delete_module) from [<c0107040>] (ret_fast_syscall+0x0/0x1c)
[40265.994171] Code: c1458d54 e59f3020 e92d4010 e1a04000 (e5941034)
[40266.016693] ---[ end trace 9d5ac43c7e41075c ]---

Signed-off-by: Andreas Kemnade <andreas@kemnade.info>
Reviewed-by: Johan Hovold <johan@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

David S. Miller [Tue, 25 Apr 2017 15:49:33 +0000 (11:49 -0400)]

Merge branch 'qed-vf-tunnel'

Manish Chopra says:

====================
qed/qede: VF tunnelling support

With this series VFs can run vxlan/geneve/gre tunnels over it.
Please consider applying this series to "net-next"
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Chopra, Manish [Mon, 24 Apr 2017 17:00:49 +0000 (10:00 -0700)]

qed - VF tunnelling support [VXLAN/GENEVE/GRE]

This patch adds hardware channel APIs support between
VF and PF for tunnelling configuration for the VFs.
According to that configuration VFs can run VXLAN/GENEVE/GRE
tunnels over it with tunnel features offloaded.

Using these APIs VF can also request for UDP ports configuration
to the PF, although PF and it's child VFs share the same port.

Signed-off-by: Manish Chopra <manish.chopra@cavium.com>
Signed-off-by: Yuval Mintz <yuval.mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Chopra, Manish [Mon, 24 Apr 2017 17:00:48 +0000 (10:00 -0700)]

qed/qede: Add UDP ports in bulletin board

This patch adds support for UDP ports in bulletin board
to notify UDP ports change to the VFs

Signed-off-by: Manish Chopra <manish.chopra@cavium.com>
Signed-off-by: Yuval Mintz <yuval.mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Chopra, Manish [Mon, 24 Apr 2017 17:00:47 +0000 (10:00 -0700)]

qede: Configure UDP ports in local context.

This patch configures UDP ports locally instead of
configuring them in deferred context which would be
helpful in synchronizing UDP ports configuration for VFs
which will be enabled in further patches.

Signed-off-by: Manish Chopra <manish.chopra@cavium.com>
Signed-off-by: Yuval Mintz <yuval.mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Chopra, Manish [Mon, 24 Apr 2017 17:00:46 +0000 (10:00 -0700)]

qede: Disable tunnel offloads for non offloaded UDP ports

This patch disables tunnel offloads via ndo_features_check()
if given UDP port is not offloaded to hardware. This in turn
allows to run multiple tunnel interfaces using different UDP ports.

Signed-off-by: Manish Chopra <manish.chopra@cavium.com>
Signed-off-by: Yuval Mintz <yuval.mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Chopra, Manish [Mon, 24 Apr 2017 17:00:45 +0000 (10:00 -0700)]

qed/qede: Enable tunnel offloads based on hw configuration

This patch enables tunnel feature offloads based on hw configuration
at initialization time instead of enabling them always.

Signed-off-by: Manish Chopra <manish.chopra@cavium.com>
Signed-off-by: Yuval Mintz <yuval.mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Chopra, Manish [Mon, 24 Apr 2017 17:00:44 +0000 (10:00 -0700)]

qed: refactor tunnelling - API/Structs

This patch changes the tunnel APIs to use per tunnel
info instead of using bitmasks for all tunnels and also
uses single struct to hold the data to prepare multiple
variant of tunnel configuration ramrods to be sent to the hardware.

Signed-off-by: Manish Chopra <manish.chopra@cavium.com>
Signed-off-by: Yuval Mintz <yuval.mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Parthasarathy Bhuvaragan [Mon, 24 Apr 2017 13:00:43 +0000 (15:00 +0200)]

tipc: fix socket flow control accounting error at tipc_recv_stream

Until now in tipc_recv_stream(), we update the received
unacknowledged bytes based on a stack variable and not based on the
actual message size.
If the user buffer passed at tipc_recv_stream() is smaller than the
received skb, the size variable in stack differs from the actual
message size in the skb. This leads to a flow control accounting
error causing permanent congestion.

In this commit, we fix this accounting error by always using the
size of the incoming message.

Fixes: 10724cc7bb78 ("tipc: redesign connection-level flow control")
Signed-off-by: Parthasarathy Bhuvaragan <parthasarathy.bhuvaragan@ericsson.com>
Reviewed-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Parthasarathy Bhuvaragan [Mon, 24 Apr 2017 13:00:42 +0000 (15:00 +0200)]

tipc: fix socket flow control accounting error at tipc_send_stream

Until now in tipc_send_stream(), we return -1 when the socket
encounters link congestion even if the socket had successfully
sent partial data. This is incorrect as the application resends
the same the partial data leading to data corruption at
receiver's end.

In this commit, we return the partially sent bytes as the return
value at link congestion.

Fixes: 10724cc7bb78 ("tipc: redesign connection-level flow control")
Signed-off-by: Parthasarathy Bhuvaragan <parthasarathy.bhuvaragan@ericsson.com>
Reviewed-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Takashi Iwai [Tue, 25 Apr 2017 15:43:56 +0000 (17:43 +0200)]

Merge tag 'asoc-fix-v4.11-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound into for-linus

ASoC: Fixes for v4.11

A few last minute fixes for v4.11, the STI fix is relatively large but
driver specific and has been cooking in -next for a little while now:

- A fix from Takashi for some suspend/resume related crashes in the
   Intel drivers.
- A fix from Mousumi Jana for issues with incorrectly created
   enumeration controls generated from topology files which could cause
   problems for userspace.
- Fixes from Arnaud Pouliquen for some crashes due to races with the
   interrupt handler in the STI driver.

commit | commitdiff | tree

Paolo Abeni [Mon, 24 Apr 2017 12:18:28 +0000 (14:18 +0200)]

ipv6: move stub initialization after ipv6 setup completion

The ipv6 stub pointer is currently initialized before the ipv6
routing subsystem: a 3rd party can access and use such stub
before the routing data is ready.
Moreover, such pointer is not cleared in case of initialization
error, possibly leading to dangling pointers usage.

This change addresses the above moving the stub initialization
at the end of ipv6 init code.

Fixes: 5f81bd2e5d80 ("ipv6: export a stub for IPv6 symbols used by vxlan")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Acked-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

David S. Miller [Tue, 25 Apr 2017 15:41:57 +0000 (11:41 -0400)]

Merge branch 'l2tpeth-info'

Guillaume Nault says:

====================
l2tp: add informations about l2tpeth interfaces in /sys

Patch #1 lets userspace retrieve the naming scheme of an l2tpeth
interface, using /sys/class/net/<iface>/name_assign_type.

Patch #2 adds the DEVTYPE field in /sys/class/net/<iface>/uevent so
that userspace can reliably know if a device is an l2tpeth interface.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Guillaume Nault [Mon, 24 Apr 2017 12:16:07 +0000 (14:16 +0200)]

l2tp: define "l2tpeth" device type

Export type of l2tpeth interfaces to userspace
(/sys/class/net/<iface>/uevent).

Signed-off-by: Guillaume Nault <g.nault@alphalink.fr>
Acked-by: James Chapman <jchapman@katalix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Guillaume Nault [Mon, 24 Apr 2017 12:16:06 +0000 (14:16 +0200)]

l2tp: set name_assign_type for devices created by l2tp_eth.c

Export naming scheme used when creating l2tpeth interfaces
(/sys/class/net/<iface>/name_assign_type). This let userspace know if
the device's name has been generated automatically or defined manually.

Signed-off-by: Guillaume Nault <g.nault@alphalink.fr>
Acked-by: James Chapman <jchapman@katalix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Pan Bian [Mon, 24 Apr 2017 10:29:16 +0000 (18:29 +0800)]

team: fix memory leaks

In functions team_nl_send_port_list_get() and
team_nl_send_options_get(), pointer skb keeps the return value of
nlmsg_new(). When the call to genlmsg_put() fails, the memory is not
freed(). This will result in memory leak bugs.

Fixes: 9b00cf2d1024 ("team: implement multipart netlink messages for options transfers")
Signed-off-by: Pan Bian <bianpan2016@163.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Jamal Hadi Salim [Sun, 23 Apr 2017 17:17:28 +0000 (13:17 -0400)]

net sched actions: Complete the JUMPX opcode

per discussion at netconf/netdev:
When we have an action that is capable of branching (example a policer),
we can achieve a continuation of the action graph by programming a
"continue" where we find an exact replica of the same filter rule with a lower
priority and the remainder of the action graph. When you have 100s of thousands
of filters which require such a feature it gets very inefficient to do two
lookups.

This patch completes a leftover feature of action codes. Its time has come.

Example below where a user labels packets with a different skbmark on ingress
of a port depending on whether they have/not exceeded the configured rate.
This mark is then used to make further decisions on some egress port.

#rate control, very low so we can easily see the effect
sudo $TC actions add action police rate 1kbit burst 90k \
conform-exceed pipe/jump 2 index 10
# skbedit index 11 will be used if the user conforms
sudo $TC actions add action skbedit mark 11 ok index 11
# skbedit index 12 will be used if the user does not conform
sudo $TC actions add action skbedit mark 12 ok index 12

#lets bind the user ..
sudo $TC filter add dev $ETH parent ffff: protocol ip prio 8 u32 \
match ip dst 127.0.0.8/32 flowid 1:10 \
action police index 10 \
action skbedit index 11 \
action skbedit index 12

#run a ping -f and see what happens..
#
jhs@foobar:~$ sudo $TC -s filter ls dev $ETH parent ffff: protocol ip
filter pref 8 u32
filter pref 8 u32 fh 800: ht divisor 1
filter pref 8 u32 fh 800::800 order 2048 key ht 800 bkt 0 flowid 1:10  (rule hit 2800 success 1005)
  match 7f000008/ffffffff at 16 (success 1005 )
action order 1:  police 0xa rate 1Kbit burst 23440b mtu 2Kb action pipe/jump 2 overhead 0b
ref 2 bind 1 installed 207 sec used 122 sec
Action statistics:
Sent 84420 bytes 1005 pkt (dropped 0, overlimits 721 requeues 0)
backlog 0b 0p requeues 0

action order 2:  skbedit mark 11 pass
index 11 ref 2 bind 1 installed 204 sec used 122 sec
Action statistics:
Sent 60564 bytes 721 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0

action order 3:  skbedit mark 12 pass
index 12 ref 2 bind 1 installed 201 sec used 122 sec
Action statistics:
Sent 23856 bytes 284 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0

Not bad, about 28% non-conforming packets..

Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Mark Brown [Tue, 25 Apr 2017 15:25:07 +0000 (16:25 +0100)]

Merge remote-tracking branches 'asoc/fix/intel', 'asoc/fix/topology' and 'asoc/fix/sti' into asoc-linus

commit | commitdiff | tree

David S. Miller [Tue, 25 Apr 2017 15:20:30 +0000 (11:20 -0400)]

Merge tag 'linux-can-fixes-for-4.11-20170425' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can

Marc Kleine-Budde says:

====================
pull-request: can 2017-04-25

this is a pull request of three patches for net/master.

There are two patches by Stephane Grosjean for that add a new variant to the
PCAN-Chip USB driver. The other patch is by Maksim Salau, which swtiches the
memory for USB transfers from heap to stack.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Bert Kenward [Tue, 25 Apr 2017 12:44:54 +0000 (13:44 +0100)]

sfc: tx ring can only have 2048 entries for all EF10 NICs

Fixes: dd248f1bc65b ("sfc: Add PCI ID for Solarflare 8000 series 10/40G NIC")
Reported-by: Patrick Talbert <ptalbert@redhat.com>
Signed-off-by: Bert Kenward <bkenward@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Takashi Iwai [Mon, 24 Apr 2017 12:09:55 +0000 (14:09 +0200)]

ASoC: intel: Fix PM and non-atomic crash in bytcr drivers

The FE setups of Intel SST bytcr_rt5640 and bytcr_rt5651 drivers carry
the ignore_suspend flag, and this prevents the suspend/resume working
properly while the stream is running, since SST core code has the
check of the running streams and returns -EBUSY.  Drop these
superfluous flags for fixing the behavior.

Also, the bytcr_rt5640 driver lacks of nonatomic flag in some FE
definitions, which leads to the kernel Oops at suspend/resume like:

  BUG: scheduling while atomic: systemd-sleep/3144/0x00000003
  Call Trace:
   dump_stack+0x5c/0x7a
   __schedule_bug+0x55/0x70
   __schedule+0x63c/0x8c0
   schedule+0x3d/0x90
   schedule_timeout+0x16b/0x320
   ? del_timer_sync+0x50/0x50
   ? sst_wait_timeout+0xa9/0x170 [snd_intel_sst_core]
   ? sst_wait_timeout+0xa9/0x170 [snd_intel_sst_core]
   ? remove_wait_queue+0x60/0x60
   ? sst_prepare_and_post_msg+0x275/0x960 [snd_intel_sst_core]
   ? sst_pause_stream+0x9b/0x110 [snd_intel_sst_core]
   ....

This patch addresses these appropriately, too.

Signed-off-by: Takashi Iwai <tiwai@suse.de>
Acked-by: Vinod Koul <vinod.koul@intel.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Cc: <stable@vger.kernel.org> # v4.1+

commit | commitdiff | tree

David S. Miller [Tue, 25 Apr 2017 14:52:47 +0000 (10:52 -0400)]

Merge tag 'linux-can-next-for-4.12-20170425' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can-next

Marc Kleine-Budde says:

====================
pull-request: can-next 2017-04-25

this is a pull request of 21 patches for net-next/master.

There are 4 patches by Stephane Grosjean for the PEAK PCAN-PCIe FD
CAN-FD boards. The next 7 patches are by Mario Huettel, which add
support for M_CAN IP version >= v3.1.x to the m_can driver. A patch by
Remigiusz Kołłątaj adds support for the Microchip CAN BUS Analyzer. 8
patches by Oliver Hartkopp complete the initial CAN network namespace
support. Wei Yongjun's patch for the ti_hecc driver fixes the return
value check in the probe function.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Florian Westphal [Thu, 20 Apr 2017 16:08:15 +0000 (18:08 +0200)]

ipvlan: use pernet operations and restrict l3s hooks to master netns

commit 4fbae7d83c98c30efc ("ipvlan: Introduce l3s mode") added
registration of netfilter hooks via nf_register_hooks().

This API provides the illusion of 'global' netfilter hooks by placing the
hooks in all current and future network namespaces.

In case of ipvlan the hook appears to be only needed in the namespace
that contains the ipvlan master device (i.e., usually init_net), so
placing them in all namespaces is not needed.

This switches ipvlan driver to pernet operations, and then only registers
hooks in namespaces where a ipvlan master device is set to l3s mode.

Extra care has to be taken when the master device is moved to another
namespace, as we might have to 'move' the netfilter hooks too.

This is done by storing the namespace the ipvlan port was created in.
On REGISTER event, do (un)register operations in the old/new namespaces.

This will also allow removal of the nf_register_hooks() in a future patch.

Cc: Mahesh Bandewar <maheshb@google.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Herbert Xu [Thu, 20 Apr 2017 12:55:12 +0000 (20:55 +0800)]

macvlan: Fix device ref leak when purging bc_queue

When a parent macvlan device is destroyed we end up purging its
broadcast queue without dropping the device reference count on
the packet source device. This causes the source device to linger.

This patch drops that reference count.

Fixes: 260916dfb48c ("macvlan: Fix potential use-after free for...")
Reported-by: Joe Ghalam <Joe.Ghalam@dell.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Roman Spychała [Thu, 20 Apr 2017 10:04:10 +0000 (12:04 +0200)]

usb: plusb: Add support for PL-27A1

This patch adds support for the PL-27A1 by adding the appropriate
USB ID's. This chip is used in the goobay Active USB 3.0 Data Link
and Unitek Y-3501 cables.

Signed-off-by: Roman Spychała <roed@onet.eu>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Sharon Dvir [Tue, 21 Feb 2017 09:12:12 +0000 (11:12 +0200)]

iwlwifi: mvm: handle possible BIOS bug

In iwl_mvm_sar_get_ewrd_table() In case of a BIOS bug, n_profiles
might be 0 thus we need to return an error value. Found by Klocwork.

Signed-off-by: Sharon Dvir <sharon.dvir@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>

commit | commitdiff | tree

Mordechai Goodstein [Sun, 12 Feb 2017 10:20:52 +0000 (12:20 +0200)]

iwlwifi: mvm: scan: avoid "big" prints

Delete the scanned channel results.
No need in it we get it any way when logging.
The print only clogs up the ftrace print buffer.

Signed-off-by: Mordechai Goodstein <mordechay.goodstein@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>

commit | commitdiff | tree

Sharon Dvir [Tue, 21 Feb 2017 08:41:31 +0000 (10:41 +0200)]

iwlwifi: mvm: check if returned value is NULL

While freeing inactive queue, check mvmsta to be valid before
dereferencing it. Found by Klocwork.

Signed-off-by: Sharon Dvir <sharon.dvir@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>

commit | commitdiff | tree

Johannes Berg [Mon, 20 Feb 2017 16:47:04 +0000 (17:47 +0100)]

iwlwifi: mvm: make iwl_run_unified_mvm_ucode() static

There's no need to have iwl_run_unified_mvm_ucode() be exposed
to other parts of the code since the logic to pick it over the
normal code in iwl_run_init_mvm_ucode() can just be done in
that function itself.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>

commit | commitdiff | tree

Sara Sharon [Wed, 11 Jan 2017 09:58:38 +0000 (11:58 +0200)]

iwlwifi: mvm: support new rate flags

Rates were changed to adapt to HE. Change is backward compatible.

Signed-off-by: Sara Sharon <sara.sharon@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>

commit | commitdiff | tree

Maksim Salau [Sun, 23 Apr 2017 17:31:40 +0000 (20:31 +0300)]

net: can: usb: gs_usb: Fix buffer on stack

Allocate buffers on HEAP instead of STACK for local structures
that are to be sent using usb_control_msg().

Signed-off-by: Maksim Salau <maksim.salau@gmail.com>
Cc: linux-stable <stable@vger.kernel.org> # >= v4.8
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>

commit | commitdiff | tree

Stephane Grosjean [Mon, 27 Mar 2017 12:36:11 +0000 (14:36 +0200)]

can: usb: Kconfig: Add PCAN-USB X6 device in help text

This patch adds a text line in the help section of the CAN_PEAK_USB
config item describing the support of the PCAN-USB X6 adapter, which is
already included in the Kernel since 4.9.

Signed-off-by: Stephane Grosjean <s.grosjean@peak-system.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>

commit | commitdiff | tree

Stephane Grosjean [Mon, 27 Mar 2017 12:36:10 +0000 (14:36 +0200)]

can: usb: Add support of PCAN-Chip USB stamp module

This patch adds the support of the PCAN-Chip USB, a stamp module for
customer hardware designs, which communicates via USB 2.0 with the
hardware. The integrated CAN controller supports the protocols CAN 2.0 A/B
as well as CAN FD. The physical CAN connection is determined by external
wiring. The Stamp module with its single-sided mounting and plated
half-holes is suitable for automatic assembly.

Note that the chip is equipped with the same logic than the PCAN-USB FD.

Signed-off-by: Stephane Grosjean <s.grosjean@peak-system.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>

commit | commitdiff | tree

Wei Yongjun [Tue, 25 Apr 2017 06:44:05 +0000 (06:44 +0000)]

can: ti_hecc: fix return value check in ti_hecc_probe()

In case of error, the function devm_ioremap_resource() returns ERR_PTR()
and never returns NULL. The NULL test in the return value check should
be replaced with IS_ERR().

Fixes: dabf54dd1c63 ("can: ti_hecc: Convert TI HECC driver to DT only driver")
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>

commit | commitdiff | tree

Oliver Hartkopp [Tue, 25 Apr 2017 06:19:45 +0000 (08:19 +0200)]

can: enable module auto loading for virtual CAN interfaces

Autoload the vcan module when a vcan instance is to be created by
'ip link add type vcan'

Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>

commit | commitdiff | tree

Oliver Hartkopp [Tue, 25 Apr 2017 06:19:44 +0000 (08:19 +0200)]

can: add Virtual CAN Tunnel driver (vxcan)

Similar to the virtual ethernet driver veth, vxcan implements a
local CAN traffic tunnel between two virtual CAN network devices.
See Kconfig entry for details.

Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>

commit | commitdiff | tree

Oliver Hartkopp [Tue, 25 Apr 2017 06:19:43 +0000 (08:19 +0200)]

can: network namespace support for CAN gateway

The CAN gateway was not implemented as per-net in the initial network
namespace support by Mario Kicherer (8e8cda6d737d).
This patch enables the CAN gateway to be used in different namespaces.

Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>

commit | commitdiff | tree

Oliver Hartkopp [Tue, 25 Apr 2017 06:19:42 +0000 (08:19 +0200)]

can: network namespace support for CAN_BCM protocol

The CAN_BCM protocol and its procfs entries were not implemented as per-net
in the initial network namespace support by Mario Kicherer (8e8cda6d737d).
This patch adds the missing per-net functionality for the CAN BCM.

Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>

commit | commitdiff | tree

Oliver Hartkopp [Tue, 25 Apr 2017 06:19:41 +0000 (08:19 +0200)]

can: complete initial namespace support

The statistics and its proc output was not implemented as per-net in the
initial network namespace support by Mario Kicherer (8e8cda6d737d).
This patch adds the missing per-net statistics for the CAN subsystem.

Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>

commit | commitdiff | tree

Oliver Hartkopp [Tue, 25 Apr 2017 06:19:40 +0000 (08:19 +0200)]

can: remove obsolete definitions

can_rx_alldev_list is a per-net data structure now. Remove it's definition
here and can_rx_dev_list too.

Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>

commit | commitdiff | tree

Oliver Hartkopp [Tue, 25 Apr 2017 06:19:39 +0000 (08:19 +0200)]

can: remove obsolete pernet_operations definitions

The namespace support for the CAN subsystem does not need any additional
memory. So when ".size = 0" there's no extra memory allocated by the system.
And therefore ".id" is obsolete too.

Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>

commit | commitdiff | tree

Oliver Hartkopp [Tue, 25 Apr 2017 06:19:38 +0000 (08:19 +0200)]

can: fix memory leak in initial namespace support

The can_rx_alldev_list is a per-net data structure now and allocated in
can_pernet_init(). Make sure the memory is free'd in can_pernet_exit() too.

Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>

commit | commitdiff | tree

Remigiusz Kołłątaj [Fri, 14 Apr 2017 18:32:28 +0000 (20:32 +0200)]

can: mcba_usb: Add support for Microchip CAN BUS Analyzer

SocketCAN driver for Microchip CAN BUS Analyzer
(http://www.microchip.com/development-tools/)

Changes in v4:
- possible memory leak fixed in mcba_usb_write_bulk_callback
- LED support added
- failure handling in mcba_usb_probe improved
- C99 initializers for structs on stack

Changes in v3:
- improved/simplified CAN ID conversion
- functions for transmission of skb and cmd separated
- fixed/improved netif_stop_queue handling
- style/cosmetic corrections

Changes in v2:
- Termination handling reimplemented to fit new netlink API
(IFLA_CAN_TERMINATION)
- Bitrate handling reimplemented to fit new netlink API
(IFLA_CAN_BITRATE)
- CAN ID conversion refactored (changed from macro to inline functions)
- CAN DLC handling using get_can_dlc()
- Endianness handling for can_speed introduced
- Debugging removed
- Redundant error prints removed
- Style/cosmetic corrections (i.e. macro names, redefs, inits etc.)

Signed-off-by: Remigiusz Kołłątaj <remigiusz.kollataj@mobica.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>

commit | commitdiff | tree

Mario Huettel [Sat, 8 Apr 2017 12:10:15 +0000 (14:10 +0200)]

can: m_can: Enable TX FIFO Handling for M_CAN IP version >= v3.1.x

* Added defines for TX Event FIFO Element
* Adapted ndo_start_xmit function.
  For versions >= v3.1.x it uses the TX FIFO to optimize the data
  throughput. It stores the echo skb at the same index as in the
  M_CAN's TX FIFO. The frame's message marker is set to this index.
  This message marker is received in the TX Event FIFO after
  the message was successfully transmitted. It is used to echo the
  correct echo skb back to the network stack.
* Added m_can_echo_tx_event function. It reads all received
  message markers in the TX Event FIFO and loops back the
  corresponding echo skbs.
* ISR checks for new TX Event Entry interrupt for version >= 3.1.x.

Signed-off-by: Mario Huettel <mario.huettel@gmx.net>
Reviewed-by: Oliver Hartkopp <socketcan@hartkopp.net>
Tested-by: Quentin Schulz <quentin.schulz@free-electrons.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>

commit | commitdiff | tree

Mario Huettel [Sat, 8 Apr 2017 12:10:14 +0000 (14:10 +0200)]

can: m_can: Configuration for TX and TX event FIFOs

* TX/TX Event FIFO sizes are configured for version >= v3.1.x

Signed-off-by: Mario Huettel <mario.huettel@gmx.net>
Tested-by: Quentin Schulz <quentin.schulz@free-electrons.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>

commit | commitdiff | tree

Mario Huettel [Sat, 8 Apr 2017 12:10:13 +0000 (14:10 +0200)]

can: m_can: Enable M_CAN version dependent initialization

This patch adapts the initialization of the M_CAN. So it can be used
with all versions >= 3.0.x.

Changes:
* Added version element to m_can_priv structure to hold M_CAN version.
* Renamed bittiming structs for version 3.0.x
* Added new bittiming structs for version >= 3.1.x
* Function alloc_m_can_dev takes 2 new arguments. The TX FIFO size and the
  base address of the module.
* Chip configuration for CAN_CTRLMODE_LOOPBACK is changed: Enabled
  CCCR_MON bit. In combination with TEST_LBCK it activates the internal
  loopback mode. Leaving CCCR_MON '0' results in external loopback mode.
* Clocks are temporarily enabled by platform_propbe function in order to
  allow read access to the Core Release register and the Control Register.
  Registers are used to detect M_CAN version and optional Non-ISO Feature.

Initialization of M_CAN for version >= 3.1.x:
* TX FIFO of M_CAN is used to transmit frames. The driver does not need to
  stop the tx queue after each frame sent.
* Initialization of TX Event FIFO is added.
* NON-ISO is fixed for all M_CAN versions < 3.2.x. Version 3.2.x _can_ have
  the NISO (Non-ISO) bit which can switch the mode of the M_CAN to Non-ISO
  mode. This bit does not have to be writeable. Therefore it is checked.
  If it is writable Non-ISO support is added to the controllers supported
  CAN modes.

New Functions:
* Function to check the Core Release version. The read value determines the
  behaviour of the driver.
* Function to check if the NISO bit for version >= 3.2.x is implemented.

Signed-off-by: Mario Huettel <mario.huettel@gmx.net>
Reviewed-by: Oliver Hartkopp <socketcan@hartkopp.net>
Tested-by: Quentin Schulz <quentin.schulz@free-electrons.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>

commit | commitdiff | tree

Mario Huettel [Sat, 8 Apr 2017 12:10:12 +0000 (14:10 +0200)]

can: m_can: Updated register defines to newest version

* Updated register defines to newest M_CAN version (v3.2.1).
* Changed defines in the whole code.

Signed-off-by: Mario Huettel <mario.huettel@gmx.net>
Reviewed-by: Oliver Hartkopp <socketcan@hartkopp.net>
Tested-by: Quentin Schulz <quentin.schulz@free-electrons.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>

commit | commitdiff | tree

Mario Huettel [Sat, 8 Apr 2017 12:10:11 +0000 (14:10 +0200)]

can: m_can: Removed virtual address from print

The virtual address of the device was printed. I removed it because it
leaks internal information.

Signed-off-by: Mario Huettel <mario.huettel@gmx.net>
Tested-by: Quentin Schulz <quentin.schulz@free-electrons.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>

commit | commitdiff | tree

Mario Huettel [Sat, 8 Apr 2017 12:10:10 +0000 (14:10 +0200)]

can: m_can: Removed initialization of FIFO water marks

FIFO water marks disabled because the driver doesn't handle water mark
events.

Signed-off-by: Mario Huettel <mario.huettel@gmx.net>
Reviewed-by: Oliver Hartkopp <socketcan@hartkopp.net>
Tested-by: Quentin Schulz <quentin.schulz@free-electrons.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>

commit | commitdiff | tree

Mario Huettel [Sat, 8 Apr 2017 12:10:09 +0000 (14:10 +0200)]

can: m_can: Disabled Interrupt Line 1

* Disabled interrupt line 1. The driver didn't use it.

Signed-off-by: Mario Huettel <mario.huettel@gmx.net>
Tested-by: Quentin Schulz <quentin.schulz@free-electrons.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>

commit | commitdiff | tree

Stephane Grosjean [Thu, 19 Jan 2017 15:31:07 +0000 (16:31 +0100)]

can: peak: add support for PEAK PCAN-PCIe FD CAN-FD boards

This patch adds the support of the PCAN-PCI Express FD boards made
by PEAK-System, for computers using the PCI Express slot.

The PCAN-PCI Express FD has one or two CAN FD channels, depending
on the model. A galvanic isolation of the CAN ports protects
the electronics of the card and the respective computer against
disturbances of up to 500 Volts. The PCAN-PCI Express FD can be operated
with ambient temperatures in a range of -40 to +85 °C.

Such boards run an extented version of the CAN-FD IP running into USB
CAN-FD interfaces from PEAK-System, so this patch adds several new commands
and their corresponding data types to the PEAK CAN-FD common definitions
header file too.

Signed-off-by: Stephane Grosjean <s.grosjean@peak-system.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>

commit | commitdiff | tree

Stephane Grosjean [Thu, 19 Jan 2017 15:31:06 +0000 (16:31 +0100)]

can: peak: move header file to new can common subdir

The CAN-FD IP from PEAK-System runs into several kinds of PC CAN-FD
interfaces. Up to now, only the USB CAN-FD adapters were supported by
the Kernel. In order to prepare the adding of some new non-USB CAN-FD
interfaces, this patch moves - and rename - the IP definitions file
from its private (usb) sub-directory into a - newly created - CAN specific
one.

Signed-off-by: Stephane Grosjean <s.grosjean@peak-system.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>

commit | commitdiff | tree

Stephane Grosjean [Thu, 19 Jan 2017 15:31:05 +0000 (16:31 +0100)]

can: peak: fix usage of const qualifier in pointers args

Fixes the usage of the const qualifier in the memory pointer arguments
of the declared inline functions. By changing the line containing "const",
this patch also changes the name of the arg into a more usual one.

Signed-off-by: Stephane Grosjean <s.grosjean@peak-system.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>

commit | commitdiff | tree

Stephane Grosjean [Thu, 19 Jan 2017 15:31:04 +0000 (16:31 +0100)]

can: peak: fix usage of usb specific data type

This patch fixes the wrong usage of a specific USB data type into a common
header file. This common header file is intended to define the common data
types and values that define access to the PEAK-System CAN-FD IP, whatever
the PC interface is.

Signed-off-by: Stephane Grosjean <s.grosjean@peak-system.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>

commit | commitdiff | tree

David S. Miller [Tue, 25 Apr 2017 03:55:20 +0000 (23:55 -0400)]

Merge branch 'virtio-net-tx-napi'

Willem de Bruijn says:

====================
virtio-net tx napi

Add napi for virtio-net transmit completion processing.

Changes:
  v2 -> v3:
    - convert __netif_tx_trylock to __netif_tx_lock on tx napi poll
          ensure that the handler always cleans, to avoid deadlock
    - unconditionally clean in start_xmit
          avoid adding an unnecessary "if (use_napi)" branch
    - remove virtqueue_disable_cb in patch 5/5
          a noop in the common event_idx based loop
    - document affinity_hint_set constraint

  v1 -> v2:
    - disable by default
    - disable unless affinity_hint_set
          because cache misses add up to a third higher cycle cost,
  e.g., in TCP_RR tests. This is not limited to the patch
  that enables tx completion cleaning in rx napi.
    - use trylock to avoid contention between tx and rx napi
    - keep interrupts masked during xmit_more (new patch 5/5)
          this improves cycles especially for multi UDP_STREAM, which
  does not benefit from cleaning tx completions on rx napi.
    - move free_old_xmit_skbs (new patch 3/5)
          to avoid forward declaration

    not changed:
    - deduplicate virnet_poll_tx and virtnet_poll_txclean
          they look similar, but have differ too much to make it
  worthwhile.
    - delay netif_wake_subqueue for more than 2 + MAX_SKB_FRAGS
          evaluated, but made no difference
    - patch 1/5

  RFC -> v1:
    - dropped vhost interrupt moderation patch:
          not needed and likely expensive at light load
    - remove tx napi weight
        - always clean all tx completions
        - use boolean to toggle tx-napi, instead
    - only clean tx in rx if tx-napi is enabled
        - then clean tx before rx
    - fix: add missing braces in virtnet_freeze_down
    - testing: add 4KB TCP_RR + UDP test results

Based on previous patchsets by Jason Wang:

  [RFC V7 PATCH 0/7] enable tx interrupts for virtio-net
  http://lkml.iu.edu/hypermail/linux/kernel/1505.3/00245.html

Before commit b0c39dbdc204 ("virtio_net: don't free buffers in xmit
ring") the virtio-net driver would free transmitted packets on
transmission of new packets in ndo_start_xmit and, to catch the edge
case when no new packet is sent, also in a timer at 10HZ.

A timer can cause long stalls. VIRTIO_F_NOTIFY_ON_EMPTY avoids stalls
due to low free descriptor count. It does not address a stalls due to
low socket SO_SNDBUF. Increasing timer frequency decreases that stall
time, but increases interrupt rate and, thus, cycle count.

Currently, with no timer, packets are freed only at ndo_start_xmit.
Latency of consume_skb is now unbounded. To avoid a deadlock if a sock
reaches SO_SNDBUF, packets are orphaned on tx. This breaks TCP small
queues.

Reenable TCP small queues by removing the orphan. Instead of using a
timer, convert the driver to regular tx napi. This does not have the
unresolved stall issue and does not have any frequency to tune.

By keeping interrupts enabled by default, napi increases tx
interrupt rate. VIRTIO_F_EVENT_IDX avoids sending an interrupt if
one is already unacknowledged, so makes this more feasible today.
Combine that with an optimization that brings interrupt rate
back in line with the existing version for most workloads:

Tx completion cleaning on rx interrupts elides most explicit tx
interrupts by relying on the fact that many rx interrupts fire.

Tested by running {1, 10, 100} {TCP, UDP} STREAM, RR, 4K_RR benchmarks
from a guest to a server on the host, on an x86_64 Haswell. The guest
runs 4 vCPUs pinned to 4 cores. vhost and the test server are
pinned to a core each.

All results are the median of 5 runs, with variance well < 10%.
Used neper (github.com/google/neper) as test process.

Napi increases single stream throughput, but increases cycle cost.
The optimizations bring this down. The previous patchset saw a
regression with UDP_STREAM, which does not benefit from cleaning tx
interrupts in rx napi. This regression is now gone for 10x, 100x.
Remaining difference is higher 1x TCP_STREAM, lower 1x UDP_STREAM.

The latest results are with process, rx napi and tx napi affine to
the same core. All numbers are lower than the previous patchset.

             upstream     napi
TCP_STREAM:
1x:
  Mbps          27816    39805
  Gcycles         274      285

10x:
  Mbps          42947    42531
  Gcycles         300      296

100x:
  Mbps          31830    28042
  Gcycles         279      269

TCP_RR Latency (us):
1x:
  p50              21       21
  p99              27       27
  Gcycles         180      167

10x:
  p50              40       39
  p99              52       52
  Gcycles         214      211

100x:
  p50             281      241
  p99             411      337
  Gcycles         218      226

TCP_RR 4K:
1x:
  p50              28       29
  p99              34       36
  Gcycles         177      167

10x:
  p50              70       71
  p99              85      134
  Gcycles         213      214

100x:
  p50             442      611
  p99             802      785
  Gcycles         237      216

UDP_STREAM:
1x:
  Mbps          29468    26800
  Gcycles         284      293

10x:
  Mbps          29891    29978
  Gcycles         285      312

100x:
  Mbps          30269    30304
  Gcycles         318      316

UDP_RR:
1x:
  p50              19       19
  p99              23       23
  Gcycles         180      173

10x:
  p50              35       40
  p99              54       64
  Gcycles         245      237

100x:
  p50             234      286
  p99             484      473
  Gcycles         224      214

Note that GSO is enabled, so 4K RR still translates to one packet
per request.

Lower throughput at 100x vs 10x can be (at least in part)
explained by looking at bytes per packet sent (nstat). It likely
also explains the lower throughput of 1x for some variants.

upstream:

N=1   bytes/pkt=16581
N=10  bytes/pkt=61513
N=100 bytes/pkt=51558

at_rx:

N=1   bytes/pkt=65204
N=10  bytes/pkt=65148
N=100 bytes/pkt=56840
====================

Acked-by: Michael S. Tsirkin <mst@redhat.com>

commit | commitdiff | tree

Willem de Bruijn [Mon, 24 Apr 2017 17:49:30 +0000 (13:49 -0400)]

virtio-net: keep tx interrupts disabled unless kick

Tx napi mode increases the rate of transmit interrupts. Suppress some
by masking interrupts while more packets are expected. The interrupts
will be reenabled before the last packet is sent.

This optimization reduces the througput drop with tx napi for
unidirectional flows such as UDP_STREAM that do not benefit from
cleaning tx completions in the the receive napi handler.

Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Willem de Bruijn [Mon, 24 Apr 2017 17:49:29 +0000 (13:49 -0400)]

virtio-net: clean tx descriptors from rx napi

Amortize the cost of virtual interrupts by doing both rx and tx work
on reception of a receive interrupt if tx napi is enabled. With
VIRTIO_F_EVENT_IDX, this suppresses most explicit tx completion
interrupts for bidirectional workloads.

Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Willem de Bruijn [Mon, 24 Apr 2017 17:49:28 +0000 (13:49 -0400)]

virtio-net: move free_old_xmit_skbs

An upcoming patch will call free_old_xmit_skbs indirectly from
virtnet_poll. Move the function above this to avoid having to
introduce a forward declaration.

This is a pure move: no code changes.

Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Willem de Bruijn [Mon, 24 Apr 2017 17:49:27 +0000 (13:49 -0400)]

virtio-net: transmit napi

Convert virtio-net to a standard napi tx completion path. This enables
better TCP pacing using TCP small queues and increases single stream
throughput.

The virtio-net driver currently cleans tx descriptors on transmission
of new packets in ndo_start_xmit. Latency depends on new traffic, so
is unbounded. To avoid deadlock when a socket reaches its snd limit,
packets are orphaned on tranmission. This breaks socket backpressure,
including TSQ.

Napi increases the number of interrupts generated compared to the
current model, which keeps interrupts disabled as long as the ring
has enough free descriptors. Keep tx napi optional and disabled for
now. Follow-on patches will reduce the interrupt cost.

Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Willem de Bruijn [Mon, 24 Apr 2017 17:49:26 +0000 (13:49 -0400)]

virtio-net: napi helper functions

Prepare virtio-net for tx napi by converting existing napi code to
use helper functions. This also deduplicates some logic.

Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

David S. Miller [Tue, 25 Apr 2017 02:42:34 +0000 (19:42 -0700)]

sparc64: Improve 64-bit constant loading in eBPF JIT.

Doing a full 64-bit decomposition is really stupid especially for
simple values like 0 and -1.

But if we are going to optimize this, go all the way and try for all 2
and 3 instruction sequences not requiring a temporary register as
well.

First we do the easy cases where it's a zero or sign extended 32-bit
number (sethi+or, sethi+xor, respectively).

Then we try to find a range of set bits we can load simply then shift
up into place, in various ways.

Then we try negating the constant and see if we can do a simple
sequence using that with a xor at the end.  (f.e. the range of set
bits can't be loaded simply, but for the negated value it can)

The final optimized strategy involves 4 instructions sequences not
needing a temporary register.

Otherwise we sadly fully decompose using a temp..

Example, from ALU64_XOR_K: 0x0000ffffffff0000 ^ 0x0 = 0x0000ffffffff0000:

0000000000000000 <foo>:
   0:   9d e3 bf 50     save  %sp, -176, %sp
   4:   01 00 00 00     nop
   8:   90 10 00 18     mov  %i0, %o0
   c:   13 3f ff ff     sethi  %hi(0xfffffc00), %o1
  10:   92 12 63 ff     or  %o1, 0x3ff, %o1     ! ffffffff <foo+0xffffffff>
  14:   93 2a 70 10     sllx  %o1, 0x10, %o1
  18:   15 3f ff ff     sethi  %hi(0xfffffc00), %o2
  1c:   94 12 a3 ff     or  %o2, 0x3ff, %o2     ! ffffffff <foo+0xffffffff>
  20:   95 2a b0 10     sllx  %o2, 0x10, %o2
  24:   92 1a 60 00     xor  %o1, 0, %o1
  28:   12 e2 40 8a     cxbe  %o1, %o2, 38 <foo+0x38>
  2c:   9a 10 20 02     mov  2, %o5
  30:   10 60 00 03     b,pn   %xcc, 3c <foo+0x3c>
  34:   01 00 00 00     nop
  38:   9a 10 20 01     mov  1, %o5     ! 1 <foo+0x1>
  3c:   81 c7 e0 08     ret
  40:   91 eb 40 00     restore  %o5, %g0, %o0

Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

David S. Miller [Mon, 24 Apr 2017 22:56:21 +0000 (15:56 -0700)]

sparc64: Support cbcond instructions in eBPF JIT.

cbcond combines a compare with a branch into a single instruction.

The limitations are:

1) Only newer chips support it

2) For immediate compares we are limited to 5-bit signed immediate
values

3) The branch displacement is limited to 10-bit signed

4) We cannot use it for JSET

Also, cbcond (unlike all other sparc control transfers) lacks a delay
slot.

Currently we don't have a useful instruction we can push into the
delay slot of normal branches. So using cbcond pretty much always
increases code density, and is therefore a win.

Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

David S. Miller [Mon, 24 Apr 2017 22:28:57 +0000 (18:28 -0400)]

Merge branch 'dsa-b53-58xx-fixes'

Florian Fainelli says:

====================
net: dsa: b53: BCM58xx devices fixes

This patch series contains fixes for the 58xx devices (Broadcom Northstar
Plus), which were identified thanks to the help of Eric Anholt.
====================

Tested-by: Eric Anholt <eric@anholt.net>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Florian Fainelli [Mon, 24 Apr 2017 21:27:23 +0000 (14:27 -0700)]

net: dsa: b53: Fix CPU port for 58xx devices

The 58xx devices (Northstar Plus) do actually have their CPU port wired
at port 8, it was unfortunately set to port 5 (B53_CPU_PORT_25) which is
incorrect, since that is the second possible management port.

Fixes: 991a36bb4645 ("net: dsa: b53: Add support for BCM585xx/586xx/88312 integrated switch")
Reported-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Linux kernel for KaRo TX COM modules

RSS Atom