The reusable fairness queueing implementation (fq.h) lacks the memory
usage limit that the fq_codel qdisc has. This means that small
devices (e.g. WiFi routers) can run out of memory when flooded with a
large number of packets. This ports the memory limit feature from
fq_codel to fq.h.
Signed-off-by: Toke Høiland-Jørgensen <toke@toke.dk> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Implement add/rm_nan_func functions and handle NAN function
termination notifications. Handle instance_id allocation for
NAN functions and implement the reconfig flow.
Signed-off-by: Andrei Otcheretianski <andrei.otcheretianski@intel.com> Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Signed-off-by: Luca Coelho <luciano.coelho@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Implement nan_change_conf callback which allows to change current
NAN configuration (master preference and dual band operation).
Store the current NAN configuration in sdata, so it can be used
both to provide the driver the updated configuration with changes
and also it will be used in hw reconfig flows in next patches.
Signed-off-by: Andrei Otcheretianski <andrei.otcheretianski@intel.com> Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Signed-off-by: Luca Coelho <luciano.coelho@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
cfg80211: Provide an API to report NAN function termination
Provide a function that reports NAN DE function termination. The function
may be terminated due to one of the following reasons: user request,
ttl expiration or failure.
If the NAN instance is tied to the owner, the notification will be
sent to the socket that started the NAN interface only
Signed-off-by: Andrei Otcheretianski <andrei.otcheretianski@intel.com> Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Signed-off-by: Luca Coelho <luciano.coelho@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
cfg80211: provide a function to report a match for NAN
Provide a function the driver can call to report a match.
This will send the event to the user space.
If the NAN instance is tied to the owner, the notifications will be
sent to the socket that started the NAN interface only.
Signed-off-by: Andrei Otcheretianski <andrei.otcheretianski@intel.com> Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Signed-off-by: Luca Coelho <luciano.coelho@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
cfg80211: allow the user space to change current NAN configuration
Some NAN configuration paramaters may change during the operation of
the NAN device. For example, a user may want to update master preference
value when the device gets plugged/unplugged to the power.
Add API that allows to do so.
Signed-off-by: Andrei Otcheretianski <andrei.otcheretianski@intel.com> Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Signed-off-by: Luca Coelho <luciano.coelho@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
A NAN function can be either publish, subscribe or follow
up. Make all the necessary verifications and just pass the
request to the driver.
Allow the user space application that starts NAN to
forbid any other socket to add or remove functions.
Signed-off-by: Andrei Otcheretianski <andrei.otcheretianski@intel.com> Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Signed-off-by: Ayala Beker <ayala.beker@intel.com> Signed-off-by: Luca Coelho <luciano.coelho@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
This allows user space to start/stop NAN interface.
A NAN interface is like P2P device in a few aspects: it
doesn't have a netdev associated to it.
Add the new interface type and prevent operations that
can't be executed on NAN interface like scan.
Define several attributes that may be configured by user space
when starting NAN functionality (master preference and dual
band operation)
Signed-off-by: Andrei Otcheretianski <andrei.otcheretianski@intel.com> Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Signed-off-by: Luca Coelho <luciano.coelho@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
David Spinadel [Thu, 22 Sep 2016 20:16:50 +0000 (23:16 +0300)]
cfg80211: Add support for static WEP in the driver
Add support for drivers that implement static WEP internally, i.e.
expose connection keys to the driver in connect flow and don't
upload the keys after the connection.
Signed-off-by: David Spinadel <david.spinadel@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
mac80211: Move ieee802111_tx_dequeue() to later in tx.c
The TXQ path restructure requires ieee80211_tx_dequeue() to call TX
handlers and parts of the xmit_fast path. Move the function to later in
tx.c in preparation for this.
Signed-off-by: Toke Høiland-Jørgensen <toke@toke.dk> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Johannes Berg [Wed, 28 Sep 2016 21:44:57 +0000 (23:44 +0200)]
cfg80211: wext: really don't store non-WEP keys
Jouni reported that during (repeated) wext_pmf test runs (from the
wpa_supplicant hwsim test suite) the kernel crashes. The reason is
that after the key is set, the wext code still unnecessarily stores
it into the key cache. Despite smatch pointing out an overflow, I
failed to identify the possibility for this in the code and missed
it during development of the earlier patch series.
In order to fix this, simply check that we never store anything but
WEP keys into the cache, adding a comment as to why that's enough.
Also, since the cache is still allocated early even if it won't be
used in many cases, add a comment explaining why - otherwise we'd
have to roll back key settings to the driver in case of allocation
failures, which is far more difficult.
Johannes Berg [Mon, 19 Sep 2016 07:44:44 +0000 (09:44 +0200)]
cfg80211: add checks for beacon rate, extend to mesh
The previous commit added support for specifying the beacon rate
for AP mode. Add features checks to this, and extend it to also
support the rate configuration for mesh networks. For IBSS it's
not as simple due to joining etc., so that's not yet supported.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
David S. Miller [Mon, 19 Sep 2016 02:29:08 +0000 (22:29 -0400)]
Merge tag 'mac80211-next-for-davem-2016-09-16' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211-next
Johannes Berg says:
====================
This time we have various things - all across the board:
* MU-MIMO sniffer support in mac80211
* a create_singlethread_workqueue() cleanup
* interface dump filtering that was documented but not implemented
* support for the new radiotap timestamp field
* send delBA in two unexpected conditions (as required by the spec)
* connect keys cleanups - allow only WEP with index 0-3
* per-station aggregation limit to work around broken APs
* debugfs improvement for the integrated codel algorithm
and various other small improvements and cleanups.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Colin Ian King [Fri, 16 Sep 2016 09:43:38 +0000 (10:43 +0100)]
net: r6040: add in missing white space in error message text
A couple of dev_err messages span two lines and the literal
string is missing a white space between words. Add the white
space and join the two lines into one.
Signed-off-by: Colin Ian King <colin.king@canonical.com> Acked-by: FLorian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Fixes: afe4fd062416 ("pkt_sched: fq: Fair Queue packet scheduler") Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Instead of using flow stats per NUMA node, use it per CPU. When using
megaflows, the stats lock can be a bottleneck in scalability.
On a E5-2690 12-core system, usual throughput went from ~4Mpps to
~15Mpps when forwarding between two 40GbE ports with a single flow
configured on the datapath.
This has been tested on a system with possible CPUs 0-7,16-23. After
module removal, there were no corruption on the slab cache.
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@redhat.com> Cc: pravin shelar <pshelar@ovn.org> Acked-by: Pravin B Shelar <pshelar@ovn.org> Signed-off-by: David S. Miller <davem@davemloft.net>
openvswitch: fix flow stats accounting when node 0 is not possible
On a system with only node 1 as possible, all statistics is going to be
accounted on node 0 as it will have a single writer.
However, when getting and clearing the statistics, node 0 is not going
to be considered, as it's not a possible node.
Tested that statistics are not zero on a system with only node 1
possible. Also compile-tested with CONFIG_NUMA off.
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@redhat.com> Acked-by: Pravin B Shelar <pshelar@ovn.org> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Mon, 19 Sep 2016 02:02:40 +0000 (22:02 -0400)]
Merge branch 'sctp-transmit-errs'
Xin Long says:
====================
sctp: fix the transmit err process
This patchset is to improve the transmit err process and also fix some
issues.
After this patchset, once the chunks are enqueued successfully, even
if the chunks fail to send out, no matter because of nodst or nomem,
no err retruns back to users any more. Instead, they are taken care
of by retransmit.
v1->v2:
- add more details to the changelog in patch 1/6
- add Fixes: tag in patch 2/6, 3/6
- also revert 69b5777f2e57 in patch 3/6
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Xin Long [Tue, 13 Sep 2016 18:04:23 +0000 (02:04 +0800)]
sctp: not return ENOMEM err back in sctp_packet_transmit
As David and Marcelo's suggestion, ENOMEM err shouldn't return back to
user in transmit path. Instead, sctp's retransmit would take care of
the chunks that fail to send because of ENOMEM.
This patch is only to do some release job when alloc_skb fails, not to
return ENOMEM back any more.
Besides, it also cleans up sctp_packet_transmit's err path, and fixes
some issues in err path:
- It didn't free the head skb in nomem: path.
- No need to check nskb in no_route: path.
- It should goto err: path if alloc_skb fails for head.
- Not all the NOMEMs should free nskb.
Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Xin Long [Tue, 13 Sep 2016 18:04:21 +0000 (02:04 +0800)]
sctp: save transmit error to sk_err in sctp_outq_flush
Every time when sctp calls sctp_outq_flush, it sends out the chunks of
control queue, retransmit queue and data queue. Even if some trunks are
failed to transmit, it still has to flush all the transports, as it's
the only chance to clean that transmit_list.
So the latest transmit error here should be returned back. This transmit
error is an internal error of sctp stack.
I checked all the places where it uses the transmit error (the return
value of sctp_outq_flush), most of them are actually just save it to
sk_err.
Except for sctp_assoc/endpoint_bh_rcv, they will drop the chunk if
it's failed to send a REPLY, which is actually incorrect, as we can't
be sure the error that sctp_outq_flush returns is from sending that
REPLY.
So it's meaningless for sctp_outq_flush to return error back.
This patch is to save transmit error to sk_err in sctp_outq_flush, the
new error can update the old value. Eventually, sctp_wait_for_* would
check for it.
Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Xin Long [Tue, 13 Sep 2016 18:04:20 +0000 (02:04 +0800)]
sctp: free msg->chunks when sctp_primitive_SEND return err
Last patch "sctp: do not return the transmit err back to sctp_sendmsg"
made sctp_primitive_SEND return err only when asoc state is unavailable.
In this case, chunks are not enqueued, they have no chance to be freed if
we don't take care of them later.
This Patch is actually to revert commit 1cd4d5c4326a ("sctp: remove the
unused sctp_datamsg_free()"), commit 69b5777f2e57 ("sctp: hold the chunks
only after the chunk is enqueued in outq") and commit 8b570dc9f7b6 ("sctp:
only drop the reference on the datamsg after sending a msg"), to use
sctp_datamsg_free to free the chunks of current msg.
Fixes: 8b570dc9f7b6 ("sctp: only drop the reference on the datamsg after sending a msg") Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Xin Long [Tue, 13 Sep 2016 18:04:19 +0000 (02:04 +0800)]
sctp: do not return the transmit err back to sctp_sendmsg
Once a chunk is enqueued successfully, sctp queues can take care of it.
Even if it is failed to transmit (like because of nomem), it should be
put into retransmit queue.
If sctp report this error to users, it confuses them, they may resend
that msg, but actually in kernel sctp stack is in charge of retransmit
it already.
Besides, this error probably is not from the failure of transmitting
current msg, but transmitting or retransmitting another msg's chunks,
as sctp_outq_flush just tries to send out all transports' chunks.
This patch is to make sctp_cmd_send_msg return avoid, and not return the
transmit err back to sctp_sendmsg
Fixes: 8b570dc9f7b6 ("sctp: only drop the reference on the datamsg after sending a msg") Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Xin Long [Tue, 13 Sep 2016 18:04:18 +0000 (02:04 +0800)]
sctp: remove the unnecessary state check in sctp_outq_tail
Data Chunks are only sent by sctp_primitive_SEND, in which sctp checks
the asoc's state through statetable before calling sctp_outq_tail. So
there's no need to check the asoc's state again in sctp_outq_tail.
Besides, sctp_do_sm is protected by lock_sock, even if sending msg is
interrupted by timer events, the event's processes still need to acquire
lock_sock first. It means no others CMDs can be enqueue into side effect
list before CMD_SEND_MSG to change asoc->state, so it's safe to remove it.
This patch is to remove redundant asoc->state check from sctp_outq_tail.
Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
samples/bpf: add comprehensive ipip, ipip6, ip6ip6 test
the test creates 3 namespaces with veth connected via bridge.
First two namespaces simulate two different hosts with the same
IPv4 and IPv6 addresses configured on the tunnel interface and they
communicate with outside world via standard tunnels.
Third namespace creates collect_md tunnel that is driven by BPF
program which selects different remote host (either first or
second namespace) based on tcp dest port number while tcp dst
ip is the same.
This scenario is rough approximation of load balancer use case.
The tests check both traditional tunnel configuration and collect_md mode.
Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Similar to gre, vxlan, geneve tunnels allow IPIP6 and IP6IP6 tunnels
to operate in 'collect metadata' mode.
Unlike ipv4 code here it's possible to reuse ip6_tnl_xmit() function
for both collect_md and traditional tunnels.
bpf_skb_[gs]et_tunnel_key() helpers and ovs (in the future) are the users.
Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Thomas Graf <tgraf@suug.ch> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>
Similar to gre, vxlan, geneve tunnels allow IPIP tunnels to
operate in 'collect metadata' mode.
bpf_skb_[gs]et_tunnel_key() helpers can make use of it right away.
ovs can use it as well in the future (once appropriate ovs-vport
abstractions and user apis are added).
Note that just like in other tunnels we cannot cache the dst,
since tunnel_info metadata can be different for every packet.
Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Thomas Graf <tgraf@suug.ch> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>
Julia Lawall [Thu, 15 Sep 2016 20:23:26 +0000 (22:23 +0200)]
l2tp: constify net_device_ops structures
Check for net_device_ops structures that are only stored in the netdev_ops
field of a net_device structure. This field is declared const, so
net_device_ops structures that have this property can be declared as const
also.
The semantic patch that makes this change is as follows:
(http://coccinelle.lip6.fr/)
Julia Lawall [Thu, 15 Sep 2016 20:23:25 +0000 (22:23 +0200)]
dwc_eth_qos: constify net_device_ops structures
Check for net_device_ops structures that are only stored in the netdev_ops
field of a net_device structure. This field is declared const, so
net_device_ops structures that have this property can be declared as const
also.
The semantic patch that makes this change is as follows:
(http://coccinelle.lip6.fr/)
@bad@
position p != {r.p,ok.p};
identifier r.i;
struct net_device_ops e;
@@
e@i@p
@depends on !bad disable optional_qualifier@
identifier r.i;
@@
static
+const
struct net_device_ops i = { ... };
// </smpl>
The result of size on this file before the change is:
text data bss dec hex filename
21623 1316 40 22979 59c3
drivers/net/ethernet/synopsys/dwc_eth_qos.o
and after the change it is:
text data bss dec hex filename
22199 724 40 22963 59b3
drivers/net/ethernet/synopsys/dwc_eth_qos.o
Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr> Signed-off-by: David S. Miller <davem@davemloft.net>
Julia Lawall [Thu, 15 Sep 2016 20:23:24 +0000 (22:23 +0200)]
hisilicon: constify net_device_ops structures
Check for net_device_ops structures that are only stored in the netdev_ops
field of a net_device structure. This field is declared const, so
net_device_ops structures that have this property can be declared as const
also.
The semantic patch that makes this change is as follows:
(http://coccinelle.lip6.fr/)
Eric Dumazet [Thu, 15 Sep 2016 16:33:02 +0000 (09:33 -0700)]
tcp: prepare skbs for better sack shifting
With large BDP TCP flows and lossy networks, it is very important
to keep a low number of skbs in the write queue.
RACK and SACK processing can perform a linear scan of it.
We should avoid putting any payload in skb->head, so that SACK
shifting can be done if needed.
With this patch, we allow to pack ~0.5 MB per skb instead of
the 64KB initially cooked at tcp_sendmsg() time.
This gives a reduction of number of skbs in write queue by eight.
tcp_rack_detect_loss() likes this.
We still allow payload in skb->head for first skb put in the queue,
to not impact RPC workloads.
Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Yuchung Cheng <ycheng@google.com> Acked-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Sat, 17 Sep 2016 13:53:29 +0000 (09:53 -0400)]
Merge tag 'wireless-drivers-next-for-davem-2016-09-15' of git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers-next
Kalle Valo says:
====================
wireless-drivers-next patches for 4.9
Major changes:
iwlwifi
* preparation for new a000 HW continues
* some DQA improvements
* add support for GMAC
* add support for 9460, 9270 and 9170 series
mwifiex
* support random MAC address for scanning
* add HT aggregation support for adhoc mode
* add custom regulatory domain support
* add manufacturing mode support via nl80211 testmode interface
bcma
* support BCM53573 series of wireless SoCs
bitfield.h
* add FIELD_PREP() and FIELD_GET() macros
mt7601u
* convert to use the new bitfield.h macros
brcmfmac
* add support for bcm4339 chip with modalias sdio:c00v02D0d4339
ath10k
* add nl80211 testmode support for 10.4 firmware
* hide kernel addresses from logs using %pK format specifier
* implement NAPI support
* enable peer stats by default
ath9k
* use ieee80211_tx_status_noskb where possible
wil6210
* extract firmware capabilities from the firmware file
ath6kl
* enable firmware crash dumps on the AR6004
ath-current is also merged to fix a conflict in ath10k.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Sat, 17 Sep 2016 13:51:48 +0000 (09:51 -0400)]
Merge branch 'mlx5e-order-0'
Tariq Toukan says:
====================
mlx5e Order-0 pages for Striding RQ
In this series, we refactor our Striding RQ receive-flow to always use
fragmented WQEs (Work Queue Elements) using order-0 pages, omitting the
flow that allocates and splits high-order pages which would fragment
and deplete high-order pages in the system.
The first patch gives a slight degradation, but opens the opportunity
to using a simple page-cache mechanism of a fair size.
The page-cache, implemented in patch 3, not only closes the performance
gap but even gives a gain.
In patch 2 we re-organize the code to better manage the calls for
alloc/de-alloc pages in the RX flow.
Series generated against net-next commit: bed806cb266e "Merge branch 'mlxsw-ethtool'"
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
net/mlx5e: Implement RX mapped page cache for page recycle
Instead of reallocating and mapping pages for RX data-path,
recycle already used pages in a per ring cache.
Performance tests:
The following results were measured on a freshly booted system,
giving optimal baseline performance, as high-order pages are yet to
be fragmented and depleted.
We ran pktgen single-stream benchmarks, with iptables-raw-drop:
Single stride, 64 bytes:
* 4,739,057 - baseline
* 4,749,550 - order0 no cache
* 4,786,899 - order0 with cache
1% gain
Larger packets, no page cross, 1024 bytes:
* 3,982,361 - baseline
* 3,845,682 - order0 no cache
* 4,127,852 - order0 with cache
3.7% gain
Larger packets, every 3rd packet crosses a page, 1500 bytes:
* 3,731,189 - baseline
* 3,579,414 - order0 no cache
* 3,931,708 - order0 with cache
5.4% gain
Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Manage the allocation and deallocation of mapped RX pages only
through dedicated API functions.
Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
net/mlx5e: Single flow order-0 pages for Striding RQ
To improve the memory consumption scheme, we omit the flow that
demands and splits high-order pages in Striding RQ, and stay
with a single Striding RQ flow that uses order-0 pages.
Moving to fragmented memory allows the use of larger MPWQEs,
which reduces the number of UMR posts and filler CQEs.
Moving to a single flow allows several optimizations that improve
performance, especially in production servers where we would
anyway fallback to order-0 allocations:
- inline functions that were called via function pointers.
- improve the UMR post process.
This patch alone is expected to give a slight performance reduction.
However, the new memory scheme gives the possibility to use a page-cache
of a fair size, that doesn't inflate the memory footprint, which will
dramatically fix the reduction and even give a performance gain.
Performance tests:
The following results were measured on a freshly booted system,
giving optimal baseline performance, as high-order pages are yet to
be fragmented and depleted.
We ran pktgen single-stream benchmarks, with iptables-raw-drop:
Single stride, 64 bytes:
* 4,739,057 - baseline
* 4,749,550 - this patch
no reduction
Larger packets, no page cross, 1024 bytes:
* 3,982,361 - baseline
* 3,845,682 - this patch
3.5% reduction
Larger packets, every 3rd packet crosses a page, 1500 bytes:
* 3,731,189 - baseline
* 3,579,414 - this patch
4% reduction
Fixes: 461017cb006a ("net/mlx5e: Support RX multi-packet WQE (Striding RQ)") Fixes: bc77b240b3c5 ("net/mlx5e: Add fragmented memory support for RX multi packet WQE") Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
David Howells [Sat, 17 Sep 2016 06:26:01 +0000 (07:26 +0100)]
rxrpc: Make IPv6 support conditional on CONFIG_IPV6
Add CONFIG_AF_RXRPC_IPV6 and make the IPv6 support code conditional on it.
This is then made conditional on CONFIG_IPV6.
Without this, the following can be seen:
net/built-in.o: In function `rxrpc_init_peer':
>> peer_object.c:(.text+0x18c3c8): undefined reference to `ip6_route_output_flags'
Reported-by: kbuild test robot <fengguang.wu@intel.com> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
cfg80211: add helper to find an IE that matches a byte-array
There are a few places where an IE that matches not only the EID, but
also other bytes inside the element, needs to be found. To simplify
that and reduce the amount of similar code, implement a new helper
function to match the EID and an extra array of bytes.
Additionally, simplify cfg80211_find_vendor_ie() by using the new
match function.
Signed-off-by: Luca Coelho <luciano.coelho@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
mac80211: allow using AP_LINK_PS with mac80211-generated TIM IE
In 46fa38e84b65 ("mac80211: allow software PS-Poll/U-APSD with
AP_LINK_PS"), Johannes allowed to use mac80211's code for handling
stations that go to PS or send PS-Poll / uAPSD trigger frames for
devices that enable RSS.
This means that mac80211 doesn't look at frames anymore but rather
relies on a notification that will come from the device when a PS
transition occurs or when a PS-Poll / trigger frame is detected by
the device.
iwlwifi will need this capability but still needs mac80211 to take
care of the TIM IE. Today, if a driver sets AP_LINK_PS, mac80211
will not update the TIM IE. Change mac80211 to check existence of
the set_tim driver callback rather than using AP_LINK_PS to decide
if the driver handles the TIM IE internally or not.
Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
[reword commit message a bit] Signed-off-by: Johannes Berg <johannes.berg@intel.com>
David S. Miller [Fri, 16 Sep 2016 08:31:56 +0000 (04:31 -0400)]
Merge branch 'QCA8K'
John Crispin says:
====================
net-next: dsa: add QCA8K support
This series is based on the AR8xxx series posted by Matthieu Olivari in may
2015. The following changes were made since then
* fixed the nitpicks from the previous review
* updated to latest API
* turned it into an mdio device
* added callbacks for fdb, bridge offloading, stp, eee, port status
* fixed several minor issues to the port setup and arp learning
* changed the namespacing as this driver to qca8k
The driver has so far only been tested on qca8337/N. It should work on other QCA
switches such as the qca8327 with minor changes.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
John Crispin [Thu, 15 Sep 2016 14:26:41 +0000 (16:26 +0200)]
net-next: dsa: add new driver for qca8xxx family
This patch contains initial support for the QCA8337 switch. It
will detect a QCA8337 switch, if present and declared in the DT.
Each port will be represented through a standalone net_device interface,
as for other DSA switches. CPU can communicate with any of the ports by
setting an IP@ on ethN interface. Most of the extra callbacks of the DSA
subsystem are already supported, such as bridge offloading, stp, fdb.
Signed-off-by: John Crispin <john@phrozen.org> Signed-off-by: David S. Miller <davem@davemloft.net>
John Crispin [Thu, 15 Sep 2016 14:26:40 +0000 (16:26 +0200)]
net-next: dsa: add Qualcomm tag RX/TX handler
Add support for the 2-bytes Qualcomm tag that gigabit switches such as
the QCA8337/N might insert when receiving packets, or that we need
to insert while targeting specific switch ports. The tag is inserted
directly behind the ethernet header.
Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: John Crispin <john@phrozen.org> Signed-off-by: David S. Miller <davem@davemloft.net>
John Crispin [Thu, 15 Sep 2016 14:26:39 +0000 (16:26 +0200)]
Documentation: devicetree: add qca8k binding
Add device-tree binding for ar8xxx switch families.
Cc: devicetree@vger.kernel.org Signed-off-by: John Crispin <john@phrozen.org> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
The driver core clears the driver data to NULL after device_release
or on probe failure. Thus, it is not needed to manually clear the
device driver data to NULL.
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
net: dsa: b53: Remove unused including <linux/version.h>
Remove including <linux/version.h> that don't need it.
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Acked-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
drivers/net/dsa/bcm_sf2.c:963:19: warning:
symbol 'bcm_sf2_io_ops' was not declared. Should it be static?
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Acked-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Fixes: 9f5afeae5152 ("tcp: use an RB tree for ooo receive queue") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Yuchung Cheng <ycheng@google.com> Cc: Yaogong Wang <wygivan@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Fri, 16 Sep 2016 06:23:06 +0000 (02:23 -0400)]
Merge branch 'mediatek-reset-flow'
Sean Wang says:
====================
mediatek: add enhancement into the existing reset flow
Current driver only resets DMA used by descriptor rings which
can't guarantee it can recover all various kinds of fatal
errors, so the patch
1) tries to reset the underlying hardware resource from scratch on
Mediatek SoC required for ethernet running.
2) refactors code in order to the reusability of existing code.
3) considers handling for race condition between the reset flow and
callbacks registered into core driver called about hardware accessing.
4) introduces power domain usage to hardware setup which leads to have
cleanly and completely restore to the state as the initial.
Changes since v1:
- fix the build error with module built causing undefined symbol for
pinctrl_bind_pins, so using pinctrl_select_state instead accomplishes
the pin mux setup during the reset process.
====================
Reviewed-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Sean Wang [Wed, 14 Sep 2016 15:13:20 +0000 (23:13 +0800)]
net: ethernet: mediatek: add more resets for internal ethernet circuit block
struct mtk_eth has already contained struct regmap ethsys pointer
to the address range of the internal circuit reset, so we reuse it
to reset more internal blocks on ethernet hardware such as packet
processing engine (PPE) and frame engine (FE) instead of rstc which
deals with FE only.
Signed-off-by: Sean Wang <sean.wang@mediatek.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Sean Wang [Wed, 14 Sep 2016 15:13:19 +0000 (23:13 +0800)]
net: ethernet: mediatek: add the whole ethernet reset into the reset process
1) original driver only resets DMA used by descriptor rings
which can't guarantee it can recover all various kinds of fatal
errors, so the patch tries to reset the underlying hardware
resource from scratch on Mediatek SoC required for ethernet
running, including power, pin mux control, clock and internal
circuits on the ethernet in order to restore into the initial
state which the rebooted machine gives.
2) add state variable inside structure mtk_eth to help distinguish
mtk_hw_init is called between the initialization during boot time
or re-initialization during the reset process.
3) add ge_mode variable inside structure mtk_mac for restoring
the interface mode of the current setup for the target MAC.
4) remove __init attribute from mtk_hw_init definition
Signed-off-by: Sean Wang <sean.wang@mediatek.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Sean Wang [Wed, 14 Sep 2016 15:13:18 +0000 (23:13 +0800)]
net: ethernet: mediatek: add controlling power domain the ethernet belongs to
introduce power domain control which the digital circuit of
the ethernet belongs to inside the flow of hardware initialization
and deinitialization which helps the entire ethernet hardware block
could restart cleanly and completely as being back to the initial
state when the whole machine reboot.
Signed-off-by: Sean Wang <sean.wang@mediatek.com> Signed-off-by: David S. Miller <davem@davemloft.net>
This cleans up the error path inside mtk_hw_init call, causing it able
to exit appropriately when something fails and also includes refactoring
mtk_cleanup call to make the partial logic reusable on the error path.
Signed-off-by: Sean Wang <sean.wang@mediatek.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Sean Wang [Wed, 14 Sep 2016 15:13:15 +0000 (23:13 +0800)]
net: ethernet: mediatek: refactoring mtk_hw_init to be reused
the existing mtk_hw_init includes hardware and software
initialization inside so that it is slightly hard to reuse
them for the process of the reset recovery, so some splitting
is made here for keeping hardware initializing relevant thing
and the else such as IRQ registration and MDIO initialization
what are all about to the interface of core driver moved to the
other proper place because they have no needs to register IRQ and
re-initialize structure again during the reset process.
Signed-off-by: Sean Wang <sean.wang@mediatek.com> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Fri, 16 Sep 2016 05:57:19 +0000 (01:57 -0400)]
Merge tag 'rxrpc-rewrite-20160913-2' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs
David Howells says:
====================
rxrpc: Support IPv6
Here is a set of patches that add IPv6 support. They need to be applied on
top of the just-posted miscellaneous fix patches. They are:
(1) Make autobinding of an unconnected socket work when sendmsg() is
called to initiate a client call.
(2) Don't specify the protocol when creating the client socket, but rather
take the default instead.
(3) Use rxrpc_extract_addr_from_skb() in a couple of places that were
doing the same thing manually. This allows the IPv6 address
extraction to be done in fewer places.
(4) Add IPv6 support. With this, calls can be made to IPv6 servers from
userspace AF_RXRPC programs; AFS, however, can't use IPv6 yet as the
RPC calls need to be upgradeable.
====================
Reviewed-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Fri, 16 Sep 2016 05:52:20 +0000 (01:52 -0400)]
Merge tag 'rxrpc-rewrite-20160913-1' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs
David Howells says:
====================
rxrpc: Miscellaneous fixes
Here's a set of miscellaneous fix patches. There are a couple of points of
note:
(1) There is one non-fix patch that adjusts the call ref tracking
tracepoint to make kernel API-held refs on calls more obvious. This
is a prerequisite for the patch that fixes prealloc refcounting.
(2) The final patch alters how jumbo packets that partially exceed the
receive window are handled. Previously, space was being left in the
Rx buffer for them, but this significantly hurts performance as the Rx
window can't be increased to match the OpenAFS Tx window size.
Instead, the excess subpackets are discarded and an EXCEEDS_WINDOW ACK
is generated for the first. To avoid the problem of someone trying to
run the kernel out of space by feeding the kernel a series of
overlapping maximal jumbo packets, we stop allowing jumbo packets on a
call if we encounter more than three jumbo packets with duplicate or
excessive subpackets.
====================
Reviewed-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: David S. Miller <davem@davemloft.net>
openvswitch: avoid deferred execution of recirc actions
The ovs kernel data path currently defers the execution of all
recirc actions until stack utilization is at a minimum.
This is too limiting for some packet forwarding scenarios due to
the small size of the deferred action FIFO (10 entries). For
example, broadcast traffic sent out more than 10 ports with
recirculation results in packet drops when the deferred action
FIFO becomes full, as reported here:
Since the current recursion depth is available (it is already tracked
by the exec_actions_level pcpu variable), we can use it to determine
whether to execute recirculation actions immediately (safe when
recursion depth is low) or defer execution until more stack space is
available.
With this change, the deferred action fifo size becomes a non-issue
for currently failing scenarios because it is no longer used when
there are three or fewer recursions through ovs_execute_actions().
Suggested-by: Pravin Shelar <pshelar@ovn.org> Signed-off-by: Lance Richardson <lrichard@redhat.com> Acked-by: Pravin B Shelar <pshelar@ovn.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Or Gerlitz [Thu, 15 Sep 2016 12:28:23 +0000 (15:28 +0300)]
net/sched: cls_flower: Remove an unused field from the filter key structure
Commit c3f8324188fa "net: Add full IPv6 addresses to flow_keys" added an
unused instance of struct flow_dissector_key_addrs into struct fl_flow_key,
remove it.
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Reported-by: Hadar Hen Zion <hadarh@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Or Gerlitz [Thu, 15 Sep 2016 12:28:22 +0000 (15:28 +0300)]
net/sched: cls_flower: Support masking for matching on tcp/udp ports
Add the definitions for src/dst udp/tcp port masks and use
them when setting && dumping the relevant keys.
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Paul Blakey <paulb@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
In commit 9ee7b683ea63 we moved the enablement of msi interrupts earlier in
alx_init_intr. If there is an error in alx_alloc_rings, __alx_open returns
with an error but msi (or msi-x) interrupts stays enabled. Add a new error
label to disable msi (or msi-x) interrupts.
Fixes: 9ee7b683ea63 ("alx: refactor msi enablement and disablement") Signed-off-by: Tobias Regnery <tobias.regnery@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
This action is intended to be an upgrade from a usability perspective
from pedit (as well as operational debugability).
Compare this:
sudo tc filter add dev $ETH parent 1: protocol ip prio 10 \
u32 match ip protocol 1 0xff flowid 1:2 \
action pedit munge offset -14 u8 set 0x02 \
munge offset -13 u8 set 0x15 \
munge offset -12 u8 set 0x15 \
munge offset -11 u8 set 0x15 \
munge offset -10 u16 set 0x1515 \
pipe
to:
sudo tc filter add dev $ETH parent 1: protocol ip prio 10 \
u32 match ip protocol 1 0xff flowid 1:2 \
action skbmod dmac 02:15:15:15:15:15
Also try to do a MAC address swap with pedit or worse
try to debug a policy with destination mac, source mac and
etherype. Then make few rules out of those and you'll get my point.
In the future common use cases on pedit can be migrated to this action
(as an example different fields in ip v4/6, transports like tcp/udp/sctp
etc). For this first cut, this allows modifying basic ethernet header.
The most important ethernet use case at the moment is when redirecting or
mirroring packets to a remote machine. The dst mac address needs a re-write
so that it doesnt get dropped or confuse an interconnecting (learning) switch
or dropped by a target machine (which looks at the dst mac). And at times
when flipping back the packet a swap of the MAC addresses is needed.
Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Daniel Borkmann [Mon, 12 Sep 2016 21:38:43 +0000 (23:38 +0200)]
bpf: use skb_at_tc_ingress helper in tcf_bpf
We have a small skb_at_tc_ingress() helper for testing for ingress, so
make use of it. cls_bpf already uses it and so should act_bpf.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Daniel Borkmann [Mon, 12 Sep 2016 21:38:42 +0000 (23:38 +0200)]
bpf: drop unnecessary test in cls_bpf_classify and tcf_bpf
The skb_mac_header_was_set() test in cls_bpf's and act_bpf's fast-path is
actually unnecessary and can be removed altogether. This was added by
commit a166151cbe33 ("bpf: fix bpf helpers to use skb->mac_header relative
offsets"), which was later on improved by 3431205e0397 ("bpf: make programs
see skb->data == L2 for ingress and egress"). We're always guaranteed to
have valid mac header at the time we invoke cls_bpf_classify() or tcf_bpf().
Reason is that since 6d1ccff62780 ("net: reset mac header in dev_start_xmit()")
we do skb_reset_mac_header() in __dev_queue_xmit() before we could call
into sch_handle_egress() or any subsequent enqueue. sch_handle_ingress()
always sees a valid mac header as well (things like skb_reset_mac_len()
would badly fail otherwise). Thus, drop the unnecessary test in classifier
and action case.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Remove rcu_read_lock protection from tunnel_key_dump and use
rtnl_dereference, dump operation is protected by rtnl lock.
Also, remove rcu_read_lock from tunnel_key_release and use
rcu_dereference_protected.
Both operations are running exclusively and a writer couldn't modify
t->params while those functions are executed.
Fixes: 54d94fd89d90 ('net/sched: Introduce act_tunnel_key') Signed-off-by: Hadar Hen Zion <hadarh@mellanox.com> Acked-by: John Fastabend <john.r.fastabend@intel.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Mon, 12 Sep 2016 12:04:57 +0000 (13:04 +0100)]
test_bpf: fix the dummy skb after dissector changes
Commit d5709f7ab776 ("flow_dissector: For stripped vlan, get vlan
info from skb->vlan_tci") made flow dissector look at vlan_proto
when vlan is present. Since test_bpf sets skb->vlan_tci to ~0
(including VLAN_TAG_PRESENT) we have to populate skb->vlan_proto.
Fixes false negative on test #24:
test_bpf: #24 LD_PAYLOAD_OFF jited:0 175 ret 0 != 42 FAIL (1 times)
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Dinan Gunawardena <dinan.gunawardena@netronome.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
mac80211: allow driver to handle packet-loss mechanism
Based on consecutive msdu failures, mac80211 triggers CQM packet-loss
mechanism. Drivers like ath10k that have its own connection monitoring
algorithm, offloaded to firmware for triggering station kickout. In case
of station kickout, driver will report low ack status by mac80211 API
(ieee80211_report_low_ack).
This flag will enable the driver to completely rely on firmware events
for station kickout and bypass mac80211 packet loss mechanism.
Signed-off-by: Rajkumar Manoharan <rmanohar@qti.qualcomm.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
No drivers implement this, relying either on the recursive
directory removal to remove their debugfs, or not having any
to start with. Remove the dead driver callback.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Johannes Berg [Wed, 14 Sep 2016 08:00:23 +0000 (10:00 +0200)]
mac80211: remove pointless chanctx NULL check
If chanctx is derived as container_of() from a non-NULL pointer,
it can't ever be NULL. Since we checked conf before, that's true
here, so remove the useless NULL check.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Johannes Berg [Wed, 14 Sep 2016 07:59:21 +0000 (09:59 +0200)]
nl80211: always check nla_put* return values
A few instances were found where we didn't check them, add the
missing checks even though they'll probably never trigger as
the message should be large enough here.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>