git.karo-electronics.de Git - linux-beck.git/log

wil6210: remove BACK RX and TX workers

WMI synchronous handling has changed and WMI calls that provide
a buffer for the reply are completed in the WMI interrupt context.
This allows sending the RX and TX BACK commands from the WMI event
handler without the need for the worker thread.
This is a better approach as it can decrease the handshake time
in the connect flow and prevent race conditions in case of fast
disconnects. An example for such a race is handling of wil_back_rx_handle
during a disconnect event, as wil_back_rx_handle is not protected by
the wil mutex and a disconnect can be handled after sta->status is
verified as connected.

Signed-off-by: Maya Erez <qca_merez@qca.qualcomm.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>

ath10k: fix firmware assert in monitor mode

commit 166de3f1895d ("ath10k: remove supported chain mask") had revealed
an issue on monitor mode. Configuring NSS upon monitor interface
creation is causing target assert in all qca9888x and qca6174 firmware.
Firmware assert issue can be reproduced by below sequence even after
reverting commit 166de3f1895d ("ath10k: remove supported chain mask").

ip link set wlan0 down
iw wlan0 set type monitor
iw phy0 set antenna 7
ip link set wlan0 up

This issue is originally reported on qca9888 with 10.1 firmware.

Fixes: 5572a95b4b ("ath10k: apply chainmask settings to vdev on creation")
Cc: stable@vger.kernel.org
Reported-by: Janusz Dziedzic <janusz.dziedzic@tieto.com>
Signed-off-by: Rajkumar Manoharan <rmanohar@qti.qualcomm.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>

ath10k: change htt tx desc/qcache peer limit config

The number of HTT Tx descriptors and qcache peer
limit aren't hw-specific. In fact they are
firmware specific and should not be placed in
hw_params.

The QCA4019 limits were submitted with the peer
flow control firmware only and to my understanding
there's no non-peer-flow-ctrl QCA4019 firmware.

However QCA99X0 is planned to run firmware
supporting the feature as well. Therefore this
patch enables QCA99X0 to use 2500 tx descriptors
whenever possible instead of just 1424.

Signed-off-by: Michal Kazior <michal.kazior@tieto.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>

ath10k: fix HTT Tx CE ring size

QCA4019 can queue up to 2500 frames at a time.
This means it requires roughly 5000 entires on the
ring to work properly. Otherwise random tx failure
may occur.

Signed-off-by: Michal Kazior <michal.kazior@tieto.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>

ath10k: implement push-pull tx

The current/old tx path design was that host, at
its own leisure, pushed tx frames to the device.
For HTT there was ~1000-1400 msdu queue depth.

After reaching that limit the driver would request
mac80211 to stop queues. There was little control
over what packets got in there as far as
DA/RA was considered so it was rather easy to
starve per-station traffic flows.

With MU-MIMO this became a significant problem
because the queue depth was insufficient to buffer
frames from multiple clients (which could have
different signal quality and capabilities) in an
efficient fashion.

Hence the new tx path in 10.4 was introduced: a
pull-push mode.

Firmware and host can share tx queue state via
DMA. The state is logically a 2 dimensional array
addressed via peer_id+tid pair. Each entry is a
counter (either number of bytes or packets. Host
keeps it updated and firmware uses it for
scheduling Tx pull requests to host.

This allows MU-MIMO to become a lot more effective
with 10+ clients.

Signed-off-by: Michal Kazior <michal.kazior@tieto.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>

ath10k: keep track of queue depth per txq

This will be necessary for later.

Signed-off-by: Michal Kazior <michal.kazior@tieto.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>

ath10k: store txq in skb_cb

This will be necessary for later.

Signed-off-by: Michal Kazior <michal.kazior@tieto.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>

ath10k: implement updating shared htt txq state

Firmware 10.4.3 onwards can support a pull-push Tx
model where it shares a Tx queue state with the
host.

The host updates the DMA region it pointed to
during HTT setup whenever number of software
queued from (on host) changes. Based on this
information firmware issues fetch requests to the
host telling the host how many frames from a list
of given stations/tids should be submitted to the
firmware.

The code won't be called because not all
appropriate HTT events are processed yet.

Signed-off-by: Michal Kazior <michal.kazior@tieto.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>

ath10k: implement wake_tx_queue

This implements very basic support for software
queueing. It also contains some knobs that will be
patched later.

Signed-off-by: Michal Kazior <michal.kazior@tieto.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>

ath10k: add new htt message generation/parsing logic

This merely adds some parsing, generation and
sanity checks with placeholders for real
code/functionality to be added later.

Signed-off-by: Michal Kazior <michal.kazior@tieto.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>

ath10k: add fast peer_map lookup

The pull-push functionality of 10.4 will be based
on peer_id and tid. These will need to be mapped,
eventually, to ieee80211_txq to be used with
ieee80211_tx_dequeue().

Iterating over existing stations every time
peer_id needs to be mapped to a station would be
inefficient wrt CPU time.

The new firmware, which will be the only user of
the code flow-wise, will guarantee to use low
peer_ids first so despite peer_map's apparent huge
size d-cache thrashing should not be a problem.

Older firmware hot paths will effectively not use
peer_map.

Signed-off-by: Michal Kazior <michal.kazior@tieto.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>

ath10k: maintain peer_id for each sta and vif

The 10.4.3 firmware with congestion control
guarantees that each peer has only a single
peer_id mapping.

The 1:1 mapping isn't the case for older firmwares
(e.g. 10.4.1, 10.2, 10.1) but it should not
matter. This 1:1 mapping is going to be only used
by future code which inherently (flow-wise) is for
10.4.3.

Signed-off-by: Michal Kazior <michal.kazior@tieto.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>

ath10k: refactor tx pending management

Tx pending counter logic assumed that the sk_buff
is already known and hence was performed in HTT
functions themselves.

However, for the sake of future wake_tx_queue()
usage the driver must be able to tell whether it
can submit more frames to firmware before it
dequeues frame from ieee80211_txq (and thus long
before HTT Tx functions are called) because once a
frame is dequeued it cannot be requeud back to
mac80211.

This prepares the driver for future changes.

Signed-off-by: Michal Kazior <michal.kazior@tieto.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>

ath10k: unify txpath decision

Some future changes will need to determine final
tx method early on. Prepare the code.

Signed-off-by: Michal Kazior <michal.kazior@tieto.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>

ath10k: refactor tx code

This prepares the code for future reuse with
ieee80211_txq and wake_tx_queue() in mind.

Signed-off-by: Michal Kazior <michal.kazior@tieto.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>

Merge 'net-next/master'

Needed by the upcoming merge of iwlwifi-next-for-kalle-2016-03-02 tag.

Merge ath-next from ath.git

ath.git patches for 4.6. Major changes:

ath10k

* dt: add bindings for ipq4019 wifi block
* start adding support for qca4019 chip

ath9k

* add device ID for Toshiba WLM-20U2/GN-1080
* allow more than one interface on DFS channels

Merge branch 'reset_mac_header'

Zhang Shengju says:

====================
use reset to set header pointers

This patch series replace set function with reset when offset is zero.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

wireless: use reset to set mac header

Since offset is zero, it's not necessary to use set function. Reset
function is straightforward, and will remove the unnecessary add
operation in set function.

Signed-off-by: Zhang Shengju <zhangshengju@cmss.chinamobile.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

mac80211: use reset to set header pointer

Since offset is zero, it's not necessary to use set function. Reset
function is straightforward, and will remove the unnecessary add
operation in set function.

Signed-off-by: Zhang Shengju <zhangshengju@cmss.chinamobile.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

mac80211_hwsim: use reset to set mac header

Since offset is zero, it's not necessary to use set function. Reset
function is straightforward, and will remove the unnecessary add
operation in set function.

Signed-off-by: Zhang Shengju <zhangshengju@cmss.chinamobile.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

vxlan: use reset to set header pointers

Since offset is zero, it's not necessary to use set function. Reset
function is straightforward, and will remove the unnecessary add operation
in set function.

Signed-off-by: Zhang Shengju <zhangshengju@cmss.chinamobile.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge tag 'rxrpc-rewrite-20160304' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs

David Howells says:

====================
RxRPC: Rewrite part 1

Here's the first set of patches from my RxRPC rewrite, aimed at net-next.
These do some clean ups and bug fixes.  Most of the changes are small, but
there are a couple of bigger changes:

(*) Convert call flag numbers and event numbers into enums.  Then rename
     the event numbers to all have _EV_ in their name to stop confusion.
     Fix one instance of an event bit being used instead of a flag bit.

(*) A copy of the Rx protocol header is kept in the sk_buff private data.
     Keep this in host byte order rather than network byte order as it
     makes more sense.  A number of other fields then get converted into
     host byte order too.

     Conversion between host and network byte order is then done at the
     packet reception/generation stage.

This is based on net-next/master
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'DIV_ROUND_UP-uapi'

Nicolas Dichtel says:

====================
uapi: consolidate DIV_ROUND_UP definition

The inital goal was to consolidate ethtool.h uapi header. But I took the
opportunity to remove all duplicate definitions of DIV_ROUND_UP.

v3: add patch #2 and #3

v2: split the patch
define DIV_ROUND_UP in uapi
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

ethtool.h: define INT_MAX for userland

INT_MAX needs limits.h in userland.
When ethtool.h is included by a userland app, we got the following error:

.../usr/include/linux/ethtool.h: In function 'ethtool_validate_speed':
.../usr/include/linux/ethtool.h:1471:18: error: 'INT_MAX' undeclared (first use in this function)
return speed <= INT_MAX || speed == SPEED_UNKNOWN
^

Fixes: e02564ee334a ("ethtool: make validate_speed accept all speeds between 0 and INT_MAX")
CC: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Acked-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

drm/vmwgfx: remove userland definition of DIV_ROUND_UP

Let's use __KERNEL_DIV_ROUND_UP, which is defined in uapi/linux/kernel.h.

Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

cxgb4i: don't redefine DIV_ROUND_UP

let's use the common definition to avoid the following warning during the
compilation:

drivers/scsi/cxgbi/cxgb4i/cxgb4i.c:161:0: warning: "DIV_ROUND_UP" redefined
#define DIV_ROUND_UP(n, d) (((n) + (d) - 1) / (d))
^
In file included from include/linux/list.h:8:0,
from include/linux/module.h:9,
from drivers/scsi/cxgbi/cxgb4i/cxgb4i.c:16:
include/linux/kernel.h:67:0: note: this is the location of the previous definition
#define DIV_ROUND_UP __KERNEL_DIV_ROUND_UP
^

Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

uapi: define DIV_ROUND_UP for userland

DIV_ROUND_UP is defined in linux/kernel.h only for the kernel.
When ethtool.h is included by a userland app, we got the following error:

include/linux/ethtool.h:1218:8: error: variably modified 'queue_mask' at file scope
__u32 queue_mask[DIV_ROUND_UP(MAX_NUM_QUEUE, 32)];
^

Let's add a common definition in uapi and use it everywhere.

Fixes: ac2c7ad0e5d6 ("net/ethtool: introduce a new ioctl for per queue setting")
CC: Kan Liang <kan.liang@intel.com>
Suggested-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

nfp: remove Rolf Neugebauer as co-maintainer

Rolf is no longer in his previous role at Netronome and as far as I know no
longer working on the NFP driver. Thus it does not seem appropriate for him
to be a co-maintainer anymore.

Reviewed-by: Dinan Gunawardena <dinan.gunawardena@netronome.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

rxrpc: Don't try to map ICMP to error as the lower layer already did that

In the ICMP message processing code, don't try to map ICMP codes to UNIX
error codes as the caller (IPv4/IPv6) already did that for us (ee_errno).

Signed-off-by: David Howells <dhowells@redhat.com>

rxrpc: Clear the unused part of a sockaddr_rxrpc for memcmp() use

Clear the unused part of a sockaddr_rxrpc structs so that memcmp() can be
used to compare them.

Signed-off-by: David Howells <dhowells@redhat.com>

rxrpc: rxkad: Casts are needed when comparing be32 values

Forced casts are needed to avoid sparse warning when directly comparing
be32 values.

Signed-off-by: David Howells <dhowells@redhat.com>

rxrpc: rxkad: The version number in the response should be net byte order

The version number rxkad places in the response should be network byte
order.

Whilst we're at it, rearrange the code to be more readable.

Signed-off-by: David Howells <dhowells@redhat.com>

rxrpc: Use ACCESS_ONCE() when accessing circular buffer pointers

Use ACCESS_ONCE() when accessing the other-end pointer into a circular
buffer as it's possible the other-end pointer might change whilst we're
doing this, and if we access it twice, we might get some weird things
happening.

Signed-off-by: David Howells <dhowells@redhat.com>

rxrpc: Adjust some whitespace and comments

Remove some excess whitespace, insert some missing spaces and adjust a
couple of comments.

Signed-off-by: David Howells <dhowells@redhat.com>

rxrpc: Be more selective about the types of received packets we accept

Currently, received RxRPC packets outside the range 1-13 are rejected.
There are, however, holes in the range that should also be rejected - plus
at least one type we don't yet support - so reject these also.

Signed-off-by: David Howells <dhowells@redhat.com>

rxrpc: Fix defined range for /proc/sys/net/rxrpc/rx_mtu

The upper bound of the defined range for rx_mtu is being set in the same
member as the lower bound (extra1) rather than the correct place (extra2).
I'm not entirely sure why this compiles.

Signed-off-by: David Howells <dhowells@redhat.com>

rxrpc: The protocol family should be set to PF_RXRPC not PF_UNIX

Fix the protocol family set in the proto_ops for rxrpc to be PF_RXRPC not
PF_UNIX.

Signed-off-by: David Howells <dhowells@redhat.com>

rxrpc: Keep the skb private record of the Rx header in host byte order

Currently, a copy of the Rx packet header is copied into the the sk_buff
private data so that we can advance the pointer into the buffer,
potentially discarding the original. At the moment, this copy is held in
network byte order, but this means we're doing a lot of unnecessary
translations.

The reasons it was done this way are that we need the values in network
byte order occasionally and we can use the copy, slightly modified, as part
of an iov array when sending an ack or an abort packet.

However, it seems more reasonable on review that it would be better kept in
host byte order and that we make up a new header when we want to send
another packet.

To this end, rename the original header struct to rxrpc_wire_header (with
BE fields) and institute a variant called rxrpc_host_header that has host
order fields. Change the struct in the sk_buff private data into an
rxrpc_host_header and translate the values when filling it in.

This further allows us to keep values kept in various structures in host
byte order rather than network byte order and allows removal of some fields
that are byteswapped duplicates.

Signed-off-by: David Howells <dhowells@redhat.com>

rxrpc: Rename call events to begin RXRPC_CALL_EV_

Rename call event names to begin RXRPC_CALL_EV_ to distinguish them from the
flags.

Signed-off-by: David Howells <dhowells@redhat.com>

rxrpc: Convert call flag and event numbers into enums

Convert call flag and event numbers into enums and move their definitions
outside of the struct.

Also move the call state enum outside of the struct and add an extra
element to count the number of states.

Signed-off-by: David Howells <dhowells@redhat.com>

rxrpc: Fix a case where a call event bit is being used as a flag bit

Fix a case where RXRPC_CALL_RELEASE (an event) is being used to specify a
flag bit. RXRPC_CALL_RELEASED should be used instead.

Signed-off-by: David Howells <dhowells@redhat.com>

net: sched: use pfifo_fast for non real queues

Some devices declare a high number of TX queues, then set a much
lower real_num_tx_queues

This cause setups using fq_codel, sfq or fq as the default qdisc to consume
more memory than really needed.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: ipv6: Fix refcnt on host routes

Andrew and Ying Huang's test robot both reported usage count problems that
trace back to the 'keep address on ifdown' patch.

>From Andrew:
We execute CRIU test on linux-next. On the current linux-next kernel
they hangs on creating a network namespace.

The kernel log contains many massages like this:
[ 1036.122108] unregister_netdevice: waiting for lo to become free.
Usage count = 2
[ 1046.165156] unregister_netdevice: waiting for lo to become free.
Usage count = 2
[ 1056.210287] unregister_netdevice: waiting for lo to become free.
Usage count = 2

I tried to revert this patch and the bug disappeared.

Here is a set of commands to reproduce this bug:

[root@linux-next-test linux-next]# uname -a
Linux linux-next-test 4.5.0-rc6-next-20160301+ #3 SMP Wed Mar 2
17:32:18 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux

[root@linux-next-test ~]# unshare -n
[root@linux-next-test ~]# ip link set up dev lo
[root@linux-next-test ~]# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN
group default qlen 1
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
[root@linux-next-test ~]# logout
[root@linux-next-test ~]# unshare -n

-----

The problem is a change made to RTM_DELADDR case in __ipv6_ifa_notify that
was added in an early version of the offending patch and is no longer
needed.

Fixes: f1705ec197e7 ("net: ipv6: Make address flushing on ifdown optional")
Cc: Andrey Wagin <avagin@gmail.com>
Cc: Ying Huang <ying.huang@linux.intel.com>
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Tested-by: Jeremiah Mahler <jmmahler@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: ezchip: adapt driver to little endian architecture

Since ezchip network driver is written with big endian EZChip platform it
is necessary to add support for little endian architecture.

The first issue is that the order of the bits in a bit field is
implementation specific. So all the bit fields are removed.
Named constants are used to access necessary fields.

And the second one is that network byte order is big endian.
For example, data on ethernet is transmitted with most-significant
octet (byte) first. So in case of little endian architecture
it is important to swap data byte order when we read it from
register. In case of unaligned access we can use "get_unaligned_be32"
and in other case we can use function "ioread32_rep" which reads all
data from register and works either with little endian or big endian
architecture.

And then when we are going to write data to register we need to restore
byte order using the function "put_unaligned_be32" in case of
unaligned access and in other case "iowrite32_rep".

The last little fix is a space between type and pointer to observe
coding style.

Signed-off-by: Lada Trimasova <ltrimas@synopsys.com>
Cc: Alexey Brodkin <abrodkin@synopsys.com>
Cc: Noam Camus <noamc@ezchip.com>
Cc: Tal Zilcer <talz@ezchip.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>

sh_eth, ravb: Use ARCH_RENESAS

Make use of ARCH_RENESAS in place of ARCH_SHMOBILE.

This is part of an ongoing process to migrate from ARCH_SHMOBILE to
ARCH_RENESAS the motivation for which being that RENESAS seems to be a more
appropriate name than SHMOBILE for the majority of Renesas ARM based SoCs.

Signed-off-by: Simon Horman <horms+renesas@verge.net.au>
Acked-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: mellanox: add DEVLINK dependencies

The new NET_DEVLINK infrastructure can be a loadable module, but the drivers
using it might be built-in, which causes link errors like:

drivers/net/built-in.o: In function `mlx4_load_one':
:(.text+0x2fbfda): undefined reference to `devlink_port_register'
:(.text+0x2fc084): undefined reference to `devlink_port_unregister'
drivers/net/built-in.o: In function `mlxsw_sx_port_remove':
:(.text+0x33a03a): undefined reference to `devlink_port_type_clear'
:(.text+0x33a04e): undefined reference to `devlink_port_unregister'

There are multiple ways to avoid this:

a) add 'depends on NET_DEVLINK || !NET_DEVLINK' dependencies
   for each user
b) use 'select NET_DEVLINK' from each driver that uses it
   and hide the symbol in Kconfig.
c) make NET_DEVLINK a 'bool' option so we don't have to
   list it as a dependency, and rely on the APIs to be
   stubbed out when it is disabled
d) use IS_REACHABLE() rather than IS_ENABLED() to check for
   NET_DEVLINK in include/net/devlink.h

This implements a variation of approach a) by adding an
intermediate symbol that drivers can depend on, and changes
the three drivers using it.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Fixes: 09d4d087cd48 ("mlx4: Implement devlink interface")
Fixes: c4745500e988 ("mlxsw: Implement devlink interface")
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'for-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next

Johan Hedberg says:

====================
pull request: bluetooth-next 2016-03-01

Here's our main set of Bluetooth & 802.15.4 patches for the 4.6 kernel.

- New Bluetooth HCI driver for Intel/AG6xx controllers
- New Broadcom ACPI IDs
- LED trigger support for indicating Bluetooth powered state
- Various fixes in mac802154, 6lowpan and related drivers
- New USB IDs for AR3012 Bluetooth controllers

Please let me know if there are any issues pulling. Thanks.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

net: relax setup_tc ndo op handle restriction

I added this check in setup_tc to multiple drivers,

if (handle != TC_H_ROOT || tc->type != TC_SETUP_MQPRIO)

Unfortunately restricting to TC_H_ROOT like this breaks the old
instantiation of mqprio to setup a hardware qdisc. This patch
relaxes the test to only check the type to make it equivalent
to the check before I broke it. With this the old instantiation
continues to work.

A good smoke test is to setup mqprio with,

# tc qdisc add dev eth4 root mqprio num_tc 8 \
map 0 1 2 3 4 5 6 7 \
queues 0@0 1@1 2@2 3@3 4@4 5@5 6@6 7@7

Fixes: e4c6734eaab9 ("net: rework ndo tc op to consume additional qdisc handle paramete")
Reported-by: Singh Krishneil <krishneil.k.singh@intel.com>
Reported-by: Jake Keller <jacob.e.keller@intel.com>
CC: Murali Karicheri <m-karicheri2@ti.com>
CC: Shradha Shah <sshah@solarflare.com>
CC: Or Gerlitz <ogerlitz@mellanox.com>
CC: Ariel Elior <ariel.elior@qlogic.com>
CC: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
CC: Bruce Allan <bruce.w.allan@intel.com>
CC: Jesse Brandeburg <jesse.brandeburg@intel.com>
CC: Don Skidmore <donald.c.skidmore@intel.com>
Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ath9k: clear bb filter calibration power threshold

JP WiFi certification for bandwidth of channel 14 failed, the OBW
is lower than the requirement. Clear the bb filter calibration power
threshold to increase OBW(+2). The fix only for qca9531 chip now.

Signed-off-by: Miaoqing Pan <miaoqing@codeaurora.org>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>

ath9k: reduce stack usage in ar9003_aic_cal_post_process

In some configurations, this function uses more than the warning limit
of 1024 bytes:

drivers/net/wireless/ath/ath9k/ar9003_aic.c: In function 'ar9003_aic_cal_post_process':
drivers/net/wireless/ath/ath9k/ar9003_aic.c:434:1: error: the frame size of 1040 bytes is larger than 1024 bytes [-Werror=frame-larger-than=]

It turns out that there are two large arrays on the stack here, but
almost all the data in them is never used outside of the loop in
which it gets written, so we can replace the array with a single
instance.

The .valid flag is used later, so I'm replacing the array of structures
with an array of bools. An obvious follow-up optimization would be
to replace it with a bitmask and set_bit()/find_first_bit()/
find_last_bit()/... operations. However, I have not tested this patch,
so I sticked to the simpler transformation that does the job of
reducing the stack usage to a harmless level.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>

ath9k: make NF load complete quickly and reliably

Make NF load complete quickly and reliably. NF load execution
is delayed by HW to end of frame if frame Rx or Tx is ongoing.
Increasing timeout to max frame duration. If NF cal is ongoing
before NF load, stop it before load, and restart it afterwards.

Signed-off-by: Miaoqing Pan <miaoqing@codeaurora.org>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>

ath10k: fix sanity check on enabling btcoex via debugfs

First check for the device state before enabling / disabling
btcoex, also return a proper error value. Enabling / disabling
btcoex ideally does a f/w + ath10k_core_restart so the checks
that are applicable for 'simulate_fw_crash' shall be applicable
for this as well

Signed-off-by: Mohammed Shafi Shajakhan <mohammed@qti.qualcomm.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>

ath10k: reduce number of peers to support peer stats feature

To enable per peer stats feature we are reducing the number of peers.
Firmware has introduced tx stats feature. We have memory limitation in
firmware to add these additional bytes.

These are the new variables introduced in the firmware.
======== =======================
Variable Bytes required/per rate
======== =======================
TX success packets 1
TX failed packets 1
Retry packets 1
Success bytes 2
TX failed bytes 2
Retry bytes 2
Tx duration 4
Rate 1
Bw and AMPDU flags 1
Total 16 (because of allocation in word pattern)

Firmware sends these tx_stats in pktlog.
If we consider 4 feedbacks at a time, Frimware need about ~1K memory for coding
and 8192 bytes required / per rate [ 4*16*128(peers)].
To accommodate this firmware needs to reduce 10 peers.

This fixes a firmware crash with firmware-5.bin_10.2.4.70.22-2.

Signed-off-by: Anilkumar Kolli <akolli@qti.qualcomm.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>

ath10k: process htt rx indication as batch mode

On multicore systems, it is possible that txrx tasket can run
in parallel with pci tasklet (i.e smp affinity of ath10k irq is
assigned to multiple CPUs). Feeding and consuming from the same
rx completion list leads to txrx tasklet runs for longer period.
Prevent this by processing a snapshot of rx queue by moving list
into temporary list. Consecutive received frames will be processed
in next batch.

Signed-off-by: Rajkumar Manoharan <rmanohar@qti.qualcomm.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>

ath10k: reduce rx_lock contention for htt rx indication

Received frame indications are queued into a skb list and latest
processed by txrx tasklet. This skb queue is protected by htt rx lock.
Since the entire rx processing till delivering frame to mac80211 and
replenish tasks are processed under rx_lock protection, there might be
some delay in queuing newly received rx frame into that list on
multicore systems. Optimize this by using skb list lock while accessing
rx completion queue instead of htt rx lock.

Signed-off-by: Rajkumar Manoharan <rmanohar@qti.qualcomm.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>

ath10k: fix erroneous return value

The ath10k_pci_hif_exchange_bmi_msg() function may return the positive
value EIO instead of -EIO in case of error.

Signed-off-by: Anton Protopopov <a.s.protopopov@gmail.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>

ath10k: add hw_rev to trace events to support pktlog

pktlog data is different between firmware variants (eg. 10.2 vs 10.4). To
have a unified user space script to decode pktlog trace events generated,
it is desirable to know which firmware variant has provided the events and
thereby decode the pktlogs appropriately. Hardware revision (hw_rev) helps
to determine the firmware variant sending these trace events. So add hw_rev
to trace events.

Signed-off-by: Ashok Raj Nagarajan <arnagara@qti.qualcomm.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>

ath10k: fix pktlog in QCA99X0

Currently, we are providing wrong payload data of pktlog to trace points.
Data we receive from FW through copy engine 8 contains pktlog data alone.
We don't need to parse anything in driver before handing it to trace
points.

Fixes: afb0bf7f530b ("ath10k: add support for pktlog in QCA99X0")
Signed-off-by: Ashok Raj Nagarajan <arnagara@qti.qualcomm.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>

ath10k: fix pointless update of peer stats list

We periodically receive f/w stats event for updating
the rx duration and there is no reason to keep on appending
the f/w stats peer list, as this gets completely cleaned up when
the user polls for f/w stats {pdev, vdev, peer stats}. Only don't
print the warning message in the case PEER_STATS service is enabled

Fixes: 856e7c3 ("ath10k: add debugfs support for Per STA total rx duration")
Signed-off-by: Mohammed Shafi Shajakhan <mohammed@qti.qualcomm.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>

ethernet/atl1c: remove left over dead code

Left over from c24588afc536a35c924d014f13b669b20ccf8553
("atl1c: using fixed TXQ configuration for l2cb and l1c")

Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net/ipv4: remove left over dead code

8cc785f6f429c2a3fb81745dc142cbd72a462c4a ("net: ipv4: make the ping
/proc code AF-independent") removed the code using it, but renamed this
variable instead of removing it.

Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net/rtnetlink: remove dead code

3b766cd832328fcb87db3507e7b98cf42f21689d ("net/core: Add reading VF
statistics through the PF netdevice") added that variable but it's never
been used.

Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'cxgb4-next'

Hariprasad Shenai says:

====================
cxgb4/cxgb4vf: Cleanup and minor fixes

This series sets FBMIN to 64 bytes for Chelsio's T6 series of adapters,
check to replenish fl is revised, some code cleanup in cxgb4vf sge
initialization code and removes dead code.

This patch series has been created against net-next tree and includes
patches on cxgb4 and cxgb4vf driver.

We have included all the maintainers of respective drivers. Kindly review
the change and let us know in case of any review comments.
====================

cxgb4vf: Remove dead functions collect_netdev_[um]c_list_addrs

Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

cxgb4vf: Remove redundant adapter ready check during probe

Function t4vf_wait_dev_ready() is already called in t4vf_prep_adapter(),
no need to call it again in adap_init0().

Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

cxgb4vf: Make sge init code more readable

Adds a new function t4vf_fl_pkt_align() and use the same in SGE
initialization code to find out freelist packet alignment

Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

cxgb4/cxgb4vf: For T6 adapter, set FBMIN to 64 bytes

T4 and T5 hardware will not coalesce Free List PCI-E Fetch Requests if
the Host Driver provides more Free List Pointers than the Fetch Burst
Minimum value. So if we set FBMIN to 64 bytes and the Host Driver
supplies 128 bytes of Free List Pointer data, the hardware will issue two
64-byte PCI-E Fetch Requests rather than a single coallesced 128-byte
Fetch Request. T6 fixes this. So, for T4/T5 we set the FBMIN value to 128

Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

cxgb4/cxgb4vf: Use fl capacity to check if fl needs to be replenished

Use freelist capacity instead of freelist size while checking, if
freelist needs to be refilled

Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'stmmac-optimizations'

Alexandre TORGUE says:

====================
stmmac: enhance driver performances and update the version

According to Giuseppe, I send the v3 series.

This is a subset of patches to rework the driver in order to improve its
performances and make it more robust under stress conditions.

All patches have been ported on STi mainstream kernel branch and
tested on ARM STiH4xx platforms and newer ones.

This series also updates the driver version and prepares it
to include further development to support new chips.

In detail, these patches are:

o to rework and improve the internal DMA bus settings

  Fine tuning is mandatory on some platforms for both
  performance and stability issues.

o to rework and optimize the descriptor management.

  This will help a lot on performance side and preparing
  the inclusion on the GMAC4.x.

o to add a set of optimizations for both xmit and rx functions.

  These will help a lot on performance side and making the driver
  more robust in case of low memory conditions and under some
  stress test, performed for example on IP-STB.

Below some throughput figures obtained on some boxes before and after
the patches.

                       nuttcp (mbps)       iperf (Mbps)
------------------------------------------------------------------
                      tcp     udp          tcp      udp
                   tx   rx   tx  rx      tx   rx   tx  rx
                    ------------------------------------------
   old             680   800 480  506    760  800   600  700
   new             830   880 540  630    840  880   700   800

V2: - rx_copybreak is now managed by using ethtool.
V3: - improve comments on PCIe detailing that there are no regressions
    - rework some APIs to properly define some params as bool as expected
    - rework the formula to get the element inside the ring. Comparing V2,
patches 4 and 13 have been merged because the same formula have been
used. After this rework, no evident benefit has been noticed in terms
of performances so the table above is still valid. Disassembling the
code for SH4 and ARM, with the new formula just an instr is saved
(depending on compiler flags) and this gives us not so relevanti gain,
for example, on SH4 where some instr are executed in the same pipeline
stage.
Ring sizes are now fixed and maybe they can be reworked to be tuned
w/o using stmmaceth= cmdline option. Indeed, nobody change these sizes
and indeed the numbers selected by default respect the budget and
avoid to pass invalid setup. These are the best driver default sizes
for ring and chain.

====================

stmmac: update version to Oct_2015

This patch just updates the driver to the version fully
tested on STi platforms. This version is Oct_2015.

Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Signed-off-by: Alexandre TORGUE <alexandre.torgue@st.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

stmmac: tune rx copy via threshold.

There is a threshold now used to also limit the skb allocation
when use zero-copy. This is to avoid that there are incoherence
in the ring due to a failure on skb allocation under very
aggressive testing and under low memory conditions.

Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Signed-off-by: Alexandre TORGUE <alexandre.torgue@st.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

stmmac: do not perform zero-copy for rx frames

This patch is to allow this driver to copy tiny frames during the reception
process. This is giving more stability while stressing the driver on STi
embedded systems.

Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Signed-off-by: Alexandre TORGUE <alexandre.torgue@st.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

stmmac: fix phy init when attached to a phy

phy_bus_name can be NULL when "fixed-link" property isn't used.
Then, since "stmmac: do not poll phy handler when attach a switch",
phy_bus_name ptr needs to be checked before strcmp is called.

Signed-off-by: Fabrice Gasnier <fabrice.gasnier@st.com>
Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Signed-off-by: Alexandre TORGUE <alexandre.torgue@st.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

stmmac: do not poll phy handler when attach a switch

This patch avoids to call the stmmac_adjust_link when
the driver is connected to a switch by using the FIXED_PHY
support. Prior this patch the phydev->irq was set as PHY_POLL
so periodically the phy handler was invoked spending useless
time because the link cannot actually change.
Note that the stmmac_adjust_link will be called just one
time and this guarantees that the ST glue logic will be
setup according to the mode and speed fixed.

Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Signed-off-by: Alexandre TORGUE <alexandre.torgue@st.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

stmmac: first frame prep at the end of xmit routine

This patch is to fill the first descriptor just before granting
the DMA engine so at the end of the xmit.
The patch takes care about the algorithm adopted to mitigate the
interrupts, then it fixes the last segment in case of no fragments.
Moreover, this new implementation does not pass any "ter" field when
prepare the descriptors because this is not necessary.
The patch also details the memory barrier in the xmit.

As final results, this patch guarantees the same performances
but fixing a case if small datagram are sent. In fact, this
kind of test is impacted if no coalesce is done.

Signed-off-by: Fabrice Gasnier <fabrice.gasnier@st.com>
Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Signed-off-by: Alexandre TORGUE <alexandre.torgue@st.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

stmmac: set dirty index out of the loop

The dirty index can be updated out of the loop where all the
tx resources are claimed. This will help on performances too.
Also a useless debug printk has been removed from the main loop.

Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Signed-off-by: Alexandre TORGUE <alexandre.torgue@st.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

stmmac: optimize tx clean function

This patch "inline" get_tx_owner and get_ls routines. It Results in a
unique read to tdes0, instead of three, to check TX_OWN and LS bits,
and other status bits.

It helps improve driver TX path by removing two uncached read/writes
inside TX clean loop for enhanced descriptors but not for normal ones
because the des1 must be read in any case.

Signed-off-by: Fabrice Gasnier <fabrice.gasnier@st.com>
Acked-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Signed-off-by: Alexandre TORGUE <alexandre.torgue@st.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

stmmac: optimize tx desc management

This patch is to optimize the way to manage the TDES inside the
xmit function. When prepare the frame, some settings (e.g. OWN
bit) can be merged. This has been reworked to improve the tx
performances.

Signed-off-by: Fabrice Gasnier <fabrice.gasnier@st.com>
Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Signed-off-by: Alexandre TORGUE <alexandre.torgue@st.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

stmmac: merge get_rx_owner into rx_status routine.

The RDES0 register can be read several times while doing RX of a
packet.
This patch slightly improves RX path performance by reading rdes0
once for two operation: check rx owner, get rx status bits.

Signed-off-by: Fabrice Gasnier <fabrice.gasnier@st.com>
Acked-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Signed-off-by: Alexandre TORGUE <alexandre.torgue@st.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

stmmac: add is_jumbo field to dma data

Optimize tx_clean by avoiding a des3 read in stmmac_clean_desc3().

In ring mode, TX, des3 seems only used when xmit a jumbo frame.
In case of normal descriptors, it may also be used for time
stamping.
Clean it in the above two case, without reading it.

Signed-off-by: Fabrice Gasnier <fabrice.gasnier@st.com>
Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Signed-off-by: Alexandre TORGUE <alexandre.torgue@st.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

stmmac: add last_segment field to dma data

last_segment field is read twice from dma descriptors in stmmac_clean().
Add last_segment to dma data so that this flag is from priv
structure in cache instead of memory.
It avoids reading twice from memory for each loop in stmmac_clean().

Signed-off-by: Fabrice Gasnier <fabrice.gasnier@st.com>
Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Signed-off-by: Alexandre TORGUE <alexandre.torgue@st.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

stmmac: add length field to dma data

Currently, the code pulls out the length field when
unmapping a buffer directly from the descriptor. This will result
in an uncached read to a dma_alloc_coherent() region. There is no
need to do this, so this patch simply puts the value directly into
a data structure which will hit the cache.

Signed-off-by: Fabrice Gasnier <fabrice.gasnier@st.com>
Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Signed-off-by: Alexandre TORGUE <alexandre.torgue@st.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

stmmac: review RX/TX ring management

This patch is to rework the ring management now optimized.
The indexes into the ring buffer are always incremented, and
the entry is accessed via doing a modulo to find the "real"
position in the ring.
It is inefficient, modulo is an expensive operation.

The formula [(entry + 1) & (size - 1)] is now adopted on
a ring that is power-of-2 in size.
Then, the number of elements cannot be set by command line but
it is fixed.

Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Signed-off-by: Alexandre TORGUE <alexandre.torgue@st.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

stmmac: change descriptor layout

This patch completely changes the descriptor layout to improve
the whole performances due to the single read usage of the
descriptors in critical paths.

Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Signed-off-by: Alexandre TORGUE <alexandre.torgue@st.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

stmmac: rework DMA bus setting and introduce new platform AXI structure

This patch restructures the DMA bus settings and this is done
by introducing a new platform structure used for programming
the AXI Bus Mode Register inside the DMA module.
This structure can be populated from device-tree as documented in the
binding txt file.

After initializing the DMA, the AXI register can be optionally tuned
for platform drivers based.
This patch also reworks some parameters to make coherent the DMA
configuration now that AXI register is introduced.
For example, the burst_len is managed by using the mentioned axi
support above; so the snps,burst-len parameter has been removed.
It makes sense to provide the AAL parameter from DT to Address-Aligned
Beats inside the Register0 and review the PBL settings when initialize
the engine.

For PCI glue, rebuilding the story of this setting, it
was added to align a configuration so not for fixing some
known problem. No issue raised after this patch.
It is safe to use the default burst length instead of
tuning it to the maximum value

Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Signed-off-by: Alexandre TORGUE <alexandre.torgue@st.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

stmmac: share reset function between dwmac100 and dwmac1000

This patch is to share the same reset procedure between dwmac100 and
dwmac1000 chips.
This will also help on enhancing the driver and support new chips.

Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Signed-off-by: Alexandre TORGUE <alexandre.torgue@st.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'rds-support-FRMR-and-cleanups'

Santosh Shilimkar says:

====================
RDS: Major clean-up with couple of new features for 4.6

v3:
Re-generated the same series by omitting "-D" option from git format-patch
command. Since first patch has file removals, git apply/am can't deal
with it when formated with '-D' option.

v2:
Dropped module parameter from [PATCH 11/13] as suggested by David Miller

Series is generated against net-next but also applies against Linus's tip
cleanly. Entire patchset is available at below git tree:

git://git.kernel.org/pub/scm/linux/kernel/git/ssantosh/linux.git for_4.6/net-next/rds_v2

The diff-stat looks bit scary since almost ~4K lines of code is
getting removed. Brief summary of the series:

- Drop the stale iWARP support:
RDS iWarp support code has become stale and non testable for
sometime. As discussed and agreed earlier on list, am dropping
its support for good. If new iWarp user(s) shows up in future,
the plan is to adapt existing IB RDMA with special sink case.
- RDS gets SO_TIMESTAMP support
- Long due RDS maintainer entry gets updated
- Some RDS IB code refactoring towards new FastReg Memory registration (FRMR)
- Lastly the initial support for FRMR

RDS IB RDMA performance with FRMR is not yet as good as FMR and I do have
some patches in progress to address that. But they are not ready for 4.6
so I left them out of this series.

Also am keeping eye on new CQ API adaptations like other ULPs doing and
will try to adapt RDS for the same most likely in 4.7+ timeframe.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

RDS: IB: Support Fastreg MR (FRMR) memory registration mode

Fastreg MR(FRMR) is another method with which one can
register memory to HCA. Some of the newer HCAs supports only fastreg
mr mode, so we need to add support for it to have RDS functional
on them.

Signed-off-by: Santosh Shilimkar <ssantosh@kernel.org>
Signed-off-by: Avinash Repaka <avinash.repaka@oracle.com>
Signed-off-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

RDS: IB: allocate extra space on queues for FRMR support

Fastreg MR(FRMR) memory registration and invalidation makes use
of work request and completion queues for its operation. Patch
allocates extra queue space towards these operation(s).

Signed-off-by: Santosh Shilimkar <ssantosh@kernel.org>
Signed-off-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

RDS: IB: add Fastreg MR (FRMR) detection support

Discovere Fast Memmory Registration support using IB device
IB_DEVICE_MEM_MGT_EXTENSIONS. Certain HCA might support just FRMR
or FMR or both FMR and FRWR. In case both mr type are supported,
default FMR is used.

Default MR is still kept as FMR against what everyone else
is following. Default will be changed to FRMR once the
RDS performance with FRMR is comparable with FMR. The
work is in progress for the same.

Signed-off-by: Santosh Shilimkar <ssantosh@kernel.org>
Signed-off-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

RDS: IB: add mr reused stats

Add MR reuse statistics to RDS IB transport.

Signed-off-by: Santosh Shilimkar <ssantosh@kernel.org>
Signed-off-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

RDS: IB: handle the RDMA CM time wait event

Drop the RDS connection on RDMA_CM_EVENT_TIMEWAIT_EXIT so that
it can reconnect and resume.

While testing fastreg, this error happened in couple of tests but
was getting un-noticed.

Signed-off-by: Santosh Shilimkar <ssantosh@kernel.org>
Signed-off-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

RDS: IB: add connection info to ibmr

Preperatory patch for FRMR support. From connection info,
we can retrieve cm_id which contains qp handled needed for
work request posting.

We also need to drop the RDS connection on QP error states
where connection handle becomes useful.

Signed-off-by: Santosh Shilimkar <ssantosh@kernel.org>
Signed-off-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

RDS: IB: move FMR code to its own file

No functional change.

Signed-off-by: Santosh Shilimkar <ssantosh@kernel.org>
Signed-off-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

RDS: IB: create struct rds_ib_fmr

Keep fmr related filed in its own struct. Fastreg MR structure
will be added to the union.

Signed-off-by: Santosh Shilimkar <ssantosh@kernel.org>
Signed-off-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

RDS: IB: Re-organise ibmr code

No functional changes. This is in preperation towards adding
fastreg memory resgitration support.

Signed-off-by: Santosh Shilimkar <ssantosh@kernel.org>
Signed-off-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

RDS: IB: Remove the RDS_IB_SEND_OP dependency

This helps to combine asynchronous fastreg MR completion handler
with send completion handler.

No functional change.

Signed-off-by: Santosh Shilimkar <ssantosh@kernel.org>
Signed-off-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

MAINTAINERS: update RDS entry

Acked-by: Chien Yen <chien.yen@oracle.com>
Signed-off-by: Santosh Shilimkar <ssantosh@kernel.org>
Signed-off-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

RDS: Add support for SO_TIMESTAMP for incoming messages

The SO_TIMESTAMP generates time stamp for each incoming RDS messages
User app can enable it by using SO_TIMESTAMP setsocketopt() at
SOL_SOCKET level. CMSG data of cmsg type SO_TIMESTAMP contains the
time stamp in struct timeval format.

Reviewed-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Signed-off-by: Santosh Shilimkar <ssantosh@kernel.org>
Signed-off-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>