git.karo-electronics.de Git - linux-beck.git/log

tcp: track data delivery rate for a TCP connection

This patch generates data delivery rate (throughput) samples on a
per-ACK basis. These rate samples can be used by congestion control
modules, and specifically will be used by TCP BBR in later patches in
this series.

Key state:

tp->delivered: Tracks the total number of data packets (original or not)
       delivered so far. This is an already-existing field.

tp->delivered_mstamp: the last time tp->delivered was updated.

Algorithm:

A rate sample is calculated as (d1 - d0)/(t1 - t0) on a per-ACK basis:

  d1: the current tp->delivered after processing the ACK
  t1: the current time after processing the ACK

  d0: the prior tp->delivered when the acked skb was transmitted
  t0: the prior tp->delivered_mstamp when the acked skb was transmitted

When an skb is transmitted, we snapshot d0 and t0 in its control
block in tcp_rate_skb_sent().

When an ACK arrives, it may SACK and ACK some skbs. For each SACKed
or ACKed skb, tcp_rate_skb_delivered() updates the rate_sample struct
to reflect the latest (d0, t0).

Finally, tcp_rate_gen() generates a rate sample by storing
(d1 - d0) in rs->delivered and (t1 - t0) in rs->interval_us.

One caveat: if an skb was sent with no packets in flight, then
tp->delivered_mstamp may be either invalid (if the connection is
starting) or outdated (if the connection was idle). In that case,
we'll re-stamp tp->delivered_mstamp.

At first glance it seems t0 should always be the time when an skb was
transmitted, but actually this could over-estimate the rate due to
phase mismatch between transmit and ACK events. To track the delivery
rate, we ensure that if packets are in flight then t0 and and t1 are
times at which packets were marked delivered.

If the initial and final RTTs are different then one may be corrupted
by some sort of noise. The noise we see most often is sending gaps
caused by delayed, compressed, or stretched acks. This either affects
both RTTs equally or artificially reduces the final RTT. We approach
this by recording the info we need to compute the initial RTT
(duration of the "send phase" of the window) when we recorded the
associated inflight. Then, for a filter to avoid bandwidth
overestimates, we generalize the per-sample bandwidth computation
from:

    bw = delivered / ack_phase_rtt

to the following:

    bw = delivered / max(send_phase_rtt, ack_phase_rtt)

In large-scale experiments, this filtering approach incorporating
send_phase_rtt is effective at avoiding bandwidth overestimates due to
ACK compression or stretched ACKs.

Signed-off-by: Van Jacobson <vanj@google.com>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: Nandita Dukkipati <nanditad@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

tcp: count packets marked lost for a TCP connection

Count the number of packets that a TCP connection marks lost.

Congestion control modules can use this loss rate information for more
intelligent decisions about how fast to send.

Specifically, this is used in TCP BBR policer detection. BBR uses a
high packet loss rate as one signal in its policer detection and
policer bandwidth estimation algorithm.

The BBR policer detection algorithm cannot simply track retransmits,
because a retransmit can be (and often is) an indicator of packets
lost long, long ago. This is particularly true in a long CA_Loss
period that repairs the initial massive losses when a policer kicks
in.

Signed-off-by: Van Jacobson <vanj@google.com>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: Nandita Dukkipati <nanditad@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

tcp: switch back to proper tcp_skb_cb size check in tcp_init()

Revert to the tcp_skb_cb size check that tcp_init() had before commit
b4772ef879a8 ("net: use common macro for assering skb->cb[] available
size in protocol families"). As related commit 744d5a3e9fe2 ("net:
move skb->dropcount to skb->cb[]") explains, the
sock_skb_cb_check_size() mechanism was added to ensure that there is
space for dropcount, "for protocol families using it". But TCP is not
a protocol using dropcount, so tcp_init() doesn't need to provision
space for dropcount in the skb->cb[], and thus we can revert to the
older form of the tcp_skb_cb size check. Doing so allows TCP to use 4
more bytes of the skb->cb[] space.

Fixes: b4772ef879a8 ("net: use common macro for assering skb->cb[] available size in protocol families")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net_sched: sch_fq: add low_rate_threshold parameter

This commit adds to the fq module a low_rate_threshold parameter to
insert a delay after all packets if the socket requests a pacing rate
below the threshold.

This helps achieve more precise control of the sending rate with
low-rate paths, especially policers. The basic issue is that if a
congestion control module detects a policer at a certain rate, it may
want fq to be able to shape to that policed rate. That way the sender
can avoid policer drops by having the packets arrive at the policer at
or just under the policed rate.

The default threshold of 550Kbps was chosen analytically so that for
policers or links at 500Kbps or 512Kbps fq would very likely invoke
this mechanism, even if the pacing rate was briefly slightly above the
available bandwidth. This value was then empirically validated with
two years of production testing on YouTube video servers.

Signed-off-by: Van Jacobson <vanj@google.com>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: Nandita Dukkipati <nanditad@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

tcp: use windowed min filter library for TCP min_rtt estimation

Refactor the TCP min_rtt code to reuse the new win_minmax library in
lib/win_minmax.c to simplify the TCP code.

This is a pure refactor: the functionality is exactly the same. We
just moved the windowed min code to make TCP easier to read and
maintain, and to allow other parts of the kernel to use the windowed
min/max filter code.

Signed-off-by: Van Jacobson <vanj@google.com>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: Nandita Dukkipati <nanditad@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

lib/win_minmax: windowed min or max estimator

This commit introduces a generic library to estimate either the min or
max value of a time-varying variable over a recent time window. This
is code originally from Kathleen Nichols. The current form of the code
is from Van Jacobson.

A single struct minmax_sample will track the estimated windowed-max
value of the series if you call minmax_running_max() or the estimated
windowed-min value of the series if you call minmax_running_min().

Nearly equivalent code is already in place for minimum RTT estimation
in the TCP stack. This commit extracts that code and generalizes it to
handle both min and max. Moving the code here reduces the footprint
and complexity of the TCP code base and makes the filter generally
available for other parts of the codebase, including an upcoming TCP
congestion control module.

This library works well for time series where the measurements are
smoothly increasing or decreasing.

Signed-off-by: Van Jacobson <vanj@google.com>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: Nandita Dukkipati <nanditad@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

tcp: cdg: rename struct minmax in tcp_cdg.c to avoid a naming conflict

The upcoming change "lib/win_minmax: windowed min or max estimator"
introduces a struct called minmax, which is then included in
include/linux/tcp.h in the upcoming change "tcp: use windowed min
filter library for TCP min_rtt estimation". This would create a
compilation error for tcp_cdg.c, which defines its own minmax
struct. To avoid this naming conflict (and potentially others in the
future), this commit renames the version used in tcp_cdg.c to
cdg_minmax.

Signed-off-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Kenneth Klette Jonassen <kennetkl@ifi.uio.no>
Acked-by: Kenneth Klette Jonassen <kennetkl@ifi.uio.no>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: ethernet: mediatek: enhance with avoiding superfluous assignment inside mtk_get_ethtool_stats

data_src is unchanged inside the loop, so this patch moves
the assignment to outside the loop to avoid unnecessarily
assignment

Signed-off-by: Sean Wang <sean.wang@mediatek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: dsa: mv88e6xxx: handle multiple ports in ATU

An address can be loaded in the ATU with multiple ports, for instance
when adding multiple ports to a Multicast group with "bridge mdb".

The current code doesn't allow that. Add an helper to get a single entry
from the ATU, then set or clear the requested port, before loading the
entry back in the ATU.

Note that the required _mv88e6xxx_atu_getnext function is defined below
mv88e6xxx_port_db_load_purge, so forward-declare it for the moment. The
ATU code will be isolated in future patches.

Fixes: 83dabd1fa84c ("net: dsa: mv88e6xxx: make switchdev DB ops generic")
Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>

net sched actions: fix GETing actions

With the batch changes that translated transient actions into
a temporary list lost in the translation was the fact that
tcf_action_destroy() will eventually delete the action from
the permanent location if the refcount is zero.

Example of what broke:
...add a gact action to drop
sudo $TC actions add action drop index 10
...now retrieve it, looks good
sudo $TC actions get action gact index 10
...retrieve it again and find it is gone!
sudo $TC actions get action gact index 10

Fixes: 22dc13c837c3 ("net_sched: convert tcf_exts from list to pointer array"),
Fixes: 824a7e8863b3 ("net_sched: remove an unnecessary list_del()")
Fixes: f07fed82ad79 ("net_sched: remove the leftover cleanup_a()")
Acked-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'bpf-direct-packet-access-improvements'

Daniel Borkmann says:

====================
BPF direct packet access improvements

This set adds write support to the currently available read support
for {cls,act}_bpf programs. First one is a fix for affected commit
sitting in net-next and prerequisite for the second one, last patch
adds a number of test cases against the verifier. For details, please
see individual patches.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

bpf: add test cases for direct packet access

Add couple of test cases for direct write and the negative size issue, and
also adjust the direct packet access test4 since it asserts that writes are
not possible, but since we've just added support for writes, we need to
invert the verdict to ACCEPT, of course. Summary: 133 PASSED, 0 FAILED.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

bpf: direct packet write and access for helpers for clsact progs

This work implements direct packet access for helpers and direct packet
write in a similar fashion as already available for XDP types via commits
4acf6c0b84c9 ("bpf: enable direct packet data write for xdp progs") and
6841de8b0d03 ("bpf: allow helpers access the packet directly"), and as a
complementary feature to the already available direct packet read for tc
(cls/act) programs.

For enabling this, we need to introduce two helpers, bpf_skb_pull_data()
and bpf_csum_update(). The first is generally needed for both, read and
write, because they would otherwise only be limited to the current linear
skb head. Usually, when the data_end test fails, programs just bail out,
or, in the direct read case, use bpf_skb_load_bytes() as an alternative
to overcome this limitation. If such data sits in non-linear parts, we
can just pull them in once with the new helper, retest and eventually
access them.

At the same time, this also makes sure the skb is uncloned, which is, of
course, a necessary condition for direct write. As this needs to be an
invariant for the write part only, the verifier detects writes and adds
a prologue that is calling bpf_skb_pull_data() to effectively unclone the
skb from the very beginning in case it is indeed cloned. The heuristic
makes use of a similar trick that was done in 233577a22089 ("net: filter:
constify detection of pkt_type_offset"). This comes at zero cost for other
programs that do not use the direct write feature. Should a program use
this feature only sparsely and has read access for the most parts with,
for example, drop return codes, then such write action can be delegated
to a tail called program for mitigating this cost of potential uncloning
to a late point in time where it would have been paid similarly with the
bpf_skb_store_bytes() as well. Advantage of direct write is that the
writes are inlined whereas the helper cannot make any length assumptions
and thus needs to generate a call to memcpy() also for small sizes, as well
as cost of helper call itself with sanity checks are avoided. Plus, when
direct read is already used, we don't need to cache or perform rechecks
on the data boundaries (due to verifier invalidating previous checks for
helpers that change skb->data), so more complex programs using rewrites
can benefit from switching to direct read plus write.

For direct packet access to helpers, we save the otherwise needed copy into
a temp struct sitting on stack memory when use-case allows. Both facilities
are enabled via may_access_direct_pkt_data() in verifier. For now, we limit
this to map helpers and csum_diff, and can successively enable other helpers
where we find it makes sense. Helpers that definitely cannot be allowed for
this are those part of bpf_helper_changes_skb_data() since they can change
underlying data, and those that write into memory as this could happen for
packet typed args when still cloned. bpf_csum_update() helper accommodates
for the fact that we need to fixup checksum_complete when using direct write
instead of bpf_skb_store_bytes(), meaning the programs can use available
helpers like bpf_csum_diff(), and implement csum_add(), csum_sub(),
csum_block_add(), csum_block_sub() equivalents in eBPF together with the
new helper. A usage example will be provided for iproute2's examples/bpf/
directory.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

bpf, verifier: enforce larger zero range for pkt on overloading stack buffs

Current contract for the following two helper argument types is:

  * ARG_CONST_STACK_SIZE: passed argument pair must be (ptr, >0).
  * ARG_CONST_STACK_SIZE_OR_ZERO: passed argument pair can be either
    (NULL, 0) or (ptr, >0).

With 6841de8b0d03 ("bpf: allow helpers access the packet directly"), we can
pass also raw packet data to helpers, so depending on the argument type
being PTR_TO_PACKET, we now either assert memory via check_packet_access()
or check_stack_boundary(). As a result, the tests in check_packet_access()
currently allow more than intended with regards to reg->imm.

Back in 969bf05eb3ce ("bpf: direct packet access"), check_packet_access()
was fine to ignore size argument since in check_mem_access() size was
bpf_size_to_bytes() derived and prior to the call to check_packet_access()
guaranteed to be larger than zero.

However, for the above two argument types, it currently means, we can have
a <= 0 size and thus breaking current guarantees for helpers. Enforce a
check for size <= 0 and bail out if so.

check_stack_boundary() doesn't have such an issue since it already tests
for access_size <= 0 and bails out, resp. access_size == 0 in case of NULL
pointer passed when allowed.

Fixes: 6841de8b0d03 ("bpf: allow helpers access the packet directly")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

ipvlan: Fix dependency issue

kbuild-build-bot reported that if NETFILTER is not selected, the
build fails pointing to netfilter symbols.

Fixes: 4fbae7d83c98 ("ipvlan: Introduce l3s mode")
Signed-off-by: Mahesh Bandewar <maheshb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

openvswitch: avoid resetting flow key while installing new flow.

since commit commit db74a3335e0f6 ("openvswitch: use percpu
flow stats") flow alloc resets flow-key. So there is no need
to reset the flow-key again if OVS is using newly allocated
flow-key.

Signed-off-by: Pravin B Shelar <pshelar@ovn.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

openvswitch: Fix Frame-size larger than 1024 bytes warning.

There is no need to declare separate key on stack,
we can just use sw_flow->key to store the key directly.

This commit fixes following warning:

net/openvswitch/datapath.c: In function ‘ovs_flow_cmd_new’:
net/openvswitch/datapath.c:1080:1: warning: the frame size of 1040 bytes
is larger than 1024 bytes [-Wframe-larger-than=]

Signed-off-by: Pravin B Shelar <pshelar@ovn.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'for-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next

Johan Hedberg says:

====================
pull request: bluetooth-next 2016-09-19

Here's the main bluetooth-next pull request for the 4.9 kernel.

- Added new messages for monitor sockets for better mgmt tracing
- Added local name and appearance support in scan response
- Added new Qualcomm WCNSS SMD based HCI driver
- Minor fixes & cleanup to 802.15.4 code
- New USB ID to btusb driver
- Added Marvell support to HCI UART driver
- Add combined LED trigger for controller power
- Other minor fixes here and there

Please let me know if there are any issues pulling. Thanks.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

6pack: fix buffer length mishandling

Dmitry Vyukov wrote:
> different runs). Looking at code, the following looks suspicious -- we
> limit copy by 512 bytes, but use the original count which can be
> larger than 512:
>
> static void sixpack_receive_buf(struct tty_struct *tty,
>     const unsigned char *cp, char *fp, int count)
> {
>     unsigned char buf[512];
>     ....
>     memcpy(buf, cp, count < sizeof(buf) ? count : sizeof(buf));
>     ....
>     sixpack_decode(sp, buf, count1);

With the sane tty locking we now have I believe the following is safe as
we consume the bytes and move them into the decoded buffer before
returning.

Signed-off-by: Alan Cox <alan@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

mlx4: add missed recycle opportunity for XDP_TX on TX failure

Correct drop handling for XDP_TX on TX failure, were recently added in
commit 95357907ae73 ("mlx4: fix XDP_TX is acting like XDP_PASS on TX
ring full").

The change missed an opportunity for recycling the RX page, instead of
going through the page allocator, like the regular XDP_DROP action does.
This patch cease the opportunity, by going through the XDP_DROP case.

Fixes: 95357907ae73 ("mlx4: fix XDP_TX is acting like XDP_PASS on TX ring full")
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'dsa-set_addr-optional'

John Crispin says:

====================
net-next: dsa: set_addr should be optional

The Marvell driver is the only one that actually sets the switches HW
address. All other drivers have an empty stub. fix this by making the
callback optional.
====================

Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net-next: dsa: qca8k: remove empty set_addr() stub

The set_addr() callback is now optional. Remove the empty stub that qca8k
has.

Signed-off-by: John Crispin <john@phrozen.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

net-next: dsa: b53: remove empty set_addr() stub

The set_addr() callback is now optional. Remove the empty stub that b53
has.

Signed-off-by: John Crispin <john@phrozen.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

net-next: dsa: make the set_addr() operation optional

Only 1 of the 3 drivers currently has a set_addr() operation. Make the
set_addr() callback optional to reduce the amount of empty stubs inside
the drivers.

Signed-off-by: John Crispin <john@phrozen.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

net-next: dsa: fix duplicate invocation of set_addr()

commit 83c0afaec7b730b ("net: dsa: Add new binding implementation")
has a duplicate invocation of the set_addr() operation callback. Remove one
of them.

Signed-off-by: John Crispin <john@phrozen.org>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'rhashtable-dups'

Herbert Xu says:

====================
rhashtable: rhashtable with duplicate objects

v3 fixes a bug in the remove path that causes the element count
to decrease when it shouldn't, leading to a gigantic hash table
when it underflows.

v2 contains a reworked insertion slowpath to ensure that the
spinlock for the table we're inserting into is taken.

This series contains two patches.  The first adds the rhlist
interface and the second converts mac80211 to use it.  If this
works out I'll then proceed to convert the other insecure_elasticity
users over to this.

I've tested the rhlist code with test_rhashtable but I haven't
tested the mac80211 conversion.  So please give it a go and see
if it still works.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

mac80211: Use rhltable instead of rhashtable

mac80211 currently uses rhashtable with insecure_elasticity set
to true. The latter is because of duplicate objects. What's
more, mac80211 walks the rhashtable chains by hand which is broken
as rhashtable may contain multiple tables due to resizing or
rehashing.

This patch fixes it by converting it to the newly added rhltable
interface which is designed for use with duplicate objects.

With rhltable a lookup returns a list of objects instead of a
single one. This is then fed into the existing for_each_sta_info
macro.

This patch also deletes the sta_addr_hash function since rhashtable
defaults to jhash.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>

rhashtable: Add rhlist interface

The insecure_elasticity setting is an ugly wart brought out by
users who need to insert duplicate objects (that is, distinct
objects with identical keys) into the same table.

In fact, those users have a much bigger problem.  Once those
duplicate objects are inserted, they don't have an interface to
find them (unless you count the walker interface which walks
over the entire table).

Some users have resorted to doing a manual walk over the hash
table which is of course broken because they don't handle the
potential existence of multiple hash tables.  The result is that
they will break sporadically when they encounter a hash table
resize/rehash.

This patch provides a way out for those users, at the expense
of an extra pointer per object.  Essentially each object is now
a list of objects carrying the same key.  The hash table will
only see the lists so nothing changes as far as rhashtable is
concerned.

To use this new interface, you need to insert a struct rhlist_head
into your objects instead of struct rhash_head.  While the hash
table is unchanged, for type-safety you'll need to use struct
rhltable instead of struct rhashtable.  All the existing interfaces
have been duplicated for rhlist, including the hash table walker.

One missing feature is nulls marking because AFAIK the only potential
user of it does not need duplicate objects.  Should anyone need
this it shouldn't be too hard to add.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Acked-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>

xen-netfront: avoid packet loss when ethernet header crosses page boundary

Small packet loss is reported on complex multi host network configurations
including tunnels, NAT, ... My investigation led me to the following check
in netback which drops packets:

        if (unlikely(txreq.size < ETH_HLEN)) {
                netdev_err(queue->vif->dev,
                           "Bad packet size: %d\n", txreq.size);
                xenvif_tx_err(queue, &txreq, extra_count, idx);
                break;
        }

But this check itself is legitimate. SKBs consist of a linear part (which
has to have the ethernet header) and (optionally) a number of frags.
Netfront transmits the head of the linear part up to the page boundary
as the first request and all the rest becomes frags so when we're
reconstructing the SKB in netback we can't distinguish between original
frags and the 'tail' of the linear part. The first SKB needs to be at
least ETH_HLEN size. So in case we have an SKB with its linear part
starting too close to the page boundary the packet is lost.

I see two ways to fix the issue:
- Change the 'wire' protocol between netfront and netback to start keeping
  the original SKB structure. We'll have to add a flag indicating the fact
  that the particular request is a part of the original linear part and not
  a frag. We'll need to know the length of the linear part to pre-allocate
  memory.
- Avoid transmitting SKBs with linear parts starting too close to the page
  boundary. That seems preferable short-term and shouldn't bring
  significant performance degradation as such packets are rare. That's what
  this patch is trying to achieve with skb_copy().

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Acked-by: David Vrabel <david.vrabel@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: phy: Add MAC-IF driver for Microsemi PHYs.

All the review comments updated and resending for review.

This is MAC interface feature.
Microsemi PHY can support RGMII, RMII or GMII/MII interface between MAC and PHY.
MAC-IF function program the right value based on Device tree configuration.

Tested on Beaglebone Black with VSC 8531 PHY.

Signed-off-by: Raju Lakkaraju <Raju.Lakkaraju@microsemi.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>

mlxsw: spectrum: Fix sparse warnings

drivers/net/ethernet/mellanox/mlxsw//spectrum.c:251:28: warning: symbol
'mlxsw_sp_span_entry_find' was not declared. Should it be static?
drivers/net/ethernet/mellanox/mlxsw//spectrum.c:265:28: warning: symbol
'mlxsw_sp_span_entry_get' was not declared. Should it be static?
drivers/net/ethernet/mellanox/mlxsw//spectrum.c:367:56: warning: mixing
different enum types
drivers/net/ethernet/mellanox/mlxsw//spectrum.c:367:56:     int enum
mlxsw_sp_span_type  versus
drivers/net/ethernet/mellanox/mlxsw//spectrum.c:367:56:     int enum
mlxsw_reg_mpar_i_e
...
drivers/net/ethernet/mellanox/mlxsw//spectrum_buffers.c:598:32: warning:
mixing different enum types
drivers/net/ethernet/mellanox/mlxsw//spectrum_buffers.c:598:32:     int
enum mlxsw_reg_sbxx_dir  versus
drivers/net/ethernet/mellanox/mlxsw//spectrum_buffers.c:598:32:     int
enum devlink_sb_pool_type
drivers/net/ethernet/mellanox/mlxsw//spectrum_buffers.c:600:39: warning:
mixing different enum types
drivers/net/ethernet/mellanox/mlxsw//spectrum_buffers.c:600:39:     int
enum mlxsw_reg_sbpr_mode  versus
drivers/net/ethernet/mellanox/mlxsw//spectrum_buffers.c:600:39:     int
enum devlink_sb_threshold_type
...
drivers/net/ethernet/mellanox/mlxsw//spectrum_router.c:255:54: warning:
mixing different enum types
drivers/net/ethernet/mellanox/mlxsw//spectrum_router.c:255:54:     int
enum mlxsw_sp_l3proto  versus
drivers/net/ethernet/mellanox/mlxsw//spectrum_router.c:255:54:     int
enum mlxsw_reg_ralxx_protocol
...
drivers/net/ethernet/mellanox/mlxsw//spectrum_router.c:1749:6: warning:
symbol 'mlxsw_sp_fib_entry_put' was not declared. Should it be static?

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

mlxsw: Change the RX LAG hash function from XOR to CRC

Change the RX hash function from XOR to CRC in order to have better
distribution of the traffic.

Signed-off-by: Elad Raz <eladr@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

bnxt_en: Fix build error for kernesl without RTC-LIB

bnxt_hwrm_fw_set_time() now returns -EOPNOTSUPP when built for kernel
without RTC_LIB. Setting the firmware time is not critical to the
successful completion of the firmware update process.

Signed-off-by: Rob Swindell <Rob.Swindell@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net sched: stylistic cleanups

Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net sched actions police: peg drop stats for conforming traffic

setting conforming action to drop is a valid policy.
When it is set we need to at least see the stats indicating it
for debugging.

Signed-off-by: Roman Mashak <mrv@mojatatu.com>
Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net sched ife action: Introduce skb tcindex metadata encap decap

Sample use case of how this is encoded:
user space via tuntap (or a connected VM/Machine/container)
encodes the tcindex TLV.

Sample use case of decoding:
IFE action decodes it and the skb->tc_index is then used to classify.
So something like this for encoded ICMP packets:

.. first decode then reclassify... skb->tcindex will be set
sudo $TC filter add dev $ETH parent ffff: prio 2 protocol 0xbeef \
u32 match u32 0 0 flowid 1:1 \
action ife decode reclassify

...next match the decode icmp packet...
sudo $TC filter add dev $ETH parent ffff: prio 4 protocol ip \
u32 match ip protocol 1 0xff flowid 1:1 \
action continue

... last classify it using the tcindex classifier and do someaction..
sudo $TC filter add dev $ETH parent ffff: prio 5 protocol ip \
handle 0x11 tcindex classid 1:1 \
action blah..

Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net sched ife action: add 16 bit helpers

encoder and checker for 16 bits metadata

Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net/mlx5: clean function declarations in eswitch.c up

We get 2 warnings when building kernel with W=1:
drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c:463:5: warning: no previous prototype for 'esw_offloads_init' [-Wmissing-prototypes]
drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c:521:6: warning: no previous prototype for 'esw_offloads_cleanup' [-Wmissing-prototypes]

In fact, both functions are declared in
drivers/net/ethernet/mellanox/mlx5/core/eswitch.c,but should be
declared in a header file, thus can be recognized in other file.

So this patch moves the declarations into
drivers/net/ethernet/mellanox/mlx5/core/eswitch.h

Signed-off-by: Baoyou Xie <baoyou.xie@linaro.org>
Acked-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

be2net: mark symbols static where possible

We get 4 warnings when building kernel with W=1:
drivers/net/ethernet/emulex/benet/be_main.c:4368:6: warning: no previous prototype for 'be_calculate_pf_pool_rss_tables' [-Wmissing-prototypes]
drivers/net/ethernet/emulex/benet/be_cmds.c:4385:5: warning: no previous prototype for 'be_get_nic_pf_num_list' [-Wmissing-prototypes]
drivers/net/ethernet/emulex/benet/be_cmds.c:4537:6: warning: no previous prototype for 'be_reset_nic_desc' [-Wmissing-prototypes]
drivers/net/ethernet/emulex/benet/be_cmds.c:4910:5: warning: no previous prototype for '__be_cmd_set_logical_link_config' [-Wmissing-prototypes]

In fact, these functions are only used in the file in which they are
declared and don't need a declaration, but can be made static.
so this patch marks these functions with 'static'.

Signed-off-by: Baoyou Xie <baoyou.xie@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

phy: mark lan88xx_suspend() static

We get 1 warning when building kernel with W=1:
drivers/net/phy/microchip.c:58:5: warning: no previous prototype for 'lan88xx_suspend' [-Wmissing-prototypes]

In fact, this function is only used in the file in which it is
declared and don't need a declaration, but can be made static.
so this patch marks this function with 'static'.

Signed-off-by: Baoyou Xie <baoyou.xie@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: ethernet: broadcom: bcmgenet: use new api ethtool_{get|set}_link_ksettings

The ethtool api {get|set}_settings is deprecated.
We move this driver to new api {get|set}_link_ksettings.

Signed-off-by: Philippe Reynes <tremyfr@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: ethernet: broadcom: bcm63xx: use new api ethtool_{get|set}_link_ksettings

The ethtool api {get|set}_settings is deprecated.
We move this driver to new api {get|set}_link_ksettings.

Signed-off-by: Philippe Reynes <tremyfr@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: ethernet: broadcom: bcm63xx: use phydev from struct net_device

The private structure contain a pointer to phydev, but the structure
net_device already contain such pointer. So we can remove the pointer
phydev in the private structure, and update the driver to use the
one contained in struct net_device.

Signed-off-by: Philippe Reynes <tremyfr@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: ethernet: broadcom: b44: use new api ethtool_{get|set}_link_ksettings

The ethtool api {get|set}_settings is deprecated.
We move this driver to new api {get|set}_link_ksettings.

Signed-off-by: Philippe Reynes <tremyfr@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: ethernet: broadcom: b44: use phydev from struct net_device

The private structure contain a pointer to phydev, but the structure
net_device already contain such pointer. So we can remove the pointer
phydev in the private structure, and update the driver to use the
one contained in struct net_device.

Signed-off-by: Philippe Reynes <tremyfr@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'bnxt_en-next'

Michael Chan says:

====================
bnxt: update for net-next.

Misc. changes and minor bug fixes for net-next. Please review.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

bnxt_en: Fixed the VF link status after a link state change

The VF link state can be changed via the 'ip link set' cmd.
Currently, the new link state does not take effect immediately.

The fix is for the PF to send a link change async event to the
designated VF after a VF link state change. This async event will
trigger the VF to update the link status.

Signed-off-by: Eddie Wai <eddie.wai@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

bnxt_en: Support for "ethtool -r" command

Restart autoneg if autoneg is enabled.

Signed-off-by: Deepak Khungar <deepak.khungar@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

bnxt_en: Pad TX packets below 52 bytes.

The hardware has a limitation that it won't pass host to BMC loopback
packets below 52-bytes.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

bnxt_en: Call firmware to approve the random VF MAC address.

After generating the random MAC address for VF, call the firmware to
approve it. This step serves 2 purposes. Some hypervisor (e.g. ESX)
wants to approve the MAC address. 2nd, the call will setup the
proper forwarding database in the internal switch.

We need to unlock the hwrm_cmd_lock mutex before calling bnxt_approve_mac().
We can do that because we are at the end of the function and all the
previous firmware response data has been copied.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

bnxt_en: Re-arrange bnxt_hwrm_func_qcaps().

Re-arrange the code so that the generation of the random MAC address for
the VF is at the end of the function. The next patch will add one more step
to call bnxt_approve_mac() to get the firmware to approve the random MAC
address.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

bnxt_en: Fix ethtool -l|-L inconsistent channel counts.

The existing code is inconsistent in reporting and accepting the combined
channel count.  bnxt_get_channels() reports maximum combined as the
maximum rx count.  bnxt_set_channels() accepts combined count that
cannot be bigger than max rx or max tx.

For example, if max rx = 2 and max tx = 1, we report max supported
combined to be 2.  But if the user tries to set combined to 2, it will
fail because 2 is bigger than max tx which is 1.

Fix the code to be consistent.  Max allowed combined = max(max_rx, max_tx).
We will accept a combined channel count <= max(max_rx, max_tx).

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

bnxt_en: Added support for Secure Firmware Update

Using Ethtool flashdev command, entire NVM package (*.pkg) files
may now be staged into the "update" area of the NVM and subsequently
verified and installed by the firmware using the newly introduced
command: NVM_INSTALL_UPDATE.

We also introduce use of the new firmware command FW_SET_TIME so that the
NVM-resident package installation log contains valid time-stamps.

Signed-off-by: Rob Swindell <Rob.Swindell@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

bnxt_en: Update to firmware interface spec 1.5.1.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

bnxt_en: Simplify PCI device names and add additinal PCI IDs.

Remove "Single-port/Dual-port" from the device names.  Dual-port devices
will appear as 2 separate devices, so no need to call each a dual-port
device.  Use a more generic name for VF devices belonging to the same
chip fanmily.  Add some remaining NPAR device IDs.

Signed-off-by: David Christensen <david.christensen@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

bnxt_en: Use RSS flags defined in the bnxt_hsi.h file.

And remove redundant definitions of the same flags.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

gso: Support partial splitting at the frag_list pointer

Since commit 8a29111c7 ("net: gro: allow to build full sized skb")
gro may build buffers with a frag_list. This can hurt forwarding
because most NICs can't offload such packets, they need to be
segmented in software. This patch splits buffers with a frag_list
at the frag_list pointer into buffers that can be TSO offloaded.

Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Acked-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Bluetooth: Set appearance only for LE capable controllers

Setting appearance on controllers without LE support will result
in No Supported error.

Signed-off-by: Michał Narajowski <michal.narajowski@codecoup.pl>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>

Bluetooth: Fix missing ext info event when setting appearance

This patch adds missing event when setting appearance, just like
in the set local name command.

Signed-off-by: Michał Narajowski <michal.narajowski@codecoup.pl>
Signed-off-by: Szymon Janc <szymon.janc@codecoup.pl>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>

Bluetooth: Add supported data types to ext info changed event

This patch adds EIR data to extended info changed event.

Signed-off-by: Michał Narajowski <michal.narajowski@codecoup.pl>
Signed-off-by: Szymon Janc <szymon.janc@codecoup.pl>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>

Bluetooth: Add appearance to Read Ext Controller Info command

If LE is enabled appearance is added to EIR data.

Signed-off-by: Michał Narajowski <michal.narajowski@codecoup.pl>
Signed-off-by: Szymon Janc <szymon.janc@codecoup.pl>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>

Bluetooth: Factor appending EIR to separate helper

This will also be used for Extended Information Event handling.

Signed-off-by: Michał Narajowski <michal.narajowski@codecoup.pl>
Signed-off-by: Szymon Janc <szymon.janc@codecoup.pl>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>

Bluetooth: Refactor read_ext_controller_info handler

There is no need to allocate heap for reply only to copy stack data to
it. This also fix rp memory leak and missing hdev unlock if kmalloc
failed.

Signed-off-by: Szymon Janc <szymon.janc@codecoup.pl>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>

Bluetooth: hci_uart: Add Marvell support

This patch introduces support for Marvell Bluetooth controller over
UART (8897 for now). In order to send the final firmware at full speed,
a helper firmware is firstly sent. Firmware download is driven by the
controller which sends request firmware packets (including expected
size).

This driver is a global rework of the one proposed by
Amitkumar Karwar <akarwar@marvell.com>.

Signed-off-by: Loic Poulain <loic.poulain@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>

Bluetooth: hci_uart: Add Nokia Protocol identifier

Will be used by hci_nokia extra protocol.

Signed-off-by: Loic Poulain <loic.poulain@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>

Bluetooth: hci_bcm: Change protocol name

Use full name instead of abbreviation.

Signed-off-by: Loic Poulain <loic.poulain@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>

Bluetooth: Increment management interface revision

Increment the mgmt revision due to the recently added
Read Extended Controller Information and Set Appearance commands.

Signed-off-by: Szymon Janc <szymon.janc@codecoup.pl>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>

Bluetooth: Fix advertising instance validity check for flags

Flags are not allowed in Scan Response.

Signed-off-by: Szymon Janc <szymon.janc@codecoup.pl>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>

Bluetooth: Unify advertising instance flags check

This unifies max length and TLV validity checks.

Signed-off-by: Szymon Janc <szymon.janc@codecoup.pl>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>

Bluetooth: Remove unused parameter from tlv_data_is_valid function

hdev parameter is not used in function.

Signed-off-by: Szymon Janc <szymon.janc@codecoup.pl>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>

Bluetooth: Add support for appearance in scan rsp

This patch enables prepending appearance value to scan response data.
It also adds support for setting appearance value through mgmt command.
If currently advertised instance has apperance flag set it is expired
immediately.

Signed-off-by: Michał Narajowski <michal.narajowski@codecoup.pl>
Signed-off-by: Szymon Janc <szymon.janc@codecoup.pl>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>

Bluetooth: Add support for local name in scan rsp

This patch enables appending local name to scan response data. If
currently advertised instance has name flag set it is expired
immediately.

Signed-off-by: Michał Narajowski <michal.narajowski@codecoup.pl>
Signed-off-by: Szymon Janc <szymon.janc@codecoup.pl>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>

Bluetooth: btrtl: Add RTL8822BE Bluetooth device

The RTL8822BE is a new Realtek wifi and BT device. Support for the BT
part is hereby added.

As this device is similar to most of the other Realtek BT devices, the
changes are minimal. The main difference is that the 8822BE needs a
configuration file for enabling and disabling features. Thus code is
added to select and load this configuration file. Although not needed
at the moment, hooks are added for the other devices that might need
such configuration files.

One additional change is to the routine that tests that the project
ID contained in the firmware matches the hardware. As the project IDs
are not sequential, continuing to use the position in the array as the
expected value of the ID would require adding extra unused entries in
the table, and any subsequant rearrangment of the array would break the
code. To fix these problems, the array elements now contain both the
hardware ID and the expected value for the project ID.

Signed-off-by: 陆朱伟 <alex_lu@realsil.com.cn>
Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>

Bluetooth: Fix not registering BR/EDR SMP channel with force_bredr flag

If force_bredr is set SMP BR/EDR channel should also be for non-SC
capable controllers. Since hcidev flag is persistent wrt power toggle
it can be already set when calling smp_register(). This resulted in
SMP BR/EDR channel not being registered even if HCI_FORCE_BREDR_SMP
flag was set.

This also fix NULL pointer dereference when trying to disable
force_bredr after power cycle.

BUG: unable to handle kernel NULL pointer dereference at 0000000000000388
IP: [<ffffffffc0493ad8>] smp_del_chan+0x18/0x80 [bluetooth]

Call Trace:
[<ffffffffc04950ca>] force_bredr_smp_write+0xba/0x100 [bluetooth]
[<ffffffff8133be14>] full_proxy_write+0x54/0x90
[<ffffffff81245967>] __vfs_write+0x37/0x160
[<ffffffff813617f7>] ? selinux_file_permission+0xd7/0x110
[<ffffffff81356fbd>] ? security_file_permission+0x3d/0xc0
[<ffffffff810eb5b2>] ? percpu_down_read+0x12/0x50
[<ffffffff812462a5>] vfs_write+0xb5/0x1a0
[<ffffffff812476f5>] SyS_write+0x55/0xc0
[<ffffffff817eb872>] entry_SYSCALL_64_fastpath+0x1a/0xa4
Code: 48 8b 45 f0 eb c1 0f 1f 00 66 2e 0f 1f 84 00 00 00 00 00 0f 1f
      44 00 00 f6 05 c6 3b 02 00 04 55 48 89 e5 41 54 53 49 89 fc 75
      4b
      <49> 8b 9c 24 88 03 00 00 48 85 db 74 31 49 c7 84 24 88 03 00 00
RIP  [<ffffffffc0493ad8>] smp_del_chan+0x18/0x80 [bluetooth]
RSP <ffff8802aee3bd90>
CR2: 0000000000000388

Signed-off-by: Szymon Janc <szymon.janc@codecoup.pl>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>

Bluetooth: Use kzalloc instead of kmalloc/memset

Use kzalloc rather than kmalloc followed by memset with 0.

Generated by: scripts/coccinelle/api/alloc/kzalloc-simple.cocci

Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>

Bluetooth: Increase the subsystem minor version number

While the subsystem version information are purely informational,
increase the minor number due to the addition of user channel and
management control monitoring suppport. It is helpful for debugging
purposes to see the version numbers change.

Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>

Bluetooth: Fix reason code used for rejecting SCO connections

A comment in the code states that SCO connection should be rejected
with the proper error value between 0xd-0xf. The code uses
HCI_ERROR_REMOTE_LOW_RESOURCES which is 0x14.

This led to following error:
< HCI Command: Reject Synchronous Co.. (0x01|0x002a) plen 7
        Address: 34:51:C9:EF:02:CA (Apple, Inc.)
        Reason: Remote Device Terminated due to Low Resources (0x14)
> HCI Event: Command Status (0x0f) plen 4
      Reject Synchronous Connection Request (0x01|0x002a) ncmd 1
        Status: Invalid HCI Command Parameters (0x12)

Instead make use of HCI_ERROR_REJ_LIMITED_RESOURCES which is 0xd.

Signed-off-by: Frédéric Dalleau <frederic.dalleau@collabora.co.uk>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>

Bluetooth: btqca: remove null checks on edl->data as it is an array

edl->data is an array of __u8 so the null check is unneccessary,
so remove it.

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>

Bluetooth: Fix wrong New Settings event when closing HCI User Channel

When closing HCI User Channel, the New Settings event was send out to
inform about changed settings. However such event is wrong since the
exclusive HCI User Channel access is active until the Index Added event
has been sent.

@ USER Close: test
@ MGMT Event: New Settings (0x0006) plen 4
        Current settings: 0x00000ad0
          Bondable
          Secure Simple Pairing
          BR/EDR
          Low Energy
          Secure Connections
= Close Index: 00:14:EF:22:04:12
@ MGMT Event: Index Added (0x0004) plen 0

Calling __mgmt_power_off from hci_dev_do_close requires an extra check
for an active HCI User Channel.

Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>

Bluetooth: Send control open and close messages for HCI user channels

When opening and closing HCI user channel, send monitoring messages to
be able to trace its behavior.

Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>

fakelb: fix schedule while atomic

This patch changes the spinlock to mutex for the available fakelb phy
list. When holding the spinlock the ieee802154_unregister_hw is called
which holding the rtnl_mutex, in that case we get a "BUG: sleeping function
called from invalid context" error. We simple change the spinlock to
mutex which allows to hold the rtnl lock there.

Signed-off-by: Alexander Aring <aar@pengutronix.de>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>

Bluetooth: Append local name and CoD to Extended Controller Info

This adds device class, complete local name and short local name
to EIR data in Extended Controller Info as specified in docs.

Signed-off-by: Michał Narajowski <michal.narajowski@codecoup.pl>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>

Bluetooth: Add framework for Extended Controller Information

This command is used to retrieve the current state and basic
information of a controller. It is typically used right after
getting the response to the Read Controller Index List command
or an Index Added event (or its extended counterparts).

When any of the values in the EIR_Data field changes, the event
Extended Controller Information Changed will be used to inform
clients about the updated information.

Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Michał Narajowski <michal.narajowski@codecoup.pl>

Bluetooth: btusb: Mark CW6622 devices to have broken link key commands

Conwise CW6622 seems to have a problem with the stored link key
commands so just mark it as broken.

< HCI Command: Read Local Supported Features (0x04|0x0003) plen 0
> HCI Event: Command Complete (0x0e) plen 12
    Read Local Supported Features (0x04|0x0003) ncmd 1
    status 0x00
    Features: 0xff 0x3e 0x85 0x38 0x18 0x18 0x00 0x00
< HCI Command: Read Local Version Information (0x04|0x0001) plen 0
> HCI Event: Command Complete (0x0e) plen 12
    Read Local Version Information (0x04|0x0001) ncmd 1
    status 0x00
    HCI Version: 2.0 (0x3) HCI Revision: 0x1f4
    LMP Version: 2.0 (0x3) LMP Subversion: 0x1f4
    Manufacturer: CONWISE Technology Corporation Ltd (66)
...
< HCI Command: Read Local Supported Commands (0x04|0x0002) plen 0
> HCI Event: Command Complete (0x0e) plen 68
    Read Local Supported Commands (0x04|0x0002) ncmd 1
    status 0x00
    Commands: 7fffef03cedfffffffffff1ff20ff8ff3f
...
< HCI Command: Read Stored Link Key (0x03|0x000d) plen 7
    bdaddr 00:00:00:00:00:00 all 1
> HCI Event: Command Complete (0x0e) plen 8
    Read Stored Link Key (0x03|0x000d) ncmd 1
    status 0x11 max 0 num 0
    Error: Unsupported Feature or Parameter Value

Signed-off-by: Szymon Janc <szymon.janc@codecoup.pl>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>

Bluetooth: Remove deprecated create_singlethread_workqueue

The workqueue "workqueue" queues multiple work items viz &qca->ws_awake_rx
&qca->ws_rx_vote_off, &qca->ws_awake_device, &qca->ws_tx_vote_off which
require strict execution ordering. Hence, an ordered dedicated workqueue
has been used to replace the deprecated create_singlethread_workqueue
instance.

WQ_MEM_RECLAIM has not been set since the driver is not being used on a
memory reclaim path.

Signed-off-by: Bhaktipriya Shridhar <bhaktipriya96@gmail.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>

Bluetooth: Handle HCI raw socket transition from unbound to bound

In case an unbound HCI raw socket is later on bound, ensure that the
monitor notification messages indicate a close and re-open. None of
the userspace tools use the socket this, but it is actually possible
to use an ioctl on an unbound socket and then later bind it.

Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>

Bluetooth: Send control open and close messages for HCI raw sockets

When opening and closing HCI raw sockets their main usage is for legacy
userspace. To track interaction with the modern mgmt interface, send
open and close monitoring messages for these action.

The HCI raw sockets is special since it supports unbound ioctl operation
and for that special case delay the notification message until at least
one ioctl has been executed. The difference between a bound and unbound
socket will be detailed by the fact the HCI index is present or not.

Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>

Bluetooth: Add extra channel checks for control open/close messages

The control open and close monitoring events require special channel
checks to ensure messages are only send when the right events happen.

Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>

Bluetooth: Assign the channel early when binding HCI sockets

Assignment of the hci_pi(sk)->channel should be done early when binding
the HCI socket. This avoids confusion with the RAW channel that is used
for legacy access.

Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>

Bluetooth: Send control open and close only when cookie is present

Only when the cookie has been assigned, then send the open and close
monitor messages. Also if the socket is bound to a device, then include
the index into the message.

Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>

Bluetooth: Use numbers for subsystem version string

Instead of keeping a version string around, use version and revision
numbers and then stringify them for use as module parameter.

Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>

Bluetooth: Introduce helper functions for socket cookie handling

Instead of manually allocating cookie information each time, use helper
functions for generating and releasing cookies.

Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>

Bluetooth: add WCNSS dependency for HCI driver

The newly added bluetooth driver is based on the soc-specific support,
but lacks the obvious compile-time dependency on that:

drivers/bluetooth/btqcomsmd.o: In function `btqcomsmd_probe':
btqcomsmd.c:(.text.btqcomsmd_probe+0x40): undefined reference to `qcom_wcnss_open_channel'
btqcomsmd.c:(.text.btqcomsmd_probe+0x5c): undefined reference to `qcom_wcnss_open_channel'
Makefile:969: recipe for target 'vmlinux' failed

Fixes: 90c107dc8b2c ("Bluetooth: Introduce Qualcomm WCNSS SMD based HCI driver")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>

Bluetooth: Use command status event for Set IO Capability errors

In case of failure, the Set IO Capability command is suppose to return
command status and not command complete.

Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>

Bluetooth: Fix wrong Get Clock Information return parameters

The address information of the Get Clock Information return parameters
is copying from a different memory location. It uses &cmd->param while
it actually needs to be cmd->param.

Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>

Bluetooth: Use individual flags for certain management events

Instead of hiding everything behind a general managment events flag,
introduce indivdual flags that allow fine control over which events are
send to a given management channel.

Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>

Bluetooth: mgmt: Fix sending redundant event for Advertising Instance

When an Advertising Instance is removed, the Advertising Removed event
shouldn't be sent to the same socket that issued the Remove
Advertising command (it gets a command complete event instead). The
mgmt_advertising_removed() function already has a parameter for
skipping a specific socket, but there was no code to propagate the
right value to this parameter. This patch fixes the issue by making
sure the intermediate hci_req_clear_adv_instance() function gets the
socket pointer.

Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>

Bluetooth: Add support for sending MGMT commands and events to monitor

This adds support for tracing all management commands and events via the
monitor interface.

Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>

Bluetooth: Add support for sending MGMT open and close to monitor

This sends new notifications to the monitor support whenever a
management channel has been opened or closed. This allows tracing of
control channels really easily.

Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>

Bluetooth: Introduce helper to pack mgmt version information

The mgmt version information will be also needed for the control
changell tracing feature. This provides a helper to pack them.

Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>