git.karo-electronics.de Git - linux-beck.git/log

vlan: Pass ethtool get_ts_info queries to real device.

Commit a6111d3c "vlan: Pass SIOC[SG]HWTSTAMP ioctls to real device"
intended to enable hardware time stamping on VLAN interfaces, but
passing SIOCSHWTSTAMP is only half of the story. This patch adds
the second half, by letting user space find out the time stamping
capabilities of the device backing a VLAN interface.

Signed-off-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net-next

Jeff Kirsher says:

====================
Intel Wired LAN Driver Updates 2014-11-20

This series contains updates to ixgbevf, i40e and i40evf.

Emil updates ixgbevf with much of the work that Alex Duyck did while at
Intel.  First updates the driver to clear the status bits on allocation
instead of in the cleanup routine, this way we can leave the recieve
descriptor rings as a read only memory block until we actually have
buffers to give back to the hardware.  Clean up ixgbevf_clean_rx_irq()
by creating ixgbevf_process_skb_field() to merge several similar
operations into this new function.  Cleanup temporary variables within
the receive hot-path and reducing the scope of variables that do not
need to exist outside the main loop.  Save on stack space by just
storing our updated values back in next_to_clean instead of using
a stack variable, which also collapses the size the function.  Improve
performace on IOMMU enabled systems and reduce cache misses by changing
the basic receive patch for ixgbevf so that instead of receiving the
data into an skb, it is received into a double buffered page.  Add
netpoll support by creating ixgbevf_netpoll(), which is a callback for
.ndo_poll_controller to allow for the VF interface to be used with
netconsole.

Mitch provides several cleanups and trivial fixes for i40e and i40evf.
First is a fix the overloading of the msg_size field in the
arq_event_info struct by splitting the field into two and renaming to
indicate the actual function of each field.  Updates code comments
to match the actual function.  Cleanup several checkpatch.pl warnings
by adding or removing blank lines, aligning function parameters, and
correcting over-long lines (which makes the code more readable).

Shannon provides a patch for i40e to write the extra bits that will
turn off the ITR wait for the interrupt, since we want the SW INT to
go off as soon as possible.

v2: updated patch 07 based on feedback from Alex Duyck by
- adding pfmemalloc check to a new function for reusable page
- moved atomic_inc outside of #if/else in ixgbevf_add_rx_frag()
- reverted the removal of the API check in ixgbevf_change_mtu()
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'amd-xgbe-next'

Tom Lendacky says:

====================
amd-xgbe: AMD XGBE driver updates 2014-11-20

The following series of patches includes functional updates to the
driver as well as some trivial changes.

- Add a read memory barrier in the Tx and Rx path after checking the
descriptor ownership bit
- Wait for the Tx engine to stop/suspend before issuing a stop command
- Implement a smatch tool suggestion to simplify an if statement
- Separate out Tx and Rx ring data fields into their own structures
- Add BQL support
- Remove an unused variable
- Change Tx coalescing support to operate on packet basis instead of
a descriptor basis
- Add support for the skb->xmit_more flag

This patch series is based on net-next.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

amd-xgbe: Add support for the skb->xmit_more flag

Add support to delay telling the hardware about data that is ready to
be transmitted if the skb->xmit_more flag is set.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

amd-xgbe: Perform Tx coalescing on a packet basis

The current form of Tx coalescing works on a descriptor basis instead
of on a packet basis and doesn't take into account TSO packets. Update
the Tx coalescing support to work on a packet basis, taking into
account the number of packets associated with a TSO transmit. Also,
only activate the Tx timer if a timer value is set.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

amd-xgbe: Remove unused variable

The tso_header variable in the xgbe_tx_ring_data structure is not used,
remove it.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

amd-xgbe: Add BQL support

Call the appropriate BQL functions to track the number of bytes queued
during Tx processing and to track the number of packets and bytes
that have been transmitted during Tx complete processing.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

amd-xgbe: Separate Tx/Rx ring data fields into new structs

Move the Tx and Rx related fields within the xgbe_ring_data struct into
their own structs in order to more easily see what fields are used for
each operation.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

amd-xgbe: Incorporate Smatch coding suggestion

The Smatch tool indicated that one of the if statements in xgbe-dev.c
could be rewritten to remove a redundant check for the 'err' variable
in an if statement.

Change the statement as suggested and add a comment to help clarify.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

amd-xgbe: Tx engine must not be active before stopping it

If the Tx engine is told to stop while it is actively processing Tx
descriptors it is possible that the Tx descriptor(s) will not be closed
out properly. When the Tx engine is restarted this could result in the
driver being stuck on the improperly closed descriptor.

Update the driver to wait for the Tx engine to be in a stopped or
suspended state before issuing the stop command.

This has not been an issue to date, but it's a good safe-guard to have.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

amd-xgbe: Add a read memory barrier to Tx/Rx path

Add a read memory barrier to the Tx and Rx paths where the ownership
bit is checked to be sure that all descriptor fields are read after
having read the ownership bit for the descriptor.

This has not been an issue to date, but it's a good safe-guard to have.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: USB: Deletion of unnecessary checks before the function call "kfree"

The kfree() function tests whether its argument is NULL and then
returns immediately. Thus the test around the call is not needed.

This issue was detected by using the Coccinelle software.

Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: Xilinx: Deletion of unnecessary checks before two function calls

The functions kfree() and of_node_put() test whether their argument is NULL
and then return immediately. Thus the test around the call is not needed.

This issue was detected by using the Coccinelle software.

Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Reviewed-by: Soren Brinkmann <soren.brinkmann@xilinx.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

IBM-EMAC: Deletion of unnecessary checks before the function call "of_dev_put"

The of_dev_put() function tests whether its argument is NULL and then
returns immediately. Thus the test around the call is not needed.

This issue was detected by using the Coccinelle software.

Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Signed-off-by: David S. Miller <davem@davemloft.net>

net/mlx4_en: mlx4_en_set_settings() always fails when autoneg is set

Fix ethtool set settings to not check AUTONEG_ENABLE

mlx4_en_set_settings should not check if cmd->autoneg == AUTONEG_ENABLE,
cmd->autoneg can be enabled by default and this check will fail other settings requests.
mlx4_en driver doesn't support changing autoneg value, but shouldn't fail the request
in case cmd->autoneg was set.

Fixes: d48b3ab ("net/mlx4_en: Use PTYS register to set ethtool settings (Speed)")
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

drivers: atm: eni: Add pci_dma_mapping_error() call

Added a pci_dma_mapping_error() call to check for mapping errors before
further using the dma handle. In case of error, control goes to a new label
where the incoming skb is freed. Unchecked dma handles were found using
Coccinelle:

@rule1@
expression e1;
identifier x;
@@

*x = pci_map_single(...);
... when != pci_dma_mapping_error(e1,x)

Signed-off-by: Tina Johnson <tinajohnson.1234@gmail.com>
Acked-by: Julia Lawall <julia.lawall@lip6.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'tipc-next'

Richard Alpe says:

====================
tipc: new netlink API

v3
The old API is not removed.

The new API is separated from the old because of a bug in the old
tipc-config utility using it. When adding commands to the existing
genl_ops struct the get-family response message grows to a point where
it overflows the small receive buffer in tipc-config, subsequently
breaking the tool. Hence the two genl_family and genl_ops structs.

The new headers are placed in a new file called tipc_netlink.h rather
than added to tipc_config.h as they where in previous versions of this
patchset.
/v3

v2
Redesigned "socket list command" to address David Millers comments in
net-next v1 of this patchset.

Simply put the problem is that we can have an arbitrary amount of
sockets with an arbitrary amount of associated publications. In the
previous patchset this was solved by nesting as many publications as
possible into a socket. If all didn't fit it sent the same socket again
with the remaining publications. As David Miller pointed out this makes
each message malformed as the receiver cannot by the data itself know if
it has received a complete set or not. This was flagged outside of the
data and the client did the reassembly.

o socket 1
  o publ 1
  o publ 2
o socket 1
  o publ 3
  o publ 4

In this patchset this is divided into socket listing and publication
listing to avoid having nested data of arbitrary size.

TIPC_NL_SOCK_GET now dumps all sockets with any nested connection
information. However, it no longer include publication information,
only a HAS_PUBL flag to indicate whether the socket has publications or
not. To compliment this there is a new command TIPC_NL_PUBL_GET which
takes a socket as argument and dumps all associated publications.

This means that on "top-level" the data is always complete. In the case
of "tipc socket list" (new tipc-config -p) it first queries all sockets
with TIPC_NL_SOCK_GET and if the socket is published it fetches the
publications using TIPC_NL_PUBL_GET. This is slow for large amount of
sockets with a low publication count (worst case). However, the
integrity is preserved and there is no malformed messages.
/v2

This is a new netlink API for TIPC. It's intended to replace the
existing ASCII API. It utilizes many of the standard netlink
functionalities in the kernel, such as attribute nesting and
input polices.

There are a couple of reasons for this rewrite. The main and most
easily justifiable is that the existing API doesn't scale.  Meaning
that a TIPC cluster with a larger amount of nodes, publications or
ports will rapidly exceed what the exiting API can handle. Resulting
in truncated or corrupt responses. In addition to this, the existing
ASCII API rarely uses "standard" kernel functions and has several
tipc specific functions for sanity checking and string formating.

The new API utilizes standard function for pushing data to socket
buffers and netlink attribute nesting to logically group data.
The new API can handle an arbitrary amount of data for things that
are likely to scale up as the TIPC usage and/or cluster size
increases.

A new user-space tool has been developed to work with this new API.
It is called "tipc" and is part of the "tipc-utils" package that
comes with many Linux distributions.  The new "tipc" tool utilizes
standard functions from libnl to format, send, receive and process
messages. The tool has borrowed design philosophies from git and the
ip tool. Making the syntax resemble that of ip whiles its strong
modularity resembles that of git.

The existing tool for managing TIPC, "tipc-config" remains in the
package, but when built for kernels that has this new API it is
replaced by a script-based wrapper that maps the old syntax to the
new tool. This way, backwards compatibility is mostly preserved.

MORE ABOUT THE CODE

The main challenge here is to handle the case where the data is of
arbitrary size. This was largely neglected in the old API design.
For example when there is a lot of sockets that has a large amount of
associated publications. In this specific case we can't assume that
all ports nor for that matter all the publications can fit inside a
single netlink message. Sending everything in one batch isn't an
option as we need to yield for the socket layer to cope.

This is solved by using the standard netlink callback for dumping
data and releasing the locks when the netlink message is full. The
dumping mechanism gets us back and we keep a reference (logical) to
where we where when the message became full. This means that we are
not "atomic", what is retrieved by user-space isn't a snapshot at a
certain time but rather a continuously updated data set. In the case
where we can't find our way back i.e. our logical reference are gone
we set a standard flag (NLM_F_DUMP_INTR) to tell user-space that the
dump was interrupted.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

tipc: add name table dump to new netlink api

Add TIPC_NL_NAME_TABLE_GET command to the new tipc netlink API.

This command supports dumping the name table of all nodes.

Netlink logical layout of name table response message:
-> name table
    -> publication
        -> type
        -> lower
        -> upper
        -> scope
        -> node
        -> ref
        -> key

Signed-off-by: Richard Alpe <richard.alpe@ericsson.com>
Reviewed-by: Erik Hugne <erik.hugne@ericsson.com>
Reviewed-by: Jon Maloy <jon.maloy@ericsson.com>
Acked-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

tipc: add net set to new netlink api

Add TIPC_NL_NET_SET command to the new tipc netlink API.

This command can set the network id and network (tipc) address.

Netlink logical layout of network set message:
-> net
[ -> id ]
[ -> address ]

Signed-off-by: Richard Alpe <richard.alpe@ericsson.com>
Reviewed-by: Erik Hugne <erik.hugne@ericsson.com>
Reviewed-by: Jon Maloy <jon.maloy@ericsson.com>
Acked-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

tipc: add net dump to new netlink api

Add TIPC_NL_NET_GET command to the new tipc netlink API.

This command dumps the network id of the node.

Netlink logical layout of returned network data:
-> net
-> id

Signed-off-by: Richard Alpe <richard.alpe@ericsson.com>
Reviewed-by: Erik Hugne <erik.hugne@ericsson.com>
Reviewed-by: Jon Maloy <jon.maloy@ericsson.com>
Acked-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

tipc: add node get/dump to new netlink api

Add TIPC_NL_NODE_GET to the new tipc netlink API.

This command can dump the address and node status of all nodes in the
tipc cluster.

Netlink logical layout of returned node/address data:
-> node
-> address
-> up flag

Signed-off-by: Richard Alpe <richard.alpe@ericsson.com>
Reviewed-by: Erik Hugne <erik.hugne@ericsson.com>
Reviewed-by: Jon Maloy <jon.maloy@ericsson.com>
Acked-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

tipc: add media set to new netlink api

Add TIPC_NL_MEDIA_SET command to the new tipc netlink API.

This command can set one or more link properties for a particular
media.

Netlink logical layout of bearer set message:
-> media
    -> name
    -> link properties
        [ -> tolerance ]
        [ -> priority ]
        [ -> window ]

Signed-off-by: Richard Alpe <richard.alpe@ericsson.com>
Reviewed-by: Erik Hugne <erik.hugne@ericsson.com>
Reviewed-by: Jon Maloy <jon.maloy@ericsson.com>
Acked-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

tipc: add media get/dump to new netlink api

Add TIPC_NL_MEDIA_GET command to the new tipc netlink API.

This command supports dumping all information about all defined
media as well as getting all information about a specific media.

The information about a media includes name and link properties.

Netlink logical layout of media get response message:
-> media
    -> name
    -> link properties
        -> tolerance
        -> priority
        -> window

Signed-off-by: Richard Alpe <richard.alpe@ericsson.com>
Reviewed-by: Erik Hugne <erik.hugne@ericsson.com>
Reviewed-by: Jon Maloy <jon.maloy@ericsson.com>
Acked-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

tipc: add link stat reset to new netlink api

Add TIPC_NL_LINK_RESET_STATS command to the new netlink API.

This command resets the link statistics for a particular link.

Netlink logical layout of link reset message:
-> link
-> name

Signed-off-by: Richard Alpe <richard.alpe@ericsson.com>
Reviewed-by: Erik Hugne <erik.hugne@ericsson.com>
Reviewed-by: Jon Maloy <jon.maloy@ericsson.com>
Acked-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

tipc: add link set to new netlink api

Add TIPC_NL_LINK_SET to the new tipc netlink API.

This command can set one or more link properties for a particular
link.

Netlink logical layout of link set message:
-> link
    -> name
    -> properties
        [ -> tolerance ]
        [ -> priority ]
        [ -> window ]

Signed-off-by: Richard Alpe <richard.alpe@ericsson.com>
Reviewed-by: Erik Hugne <erik.hugne@ericsson.com>
Reviewed-by: Jon Maloy <jon.maloy@ericsson.com>
Acked-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

tipc: add link get/dump to new netlink api

Add TIPC_NL_LINK_GET command to the new tipc netlink API.

This command supports dumping all information about all links
(including the broadcast link) or getting all information about a
specific link (not the broadcast link).

The information about a link includes name, transmission info,
properties and link statistics.

As the tipc broadcast link is special we unfortunately have to treat
it specially. It is a deliberate decision not to abstract the
broadcast link on this (API) level.

Netlink logical layout of link response message:
    -> port
        -> name
        -> MTU
        -> RX
        -> TX
        -> up flag
        -> active flag
        -> properties
           -> priority
           -> tolerance
           -> window
        -> statistics
            -> rx_info
            -> rx_fragments
            -> rx_fragmented
            -> rx_bundles
            -> rx_bundled
            -> tx_info
            -> tx_fragments
            -> tx_fragmented
            -> tx_bundles
            -> tx_bundled
            -> msg_prof_tot
            -> msg_len_cnt
            -> msg_len_tot
            -> msg_len_p0
            -> msg_len_p1
            -> msg_len_p2
            -> msg_len_p3
            -> msg_len_p4
            -> msg_len_p5
            -> msg_len_p6
            -> rx_states
            -> rx_probes
            -> rx_nacks
            -> rx_deferred
            -> tx_states
            -> tx_probes
            -> tx_nacks
            -> tx_acks
            -> retransmitted
            -> duplicates
            -> link_congs
            -> max_queue
            -> avg_queue

Signed-off-by: Richard Alpe <richard.alpe@ericsson.com>
Reviewed-by: Erik Hugne <erik.hugne@ericsson.com>
Reviewed-by: Jon Maloy <jon.maloy@ericsson.com>
Acked-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

tipc: add publication dump to new netlink api

Add TIPC_NL_PUBL_GET command to the new tipc netlink API.

This command supports dumping of all publications for a specific
socket.

Netlink logical layout of request message:
    -> socket
        -> reference

Netlink logical layout of response message:
    -> publication
        -> type
        -> lower
        -> upper

Signed-off-by: Richard Alpe <richard.alpe@ericsson.com>
Reviewed-by: Erik Hugne <erik.hugne@ericsson.com>
Reviewed-by: Jon Maloy <jon.maloy@ericsson.com>
Acked-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

tipc: add sock dump to new netlink api

Add TIPC_NL_SOCK_GET command to the new tipc netlink API.

This command supports dumping of all available sockets with their
associated connection or publication(s). It could be extended to reply
with a single socket if the NLM_F_DUMP isn't set.

The information about a socket includes reference, address, connection
information / publication information.

Netlink logical layout of response message:
-> socket
    -> reference
    -> address
    [
    -> connection
        -> node
        -> socket
        [
        -> connected flag
        -> type
        -> instance
        ]
    ]
    [
    -> publication flag
    ]

Signed-off-by: Richard Alpe <richard.alpe@ericsson.com>
Reviewed-by: Erik Hugne <erik.hugne@ericsson.com>
Reviewed-by: Jon Maloy <jon.maloy@ericsson.com>
Acked-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

tipc: add bearer set to new netlink api

Add TIPC_NL_BEARER_SET command to the new tipc netlink API.

This command can set one or more link properties for a particular
bearer.

Netlink logical layout of bearer set message:
-> bearer
    -> name
    -> link properties
        [ -> tolerance ]
        [ -> priority ]
        [ -> window ]

Signed-off-by: Richard Alpe <richard.alpe@ericsson.com>
Reviewed-by: Erik Hugne <erik.hugne@ericsson.com>
Reviewed-by: Jon Maloy <jon.maloy@ericsson.com>
Acked-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

tipc: add bearer get/dump to new netlink api

Add TIPC_NL_BEARER_GET command to the new tipc netlink API.

This command supports dumping all data about all bearers or getting
all information about a specific bearer.

The information about a bearer includes name, link priorities and
domain.

Netlink logical layout of bearer get message:
-> bearer
    -> name

Netlink logical layout of returned bearer information:
-> bearer
    -> name
    -> link properties
        -> priority
        -> tolerance
        -> window
    -> domain

Signed-off-by: Richard Alpe <richard.alpe@ericsson.com>
Reviewed-by: Erik Hugne <erik.hugne@ericsson.com>
Reviewed-by: Jon Maloy <jon.maloy@ericsson.com>
Acked-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

tipc: add bearer disable/enable to new netlink api

A new netlink API for tipc that can disable or enable a tipc bearer.

The new API is separated from the old API because of a bug in the
user space client (tipc-config). The problem is that older versions
of tipc-config has a very low receive limit and adding commands to
the legacy genl_opts struct causes the ctrl_getfamily() response
message to grow, subsequently breaking the tool.

The new API utilizes netlink policies for input validation. Where the
top-level netlink attributes are tipc-logical entities, like bearer.
The top level entities then contain nested attributes. In this case
a name, nested link properties and a domain.

Netlink commands implemented in this patch:
TIPC_NL_BEARER_ENABLE
TIPC_NL_BEARER_DISABLE

Netlink logical layout of bearer enable message:
-> bearer
    -> name
    [ -> domain ]
    [
    -> properties
        -> priority
    ]

Netlink logical layout of bearer disable message:
-> bearer
    -> name

Signed-off-by: Richard Alpe <richard.alpe@ericsson.com>
Reviewed-by: Erik Hugne <erik.hugne@ericsson.com>
Reviewed-by: Jon Maloy <jon.maloy@ericsson.com>
Acked-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

macvtap: advance iov iterator when needed in macvtap_put_user()

When mergeable buffer is used, vnet_hdr_sz is greater than sizeof struct
virtio_net_hdr. So we need advance the iov iterators in this case.

Fixes 6c36d2e26cda1ad3e2c4b90dd843825fc62fe5b4 ("macvtap: Use iovec iterators")
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>

mlx4: don't duplicate kvfree()

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Acked-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

mlx5: don't duplicate kvfree()

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Acked-by: Eli Cohen <eli@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'r8152-next'

Hayes Wang says:

====================
r8152: adjust rx functions

v3:
For patch #1, remove unnecessary initialization for ret and
unnecessary blank line in r8152_submit_rx().

v2:
For patch #1, set actual_length to 0 before adding the rx to the
list, when a error occurs.

For patch #2, change the flow. Stop submitting the rx if a error
occurs, and add the remaining rx to the list for submitting later.

v1:
Adjust some flows and codes which are relative to r8152_submit_rx()
and rtl_start_rx().
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

r8152: adjust rtl_start_rx

If there is a error for r8152_submit_rx(), add the remaining rx
buffers to the list. Then the remaining rx buffers could be
submitted later.

Signed-off-by: Hayes Wang <hayeswang@realtek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

r8152: adjust r8152_submit_rx

The behavior of handling the returned status from r8152_submit_rx()
is almost same, so let r8152_submit_rx() deal with the error
directly. This could avoid the duplicate code.

Signed-off-by: Hayes Wang <hayeswang@realtek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: sctp: keep owned chunk in destructor_arg instead of skb->cb

It's just silly to hold the skb destructor argument around inside
skb->cb[] as we currently do in SCTP.

Nowadays, we're sort of cheating on data accounting in the sense
that due to commit 4c3a5bdae293 ("sctp: Don't charge for data in
sndbuf again when transmitting packet"), we orphan the skb already
in the SCTP output path, i.e. giving back charged data memory, and
use a different destructor only to make sure the sk doesn't vanish
on skb destruction time. Thus, cb[] is still valid here as we
operate within the SCTP layer. (It's generally actually a big
candidate for future rework, imho.)

However, storing the destructor in the cb[] can easily cause issues
should an non sctp_packet_set_owner_w()'ed skb ever escape the SCTP
layer, since cb[] may get overwritten by lower layers and thus can
corrupt the chunk pointer. There are no such issues at present,
but lets keep the chunk in destructor_arg, as this is the actual
purpose for it.

Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Acked-by: Vlad Yasevich <vyasevich@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: bcmgenet: log RX buffer allocation and RX/TX dma failures

To help troubleshoot heavy memory pressure conditions, add a bunch of
statistics counter to log RX buffer allocation and RX/TX DMA mapping
failures. These are reported like any other counters through the ethtool
stats interface.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: systemport: log RX buffer allocation and RX/TX DMA failures

To help troubleshoot heavy memory pressure conditions, add a bunch of
statistics counter to log RX buffer allocation and RX/TX DMA mapping
failures. These are reported like any other counters through the ethtool
stats interface.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

packet: make packet_snd fail on len smaller than l2 header

When sending packets out with PF_PACKET, SOCK_RAW, ensure that the
packet is at least as long as the device's expected link layer header.
This check already exists in tpacket_snd, but not in packet_snd.
Also rate limit the warning in tpacket_snd.

Signed-off-by: Willem de Bruijn <willemb@google.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Acked-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'vlan_action'

Jiri Pirko says:

====================
sched: introduce vlan action

Please see the individual patches for info
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

sched: introduce vlan action

This tc action allows to work with vlan tagged skbs. Two supported
sub-actions are header pop and header push.

Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: move vlan pop/push functions into common code

So it can be used from out of openvswitch code.
Did couple of cosmetic changes on the way, namely variable naming and
adding support for 8021AD proto.

Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: move make_writable helper into common code

note that skb_make_writable already exists in net/netfilter/core.c
but does something slightly different.

Suggested-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

vlan: introduce __vlan_insert_tag helper which does not free skb

There's a need for helper which inserts vlan tag but does not free the
skb in case of an error.

Suggested-by: Pravin Shelar <pshelar@nicira.com>
Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

vlan: introduce *vlan_hwaccel_push_inside helpers

Use them to push skb->vlan_tci into the payload and avoid code
duplication.

Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

vlan: rename __vlan_put_tag to vlan_insert_tag_set_proto

Name fits better. Plus there's going to be introduced
__vlan_insert_tag later on.

Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

vlan: kill vlan_put_tag helper

Since both tx and rx paths work with skb->vlan_tci, there's no need for
this function anymore. Switch users directly to __vlan_hwaccel_put_tag.

Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>

vlan: make __vlan_hwaccel_put_tag return void

Always returns the same skb it gets, so change to void.

Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

openvswitch: actions: use skb_postpull_rcsum when possible

Replace duplicated code by calling skb_postpull_rcsum

Suggested-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

l2tp_eth: allow to set a specific mac address

Signed-off-by: Alexander Couzens <lynxis@fe80.eu>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'phy_device_type'

Johan Hovold says:

====================
net: phy: add device-type abstraction

This series adds device and device-type abstractions to the micrel
driver, and enables support for RMII-reference clock selection for
KSZ8081 and KSZ8091 devices.

While adding support for more features for the Micrel PHYs mentioned
above, it became apparent that the configuration space is much too large
and that adding type-specific callbacks will simply not scale. Instead I
added a driver_data field to struct phy_device, which can be used to
store static device type data that can be parsed and acted on in
generic driver callbacks. This allows a lot of duplicated code to be
removed, and should make it much easier to add new features or deal with
device-type quirks in the future.

The series has been tested on a dual KSZ8081 setup. Further testing on
other Micrel PHYs would be much appreciated.

The recent commit a95a18afe4c8 ("phy/micrel: KSZ8031RNL RMII clock
reconfiguration bug") currently prevents KSZ8031 PHYs from using the
generic config-init. Bruno, who is the author of that patch, has agreed
to test this series and some follow-up diagnostic patches to determine
how best to incorporate these devices as well. I intend to send a
follow-up patch that removes the custom 8031 config-init and documents
this quirk, but the current series can be applied meanwhile.

These patches are against net-next which contains some already merged
prerequisite patches to the driver.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

net: phy: micrel: add copyright entry

Add myself to the list of copyright holders.

Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: phy: micrel: refactor interrupt config

Add generic interrupt-config callback and store interrupt-level bitmask
in type data for PHY types not using bit 9.

Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

dt/bindings: add clock-select function property to micrel phy binding

Add "micrel,rmii-reference-clock-select-25-mhz" to Micrel ethernet PHY
binding documentation.

This property is needed to properly describe some revisions of Micrel
PHYs which has the function of this configuration bit inverted so that
setting it enables 25 MHz rather than 50 MHz clock mode.

Note that a clock reference ("rmii-ref") is still needed to actually
select either mode.

Cc: devicetree@vger.kernel.org
Signed-off-by: Johan Hovold <johan@kernel.org>
Acked-by: Sascha Hauer <s.hauer@pengutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>

dt/bindings: reformat micrel eth-phy documentation

Reduce indentation of Micrel PHY binding documentations somewhat.

Also fix "reference input clock" typo while at it.

Cc: devicetree@vger.kernel.org
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: phy: micrel: add support for clock-mode select to KSZ8081/KSZ8091

Micrel KSZ8081 and KSZ8091 PHYs have the RMII Reference Clock Select
bit, which is used to select 25 or 50 MHz clock mode.

Note that on some revisions of the PHY (e.g. KSZ8081RND) the function of
this bit is inverted so that setting it enables 25 rather than 50 MHz
mode. Add a new device-tree property
"micrel,rmii-reference-clock-select-25-mhz" to describe this.

Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: phy: micrel: add generic clock-mode-select support

Add generic RMII-Reference-Clock-Select support.

Several Micrel PHY have an RMII-Reference-Clock-Select bit to select
25 MHz or 50 MHz clock mode. Recently, support for configuring this
through device tree for KSZ8021 and KSZ8031 was added.

Generalise this support so that it can be configured for other PHY types
as well.

Note that some PHY revisions (of the same type) has this bit inverted.
This should be either configurable through a new device-tree property,
or preferably, determined based on PHY ID if possible.

Also note that this removes support for setting 25 MHz mode from board
files which was also added by the above mentioned commit 45f56cb82e45
("net/phy: micrel: Add clock support for KSZ8021/KSZ8031").

Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: phy: micrel: add has-broadcast-disable flag to type data

Add has_broadcast_disable flag to type-data and generic config_init.

This allows us to remove the ksz8081 config_init callback.

Note that ksz8021_config_init is kept for now due to a95a18afe4c8
("phy/micrel: KSZ8031RNL RMII clock reconfiguration bug").

Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: phy: micrel: parse of nodes at probe

Parse the "micrel,led-mode" property at probe, rather than at config_init
time in the led-setup helper itself.

Note that the bogus parent->of_node bit is removed.

Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: phy: micrel: add device-type abstraction

Add structured device-type information and support for generic led-mode
setup to the generic config_init callback.

This is a first step in ultimately getting rid of device-type specific
callbacks.

Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: phy: add static data field to struct phy_driver

Add static driver-data field to struct phy_driver, which can be used to
store structured device-type information.

Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

sky2: use new netdev_rss_key_fill() helper

Switch to a random RSS key rather than a fixed one.
Using netdev_rss_key_fill helper also ensures that all ports share
a common key.

See also commit 960fb622f85180f36d3aff82af53e2be3db2f888.

Signed-off-by: Ian Morris <ipm@chirality.org.uk>
Cc: Mirko Lindner <mlindner@marvell.com>
Cc: Stephen Hemminger <stephen@networkplumber.org>
Cc: Eric Dumazet <edumazet@google.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

enic: support skb->xmit_more

Check and update posted_index only when skb->xmit_more is 0 or tx queue is full.

v2:
use txq_map instead of skb_get_queue_mapping(skb)

Signed-off-by: Govindarajulu Varadarajan <_govind@gmx.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

mISDN: Deletion of unnecessary checks before the function call "vfree"

The vfree() function performs also input parameter validation. Thus the test
around the call is not needed.

This issue was detected by using the Coccinelle software.

Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'for-davem' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs

i40e: trigger SW INT with no ITR wait

Since we want the SW INT to go off as soon as possible, write the
extra bits that will turn off the ITR wait for the interrupt.

Change-ID: I6d5382ba60840fa32abb7dea17c839eb4b5f68f7
Signed-off-by: Shannon Nelson <shannon.nelson@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

i40evf: remove unnecessary else

Since the if part of this statement contains a break, there's no reason
for the else. Clean up the code and make it more obvious that the delay
happens each time through the loop.

Change-ID: I9292eaf7dd687688bdc401b8bd8d1d14f6944460
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

i40evf: make comparisons consistent

Most of the null-checking in this driver is of the style if (!foo),
except these few. Make these checks consistent with the rest of the
code.

Change-ID: I991924f34072fa607a1b626a8b3f1fa5195d43e9
Reported-by: Joe Perches <joe@perches.com>
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

i40evf: make checkpatch happy

This patch is the result of running checkpatch on the i40evf driver with
the --strict option. The vast majority of changes are adding/removing
blank lines, aligning function parameters, and correcting over-long
lines.

The only possible functional change is changing the flags member of the
adapter structure to be non-volatile. However, according to the kernel
documentation, this is not necessary and the volatile should be removed.

Change-ID: Ie8c6414800924f529bef831e8845292b970fe2ed
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

i40evf: update header comments

No code changes. Update comments to match actual function declarations.

Change-ID: Ib830d2f154ee917a104955c0914267fc98f3d2c8
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

i40e: don't overload fields

Overloading the msg_size field in the arq_event_info struct is just a
bad idea. It leads to repeated bugs when the structure is used in a
loop, since the input value (buffer size) is overwritten by the output
value (actual message length).

Fix this by splitting the field into two and renaming to indicate the
actual function of each field.

Since the arq_event struct has now changed, we need to change the drivers
to support this. Note that we no longer need to initialize the buffer size
each time we go through a loop as this value is no longer destroyed by
arq processing.

In the process, we also fix a bug in i40evf_verify_api_ver where the
buffer size was not correctly reinitialized each time through the loop.

Change-ID: Ic7f9633cdd6f871f93e698dfb095e29c696f5581
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Acked-by: Shannon Nelson <shannon.nelson@intel.com>
Acked-by: Ashish Shah <ashish.n.shah@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

ixgbevf: add netpoll support

This patch adds ixgbevf_netpoll() a callback for .ndo_poll_controller to
allow for the VF interface to be used with netconsole.

CC: Alexander Duyck <alexander.h.duyck@redhat.com>
Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

ixgbevf: compare total_rx_packets and budget in ixgbevf_clean_rx_irq

total_rx_packets is the number of packets we had cleaned, and budget is
the total number of packets that we could clean per poll. Instead of
altering both of these values we can save ourselves one write to memory by
just comparing total_rx_packets to the budget and as long as we are less
than budget we continue cleaning.

Also change the do{}while logic to while{} in order to avoid processing
packets when budget is 0.

CC: Alexander Duyck <alexander.h.duyck@redhat.com>
Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

ixgbevf: Change receive model to use double buffered page based receives

This patch changes the basic receive path for ixgbevf so that instead of
receiving the data into an skb it is received into a double buffered page.
The main change is that the receives will be done in pages only and then
pull the header out of the page and copy it into the sk_buff data.

This has the advantages of reduced cache misses and improved performance on
IOMMU enabled systems.

v2:
- added pfmemalloc check to a new function for reusable page
- moved atomic_inc outside of #if/else in ixgbevf_add_rx_frag()
- reverted the removal of the api check in ixgbevf_change_mtu()

CC: Alexander Duyck <alexander.h.duyck@redhat.com>
Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

ixgbevf: Update Rx next to clean in real time

Since the next_to_clean value is only accessed by the Rx interrupt handler
we can save on stack space by just storing our updated values back in
next_to_clean instead of using the stack variable i. This should help to
reduce stack space and we can further collapse the size of the function.

Also removed non_eop_descs counter as it was never shown in the stats.

CC: Alexander Duyck <alexander.h.duyck@redhat.com>
Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

ixgbevf: reorder main loop in ixgbe_clean_rx_irq to allow for do/while/continue

This change allows us to go from a loop based on the descriptor to one
primarily based on the budget. The advantage to this is that we can avoid
carrying too many values from one iteration to the next.

CC: Alexander Duyck <alexander.h.duyck@redhat.com>
Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

ixgbevf: Cleanup variable usage, improve stack performance

This change is meant to help cleanup the usage of temporary variables
within the Rx hot-path by removing unnecessary variables and reducing
the scope of variables that do not need to exist outside the main loop.

CC: Alexander Duyck <alexander.h.duyck@redhat.com>
Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

ixgbevf: Combine the logic for post Rx processing into single function

This patch cleans up ixgbevf_clean_rx_irq() by merging several similar
operations into a new function - ixgbevf_process_skb_fields().

CC: Alexander Duyck <alexander.h.duyck@redhat.com>
Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

ixgbevf: Test Rx status bits directly out of the descriptor

Instead of keeping a local copy of the status bits from the descriptor
we can just read them directly - this is accomplished with the addition
of ixgbevf_test_staterr().

In addition instead of doing a byteswap on the status bits value, we
can byteswap the constant values we are testing since that can be done
at compile time which should help to improve performance on big-endian
systems.

CC: Alexander Duyck <alexander.h.duyck@redhat.com>
Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

ixgbevf: Update ixgbevf_alloc_rx_buffers to handle clearing of status bits

Instead of clearing the status bits in the cleanup it makes more sense to
just clear the status bits on allocation. This way we can leave the Rx
descriptor rings as a read only memory block until we actually have buffers
to give back to the hardware.

CC: Alexander Duyck <alexander.h.duyck@redhat.com>
Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

Merge branch 'bonding_4ad'

Xie Jianhua says:

====================
bonding: Introduce 4 AD link speed

The speed field of AD Port Key was based on bitmask, it supported 5
kinds of link speed at most, as there were only 5 bits in the speed
field of the AD Port Key. This patches series change the speed type
(AD_LINK_SPEED_BITMASK) from bitmask to enum type in order to enhance
speed type from 5 to 32, and then introduce 4 AD link speed to fix
agg_bandwidth.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

bonding: Introduce 4 AD link speed to fix agg_bandwidth

This patch adds [2.5|20|40|56] Gbps enum definition, and fixes
aggregated bandwidth calculation based on above slave links.

CC: Jay Vosburgh <j.vosburgh@gmail.com>
CC: Veaceslav Falico <vfalico@gmail.com>
CC: Andy Gospodarek <andy@greyhouse.net>
CC: David S. Miller <davem@davemloft.net>
Signed-off-by: Jianhua Xie <jianhua.xie@freescale.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

bonding: change AD_LINK_SPEED_BITMASK to enum to suport more speed

Port Key was determined as 16 bits according to the link speed,
duplex and user key (which is yet not supported).  In the old
speed field, 5 bits are for speed [1|10|100|1000|10000]Mbps as
below:
--------------------------------------------------------------
Port key :| User key        | Speed         |       Duplex|
--------------------------------------------------------------
    16                  6               1               0
This patch keeps the old layout, but changes AD_LINK_SPEED_BITMASK
from bit type to an enum type.  In this way, the speed field can
expand speed type from 5 to 32.

CC: Jay Vosburgh <j.vosburgh@gmail.com>
CC: Veaceslav Falico <vfalico@gmail.com>
CC: Andy Gospodarek <andy@greyhouse.net>
CC: David S. Miller <davem@davemloft.net>
Signed-off-by: Jianhua Xie <jianhua.xie@freescale.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

bury skb_copy_to_page()

no callers since 3.0

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

fold verify_iovec() into copy_msghdr_from_user()

... and do the same on the compat side of things.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

{compat_,}verify_iovec(): switch to generic copying of iovecs

use {compat_,}rw_copy_check_uvector(). As the result, we are
guaranteed that all iovecs seen in ->msg_iov by ->sendmsg()
and ->recvmsg() will pass access_ok().

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

separate kernel- and userland-side msghdr

Kernel-side struct msghdr is (currently) using the same layout as
userland one, but it's not a one-to-one copy - even without considering
32bit compat issues, we have msg_iov, msg_name and msg_control copied
to kernel[1].  It's fairly localized, so we get away with a few functions
where that knowledge is needed (and we could shrink that set even
more).  Pretty much everything deals with the kernel-side variant and
the few places that want userland one just use a bunch of force-casts
to paper over the differences.

The thing is, kernel-side definition of struct msghdr is *not* exposed
in include/uapi - libc doesn't see it, etc.  So we can add struct user_msghdr,
with proper annotations and let the few places that ever deal with those
beasts use it for userland pointers.  Saner typechecking aside, that will
allow to change the layout of kernel-side msghdr - e.g. replace
msg_iov/msg_iovlen there with struct iov_iter, getting rid of the need
to modify the iovec as we copy data to/from it, etc.

We could introduce kernel_msghdr instead, but that would create much more
noise - the absolute majority of the instances would need to have the
type switched to kernel_msghdr and definition of struct msghdr in
include/linux/socket.h is not going to be seen by userland anyway.

This commit just introduces user_msghdr and switches the few places that
are dealing with userland-side msghdr to it.

[1] actually, it's even trickier than that - we copy msg_control for
sendmsg, but keep the userland address on recvmsg.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

bpf: fix arraymap NULL deref and missing overflow and zero size checks

- fix NULL pointer dereference:
kernel/bpf/arraymap.c:41 array_map_alloc() error: potential null dereference 'array'. (kzalloc returns null)
kernel/bpf/arraymap.c:41 array_map_alloc() error: we previously assumed 'array' could be null (see line 40)

- integer overflow check was missing in arraymap
(hashmap checks for overflow via kmalloc_array())

- arraymap can round_up(value_size, 8) to zero. check was missing.

- hashmap was missing zero size check as well, since roundup_pow_of_two() can
truncate into zero

- found a typo in the arraymap comment and unnecessary empty line

Fix all of these issues and make both overflow checks explicit U32 in size.

Reported-by: kbuild test robot <fengguang.wu@intel.com>
Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

netlink: Deletion of an unnecessary check before the function call "__module_get"

The __module_get() function tests whether its argument is NULL and then
returns immediately. Thus the test around the call is not needed.

This issue was detected by using the Coccinelle software.

Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: pktgen: Deletion of an unnecessary check before the function call "proc_remove"

The proc_remove() function tests whether its argument is NULL and then
returns immediately. Thus the test around the call is not needed.

This issue was detected by using the Coccinelle software.

Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Signed-off-by: David S. Miller <davem@davemloft.net>

usbnet: rtl8150: remove unused variable

remove unused variable

Signed-off-by: Sudip Mukherjee <sudip@vectorindia.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'stmmac-next'

Giuseppe Cavallaro says:

====================
stmmac: update driver documentation

Recently many changes have been done inside the driver
so this patch updates the driver's doc for example reviewing
information for the rx and tx processes that are managed
by napi method, adding new information for missing glue-logic files
etc.
Also this reviews and fixes what is reported when run kernel-doc script.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

stmmac: review driver when run kernel-doc

When run ./scripts/kernel-doc several warnings are reported
so this patch fix them.
Also it reviews many comments and adds new ones.

Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

stmmac: document common header file

This patch adds some useful comments inside the common header
file to provide information about the APIs exposed by the driver.

Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

stmmac: update driver documentation

Recently many changes have been done inside the driver
so this patch updates the driver's doc for example reviewing
information for the rx and tx processes that are managed
by napi method, adding new information for missing glue-logic files
etc.

Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

tcp: make connect() mem charging friendly

While working on sk_forward_alloc problems reported by Denys
Fedoryshchenko, we found that tcp connect() (and fastopen) do not call
sk_wmem_schedule() for SYN packet (and/or SYN/DATA packet), so
sk_forward_alloc is negative while connect is in progress.

We can fix this by calling regular sk_stream_alloc_skb() both for the
SYN packet (in tcp_connect()) and the syn_data packet in
tcp_send_syn_data()

Then, tcp_send_syn_data() can avoid copying syn_data as we simply
can manipulate syn_data->cb[] to remove SYN flag (and increment seq)

Instead of open coding memcpy_fromiovecend(), simply use this helper.

This leaves in socket write queue clean fast clone skbs.

This was tested against our fastopen packetdrill tests.

Reported-by: Denys Fedoryshchenko <nuclearcat@nuclearcat.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

tun: return NET_XMIT_DROP for dropped packets

After commit 5d097109257c03a71845729f8db6b5770c4bbedc
("tun: only queue packets on device"), NETDEV_TX_OK was returned for
dropped packets. This will confuse pktgen since dropped packets were
counted as sent ones.

Fixing this by returning NET_XMIT_DROP to let pktgen count it as error
packet.

Cc: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

icmp: Remove some spurious dropped packet profile hits from the ICMP path

If icmp_rcv() has successfully processed the incoming ICMP datagram, we
should use consume_skb() rather than kfree_skb() because a hit on the likes
of perf -e skb:kfree_skb is not called-for.

Signed-off-by: Rick Jones <rick.jones2@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>