git.karo-electronics.de Git - linux-beck.git/log

drivers/net: Remove alloc_etherdev error messages

alloc_etherdev has a generic OOM/unable to alloc message.
Remove the duplicative messages after alloc_etherdev calls.

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

drivers/net: Remove unnecessary k.alloc/v.alloc OOM messages

alloc failures use dump_stack so emitting an additional
out-of-memory message is an unnecessary duplication.

Remove the allocation failure messages.

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

tcp: md5: use sock_kmalloc() to limit md5 keys

There is no limit on number of MD5 keys an application can attach to a
tcp socket.

This patch adds a per tcp socket limit based
on /proc/sys/net/core/optmem_max

With current default optmem_max values, this allows about 150 keys on
64bit arches, and 88 keys on 32bit arches.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

tcp: md5: rcu conversion

In order to be able to support proper RST messages for TCP MD5 flows, we
need to allow access to MD5 keys without locking listener socket.

This conversion is a nice cleanup, and shrinks size of timewait sockets
by 80 bytes.

IPv6 code reuses generic code found in IPv4 instead of duplicating it.

Control path uses GFP_KERNEL allocations instead of GFP_ATOMIC.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Shawn Lu <shawn.lu@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

tcp: md5: remove obsolete md5_add() method

We no longer use md5_add() method from struct tcp_sock_af_ops

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

r8169: spinlock redux.

rtl8169_get_regs operates under RTNL and rtl task mutex whereas
rtl_set_rx_mode is either called under RTNL or rtl task mutex protection.

Signed-off-by: Francois Romieu <romieu@fr.zoreil.com>
Cc: Hayes Wang <hayeswang@realtek.com>

r8169: avoid a useless work scheduling.

Signed-off-by: Francois Romieu <romieu@fr.zoreil.com>
Suggested-by: Michał Mirosław <mirqus@gmail.com>
Cc: Hayes Wang <hayeswang@realtek.com>

r8169: move task enable boolean to bitfield.

Simpler, more consistent, with negligible cost in non-critical paths.

Signed-off-by: Francois Romieu <romieu@fr.zoreil.com>
Suggested-by: Michał Mirosław <mirqus@gmail.com>
Cc: Hayes Wang <hayeswang@realtek.com>

r8169: bh locking redux and task scheduling.

- atomic bit operations are globally visible
- pending status is always cleared before execution
- scheduled works are either idempotent or only required to happen once
after a series of originating events, say link events for instance

Signed-off-by: Francois Romieu <romieu@fr.zoreil.com>
Suggested-by: Michał Mirosław <mirqus@gmail.com>
Cc: Hayes Wang <hayeswang@realtek.com>

r8169: fix early queue wake-up.

With infinite gratitude to Eric Dumazet for allowing me to identify
the error.

Signed-off-by: Francois Romieu <romieu@fr.zoreil.com>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Hayes Wang <hayeswang@realtek.com>

Merge branch 'for-davem' of git://git.kernel.org/pub/scm/linux/kernel/git/bwh/sfc-next

net: Deinline __nlmsg_put and genlmsg_put. -7k code on i386 defconfig.

text data bss dec hex filename
8455963 532732 1810804 10799499 a4c98b vmlinux.o.before
8448899 532732 1810804 10792435 a4adf3 vmlinux.o

This change also removes commented-out copy of __nlmsg_put
which was last touched in 2005 with "Enable once all users
have been converted" comment on top.

Changes in v2: rediffed against net-next.

Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ipv6: fix RFC5722 comment

RFC5722 Section 4 was amended by Errata 3089

Our implementation did the right thing anyway...

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Acked-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: Allow ipv6 proxies and arp proxies be shown with iproute2

Add ability to return neighbour proxies list to caller if
it sent full ndmsg structure and has NTF_PROXY flag set.

Before this patch (and before iproute2 patches):
$ ip neigh add proxy 2001::1 dev eth0
$ ip -6 neigh show
$

After it and with applied iproute2 patches:
$ ip neigh add proxy 2001::1 dev eth0
$ ip -6 neigh show
2001::1 dev eth0 proxy
$

Compatibility with old versions of iproute2 is not broken,
kernel checks for incoming structure size and properly
works if old structure is came.

[v2]
* changed comments style.
* removed useless line with continue and curly bracket.
* changed incoming message size check from equal to more or
equal.

CC: davem@davemloft.net
CC: kuznet@ms2.inr.ac.ru
CC: netdev@vger.kernel.org
CC: xemul@parallels.com
Signed-off-by: Tony Zelenoff <antonz@parallels.com>
Acked-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>

drivers/net: strip unused module code from sun3_82586.c

This code is clearly unused, since it has a #error right
in it. Given the vintage of sun3 hardware, it is probably
safe to assume that there is little interest in adding new
functionality to the driver now, so just delete the unused
block of code.

Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: Sam Creasey <sammy@sammy.net>
Signed-off-by: David S. Miller <davem@davemloft.net>

drivers/net: fix up stale paths from driver reorg

The reorganization of the driver layout in drivers/net
left behind some stale paths in comments and in Kconfig
help text. Bring them up to date. No actual change to
any code takes place here.

Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
CC: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'davem-next.r8169' of git://violet.fr.zoreil.com/romieu/linux

sfc: Use a more sensible cast in efx_rx_buf_offset()

This function returns the page offset of the buffer, which can be
calculated based on either its DMA address or its virtual address.  It
used to use the virtual address and we would cast that to unsigned
long, as anything smaller would result in a compiler warning.  Now
that it's using the DMA address we should use unsigned int, matching
the return type.  It is also unnecessary to use __force.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>

sfc: MTD: Leave the DEBUG macro alone

<linux/mtd/mtd.h> no longer defines DEBUG so we do not need to
un-define it here.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>

Merge branch 'for-davem' of git://git.kernel.org/pub/scm/linux/kernel/git/bwh/sfc-next

ipv6: Eliminate dst_get_neighbour_noref() usage in ip6_forward().

It's only used to get at neigh->primary_key, which in this context is
always going to be the same as rt->rt6i_gateway.

Signed-off-by: David S. Miller <davem@davemloft.net>

ipv6: Remove neigh argument from ndisc_send_redirect()

Instead, compute it as-needed inside of that function using
dst_neigh_lookup().

Signed-off-by: David S. Miller <davem@davemloft.net>

ipv6: fib: Convert fib6_age() to dst_neigh_lookup().

In this specific situation we know we are dealing with a gatewayed route
and therefore rt6i_gateway is not going to be in6addr_any even in future
interpretations.

Signed-off-by: David S. Miller <davem@davemloft.net>

ipv6: ndisc: Convert to dst_neigh_lookup()

Now all code paths grab a local reference to the neigh, so if neigh
is not NULL we unconditionally release it at the end. The old logic
would only release if we didn't have a non-NULL 'rt'.

Signed-off-by: David S. Miller <davem@davemloft.net>

ipv4: ip_gre: Convert to dst_neigh_lookup()

The conversion is very similar to that made to ipv6's SIT code.

Signed-off-by: David S. Miller <davem@davemloft.net>

r8169: remove work from irq handler.

The irq handler was a mess.

See 7ab87ff4c770eed71e3777936299292739fcd0fe ("via-rhine: move work from
irq handler to softirq and beyond") for similar changes. One can notice:
- all non-napi tasks are explicitely scheduled trough a single work queue.
- hiding software tx queue start behind the rtl_hw_start method is mildly
  natural. Move it in the caller where needed.
- as can be seen from the heavy use of bh disabling locks, the driver is
  not safe for irq context messages with netconsole. It is still quite
  usable for general messaging though. Tested ok with concurrent registers
  dump (ethtool -d) + background traffic + "echo t > /proc/sysrq-trigger".

Tested with old PCI chipset, PCIe 8168 and 810x:
- XID 0c900800 RTL8168evl/8111evl
- XID 18000000 RTL8168b/8111b
- XID 98000000 RTL8169sc/8110sc
- XID 083000c0 RTL8168d/8111d
- XID 081000c0 RTL8168d/8111d
- XID 00b00000 RTL8105e
- XID 04a00000 RTL8102e

As a side note, the comments in f11a377b3f4e897d11f0e8d1fc688667e2f19708
("r8169: avoid losing MSI interrupts") does not seem completely clear: if
I hack the driver further to stop acking the irq link event bit, MSI
interrupts keep being delivered (RTL8168b/8111b, XID 18000000).

Signed-off-by: Francois Romieu <romieu@fr.zoreil.com>
Cc: Hayes Wang <hayeswang@realtek.com>

r8169: missing barriers.

Signed-off-by: Francois Romieu <romieu@fr.zoreil.com>
Cc: Hayes Wang <hayeswang@realtek.com>

r8169: irq mask helpers.

Signed-off-by: Francois Romieu <romieu@fr.zoreil.com>
Cc: Hayes Wang <hayeswang@realtek.com>

r8169: factor out IntrMask writes.

Signed-off-by: Francois Romieu <romieu@fr.zoreil.com>
Cc: Hayes Wang <hayeswang@realtek.com>

r8169: stop delaying workqueue.

Though motivated by the move of the driver to a single work queue of
sequential events and removal of hard irq processing, it looks safe as
a standalone change.

Signed-off-by: Francois Romieu <romieu@fr.zoreil.com>
Cc: Hayes Wang <hayeswang@realtek.com>

r8169: remove rtl8169_reinit_task.

I see no good reason to keep both rtl8169_reinit_task and rtl8169_reset_task:
- rtl8169_reinit_task adds a software failure point which does relate to
  any hardware state
- they handle hardware the same. Remember that rtl8169_reinit_task was
  introduced in the 8169 only era to handle PCI errors way before the 8168
  asked for pll and firmware ops and compare :

      rtl8169_reinit_task     |    rtl8169_reset_task
  ----------------------------+--------------------------
  rtl8169_wait_for_quiescence | rtl8169_hw_reset
  rtl8169_update_counters     | rtl8169_wait_for_quiescence
  rtl8169_hw_reset            | rtl_hw_start
  rtl8169_rx_missed           | rtl8169_check_link_status
  rtl_pll_power_down          |
  rtl_request_firmware        |
  rtl8169_init_phy            |
  rtl_pll_power_up            |
  rtl_hw_start                |
  rtl8169_check_link_status   |

Signed-off-by: Francois Romieu <romieu@fr.zoreil.com>
Cc: Hayes Wang <hayeswang@realtek.com>

r8169: remove hardcoded PCIe registers accesses.

Signed-off-by: Francois Romieu <romieu@fr.zoreil.com>

e1000e: update copyright year

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

e1000e: split lib.c into three more-appropriate files

The generic lib.c file contains code relative to the various MACs, NVM and
Manageability supported by the driver.  This patch splits the file into
three which are specific to those areas similar to how the PHY-specific
code is in phy.c and code specific to the 80003es2lan, 8257x, and ichX
MAC families are in their own files.  The generic code that is applicable
to all MAC/PHY parts supported by the driver remains in netdev.c, param.c
and ethtool.c files.  No change in functionality, just moving code
around for ease of maintenance, with some whitespace and other checkpatch
cleanups.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

e1000e: call er16flash() instead of __er16flash()

__er16flash() is not meant to be called directly.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

e1000e: increase version number

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

e1000e: convert final strncpy() to strlcpy()

Convert the last instances of strncpy() to the preferred strlcpy().

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

e1000e: concatenate long debug strings which span multiple lines

To ease searching for debug message strings, concatenate strings that span
multiple lines even if the resulting line exceeds 80 columns; these will
not cause checkpatch warnings.

Also, add '\n' and remove unnecessary '\r' from a few debug strings.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

e1000e: conditionally restart autoneg on 82577/8/9 when setting LPLU state

When setting the Low Power Link Up (LPLU, a.k.a. reverse auto-negotiation)
on 82577/8278/82579, do not restart auto-negotiation if reset of the Phy is
blocked by the Manageability Engine.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

e1000e: increase Rx PBA to prevent dropping received packets on 82566/82567

During bi-directional stress on some 82566/82567 devices, some received
packets were dropped. Increasing the Receive Packet Buffer Allocation
resolves this.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

e1000e: ICHx/PCHx LOMs should use LPLU setting in NVM when going to Sx

When going to Sx with an ICHx/PCH device, the default Low Power Link Up
(LPLU, a.k.a. reverse auto-negotiation) behavior should be whatever is set
in the NVM. However, the function e1000_suspend_workarounds_ich8lan()
called when going to Sx always enabled LPLU in all power states.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

e1000e: update workaround for 82579 intermittently disabled during S0->Sx

The workaround which toggles the LANPHYPC (LAN PHY Power Control) value bit
to force the MAC-Phy interconnect into PCIe mode from SMBus mode during
driver load and resume should always be done except if PHY resets are
blocked by the Manageability Engine (ME). Previously, the toggle was done
only if PHY resets are blocked and the ME was disabled.

The rest of the patch is just indentation changes as a consequence of the
updated workaround.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

e1000e: disable Early Receive DMA on ICH LOMs

Internal stress testing with jumbo frames shows the reliability of ICH9 and
ICH10D devices is improved in certain corner cases by disabling the Early
Receive feature. To reduce the performance impact caused by disabling this
feature, the packet buffer sizes and relevant flow control settings are
modified accordingly.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

sfc: Replace efx_rx_buffer::is_page and other booleans with a flags field

Replace checksummed and discard booleans from efx_handle_rx_event()
with a bitmask, added to the flags field.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>

sfc: Move the end of the non-GRO RX path into its own function

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>

sfc: Make all MAC statistics consistently 64 bits wide

Currently we use type u64 for byte counts, which can very quickly
exceed 2^32, and unsigned long for packet counts, which do not. But
it can still take only 20-something minutes to send or receive 2^32
packets, and not all tools properly handle overflow even if they
sample more often than this.

The MAC statistics are all updated synchronously, so it costs very
little to make them all 64-bit regardless of native word size.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>

sfc: Rename implementation of ndo_set_rx_mode

Rename efx_set_multicast_list() to efx_set_rx_mode(), in line
with the operation name net_device_ops::ndo_set_rx_mode.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>

sfc: Remove redundant 'rc' variable, always set to 0

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>

sfc: Minor formatting fixes

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>

sfc: Use existing local variables instead of repeated indirect lookups

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>

sfc: Remove remnants of on-load self-test

The out-of-tree version of the sfc driver used to run a self-test on
each device before registering it. Although this was never included
in-tree, some functions have checks for this special case which is not
really possible.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>

sfc: Remove obsolete function efx_dev_name()

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>

sfc: Update the description of SFC_MTD

SFC4000 boards also have an EEPROM exposed as MTD.
The boot configuration is accessed through MTD.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>

sfc: Add hwmon driver for boards using SFC9000-family controllers

The SFC9000-family controllers have firmware to manage all board
peripherals including temperature, heat sink continuity and voltage
sensors. The firmware reports sensor alarms, which we log, and
will shut down the board if necessary.

Some users may want to monitor their boards more closely, so add an
hwmon driver that exposes all sensors reported by the firmware. Move
efx_mcdi_sensor_event() into the new file so it can share the array of
sensor labels with the hwmon driver.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>

sfc: Clean up test interrupt handling

Interrupts are normally generated by the event queues, moderated by
timers.  However, they may also be triggered by detection of a 'fatal'
error condition (e.g. memory parity error) or by the host writing to
certain CSR fields as part of a self-test.

The IRQ level/index used for these on Falcon rev B0 and Siena is set
by the KER_INT_LEVE_SEL field and cached by the driver in
efx_nic::fatal_irq_level.  Since this value is also relevant to
self-tests rename the field to just 'irq_level'.

Avoid unnecessary cache traffic by using a per-channel 'last_irq_cpu'
field and only writing to the per-controller field when the interrupt
matches efx_nic::irq_level.  Remove the volatile qualifier and use
ACCESS_ONCE in the places we read these fields.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>

Partly revert "sfc: Handle serious errors in exactly one interrupt handler"

This reverts commit 6369545945b90daa1a73fca174da9194c398417c in
drivers/net/ethernet/sfc/falcon.c.

Unlike the INT_ISR0 register on later controller revisions, the
NET_IVEC_INT_Q bits written to memory are only ever set for
interrupting event queues, not for any other interrupt sources.

By definition there can only be one legacy interrupt handler per
function, so there is no need to worry about detecting a fatal
interrupt more than once.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>

sfc: Remove dependence on NAPI polling in efx_test_eventq_irq()

We cannot safely assume that the NAPI handler will complete within the
20 ms that we allow for the event self-test. The handler may be
deferred for longer than this, particularly on realtime kernels.

Instead, check whether either an event has been handled or (as in the
old failure path) whether an interrupt has been received and an event
has been delivered but not yet handled. Use napi_disable() to
synchronize with the NAPI handler before checking, since it will
clear events before updating eventq_read_ptr.

Remove the test result chan.N.eventq.poll, since it is not an error
if the NAPI handler does not run during the test.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>

sfc: Correct interrupt timer quantum for Siena (normal and turbo mode)

We currently assume that the timer quantum for Siena is 5 us, the same
as for Falcon. This is not correct; timer ticks are generated on a
rota which takes a minimum of 768 cycles (each event delivery or other
timer change will delay it by 3 cycles). The timer quantum should be
6.144 or 3.072 us depending on whether turbo mode is active.

Replace EFX_IRQ_MOD_RESOLUTION with a timer_quantum_ns field in struct
efx_nic, initialised by the efx_nic_type::probe function.

While we're at it, replace EFX_IRQ_MOD_MAX with a timer_period_max
field in struct efx_nic_type.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>

sfc: Support extraction of CAPABILITIES from GET_BOARD_CFG response.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>

sfc: Consistently test DEBUG macro, not EFX_ENABLE_DEBUG

The netif_dbg() macro is defined in <linux/netdevice.h>. If the DEBUG
macro is defined, it logs a message at 'debug' level, otherwise it
does nothing.

In net_driver.h we define DEBUG if EFX_ENABLE_DEBUG is defined, but
this is too late for those source files that already got a
definition of netif_dbg() by including <linux/netdevice.h>

Get rid of EFX_ENABLE_DEBUG, and only define and test DEBUG.

In mtd.c, we do not use DEBUG as a condition flag but are forced to
use the DEBUG macro-function from <linux/mtd/mtd.h>. Undefine DEBUG
before including it.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>

sfc: Remove efx_nic_type::push_multicast_hash operation

Both implementations of efx_nic_type::reconfigure_mac operation
push the multicast hash filter to the hardware. It is therefore
redundant to call efx_nic_type::push_multicast_hash as well.

efx_mcdi_mac_reconfigure() also uses this operation, but the
implementation for Siena just uses MCDI anyway. Merge that into
efx_mcdi_mac_reconfigure().

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>

sfc: Merge efx_mcdi_mac_check_fault() and efx_mcdi_get_mac_faults()

The latter is only called by the former, which is a very short
wrapper. Further, gcc 4.5 may currently wrongly warn that the
'faults' variable may be used uninitialised.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>

sfc: Merge efx_mac_operations into efx_nic_type

No NICs need to switch efx_mac_operations at run-time, and the MAC
operations are fairly closely bound to NIC types.

Move efx_mac_operations::reconfigure to efx_nic_type::reconfigure_mac
and efx_mac_operations::check_fault fo efx_nic_type::check_mac_fault.
Change callers to call through efx->type or directly if the NIC type
is known.

Remove efx_mac_operations::update_stats. The implementations for
Falcon used to fetch MAC statistics synchronously and this was used by
efx_register_netdev() to clear statistics after running self-tests.
However, it now only converts statistics that have already been
fetched (and that only for Falcon), and the call from
efx_register_netdev() has no effect.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>

sfc: Hold efx_nic::stats_lock while reading efx_nic::mac_stats

efx_nic::stats_lock is used to serialise stats updates, but each
reader was dropping it before it finished reading efx_nic::mac_stats.

If there were concurrent stats reads using procfs, or one using procfs
and one using ethtool, an update could race with a read. On a 32-bit
system, the reader could see word-tearing of 64-bit stats (32 bits of
the old value and 32 bits of the new).

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>

sfc: Use new names for MC shared memory layout constants

These are defined alongside the firmware protocol in mcdi_pcol.h.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>

sfc: Make handling of MC reboot more reliable

When the MC reboots, either as part of a firmware upgrade or due to a
bug, it attempts to complete (with an error) any requests that were
outstanding before the reboot.  Since there is an inherent race
condition in checking this, it will also write to a status word in
shared memory.

If we look at each of these separately, we may detect each reboot
twice, resulting in a spurious command failure after a firmware
upgrade or frustrating recovery from a firmware bug.  Instead, if a
request completion indicates a reboot, we must poll and clear the
status word.

This bug was previously masked by use of an incorrect address for the
status word.  Fix that, using the definition now included in
mcdi_pcol.h.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>

sfc: Remove fallback for invalid permanent MAC address

By the time we look at the MAC address in efx_probe_port(), either the
driver or the firmware has already validated the board configuration.
The possibility of having an invalid MAC address just isn't worth
considering. It certainly isn't worth having a compile-time option
for this.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>

ipv6: Use ipv6_addr_any()

Suggested by YOSHIFUJI Hideaki.

Signed-off-by: David S. Miller <davem@davemloft.net>

e1000e: Need to include vmalloc.h

Otherwise (on sparc64):

drivers/net/ethernet/intel/e1000e/ethtool.c:657:3: error: implicit declaration of function 'vmalloc' [-Werror=implicit-function-declaration]

Signed-off-by: David S. Miller <davem@davemloft.net>

ipv6: sit: Convert to dst_neigh_lookup()

The only semantic difference is that we now hold a reference to the
neighbour and thus have to release it.

Signed-off-by: David S. Miller <davem@davemloft.net>

ipv4/ipv6: Prepare for new route gateway semantics.

In the future the ipv4/ipv6 route gateway will take on two types
of values:

1) INADDR_ANY/IN6ADDR_ANY, for local network routes, and in this case
the neighbour must be obtained using the destination address in
ipv4/ipv6 header as the lookup key.

2) Everything else, the actual nexthop route address.

So if the gateway is not inaddr-any we use it, otherwise we must use
the packet's destination address.

Signed-off-by: David S. Miller <davem@davemloft.net>

tcp: add LINUX_MIB_TCPRETRANSFAIL counter

It might be useful to get a counter of failed tcp_retransmit_skb()
calls.

Reported-by: Satoru Moriya <satoru.moriya@hds.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

be2net: allocate more headroom in incoming skbs

Allocation of 64 bytes in skb headroom is not enough if we have to pull
ethernet + ipv6 + tcp headers, and/or extra tunneling header.

Its currently not noticed because netdev_alloc_skb_ip_align(64) give us
more room, thanks to power-of-two kmalloc() roundups.

Make sure we ask for 128 bytes so that side effects of upcoming patches
from Ian Campbell dont decrease benet rx performance, because of extra
skb head reallocations.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Ian Campbell <Ian.Campbell@citrix.com>
Cc: Vasundhara Volam <vasundhara.volam@emulex.com>
Cc: Sathya Perla <sathya.perla@emulex.com>
Cc: Ajit Khaparde <ajit.khaparde@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

bnx2x: Update version to 1.72.0 and copyrights

Signed-off-by: Ariel Elior <ariele@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

bnx2x: Recoverable and unrecoverable error statistics

Add statistics for tracking parity errors from which we successfully
recovered and those which were deemed unrecoverable.

Signed-off-by: Ariel Elior <ariele@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

bnx2x: Recovery flow bug fixes

1. Sample mcp pulse and mcp sequence in nic load instead of in init_one
as they may change by the time we want to use them.

2. Allow cnic to access device during nic load (by adding a new "LOADING" state
to recovery flow). This prevents the unnecessary cnic timeout which resulted
by cnic attempting to access because nic is loading, but being blocked because
of the Recovery state.

3. Issue 'fake' driver load command to mcp when last driver unloads to prevent
mcp from taking ownership. When recovery is complete unload fake driver to
allow mcp to initialize the hardware before first driver loads.

Signed-off-by: Ariel Elior <ariele@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

bnx2x: Track active PFs with bitmap

The recovery register (to which a hardware lock has been added in previous
patch) is used amongst other things to track the active PFs. The old
implementation which used a per path counter is not viable in a virtualized
environment where a pf may increment the counter and then have the kernel
crash around it preventing the counter from ever reaching zero.
In the new implementation the scenario described will result in the PF timing
out against the mcp, which will clear the PF's bit in the bitmask allowing
recovery process to proceed.

Signed-off-by: Ariel Elior <ariele@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

bnx2x: Lock PF-common resources

Use hardware locks to protect resources common to several Physical Functions. In
a virtualized environment the RTNL lock only protects a PF's driver against
the PFs sharing it's VMs with regard to device resources. Other PFs may reside
in other VMs under other OSs, and are not subject to the lock. Such resources
which were previously protected implicitly by the RTNL lock must now be
protected explicitly with dedicated HW locks.

Signed-off-by: Ariel Elior <ariele@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

bnx2x: Loaded Firmware Version Validation

In a virtualized environment it is possible for a loading driver to discover
that Firmware is already loaded to the device, and that this FW does not match
its own. This can happen for example if different Physical Functions are
Assigned to different VMs in which different driver versions are loaded. The
code in this patch ensures that only drivers with matching FW are loaded over
the device, and that in the case described above where the Firmware version
doesn't match the driver load is aborted.

Signed-off-by: Ariel Elior <ariele@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

bnx2x: Function Level Reset Final Cleanup

1. Fix bug where return value is ignored
2. Improve printouts
3. Fix typos

Signed-off-by: Ariel Elior <ariele@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

bnx2x: Obtain Bus Device Function from register

BDF was obtained from kernel but since in virtualized environment
(e.g. physical device assigment in KVM) the function number may
not be the real one, the info must be obtained from the device.

Signed-off-by: Ariel Elior <ariele@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

bnx2x: Removing indirect register access

In virtualized environments indirect access to the device may not be supported
(depending on the Hypervisor type). Indirect device access was used since in
some harware contexts (i.e. certain chipset and BIOS) every access the driver
makes across the pci is followed by a BIOS initiated Zero Length Read to the
same address. When accessing widebus registers this zero length read corrupts
the serialization of the read/write sequence resulting with errors. To avoid
this problem widebus registers are always accessed via the DMAE or the indirect
interface. However, the 57712x and 578xx devices intercept the zero length read
and so using the indirect interface with these devices is not necessary. Since
PDA is only supported for 57712x and 578xx the indirect access to device was
restricted to 57710 and 57711x.

Signed-off-by: Ariel Elior <ariele@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

bnx2x: Support Queue Per Cos in 5771xx devices

Enable the use of up to three hardware queues for transmission. The queues
are always dequed round robin (i.e. strict priority, PFC and ETS are not
supported). This does allow the allocation of a seperate HW queue for low
volume, high priority traffic which will be serviced more promptly.

Signed-off-by: Ariel Elior <ariele@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

e1000e: 82574/82583 Tx hang workaround

On 82574/82583, there is a hardware bug which might cause a Tx hang when
the internal buffer is full. Setting this bit enables a hardware fix to
work around the issue.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

e1000e: use hardware default values for Transmit Control register

This code snippet is simply writing default values to the register which is
unnecessary since the values are programmed into the register by default.
There is a special case for 80003es2lan needing the Retransmit on Late
Collision bit set but that is also done in e1000_init_hw_80003es2lan().

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

e1000e: use default settings for Tx Inter Packet Gap timer

Use the default hardware values for TIPG except for 80003es2lan(*). The
code that is removed in this patch is either unnecessarily writing the TIPG
register with the hardware default values for some devices (82571/2/3/4) or
writing the wrong value for others (ICH/PCH LOMs). The only change in
functionality is setting the correct default TIPG for the latter devices.

(*) The correct value for 80003es2lan is already set properly in
e1000_init_hw_80003es2lan() and e1000_cfg_kmrn_{10_100|1000}_80003es2lan(),
and the unused flag FLAG_TIPG_MEDIUM_FOR_80003ESLAN is removed.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

e1000e: 82579: workaround for link drop issue

When connected to certain switches, the 82579 PHY might drop link
unexpectedly. Work around the issue by setting the Mean Square Error
higher than the hardware default.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

e1000e: always set transmit descriptor control registers the same

The hardware erratum workaround where the TXDCTL register must be the same
setting for both queues should always be done.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

e1000e: default IntMode based on kernel config & available hardware support

Based on a patch from Prabhakar Kushwaha <prabhakar@freescale.com>, set
appropriate default interrupt mode dependent on whether CONFIG_PCI_MSI
is enabled in the kernel configuration and if the hardware supports
MSI-X. Set the module parameter log message accordingly.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Cc: Jin Qing <b24347@freescale.com>
Cc: Prabhakar Kushwaha <prabhakar@freescale.com>
Cc: Jin Qing <b24347@freescale.com>
Cc: Kumar Gala <galak@kernel.crashing.org>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

e1000e: re-factor ethtool get/set ring parameter

Make it more like how igb does it, with some additional error checking.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

e1000e: pass pointer to ring struct instead of adapter struct

For ring-specific functions, pass a pointer to the ring struct instead of a
pointer to the adapter struct.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

e1000e: convert head, tail and itr_register offsets to __iomem pointers

The Tx/Rx head and tail registers and itr_register are always at known
addresses based on the __iomem address at which the PCI region (from BAR 0)
is mapped and known offsets within the region for each of these registers.
Store and use the full address rather than just the region offset to reduce
unnecessary address calculations. Also, change current u8 __iomem pointers
to void __iomem pointers.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

e1000e: re-enable alternate MAC address for all devices which support it

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

e1000e: add Receive Packet Steering (RPS) support

Enable RPS by default.  Disallow jumbo frames when both receive checksum
and receive hashing are enabled because the hardware cannot do both IP
payload checksum (enabled when receive checksum is enabled when using
packet split which is used for jumbo frames) and provide RSS hash at the
same time.

v2: added ethtool command to query flow hashing behavior per Ben Hutchings
    and changed the type of rsskey to cleanup the setting of the register
    array and avoid unnecessary casts (as pointed out by Joe Perches).
    The long error messages are not changed since there is nothing in
    the kernel ./Documentation that suggests the preferred method for
    dealing with long messages other than to never break strings; leaving
    them as-is for now.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

e1000e: cleanup Rx checksum offload code

1) cleanup whitespace in e1000_rx_checksum() function header comment
2) do not check hardware checksum when Rx checksum is disabled
3) reduce duplicated calls to le16_to_cpu() by just using it within
e1000_rx_checksum() instead of in each call to the function

v2: use swab16 instead of le16_to_cpu & htons and corrected type for the
passed-in csum

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

infiniband: nes: Convert nes_addr_resolve_neigh() over to dst_neigh_lookup().

Now we must provide the IP destination address, and a reference has
to be dropped when we're done with the entry.

Signed-off-by: David S. Miller <davem@davemloft.net>

infiniband: cxgb4: Convert import_ep() over to dst_neigh_lookup().

Now we must provide the IP destination address, and a reference has
to be dropped when we're done with the entry.

Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

infiniband: Convert dst_fetch_ha() over to dst_neigh_lookup().

Now we must provide the IP destination address, and a reference has
to be dropped when we're done with the entry.

Signed-off-by: David S. Miller <davem@davemloft.net>

net: usb: qmi_wwan: New driver for Huawei QMI based WWAN devices

Some WWAN LTE/3G devices based on chipsets from Qualcomm provide
near standard CDC ECM interfaces in addition to the usual serial
interfaces.   The Huawei E392/E398 are examples of such devices.

These typically cannot be fully configured using AT commands
over a serial interface.  It is necessary to speak the proprietary
Qualcomm MSM Interface (QMI) protocol to the device to enable the
ethernet proxy functionality.

The devices embed the QMI protocol in CDC on the control interface,
using standard CDC commands and notifications. The do not otherwise
use CDC commands for the ethernet function.  This driver does
therefore not need access to any other aspects of the control
interface than the descriptors attached to it.

Another driver, cdc-wdm, will provide userspace access to the
QMI protocol independently of this driver.  To facilitate this,
this driver avoids binding to the control interface, and uses
only the associated data interface after parsing the common CDC
functional descriptors on the control interface.

You will want both the cdc-wdm and option drivers as companions to
this driver, to have full access to all interfaces and protocols
exported by the device.

Signed-off-by: Bjørn Mork <bjorn@mork.no>
Signed-off-by: David S. Miller <davem@davemloft.net>

drivers/net/ethernet/xilinx: added Xilinx AXI Ethernet driver

This driver adds support for Xilinx 10/100/1000 AXI Ethernet.

It can be used, for instance, on Xilinx boards with a Microblaze
architecture like the ML605.

The patch is against the latest net-next tree and checkpatch clean.

Signed-off-by: Ariane Keller <ariane.keller@tik.ee.ethz.ch>
Signed-off-by: Daniel Borkmann <daniel.borkmann@tik.ee.ethz.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>