git.karo-electronics.de Git - linux-beck.git/log

bnx2x: Support for managing RX indirection table

Support fetching and retrieving RX indirection table via ethtool.

Signed-off-by: Tom Herbert <therbert@google.com>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Acked-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'for-davem' of git://git.kernel.org/pub/scm/linux/kernel/git/bwh/sfc-next-2.6

Merge branch 'fec' of git://git.pengutronix.de/git/ukl/linux-2.6

sfc: Add TX queues for high-priority traffic

Implement the ndo_setup_tc() operation with 2 traffic classes.

Current Solarstorm controllers do not implement TX queue priority, but
they do allow queues to be 'paced' with an enforced delay between
packets. Paced and unpaced queues are scheduled in round-robin within
two separate hardware bins (paced queues with a large delay may be
placed into a third bin temporarily, but we won't use that). If there
are queues in both bins, the TX scheduler will alternate between them.

If we make high-priority queues unpaced and best-effort queues paced,
and high-priority queues are mostly empty, a single high-priority queue
can then instantly take 50% of the packet rate regardless of how many
of the best-effort queues have descriptors outstanding.

We do not actually want an enforced delay between packets on best-
effort queues, so we set the pace value to a reserved value that
actually results in a delay of 0.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>

sfc: Distinguish queue lookup from test for queue existence

efx_channel_get_{rx,tx}_queue() currently return NULL if the channel
isn't used for traffic in that direction. In most cases this is a
bug, but some callers rely on it as an existence test.

Add existence test functions efx_channel_has_{rx_queue,tx_queues}()
and use them as appropriate.

Change efx_channel_get_{rx,tx}_queue() to assert that the requested
queue exists.

Remove now-redundant initialisation from efx_set_channels().

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>

sfc: Move TX queue core queue mapping into tx.c

efx_hard_start_xmit() needs to implement a mapping which is the
inverse of tx_queue::core_txq. Move the initialisation of
tx_queue::core_txq next to efx_hard_start_xmit() to make the
connection more obvious.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>

net: Adjust TX queue kobjects if number of queues changes during unregister

If the root qdisc for a net device is mqprio, and the driver's
ndo_setup_tc() operation dynamically adds and remvoes TX queues,
netif_set_real_num_tx_queues() will be called during device
unregistration to remove the extra TX queues when the qdisc is
destroyed. Currently this causes the corresponding kobjects
to be leaked, and the device's reference count never drops to 0.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>

net/fec: enable flow control and length check on enet-mac

Also optimize not to reread the value written to FEC_R_CNTRL.

Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>

net/fec: postpone unsetting driver data until the hardware is stopped

Reported-by: Lothar Waßmann <LW@KARO-electronics.de>
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>

net/fec: provide device for dma functions and matching sizes for map and unmap

This fixes warnings when CONFIG_DMA_API_DEBUG=y:

NULL NULL: DMA-API: device driver tries to free DMA memory it has not allocated [device address=0x000000004781a020] [size=64 bytes]
net eth0: DMA-API: device driver frees DMA memory with different size [device address=0x000000004781a020] [map size=2048 bytes] [unmap size=64 bytes]

Moreover pass the platform device to dma_{,un}map_single which makes
more sense because the logical network device doesn't know anything
about dma.

Passing the platform device was a suggestion by Lothar Waßmann.

Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>

net/fec: reorder functions a bit allows removing forward declarations

Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>

net/fec: some whitespace cleanup

A few of these were found and reported by Lothar Waßmann.

Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>

net/fec: consistenly name struct net_device pointers "ndev"

A variable named "dev" usually (usually subjective) points to a struct
device.

Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>

net/fec: add phy_stop to fec_enet_close

This undoes the effects of phy_start in fec_enet_open.

Reported-by: Lothar Waßmann <LW@KARO-electronics.de>
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>

net/fec: consolidate all i.MX options to CONFIG_ARM

Moreover stop listing all i.MX platforms featuring a FEC, and use
the platform's config symbol that selects registration of a fec device
instead. This might make it easier to add new platforms.

Set default = y for ARMs having a fec to reduce defconfig sizes.

Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>

net/fec: put the ioremap cookie immediately into a void __iomem pointer

Saving it first into struct net_device->base_addr (which is an unsigned
long) is pointless and only needs to use more casts than necessary.

Reported-by: Lothar Waßmann <LW@KARO-electronics.de>
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>

net/fec: no need to memzero private data

alloc_etherdev internally uses kzalloc, so the private data is already
zerod out.

Reported-by: Lothar Waßmann <LW@KARO-electronics.de>
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>

net/fec: no need to check for validity of ndev in suspend and resume

dev_set_drvdata is called unconditionally in the probe function and so
it cannot be NULL.

Reported-by: Lothar Waßmann <LW@KARO-electronics.de>
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>

net/fec: don't free an irq that failed to be requested

Reported-by: Lothar Waßmann <LW@KARO-elektronics.de>
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>

net/fec: release mem_region requested in probe in error path and remove

Reported-by: Lothar Waßmann <LW@KARO-electronics.de>
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>

net/fec: no need to cast arguments for memcpy

memcpy takes a const void * as 2nd argument. So the argument is
converted automatically to void * anyhow.

Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>

ipv4: Cache learned redirect information in inetpeer.

Note that we do not generate the redirect netevent any longer,
because we don't create a new cached route.

Instead, once the new neighbour is bound to the cached route,
we emit a neigh update event instead.

Signed-off-by: David S. Miller <davem@davemloft.net>

ipv4: Cache learned PMTU information in inetpeer.

The general idea is that if we learn new PMTU information, we
bump the peer genid.

This triggers the dst_ops->check() code to validate and if
necessary propagate the new PMTU value into the metrics.

Learned PMTU information self-expires.

This means that it is not necessary to kill a cached route
entry just because the PMTU information is too old.

As a consequence:

1) When the path appears unreachable (dst_ops->link_failure
   or dst_ops->negative_advice) we unwind the PMTU state if
   it is out of date, instead of killing the cached route.

   A redirected route will still be invalidated in these
   situations.

2) rt_check_expire(), rt_worker_func(), et al. are no longer
   necessary at all.

Signed-off-by: David S. Miller <davem@davemloft.net>

phy/micrel: add ability to support 50MHz RMII clock on KZS8051RNL

Platform code can now set the MICREL_PHY_50MHZ_CLK bit of dev_flags in a fixup
routine (registered with phy_register_fixup_for_uid()), to make the KZS8051RNL
PHY work with 50MHz RMII reference clock.

Cc: David J. Choi <david.choi@micrel.com>
Signed-off-by: Baruch Siach <baruch@tkos.co.il>
Signed-off-by: David S. Miller <davem@davemloft.net>

ROSE: AX25: finding routes simplification

With previous patch, rose_get_neigh() routine
investigates the full list of neighbor nodes
until it finds or not an already connected node whether
it is called locally or through a level 3 transit frame.
If no routes are opened through an adjacent connected node
then a classical connect request is attempted.

Then there is no more reason for an extra loop such
as the one removed by this patch.

Signed-off-by: Bernard Pidoux <f6bvp@free.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>

ROSE: rose AX25 packet routing improvement

FPAC AX25 packet application is using Linux kernel ROSE
routing skills in order to connect or send packets to remote stations
knowing their ROSE address via a network of interconnected nodes.

Each FPAC node has a ROSE routing table that Linux ROSE module is
looking at each time a ROSE frame is relayed by the node or when
a connect request to a neighbor node is received.

A previous patch improved the system time response by looking at
already established routes each time the system was looking for a
route to relay a frame. If a neighbor node routing the destination
address was already connected, then the frame would be sent
through him. If not, a connection request would be issued.

The present patch extends the same routing capability to a connect
request asked by a user locally connected into an FPAC node.
Without this patch, a connect request was not well handled unless it
was directed to an immediate connected neighbor of the local node.

Implemented at a number of ROSE FPAC node stations, the present patch
improved dramatically FPAC ROSE routing time response and efficiency.

Signed-off-by: Bernard Pidoux <f6bvp@free.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>

ipv4: fix rcu lock imbalance in fib_select_default()

Commit 0c838ff1ade7 (ipv4: Consolidate all default route selection
implementations.) forgot to remove one rcu_read_unlock() from
fib_select_default().

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

sch_mqprio: Always set num_tc to 0 in mqprio_destroy()

All the cleanup code in mqprio_destroy() is currently conditional on
priv->qdiscs being non-null, but that condition should only apply to
the per-queue qdisc cleanup. We should always set the number of
traffic classes back to 0 here.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>

can: c_can: Added support for Bosch C_CAN controller

Bosch C_CAN controller is a full-CAN implementation which is compliant
to CAN protocol version 2.0 part A and B. Bosch C_CAN user manual can be
obtained from:

http://www.semiconductors.bosch.de/media/en/pdf/ipmodules_1/c_can/users_manual_c_can.pdf

This patch adds the support for this controller.
The following are the design choices made while writing the controller
driver:
1. Interface Register set IF1 has be used only in the current design.
2. Out of the 32 Message objects available, 16 are kept aside for RX
purposes and the rest for TX purposes.
3. NAPI implementation is such that both the TX and RX paths function
in polling mode.

Signed-off-by: Bhupesh Sharma <bhupesh.sharma@st.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

jme: Advance driver version

Signed-off-by: Guo-Fu Tseng <cooldavid@cooldavid.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

jme: Don't show UDP Checksum error if HW misjudged

Some JMicron Chip treat 0 as error checksum for UDP packets.
Which should be "No checksum needed".

Reported-by: Adam Swift <Adam.Swift@omnitude.net>
Confirmed-by: "Aries Lee" <arieslee@jmicron.com>
Signed-off-by: Guo-Fu Tseng <cooldavid@cooldavid.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

jme: Refill receive unicase MAC addr after resume

The value of the register which holds receive Unicast MAC Address
sometimes get messed-up after resume.
This patch refill it before enabling the hardware filter.

Signed-off-by: Guo-Fu Tseng <cooldavid@cooldavid.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

jme: Safer MAC processor reset sequence

Adding control to clk_rx, and makes the control of clk_{rx|tx|tcp}
with safer sequence.
This sequence is provided by JMicron.

Signed-off-by: Guo-Fu Tseng <cooldavid@cooldavid.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

jme: Fix hardware action of full-duplex

Clear Transmit Timer/Retry setting while full-duplex.

Signed-off-by: Guo-Fu Tseng <cooldavid@cooldavid.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

jme: Rename phyfifo function for easier understand

Signed-off-by: Guo-Fu Tseng <cooldavid@cooldavid.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

jme: Fix bit typo of JMC250A2 workaround

Signed-off-by: Guo-Fu Tseng <cooldavid@cooldavid.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

jme: PHY Power control for new chip

After main chip rev 5, the hardware support more power saving
control registers.
Some Non-Linux drivers might turn off the phy power with new
interfaces, this patch makes it possible for Linux to turn it
on again.

Signed-off-by: Guo-Fu Tseng <cooldavid@cooldavid.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

jme: Extract main and sub chip revision

Get the main and sub chip revision for later workaround use.

Signed-off-by: Guo-Fu Tseng <cooldavid@cooldavid.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

bridge: implement [add/del]_slave ops

add possibility to addif/delif via rtnetlink

Signed-off-by: Jiri Pirko <jpirko@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

bond: implement [add/del]_slave ops

allow enslaving/releasing using netlink interface

Signed-off-by: Jiri Pirko <jpirko@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

rtnetlink: implement setting of master device

This patch allows userspace to enslave/release slave devices via netlink
interface using IFLA_MASTER. This introduces generic way to add/remove
underling devices.

Signed-off-by: Jiri Pirko <jpirko@redhat.com>
Acked-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>

tlan: Fix bugs introduced by the last tlan cleanup patch

Fix two bugs introduced by the commit
c659c38b2796578638548b77ef626d93609ec8ac ("tlan: Code cleanup:
checkpatch.pl is relatively happy now.") In that change,
TLAN_CSTAT_READY was considered as a bit mask containing a single bit
set while it was actually had two set instead.

Many thanks to Dan Carpenter for finding the mistake.

Signed-off-by: Sakari Ailus <sakari.ailus@iki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: make dev->master general

dev->master is now tightly connected to bonding driver. This patch makes
this pointer more general and ready to be used by others.

- netdev_set_master() - bond specifics moved to new function
netdev_set_bond_master()
- introduced netif_is_bond_slave() to check if device is a bonding slave

Signed-off-by: Jiri Pirko <jpirko@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: remove the unnecessary dance around skb_bond_should_drop

No need to check (master) twice and to drive in and out the header file.

Signed-off-by: Jiri Pirko <jpirko@redhat.com>
Reviewed-by: Nicolas de Pesloüan <nicolas.2p.debian@free.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>

phy: Remove unneeded depends on PHYLIB

Remove unneeded depends on PHYLIB. The config selection is already in
an if PHYLIB / endif block.

Signed-off-by: H Hartley Sweeten <hsweeten@visionengravers.com>
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: David S. Miller <davem@davemloft.net>

network: Allow af_packet to transmit +4 bytes for VLAN packets.

This allows user-space to send a '1500' MTU VLAN packet on a
1500 MTU ethernet frame. The extra 4 bytes of a VLAN header is
not usually charged against the MTU when other parts of the
network stack is transmitting vlans...

Signed-off-by: Ben Greear <greearb@candelatech.com>
Reviewed-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'batman-adv/next' of git://git.open-mesh.org/ecsv/linux-merge

be2net: restrict WOL to PFs only.

WOL is not supported for Vrtual Functions.

Signed-off-by: Ajit Khaparde <ajit.khaparde@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

be2net: detect a UE even when a interface is down.

Signed-off-by: Ajit Khaparde <ajit.khaparde@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

be2net: gracefully handle situations when UE is detected

Avoid accessing the hardware when UE is detected.

Signed-off-by: Ajit Khaparde <ajit.khaparde@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

be2net: fix be_suspend/resume/shutdown

> call pci msix disable in be_suspend
> call pci msix enable in be_resume
> stop worker thread in be_suspend
> start worker thread in be_resume
> stop worker thread in be_shutdown

Signed-off-by: Ajit Khaparde <ajit.khaparde@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

be2net: pass proper hdr_size while flashing redboot.

Signed-off-by: Ajit Khaparde <ajit.khaparde@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

be2net: Fix broken priority setting when vlan tagging is enabled.

Signed-off-by: Ajit Khaparde <ajit.khaparde@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

be2net: Allow VFs to call be_cmd_reset_function.

Signed-off-by: Ajit Khaparde <ajit.khaparde@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

be2net: pass domain numbers for pmac_add/del functions

be_cmd_pmac_add/del functions need to pass domain number to the firmware.

Signed-off-by: Ajit Khaparde <ajit.khaparde@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

be2net: For the VF MAC, use the OUI from current MAC address

Currently we are always using the Emulex OUI for a VF MAC address
while generating MAC for a VF. Use OUI from current MAC instead.

Signed-off-by: Ajit Khaparde <ajit.khaparde@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

be2net: Cleanup the VF interface handles

The PF needs to cleanup all the interface handles that it created for the VFs.

Signed-off-by: Ajit Khaparde <ajit.khaparde@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

be2net: call be_vf_eth_addr_config() after register_netdev

This is to avoid the completion processing for be_vf_eth_addr_config
to consume the link status notification before netdev_register.
Otherwise this causes the PF miss its first link status update.

Signed-off-by: Ajit Khaparde <ajit.khaparde@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

be2net: Initialize and cleanup sriov resources only if pci_enable_sriov has succeeded.

Signed-off-by: Ajit Khaparde <ajit.khaparde@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

be2net: Use domain id when be_cmd_if_destroy is called.

Signed-off-by: Ajit Khaparde <ajit.khaparde@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

be2net: endianness fix in be_cmd_set_qos().

Signed-off-by: Ajit Khaparde <ajit.khaparde@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

be2net: While configuring QOS for VF, pass proper domain id

While configuring QOS for VFs, the VF number should be translated
to domain number correctly.

Signed-off-by: Ajit Khaparde <ajit.khaparde@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/jkirsher/net-next-2.6

batman-adv: Disallow originator addressing within mesh layer

For a host in the mesh network, the batman layer should be transparent.
However, we had one exception, data packets within the mesh network
which have the same destination as a originator are being routed to
that node, although there is no host that node's bat0 interface and
therefore gets dropped anyway. This commit removes this exception.

Signed-off-by: Linus Lüssing <linus.luessing@ascom.ch>
Signed-off-by: Sven Eckelmann <sven@narfation.org>

batman-adv: Remove duplicate types.h inclusions

types.h is included by main.h, which is included at the beginning of any
other c-file anyway. Therefore this commit removes those duplicate
inclussions.

Signed-off-by: Linus Lüssing <linus.luessing@ascom.ch>
Signed-off-by: Sven Eckelmann <sven@narfation.org>

batman-adv: Split combined variable declarations

Multiple variable declarations in a single statements over multiple lines can
be split into multiple variable declarations without changing the actual
behavior.

Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>
Signed-off-by: Sven Eckelmann <sven@narfation.org>

atl1: Do not use legacy PCI power management

The atl1 driver uses the legacy PCI power management, so it has to
do some PCI-specific things in its ->suspend() and ->resume()
callbacks, which isn't necessary and should better be done by the PCI
subsystem-level power management code.

Convert atl1 to the new PCI power management framework and make it
let the PCI subsystem take care of all the PCI-specific aspects of
device handling during system power transitions.

Tested-by: Thomas Fjellstrom <thomas@fjellstrom.ca>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>

atl1c: Do not call device_init_wakeup() in atl1c_probe()

The atl1c driver shouldn't call device_init_wakeup() in its probe
routine with the second argument equal to 1, because for PCI devices
the wakeup capability setting is initialized as appropriate by the
PCI subsystem. Remove the potentially harmful call.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>

tg3: Avoid setting power.can_wakeup for devices that cannot wake up

The tg3 driver uses device_init_wakeup() in such a way that the
device's power.can_wakeup flag may be set even though the PCI
subsystem cleared it before, in which case the device cannot wake
up the system from sleep states. Modify the driver to only change
the power.can_wakeup flag if the device is not capable of generating
wakeup signals.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Matt Carlson <mcarlson@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ixgbe: Adding 100MB FULL support in ethtool

Current driver does not show 100MB support in ethtool.
Adding support for the same.

Signed-off-by: Atita Shirwaikar <atita.shirwaikar@intel.com>
Tested-by: Stephen Ko <stephen.s.ko@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

ixgbe: cleanup ixgbe_init_mbx_params_pf namespace issue

The function ixgbe_init_mbx_params_pf isn't used unless CONFIG_PCI_IOV
is defined. This is causing namespace warnings. So I wrapped its
definition in CONFIG_PCI_IOV too.

Signed-off-by: Don Skidmore <donald.c.skidmore@intel.com>
Tested-by: Stephen Ko <stephen.s.ko@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

ixgbe: cleanup namespace complaint by removing little used function

We had a support function that just walked a few pointers to get
from the ixgbe_hw struct to the netdev pointer. This was causing
a namespace warning so I removed it and just reference the pointers
directly.

Signed-off-by: Don Skidmore <donald.c.skidmore@intel.com>
Tested-by: Stephen Ko <stephen.s.ko@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

ixgbe: fix namespace issue with ixgbe_dcb_txq_to_tc

We didn't need the prototype and it was causing namespace complaints so
I made it static.

Signed-off-by: Don Skidmore <donald.c.skidmore@intel.com>
Tested-by: Stephen Ko <stephen.s.ko@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

ixgbe: DCB, use hardware independent routines

This consolidates hardware specifics to ixgbe_dcb.c this simplifies
code that was previously branching based on hardware type.

Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Tested-by: Ross Brattain <ross.b.brattain@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

ixgbe: DCB, remove RESET bit it is no longer needed

This removes the RESET bit previously used to force a device
reset when DCB bandwidth configurations were changed. This can
now be done dynamically without a reset so the bit is no longer
needed. The only remaining operations that force a device reset
are DCB enable/disable and FCoE application priority changes.
DCB enable/disable is a hardware requirement.

Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Tested-by: Ross Brattain <ross.b.brattain@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

ixgbe: DCB, do not reset on CEE pg changes

The 82599 and 82598 devices do not require hardware resets to
configure CEE pg settings. This patch changes DCB configuration
to set the CEE pg values directly from the dcbnl ops routine.

This reduces the number of resets seen on the wire and allows
LLDP to reach a steady state faster.

Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Tested-by: Ross Brattain <ross.b.brattain@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

ixgbe: DCB, implement 802.1Qaz routines

Implements 802.1Qaz support for ixgbe driver. Additionally,
this adds IEEE_8021QAZ_TSA_{} defines to dcbnl.h this is to
avoid having to use cryptic numeric codes for the TSA type.

Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Tested-by: Ross Brattain <ross.b.brattain@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

ixgbe: DCB, abstract out dcb_config from DCB hardware configuration

Currently the routines that configure the HW for DCB require a
ixgbe_dcb_config structure. This structure was designed to support
the CEE standard and does not match the IEEE standard well.

This patch changes the HW routines in ixgbe_dcb_8259x.{ch} to use
raw pfc and bandwidth values. This requires some parsing of the DCB
configuration but makes the HW routines independent of the data
structure that contains the DCB configuration.

The primary advantage to doing this is we can do HW setup directly
from the 802.1Qaz ops without having to arbitrarily encapsulate this
data into the CEE structure.

Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Tested-by: Ross Brattain <ross.b.brattain@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

ixgbe: DCB, remove round robin mode on 82598 devices

Remove round robin configuration code for 82598 parts it
is not settable and is always false.

If we need/want this in the future we can add it back properly.

Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Tested-by: Ross Brattain <ross.b.brattain@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

ixgbe: DCB, only reprogram HW if the FCoE priority is changed

If the FCoE priority is not changing do not set the RESET and
APP_UPCHG bits. This causes unneeded HW resets and which can
cause unneeded LLDP frames and negotiations.

The current check is not sufficient because the FCoE priority
can change twice during a negotiation which results in the
bits being set. This occurs when the switch changes the
priority or when the link is reset with switches that do not
include the APP priority until after PFC has been negotiated.

This results in set_app being called with the local APP
priority. Then the negotiation completes and set_app
is called again with the peer APP priority. The check
fails so the device is reset and the above occurs again
resulting in an endless loop of resets.

By only resetting the device if the APP priority has really
changed we short circuit the loop.

Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Tested-by: Ross Brattain <ross.b.brattain@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

igb: Enable PF side of SR-IOV support for i350 devices

This patch adds full support for SR-IOV by enabling the PF side.
VF side has already been committed.

Signed-off-by: Carolyn Wyborny <carolyn.wyborny@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

e1000e: return appropriate errors for 'ethtool -r'

...when invoked while interface is not up or when auto-negotiation is
disabled as done by other drivers.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

e1000e: use correct pointer when memcpy'ing a 2-dimensional array

*e1000_gstrings_test is not the same size as e1000_gstrings_test.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

e1000e: replace unbounded sprintf with snprintf

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

tg3: Expand 5719 workaround

As a precautionary measure, expand the fix submitted in commit
4d163b75e979833979cc401ae433cb1d7743d57e entitled "tg3: Fix 5719 A0 tx
completion bug" to apply to all 5719 revisions.

Signed-off-by: Matt Carlson <mcarlson@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

batman-adv: Use successive sequence numbers for fragments

The two fragments of an unicast packet must have successive sequence numbers to
allow the receiver side to detect matching fragments and merge them again. The
current implementation doesn't provide that property because a sequence of two
atomic_inc_return may be interleaved with another sequence which also changes
the variable.

The access to the fragment sequence number pool has either to be protected by
correct locking or it has to reserve two sequence numbers in a single fetch.
The latter one can easily be done by increasing the value of the last used
sequence number by 2 in a single step. The generated window of two currently
unused sequence numbers can now be scattered across the two fragments.

Reported-by: Linus Lüssing <linus.luessing@web.de>
Signed-off-by: Sven Eckelmann <sven@narfation.org>

inet: Create a mechanism for upward inetpeer propagation into routes.

If we didn't have a routing cache, we would not be able to properly
propagate certain kinds of dynamic path attributes, for example
PMTU information and redirects.

The reason is that if we didn't have a routing cache, then there would
be no way to lookup all of the active cached routes hanging off of
sockets, tunnels, IPSEC bundles, etc.

Consider the case where we created a cached route, but no inetpeer
entry existed and also we were not asked to pre-COW the route metrics
and therefore did not force the creation a new inetpeer entry.

If we later get a PMTU message, or a redirect, and store this
information in a new inetpeer entry, there is no way to teach that
cached route about the newly existing inetpeer entry.

The facilities implemented here handle this problem.

First we create a generation ID.  When we create a cached route of any
kind, we remember the generation ID at the time of attachment.  Any
time we force-create an inetpeer entry in response to new path
information, we bump that generation ID.

The dst_ops->check() callback is where the knowledge of this event
is propagated.  If the global generation ID does not equal the one
stored in the cached route, and the cached route has not attached
to an inetpeer yet, we look it up and attach if one is found.  Now
that we've updated the cached route's information, we update the
route's generation ID too.

This clears the way for implementing PMTU and redirects directly in
the inetpeer cache.  There is absolutely no need to consult cached
route information in order to maintain this information.

At this point nothing bumps the inetpeer genids, that comes in the
later changes which handle PMTUs and redirects using inetpeers.

Signed-off-by: David S. Miller <davem@davemloft.net>

inetpeer: Add redirect and PMTU discovery cached info.

Validity of the cached PMTU information is indicated by it's
expiration value being non-zero, just as per dst->expires.

The scheme we will use is that we will remember the pre-ICMP value
held in the metrics or route entry, and then at expiration time
we will restore that value.

In this way PMTU expiration does not kill off the cached route as is
done currently.

Redirect information is permanent, or at least until another redirect
is received.

Signed-off-by: David S. Miller <davem@davemloft.net>

inetpeer: Abstract address representation further.

Future changes will add caching information, and some of
these new elements will be addresses.

Since the family is implicit via the ->daddr.family member,
replicating the family in ever address we store is entirely
redundant.

Signed-off-by: David S. Miller <davem@davemloft.net>

net: rename group sysfs entry to netdev_group

commit a512b92 adds sysfs entry for net device group, but
before this commit, tun also uses group sysfs, so after this
commit checkin, kernel warns like this:
sysfs: cannot create duplicate filename '/devices/virtual/net/vnet0/group'

Since tun has used this for years, rename sysfs under tun might
break existing userspace, so rename group sysfs entry for net device
group is a better choice.

Signed-off-by: Xiaotian Feng <dfeng@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/kaber/nf-next-2.6

Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6

Conflicts:
drivers/net/e1000e/netdev.c

pch_can: fix module reload issue with MSI

Currently, in case reload pch_can,
pch_can not to be able to catch interrupt.

The cause is bus-master is not set in pch_can.
Thus, add enabling bus-master processing.

Signed-off-by: Tomoya MORINAGA <tomoya-linux@dsn.okisemi.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

pch_can: fix rmmod issue

Currently, when rmmod pch_can, kernel failure occurs.
The cause is pci_iounmap executed before pch_can_reset.
Thus pci_iounmap moves after pch_can_reset.

Signed-off-by: Tomoya MORINAGA <tomoya-linux@dsn.okisemi.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

pch_can: fix 800k comms issue

Currently, 800k comms fails since prop_seg set zero.
(EG20T PCH CAN register of prop_seg must be set more than 1)
To prevent prop_seg set to zero, change tseg2_min 1 to 2.

Signed-off-by: Tomoya MORINAGA <tomoya-linux@dsn.okisemi.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: Kill NETEVENT_PMTU_UPDATE.

Nobody actually does anything in response to the event,
so just kill it off.

Signed-off-by: David S. Miller <davem@davemloft.net>

net: Remove bogus barrier() in dst_allfrag().

I simply missed this one when modifying the other dst
metric interfaces earlier.

Signed-off-by: David S. Miller <davem@davemloft.net>

net: Fix lockdep regression caused by initializing netdev queues too early.

In commit aa9421041128abb4d269ee1dc502ff65fb3b7d69 ("net: init ingress
queue") we moved the allocation and lock initialization of the queues
into alloc_netdev_mq() since register_netdevice() is way too late.

The problem is that dev->type is not setup until the setup()
callback is invoked by alloc_netdev_mq(), and the dev->type is
what determines the lockdep class to use for the locks in the
queues.

Fix this by doing the queue allocation after the setup() callback
runs.

This is safe because the setup() callback is not allowed to make any
state changes that need to be undone on error (memory allocations,
etc.). It may, however, make state changes that are undone by
free_netdev() (such as netif_napi_add(), which is done by the
ipoib driver's setup routine).

The previous code also leaked a reference to the &init_net namespace
object on RX/TX queue allocation failures.

Signed-off-by: David S. Miller <davem@davemloft.net>

net/caif: Fix dangling list pointer in freed object on error.

rtnl_link_ops->setup(), and the "setup" callback passed to alloc_netdev*(),
cannot make state changes which need to be undone on failure. There is
no cleanup mechanism available at this point.

So we have to add the caif private instance to the global list once we
are sure that register_netdev() has succedded in ->newlink().

Otherwise, if register_netdev() fails, the caller will invoke free_netdev()
and we will have a reference to freed up memory on the chnl_net_list.

Signed-off-by: David S. Miller <davem@davemloft.net>

ipsec: allow to align IPv4 AH on 32 bits

The Linux IPv4 AH stack aligns the AH header on a 64 bit boundary
(like in IPv6). This is not RFC compliant (see RFC4302, Section
3.3.3.2.1), it should be aligned on 32 bits.

For most of the authentication algorithms, the ICV size is 96 bits.
The AH header alignment on 32 or 64 bits gives the same results.

However for SHA-256-128 for instance, the wrong 64 bit alignment results
in adding useless padding in IPv4 AH, which is forbidden by the RFC.

To avoid breaking backward compatibility, we use a new flag
(XFRM_STATE_ALIGN4) do change original behavior.

Initial patch from Dang Hongwu <hongwu.dang@6wind.com> and
Christophe Gouault <christophe.gouault@6wind.com>.

Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>