Tom Herbert [Wed, 16 Feb 2011 10:27:02 +0000 (10:27 +0000)]
bnx2x: Support for managing RX indirection table
Support fetching and retrieving RX indirection table via ethtool.
Signed-off-by: Tom Herbert <therbert@google.com> Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Acked-by: Eilon Greenstein <eilong@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Ben Hutchings [Mon, 10 Jan 2011 21:18:20 +0000 (21:18 +0000)]
sfc: Add TX queues for high-priority traffic
Implement the ndo_setup_tc() operation with 2 traffic classes.
Current Solarstorm controllers do not implement TX queue priority, but
they do allow queues to be 'paced' with an enforced delay between
packets. Paced and unpaced queues are scheduled in round-robin within
two separate hardware bins (paced queues with a large delay may be
placed into a third bin temporarily, but we won't use that). If there
are queues in both bins, the TX scheduler will alternate between them.
If we make high-priority queues unpaced and best-effort queues paced,
and high-priority queues are mostly empty, a single high-priority queue
can then instantly take 50% of the packet rate regardless of how many
of the best-effort queues have descriptors outstanding.
We do not actually want an enforced delay between packets on best-
effort queues, so we set the pace value to a reserved value that
actually results in a delay of 0.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Ben Hutchings [Mon, 7 Feb 2011 23:04:38 +0000 (23:04 +0000)]
sfc: Distinguish queue lookup from test for queue existence
efx_channel_get_{rx,tx}_queue() currently return NULL if the channel
isn't used for traffic in that direction. In most cases this is a
bug, but some callers rely on it as an existence test.
Add existence test functions efx_channel_has_{rx_queue,tx_queues}()
and use them as appropriate.
Change efx_channel_get_{rx,tx}_queue() to assert that the requested
queue exists.
Remove now-redundant initialisation from efx_set_channels().
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Ben Hutchings [Wed, 12 Jan 2011 18:39:40 +0000 (18:39 +0000)]
sfc: Move TX queue core queue mapping into tx.c
efx_hard_start_xmit() needs to implement a mapping which is the
inverse of tx_queue::core_txq. Move the initialisation of
tx_queue::core_txq next to efx_hard_start_xmit() to make the
connection more obvious.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Ben Hutchings [Tue, 15 Feb 2011 19:39:21 +0000 (19:39 +0000)]
net: Adjust TX queue kobjects if number of queues changes during unregister
If the root qdisc for a net device is mqprio, and the driver's
ndo_setup_tc() operation dynamically adds and remvoes TX queues,
netif_set_real_num_tx_queues() will be called during device
unregistration to remove the extra TX queues when the qdisc is
destroyed. Currently this causes the corresponding kobjects
to be leaked, and the device's reference count never drops to 0.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Uwe Kleine-König [Mon, 17 Jan 2011 08:52:18 +0000 (09:52 +0100)]
net/fec: consolidate all i.MX options to CONFIG_ARM
Moreover stop listing all i.MX platforms featuring a FEC, and use
the platform's config symbol that selects registration of a fec device
instead. This might make it easier to add new platforms.
Set default = y for ARMs having a fec to reduce defconfig sizes.
David S. Miller [Thu, 10 Feb 2011 04:42:07 +0000 (20:42 -0800)]
ipv4: Cache learned PMTU information in inetpeer.
The general idea is that if we learn new PMTU information, we
bump the peer genid.
This triggers the dst_ops->check() code to validate and if
necessary propagate the new PMTU value into the metrics.
Learned PMTU information self-expires.
This means that it is not necessary to kill a cached route
entry just because the PMTU information is too old.
As a consequence:
1) When the path appears unreachable (dst_ops->link_failure
or dst_ops->negative_advice) we unwind the PMTU state if
it is out of date, instead of killing the cached route.
A redirected route will still be invalidated in these
situations.
2) rt_check_expire(), rt_worker_func(), et al. are no longer
necessary at all.
Signed-off-by: David S. Miller <davem@davemloft.net>
Baruch Siach [Mon, 14 Feb 2011 02:05:33 +0000 (02:05 +0000)]
phy/micrel: add ability to support 50MHz RMII clock on KZS8051RNL
Platform code can now set the MICREL_PHY_50MHZ_CLK bit of dev_flags in a fixup
routine (registered with phy_register_fixup_for_uid()), to make the KZS8051RNL
PHY work with 50MHz RMII reference clock.
Cc: David J. Choi <david.choi@micrel.com> Signed-off-by: Baruch Siach <baruch@tkos.co.il> Signed-off-by: David S. Miller <davem@davemloft.net>
Bernard Pidoux [Mon, 14 Feb 2011 21:33:49 +0000 (13:33 -0800)]
ROSE: AX25: finding routes simplification
With previous patch, rose_get_neigh() routine
investigates the full list of neighbor nodes
until it finds or not an already connected node whether
it is called locally or through a level 3 transit frame.
If no routes are opened through an adjacent connected node
then a classical connect request is attempted.
Then there is no more reason for an extra loop such
as the one removed by this patch.
Signed-off-by: Bernard Pidoux <f6bvp@free.fr> Signed-off-by: David S. Miller <davem@davemloft.net>
Bernard Pidoux [Mon, 14 Feb 2011 21:31:09 +0000 (13:31 -0800)]
ROSE: rose AX25 packet routing improvement
FPAC AX25 packet application is using Linux kernel ROSE
routing skills in order to connect or send packets to remote stations
knowing their ROSE address via a network of interconnected nodes.
Each FPAC node has a ROSE routing table that Linux ROSE module is
looking at each time a ROSE frame is relayed by the node or when
a connect request to a neighbor node is received.
A previous patch improved the system time response by looking at
already established routes each time the system was looking for a
route to relay a frame. If a neighbor node routing the destination
address was already connected, then the frame would be sent
through him. If not, a connection request would be issued.
The present patch extends the same routing capability to a connect
request asked by a user locally connected into an FPAC node.
Without this patch, a connect request was not well handled unless it
was directed to an immediate connected neighbor of the local node.
Implemented at a number of ROSE FPAC node stations, the present patch
improved dramatically FPAC ROSE routing time response and efficiency.
Signed-off-by: Bernard Pidoux <f6bvp@free.fr> Signed-off-by: David S. Miller <davem@davemloft.net>
Ben Hutchings [Mon, 14 Feb 2011 19:02:23 +0000 (19:02 +0000)]
sch_mqprio: Always set num_tc to 0 in mqprio_destroy()
All the cleanup code in mqprio_destroy() is currently conditional on
priv->qdiscs being non-null, but that condition should only apply to
the per-queue qdisc cleanup. We should always set the number of
traffic classes back to 0 here.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Bhupesh Sharma [Mon, 14 Feb 2011 06:51:44 +0000 (22:51 -0800)]
can: c_can: Added support for Bosch C_CAN controller
Bosch C_CAN controller is a full-CAN implementation which is compliant
to CAN protocol version 2.0 part A and B. Bosch C_CAN user manual can be
obtained from:
This patch adds the support for this controller.
The following are the design choices made while writing the controller
driver:
1. Interface Register set IF1 has be used only in the current design.
2. Out of the 32 Message objects available, 16 are kept aside for RX
purposes and the rest for TX purposes.
3. NAPI implementation is such that both the TX and RX paths function
in polling mode.
Signed-off-by: Bhupesh Sharma <bhupesh.sharma@st.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Guo-Fu Tseng [Sun, 13 Feb 2011 19:27:17 +0000 (19:27 +0000)]
jme: Don't show UDP Checksum error if HW misjudged
Some JMicron Chip treat 0 as error checksum for UDP packets.
Which should be "No checksum needed".
Reported-by: Adam Swift <Adam.Swift@omnitude.net> Confirmed-by: "Aries Lee" <arieslee@jmicron.com> Signed-off-by: Guo-Fu Tseng <cooldavid@cooldavid.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Guo-Fu Tseng [Sun, 13 Feb 2011 18:27:40 +0000 (18:27 +0000)]
jme: Refill receive unicase MAC addr after resume
The value of the register which holds receive Unicast MAC Address
sometimes get messed-up after resume.
This patch refill it before enabling the hardware filter.
Signed-off-by: Guo-Fu Tseng <cooldavid@cooldavid.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Guo-Fu Tseng [Sun, 13 Feb 2011 18:27:35 +0000 (18:27 +0000)]
jme: PHY Power control for new chip
After main chip rev 5, the hardware support more power saving
control registers.
Some Non-Linux drivers might turn off the phy power with new
interfaces, this patch makes it possible for Linux to turn it
on again.
Signed-off-by: Guo-Fu Tseng <cooldavid@cooldavid.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Jiri Pirko [Sun, 13 Feb 2011 10:15:37 +0000 (10:15 +0000)]
rtnetlink: implement setting of master device
This patch allows userspace to enslave/release slave devices via netlink
interface using IFLA_MASTER. This introduces generic way to add/remove
underling devices.
Signed-off-by: Jiri Pirko <jpirko@redhat.com> Acked-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
Sakari Ailus [Wed, 9 Feb 2011 10:25:06 +0000 (10:25 +0000)]
tlan: Fix bugs introduced by the last tlan cleanup patch
Fix two bugs introduced by the commit c659c38b2796578638548b77ef626d93609ec8ac ("tlan: Code cleanup:
checkpatch.pl is relatively happy now.") In that change,
TLAN_CSTAT_READY was considered as a bit mask containing a single bit
set while it was actually had two set instead.
Many thanks to Dan Carpenter for finding the mistake.
Signed-off-by: Sakari Ailus <sakari.ailus@iki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>
Jiri Pirko [Sat, 12 Feb 2011 06:48:36 +0000 (06:48 +0000)]
net: make dev->master general
dev->master is now tightly connected to bonding driver. This patch makes
this pointer more general and ready to be used by others.
- netdev_set_master() - bond specifics moved to new function
netdev_set_bond_master()
- introduced netif_is_bond_slave() to check if device is a bonding slave
Signed-off-by: Jiri Pirko <jpirko@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Jiri Pirko [Sat, 12 Feb 2011 00:46:06 +0000 (00:46 +0000)]
net: remove the unnecessary dance around skb_bond_should_drop
No need to check (master) twice and to drive in and out the header file.
Signed-off-by: Jiri Pirko <jpirko@redhat.com> Reviewed-by: Nicolas de Pesloüan <nicolas.2p.debian@free.fr> Signed-off-by: David S. Miller <davem@davemloft.net>
hartleys [Fri, 11 Feb 2011 12:14:06 +0000 (12:14 +0000)]
phy: Remove unneeded depends on PHYLIB
Remove unneeded depends on PHYLIB. The config selection is already in
an if PHYLIB / endif block.
Signed-off-by: H Hartley Sweeten <hsweeten@visionengravers.com> Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: David S. Miller <davem@davemloft.net>
Ben Greear [Fri, 11 Feb 2011 09:35:18 +0000 (09:35 +0000)]
network: Allow af_packet to transmit +4 bytes for VLAN packets.
This allows user-space to send a '1500' MTU VLAN packet on a
1500 MTU ethernet frame. The extra 4 bytes of a VLAN header is
not usually charged against the MTU when other parts of the
network stack is transmitting vlans...
Signed-off-by: Ben Greear <greearb@candelatech.com> Reviewed-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Ajit Khaparde [Fri, 11 Feb 2011 13:35:41 +0000 (13:35 +0000)]
be2net: call be_vf_eth_addr_config() after register_netdev
This is to avoid the completion processing for be_vf_eth_addr_config
to consume the link status notification before netdev_register.
Otherwise this causes the PF miss its first link status update.
Signed-off-by: Ajit Khaparde <ajit.khaparde@emulex.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Lüssing [Mon, 7 Feb 2011 00:14:40 +0000 (00:14 +0000)]
batman-adv: Disallow originator addressing within mesh layer
For a host in the mesh network, the batman layer should be transparent.
However, we had one exception, data packets within the mesh network
which have the same destination as a originator are being routed to
that node, although there is no host that node's bat0 interface and
therefore gets dropped anyway. This commit removes this exception.
Signed-off-by: Linus Lüssing <linus.luessing@ascom.ch> Signed-off-by: Sven Eckelmann <sven@narfation.org>
Linus Lüssing [Sun, 6 Feb 2011 23:08:37 +0000 (23:08 +0000)]
batman-adv: Remove duplicate types.h inclusions
types.h is included by main.h, which is included at the beginning of any
other c-file anyway. Therefore this commit removes those duplicate
inclussions.
Signed-off-by: Linus Lüssing <linus.luessing@ascom.ch> Signed-off-by: Sven Eckelmann <sven@narfation.org>
Marek Lindner [Tue, 8 Feb 2011 12:43:54 +0000 (12:43 +0000)]
batman-adv: Split combined variable declarations
Multiple variable declarations in a single statements over multiple lines can
be split into multiple variable declarations without changing the actual
behavior.
Signed-off-by: Marek Lindner <lindner_marek@yahoo.de> Signed-off-by: Sven Eckelmann <sven@narfation.org>
The atl1 driver uses the legacy PCI power management, so it has to
do some PCI-specific things in its ->suspend() and ->resume()
callbacks, which isn't necessary and should better be done by the PCI
subsystem-level power management code.
Convert atl1 to the new PCI power management framework and make it
let the PCI subsystem take care of all the PCI-specific aspects of
device handling during system power transitions.
Tested-by: Thomas Fjellstrom <thomas@fjellstrom.ca> Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Signed-off-by: David S. Miller <davem@davemloft.net>
atl1c: Do not call device_init_wakeup() in atl1c_probe()
The atl1c driver shouldn't call device_init_wakeup() in its probe
routine with the second argument equal to 1, because for PCI devices
the wakeup capability setting is initialized as appropriate by the
PCI subsystem. Remove the potentially harmful call.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Signed-off-by: David S. Miller <davem@davemloft.net>
tg3: Avoid setting power.can_wakeup for devices that cannot wake up
The tg3 driver uses device_init_wakeup() in such a way that the
device's power.can_wakeup flag may be set even though the PCI
subsystem cleared it before, in which case the device cannot wake
up the system from sleep states. Modify the driver to only change
the power.can_wakeup flag if the device is not capable of generating
wakeup signals.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Acked-by: Matt Carlson <mcarlson@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Current driver does not show 100MB support in ethtool.
Adding support for the same.
Signed-off-by: Atita Shirwaikar <atita.shirwaikar@intel.com> Tested-by: Stephen Ko <stephen.s.ko@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The function ixgbe_init_mbx_params_pf isn't used unless CONFIG_PCI_IOV
is defined. This is causing namespace warnings. So I wrapped its
definition in CONFIG_PCI_IOV too.
Signed-off-by: Don Skidmore <donald.c.skidmore@intel.com> Tested-by: Stephen Ko <stephen.s.ko@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Don Skidmore [Fri, 28 Jan 2011 02:28:31 +0000 (02:28 +0000)]
ixgbe: cleanup namespace complaint by removing little used function
We had a support function that just walked a few pointers to get
from the ixgbe_hw struct to the netdev pointer. This was causing
a namespace warning so I removed it and just reference the pointers
directly.
Signed-off-by: Don Skidmore <donald.c.skidmore@intel.com> Tested-by: Stephen Ko <stephen.s.ko@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Don Skidmore [Fri, 28 Jan 2011 02:28:26 +0000 (02:28 +0000)]
ixgbe: fix namespace issue with ixgbe_dcb_txq_to_tc
We didn't need the prototype and it was causing namespace complaints so
I made it static.
Signed-off-by: Don Skidmore <donald.c.skidmore@intel.com> Tested-by: Stephen Ko <stephen.s.ko@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
John Fastabend [Wed, 5 Jan 2011 04:48:45 +0000 (04:48 +0000)]
ixgbe: DCB, use hardware independent routines
This consolidates hardware specifics to ixgbe_dcb.c this simplifies
code that was previously branching based on hardware type.
Signed-off-by: John Fastabend <john.r.fastabend@intel.com> Tested-by: Ross Brattain <ross.b.brattain@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
John Fastabend [Fri, 7 Jan 2011 15:30:46 +0000 (15:30 +0000)]
ixgbe: DCB, remove RESET bit it is no longer needed
This removes the RESET bit previously used to force a device
reset when DCB bandwidth configurations were changed. This can
now be done dynamically without a reset so the bit is no longer
needed. The only remaining operations that force a device reset
are DCB enable/disable and FCoE application priority changes.
DCB enable/disable is a hardware requirement.
Signed-off-by: John Fastabend <john.r.fastabend@intel.com> Tested-by: Ross Brattain <ross.b.brattain@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
John Fastabend [Sat, 22 Jan 2011 06:07:05 +0000 (06:07 +0000)]
ixgbe: DCB, do not reset on CEE pg changes
The 82599 and 82598 devices do not require hardware resets to
configure CEE pg settings. This patch changes DCB configuration
to set the CEE pg values directly from the dcbnl ops routine.
This reduces the number of resets seen on the wire and allows
LLDP to reach a steady state faster.
Signed-off-by: John Fastabend <john.r.fastabend@intel.com> Tested-by: Ross Brattain <ross.b.brattain@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
John Fastabend [Thu, 10 Feb 2011 14:40:01 +0000 (14:40 +0000)]
ixgbe: DCB, implement 802.1Qaz routines
Implements 802.1Qaz support for ixgbe driver. Additionally,
this adds IEEE_8021QAZ_TSA_{} defines to dcbnl.h this is to
avoid having to use cryptic numeric codes for the TSA type.
Signed-off-by: John Fastabend <john.r.fastabend@intel.com> Tested-by: Ross Brattain <ross.b.brattain@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
John Fastabend [Wed, 5 Jan 2011 04:47:43 +0000 (04:47 +0000)]
ixgbe: DCB, abstract out dcb_config from DCB hardware configuration
Currently the routines that configure the HW for DCB require a
ixgbe_dcb_config structure. This structure was designed to support
the CEE standard and does not match the IEEE standard well.
This patch changes the HW routines in ixgbe_dcb_8259x.{ch} to use
raw pfc and bandwidth values. This requires some parsing of the DCB
configuration but makes the HW routines independent of the data
structure that contains the DCB configuration.
The primary advantage to doing this is we can do HW setup directly
from the 802.1Qaz ops without having to arbitrarily encapsulate this
data into the CEE structure.
Signed-off-by: John Fastabend <john.r.fastabend@intel.com> Tested-by: Ross Brattain <ross.b.brattain@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
John Fastabend [Wed, 5 Jan 2011 04:47:38 +0000 (04:47 +0000)]
ixgbe: DCB, remove round robin mode on 82598 devices
Remove round robin configuration code for 82598 parts it
is not settable and is always false.
If we need/want this in the future we can add it back properly.
Signed-off-by: John Fastabend <john.r.fastabend@intel.com> Tested-by: Ross Brattain <ross.b.brattain@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
John Fastabend [Tue, 1 Feb 2011 02:10:20 +0000 (02:10 +0000)]
ixgbe: DCB, only reprogram HW if the FCoE priority is changed
If the FCoE priority is not changing do not set the RESET and
APP_UPCHG bits. This causes unneeded HW resets and which can
cause unneeded LLDP frames and negotiations.
The current check is not sufficient because the FCoE priority
can change twice during a negotiation which results in the
bits being set. This occurs when the switch changes the
priority or when the link is reset with switches that do not
include the APP priority until after PFC has been negotiated.
This results in set_app being called with the local APP
priority. Then the negotiation completes and set_app
is called again with the peer APP priority. The check
fails so the device is reset and the above occurs again
resulting in an endless loop of resets.
By only resetting the device if the APP priority has really
changed we short circuit the loop.
Signed-off-by: John Fastabend <john.r.fastabend@intel.com> Tested-by: Ross Brattain <ross.b.brattain@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Bruce Allan [Thu, 20 Jan 2011 06:58:07 +0000 (06:58 +0000)]
e1000e: return appropriate errors for 'ethtool -r'
...when invoked while interface is not up or when auto-negotiation is
disabled as done by other drivers.
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Bruce Allan [Wed, 19 Jan 2011 04:23:39 +0000 (04:23 +0000)]
e1000e: use correct pointer when memcpy'ing a 2-dimensional array
*e1000_gstrings_test is not the same size as e1000_gstrings_test.
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Bruce Allan [Wed, 19 Jan 2011 04:20:59 +0000 (04:20 +0000)]
e1000e: replace unbounded sprintf with snprintf
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Matt Carlson [Fri, 11 Feb 2011 04:06:46 +0000 (20:06 -0800)]
tg3: Expand 5719 workaround
As a precautionary measure, expand the fix submitted in commit 4d163b75e979833979cc401ae433cb1d7743d57e entitled "tg3: Fix 5719 A0 tx
completion bug" to apply to all 5719 revisions.
Signed-off-by: Matt Carlson <mcarlson@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Sven Eckelmann [Thu, 10 Feb 2011 14:33:56 +0000 (14:33 +0000)]
batman-adv: Use successive sequence numbers for fragments
The two fragments of an unicast packet must have successive sequence numbers to
allow the receiver side to detect matching fragments and merge them again. The
current implementation doesn't provide that property because a sequence of two
atomic_inc_return may be interleaved with another sequence which also changes
the variable.
The access to the fragment sequence number pool has either to be protected by
correct locking or it has to reserve two sequence numbers in a single fetch.
The latter one can easily be done by increasing the value of the last used
sequence number by 2 in a single step. The generated window of two currently
unused sequence numbers can now be scattered across the two fragments.
Reported-by: Linus Lüssing <linus.luessing@web.de> Signed-off-by: Sven Eckelmann <sven@narfation.org>
David S. Miller [Tue, 8 Feb 2011 04:38:06 +0000 (20:38 -0800)]
inet: Create a mechanism for upward inetpeer propagation into routes.
If we didn't have a routing cache, we would not be able to properly
propagate certain kinds of dynamic path attributes, for example
PMTU information and redirects.
The reason is that if we didn't have a routing cache, then there would
be no way to lookup all of the active cached routes hanging off of
sockets, tunnels, IPSEC bundles, etc.
Consider the case where we created a cached route, but no inetpeer
entry existed and also we were not asked to pre-COW the route metrics
and therefore did not force the creation a new inetpeer entry.
If we later get a PMTU message, or a redirect, and store this
information in a new inetpeer entry, there is no way to teach that
cached route about the newly existing inetpeer entry.
The facilities implemented here handle this problem.
First we create a generation ID. When we create a cached route of any
kind, we remember the generation ID at the time of attachment. Any
time we force-create an inetpeer entry in response to new path
information, we bump that generation ID.
The dst_ops->check() callback is where the knowledge of this event
is propagated. If the global generation ID does not equal the one
stored in the cached route, and the cached route has not attached
to an inetpeer yet, we look it up and attach if one is found. Now
that we've updated the cached route's information, we update the
route's generation ID too.
This clears the way for implementing PMTU and redirects directly in
the inetpeer cache. There is absolutely no need to consult cached
route information in order to maintain this information.
At this point nothing bumps the inetpeer genids, that comes in the
later changes which handle PMTUs and redirects using inetpeers.
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Wed, 9 Feb 2011 23:36:47 +0000 (15:36 -0800)]
inetpeer: Add redirect and PMTU discovery cached info.
Validity of the cached PMTU information is indicated by it's
expiration value being non-zero, just as per dst->expires.
The scheme we will use is that we will remember the pre-ICMP value
held in the metrics or route entry, and then at expiration time
we will restore that value.
In this way PMTU expiration does not kill off the cached route as is
done currently.
Redirect information is permanent, or at least until another redirect
is received.
Signed-off-by: David S. Miller <davem@davemloft.net>
Xiaotian Feng [Thu, 10 Feb 2011 03:16:15 +0000 (19:16 -0800)]
net: rename group sysfs entry to netdev_group
commit a512b92 adds sysfs entry for net device group, but
before this commit, tun also uses group sysfs, so after this
commit checkin, kernel warns like this:
sysfs: cannot create duplicate filename '/devices/virtual/net/vnet0/group'
Since tun has used this for years, rename sysfs under tun might
break existing userspace, so rename group sysfs entry for net device
group is a better choice.
Signed-off-by: Xiaotian Feng <dfeng@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Tomoya [Mon, 7 Feb 2011 23:29:02 +0000 (23:29 +0000)]
pch_can: fix rmmod issue
Currently, when rmmod pch_can, kernel failure occurs.
The cause is pci_iounmap executed before pch_can_reset.
Thus pci_iounmap moves after pch_can_reset.
Signed-off-by: Tomoya MORINAGA <tomoya-linux@dsn.okisemi.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Tomoya [Mon, 7 Feb 2011 23:29:01 +0000 (23:29 +0000)]
pch_can: fix 800k comms issue
Currently, 800k comms fails since prop_seg set zero.
(EG20T PCH CAN register of prop_seg must be set more than 1)
To prevent prop_seg set to zero, change tseg2_min 1 to 2.
Signed-off-by: Tomoya MORINAGA <tomoya-linux@dsn.okisemi.com> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 8 Feb 2011 23:02:50 +0000 (15:02 -0800)]
net: Fix lockdep regression caused by initializing netdev queues too early.
In commit aa9421041128abb4d269ee1dc502ff65fb3b7d69 ("net: init ingress
queue") we moved the allocation and lock initialization of the queues
into alloc_netdev_mq() since register_netdevice() is way too late.
The problem is that dev->type is not setup until the setup()
callback is invoked by alloc_netdev_mq(), and the dev->type is
what determines the lockdep class to use for the locks in the
queues.
Fix this by doing the queue allocation after the setup() callback
runs.
This is safe because the setup() callback is not allowed to make any
state changes that need to be undone on error (memory allocations,
etc.). It may, however, make state changes that are undone by
free_netdev() (such as netif_napi_add(), which is done by the
ipoib driver's setup routine).
The previous code also leaked a reference to the &init_net namespace
object on RX/TX queue allocation failures.
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 8 Feb 2011 22:31:31 +0000 (14:31 -0800)]
net/caif: Fix dangling list pointer in freed object on error.
rtnl_link_ops->setup(), and the "setup" callback passed to alloc_netdev*(),
cannot make state changes which need to be undone on failure. There is
no cleanup mechanism available at this point.
So we have to add the caif private instance to the global list once we
are sure that register_netdev() has succedded in ->newlink().
Otherwise, if register_netdev() fails, the caller will invoke free_netdev()
and we will have a reference to freed up memory on the chnl_net_list.
Signed-off-by: David S. Miller <davem@davemloft.net>
Nicolas Dichtel [Wed, 2 Feb 2011 06:29:02 +0000 (06:29 +0000)]
ipsec: allow to align IPv4 AH on 32 bits
The Linux IPv4 AH stack aligns the AH header on a 64 bit boundary
(like in IPv6). This is not RFC compliant (see RFC4302, Section
3.3.3.2.1), it should be aligned on 32 bits.
For most of the authentication algorithms, the ICV size is 96 bits.
The AH header alignment on 32 or 64 bits gives the same results.
However for SHA-256-128 for instance, the wrong 64 bit alignment results
in adding useless padding in IPv4 AH, which is forbidden by the RFC.
To avoid breaking backward compatibility, we use a new flag
(XFRM_STATE_ALIGN4) do change original behavior.
Initial patch from Dang Hongwu <hongwu.dang@6wind.com> and
Christophe Gouault <christophe.gouault@6wind.com>.
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>