Matt Carlson [Thu, 14 Oct 2010 10:37:41 +0000 (10:37 +0000)]
tg3: Add EEE support
This patch adds Energy Efficient Ethernet (EEE) support for the 5718
device ID and the 57765 B0 asic revision.
Signed-off-by: Matt Carlson <mcarlson@broadcom.com> Reviewed-by: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Matt Carlson [Thu, 14 Oct 2010 10:37:40 +0000 (10:37 +0000)]
tg3: Add clause 45 register accessor methods
This patch adds clause 45 register access methods. They will be used in
the following patch.
Signed-off-by: Matt Carlson <mcarlson@broadcom.com> Reviewed-by: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Matt Carlson [Thu, 14 Oct 2010 10:37:39 +0000 (10:37 +0000)]
tg3: Disable unused transmit rings
This patch allows the driver to disable the additional transmit rings
available on the 5717 and 5719 devices. This is not strictly necessary,
but is done anyways for correctness.
Signed-off-by: Matt Carlson <mcarlson@broadcom.com> Reviewed-by: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Matt Carlson [Thu, 14 Oct 2010 10:37:38 +0000 (10:37 +0000)]
tg3: Add support for selfboot format 1 v6
5718 B0 and 5719 devices will use a new selfboot firmware format. This
patch adds code to detect the new format so that bootcode versions get
reported correctly.
Signed-off-by: Matt Carlson <mcarlson@broadcom.com> Reviewed-by: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Thu, 14 Oct 2010 20:53:04 +0000 (20:53 +0000)]
fib_hash: embed initial hash table in fn_zone
While looking for false sharing problems, I noticed
sizeof(struct fn_zone) was small (28 bytes) and possibly sharing a cache
line with an often written kernel structure.
Most of the time, fn_zone uses its initial hash table of 16 slots.
We can avoid the false sharing problem by embedding this initial hash
table in fn_zone itself, so that sizeof(fn_zone) > L1_CACHE_BYTES
We did a similar optimization in commit a6501e080c (Reduce memory needs
and speedup lookups)
Add a fz_revorder field to speedup fn_hash() a bit.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Thu, 14 Oct 2010 05:56:18 +0000 (05:56 +0000)]
netns: reorder fields in struct net
In a network bench, I noticed an unfortunate false sharing between
'loopback_dev' and 'count' fields in "struct net".
'count' is written each time a socket is created or destroyed, while
loopback_dev might be often read in routing code.
Move loopback_dev in a read mostly section of "struct net"
Note: struct netns_xfrm is cache line aligned on SMP.
(It contains a "struct dst_ops")
Move it at the end to avoid holes, and reduce sizeof(struct net) by 128
bytes on ia32.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Ilpo Järvinen [Thu, 14 Oct 2010 01:52:09 +0000 (01:52 +0000)]
tcp: use correct counters in CA_CWR state too
As CWR is stronger than CA_Disorder state, we can miscount
SACK/Reno failure into other timeouts. Not a bad problem as
it can happen only due to ECN, FRTO detecting spurious RTO
or xmit error which are the only callers of tcp_enter_cwr.
And even then losses and RTO must still follow thereafter
to actually end up into the relevant code paths.
Compile tested.
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>
Ilpo Järvinen [Thu, 14 Oct 2010 01:42:30 +0000 (01:42 +0000)]
tcp: sack lost marking fixes
When only fast rexmit should be done, tcp_mark_head_lost marks
L too far. Also, sacked_upto below 1 is perfectly valid number,
the packets == 0 then needs to be trapped elsewhere.
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>
stmmac: remove ifdef NETIF_F_TSO from stmmac_ethtool.c
Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com> Reported-by: Armando Visconti <armando.visconti@st.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Harvey Harrison [Wed, 13 Oct 2010 18:59:13 +0000 (18:59 +0000)]
niu: introduce temp variables to avoid sparse warnings when swapping in-situ
Suppress a large block of warnings like:
drivers/net/niu.c:7094:38: warning: incorrect type in assignment (different base types)
drivers/net/niu.c:7094:38: expected restricted __be32 [usertype] ip4src
drivers/net/niu.c:7094:38: got unsigned long long
drivers/net/niu.c:7104:17: warning: cast from restricted __be32
...
Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Randy Dunlap [Wed, 13 Oct 2010 15:18:59 +0000 (15:18 +0000)]
net: move MII outside of NET_ETHERNET, fix kconfig warning
We have USB, PCMCIA, and gigabit ethernet drivers that select
MII even though NET_ETHERNET is not enabled, so make MII not
be dependent on NET_ETHERNET. It is still dependent on NET
and NETDEVICES.
Fixes kconfig unmet dependency warning (shortened, was very long string):
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Acked-by: Jeff Garzik <jgarzik@pobox.com> [2006-NOV-30] Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Acked-by: Gustavo F. Padovan <padovan@profusion.mobi> Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Acked-by: Gustavo F. Padovan <padovan@profusion.mobi> Signed-off-by: David S. Miller <davem@davemloft.net>
Do some cleanups of TIPC based on make namespacecheck
1. Don't export unused symbols
2. Eliminate dead code
3. Make functions and variables local
4. Rename buf_acquire to tipc_buf_acquire since it is used in several files
Compile tested only.
This make break out of tree kernel modules that depend on TIPC routines.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Acked-by: Jon Maloy <jon.maloy@ericsson.com> Acked-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
françois romieu [Wed, 13 Oct 2010 09:26:05 +0000 (09:26 +0000)]
via-velocity: forced 1000 Mbps mode support.
Full duplex only. Half duplex 1000 Mbps is not supported.
Signed-off-by: David Lv <DavidLv@viatech.com.cn> Acked-by: Francois Romieu <romieu@fr.zoreil.com> Tested-by: Seguier Regis <rseguier@e-teleport.net> Signed-off-by: David S. Miller <davem@davemloft.net>
Denis Kirjanov [Wed, 13 Oct 2010 00:56:09 +0000 (00:56 +0000)]
sundance: Add initial ethtool stats support
Add ethtool stats support.
Signed-off-by: Denis Kirjanov <dkirjanov@kernel.org> Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Dan Carpenter [Tue, 12 Oct 2010 23:36:19 +0000 (23:36 +0000)]
pch_gbe: fix if condition in set_settings()
There were no curly braces in this if condition so it always enabled full
duplex.
And ecmd->speed is an unsigned short so it is never equal to -1. The
effect is that mii_ethtool_sset() fails with -EINVAL and an error is
printed to dmesg.
Signed-off-by: Dan Carpenter <error27@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Harvey Harrison [Tue, 12 Oct 2010 22:20:34 +0000 (22:20 +0000)]
dnet: mark methods static and annotate for correct endianness
Their doesn't appear to be bugs with the endianness handling here, just get the
annotations right to keep sparse happy.
Suppresses the following sparse warnings:
drivers/net/dnet.c:30:5: warning: symbol 'dnet_readw_mac' was not declared. Should it be static?
drivers/net/dnet.c:49:6: warning: symbol 'dnet_writew_mac' was not declared. Should it be static?
drivers/net/dnet.c:364:5: warning: symbol 'dnet_phy_marvell_fixup' was not declared. Should it be static?
drivers/net/dnet.c:66:13: warning: incorrect type in assignment (different base types)
drivers/net/dnet.c:66:13: expected unsigned short [unsigned] [usertype] tmp
drivers/net/dnet.c:66:13: got restricted __be16 [usertype] <noident>
drivers/net/dnet.c:68:13: warning: incorrect type in assignment (different base types)
drivers/net/dnet.c:68:13: expected unsigned short [unsigned] [usertype] tmp
drivers/net/dnet.c:68:13: got restricted __be16 [usertype] <noident>
drivers/net/dnet.c:70:13: warning: incorrect type in assignment (different base types)
drivers/net/dnet.c:70:13: expected unsigned short [unsigned] [usertype] tmp
drivers/net/dnet.c:70:13: got restricted __be16 [usertype] <noident>
drivers/net/dnet.c:92:27: warning: cast to restricted __be16
drivers/net/dnet.c:94:33: warning: cast to restricted __be16
drivers/net/dnet.c:96:33: warning: cast to restricted __be16
Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Harvey Harrison [Tue, 12 Oct 2010 21:52:26 +0000 (21:52 +0000)]
cxgb4vf: make single bit signed bitfields unsigned
Single bit signed bitfields don't make a lot of sense, noticed by sparse:
drivers/net/cxgb4vf/t4vf_common.h:135:31: error: dubious one-bit signed bitfield
drivers/net/cxgb4vf/t4vf_common.h:136:36: error: dubious one-bit signed bitfield
drivers/net/cxgb4vf/t4vf_common.h:137:36: error: dubious one-bit signed bitfield
drivers/net/cxgb4vf/t4vf_common.h:138:36: error: dubious one-bit signed bitfield
drivers/net/cxgb4vf/t4vf_common.h:139:36: error: dubious one-bit signed bitfield
drivers/net/cxgb4vf/t4vf_common.h:140:31: error: dubious one-bit signed bitfield
drivers/net/cxgb4vf/t4vf_common.h:141:31: error: dubious one-bit signed bitfield
drivers/net/cxgb4vf/t4vf_common.h:142:35: error: dubious one-bit signed bitfield
drivers/net/cxgb4vf/t4vf_common.h:143:35: error: dubious one-bit signed bitfield
drivers/net/cxgb4vf/t4vf_common.h:154:27: error: dubious one-bit signed bitfield
drivers/net/cxgb4vf/t4vf_common.h:155:26: error: dubious one-bit signed bitfield
drivers/net/cxgb4vf/t4vf_common.h:156:27: error: dubious one-bit signed bitfield
drivers/net/cxgb4vf/t4vf_common.h:157:26: error: dubious one-bit signed bitfield
Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Mon, 11 Oct 2010 19:05:25 +0000 (19:05 +0000)]
net: allocate skbs on local node
commit b30973f877 (node-aware skb allocation) spread a wrong habit of
allocating net drivers skbs on a given memory node : The one closest to
the NIC hardware. This is wrong because as soon as we try to scale
network stack, we need to use many cpus to handle traffic and hit
slub/slab management on cross-node allocations/frees when these cpus
have to alloc/free skbs bound to a central node.
skb allocated in RX path are ephemeral, they have a very short
lifetime : Extra cost to maintain NUMA affinity is too expensive. What
appeared as a nice idea four years ago is in fact a bad one.
In 2010, NIC hardwares are multiqueue, or we use RPS to spread the load,
and two 10Gb NIC might deliver more than 28 million packets per second,
needing all the available cpus.
Cost of cross-node handling in network and vm stacks outperforms the
small benefit hardware had when doing its DMA transfert in its 'local'
memory node at RX time. Even trying to differentiate the two allocations
done for one skb (the sk_buff on local node, the data part on NIC
hardware node) is not enough to bring good performance.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Acked-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Mon, 11 Oct 2010 11:17:47 +0000 (11:17 +0000)]
r8169: use 50% less ram for RX ring
Using standard skb allocations in r8169 leads to order-3 allocations (if
PAGE_SIZE=4096), because NIC needs 16383 bytes, and skb overhead makes
this bigger than 16384 -> 32768 bytes per "skb"
Using kmalloc() permits to reduce memory requirements of one r8169 nic
by 4Mbytes. (256 frames * 16Kbytes). This is fine since a hardware bug
requires us to copy incoming frames, so we build real skb when doing
this copy.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
John Fastabend [Fri, 15 Oct 2010 16:27:38 +0000 (09:27 -0700)]
ixgbe: DCB: remove DCB check config
Remove a DCB check config from DCB configuration we
continue to configure DCB even if it fails so don't
even bother to check. Plus user space (lldpad) checks
this before programming the hw anyways.
Worse case is we program some values into the hw that
don't make total sense resulting in incorrect bandwidth
allocation.
Signed-off-by: John Fastabend <john.r.fastabend@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Carolyn Wyborny [Tue, 12 Oct 2010 22:27:02 +0000 (22:27 +0000)]
igb: add check for fiber/serdes devices to igb_set_spd_dplx;
Signed-off-by: Carolyn Wyborny <carolyn.wyborny@intel.com> Tested-by: Emil Tantilov <emil.s.tantilov@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Emil Tantilov [Tue, 12 Oct 2010 22:20:59 +0000 (22:20 +0000)]
ixgbe: declare functions as static
Following patch fixes warnings reported by `make namespacecheck`
Reported by Stephen Hemminger
CC: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com> Tested-by: Stephen Ko <stephen.s.ko@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Emil Tantilov [Tue, 12 Oct 2010 22:20:34 +0000 (22:20 +0000)]
ixgbe: remove unused functions
Remove functions that are declared, but not used in the driver.
This patch fixes warnings reported by `make namespacecheck`
Reported by Stephen Hemminger
CC: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com> Tested-by: Stephen Ko <stephen.s.ko@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Michael Chan [Wed, 13 Oct 2010 14:06:50 +0000 (14:06 +0000)]
cnic: Decouple uio close from cnic shutdown
During cnic shutdown, the original driver code requires userspace to
close the uio device within a few seconds. This doesn't always happen
as the userapp may be hung or otherwise take a long time to close. The
system may crash when this happens.
We fix the problem by decoupling the uio structures from the cnic
structures during cnic shutdown. We do not unregister the uio device
until the cnic driver is unloaded. This eliminates the unreliable wait
loop for uio to close.
All uio structures are kept in a linked list. If the device is shutdown
and later brought back up again, the uio strcture will be found in the
linked list and coupled back to the cnic structures.
Signed-off-by: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Michael Chan [Wed, 13 Oct 2010 14:06:47 +0000 (14:06 +0000)]
cnic: Defer iscsi connection cleanup
The bnx2x devices require a 2 second quiet time before sending the last
RAMROD command to destroy a connection. This sleep wait adds up to a
long delay when iscsid is serially destroying maultiple connections.
Create a workqueue to perform the final connection cleanup in the
background to speed up the process. This significantly speeds up the
process as the wait time can be done in parallel for multiple connections.
Signed-off-by: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Kumar Sanghvi [Tue, 12 Oct 2010 20:14:43 +0000 (20:14 +0000)]
Phonet: 'connect' socket implementation for Pipe controller
Based on suggestion by Rémi Denis-Courmont to implement 'connect'
for Pipe controller logic, this patch implements 'connect' socket
call for the Pipe controller logic.
The patch does following:-
- Removes setsockopts for PNPIPE_CREATE and PNPIPE_DESTROY
- Adds setsockopt for setting the Pipe handle value
- Implements connect socket call
- Updates the Pipe controller logic
User-space should now follow below sequence with Pipe controller:-
-socket
-bind
-setsockopt for PNPIPE_PIPE_HANDLE
-connect
-setsockopt for PNPIPE_ENCAP_IP
-setsockopt for PNPIPE_ENABLE
GPRS/3G data has been tested working fine with this.
Signed-off-by: Kumar Sanghvi <kumar.sanghvi@stericsson.com> Acked-by: Rémi Denis-Courmont <remi.denis-courmont@nokia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Paul Gortmaker [Tue, 12 Oct 2010 14:25:58 +0000 (14:25 +0000)]
tipc: clean out all instances of #if 0'd unused code
Remove all instances of legacy, or as yet to be implemented code
that is currently living within an #if 0 ... #endif block.
In the rare instance that some of it be needed in the future,
it can still be dragged out of history, but there is no need
for it to sit in mainline.
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Mon, 11 Oct 2010 10:22:12 +0000 (10:22 +0000)]
net: percpu net_device refcount
We tried very hard to remove all possible dev_hold()/dev_put() pairs in
network stack, using RCU conversions.
There is still an unavoidable device refcount change for every dst we
create/destroy, and this can slow down some workloads (routers or some
app servers, mmap af_packet)
We can switch to a percpu refcount implementation, now dynamic per_cpu
infrastructure is mature. On a 64 cpus machine, this consumes 256 bytes
per device.
On x86, dev_hold(dev) code :
before
lock incl 0x280(%ebx)
after:
movl 0x260(%ebx),%eax
incl fs:(%eax)
Dmitry Kravkov [Tue, 12 Oct 2010 09:02:21 +0000 (09:02 +0000)]
bnx2x: Fixing a typo: added a missing RSS enablement
Reported-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Dmitry Kravkov <dmitry@broadcom.com> Signed-off-by: Vladislav Zolotarov <vladz@broadcom.com> Tested-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Gerrit Renker [Mon, 11 Oct 2010 18:41:13 +0000 (20:41 +0200)]
dccp: schedule an Ack when receiving timestamps
This schedules an Ack when receiving a timestamp, exploiting the
existing inet_csk_schedule_ack() function, saving one case in the
`dccp_ack_pending()' function.
Ivo Calado [Mon, 11 Oct 2010 18:40:04 +0000 (20:40 +0200)]
dccp: generalise data-loss condition
This patch generalises the task of determining data loss from RFC 4340, 7.7.1.
Let S_A, S_B be sequence numbers such that S_B is "after" S_A, and let
N_B be the NDP count of packet S_B. Then, using modulo-2^48 arithmetic,
D = S_B - S_A - 1 is an upper bound of the number of lost data packets,
D - N_B is an approximation of the number of lost data packets
(there are cases where this is not exact).
The patch implements this as
dccp_loss_count(S_A, S_B, N_B) := max(S_B - S_A - 1 - N_B, 0)
Gerrit Renker [Mon, 11 Oct 2010 18:37:38 +0000 (20:37 +0200)]
dccp: remove unused argument in CCID tx function
This removes the argument `more' from ccid_hc_tx_packet_sent, since it was
nowhere used in the entire code.
(Btw, this argument was not even used in the original KAME code where the
function initially came from; compare the variable moreToSend in the
freebsd61-dccp-kame-28.08.2006.patch kept by Emmanuel Lochin.)
Gerrit Renker [Mon, 11 Oct 2010 18:36:33 +0000 (20:36 +0200)]
dccp: merge now-reduced connect_init() function
After moving the assignment of GAR/ISS from dccp_connect_init() to
dccp_transmit_skb(), the former function becomes very small, so that
a merger with dccp_connect() suggests itself.
Gerrit Renker [Mon, 11 Oct 2010 18:35:40 +0000 (20:35 +0200)]
dccp: fix the adjustments to AWL and SWL
This fixes a problem and a potential loophole with regard to seqno/ackno
validity: currently the initial adjustments to AWL/SWL are only performed
once at the begin of the connection, during the handshake.
Since the Sequence Window feature is always greater than Wmin=32 (7.5.2),
it is however necessary to perform these adjustments at least for the first
W/W' (variables as per 7.5.1) packets in the lifetime of a connection.
This requirement is complicated by the fact that W/W' can change at any time
during the lifetime of a connection.
Therefore it is better to perform that safety check each time SWL/AWL are
updated, as implemented by the patch.
A second problem solved by this patch is that the remote/local Sequence Window
feature values (which set the bounds for AWL/SWL/SWH) are undefined until the
feature negotiation has completed.
During the initial handshake we have more stringent sequence number protection;
the changes added by this patch effect that {A,S}W{L,H} are within the correct
bounds at the instant that feature negotiation completes (since the SeqWin
feature activation handlers call dccp_update_gsr/gss()).
Michael Chan [Mon, 11 Oct 2010 23:12:28 +0000 (16:12 -0700)]
bnx2: Enable AER on PCIE devices only
To prevent unnecessary error message. pci_save_state() is also moved to
the end of ->probe() so that all PCI config, including AER state, will be
saved.
Update version to 2.0.18.
Signed-off-by: Michael Chan <mchan@broadcom.com> Reviewed-by: Benjamin Li <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Michael Chan [Mon, 11 Oct 2010 23:12:00 +0000 (16:12 -0700)]
bnx2: Update firmware to 6.0.x.
- Improved flow control and simplified interface
- Use hardware RSS indirection table instead of the slower firmware-
based table
- Lower latency interrupt on 5709
Signed-off-by: Michael Chan <mchan@broadcom.com> Reviewed-by: Benjamin Li <benli@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Fri, 8 Oct 2010 06:37:34 +0000 (06:37 +0000)]
net dst: use a percpu_counter to track entries
struct dst_ops tracks number of allocated dst in an atomic_t field,
subject to high cache line contention in stress workload.
Switch to a percpu_counter, to reduce number of time we need to dirty a
central location. Place it on a separate cache line to avoid dirtying
read only fields.
Kees Cook [Mon, 11 Oct 2010 19:23:25 +0000 (12:23 -0700)]
net: clear heap allocations for privileged ethtool actions
Several other ethtool functions leave heap uncleared (potentially) by
drivers. Some interfaces appear safe (eeprom, etc), in that the sizes
are well controlled. In some situations (e.g. unchecked error conditions),
the heap will remain unchanged in areas before copying back to userspace.
Note that these are less of an issue since these all require CAP_NET_ADMIN.
Cc: stable@kernel.org Signed-off-by: Kees Cook <kees.cook@canonical.com> Acked-by: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Jiri Slaby [Sun, 10 Oct 2010 23:26:56 +0000 (23:26 +0000)]
NET: pch, fix use after free
Stanse found that pch_gbe_xmit_frame uses skb after it is freed. Fix
that.
Signed-off-by: Jiri Slaby <jslaby@suse.cz> Cc: Masayuki Ohtake <masa-korg@dsn.okisemi.com> Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Jiri Slaby [Sun, 10 Oct 2010 23:26:57 +0000 (23:26 +0000)]
ATM: iphase, remove sleep-inside-atomic
Stanse found that ia_init_one locks a spinlock and inside of that it
calls ia_start which calls:
* request_irq
* tx_init which does kmalloc(GFP_KERNEL)
Both of them can thus sleep and result in a deadlock. I don't see a
reason to have a per-device spinlock there which is used only there
and inited right before the lock location. So remove it completely.
Signed-off-by: Jiri Slaby <jslaby@suse.cz> Cc: Chas Williams <chas@cmf.nrl.navy.mil> Signed-off-by: David S. Miller <davem@davemloft.net>
Jiri Slaby [Sun, 10 Oct 2010 22:46:34 +0000 (22:46 +0000)]
ATM: mpc, fix use after free
Stanse found that mpc_push frees skb and then it dereferences it. It
is a typo, new_skb should be dereferenced there.
Signed-off-by: Jiri Slaby <jslaby@suse.cz> Cc: Eric Dumazet <eric.dumazet@gmail.com> Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Jiri Slaby [Sun, 10 Oct 2010 21:50:44 +0000 (21:50 +0000)]
ATM: solos-pci, remove use after free
Stanse found we do in console_show:
kfree_skb(skb);
return skb->len;
which is not good. Fix that by remembering the len and use it in the
function instead.
Signed-off-by: Jiri Slaby <jslaby@suse.cz> Cc: Chas Williams <chas@cmf.nrl.navy.mil> Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Mon, 11 Oct 2010 16:16:57 +0000 (09:16 -0700)]
neigh: speedup neigh_hh_init()
When a new dst is used to send a frame, neigh_resolve_output() tries to
associate an struct hh_cache to this dst, calling neigh_hh_init() with
the neigh rwlock write locked.
Most of the time, hh_cache is already known and linked into neighbour,
so we find it and increment its refcount.
This patch changes the logic so that we call neigh_hh_init() with
neighbour lock read locked only, so that fast path can be run in
parallel by concurrent cpus.
This brings part of the speedup we got with commit c7d4426a98a5f
(introduce DST_NOCACHE flag) for non cached dsts, even for cached ones,
removing one of the contention point that routers hit on multiqueue
enabled machines.
Further improvements would need to use a seqlock instead of an rwlock to
protect neigh->ha[], to not dirty neigh too often and remove two atomic
ops.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Oskar Schirmer [Thu, 7 Oct 2010 02:30:30 +0000 (02:30 +0000)]
net/fec: carrier off initially to avoid root mount failure
with hardware slow in negotiation, the system did freeze
while trying to mount root on nfs at boot time.
the link state has not been initialised so network stack
tried to start transmission right away. this caused instant
retries, as the driver solely stated business upon link down,
rendering the system unusable.
notify carrier off initially to prevent transmission until
phylib will report link up.
Signed-off-by: Oskar Schirmer <oskar@linutronix.de> Signed-off-by: David S. Miller <davem@davemloft.net>
Nicolas Kaiser [Thu, 7 Oct 2010 13:14:50 +0000 (13:14 +0000)]
ehea: simplify conditional
Simplify: ((a && b) || (!a && !b)) => (a == b)
Signed-off-by: Nicolas Kaiser <nikai@nikai.net> Acked-by: Breno Leitao <leitao@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Use DMA API as PCI equivalents will be deprecated. This change also
allow to allocate with GFP_KERNEL where possible.
Tested-by: Neal Becker <ndbecker2@gmail.com> Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com> Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
r8169: allocate with GFP_KERNEL flag when able to sleep
We have fedora bug report where driver fail to initialize after
suspend/resume because of memory allocation errors:
https://bugzilla.redhat.com/show_bug.cgi?id=629158
To fix use GFP_KERNEL allocation where possible.
Tested-by: Neal Becker <ndbecker2@gmail.com> Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com> Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Tom Herbert [Thu, 7 Oct 2010 10:09:10 +0000 (10:09 +0000)]
net: Fix rxq ref counting
The rx->count reference is used to track reference counts to the
number of rx-queue kobjects created for the device. This patch
eliminates initialization of the counter in netif_alloc_rx_queues
and instead increments the counter each time a kobject is created.
This is now symmetric with the decrement that is done when an object is
released.
Signed-off-by: Tom Herbert <therbert@google.com> Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
There are a bunch of issues that need to be fixed, including:
- GFP_KERNEL allocations from atomic context
(and GFP_ATOMIC in process context),
- abuse of the setsockopt() call convention,
- unprotected/unlocked static variables...
IMHO, we will need to alter the userspace ABI when we fix it. So mark
the configuration option as EXPERIMENTAL for the time being (or should
it be BROKEN instead?).
Signed-off-by: Rémi Denis-Courmont <remi.denis-courmont@nokia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
...which does not follow the usual socket option pattern. This patch
merges all three "options" into a single gettable&settable option,
before Linux 2.6.37 gets out:
int status;
socklen_t len = sizeof(status);
/* enable pipe */
status = 1;
setsockopt(fd, SOL_PNPIPE, PNPIPE_ENABLE, &status, sizeof(status));
/* disable pipe */
status = 0;
setsockopt(fd, SOL_PNPIPE, PNPIPE_ENABLE, &status, sizeof(status));
/* get status */
getsockopt(fd, SOL_PNPIPE, PNPIPE_ENABLE, &status, &len);
This also fixes the error code from EFAULT to ENOTCONN.
Signed-off-by: Rémi Denis-Courmont <remi.denis-courmont@nokia.com> Cc: Kumar Sanghvi <kumar.sanghvi@stericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Phonet: advise against enabling the pipe controller
As it currently is, the new code path is not compatible with existing
Nokia modems. This would break existing userspace for Nokia modem, such
as the existing oFono ISI driver.
Signed-off-by: Rémi Denis-Courmont <remi.denis-courmont@nokia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Sritej Velaga [Thu, 7 Oct 2010 23:46:10 +0000 (23:46 +0000)]
qlcnic: change all P3 references to P3P
This patch just rename all P3 #define to P3P.
Signed-off-by: Sritej Velaga <sritej.velaga@qlogic.com> Signed-off-by: Amit Kumar Salecha <amit.salecha@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Rajesh Borundia [Thu, 7 Oct 2010 23:46:09 +0000 (23:46 +0000)]
qlcnic: fix promiscous mode for VF
o Allow promiscous mode setting for VF's depending upon the configuration.
Signed-off-by: Rajesh Borundia <rajesh.borundia@qlogic.com> Signed-off-by: Amit Kumar Salecha <amit.salecha@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Sritej Velaga [Thu, 7 Oct 2010 23:46:08 +0000 (23:46 +0000)]
qlcnic: fix board description
Remove "Flex-10" from board description.
Signed-off-by: Sritej Velaga <sritej.velaga@qlogic.com> Signed-off-by: Amit Kumar Salecha <amit.salecha@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Put device in quiescent mode during internal loopback test.
Before running test, set state to NEED_QUISCENT. After getting
ack from all function, change state to QUISCENT and perform test.
Signed-off-by: Amit Kumar Salecha <amit.salecha@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Currently fw recovery usage global workqueue.
As same workqueue used by kernel for ethtool and etc., supporting
quiescent mode is not possible, without driver private workqueue.
Signed-off-by: Amit Kumar Salecha <amit.salecha@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
ipv4: Remove leftover rcu_read_unlock calls from __mkroute_output()
Commit "fib: RCU conversion of fib_lookup()" removed rcu_read_lock() from
__mkroute_output but left a couple of calls to rcu_read_unlock() in there.
This causes lockdep to complain that the rcu_read_unlock() call in
__ip_route_output_key causes a lock inbalance and quickly crashes the
kernel. The below fixes this for me.
Signed-off-by: Dimitris Michailidis <dm@chelsio.com> Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Kees Cook [Thu, 7 Oct 2010 10:03:48 +0000 (10:03 +0000)]
net: clear heap allocation for ETHTOOL_GRXCLSRLALL
Calling ETHTOOL_GRXCLSRLALL with a large rule_cnt will allocate kernel
heap without clearing it. For the one driver (niu) that implements it,
it will leave the unused portion of heap unchanged and copy the full
contents back to userspace.
Signed-off-by: Kees Cook <kees.cook@canonical.com> Acked-by: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Ben Hutchings [Fri, 8 Oct 2010 17:36:10 +0000 (10:36 -0700)]
sfc: Don't try to set filters with search depths we know won't work
The filter engine will time-out and ignore filters beyond
200-something hops. We also need to avoid infinite loops in
efx_filter_search() when the table is full.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Dan Carpenter [Fri, 8 Oct 2010 17:21:22 +0000 (10:21 -0700)]
isdn: strcpy() => strlcpy()
setup.phone and setup.eazmsn are 32 character buffers.
rcvmsg.msg_data.byte_array is a 48 character buffer.
sc_adapter[card]->channel[rcvmsg.phy_link_no - 1].dn is 50 chars.
The rcvmsg struct comes from the memcpy_fromio() in receivemessage().
I guess that means it's data off the wire. I'm not very familiar with
this code but I don't see any reason to assume these strings are NULL
terminated.
Also it's weird that "dn" in a 50 character buffer but we only seem to
use 32 characters. In drivers/isdn/sc/scioc.h, "dn" is only a 49
character buffer. So potentially there is still an issue there.
The important thing for now is to prevent the memory corruption.
Signed-off-by: Dan Carpenter <error27@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
The following commit removed DISABLE_REGWRITE_BUFFER ops. The unnecessary
REGWRITE_BUFFER_FLUSH was not removed properly which is causing failure on
hw reset.
Author: Felix Fietkau <nbd@openwrt.org>
Date: Tue Oct 5 12:03:42 2010 +0200
ath9k_hw: clean up register write buffering
Signed-off-by: Rajkumar Manoharan <rmanoharan@atheros.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>