Eric Dumazet [Tue, 22 Jun 2010 17:22:17 +0000 (10:22 -0700)]
net: Introduce u64_stats_sync infrastructure
To properly implement 64bits network statistics on 32bit or 64bit hosts,
we provide one new type and four methods, to ease conversions.
Stats producer should use following template granted it already got an
exclusive access to counters (include/linux/u64_stats_sync.h contains
some documentation about details)
John Fastabend [Wed, 16 Jun 2010 14:18:12 +0000 (14:18 +0000)]
net: consolidate netif_needs_gso() checks
netif_needs_gso() is checked twice in the TX path once,
before submitting the skb to the qdisc and once after
it is dequeued from the qdisc just before calling
ndo_hard_start(). This opens a window for a user to
change the gso/tso or tx checksum settings that can
cause netif_needs_gso to be true in one check and false
in the other.
Specifically, changing TX checksum setting may cause
the warning in skb_gso_segment() to be triggered if
the checksum is calculated earlier.
This consolidates the netif_needs_gso() calls so that
the stack only checks if gso is needed in
dev_hard_start_xmit().
Signed-off-by: John Fastabend <john.r.fastabend@intel.com> Cc: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
Bruce Allan [Thu, 17 Jun 2010 18:59:48 +0000 (18:59 +0000)]
e1000e: disable gig speed when in S0->Sx transition
Most of this workaround is necessary for all ICHx/PCH parts so one of
the two MAC-type checks can be removed.
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Bruce Allan [Thu, 17 Jun 2010 18:59:27 +0000 (18:59 +0000)]
e1000e: packet split should not be used with early receive
Originally it was thought there were issues with ICHx/PCH parts with packet
split when jumbo frames were enabled but in fact it is really only when
early-receive is enabled (via ERT register) on these parts. Use packet
split with jumbos but only when early-receive is not enabled.
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Bruce Allan [Thu, 17 Jun 2010 18:59:06 +0000 (18:59 +0000)]
e1000e: do not touch PHY page 800 registers when link speed is 1000Mbps
The PHY on 82577/82578 has issues when the registers on page 800 are
accessed when in gigabit mode. Do not clear the Wakeup Control register
when resetting the part since it is on page 800 (and will be cleared on
reset anyway).
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Bruce Allan [Thu, 17 Jun 2010 18:58:43 +0000 (18:58 +0000)]
e1000e: avoid polling h/w registers during link negotiation
Avoid touching hardware registers when possible, otherwise link negotiation
can get messed up when user-level scripts are rapidly polling the driver to
see if/when link is up. Use the saved link state information instead when
possible.
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Sjur Braendeland [Thu, 17 Jun 2010 06:55:41 +0000 (06:55 +0000)]
caif: Add debug connection type for CAIF.
Added new CAIF protocol type CAIFPROTO_DEBUG for accessing
CAIF debug on the ST Ericsson modems.
There are two debug servers on the modem, one for radio related
debug (CAIF_RADIO_DEBUG_SERVICE) and the other for
communication/application related debug (CAIF_COM_DEBUG_SERVICE).
The debug connection can contain trace debug printouts or
interactive debug used for debugging and test.
Debug connections can be of type STREAM or SEQPACKET.
Signed-off-by: Sjur Braendeland <sjur.brandeland@stericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Bruce Allan [Wed, 16 Jun 2010 13:28:34 +0000 (13:28 +0000)]
e1000e: update driver version number
Also separate out an _EXTRAVERSION similar to the core kernel.
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Bruce Allan [Wed, 16 Jun 2010 13:28:11 +0000 (13:28 +0000)]
e1000e: update copyright information
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Bruce Allan [Wed, 16 Jun 2010 13:27:49 +0000 (13:27 +0000)]
e1000e: enable support for EEE on 82579
This patch enables IEEE802.3az (a.k.a. Energy Efficient Ethernet) on the
new 82579 LOMs. An optional module parameter is provided to disable the
feature if desired.
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Bruce Allan [Wed, 16 Jun 2010 13:27:28 +0000 (13:27 +0000)]
e1000e: initial support for 82579 LOMs
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Bruce Allan [Wed, 16 Jun 2010 13:27:05 +0000 (13:27 +0000)]
e1000e: fix check for manageability on ICHx/PCH
Do not check for all the bits in E1000_FWSM_MODE_MASK when checking for
manageability on 82577/82578; only check if iAMT is enabled. Both of the
manageability checks (for 82577/82578 and ICHx) must check the firmware
valid bit too since the other bits are only valid when the latter is set.
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Bruce Allan [Wed, 16 Jun 2010 13:26:41 +0000 (13:26 +0000)]
e1000e: separate out PHY statistics register updates
The 82577/82578 parts have half-duplex statistics in PHY registers. These
need only be read when in half-duplex and should all be read at once rather
than one at a time to prevent excessive cycles of acquiring/releasing the
PHY semaphore.
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Bruce Allan [Wed, 16 Jun 2010 13:26:17 +0000 (13:26 +0000)]
e1000e: cleanup e1000_sw_lcd_config_ich8lan()
Do not acquire and release the PHY unnecessarily for parts that return
from this workaround without actually accessing the PHY registers.
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Bruce Allan [Wed, 16 Jun 2010 13:25:55 +0000 (13:25 +0000)]
e1000e: cleanup ethtool loopback setup code
Refactor the loopback setup code to first handle the only 10/100 PHY
supported by the driver if applicable and then handle the 1Gig PHYs in a
switch statement for PHY-specific setups.
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Emil Tantilov <emil.s.tantilov@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
cxgb4: dynamically determine flash size and FW image location
Handle the larger flash memories on newer boards:
- get the size and number of sectors by probing the flash
- writes and erases can take longer, adjust the timeouts for these operations
- the FW image can be at different locations depending on flash size,
find its location dynamically as well.
Signed-off-by: Dimitris Michailidis <dm@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
For certain set of register, base window addresses are not defined.
In such cases window should not set.
Return with error for such cases to avoid NMI.
Signed-off-by: Amit Kumar Salecha <amit.salecha@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
There is a race between netif_stop_queue and netif_stopped_queue
check. So check once again if buffers are available to avoid race.
With above logic we can also get rid of tx lock in process_cmd_ring.
Signed-off-by: Rajesh K Borundia <rajesh.borundia@qlogic.com> Signed-off-by: Amit Kumar Salecha <amit.salecha@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
schacko [Thu, 17 Jun 2010 02:56:40 +0000 (02:56 +0000)]
qlcnic: seperate interrupt for TX
Earlier all poll routine can process rx and tx, But now
one poll routine to process rx + tx and other for rx only.
Last msix vector will be used for separate tx interrupt.
o This is supported from fw version 4.4.2.
o Bump version 5.0.5
Signed-off-by: Sony Chacko <schacko@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Sritej Velaga [Thu, 17 Jun 2010 02:56:39 +0000 (02:56 +0000)]
qlcnic: change driver description
o Remove extra printing of mac address
o This driver also supports NIC only Qlogic adapters.
Signed-off-by: Sritej Velaga <sritej.velaga@qlogic.com> Signed-off-by: Amit Kumar Salecha <amit.salecha@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
During device soft reset, don't halt every device block.
Access to some blocks is required during recovery.
Signed-off-by: Sucheta Chakraborty <sucheta@dut4145.unminc.com> Signed-off-by: Amit Kumar Salecha <amit.salecha@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
af_unix: Allow connecting to sockets in other network namespaces.
Remove the restriction that only allows connecting to a unix domain
socket identified by unix path that is in the same network namespace.
Crossing network namespaces is always tricky and we did not support
this at first, because of a strict policy of don't mix the namespaces.
Later after Pavel proposed this we did not support this because no one
had performed the audit to make certain using unix domain sockets
across namespaces is safe.
What fundamentally makes connecting to af_unix sockets in other
namespaces is safe is that you have to have the proper permissions on
the unix domain socket inode that lives in the filesystem. If you
want strict isolation you just don't create inodes where unfriendlys
can get at them, or with permissions that allow unfriendlys to open
them. All nicely handled for us by the mount namespace and other
standard file system facilities.
I looked through unix domain sockets and they are a very controlled
environment so none of the work that goes on in dev_forward_skb to
make crossing namespaces safe appears needed, we are not loosing
controll of the skb and so do not need to set up the skb to look like
it is comming in fresh from the outside world. Further the fields in
struct unix_skb_parms should not have any problems crossing network
namespaces.
Now that we handle SCM_CREDENTIALS in a way that gives useable values
across namespaces. There does not appear to be any operational
problems with encouraging the use of unix domain sockets across
containers either.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Acked-by: Daniel Lezcano <daniel.lezcano@free.fr> Acked-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
af_unix: Allow credentials to work across user and pid namespaces.
In unix_skb_parms store pointers to struct pid and struct cred instead
of raw uid, gid, and pid values, then translate the credentials on
reception into values that are meaningful in the receiving processes
namespaces.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Acked-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
scm: Capture the full credentials of the scm sender.
Start capturing not only the userspace pid, uid and gid values of the
sending process but also the struct pid and struct cred of the sending
process as well.
This is in preparation for properly supporting SCM_CREDENTIALS for
sockets that have different uid and/or pid namespaces at the different
ends.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Acked-by: Serge E. Hallyn <serge@hallyn.com> Signed-off-by: David S. Miller <davem@davemloft.net>
af_netlink: Add needed scm_destroy after scm_send.
scm_send occasionally allocates state in the scm_cookie, so I have
modified netlink_sendmsg to guarantee that when scm_send succeeds
scm_destory will be called to free that state.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Reviewed-by: Daniel Lezcano <daniel.lezcano@free.fr> Acked-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
af_unix: Allow SO_PEERCRED to work across namespaces.
Use struct pid and struct cred to store the peer credentials on struct
sock. This gives enough information to convert the peer credential
information to a value relative to whatever namespace the socket is in
at the time.
This removes nasty surprises when using SO_PEERCRED on socket
connetions where the processes on either side are in different pid and
user namespaces.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Acked-by: Daniel Lezcano <daniel.lezcano@free.fr> Acked-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
To keep the coming code clear and to allow both the sock
code and the scm code to share the logic introduce a
fuction to translate from struct cred to struct ucred.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Acked-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
user_ns: Introduce user_nsmap_uid and user_ns_map_gid.
Define what happens when a we view a uid from one user_namespace
in another user_namepece.
- If the user namespaces are the same no mapping is necessary.
- For most cases of difference use overflowuid and overflowgid,
the uid and gid currently used for 16bit apis when we have a 32bit uid
that does fit in 16bits. Effectively the situation is the same,
we want to return a uid or gid that is not assigned to any user.
- For the case when we happen to be mapping the uid or gid of the
creator of the target user namespace use uid 0 and gid as confusing
that user with root is not a problem.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Acked-by: Serge E. Hallyn <serue@us.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Reorder the fields in scm_cookie so they pack better on 64bit.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Acked-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
qlcnic: Fix a bug in setting up NIC partitioning mode
The driver was not detecting the presence of NIC partitioning capability of the
firmware properly. Now, it checks the eswitch set bit in the FW capabilities
register and accordingly sets the driver mode as NPAR capable or not.
Signed-off-by: Anirban Chakraborty <anirban.chakraborty@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Florian Westphal [Wed, 16 Jun 2010 21:42:15 +0000 (14:42 -0700)]
syncookies: check decoded options against sysctl settings
Discard the ACK if we find options that do not match current sysctl
settings.
Previously it was possible to create a connection with sack, wscale,
etc. enabled even if the feature was disabled via sysctl.
Also remove an unneeded call to tcp_sack_reset() in
cookie_check_timestamp: Both call sites (cookie_v4_check,
cookie_v6_check) zero "struct tcp_options_received", hand it to
tcp_parse_options() (which does not change tcp_opt->num_sacks/dsack)
and then call cookie_check_timestamp().
Even if num_sacks/dsacks were changed, the structure is allocated on
the stack and after cookie_check_timestamp returns only a few selected
members are copied to the inet_request_sock.
Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>
Bruno Randolf [Wed, 16 Jun 2010 10:12:17 +0000 (19:12 +0900)]
ath5k: review and add comments for descriptors
I carefully reviewed desh.h against the HAL sources. Added comments and made
differences between 5210, 5211 and 5212 more clear by adding _521x to the
defines which are specific to that chipset. Renamed some defines. No functional
changes.
Signed-off-by: Bruno Randolf <br1@einfach.org> Acked-by: Bob Copeland <me@bobcopeland.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
Bruno Randolf [Wed, 16 Jun 2010 10:12:01 +0000 (19:12 +0900)]
ath5k: use direct function calls for descriptors when possible
Use direct function calls for ath5k_hw_setup_rx_desc() and
ath5k_hw_setup_mrr_tx_desc() instead of a function pointer which always pointed
to the same function in the case of ath5k_hw_setup_rx_desc() and which is
easily unified in the case of ath5k_hw_setup_mrr_tx_desc().
Also simplify the initialization function for the remaining function pointers.
Signed-off-by: Bruno Randolf <br1@einfach.org> Signed-off-by: John W. Linville <linville@tuxdriver.com>
Bruno Randolf [Wed, 16 Jun 2010 10:11:56 +0000 (19:11 +0900)]
ath5k: move checks and stats into new function
Create a new function ath5k_receive_frame_ok() which checks for errors, updates
error statistics and tells us if we want to further "receive" this frame or
not. This way we can avoid a goto and have a cleaner separation between buffer
handling and other things.
Signed-off-by: Bruno Randolf <br1@einfach.org> Signed-off-by: John W. Linville <linville@tuxdriver.com>
Bruno Randolf [Wed, 16 Jun 2010 10:11:51 +0000 (19:11 +0900)]
ath5k: split descriptor handling and frame receive
Move frame reception into it's own function to have a clearer separation
between buffer and descriptor handling and things that are done when we
actually receive a frame.
Signed-off-by: Bruno Randolf <br1@einfach.org> Acked-by: Bob Copeland <me@bobcopeland.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
BugLink: http://bugs.launchpad.net/bugs/21367
Enable LED by default and update the MODULE_PARM_DESC. The original
reason for defaulting to disabled was documented in 2005 and noted, "The
LED code has been reported to hang some systems when running ifconfig
and is therefore disabled by default." This no longer appears
applicable and users have been requesting this be enabled for several
years.
Signed-off-by: TJ <ubuntu@tjworld.net> Signed-off-by: Tim Gardner <tim.gardner@canonical.com> Signed-off-by: Andy Whitcroft <apw@canonical.com> Acked-by: Stefan Bader <stefan.bader@canonical.com> Signed-off-by: Leann Ogasawara <leann.ogasawara@canonical.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
Leann Ogasawara [Tue, 15 Jun 2010 21:01:51 +0000 (14:01 -0700)]
p54usb: Comment out duplicate Medion MD40900 device id
The Medion MD40900 device id [0x0cde, 0x0006] is defined twice.
Comment out the duplicate.
Originally-by: Ben Collins <ben.collins@ubuntu.com> Signed-off-by: Leann Ogasawara <leann.ogasawara@canonical.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
Eric Dumazet [Wed, 16 Jun 2010 04:52:13 +0000 (04:52 +0000)]
inetpeer: restore small inet_peer structures
Addition of rcu_head to struct inet_peer added 16bytes on 64bit arches.
Thats a bit unfortunate, since old size was exactly 64 bytes.
This can be solved, using an union between this rcu_head an four fields,
that are normally used only when a refcount is taken on inet_peer.
rcu_head is used only when refcnt=-1, right before structure freeing.
Add a inet_peer_refcheck() function to check this assertion for a while.
We can bring back SLAB_HWCACHE_ALIGN qualifier in kmem cache creation.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Wed, 16 Jun 2010 01:16:43 +0000 (18:16 -0700)]
net: NET_SKB_PAD should depend on L1_CACHE_BYTES
In old kernels, NET_SKB_PAD was defined to 16.
Then commit d6301d3dd1c2 (net: Increase default NET_SKB_PAD to 32), and
commit 18e8c134f4e9 (net: Increase NET_SKB_PAD to 64 bytes) increased it
to 64.
While first patch was governed by network stack needs, second was more
driven by performance issues on current hardware. Real intent was to
align data on a cache line boundary.
So use max(32, L1_CACHE_BYTES) instead of 64, to be more generic.
Remove microblaze and powerpc own NET_SKB_PAD definitions.
Thanks to Alexander Duyck and David Miller for their comments.
Suggested-by: David Miller <davem@davemloft.net> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Florian Westphal [Sun, 13 Jun 2010 11:29:39 +0000 (11:29 +0000)]
ipv6: syncookies: do not skip ->iif initialization
When syncookies are in effect, req->iif is left uninitialized.
In case of e.g. link-local addresses the route lookup then fails
and no syn-ack is sent.
Rearrange things so ->iif is also initialized in the syncookie case.
want_cookie can only be true when the isn was zero, thus move the want_cookie
check into the "!isn" branch.
Cc: Glenn Griffin <ggriffin.kernel@gmail.com> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>
Sonic Zhang [Fri, 11 Jun 2010 09:44:31 +0000 (17:44 +0800)]
netdev:bfin_mac: reclaim and free tx skb as soon as possible after transfer
SKBs hold onto resources that can't be held indefinitely, such as TCP
socket references and netfilter conntrack state. So if a packet is left
in TX ring for a long time, there might be a TCP socket that cannot be
closed and freed up.
Current blackfin EMAC driver always reclaim and free used tx skbs in future
transfers. The problem is that future transfer may not come as soon as
possible. This patch start a timer after transfer to reclaim and free skb.
There is nearly no performance drop with this patch.
TX interrupt is not enabled because of a strange behavior of the Blackfin EMAC.
If EMAC TX transfer control is turned on, endless TX interrupts are triggered
no matter if TX DMA is enabled or not. Since DMA walks down the ring automatically,
TX transfer control can't be turned off in the middle. The only way is to disable
TX interrupt completely.
Signed-off-by: Sonic Zhang <sonic.zhang@analog.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Tue, 15 Jun 2010 08:23:14 +0000 (08:23 +0000)]
inetpeer: RCU conversion
inetpeer currently uses an AVL tree protected by an rwlock.
It's possible to make most lookups use RCU
1) Add a struct rcu_head to struct inet_peer
2) add a lookup_rcu_bh() helper to perform lockless and opportunistic
lookup. This is a normal function, not a macro like lookup().
3) Add a limit to number of links followed by lookup_rcu_bh(). This is
needed in case we fall in a loop.
4) add an smp_wmb() in link_to_pool() right before node insert.
5) make unlink_from_pool() use atomic_cmpxchg() to make sure it can take
last reference to an inet_peer, since lockless readers could increase
refcount, even while we hold peers.lock.
6) Delay struct inet_peer freeing after rcu grace period so that
lookup_rcu_bh() cannot crash.
7) inet_getpeer() first attempts lockless lookup.
Note this lookup can fail even if target is in AVL tree, but a
concurrent writer can let tree in a non correct form.
If this attemps fails, lock is taken a regular lookup is performed
again.
8) convert peers.lock from rwlock to a spinlock
9) Remove SLAB_HWCACHE_ALIGN when peer_cachep is created, because
rcu_head adds 16 bytes on 64bit arches, doubling effective size (64 ->
128 bytes)
In a future patch, this is probably possible to revert this part, if rcu
field is put in an union to share space with rid, ip_id_count, tcp_ts &
tcp_ts_stamp. These fields being manipulated only with refcnt > 0.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Michael Chan [Tue, 15 Jun 2010 08:57:03 +0000 (08:57 +0000)]
cnic: Fix cnic_cm_abort() error handling.
Fix the code that handles the error case when cnic_cm_abort() cannot
proceed normally. We cannot just set the csk->state and we must
go through cnic_ready_to_close() to handle all the conditions. We
also add error return code in cnic_cm_abort().
Signed-off-by: Michael Chan <mchan@broadcom.com> Signed-off-by: Eddie Wai <waie@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Michael Chan [Tue, 15 Jun 2010 08:57:02 +0000 (08:57 +0000)]
cnic: Refactor and fix cnic_ready_to_close().
Combine RESET_RECEIVED and RESET_COMP logic and fix race condition
between these 2 events and cnic_cm_close(). In particular, we need
to (test_and_clear_bit(SK_F_OFFLD_COMPLETE, &csk->flags)) before we
update csk->state.
Signed-off-by: Michael Chan <mchan@broadcom.com> Signed-off-by: Eddie Wai <waie@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Alexander Duyck [Tue, 15 Jun 2010 09:25:48 +0000 (09:25 +0000)]
ixgbe: update set_rx_mode to fix issues w/ macvlan
This change corrects issues where macvlan was not correctly triggering
promiscuous mode on ixgbe due to the filters not being correctly set. It
also corrects the fact that VF rar filters were being overwritten when the
PF was reset.
CC: Shirley Ma <xma@us.ibm.com> Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Emil Tantilov <emil.s.tantilov@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
ath9k_hw: avoid setting cwmin/cwmax to 0 for IBSS for AR9003
IBSS requires the cwmin and cwmax to be respected when
we reset the txqueues on AR9003 otherwise the distribution
of beacons will be balanced towards the AR9003 card first
preventing equal contention for air time for other peers
on the IBSS.
Without this IBSS will work but only the AR9003 card will be
be issuing beacons on the IBSS.
Signed-off-by: Luis R. Rodriguez <lrodriguez@atheros.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
Ivo van Doorn [Mon, 14 Jun 2010 20:14:19 +0000 (22:14 +0200)]
rt2x00: Synchronize WCID initialization with legacy driver
Legacy rt2870 driver handles WCID differently then we expected,
the BSSID and Cipher value are 3 bit values, while the 4th bit
should be set elsewhere in an extended field.
After this, rt2800usb reports frames have been decrypted
successfully, indicating that the Hardware decryption now is
working correctly.
Signed-off-by: Ivo van Doorn <IvDoorn@gmail.com> Acked-by: Gertjan van Wingerde <gwingerde@gmail.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
Ivo van Doorn [Mon, 14 Jun 2010 20:13:56 +0000 (22:13 +0200)]
rt2x00: Enable HW crypto by default
Hardware cryptography seems to be working
on a 11G network with WPA/WPA2 cryptography
enabled. WEP still needs to be tested...
Signed-of-by: Ivo van Doorn <IvDoorn@gmail.com> Acked-by: Gertjan van Wingerde <gwingerde@gmail.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
Ivo van Doorn [Mon, 14 Jun 2010 20:13:37 +0000 (22:13 +0200)]
rt2x00: Limit TX done looping to number of TX ring entries
Similar to rt2800pci, remove the check for duplicate
register reading, and instead limit the for-loop to
the maximum number of TX entries inside a queue.
Signed-off-by: Ivo van Doorn <IvDoorn@gmail.com> Acked-by: Gertjan van Wingerde <gwingerde@gmail.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
Ivo van Doorn [Mon, 14 Jun 2010 20:13:15 +0000 (22:13 +0200)]
rt2x00: Update author rt2800lib
rt2800lib has been under development of the rt2x00 project,
so add it to the author string for the module information.
Signed-off-by: Ivo van Doorn <IvDoorn@gmail.com> Acked-by: Gertjan van Wingerde <gwingerde@gmail.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
Ivo van Doorn [Mon, 14 Jun 2010 20:12:54 +0000 (22:12 +0200)]
rt2x00: Enable fallback rates for rt61pci and rt73usb
Explicitly enable the usage of fallback rates for
the transmission of frames with rt61pci and rt73usb hardware.
Note that for txdone reporting, only rt61pci is capable of
reporting the fallback rates, for USB it is not possible
to determine the number of retries. However the device will
use the fallback rates, so it might still help in the performance.
Signed-off-by: Ivo van Doorn <IvDoorn@gmail.com> Acked-by: Gertjan van Wingerde <gwingerde@gmail.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
Helmut Schaa [Mon, 14 Jun 2010 20:12:26 +0000 (22:12 +0200)]
rt2x00: Fix tx status reporting when falling back to the lowest rate
In some corner cases the reported tx rates/retries didn't match the really
used ones.
The hardware lowers the tx rate on each consecutive retry by 1 (but won't
fall back from MCS to legacy rates) _until_ it reaches the lowest one.
In case the frame wasn't sent succesful the number of retries is 7 and if
a rate index <7 was used the previous code reported negative rate indexes
which were then ignored by the rate control algorithm and mac80211.
Instead, report the remaining number of retries to have happened with
the lowest rate (index 0). This should give the rate control algorithm
slightly more accurate information about the used tx rates/retries.
Signed-off-by: Helmut Schaa <helmut.schaa@googlemail.com> Acked-by: Gertjan van Wingerde <gwingerde@gmail.com> Signed-off-by: Ivo van Doorn <IvDoorn@gmail.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
Helmut Schaa [Mon, 14 Jun 2010 20:12:01 +0000 (22:12 +0200)]
rt2x00: provide mac80211 a suitable max_rates value
Set up max_rates and max_rate_tries with suitable values even if we do not
support the whole functionality.
As rt2800 has a global fallback table we cannot specify more then one tx rate
per frame but since the hw will try several different rates (based on the
fallback table) we should still initialize max_rates to the maximum number of
rates we are going to try. Otherwise mac80211 will truncate our reported tx
rates and the rc algortihm will end up with incorrect data choosing unsuitable
rates for tx.
This improves throughput on rt2800 devices considerable.
Signed-off-by: Helmut Schaa <helmut.schaa@googlemail.com> Acked-by: Gertjan van Wingerde <gwingerde@gmail.com> Signed-off-by: Ivo van Doorn <IvDoorn@gmail.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>