Allow master and backup servers to use many threads
for sync traffic. Add sysctl var "sync_ports" to define the
number of threads. Every thread will use single UDP port,
thread 0 will use the default port 8848 while last thread
will use port 8848+sync_ports-1.
The sync traffic for connections is scheduled to many
master threads based on the cp address but one connection is
always assigned to same thread to avoid reordering of the
sync messages.
Remove ip_vs_sync_switch_mode because this check
for sync mode change is still risky. Instead, check for mode
change under sync_buff_lock.
Make sure the backup socks do not block on reading.
Special thanks to Aleksey Chudov for helping in all tests.
Add two new sysctl vars to control the sync rate with the
main idea to reduce the rate for connection templates because
currently it depends on the packet rate for controlled connections.
This mechanism should be useful also for normal connections
with high traffic.
sync_refresh_period: in seconds, difference in reported connection
timer that triggers new sync message. It can be used to
avoid sync messages for the specified period (or half of
the connection timeout if it is lower) if connection state
is not changed from last sync.
sync_retries: integer, 0..3, defines sync retries with period of
sync_refresh_period/8. Useful to protect against loss of
sync messages.
Allow sysctl_sync_threshold to be used with
sysctl_sync_period=0, so that only single sync message is sent
if sync_refresh_period is also 0.
Add new field "sync_endtime" in connection structure to
hold the reported time when connection expires. The 2 lowest
bits will represent the retry count.
As the sysctl_sync_period now can be 0 use ACCESS_ONCE to
avoid division by zero.
Special thanks to Aleksey Chudov for being patient with me,
for his extensive reports and helping in all tests.
High rate of sync messages in master can lead to
overflowing the socket buffer and dropping the messages.
Fixed sleep of 1 second without wakeup events is not suitable
for loaded masters,
Use delayed_work to schedule sending for queued messages
and limit the delay to IPVS_SYNC_SEND_DELAY (20ms). This will
reduce the rate of wakeups but to avoid sending long bursts we
wakeup the master thread after IPVS_SYNC_WAKEUP_RATE (8) messages.
Add hard limit for the queued messages before sending
by using "sync_qlen_max" sysctl var. It defaults to 1/32 of
the memory pages but actually represents number of messages.
It will protect us from allocating large parts of memory
when the sending rate is lower than the queuing rate.
As suggested by Pablo, add new sysctl var
"sync_sock_size" to configure the SNDBUF (master) or
RCVBUF (slave) socket limit. Default value is 0 (preserve
system defaults).
Change the master thread to detect and block on
SNDBUF overflow, so that we do not drop messages when
the socket limit is low but the sync_qlen_max limit is
not reached. On ENOBUFS or other errors just drop the
messages.
Change master thread to enter TASK_INTERRUPTIBLE
state early, so that we do not miss wakeups due to messages or
kthread_should_stop event.
Thanks to Pablo Neira Ayuso for his valuable feedback!
Signed-off-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au>
ipvs: always update some of the flags bits in backup
As the goal is to mirror the inactconns/activeconns
counters in the backup server, make sure the cp->flags are
updated even if cp is still not bound to dest. If cp->flags
are not updated ip_vs_bind_dest will rely only on the initial
flags when updating the counters. To avoid mistakes and
complicated checks for protocol state rely only on the
IP_VS_CONN_F_INACTIVE bit when updating the counters.
ipvs: fix ip_vs_try_bind_dest to rebind app and transmitter
Initially, when the synced connection is created we
use the forwarding method provided by master but once we
bind to destination it can be changed. As result, we must
update the application and the transmitter.
As ip_vs_try_bind_dest is called always for connections
that require dest binding, there is no need to validate the
cp and dest pointers.
Signed-off-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au>
ipvs: remove check for IP_VS_CONN_F_SYNC from ip_vs_bind_dest
As the IP_VS_CONN_F_INACTIVE bit is properly set
in cp->flags for all kind of connections we do not need to
add special checks for synced connections when updating
the activeconns/inactconns counters for first time. Now
logic will look just like in ip_vs_unbind_dest.
Signed-off-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au>
ipvs: ignore IP_VS_CONN_F_NOOUTPUT in backup server
As IP_VS_CONN_F_NOOUTPUT is derived from the
forwarding method we should get it from conn_flags just
like we do it for IP_VS_CONN_F_FWD_MASK bits when binding
to real server.
Signed-off-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au>
if net.bridge.bridge-nf-filter-vlan-tagged sysctl is enabled, bridge
netfilter removes the vlan header temporarily and then feeds the packet
to ip(6)tables.
When the new "bridge-nf-pass-vlan-input-device" sysctl is on
(default off), then bridge netfilter will also set the
in-interface to the vlan interface; if such an interface exists.
This is needed to make iptables REDIRECT target work with
"vlan-on-top-of-bridge" setups and to allow use of "iptables -i" to
match the vlan device name.
Also update Documentation with current brnf default settings.
Signed-off-by: Florian Westphal <fw@strlen.de> Acked-by: Bart De Schuymer <bdschuym@pandora.be> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Eric Dumazet [Wed, 18 Apr 2012 04:36:40 +0000 (06:36 +0200)]
netfilter: nf_conntrack: use this_cpu_inc()
this_cpu_inc() is IRQ safe and faster than
local_bh_disable()/__this_cpu_inc()/local_bh_enable(), at least on x86.
Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Patrick McHardy <kaber@trash.net> Cc: Christoph Lameter <cl@linux.com> Cc: Tejun Heo <tj@kernel.org> Reviewed-by: Christoph Lameter <cl@linux.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
[ Note: flows that already got a helper will keep using it even
if automatic helper assignment has been disabled ]
Once this behaviour has been disabled, you have to explicitly
use the iptables CT target to attach helper to flows.
There are good reasons to stop supporting automatic helper
assignment, for further information, please read:
http://www.netfilter.org/news.html#2012-04-03
This patch also adds one message to inform that automatic helper
assignment is deprecated and it will be removed soon (this is
spotted only once, with the first flow that gets a helper attached
to make it as less annoying as possible).
Signed-off-by: Eric Leblond <eric@regit.org> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
This patch allows the GPIO/LED settings to be configured by the
EEPROM if present, and only sets the default values (LED outputs
for link/activity) when an EEPROM is not detected.
Signed-off-by: Steve Glendinning <steve.glendinning@shawell.net> Signed-off-by: David S. Miller <davem@davemloft.net>
Only a write is necessary to clear the interrupt status, and we
don't use the value from the preceding read operation. This
patch eliminates the unnecessary read.
Signed-off-by: Steve Glendinning <steve.glendinning@shawell.net> Signed-off-by: David S. Miller <davem@davemloft.net>
Resolved the iwlwifi conflict with mainline using 3-way diff posted
by John Linville and Stephen Rothwell. In 'net' we added a bug
fix to make iwlwifi report a more accurate skb->truesize but this
conflicted with RX path changes that happened meanwhile in net-next.
In e1000e a conflict arose in the validation code for settings of
adapter->itr. 'net-next' had more sophisticated logic so that
logic was used.
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 8 May 2012 03:05:13 +0000 (23:05 -0400)]
Merge branch 'vhost-net-next' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost
Michael S. Tsirkin says:
--------------------
There are mostly bugfixes here.
I hope to merge some more patches by 3.5, in particular
vlan support fixes are waiting for Eric's ack,
and a version of tracepoint patch might be
ready in time, but let's merge what's ready so it's testable.
This includes a ton of zerocopy fixes by Jason -
good stuff but too intrusive for 3.4 and zerocopy is experimental
anyway.
virtio supported delayed interrupt for a while now
so adding support to the virtio tool made sense
--------------------
Signed-off-by: David S. Miller <davem@davemloft.net>
Jiri Pirko [Thu, 3 May 2012 22:37:45 +0000 (22:37 +0000)]
net: IP_MULTICAST_IF setsockopt now recognizes struct mreq
Until now, struct mreq has not been recognized and it was worked with
as with struct in_addr. That means imr_multiaddr was copied to
imr_address. So do recognize struct mreq here and copy that correctly.
Signed-off-by: Jiri Pirko <jpirko@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
David Daney [Wed, 2 May 2012 15:16:38 +0000 (15:16 +0000)]
netdev/of/phy: Add MDIO bus multiplexer support.
This patch adds a somewhat generic framework for MDIO bus
multiplexers. It is modeled on the I2C multiplexer.
The multiplexer is needed if there are multiple PHYs with the same
address connected to the same MDIO bus adepter, or if there is
insufficient electrical drive capability for all the connected PHY
devices.
Conceptually it could look something like this:
------------------
| Control Signal |
--------+---------
|
--------------- --------+------
| MDIO MASTER |---| Multiplexer |
--------------- --+-------+----
| |
C C
h h
i i
l l
d d
| |
--------- A B ---------
| | | | | |
| PHY@1 +-------+ +---+ PHY@1 |
| | | | | |
--------- | | ---------
--------- | | ---------
| | | | | |
| PHY@2 +-------+ +---+ PHY@2 |
| | | |
--------- ---------
This framework configures the bus topology from device tree data. The
mechanics of switching the multiplexer is left to device specific
drivers.
The follow-on patch contains a multiplexer driven by GPIO lines.
Signed-off-by: David Daney <david.daney@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
David Daney [Wed, 2 May 2012 15:16:37 +0000 (15:16 +0000)]
netdev/of/phy: New function: of_mdio_find_bus().
Add of_mdio_find_bus() which allows an mii_bus to be located given its
associated the device tree node.
This is needed by the follow-on patch to add a driver for MDIO bus
multiplexers.
The of_mdiobus_register() function is modified so that the device tree
node is recorded in the mii_bus. Then we can find it again by
iterating over all mdio_bus_class devices.
Because the OF device tree has now become an integral part of the
kernel, this can live in mdio_bus.c (which contains the needed
mdio_bus_class structure) instead of of_mdio.c.
Signed-off-by: David Daney <david.daney@cavium.com> Cc: Grant Likely <grant.likely@secretlab.ca> Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: David S. Miller <davem@davemloft.net>
isdn/capi: elliminate capincci_find() in non-middleware case
If Kernel CAPI is compiled without CONFIG_ISDN_CAPI_MIDDLEWARE,
the structure retrieved via capincci_find() is never actually
used, so don't compile that function in that case.
Signed-off-by: Tilman Schmidt <tilman@imap.cc> Signed-off-by: David S. Miller <davem@davemloft.net>
Various functions in the Gigaset driver were using different
conventions for the meaning of their int return values.
Align them to the usual negative error numbers convention.
Inspired-by: Julia Lawall <julia.lawall@lip6.fr> Signed-off-by: Tilman Schmidt <tilman@imap.cc> Signed-off-by: David S. Miller <davem@davemloft.net>
Functions clear_at_state and free_strings did the same thing;
drop one of them, keeping the more descriptive name.
Drop a redundant call.
Rename function dealloc_at_states to dealloc_temp_at_states
to clarify its purpose.
Signed-off-by: Tilman Schmidt <tilman@imap.cc> Signed-off-by: David S. Miller <davem@davemloft.net>
isdn/gigaset: improve error handling querying firmware version
An out-of-place "OK" response to the "AT+GMR" (get firmware version)
command turns out to be, more often than not, a delayed response to
a previous command rather than an actual error, so continue waiting
for the version number in that case.
Signed-off-by: Tilman Schmidt <tilman@imap.cc> CC: stable <stable@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
If DISCONNECT_B3_IND was synthesized because of a DISCONNECT_REQ
with existing logical connections, the connection state wasn't
updated accordingly. Also the emitted DISCONNECT_B3_IND message
wasn't included in the debug log as requested.
This patch fixes both of these issues.
Signed-off-by: Tilman Schmidt <tilman@imap.cc> CC: stable <stable@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Introduce a global ratelimit for CAPI message dumps to protect
against possible log flood.
Drop the ratelimit for ignored messages which is now covered by the
global one.
Signed-off-by: Tilman Schmidt <tilman@imap.cc> CC: stable <stable@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Johannes Berg [Mon, 7 May 2012 13:39:06 +0000 (15:39 +0200)]
net: compare_ether_addr[_64bits]() has no ordering
Neither compare_ether_addr() nor compare_ether_addr_64bits()
(as it can fall back to the former) have comparison semantics
like memcmp() where the sign of the return value indicates sort
order. We had a bug in the wireless code due to a blind memcmp
replacement because of this.
A cursory look suggests that the wireless bug was the only one
due to this semantic difference.
Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Alexander Duyck [Fri, 4 May 2012 14:26:56 +0000 (14:26 +0000)]
skb: Add inline helper for getting the skb end offset from head
With the recent changes for how we compute the skb truesize it occurs to me
we are probably going to have a lot of calls to skb_end_pointer -
skb->head. Instead of running all over the place doing that it would make
more sense to just make it a separate inline skb_end_offset(skb) that way
we can return the correct value without having gcc having to do all the
optimization to cancel out skb->head - skb->head.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Alexander Duyck [Fri, 4 May 2012 14:26:51 +0000 (14:26 +0000)]
skb: Drop "fastpath" variable for skb_cloned check in pskb_expand_head
Since there is now only one spot that actually uses "fastpath" there isn't
much point in carrying it. Instead we can just use a check for skb_cloned
to verify if we can perform the fast-path free for the head or not.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Alexander Duyck [Fri, 4 May 2012 14:26:46 +0000 (14:26 +0000)]
skb: Drop bad code from pskb_expand_head
The fast-path for pskb_expand_head contains a check where the size plus the
unaligned size of skb_shared_info is compared against the size of the data
buffer. This code path has two issues. First is the fact that after the
recent changes by Eric Dumazet to __alloc_skb and build_skb the shared info
is always placed in the optimal spot for a buffer size making this check
unnecessary. The second issue is the fact that the check doesn't take into
account the aligned size of shared info. As a result the code burns cycles
doing a memcpy with nothing actually being shifted.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
cdc_ether: Ignore bogus union descriptor for RNDIS devices
Some RNDIS devices include a bogus CDC Union descriptor pointing
to non-existing interfaces. The RNDIS code is already prepared
to handle devices without a CDC Union descriptor by hardwiring
the driver to use interfaces 0 and 1, which is correct for the
devices with the bogus descriptor as well. So we can reuse the
existing workaround.
Cc: Markus Kolb <linux-201011@tower-net.de> Cc: Iker Salmón San Millán <shaola@esdebian.org> Cc: Jonathan Nieder <jrnieder@gmail.com> Cc: Oliver Neukum <oliver@neukum.org> Cc: 655387@bugs.debian.org Cc: stable@vger.kernel.org Signed-off-by: Bjørn Mork <bjorn@mork.no> Signed-off-by: David S. Miller <davem@davemloft.net>
Ariel Elior [Sun, 6 May 2012 07:05:57 +0000 (07:05 +0000)]
bnx2x: bug fix when loading after SAN boot
This is a bug fix for an "interface fails to load" issue.
The issue occurs when bnx2x driver loads after UNDI driver was previously
loaded over the chip. In such a scenario the UNDI driver is loaded and operates
in the pre-boot kernel, within its own specific host memory address range.
When the pre-boot stage is complete, the real kernel is loaded, in a new and
distinct host memory address range. The transition from pre-boot stage to boot
is asynchronous from UNDI point of view.
A race condition occurs when UNDI driver triggers a DMAE transaction to valid
host addresses in the pre-boot stage, when control is diverted to the real
kernel. This results in access to illegal addresses by our HW as the addresses
which were valid in the preboot stage are no longer considered valid.
Specifically, the 'was_error' bit in the pci glue of our device is set. This
causes all following pci transactions from chip to host to timeout (in
accordance to the pci spec).
Signed-off-by: Ariel Elior <ariele@broadcom.com> Signed-off-by: Eilon Greenstein <eilong@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
John Fastabend [Mon, 23 Apr 2012 22:27:28 +0000 (22:27 +0000)]
ixgbe: dcb: IEEE PFC stats and reset logic incorrect
PFC stats are only tabulated when PFC is enabled. However in IEEE
mode the ieee_pfc pfc_tc bits were not checked and the calculation
was aborted.
This results in statistics not being reported through ethtool and
possible a false Tx hang occurring when receiving pause frames.
Signed-off-by: John Fastabend <john.r.fastabend@intel.com> Tested-by: Ross Brattain <ross.b.brattain@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Bruce Allan [Fri, 4 May 2012 08:52:03 +0000 (08:52 +0000)]
e1000e: increase version number
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Richard Alpe [Fri, 20 Apr 2012 15:24:50 +0000 (15:24 +0000)]
e1000e: clear REQ and GNT in EECD (82571 && 82572)
Clear the REQ and GNT bit in the eeprom control register (EECD).
This is required if the eeprom is to be accessed with auto read
EERD register.
After a cold reset this doesn't matter but if PBIST MAC test was
executed before booting, the register was left in a dirty state
(the 2 bits where set), which caused the read operation to time out
and returning 0.
Reported-by: Aleksandar Igic <aleksandar.igic@dektech.com.au> Signed-off-by: Richard Alpe <richard.alpe@ericsson.com> Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Bruce Allan [Tue, 20 Mar 2012 03:47:41 +0000 (03:47 +0000)]
e1000e: enable forced master/slave on 82577
Like other supported (igp) PHYs, the driver needs to be able to force the
master/slave mode on 82577. Since the code is the same as what already
exists in the code flow for igp PHYs, move it to a new function to be
called for both flows.
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Eric Dumazet [Fri, 4 May 2012 05:14:02 +0000 (05:14 +0000)]
tcp: be more strict before accepting ECN negociation
It appears some networks play bad games with the two bits reserved for
ECN. This can trigger false congestion notifications and very slow
transferts.
Since RFC 3168 (6.1.1) forbids SYN packets to carry CT bits, we can
disable TCP ECN negociation if it happens we receive mangled CT bits in
the SYN packet.
Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Perry Lorier <perryl@google.com> Cc: Matt Mathis <mattmathis@google.com> Cc: Yuchung Cheng <ycheng@google.com> Cc: Neal Cardwell <ncardwell@google.com> Cc: Wilmer van der Gaast <wilmer@google.com> Cc: Ankur Jain <jankur@google.com> Cc: Tom Herbert <therbert@google.com> Cc: Dave Täht <dave.taht@bufferbloat.net> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Karsten Keil [Fri, 4 May 2012 04:15:35 +0000 (04:15 +0000)]
mISDN: Help to identify the card
With multiple cards is hard to figure out which port caused trouble
int the layer2 routines (e.g. got a timeout).
Now we have the informations in the log output.
Signed-off-by: Karsten Keil <kkeil@linux-pingi.de> Signed-off-by: David S. Miller <davem@davemloft.net>
Karsten Keil [Fri, 4 May 2012 04:15:34 +0000 (04:15 +0000)]
mISDN: Layer1 statemachine fix
The timer3 and the activation delay timer need to be independent.
If timer3 fires do not reqest power up we have to send only INFO 0.
Now layer1 pass TBR3 again.
Signed-off-by: Karsten Keil <kkeil@linux-pingi.de> Signed-off-by: David S. Miller <davem@davemloft.net>
Tei manager reports current layer 1 state on creation.
On state change it reports it to the socket interface.
Signed-off-by: Andreas Eversberg <andreas@eversberg.eu> Signed-off-by: Karsten Keil <keil@b1-systems.de> Signed-off-by: David S. Miller <davem@davemloft.net>
Silence sparse warnings shown below:
...
drivers/net/ethernet/intel/e1000/e1000_main.c:3435:17: warning:
cast to restricted __le64
drivers/net/ethernet/intel/e1000/e1000_main.c:3435:17: warning:
cast to restricted __le64
...
Signed-off-by: Andrei Emeltchenko <andrei.emeltchenko@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
John Fastabend [Mon, 23 Apr 2012 12:22:39 +0000 (12:22 +0000)]
igb, ixgbe: netdev_tx_reset_queue incorrectly called from tx init path
igb and ixgbe incorrectly call netdev_tx_reset_queue() from
i{gb|xgbe}_clean_tx_ring() this sort of works in most cases except
when the number of real tx queues changes. When the number of real
tx queues changes netdev_tx_reset_queue() only gets called on the
new number of queues so when we reduce the number of queues we risk
triggering the watchdog timer and repeated device resets.
So this is not only a cosmetic issue but causes real bugs. For
example enabling/disabling DCB or FCoE in ixgbe will trigger this.
CC: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: John Fastabend <john.r.fastabend@intel.com> Tested-by: John Bishop <johnx.bishop@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Alexander Duyck [Thu, 19 Apr 2012 17:48:48 +0000 (17:48 +0000)]
ixgbe: Update link flow control to correctly handle multiple packet buffer DCB
This change updates the link flow control configuration so that we
correctly set the link flow control settings for DCB. Previously we would
have to call the fc_enable call 8 times, once for each packet buffer. If
we move that logic into the fc_enable call itself we can avoid multiple
unnecessary register writes.
This change also corrects an issue in which we were only shifting the water
marks for 82599 parts by 6 instead of 10. This was resulting in us only
using 1/16 of the packet buffer when flow control was enabled.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Ross Brattain <ross.b.brattain@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Alexander Duyck [Thu, 19 Apr 2012 17:49:56 +0000 (17:49 +0000)]
ixgbe: Reorder link flow control functions in ixgbe_common.c
We can avoid many of the forward declarations found in ixgbe_common.c by
just reordering things so this patch does that to help cleanup the code.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Ross Brattain <ross.b.brattain@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Alexander Duyck [Fri, 6 Apr 2012 04:24:50 +0000 (04:24 +0000)]
ixgbe: Use __free_pages instead of put_page to release pages
This change replaces the calls to put_page with calls to __free_page.
Since the FCoE code is able to access order 1 pages I thought it would be a
good idea to change things over to using __free_pages since that is the
preferred approach for freeing pages.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Ross Brattain <ross.b.brattain@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Alexander Duyck [Wed, 28 Mar 2012 08:03:48 +0000 (08:03 +0000)]
ixgbe: Make ixgbe_fc_autoneg return void and always set current_mode
This change makes it so that ixgbe_fc_autoneg is a void and always sets the
current_mode. Previously if the link was down we would return an error,
however there is no harm in simply treating a link down case as a case in
which autoneg simply failed. This allows us to rely on the return value of
the ixgbe_fc_enable call now since there should be no cases where it
returns an error that would normally be ignored.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Ross Brattain <ross.b.brattain@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Alexander Duyck [Wed, 28 Mar 2012 08:03:43 +0000 (08:03 +0000)]
ixgbe: Reorder the ring to q_vector mapping to improve performance
This change reorders the mapping of rings to q_vectors in the case that the
number of rings exceeds the number of q_vectors. Previously we would
allocate the first R/N queues to the first q_vector where R is the number
of rings and N is the number of q_vectors. Instead of doing this we can do
a better job of interleaving the rings to the CPUs by assigning every Nth
ring to the q_vector.
The below tables illustrate this change for the R = 16 N = 4 case.
Before patch After patch
q_vector: 0 1 2 3 0 1 2 3
Rings: 0 4 8 12 0 1 2 3
1 5 9 13 4 5 6 7
3 6 10 14 8 9 10 11
4 7 11 15 12 13 14 15
This should improve the performance for both DCB or ATR when the number of
rings exceeds the number of q_vectors allocated by the adapter.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Ross Brattain <ross.b.brattain@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Alexander Duyck [Wed, 28 Mar 2012 08:03:32 +0000 (08:03 +0000)]
ixgbe: Track instances of buffer available but no DMA resources present
This change makes it so that we can track instances of where a packet was
dropped due to a packet being received when there are no DMA buffers
available in the ring.
For some reason this was only being enabled with RSC, however it makes
more sense to always have this feature on so that we can track any cases
where we might drop a buffer due to an Rx ring being full.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Ross Brattain <ross.b.brattain@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Bruce Allan [Thu, 19 Apr 2012 03:21:47 +0000 (03:21 +0000)]
e1000e: initial support for i217
i217 is the next-generation LOM that will be available on systems with the
Lynx Point Platform Controller Hub (PCH) chipset from Intel. This patch
provides the initial support for the device.
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Matthew Vick [Wed, 25 Apr 2012 04:45:57 +0000 (04:45 +0000)]
e1000e: Update driver version number
Version bump to 1.11.3-k.
Signed-off-by: Matthew Vick <matthew.vick@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Matthew Garrett [Thu, 3 May 2012 20:50:46 +0000 (16:50 -0400)]
efivars: Improve variable validation
Ben Hutchings pointed out that the validation in efivars was inadequate -
most obviously, an entry with size 0 would server as a DoS against the
kernel. Improve this based on his suggestions.
Signed-off-by: Matthew Garrett <mjg@redhat.com> Cc: stable@vger.kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Fri, 4 May 2012 00:16:52 +0000 (17:16 -0700)]
Merge tag 'tag/upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev
Pull libata fixes from Jeff Garzik:
1) Fix regression that could cause a misdiagnosis, which in turn could
lead to an erroneous 3.0 Gbps -> 1.5 downshift, particularly when hotplug
and suspend/resume is involved.
2) Fix a regression that led to ata%d controller ids being numbered one
larger than in <= 3.4-rc3 (oh, the horror!). Controller ids should now be
as expected.
3) add some DT, PCI id's
4) ata/pata_arasan_cf: minor cpp fixing/cleaning
* tag 'tag/upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev:
ata: ahci_platform: Add synopsys ahci controller in DT's compatible list
ata/pata_arasan_cf: Move arasan_cf_pm_ops out of #ifdef, #endif macros
libata: init ata_print_id to 0
ahci: Detect Marvell 88SE9172 SATA controller
libata: skip old error history when counting probe trials
Linus Torvalds [Fri, 4 May 2012 00:15:47 +0000 (17:15 -0700)]
Merge branch 'i2c-embedded/for-current' of git://git.pengutronix.de/git/wsa/linux
Pull i2c embedded fixes from Wolfram Sang:
"Here are some typical i2c driver bugfixes for 3.4. Missed clock
handling, improper timeout fixes, hardware wrokarounds... All
patches have been in linux-next for a few days, too."
* 'i2c-embedded/for-current' of git://git.pengutronix.de/git/wsa/linux:
i2c: mxs: disable QUEUE when sending is done
i2c: mxs: handle spurious interrupt
i2c-eg20t: Modify MODULE_AUTHOR's email address
i2c-eg20t: change timeout value 50msec to 1000msec
i2c: tegra: Add delay before resetting the controller after NACK
i2c: pnx: Disable clk in suspend
Linus Torvalds [Fri, 4 May 2012 00:14:55 +0000 (17:14 -0700)]
Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux
Pull drm fixes from Dave Airlie:
"Just some regression fixes from Ben along with a variable that gcc
failed to spot is uninitialised."
* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux:
nouveau: initialise has_optimus variable.
drm/nv10/gpio: fix thinko in mask for gpio lines 2-9
nvc0/fb: shut up PMFB interrupt after the first occurrence
drm/nouveau/hdmi: use correct hdmi regs for nvaa/nvac
drm/nouveau/bios: fix regression on some nv4x board
1) Transfer padding was wrong for full-speed USB in ASIX driver, fix
from Ingo van Lil.
2) Propagate the negative packet offset fix into the PowerPC BPF JIT.
From Jan Seiffert.
3) dl2k driver's private ioctls were letting unprivileged tasks make
MII writes and other ugly bits like that. Fix from Jeff Mahoney.
4) Fix TX VLAN and RX packet drops in ucc_geth, from Joakim Tjernlund.
5) OOPS and network namespace fixes in IPVS from Hans Schillstrom and
Julian Anastasov.
6) Fix races and sleeping in locked context bugs in drop_monitor, from
Neil Horman.
7) Fix link status indication in smsc95xx driver, from Paolo Pisati.
8) Fix bridge netfilter OOPS, from Peter Huang.
9) L2TP sendmsg can return on error conditions with the socket lock
held, oops. Fix from Sasha Levin.
10) udp_diag should return meaningful values for socket memory usage,
from Shan Wei.
11) Eric Dumazet is so awesome he gets his own section:
Socket memory cgroup code (I never should have applied those
patches, grumble...) made erroneous changes to
sk_sockets_allocated_read_positive(). It was changed to
use percpu_counter_sum_positive (which requires BH disabling)
instead of percpu_counter_read_positive (which does not).
Revert back to avoid crashes and lockdep warnings.
Adjust the default tcp_adv_win_scale and tcp_rmem[2] values
to fix throughput regressions. This is necessary as a result
of our more precise skb->truesize tracking.
Fix SKB leak in netem packet scheduler.
12) New device IDs for various bluetooth devices, from Manoj Iyer,
AceLan Kao, and Steven Harms.
13) Fix command completion race in ipw2200, from Stanislav Yakovlev.
14) Fix rtlwifi oops on unload, from Larry Finger.
15) Fix hard_mtu when adjusting hard_header_len in smsc95xx driver.
From Stephane Fillod.
16) ehea driver registers it's IRQ before all the necessary state is
setup, resulting in crashes. Fix from Thadeu Lima de Souza
Cascardo.
17) Fix PHY connection failures in davinci_emac driver, from Anatolij
Gustschin.
18) Missing break; in switch statement in bluetooth's
hci_cmd_complete_evt(). Fix from Szymon Janc.
19) Fix queue programming in iwlwifi, from Johannes Berg.
20) Interrupt throttling defaults not being actually programmed into the
hardware, fix from Jeff Kirsher and Ying Cai.
21) TLAN driver SKB encoding in descriptor busted on 64-bit, fix from
Benjamin Poirier.
22) Fix blind status block RX producer pointer deref in TG3 driver, from
Matt Carlson.
23) Promisc and multicast are busted on ehea, fixes from Thadeu Lima de
Souza Cascardo.
24) Fix crashes in 6lowpan, from Alexander Smirnov.
25) tcp_complete_cwr() needs to be careful to not rewind the CWND to
ssthresh if ssthresh has the "infinite" value. Fix from Yuchung
Cheng.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (81 commits)
sungem: Fix WakeOnLan
tcp: change tcp_adv_win_scale and tcp_rmem[2]
net: l2tp: unlock socket lock before returning from l2tp_ip_sendmsg
drop_monitor: prevent init path from scheduling on the wrong cpu
usbnet: fix failure handling in usbnet_probe
usbnet: fix leak of transfer buffer of dev->interrupt
ucc_geth: Add 16 bytes to max TX frame for VLANs
net: ucc_geth, increase no. of HW RX descriptors
netem: fix possible skb leak
sky2: fix receive length error in mixed non-VLAN/VLAN traffic
sky2: propogate rx hash when packet is copied
net: fix two typos in skbuff.h
cxgb3: Don't call cxgb_vlan_mode until q locks are initialized
ixgbe: fix calling skb_put on nonlinear skb assertion bug
ixgbe: Fix a memory leak in IEEE DCB
igbvf: fix the bug when initializing the igbvf
smsc75xx: enable mac to detect speed/duplex from phy
smsc75xx: declare smsc75xx's MII as GMII capable
smsc75xx: fix phy interrupt acknowledge
smsc75xx: fix phy init reset loop
...
Linus Torvalds [Fri, 4 May 2012 00:08:58 +0000 (17:08 -0700)]
Merge tag 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging
Pull hwmon fixes from Guenter Roeck:
"Fix OOPS seen in coretemp driver if the CPU core ID is too large"
* tag 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
hwmon: (coretemp) Increase CPU core limit
hwmon: (coretemp) fix oops on cpu unplug
The idea here seems to be to get a 44bit DMA mask working and if this
fails it should fallback to a 32bit DMA mask. The dma_mask variable is
assigned once to 44bit and never updated. pci_set_dma_mask() and
pci_set_consistent_dma_mask() are both implemented as functions so there
is no evil macro which might update dma_mask. Looking at the assembly, I
see a call to dma_set_mask() followed by dma_supported() and then a jump
passed the second dma_set_mask(). The only way to get to second
dma_set_mask() call is by an error code in the first one.
So I hereby remove the check since it looks superfluous. Please ignore
the path if there is black magic involved.
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: David S. Miller <davem@davemloft.net>
When comparing the dmesg between 3.4-rc3 and 3.4-rc4 I found the
following differences:
-ata1: SATA max UDMA/133 abar m2048@0xf9fff000 port 0xf9fff100 irq 47
-ata2: SATA max UDMA/133 abar m2048@0xf9fff000 port 0xf9fff180 irq 47
-ata3: DUMMY
+ata2: SATA max UDMA/133 abar m2048@0xf9fff000 port 0xf9fff100 irq 47
+ata3: SATA max UDMA/133 abar m2048@0xf9fff000 port 0xf9fff180 irq 47
ata4: DUMMY
ata5: DUMMY
-ata6: SATA max UDMA/133 abar m2048@0xf9fff000 port 0xf9fff380 irq 47
+ata6: DUMMY
+ata7: SATA max UDMA/133 abar m2048@0xf9fff000 port 0xf9fff380 irq 47
The change of numbering comes from commit 85d6725b7c0d7e3f ("libata:
make ata_print_id atomic") that changed lines like
ap->print_id = ata_print_id++;
to
ap->print_id = atomic_inc_return(&ata_print_id);
As the latter behaves like ++ata_print_id, we must initialize
it to zero to start the numbering from one.
Signed-off-by: Tero Roponen <tero.roponen@gmail.com> Acked-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Lin Ming [Thu, 3 May 2012 14:15:07 +0000 (22:15 +0800)]
libata: skip old error history when counting probe trials
Commit d902747("[libata] Add ATA transport class") introduced
ATA_EFLAG_OLD_ER to mark entries in the error ring as cleared.
But ata_count_probe_trials_cb() didn't check this flag and it still
counts the old error history. So wrong probe trials count is returned
and it causes problem, for example, SATA link speed is slowed down from
3.0Gbps to 1.5Gbps.
Fix it by checking ATA_EFLAG_OLD_ER in ata_count_probe_trials_cb().
Cc: stable <stable@vger.kernel.org> # 2.6.37+ Signed-off-by: Lin Ming <ming.m.lin@intel.com> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Alexander Duyck [Thu, 3 May 2012 01:09:42 +0000 (01:09 +0000)]
skb: Add skb_head_is_locked helper function
This patch adds support for a skb_head_is_locked helper function. It is
meant to be used any time we are considering transferring the head from
skb->head to a paged frag. If the head is locked it means we cannot remove
the head from the skb so it must be copied or we must take the skb as a
whole.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Wed, 2 May 2012 23:33:21 +0000 (23:33 +0000)]
net: Fix truesize accounting in skb_gro_receive()
GRO is very optimistic in skb truesize estimates, only taking into
account the used part of fragments.
Be conservative, and use more precise computation, so that bloated GRO
skbs can be collapsed eventually.
Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Alexander Duyck <alexander.h.duyck@intel.com> Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Acked-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Sat, 24 Mar 2012 04:29:46 +0000 (00:29 -0400)]
iwlwifi: fix skb truesize underestimation
By default, iwlwifi uses order-1 pages (8 KB) to store incoming frames,
but doesnt say so in skb->truesize.
This makes very possible to exhaust kernel memory since these skb evade
normal socket memory accounting.
As struct ieee80211_hdr is going to be pulled before calling IP stack,
there is no need to use dev_alloc_skb() to reserve NET_SKB_PAD bytes.
alloc_skb() is ok in this driver, allowing more tailroom.
Pull beginning of frame in skb header, in the hope we can reuse order-1
pages in the driver immediately for small frames and reduce their
truesize to the minimum (linear skbs)
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Cc: Wey-Yi Guy <wey-yi.w.guy@intel.com> Cc: "John W. Linville" <linville@tuxdriver.com> Cc: Neal Cardwell <ncardwell@google.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
The BIT_APP_UPCHG bit is no longer set when ixgbe_dcbnl_set_all() is
called. This results in the FCoE app user priority never getting set
and the driver will not configure the tx_rings correctly for FCoE
packets which use the SAN MTU and FCoE offloads.
We resolve this regression by fixing ixgbe_copy_dcb_cfg() to also
check for FCoE application changes. Additionally, we can drop the
IEEE variants of get_dcb_app() because this path is never called
with the IEEE mode enabled.
Signed-off-by: John Fastabend <john.r.fastabend@intel.com> Tested-by: Ross Brattain <ross.b.brattain@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Don Skidmore [Sat, 17 Mar 2012 05:51:52 +0000 (05:51 +0000)]
ixgbe: fix race condition with shutdown
It was possible for shutdown to pull the rug out from other driver entry
points. Now we just grab the rtnl lock before taking everything apart.
Thanks to Hariharan for noticing this tight race condition.
Signed-off-by: Don Skidmore <donald.c.skidmore@intel.com> Cc: Hariharan Nagarajan <hanagara@cisco.com> Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Greg Rose [Tue, 17 Apr 2012 04:29:34 +0000 (04:29 +0000)]
ixgbevf: Make sure jumbo frames are set correctly after PF reset
If the Physical Function (PF) resets after the VF has set jumbo
frame MTU then the VF jumbo frame is overwritten. Make sure the
VF driver always requests proper MTU size after reset
synchronization.
Signed-off-by: Greg Rose <gregory.v.rose@intel.com> Tested-by: Sibai Li <sibai.li@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Chris Boot [Tue, 24 Apr 2012 07:24:58 +0000 (07:24 +0000)]
e1000e: Remove special case for 82573/82574 ASPM L1 disablement
For the 82573, ASPM L1 gets disabled wholesale so this special-case code
is not required. For the 82574 the previous patch does the same as for
the 82573, disabling L1 on the adapter. Thus, this code is no longer
required and can be removed.
Signed-off-by: Chris Boot <bootc@bootc.net> Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Chris Boot [Tue, 24 Apr 2012 07:24:52 +0000 (07:24 +0000)]
e1000e: Disable ASPM L1 on 82574
ASPM on the 82574 causes trouble. Currently the driver disables L0s for
this NIC but only disables L1 if the MTU is >1500. This patch simply
causes L1 to be disabled regardless of the MTU setting.
Signed-off-by: Chris Boot <bootc@bootc.net> Cc: "Wyborny, Carolyn" <carolyn.wyborny@intel.com> Cc: Nix <nix@esperi.org.uk> Link: https://lkml.org/lkml/2012/3/19/362 Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Matthew Vick [Wed, 25 Apr 2012 08:01:05 +0000 (08:01 +0000)]
e1000e: Driver workaround for IPv6 Header Extension Erratum.
Previously, IPv6 extension header parsing was disabled for all devices
supported by e1000e when using packet split mode. However, as per a
silicon errata, only certain devices need this restriction and will need
to disable IPv6 extension header parsing for all modes.
Signed-off-by: Matthew Vick <matthew.vick@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Matthew Vick [Wed, 25 Apr 2012 07:25:18 +0000 (07:25 +0000)]
e1000e: Resolve intermittent negotiation issue on 82574/82583.
For 82574 and 82583 devices, resolve an intermittent link issue where
the link negotiates to 100Mbps rather than 1Gbps when powering off the
PHY and powering on the PHY after several seconds.
Signed-off-by: Matthew Vick <matthew.vick@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Bruce Allan [Sat, 14 Apr 2012 04:21:52 +0000 (04:21 +0000)]
e1000e: cleanup long [read|write]_reg_locked PHY ops function pointers
Calling the locked versions of the read/write PHY ops function pointers
often produces excessively long lines. Shorten these as is done with
the non-locked versions of the PHY register read/write functions.
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Bruce Allan [Tue, 20 Mar 2012 03:48:08 +0000 (03:48 +0000)]
e1000e: suggest a possible workaround to a device hang on 82577/8
There is a known issue in the 82577 and 82578 device that can cause a hang
in the device hardware during traffic stress; the current workaround in the
driver is to disable transmit flow control by default. If the user enables
transmit flow control and the device hang occurs, provide a message in the
syslog suggesting to re-enable the workaround.
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Alexander Duyck [Wed, 2 May 2012 21:19:14 +0000 (21:19 +0000)]
ixgbe: Fix use after free on module remove
While testing the TCP changes I had to fix an issue in order to be able to
load and unload the module.
The recent patch that added thermal sensor support added a use after free
bug on module unload with an 82598 adapter in the system. To resolve the
issue I have updated the code so that when we free the info_kobj we set it
back to NULL.
I suspect there are likely other bugs present, but I will leave that for
another patch that can undergo more testing.
I am submitting this directly to net-next since this fixes a fairly serious
bug that will lock up the ixgbe module until the system is rebooted.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Alexander Duyck [Wed, 2 May 2012 21:19:09 +0000 (21:19 +0000)]
tcp: move stats merge to the end of tcp_try_coalesce
This change cleans up the last bits of tcp_try_coalesce so that we only
need one goto which jumps to the end of the function. The idea is to make
the code more readable by putting things in a linear order so that we start
execution at the top of the function, and end it at the bottom.
I also made a slight tweak to the code for handling frags when we are a
clone. Instead of making it an if (clone) loop else nr_frags = 0 I changed
the logic so that if (!clone) we just set the number of frags to 0 which
disables the for loop anyway.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Cc: Eric Dumazet <edumazet@google.com> Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Alexander Duyck [Wed, 2 May 2012 21:19:04 +0000 (21:19 +0000)]
tcp: Move code related to head frag in tcp_try_coalesce
This change reorders the code related to the use of an skb->head_frag so it
is placed before we check the rest of the frags. This allows the code to
read more linearly instead of like some sort of loop.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Cc: Eric Dumazet <edumazet@google.com> Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>