The 802 standard allows pause frames to be either unicast or multicast.
Switches seem to send unicast frames, but on a direct link, other boards send
multicast pause. Unless the filter bit is set, these pause frames get
dropped.
Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org> Signed-off-by: Adrian Bunk <bunk@stusta.de>
Different chipsets have different amount of ram buffer (some have none),
so need to make sure that driver does proper setup for all cases from 0 on
to 48K, in units of 1K.
This is a backport of the code from 2.6.19 or later
Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org> Signed-off-by: Adrian Bunk <bunk@stusta.de>
Kirill Korotaev [Mon, 26 Feb 2007 00:48:36 +0000 (01:48 +0100)]
fix ext3 block bitmap leakage
This patch fixes ext3 block bitmap leakage,
which leads to the following fsck messages on
_healthy_ filesystem:
Block bitmap differences: -64159 -73707
All kernels up to 2.6.17 have this bug.
Found by
Vasily Averin <vvs@sw.ru> and Andrey Savochkin <saw@sawoct.com>
Test case triggered the issue was created by
Dmitry Monakhov <dmonakhov@sw.ru>
Signed-Off-By: Kirill Korotaev <dev@openvz.org> Signed-off-by: Adrian Bunk <bunk@stusta.de>
Ilpo Järvinen [Mon, 26 Feb 2007 00:36:47 +0000 (01:36 +0100)]
[TCP]: Prevent pseudo garbage in SYN's advertized window
TCP may advertize up to 16-bits window in SYN packets (no window
scaling allowed). At the same time, TCP may have rcv_wnd
(32-bits) that does not fit to 16-bits without window scaling
resulting in pseudo garbage into advertized window from the
low-order bits of rcv_wnd. This can happen at least when
mss <= (1<<wscale) (see tcp_select_initial_window). This patch
fixes the handling of SYN advertized windows (compile tested
only).
In worst case (which is unlikely to occur though), the receiver
advertized window could be just couple of bytes. I'm not sure
that such situation would be handled very well at all by the
receiver!? Fortunately, the situation normalizes after the
first non-SYN ACK is received because it has the correct,
scaled window.
Alternatively, tcp_select_initial_window could be changed to
prevent too large rcv_wnd in the first place.
[ tcp_make_synack() has the same bug, and I've added a fix for
that to this patch -DaveM ]
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Adrian Bunk <bunk@stusta.de>
Jean Delvare [Sun, 25 Feb 2007 23:55:22 +0000 (00:55 +0100)]
hwmon: Add support for the Winbond W83687THF
Add support for the Winbond W83687THF chip to the w83627hf hardware
monitoring driver. This new chip is almost similar to the already
supported W83627THF chip, except for VID and a few other minor
changes.
Signed-off-by: Jean Delvare <khali@linux-fr.org> Signed-off-by: Adrian Bunk <bunk@stusta.de>
Charles Spirakis [Sun, 25 Feb 2007 23:50:40 +0000 (00:50 +0100)]
w83791d: Documentation update
The alarm bits and the beep enable bits are in different positions in
the hardware. Document the problem and leave it to the user-space code
to handle the situation. When this driver is updated to the standardized
sysfs alarm/beep methodology, this won't be a problem.
This is a documentation only change.
Signed-off by: Charles Spirakis <bezaur@gmail.com> Signed-off-by: Jean Delvare <khali@linux-fr.org> Signed-off-by: Adrian Bunk <bunk@stusta.de>
Charles Spirakis [Sun, 25 Feb 2007 23:49:39 +0000 (00:49 +0100)]
HWMON: w83791d: New hardware monitoring driver for the Winbond W83791D
Add support for the w83791d sensor chip. The w83791d hardware is
somewhere between the w83781d and the w83792d and this driver code
is derived from the code that supports those chips.
Signed-off-by: Charles Spirakis <bezaur@gmail.com> Signed-off-by: Jean Delvare <khali@linux-fr.org> Signed-off-by: Adrian Bunk <bunk@stusta.de>
Jean Delvare [Sun, 25 Feb 2007 23:46:17 +0000 (00:46 +0100)]
hwmon: New PC87427 hardware monitoring driver
This is a new hardware monitoring driver for the National Semiconductor
PC87427 Super-I/O chip. It only supports fan speed monitoring for now,
while the chip can do much more.
Thanks to Amir Habibi at Candelis for setting up a test system, and to
Michael Kress for testing several iterations of this driver.
Signed-off-by: Jean Delvare <khali@linux-fr.org> Signed-off-by: Adrian Bunk <bunk@stusta.de>
Martin Devera [Sun, 25 Feb 2007 23:40:16 +0000 (00:40 +0100)]
I2C: i2c-piix4: Add Broadcom HT-1000 support
Add Broadcom HT-1000 south bridge's PCI ID to i2c-piix driver. Note
that at least on Supermicro H8SSL it uses non-standard SMBHSTCFG = 3
and standard values like 0 or 9 causes hangup.
Signed-off-by: Martin Devera <devik@cdi.cz> Signed-off-by: Jean Delvare <khali@linux-fr.org> Signed-off-by: Adrian Bunk <bunk@stusta.de>
Marcel Siegert [Sun, 25 Feb 2007 23:38:10 +0000 (00:38 +0100)]
V4L/DVB: Dvbdev: fix illegal re-usage of fileoperations struct
Arjan van de Ven <arjan@infradead.org> reported an illegal re-usage of
the fileoperations struct if more than one dvb device (e.g. frontend) is
present.
This patch fixes this issue.
It allocates a new fileoperations struct each time a device is
registered and copies the default template fileops.
Signed-off-by: Marcel Siegert <mws@linuxtv.org> Signed-off-by: Michael Krufky <mkrufky@linuxtv.org> Signed-off-by: Adrian Bunk <bunk@stusta.de>
Dan Streetman [Thu, 22 Feb 2007 20:11:26 +0000 (21:11 +0100)]
USB: add ZyXEL vendor/product ID to rtl8150 driver
I just got a "ZyXEL Prestige USB Adapter" that is actually RTL8150
adapter. Here is the relevant /proc/bus/usb/devices output (after
adding the vendor/product IDs to the driver):
This patch adds the ZyXEL vendor ID to the rtl8150.c driver. The
device has absolutely no identifying marks on the outside for model
type, just a serial number, and I can't find anything on ZyXEL's
website, so I called the product ID PRODUCT_ID_PRESTIGE to match the
product string.
Signed-off-by: Dan Streetman <ddstreet@ieee.org> Signed-off-by: Adrian Bunk <bunk@stusta.de>
This driver uses port 0 to handle receives on both ports. So
the netif_poll_disable call in dev_close would end up stopping the
second port on dual port cards.
Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org> Signed-off-by: Adrian Bunk <bunk@stusta.de>
Alexey Dobriyan [Wed, 21 Feb 2007 00:43:24 +0000 (01:43 +0100)]
[ATM] ambassador, firestream: "-1 >>" is implementation defined
6.5.7(5): The result of E1 >> E2 is E1 right-shifted E2 bit positions.
...
If E1 has a signed type and a negative value, the resulting value
is implementation defined.
So, cast -1 to unsigned type to make result well-defined.
[ Modified to use ~0U based upon recommendation from Al Viro. -DaveM ]
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Adrian Bunk <bunk@stusta.de>
David Howells [Wed, 21 Feb 2007 00:20:05 +0000 (01:20 +0100)]
Keys: Fix key serial number collision handling (CVE-2007-0006)
Fix the key serial number collision avoidance code in key_alloc_serial().
This didn't use to be so much of a problem as the key serial numbers were
allocated from a simple incremental counter, and it would have to go through
two billion keys before it could possibly encounter a collision. However, n
that random numbers are used instead, collisions are much more likely.
This is fixed by finding a hole in the rbtree where the next unused serial
number ought to be and using that by going almost back to the top of the
insertion routine and redoing the insertion with the new serial number rathe
than trying to be clever and attempting to work out the insertion point
pointer directly.
This fixes kernel Bugzilla #7727.
Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Adrian Bunk <bunk@stusta.de>
The problem here is that the void cast causes return types to not be
promoted, and for ops such as listxattr which expect more than 32 bits of
return value, the 32-bit -EIO is interpreted as a large positive 64-bit
number, i.e. 0x00000000fffffffa instead of 0xfffffffa.
This goes particularly badly when the return value is taken as a number of
bytes to copy into, say, a user's buffer for example...
I originally had coded up the fix by creating a return_EIO_<TYPE> macro
for each return type, like this:
but Al felt that it was probably better to create an EIO-returner for each
actual op signature. Since so few ops share a signature, I just went ahead
& created an EIO function for each individual file & inode op that returns
a value.
Adrian Bunk:
backported to 2.6.16
Signed-off-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: Adrian Bunk <bunk@stusta.de>
Greg Banks [Tue, 20 Feb 2007 23:14:36 +0000 (00:14 +0100)]
Fix a free-wrong-pointer bug in nfs/acl server (CVE-2007-0772)
Due to type confusion, when an nfsacl verison 2 'ACCESS' request
finishes and tries to clean up, it calls fh_put on entiredly the
wrong thing and this can cause an oops.
Jeff Dike [Wed, 14 Feb 2007 19:37:44 +0000 (20:37 +0100)]
uml: fix signal frame alignment
Use the same signal frame alignment calculations as the underlying
architecture. x86_64 appeared to do this, but the "- 8" was really
subtracting 8 * sizeof(struct rt_sigframe) rather than 8 bytes.
UML/i386 might have been OK, but I changed the calculation to match
i386 just to be sure.
Signed-off-by: Jeff Dike <jdike@addtoit.com> Signed-off-by: Adrian Bunk <bunk@stusta.de>
Al Viro [Wed, 14 Feb 2007 12:58:42 +0000 (13:58 +0100)]
[TCP]: struct tcp_sack_block annotations
Some of the instances of tcp_sack_block are host-endian, some - net-endian.
Define struct tcp_sack_block_wire identical to struct tcp_sack_block
with u32 replaced with __be32; annotate uses of tcp_sack_block replacing
net-endian ones with tcp_sack_block_wire. Change is obviously safe since
for cc(1) __be32 is typedefed to u32.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Adrian Bunk <bunk@stusta.de>
Jiri Bohac [Wed, 14 Feb 2007 08:40:31 +0000 (09:40 +0100)]
[IPX]: Fix NULL pointer dereference on ipx unload
Fixes a null pointer dereference when unloading the ipx module.
On initialization of the ipx module, registering certain packet
types can fail. When this happens, unloading the module later
dereferences NULL pointers. This patch fixes that. Please apply.
Signed-off-by: Jiri Bohac <jbohac@suse.cz> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Adrian Bunk <bunk@stusta.de>
Herbert Xu [Wed, 14 Feb 2007 08:39:09 +0000 (09:39 +0100)]
[NETFILTER]: Clear GSO bits for TCP reset packet
The TCP reset packet is copied from the original. This
includes all the GSO bits which do not apply to the new
packet. So we should clear those bits.
Spotted by Patrick McHardy.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Adrian Bunk <bunk@stusta.de>
I encountered a kernel panic with my test program, which is a very
simple IPv6 client-server program.
The server side sets IPV6_RECVPKTINFO on a listening socket, and the
client side just sends a message to the server. Then the kernel panic
occurs on the server. (If you need the test program, please let me
know. I can provide it.)
This problem happens because a skb is forcibly freed in
tcp_rcv_state_process().
When a socket in listening state(TCP_LISTEN) receives a syn packet,
then tcp_v6_conn_request() will be called from
tcp_rcv_state_process(). If the tcp_v6_conn_request() successfully
returns, the skb would be discarded by __kfree_skb().
However, in case of a listening socket which was already set
IPV6_RECVPKTINFO, an address of the skb will be stored in
treq->pktopts and a ref count of the skb will be incremented in
tcp_v6_conn_request(). But, even if the skb is still in use, the skb
will be freed. Then someone still using the freed skb will cause the
kernel panic.
I suggest to use kfree_skb() instead of __kfree_skb().
Signed-off-by: Masayuki Nakagawa <nakagawa.msy@ncos.nec.co.jp> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Adrian Bunk <bunk@stusta.de>
Baruch Even [Wed, 14 Feb 2007 08:29:14 +0000 (09:29 +0100)]
TCP: Fix sorting of SACK blocks.
The sorting of SACK blocks actually munges them rather than sort,
causing the TCP stack to ignore some SACK information and breaking the
assumption of ordered SACK blocks after sorting.
The sort takes the data from a second buffer which isn't moved causing
subsequent data moves to occur from the wrong location. The fix is to
use a temporary buffer as a normal sort does.
Signed-off-By: Baruch Even <baruch@ev-en.org> Signed-off-by: David S. Miller <davem@davemloft.net>
DECNET: Handle a failure in neigh_parms_alloc (take 2)
While enhancing the neighbour code to handle multiple network
namespaces I noticed that decnet is assuming neigh_parms_alloc
will allways succeed, which is clearly wrong. So handle the
failure.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Adrian Bunk <bunk@stusta.de>
Hugh Dickins [Tue, 13 Feb 2007 12:10:20 +0000 (13:10 +0100)]
fix umask when noACL kernel meets extN tuned for ACLs
Fix insecure default behaviour reported by Tigran Aivazian: if an
ext2 or ext3 filesystem is tuned to mount with "acl", but mounted by
a kernel built without ACL support, then umask was ignored when creating
inodes - though root or user has umask 022, touch creates files as 0666,
and mkdir creates directories as 0777.
This appears to have worked right until 2.6.11, when a fix to the default
mode on symlinks (always 0777) assumed VFS applies umask: which it does,
unless the mount is marked for ACLs; but ext[23] set MS_POSIXACL in
s_flags according to s_mount_opt set according to def_mount_opts.
We could revert to the 2.6.10 ext[23]_init_acl (adding an S_ISLNK test);
but other filesystems only set MS_POSIXACL when ACLs are configured. We
could fix this at another level; but it seems most robust to avoid setting
the s_mount_opt flag in the first place (at the expense of more ifdefs).
Likewise don't set the XATTR_USER flag when built without XATTR support.
Signed-off-by: Hugh Dickins <hugh@veritas.com> Signed-off-by: Adrian Bunk <bunk@stusta.de>
reiserfs: avoid tail packing if an inode was ever mmapped
This patch fixes a confusion reiserfs has for a long time.
On release file operation reiserfs used to try to pack file data stored in
last incomplete page of some files into metadata blocks. After packing the
page got cleared with clear_page_dirty. It did not take into account that
the page may be mmaped into other process's address space. Recent
replacement for clear_page_dirty cancel_dirty_page found the confusion with
sanity check that page has to be not mapped.
The patch fixes the confusion by making reiserfs avoid tail packing if an
inode was ever mmapped. reiserfs_mmap and reiserfs_file_release are
serialized with mutex in reiserfs specific inode. reiserfs_mmap locks the
mutex and sets a bit in reiserfs specific inode flags.
reiserfs_file_release checks the bit having the mutex locked. If bit is
set - tail packing is avoided. This eliminates a possibility that mmapped
page gets cancel_page_dirty-ed.
Signed-off-by: Vladimir Saveliev <vs@namesys.com> Signed-off-by: Adrian Bunk <bunk@stusta.de>
Neil Brown [Tue, 30 Jan 2007 23:53:52 +0000 (00:53 +0100)]
Make 'repair' actually work for raid1.
When 'repair' finds a block that is different one the various
parts of the mirror. it is meant to write a chosen good version
to the others. However it currently writes out the original data
to each. The memcpy to make all the data the same is missing.
Also correct a test so that 'repair' causes a repair, rather than
anything other then 'repair'.
Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Adrian Bunk <bunk@stusta.de>
James Bottomley [Sat, 27 Jan 2007 23:54:39 +0000 (00:54 +0100)]
[SCSI] arcmsr: fix up sysfs values
The sysfs files in arcmsr are non-standard in that they aren't simple
filename value pairs, the values actually contain preceeding text which
would have to be parsed. The idea of sysfs files is that the file name
is the description and the contents is a simple value.
Fix up arcmsr to conform to this standard.
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com> Signed-off-by: Adrian Bunk <bunk@stusta.de>
Andrew Morton [Sat, 27 Jan 2007 23:53:31 +0000 (00:53 +0100)]
[SCSI] areca sysfs fix
Remove sysfs_remove_bin_file() return-value checking from the areca driver.
There's nothing a driver can do if sysfs file removal fails, so we'll soon be
changing sysfs_remove_bin_file() to internally print a diagnostic and to
return void.
Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Adrian Bunk <bunk@stusta.de>
Marcel Holtmann [Thu, 25 Jan 2007 19:54:35 +0000 (20:54 +0100)]
[Bluetooth] Fix deadlock in the L2CAP layer
The Bluetooth L2CAP layer has 2 locks that are used in softirq context,
(one spinlock and one rwlock, where the softirq usage is readlock) but
where not all usages of the lock were _bh safe. The patch below corrects
this.
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Adrian Bunk <bunk@stusta.de>
Marcel Holtmann [Thu, 25 Jan 2007 19:34:48 +0000 (20:34 +0100)]
[Bluetooth] Fix compat ioctl for BNEP, CMTP and HIDP
There exists no attempt do deal with the fact that a structure with
a uint32_t followed by a pointer is going to be different for 32-bit
and 64-bit userspace. Any 32-bit process trying to use it will be
failing with -EFAULT if it's lucky; suffering from having data dumped
at a random address if it's not.
Signed-off-by: David Woodhouse <dwmw2@infradead.org> Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Adrian Bunk <bunk@stusta.de>
Marcel Holtmann [Thu, 25 Jan 2007 18:40:43 +0000 (19:40 +0100)]
[Bluetooth] Fix uninitialized return value for RFCOMM sendmsg()
When calling send() with a zero length parameter on a RFCOMM socket
it returns a positive value. In this rare case the variable err is
used uninitialized and unfortunately its value is returned.
Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Adrian Bunk <bunk@stusta.de>
Jan Andersson [Wed, 24 Jan 2007 23:10:10 +0000 (00:10 +0100)]
sparc32: add offset in pci_map_sg()
Add sg->offset to sg->dvma_address in pci_map_sg() on sparc32. Without the
offset, transfers to buffers that do not begin on a page boundary will not
work as expected.
Signed-off-by: Jan Andersson <jan.andersson@ieee.org> Acked-By: David Miller <davem@davemloft.net>
Eric Sesterhenn [Wed, 24 Jan 2007 23:05:10 +0000 (00:05 +0100)]
V4L/DVB: Missing statement in drivers/media/dvb/frontends/cx22700.c
Stumbled over this because of coverity (id #492),
seems like we are missing a return statement here and fail
to do proper bounds checking. If this assumption is false
we should at least change the identation to make it clear
Signed-off-by: Eric Sesterhenn <snakebyte@gmx.de> Signed-off-by: Adrian Bunk <bunk@stusta.de>
Hugh Dickins [Tue, 23 Jan 2007 15:46:22 +0000 (16:46 +0100)]
read_zero_pagealigned() locking fix
Ramiro Voicu hits the BUG_ON(!pte_none(*pte)) in zeromap_pte_range: kernel
bugzilla 7645. Right: read_zero_pagealigned uses down_read of mmap_sem,
but another thread's racing read of /dev/zero, or a normal fault, can
easily set that pte again, in between zap_page_range and zeromap_page_range
getting there. It's been wrong ever since 2.4.3.
The simple fix is to use down_write instead, but that would serialize reads
of /dev/zero more than at present: perhaps some app would be badly
affected. So instead let zeromap_page_range return the error instead of
BUG_ON, and read_zero_pagealigned break to the slower clear_user loop in
that case - there's no need to optimize for it.
Use -EEXIST for when a pte is found: BUG_ON in mmap_zero (the other user of
zeromap_page_range), though it really isn't interesting there. And since
mmap_zero wants -EAGAIN for out-of-memory, the zeromaps better return that
than -ENOMEM.
Signed-off-by: Hugh Dickins <hugh@veritas.com> Signed-off-by: Adrian Bunk <bunk@stusta.de>
Alan Cox [Mon, 22 Jan 2007 19:39:00 +0000 (20:39 +0100)]
atiixp: hang fix
When the old IDE layer calls into methods in the driver during error
handling it is essentially random whether ide_lock is already held. This
causes a deadlock in the atiixp driver which also uses ide_lock internally
for locking.
Switch to a private lock instead.
[akpm@osl.org: cleanup] Signed-off-by: Alan Cox <alan@redhat.com> Acked-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Adrian Bunk <bunk@stusta.de>
Jens Axboe [Mon, 22 Jan 2007 19:34:31 +0000 (20:34 +0100)]
cdrom: set default timeout to 7 seconds
It's a known fact that Windows times out commands after 7 seconds, so
drives generally try and respond if they can before that happens. We
default to 5 seconds, which sometimes is a bit too short.
Jeremy Higdon reported here:
http://lkml.org/lkml/2007/1/1/145
that his drive takes longer than 5 seconds for a "read track
information" command, later confirming that it is about 6.7 seconds.
So just do the sane thing and change the default command timeout to 7
seconds to avoid other surprises.
Signed-off-by: Jens Axboe <jens.axboe@oracle.com> Signed-off-by: Adrian Bunk <bunk@stusta.de>
Jes Sorensen [Mon, 22 Jan 2007 19:20:21 +0000 (20:20 +0100)]
[SCSI] qla1280 command timeout
Original patch from Ian Dall in bugzilla. Set command timeout as
specified by the SCSI layer rather than hardcode it to 30 seconds. I
have received a couple of reports of people hitting this one with
various tape configurations and the patch looks obviously correct.
From http://bugzilla.kernel.org/show_bug.cgi?id=6275
Ian Dall <ian@beware.dropbear.id.au>:
The command sent to the card was using a 30second timeout regardless of the
timeout requested in the scsi command passed down from higher levels.
Signed-off-by: Jes Sorensen <jes@sgi.com> Signed-off-by: Adrian Bunk <bunk@stusta.de>
James Bursa [Sat, 20 Jan 2007 21:58:51 +0000 (22:58 +0100)]
adfs: fix filename handling
Fix filenames on adfs discs being terminated at the first character greater
than 128 (adfs filenames are Latin 1). I saw this problem when using a
loopback adfs image on a 2.6.17-rc5 x86_64 machine, and the patch fixed it
there.
Include connector config in the s390 arch Kconfig to get support for
connectors.
This also fixes the following Kconfig warning:
fs/Kconfig:1728:warning: 'select' used by config symbol 'CIFS_UPCALL' refer to undefined symbol 'CONNECTOR'
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Adrian Bunk <bunk@stusta.de>
Patrick McHardy [Sat, 20 Jan 2007 21:18:30 +0000 (22:18 +0100)]
NETFILTER: NAT: fix NOTRACK checksum handling
The whole idea with the NOTRACK netfilter target is that
you can force the netfilter code to avoid connection
tracking, and all costs assosciated with it, by making
traffic match a NOTRACK rule.
But this is totally broken by the fact that we do a checksum
calculation over the packet before we do the NOTRACK bypass
check, which is very expensive. People setup NOTRACK rules
explicitly to avoid all of these kinds of costs.
This patch from Patrick, already in Linus's tree, fixes the
bug.
Move the check for ip_conntrack_untracked before the call to
skb_checksum_help to fix NOTRACK excemptions from NAT. Pre-2.6.19
NAT code breaks TSO by invalidating hardware checksums for every
packet, even if explicitly excluded from NAT through NOTRACK.
2.6.19 includes a fix that makes NAT and TSO live in harmony,
but the performance degradation caused by this deserves making
at least the workaround work properly in -stable.
Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Adrian Bunk <bunk@stusta.de>