git.karo-electronics.de Git - karo-tx-linux.git/log

[NETFILTER]: Clear GSO bits for TCP reset packet

The TCP reset packet is copied from the original. This
includes all the GSO bits which do not apply to the new
packet. So we should clear those bits.

Spotted by Patrick McHardy.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Adrian Bunk <bunk@stusta.de>

[TCP]: Don't apply FIN exception to full TSO segments.

Signed-off-by: John Heffner <jheffner@psc.edu>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Adrian Bunk <bunk@stusta.de>

[ATM]: Fix for crash in adummy_init()

This was reported by Ingo Molnar here,

http://lkml.org/lkml/2006/12/18/119

The problem is that adummy_init() depends on atm_init() , but adummy_init()
is called first.

So I put atm_init() into subsys_initcall which seems appropriate, and it
will still get module_init() if it becomes a module.

Interesting to note that you could crash your system here if you just load
the modules in the wrong order.

Signed-off-by: Daniel Walker <dwalker@mvista.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Adrian Bunk <bunk@stusta.de>

TCP: skb is unexpectedly freed.

I encountered a kernel panic with my test program, which is a very
simple IPv6 client-server program.

The server side sets IPV6_RECVPKTINFO on a listening socket, and the
client side just sends a message to the server.  Then the kernel panic
occurs on the server.  (If you need the test program, please let me
know. I can provide it.)

This problem happens because a skb is forcibly freed in
tcp_rcv_state_process().

When a socket in listening state(TCP_LISTEN) receives a syn packet,
then tcp_v6_conn_request() will be called from
tcp_rcv_state_process().  If the tcp_v6_conn_request() successfully
returns, the skb would be discarded by __kfree_skb().

However, in case of a listening socket which was already set
IPV6_RECVPKTINFO, an address of the skb will be stored in
treq->pktopts and a ref count of the skb will be incremented in
tcp_v6_conn_request().  But, even if the skb is still in use, the skb
will be freed.  Then someone still using the freed skb will cause the
kernel panic.

I suggest to use kfree_skb() instead of __kfree_skb().

Signed-off-by: Masayuki Nakagawa <nakagawa.msy@ncos.nec.co.jp>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Adrian Bunk <bunk@stusta.de>

TCP: Fix sorting of SACK blocks.

The sorting of SACK blocks actually munges them rather than sort,
causing the TCP stack to ignore some SACK information and breaking the
assumption of ordered SACK blocks after sorting.

The sort takes the data from a second buffer which isn't moved causing
subsequent data moves to occur from the wrong location. The fix is to
use a temporary buffer as a normal sort does.

Signed-off-By: Baruch Even <baruch@ev-en.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

AF_PACKET: Check device down state before hard header callbacks.

If the device is down, invoking the device hard header callbacks
is not legal, so check it early.

Based upon a shaper OOPS report from Frederik Deweerdt.

Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Adrian Bunk <bunk@stusta.de>

SPARC32: Fix over-optimization by GCC near ip_fast_csum.

In some cases such as:
iph->check = 0;
iph->check = ip_fast_csum((unsigned char *)iph, iph->ihl);
GCC may optimize out the previous store.

Observed as a failure of NFS over udp (bad checksums on ip fragments)
when compiled with GCC 3.4.2.

Signed-off-by: Bob Breuer <breuerr@mc.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Adrian Bunk <bunk@stusta.de>

DECNET: Handle a failure in neigh_parms_alloc (take 2)

While enhancing the neighbour code to handle multiple network
namespaces I noticed that decnet is assuming neigh_parms_alloc
will allways succeed, which is clearly wrong. So handle the
failure.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Adrian Bunk <bunk@stusta.de>

Fix up CIFS for "test_clear_page_dirty()" removal

This also adds he required page "writeback" flag handling, that cifs
hasn't been doing and that the page dirty flag changes made obvious.

Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Acked-by: Steve French <smfltc@us.ibm.com>

fix umask when noACL kernel meets extN tuned for ACLs

Fix insecure default behaviour reported by Tigran Aivazian: if an
ext2 or ext3 filesystem is tuned to mount with "acl", but mounted by
a kernel built without ACL support, then umask was ignored when creating
inodes - though root or user has umask 022, touch creates files as 0666,
and mkdir creates directories as 0777.

This appears to have worked right until 2.6.11, when a fix to the default
mode on symlinks (always 0777) assumed VFS applies umask: which it does,
unless the mount is marked for ACLs; but ext[23] set MS_POSIXACL in
s_flags according to s_mount_opt set according to def_mount_opts.

We could revert to the 2.6.10 ext[23]_init_acl (adding an S_ISLNK test);
but other filesystems only set MS_POSIXACL when ACLs are configured. We
could fix this at another level; but it seems most robust to avoid setting
the s_mount_opt flag in the first place (at the expense of more ifdefs).

Likewise don't set the XATTR_USER flag when built without XATTR support.

Signed-off-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Adrian Bunk <bunk@stusta.de>

Linux 2.6.16.40

Linux 2.6.16.40-rc1

reiserfs: avoid tail packing if an inode was ever mmapped

This patch fixes a confusion reiserfs has for a long time.

On release file operation reiserfs used to try to pack file data stored in
last incomplete page of some files into metadata blocks.  After packing the
page got cleared with clear_page_dirty.  It did not take into account that
the page may be mmaped into other process's address space.  Recent
replacement for clear_page_dirty cancel_dirty_page found the confusion with
sanity check that page has to be not mapped.

The patch fixes the confusion by making reiserfs avoid tail packing if an
inode was ever mmapped.  reiserfs_mmap and reiserfs_file_release are
serialized with mutex in reiserfs specific inode.  reiserfs_mmap locks the
mutex and sets a bit in reiserfs specific inode flags.
reiserfs_file_release checks the bit having the mutex locked.  If bit is
set - tail packing is avoided.  This eliminates a possibility that mmapped
page gets cancel_page_dirty-ed.

Signed-off-by: Vladimir Saveliev <vs@namesys.com>
Signed-off-by: Adrian Bunk <bunk@stusta.de>

[libata] use kmap_atomic(KM_IRQ0) in SCSI simulator

We are inside spin_lock_irqsave().  quoth akpm's debug facility:

[  231.948000] SCSI device sda: 195371568 512-byte hdwr sectors (100030 MB)
[  232.232000] ata1.00: configured for UDMA/33
[  232.404000] WARNING (1) at arch/i386/mm/highmem.c:47 kmap_atomic()
[  232.404000]  [<c01162e6>] kmap_atomic+0xa9/0x1ab
[  232.404000]  [<c0242c81>] ata_scsi_rbuf_get+0x1c/0x30
[  232.404000]  [<c0242caf>] ata_scsi_rbuf_fill+0x1a/0x87
[  232.404000]  [<c0243ab2>] ata_scsiop_mode_sense+0x0/0x309
[  232.404000]  [<c01729d5>] end_bio_bh_io_sync+0x0/0x37
[  232.404000]  [<c02311c6>] scsi_done+0x0/0x16
[  232.404000]  [<c02311c6>] scsi_done+0x0/0x16
[  232.404000]  [<c0242dcc>] ata_scsi_simulate+0xb0/0x13f
[...]

Signed-off-by: Jeff Garzik <jeff@garzik.org>
Signed-off-by: Adrian Bunk <bunk@stusta.de>

ia64: add pci_get_legacy_ide_irq()

Add pci_get_legacy_ide_irq() identical to the one used by i386/x86_64.
Fixes amd74xx driver build on ia64 (bugzilla bug #6644).

Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Signed-off-by: Adrian Bunk <bunk@stusta.de>

hwmon: Update Rudolf Marek's e-mail address

The Silicon Hill club is not what it used to be.

Signed-off-by: Rudolf Marek <r.marek@assembler.cz>
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Signed-off-by: Adrian Bunk <bunk@stusta.de>

k8temp: Documentation update

Update the documentation for the k8temp driver.

Signed-off-by: Rudolf Marek <r.marek@assembler.cz>
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Signed-off-by: Adrian Bunk <bunk@stusta.de>

k8temp: Add documentation

Signed-off-by: Rudolf Marek <r.marek@sh.cvut.cz>
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Signed-off-by: Adrian Bunk <bunk@stusta.de>

Make 'repair' actually work for raid1.

When 'repair' finds a block that is different one the various
parts of the mirror. it is meant to write a chosen good version
to the others. However it currently writes out the original data
to each. The memcpy to make all the data the same is missing.

Also correct a test so that 'repair' causes a repair, rather than
anything other then 'repair'.

Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Adrian Bunk <bunk@stusta.de>

hwmon: New driver k8temp

Add support for the temperature sensor(s) found in AMD K8 CPUs.

Signed-off-by: Rudolf Marek <r.marek@sh.cvut.cz>
Signed-off-by: Adrian Bunk <bunk@stusta.de>

[SCSI] arcmsr: fix up sysfs values

The sysfs files in arcmsr are non-standard in that they aren't simple
filename value pairs, the values actually contain preceeding text which
would have to be parsed. The idea of sysfs files is that the file name
is the description and the contents is a simple value.

Fix up arcmsr to conform to this standard.

Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Signed-off-by: Adrian Bunk <bunk@stusta.de>

[SCSI] areca sysfs fix

Remove sysfs_remove_bin_file() return-value checking from the areca driver.

There's nothing a driver can do if sysfs file removal fails, so we'll soon be
changing sysfs_remove_bin_file() to internally print a diagnostic and to
return void.

Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Adrian Bunk <bunk@stusta.de>

[SCSI] arcmsr: initial driver, version 1.20.00.13

arcmsr is a driver for the Areca Raid controller, a host based RAID
subsystem that speaks SCSI at the firmware level.

This patch is quite a clean up over the initial submission with
contributions from:

Randy Dunlap <rdunlap@xenotime.net>
Christoph Hellwig <hch@lst.de>
Matthew Wilcox <matthew@wil.cx>
Adrian Bunk <bunk@stusta.de>

Signed-off-by: Erich Chen <erich@areca.com.tw>
Signed-off-by: Adrian Bunk <bunk@stusta.de>

Linux 2.6.16.39

Linux 2.6.16.39-rc1

[Bluetooth] Fix deadlock in the L2CAP layer

The Bluetooth L2CAP layer has 2 locks that are used in softirq context,
(one spinlock and one rwlock, where the softirq usage is readlock) but
where not all usages of the lock were _bh safe. The patch below corrects
this.

Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Adrian Bunk <bunk@stusta.de>

[Bluetooth] Add locking for bt_proto array manipulation

The bt_proto array needs to be protected by some kind of locking to
prevent a race condition between bt_sock_create and bt_sock_register.

And in addition all calls to sk_alloc need to be made GFP_ATOMIC now.

Signed-off-by: Masatake YAMATO <jet@gyve.org>
Signed-off-by: Frederik Deweerdt <frederik.deweerdt@gmail.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Adrian Bunk <bunk@stusta.de>

[Bluetooth] Fix compat ioctl for BNEP, CMTP and HIDP

There exists no attempt do deal with the fact that a structure with
a uint32_t followed by a pointer is going to be different for 32-bit
and 64-bit userspace. Any 32-bit process trying to use it will be
failing with -EFAULT if it's lucky; suffering from having data dumped
at a random address if it's not.

Signed-off-by: David Woodhouse <dwmw2@infradead.org>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Adrian Bunk <bunk@stusta.de>

[Bluetooth] Handle command complete event for exit periodic inquiry

The command complete event of the exit periodic inquiry command must
clear the HCI_INQUIRY flag and finish the HCI request.

Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Adrian Bunk <bunk@stusta.de>

[Bluetooth] Return EINPROGRESS for non-blocking socket calls

In case of non-blocking socket calls we should return EINPROGRESS
and not EAGAIN.

Signed-off-by: Ulisses Furquim <ulissesf@gmail.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Adrian Bunk <bunk@stusta.de>

kbuild: explicitly turn off gcc stack-protector

Ubuntu has enabled -fstack-protector per default in gcc
breaking kernel build. Explicit turn it off for now.

Backported based on several patches by Sam Ravnborg <sam@ravnborg.org>.

Signed-off-by: Adrian Bunk <bunk@stusta.de>

[Bluetooth] Fix uninitialized return value for RFCOMM sendmsg()

When calling send() with a zero length parameter on a RFCOMM socket
it returns a positive value. In this rare case the variable err is
used uninitialized and unfortunately its value is returned.

Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Adrian Bunk <bunk@stusta.de>

[Bluetooth] More checks if DLC is still attached to the TTY

If the DLC device is no longer attached to the TTY device, then return
errors or default values for various callbacks of the TTY layer.

Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Adrian Bunk <bunk@stusta.de>

BLUETOOTH: Fix unaligned access in hci_send_to_sock.

The "u16 *" derefs of skb->data need to be wrapped inside of
a get_unaligned().

Thanks to Gustavo Zacarias for the bug report.

Signed-off-by: David S. Miller <davem@davemloft.net>
Acked-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Adrian Bunk <bunk@stusta.de>

[Bluetooth] Check if DLC is still attached to the TTY

If the DLC device is no longer attached to the TTY device, then it
makes no sense to go through with changing the termios settings.

Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Adrian Bunk <bunk@stusta.de>

sparc32: add offset in pci_map_sg()

Add sg->offset to sg->dvma_address in pci_map_sg() on sparc32. Without the
offset, transfers to buffers that do not begin on a page boundary will not
work as expected.

Signed-off-by: Jan Andersson <jan.andersson@ieee.org>
Acked-By: David Miller <davem@davemloft.net>

V4L/DVB: Missing statement in drivers/media/dvb/frontends/cx22700.c

Stumbled over this because of coverity (id #492),
seems like we are missing a return statement here and fail
to do proper bounds checking. If this assumption is false
we should at least change the identation to make it clear

Signed-off-by: Eric Sesterhenn <snakebyte@gmx.de>
Signed-off-by: Adrian Bunk <bunk@stusta.de>

V4L/DVB: Flexcop-usb: fix debug printk

.. fix debug printk. Why, oh why, one would want to do
(u16 & 0xff) << 8
and print it with %02x format?

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Adrian Bunk <bunk@stusta.de>

V4L/DVB: Fix uninitialised variable in dvb_frontend_swzigzag

Spotted by coverity/Adrian Bunk.

Signed-off-by: Andrew de Quincey <adq_dvb@lidskialf.net>
Signed-off-by: Adrian Bunk <bunk@stusta.de>

[Bluetooth] Let BT_HIDP depend on INPUT

This patch lets BT_HIDP depend on instead of select INPUT. This fixes
the following warning during an s390 build:

net/bluetooth/hidp/Kconfig:4:warning: 'select' used by config symbol
'BT_HIDP' refer to undefined symbol 'INPUT'

A dependency on INPUT also implies !S390 (and therefore makes the
explicit dependency obsolete) since INPUT is not available on s390.

The practical difference should be nearly zero, since INPUT is always
set to y unless EMBEDDED=y (or S390=y).

Signed-off-by: Adrian Bunk <bunk@stusta.de>

i386: fix CPU hotplug with 2GB VMSPLIT

In VMSPLIT mode, kernel PGD might have more entries than user space

Signed-off-by: Shaohua Li <shaohua.li@intel.com>
Signed-off-by: Adrian Bunk <bunk@stusta.de>

read_zero_pagealigned() locking fix

Ramiro Voicu hits the BUG_ON(!pte_none(*pte)) in zeromap_pte_range: kernel
bugzilla 7645.  Right: read_zero_pagealigned uses down_read of mmap_sem,
but another thread's racing read of /dev/zero, or a normal fault, can
easily set that pte again, in between zap_page_range and zeromap_page_range
getting there.  It's been wrong ever since 2.4.3.

The simple fix is to use down_write instead, but that would serialize reads
of /dev/zero more than at present: perhaps some app would be badly
affected.  So instead let zeromap_page_range return the error instead of
BUG_ON, and read_zero_pagealigned break to the slower clear_user loop in
that case - there's no need to optimize for it.

Use -EEXIST for when a pte is found: BUG_ON in mmap_zero (the other user of
zeromap_page_range), though it really isn't interesting there.  And since
mmap_zero wants -EAGAIN for out-of-memory, the zeromaps better return that
than -ENOMEM.

Signed-off-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Adrian Bunk <bunk@stusta.de>

atiixp: hang fix

When the old IDE layer calls into methods in the driver during error
handling it is essentially random whether ide_lock is already held. This
causes a deadlock in the atiixp driver which also uses ide_lock internally
for locking.

Switch to a private lock instead.

[akpm@osl.org: cleanup]
Signed-off-by: Alan Cox <alan@redhat.com>
Acked-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Adrian Bunk <bunk@stusta.de>

cdrom: set default timeout to 7 seconds

It's a known fact that Windows times out commands after 7 seconds, so
drives generally try and respond if they can before that happens. We
default to 5 seconds, which sometimes is a bit too short.

Jeremy Higdon reported here:

http://lkml.org/lkml/2007/1/1/145

that his drive takes longer than 5 seconds for a "read track
information" command, later confirming that it is about 6.7 seconds.

So just do the sane thing and change the default command timeout to 7
seconds to avoid other surprises.

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Signed-off-by: Adrian Bunk <bunk@stusta.de>

[SCSI] qla1280 bus reset typo

Fix typo in check of return value of qla1280_bus_reset() which would
result in an adapter reset in addition to the bus reset.

Signed-off-by: Jes Sorensen <jes@sgi.com>
Signed-off-by: Adrian Bunk <bunk@stusta.de>

[SCSI] qla1280 command timeout

Original patch from Ian Dall in bugzilla. Set command timeout as
specified by the SCSI layer rather than hardcode it to 30 seconds. I
have received a couple of reports of people hitting this one with
various tape configurations and the patch looks obviously correct.

From http://bugzilla.kernel.org/show_bug.cgi?id=6275

Ian Dall <ian@beware.dropbear.id.au>:

The command sent to the card was using a 30second timeout regardless of the
timeout requested in the scsi command passed down from higher levels.

Signed-off-by: Jes Sorensen <jes@sgi.com>
Signed-off-by: Adrian Bunk <bunk@stusta.de>

adfs: fix filename handling

Fix filenames on adfs discs being terminated at the first character greater
than 128 (adfs filenames are Latin 1). I saw this problem when using a
loopback adfs image on a 2.6.17-rc5 x86_64 machine, and the patch fixed it
there.

Signed-off-by: Adrian Bunk <bunk@stusta.de>

s390: connector support

Include connector config in the s390 arch Kconfig to get support for
connectors.

This also fixes the following Kconfig warning:
fs/Kconfig:1728:warning: 'select' used by config symbol 'CIFS_UPCALL' refer to undefined symbol 'CONNECTOR'

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Adrian Bunk <bunk@stusta.de>

NETFILTER: arp_tables: missing unregistration on module unload

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Adrian Bunk <bunk@stusta.de>

NETFILTER: NAT: fix NOTRACK checksum handling

The whole idea with the NOTRACK netfilter target is that
you can force the netfilter code to avoid connection
tracking, and all costs assosciated with it, by making
traffic match a NOTRACK rule.

But this is totally broken by the fact that we do a checksum
calculation over the packet before we do the NOTRACK bypass
check, which is very expensive. People setup NOTRACK rules
explicitly to avoid all of these kinds of costs.

This patch from Patrick, already in Linus's tree, fixes the
bug.

Move the check for ip_conntrack_untracked before the call to
skb_checksum_help to fix NOTRACK excemptions from NAT. Pre-2.6.19
NAT code breaks TSO by invalidating hardware checksums for every
packet, even if explicitly excluded from NAT through NOTRACK.

2.6.19 includes a fix that makes NAT and TSO live in harmony,
but the performance degradation caused by this deserves making
at least the workaround work properly in -stable.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Adrian Bunk <bunk@stusta.de>

mm: fix bug in set_page_dirty_buffers

This was triggered, but not the fault of, the dirty page accounting
patches. Suitable for -stable as well, after it goes upstream.

Unable to handle kernel NULL pointer dereference at virtual address 0000004c
EIP is at _spin_lock+0x12/0x66
Call Trace:
[<401766e7>] __set_page_dirty_buffers+0x15/0xc0
[<401401e7>] set_page_dirty+0x2c/0x51
[<40140db2>] set_page_dirty_balance+0xb/0x3b
[<40145d29>] __do_fault+0x1d8/0x279
[<40147059>] __handle_mm_fault+0x125/0x951
[<401133f1>] do_page_fault+0x440/0x59f
[<4034d0c1>] error_code+0x39/0x40
[<08048a33>] 0x8048a33
=======================

Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Adrian Bunk <bunk@stusta.de>

Linux 2.6.16.38

Linux 2.6.16.38-rc2

[IPV6] Fix joining all-node multicast group.

Signed-off-by: Adrian Bunk <bunk@stusta.de>

UML: fix the MODE_TT compilation

Signed-off-by: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Signed-off-by: Adrian Bunk <bunk@stusta.de>

Linux 2.6.16.38-rc1

x86_64: re-add a newline to RESTORE_CONTEXT

RESTORE_CONTEXT lost a newline:
http://www.mail-archive.com/kgdb-bugreport@lists.sourceforge.net/msg00559.html

Reported by Steven M. Christey.

Signed-off-by: Adrian Bunk <bunk@stusta.de>

ALSA: snd_rtctimer: handle RTC interrupts with a tasklet

The calls to rtc_control() from inside the interrupt handler can
deadlock the RTC code, so move our interrupt handling code to a tasklet.

Signed-off-by: Clemens Ladisch <clemens@ladisch.de>
Acked-By: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Adrian Bunk <bunk@stusta.de>

ALSA: emu10k1: Fix outl() in snd_emu10k1_resume_regs()

The emu10k1 driver saves the A_IOCFG and HCFG register on suspend and restores
it on resumes. Unfortunately, this doesn't work as the arguments to outl() are
reversed.

Signed-off-by: Arnaud Patard <arnaud.patard@rtp-net.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Adrian Bunk <bunk@stusta.de>

ALSA: Fix initiailization of user-space controls

Fix an assertion when accessing a user-defined control due to lack of
initialization (appears only when CONFIG_SND_DEBUg is enabled).

ALSA sound/core/control.c:660: BUG? (info->access == 0)

Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Adrian Bunk <bunk@stusta.de>

skip data conversion in compat_sys_mount when data_page is NULL

OpenVZ Linux kernel team has found a problem with mounting in compat mode.

Simple command "mount -t smbfs ..." on Fedora Core 5 distro in 32-bit mode
leads to oops:

Unable to handle kernel NULL pointer dereference at 0000000000000000 RIP:
[<ffffffff802bc7c6>] compat_sys_mount+0xd6/0x290
PGD 34d48067 PUD 34d03067 PMD 0
Oops: 0000 [1] SMP
CPU: 0
Modules linked in: iptable_nat simfs smbfs ip_nat ip_conntrack vzdquota
parport_pc lp parport 8021q bridge llc vznetdev vzmon nfs lockd sunrpc vzdev
iptable_filter af_packet xt_length ipt_ttl xt_tcpmss ipt_TCPMSS
iptable_mangle xt_limit ipt_tos ipt_REJECT ip_tables x_tables thermal
processor fan button battery asus_acpi ac uhci_hcd ehci_hcd usbcore i2c_i801
i2c_core e100 mii floppy ide_cd cdrom
Pid: 14656, comm: mount
RIP: 0060:[<ffffffff802bc7c6>]  [<ffffffff802bc7c6>]
compat_sys_mount+0xd6/0x290
RSP: 0000:ffff810034d31f38  EFLAGS: 00010292
RAX: 000000000000002c RBX: 0000000000000000 RCX: 0000000000000000
RDX: ffff810034c86bc0 RSI: 0000000000000096 RDI: ffffffff8061fc90
RBP: ffff810034d31f78 R08: 0000000000000000 R09: 000000000000000d
R10: ffff810034d31e58 R11: 0000000000000001 R12: ffff810039dc3000
R13: 000000000805ea48 R14: 0000000000000000 R15: 00000000c0ed0000
FS:  0000000000000000(0000) GS:ffffffff80749000(0033) knlGS:00000000b7d556b0
CS:  0060 DS: 007b ES: 007b CR0: 000000008005003b
CR2: 0000000000000000 CR3: 0000000034d43000 CR4: 00000000000006e0
Process mount (pid: 14656, veid=300, threadinfo ffff810034d30000, task
ffff810034c86bc0)
Stack:  0000000000000000 ffff810034dd0000 ffff810034e4a000 000000000805ea48
0000000000000000 0000000000000000 0000000000000000 0000000000000000
000000000805ea48 ffffffff8021e64e 0000000000000000 0000000000000000
Call Trace:
[<ffffffff8021e64e>] ia32_sysret+0x0/0xa

Code: 83 3b 06 0f 85 41 01 00 00 0f b7 43 0c 89 43 14 0f b7 43 0a
RIP  [<ffffffff802bc7c6>] compat_sys_mount+0xd6/0x290
RSP <ffff810034d31f38>
CR2: 0000000000000000

The problem is that data_page pointer can be NULL, so we should skip data
conversion in this case.

Signed-off-by: Andrey Mirkin <amirkin@openvz.org>
Signed-off-by: Adrian Bunk <bunk@stusta.de>

rtc: lockdep fix/workaround

BUG: warning at kernel/lockdep.c:1816/trace_hardirqs_on() (Not tainted)
[<c04051ee>] show_trace_log_lvl+0x58/0x171
[<c0405802>] show_trace+0xd/0x10
[<c040591b>] dump_stack+0x19/0x1b
[<c043abee>] trace_hardirqs_on+0xa2/0x11e
[<c06143c3>] _spin_unlock_irq+0x22/0x26
[<c0541540>] rtc_get_rtc_time+0x32/0x176
[<c0419ba4>] hpet_rtc_interrupt+0x92/0x14d
[<c0450f94>] handle_IRQ_event+0x20/0x4d
[<c0451055>] __do_IRQ+0x94/0xef
[<c040678d>] do_IRQ+0x9e/0xbd
[<c0404a49>] common_interrupt+0x25/0x2c
DWARF2 unwinder stuck at common_interrupt+0x25/0x2c

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Adrian Bunk <bunk@stusta.de>

ebtables: check struct type before computing gap

Check struct type before dereferencing fields in ebt_entry.
Failure to check can cause oops.

Signed-off-by: Chuck Ebbert <76306.1226@compuserve.com>
Acked-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Adrian Bunk <bunk@stusta.de>

i2c-mv64xxx: Fix random oops at boot

I have a Marvell board which has the same i2c hw block than mv64xxx, so
I'm trying to use i2c-mv64xxx driver.

But I get the following random oops at boot:

Unable to handle kernel NULL pointer dereference at virtual address 00000002
Backtrace:
[<c0397e4c>] (mv64xxx_i2c_intr+0x0/0x2b8) from [<c02879c4>] (__do_irq+0x4c/0x8c)
[<c0287978>] (__do_irq+0x0/0x8c) from [<c0287c0c>] (do_level_IRQ+0x68/0xc0)
r8 = C0501E08  r7 = 00000005  r6 = C0501E08  r5 = 00000005
r4 = C048BB78
[<c0287ba4>] (do_level_IRQ+0x0/0xc0) from [<c02885f8>] (asm_do_IRQ+0x50/0x134)
r6 = C0449C78  r5 = F1020000  r4 = FFFFFFFF
[<c02885a8>] (asm_do_IRQ+0x0/0x134) from [<c02869c4>] (__irq_svc+0x24/0x100)
r8 = C1CAC400  r7 = 00000005  r6 = 00000002  r5 = F1020000
r4 = FFFFFFFF
[<c0287efc>] (setup_irq+0x0/0x124) from [<c02880d0>] (request_irq+0xb0/0xd0)
r7 = C041B2AC  r6 = C0397E4C  r5 = 00000000  r4 = 00000005
[<c0288020>] (request_irq+0x0/0xd0) from [<c03985f4>] (mv64xxx_i2c_probe+0x148/0x244)
[<c03984ac>] (mv64xxx_i2c_probe+0x0/0x244) from [<c038bedc>] (platform_drv_probe+0x20/0x24)

The oops is caused by a spurious interrupt that occurs when request_irq
is called. mv64xxx_i2c_fsm() tries to read drv_data->msg, which is NULL.

I noticed that hardware init is done after requesting irq. Thus any
pending irq from previous hardware usage may cause this.

Signed-off-by: Maxime Bizon <mbizon@freebox.fr>
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Signed-off-by: Adrian Bunk <bunk@stusta.de>

V4L: cx88: Fix leadtek_eeprom tagging

reference to .init.text: from .text between 'cx88_card_setup'
(at offset 0x68c) and 'cx88_risc_field'
Caused by leadtek_eeprom() being declared __devinit and called from
a non-devinit context.

Signed-off-by: Jean Delvare <khali@linux-fr.org>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>

corrupted cramfs filesystems cause kernel oops (CVE-2006-5823)

Steve Grubb's fzfuzzer tool (http://people.redhat.com/sgrubb/files/
fsfuzzer-0.6.tar.gz) generates corrupt Cramfs filesystems which cause
Cramfs to kernel oops in cramfs_uncompress_block().  The cause of the oops
is an unchecked corrupted block length field read by cramfs_readpage().

This patch adds a sanity check to cramfs_readpage() which checks that the
block length field is sensible.  The (PAGE_CACHE_SIZE << 1) size check is
intentional, even though the uncompressed data is not going to be larger
than PAGE_CACHE_SIZE, gzip sometimes generates compressed data larger than
the original source data.  Mkcramfs checks that the compressed size is
always less than or equal to PAGE_CACHE_SIZE << 1.  Of course Cramfs could
use the original uncompressed data in this case, but it doesn't.

Signed-off-by: Phillip Lougher <phillip@lougher.org.uk>
Signed-off-by: Adrian Bunk <bunk@stusta.de>

handle ext3 directory corruption better (CVE-2006-6053)

I've been using Steve Grubb's purely evil "fsfuzzer" tool, at
http://people.redhat.com/sgrubb/files/fsfuzzer-0.4.tar.gz

Basically it makes a filesystem, splats some random bits over it, then
tries to mount it and do some simple filesystem actions.

At best, the filesystem catches the corruption gracefully.  At worst,
things spin out of control.

As you might guess, we found a couple places in ext3 where things spin out
of control :)

First, we had a corrupted directory that was never checked for
consistency...  it was corrupt, and pointed to another bad "entry" of
length 0.  The for() loop looped forever, since the length of
ext3_next_entry(de) was 0, and we kept looking at the same pointer over and
over and over and over...  I modeled this check and subsequent action on
what is done for other directory types in ext3_readdir...

(adding this check adds some computational expense; I am testing a followup
patch to reduce the number of times we check and re-check these directory
entries, in all cases.  Thanks for the idea, Andreas).

Next we had a root directory inode which had a corrupted size, claimed to
be > 200M on a 4M filesystem.  There was only really 1 block in the
directory, but because the size was so large, readdir kept coming back for
more, spewing thousands of printk's along the way.

Per Andreas' suggestion, if we're in this read error condition and we're
trying to read an offset which is greater than i_blocks worth of bytes,
stop trying, and break out of the loop.

With these two changes fsfuzz test survives quite well on ext3.

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Adrian Bunk <bunk@stusta.de>

ext2: skip pages past number of blocks in ext2_find_entry (CVE-2006-6054)

This one was pointed out on the MOKB site:
http://kernelfun.blogspot.com/2006/11/mokb-09-11-2006-linux-26x-ext2checkpage.html

If a directory's i_size is corrupted, ext2_find_entry() will keep processing
pages until the i_size is reached, even if there are no more blocks associated
with the directory inode. This patch puts in some minimal sanity-checking
so that we don't keep checking pages (and issuing errors) if we know there
can be no more data to read, based on the block count of the directory inode.

This is somewhat similar in approach to the ext3 patch I sent earlier this
year.

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Adrian Bunk <bunk@stusta.de>

hfs_fill_super returns success even if no root inode (CVE-2006-6056)

http://kernelfun.blogspot.com/2006/11/mokb-14-11-2006-linux-26x-selinux.html

mount that image...
fs: filesystem was not cleanly unmounted, running fsck.hfs is recommended.  mounting read-only.
hfs: get root inode failed.
BUG: unable to handle kernel NULL pointer dereference at virtual address 00000018
printing eip
...
EIP is at superblock_doinit+0x21/0x767
...
[] selinux_sb_kern_mount+0xc/0x4b
[] vfs_kern_mount+0x99/0xf6
[] do_kern_mount+0x2d/0x3e
[] do_mount+0x5fa/0x66d
[] sys_mount+0x77/0xae
[] syscall_call+0x7/0xb
DWARF2 unwinder stuck at syscall_call+0x7/0xb

hfs_fill_super() returns success even if
  root_inode = hfs_iget(sb, &fd.search_key->cat, &rec);
or
  sb->s_root = d_alloc_root(root_inode);

fails.  This superblock finds its way to superblock_doinit() which does:

        struct dentry *root = sb->s_root;
        struct inode *inode = root->d_inode;

and boom.  Need to make sure the error cases return an error, I think.

[akpm@osdl.org: return -ENOMEM on oom]
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Adrian Bunk <bunk@stusta.de>

USB_RTL8150 must select MII to avoid link errors.

Stolen from a patch by Randy Dunlap.

Signed-off-by: Adrian Bunk <bunk@stusta.de>

Fix for shmem_truncate_range() BUG_ON()

Ran into BUG() while doing madvise(REMOVE) testing. If we are punching a
hole into shared memory segment using madvise(REMOVE) and the entire hole
is below the indirect blocks, we hit following assert.

BUG_ON(limit <= SHMEM_NR_DIRECT);

Signed-off-by: Badari Pulavarty <pbadari@us.ibm.com>
Forwarded-by: Jordan Neumeyer
Signed-off-by: Adrian Bunk <bunk@stusta.de>

TCP: Fix and simplify microsecond rtt sampling

This changes the microsecond RTT sampling so that samples are taken in
the same way that RTT samples are taken for the RTO calculator: on the
last segment acknowledged, and only when the segment hasn't been
retransmitted.

Signed-off-by: John Heffner <jheffner@psc.edu>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Adrian Bunk <bunk@stusta.de>

uml: fix processor selection

Makes UML compile on any possible processor choice. The two problems were:

*) x86 code, when 386 is selected, checks at runtime boot_cpuflags, which we
   not have.
*) 3Dnow support for memcpy() et al. does not compile currently and fixing t
   is not trivial, so simply disable it; with this change, if one selects MK
   UML compiles (while it did not).
Merged upstream.

Signed-off-by: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Signed-off-by: Adrian Bunk <bunk@stusta.de>

rio: typo in bitwise AND expression.

The line:

hp->Mode &= !RIO_PCI_INT_ENABLE;

is obviously wrong as RIO_PCI_INT_ENABLE=0x04 and is used as a bitmask
2 lines before. Getting no IRQ would not disable RIO_PCI_INT_ENABLE
but rather RIO_PCI_BOOT_FROM_RAM which equals 0x01.

Obvious fix is to change ! for ~.

Signed-off-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Adrian Bunk <bunk@stusta.de>

drm: allow detection of new VIA chipsets

Update pci ids.

Signed-off-by: Chuck Short <zulcss@gmail.com>
Signed-off-by: Ben Collins <bcollins@ubuntu.com>
Signed-off-by: Adrian Bunk <bunk@stusta.de>

drm: Add the P4VM800PRO PCI ID.

Signed-off-by: Dave Airlie <airlied@linux.ie>
Signed-off-by: Adrian Bunk <bunk@stusta.de>

i2c-i801: SMBus patch for Intel ICH9

This updated patch adds the Intel ICH9 LPC and SMBus Controller DID's.

Signed-off-by: Jason Gaston <jason.d.gaston@intel.com>
Signed-off-by: Adrian Bunk <bunk@stusta.de>

PCI: irq: irq and pci_ids patch for Intel ICH9

This updated patch adds the Intel ICH9 LPC and SMBus Controller DID's.

Signed-off-by: Jason Gaston <jason.d.gaston@intel.com>
Signed-off-by: Adrian Bunk <bunk@stusta.de>

i2c-viapro: Add support for the VT8237A and VT8251

Documentation update included. Compile tested.

Signed-off-by: Rudolf Marek <r.marek@sh.cvut.cz>
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Signed-off-by: Adrian Bunk <bunk@stusta.de>

SPI/MTD: mtd_dataflash oops prevention

Return a fault code if the Dataflash driver runs into a "no device present"
error when the MISO line has a pulldown (it currently expects a pullup), so
that rmmod won't oops.

Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Adrian Bunk <bunk@stusta.de>

[IPV4/IPV6]: Fix inet{,6} device initialization order.

It is important that we only assign dev->ip{,6}_ptr
only after all portions of the inet{,6} are setup.

Otherwise we can receive packets before the multicast
spinlocks et al. are initialized.

Signed-off-by: David L Stevens <dlstevens@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Adrian Bunk <bunk@stusta.de>

[SOUND] Sparc CS4231: Use 64 for period_bytes_min

This matches what the ISA cs4231 driver uses.

Tested by Georg Chini.

Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Adrian Bunk <bunk@stusta.de>

[SOUND] Sparc CS4231: Fix IRQ return value and initialization.

SBUS: Change IRQ-handler return value from 0 to IRQ_HANDLED and
fix some initialisation problems.

Change period_bytes_min from 4096 to 256 to allow driver to work with
low latency (VOIP) applications. Hope this does not break EBUS.

Signed-off-by: Georg Chini <georg.chini@triaton-webhosting.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Adrian Bunk <bunk@stusta.de>

USB: Fix alignment of buffer passed down to ->hub_control()

Implementations assume the buffer is at least 4 byte aligned.

Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Adrian Bunk <bunk@stusta.de>

fix the UML compilation

Based on patches from Linus' tree.

Signed-off-by: Adrian Bunk <bunk@stusta.de>

[SUNKBD]: Fix sunkbd_enable(sunkbd, 0); obvious.

"sunkbd_enable(sunkbd, 0);" has no effect. Adding "sunkbd->enabled =
enable" in sunkbd_enable (obvious)

Signed-off-by: Fabrice Knevez <nuxdoors@cegetel.net>
Signed-off-by: Adrian Bunk <bunk@stusta.de>

ibmtr section fixes

WARNING: drivers/net/tokenring/ibmtr.o - Section mismatch: reference to .init.data:ibmtr_mem_base from .text between 'ibmtr_probe1' (at offset 0x6e6) and 'ibmtr_probe_card'
WARNING: drivers/net/tokenring/ibmtr.o - Section mismatch: reference to .init.data:ibmtr_mem_base from .text between 'ibmtr_probe1' (at offset 0x74a) and 'ibmtr_probe_card'
WARNING: drivers/net/tokenring/ibmtr.o - Section mismatch: reference to .init.data:ibmtr_mem_base from .text between 'ibmtr_probe1' (at offset 0x7fd) and 'ibmtr_probe_card'

Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Adrian Bunk <bunk@stusta.de>

x86_64: Don't leak NT bit into next task (CVE-2006-5755)

SYSENTER can cause a NT to be set which might cause crashes on the IRET
in the next task.

Following similar i386 patch from Linus.

Backport to 2.6.16 by Chuck Ebbert <76306.1226@compuserve.com>
[Changed 'set_debugreg' to the older 'set_debug' in setup64.c
and added raw_local_save_flags() from 2.6.19 to system.h]

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Adrian Bunk <bunk@stusta.de>

x86_64: fix ia32 syscall count

Signed-off-by: Chuck Ebbert <76306.1226@compuserve.com>
Signed-off-by: Adrian Bunk <bunk@stusta.de>

Bluetooth: Add packet size checks for CAPI messages (CVE-2006-6106)

With malformed packets it might be possible to overwrite internal
CMTP and CAPI data structures. This patch adds additional length
checks to prevent these kinds of remote attacks.

Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Adrian Bunk <bunk@stusta.de>

grow_buffers() infinite loop fix (CVE-2006-5757/CVE-2006-6060)

If grow_buffers() is for some reason passed a block number which wants to li
outside the maximum-addressable pagecache range (PAGE_SIZE * 4G bytes) then
will accidentally truncate `index' and will then instnatiate a page at the
wrong pagecache offset. This causes __getblk_slow() to go into an infinite
loop.

This can happen with corrupted disks, or with software errors elsewhere.

Detect that, and handle it.

Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Adrian Bunk <bunk@stusta.de>

i386: save/restore eflags in context switch (CVE-2006-5173)

(And reset it on new thread creation)

It turns out that eflags is important to save and restore not just
because of iopl, but due to the magic bits like the NT bit, which we
don't want leaking between different threads.

Backported to 2.6.16 by Chuck Ebbert <76306.1226@compuserve.com>
[Backport consisted of removing the CFI annotations.]

Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Signed-off-by: Adrian Bunk <bunk@stusta.de>

Call init_timer() for ISDN PPP CCP reset state timer (CVE-2006-5749)

The function isdn_ppp_ccp_reset_alloc_state() sets ->timer.function
and ->timer.data and later on calls add_timer() with no init_timer()
ever done.

Noted by Al Viro.

Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Adrian Bunk <bunk@stusta.de>

Fix incorrect user space access locking in mincore() (CVE-2006-4814)

Doug Chapman noticed that mincore() will doa "copy_to_user()" of the
result while holding the mmap semaphore for reading, which is a big
no-no. While a recursive read-lock on a semaphore in the case of a page
fault happens to work, we don't actually allow them due to deadlock
schenarios with writers due to fairness issues.

Doug and Marcel sent in a patch to fix it, but I decided to just rewrite
the mess instead - not just fixing the locking problem, but making the
code smaller and (imho) much easier to understand.

Also included are two fixes for the original patch including one
by Oleg Nesterov.

Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Signed-off-by: Adrian Bunk <bunk@stusta.de>

fuse: fix hang on SMP

Fuse didn't always call i_size_write() with i_mutex held which caused
rare hangs on SMP/32bit. This bug has been present since fuse-2.2,
well before being merged into mainline.

The simplest solution is to protect i_size_write() with the
per-connection spinlock. Using i_mutex for this purpose would require
some restructuring of the code and I'm not even sure it's always safe
to acquire i_mutex in all places i_size needs to be set.

Since most of vmtruncate is already duplicated for other reasons,
duplicate the remaining part as well, making all i_size_write() calls
internal to fuse.

Using i_size_write() was unnecessary in fuse_init_inode(), since this
function is only called on a newly created locked inode.

Reported by a few people over the years, but special thanks to Dana
Henriksen who was persistent enough in helping me debug it.

Adrian Bunk:
Backported to 2.6.16.

Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Signed-off-by: Adrian Bunk <bunk@stusta.de>

[PKTGEN]: Fix module load/unload races.

Adrian Bunk:
Backported to 2.6.16.

Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Adrian Bunk <bunk@stusta.de>

i2c: fix broken ds1337 initialization

On a custom board with ds1337 RTC I found that upgrade from 2.6.15 to
2.6.18 broke RTC support.

The main problem are changes to ds1337_init_client().
When a ds1337 recognizes a problem (e.g. power or clock failure) bit 7
in status register is set. This has to be reset by writing 0 to status
register. But since there are only 16 byte written to the chip and the
first byte is interpreted as an address, the status register (which is
the 16th) is never written.
The other problem is, that initializing all registers to zero is not
valid for day, date and month register. Funny enough this is checked by
ds1337_detect(), which depends on this values not being zero. So then
treated by ds1337_init_client() the ds1337 is not detected anymore,
whereas the failure bit in the status register is still set.

Broken by commit f9e8957937ebf60d22732a5ca9130f48a7603f60 (2.6.16-rc1,
2006-01-06). This fix is in Linus' tree since 2.6.20-rc1 (commit
763d9c046a2e511ec090a8986d3f85edf7448e7e).

Signed-off-by: Dirk Stieler <stieler@gdsys.de>
Signed-off-by: Dirk Eibach <eibach@gdsys.de>
Signed-off-by: Jean Delvare <khali@linux-fr.org>

NET_SCHED: Fix fallout from dev->qdisc RCU change

The move of qdisc destruction to a rcu callback broke locking in the
entire qdisc layer by invalidating previously valid assumptions about
the context in which changes to the qdisc tree occur.

The two assumptions were:

- since changes only happen in process context, read_lock doesn't need
  bottem half protection. Now invalid since destruction of inner qdiscs,
  classifiers, actions and estimators happens in the RCU callback unless
  they're manually deleted, resulting in dead-locks when read_lock in
  process context is interrupted by write_lock_bh in bottem half context.

- since changes only happen under the RTNL, no additional locking is
  necessary for data not used during packet processing (f.e. u32_list).
  Again, since destruction now happens in the RCU callback, this assumption
  is not valid anymore, causing races while using this data, which can
  result in corruption or use-after-free.

Instead of "fixing" this by disabling bottem halfs everywhere and adding
new locks/refcounting, this patch makes these assumptions valid again by
moving destruction back to process context. Since only the dev->qdisc
pointer is protected by RCU, but ->enqueue and the qdisc tree are still
protected by dev->qdisc_lock, destruction of the tree can be performed
immediately and only the final free needs to happen in the rcu callback
to make sure dev_queue_xmit doesn't access already freed memory.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Adrian Bunk <bunk@stusta.de>

Linux 2.6.16.37

Linux 2.6.16.37-rc1