Neil Brown [Fri, 15 Dec 2006 00:48:58 +0000 (01:48 +0100)]
md: Fix md grow/size code to correctly find the maximum available space
An md array can be asked to change the amount of each device that it is using,
and in particular can be asked to use the maximum available space. This
currently only works if the first device is not larger than the rest. As
'size' gets changed and so 'fit' becomes wrong. So check if a 'fit' is
required early and don't corrupt it.
Signed-off-by: Doug Ledford <dledford@redhat.com> Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Adrian Bunk <bunk@stusta.de>
Zachary Amsden [Fri, 15 Dec 2006 00:38:04 +0000 (01:38 +0100)]
softirq: remove BUG_ONs which can incorrectly trigger
It is possible to have tasklets get scheduled before softirqd has had a chance
to spawn on all CPUs. This is totally harmless; after success during action
CPU_UP_PREPARE, action CPU_ONLINE will be called, which immediately wakes
softirqd on the appropriate CPU to process the already pending tasklets. So
there is no danger of having a missed wakeup for any tasklets that were
already pending.
In particular, i386 is affected by this during startup, and is visible when
using a very large initrd; during the time it takes for the initrd to be
decompressed, a timer IRQ can come in and schedule RCU callbacks. It is also
possible that resending of a hardware IRQ via a softirq triggers the same bug.
Because of different timing conditions, this shows up in all emulators and
virtual machines tested, including Xen, VMware, Virtual PC, and Qemu. It is
also possible to trigger on native hardware with a large enough initrd,
although I don't have a reliable case demonstrating that.
Signed-off-by: Zachary Amsden <zach@vmware.com> Signed-off-by: Adrian Bunk <bunk@stusta.de>
Christophe Saout [Fri, 15 Dec 2006 00:21:59 +0000 (01:21 +0100)]
dm crypt: Fix data corruption with dm-crypt over RAID5
Fix corruption issue with dm-crypt on top of software raid5. Cancelled
readahead bio's that report no error, just have BIO_UPTODATE cleared
were reported as successful reads to the higher layers (and leaving
random content in the buffer cache). Already fixed in 2.6.19.
Signed-off-by: Christophe Saout <christophe@saout.de> Signed-off-by: Adrian Bunk <bunk@stusta.de>
Christophe Saout [Fri, 15 Dec 2006 00:20:35 +0000 (01:20 +0100)]
Fix SUNRPC wakeup/execute race condition
The sunrpc scheduler contains a race condition that can let an RPC
task end up being neither running nor on any wait queue. The race takes
place between rpc_make_runnable (called from rpc_wake_up_task) and
__rpc_execute under the following condition:
First __rpc_execute calls tk_action which puts the task on some wait
queue. The task is dequeued by another process before __rpc_execute
continues its execution. While executing rpc_make_runnable exactly after
setting the task `running' bit and before clearing the `queued' bit
__rpc_execute picks up execution, clears `running' and subsequently
both functions fall through, both under the false assumption somebody
else took the job.
Swapping rpc_test_and_set_running with rpc_clear_queued in
rpc_make_runnable fixes that hole. This introduces another possible
race condition that can be handled by checking for `queued' after
setting the `running' bit.
Bug noticed on a 4-way x86_64 system under XEN with an NFSv4 server
on the same physical machine, apparently one of the few ways to hit
this race condition at all.
Mark McLoughlin [Thu, 14 Dec 2006 22:09:07 +0000 (23:09 +0100)]
dm snapshot: fix metadata writing when suspending
When suspending a device-mapper device, dm_suspend() sleeps until all
necessary I/O is completed. This state is triggered by a callback from
persistent_commit(). But some I/O can still be issued *after* the callback
(to prepare the next metadata area for use if the current one is full). This
patch delays the callback until after that I/O is complete.
Signed-off-by: Mark McLoughlin <markmc@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com> Signed-off-by: Adrian Bunk <bunk@stusta.de>
Daniel Kobras [Thu, 14 Dec 2006 22:08:32 +0000 (23:08 +0100)]
dm: Fix deadlock under high i/o load in raid1 setup.
On an nForce4-equipped machine with two SATA disk in raid1 setup using dmraid,
we experienced frequent deadlock of the system under high i/o load. 'cat
/dev/zero > ~/zero' was the most reliable way to reproduce them: Randomly
after a few GB, 'cp' would be left in 'D' state along with kjournald and
kmirrord. The functions cp and kjournald were blocked in did vary, but
kmirrord's wchan always pointed to 'mempool_alloc()'. We've seen this pattern
on 2.6.15 and 2.6.17 kernels. http://lkml.org/lkml/2005/4/20/142 indicates
that this problem has been around even before.
So much for the facts, here's my interpretation: mempool_alloc() first tries
to atomically allocate the requested memory, or falls back to hand out
preallocated chunks from the mempool. If both fail, it puts the calling
process (kmirrord in this case) on a private waitqueue until somebody refills
the pool. Where the only 'somebody' is kmirrord itself, so we have a
deadlock.
I worked around this problem by falling back to a (blocking) kmalloc when
before kmirrord would have ended up on the waitqueue. This defeats part of
the benefits of using the mempool, but at least keeps the system running. And
it could be done with a two-line change. Note that mempool_alloc() clears the
GFP_NOIO flag internally, and only uses it to decide whether to wait or return
an error if immediate allocation fails, so the attached patch doesn't change
behaviour in the non-deadlocking case. Path is against current git
(2.6.18-rc4), but should apply to earlier versions as well. I've tested on
2.6.15, where this patch makes the difference between random lockup and a
stable system.
Signed-off-by: Daniel Kobras <kobras@linux.de> Acked-by: Alasdair G Kergon <agk@redhat.com> Signed-off-by: Adrian Bunk <bunk@stusta.de>
Neil Brown [Thu, 14 Dec 2006 22:07:54 +0000 (23:07 +0100)]
dm: mirror sector offset fix
The device-mapper core does not perform any remapping of bios before passing
them to the targets. If a particular mapping begins part-way into a device,
targets obtain the sector relative to the start of the mapping by subtracting
ti->begin.
The dm-raid1 target didn't do this everywhere: this patch fixes it, taking
care to subtract ti->begin exactly once for each bio.
Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Alasdair G Kergon <agk@redhat.com> Signed-off-by: Adrian Bunk <bunk@stusta.de>
Jeff Mahoney [Thu, 14 Dec 2006 22:07:12 +0000 (23:07 +0100)]
dm: add module ref counting
The reference counting on dm-mod is zero if no mapped devices are open. This
is incorrect, and can lead to an oops if the module is unloaded while mapped
devices exist.
This patch claims a reference to the module whenever a device is created, and
drops it again when the device is freed.
Devices must be removed before dm-mod is unloaded.
Signed-off-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com> Signed-off-by: Adrian Bunk <bunk@stusta.de>
Persistent snapshots currently store a private copy of the chunk size.
Userspace also supplies the chunk size when loading a snapshot. Ensure
consistency by only storing the chunk_size in one place instead of two.
Currently the two sizes will differ if the chunk size supplied by userspace
does not match the chunk size an existing snapshot actually uses. Amongst
other problems, this causes an incorrect 'percentage full' to be reported.
The patch ensures consistency by only storing the chunk_size in one place,
removing it from struct pstore. Some initialisation is delayed until the
correct chunk_size is known. If read_header() discovers that the wrong chun
size was supplied, the 'area' buffer (which the header already got read into
is reinitialised to the correct size.
Signed-off-by: Alasdair G Kergon <agk@redhat.com> Signed-off-by: Adrian Bunk <bunk@stusta.de>
Michal Miroslaw [Thu, 14 Dec 2006 22:02:33 +0000 (23:02 +0100)]
dm: BUG/OOPS fix
Fix BUG I tripped on while testing failover and multipathing.
BUG shows up on error path in multipath_ctr() when parse_priority_group()
fails after returning at least once without error. The fix is to
initialize m->ti early - just after alloc()ing it.
Joerg Ahrens [Thu, 14 Dec 2006 21:29:54 +0000 (22:29 +0100)]
xirc2ps_cs: Cannot reset card in atomic context
I am using a Xircom CEM33 pcmcia NIC which has occasional hardware problems.
If the netdev watchdog detects a transmit timeout, do_reset is called which
msleeps - this is illegal in atomic context.
This patch schedules the timeout handling as a workqueue item.
Alexey Kuznetsov [Thu, 14 Dec 2006 21:28:51 +0000 (22:28 +0100)]
[IPV4]: severe locking bug in fib_semantics.c
Found in 2.4 by Yixin Pan <yxpan@hotmail.com>.
> When I read fib_semantics.c of Linux-2.4.32, write_lock(&fib_info_lock) =
> is used in fib_release_info() instead of write_lock_bh(&fib_info_lock). =
> Is the following case possible: a BH interrupts fib_release_info() while =
> holding the write lock, and calls ip_check_fib_default() which calls =
> read_lock(&fib_info_lock), and spin forever.
Signed-off-by: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru> Signed-off-by: Adrian Bunk <bunk@stusta.de>
Hans Verkuil [Thu, 14 Dec 2006 21:26:40 +0000 (22:26 +0100)]
V4L: Fix broken TUNER_LG_NTSC_TAPE radio support
The TUNER_LG_NTSC_TAPE is identical in all respects to the
TUNER_PHILIPS_FM1236_MK3. So use the params struct for the Philips
tuner.
Also add this LG_NTSC_TAPE tuner to the switches where radio specific
parameters are set so it behaves like a TUNER_PHILIPS_FM1236_MK3. This
change fixes the radio support for this tuner (the wrong bandswitch byte
was used).
Thanks to Andy Walls <cwalls@radix.net> for finding this bug.
Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl> Signed-off-by: Michael Krufky <mkrufky@linuxtv.org> Signed-off-by: Adrian Bunk <bunk@stusta.de>
Michael Krufky [Thu, 14 Dec 2006 21:25:04 +0000 (22:25 +0100)]
DVB: lgdt330x: fix signal / lock status detection bug
In some cases when using VSB, the AGC status register has been known to
falsely report "no signal" when in fact there is a carrier lock. The
datasheet labels these status flags as QAM only, yet the lgdt330x
module is using these flags for both QAM and VSB.
This patch allows for the carrier recovery lock status register to be
tested, even if the agc signal status register falsely reports no signal.
Thanks to jcrews from #linuxtv in irc, for initially reporting this bug.
Signed-off-by: Michael Krufky <mkrufky@linuxtv.org> Signed-off-by: Adrian Bunk <bunk@stusta.de>
Chuck Ebbert [Sat, 9 Dec 2006 15:21:59 +0000 (16:21 +0100)]
binfmt_elf: fix checks for bad address
Fix check for bad address; use macro instead of open-coding two checks.
Taken from RHEL4 kernel update.
From: Ernie Petrides <petrides@redhat.com>
For background, the BAD_ADDR() macro should return TRUE if the address is
TASK_SIZE, because that's the lowest address that is *not* valid for
user-space mappings. The macro was correct in binfmt_aout.c but was wrong
for the "equal to" case in binfmt_elf.c. There were two in-line validations
of user-space addresses in binfmt_elf.c, which have been appropriately
converted to use the corrected BAD_ADDR() macro in the patch you posted
yesterday. Note that the size checks against TASK_SIZE are okay as coded.
The additional changes that I propose are below. These are in the error
paths for bad ELF entry addresses once load_elf_binary() has already
committed to exec'ing the new image (following the tearing down of the
task's original address space).
The 1st hunk deals with the interp-side of the outer "if". There were two
problems here. The printk() should be removed because this path can be
triggered at will by a bogus interpreter image created and used by a
malicious user. Further, the error code should not be ENOEXEC, because that
causes the loop in search_binary_handler() to continue trying other exec
handlers (twice, in fact). But it's too late for this to work correctly,
because the user address space has already been torn down, and an exec()
failure cannot be returned to the user code because the code no longer
exists. The only recovery is to force a SIGSEGV, but it's best to terminate
the search loop immediately. I somewhat arbitrarily chose EINVAL as a
fallback error code, but any error returned by load_elf_interp() will
override that (but this value will never be seen by user-space).
The 2nd hunk deals with the non-interp-side of the outer "if". There were
two problems here as well. The SIGSEGV needs to be forced, because a prior
sigaction() syscall might have set the associated disposition to SIG_IGN.
And the ENOEXEC should be changed to EINVAL as described above.
Signed-off-by: Chuck Ebbert <76306.1226@compuserve.com> Signed-off-by: Adrian Bunk <bunk@stusta.de>
Patrick McHardy [Sat, 9 Dec 2006 15:14:39 +0000 (16:14 +0100)]
[XFRM]: Use output device disable_xfrm for forwarded packets
Currently the behaviour of disable_xfrm is inconsistent between
locally generated and forwarded packets. For locally generated
packets disable_xfrm disables the policy lookup if it is set on
the output device, for forwarded traffic however it looks at the
input device. This makes it impossible to disable xfrm on all
devices but a dummy device and use normal routing to direct
traffic to that device.
Always use the output device when checking disable_xfrm.
Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Adrian Bunk <bunk@stusta.de>
Daniel Ritz [Wed, 6 Dec 2006 19:19:36 +0000 (20:19 +0100)]
PCI: fix ICH6 quirks
- add the ICH6(R) LPC to the ICH6 ACPI quirks. currently only the ICH6-M
is handled. [ PCI_DEVICE_ID_INTEL_ICH6_1 is the ICH6-M LPC, ICH6_0 is
the ICH6(R) ]
Signed-off-by: Daniel Ritz <daniel.ritz@gmx.ch> Signed-off-by: Adrian Bunk <bunk@stusta.de>
Bjorn Helgaas [Wed, 6 Dec 2006 19:17:30 +0000 (20:17 +0100)]
PCI: quirk to disable e100 interrupt if RESET failed to
Without this quirk, e100 can be pulling on a shared
interrupt line when another device (eg. USB) loads,
causing the interrupt to scream and get disabled.
http://bugzilla.kernel.org/show_bug.cgi?id=5918
Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com> Signed-off-by: Adrian Bunk <bunk@stusta.de>
Linus Torvalds [Wed, 6 Dec 2006 19:16:59 +0000 (20:16 +0100)]
Add PIIX4 APCI quirk for the 440MX chipset too
This is confirmed to fix a hang due to PCI resource conflicts with
setting up the Cardbus bridge on old laptops with the 440MX chipsets.
Original report by Alessio Sangalli, lspci debugging help by Pekka
Enberg, and trial patch suggested by Daniel Ritz:
"From the docs available i would _guess_ this thing is really similar
to the 82443BX/82371AB combination. at least the SMBus base address
register is hidden at the very same place (32bit at 0x90 in function
3 of the "south" brigde)"
The dang thing is largely undocumented, but the patch was corroborated
by Asit Mallick:
"I am trying to find the register information. 440MX is an integration of
440BX north-bridge without AGP and PIIX4E (82371EB). PIIX4 quirk
should cover the ACPI and SMBus related I/O registers."
and verified to fix the problem by Alessio.
Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Adrian Bunk <bunk@stusta.de>
Brice Goglin [Wed, 6 Dec 2006 19:15:55 +0000 (20:15 +0100)]
PCI: nVidia quirk to make AER PCI-E extended capability visible
The nVidia CK804 PCI-E chipset supports the AER extended capability
but sometimes fails to link it (with some BIOS or after a warm reboot).
It makes the AER cap invisible to pci_find_ext_capability().
The patch adds a quirk to set the missing bit that controls the
linking of the capability.
By the way, it removes the corresponding code in the myri10ge driver.
Signed-off-by: Brice Goglin <brice@myri.com> Signed-off-by: Adrian Bunk <bunk@stusta.de>
pci_ids.h: correct naming of 1022:7450 (AMD 8131 Bridge)
The naming of the constant defined for PCI ID 1022:7450 does not seem
to match the information at http://pciids.sourceforge.net/:
http://pci-ids.ucw.cz/iii/?i=1022
There 1022:7450 is listed as "AMD-8131 PCI-X Bridge" while 1022:7451
is listed as "AMD-8131 PCI-X IOAPIC". Yet, the current definition for
0x7450 is PCI_DEVICE_ID_AMD_8131_APIC. It seems to me like that name
should map to 0x7451, while a name like PCI_DEVICE_ID_AMD_8131_BRIDGE
should map to 0x7450.
Signed-off-by: John W. Linville <linville@tuxdriver.com> Signed-off-by: Adrian Bunk <bunk@stusta.de>
Ralf Baechle [Wed, 6 Dec 2006 17:49:53 +0000 (18:49 +0100)]
Fix mempolicy.h build error
<linux/mempolicy.h> uses struct mm_struct and relies on a definition or
declaration somehow magically being dragged in which may result in a
build:
CC mm/mempolicy.o
In file included from mm/mempolicy.c:69:
include/linux/mempolicy.h:150: warning: 'struct mm_struct' declared inside parameter list
include/linux/mempolicy.h:150: warning: its scope is only this definition or declaration, which is probably not what you want
include/linux/mempolicy.h:174: warning: 'struct mm_struct' declared inside parameter list
mm/mempolicy.c:673: error: conflicting types for 'do_migrate_pages'
include/linux/mempolicy.h:174: error: previous declaration of 'do_migrate_pages' was here
mm/mempolicy.c:1696: error: conflicting types for 'mpol_rebind_mm'
include/linux/mempolicy.h:150: error: previous declaration of 'mpol_rebind_mm' was here
make[1]: *** [mm/mempolicy.o] Error 1
make: *** [mm] Error 2
$
Including <linux/sched.h> is a step into direction of include hell so
fixed by adding a forward declaration of struct mm_struct instead.
Signed-off-by: Ralf Baechle <ralf@linux-mips.org> Signed-off-by: Adrian Bunk <bunk@stusta.de>
Chris Wright [Mon, 4 Dec 2006 18:44:59 +0000 (19:44 +0100)]
bridge: fix possible overflow in get_fdb_entries (CVE-2006-5751)
Make sure to properly clamp maxnum to avoid overflow (CVE-2006-5751).
Signed-off-by: Chris Wright <chrisw@sous-sol.org> Acked-by: Stephen Hemminger <shemminger@osdl.org> Acked-by: David Miller <davem@davemloft.net> Signed-off-by: Adrian Bunk <bunk@stusta.de>
Trond Myklebust [Mon, 4 Dec 2006 18:43:11 +0000 (19:43 +0100)]
fcntl(F_SETSIG) fix
fcntl(F_SETSIG) no longer works on leases because
lease_release_private_callback() gets called as the lease is copied in
order to initialise it.
The problem is that lease_alloc() performs an unnecessary initialisation,
which sets the lease_manager_ops. Avoid the problem by allocating the
target lease structure using locks_alloc_lock().
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: Adrian Bunk <bunk@stusta.de>
alim15x3.c: M5229 (rev c8) support for DMA cd-writer
Configuration bits are not set properly for DMA on some chipset revisions.
It has already been corrected for M5229 (rev c7) but not for M5229 (rev
c8). This leads to the bug described at
http://bugzilla.kernel.org/show_bug.cgi?id=5786 (lost interrupt + ide bus
hangs).
Signed-off-by: Michael De Backer <micdb@skynet.be> Signed-off-by: Adrian Bunk <bunk@stusta.de>
MX300 does not have an EXTRA_BTN - it is a simple wheel mouse with
an additional task-switcher button, which is reported as side button
(and not task button).
Signed-off-by: Daniel Drake <dsd@gentoo.org> Signed-off-by: Dmitry Torokhov <dtor@mail.ru> Signed-off-by: Adrian Bunk <bunk@stusta.de>
Al Viro [Mon, 4 Dec 2006 12:12:43 +0000 (13:12 +0100)]
[EBTABLES]: Deal with the worst-case behaviour in loop checks.
No need to revisit a chain we'd already finished with during
the check for current hook. It's either instant loop (which
we'd just detected) or a duplicate work.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Adrian Bunk <bunk@stusta.de>
Patrick McHardy [Mon, 4 Dec 2006 11:46:48 +0000 (12:46 +0100)]
[NET_SCHED]: policer: restore compatibility with old iproute binaries
The tc actions increased the size of struct tc_police, which broke
compatibility with old iproute binaries since both the act_police
and the old NET_CLS_POLICE code check for an exact size match.
Since the new members are not even used, the simple fix is to also
accept the size of the old structure. Dumping is not affected since
old userspace will receive a bigger structure, which is handled fine.
Signed-off-by: Patrick McHardy <kaber@trash.net> Acked-by: Jamal Hadi Salim <hadi@cyberus.ca> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Adrian Bunk <bunk@stusta.de>
[IPV6]: Fix address/interface handling in UDP and DCCP, according to the scoping architecture.
TCP and RAW do not have this issue. Closes Bug #7432.
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Adrian Bunk <bunk@stusta.de>
Josh Triplett [Wed, 29 Nov 2006 13:26:18 +0000 (14:26 +0100)]
freevxfs: Add missing lock_kernel() to vxfs_readdir
Commit 7b2fd697427e73c81d5fa659efd91bd07d303b0e in the historical GIT tree
stopped calling the readdir member of a file_operations struct with the big
kernel lock held, and fixed up all the readdir functions to do their own
locking. However, that change added calls to unlock_kernel() in
vxfs_readdir, but no call to lock_kernel(). Fix this by adding a call to
lock_kernel().
Signed-off-by: Josh Triplett <josh@freedesktop.org> Signed-off-by: Adrian Bunk <bunk@stusta.de>
Jeremy Higdon [Wed, 29 Nov 2006 13:22:11 +0000 (14:22 +0100)]
sgiioc4: Disable module unload
This patch removes a module_exit function that sgiioc4 should not have had.
It seems that the IDE layer doesn't support submodule unloading. sgiioc4
was the only driver in drivers/ide/pci that had an exit function.
After an unload, the devices would stay around and the next attempt to
reference would crash...
Signed-off-by: Jeremy Higdon <jeremy@sgi.com> Signed-off-by: Adrian Bunk <bunk@stusta.de>
Vasily Tarasov [Wed, 29 Nov 2006 13:04:14 +0000 (14:04 +0100)]
block layer: elv_iosched_show should get elv_list_lock
elv_iosched_show function iterates other elv_list,
hence elv_list_lock should be got.
Also the question is: in elv_iosched_show, elv_iosched_store
q->elevator->elevator_type construction is used without locking q->queue_lock.
Is it expected?..
Nathan Lynch [Wed, 29 Nov 2006 11:17:37 +0000 (12:17 +0100)]
nvidiafb: fix unreachable code in nv10GetConfig
Fix binary/logical operator typo which leads to unreachable code. Noticed
while looking at other issues; I don't have the relevant hardware to test
this.
Signed-off-by: Nathan Lynch <ntl@pobox.com> Signed-off-by: Adrian Bunk <bunk@stusta.de>
Pierre Ossman [Wed, 29 Nov 2006 11:10:52 +0000 (12:10 +0100)]
MMC: Always use a sector size of 512 bytes
Both MMC and SD specifications specify (although a bit unclearly in the MMC
case) that a sector size of 512 bytes must always be supported by the card.
Cards can report larger "native" size than this, and cards >= 2 GB even
must do so. Most other readers use 512 bytes even for these cards. We should
do the same to be compatible.
Signed-off-by: Pierre Ossman <drzeus@drzeus.cx> Signed-off-by: Adrian Bunk <bunk@stusta.de>
Herbert Xu [Wed, 29 Nov 2006 11:06:04 +0000 (12:06 +0100)]
SCTP: Always linearise packet on input
I was looking at a RHEL5 bug report involving Xen and SCTP
(https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=212550).
It turns out that SCTP wasn't written to handle skb fragments at
all. The absence of any calls to skb_may_pull is testament to
that.
It just so happens that Xen creates fragmented packets more often
than other scenarios (header & data split when going from domU to
dom0). That's what caused this bug to show up.
Until someone has the time sits down and audits the entire net/sctp
directory, here is a conservative and safe solution that simply
linearises all packets on input.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Adrian Bunk <bunk@stusta.de>
Al Viro [Wed, 29 Nov 2006 10:40:22 +0000 (11:40 +0100)]
add forgotten ->b_data in memcpy() call in ext3/resize.c (oopsable)
sbi->s_group_desc is an array of pointers to buffer_head. memcpy() of
buffer size from address of buffer_head is a bad idea - it will generate
junk in any case, may oops if buffer_head is close to the end of slab
page and next page is not mapped and isn't what was intended there.
IOW, ->b_data is missing in that call. Fortunately, result doesn't go
into the primary on-disk data structures, so only backup ones get crap
written to them; that had allowed this bug to remain unnoticed until
now.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Adrian Bunk <bunk@stusta.de>
Olaf Kirch [Wed, 29 Nov 2006 09:59:22 +0000 (10:59 +0100)]
[UDP]: Make udp_encap_rcv use pskb_may_pull
Make udp_encap_rcv use pskb_may_pull
IPsec with NAT-T breaks on some notebooks using the latest e1000 chipset,
when header split is enabled. When receiving sufficiently large packets, the
driver puts everything up to and including the UDP header into the header
portion of the skb, and the rest goes into the paged part. udp_encap_rcv
forgets to use pskb_may_pull, and fails to decapsulate it. Instead, it
passes it up it to the IKE daemon.
Signed-off-by: Olaf Kirch <okir@suse.de> Signed-off-by: Jean Delvare <jdelvare@suse.de> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Adrian Bunk <bunk@stusta.de>
Alan Stern [Sat, 25 Nov 2006 01:47:52 +0000 (02:47 +0100)]
USB: UHCI: Increase port-reset completion delay for HP controllers
This patch (as657) increases the port-reset completion delay in uhci-hcd
for HP's embedded controllers. Unlike other UHCI controllers, the HP
chips can take as long as 250 us to carry out the processing associated
with finishing a port reset.
This fixes Novell bug #148761.
Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: Adrian Bunk <bunk@stusta.de>
The hptiop just got merged with a horrible amount of really bad ioctl
code that is against the standards for new scsi drivers. This patch
backs it out (and fixes a small bug where scsi_add_host is called to
early). We can re-add proper APIs once we agree on them.
Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Adrian Bunk <bunk@stusta.de>
Updates:
- don't bypass SYNCHRONIZE_CACHE command
- return SCSI_MLQUEUE_HOST_BUSY when no free request slots
- move scsi_remove_host() to the begin of hpt_remove(), or it will
not work after resources being released.
Signed-off-by: HighPoint Linux Team <linux@highpoint-tech.com> Signed-off-by: Adrian Bunk <bunk@stusta.de>
HighPoint RocketRAID 3220/3320 series 8 channel PCI-X SATA RAID Host
Adapters.
Fixes from original submission:
Merge Andrew Morton's patches:
- Provide locking for global list
- Fix debug printks
- uninline function with multiple callsites
- coding style fixups
- remove unneeded casts of void*
- kfree(NULL) is legal
- Don't "succeed" if register_chrdev() failed - otherwise we'll later
unregister a not-registered chrdev.
- Don't return from hptiop_do_ioctl() with the spinlock held.
- uninline __hpt_do_ioctl()
Update for Arjan van de Ven's comments:
- put all asm/ includes after the linux/ ones
- replace mdelay with msleep
- add pci posting flush
- do not set pci command reqister in map_pci_bar
- do not try merging sg elements in hptiop_buildsgl()
- remove unused outstandingcommands member from hba structure
- remove unimplemented hptiop_abort() handler
- remove typedef u32 hpt_id_t
Other updates:
- fix endianess
Signed-off-by: HighPoint Linux Team <linux@highpoint-tech.com> Signed-off-by: Adrian Bunk <bunk@stusta.de>
Hidetoshi Seto [Fri, 24 Nov 2006 02:11:19 +0000 (03:11 +0100)]
sysfs: remove duplicated dput in sysfs_update_file
Following function can drops d_count twice against one reference
by lookup_one_len.
<SOURCE>
/**
* sysfs_update_file - update the modified timestamp on an object attribute.
* @kobj: object we're acting for.
* @attr: attribute descriptor.
*/
int sysfs_update_file(struct kobject * kobj, const struct attribute * attr)
{
struct dentry * dir = kobj->dentry;
struct dentry * victim;
int res = -ENOENT;
mutex_lock(&dir->d_inode->i_mutex);
victim = lookup_one_len(attr->name, dir, strlen(attr->name));
if (!IS_ERR(victim)) {
/* make sure dentry is really there */
if (victim->d_inode &&
(victim->d_parent->d_inode == dir->d_inode)) {
victim->d_inode->i_mtime = CURRENT_TIME;
fsnotify_modify(victim);
/**
* Drop reference from initial sysfs_get_dentry().
*/
dput(victim);
res = 0;
} else
d_drop(victim);
/**
* Drop the reference acquired from sysfs_get_dentry() above.
*/
dput(victim);
}
mutex_unlock(&dir->d_inode->i_mutex);
return res;
}
</SOURCE>
PCI-hotplug (drivers/pci/hotplug/pci_hotplug_core.c) is only user of
this function. I confirmed that dentry of /sys/bus/pci/slots/XXX/*
have negative d_count value.
This patch removes unnecessary dput().
Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: Adrian Bunk <bunk@stusta.de>
Kirill Korotaev [Fri, 24 Nov 2006 02:08:27 +0000 (03:08 +0100)]
fix sys_getppid oopses on debug kernel
sys_getppid() optimization can access a freed memory. On kernels with
DEBUG_SLAB turned ON, this results in Oops. As Dave Hansen noted, this
optimization is also unsafe for memory hotplug.
So this patch always takes the lock to be safe.
Signed-off-by: Kirill Korotaev <dev@openvz.org> Signed-off-by: Adrian Bunk <bunk@stusta.de>
Al Viro [Fri, 24 Nov 2006 02:03:34 +0000 (03:03 +0100)]
[IPX]: Annotate and fix IPX checksum
Calculation of IPX checksum got buggered about 2.4.0. The old variant
mangled the packet; that got fixed, but calculation itself got buggered.
Restored the correct logics, fixed a subtle breakage we used to have even
back then: if the sum is 0 mod 0xffff, we want to return 0, not 0xffff.
The latter has special meaning for IPX (cheksum disabled). Observation
(and obvious fix) nicked from history of FreeBSD ipx_cksum.c...
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Adrian Bunk <bunk@stusta.de>
Need to check some more cases in IPX receive. If the skb is purely
fragments, the IPX header needs to be extracted. The function
pskb_may_pull() may in theory invalidate all the pointers in the skb,
so references to ipx header must be refreshed.
Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Adrian Bunk <bunk@stusta.de>
This patch will linearize and check there is enough data.
It handles the pprop case as well as avoiding a whole audit of
the routing code.
Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Adrian Bunk <bunk@stusta.de>