Linus Torvalds [Thu, 4 Jan 2007 00:44:45 +0000 (01:44 +0100)]
Fix incorrect user space access locking in mincore() (CVE-2006-4814)
Doug Chapman noticed that mincore() will doa "copy_to_user()" of the
result while holding the mmap semaphore for reading, which is a big
no-no. While a recursive read-lock on a semaphore in the case of a page
fault happens to work, we don't actually allow them due to deadlock
schenarios with writers due to fairness issues.
Doug and Marcel sent in a patch to fix it, but I decided to just rewrite
the mess instead - not just fixing the locking problem, but making the
code smaller and (imho) much easier to understand.
Also included are two fixes for the original patch including one
by Oleg Nesterov.
Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Adrian Bunk <bunk@stusta.de>
Miklos Szeredi [Thu, 4 Jan 2007 00:14:06 +0000 (01:14 +0100)]
fuse: fix hang on SMP
Fuse didn't always call i_size_write() with i_mutex held which caused
rare hangs on SMP/32bit. This bug has been present since fuse-2.2,
well before being merged into mainline.
The simplest solution is to protect i_size_write() with the
per-connection spinlock. Using i_mutex for this purpose would require
some restructuring of the code and I'm not even sure it's always safe
to acquire i_mutex in all places i_size needs to be set.
Since most of vmtruncate is already duplicated for other reasons,
duplicate the remaining part as well, making all i_size_write() calls
internal to fuse.
Using i_size_write() was unnecessary in fuse_init_inode(), since this
function is only called on a newly created locked inode.
Reported by a few people over the years, but special thanks to Dana
Henriksen who was persistent enough in helping me debug it.
Adrian Bunk:
Backported to 2.6.16.
Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Adrian Bunk <bunk@stusta.de>
Dirk Eibach [Wed, 3 Jan 2007 23:42:01 +0000 (00:42 +0100)]
i2c: fix broken ds1337 initialization
On a custom board with ds1337 RTC I found that upgrade from 2.6.15 to
2.6.18 broke RTC support.
The main problem are changes to ds1337_init_client().
When a ds1337 recognizes a problem (e.g. power or clock failure) bit 7
in status register is set. This has to be reset by writing 0 to status
register. But since there are only 16 byte written to the chip and the
first byte is interpreted as an address, the status register (which is
the 16th) is never written.
The other problem is, that initializing all registers to zero is not
valid for day, date and month register. Funny enough this is checked by
ds1337_detect(), which depends on this values not being zero. So then
treated by ds1337_init_client() the ds1337 is not detected anymore,
whereas the failure bit in the status register is still set.
Patrick McHardy [Wed, 3 Jan 2007 23:38:10 +0000 (00:38 +0100)]
NET_SCHED: Fix fallout from dev->qdisc RCU change
The move of qdisc destruction to a rcu callback broke locking in the
entire qdisc layer by invalidating previously valid assumptions about
the context in which changes to the qdisc tree occur.
The two assumptions were:
- since changes only happen in process context, read_lock doesn't need
bottem half protection. Now invalid since destruction of inner qdiscs,
classifiers, actions and estimators happens in the RCU callback unless
they're manually deleted, resulting in dead-locks when read_lock in
process context is interrupted by write_lock_bh in bottem half context.
- since changes only happen under the RTNL, no additional locking is
necessary for data not used during packet processing (f.e. u32_list).
Again, since destruction now happens in the RCU callback, this assumption
is not valid anymore, causing races while using this data, which can
result in corruption or use-after-free.
Instead of "fixing" this by disabling bottem halfs everywhere and adding
new locks/refcounting, this patch makes these assumptions valid again by
moving destruction back to process context. Since only the dev->qdisc
pointer is protected by RCU, but ->enqueue and the qdisc tree are still
protected by dev->qdisc_lock, destruction of the tree can be performed
immediately and only the final free needs to happen in the rcu callback
to make sure dev_queue_xmit doesn't access already freed memory.
Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Adrian Bunk <bunk@stusta.de>
Trond Myklebust [Mon, 18 Dec 2006 19:55:04 +0000 (20:55 +0100)]
NFS: nfs_lookup - don't hash dentry when optimising away the lookup
If the open intents tell us that a given lookup is going to result in a,
exclusive create, we currently optimize away the lookup call itself. The
reason is that the lookup would not be atomic with the create RPC call, so
why do it in the first place?
A problem occurs, however, if the VFS aborts the exclusive create operation
after the lookup, but before the call to create the file/directory: in this
case we will end up with a hashed negative dentry in the dcache that has
never been looked up.
Fix this by only actually hashing the dentry once the create operation has
been successfully completed.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: Adrian Bunk <bunk@stusta.de>
Brian King [Mon, 18 Dec 2006 19:53:26 +0000 (20:53 +0100)]
[SCSI] DAC960: PCI id table fixup
The PCI ID table in the DAC960 driver conflicts with some devices
that use the ipr driver. All ipr adapters that use this chip
have an IBM subvendor ID and all DAC960 adapters that use this
chip have a Mylex subvendor id.
Signed-off-by: Brian King <brking@us.ibm.com> Signed-off-by: Adrian Bunk <bunk@stusta.de>
bridge-netfilter: don't overwrite memory outside of skb
The bridge netfilter code needs to check for space at the
front of the skb before overwriting; otherwise if skb from
device doesn't have headroom, then it will cause random
memory corruption.
Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: Adrian Bunk <bunk@stusta.de>
Jeff Garzik [Sun, 17 Dec 2006 23:07:43 +0000 (00:07 +0100)]
ISDN: fix drivers, by handling errors thrown by ->readstat()
This is a particularly ugly on-failure bug, possibly security, since the
lack of error handling here is covering up another class of bug: failure to
handle copy_to_user() return values.
The I4L API function ->readstat() returns an integer, and by looking at
several existing driver implementations, it is clear that a negative return
value was meant to indicate an error.
Given that several drivers already return a negative value indicating an
errno-style error, the current code would blindly accept that [negative]
value as a valid amount of bytes read. Obvious damage ensues.
Correcting ->readstat() handling to properly notice errors fixes the
existing code to work correctly on error, and enables future patches to
more easily indicate errors during operation.
Signed-off-by: Jeff Garzik <jeff@garzik.org> Signed-off-by: Adrian Bunk <bunk@stusta.de>
Francois Romieu [Sun, 17 Dec 2006 21:14:09 +0000 (22:14 +0100)]
r8169: tweak the PCI data parity error recovery
The 8110SB based n2100 board signals a lot of what ought to be
PCI data parity errors durint operation of the 8169 as target.
Experiment proved that the driver can ignore the error and
process the packet as if nothing had happened.
Let's add an ad-hoc knob to enable users to fix their system while
avoiding the risks of a wholesale change.
Signed-off-by: Francois Romieu <romieu@fr.zoreil.com> Signed-off-by: Adrian Bunk <bunk@stusta.de>
Arjan van de Ven [Sun, 17 Dec 2006 20:49:57 +0000 (21:49 +0100)]
x86-64: Mark rdtsc as sync only for netburst, not for core2
On the Core2 cpus, the rdtsc instruction is not serializing (as defined
in the architecture reference since rdtsc exists) and due to the deep
speculation of these cores, it's possible that you can observe time go
backwards between cores due to this speculation. Since the kernel
already deals with this with the SYNC_RDTSC flag, the solution is
simple, only assume that the instruction is serializing on family 15...
The price one pays for this is a slightly slower gettimeofday (by a
dozen or two cycles), but that increase is quite small to pay for a
really-going-forward tsc counter.
Backport by Chris Wright.
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Chris Wright <chrisw@sous-sol.org> Signed-off-by: Adrian Bunk <bunk@stusta.de>
Robin Holt [Sun, 17 Dec 2006 20:34:56 +0000 (21:34 +0100)]
IA64: bte_unaligned_copy() transfers one extra cache line.
When called to do a transfer that has a start offset within the cache
line which is uneven between source and destination and a length which
terminates the source of the copy exactly on a cache line, one extra
line gets copied into a temporary buffer. This is normally not an issue
since the buffer is a kernel buffer and only the requested information
gets copied into the user buffer.
The problem arises when the source ends at the very last physical page
of memory. That last cache line does not exist and results in the SHUB
chip raising an MCA.
Signed-off-by: Robin Holt <holt@sgi.com> Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Adrian Bunk <bunk@stusta.de>
Tejun Heo [Sun, 17 Dec 2006 20:32:00 +0000 (21:32 +0100)]
scsi: clear garbage after CDBs on SG_IO
ATAPI devices transfer fixed number of bytes for CDBs (12 or 16). Some
ATAPI devices choke when shorter CDB is used and the left bytes contain
garbage. Block SG_IO cleared left bytes but SCSI SG_IO didn't. This patch
makes SCSI SG_IO clear it and simplify CDB clearing in block SG_IO.
Linus Torvalds [Fri, 15 Dec 2006 00:56:30 +0000 (01:56 +0100)]
AGP: Allocate AGP pages with GFP_DMA32 by default
Not all graphic page remappers support physical addresses over the 4GB
mark for remapping, so while some do (the AMD64 GART always did, and I
just fixed the i965 to do so properly), we're safest off just forcing
GFP_DMA32 allocations to make sure graphics pages get allocated in the
low 32-bit address space by default.
AGP sub-drivers that really care, and can do better, could just choose
to implement their own allocator (or we could add another "64-bit safe"
default allocator for their use), but quite frankly, you're not likely
to care in practice.
So for now, this trivial change means that we won't be allocating pages
that we can't map correctly by mistake on x86-64.
[ On traditional 32-bit x86, this could never happen, because GFP_KERNEL
would never allocate any highmem memory anyway ]
Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Adrian Bunk <bunk@stusta.de>
Neil Brown [Fri, 15 Dec 2006 00:48:58 +0000 (01:48 +0100)]
md: Fix md grow/size code to correctly find the maximum available space
An md array can be asked to change the amount of each device that it is using,
and in particular can be asked to use the maximum available space. This
currently only works if the first device is not larger than the rest. As
'size' gets changed and so 'fit' becomes wrong. So check if a 'fit' is
required early and don't corrupt it.
Signed-off-by: Doug Ledford <dledford@redhat.com> Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Adrian Bunk <bunk@stusta.de>
Zachary Amsden [Fri, 15 Dec 2006 00:38:04 +0000 (01:38 +0100)]
softirq: remove BUG_ONs which can incorrectly trigger
It is possible to have tasklets get scheduled before softirqd has had a chance
to spawn on all CPUs. This is totally harmless; after success during action
CPU_UP_PREPARE, action CPU_ONLINE will be called, which immediately wakes
softirqd on the appropriate CPU to process the already pending tasklets. So
there is no danger of having a missed wakeup for any tasklets that were
already pending.
In particular, i386 is affected by this during startup, and is visible when
using a very large initrd; during the time it takes for the initrd to be
decompressed, a timer IRQ can come in and schedule RCU callbacks. It is also
possible that resending of a hardware IRQ via a softirq triggers the same bug.
Because of different timing conditions, this shows up in all emulators and
virtual machines tested, including Xen, VMware, Virtual PC, and Qemu. It is
also possible to trigger on native hardware with a large enough initrd,
although I don't have a reliable case demonstrating that.
Signed-off-by: Zachary Amsden <zach@vmware.com> Signed-off-by: Adrian Bunk <bunk@stusta.de>
Christophe Saout [Fri, 15 Dec 2006 00:21:59 +0000 (01:21 +0100)]
dm crypt: Fix data corruption with dm-crypt over RAID5
Fix corruption issue with dm-crypt on top of software raid5. Cancelled
readahead bio's that report no error, just have BIO_UPTODATE cleared
were reported as successful reads to the higher layers (and leaving
random content in the buffer cache). Already fixed in 2.6.19.
Signed-off-by: Christophe Saout <christophe@saout.de> Signed-off-by: Adrian Bunk <bunk@stusta.de>
Christophe Saout [Fri, 15 Dec 2006 00:20:35 +0000 (01:20 +0100)]
Fix SUNRPC wakeup/execute race condition
The sunrpc scheduler contains a race condition that can let an RPC
task end up being neither running nor on any wait queue. The race takes
place between rpc_make_runnable (called from rpc_wake_up_task) and
__rpc_execute under the following condition:
First __rpc_execute calls tk_action which puts the task on some wait
queue. The task is dequeued by another process before __rpc_execute
continues its execution. While executing rpc_make_runnable exactly after
setting the task `running' bit and before clearing the `queued' bit
__rpc_execute picks up execution, clears `running' and subsequently
both functions fall through, both under the false assumption somebody
else took the job.
Swapping rpc_test_and_set_running with rpc_clear_queued in
rpc_make_runnable fixes that hole. This introduces another possible
race condition that can be handled by checking for `queued' after
setting the `running' bit.
Bug noticed on a 4-way x86_64 system under XEN with an NFSv4 server
on the same physical machine, apparently one of the few ways to hit
this race condition at all.
Mark McLoughlin [Thu, 14 Dec 2006 22:09:07 +0000 (23:09 +0100)]
dm snapshot: fix metadata writing when suspending
When suspending a device-mapper device, dm_suspend() sleeps until all
necessary I/O is completed. This state is triggered by a callback from
persistent_commit(). But some I/O can still be issued *after* the callback
(to prepare the next metadata area for use if the current one is full). This
patch delays the callback until after that I/O is complete.
Signed-off-by: Mark McLoughlin <markmc@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com> Signed-off-by: Adrian Bunk <bunk@stusta.de>
Daniel Kobras [Thu, 14 Dec 2006 22:08:32 +0000 (23:08 +0100)]
dm: Fix deadlock under high i/o load in raid1 setup.
On an nForce4-equipped machine with two SATA disk in raid1 setup using dmraid,
we experienced frequent deadlock of the system under high i/o load. 'cat
/dev/zero > ~/zero' was the most reliable way to reproduce them: Randomly
after a few GB, 'cp' would be left in 'D' state along with kjournald and
kmirrord. The functions cp and kjournald were blocked in did vary, but
kmirrord's wchan always pointed to 'mempool_alloc()'. We've seen this pattern
on 2.6.15 and 2.6.17 kernels. http://lkml.org/lkml/2005/4/20/142 indicates
that this problem has been around even before.
So much for the facts, here's my interpretation: mempool_alloc() first tries
to atomically allocate the requested memory, or falls back to hand out
preallocated chunks from the mempool. If both fail, it puts the calling
process (kmirrord in this case) on a private waitqueue until somebody refills
the pool. Where the only 'somebody' is kmirrord itself, so we have a
deadlock.
I worked around this problem by falling back to a (blocking) kmalloc when
before kmirrord would have ended up on the waitqueue. This defeats part of
the benefits of using the mempool, but at least keeps the system running. And
it could be done with a two-line change. Note that mempool_alloc() clears the
GFP_NOIO flag internally, and only uses it to decide whether to wait or return
an error if immediate allocation fails, so the attached patch doesn't change
behaviour in the non-deadlocking case. Path is against current git
(2.6.18-rc4), but should apply to earlier versions as well. I've tested on
2.6.15, where this patch makes the difference between random lockup and a
stable system.
Signed-off-by: Daniel Kobras <kobras@linux.de> Acked-by: Alasdair G Kergon <agk@redhat.com> Signed-off-by: Adrian Bunk <bunk@stusta.de>
Neil Brown [Thu, 14 Dec 2006 22:07:54 +0000 (23:07 +0100)]
dm: mirror sector offset fix
The device-mapper core does not perform any remapping of bios before passing
them to the targets. If a particular mapping begins part-way into a device,
targets obtain the sector relative to the start of the mapping by subtracting
ti->begin.
The dm-raid1 target didn't do this everywhere: this patch fixes it, taking
care to subtract ti->begin exactly once for each bio.
Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Alasdair G Kergon <agk@redhat.com> Signed-off-by: Adrian Bunk <bunk@stusta.de>
Jeff Mahoney [Thu, 14 Dec 2006 22:07:12 +0000 (23:07 +0100)]
dm: add module ref counting
The reference counting on dm-mod is zero if no mapped devices are open. This
is incorrect, and can lead to an oops if the module is unloaded while mapped
devices exist.
This patch claims a reference to the module whenever a device is created, and
drops it again when the device is freed.
Devices must be removed before dm-mod is unloaded.
Signed-off-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com> Signed-off-by: Adrian Bunk <bunk@stusta.de>
Persistent snapshots currently store a private copy of the chunk size.
Userspace also supplies the chunk size when loading a snapshot. Ensure
consistency by only storing the chunk_size in one place instead of two.
Currently the two sizes will differ if the chunk size supplied by userspace
does not match the chunk size an existing snapshot actually uses. Amongst
other problems, this causes an incorrect 'percentage full' to be reported.
The patch ensures consistency by only storing the chunk_size in one place,
removing it from struct pstore. Some initialisation is delayed until the
correct chunk_size is known. If read_header() discovers that the wrong chun
size was supplied, the 'area' buffer (which the header already got read into
is reinitialised to the correct size.
Signed-off-by: Alasdair G Kergon <agk@redhat.com> Signed-off-by: Adrian Bunk <bunk@stusta.de>
Michal Miroslaw [Thu, 14 Dec 2006 22:02:33 +0000 (23:02 +0100)]
dm: BUG/OOPS fix
Fix BUG I tripped on while testing failover and multipathing.
BUG shows up on error path in multipath_ctr() when parse_priority_group()
fails after returning at least once without error. The fix is to
initialize m->ti early - just after alloc()ing it.
Joerg Ahrens [Thu, 14 Dec 2006 21:29:54 +0000 (22:29 +0100)]
xirc2ps_cs: Cannot reset card in atomic context
I am using a Xircom CEM33 pcmcia NIC which has occasional hardware problems.
If the netdev watchdog detects a transmit timeout, do_reset is called which
msleeps - this is illegal in atomic context.
This patch schedules the timeout handling as a workqueue item.
Alexey Kuznetsov [Thu, 14 Dec 2006 21:28:51 +0000 (22:28 +0100)]
[IPV4]: severe locking bug in fib_semantics.c
Found in 2.4 by Yixin Pan <yxpan@hotmail.com>.
> When I read fib_semantics.c of Linux-2.4.32, write_lock(&fib_info_lock) =
> is used in fib_release_info() instead of write_lock_bh(&fib_info_lock). =
> Is the following case possible: a BH interrupts fib_release_info() while =
> holding the write lock, and calls ip_check_fib_default() which calls =
> read_lock(&fib_info_lock), and spin forever.
Signed-off-by: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru> Signed-off-by: Adrian Bunk <bunk@stusta.de>
Hans Verkuil [Thu, 14 Dec 2006 21:26:40 +0000 (22:26 +0100)]
V4L: Fix broken TUNER_LG_NTSC_TAPE radio support
The TUNER_LG_NTSC_TAPE is identical in all respects to the
TUNER_PHILIPS_FM1236_MK3. So use the params struct for the Philips
tuner.
Also add this LG_NTSC_TAPE tuner to the switches where radio specific
parameters are set so it behaves like a TUNER_PHILIPS_FM1236_MK3. This
change fixes the radio support for this tuner (the wrong bandswitch byte
was used).
Thanks to Andy Walls <cwalls@radix.net> for finding this bug.
Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl> Signed-off-by: Michael Krufky <mkrufky@linuxtv.org> Signed-off-by: Adrian Bunk <bunk@stusta.de>
Michael Krufky [Thu, 14 Dec 2006 21:25:04 +0000 (22:25 +0100)]
DVB: lgdt330x: fix signal / lock status detection bug
In some cases when using VSB, the AGC status register has been known to
falsely report "no signal" when in fact there is a carrier lock. The
datasheet labels these status flags as QAM only, yet the lgdt330x
module is using these flags for both QAM and VSB.
This patch allows for the carrier recovery lock status register to be
tested, even if the agc signal status register falsely reports no signal.
Thanks to jcrews from #linuxtv in irc, for initially reporting this bug.
Signed-off-by: Michael Krufky <mkrufky@linuxtv.org> Signed-off-by: Adrian Bunk <bunk@stusta.de>
Chuck Ebbert [Sat, 9 Dec 2006 15:21:59 +0000 (16:21 +0100)]
binfmt_elf: fix checks for bad address
Fix check for bad address; use macro instead of open-coding two checks.
Taken from RHEL4 kernel update.
From: Ernie Petrides <petrides@redhat.com>
For background, the BAD_ADDR() macro should return TRUE if the address is
TASK_SIZE, because that's the lowest address that is *not* valid for
user-space mappings. The macro was correct in binfmt_aout.c but was wrong
for the "equal to" case in binfmt_elf.c. There were two in-line validations
of user-space addresses in binfmt_elf.c, which have been appropriately
converted to use the corrected BAD_ADDR() macro in the patch you posted
yesterday. Note that the size checks against TASK_SIZE are okay as coded.
The additional changes that I propose are below. These are in the error
paths for bad ELF entry addresses once load_elf_binary() has already
committed to exec'ing the new image (following the tearing down of the
task's original address space).
The 1st hunk deals with the interp-side of the outer "if". There were two
problems here. The printk() should be removed because this path can be
triggered at will by a bogus interpreter image created and used by a
malicious user. Further, the error code should not be ENOEXEC, because that
causes the loop in search_binary_handler() to continue trying other exec
handlers (twice, in fact). But it's too late for this to work correctly,
because the user address space has already been torn down, and an exec()
failure cannot be returned to the user code because the code no longer
exists. The only recovery is to force a SIGSEGV, but it's best to terminate
the search loop immediately. I somewhat arbitrarily chose EINVAL as a
fallback error code, but any error returned by load_elf_interp() will
override that (but this value will never be seen by user-space).
The 2nd hunk deals with the non-interp-side of the outer "if". There were
two problems here as well. The SIGSEGV needs to be forced, because a prior
sigaction() syscall might have set the associated disposition to SIG_IGN.
And the ENOEXEC should be changed to EINVAL as described above.
Signed-off-by: Chuck Ebbert <76306.1226@compuserve.com> Signed-off-by: Adrian Bunk <bunk@stusta.de>
Patrick McHardy [Sat, 9 Dec 2006 15:14:39 +0000 (16:14 +0100)]
[XFRM]: Use output device disable_xfrm for forwarded packets
Currently the behaviour of disable_xfrm is inconsistent between
locally generated and forwarded packets. For locally generated
packets disable_xfrm disables the policy lookup if it is set on
the output device, for forwarded traffic however it looks at the
input device. This makes it impossible to disable xfrm on all
devices but a dummy device and use normal routing to direct
traffic to that device.
Always use the output device when checking disable_xfrm.
Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Adrian Bunk <bunk@stusta.de>
Daniel Ritz [Wed, 6 Dec 2006 19:19:36 +0000 (20:19 +0100)]
PCI: fix ICH6 quirks
- add the ICH6(R) LPC to the ICH6 ACPI quirks. currently only the ICH6-M
is handled. [ PCI_DEVICE_ID_INTEL_ICH6_1 is the ICH6-M LPC, ICH6_0 is
the ICH6(R) ]
Signed-off-by: Daniel Ritz <daniel.ritz@gmx.ch> Signed-off-by: Adrian Bunk <bunk@stusta.de>
Bjorn Helgaas [Wed, 6 Dec 2006 19:17:30 +0000 (20:17 +0100)]
PCI: quirk to disable e100 interrupt if RESET failed to
Without this quirk, e100 can be pulling on a shared
interrupt line when another device (eg. USB) loads,
causing the interrupt to scream and get disabled.
http://bugzilla.kernel.org/show_bug.cgi?id=5918
Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com> Signed-off-by: Adrian Bunk <bunk@stusta.de>
Linus Torvalds [Wed, 6 Dec 2006 19:16:59 +0000 (20:16 +0100)]
Add PIIX4 APCI quirk for the 440MX chipset too
This is confirmed to fix a hang due to PCI resource conflicts with
setting up the Cardbus bridge on old laptops with the 440MX chipsets.
Original report by Alessio Sangalli, lspci debugging help by Pekka
Enberg, and trial patch suggested by Daniel Ritz:
"From the docs available i would _guess_ this thing is really similar
to the 82443BX/82371AB combination. at least the SMBus base address
register is hidden at the very same place (32bit at 0x90 in function
3 of the "south" brigde)"
The dang thing is largely undocumented, but the patch was corroborated
by Asit Mallick:
"I am trying to find the register information. 440MX is an integration of
440BX north-bridge without AGP and PIIX4E (82371EB). PIIX4 quirk
should cover the ACPI and SMBus related I/O registers."
and verified to fix the problem by Alessio.
Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Adrian Bunk <bunk@stusta.de>
Brice Goglin [Wed, 6 Dec 2006 19:15:55 +0000 (20:15 +0100)]
PCI: nVidia quirk to make AER PCI-E extended capability visible
The nVidia CK804 PCI-E chipset supports the AER extended capability
but sometimes fails to link it (with some BIOS or after a warm reboot).
It makes the AER cap invisible to pci_find_ext_capability().
The patch adds a quirk to set the missing bit that controls the
linking of the capability.
By the way, it removes the corresponding code in the myri10ge driver.
Signed-off-by: Brice Goglin <brice@myri.com> Signed-off-by: Adrian Bunk <bunk@stusta.de>
pci_ids.h: correct naming of 1022:7450 (AMD 8131 Bridge)
The naming of the constant defined for PCI ID 1022:7450 does not seem
to match the information at http://pciids.sourceforge.net/:
http://pci-ids.ucw.cz/iii/?i=1022
There 1022:7450 is listed as "AMD-8131 PCI-X Bridge" while 1022:7451
is listed as "AMD-8131 PCI-X IOAPIC". Yet, the current definition for
0x7450 is PCI_DEVICE_ID_AMD_8131_APIC. It seems to me like that name
should map to 0x7451, while a name like PCI_DEVICE_ID_AMD_8131_BRIDGE
should map to 0x7450.
Signed-off-by: John W. Linville <linville@tuxdriver.com> Signed-off-by: Adrian Bunk <bunk@stusta.de>
Ralf Baechle [Wed, 6 Dec 2006 17:49:53 +0000 (18:49 +0100)]
Fix mempolicy.h build error
<linux/mempolicy.h> uses struct mm_struct and relies on a definition or
declaration somehow magically being dragged in which may result in a
build:
CC mm/mempolicy.o
In file included from mm/mempolicy.c:69:
include/linux/mempolicy.h:150: warning: 'struct mm_struct' declared inside parameter list
include/linux/mempolicy.h:150: warning: its scope is only this definition or declaration, which is probably not what you want
include/linux/mempolicy.h:174: warning: 'struct mm_struct' declared inside parameter list
mm/mempolicy.c:673: error: conflicting types for 'do_migrate_pages'
include/linux/mempolicy.h:174: error: previous declaration of 'do_migrate_pages' was here
mm/mempolicy.c:1696: error: conflicting types for 'mpol_rebind_mm'
include/linux/mempolicy.h:150: error: previous declaration of 'mpol_rebind_mm' was here
make[1]: *** [mm/mempolicy.o] Error 1
make: *** [mm] Error 2
$
Including <linux/sched.h> is a step into direction of include hell so
fixed by adding a forward declaration of struct mm_struct instead.
Signed-off-by: Ralf Baechle <ralf@linux-mips.org> Signed-off-by: Adrian Bunk <bunk@stusta.de>
Chris Wright [Mon, 4 Dec 2006 18:44:59 +0000 (19:44 +0100)]
bridge: fix possible overflow in get_fdb_entries (CVE-2006-5751)
Make sure to properly clamp maxnum to avoid overflow (CVE-2006-5751).
Signed-off-by: Chris Wright <chrisw@sous-sol.org> Acked-by: Stephen Hemminger <shemminger@osdl.org> Acked-by: David Miller <davem@davemloft.net> Signed-off-by: Adrian Bunk <bunk@stusta.de>
Trond Myklebust [Mon, 4 Dec 2006 18:43:11 +0000 (19:43 +0100)]
fcntl(F_SETSIG) fix
fcntl(F_SETSIG) no longer works on leases because
lease_release_private_callback() gets called as the lease is copied in
order to initialise it.
The problem is that lease_alloc() performs an unnecessary initialisation,
which sets the lease_manager_ops. Avoid the problem by allocating the
target lease structure using locks_alloc_lock().
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: Adrian Bunk <bunk@stusta.de>
alim15x3.c: M5229 (rev c8) support for DMA cd-writer
Configuration bits are not set properly for DMA on some chipset revisions.
It has already been corrected for M5229 (rev c7) but not for M5229 (rev
c8). This leads to the bug described at
http://bugzilla.kernel.org/show_bug.cgi?id=5786 (lost interrupt + ide bus
hangs).
Signed-off-by: Michael De Backer <micdb@skynet.be> Signed-off-by: Adrian Bunk <bunk@stusta.de>
MX300 does not have an EXTRA_BTN - it is a simple wheel mouse with
an additional task-switcher button, which is reported as side button
(and not task button).
Signed-off-by: Daniel Drake <dsd@gentoo.org> Signed-off-by: Dmitry Torokhov <dtor@mail.ru> Signed-off-by: Adrian Bunk <bunk@stusta.de>
Al Viro [Mon, 4 Dec 2006 12:12:43 +0000 (13:12 +0100)]
[EBTABLES]: Deal with the worst-case behaviour in loop checks.
No need to revisit a chain we'd already finished with during
the check for current hook. It's either instant loop (which
we'd just detected) or a duplicate work.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Adrian Bunk <bunk@stusta.de>
Patrick McHardy [Mon, 4 Dec 2006 11:46:48 +0000 (12:46 +0100)]
[NET_SCHED]: policer: restore compatibility with old iproute binaries
The tc actions increased the size of struct tc_police, which broke
compatibility with old iproute binaries since both the act_police
and the old NET_CLS_POLICE code check for an exact size match.
Since the new members are not even used, the simple fix is to also
accept the size of the old structure. Dumping is not affected since
old userspace will receive a bigger structure, which is handled fine.
Signed-off-by: Patrick McHardy <kaber@trash.net> Acked-by: Jamal Hadi Salim <hadi@cyberus.ca> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Adrian Bunk <bunk@stusta.de>
[IPV6]: Fix address/interface handling in UDP and DCCP, according to the scoping architecture.
TCP and RAW do not have this issue. Closes Bug #7432.
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Adrian Bunk <bunk@stusta.de>
Josh Triplett [Wed, 29 Nov 2006 13:26:18 +0000 (14:26 +0100)]
freevxfs: Add missing lock_kernel() to vxfs_readdir
Commit 7b2fd697427e73c81d5fa659efd91bd07d303b0e in the historical GIT tree
stopped calling the readdir member of a file_operations struct with the big
kernel lock held, and fixed up all the readdir functions to do their own
locking. However, that change added calls to unlock_kernel() in
vxfs_readdir, but no call to lock_kernel(). Fix this by adding a call to
lock_kernel().
Signed-off-by: Josh Triplett <josh@freedesktop.org> Signed-off-by: Adrian Bunk <bunk@stusta.de>
Jeremy Higdon [Wed, 29 Nov 2006 13:22:11 +0000 (14:22 +0100)]
sgiioc4: Disable module unload
This patch removes a module_exit function that sgiioc4 should not have had.
It seems that the IDE layer doesn't support submodule unloading. sgiioc4
was the only driver in drivers/ide/pci that had an exit function.
After an unload, the devices would stay around and the next attempt to
reference would crash...
Signed-off-by: Jeremy Higdon <jeremy@sgi.com> Signed-off-by: Adrian Bunk <bunk@stusta.de>
Vasily Tarasov [Wed, 29 Nov 2006 13:04:14 +0000 (14:04 +0100)]
block layer: elv_iosched_show should get elv_list_lock
elv_iosched_show function iterates other elv_list,
hence elv_list_lock should be got.
Also the question is: in elv_iosched_show, elv_iosched_store
q->elevator->elevator_type construction is used without locking q->queue_lock.
Is it expected?..
Nathan Lynch [Wed, 29 Nov 2006 11:17:37 +0000 (12:17 +0100)]
nvidiafb: fix unreachable code in nv10GetConfig
Fix binary/logical operator typo which leads to unreachable code. Noticed
while looking at other issues; I don't have the relevant hardware to test
this.
Signed-off-by: Nathan Lynch <ntl@pobox.com> Signed-off-by: Adrian Bunk <bunk@stusta.de>
Pierre Ossman [Wed, 29 Nov 2006 11:10:52 +0000 (12:10 +0100)]
MMC: Always use a sector size of 512 bytes
Both MMC and SD specifications specify (although a bit unclearly in the MMC
case) that a sector size of 512 bytes must always be supported by the card.
Cards can report larger "native" size than this, and cards >= 2 GB even
must do so. Most other readers use 512 bytes even for these cards. We should
do the same to be compatible.
Signed-off-by: Pierre Ossman <drzeus@drzeus.cx> Signed-off-by: Adrian Bunk <bunk@stusta.de>
Herbert Xu [Wed, 29 Nov 2006 11:06:04 +0000 (12:06 +0100)]
SCTP: Always linearise packet on input
I was looking at a RHEL5 bug report involving Xen and SCTP
(https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=212550).
It turns out that SCTP wasn't written to handle skb fragments at
all. The absence of any calls to skb_may_pull is testament to
that.
It just so happens that Xen creates fragmented packets more often
than other scenarios (header & data split when going from domU to
dom0). That's what caused this bug to show up.
Until someone has the time sits down and audits the entire net/sctp
directory, here is a conservative and safe solution that simply
linearises all packets on input.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Adrian Bunk <bunk@stusta.de>
Al Viro [Wed, 29 Nov 2006 10:40:22 +0000 (11:40 +0100)]
add forgotten ->b_data in memcpy() call in ext3/resize.c (oopsable)
sbi->s_group_desc is an array of pointers to buffer_head. memcpy() of
buffer size from address of buffer_head is a bad idea - it will generate
junk in any case, may oops if buffer_head is close to the end of slab
page and next page is not mapped and isn't what was intended there.
IOW, ->b_data is missing in that call. Fortunately, result doesn't go
into the primary on-disk data structures, so only backup ones get crap
written to them; that had allowed this bug to remain unnoticed until
now.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Adrian Bunk <bunk@stusta.de>