Ken Chen [Thu, 8 Feb 2007 22:20:27 +0000 (14:20 -0800)]
hugetlb: preserve hugetlb pte dirty state
__unmap_hugepage_range() is buggy that it does not preserve dirty state of
huge_pte when unmapping hugepage range. It causes data corruption in the
event of dop_caches being used by sys admin. For example, an application
creates a hugetlb file, modify pages, then unmap it. While leaving the
hugetlb file alive, comes along sys admin doing a "echo 3 >
/proc/sys/vm/drop_caches".
drop_pagecache_sb() will happily free all pages that aren't marked dirty if
there are no active mapping. Later when application remaps the hugetlb
file back and all data are gone, triggering catastrophic flip over on
application.
Not only that, the internal resv_huge_pages count will also get all messed
up. Fix it up by marking page dirty appropriately.
Signed-off-by: Ken Chen <kenchen@google.com> Cc: "Nish Aravamudan" <nish.aravamudan@gmail.com> Cc: Adam Litke <agl@us.ibm.com> Cc: David Gibson <david@gibson.dropbear.id.au> Acked-by: William Irwin <bill.irwin@oracle.com> Cc: Hugh Dickins <hugh@veritas.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
As macbook/macbook pro's also have to live with a single mouse button the
following patch just enables the Macintosh device drivers menu in Kconfig +
adds the macintosh dir to the obj-* to make macbook* users happy (who use
exactly that since months....
Signed-off-by: Soeren Sonnenburg <kernel@nn7.de> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Dmitry Torokhov <dtor@mail.ru> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Adrian Bunk <bunk@stusta.de> Cc: Eric Van Hensbergen <ericvh@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This is a fix of regression, which triggered by ~2.6.16.
Patch with name ufs-directory-and-page-cache-from-blocks-to-pages.patch: in
additional to conversation from block to page cache mechanism added new
checks of directory integrity, one of them that directory entry do not
across directory chunks.
But some kinds of UFS: OpenStep UFS and Apple UFS (looks like these are the
same filesystems) have different directory chunk size, then common
UFSes(BSD and Solaris UFS).
So this patch adds ability to works with variable size of directory chunks,
and set it for ufstype=openstep to right size.
Magnus Damm [Tue, 6 Feb 2007 00:20:09 +0000 (16:20 -0800)]
kexec: Fix CONFIG_SMP=n compilation V2 (ia64)
Kexec support for 2.6.20 on ia64 does not build properly using a config
made up by CONFIG_SMP=n and CONFIG_HOTPLUG_CPU=n:
CC arch/ia64/kernel/machine_kexec.o
arch/ia64/kernel/machine_kexec.c: In function `machine_shutdown':
arch/ia64/kernel/machine_kexec.c:77: warning: implicit declaration of function `cpu_down'
AS arch/ia64/kernel/relocate_kernel.o
CC arch/ia64/kernel/crash.o
arch/ia64/kernel/crash.c: In function `kdump_cpu_freeze':
arch/ia64/kernel/crash.c:139: warning: implicit declaration of function `ia64_jump_to_sal'
arch/ia64/kernel/crash.c:139: error: `sal_boot_rendez_state' undeclared (first use in this function)
arch/ia64/kernel/crash.c:139: error: (Each undeclared identifier is reported only once
arch/ia64/kernel/crash.c:139: error: for each function it appears in.)
arch/ia64/kernel/crash.c: At top level:
arch/ia64/kernel/crash.c:84: warning: 'kdump_wait_cpu_freeze' defined but not used
make[1]: *** [arch/ia64/kernel/crash.o] Error 1
make: *** [arch/ia64/kernel] Error 2
Signed-off-by: Magnus Damm <magnus@valinux.co.jp> Acked-by: Simon Horman <horms@verge.net.au> Acked-by: Jay Lan <jlan@sgi.com> Cc: Tony Luck <tony.luck@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
The commit was buggy in multiple ways:
- the conversion to ilog2() was incorrect to begin with
- it tested the wrong #defines, so on all architectures but FRV you'd
never see the bug except for constant arguments.
- the new "get_order()" macro used its arguments multiple times, and
didn't even parenthesize them properly
- despite the comments, it was not true that you could use it for
constant initializers, since not all architectures even use the
generic page.h header file.
All of the problems are individually fixable, but it all boils down to:
better just revert it, and re-do it from scratch.
Cc: David Howells <dhowells@redhat.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Andrew Morton <akpm@osdl.org> Cc: David Woodhouse <dwmw2@infradead.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Thomas Renninger [Thu, 22 Feb 2007 12:52:40 +0000 (13:52 +0100)]
Backport of psmouse suspend/shutdown cleanups
This patch works back to 2.6.17 (earlier kernels seem to
need up/down operations on mutex/semaphore).
psmouse - properly reset mouse on shutdown/suspend
Some people report that they need psmouse module unloaded
for suspend to ram/disk to work properly. Let's make port
cleanup behave the same way as driver unload.
This fixes "bad state" problem on various HP laptops, such
as nx7400.
Ingo Molnar [Thu, 1 Mar 2007 23:58:51 +0000 (18:58 -0500)]
sched: fix SMT scheduler bug
The SMT scheduler incorrectly skips kernel threads even if they are
runnable (but they are preempted by a higher-prio user-space task which got
SMT-delayed by an even higher-priority task running on a sibling CPU).
Fix this for now by only doing the SMT-nice optimization if the
to-be-delayed task is the only runnable task. (This should cover most of
the real-life cases anyway.)
This bug has been in the SMT scheduler since 2.6.17 or so, but has only
been noticed now by the active check in the dynticks code.
Signed-off-by: Ingo Molnar <mingo@elte.hu> Cc: Michal Piotrowski <michal.k.k.piotrowski@gmail.com> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Chuck Ebbert <cebbert@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
tty_io: fix race in master pty close/slave pty close path
This patch fixes a possible race that leads to double freeing an idr index.
When the master begin to close, release_dev() is called and then
pty_close() is called:
if (tty->driver->close)
tty->driver->close(tty, filp);
This is done without helding any locks other than BKL. Inside pty_close(),
being a master close, the devpts entry will be removed:
#ifdef CONFIG_UNIX98_PTYS
if (tty->driver == ptm_driver)
devpts_pty_kill(tty->index);
#endif
But devpts_pty_kill() will call get_node() that may sleep while waiting for
&devpts_root->d_inode->i_sem. When this happens and the slave is being
opened, tty_open() just found the driver and index:
This part of the code is already protected under tty_mute. The problem is
that the slave close already got an index. Then init_dev() is called and
blocks waiting for the same &devpts_root->d_inode->i_sem.
When the master close resumes, it removes the devpts entry, and the
relation between idr index and the tty is gone. The master then sleeps
waiting for the tty_mutex on release_dev().
Slave open resumes and found no tty for that index. As result, a NULL tty
is returned and init_dev() doesn't flow to fast_track:
/* check whether we're reopening an existing tty */
if (driver->flags & TTY_DRIVER_DEVPTS_MEM) {
tty = devpts_get_tty(idx);
if (tty && driver->subtype == PTY_TYPE_MASTER)
tty = tty->link;
} else {
tty = driver->ttys[idx];
}
if (tty) goto fast_track;
The result of this, is that a new tty will be created and init_dev() returns
sucessfull. After returning, tty_mutex is dropped and master close may resume.
Master close finds it's the only use and both sides are closing, then releases
the tty and the index. At this point, the idr index is free, but slave still
has it.
Slave open then calls pty_open() and finds that tty->link->count is 0,
because there's no master and returns error. Then tty_open() calls
release_dev() which executes without any warning, as it was a case of last
slave close when the master is already closed (master->count == 0,
slave->count == 1). The tty is then released with the already released idr
index.
This normally would only issue a warning on idr_remove() but in case of a
customer's critical application, it's never too simple:
thread1: opens master, gets index X
thread1: begin closing master
thread2: begin opening slave with index X
thread1: finishes closing master, index X released
thread3: opens master, gets index X, just released
thread2: fails opening slave, releases index X <----
thread4: opens master, gets index X, init_dev() then find an already in use
and healthy tty and fails
If no more indexes are released, ptmx_open() will keep failing, as the
first free index available is X, and it will make init_dev() fail because
you're trying to "reopen a master" which isn't valid.
The patch notices when this race happens and make init_dev() fail
imediately. The init_dev() function is called with tty_mutex held, so it's
safe to continue with tty till the end of function because release_dev()
won't make any further changes without grabbing the tty_mutex.
Without the patch, on some machines it's possible get easily idr warnings
like this one:
idr_remove called for id=15 which is not allocated.
[<c02555b9>] idr_remove+0x139/0x170
[<c02a1b62>] release_mem+0x182/0x230
[<c02a28e7>] release_dev+0x4b7/0x700
[<c02a0ea7>] tty_ldisc_enable+0x27/0x30
[<c02a1e64>] init_dev+0x254/0x580
[<c02a0d64>] check_tty_count+0x14/0xb0
[<c02a4f05>] tty_open+0x1c5/0x340
[<c02a4d40>] tty_open+0x0/0x340
[<c017388f>] chrdev_open+0xaf/0x180
[<c017c2ac>] open_namei+0x8c/0x760
[<c01737e0>] chrdev_open+0x0/0x180
[<c0167bc9>] __dentry_open+0xc9/0x210
[<c0167e2c>] do_filp_open+0x5c/0x70
[<c0167a91>] get_unused_fd+0x61/0xd0
[<c0167e93>] do_sys_open+0x53/0x100
[<c0167f97>] sys_open+0x27/0x30
[<c010303b>] syscall_call+0x7/0xb
using this test application available on:
http://www.ruivo.org/~aris/pty_sodomizer.c
Signed-off-by: Aristeu Sergio Rozanski Filho <aris@ruivo.org> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Chuck Ebbert <cebbert@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Neil Brown [Fri, 9 Mar 2007 18:50:27 +0000 (10:50 -0800)]
export blk_recount_segments
On Monday February 12, marcm@liquid-nexus.net wrote:
> >
> > Thanks for the quick response Neil unfortunately the kernel doesn't build with
> > this patch due to a missing symbol:
> >
> > WARNING: "blk_recount_segments" [drivers/md/raid456.ko] undefined!
> >
> > Is that in another file that needs patching or within raid5.c?
Yes. I keep forgetting about that bit. Sorry.
Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Reading /proc/net/anycast6 when there is no anycast address
on an interface results in an ever-increasing inet6_dev reference
count, as well as a reference to the netdevice you can't get rid of.
From: David Stevens <dlstevens@us.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Michal Wrobel [Tue, 27 Feb 2007 19:12:45 +0000 (11:12 -0800)]
Don't add anycast reference to device multiple times
[IPV6]: anycast refcnt fix
This patch fixes a bug in Linux IPv6 stack which caused anycast address
to be added to a device prior DAD has been completed. This led to
incorrect reference count which resulted in infinite wait for
unregister_netdevice completion on interface removal.
Signed-off-by: Michal Wrobel <xmxwx@asn.pl> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
[XFRM_TUNNEL]: Reload header pointer after pskb_may_pull/pskb_expand_head
Please consider applying, this was found on your latest
net-2.6 tree while playing around with that ip_hdr() + turn
skb->nh/h/mac pointers as offsets on 64 bits idea :-)
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
David Miller [Tue, 27 Feb 2007 19:01:38 +0000 (11:01 -0800)]
Fix interrupt probing on E450 sparc64 systems
[SPARC64]: Fix PCI interrupts on E450 et al.
When the PCI controller OBP node lacks an interrupt-map
and interrupt-map-mask property, we need to form the
INO by hand. The PCI swizzle logic was not doing that
properly.
This was a regression added by the of_device code.
Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Jiri Kosina [Thu, 1 Mar 2007 11:02:52 +0000 (12:02 +0100)]
HID: fix possible double-free on error path in hid parser
HID: fix possible double-free on error path in hid parser
Freeing of device->collection is properly done in hid_free_device() (as
this function is supposed to free all the device resources and could be
called from transport specific code, e.g. usb_hid_configure()).
Remove all kfree() calls preceeding the hid_free_device() call.
Livio Soares [Thu, 22 Feb 2007 05:13:17 +0000 (16:13 +1100)]
POWERPC: Fix performance monitor exception
To the issue: some point during 2.6.20 development, Paul Mackerras
introduced the "lazy IRQ disabling" patch (very cool work, BTW).
In that patch, the performance monitor unit exception was marked as
"maskable", in the sense that if interrupts were soft-disabled, that
exception could be ignored. This broke my PowerPC profiling code.
The symptom that I see is that a varying number of interrupts
(from 0 to $n$, typically closer to 0) get delivered, when, in
reality, it should always be very close to $n$.
The issue stems from the way masking is being done. Masking in
this fashion seems to work well with the decrementer and external
interrupts, because they are raised again until "really" handled.
For the PMU, however, this does not apply (at least on my Xserver
machine with a 970FX processor). If the PMU exception is not handled,
it will _not_ be re-raised (at least on my machine). The documentation
states that the PMXE bit in MMCR0 is set to 0 when the PMU exception
is raised. However, software must re-set the bit to re-enable PMU
exceptions. If the exception is ignored (as currently) not only is
that interrupt lost, but because software does not re-set PMXE, the
PMU registers are "frozen" forever.
[This patch means that performance monitor exceptions are taken and
handled even if irqs are off, as long as some other interrupt hasn't
come along and caused interrupts to be hard-disabled. In this sense
the PMU exception becomes like an NMI. The oprofile code for most
powerpc processors does nothing that is unsafe in an NMI context, but
the Cell oprofile code does a spin_lock_irqsave. However, that turns
out to be OK because Cell doesn't actually use the performance
monitor exception; performance monitor interrupts come in as a
regular interrupt on Cell, so will be disabled when irqs are off.
-- paulus.]
This restores the previous behaviour for these devices by ensuring that when
the voltage is changed, only one write to set the voltage is performed.
It may be that both writes are needed if the voltage is being changed between
two non-zero values or that it's safe to ensure that only one write is done
if the hardware only supports one voltage; I don't know whether either is the
case nor can I test since I have only the one SD reader (1524:0550), and it
supports just the one voltage.
Signed-off-by: Darren Salt <linux@youmustbejoking.demon.co.uk> Signed-off-by: Pierre Ossman <drzeus@drzeus.cx> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Jeff Dike [Thu, 22 Feb 2007 16:48:38 +0000 (11:48 -0500)]
UML - Fix 2.6.20 hang
A previous cleanup misused need_poll, which had a fairly broken
interface. It implemented a growable array, changing the used
elements count itself, but leaving it up to the caller to fill in the
actual elements, including the entire array if the array had to be
reallocated. This worked because the previous users were switching
between two such structures, and the elements were copied from the
inactive array to the active array after making sure the active array
had enough room.
maybe_sigio_broken was made to use need_poll, but it was operating on
a single array, so when the buffer was reallocated, the previous
contents were lost.
This patch makes need_poll implement more sane semantics. It merely
assures that the array is of the proper size and that the contents are
preserved. It is up to the caller to adjust the used elements count
and to ensure that the proper elements are resent.
This manifested itself as a hang in 2.6.20 as the uninitialized buffer
convinced UML that one of its own file descriptors didn't support
SIGIO and needed to be watched by poll in a separate thread. The
result was an interrupt flood as control traffic over this descriptor
sparked interrupts, which resulted in more control traffic, ad nauseum.
Signed-off-by: Jeff Dike <jdike@addtoit.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Hugh Dickins [Fri, 23 Feb 2007 21:53:49 +0000 (21:53 +0000)]
fix umask when noACL kernel meets extN tuned for ACLs
Fix insecure default behaviour reported by Tigran Aivazian: if an ext2
or ext3 or ext4 filesystem is tuned to mount with "acl", but mounted by
a kernel built without ACL support, then umask was ignored when creating
inodes - though root or user has umask 022, touch creates files as 0666,
and mkdir creates directories as 0777.
This appears to have worked right until 2.6.11, when a fix to the default
mode on symlinks (always 0777) assumed VFS applies umask: which it does,
unless the mount is marked for ACLs; but ext[234] set MS_POSIXACL in
s_flags according to s_mount_opt set according to def_mount_opts.
We could revert to the 2.6.10 ext[234]_init_acl (adding an S_ISLNK test);
but other filesystems only set MS_POSIXACL when ACLs are configured. We
could fix this at another level; but it seems most robust to avoid setting
the s_mount_opt flag in the first place (at the expense of more ifdefs).
Likewise don't set the XATTR_USER flag when built without XATTR support.
Tejun Heo [Sat, 24 Feb 2007 13:30:36 +0000 (22:30 +0900)]
sata_sil: ignore and clear spurious IRQs while executing commands by polling
sata_sil used to trigger HSM error if IRQ occurs during polling
command. This didn't matter because polling wasn't used in sata_sil.
However, as of 2.6.20, all IDENTIFYs are performed by polling and
device detection sometimes fails due to spurious IRQ. This patch
makes sata_sil ignore and clear spurious IRQ while executing commands
by polling.
This fixes bug#7996 and IMHO should also be included in -stable.
Stefan Seyfried [Sat, 24 Feb 2007 22:06:43 +0000 (23:06 +0100)]
swsusp: Fix possible oops in userland interface
Fix the Oops occuring when SNAPSHOT_PMOPS or SNAPSHOT_S2RAM ioctl is called on
a system without pm_ops defined (eg. a non-ACPI kernel on x86 PC).
Signed-off-by: Stefan Seyfried <seife@suse.de> Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Acked-by: Pavel Machek <pavel@ucw.cz> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Thomas Gleixner [Thu, 22 Feb 2007 00:33:29 +0000 (01:33 +0100)]
Fix posix-cpu-timer breakage caused by stale p->last_ran value
Problem description at:
http://bugzilla.kernel.org/show_bug.cgi?id=8048
Commit b18ec80396834497933d77b81ec0918519f4e2a7
[PATCH] sched: improve migration accuracy
optimized the scheduler time calculations, but broke posix-cpu-timers.
The problem is that the p->last_ran value is not updated after a context
switch. So a subsequent call to current_sched_time() calculates with a
stale p->last_ran value, i.e. accounts the full time, which the task was
scheduled away.
Michael Krufky [Sat, 3 Mar 2007 14:36:15 +0000 (09:36 -0500)]
V4L: cx88-blackbird: allow usage of 376836 and 262144 sized firmware images
This updates the cx88-blackbird driver to be able to use the new cx23416
firmware image released by Hauppauge Computer Works, while retaining
compatibility with the older firmware images.
cx2341x firmware can be downloaded at: http://dl.ivtvdriver.org/ivtv/firmware/
Hans Verkuil [Thu, 15 Feb 2007 06:40:34 +0000 (03:40 -0300)]
V4L: fix cx25840 firmware loading
Due to changes in the i2c handling in 2.6.20 this cx25840 bug surfaced,
causing the firmware load to fail for the ivtv driver. The correct
sequence is to first attach the i2c client, then use the client's
device to load the firmware.
Michael Krufky [Sat, 3 Mar 2007 14:36:09 +0000 (09:36 -0500)]
DVB: digitv: open nxt6000 i2c_gate for TDED4 tuner handling
dvb-pll normally opens the i2c gate before attempting to communicate with
the pll, but the code for this device is not using dvb-pll. This should
be cleaned up in the future, but for now, just open the i2c gate at the
appropriate place in order to fix this driver bug.
Rework the cx23416 firmware loader so that it longer requires the
firmware size to be a multiple of 8KB. Until recently all cx2341x
firmware images were exactly 256KB, but newer firmware is larger than
that and also appears to have arbitrary size. We still must check
against a multiple of 4 bytes (because the cx23416 itself uses a 32
bit word size).
This fix is already in the upstream driver source and has proven
itself there; this is a backport for the 2.6.20.y kernel series.
Mike Isely [Sat, 3 Mar 2007 14:35:54 +0000 (09:35 -0500)]
V4L: pvrusb2: Fix video corruption on stream start
This introduces some extra cx23416 commands when streaming is
started. The addition of these commands fix random sporadic video
corruption that can take place when the video stream is temporarily
disrupted through loss of signal (e.g. changing the channel in the RF
tuner).
This fix is already in the upstream driver source and has proven
itself there; this is a backport for the 2.6.20.y kernel series.
Marcel Siegert [Sat, 3 Mar 2007 14:35:48 +0000 (09:35 -0500)]
dvbdev: fix illegal re-usage of fileoperations struct
Arjan van de Ven <arjan@infradead.org> reported an illegal re-usage of
the fileoperations struct if more than one dvb device (e.g. frontend) is
present.
This patch fixes this issue.
It allocates a new fileoperations struct each time a device is
registered and copies the default template fileops.
NeilBrown [Tue, 20 Feb 2007 06:34:47 +0000 (17:34 +1100)]
md: Fix raid10 recovery problem.
There are two errors that can lead to recovery problems with raid10
when used in 'far' more (not the default).
Due to a '>' instead of '>=' the wrong block is located which would
result in garbage being written to some random location, quite
possible outside the range of the device, causing the newly
reconstructed device to fail.
The device size calculation had some rounding errors (it didn't round
when it should) and so recovery would go a few blocks too far which
would again cause a write to a random block address and probably
a device error.
The code for working with device sizes was fairly confused and spread
out, so this has been tided up a bit.
Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Stefano Brivio [Sat, 17 Feb 2007 17:43:14 +0000 (18:43 +0100)]
bcm43xx: fix for 4309
BCM4309 devices aren't working properly as A PHYs aren't supported yet, but
we probe 802.11a cores anyway. This fixes it, while still allowing for A PHY code
to be developed in the future.
Jan Beulich [Sat, 17 Feb 2007 12:33:31 +0000 (13:33 +0100)]
i386: Fix broken CONFIG_COMPAT_VDSO on i386
After updating several machines to 2.6.20, I can't boot anymore the single
one of them that supports the NX bit and is configured as a 32-bit system.
My understanding is that the VDSO changes in 2.6.20-rc7 were not fully
cooked, in that with that config option enabled VDSO_SYM(x) now equals
x, meaning that an address in the fixmap area is now being passed to
apps via AT_SYSINFO. However, the page is mapped with PAGE_READONLY
rather than PAGE_READONLY_EXEC.
I'm not certain whether having app code go through the fixmap area is
intended, but in case it is here is the simple patch that makes things work
again.
Don't mark pause frames as errors. This problem caused transmitter not
to pause and would effectively take out a gigabit switch because the
it can't handle overrun.
Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Ilpo Järvinen [Tue, 13 Feb 2007 20:42:11 +0000 (12:42 -0800)]
Prevent pseudo garbage in SYN's advertized window
TCP may advertize up to 16-bits window in SYN packets (no window
scaling allowed). At the same time, TCP may have rcv_wnd
(32-bits) that does not fit to 16-bits without window scaling
resulting in pseudo garbage into advertized window from the
low-order bits of rcv_wnd. This can happen at least when
mss <= (1<<wscale) (see tcp_select_initial_window). This patch
fixes the handling of SYN advertized windows (compile tested
only).
In worst case (which is unlikely to occur though), the receiver
advertized window could be just couple of bytes. I'm not sure
that such situation would be handled very well at all by the
receiver!? Fortunately, the situation normalizes after the
first non-SYN ACK is received because it has the correct,
scaled window.
Alternatively, tcp_select_initial_window could be changed to
prevent too large rcv_wnd in the first place.
[ tcp_make_synack() has the same bug, and I've added a fix for
that to this patch -DaveM ]
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Jiri Bohac [Wed, 14 Feb 2007 02:19:47 +0000 (18:19 -0800)]
Fix IPX module unload
[IPX]: Fix NULL pointer dereference on ipx unload
Fixes a null pointer dereference when unloading the ipx module.
On initialization of the ipx module, registering certain packet
types can fail. When this happens, unloading the module later
dereferences NULL pointers. This patch fixes that. Please apply.
Signed-off-by: Jiri Bohac <jbohac@suse.cz> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Herbert Xu [Wed, 14 Feb 2007 02:12:38 +0000 (18:12 -0800)]
Clear TCP segmentation offload state in ipt_REJECT
[NETFILTER]: Clear GSO bits for TCP reset packet
The TCP reset packet is copied from the original. This
includes all the GSO bits which do not apply to the new
packet. So we should clear those bits.
Spotted by Patrick McHardy.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Alan Stern [Mon, 26 Feb 2007 22:16:06 +0000 (17:16 -0500)]
UHCI: fix port resume problem
This patch (as863) fixes a problem encountered sometimes when resuming
a port on a UHCI controller. The hardware may turn off the
Resume-Detect bit before turning off the Suspend bit, leading usbcore
to think that the port is still suspended and the resume has failed.
The patch makes uhci_finish_suspend() wait until both bits are safely
off.
Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
NeilBrown [Tue, 6 Mar 2007 06:11:33 +0000 (17:11 +1100)]
Fix recently introduced problem with shutting down a busy NFS server.
When the last thread of nfsd exits, it shuts down all related sockets.
It currently uses svc_close_socket to do this, but that only is
immediately effective if the socket is not SK_BUSY.
If the socket is busy - i.e. if a request has arrived that has not yet
been processes - svc_close_socket is not effective and the shutdown
process spins.
So create a new svc_force_close_socket which removes the SK_BUSY flag
is set and then calls svc_close_socket.
Also change some open-codes loops in svc_destroy to use
list_for_each_entry_safe.
Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
NeilBrown [Tue, 6 Mar 2007 06:11:29 +0000 (17:11 +1100)]
Avoid using nfsd process pools on SMP machines.
process-pools have real benefits for NUMA, but on SMP
machines they only work if network interface interrupts
go to all CPUs (via round-robin or multiple nics). This is
not always the case, so disable the pools in this case until
a better solution is developped.
Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
diff .prev/net/sunrpc/svc.c ./net/sunrpc/svc.c
Alan Stern [Tue, 13 Feb 2007 19:53:06 +0000 (14:53 -0500)]
EHCI: turn off remote wakeup during shutdown
This patch (as850b) disables remote wakeup (and everything else!) on
all EHCI ports when the shutdown() method is called. If remote wakeup
is left active then some systems will reboot instead of powering off.
This fixes Bugzilla #7828.
Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
IPV6: HASHTABLES: Use appropriate seed for caluculating ehash index.
Tetsuo Handa <handat@pm.nttdata.co.jp> told me that connect(2) with TCPv6
socket almost always took a few minutes to return when we did not have any
ports available in the range of net.ipv4.ip_local_port_range.
The reason was that we used incorrect seed for calculating index of
hash when we check established sockets in __inet6_check_established().
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
David Woodhouse [Mon, 12 Feb 2007 23:26:22 +0000 (09:56 +1030)]
MTD: Fatal regression in drivers/mtd/redboot.c in 2.6.20
[MTD] Fix regression in RedBoot partition scanning
This fixes a regression introduced by the attempt to handle RedBoot FIS
tables which are smaller than an eraseblock, in commit 0b47d654089c5ce3f2ea26a4485db9bcead1e515
It moves the recalculation of the number of slots in the table to the
correct place, and improves the heuristic for when we think we need to
byte-swap what we read from the flash.
Signed-off-by: David Woodhouse <dwmw2@infradead.org> Cc: Rod Whitby <rod@whitby.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Kconfig: FAULT_INJECTION can be selected only if LOCKDEP is enabled.
There is no prompt for STACKTRACE, so it is enabled only when 'select'ed.
FAULT_INJECTION depends on it, while LOCKDEP selects it. So FAULT_INJECTION
becomes visible in Kconfig only when LOCKDEP is enabled.
Signed-off-by: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Julien BLACHE [Sun, 11 Feb 2007 17:27:09 +0000 (18:27 +0100)]
USB HID: Fix USB vendor and product IDs endianness for USB HID devices
The USB vendor and product IDs are not byteswapped appropriately, and
thus come out in the wrong endianness when fetched through the evdev
using ioctl() on big endian platforms.
Stefan Richter [Fri, 9 Feb 2007 23:44:44 +0000 (00:44 +0100)]
ieee1394: fix host device registering when nodemgr disabled
Since my commit 8252bbb1363b7fe963a3eb6f8a36da619a6f5a65 in 2.6.20-rc1,
host devices have a dummy driver attached. Alas the driver was not
registered before use if ieee1394 was loaded with disable_nodemgr=1.
This resulted in non-functional FireWire drivers or kernel lockup.
http://bugzilla.kernel.org/show_bug.cgi?id=7942
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
David Moore [Fri, 9 Feb 2007 23:41:28 +0000 (00:41 +0100)]
ieee1394: video1394: DMA fix
This together with the phys_to_virt fix in lib/swiotlb.c::swiotlb_sync_sg
fixes video1394 DMA on machines with DMA bounce buffers, especially Intel
x86-64 machines with > 3GB RAM.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de> Signed-off-by: David Moore <dcm@acm.org> Tested-by: Nicolas Turro <Nicolas.Turro@inrialpes.fr> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
CC arch/ppc/kernel/ppc_ksyms.o
arch/ppc/kernel/ppc_ksyms.c:275: error: '__mtdcr' undeclared here (not in a function)
arch/ppc/kernel/ppc_ksyms.c:275: warning: type defaults to 'int' in declaration of '__mtdcr'
arch/ppc/kernel/ppc_ksyms.c:276: error: '__mfdcr' undeclared here (not in a function)
arch/ppc/kernel/ppc_ksyms.c:276: warning: type defaults to 'int' in declaration of '__mfdcr'
make[1]: *** [arch/ppc/kernel/ppc_ksyms.o] Error 1
This is due to the EXPORT_SYMBOL for __mtdcr/__mfdcr not having the proper CONFIG protection
Signed-off-by: Rojhalat Ibrahim <imr@rtschenk.de> Signed-off-by: Kumar Gala <galak@kernel.crashing.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Neil Brown [Wed, 7 Feb 2007 22:28:28 +0000 (09:28 +1100)]
md: Avoid possible BUG_ON in md bitmap handling.
md/bitmap tracks how many active write requests are pending on blocks
associated with each bit in the bitmap, so that it knows when it can
clear the bit (when count hits zero).
The counter has 14 bits of space, so if there are ever more than 16383,
we cannot cope.
Currently the code just calles BUG_ON as "all" drivers have request queue
limits much smaller than this.
However is seems that some don't. Apparently some multipath configurations
can allow more than 16383 concurrent write requests.
So, in this unlikely situation, instead of calling BUG_ON we now wait
for the count to drop down a bit. This requires a new wait_queue_head,
some waiting code, and a wakeup call.
Tested by limiting the counter to 20 instead of 16383 (writes go a lot slower
in that case...).
Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
diff .prev/drivers/md/bitmap.c ./drivers/md/bitmap.c
Alexey Dobriyan [Wed, 7 Feb 2007 05:58:27 +0000 (21:58 -0800)]
Fix allocation failure handling in multicast
[IPV4/IPV6] multicast: Check add_grhead() return value
add_grhead() allocates memory with GFP_ATOMIC and in at least two places skb
from it passed to skb_put() without checking.
Signed-off-by: Alexey Dobriyan <adobriyan@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Daniel Walker [Wed, 7 Feb 2007 05:56:37 +0000 (21:56 -0800)]
Fix ATM initcall ordering.
[ATM]: Fix for crash in adummy_init()
This was reported by Ingo Molnar here,
http://lkml.org/lkml/2006/12/18/119
The problem is that adummy_init() depends on atm_init() , but adummy_init()
is called first.
So I put atm_init() into subsys_initcall which seems appropriate, and it
will still get module_init() if it becomes a module.
Interesting to note that you could crash your system here if you just load
the modules in the wrong order.
Signed-off-by: Daniel Walker <dwalker@mvista.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Neil Brown [Tue, 6 Feb 2007 23:26:56 +0000 (10:26 +1100)]
Fix various bugs with aligned reads in RAID5.
Fix various bugs with aligned reads in RAID5.
It is possible for raid5 to be sent a bio that is too big
for an underlying device. So if it is a READ that we
pass stright down to a device, it will fail and confuse
RAID5.
So in 'chunk_aligned_read' we check that the bio fits within the
parameters for the target device and if it doesn't fit, fall back
on reading through the stripe cache and making lots of one-page
requests.
Note that this is the earliest time we can check against the device
because earlier we don't have a lock on the device, so it could change
underneath us.
Also, the code for handling a retry through the cache when a read
fails has not been tested and was badly broken. This patch fixes that
code.
Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Takashi Iwai [Tue, 6 Feb 2007 18:15:26 +0000 (19:15 +0100)]
hda-intel - Don't try to probe invalid codecs
[ALSA] hda-intel - Don't try to probe invalid codecs
Fix the max number of codecs detected by HD-intel (and compatible)
controllers to 3. Some hardware reports extra bits as if
connected, and the driver gets confused to probe unexisting codecs.
Takashi Iwai [Tue, 6 Feb 2007 18:13:31 +0000 (19:13 +0100)]
usbaudio - Fix Oops with unconventional sample rates
The patch fixes the memory corruption by the support of unconventional
sample rates. Also, it avoids the too restrictive constraints if
any of usb descriptions contain continuous rates.
Alan Stern [Mon, 5 Feb 2007 14:56:15 +0000 (09:56 -0500)]
USB: fix concurrent buffer access in the hub driver
This patch (as849) fixes a bug in the USB hub driver. A single
pre-allocated buffer is used for all port status reads, but nothing
guarantees exclusive use of the buffer. A mutex is added to provide
this guarantee.
Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
David Moore [Sun, 4 Feb 2007 18:39:40 +0000 (13:39 -0500)]
Missing critical phys_to_virt in lib/swiotlb.c
Missing critical phys_to_virt in lib/swiotlb.c
Adds missing call to phys_to_virt() in the
lib/swiotlb.c:swiotlb_sync_sg() function. Without this change, a kernel
panic will always occur whenever a SWIOTLB bounce buffer from a
scatter-gather list gets synced. Affected are especially Intel x86_64
machines with more than about 3 GB RAM.
Signed-off-by: David Moore <dcm@acm.org> Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Dave Jones [Sun, 4 Feb 2007 17:18:50 +0000 (12:18 -0500)]
AGP: intel-agp bugfix
On Sun, Feb 04, 2007 at 04:51:38PM +0100, Eric Piel wrote:
> Hello,
>
> I've got a regression in 2.6.20-rc7 (-rc6 was fine) due to commit
> 4b95320fc4d21b0ff2f8604305dd6c851aff6096 ([AGPGART] intel_agp: restore
> graphics device's pci space early in resume).
I think the key to this failure is the last line here ..
> agpgart-intel 0000:00:00.0: resuming
> PM: Writing back config space on device 0000:00:02.0 at offset f (was 10b, writing 0)
> PM: Writing back config space on device 0000:00:02.0 at offset d (was dc, writing 0)
> PM: Writing back config space on device 0000:00:02.0 at offset b (was 10161025, writing 0)
> PM: Writing back config space on device 0000:00:02.0 at offset 5 (was f4000000, writing 0)
> PM: Writing back config space on device 0000:00:02.0 at offset 4 (was f8000008, writing 0)
> PM: Writing back config space on device 0000:00:02.0 at offset 2 (was 3000011, writing 0)
> PM: Writing back config space on device 0000:00:02.0 at offset 1 (was 2b00007, writing 0)
> PM: Writing back config space on device 0000:00:02.0 at offset 0 (was 11328086, writing 0)
> agpgart: Unable to remap memory.
This then blows up the next access to intel_i810_private.registers, which happens to
be intel_i810_insert_entries.
Either we need .suspend methods which unmap these regions, or we need
to skip trying to map them a second time on resume.
There's an ugly patch below which does the latter. Give it a try?
The intel-agp suspend/resume code has really grown into something
of a monster, and could use some refactoring in a big way.
Dave
From: Dave Jones <davej@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Tejun Heo [Mon, 5 Feb 2007 12:47:13 +0000 (21:47 +0900)]
ide: fix drive side 80c cable check
eighty_ninty_three() had word 93 validitity check but not the 80c bit
test itself (bit 12). This increases the chance of incorrect wire
detection especially because host side cable detection is often
unreliable and we sometimes soley depend on drive side cable
detection. Fix it.
David Howells [Fri, 9 Feb 2007 14:30:37 +0000 (09:30 -0500)]
Keys: Fix key serial number collision handling
Fix the key serial number collision avoidance code in key_alloc_serial().
This didn't use to be so much of a problem as the key serial numbers were
allocated from a simple incremental counter, and it would have to go through
two billion keys before it could possibly encounter a collision. However, now
that random numbers are used instead, collisions are much more likely.
This is fixed by finding a hole in the rbtree where the next unused serial
number ought to be and using that by going almost back to the top of the
insertion routine and redoing the insertion with the new serial number rather
than trying to be clever and attempting to work out the insertion point
pointer directly.
NeilBrown [Wed, 7 Feb 2007 00:10:26 +0000 (11:10 +1100)]
knfsd: Fix a race in closing NFSd connections.
If you lose this race, it can iput a socket inode twice and you
get a BUG in fs/inode.c
When I added the option for user-space to close a socket,
I added some cruft to svc_delete_socket so that I could call
that function when closing a socket per user-space request.
This was the wrong thing to do. I should have just set SK_CLOSE
and let normal mechanisms do the work.
Not only wrong, but buggy. The locking is all wrong and it openned
up a race where-by a socket could be closed twice.
So this patch:
Introduces svc_close_socket which sets SK_CLOSE then either leave
the close up to a thread, or calls svc_delete_socket if it can
get SK_BUSY.
Adds a bias to sk_busy which is removed when SK_DEAD is set,
This avoid races around shutting down the socket.
Changes several 'spin_lock' to 'spin_lock_bh' where the _bh
was missing.
Dan Williams [Tue, 13 Feb 2007 21:07:27 +0000 (16:07 -0500)]
prism54: correct assignment of DOT1XENABLE in WE-19 codepaths
Correct assignment of DOT1XENABLE in WE-19 codepaths.
RX_UNENCRYPTED_EAPOL = 1 really means setting DOT1XENABLE _off_, and
vice versa. The original WE-19 patch erroneously reversed that. This
patch fixes association with unencrypted and WEP networks when using
wpa_supplicant.
It also adds two missing break statements that, left out, could result
in incorrect card configuration.
Applies to (I think) 2.6.19 and later.
Signed-off-by: Dan Williams <dcbw@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Atsushi Nemoto [Sat, 3 Feb 2007 14:16:36 +0000 (23:16 +0900)]
rtc-pcf8563: detect polarity of century bit automatically
The usage of the century bit was inverted on 2.6.19 following to PCF8563's
description, but it was not match to usage suggested by RTC8564's
datasheet. Anyway what MO_C=1 means can vary on each platform. This patch
is to detect its polarity in get_datetime routine. The default value of
c_polarity is 0 (MO_C=1 means 19xx) so that this patch does not change
current behavior even if get_datetime was not called before set_datetime.
x86_64: fix 2.6.18 regression - PTRACE_OLDSETOPTIONS should be accepted
Also PTRACE_OLDSETOPTIONS should be accepted, as done by kernel/ptrace.c and
forced by binary compatibility. UML/32bit breaks because of this - since it is wise
enough to use PTRACE_OLDSETOPTIONS to be binary compatible with 2.4 host
kernels.
Mark Fasheh [Tue, 6 Mar 2007 00:34:11 +0000 (16:34 -0800)]
ocfs2: ocfs2_link() journal credits update
Commit 592282cf2eaa33409c6511ddd3f3ecaa57daeaaa fixed some missing directory
c/mtime updates in part by introducing a dinode update in ocfs2_add_entry().
Unfortunately, ocfs2_link() (which didn't update the directory inode before)
is now missing a single journal credit. Fix this by doubling the number of
inode updates expected during hard link creation.
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Greg Banks [Mon, 19 Feb 2007 23:12:34 +0000 (10:12 +1100)]
[PATCH] Fix a free-wrong-pointer bug in nfs/acl server (CVE-2007-0772)
Due to type confusion, when an nfsacl verison 2 'ACCESS' request
finishes and tries to clean up, it calls fh_put on entiredly the
wrong thing and this can cause an oops.
Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Frédéric Riss [Tue, 30 Jan 2007 20:41:17 +0000 (21:41 +0100)]
[PATCH] EFI x86: pass firmware call parameters on the stack
When calling into the EFI firmware, the parameters need to be passed on
the stack. The recent change to use -mregparm=3 breaks x86 EFI support.
This patch is needed to allow the new Intel-based Macs to suspend to ram
(efi.get_time is called during the suspend phase).
* master.kernel.org:/pub/scm/linux/kernel/git/jejb/scsi-rc-fixes-2.6:
[SCSI] sd: udev accessing an uninitialized scsi_disk field results in a crash
[SCSI] st: A MTIOCTOP/MTWEOF within the early warning will cause the file number to be incorrect
[SCSI] qla4xxx: bug fixes
[SCSI] Fix scsi_add_device() for async scanning
John Keller [Sat, 3 Feb 2007 09:14:02 +0000 (01:14 -0800)]
[PATCH] Altix: more ACPI PRT support
The SN Altix platform does not conform to the IOSAPIC IRQ routing model.
Add code in acpi_unregister_gsi() to check if (acpi_irq_model ==
ACPI_IRQ_MODEL_PLATFORM) and return.
Due to an oversight, this code was not added previously when
similar code was added to acpi_register_gsi().
Signed-off-by: John Keller <jpk@sgi.com> Acked-by: Len Brown <lenb@kernel.org> Cc: "Luck, Tony" <tony.luck@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Andrew Morton [Sat, 3 Feb 2007 09:14:01 +0000 (01:14 -0800)]
[PATCH] revert blockdev direct io back to 2.6.19 version
Andrew Vasquez is reporting as-iosched oopses and a 65% throughput
slowdown due to the recent special-casing of direct-io against
blockdevs. We don't know why either of these things are occurring.
The patch minimally reverts us back to the 2.6.19 code for a 2.6.20
release.
Cc: Andrew Vasquez <andrew.vasquez@qlogic.com> Cc: Ken Chen <kenchen@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Mike Frysinger [Sat, 3 Feb 2007 09:13:55 +0000 (01:13 -0800)]
[PATCH] alpha: fix epoll syscall enumerations
We went and named them __NR_sys_foo instead of __NR_foo.
It may be too late to change this, but we can at least add the proper names
now.
Signed-off-by: Mike Frysinger <vapier@gentoo.org> Cc: Richard Henderson <rth@twiddle.net> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Magnus Damm [Sat, 3 Feb 2007 09:13:48 +0000 (01:13 -0800)]
[PATCH] kexec: Avoid migration of already disabled irqs (ia64)
This patch fixes up ia64 kexec support for HP rx2620 hardware. It does
this by skipping migration of already disabled irqs. This is most likely a
problem on other ia64 platforms as well, but I've only been able to
reproduce it on one machine so far.
The full story is that handle_bad_irq() gets invoked before starting the
new kernel without this patch. This seems to happen when fixup_irqs()
calls generic_handle_irq() on already migrated (and disabled) irqs. So by
avoiding migration of disabled irqs we stay away of handle_bad_irq().
The code has been tested on three different ia64 machines, all with good
results. It is possible to trigger the same bug by offlining a processor
using echo 0 > /sys/devices/system/cpu/cpuX/online.
More detailed information is available in the following mail thread:
http://lists.osdl.org/pipermail/fastboot/2007-January/thread.html#5774
Signed-off-by: Magnus Damm <magnus@valinux.co.jp> Acked-by: Simon Horman <horms@verge.net.au> Acked-by: Zou, Nanhai <nanhai.zou@intel.com> Acked-by: Jay Lan <jlan@sgi.com> Acked-by: "Luck, Tony" <tony.luck@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
See report: http://marc.theaimsgroup.com/?l=linux-kernel&m=116599593200888&w=2
flush_workqueue() is not allowed to be called in the softirq context.
However, aio_complete() called from I/O interrupt can potentially call
put_ioctx with last ref count on ioctx and triggers bug. It is simply
incorrect to perform ioctx freeing from aio_complete.
The bug is trigger-able from a race between io_destroy() and aio_complete().
A possible scenario:
The real problem is that the condition check of ctx->reqs_active in
wait_for_all_aios() is incorrect that access to reqs_active is not
being properly protected by spin lock.
This patch adds that protective spin lock, and at the same time removes
all duplicate ref counting for each kiocb as reqs_active is already used
as a ref count for each active ioctx. This also ensures that buggy call
to flush_workqueue() in softirq context is eliminated.
Signed-off-by: "Ken Chen" <kenchen@google.com> Cc: Zach Brown <zach.brown@oracle.com> Cc: Suparna Bhattacharya <suparna@in.ibm.com> Cc: Benjamin LaHaise <bcrl@kvack.org> Cc: Badari Pulavarty <pbadari@us.ibm.com> Cc: <stable@kernel.org> Acked-by: Jeff Moyer <jmoyer@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>