Despite all the care lately in making the powermac sleep/wakeup as
robust as possible, there is still a nasty related to the use of cpufreq
on PMU based machines. Unfortunately, it affects paulus old powerbook
so I have to fix it :)
We didn't manage to understand what is precisely going on, it leads to
memory corruption and might have to do with RAM not beeing properly
refreshed when a cpufreq transition is done right before the sleep.
The best workaround (and less intrusive at this point) we could come up
with is included in this patch. We basically do _not_ force a switch to
high speed on suspend anymore (that is what is causing the problem) on
those machines. We still force a speed switch on wakeup (since we don't
know what speed we are coming back from sleep at, and that seems to work
fine).
Since, during this short interval, the actual CPU speed might be
incorrect, we also hack around by multiplying loops_per_jiffy by 2 (max
speed factor on those machines) during early wakeup stage to make sure
udelay's during that time aren't too short.
For after 2.6.12, we'll change udelay implementation to use the CPU
timebase (which is always constant) instead like we do on ppc64 and thus
get rid of all those problems.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
[PATCH] iseries_veth: Supress spurious WARN_ON() at module unload
My patch from a few weeks back (now in mainline), called "Cleanup skbs to
prevent unregister_netdevice() hanging", can cause our TX timeout code to
fire on machines with lots of VLANs (because it takes > 2 seconds between
when we stop the queues and when we're finished stopping the connections).
When that happens the TX timeout code freaks out and does a WARN_ON()
because as far as it's concerned there shouldn't be a TX timeout happening,
which is fair enough.
I have a "proper" fix for this, which is to a) do refcounting on
connections and b) implement a proper ack timer so we don't keep unacked
skbs lying around for ever. But for 2.6.12 I propose just supressing the
WARN_ON(). Users will still see the "NETDEV WATCHDOG" warning, but that's
not nearly as bad as a WARN_ON() which users interpret as an Oops.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Narendra Sankar [Fri, 6 May 2005 19:00:05 +0000 (12:00 -0700)]
[PATCH] PCI: MSI functionality broken on Serverworks GC chipset
MSI functionality is broken on the GC_LE x86 chipset that Serverworks
developed and that is being used in various platforms today. Broadcom is
going to push out to the kernel MSI enabled Gigabit drivers (in the very
near future), and we would like to make sure that MSI does not get
enabled on any platforms using the GC_LE chipset (device id 0x17).
Following the AMD 8131 example, I am including a patch to disable MSI
functionality when a GCNB_LE is detected. Please let me know if there
are any issues with this. This is a permanent fix for this chipset, as
the hardware will not be updated.
[IA64] Fix race condition in the rt_sigprocmask fastcall
current->blocked will be set to the value of current->thread_info->flags if the
cmpxchg to update thread_info->flags fails. For performance reasons the store into
current->blocked was placed in the cmpxchg loop. However, the cmpxchg overwrites the
register holding the value to be stored. In the rare case of a retry the value of
thread_info->flags will be written into current->blocked.
The fix is to use another register so that the register containing the current->blocked
value is not overwritten.
Signed-off-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Tony Luck <tony.luck@intel.com>
Ian Abbott [Thu, 2 Jun 2005 09:34:11 +0000 (10:34 +0100)]
[PATCH] USB: ftdi_sio: avoid losing received data in tty-ldisc
ftdi_sio: Avoid losing bytes at tty-ldisc.
This patch was originally developed by Daniel Smertnig. I
(Ian Abbott) made a few changes. It has been tested by both
Daniel and I, at least for raw, non-canonical receive data
processing.
Here is Daniel's original description of the patch:
===
During a project in which I was using a FTDI 232BM to
transmit data at relative high speeds (625kBit/s), I
noticed a problem where data was lost even if flow
control was enabled: The FTDI-Driver receives 512 Bytes
of data over USB at a time, which consists of 8 64-Byte
packets. Subtracting the 2 bytes of status information
included in each packet this gives 496 "real" data
bytes per read.
This data is passed (indirectly, via the flip buffers)
to the tty line discipline which takes care of
throttling when there the free buffer space reaches
TTY_THRESHOLD_THROTTLE (128). Because the FTDI driver
processes up to 496 bytes at a time, throttling won't
happen in time and the line discipline will discard the
remaining bytes.
To avoid this the patch passes data in 62-byte blocks
to the tty layer and checks the available space in the
ldisc-buffers. If there isn't enough free space,
processing the rest of the data is delayed using a
workqueue.
Note: The original problem should be easily
reproducible with a userspace program which does slow &
small reads.
===
Signed-off-by: Ian Abbott <abbotti@mev.co.uk> Signed-off-by: Daniel Smertnig <daniel.smertnig@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Pete Zaitcev [Mon, 6 Jun 2005 20:54:59 +0000 (13:54 -0700)]
[PATCH] USB: fix ub issues
This smoothes two imperfections:
- Increase number of LUNs per device from 4 to 9. The best solution
would be to remove this limit altogether, but that has to wait until
the time when more than 26 hosts are allowed.
- Replace mdelay with msleep in a probing routine.
Signed-off-by: Pete Zaitcev <zaitcev@yahoo.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Albert Lee [Mon, 6 Jun 2005 07:56:03 +0000 (15:56 +0800)]
[PATCH] sg traverse fix for __atapi_pio_bytes()
Problem:
Incorrect md5sum when using ATAPI PIO mode to verify a distro CD.
Root cause: sg traverse problem.
In __atapi_pio_bytes(), if qc->cursg++ is increased and "goto
next_page" is executed, then sg is not updated to the new qc->cursg
and the old sg is overwritten with the new data.
Changes:
- Replace "goto next_page" with "goto next_sg" to make sg updated.
Paul Mackerras [Wed, 8 Jun 2005 11:59:15 +0000 (21:59 +1000)]
[PATCH] ppc64: Fix PER_LINUX32 behaviour
This patch fixes some bugs in the ppc64 PER_LINUX32 implementation,
noted by Juergen Kreileder:
* uname(2) doesn't respect PER_LINUX32, it returns 'ppc64' instead of 'ppc'
* Child processes of a PER_LINUX32 process don't inherit PER_LINUX32
Along the way I took the opportunity to move things around so that
sys_ppc32.c only has 32-bit syscall emulation functions and to remove
the obsolete "fakeppc" command line option.
Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
READA errors failing with EWOULDBLOCK/EAGAIN do not constitute a valid
reason for failing the path; this lead to erratic errors on DM multipath
devices. This error can be safely propagated upwards without failing the
path.
Acked-by: Kevin Corry <kevcorry@us.ibm.com> Acked-by: Jens Axboe <axboe@suse.de> Signed-off-by: Lars Marowsky-Bree <lmb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Peter Chubb [Wed, 8 Jun 2005 22:50:20 +0000 (15:50 -0700)]
[PATCH] ia64: fix floating-point preemption problem
There've been reports of problems with CONFIG_PREEMPT=y and the high
floating point partition. This is caused by the possibility of preemption
and rescheduling on a different processor while saving or restioirng the
high partition.
The only places where the FPU state is touched are in ptrace, in
switch_to(), and where handling a floating-point exception. In switch_to()
preemption is off. So it's only in trap.c and ptrace.c that we need to
prevent preemption.
Here is a patch that adds commentary to make the conditions clear, and adds
appropriate preempt_{en,dis}able() calls to make it so. In trap.c I use
preempt_enable_no_resched(), as we're about to return to user space where
the preemption flag will be checked anyway.
Signed-off-by: Peter Chubb <peterc@gelato.unsw.edu.au> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Remove spurious MSR_SE reset during kprobe processing.
single_step_exception() already does it for us. Reset it to be safe when
executing the fault_handler.
Signed-off-by: Ananth N Mavinakayanahalli <ananth@in.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Keith Owens [Wed, 8 Jun 2005 22:49:07 +0000 (15:49 -0700)]
[PATCH] Stop arch/i386/kernel/vsyscall-note.o being rebuilt every time
arch/i386/kernel/vsyscall-note.o is not listed as a target so its .cmd file
is neither considered as a target nor is it read on the next build. This
causes vsyscall-note.o to be rebuilt every time that you run make, which
causes vmlinux to be rebuilt every time.
Signed-off-by: Keith Owens <kaos@ocs.com.au> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Jeff Dike [Wed, 8 Jun 2005 22:48:27 +0000 (15:48 -0700)]
[PATCH] uml: clean up error path
This cleans an error path which used to leak file descriptors by returning
without trying to tidy up.
Signed-off-by: Jeff Dike <jdike@addtoit.com> Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Jeff Dike [Wed, 8 Jun 2005 22:48:13 +0000 (15:48 -0700)]
[PATCH] uml: fix strace -f
It turns out that we need to check for pending signals when a newly forked
process is run for the first time. With strace -f, strace needs to know about
the forked process before it gets going. If it doesn't, then it ptraces some
bogus values into its registers, and the process segfaults. So, I added calls
to interrupt_end, which does that, plus checks for reschedules. There
shouldn't be any of those, but x86 does the same thing, so I'm copying that
behavior to be safe.
Signed-off-by: Jeff Dike <jdike@addtoit.com> Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Jeff Dike [Wed, 8 Jun 2005 22:48:01 +0000 (15:48 -0700)]
[PATCH] uml: compile fixes for gcc 4
This is a bunch of compile fixes provoked by building UML with gcc 4. There
are a bunch of signedness mismatches, a couple of uninitialized references,
and a botched C99 structure initialization which had somehow gone unnoticed.
Signed-off-by: Jeff Dike <jdike@addtoit.com> Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Jeff Dike [Wed, 8 Jun 2005 22:47:50 +0000 (15:47 -0700)]
[PATCH] uml: make the emulated iomem driver work on 2.6
This makes the minimal fixes needed to make the UML iomem driver work in 2.6.
Signed-off-by: Jeff Dike <jdike@addtoit.com> Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Thomas Graf [Wed, 8 Jun 2005 22:10:48 +0000 (15:10 -0700)]
[PKT_SCHED]: Allow socket attributes to be matched on via meta ematch
Adds meta collectors for all socket attributes that make sense
to be filtered upon. Some of them are only useful for debugging
but having them doesn't hurt.
Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
Changing the sysctl net.core.dev_weight has no effect because the weight
of the backlog devices is set during initialization and never changed.
This patch propagates any changes to the global value affected by sysctl
to the per-cpu devices. It is done every time the packet handler
function is run.
Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Michael Chan [Wed, 8 Jun 2005 21:13:14 +0000 (14:13 -0700)]
[TG3]: Fix 5700/5701 DMA corruption on Apple G4.
Fix 5700/5701 DMA write corruption on Apple G4 by detecting the Apple
UniNorth PCI 1.5 chipset and adjusting the DMA write boundary to 16. DMA
test fails to detect the problem with this chipset.
Thanks to Manuel Perez Ayala for reporting the problem and helping to
debug it.
Signed-off-by: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Keith Owens [Sat, 28 May 2005 06:09:00 +0000 (23:09 -0700)]
[IA64] Extract correct break number for break.b
break.b does not store the break number in cr.iim, instead it stores 0,
which makes all break.b instructions look like BUG(). Extract the
break number from the instruction itself.
Signed-off-by: Keith Owens <kaos@sgi.com> Signed-off-by: Tony Luck <tony.luck@intel.com>
Tony Luck [Wed, 8 Jun 2005 19:12:48 +0000 (12:12 -0700)]
[IA64] Update comment to describe modes set in default control register.
Christian Hildner pointed out that the comment did not match what the
code does in cpu_init() when we set up the default control register.
Patch based on suggestions from Ken Chen.
Keith Owens [Mon, 6 Jun 2005 09:04:00 +0000 (02:04 -0700)]
[IA64] Module gp must point to valid memory
Some bits of the kernel assume that gp always points to valid memory,
in particular PHYSICAL_MODE_ENTER() assumes that both gp and sp are
valid virtual addresses with associated physical pages. The IA64
module loader puts gp well past the end of the module, with no physical
backing. Offsets on gp are still valid, but physical mode addressing
breaks for modules. Ensure that gp always falls within the module
body. Also ensure that gp is 8 byte aligned.
Signed-off-by: Keith Owens <kaos@sgi.com> Signed-off-by: Tony Luck <tony.luck@intel.com>
Kprobes was eating the hardware instruction and data address
breakpoint exceptions. This patch fixes it; kprobes doesn't use those
exceptions at all and should ignore them.
Signed-off-by: Ananth N Mavinakayanahalli <amavin@redhat.com> Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Olaf Hering [Wed, 8 Jun 2005 05:12:00 +0000 (15:12 +1000)]
[PATCH] ppc64: print negative numbers correctly in boot wrapper
if num has a value of -1, accessing the digits[] array will fail and the
format string will be printed in funny way, or not at all. This happens if
one prints negative numbers.
Just change the code to match lib/vsprintf.c
asm/div64.h cant be used because u64 maps to u32 for this build.
Signed-off-by: Olaf Hering <olh@suse.de> Signed-off-by: Paul Mackerras <paulus@samba.org>
Russell King [Wed, 8 Jun 2005 14:28:24 +0000 (15:28 +0100)]
[PATCH] ARM: Fix Xscale copy_page implementation
The ARM copypage changes in 2.6.12-rc4-git1 removed the preempt locking
from the copypage functions which broke the XScale implementation.
This patch fixes the locking on XScale and removes the now unneeded
minicache code.
Signed-off-by: Russell King <rmk@arm.linux.org.uk> Checked-by: Richard Purdie
Trond Myklebust [Tue, 7 Jun 2005 22:37:01 +0000 (18:37 -0400)]
[PATCH] NFS: Fix lookup intent handling
We should never apply a lookup intent to anything other than the last
path component in an open(), create() or access() call.
Introduce the helper nfs_lookup_check_intent() which always returns
zero if LOOKUP_CONTINUE or LOOKUP_PARENT are set, and returns the
intent flags if we're on the last component of the lookup.
By doing so, we fix a bug in open(O_EXCL), where we may end up
optimizing away a real lookup of the parent directory.
Problem noticed by Linda Dunaphant <linda.dunaphant@ccur.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Matthew Dobson [Tue, 7 Jun 2005 20:22:05 +0000 (13:22 -0700)]
[PATCH] send_IPI_mask_sequence() warning fix
In file included from arch/i386/kernel/smp.c:235:
include/asm-i386/mach-numaq/mach_ipi.h:4: warning: `send_IPI_mask_sequence'
declared inline after its definition
Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
David Mosberger [Mon, 4 Apr 2005 20:29:43 +0000 (13:29 -0700)]
[PATCH] Replace check_bridge_mode() with (bridge->mode & AGSTAT_MODE_3_0).
[AGPGART] Replace check_bridge_mode() with (bridge->mode & AGSTAT_MODE_3_0).
As mentioned earlier, the current check_bridge_mode() code assumes
that AGP bridges are PCI devices. This isn't always true. Definitely
not for HP zx1 chipset and the same seems to be the case for SGI's AGP
bridge.
The patch below fixes the problem by picking up the AGP_MODE_3_0 bit
from bridge->mode. I feel like I may be missing something, since I
can't see any reason why check_bridge_mode() wasn't doing that in the
first place. According to the AGP 3.0 specs, the AGP_MODE_3_0 bit is
determined during the hardware reset and cannot be changed, so it
seems to me it should be safe to pick it up from bridge->mode.
With the patch applied, I can definitely use AGP acceleration both
with AGP 2.0 and AGP 3.0 (one with an Nvidia card, the other with an
ATI FireGL card).
Unless someone spots a problem, please apply this patch so 3d
acceleration can work on zx1 boxes again.
This makes AGP work again on machines with an AGP bridge that isn't a
PCI device.
Signed-off-by: David Mosberger-Tang <davidm@hpl.hp.com> Signed-off-by: Dave Jones <davej@redhat.com>
Keir Fraser [Wed, 30 Mar 2005 21:17:04 +0000 (13:17 -0800)]
[PATCH] AGP fix for Xen VMM
When Linux is running on the Xen virtual machine monitor, physical
addresses are virtualised and cannot be directly referenced by the AGP
GART. This patch fixes the GART driver for Xen by adding a layer of
abstraction between physical addresses and 'GART addresses'.
Architecture-specific functions are also defined for allocating and freeing
the GATT. Xen requires this to ensure that table really is contiguous from
the point of view of the GART.
These extra interface functions are defined as 'no-ops' for all existing
architectures that use the GART driver.
Signed-off-by: Keir Fraser <keir@xensource.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Dave Jones <davej@redhat.com>
David Mosberger [Mon, 6 Jun 2005 22:50:09 +0000 (15:50 -0700)]
[PATCH] Include <linux/config.h> before testing CONFIG_ACPI
I'm not sure why this issue is suddenly showing, but without this
patchlet, the zx1 config won't compile anymore (e.g., to see the
compilation-error, look for "***" in [1]).
Michael Chan [Mon, 6 Jun 2005 22:16:20 +0000 (15:16 -0700)]
[TG3] Fix link failure in 5701
On some 5701 devices with older bootcode, the LED configuration bits in
SRAM may be invalid with value zero. The fix is to check for invalid
bits (0) and default to PHY 1 mode. Incorrect LED mode will lead to
error in programming the PHY.
Thanks to Grant Grundler for debugging the problem.
>From Grant:
| In May, 2004, tg3 v3.4 changed how MAC_LED_CTRL (0x40c) was getting
| programmed and how to determine what to program into LED_CTRL. The new
| code trusted NIC_SRAM_DATA_CFG (0x00000b58) to indicate what to write
| to LED_CTRL and MII EXT_CTRL registers. On "IOX Core Lan", SRAM was
| saying MODE_MAC (0x0) and that doesn't work.
Signed-off-by: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Yoshinori Sato [Mon, 6 Jun 2005 21:46:32 +0000 (14:46 -0700)]
[PATCH] binfmt_flat mmap flag fix
Make sure that binfmt_flat passes the correct flags into do_mmap(). nommu's
validate_mmap_request() will simple return -EINVAL if we try and pass it a
flags value of zero.
Signed-off-by: Yoshinori Sato <ysato@users.sourceforge.jp> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Al Viro [Mon, 6 Jun 2005 20:36:14 +0000 (13:36 -0700)]
[PATCH] namei fixes (19/19)
__do_follow_link() passes potentially worng vfsmount to touch_atime(). It
matters only in (currently impossible) case of symlink mounted on something,
but it's trivial to fix and that actually makes more sense.
Signed-off-by: Al Viro <viro@parcelfarce.linux.theplanet.co.uk> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Al Viro [Mon, 6 Jun 2005 20:36:12 +0000 (13:36 -0700)]
[PATCH] namei fixes (16/19)
Conditional mntput() moved into __do_follow_link(). There it collapses with
unconditional mntget() on the same sucker, closing another too-early-mntput()
race.
Signed-off-by: Al Viro <viro@parcelfarce.linux.theplanet.co.uk> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Al Viro [Mon, 6 Jun 2005 20:36:11 +0000 (13:36 -0700)]
[PATCH] namei fixes (15/19)
Getting rid of sloppy logics:
a) in do_follow_link() we have the wrong vfsmount dropped if our symlink
had been mounted on something. Currently it worls only because we never
get such situation (modulo filesystem playing dirty tricks on us). And
it obfuscates already convoluted logics...
b) same goes for open_namei().
c) in __link_path_walk() we have another "it should never happen" sloppiness -
out_dput: there does double-free on underlying vfsmount and leaks the covering
one if we hit it just after crossing a mountpoint. Again, wrong vfsmount
getting dropped.
d) another too-early-mntput() race - in do_follow_mount() we need to postpone
conditional mntput(path->mnt) until after dput(path->dentry). Again, this one
happens only in it-currently-never-happens-unless-some-fs-plays-dirty
scenario...
Signed-off-by: Al Viro <viro@parcelfarce.linux.theplanet.co.uk> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Al Viro [Mon, 6 Jun 2005 20:36:08 +0000 (13:36 -0700)]
[PATCH] namei fixes (13/19)
In open_namei() exit_dput: we have mntput() done in the wrong order -
if nd->mnt != path.mnt we end up doing
mntput(nd->mnt);
nd->mnt = path.mnt;
dput(nd->dentry);
mntput(nd->mnt);
which drops nd->dentry too late. Fixed by having path.mnt go first.
That allows to switch O_NOFOLLOW under if (__follow_mount(...)) back
to exit_dput, while we are at it.
Fix for early-mntput() race + equivalent transformation.
Signed-off-by: Al Viro <viro@parcelfarce.linux.theplanet.co.uk> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Al Viro [Mon, 6 Jun 2005 20:36:06 +0000 (13:36 -0700)]
[PATCH] namei fixes (10/19)
In open_namei(), __follow_down() loop turned into __follow_mount().
Instead of
if we are on a mountpoint dentry
if O_NOFOLLOW checks fail
drop path.dentry
drop nd
return
do equivalent of follow_mount(&path.mnt, &path.dentry)
nd->mnt = path.mnt
we do
if __follow_mount(path) had, indeed, traversed mountpoint
/* now both nd->mnt and path.mnt are pinned down */
if O_NOFOLLOW checks fail
drop path.dentry
drop path.mnt
drop nd
return
mntput(nd->mnt)
nd->mnt = path.mnt
Now __follow_down() can be folded into follow_down() - no other callers left.
We need to reorder dput()/mntput() there - same problem as in follow_mount().
Equivalent transformation + fix for a bug in O_NOFOLLOW handling - we used to
get -ELOOP if we had the same fs mounted on /foo and /bar, had something bound
on /bar/baz and tried to open /foo/baz with O_NOFOLLOW. And fix of
too-early-mntput() race in follow_down()
Signed-off-by: Al Viro <viro@parcelfarce.linux.theplanet.co.uk> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Al Viro [Mon, 6 Jun 2005 20:36:05 +0000 (13:36 -0700)]
[PATCH] namei fixes (9/19)
New helper: __follow_mount(struct path *path). Same as follow_mount(), except
that we do *not* do mntput() after the first lookup_mnt().
IOW, original path->mnt stays pinned down. We also take care to do dput()
before mntput() in the loop body (follow_mount() also needs that reordering,
but that will be done later in the series).
The following are equivalent, assuming that path.mnt == x:
(1)
follow_mount(&path.mnt, &path.dentry)
(2)
__follow_mount(&path);
if (path->mnt != x)
mntput(x);
(3)
if (__follow_mount(&path))
mntput(x);
Callers of follow_mount() in __link_path_walk() converted to (2).
Equivalent transformation + fix for too-late-mntput() race in __follow_mount()
loop.
Signed-off-by: Al Viro <viro@parcelfarce.linux.theplanet.co.uk> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Al Viro [Mon, 6 Jun 2005 20:36:04 +0000 (13:36 -0700)]
[PATCH] namei fixes (8/19)
In open_namei() we never use path.mnt or path.dentry after exit: or ok:.
Assignment of path.dentry in case of LAST_BIND is dead code and only
obfuscates already convoluted function; assignment of path.mnt after
__do_follow_link() can be moved down to the place where we set path.dentry.
Obviously equivalent transformations, just to clean the air a bit in that
region.
Signed-off-by: Al Viro <viro@parcelfarce.linux.theplanet.co.uk> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Equivalent transformations - the reason why we have that mntget() is that
__do_follow_link() can drop a reference to nd->mnt and that's what holds
path->mnt. So that call can happen at any point prior to __do_follow_link()
touching nd->mnt. The rest is obvious.
NOTE: current tree relies on symlinks *never* being mounted on anything. It's
not hard to get rid of that assumption (actually, that will come for free
later in the series). For now we are just not making the situation worse than
it is.
Signed-off-by: Al Viro <viro@parcelfarce.linux.theplanet.co.uk> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Al Viro [Mon, 6 Jun 2005 20:36:01 +0000 (13:36 -0700)]
[PATCH] namei fixes (5/19)
fix for too early mntput() in open_namei() - we pin path.mnt down for the
duration of __do_follow_link(). Otherwise we could get the fs where our
symlink lived unmounted while we were in __do_follow_link(). That would end
up with dentry of symlink staying pinned down through the fs shutdown.
Signed-off-by: Al Viro <viro@parcelfarce.linux.theplanet.co.uk> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Al Viro [Mon, 6 Jun 2005 20:36:01 +0000 (13:36 -0700)]
[PATCH] namei fixes (4/19)
path.mnt in open_namei() set to mirror nd->mnt.
nd->mnt is set in 3 places in that function - path_lookup() in the beginning,
__follow_down() loop after do_last: and __do_follow_link() call after
do_link:.
We set path.mnt to nd->mnt after path_lookup() and __do_follow_link(). In
__follow_down() loop we use &path.mnt instead of &nd->mnt and set nd->mnt to
path.mnt immediately after that loop.
Obviously equivalent transformation.
Signed-off-by: Al Viro <viro@parcelfarce.linux.theplanet.co.uk> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Al Viro [Mon, 6 Jun 2005 20:35:58 +0000 (13:35 -0700)]
[PATCH] namei fixes
OK, here comes a patch series that hopefully should close all
too-early-mntput() races in fs/namei.c. Entire area is convoluted as hell, so
I'm splitting that series into _very_ small chunks.
Patches alread in the tree close only (very wide) races in following symlinks
(see "busy inodes after umount" thread some time ago). Unfortunately, quite a
few narrower races of the same nature were not closed. Hopefully this should
take care of all of them.
Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Kumar Gala [Mon, 6 Jun 2005 20:35:57 +0000 (13:35 -0700)]
[PATCH] ppc32: Fix incorrect CPU_FTR fixup usage for unified caches
Runtime feature support for unified caches was testing a userland feature
flag (PPC_FEATURE_UNIFIED_CACHE) instead of a cpu feature flag
(CPU_FTR_SPLIT_ID_CACHE). Luckily the current defined bit mask for cpu
features and userland features do not overlap so this only causes an issue
on machines with a unified cache, which is extremely rare on PPC today.
Signed-off-by: Kumar Gala <kumar.gala@freescale.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>