Ram Pai [Mon, 7 Nov 2005 22:21:20 +0000 (17:21 -0500)]
[PATCH] unbindable mounts
An unbindable mount does not forward or receive propagation. Also
unbindable mount disallows bind mounts. The semantics is as follows.
Bind semantics:
It is invalid to bind mount an unbindable mount.
Move semantics:
It is invalid to move an unbindable mount under shared mount.
Clone-namespace semantics:
If a mount is unbindable in the parent namespace, the corresponding
cloned mount in the child namespace becomes unbindable too. Note:
there is subtle difference, unbindable mounts cannot be bind mounted
but can be cloned during clone-namespace.
Signed-off-by: Ram Pai <linuxram@us.ibm.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Ram Pai [Mon, 7 Nov 2005 22:21:01 +0000 (17:21 -0500)]
[PATCH] handling of slave mounts
This makes bind, rbind, move, clone namespace and umount operations
aware of the semantics of slave mount (see Documentation/sharedsubtree.txt
in the last patch of the series for detailed description).
Signed-off-by: Ram Pai <linuxram@us.ibm.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Ram Pai [Mon, 7 Nov 2005 22:20:48 +0000 (17:20 -0500)]
[PATCH] introduce slave mounts
A slave mount always has a master mount from which it receives
mount/umount events. Unlike shared mount the event propagation does not
flow from the slave mount to the master.
Signed-off-by: Ram Pai <linuxram@us.ibm.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Ram Pai [Mon, 7 Nov 2005 22:20:03 +0000 (17:20 -0500)]
[PATCH] shared mounts handling: move
Implement handling of mount --move in presense of shared mounts (see
Documentation/sharedsubtree.txt in the end of patch series for detailed
description).
Signed-off-by: Ram Pai <linuxram@us.ibm.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Ram Pai [Mon, 7 Nov 2005 22:19:33 +0000 (17:19 -0500)]
[PATCH] introduce shared mounts
This creates shared mounts. A shared mount when bind-mounted to some
mountpoint, propagates mount/umount events to each other. All the
shared mounts that propagate events to each other belong to the same
peer-group.
Signed-off-by: Ram Pai <linuxram@us.ibm.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Ram Pai [Mon, 7 Nov 2005 22:17:22 +0000 (17:17 -0500)]
[PATCH] mount expiry fixes
- clean up the ugliness in may_umount_tree()
- fix a bug in do_loopback(). after cloning a tree, do_loopback()
unlinks only the topmost mount of the cloned tree, leaving behind the
children mounts on their corresponding expiry list.
Signed-off-by: Ram Pai <linuxram@us.ibm.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Ram Pai [Mon, 7 Nov 2005 22:17:04 +0000 (17:17 -0500)]
[PATCH] umount_tree() locking change
umount is done under the protection of the namespace semaphore. This
can lead to intresting deadlocks when the last reference to a mount is
released, if filesystem code is in sufficiently nasty state.
This collects all the to-be-released-mounts and releases them after
releasing the namespace semaphore. That both reduces the time we are
holding namespace semaphore and gets the things more robust.
Idea proposed by Al Viro.
Signed-off-by: Ram Pai <linuxram@us.ibm.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Al Viro [Mon, 7 Nov 2005 22:15:34 +0000 (17:15 -0500)]
[PATCH] allow callers of seq_open do allocation themselves
Allow caller of seq_open() to kmalloc() seq_file + whatever else they
want and set ->private_data to it. seq_open() will then abstain from
doing allocation itself.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Al Viro [Mon, 7 Nov 2005 22:15:04 +0000 (17:15 -0500)]
[PATCH] cleanups and bug fix in do_loopback()
- check_mnt() on the source of binding should've been unconditional
from the very beginning. My fault - as far I could've trace it,
that's an old thinko made back in 2001. Kudos to Miklos for spotting
it...
Fixed.
- code cleaned up.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Al Viro [Mon, 7 Nov 2005 22:13:39 +0000 (17:13 -0500)]
[PATCH] saner handling of auto_acct_off() and DQUOT_OFF() in umount
The way we currently deal with quota and process accounting that might
keep vfsmount busy at umount time is inherently broken; we try to turn
them off just in case (not quite correctly, at that) and
a) pray umount doesn't fail (otherwise they'll stay turned off)
b) pray nobody doesn anything funny just as we turn quota off
Moreover, LSM provides hooks for doing the same sort of broken logics.
The proper way to deal with that is to introduce the second kind of
reference to vfsmount. Semantics:
- when the last normal reference is dropped, all special ones are
converted to normal ones and if there had been any, cleanup is done.
- normal reference can be cloned into a special one
- special reference can be converted to normal one; that's a no-op if
we'd already passed the point of no return (i.e. mntput() had
converted special references to normal and started cleanup).
The way it works: e.g. starting process accounting converts the vfsmount
reference pinned by the opened file into special one and turns it back
to normal when it gets shut down; acct_auto_close() is done when no
normal references are left. That way it does *not* obstruct umount(2)
and it silently gets turned off when the last normal reference to
vfsmount is gone. Which is exactly what we want...
The same should be done by LSM module that holds some internal
references to vfsmount and wants to shut them down on umount - it should
make them special and security_sb_umount_close() will be called exactly
when the last normal reference to vfsmount is gone.
quota handling is even simpler - we don't use normal file IO anymore, so
there's no need to hold vfsmounts at all. DQUOT_OFF() is done from
deactivate_super(), where it really belongs.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
[PATCH] ppc64: Fix the lazy icache/dcache code for non-RAM pages
For some stupid reason I can't explain (brown paper bag is at hand), I
removed the check pfn_valid() in the code that does the icache/dcache
coherency on POWER4 and later. That causes us to eventually try to
access non existing struct page when hashing in IO pages.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
Mike Kravetz [Mon, 7 Nov 2005 21:48:59 +0000 (13:48 -0800)]
[PATCH] Memory Add Fixes for ppc64
On Tue, Nov 08, 2005 at 08:12:56AM +1100, Benjamin Herrenschmidt wrote:
> Yes, the MAX_ORDER should be different indeed. But can Kconfig do that ?
> That is have the default value be different based on a Kconfig option ?
> I don't see that ... We may have to do things differently here...
This seems to be done in other parts of the Kconfig file. Using those
as an example, this should keep the MAX_ORDER block size at 16MB.
Signed-off-by: Mike Kravetz <kravetz@us.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org>
It turns out the section ordering of the linker script is different on
ppc32 and ppc64, so just count data as _edata - _sdata which should work
on both.
Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
[PATCH] ppc64: Thermal control for SMU based machines
This adds a new thermal control framework for PowerMac, along with the
implementation for PowerMac8,1, PowerMac8,2 (iMac G5 rev 1 and 2), and
PowerMac9,1 (latest single CPU desktop). In the future, I expect to move
the older G5 thermal control to the new framework as well.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
[PATCH] ppc64: Update g5_defconfig for ARCH=powerpc
This patch updates g5_defconfig for ARCH=powerpc in order to add the SMU
support & thermal drivers to it, the pmac sound driver (works on some
G5s) and replaces rivafb with nvidiafb which works better for the cards
found in G5 based machines.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
This patch adds the ability to the SMU driver to recover missing
calibration partitions from the SMU chip itself. It also adds some
dynamic mecanism to /proc/device-tree so that new properties are visible
to userland.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
David Woodhouse [Wed, 2 Nov 2005 22:34:20 +0000 (22:34 +0000)]
[PATCH] powerpc: Fix ppc32 initrd
OK, the Fedora ppc32 and ppc64 kernels should both be arch/powerpc by
tomorrow. They're booting on G5, POWER5, and my powerbook. I'll test
pmac SMP and Pegasos later -- but pmac smp is known broken in arch/ppc
anyway, and I'll live with a potential Pegasos regression for now; it
wasn't supported officially in FC4 either.
I needed to fix ppc32 initrd -- we were never setting initrd_start.
Signed-off-by: David Woodhouse <dwmw2@infradead.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
implement a compat_ioctl handle in the driver instead of having table
entries in sparc64 ioctl32.c (I plan to get rid of the arch ioctl32.c
file eventually)
Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: David S. Miller <davem@davemloft.net>
Hugh Dickins [Mon, 7 Nov 2005 22:12:08 +0000 (14:12 -0800)]
[SPARC64] mm: simpler tlb_flush_mmu
Minor simplification to the sparc64 tlb_flush_mmu: tlb_remove_page
set need_flush only after handling the tlb_fast_mode case, then
tlb_flush_mmu need not consider whether it's tlb_fast_mode.
Signed-off-by: Hugh Dickins <hugh@veritas.com> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Mon, 7 Nov 2005 22:10:10 +0000 (14:10 -0800)]
[SPARC64]: Kill off dummy_tick_ops.
It only serves to generate false-positive buildcheck warnings.
Just set it initially to tick_operations which uses the v9
%tick register which every sparc64 processor has.
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Mon, 7 Nov 2005 22:09:58 +0000 (14:09 -0800)]
[SPARC64] mm: Do not flush TLB mm in tlb_finish_mmu()
It isn't needed any longer, as noted by Hugh Dickins.
We still need the flush routines, due to the one remaining
call site in hugetlb_prefault_arch_hook(). That can be
eliminated at some later point, however.
Signed-off-by: David S. Miller <davem@davemloft.net>
Hugh Dickins [Mon, 7 Nov 2005 22:09:01 +0000 (14:09 -0800)]
[SPARC64] mm: context switch ptlock
sparc64 is unique among architectures in taking the page_table_lock in
its context switch (well, cris does too, but erroneously, and it's not
yet SMP anyway).
This seems to be a private affair between switch_mm and activate_mm,
using page_table_lock as a per-mm lock, without any relation to its uses
elsewhere. That's fine, but comment it as such; and unlock sooner in
switch_mm, more like in activate_mm (preemption is disabled here).
There is a block of "if (0)"ed code in smp_flush_tlb_pending which would
have liked to rely on the page_table_lock, in switch_mm and elsewhere;
but its comment explains how dup_mmap's flush_tlb_mm defeated it. And
though that could have been changed at any time over the past few years,
now the chance vanishes as we push the page_table_lock downwards, and
perhaps split it per page table page. Just delete that block of code.
Which leaves the mysterious spin_unlock_wait(&oldmm->page_table_lock)
in kernel/fork.c copy_mm. Textual analysis (supported by Nick Piggin)
suggests that the comment was written by DaveM, and that it relates to
the defeated approach in the sparc64 smp_flush_tlb_pending. Just delete
this block too.
Signed-off-by: Hugh Dickins <hugh@veritas.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Hugh Dickins [Mon, 7 Nov 2005 22:08:46 +0000 (14:08 -0800)]
[SPARC64] mm: don't re-evaluate *ptep
sparc64 prom_callback and new_setup_frame32 each operates on a user page
table without holding lock, and no doubt they've good reason. But I'd
feel more confident if they were to do a "pte = *ptep" and then operate
on pte, rather than re-evaluating *ptep.
Signed-off-by: Hugh Dickins <hugh@veritas.com> Signed-off-by: David S. Miller <davem@davemloft.net>
[ARM] 3121/1: unconditionally use XCB=101 on ixp2000
Patch from Lennert Buytenhek
Since we have to use XCB=101 instead of XCB=000 on the ixp2400 to
prevent it from regularly falling over, and since we have to deal with
manual write buffer flushing because of that, we might as well use
XCB=101 on all ixp2000 platforms since it's faster than XCB=000.
Signed-off-by: Lennert Buytenhek <buytenh@wantstofly.org> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
[ARM] 3118/1: fix and reenable nwfpe extended precision emulation for big-endian
Patch from Lennert Buytenhek
nwfpe extended precision emulation used to be broken on big-endian
and was therefore disabled. This patch fixes nwfpe so that it copies
extended precision floats to/from userspace in the proper word order
(similar to patch #2046, see the description of that patch for an
explanation) and reenables the Kconfig option.
Signed-off-by: Lennert Buytenhek <buytenh@wantstofly.org> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
The routine that nwfpe uses for converting floats/doubles to
extended precision fails to zero two bytes of kernel stack. This
is not immediately obvious, as the floatx80 structure has 16 bits
of implicit padding (by design.) These two bytes are copied to
userspace when an stfe is emulated, causing a possible info leak.
Make the padding explicit and zero it out in the relevant places.
Signed-off-by: Lennert Buytenhek <buytenh@wantstofly.org> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Atsushi Nemoto [Wed, 2 Nov 2005 16:01:15 +0000 (01:01 +0900)]
Use rtc_lock to protect RTC operations
Many RTC routines were not protected against each other, so there are
potential races, for example, ntp-update against /dev/rtc. This patch
fixes them using rtc_lock.
Ralf Baechle [Mon, 31 Oct 2005 23:34:52 +0000 (23:34 +0000)]
VPE loader janitoring
o Switch to dynamic major
o Remove duplicate SHN_MIPS_SCOMMON definition
o Coding style: remove typedefs.
o Coding style: reorder to avoid the need for forward declarations
o Use kzalloc.