[XFS] There are a few problems with the new
xfs_bmap_search_multi_extents() wrapper function that I introduced in mod
xfs-linux:xfs-kern:207393a. The function was added as a wrapper around
xfs_bmap_do_search_extents() to avoid breaking the top-of-tree CXFS
interface. The idea of the function was basically to extract the target
extent buffer (if muli- level extent allocation mode), then call
xfs_bmap_do_search_extents() with either a pointer to the first extent in
the target buffer or a pointer to the first extent in the file, depending
on which extent mode was being used. However, in addition to locating the
target extent record for block bno, xfs_bmap_do_search_extents() also sets
four parameters needed by the caller: *lastx, *eofp, *gotp, *prevp.
Passing only the target extent buffer to xfs_bmap_do_search_extents()
causes *eofp to be set incorrectly if the extent is at the end of the
target list but there are actually more extents in the next er_extbuf.
Likewise, if the extent is the first one in the buffer but NOT the first
in the file, *prevp is incorrectly set to NULL. Adding the needed
functionality to xfs_bmap_search_multi_extents() to re-set any incorrectly
set fields is redundant and makes the call to xfs_bmap_do_search_extents()
not make much sense when multi-level extent allocation mode is being used.
This mod basically extracts the two functional components from
xfs_bmap_do_search_extents(), with the intent of obsoleting/removing
xfs_bmap_do_search_extents() after the CXFS mult-level in-core extent
changes are checked in. The two components are: 1) The binary search to
locate the target extent record, and 2) Setting the four parameters needed
by the caller (*lastx, *eofp, *gotp, *prevp). Component 1: I created a
new function in xfs_inode.c called xfs_iext_bno_to_ext(), which executes
the binary search to find the target extent record.
xfs_bmap_search_multi_extents() has been modified to call
xfs_iext_bno_to_ext() rather than xfs_bmap_do_search_extents(). Component
2: The parameter setting functionality has been added to
xfs_bmap_search_multi_extents(), eliminating the need for
xfs_bmap_do_search_extents(). These changes make the removal of
xfs_bmap_do_search_extents() trival once the CXFS changes are in place.
They also allow us to maintain the current XFS interface, using the new
search function introduced in mod xfs-linux:xfs-kern:207393a.
Nathan Scott [Tue, 14 Mar 2006 02:33:36 +0000 (13:33 +1100)]
[XFS] Dynamically allocate vattr in places it makes sense to do so, to
reduce stack use. Also re-use vattr in some places so that multiple
copies are not held on-stack.
[XFS] 929045 567344 This mod introduces multi-level in-core file extent
functionality, building upon the new layout introduced in mod
xfs-linux:xfs-kern:207390a. The new multi-level extent allocations are
only required for heavily fragmented files, so the old-style linear extent
list is used on files until the extents reach a pre-determined size of 4k.
4k buffers are used because this is the system page size on Linux i386 and
systems with larger page sizes don't seem to gain much, if anything, by
using their native page size as the extent buffer size. Also, using 4k
extent buffers everywhere provides a consistent interface for CXFS across
different platforms. The 4k extent buffers are managed by an indirection
array (xfs_ext_irec_t) which is basically just a pointer array with a bit
of extra information to keep track of the number of extents in each buffer
as well as the extent offset of each buffer. Major changes include: -
Add multi-level in-core file extent functionality to the xfs_iext_
subroutines introduced in mod: xfs-linux:xfs-kern:207390a - Introduce 13
new subroutines which add functionality for multi-level in-core file
extents: xfs_iext_add_indirect_multi()
xfs_iext_remove_indirect() xfs_iext_realloc_indirect()
xfs_iext_indirect_to_direct() xfs_iext_bno_to_irec()
xfs_iext_idx_to_irec() xfs_iext_irec_init()
xfs_iext_irec_new() xfs_iext_irec_remove()
xfs_iext_irec_compact() xfs_iext_irec_compact_pages()
xfs_iext_irec_compact_full() xfs_iext_irec_update_extoffs()
[XFS] 929045 567344 This mod re-organizes some of the in-core file extent
code to prepare for an upcoming mod which will introduce multi-level
in-core extent allocations. Although the in-core extent management is
using a new code path in this mod, the functionality remains the same.
Major changes include: - Introduce 10 new subroutines which re-orgainze
the existing code but do NOT change functionality:
xfs_iext_get_ext() xfs_iext_insert() xfs_iext_add()
xfs_iext_remove() xfs_iext_remove_inline()
xfs_iext_remove_direct() xfs_iext_realloc_direct()
xfs_iext_direct_to_inline() xfs_iext_inline_to_direct()
xfs_iext_destroy() - Remove 2 subroutines (functionality moved to new
subroutines above): xfs_iext_realloc() -replaced by xfs_iext_add()
and xfs_iext_remove() xfs_bmap_insert_exlist() - replaced by
xfs_iext_insert() xfs_bmap_delete_exlist() - replaced by
xfs_iext_remove() - Replace all hard-coded (indexed) extent assignments
with a call to xfs_iext_get_ext() - Replace all extent record pointer
arithmetic (ep++, ep--, base + lastx,..) with calls to
xfs_iext_get_ext() - Update comments to remove the idea of a single
"extent list" and introduce "extent record" terminology instead
David Chinner [Tue, 14 Mar 2006 02:29:16 +0000 (13:29 +1100)]
[XFS] using a spinlock per cpu for superblock counter exclusion results in
a preēmpt counter overflow at 256p and above. Change the exclusion
mechanism to use atomic bit operations and busy wait loops to emulate the
spin lock exclusion mechanism but without the preempt count issues.
David Chinner [Tue, 14 Mar 2006 02:23:52 +0000 (13:23 +1100)]
[XFS] Add support for hotplug CPUs to the per-CPU superblock counters by
registering a notifier callback that listens to CPU up/down events to
modify the counters appropriately.
Nathan Scott [Tue, 14 Mar 2006 02:20:13 +0000 (13:20 +1100)]
[XFS] When compiling with gcc 4.0 and CONFIG_SMP unset, there are many
warnings along the lines: xfs_linux.h:103:5: warning: "CONFIG_SMP" is not
defined.
David Chinner [Tue, 14 Mar 2006 02:13:09 +0000 (13:13 +1100)]
[XFS] On machines with more than 8 cpus, when running parallel I/O
threads, the incore superblock lock becomes the limiting factor for
buffered write throughput. Make the contended fields in the incore
superblock use per-cpu counters so that there is no global lock to limit
scalability.
Nathan Scott [Tue, 14 Mar 2006 02:05:30 +0000 (13:05 +1100)]
[XFS] XFS propagates MS_NOATIME through two levels internally but doesn't
actually use it. Kill this dead code. Signed-off-by: Christoph Hellwig
<hch@lst.de>
David Chinner [Tue, 14 Mar 2006 02:02:13 +0000 (13:02 +1100)]
[XFS] find_exported_dentry(). XFS does not need to use this symbol as it
is provided by a vector through the superblock export operations when the
filesystem is exported by NFS. The fix is to call that vector instead of
using the exported symbol directly.
Herbert Xu [Mon, 13 Mar 2006 22:26:12 +0000 (14:26 -0800)]
[TCP]: Fix zero port problem in IPv6
When we link a socket into the hash table, we need to make sure that we
set the num/port fields so that it shows us with a non-zero port value
in proc/netlink and on the wire. This code and comment is copied over
from the IPv4 stack as is.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Andi Kleen [Sun, 12 Mar 2006 22:52:59 +0000 (23:52 +0100)]
[PATCH] x86-64: Fix up handling of non canonical user RIPs
EM64T CPUs have somewhat weird error reporting for non canonical RIPs in
SYSRET.
We can't handle any exceptions there because the exception handler would
end up running on the user stack which is unsafe.
To avoid problems any code that might end up with a user touched pt_regs
should return using int_ret_from_syscall. int_ret_from_syscall ends up
using IRET, which allows safe exceptions.
Linus Torvalds [Sun, 12 Mar 2006 22:56:02 +0000 (14:56 -0800)]
Merge master.kernel.org:/home/rmk/linux-2.6-arm
* master.kernel.org:/home/rmk/linux-2.6-arm:
[ARM] iwmmxt thread state alignment
[ARM] 3350/1: Enable 1-wire on ARM
[ARM] 3356/1: Workaround for the ARM1136 I-cache invalidation problem
[ARM] 3355/1: NSLU2: remove propmt depends
[ARM] 3354/1: NAS100d: fix power led handling
[ARM] Fix muldi3.S
Russell King [Sun, 12 Mar 2006 22:36:06 +0000 (22:36 +0000)]
[ARM] iwmmxt thread state alignment
This patch removes the reliance of iwmmxt on hand coded alignments.
Since thread_info is always 8K aligned, specifying that fpstate is
8-byte aligned achieves the same effect without needing to resort
to hand coded alignments.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Gregor Maier [Sun, 12 Mar 2006 02:51:25 +0000 (18:51 -0800)]
[NETFILTER]: Fix wrong option spelling in Makefile for CONFIG_BRIDGE_EBT_ULOG
Signed-off-by: Gregor Maier <gregor@net.in.tum.de> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
Brian Haley [Sun, 12 Mar 2006 02:50:14 +0000 (18:50 -0800)]
[IPV6]: fix ipv6_saddr_score struct element
The scope element in the ipv6_saddr_score struct used in
ipv6_dev_get_saddr() is an unsigned integer, but __ipv6_addr_src_scope()
returns a signed integer (and can return -1).
Signed-off-by: Brian Haley <brian.haley@hp.com> Acked-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Sam Ravnborg [Wed, 8 Mar 2006 08:06:33 +0000 (00:06 -0800)]
[PATCH] de620: fix section mismatch warning
In latest -mm de620 gave following warning:
WARNING: drivers/net/de620.o - Section mismatch: reference to \
.init.text:de620_probe from .text between 'init_module' (at offset \
0x1682) and 'cleanup_module'
init_module() call de620_probe() which is declared __init.
Fix is to declare init_module() __init too.
Signed-off-by: Sam Ravnborg <sam@ravnborg.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Jeff Garzik <jeff@garzik.org>
Jon Mason [Fri, 10 Mar 2006 21:12:10 +0000 (15:12 -0600)]
[PATCH] dl2k: DMA freeing error
This patch fixes an error in the dl2k driver's DMA mapping/unmapping.
The adapter uses the upper 16bits of the DMA address for the buffer
size. However, this is not masked off when referencing the DMA
address, and can lead to errors by trying to free a DMA address out of
range.
Thanks,
Jon
Signed-off-by: Jon Mason <jdmason@us.ibm.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>
David S. Miller [Sat, 11 Mar 2006 02:08:09 +0000 (18:08 -0800)]
[PATCH] Wrong return value corrupts free object in e1000 driver
For some reason, E1000's ->hard_start_xmit() routine returns -EFAULT
instead of one of the NETDEV_TX_* error codes. In fact, it frees up
the SKB before returning this. This makes the queueing layer think
the packet should be requeued and subsequently we corrupt a freed
object.
Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Jeff Garzik <jeff@garzik.org>
The patch '[PATCH] RCU signal handling' [1] added an export for
__put_task_struct_cb, a put_task_struct helper newly introduced in that
patch. But the put_task_struct couldn't be used modular previously as
__put_task_struct wasn't exported. There are not callers of it in modular
code, and it shouldn't be exported because we don't want drivers to hold
references to task_structs.
This patch removes the export and folds __put_task_struct into
__put_task_struct_cb as there's no other caller.
Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Paul E. McKenney <paulmck@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Stephen Smalley [Sat, 11 Mar 2006 11:27:16 +0000 (03:27 -0800)]
[PATCH] selinux: tracer SID fix
Fix SELinux to not reset the tracer SID when the child is already being
traced, since selinux_ptrace is also called by proc for access checking
outside of the context of a ptrace attach.
Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov> Acked-by: James Morris <jmorris@namei.org> Acked-by: Chris Wright <chrisw@sous-sol.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Arjan van de Ven [Sat, 11 Mar 2006 11:27:15 +0000 (03:27 -0800)]
[PATCH] edac: disable a few sysfs files to avoid them becoming an ABI
Disable (via ugly #if 0's) the 3 sysfs files that I think by now we all
agree are very much wrong. These files shouldn't become part of the ABI by
the 2.6.16 release, so I rather have this minimal patch merged to disable
them for now, the real fix can then come during the 2.6.17 devel window.
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Badari Pulavarty [Sat, 11 Mar 2006 11:27:14 +0000 (03:27 -0800)]
[PATCH] ext3: fix nobh mode for chattr +j inodes
One can do "chattr +j" on a file to change its journalling mode. Fix
writeback mode with "nobh" handling for it.
Even though, we mount ext3 filesystem in writeback mode with "nobh" option,
some one can do "chattr +j" on a single file to force it to do journalled
mode. In order to do journaling, ext3_block_truncate_page() need to
fallback to default case of creating buffers and adding them to transaction
etc.
Kirill Korotaev [Sat, 11 Mar 2006 11:27:13 +0000 (03:27 -0800)]
[PATCH] ext3: ext3_symlink should use GFP_NOFS allocations inside
This patch fixes illegal __GFP_FS allocation inside ext3 transaction in
ext3_symlink(). Such allocation may re-enter ext3 code from
try_to_free_pages. But JBD/ext3 code keeps a pointer to current journal
handle in task_struct and, hence, is not reentrable.
This bug led to "Assertion failure in journal_dirty_metadata()" messages.
Dmitry Torokhov [Sat, 11 Mar 2006 05:23:38 +0000 (00:23 -0500)]
[PATCH] Input: psmouse - disable autoresync
Automatic resynchronization in psmouse driver causes problems on some
hardware so disable it by default for now. People with KVM switches
that require resync can still enable it via module parameter or sysfs
attribute.
Jan Beulich [Wed, 22 Feb 2006 12:29:04 +0000 (13:29 +0100)]
[PATCH] kbuild: version.h should depend on .kernelrelease
Rebuilding a previously built tree while using make's -j option from
time to time results in the version.h check running at the same time as
the updating of .kernelrelease, resulting in UTS_RELEASE remaining an
empty string (and as a side effect causing the entire kernel to be
rebuilt).
Signed-Off-By: Jan Beulich <jbeulich@novell.com> Signed-off-by: Sam Ravnborg <sam@ravnborg.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Catalin Marinas [Fri, 10 Mar 2006 22:26:47 +0000 (22:26 +0000)]
[ARM] 3356/1: Workaround for the ARM1136 I-cache invalidation problem
Patch from Catalin Marinas
ARM1136 erratum 371025 (category 2) specifies that, under rare
conditions, an invalidate I-cache by MVA (line or range) operation can
fail to invalidate a cache line. The recommended workaround is to
either invalidate the entire I-cache or invalidate the range by
set/way rather than MVA.
Note that for a 16K cache size, invalidating a 4K page by set/way is
equivalent to invalidating the entire I-cache.
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>