David Chinner [Tue, 14 Mar 2006 02:23:52 +0000 (13:23 +1100)]
[XFS] Add support for hotplug CPUs to the per-CPU superblock counters by
registering a notifier callback that listens to CPU up/down events to
modify the counters appropriately.
Nathan Scott [Tue, 14 Mar 2006 02:20:13 +0000 (13:20 +1100)]
[XFS] When compiling with gcc 4.0 and CONFIG_SMP unset, there are many
warnings along the lines: xfs_linux.h:103:5: warning: "CONFIG_SMP" is not
defined.
David Chinner [Tue, 14 Mar 2006 02:13:09 +0000 (13:13 +1100)]
[XFS] On machines with more than 8 cpus, when running parallel I/O
threads, the incore superblock lock becomes the limiting factor for
buffered write throughput. Make the contended fields in the incore
superblock use per-cpu counters so that there is no global lock to limit
scalability.
Nathan Scott [Tue, 14 Mar 2006 02:05:30 +0000 (13:05 +1100)]
[XFS] XFS propagates MS_NOATIME through two levels internally but doesn't
actually use it. Kill this dead code. Signed-off-by: Christoph Hellwig
<hch@lst.de>
David Chinner [Tue, 14 Mar 2006 02:02:13 +0000 (13:02 +1100)]
[XFS] find_exported_dentry(). XFS does not need to use this symbol as it
is provided by a vector through the superblock export operations when the
filesystem is exported by NFS. The fix is to call that vector instead of
using the exported symbol directly.
Herbert Xu [Mon, 13 Mar 2006 22:26:12 +0000 (14:26 -0800)]
[TCP]: Fix zero port problem in IPv6
When we link a socket into the hash table, we need to make sure that we
set the num/port fields so that it shows us with a non-zero port value
in proc/netlink and on the wire. This code and comment is copied over
from the IPv4 stack as is.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Andi Kleen [Sun, 12 Mar 2006 22:52:59 +0000 (23:52 +0100)]
[PATCH] x86-64: Fix up handling of non canonical user RIPs
EM64T CPUs have somewhat weird error reporting for non canonical RIPs in
SYSRET.
We can't handle any exceptions there because the exception handler would
end up running on the user stack which is unsafe.
To avoid problems any code that might end up with a user touched pt_regs
should return using int_ret_from_syscall. int_ret_from_syscall ends up
using IRET, which allows safe exceptions.
Linus Torvalds [Sun, 12 Mar 2006 22:56:02 +0000 (14:56 -0800)]
Merge master.kernel.org:/home/rmk/linux-2.6-arm
* master.kernel.org:/home/rmk/linux-2.6-arm:
[ARM] iwmmxt thread state alignment
[ARM] 3350/1: Enable 1-wire on ARM
[ARM] 3356/1: Workaround for the ARM1136 I-cache invalidation problem
[ARM] 3355/1: NSLU2: remove propmt depends
[ARM] 3354/1: NAS100d: fix power led handling
[ARM] Fix muldi3.S
Russell King [Sun, 12 Mar 2006 22:36:06 +0000 (22:36 +0000)]
[ARM] iwmmxt thread state alignment
This patch removes the reliance of iwmmxt on hand coded alignments.
Since thread_info is always 8K aligned, specifying that fpstate is
8-byte aligned achieves the same effect without needing to resort
to hand coded alignments.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Gregor Maier [Sun, 12 Mar 2006 02:51:25 +0000 (18:51 -0800)]
[NETFILTER]: Fix wrong option spelling in Makefile for CONFIG_BRIDGE_EBT_ULOG
Signed-off-by: Gregor Maier <gregor@net.in.tum.de> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
Brian Haley [Sun, 12 Mar 2006 02:50:14 +0000 (18:50 -0800)]
[IPV6]: fix ipv6_saddr_score struct element
The scope element in the ipv6_saddr_score struct used in
ipv6_dev_get_saddr() is an unsigned integer, but __ipv6_addr_src_scope()
returns a signed integer (and can return -1).
Signed-off-by: Brian Haley <brian.haley@hp.com> Acked-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Sam Ravnborg [Wed, 8 Mar 2006 08:06:33 +0000 (00:06 -0800)]
[PATCH] de620: fix section mismatch warning
In latest -mm de620 gave following warning:
WARNING: drivers/net/de620.o - Section mismatch: reference to \
.init.text:de620_probe from .text between 'init_module' (at offset \
0x1682) and 'cleanup_module'
init_module() call de620_probe() which is declared __init.
Fix is to declare init_module() __init too.
Signed-off-by: Sam Ravnborg <sam@ravnborg.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Jeff Garzik <jeff@garzik.org>
Jon Mason [Fri, 10 Mar 2006 21:12:10 +0000 (15:12 -0600)]
[PATCH] dl2k: DMA freeing error
This patch fixes an error in the dl2k driver's DMA mapping/unmapping.
The adapter uses the upper 16bits of the DMA address for the buffer
size. However, this is not masked off when referencing the DMA
address, and can lead to errors by trying to free a DMA address out of
range.
Thanks,
Jon
Signed-off-by: Jon Mason <jdmason@us.ibm.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>
David S. Miller [Sat, 11 Mar 2006 02:08:09 +0000 (18:08 -0800)]
[PATCH] Wrong return value corrupts free object in e1000 driver
For some reason, E1000's ->hard_start_xmit() routine returns -EFAULT
instead of one of the NETDEV_TX_* error codes. In fact, it frees up
the SKB before returning this. This makes the queueing layer think
the packet should be requeued and subsequently we corrupt a freed
object.
Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Jeff Garzik <jeff@garzik.org>
The patch '[PATCH] RCU signal handling' [1] added an export for
__put_task_struct_cb, a put_task_struct helper newly introduced in that
patch. But the put_task_struct couldn't be used modular previously as
__put_task_struct wasn't exported. There are not callers of it in modular
code, and it shouldn't be exported because we don't want drivers to hold
references to task_structs.
This patch removes the export and folds __put_task_struct into
__put_task_struct_cb as there's no other caller.
Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Paul E. McKenney <paulmck@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Stephen Smalley [Sat, 11 Mar 2006 11:27:16 +0000 (03:27 -0800)]
[PATCH] selinux: tracer SID fix
Fix SELinux to not reset the tracer SID when the child is already being
traced, since selinux_ptrace is also called by proc for access checking
outside of the context of a ptrace attach.
Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov> Acked-by: James Morris <jmorris@namei.org> Acked-by: Chris Wright <chrisw@sous-sol.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Arjan van de Ven [Sat, 11 Mar 2006 11:27:15 +0000 (03:27 -0800)]
[PATCH] edac: disable a few sysfs files to avoid them becoming an ABI
Disable (via ugly #if 0's) the 3 sysfs files that I think by now we all
agree are very much wrong. These files shouldn't become part of the ABI by
the 2.6.16 release, so I rather have this minimal patch merged to disable
them for now, the real fix can then come during the 2.6.17 devel window.
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Badari Pulavarty [Sat, 11 Mar 2006 11:27:14 +0000 (03:27 -0800)]
[PATCH] ext3: fix nobh mode for chattr +j inodes
One can do "chattr +j" on a file to change its journalling mode. Fix
writeback mode with "nobh" handling for it.
Even though, we mount ext3 filesystem in writeback mode with "nobh" option,
some one can do "chattr +j" on a single file to force it to do journalled
mode. In order to do journaling, ext3_block_truncate_page() need to
fallback to default case of creating buffers and adding them to transaction
etc.
Kirill Korotaev [Sat, 11 Mar 2006 11:27:13 +0000 (03:27 -0800)]
[PATCH] ext3: ext3_symlink should use GFP_NOFS allocations inside
This patch fixes illegal __GFP_FS allocation inside ext3 transaction in
ext3_symlink(). Such allocation may re-enter ext3 code from
try_to_free_pages. But JBD/ext3 code keeps a pointer to current journal
handle in task_struct and, hence, is not reentrable.
This bug led to "Assertion failure in journal_dirty_metadata()" messages.
Dmitry Torokhov [Sat, 11 Mar 2006 05:23:38 +0000 (00:23 -0500)]
[PATCH] Input: psmouse - disable autoresync
Automatic resynchronization in psmouse driver causes problems on some
hardware so disable it by default for now. People with KVM switches
that require resync can still enable it via module parameter or sysfs
attribute.
Jan Beulich [Wed, 22 Feb 2006 12:29:04 +0000 (13:29 +0100)]
[PATCH] kbuild: version.h should depend on .kernelrelease
Rebuilding a previously built tree while using make's -j option from
time to time results in the version.h check running at the same time as
the updating of .kernelrelease, resulting in UTS_RELEASE remaining an
empty string (and as a side effect causing the entire kernel to be
rebuilt).
Signed-Off-By: Jan Beulich <jbeulich@novell.com> Signed-off-by: Sam Ravnborg <sam@ravnborg.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Catalin Marinas [Fri, 10 Mar 2006 22:26:47 +0000 (22:26 +0000)]
[ARM] 3356/1: Workaround for the ARM1136 I-cache invalidation problem
Patch from Catalin Marinas
ARM1136 erratum 371025 (category 2) specifies that, under rare
conditions, an invalidate I-cache by MVA (line or range) operation can
fail to invalidate a cache line. The recommended workaround is to
either invalidate the entire I-cache or invalidate the range by
set/way rather than MVA.
Note that for a 16K cache size, invalidating a 4K page by set/way is
equivalent to invalidating the entire I-cache.
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
[PATCH] slab: Node rotor for freeing alien caches and remote per cpu pages.
The cache reaper currently tries to free all alien caches and all remote
per cpu pages in each pass of cache_reap. For a machines with large number
of nodes (such as Altix) this may lead to sporadic delays of around ~10ms.
Interrupts are disabled while reclaiming creating unacceptable delays.
This patch changes that behavior by adding a per cpu reap_node variable.
Instead of attempting to free all caches, we free only one alien cache and
the per cpu pages from one remote node. That reduces the time spend in
cache_reap. However, doing so will lengthen the time it takes to
completely drain all remote per cpu pagesets and all alien caches. The
time needed will grow with the number of nodes in the system. All caches
are drained when they overflow their respective capacity. So the drawback
here is only that a bit of memory may be wasted for awhile longer.
Details:
1. Rename drain_remote_pages to drain_node_pages to allow the specification
of the node to drain of pcp pages.
2. Add additional functions init_reap_node, next_reap_node for NUMA
that manage a per cpu reap_node counter.
3. Add a reap_alien function that reaps only from the current reap_node.
For us this seems to be a critical issue. Holdoffs of an average of ~7ms
cause some HPC benchmarks to slow down significantly. F.e. NAS parallel
slows down dramatically. NAS parallel has a 12-16 seconds runtime w/o rotor
compared to 5.8 secs with the rotor patches. It gets down to 5.05 secs with
the additional interrupt holdoff reductions.
Signed-off-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Currently the code tries up to spin_retry times to grab a lock using the cs
instruction. The cs instruction has exclusive access to a memory region
and therefore invalidates the appropiate cache line of all other cpus. If
there is contention on a lock this leads to cache line trashing. This can
be avoided if we first check wether a cs instruction is likely to succeed
before the instruction gets actually executed.
Signed-off-by: Christian Ehrhardt <ehrhardt@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
[PATCH] vmscan: no zone_reclaim if PF_MALLOC is set
If the process has already set PF_MALLOC and is already using
current->reclaim_state then do not try to reclaim memory from the zone.
This is set by kswapd and/or synchrononous global reclaim which will not
take it lightly if we zap the reclaim_state.
Signed-off-by: Christoph Lameter <clameter@sig.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Adrian Bunk [Fri, 10 Mar 2006 01:33:45 +0000 (17:33 -0800)]
[PATCH] xtensa must set RWSEM_GENERIC_SPINLOCK=y
/usr/src/ctest/git/kernel/mm/rmap.c: In function `page_referenced_one':
/usr/src/ctest/git/kernel/mm/rmap.c:354: warning: implicit declaration of function `rwsem_is_locked'
Signed-off-by: Adrian Bunk <bunk@stusta.de> Cc: <chris@zankel.net> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Signed-off-by: Adrian Bunk <bunk@stusta.de> Cc: Paul Mundt <lethal@linux-sh.org> Cc: Kazumoto Kojima <kkojima@rr.iij4u.or.jp> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Doug Warzecha [Fri, 10 Mar 2006 01:33:35 +0000 (17:33 -0800)]
[PATCH] dcdbas: dcdbas_pdev referenced after platform_device_unregister on exit
smi_data_buf_free() references dcdbas_pdev when calling
dma_free_coherent(). In dcdbas_exit(), smi_data_buf_free() is called after
platform_device_unregister(dcdbas_pdev).
This patch moves platform_device_unregister(dcdbas_pdev) after
smi_data_buf_free() in dcdbas_exit().
Hugh Dickins [Fri, 10 Mar 2006 01:33:34 +0000 (17:33 -0800)]
[PATCH] page_add_file_rmap(): remove BUG_ON()s
Remove two early-development BUG_ONs from page_add_file_rmap.
The pfn_valid test (originally useful for checking that nobody passed an
artificial struct page) comes too late, since we already have the struct
page.
The PageAnon test (useful when anon was first distinguished from file rmap)
prevents ->nopage implementations from reusing ->mapping, which would
otherwise be available.
Ralf Baechle [Wed, 8 Mar 2006 14:13:04 +0000 (14:13 +0000)]
[MIPS] Discard .exit.text at runtime.
At times gcc will place bits of __exit functions into .rodata. If
compiled into the kernle itself we used to discard .exit.text - but
not the bits left in .rodata. While harmless this did at times result
in a large number of warnings. So until gcc fixes this, discard
.exit.text at runtime.
Ralf Baechle [Thu, 2 Mar 2006 16:50:12 +0000 (16:50 +0000)]
[MIPS] Threaten removal of code for NEC DDB5074 and DDB5476 evaluation boards.
What: Support for NEC DDB5074 and DDB5476 evaluation boards.
When: June 2006
Why: Board specific code doesn't build anymore since ~2.6.0 and no
users have complained indicating there is no more need for these
boards. This should really be considered a last call.
Who: Ralf Baechle <ralf@linux-mips.org>
Andi Kleen [Thu, 9 Mar 2006 01:57:25 +0000 (17:57 -0800)]
[PATCH] i386: port ATI timer fix from x86_64 to i386 II
ATI chipsets tend to generate double timer interrupts for the local APIC
timer when both the 8254 and the IO-APIC timer pins are enabled. This is
because they route it to both and the result is anded together and the CPU
ends up processing it twice.
This patch changes check_timer to disable the 8254 routing for interrupt 0.
I think it would be safe on all chipsets actually (i tested it on a couple
and it worked everywhere) and Windows seems to do it in a similar way, but
to be conservative this patch only enables this mode on ATI (and adds
options to enable/disable too)
Ported over from a similar x86-64 change.
I reused the ACPI earlyquirk infrastructure for the ATI bridge check, but
tweaked it a bit to work even without ACPI.
Inspired by a patch from Chuck Ebbert, but redone.
Randy Dunlap [Thu, 9 Mar 2006 00:46:08 +0000 (16:46 -0800)]
[NET] compat ifconf: fix limits
A recent change to compat. dev_ifconf() in fs/compat_ioctl.c
causes ifconf data to be truncated 1 entry too early when copying it
to userspace. The correct amount of data (length) is returned,
but the final entry is empty (zero, not filled in).
The for-loop 'i' check should use <= to allow the final struct
ifreq32 to be copied. I also used the ifconf-corruption program
in kernel bugzilla #4746 to make sure that this change does not
re-introduce the corruption.
Signed-off-by: Randy Dunlap <rdunlap@xenotime.net> Signed-off-by: David S. Miller <davem@davemloft.net>