arch/s390/appldata/appldata_base.c has some confusing debugging code left
over to allow compiling it as a module. In practice, it cannot be configured
as module and there is no need to keep that code.
Signed-off-by: Gerald Schaefer <geraldsc@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
The clock synchronization of the ETR code requires an smp_call_function
to synchronize all cpus. Calling smp_call_function from a tasklet is
illegal. Replace the tasklet with a job on the global workqueue.
ETR work is rare and can be postponed to a be done by a kernel thread.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
The page_test_and_clear_dirty primitive really consists of two
operations, page_test_dirty and the page_clear_dirty. The combination
of the two is not an atomic operation, so it makes more sense to have
two separate operations instead of one.
In addition to the improved readability of the s390 version of
SetPageUptodate, it now avoids the page_test_dirty operation which is
an insert-storage-key-extended (iske) instruction which is an expensive
operation.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Generate uevents for all cpus if cpu capability changes. This can
happen e.g. because the cpus are overheating. The cpu capability can
be read via /sys/devices/system/cpu/cpuN/capability.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Ahmed S. Darwish [Fri, 27 Apr 2007 14:01:50 +0000 (16:01 +0200)]
[S390] ctc: kmalloc->kzalloc/casting cleanups.
A patch for the CTC / ESCON network driver. Switch from kmalloc to kzalloc
when appropriate, remove some unnecessary kmalloc casts too. Since I have no
s390 machine, I didn't compile it but I examined it carefully.
Signed-off-by: "Ahmed S. Darwish" <darwish.07@gmail.com> Cc: Frank Pavlic <fpavlic@de.ibm.com> Cc: Ursula Braun <braunu@de.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Michael Holzheu [Fri, 27 Apr 2007 14:01:49 +0000 (16:01 +0200)]
[S390] zfcpdump support.
s390 machines provide hardware support for creating Linux dumps on SCSI
disks. For creating a dump a special purpose dump Linux is used. The first
32 MB of memory are saved by the hardware before the dump Linux is
booted. Via an SCLP interface, the saved memory can be accessed from
Linux. This patch exports memory and registers of the crashed Linux to
userspace via a debugfs file. For more information refer to
Documentation/s390/zfcpdump.txt, which is included in this patch.
Signed-off-by: Michael Holzheu <holzheu@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
[S390] dasd: Add sysfs attribute status and generate uevents.
This patch adds a sysfs-attribute 'status' to make the DASD device-status
accessible from user-space. In addition, the DASD driver generates an
uevent(CHANGE) for the ccw-device on each device-status change.
This enables user-space applications (e.g. udev) to do related processing.
Recent cvs versions of gcc have support for an improved stack overflow
checking that calculates the size of the guard size for each function.
If the compiler accepts -mstack-size without -mstack-guard then the
new stack check is available. We always want to use the new stack
checker.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Section mismatch: reference to .init.text:con3270_consetup from .data
between 'con3270' (at offset 0x45c8) and 'con3270_fn'
Section mismatch: reference to .init.text:con3215_consetup from .data
between 'con3215' (at offset 0x4678) and
'raw3215_ccw_driver'
Since there is no difference between a non present console setup
function and one that returns only 0 remove them.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Simplify the signal_return function that checks for the two special
system calls sigreturn and rt_sigreturn. No need to do a page table
walk, a call to copy_from_user while disabled page faults will work
as well.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
The minor fault path has grown a lot in terms of cycles. In particular
the kprobes hook is very costly. Optimize the path to save a couple of
cycles. If kprobes is enabled more than 300 cycles can be avoided if
kprobes_running() is false.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Generic bug implementation for s390. Will increase the value of the
console output on BUG() statements since registers r0-r5,r14 will
not be clobbered by a printk() call that was previously done before
the illegal instruction of BUG() was hit.
Also implements an architecture specific WARN_ON(). Output of that
could be increased but requires common code change.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
This patch adds two improvements to the oops output. First it adds an
additional line after the PSW which decodes the different fields of it.
Second a disassembler is added that decodes the instructions surrounding
the faulting PSW. The output of a test oops now looks like this:
Remove system call glue for sys_clone, sys_fork, sys_vfork, sys_execve,
sys_sigreturn, sys_rt_sigreturn and sys_sigaltstack. Call do_execve from
kernel_execve directly, move pt_regs to the right place and branch to
sysc_return to start the user space program. This removes the last
in-kernel system call.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
We used to unregister ccw devices not directly from the I/O
subchannel remove function in order to avoid lifelocks on the
css bus semaphore. This semaphore is gone, and there is no reason
to not unregister the ccw device directly (it is even better since
it is more in keeping with the goal of immediate disconnect).
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
We allocage two pages when channel path measurements are enabled
via cm_enable. We must not forget to free them again when
channel path measurements are disabled again.
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
[S390] cio: replace subchannel evaluation queue with bitmap
Use a bitmap for indicating which subchannels require evaluation
instead of allocating memory for each evaluation request. This
approach reduces memory consumption during recovery in case of
massive evaluation request occurrence and removes the need for
memory allocation failure handling.
Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Peter Oberparleiter <peter.oberparleiter@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Stefan Bader [Fri, 27 Apr 2007 14:01:33 +0000 (16:01 +0200)]
[S390] cio: Re-start path verification after aborting internal I/O.
Path verification triggered by changes to the available CHPIDs will be
interrupted by another change but not re-started. This results in an
invalid path mask.
To solve this make sure to completely re-start path verification when
changing the available paths.
Signed-off-by: Stefan Bader <shbader@de.ibm.com> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Add a new attribute to the channel-path sysfs directory through which
channel-path configure operations can be triggered. Also listen for
hardware events requesting channel-path configure operations and
process them accordingly.
Signed-off-by: Peter Oberparleiter <peter.oberparleiter@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
* master.kernel.org:/pub/scm/linux/kernel/git/davem/sparc-2.6:
[PARPORT] SUNBPP: Fix OOPS when debugging is enabled.
[SPARC] openprom: Switch to ref counting PCI API
Andrew Morton [Wed, 25 Apr 2007 20:01:21 +0000 (13:01 -0700)]
packet: fix error handling
The packet driver is assuming (reasonably) that the (undocumented)
request.errors is an errno. But it is in fact some mysterious bitfield. When
things go wrong we return weird positive numbers to the VFS as pointers and it
goes oops.
Thanks to William Heimbigner for reporting and diagnosis.
(It doesn't oops, but this driver still doesn't work for William)
Cc: William Heimbigner <icxcnika@mar.tar.cc> Cc: Peter Osterlund <petero2@telia.com> Cc: Jens Axboe <jens.axboe@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Reply to NETLINK_FIB_LOOKUP messages were misrouted back to kernel,
which resulted in infinite recursion and stack overflow.
The bug is present in all kernel versions since the feature appeared.
The patch also makes some minimal cleanup:
1. Return something consistent (-ENOENT) when fib table is missing
2. Do not crash when queue is empty (does not happen, but yet)
3. Put result of lookup
Signed-off-by: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru> Signed-off-by: David S. Miller <davem@davemloft.net>
There's a really rare and obscure bug in CFQ, that causes a crash in
cfq_dispatch_insert() due to rq == NULL. One example of the resulting
oops is seen here:
http://lkml.org/lkml/2007/4/15/41
Neil correctly diagnosed the situation for how this can happen: if two
concurrent requests with the exact same sector number (due to direct IO
or aliasing between MD and the raw device access), the alias handling
will add the request to the sortlist, but next_rq remains NULL.
Read the more complete analysis at:
http://lkml.org/lkml/2007/4/25/57
This looks like it requires md to trigger, even though it should
potentially be possible to due with O_DIRECT (at least if you edit the
kernel and doctor some of the unplug calls).
The fix is to move the ->next_rq update to when we add a request to the
rbtree. Then we remove the possibility for a request to exist in the
rbtree code, but not have ->next_rq correctly updated.
A security issue is emerging. Disallow Routing Header Type 0 by default
as we have been doing for IPv4.
Note: We allow RH2 by default because it is harmless.
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Andrew Morton [Tue, 24 Apr 2007 16:51:03 +0000 (12:51 -0400)]
drivers/net/hamradio/baycom_ser_fdx build fix
sparc64:
drivers/net/hamradio/baycom_ser_fdx.c: In function `ser12_open':
drivers/net/hamradio/baycom_ser_fdx.c:417: error: `NR_IRQS' undeclared (first us
e in this function)
drivers/net/hamradio/baycom_ser_fdx.c:417: error: (Each undeclared identifier is
reported only once
drivers/net/hamradio/baycom_ser_fdx.c:417: error: for each function it appears i
n.)
Cc: Folkert van Heusden <folkert@vanheusden.com> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Jeff Garzik <jeff@garzik.org>
Dan Williams [Tue, 24 Apr 2007 14:20:06 +0000 (10:20 -0400)]
usb-net/pegasus: fix pegasus carrier detection
Broken by 4a1728a28a193aa388900714bbb1f375e08a6d8e which switched the
return semantics of read_mii_word() but didn't fix usage of
read_mii_word() to conform to the new semantics.
Setting carrier to off based on the NO_CARRIER flag is also incorrect as
that flag only triggers on TX failure and therefore isn't correct when
no frames are being transmitted. Since there is already a 2*HZ MII
carrier check going on, defer to that.
Add a TRUST_LINK_STATUS feature flag for adapters where the LINK_STATUS
flag is actually correct, and use that rather than the NO_CARRIER flag.
Signed-off-by: Dan Williams <dcbw@redhat.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>
Neil Horman [Fri, 20 Apr 2007 13:54:58 +0000 (09:54 -0400)]
sis900: Allocate rx replacement buffer before rx operation
The sis900 driver appears to have a bug in which the receive routine
passes the skbuff holding the received frame to the network stack before
refilling the buffer in the rx ring. If a new skbuff cannot be allocated, the
driver simply leaves a hole in the rx ring, which causes the driver to stop
receiving frames and become non-recoverable without an rmmod/insmod according to
reporters. This patch reverses that order, attempting to allocate a replacement
buffer first, and receiving the new frame only if one can be allocated. If no
skbuff can be allocated, the current skbuf in the rx ring is recycled, dropping
the current frame, but keeping the NIC operational.
Signed-off-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>
The following patch fixes a kernel bug in depca_platform_probe().
We don't use a dynamic pointer for pldev->dev.platform_data, so it seems
that the correct way to proceed if platform_device_add(pldev) fails is
to explicitly set the pldev->dev.platform_data pointer to NULL, before
calling the platform_device_put(pldev), or it will be kfree'ed by
platform_device_release().
8250: fix possible deadlock between serial8250_handle_port() and serial8250_interrupt()
Commit 40b36daa introduced possibility that serial8250_backup_timeout() ->
serial8250_handle_port() locks port.lock without disabling irqs, thus
allowing deadlock against interrupt handler (port.lock is acquired in
serial8250_interrupt()).
Spotted by lockdep.
Signed-off-by: Jiri Kosina <jkosina@suse.cz> Cc: Dave Jones <davej@codemonkey.org.uk> Cc: Russell King <rmk@arm.linux.org.uk> Cc: Alex Williamson <alex.williamson@hp.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jeff Mahoney [Mon, 23 Apr 2007 21:41:17 +0000 (14:41 -0700)]
reiserfs: fix xattr root locking/refcount bug
The listxattr() and getxattr() operations are only protected by a read
lock. As a result, if either of these operations run in parallel, a race
condition exists where the xattr_root will end up being cached twice, which
results in the leaking of a reference and a BUG() on umount.
This patch refactors get_xa_root(), __get_xa_root(), and create_xa_root(),
into one get_xa_root() function that takes the appropriate locking around
the entire critical section.
Reported, diagnosed and tested by Andrea Righi <a.righi@cineca.it>
Signed-off-by: Jeff Mahoney <jeffm@suse.com> Cc: Andrea Righi <a.righi@cineca.it> Cc: "Vladimir V. Saveliev" <vs@namesys.com> Cc: Edward Shishkin <edward@namesys.com> Cc: Alex Zarochentsev <zam@namesys.com> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The commit 34f5a39899f3f3e815da64f48ddb72942d86c366 restricted reading
of the tainted value. The attached patch changes this back to a
write-only check and restores the read behaviour of older versions.
Andrew Morton [Mon, 23 Apr 2007 21:41:13 +0000 (14:41 -0700)]
acpi-thermal: fix mod_timer() interval
Use relative time, not absolute. Discovered by Jung-Ik (John) Lee
<jilee@google.com>.
Cc: Jung-Ik (John) Lee <jilee@google.com> Acked-by: Len Brown <lenb@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
v9fs_insert uses v9fs_fid_lookup (which also locks the fid) to get the
primary fid associated with the dentry and destroys the v9fs_fid struct
after removing the file. If another process called v9fs_fid_lookup on the
same dentry, it may wait undefinitely for the fid's lock (as the struct is
freed).
This patch changes v9fs_remove to use a cloned fid, so the primary fid is
not locked and freed.
Signed-off-by: Latchesar Ionkov <lucho@ionkov.net> Cc: Eric Van Hensbergen <ericvh@hera.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Stefan Richter [Mon, 23 Apr 2007 21:41:10 +0000 (14:41 -0700)]
ieee1394: update MAINTAINERS database
- update Ben's address
- replace Ben's contact by mine as raw1394's 2nd contact
- eth1394's and pcilynx's maintenance doesn't really differ from that
of other parts of the stack like video1394
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de> Acked-by: Ben Collins <ben.collins@ubuntu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
NR_FILE_PAGES must be accounted for depending on the zone that the page
belongs to. If we replace the page in the radix tree then we may have to
shift the count to another zone.
Suggested-by: Ethan Solomita <solo@google.com> Eventually-typed-in-by: Christoph Lameter <clameter@sgi.com> Cc: Martin Bligh <mbligh@mbligh.org> Cc: <stable@kernel.org> Signed-off-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Taskstats fix the structure members alignment issue
We broke the the alignment of members of taskstats to the 8 byte boundary
with the CSA patches. In the current kernel, the taskstats structure is
not suitable for use by 32 bit applications in a 64 bit kernel.
On x86_64
Offsets of taskstats' members (64 bit kernel, 64 bit application)
This is one way to solve the problem without re-arranging structure members
is to pack the structure. The patch adds an __attribute__((aligned(8))) to
the taskstats structure members so that 32 bit applications using taskstats
can work with a 64 bit kernel.
Using __attribute__((packed)) would break the 64 bit alignment of members.
The fix was tested on x86_64. After the fix, we got
Offsets of taskstats' members (64 bit kernel, 64 bit application)
fix OOM killing processes wrongly thought MPOL_BIND
I only have CONFIG_NUMA=y for build testing: surprised when trying a memhog
to see lots of other processes killed with "No available memory
(MPOL_BIND)". memhog is killed correctly once we initialize nodemask in
constrained_alloc().
Signed-off-by: Hugh Dickins <hugh@veritas.com> Acked-by: Christoph Lameter <clameter@sgi.com> Acked-by: William Irwin <bill.irwin@oracle.com> Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Fix possible NULL pointer access in 8250 serial driver
I encountered the following kernel panic. The cause of this problem was
NULL pointer access in check_modem_status() in 8250.c. I confirmed this
problem is fixed by the attached patch, but I don't know this is the
correct fix.
Fix the possible NULL pointer access in check_modem_status() in 8250.c. The
check_modem_status() would access 'info' member of uart_port structure, but it
is not initialized before uart_open() is called. The check_modem_status() can
be called through /proc/tty/driver/serial before uart_open() is called.
Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com> Signed-off-by: Taku Izumi <izumi2005@soft.fujitsu.com> Cc: Russell King <rmk@arm.linux.org.uk> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
David Rientjes [Tue, 24 Apr 2007 04:36:13 +0000 (21:36 -0700)]
oom: kill all threads that share mm with killed task
oom_kill_task() calls __oom_kill_task() to OOM kill a selected task.
When finding other threads that share an mm with that task, we need to
kill those individual threads and not the same one.
noreplacement is dangerous on modern systems because it will not replace the
context switch FNSAVE with SSE aware FXSAVE. But other places in the kernel still assume
SSE and do FXSAVE and the CPU will then access FXSAVE information with
FNSAVE and cause corruption.
Easiest way to avoid this is to remove the option. It was mostly for paranoia
reasons anyways and alternative()s have been stable for some time.
Thanks to Jeremy F. for reporting and helping debug it.
Joachim Deguara [Tue, 24 Apr 2007 11:05:36 +0000 (13:05 +0200)]
[PATCH] x86-64: make GART PTEs uncacheable
This patches fixes the silent data corruption problems being seen using the
GART iommu where 4kB of data where incorrect (seen mostly on Nvidia CK804
systems). This fix, to mark the memory regin the GART PTEs reside on as
uncacheable, also brings the code in line with the AGP specification.
Signed-off-by: Joachim Deguara <joachim.deguara@amd.com> Signed-off-by: Andi Kleen <ak@suse.de>
Patrick McHardy [Tue, 24 Apr 2007 05:39:02 +0000 (22:39 -0700)]
[XFRM]: beet: fix pseudo header length value
draft-nikander-esp-beet-mode-07.txt is not entirely clear on how the length
value of the pseudo header should be calculated, it states "The Header Length
field contains the length of the pseudo header, IPv4 options, and padding in
8 octets units.", but also states "Length in octets (Header Len + 1) * 8".
draft-nikander-esp-beet-mode-08-pre1.txt [1] clarifies this, the header length
should not include the first 8 byte.
This change affects backwards compatibility, but option encapsulation didn't
work until very recently anyway.
Change to defer congestion control initialization.
If setsockopt() was used to change TCP_CONGESTION before
connection is established, then protocols that use sequence numbers
to keep track of one RTT interval (vegas, illinois, ...) get confused.
Change the init hook to be called after handshake.
Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* master.kernel.org:/pub/scm/linux/kernel/git/bart/ide-2.6:
ide/Kconfig: add missing range check for IDE_MAX_HWIFS
hpt366: fix kernel oops with HPT302N
ide/pci/delkin_cb.c: add new PCI ID
Fix a regression due to the patch "NFS: disconnect before retrying NFSv4
requests over TCP"
The assumption made in xprt_transmit() that the condition
"req->rq_bytes_sent == 0 and request is on the receive list"
should imply that we're dealing with a retransmission is false.
Firstly, it may simply happen that the socket send queue was full
at the time the request was initially sent through xprt_transmit().
Secondly, doing this for each request that was retransmitted implies
that we disconnect and reconnect for _every_ request that happened to
be retransmitted irrespective of whether or not a disconnection has
already occurred.
Fix is to move this logic into the call_status request timeout handler.
NFS: Fix the 'desynchronized value of nfs_i.ncommit' error
Redirtying a request that is already marked for commit will screw up the
accounting for NR_UNSTABLE_NFS as well as nfs_i.ncommit.
Ensure that all requests on the commit queue are labelled with the
PG_NEED_COMMIT flag, and avoid moving them onto the dirty list inside
nfs_page_mark_flush().
Also inline nfs_mark_request_dirty() into nfs_page_mark_flush() for
atomicity reasons. Avoid dropping the spinlock until we're done marking the
request in the radix tree and have added it to the ->dirty list.