Linus Torvalds [Sun, 17 Dec 2006 00:01:50 +0000 (16:01 -0800)]
Fix up mm/mincore.c error value cases
Hugh Dickins correctly points out that mincore() is actually _supposed_
to fail on an unmapped hole in the user address space, rather than
return valid ("empty") information about the hole. This just simplifies
the problem further (I had been misled by our previous confusing and
complicated way of doing mincore()).
Also, in the unlikely situation that we can't allocate a temporary
kernel buffer, we should actually return EAGAIN, not ENOMEM, to keep the
"unmapped hole" and "allocation failure" error cases separate.
Finally, add a comment about our stupid historical lack of support for
anonymous mappings. I'll fix that if somebody reminds me after 2.6.20
is out.
Linus Torvalds [Sat, 16 Dec 2006 17:54:23 +0000 (09:54 -0800)]
Merge branch 'upstream-linus' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/libata-dev
* 'upstream-linus' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/libata-dev:
[PATCH] pata_via: Cable detect error
[PATCH] Fix help text for CONFIG_ATA_PIIX
[PATCH] initializer entry defined twice in pata_rz1000
[PATCH] ata: fix platform_device_register_simple() error check
[PATCH] ahci: do not mangle saved HOST_CAP while resetting controller
[PATCH] libata: don't initialize sg in ata_exec_internal() if DMA_NONE (take #2)
[libata] sata_svw: Disable ATAPI DMA on current boards (errata workaround)
[libata] use kmap_atomic(KM_IRQ0) in SCSI simulator
[PATCH] ata_piix: use piix_host_stop() in ich_pata_ops
[PATCH] ata_piix: IDE mode SATA patch for Intel ICH9
Linus Torvalds [Sat, 16 Dec 2006 17:53:50 +0000 (09:53 -0800)]
Make workqueue bit operations work on "atomic_long_t"
On architectures where the atomicity of the bit operations is handled by
external means (ie a separate spinlock to protect concurrent accesses),
just doing a direct assignment on the workqueue data field (as done by
commit 4594bf159f1962cec3b727954b7c598b07e2e737) can cause the
assignment to be lost due to lack of serialization with the bitops on
the same word.
So we need to serialize the assignment with the locks on those
architectures (notably older ARM chips, PA-RISC and sparc32).
So rather than using an "unsigned long", let's use "atomic_long_t",
which already has a safe assignment operation (atomic_long_set()) on
such architectures.
This requires that the atomic operations use the same atomicity locks as
the bit operations do, but that is largely the case anyway. Sparc32
will probably need fixing.
Architectures (including modern ARM with LL/SC) that implement sane
atomic operations for SMP won't see any of this matter.
Cc: Russell King <rmk+lkml@arm.linux.org.uk> Cc: David Howells <dhowells@redhat.com> Cc: David Miller <davem@davemloft.com> Cc: Matthew Wilcox <matthew@wil.cx> Cc: Linux Arch Maintainers <linux-arch@vger.kernel.org> Cc: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Linus Torvalds [Sat, 16 Dec 2006 17:44:32 +0000 (09:44 -0800)]
Fix incorrect user space access locking in mincore()
Doug Chapman noticed that mincore() will doa "copy_to_user()" of the
result while holding the mmap semaphore for reading, which is a big
no-no. While a recursive read-lock on a semaphore in the case of a page
fault happens to work, we don't actually allow them due to deadlock
schenarios with writers due to fairness issues.
Doug and Marcel sent in a patch to fix it, but I decided to just rewrite
the mess instead - not just fixing the locking problem, but making the
code smaller and (imho) much easier to understand.
Alan [Sat, 16 Dec 2006 14:32:21 +0000 (14:32 +0000)]
[PATCH] pata_via: Cable detect error
The UDMA66 VIA hardware has no controller side cable detect bits we can
use. This patch minimally fixes the problem by reporting unknown in this
case and using drive side detection.
The old drivers/ide code does some additional tricks but those aren't
appropriate now we are in -rc.
Without this update UDMA66 via controllers run slowly. They don't fail so
it's a borderline call whether this is -rc material or not.
Signed-off-by: Alan Cox <alan@redhat.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>
Alan [Sat, 16 Dec 2006 12:54:29 +0000 (12:54 +0000)]
[PATCH] Fix help text for CONFIG_ATA_PIIX
> Thanks for clarifying Bill, and sorry Alan. ata_piix does indeed work
> correctly. The help text is a bit confusing:
>
> config ATA_PIIX
> tristate "Intel PIIX/ICH SATA support"
> depends on PCI
> help
> This option enables support for ICH5/6/7/8 Serial ATA.
> If PATA support was enabled previously, this enables
> support for select Intel PIIX/ICH PATA host controllers.
New help text
Signed-off-by: Alan Cox <alan@redhat.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>
Ira Snyder [Fri, 15 Dec 2006 21:08:52 +0000 (13:08 -0800)]
[PATCH] initializer entry defined twice in pata_rz1000
This removes the extra definition of the .error_handler member
in the pata_rz1000 driver.
Signed-off-by: Ira W. Snyder <kernel@irasnyder.com> Cc: Jeff Garzik <jeff@garzik.org> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Jeff Garzik <jeff@garzik.org>
The return value of platform_device_register_simple() should be checked
by IS_ERR().
Cc: Jeff Garzik <jgarzik@pobox.com> Acked-by: Alan Cox <alan@lxorguk.ukuu.org.uk> Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Jeff Garzik <jeff@garzik.org>
Tejun Heo [Tue, 12 Dec 2006 11:17:32 +0000 (20:17 +0900)]
[PATCH] ahci: do not mangle saved HOST_CAP while resetting controller
Do not mangle with HOST_CAP while resetting controller. The code is
there for a historical reason. The mangling breaks controller feature
detection and 0 PORTS_IMPL workaround code.
Tejun Heo [Mon, 11 Dec 2006 17:15:31 +0000 (02:15 +0900)]
[PATCH] libata: don't initialize sg in ata_exec_internal() if DMA_NONE (take #2)
Calling sg_init_one() with NULL buf causes oops on certain
configurations. Don't initialize sg in ata_exec_internal() if
DMA_NONE and make the function complain if @buf is NULL when dma_dir
isn't DMA_NONE. While at it, fix comment.
The problem is discovered and initial patch was submitted by Arnd
Bergmann.
Jeff Garzik [Thu, 14 Dec 2006 22:04:33 +0000 (17:04 -0500)]
[libata] sata_svw: Disable ATAPI DMA on current boards (errata workaround)
Current Broadcom/Serverworks SATA boards (including Apple K2 SATA)
have problems with ATAPI DMA, so it is disabled. ATAPI PIO, ATA PIO,
and ATA DMA continue to work just fine.
Acked-by: Anantha Subramanyam <ananth@broadcom.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>
Tejun Heo [Mon, 11 Dec 2006 13:26:25 +0000 (22:26 +0900)]
[PATCH] ata_piix: use piix_host_stop() in ich_pata_ops
piix_init_one() allocates host private data which should be freed by
piix_host_stop(). ich_pata_ops wasn't converted to piix_host_stop()
while merging, leaking 4 bytes on driver detach. Fix it.
This was spotted using Kmemleak by Catalin Marinas.
Jason Gaston [Thu, 7 Dec 2006 16:57:32 +0000 (08:57 -0800)]
[PATCH] ata_piix: IDE mode SATA patch for Intel ICH9
This updated patch adds the Intel ICH9 IDE mode SATA controller DID's.
Signed-off-by: Jason Gaston <jason.d.gaston@intel.com> Acked-by: Tejun Heo <htejun@gmail.com> Cc: Jeff Garzik <jeff@garzik.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Jeff Garzik <jeff@garzik.org>
Roland Dreier [Sat, 16 Dec 2006 04:55:28 +0000 (20:55 -0800)]
IB/mthca: Use DEFINE_MUTEX() instead of mutex_init()
mthca_device_mutex() can be initialized automatically with
DEFINE_MUTEX() rather than explicitly calling mutex_init(). This
saves a bit of text and shrinks the source by a line, so we may as
well do it....
Linus Torvalds [Fri, 15 Dec 2006 22:13:51 +0000 (14:13 -0800)]
Fix "delayed_work_pending()" macro expansion
Nobody uses it, but it was still wrong. Using the macro argument name
'work' meant that when we used 'work' as a member name, that would also
get replaced by the macro argument.
Roland Dreier [Fri, 15 Dec 2006 22:01:49 +0000 (14:01 -0800)]
IB/srp: Fix FMR mapping for 32-bit kernels and addresses above 4G
struct srp_device.fmr_page_mask was unsigned long, which means that
the top part of addresses above 4G was being chopped off on 32-bit
architectures. Of course nothing good happens when data from SRP
targets is DMAed to the wrong place.
Fix this by changing fmr_page_mask to u64, to match the addresses
actually used by IB devices.
Thanks to Brian Cain <Brian.Cain@ge.com> and David McMillen
<davem@systemfabricworks.com> for help diagnosing the bug and testing
the fix.
Roland Dreier [Fri, 15 Dec 2006 21:57:26 +0000 (13:57 -0800)]
IB: Fix ib_dma_alloc_coherent() wrapper
The ib_dma_alloc_coherent() wrapper uses a u64* for the dma_handle
parameter, unlike dma_alloc_coherent, which uses dma_addr_t*. This
means that we need a temporary variable to handle the case when
ib_dma_alloc_coherent() just falls through directly to
dma_alloc_coherent() on architectures where sizeof u64 != sizeof
dma_addr_t.
Linus Torvalds [Fri, 15 Dec 2006 16:43:13 +0000 (08:43 -0800)]
Remove stack unwinder for now
It has caused more problems than it ever really solved, and is
apparently not getting cleaned up and fixed. We can put it back when
it's stable and isn't likely to make warning or bug events worse.
In the meantime, enable frame pointers for more readable stack traces.
Stefan Bader [Fri, 15 Dec 2006 16:18:30 +0000 (17:18 +0100)]
[S390] cio: css_register_subchannel race.
Asynchronous probe can release memory of a subchannel before
css_get_ssd_info is called. To fix this call css_get_ssd_info
before registering with driver core.
Signed-off-by: Stefan Bader <shbader@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Michael Holzheu [Fri, 15 Dec 2006 16:18:27 +0000 (17:18 +0100)]
[S390] Save prefix register for dump on panic
The dump tools expect that the saved prefix register points to the
lowcore of the dump cpu. Since we set the prefix register to 0 during
reipl/dump, we have to save the original prefix register. Before we
start the dump program, we copy the original prefix register to the
designated location in the lowcore.
Signed-off-by: Michael Holzheu <holzheu@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Michael Holzheu [Fri, 15 Dec 2006 16:18:22 +0000 (17:18 +0100)]
[S390] Fix reboot hang on LPARs
Reboot hangs on LPARs without diag308 support. The reason for this is,
that before the reboot is done, the channel subsystem is shut down.
During the reset on each possible subchannel a "store subchannel" is
done. This operation can end in a program check interruption, if the
specified subchannel set is not implemented by the hardware. During
the reset, currently we do not have a program check handler, which
leads to the described kernel bug. We install now a new program check
handler for the reboot code to fix this problem.
Signed-off-by: Michael Holzheu <holzheu@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Ursula Braun [Fri, 15 Dec 2006 16:18:14 +0000 (17:18 +0100)]
[S390] Hipersocket multicast queue: make sure outbound handler is called
A HiperSocket multicast queue works asynchronously. When sending
buffers, the buffer state change from PRIMED to EMPTY may happen
delayed. Reschedule the checking for changes in the outbound queue,
if there are still PRIMED buffers.
Signed-off-by: Ursula Braun <braunu@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
[SCTP]: Add support for SCTP_CONTEXT socket option.
Signed-off-by: Ivan Skytte Jorgensen <isj-sctp@i1.dk> Signed-off-by: Sridhar Samudrala <sri@us.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
[SCTP]: Handle address add/delete events in a more efficient way.
Currently in SCTP, we maintain a local address list by rebuilding the whole
list from the device list whenever we get a address add/delete event.
This patch fixes it by only adding/deleting the address for which we
receive the event.
Also removed the sctp_local_addr_lock() which is no longer needed as we
now use list_for_each_safe() to traverse this list. This fixes the bugs
in sctp_copy_laddrs_xxx() routines where we do copy_to_user() while
holding this lock.
Signed-off-by: Sridhar Samudrala <sri@us.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Brian Haley [Wed, 13 Dec 2006 01:09:49 +0000 (17:09 -0800)]
[IPV6]: Fix IPV6_UNICAST_HOPS getsockopt().
> Relevant standard (RFC 3493) notes:
>
> The IPV6_UNICAST_HOPS option may be used with getsockopt() to
> determine the hop limit value that the system will use for subsequent
> unicast packets sent via that socket.
>
> I don't reckon -1 could be the hop limit value.
-1 means un-initialized.
> IMHO, the value from
> case 1 (if socket is connected to some destination), otherwise case 2
> (if bound to a scope interface) or ultimately the default hop limit
> ought to be returned instead, as it will be most often correct, while
> the current behavior is always wrong, unless setsockopt() has been used
> first. I don't if some people may think doing a route lookup in
> getsockopt might be overly expensive, but at least the two other cases
> should be ok, particularly the last one.
The following patch seems to work for me, but this code has behaved this
way for a while, so don't know if it will break any existing apps.
Signed-off-by: Brian Haley <brian.haley@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Ian McDonald [Tue, 12 Dec 2006 02:47:59 +0000 (00:47 -0200)]
[DCCP] ccid3: return value in ccid3_hc_rx_calc_first_li
In a recent patch we introduced invalid return codes which will result in the
opposite of what is intended (i.e. send more packets in face of peculiar
network conditions).
This fixes it by returning ~0 which means not calculated as per
dccp_li_hist_calc_i_mean.
Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Al Viro [Tue, 12 Dec 2006 08:29:52 +0000 (00:29 -0800)]
[NETFILTER]: {ip,ip6,arp}_tables: fix exponential worst-case search for loops
If we come to node we'd already marked as seen and it's not a part of path
(i.e. we don't have a loop right there), we already know that it isn't a
part of any loop, so we don't need to revisit it.
That speeds the things up if some chain is refered to from several places
and kills O(exp(table size)) worst-case behaviour (without sleeping,
at that, so if you manage to self-LART that way, you are SOL for a long
time)...
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
Yasuyuki Kozakai [Tue, 12 Dec 2006 08:29:02 +0000 (00:29 -0800)]
[NETFILTER]: x_tables: add missing try to load conntrack from match/targets
CLUSTERIP, CONNMARK, CONNSECMARK, and connbytes need ip_conntrack or
layer 3 protocol module of nf_conntrack.
Signed-off-by: Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
Yasuyuki Kozakai [Tue, 12 Dec 2006 08:28:40 +0000 (00:28 -0800)]
[NETFILTER]: x_tables: error if ip_conntrack is asked to handle IPv6 packets
To do that, this makes nf_ct_l3proto_try_module_{get,put} compatible
functions. As a result we can remove '#ifdef' surrounds and direct call of
need_conntrack().
Signed-off-by: Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
Yasuyuki Kozakai [Tue, 12 Dec 2006 08:28:09 +0000 (00:28 -0800)]
[NETFILTER]: nf_nat: fix NF_NAT dependency
NF_NAT depends on NF_CONNTRACK_IPV4, not NF_CONNTRACK.
Signed-off-by: Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Wed, 13 Dec 2006 23:58:32 +0000 (15:58 -0800)]
Merge branch 'for-linus' of master.kernel.org:/home/rmk/linux-2.6-arm
* 'for-linus' of master.kernel.org:/home/rmk/linux-2.6-arm:
[ARM] 4017/1: [Jornada7xx] - Updating Jornada720.c
[ARM] 3992/1: i.MX/MX1 CPU Frequency scaling support
[ARM] Provide a method to alter the control register
[ARM] 4016/1: prefetch macro is wrong wrt gcc's "delete-null-pointer-checks"
[ARM] Remove empty fixup function
[ARM] 4014/1: include drivers/hid/Kconfig
[ARM] 4013/1: clocksource driver for netx
[ARM] 4012/1: Clocksource for pxa
[ARM] Clean up ioremap code
[ARM] Unuse another Linux PTE bit
[ARM] Clean up KERNEL_RAM_ADDR
[ARM] Add sys_*at syscalls
[ARM] 4004/1: S3C24XX: UDC remove implict addition of VA to regs
[ARM] Formalise the ARMv6 processor name string
[ARM] Handle HWCAP_VFP in VFP support code
[ARM] 4011/1: AT91SAM9260: Fix compilation with NAND driver
[ARM] 4010/1: AT91SAM9260-EK board: Prepare for MACB Ethernet support
Scott Wood [Mon, 4 Dec 2006 22:57:19 +0000 (14:57 -0800)]
Driver core: Make platform_device_add_data accept a const pointer
platform_device_add_data() makes a copy of the data that is given to it,
and thus the parameter can be const. This removes a warning when data
from get_property() on powerpc is handed to platform_device_add_data(),
as get_property() returns a const pointer.
Signed-off-by: Scott Wood <scottwood@freescale.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Fix file and directory removal in debugfs. Add inotify support for file removal.
The following scenario :
create dir a
create dir a/b
cd a/b (some process goes in cwd a/b)
rmdir a/b
rmdir a
fails due to the fact that "a" appears to be non empty. It is because
the "b" dentry is not deleted from "a" and still in use. The same
problem happens if "b" is a file. d_delete is nice enough to know when
it needs to unhash and free the dentry if nothing else is using it or,
if someone is using it, to remove it from the hash queues and wait for
it to be deleted when it has no users.
The nice side-effect of this fix is that it calls the file removal
notification.
DebugFS : more file/directory creation error handling
Correct dentry count to handle creation errors.
This patch puts a dput at the file creation instead of the file removal :
lookup_one_len already returns a dentry with reference count of 1. Then,
the dget() in simple_mknod increments it when the dentry is associated
with a file. In a scenario where simple_create or simple_mkdir returns
an error, this would lead to an unwanted increment of the reference
counter, therefore making file removal impossible.
* HP Jornada 720 uses epson 1356 chip for graphics. This chip is compatible with s1d13xxxfb driver.
* HP Jornada 720 uses a Microprocessor Control Unit to talk to various
hardware. We add it as a platform device in jornada720_init()
* We provide pm_suspend() to avoid unresolved symbols in apm.o. We are
unable to truly suspend now, hence the stub.
* Speaker/microphone enabling got removed because it will be placed in the alsa driver.
Signed-off-by: Filip Zyzniewski <(address hidden)> Signed-off-by: Kristoffer Ericson <(address hidden)> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Russell King [Wed, 13 Dec 2006 18:33:53 +0000 (18:33 +0000)]
[ARM] Provide a method to alter the control register
i.MX needs to tweak the control register to support CPU frequency
scaling. Rather than have folk blindly try and change the control
register by writing to it and then wondering why it doesn't work,
provide a method (which is safe for UP only, and therefore only
available for UP) to achieve this.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Nicolas Pitre [Wed, 13 Dec 2006 17:39:26 +0000 (18:39 +0100)]
[ARM] 4016/1: prefetch macro is wrong wrt gcc's "delete-null-pointer-checks"
optimization
The gcc manual says:
|`-fdelete-null-pointer-checks'
| Use global dataflow analysis to identify and eliminate useless
| checks for null pointers. The compiler assumes that dereferencing
| a null pointer would have halted the program. If a pointer is
| checked after it has already been dereferenced, it cannot be null.
| Enabled at levels `-O2', `-O3', `-Os'.
Because the constraint to the inline asm used in the prefetch() macro is
a memory operand, gcc assumes that the asm code does dereference the
pointer and the delete-null-pointer-checks optimization kicks in.
Inspection of generated assembly for the above example shows that bar()
is indeed called unconditionally without any test on the value of x.
Of course in the prefetch case there is no real dereference and it
cannot be assumed that a null pointer would have been caught at that
point. This causes kernel oopses with constructs like
hlist_for_each_entry() where the list's 'next' content is prefetched
before the pointer is tested against NULL, and only when gcc feels like
applying this optimization which doesn't happen all the time with more
complex code.
It appears that the way to prevent delete-null-pointer-checks
optimization to occur in this case is to make prefetch() into a static
inline function instead of a macro. At least this is what is done on
x86_64 where a similar inline asm memory operand is used (I presume they
would have seen the same problem if it didn't work) and resulting code
for the above example confirms that.
An alternative would consist of replacing the memory operand by a
register operand containing the pointer, and use the addressing mode
explicitly in the asm template. But that would be less optimal than an
offsettable memory reference.
Signed-off-by: Nicolas Pitre <nico@cam.org> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Ralf Baechle [Tue, 12 Dec 2006 17:14:57 +0000 (17:14 +0000)]
[PATCH] Optimize D-cache alias handling on fork
Virtually index, physically tagged cache architectures can get away
without cache flushing when forking. This patch adds a new cache
flushing function flush_cache_dup_mm(struct mm_struct *) which for the
moment I've implemented to do the same thing on all architectures
except on MIPS where it's a no-op.
Atsushi Nemoto [Tue, 12 Dec 2006 17:14:56 +0000 (17:14 +0000)]
[PATCH] MIPS: Fix COW D-cache aliasing on fork
Provide a custom copy_user_highpage() to deal with aliasing issues on
MIPS. It uses kmap_coherent() to map an user page for kernel with same
color. Rewrite copy_to_user_page() and copy_from_user_page() with the
new interfaces to avoid extra cache flushing.
The main part of this patch was originally written by Ralf Baechle;
Atushi Nemoto did the the debugging.
Atsushi Nemoto [Tue, 12 Dec 2006 17:14:55 +0000 (17:14 +0000)]
[PATCH] Pass vma argument to copy_user_highpage().
To allow a more effective copy_user_highpage() on certain architectures,
a vma argument is added to the function and cow_user_page() allowing
the implementation of these functions to check for the VM_EXEC bit.
The main part of this patch was originally written by Ralf Baechle;
Atushi Nemoto did the the debugging.
Atsushi Nemoto [Tue, 12 Dec 2006 17:14:54 +0000 (17:14 +0000)]
[PATCH] Fix COW D-cache aliasing on fork
Problem:
1. There is a process containing two thread (T1 and T2). The
thread T1 calls fork(). Then dup_mmap() function called on T1 context.
static inline int dup_mmap(struct mm_struct *mm, struct mm_struct *oldmm)
...
flush_cache_mm(current->mm);
... /* A */
(write-protect all Copy-On-Write pages)
... /* B */
flush_tlb_mm(current->mm);
...
2. When preemption happens between A and B (or on SMP kernel), the
thread T2 can run and modify data on COW pages without page fault
(modified data will stay in cache).
3. Some time after fork() completed, the thread T2 may cause a page
fault by write-protect on a COW page.
4. Then data of the COW page will be copied to newly allocated
physical page (copy_cow_page()). It reads data via kernel mapping.
The kernel mapping can have different 'color' with user space
mapping of the thread T2 (dcache aliasing). Therefore
copy_cow_page() will copy stale data. Then the modified data in
cache will be lost.
In order to allow architecture code to deal with this problem allow
architecture code to override copy_user_highpage() by defining
__HAVE_ARCH_COPY_USER_HIGHPAGE in <asm/page.h>.
The main part of this patch was originally written by Ralf Baechle;
Atushi Nemoto did the the debugging.
Russell King [Wed, 13 Dec 2006 14:45:46 +0000 (14:45 +0000)]
[PATCH] Add support for Korenix 16C950-based PCI cards
This adds initial support to 8250-pci for the Korenix Jetcard PCI serial
cards. The JC12xx cards are standard RS232-based serial cards utilising
the Oxford 16C950 device.
The JC14xx are RS422/RS485-based cards, but in order for these to be
supported natively, we will need additional tweaks to the 8250 layers so
we can specify some values for the 950's registers. Hence, these two
entries are commented out.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Linus Torvalds [Wed, 13 Dec 2006 17:15:34 +0000 (09:15 -0800)]
Merge branch 'for-linus' of git://brick.kernel.dk/data/git/linux-2.6-block
* 'for-linus' of git://brick.kernel.dk/data/git/linux-2.6-block:
[PATCH] Fixup cciss error handling
[PATCH] Allow as-iosched to be unloaded
[PATCH 2/2] cciss: remove calls to pci_disable_device
[PATCH 1/2] cciss: map out more memory for config table
[PATCH] Propagate down request sync flag
Resolve trivial whitespace conflict in drivers/block/cciss.c manually.
Linus Torvalds [Wed, 13 Dec 2006 17:13:19 +0000 (09:13 -0800)]
Merge branch 'hwmon-for-linus' of git://jdelvare.pck.nerim.net/jdelvare-2.6
* 'hwmon-for-linus' of git://jdelvare.pck.nerim.net/jdelvare-2.6:
hwmon: Add MAINTAINERS entry for new ams driver
hwmon: New AMS hardware monitoring driver
hwmon/w83793: Add documentation and maintainer
hwmon: New Winbond W83793 hardware monitoring driver
hwmon: Update Rudolf Marek's e-mail address
hwmon/f71805f: Fix the device address decoding
hwmon/f71805f: Always create all fan inputs
hwmon/f71805f: Add support for the Fintek F71872F/FG chip
hwmon: New PC87427 hardware monitoring driver
hwmon/it87: Remove the SMBus interface support
hwmon/hdaps: Update the list of supported devices
hwmon/hdaps: Move the DMI detection data to .data
hwmon/pc87360: Autodetect the VRM version
hwmon/f71805f: Document the fan control features
hwmon/f71805f: Add support for "speed mode" fan speed control
hwmon/f71805f: Support DC fan speed control mode
hwmon/f71805f: Let the user adjust the PWM base frequency
hwmon/f71805f: Add manual fan speed control
hwmon/f71805f: Store the fan control registers
Robert P. J. Day [Wed, 13 Dec 2006 08:35:56 +0000 (00:35 -0800)]
[PATCH] getting rid of all casts of k[cmz]alloc() calls
Run this:
#!/bin/sh
for f in $(grep -Erl "\([^\)]*\) *k[cmz]alloc" *) ; do
echo "De-casting $f..."
perl -pi -e "s/ ?= ?\([^\)]*\) *(k[cmz]alloc) *\(/ = \1\(/" $f
done
And then go through and reinstate those cases where code is casting pointers
to non-pointers.
And then drop a few hunks which conflicted with outstanding work.
Cc: Russell King <rmk@arm.linux.org.uk>, Ian Molton <spyro@f2s.com> Cc: Mikael Starvik <starvik@axis.com> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: Roman Zippel <zippel@linux-m68k.org> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Kyle McMartin <kyle@mcmartin.ca> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Jeff Dike <jdike@addtoit.com> Cc: Greg KH <greg@kroah.com> Cc: Jens Axboe <jens.axboe@oracle.com> Cc: Paul Fulghum <paulkf@microgate.com> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Cc: Karsten Keil <kkeil@suse.de> Cc: Mauro Carvalho Chehab <mchehab@infradead.org> Cc: Jeff Garzik <jeff@garzik.org> Cc: James Bottomley <James.Bottomley@steeleye.com> Cc: Ian Kent <raven@themaw.net> Cc: Steven French <sfrench@us.ibm.com> Cc: David Woodhouse <dwmw2@infradead.org> Cc: Neil Brown <neilb@cse.unsw.edu.au> Cc: Jaroslav Kysela <perex@suse.cz> Cc: Takashi Iwai <tiwai@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Sergei Shtylyov [Wed, 13 Dec 2006 08:35:53 +0000 (00:35 -0800)]
[PATCH] HPT37x: read f_CNT saved by BIOS from port
The undocumented register BIOS uses for saving f_CNT seems to only be
mapped to I/O space while all the other HPT3xx regs are dual-mapped. Looks
like another HighPoint's dirty trick. With this patch, the deadly kernel
oops on the cards having the modern HighPoint BIOSes is now at last gone!
Signed-off-by: Sergei Shtylyov <sshtylyov@ru.mvista.com> Cc: Bartlomiej Zolnierkiewicz <B.Zolnierkiewicz@elka.pw.edu.pl> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Sergei Shtylyov [Wed, 13 Dec 2006 08:35:52 +0000 (00:35 -0800)]
[PATCH] ide: HPT3xx: fix PCI clock detection
Use the f_CNT value saved by the HighPoint BIOS if available as reading it
directly would give us a wrong PCI frequency after DPLL has already been
calibrated by BIOS.
Signed-off-by: Sergei Shtylyov <sshtylyov@ru.mvista.com> Cc: Bartlomiej Zolnierkiewicz <B.Zolnierkiewicz@elka.pw.edu.pl> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Sergei Shtylyov [Wed, 13 Dec 2006 08:35:51 +0000 (00:35 -0800)]
[PATCH] ide: fix the case of multiple HPT3xx chips present
init_chipset_hpt366() modifies some fields of the ide_pci_device_t structure
depending on the chip's revision, so pass it a copy of the structure to avoid
issues when multiple different chips are present.
Signed-off-by: Sergei Shtylyov <sshtylyov@ru.mvista.com> Cc: Bartlomiej Zolnierkiewicz <B.Zolnierkiewicz@elka.pw.edu.pl> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Sergei Shtylyov [Wed, 13 Dec 2006 08:35:50 +0000 (00:35 -0800)]
[PATCH] ide: fix HPT3xx hotswap support
Fix the broken hotswap code: on HPT37x it caused RESET- to glitch when
tristating the bus (the MISC control 3/6 and soft control 2 need to be written
to in the certain order), and for HPT36x the obsolete HDIO_TRISTATE_HWIF
ioctl() handler was called instead which treated the state argument wrong.
Also, get rid of the soft control reg. 1 wtite to enable IDE interrupt --
this is done in init_hpt37x() already...
Have been tested on HPT370 and 371N.
Signed-off-by: Sergei Shtylyov <sshtylyov@ru.mvista.com> Cc: Bartlomiej Zolnierkiewicz <B.Zolnierkiewicz@elka.pw.edu.pl> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Sergei Shtylyov [Wed, 13 Dec 2006 08:35:49 +0000 (00:35 -0800)]
[PATCH] ide: optimize HPT37x timing tables
Save some space on the timing tables by introducing the separate transfer mode
table in which the mode lookup is done to get the index into the timing table
itself. Get rid of the rest of the obsolete/duplicate tables and use one set
of tables for the whole HPT37x chip family like the HighPoint open-source
drivers do. Documnent the different timing register layout for the HPT36x
chip family (this is my guesswork based on the timing values).
Have been tested and works fine on HPT370/302/371N.
Signed-off-by: Sergei Shtylyov <sshtylyov@ru.mvista.com> Cc: Bartlomiej Zolnierkiewicz <B.Zolnierkiewicz@elka.pw.edu.pl> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Sergei Shtylyov [Wed, 13 Dec 2006 08:35:49 +0000 (00:35 -0800)]
[PATCH] ide: fix HPT37x timing tables
Fix/remove bad/unused timing tables: HPT370/A 66 MHz tables weren't really
needed (the chips are not UltraATA/133 capable and shouldn't support 66 MHz
PCI) and had many modes over- and underclocked, HPT372 33 MHz table was in
fact for 66 MHz and 50 MHz table missed UltraDMA mode 6, HPT374 33 MHz table
was really for 50 MHz... (Actually, HPT370/A 33 MHz tables also have issues.
e.g. HPT370 has PIO modes 0/1 overlocked.)
There's also no need in the separate HPT374 tables because HPT372 timings
should be the same (and those tables has UltraDMA mode 6 which HPT374 supports
depending on HPT374_ALLOW_ATA133_6 #define)...
Signed-off-by: Sergei Shtylyov <sshtylyov@ru.mvista.com> Cc: Bartlomiej Zolnierkiewicz <B.Zolnierkiewicz@elka.pw.edu.pl> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Sergei Shtylyov [Wed, 13 Dec 2006 08:35:47 +0000 (00:35 -0800)]
[PATCH] ide: HPT3xxN clocking fixes
Fix serious problems with the HPT372N clock turnaround code:
- the wrong ports were written to when called for the secondary channel;
- it didn't serialize access to the channels;
- turnaround shou;dn't be done on 66 MHz PCI;
- caching the clock mode per-channel caused it to get out of sync with the
actual register value.
Additionally, avoid calibrating PLL twice (for each channel) as the second try
results in a wrong PCI frequency and thus in the wrong timings.
Make the driver deal with HPT302N and HPT371N correctly -- the clocking and
(seemingly) a need for clock tunaround is the same as for HPT372N. HPT371/N
chips have only one, secondary channel, so avoid touching their "pure virtual"
primary channel, and disable it if the BIOS haven't done this already.
Also, while at it, disable UltraATA/133 for HPT372 by default -- 50 MHz DPLL
clock don't allow for this speed anyway. And remove the traces of the former
bad patch that wasn't even applicable to this version of driver.
Has been tested on HPT370/371N, unfortunately I don't have an instant access
to the other chips...
Signed-off-by: Sergei Shtylyov <sshtylyov@ru.mvista.com> Cc: Bartlomiej Zolnierkiewicz <B.Zolnierkiewicz@elka.pw.edu.pl> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Eric Dumazet [Wed, 13 Dec 2006 08:35:45 +0000 (00:35 -0800)]
[PATCH] Optimize calc_load()
calc_load() is called by timer interrupt to update avenrun[]. It currently
calls nr_active() at each timer tick (HZ per second), while the update of
avenrun[] is done only once every 5 seconds. (LOAD_FREQ=5 Hz)
nr_active() is quite expensive on SMP machines, since it has to sum up
nr_running and nr_uninterruptible of all online CPUS, bringing foreign
dirty cache lines.
This patch is an optimization of calc_load() so that nr_active() is called
only if we need it.
The use of unlikely() is welcome since the condition is true only once every
5*HZ time.
Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Cc: Ingo Molnar <mingo@elte.hu> Acked-by: "Siddha, Suresh B" <suresh.b.siddha@intel.com> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
NeilBrown [Wed, 13 Dec 2006 08:35:45 +0000 (00:35 -0800)]
[PATCH] knfsd: Fix up some bit-rot in exp_export
The nfsservctl system call isn't used but recent nfs-utils releases for
exporting filesystems, and consequently the code that is uses - exp_export -
has suffered some bitrot.
Particular:
- some newly added fields in 'struct svc_export' are being initialised
properly.
- the return value is now always -ENOMEM ...
This patch fixes both these problems.
Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
J.Bruce Fields [Wed, 13 Dec 2006 08:35:43 +0000 (00:35 -0800)]
[PATCH] knfsd: nfsd4: simplify filehandle check
Kill another big "if" clause.
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
J.Bruce Fields [Wed, 13 Dec 2006 08:35:39 +0000 (00:35 -0800)]
[PATCH] knfsd: nfsd4: simplify migration op check
I'm not too fond of these big if conditions. Replace them by checks of a flag
in the operation descriptor. To my eye this makes the code a bit more
self-documenting, and makes the complicated part of the code (proc_compound) a
little more compact.
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
J.Bruce Fields [Wed, 13 Dec 2006 08:35:38 +0000 (00:35 -0800)]
[PATCH] knfsd: nfsd4: reorganize compound ops
Define an op descriptor struct, use it to simplify nfsd4_proc_compound().
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
J.Bruce Fields [Wed, 13 Dec 2006 08:35:31 +0000 (00:35 -0800)]
[PATCH] knfsd: nfsd4: make verify and nverify wrappers
Make wrappers for verify and nverify, for consistency with other ops.
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>