Andy Whitcroft [Wed, 22 Aug 2007 21:01:26 +0000 (14:01 -0700)]
synchronous lumpy reclaim: wait for page writeback when directly reclaiming contiguous areas
Lumpy reclaim works by selecting a lead page from the LRU list and then
selecting pages for reclaim from the order-aligned area of pages. In the
situation were all pages in that region are inactive and not referenced by any
process over time, it works well.
In the situation where there is even light load on the system, the pages may
not free quickly. Out of a area of 1024 pages, maybe only 950 of them are
freed when the allocation attempt occurs because lumpy reclaim returned early.
This patch alters the behaviour of direct reclaim for large contiguous
blocks.
The first attempt to call shrink_page_list() is asynchronous but if it fails,
the pages are submitted a second time and the calling process waits for the IO
to complete. This may stall allocators waiting for contiguous memory but that
should be expected behaviour for high-order users. It is preferable behaviour
to potentially queueing unnecessary areas for IO. Note that kswapd will not
stall in this fashion.
[apw@shadowen.org: update to version 2]
[apw@shadowen.org: update to version 3] Signed-off-by: Mel Gorman <mel@csn.ul.ie> Signed-off-by: Andy Whitcroft <apw@shadowen.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Andy Whitcroft [Wed, 22 Aug 2007 21:01:25 +0000 (14:01 -0700)]
synchronous lumpy reclaim: ensure we count pages transitioning inactive via clear_active_flags
As pointed out by Mel when reclaim is applied at higher orders a significant
amount of IO may be started. As this takes finite time to drain reclaim will
consider more areas than ultimatly needed to satisfy the request. This leads
to more reclaim than strictly required and reduced success rates.
I was able to confirm Mel's test results on systems locally. These show that
even under light load the success rates drop off far more than expected.
Testing with a modified version of his patch (which follows) I was able to
allocate almost all of ZONE_MOVABLE with a near idle system. I ran 5 test
passes sequentially following system boot (the system has 29 hugepages in
ZONE_MOVABLE):
2.6.23-rc1 11 8 6 7 7
sync_lumpy 28 28 29 29 26
These show that although hugely better than the near 0% success normally
expected we can only allocate about a 1/4 of the zone. Using synchronous
reclaim for these allocations we get close to 100% as expected.
I have also run our standard high order tests and these show no regressions in
allocation success rates at rest, and some significant improvements under
load.
This patch:
We are transitioning pages from active to inactive in clear_active_flags,
those need counting as PGDEACTIVATE vm events.
Signed-off-by: Andy Whitcroft <apw@shadowen.org> Acked-by: Mel Gorman <mel@csn.ul.ie> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Michael Neuling [Wed, 22 Aug 2007 23:31:42 +0000 (09:31 +1000)]
Documentation: fix getdelays.c printf bug
Commit b663a79c191508f27cd885224b592a878c0ba0f6 ("taskstats: add
context-switch counters") incorrectly removed a comma from a printf
statement. This causes corruption in the output printing or a seg
fault.
Andrew Morton [Wed, 22 Aug 2007 21:01:20 +0000 (14:01 -0700)]
free_irq(): fix DEBUG_SHIRQ handling
If we're going to run the handler from free_irq() then we must do it with
local irq's disabled. Otherwise lockdep complains that the handler is taking
irq-safe spinlocks in a non-irq-safe fashion.
Cc: Ingo Molnar <mingo@elte.hu> Cc: David Woodhouse <dwmw2@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Add blacklisting capability to serial_pci to avoid misdetection of serial ports
The serial_pci driver tries to guess serial ports on unknown devices based
on the PCI class (modem or serial). On certain softmodems (AC'97 modems)
this can lead to the recognition of non-existing serial ports.
This patch adds a blacklist of PCI IDs that are to be ignored by the driver.
[akpm@linux-foundation.org: cleanups] Signed-off-by: Christian Schmidt <schmidt@digadd.de> Cc: Bjorn Helgaas <bjorn.helgaas@hp.com> Cc: Russell King <rmk+lkml@arm.linux.org.uk> Cc: Yinghai Lu <yinghai.lu@sun.com> Acked-by: Alan Cox <alan@lxorguk.ukuu.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Corey Minyard [Wed, 22 Aug 2007 21:01:18 +0000 (14:01 -0700)]
Serial 8250: handle saving the clear-on-read bits from the LSR and MSR
Reading the LSR clears the break, parity, frame error, and overrun bits in
the 8250 chip, but these are not being saved in all places that read the
LSR. Same goes for the MSR delta bits. Save the LSR bits off whenever the
lsr is read so they can be handled later in the receive routine. Save the
MSR bits to be handled in the modem status routine.
Also, clear the stored bits and clear the interrupt registers before
enabling interrupts, to avoid handling old values of the stored bits in the
interrupt routines.
[akpm@linux-foundation.org: clean up pre-existing code] Signed-off-by: Corey Minyard <minyard@acm.org> Cc: Russell King <rmk+lkml@arm.linux.org.uk> Cc: Yinghai Lu <yinghai.lu@sun.com> Cc: Bjorn Helgaas <bjorn.helgaas@hp.com> Acked-by: Alan Cox <alan@lxorguk.ukuu.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Niels de Vos [Wed, 22 Aug 2007 21:01:14 +0000 (14:01 -0700)]
serial: add support for ITE 887x chips
Add support for the it887x-chips (PCI) manufactured by ITE.
Signed-off-by: Niels de Vos <niels.devos@wincor-nixdorf.com> Cc: Russell King <rmk@arm.linux.org.uk> Acked-by: Alan Cox <alan@lxorguk.ukuu.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
David Woodhouse [Wed, 22 Aug 2007 21:01:11 +0000 (14:01 -0700)]
serial: don't optimise away baud rate changes when BOTHER is used
The uart_set_termios() function will bail out early without bothering to
touch the hardware, if it decides that nothing "relevant" has changed.
Unfortunately, its idea of "relevant" doesn't include c_[io]speed. So if
the baud rate bits are BOTHER and you just change the speed, the change
gets optimised away.
This patch makes it ignore the old Bfoo bits in c_cflag and just check
whether c_ispeed and c_ospeed have changed. Those integers are always set
appropriately for us by set_termios().
Signed-off-by: David Woodhouse <dwmw2@infradead.org> Acked-by: Alan Cox <alan@redhat.com> Cc: Russell King <rmk@arm.linux.org.uk> Cc: Mariusz Kozlowski <m.kozlowski@tuxland.pl> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Lee Schermerhorn [Wed, 22 Aug 2007 21:01:06 +0000 (14:01 -0700)]
Document Linux Memory Policy
I couldn't find any memory policy documentation in the Documentation
directory, so here is my attempt to document it.
There's lots more that could be written about the internal design--including
data structures, functions, etc. However, if you agree that this is better
that the nothing that exists now, perhaps it could be merged. This will
provide a baseline for updates to document the many policy patches that are
currently being worked.
Signed-off-by: Lee Schermerhorn <lee.schermerhorn@hp.com> Cc: Christoph Lameter <clameter@sgi.com> Cc: Andi Kleen <ak@suse.de> Cc: Michael Kerrisk <mtk-manpages@gmx.net> Acked-by: Rob Landley <rob@landley.net> Acked-by: Mel Gorman <mel@csn.ul.ie> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Kent Yoder [Wed, 22 Aug 2007 21:01:04 +0000 (14:01 -0700)]
tpmdd maintainers
Fix up the maintainers info in the tpm drivers. Kylene will be out for
some time, so copying the sourceforge list is the best way to get some
attention.
Cc: Marcel Selhorst <tpm@selhorst.net> Cc: Kylene Jo Hall <kjhall@us.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Andy Whitcroft [Wed, 22 Aug 2007 21:01:03 +0000 (14:01 -0700)]
sparsemem: ensure we initialise the node mapping for SPARSEMEM_STATIC
Booting SPARSEMEM on NUMA systems trips a BUG in page_alloc.c:
Initializing HighMem for node 0 (00038000:00100000)
Initializing HighMem for node 1 (00100000:001ffe00)
------------[ cut here ]------------
kernel BUG at /home/apw/git/linux-2.6/mm/page_alloc.c:456!
[...]
This occurs because the section to node id mapping is not being
setup correctly during init under SPARSEMEM_STATIC, leading to an
attempt to free pages from all nodes into the zones on node 0.
When the zone_table[] was removed in the following commit, a new
section to node mapping table was introduced:
That conversion inadvertantly only initialised the node mapping in
SPARSEMEM_EXTREME. Ensure we initialise the node mapping in
SPARSEMEM_STATIC.
[akpm@linux-foundation.org: make the stubs static inline] Signed-off-by: Andy Whitcroft <apw@shadowen.org> Cc: Christoph Lameter <clameter@sgi.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Ryusuke Konishi [Wed, 22 Aug 2007 21:01:02 +0000 (14:01 -0700)]
eCryptfs: fix lookup error for special files
When ecryptfs_lookup() is called against special files, eCryptfs generates
the following errors because it tries to treat them like regular eCryptfs
files.
Error opening lower file for lower_dentry [0xffff810233a6f150], lower_mnt [0xffff810235bb4c80], and flags [0x8000]
Error opening lower_file to read header region
Error attempting to read the [user.ecryptfs] xattr from the lower file; return value = [-95]
Valid metadata not found in header region or xattr region; treating file as unencrypted
For instance, the problem can be reproduced by the steps below.
# mkdir /root/crypt /mnt/crypt
# mount -t ecryptfs /root/crypt /mnt/crypt
# mknod /mnt/crypt/c0 c 0 0
# umount /mnt/crypt
# mount -t ecryptfs /root/crypt /mnt/crypt
# ls -l /mnt/crypt
This patch fixes it by adding a check similar to directories and
symlinks.
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp> Acked-by: Michael Halcrow <mhalcrow@us.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* git://git.kernel.org/pub/scm/linux/kernel/git/wim/linux-2.6-watchdog:
[WATCHDOG] Add support for 1533 bridge to alim1535_wdt
[WATCHDOG] Add a 00-INDEX file to Documentation/watchdog/
[WATCHDOG] Eurotechwdt.c - clean-up comments
Linus Torvalds [Wed, 22 Aug 2007 18:13:22 +0000 (11:13 -0700)]
Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/sparc-2.6
* 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/sparc-2.6:
[SPARC32]: Revert f642b263800e6e57c377d630be6d2a999683b579.
[SPARC64]: Need to clobber global reg vars in switch_to().
Linus Torvalds [Wed, 22 Aug 2007 18:13:00 +0000 (11:13 -0700)]
Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
* 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6:
[IRDA] irda_nl_get_mode: always results in failure
[PPP]: Fix output buffer size in ppp_decompress_frame().
[IRDA]: Avoid a label defined but not used warning in irda_init()
[IPV6]: Fix kernel panic while send SCTP data with IP fragments
[SNAP]: Check packet length before reading
[DCCP]: Allocation in atomic context
Linus Torvalds [Wed, 22 Aug 2007 18:12:08 +0000 (11:12 -0700)]
Merge branch 'for-linus' of git://git390.osdl.marist.edu/pub/scm/linux-2.6
* 'for-linus' of git://git390.osdl.marist.edu/pub/scm/linux-2.6:
[S390] Change atomic_read/set to inline functions with barrier semantics.
[S390] kprobes: fix instruction length calculation
[S390] hypfs: inode corruption due to missing locking
[S390] disassembler: fix b2 opcodes like srst, bsg, and others
[S390] vmur: fix reference counting for vmur device structure
[S390] vmur: fix diag14 exceptions with addresses > 2GB.
[S390] qdio: Refresh buffer states for IQDIO Asynchronous output queue
[S390] qdio: fix EQBS handling on CCQ96
[S390] cio: change confusing message in cmf.
[S390] cio: dont forget to set last slot to NULL in ccw_uevent().
Zachary Amsden [Wed, 22 Aug 2007 01:30:36 +0000 (18:30 -0700)]
Fix lazy mode vmalloc synchronization for paravirt
Touching vmalloc memory in the middle of a lazy mode update can generate
a kernel PDE update, which must be flushed immediately. The fix is to
leave lazy mode when doing a vmalloc sync.
Heiko Carstens [Wed, 22 Aug 2007 11:51:45 +0000 (13:51 +0200)]
[S390] Change atomic_read/set to inline functions with barrier semantics.
After doing some tests this seems to be the best variant for s390 and
should be correct as well. With gcc 4.2.1 we get the following kernel
image sizes using the default configuration:
atomic_t type volatile, atomic_read/set defines 5311824 bytes
atomic_t type int, atomic_read/set defines 5270864 bytes
atomic_t type int, atomic_read/set inline asm 5279056 bytes
atomic_t type int, atomic_read/set inline barrier 5270864 bytes
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Placing a kprobe on "bc" instruction (s390/s390x) can cause an oops.
The instruction length is encoded into the first two bits of the s390
instruction. Kprobe is incorrectly computing the instruction length.
The instruction length is used for determining what type of "fix-up" is
needed for conditional branch instruction. The problem can bee seen by
placing a kprobe on a "bc" instruction that will not branch. The
results is that Kprobe incorrectly computes the new instruction
pointer (psw.addr) after single stepping the instruction. The problem
is corrected with this patch.
Signed-off-by: David Wilder <dwilder@us.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Michael Holzheu [Wed, 22 Aug 2007 11:51:43 +0000 (13:51 +0200)]
[S390] hypfs: inode corruption due to missing locking
hypfs removes the whole hypfs directory tree and creates a new one, when a
process triggers an update by writing to the "update" attribute. When removing
and creating files, it is necessary to lock the inode of the parent directory
where the files live. Currently hypfs does not lock the parent inode, which
can lead to inode corruption. This patch:
* Introduces correct locking
* Fixes i_nlink reference counting for inodes, when creating directories
* Adds info printk, when hypfs filesystem has been mounted
Signed-off-by: Michael Holzheu <holzheu@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
[S390] disassembler: fix b2 opcodes like srst, bsg, and others
The instruction table for b2 opcodes was missing an opfrag value
for the cpya instruction. All instructions specified after cpya
were not considered by the disassembler. The fix is simple and
obvious - add the opfrag field to the cpya instruction.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Michael Holzheu [Wed, 22 Aug 2007 11:51:41 +0000 (13:51 +0200)]
[S390] vmur: fix reference counting for vmur device structure
When a vmur device is removed due to a detach of the device, currently the
ur device structure is freed. Unfortunately it can happen, that there is
still a user of the device structure, when the character device is open
during the detach process. To fix this, reference counting for the vmur
structure is introduced.
In addition to that, the online, offline, probe and remove functions are
serialized now using a global mutex.
Signed-off-by: Michael Holzheu <holzheu@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Michael Holzheu [Wed, 22 Aug 2007 11:51:40 +0000 (13:51 +0200)]
[S390] vmur: fix diag14 exceptions with addresses > 2GB.
There are several s390 diagnose calls, which must be executed below the
2GB memory boundary. In order to enforce this, those diagnoses must be
compiled into the kernel. Currently diag 14 can be called within the
vmur kernel module from addresses above 2GB. This leads to specification
exceptions. This patch moves diag10, diag14 and diag210 into the new
diag.c file.
Signed-off-by: Michael Holzheu <holzheu@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Klaus D. Wacker [Wed, 22 Aug 2007 11:51:39 +0000 (13:51 +0200)]
[S390] qdio: Refresh buffer states for IQDIO Asynchronous output queue
Hipersocket Multicast queue works asynchronously. When sending buffers,
the buffer state change may happen delayed. The tasklet for checking
changes in the outbound queue excluded IQDIO async queues from this
process. This created either a hang situation when the queue ran full,
or presented a hang situation a interface close time.
The tasklet processing is changed to include IQDIO async queues when
requesting buffer state refresh.
Signed-off-by: Klaus D. Wacker <kdwacker@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Klaus D. Wacker [Wed, 22 Aug 2007 11:51:38 +0000 (13:51 +0200)]
[S390] qdio: fix EQBS handling on CCQ96
QDIO returned from EQBS instruction in any case after return code
CCQ=96 was issued regardless whether buffer states for at least one
buffer were extracted or not.
This caused FCP devices to hang when running under z/VM and having
QIOASSASIST=ON and having high I/O rates.
In order to fix this qdio return code processing of EQBS instruction
after CCQ=96 is changed that buffers are returned and if no buffers
where extracted the instruction is repeated at once.
Signed-off-by: Klaus D. Wacker <kdwacker@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Cornelia Huck [Wed, 22 Aug 2007 11:51:37 +0000 (13:51 +0200)]
[S390] cio: change confusing message in cmf.
cmf currently prints a message that more than 4096 channels are not
allowed in basic mode - however, this can only be enforced if cmf was
a module (which is no longer possible). It makes much more sense to
not check the specified number of channels and just print a message if
the block for basic mode could not be allocated (which may happen for
any number of specified channels).
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Andy Whitcroft [Wed, 22 Aug 2007 04:23:39 +0000 (21:23 -0700)]
[IRDA] irda_nl_get_mode: always results in failure
It seems an extraneous trailing ';' has slipped in to the error handling for a
name registration failure causing the error path to trigger unconditionally.
Signed-off-by: Andy Whitcroft <apw@shadowen.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Samuel Ortiz <samuel@sortiz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
[PPP]: Fix output buffer size in ppp_decompress_frame().
This patch addresses the issue with "osize too small" errors in mppe
encryption. The patch fixes the issue with wrong output buffer size
being passed to ppp decompression routine.
--------------------
As pointed out by Suresh Mahalingam, the issue addressed by
ppp-fix-osize-too-small-errors-when-decoding patch is not fully resolved yet.
The size of allocated output buffer is correct, however it size passed to
ppp->rcomp->decompress in ppp_generic.c if wrong. The patch fixes that.
--------------------
Signed-off-by: Konstantin Sharlaimov <konstantin.sharlaimov@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Wei Yongjun [Wed, 22 Aug 2007 03:59:08 +0000 (20:59 -0700)]
[IPV6]: Fix kernel panic while send SCTP data with IP fragments
If ICMP6 message with "Packet Too Big" is received after send SCTP DATA,
kernel panic will occur when SCTP DATA is send again.
This is because of a bad dest address when call to skb_copy_bits().
The messages sequence is like this:
Endpoint A Endpoint B
<------- SCTP DATA (size=1432)
ICMP6 message ------->
(Packet Too Big pmtu=1280)
<------- Resend SCTP DATA (size=1432)
------------kernel panic---------------
@@ -761,7 +762,7 @@ slow_path:
/*
* Copy a block of the IP datagram.
*/
- if (skb_copy_bits(skb, ptr, frag->h.raw, len))
+ if (skb_copy_bits(skb, ptr, skb_transport_header(skb),
len))
BUG();
left -= len;
====================
Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com> Acked-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Signed-off-by: David S. Miller <davem@davemloft.net>
The problem was that GFP_KERNEL was used; fixed by using gfp_any().
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Signed-off-by: David S. Miller <davem@davemloft.net>
Robin Getz [Wed, 22 Aug 2007 03:14:58 +0000 (23:14 -0400)]
fix - ensure we don't use bootconsoles after init has been released
Gerd Hoffmann pointed out that my patch from yesterday can lead
to a null pointer dereference if the kernel is booted with no
console, and no earlyprintk defined. This fixes that issue.
Signed-off-by: Robin Getz <rgetz@blackfin.uclinux.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Kumar Gala [Wed, 22 Aug 2007 00:15:31 +0000 (19:15 -0500)]
[POWERPC] Fix PCI Device ID for MPC8544/8533 processors
The initial user manuals for MPC8544/8533 had some issues with properly
documenting the device IDs for MPC8544/8533. These processors are almost
identical and both show up on the reference boards.
Fix up the quirks for PCIe support to handle MPC8533/E.
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
There are special PHY settings available on Yukon EC-U chip that
should not get cleared. This should solve mysterious errors on some
motherboards (like Gigabyte DS-3).
Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Martin Bachem [Tue, 21 Aug 2007 12:26:21 +0000 (14:26 +0200)]
hisax: update hfc_usb driver
This fixes handling of USB ISO completion error -EXDEV and includes
several other changes to current CVS version at isdn4linux.de (changes
in debug flags, style of code remarks, etc)
Linus Torvalds [Tue, 21 Aug 2007 06:38:44 +0000 (23:38 -0700)]
Revert "USB: EHCI cpufreq fix"
This reverts commit 196705c9bbc03540429b0f7cf9ee35c2f928a534. It was
reported to cause a regression by Daniel Exner, and Arjan van de Ven
points out that we actually already have infrastructure in place for
setting limits on acceptable DMA latency that would be the much more
correct fix for the problem with some Broadcom EHCI controllers.
Fixed up trivial conflicts due to the changes to support big-endian host
controller descriptors in drivers/usb/host/{ehci-sched.c,ehci.h}.
Zach Brown [Tue, 21 Aug 2007 00:12:01 +0000 (17:12 -0700)]
dio: zero struct dio with kzalloc instead of manually
This patch uses kzalloc to zero all of struct dio rather than manually
trying to track which fields we rely on being zero. It passed aio+dio
stress testing and some bug regression testing on ext3.
This patch was introduced by Linus in the conversation that lead up to
Badari's minimal fix to manually zero .map_bh.b_state in commit:
I was unable to measure a stable difference in the number of cpu cycles
spent in blockdev_direct_IO() when pushing aio+dio 256K reads at
~340MB/s.
So the resulting intent of the patch isn't a performance gain but to
avoid exposing ourselves to the risk of finding another field like
.map_bh.b_state where we rely on zeroing but don't enforce it in the
code.
Signed-off-by: Zach Brown <zach.brown@oracle.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Tue, 21 Aug 2007 05:48:24 +0000 (22:48 -0700)]
Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/v4l-dvb
* 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/v4l-dvb:
V4L/DVB (6028): Turn an unnecessary mdelay() into msleep().
V4L/DVB (6027): Get rid of an ill-behaved msleep in i2c write
V4L/DVB (6026): Avoid powering up the camera on resume
V4L/DVB (6016): get_dvb_firmware: update script for new location of tda10046 firmware
V4L/DVB (5991): dvb-pll: Set minimum and maximum frequency properly
V4L/DVB (5969): ivtv: report ivtv version in status log
V4L/DVB (5967): ivtv: fix VIDIOC_S_FBUF:new OSD values where never set
V4L/DVB (5968): videodev2.h: remove superfluous FBUF GLOBAL_INV_ALPHA support
David Woodhouse [Mon, 20 Aug 2007 10:05:29 +0000 (11:05 +0100)]
JFFS2 locking regression fix.
Commit a491486a2087ac3dfc00efb4f838c8d684afaf54 introduced a locking
problem in JFFS2 -- we up() the alloc_sem when we weren't previously
holding it. This leads to all kinds of fun behaviour later.
There was a _reason_ for the
if (1 /* alternative path needs testing */ ||
which the above-mentioned commit removed :)
Discovered and debugged by Giulio Fedel <giulio.fedel@andorsystems.com>
Signed-off-by: David Woodhouse <dwmw2@infradead.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This ensures that a bootconsole is unregistered if it is not replaced.
The current implementation spews garbage out the bootconsole in this case,
since the bootconsole structure is normally in the init section, and is
freed, but still used.
Signed-off-by: Robin Getz <rgetz@blackfin.uclinux.org> Acked-by: Gerd Hoffmann <kraxel@redhat.com> Acked-by: Paul Mundt <lethal@linux-sh.org> Cc: Mike Frysinger <vapier.adi@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
ide-disk: workaround for buggy HPA support on ST340823A (take 3)
This disk reports total number of sectors instead of maximum sector address
in response to READ_NATIVE_MAX_ADDRESS command and also happily accepts
SET_MAX_ADDRESS command with the bogus value. This results in +1 sector
capacity being used and errors on attempts to use the last sector.
Programming DMA mode may destroy current PIO mode setting so if
CONFIG_HPT34X_AUTODMA=n (the default case) make ide_tune_dma() fail
early by disabling all host DMA masks and re-tune PIO mode.
This fix doesn't help with the driver being broken but is needed
for some other changes.
ide: add cable detection for early UDMA66 devices (take 3)
* Move ide_in_drive_list() from ide-dma.c to ide-iops.c.
* Add ivb_list[] table for listening early UDMA66 devices which don't conform
to ATA4 standard wrt cable detection (bit14 is zero, only bit13 is valid)
and use only device side cable detection for them since host side cable
detection may be unreliable.
* Add model "QUANTUM FIREBALLlct10 05" with firwmare "A03.0900" to the list
(from Craig's bugreport).
v2:
* Improve kernel message basing on suggestion from Sergei.
v3:
* Don't print kernel message when no device side cable detection is done,
plus some minor fixes. (Noticed by Sergei)
* Add DMA blacklist checking (->ide_dma_on check probably can go now).
* Add ->atapi_dma flag checking and remove no longer needed
ns87415_ide_dma_check() from ns87415 host driver.
* Remove now needless __ide_dma_check() wrapper and symbol export.
* Check drive->autodma instead of hwif->autodma (there should be no changes in
behavior as all users of config_drive_for_dma() set both ->autodma flags).
ide: fix hidden dependencies on CONFIG_IDE_GENERIC
Some host drivers depend on CONFIG_IDE_GENERIC to do the probing but their
config options lack explicit dependencies on IDE_GENERIC. In the long-term
these host drivers should be fixed to do the probing themselves but for now
fix them by making their config options select CONFIG_IDE_GENERIC.
Tejun Heo [Mon, 20 Aug 2007 20:42:53 +0000 (22:42 +0200)]
ide: make CONFIG_IDE_GENERIC default to N
These days, CONFIG_IDE_GENERIC causes more confusion and
misconfiguration than it helps. Especially so because libata is
linked after the generic driver. Default to N.
Jonathan Corbet [Fri, 17 Aug 2007 04:02:33 +0000 (01:02 -0300)]
V4L/DVB (6027): Get rid of an ill-behaved msleep in i2c write
Configuring the OLPC camera requires something over 150 register
writes. Unfortunately, querying the CAFE i2c controller too
soon after a write causes the hardware to flake. The problem had
been "solved" with an msleep() call, but, between the number of
registers and how msleep() behaves, that resulted in a 3-second
delay on camera initialization. Instead, we hand-code a wait for
the completion interrupt which avoids reading the status registers.
Signed-off-by: Jonathan Corbet <corbet@lwn.net> Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
Al Viro [Sun, 19 Aug 2007 03:51:26 +0000 (04:51 +0100)]
missing return in bridge sysfs code
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Acked-by: David S. Miller <davem@davemloft.net> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Avi Kivity [Sun, 19 Aug 2007 12:57:26 +0000 (15:57 +0300)]
KVM: Avoid calling smp_call_function_single() with interrupts disabled
When taking a cpu down, we need to hardware_disable() it.
Unfortunately, the CPU_DYING notifier is called with interrupts
disabled, which means we can't use smp_call_function_single().
Fortunately, the CPU_DYING notifier is always called on the dying cpu,
so we don't need to use the function at all and can simply call
hardware_disable() directly.
Tested-by: Paolo Ornati <ornati@fastwebnet.it> Signed-off-by: Avi Kivity <avi@qumranet.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Christian Heim [Sun, 19 Aug 2007 11:07:59 +0000 (13:07 +0200)]
Remove double inclusion of linux/capability.h
Remove the second inclusion of linux/capability.h, which has been
introduced with "[PATCH] move capable() to capability.h" (commit c59ede7b78db329949d9cdcd7064e22d357560ef)
Signed-off-by: Christian Heim <phreak@gentoo.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Chris Wright [Sat, 18 Aug 2007 21:31:41 +0000 (14:31 -0700)]
x86: properly initialize temp insn buffer for paravirt patching
With commit ab144f5ec64c42218a555ec1dbde6b60cf2982d6 the patching code
now collects the complete new instruction stream into a temp buffer
before finally patching in the new insns. In some cases the paravirt
patchers will choose to leave the patch site unpatched (length mismatch,
clobbers mismatch, etc).
This causes the new patching code to copy an uninitialized temp buffer,
i.e. garbage, to the callsite. Simply make sure to always initialize
the buffer with the original instruction stream. A better fix is to
audit all the patchers and return proper length so that apply_paravirt()
can skip copies when we leave the patch site untouched.
Signed-off-by: Chris Wright <chrisw@sous-sol.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Andi Kleen [Wed, 15 Aug 2007 00:40:36 +0000 (02:40 +0200)]
x86_64: Change PMDS invocation to single macro
Very old binutils (2.12.90...) seem to have trouble with newlines
in assembler macro invocation. They put them into the resulting
argument expansion. In this case this lead to a parse error because
a .rept expression ended up spread over multiple lines. Change the PMDS()
invocation to a single line.
Daniel Gollub [Wed, 15 Aug 2007 00:40:35 +0000 (02:40 +0200)]
x86_64: Fix to keep watchdog disabled by default for i386/x86_64
Fixed wrong expression which enabled watchdogs even if nmi_watchdog kernel
parameter wasn't set. This regression got slightly introduced with commit b7471c6da94d30d3deadc55986cc38d1ff57f9ca.
Introduced NMI_DISABLED (-1) which allows to switch the value of NMI_DEFAULT
without breaking the APIC NMI watchdog code (again).
Fixes:
https://bugzilla.novell.com/show_bug.cgi?id=298084
http://bugzilla.kernel.org/show_bug.cgi?id=7839
And likely some more nmi_watchdog=0 related issues.
Andi Kleen [Wed, 15 Aug 2007 00:40:34 +0000 (02:40 +0200)]
x86_64: Fail dma_alloc_coherent on dma less devices
This should fix an oops with PCMCIA PATA devices
http://bugzilla.kernel.org/show_bug.cgi?id=8424
This is not a full fix for the problem, but probably
still the right thing to do.
[ I'm almost certain it's *not* the right thing to do, but it avoids an
oops, and I want comments from others on what the right thing would
actually be.. I suspect we should just remove the use of dma_mask
entirely in this function, and just use coherent_dma_mask. - Linus ]
Timo Jantunen [Tue, 14 Aug 2007 18:56:57 +0000 (21:56 +0300)]
fix random hang in forcedeth driver when using netconsole
If the forcedeth driver receives too much work in an interrupt, it
assumes it has a broken hardware with stuck IRQ. It works around the
problem by disabling interrupts on the nic but makes a printk while
holding device spinlog - which isn't smart thing to do if you have
netconsole on the same nic.
This patch moves the printk's out of the spinlock protected area.
Without this patch the machine hangs hard. With this patch everything
still works even when there is significant increase on CPU usage while
using the nic.
Signed-off-by: Timo Jantunen <jeti@iki.fi> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Satyam Sharma [Thu, 16 Aug 2007 00:39:25 +0000 (06:09 +0530)]
i386: Fix a couple busy loops in mach_wakecpu.h:wait_for_init_deassert()
Use cpu_relax() in the busy loops, as atomic_read() doesn't automatically
imply volatility for i386 and x86_64. x86_64 doesn't have this issue because
it open-codes the while loop in smpboot.c:smp_callin() itself that already
uses cpu_relax().
For i386, however, smpboot.c:smp_callin() calls wait_for_init_deassert()
which is buggy for mach-default and mach-es7000 cases.
[ I test-built a kernel -- smp_callin() itself got inlined in its only
callsite, smpboot.c:start_secondary() -- and the relevant piece of
code disassembles to the following:
0xc1019704 <start_secondary+12>: mov 0xc144c4c8,%eax
0xc1019709 <start_secondary+17>: test %eax,%eax
0xc101970b <start_secondary+19>: je 0xc1019709 <start_secondary+17>
init_deasserted (at 0xc144c4c8) gets fetched into %eax only once and
then we loop over the test of the stale value in the register only,
so these look like real bugs to me. With the fix below, this becomes:
Linus Torvalds [Sat, 18 Aug 2007 16:38:56 +0000 (09:38 -0700)]
Merge branch 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6
* 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6:
Cross-compilation between e.g. i386 -> 64bit could break -> work around it
[IA64] Enable early console for Ski simulator
[IA64] forbid ptrace changes psr.ri to 3
[IA64] Failure to grow RBS
[IA64] Fix processor_get_freq
[IA64] SGI Altix : fix a force_interrupt bug on altix
[IA64] Update arch/ia64/configs/* s/SLAB/SLUB/
[IA64] get back PT_IA_64_UNWIND program header
[IA64] need NOTES in vmlinux.lds.S
[IA64] make unwinder stop at last frame of the bootloader
[IA64] Clean up CPE handler registration
[IA64] Include Kconfig.preempt
[IA64] SN2 needs platform specific irq_to_vector() function.
[IA64] Use atomic64_read to read an atomic64_t.
[IA64] disable irq's and check need_resched before safe_halt
Linus Torvalds [Sat, 18 Aug 2007 16:38:09 +0000 (09:38 -0700)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband:
IB/mlx4: Incorrect semicolon after if statement
mlx4_core: Wait 1 second after reset before accessing device
IPoIB: Fix leak in ipoib_transport_dev_init() error path
IB/mlx4: Fix opcode returned in RDMA read completion
IB/srp: Add OUI for new Cisco targets
IB/srp: Wrap OUI checking for workarounds in helper functions
RDMA/cxgb3: Always call low level send function via cxgb3_ofld_send()
IB: Move the macro IB_UMEM_MAX_PAGE_CHUNK() to umem.c
IB: Include <linux/list.h> and <linux/rwsem.h> from <rdma/ib_verbs.h>
IB: Include <linux/list.h> from <rdma/ib_mad.h>
IB/mad: Fix address handle leak in mad_rmpp
IB/mad: agent_send_response() should be void
IB/mad: Fix memory leak in switch handling in ib_mad_recv_done_handler()
IB/mad: Fix error path if response alloc fails in ib_mad_recv_done_handler()
IB/sa: Don't need to check for default P_Key twice
IB/core: Ignore membership bit in ib_find_pkey()
Linus Torvalds [Sat, 18 Aug 2007 16:34:28 +0000 (09:34 -0700)]
Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/sparc-2.6
* 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/sparc-2.6:
[MATH-EMU]: Fix underflow exception reporting.
[SPARC64]: Create a HWCAP_SPARC_N2 and report it to userspace on Niagara-2.
[SPARC64]: SMP trampoline needs to avoid %tick_cmpr on sun4v too.
[SPARC64]: Do not touch %tick_cmpr on sun4v cpus.
[SPARC64]: Niagara-2 optimized copies.
[SPARC64]: Allow userspace to get at the machine description.
[SPARC32]: Remove superfluous 'kernel_end' alignment on sun4c.
[SPARC32]: Fix bogus ramdisk image location check.
[SPARC32]: Remove iommu from struct sbus_bus and use archdata like sparc64.
Linus Torvalds [Sat, 18 Aug 2007 16:33:25 +0000 (09:33 -0700)]
Merge branch 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc
* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc:
[POWERPC] Fix invalid semicolon after if statement
[POWERPC] ps3: Fix no storage devices found
[POWERPC] Fix for assembler -g
[POWERPC] Fix small race in 44x tlbie function
[POWERPC] Remove unused code causing a compile warning
[POWERPC] cell: Fix errno for modular spufs_create with invalid neighbour
Linus Torvalds [Sat, 18 Aug 2007 16:31:05 +0000 (09:31 -0700)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/hpa/linux-2.6-x86setup
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/hpa/linux-2.6-x86setup:
[x86 setup] edd.c: make sure MBR signatures actually get reported
[x86 setup] Don't use EDD to get the MBR signature
[x86 setup] The current display page is returned in %bh, not %bl
* git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6:
[CIFS] Check return code on failed alloc
[CIFS] Update CIFS project web site
[CIFS] Fix hang in find_writable_file
Marcel Holtmann [Fri, 17 Aug 2007 19:47:58 +0000 (21:47 +0200)]
Reset current->pdeath_signal on SUID binary execution
This fixes a vulnerability in the "parent process death signal"
implementation discoverd by Wojciech Purczynski of COSEINC PTE Ltd.
and iSEC Security Research.