Len Brown [Sun, 12 Aug 2007 04:13:02 +0000 (00:13 -0400)]
ACPI: thermal: add DMI hooks to handle AOpen's broken Award BIOS
Use DMI to:
1. enable polling (BIOS thermal events are broken)
2. disable active trip points (BIOS fan control is broken)
3. disable passive trip point (BIOS hard-codes it too low)
The actual temperature reading does work,
and with the aid of polling, the critical
trip point should work too.
Len Brown [Sun, 12 Aug 2007 04:12:54 +0000 (00:12 -0400)]
ACPI: thermal: create "thermal.act=" to disable or override active trip point
thermal.act=-1 disables all active trip points
in all ACPI thermal zones.
thermal.act=C, where C > 0, overrides all lowest temperature
active trip points in all thermal zones to C degrees Celsius.
Raising this trip-point may allow you to keep your system silent
up to a higher temperature. However, it will not allow you to
raise the lowest temperature trip point above the next higher
trip point (if there is one). Lowering this trip point may
kick in the fan sooner.
Note that overriding this trip-point will disable any BIOS attempts
to implement hysteresis around the lowest temperature trip point.
This may result in the fan starting and stopping frequently
if temperature frequently crosses C.
WARNING: raising trip points above the manufacturer's defaults
may cause the system to run at higher temperature and shorten
its life.
Len Brown [Sun, 12 Aug 2007 04:12:44 +0000 (00:12 -0400)]
ACPI: thermal: create "thermal.nocrt" to disable critical actions
thermal.nocrt=1 disables actions on _CRT and _HOT
ACPI thermal zone trip-points. They will be marked
as <disabled> in /proc/acpi/thermal_zone/*/trip_points.
There are two cases where this option is used:
1. Debugging a hot system crossing valid trip point.
If your system fan is spinning at full speed,
be sure that the vent is not clogged with dust.
Many laptops have very fine thermal fins that are easily blocked.
Check that the processor fan-sink is properly seated,
has the proper thermal grease, and is really spinning.
Check for fan related options in BIOS SETUP.
Sometimes there is a performance vs quiet option.
Defaults are generally the most conservative.
If your fan is not spinning, yet /proc/acpi/fan/
has files in it, please file a Linux/ACPI bug.
WARNING: you risk shortening the lifetime of your
hardware if you use this parameter on a hot system.
Note that this refers to all system components,
including the disk drive.
2. Working around a cool system crossing critical
trip point due to erroneous temperature reading.
Try again with CONFIG_HWMON=n
There is known potential for conflict between the
the hwmon sub-system and the ACPI BIOS.
If this fixes it, notify lm-sensors@lm-sensors.org
and linux-acpi@vger.kernel.org
Otherwise, file a Linux/ACPI bug, or notify
just linux-acpi@vger.kernel.org.
Len Brown [Sun, 12 Aug 2007 04:12:35 +0000 (00:12 -0400)]
ACPI: thermal: create "thermal.psv=" to override passive trip points
"thermal.psv=-1" disables passive trip points
for all ACPI thermal zones.
"thermal.psv=C", where 'C' is degrees Celsius,
overrides all existing passive trip points
for all ACPI thermal zones.
thermal.psv is checked at module load time,
and in response to trip-point change events.
Note that if the system does not deliver thermal zone
temperature change events near the new trip-point,
then it will not be noticed. To force your custom
trip point to be noticed, you may need to enable polling:
eg. thermal.tzp=3000 invokes polling every 5 minutes.
Note that once passive thermal throttling is invoked,
it has its own internal Thermal Sampling Period (_TSP),
that is unrelated to _TZP.
WARNING: disabling or raising a thermal trip point
may result in increased running temperature and
shorter hardware lifetime on some systems.
Len Brown [Sun, 12 Aug 2007 04:12:26 +0000 (00:12 -0400)]
ACPI: thermal: expose "thermal.tzp=" to set global polling frequency
Thermal Zone Polling frequency (_TZP) is an optional ACPI object
recommending the rate that the OS should poll the associated thermal zone.
If _TZP is 0, no polling should be used.
If _TZP is non-zero, then the platform recommends that
the OS poll the thermal zone at the specified rate.
The minimum period is 30 seconds.
The maximum period is 5 minutes.
(note _TZP and thermal.tzp units are in deci-seconds,
so _TZP = 300 corresponds to 30 seconds)
If _TZP is not present, ACPI 3.0b recommends that the
thermal zone be polled at an "OS provided default frequency".
However, common industry practice is:
1. The BIOS never specifies any _TZP
2. High volume OS's from this century never poll any thermal zones
Ie. The OS depends on the platform's ability to
provoke thermal events when necessary, and
the "OS provided default frequency" is "never":-)
There is a proposal that ACPI 4.0 be updated to reflect
common industry practice -- ie. no _TZP, no polling.
The Linux kernel already follows this practice --
thermal zones are not polled unless _TZP is present and non-zero.
But thermal zone polling is useful as a workaround for systems
which have ACPI thermal control, but have an issue preventing
thermal events. Indeed, some Linux distributions still
set a non-zero thermal polling frequency for this reason.
But rather than ask the user to write a polling frequency
into all the /proc/acpi/thermal_zone/*/polling_frequency
files, here we simply document and expose the already
existing module parameter to do the same at system level,
to simplify debugging those broken platforms.
Note that thermal.tzp is a module-load time parameter only.
ACPI: thinkpad-acpi: fix sysfs paths in documentation
The documentation used "thinkpad-acpi" to refer to the directories in
sysfs, while it should have been using "thinkpad_acpi". Thanks to Hugh
Dickins for the error report.
I wish I could just call the module and everything else by the proper
name with the "-", instead of using these ugly translations to "_".
Signed-off-by: Henrique de Moraes Holschuh <hmh@hmh.eng.br> Cc: Hugh Dickins <hugh@veritas.com> Signed-off-by: Len Brown <len.brown@intel.com>
Linus Torvalds [Sat, 11 Aug 2007 23:18:58 +0000 (16:18 -0700)]
Merge branch 'for-linus' of git://git390.osdl.marist.edu/pub/scm/linux-2.6
* 'for-linus' of git://git390.osdl.marist.edu/pub/scm/linux-2.6:
[S390] monwriter: Serialization bug for multithreaded applications.
[S390] vmur: diag14 only works with buffers below 2GB
[S390] vmur: add "top of queue" sanity check for reader open
[S390] vmur: reject open on z/VM reader files with status HOLD
[S390] vmur: use DECLARE_COMPLETION_ONSTACK to keep lockdep happy
[S390] vmur: allocate single record buffers instead of one big data buffer
[S390] remove DEFAULT_MIGRATION_COST
[S390] qdio: make sure data structures are correctly aligned.
[S390] hypfs: implement show_options
[S390] cio: avoid memory leak on error in css_alloc_subchannel().
Linus Torvalds [Sat, 11 Aug 2007 23:09:49 +0000 (16:09 -0700)]
Merge branch 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc
* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc:
[POWERPC] Fix size check for hugetlbfs
[POWERPC] Fix initialization and usage of dma_mask
[POWERPC] Fix more section mismatches in head_64.S
[POWERPC] Revert "[POWERPC] Add 'mdio' to bus scan id list for platforms with QE UEC"
[POWERPC] PS3: Update ps3_defconfig
[POWERPC] PS3: Remove text saying PS3 support is incomplete
[POWERPC] PS3: Fix storage probe logic
[POWERPC] cell: Move SPU affinity init to spu_management_of_ops
[POWERPC] Fix potential duplicate entry in SLB shadow buffer
Linus Torvalds [Sat, 11 Aug 2007 23:01:34 +0000 (16:01 -0700)]
Merge branch 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mfasheh/ocfs2
* 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mfasheh/ocfs2:
ocfs2: set non-default s_time_gran during mount
ocfs2: Retry sendpage() if it returns EAGAIN
ocfs2: Fix rename/extend race
[2.6 patch] ocfs2_insert_extent(): remove dead code
ocfs2: Fix max offset calculations
ocfs2: check ia_size limits in setattr
ocfs2: Fix some casting errors related to file writes
ocfs2: use s_maxbytes directly in ocfs2_change_file_space()
ocfs2: Restrict inode changes in ocfs2_update_inode_atime()
Linus Torvalds [Sat, 11 Aug 2007 23:01:06 +0000 (16:01 -0700)]
Merge branch 'for-linus' of git://git.kernel.dk/linux-2.6-block
* 'for-linus' of git://git.kernel.dk/linux-2.6-block:
BLOCK: Hide the contents of linux/bio.h if CONFIG_BLOCK=n
sysace: HDIO_GETGEO has it's own method for ages
drivers/block/cpqarray.c: better error handling and kmalloc + memset conversion to k[cz]alloc
drivers/block/cciss.c: kmalloc + memset conversion to kzalloc
Clean up duplicate includes in drivers/block/
Fix remap handling by blktrace
[PATCH] remove mm/filemap.c:file_send_actor()
Chuck Ebbert [Fri, 10 Aug 2007 20:31:11 +0000 (22:31 +0200)]
i386: Fix double fault handler
The new percpu code has apparently broken the doublefault handler
when CONFIG_DEBUG_SPINLOCK is set. Doublefault is handled by
a hardware task, making the check
fault because it uses the FS register to access the percpu data
for current, and that register is zero in the new TSS. (The trace
I saw was on 2.6.20 where it was GS, but it looks like this will
still happen with FS on 2.6.22.)
Initializing FS in the doublefault_tss should fix it.
AK: Also fix broken ptr_ok() and turn printks into KERN_EMERG
AK: And add a PANIC prefix to make clear the system will hang
AK: (e.g. x86-64 will recover)
Andi Kleen [Fri, 10 Aug 2007 20:31:08 +0000 (22:31 +0200)]
i386: Add warning in Documentation that zero-page is not a stable ABI
Some people writing boot loaders seem to falsely belief the 32bit zero page is a
stable interface for out of tree code like the real mode boot protocol. Add a comment
clarifying that is not true.
Andi Kleen [Fri, 10 Aug 2007 20:31:07 +0000 (22:31 +0200)]
i386: Use global flag to disable broken local apic timer on AMD CPUs.
The Averatec 2370 and some other Turion laptop BIOS seems to program the
ENABLE_C1E MSR inconsistently between cores. This confuses the lapic
use heuristics because when C1E is enabled anywhere it seems to affect
the complete chip.
Use a global flag instead of a per cpu flag to handle this.
If any CPU has C1E enabled disabled lapic use.
Zachary Amsden [Fri, 10 Aug 2007 20:31:05 +0000 (22:31 +0200)]
x86_64: Early segment setup for VT
VT is very picky about when it can enter execution.
Get all segments setup and get LDT and TR into valid state to allow
VT execution under VMware and KVM (untested).
This makes the boot decompression run under VT, which makes it several
orders of magnitude faster on 64-bit Intel hardware.
Before, I was seeing times up to a minute or more to decompress a 1.3MB kernel
on a very fast box.
Andi Kleen [Fri, 10 Aug 2007 20:31:03 +0000 (22:31 +0200)]
i386: Make patching more robust, fix paravirt issue
Commit 19d36ccdc34f5ed444f8a6af0cbfdb6790eb1177 "x86: Fix alternatives
and kprobes to remap write-protected kernel text" uses code which is
being patched for patching.
In particular, paravirt_ops does patching in two stages: first it
calls paravirt_ops.patch, then it fills any remaining instructions
with nop_out(). nop_out calls text_poke() which calls
lookup_address() which calls pgd_val() (aka paravirt_ops.pgd_val):
that call site is one of the places we patch.
If we always do patching as one single call to text_poke(), we only
need make sure we're not patching the memcpy in text_poke itself.
This means the prototype to paravirt_ops.patch needs to change, to
marshal the new code into a buffer rather than patching in place as it
does now. It also means all patching goes through text_poke(), which
is known to be safe (apply_alternatives is also changed to make a
single patch).
AK: fix compilation on x86-64 (bad rusty!)
AK: fix boot on x86-64 (sigh)
AK: merged with other patches
Andi Kleen [Fri, 10 Aug 2007 20:31:02 +0000 (22:31 +0200)]
x86: Disable CLFLUSH support again
It turns out CLFLUSH support is still not complete; we
flush the wrong pages. Again disable it for the release.
Noticed by Jan Beulich who then also noticed a stupid typo later.
Current code assumed that devices were directly connected to a Calgary
bridge, as it tried to get the iommu table directly from the parent bus
controller.
When we have another bridge between the Calgary/CalIOC2 bridge and the
device we should look upwards until we get to the top (Calgary/CalIOC2
bridge), where the iommu table resides.
dean gaudet [Fri, 10 Aug 2007 20:30:59 +0000 (22:30 +0200)]
x86: Work around mmio config space quirk on AMD Fam10h
Some broken devices have been discovered to require %al/%ax/%eax registers
for MMIO config space accesses. Modify mmconfig.c to use these registers
explicitly (rather than modify the global readb/writeb/etc inlines).
AK: also changed i386 to always use eax
AK: moved change to extended space probing to different patch
AK: reworked with inlines according to Linus' requirements.
AK: improve comments.
Greg Ungerer [Fri, 10 Aug 2007 20:01:20 +0000 (13:01 -0700)]
changing include/asm-generic/pgtable.h for non-mmu
There are some parts of include/asm-generic/pgtable.h that are relevant to
the non-mmu architectures. To make it easier to include this from them I
would like to ifdef the relevant parts.
Without this there is a handful of functions that are referenced in here
that are not defined on many non-mmu architectures. They could be defined
out of course, as an alternative approach.
Cc: David Howells <dhowells@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Muli Ben-Yehuda [Fri, 10 Aug 2007 20:01:19 +0000 (13:01 -0700)]
finish i386 and x86-64 sysdata conversion
This patch finishes the i386 and x86-64 ->sysdata conversion and hopefully
also fixes Riku's and Andy's observed bugs. It is based on Yinghai Lu's
and Andy Whitcroft's patches (thanks!) with some changes:
- introduce pci_scan_bus_with_sysdata() and use it instead of
pci_scan_bus() where appropriate. pci_scan_bus_with_sysdata() will
allocate the sysdata structure and then call pci_scan_bus().
- always allocate pci_sysdata dynamically. The whole point of this
sysdata work is to make it easy to do root-bus specific things
(e.g., support PCI domains and IOMMU's). I dislike using a default
struct pci_sysdata in some places and a dynamically allocated
pci_sysdata elsewhere - the potential for someone indavertantly
changing the default structure is too high.
- this patch only makes the minimal changes necessary, i.e., the NUMA node is
always initialized to -1. Patches to do the right thing with regards
to the NUMA node can build on top of this (either add a 'node'
parameter to pci_scan_bus_with_sysdata() or just update the node
when it becomes known).
The patch was compile tested with various configurations (e.g., NUMAQ,
VISWS) and run-time tested on i386 and x86-64. Unfortunately none of my
machines exhibited the bugs so caveat emptor.
Andy, could you please see if this fixes the NUMA issues you've seen?
Riku, does this fix "pci=noacpi" on your laptop?
Signed-off-by: Muli Ben-Yehuda <muli@il.ibm.com> Cc: Yinghai Lu <yhlu.kernel@gmail.com> Cc: Andi Kleen <ak@suse.de> Cc: Chuck Ebbert <cebbert@redhat.com> Cc: <riku.seppala@kymp.net> Cc: Andy Whitcroft <apw@shadowen.org> Cc: Jeff Garzik <jeff@garzik.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jay Estabrook [Fri, 10 Aug 2007 20:01:12 +0000 (13:01 -0700)]
alpha: -Werror fixes for sys_titan.c
This code corrects the usage of the request_irq() routine.
Signed-off-by: Jay Estabrook <jay.estabrook@hp.com> Signed-off-by: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: Richard Henderson <rth@twiddle.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Peter Chubb [Fri, 10 Aug 2007 20:01:10 +0000 (13:01 -0700)]
fix compilation with gcc 4.2
gcc-4.2 is a lot more picky about its symbol handling. EXPORT_SYMBOL no
longer works on symbols that are undefined or defined with static scope.
For example, with CONFIG_PROFILE off, I see:
kernel/profile.c:206: error: __ksymtab_profile_event_unregister causes a section type conflict
kernel/profile.c:205: error: __ksymtab_profile_event_register causes a section type conflict
This patch moves the EXPORTs inside the #ifdef CONFIG_PROFILE, so we
only try to export symbols that are defined.
Also, in kernel/kprobes.c there's an EXPORT_SYMBOL_GPL() for
jprobes_return, which if CONFIG_JPROBES is undefined is a static
inline and gives the same error.
And in drivers/acpi/resources/rsxface.c, there's an
ACPI_EXPORT_SYMBOPL() for a static symbol. If it's static, it's not
accessible from outside the compilation unit, so should bot be exported.
These three changes allow building a zx1_defconfig kernel with gcc 4.2
on IA64.
[akpm@linux-foundation.org: export jpobe_return properly] Signed-off-by: Peter Chubb <peterc@gelato.unsw.edu.au> Cc: Prasanna S Panchamukhi <prasanna@in.ibm.com> Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com> Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com> Cc: "Luck, Tony" <tony.luck@intel.com> Cc: Len Brown <lenb@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
David Brownell [Fri, 10 Aug 2007 20:01:09 +0000 (13:01 -0700)]
spidev warning fix
Git rid of "warning: passing arg 2 of `access_ok' makes pointer from integer
without a cast" reported on SH ... most architectures use macros in that
test, SH uses inlined functions.
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Josh Triplett [Fri, 10 Aug 2007 20:01:07 +0000 (13:01 -0700)]
RCU: Remove prototype for nonexistent function synchronize_idle()
synchronize_idle() sounds like an interesting function, but we don't
actually have it, so don't prototype it. Introduced in commit 9b06e818985d139fd9e82c28297f7744e1b484e1, in 2005.
Signed-off-by: Josh Triplett <josh@kernel.org> Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Andrew Morton [Fri, 10 Aug 2007 20:01:06 +0000 (13:01 -0700)]
mtdchar build fix
sh:
drivers/mtd/mtdchar.c: In function `mtd_mmap':
drivers/mtd/mtdchar.c:817: error: dereferencing pointer to incomplete type
drivers/mtd/mtdchar.c:817: error: `VM_SHARED' undeclared (first use in this function)
drivers/mtd/mtdchar.c:817: error: (Each undeclared identifier is reported only once
Cc: David Woodhouse <dwmw2@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The problem can be triggered when tcflush() is called when data are being
pushed to the line discipline driver by flush_to_ldisc().
flush_to_ldisc() releases tty->buf.lock when calling the line discipline
receive_buf function. At that poing tty_buffer_flush() kicks in and sets both
tty->buf.head and tty->buf.tail to NULL. When flush_to_ldisc() finishes, it
restores tty->buf.head but doesn't touch tty->buf.tail. This corrups the
buffer queue, and the next call to tty_buffer_request_room() will allocate a
new buffer and overwrite tty->buf.head. The previous buffer is then lost
forever without being released.
(Thanks to Laurent for the above text, for finding, disgnosing and reporting
the bug)
- Use tty->flags bits for the flush status.
- Wait for the flag to clear again before returning
- Fix the doc error noted
- Fix flush of empty queue leaving stale flushpending
[akpm@linux-foundation.org: cleanup] Signed-off-by: Alan Cox <alan@redhat.com> Acked-by: Paul Fulghum <paulkf@microgate.com> Cc: Laurent Pinchart <laurentp@cse-semaphore.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jesper Juhl [Fri, 10 Aug 2007 20:01:04 +0000 (13:01 -0700)]
Documentation: sysrq, description of 'h' slightly inaccurate
In Documentation/sysrq.txt, the description of 'h' says that any key not
listed *above* will generate help. That's obviously not true since all the
keys listed below 'h' will do what they are described to do, not display help.
So change the text so that it says that any key not listed in the table will
generate help, which is what really happens.
Andy Whitcroft [Fri, 10 Aug 2007 20:01:03 +0000 (13:01 -0700)]
update checkpatch.pl to version 0.09
This version brings a number of new checks, and a number of bug
fixes. Of note:
- checks for spacing on round and square bracket combinations
- loosening of the single statement brace checks, to allow
them when they contain comments or where other blocks in a
compound statement have them.
- parks the multple declaration support
- allows architecture defines in architecture specific headers
Andy Whitcroft (21):
Version: 0.09
loosen single statement brace checks
fix up multiple declaration to avoid function arguments
add some function space parenthesis check exceptions
handle EXPORT_'s with parentheses in their names
clean up some warnings in multi-line macro bracketing support
park the multiple declaration checks
make block brace checks count comments as a statement
__volatile__ and __extension__ are not functions
allow architecture specific defined within architecture includes
check spacing on square brackets
check spacing on parentheses
ensure we apply checks to the part before start comment
check #ifdef conditional spacing
handle __init_refok and __must_check
add noinline to inline checks
prevent email addresses from tripping spacing checks
handle typed initialiser spacing
handle line contination as end of line
add bool to the type matcher
refine EXPORT_SYMBOL checks to handle pointers
Signed-off-by: Andy Whitcroft <apw@shadowen.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
While specs says: "The SPI baud rate generator clock source (either
system clock or system clock divided by 16, depending on DIV16 bit) is
divided by 4 * ([PM] + 1), a range from 4 to 64.".
Thus " - 1" is missing in the spi_mpc83xx's formula.
Why nobody noticed that bug? Probably because sysclk usually less then
user expects, e.g. you expect 200 MHz, but real clock is 198 MHz,
and integer rounding helps when this formula is used.
Suppose it's SPI in QE, SYSCLK at 198 MHz, thus SPIBRG at 99MHz, 25 MHz
requested.
PM = (99MHz / ( 25 MHz * 4 )), PM == 0, output SPICLK will be 24.75 MHz
At lower frequencies this bug is more noticeable, though.
And this bug shows itself in all its beauty if SYSCLK is equal or a bit
more than you expect (200 MHz SYSCLK, 100 MHz SPIBRG):
PM = (100MHz / ( 25 MHz * 4 )), PM == 1, output SPICLK will be 12.625 MHz!
Signed-off-by: Anton Vorontsov <avorontsov@ru.mvista.com> Signed-off-by: David Brownell <dbrownell@users.sourceforge.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Anton Vorontsov [Fri, 10 Aug 2007 20:01:01 +0000 (13:01 -0700)]
spi_mpc83xx: in "QE mode", use sysclk/2
For MPC8349E input to the SPI Baud Rate Generator is SYSCLK, but it's
SYSCLK/2 for MPC8323E (SPI in QE). Fix this, and remove confusion by
renaming the mpc83xx_spi->sysclk member as mpc83xx_spi->spibrg.
Signed-off-by: Anton Vorontsov <avorontsov@ru.mvista.com> Signed-off-by: David Brownell <dbrownell@users.sourceforge.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Hibernation: do not try to mark invalid PFNs as nosave
On some systems some PFNs reported by the early initialization code as
'nosave' may be invalid. If we try to set the corresponding bits in the
hibernation bitmap, BUG_ON() in memory_bm_find_bit() will be triggered and
the system won't be able to boot (cf.
https://bugzilla.novell.com/show_bug.cgi?id=296242).
Prevent this from happening by verifying if the 'nosave' PFNs are valid in
mark_nosave_pages().
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Acked-by: Pavel Machek <pavel@ucw.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Ryusuke Konishi [Fri, 10 Aug 2007 20:00:56 +0000 (13:00 -0700)]
eCryptfs: fix error handling in ecryptfs_init
ecryptfs_init() exits without doing any cleanup jobs if
ecryptfs_init_messaging() fails. In that case, eCryptfs leaves
sysfs entries, leaks memory, and causes an invalid page fault.
This patch fixes the problem.
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp> Acked-by: Michael Halcrow <mhalcrow@us.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Gabriel C [Fri, 10 Aug 2007 20:00:56 +0000 (13:00 -0700)]
linux-audit list is subscribers-only
Signed-off-by: Gabriel Craciunescu <nix.or.die@googlemail.com> Cc: David Woodhouse <dwmw2@infradead.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Ryusuke Konishi [Fri, 10 Aug 2007 20:00:51 +0000 (13:00 -0700)]
eCryptfs: fix lookup error for special files
When ecryptfs_lookup() is called against special files, eCryptfs generates
the following errors because it tries to treat them like regular eCryptfs
files.
Error opening lower file for lower_dentry [0xffff810233a6f150], lower_mnt [0xffff810235bb4c80], and flags
[0x8000]
Error opening lower_file to read header region
Error attempting to read the [user.ecryptfs] xattr from the lower file; return value = [-95]
Valid metadata not found in header region or xattr region; treating file as unencrypted
For instance, the problem can be reproduced by the steps below.
# mkdir /root/crypt /mnt/crypt
# mount -t ecryptfs /root/crypt /mnt/crypt
# mknod /mnt/crypt/c0 c 0 0
# umount /mnt/crypt
# mount -t ecryptfs /root/crypt /mnt/crypt
# ls -l /mnt/crypt
This patch fixes it by adding a check similar to directories and
symlinks.
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp> Acked-by: Michael Halcrow <mhalcrow@us.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Paul A. Clarke [Fri, 10 Aug 2007 20:00:49 +0000 (13:00 -0700)]
matroxfb: rectify jitter (G450/G550)
This builds upon my previous attempts to resolve some jitter problems seen
with the Matrox G450 and G550 -based cards, including odd disparities observed
between x86 and Power -based machines in a somewhat less hackish way (removing
the hacked ifdefs).
Apparently, preference should be given to use the DVI PLL when frequencies
permit, the Standard PLL otherwise. The max pixel clock for the panellink
interface is extracted from the PInS information on the card and used as a
limit to determine which PLL to use.
Signed-off-by: Paul A. Clarke <pc@us.ibm.com> Acked-by: Petr Vandrovec <petr@vandrovec.name> Signed-off-by: Antonino Daplas <adaplas@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Adrian McMenamin [Fri, 10 Aug 2007 20:00:48 +0000 (13:00 -0700)]
pvr2fb: Consolidated cleanup of pvr2fb.c
- better handling of the pvr2 registers based on more up to date information.
Testing shows that it seems to work pretty well at 16bpp, 24bpp and 32bpp -
including proper rendering of the boot logo at all levels (previously this was
a bit broken even at 16bpp) and giving white against black text. Really
detailed testing (eg with X11) requires support for the maple bus - which
isn't (currently - next project assuming this is okay) available, but I have
no reason to think this is broken.
Signed-off by: Adrian McMenamin <adrian@mcmen.demon.co.uk> Acked-by: Paul Mundt <lethal@linux-sh.org> Signed-off-by: Antonino Daplas <adaplas@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Reported by: Adrian McMenamin <adrianmcmenamin@gmail.com>
This driver will oops when the pseudo_palette[] is written as u32 but not when
written as u16. When written as u32, it corrupts the adjacent 'mmio_base'
field of struct pvr2fb_par. Fix by using framebuffer_alloc()/release() to
allocate struct fb_info and struct pvr2fb_par, and create the pseudo_palette[]
as part of struct pvr2fb_par.
Helge Deller [Fri, 10 Aug 2007 20:00:45 +0000 (13:00 -0700)]
stifb: detect cards in double buffer mode more reliably
Visualize-EG, Graffiti and A4450A graphics cards on PARISC can
be configured in double-buffer and standard mode, but the stifb
driver supports standard mode only.
This patch detects double-buffered cards more reliable.
It is a real bugfix for a very nasty problem for all parisc users which have
wrongly configured their graphic card. The problem: The stifb graphics driver
will not detect that the card is wrongly configured and then nevertheless just
enables the graphics mode, which it shouldn't. In the end, the user will see
no further updates / boot messages on the screen.
We had documented this problem already on our FAQ
(http://parisc-linux.org/faq/index.html#viseg "Why do I get corrupted graphics
with my Vis-EG/Graffiti/A4450A card?") but people still run into this problem.
So having this fix in as early as possible can help us.
Badari Pulavarty [Fri, 10 Aug 2007 20:00:44 +0000 (13:00 -0700)]
direct-io: fix error-path crashes
Need to initialize map_bh.b_state to zero. Otherwise, in case of a faulty
user-buffer its possible to go into dio_zero_block() and submit a page by
mistake - since it checks for buffer_new().
Robin Holt [Fri, 10 Aug 2007 20:00:43 +0000 (13:00 -0700)]
x86_64: fix HPET init race
I have had four seperate system lockups attributable to this exact problem
in two days of testing. Instead of trying to handle all the weird end
cases and wrap, how about changing it to look for exactly what we appear
to want.
The following patch removes a couple races in setup_APIC_timer. One occurs
when the HPET advances the COUNTER past the T0_CMP value between the time
the T0_CMP was originally read and when COUNTER is read. This results in
a delay waiting for the counter to wrap. The other results from the counter
wrapping.
This change takes a snapshot of T0_CMP at the beginning of the loop and
simply loops until T0_CMP has changed (a tick has happened).
<later>
I have one small concern about the patch. I am not sure it meets the intent
as well as it should. I think we are trying to match APIC timer interrupts up
with the hpet counter increment. The event which appears to be disturbing
this loop in our test environment is the NMI watchdog. What we believe has
been happening with the existing code is the setup_APIC_timer loop has read
the CMP value, and the NMI watchdog code fires for the first time. This
results in a series of icache miss slowdowns and by the time we get back to
things it has wrapped.
I think this code is trying to get the CMP as close to the counter value as
possible. If that is the intent, maybe we should really be testing against a
"window" around the CMP. Something like COUNTER = CMP+/2. It appears COUNTER
should get advanced every 89nSec (IIRC). The above seems like an unreasonably
small window, but may be necessary. Without documentation, I am not sure of
the original intent with this code.
In summary, this code fixes my boot hangs, but since I am not certain of the
intent of the existing code, I am not certain this has not introduced new bugs
or unexpected behaviors.
The way this driver tries to implement HDIO_GETGEO it'll never be called.
Then again on ppc it probably will never be called anyway because it's
utterly pointless.
Signed-off-by: Christoph Hellwig <hch@lst.de> Cc: Grant Likely <grant.likely@secretlab.ca> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
drivers/block/cpqarray.c: better error handling and kmalloc + memset conversion to k[cz]alloc
This patch removes some redundant casts, does the kmalloc + memset to
k[cz]alloc conversion and it changes the error path to use goto (to avoid code
duplication).
This patch provides more information concerning REMAP operations on block
IOs. The additional information provides clearer details at the user level,
and supports post-processing analysis in btt.
o Adds in partition remaps on the same device.
o Fixed up the remap information in DM to be in the right order
o Sent up mapped-from and mapped-to device information
Signed-off-by: Alan D. Brunelle <alan.brunelle@hp.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Michael Holzheu [Fri, 10 Aug 2007 12:32:34 +0000 (14:32 +0200)]
[S390] vmur: diag14 only works with buffers below 2GB
If memory buffers above 2GB are used, diagnose 14 raises a specification
exception. This fix ensures that buffer allocation is done below the 2GB
boundary.
Signed-off-by: Michael Holzheu <holzheu@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Michael Holzheu [Fri, 10 Aug 2007 12:32:33 +0000 (14:32 +0200)]
[S390] vmur: add "top of queue" sanity check for reader open
If the z/VM reader is already open, it can happen that after opening the
Linux reader device, not the topmost file is processed. According the
semantics of the Linux z/VM unit record device driver, always the topmost
file has to be processed. With this fix an error is returned if that is
not the case.
Signed-off-by: Michael Holzheu <holzheu@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Michael Holzheu [Fri, 10 Aug 2007 12:32:32 +0000 (14:32 +0200)]
[S390] vmur: reject open on z/VM reader files with status HOLD
If a reader file with HOLD status is at the top of the reader queue, currently
all read requests will return data of the second file in the queue. But the
semantics of vmur is that always the topmost file is read. With this fix
-EPERM is returned on open, if the topmost reader file is in HOLD status.
Signed-off-by: Michael Holzheu <holzheu@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Michael Holzheu [Fri, 10 Aug 2007 12:32:30 +0000 (14:32 +0200)]
[S390] vmur: allocate single record buffers instead of one big data buffer
vmur allocates one contiguous kernel buffer to copy user data when creating
ccw programs for punch or printer. If big block sizes are used, under memory
pressure it can happen, that we do not get memory in one chunk. Now we
allocate memory for each single record to avoid high order allocations.
Signed-off-by: Michael Holzheu <holzheu@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Heiko Carstens [Fri, 10 Aug 2007 12:32:28 +0000 (14:32 +0200)]
[S390] qdio: make sure data structures are correctly aligned.
The slsb structure contained at the beginning of the qdio_q structure
must start on a 256 byte boundary. To make sure this is the case even
if slab debugging is turned on create an own slab cache for qdio_q
structures.
Besides that don't use the slab allocator to allocate whole pages. Use
the page allocator instead.
Also fix a few memory leaks in error handling code.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
My "slices" address space management code that was added in the 2.6.22
implementation of get_unmapped_area() doesn't properly check that the
size is a multiple of the requested page size. This allows userland to
create VMAs that aren't a multiple of the huge page size with hugetlbfs
(since hugetlbfs entirely relies on get_unmapped_area() to do that
checking) which leads to a kernel BUG() when such areas are torn down.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
[POWERPC] Fix initialization and usage of dma_mask
powerpc has a couple of bugs in the usage of dma_masks that tend to
break when drivers explicitly try to set a 32-bit mask for example.
First, the code that generates the pci devices from the OF device-tree
doesn't initialize the mask properly, then our implementation of
set_dma_mask() was trying to validate the -previous- mask value, not the
one passed in as an argument.
This fixes these problems.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
That commit was a mistake from the start; I added mdio type to the
bus scan list early on in my ucc_geth migrate to phylib development,
which is just pure wrong (the ucc_geth_mii driver creates the mii
bus and the PHY layer handles PHY enumeration without translation).
Fix the PS3 storage probe logic to properly find device regions on cold
startup.
o Change the storage probe event mask from notify_device_ready
to notify_region_update.
o Improve the storage probe error handling.
o Change ps3_storage_wait_for_device() to use a temporary variable to hold
the buffer address.
Andre Detsch [Sat, 4 Aug 2007 01:53:46 +0000 (18:53 -0700)]
[POWERPC] cell: Move SPU affinity init to spu_management_of_ops
This patch moves affinity initialization code from spu_base.c to a
new spu_management_of_ops function (init_affinity), which is empty
in the case of PS3. This fixes a linking problem that was happening
when compiling for PS3.
Also, some small code style changes were made.
Signed-off-by: Andre Detsch <adetsch@br.ibm.com> Signed-off-by: Geoff Levand <geoffrey.levand@am.sony.com> Acked-by: Arnd Bergmann <arnd.bergmann@de.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org>
Paul Mackerras [Fri, 10 Aug 2007 11:04:07 +0000 (21:04 +1000)]
[POWERPC] Fix potential duplicate entry in SLB shadow buffer
We were getting a duplicate entry in the SLB shadow buffer in
slb_flush_and_rebolt() if the kernel stack was in the same segment
as PAGE_OFFSET, which on POWER6 causes the hypervisor to terminate
the partition with an error. This fixes it.
Also we were not creating an SLB entry (or an SLB shadow buffer
entry) for the kernel stack on secondary CPUs when starting the
CPU. This isn't a major problem, since an appropriate entry will
be created on demand, but this fixes that also for consistency.
Jesper Juhl [Wed, 8 Aug 2007 23:31:30 +0000 (16:31 -0700)]
SLUB: Fix format specifier in Documentation/vm/slabinfo.c
There's a little problem in Documentation/vm/slabinfo.c
The code is using "%d" in a printf() call to print an 'unsigned long'.
This patch corrects it to use "%lu" instead.
Signed-off-by: Jesper Juhl <jesper.juhl@gmail.com> Signed-off-by: Christoph Lameter <clameter@sgi.com>
The dynamic dma kmalloc creation can run into trouble if a
GFP_ATOMIC allocation is the first one performed for a certain size
of dma kmalloc slab.
- Move the adding of the slab to sysfs into a workqueue
(sysfs does GFP_KERNEL allocations)
- Do not call kmem_cache_destroy() (uses slub_lock)
- Only acquire the slub_lock once and--if we cannot wait--do a trylock.
This introduces a slight risk of the first kmalloc(x, GFP_DMA|GFP_ATOMIC)
for a range of sizes failing due to another process holding the slub_lock.
However, we only need to acquire the spinlock once in order to establish
each power of two DMA kmalloc cache. The possible conflict is with the
slub_lock taken during slab management actions (create / remove slab cache).
It is rather typical that a driver will first fill its buffers using
GFP_KERNEL allocations which will wait until the slub_lock can be acquired.
Drivers will also create its slab caches first outside of an atomic
context before starting to use atomic kmalloc from an interrupt context.
If there are any failures then they will occur early after boot or when
loading of multiple drivers concurrently. Drivers can already accomodate
failures of GFP_ATOMIC for other reasons. Retries will then create the slab.
Signed-off-by: Christoph Lameter <clameter@sgi.com>
SLUB: Remove checks for MAX_PARTIAL from kmem_cache_shrink
The MAX_PARTIAL checks were supposed to be an optimization. However, slab
shrinking is a manually triggered process either through running slabinfo
or by the kernel calling kmem_cache_shrink.
If one really wants to shrink a slab then all operations should be done
regardless of the size of the partial list. This also fixes an issue that
could surface if the number of partial slabs was initially above MAX_PARTIAL
in kmem_cache_shrink and later drops below MAX_PARTIAL through the
elimination of empty slabs on the partial list (rare). In that case a few
slabs may be left off the partial list (and only be put back when they
are empty).
Signed-off-by: Christoph Lameter <clameter@sgi.com>