There is a problem with putting the first kernel in EFI virtual mode,
it is that when the second kernel comes up it tries to initialize the
EFI again and once we have put EFI in virtual mode we can not really
do that.
Actually, EFI is not necessary for kdump, we can boot the second kernel
with "noefi" parameter, but the boot will mostly fail because 2nd kernel
cannot find RSDP.
In this situation, we introduced "acpi_rsdp=" kernel parameter, so that
kexec-tools can pass the "noefi acpi_rsdp=X" to the second kernel to
make kdump works. The physical address of the RSDP can be got from
sysfs(/sys/firmware/efi/systab).
Signed-off-by: Takao Indoh <indou.takao@jp.fujitsu.com> Reviewed-by: WANG Cong <amwang@redhat.com> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Len Brown <len.brown@intel.com>
Vasiliy Kulikov [Sat, 25 Jun 2011 17:07:52 +0000 (21:07 +0400)]
ACPI: constify ops structs
Structs battery_file, acpi_dock_ops, file_operations,
thermal_cooling_device_ops, thermal_zone_device_ops, kernel_param_ops
are not changed in runtime. It is safe to make them const.
register_hotplug_dock_device() was altered to take const "ops" argument
to respect acpi_dock_ops' const notion.
Signed-off-by: Vasiliy Kulikov <segoon@openwall.com> Acked-by: Jeff Garzik <jgarzik@redhat.com> Signed-off-by: Len Brown <len.brown@intel.com>
The problem being that a device triggers boot interrupts (due to threaded
interrupt handling and masking of the IO-APIC), which are forwarded
to the PIRQ line of the device. These interrupts are not handled on the PIRQ
line because the interrupt handler is not present there.
This should have already been fixed by CONFIG_X86_REROUTE_FOR_BROKEN_BOOT_IRQS.
However some parts of the quirk got lost in the ACPI merge. This is a resent of
the patch proposed in 2009.
See http://lkml.org/lkml/2009/9/7/192
Signed-off-by: Stefan Assmann <sassmann@kpanic.de> Tested-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Len Brown <len.brown@intel.com>
clocksource: convert ARM 32-bit up counting clocksources
broke the build for ixp4xx and made big endian operation impossible.
This commit restores the original behaviour.
Signed-off-by: Richard Cochran <richard.cochran@omicron.at> Signed-off-by: Krzysztof Hałasa <khc@pm.waw.pl>
[ Thomas says that we might want to have generic BE accessor functions
to the MMIO clock source, but that hasn't happened yet, so in the
meantime this seems to be the short-term fix for the particular
problem - Linus ] Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Axel Lin [Sun, 10 Jul 2011 07:45:07 +0000 (15:45 +0800)]
gpio: wm831x: add a missing break in wm831x_gpio_dbg_show
Signed-off-by: Axel Lin <axel.lin@gmail.com> Acked-by: Mark Brown <broonie@opensource.wolfsonmicro.com> Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Merge branch 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/staging
* 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/staging:
hwmon: (adm1275) Fix coefficients per datasheet revision B
hwmon: (pmbus) Use long variables for register to data conversions
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6:
fix loop checks in d_materialise_unique()
Fix ->d_lock locking order in unlazy_walk()
Merge branch 'rcu/urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-2.6-rcu
* 'rcu/urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-2.6-rcu:
rcu: Prevent RCU callbacks from executing before scheduler initialized
Peter Zijlstra [Mon, 11 Jul 2011 14:28:50 +0000 (16:28 +0200)]
sched: Fix 32bit race
Commit 3fe1698b7fe0 ("sched: Deal with non-atomic min_vruntime reads
on 32bit") forgot to initialize min_vruntime_copy which could lead to
an infinite while loop in task_waking_fair() under some circumstances
(early boot, lucky timing).
[ This bug was also reported by others that blamed it on the RCU
initialization problems ]
Reported-and-tested-by: Bruno Wolff III <bruno@wolff.to> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
hwmon: (adm1275) Fix coefficients per datasheet revision B
Coefficients to convert chip register values to voltage/current have been
slightly changed in revision B of the chip datasheet. Update driver coefficients
to match the coefficients in the datasheet.
Signed-off-by: Guenter Roeck <guenter.roeck@ericsson.com> Acked-by: Jean Delvare <khali@linux-fr.org>
Al Viro [Wed, 13 Jul 2011 01:42:24 +0000 (21:42 -0400)]
fix loop checks in d_materialise_unique()
Both __d_unalias() and __d_materialise_dentry() need loop prevention.
Grab rename_lock in caller, check for loops there...
As a side benefit, we have dentry_lock_for_move() called only under
rename_lock, which seriously reduces deadlock potential of the
execrable "locking order" used for ->d_lock.
* git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-2.6-fixes:
GFS2: Resolve inode eviction and ail list interaction bug
GFS2: Fix race during filesystem mount
GFS2: force a log flush when invalidating the rindex glock
GFS2: Resolve inode eviction and ail list interaction bug
This patch contains a few misc fixes which resolve a recently
reported issue. This patch has been a real team effort and has
received a lot of testing.
The first issue is that the ail lock needs to be held over a few
more operations. The lock thats added into gfs2_releasepage() may
possibly be a candidate for replacing with RCU at some future
point, but at this stage we've gone for the obvious fix.
The second issue is that gfs2_write_inode() can end up calling
a glock recursively when called from gfs2_evict_inode() via the
syncing code, so it needs a guard added.
The third issue is that we either need to not truncate the metadata
pages of inodes which have zero link count, but which we cannot
deallocate due to them still being in use by other nodes, or we need
to ensure that those pages have all made it through the journal and
ail lists first. This patch takes the former approach, but the
latter has also been tested and there is nothing to choose between
them performance-wise. So again, we could revise that decision
in the future.
Also, the inode eviction process is now better documented.
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com> Tested-by: Bob Peterson <rpeterso@redhat.com> Tested-by: Abhijith Das <adas@redhat.com> Reported-by: Barry J. Marson <bmarson@redhat.com> Reported-by: David Teigland <teigland@redhat.com>
Lan Tianyu [Thu, 30 Jun 2011 03:34:12 +0000 (11:34 +0800)]
ACPI / Battery: Resolve the race condition in the sysfs_remove_battery()
Use battery->lock in sysfs_remove_battery() to make
checking, removing, and clearing bat.dev atomic.
This is necessary because sysfs_remove_battery() may
be invoked concurrently from different paths.
https://bugzilla.kernel.org/show_bug.cgi?id=35642
Signed-off-by: Lan Tianyu <tianyu.lan@intel.com> Signed-off-by: Len Brown <len.brown@intel.com>
Lan Tianyu [Thu, 30 Jun 2011 03:33:58 +0000 (11:33 +0800)]
ACPI / Battery: Add the check before refresh sysfs in the battery_notify()
In the commit 25be5821, add the refresh sysfs when system resumes
from suspending. But it didn't check that the battery exists. This
will cause battery sysfs files added when the battery doesn't exist.
This patch add the check before refreshing.
https://bugzilla.kernel.org/show_bug.cgi?id=35642
Signed-off-by: Lan Tianyu <tianyu.lan@intel.com> Signed-off-by: Len Brown <len.brown@intel.com>
Lan Tianyu [Thu, 30 Jun 2011 03:33:40 +0000 (11:33 +0800)]
ACPI / Battery: Add the hibernation process in the battery_notify()
The Commit 25be58215 has added a PM notifier to refresh the sys in order
to deal with the unit change of the Battery Present Rate. But it just
consided the suspend situation. The problem also will happen during the
hibernation according the bug 28192.
https://bugzilla.kernel.org/show_bug.cgi?id=28192
This patch adds the hibernation process and fix the bug.
Signed-off-by: Lan Tianyu <tianyu.lan@intel.com> Signed-off-by: Len Brown <len.brown@intel.com>
Lan Tianyu [Thu, 30 Jun 2011 03:33:27 +0000 (11:33 +0800)]
ACPI / Battery: Rename acpi_battery_quirks2 with acpi_battery_quirks
This patch is cosmetic only, and makes no functional change.
Since the acpi_battery_quirks has been deleted, rename
acpi_battery_quirks2 with acpi_battery_quirks to clean the code.
Signed-off-by: Lan Tianyu <tianyu.lan@intel.com> Signed-off-by: Len Brown <len.brown@intel.com>
Lan Tianyu [Thu, 30 Jun 2011 03:33:12 +0000 (11:33 +0800)]
ACPI / Battery: Change 16-bit signed negative battery current into correct value
This patch is for some machines which report the battery current
as a 16-bit signed negative when it is charging. This is caused
by DSDT bug. The commit bc76f90b8a5cf4aceedf210d08d5e8292f820cec
has resolved the problem for Acer laptops. But some other machines
also have such problem.
https://bugzilla.kernel.org/show_bug.cgi?id=33722
Since it is improper that the current is above 32A on laptops
whether on AC or on battery, this patch is to check the current and
take its absolute value as current and producing a message when it
is negative in s16.
Remove Acer quirk, as this workaround handles Acer too.
Signed-off-by: Lan Tianyu <tianyu.lan@intel.com> Signed-off-by: Len Brown <len.brown@intel.com>
Lin Ming [Wed, 1 Jun 2011 15:54:02 +0000 (23:54 +0800)]
ACPI: Fixes device power states array overflow
Commit 28c2103 added new state ACPI_STATE_D3_COLD, so the device power
states array must be expanded by one also.
v2: Use ACPI_D_STATE_COUNT instead of number 5 for the array size.
Reported-by: Dan Carpenter <error27@gmail.com> Suggested-by: Oldřich Jedlička <oldium.pro@seznam.cz> Signed-off-by: Lin Ming <ming.m.lin@intel.com> Signed-off-by: Len Brown <len.brown@intel.com>
ACPICA: Do not repair _TSS return package if _PSS is present
We can only sort the _TSS return package if there is no _PSS
in the same scope. This is because if _PSS is present, the ACPI
specification dictates that the _TSS Power Dissipation field is
to be ignored, and therefore some BIOSs leave garbage values in
the _TSS Power field(s). In this case, it is best to just return
the _TSS package as-is.
Reported-by: Fenghua Yu <fenghua.yu@intel.com> Signed-off-by: Fenghua Yu <fenghua.yu@intel.com> Signed-off-by: Bob Moore <robert.moore@intel.com> Signed-off-by: Lin Ming <ming.m.lin@intel.com> Signed-off-by: Len Brown <len.brown@intel.com>
memory_failure() is the entry point for HWPoison memory error
recovery. It must be called in process context. But commonly
hardware memory errors are notified via MCE or NMI, so some delayed
execution mechanism must be used. In MCE handler, a work queue + ring
buffer mechanism is used.
In addition to MCE, now APEI (ACPI Platform Error Interface) GHES
(Generic Hardware Error Source) can be used to report memory errors
too. To add support to APEI GHES memory recovery, a mechanism similar
to that of MCE is implemented. memory_failure_queue() is the new
entry point that can be called in IRQ context. The next step is to
make MCE handler uses this interface too.
Signed-off-by: Huang Ying <ying.huang@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Wu Fengguang <fengguang.wu@intel.com> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Len Brown <len.brown@intel.com>
ACPI, APEI, GHES, Error records content based throttle
printk is used by GHES to report hardware errors. Ratelimit is
enforced on the printk to avoid too many hardware error reports in
kernel log. Because there may be thousands or even millions of
corrected hardware errors during system running.
Currently, a simple scheme is used. That is, the total number of
hardware error reporting is ratelimited. This may cause some issues
in practice.
For example, there are two kinds of hardware errors occurred in
system. One is corrected memory error, because the fault memory
address is accessed frequently, there may be hundreds error report
per-second. The other is corrected PCIe AER error, it will be
reported once per-second. Because they share one ratelimit control
structure, it is highly possible that only memory error is reported.
To avoid the above issue, an error record content based throttle
algorithm is implemented in the patch. Where after the first
successful reporting, all error records that are same are throttled for
some time, to let other kinds of error records have the opportunity to
be reported.
In above example, the memory errors will be throttled for some time,
after being printked. Then the PCIe AER error will be printked
successfully.
Signed-off-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Len Brown <len.brown@intel.com>
ACPI, APEI, GHES, printk support for recoverable error via NMI
Some APEI GHES recoverable errors are reported via NMI, but printk is
not safe in NMI context.
To solve the issue, a lock-less memory allocator is used to allocate
memory in NMI handler, save the error record into the allocated
memory, put the error record into a lock-less list. On the other
hand, an irq_work is used to delay the operation from NMI context to
IRQ context. The irq_work IRQ handler will remove nodes from
lock-less list, printk the error record and do some further processing
include recovery operation, then free the memory.
Signed-off-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Len Brown <len.brown@intel.com>
This version of the gen_pool memory allocator supports lockless
operation.
This makes it safe to use in NMI handlers and other special
unblockable contexts that could otherwise deadlock on locks. This is
implemented by using atomic operations and retries on any conflicts.
The disadvantage is that there may be livelocks in extreme cases. For
better scalability, one gen_pool allocator can be used for each CPU.
The lockless operation only works if there is enough memory available.
If new memory is added to the pool a lock has to be still taken. So
any user relying on locklessness has to ensure that sufficient memory
is preallocated.
The basic atomic operation of this allocator is cmpxchg on long. On
architectures that don't have NMI-safe cmpxchg implementation, the
allocator can NOT be used in NMI handler. So code uses the allocator
in NMI handler should depend on CONFIG_ARCH_HAVE_NMI_SAFE_CMPXCHG.
Signed-off-by: Huang Ying <ying.huang@intel.com> Reviewed-by: Andi Kleen <ak@linux.intel.com> Reviewed-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Len Brown <len.brown@intel.com>
Cmpxchg is used to implement adding new entry to the list, deleting
all entries from the list, deleting first entry of the list and some
other operations.
Because this is a single list, so the tail can not be accessed in O(1).
If there are multiple producers and multiple consumers, llist_add can
be used in producers and llist_del_all can be used in consumers. They
can work simultaneously without lock. But llist_del_first can not be
used here. Because llist_del_first depends on list->first->next does
not changed if list->first is not changed during its operation, but
llist_del_first, llist_add, llist_add (or llist_del_all, llist_add,
llist_add) sequence in another consumer may violate that.
If there are multiple producers and one consumer, llist_add can be
used in producers and llist_del_all or llist_del_first can be used in
the consumer.
Where "-" stands for no lock is needed, while "L" stands for lock is
needed.
The list entries deleted via llist_del_all can be traversed with
traversing function such as llist_for_each etc. But the list entries
can not be traversed safely before deleted from the list. The order
of deleted entries is from the newest to the oldest added one. If you
want to traverse from the oldest to the newest, you must reverse the
order by yourself before traversing.
The basic atomic operation of this list is cmpxchg on long. On
architectures that don't have NMI-safe cmpxchg implementation, the
list can NOT be used in NMI handler. So code uses the list in NMI
handler should depend on CONFIG_ARCH_HAVE_NMI_SAFE_CMPXCHG.
Signed-off-by: Huang Ying <ying.huang@intel.com> Reviewed-by: Andi Kleen <ak@linux.intel.com> Reviewed-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Len Brown <len.brown@intel.com>
cmpxchg() is widely used by lockless code, including NMI-safe lockless
code. But on some architectures, the cmpxchg() implementation is not
NMI-safe, on these architectures the lockless code may need a
spin_trylock_irqsave() based implementation.
This patch adds a Kconfig option: ARCH_HAVE_NMI_SAFE_CMPXCHG, so that
NMI-safe lockless code can depend on it or provide different
implementation according to it.
On many architectures, cmpxchg is only NMI-safe for several specific
operand sizes. So, ARCH_HAVE_NMI_SAFE_CMPXCHG define in this patch
only guarantees cmpxchg is NMI-safe for sizeof(unsigned long).
Signed-off-by: Huang Ying <ying.huang@intel.com> Acked-by: Mike Frysinger <vapier@gentoo.org> Acked-by: Paul Mundt <lethal@linux-sh.org> Acked-by: Hans-Christian Egtvedt <hans-christian.egtvedt@atmel.com> Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Acked-by: Chris Metcalf <cmetcalf@tilera.com> CC: Richard Henderson <rth@twiddle.net> CC: Mikael Starvik <starvik@axis.com> CC: David Howells <dhowells@redhat.com> CC: Yoshinori Sato <ysato@users.sourceforge.jp> CC: Tony Luck <tony.luck@intel.com> CC: Hirokazu Takata <takata@linux-m32r.org> CC: Geert Uytterhoeven <geert@linux-m68k.org> CC: Michal Simek <monstr@monstr.eu> CC: Ralf Baechle <ralf@linux-mips.org> CC: Kyle McMartin <kyle@mcmartin.ca> CC: Martin Schwidefsky <schwidefsky@de.ibm.com> CC: Chen Liqin <liqin.chen@sunplusct.com> CC: "David S. Miller" <davem@davemloft.net> CC: Ingo Molnar <mingo@redhat.com> CC: Chris Zankel <chris@zankel.net> Signed-off-by: Len Brown <len.brown@intel.com>
APEI firmware first mode must be turned on explicitly on some
machines, otherwise there may be no GHES hardware error record for
hardware error notification. APEI bit in generic _OSC call can be
used to do that, but on some machine, a special WHEA _OSC call must be
used. This patch adds the support to that WHEA _OSC call.
Signed-off-by: Huang Ying <ying.huang@intel.com> Reviewed-by: Andi Kleen <ak@linux.intel.com> Reviewed-by: Matthew Garrett <mjg@redhat.com> Signed-off-by: Len Brown <len.brown@intel.com>
ACPI, APEI, Add APEI bit support in generic _OSC call
In APEI firmware first mode, hardware error is reported by hardware to
firmware firstly, then firmware reports the error to Linux in a GHES
error record via POLL/SCI/IRQ/NMI etc.
This may result in some issues if OS has no full APEI support. So
some firmware implementation will work in a back-compatible mode by
default. Where firmware will only notify OS in old-fashion, without
GHES record. For example, for a fatal hardware error, only NMI is
signaled, no GHES record.
To gain full APEI power on these machines, APEI bit in generic _OSC
call can be specified to tell firmware that Linux has full APEI
support. This patch adds the APEI bit support in generic _OSC call.
Signed-off-by: Huang Ying <ying.huang@intel.com> Reviewed-by: Andi Kleen <ak@linux.intel.com> Reviewed-by: Matthew Garrett <mjg@redhat.com> Signed-off-by: Len Brown <len.brown@intel.com>
ACPI, APEI, GHES, Prevent GHES to be built as module
GHES (Generic Hardware Error Source) is used to process hardware error
notification in firmware first mode. But because firmware first mode
can be turned on but can not be turned off, it is unreasonable to
unload the GHES module with firmware first mode turned on. To avoid
confusion, this patch makes GHES can be enabled/disabled in
configuration time, but not built as module and unloaded at run time.
Signed-off-by: Huang Ying <ying.huang@intel.com> Reviewed-by: Andi Kleen <ak@linux.intel.com> Reviewed-by: Matthew Garrett <mjg@redhat.com> Signed-off-by: Len Brown <len.brown@intel.com>
Some actions in APEI ERST and EINJ tables are optional, for example,
ACPI_EINJ_BEGIN_OPERATION action is used to do some preparation for
error injection, and firmware may choose to do nothing here. While
some other actions are mandatory, for example, firmware must provide
ACPI_EINJ_GET_ERROR_TYPE implementation.
Original implementation treats all actions as optional (that is, can
have no instructions), that may cause issue if firmware does not
provide some mandatory actions. To fix this, this patch adds
apei_exec_run_optional, which should be used for optional actions.
The original apei_exec_run should be used for mandatory actions.
Cc: Thomas Renninger <trenn@novell.com> Signed-off-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Len Brown <len.brown@intel.com>
ACPI, APEI, GHES, Do not ratelimit fatal error printk before panic
printk is used by GHES to report hardware errors. Normally, the
printk will be ratelimited to avoid too many hardware error reports in
kernel log. Because there may be thousands or even millions of
corrected hardware errors during system running.
That is different for fatal hardware error, because system will go
panic as soon as possible, there will be no more than several error
records. And these error records are valuable for system fault
diagnosis, so they should not be ratelimited.
Signed-off-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Len Brown <len.brown@intel.com>
Chen Gong [Wed, 13 Jul 2011 05:14:14 +0000 (13:14 +0800)]
ACPI, APEI, ERST, Fix erst-dbg long record reading issue
When we debug ERST table with erst-dbg, if the error record in ERST
table is too long(>4K), it can't be read out. So this patch increases
the buffer size to 16K to ensure such error records can be read from
ERST table.
Signed-off-by: Chen Gong <gong.chen@linux.intel.com> Signed-off-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Len Brown <len.brown@intel.com>
ACPI, APEI, HEST, Detect duplicated hardware error source ID
The firmware on some machine will report duplicated hardware error
source ID in HEST. This is considered a firmware bug. To provide
better warning message, this patch adds duplicated hardware error
source ID detecting and corresponding printk.
This patch fixes #37412 on kernel bugzilla:
https://bugzilla.kernel.org/show_bug.cgi?id=37412
Reported-by: marconifabio@ubuntu-it.org Signed-off-by: Huang Ying <ying.huang@intel.com> Tested-by: Mathias <janedo.spam@gmail.com> Signed-off-by: Len Brown <len.brown@intel.com>
Len Brown [Mon, 6 Jun 2011 05:06:57 +0000 (01:06 -0400)]
acpi-cpufreq: remove unreliable device.get() code
cpufreq offers the optional driver.get() entry point
for drivers to export instantaneous frequency in
/sys/devices/system/cpu/cpu*/cpufreq/cpuinfo_cur_freq.
25% of the acpi-cpufreq driver is involved in supporting
that optional feature, but on modern processors, it
is not reliable.
So here we delete this optional feature from acpi-cpufreq.
/sys/devices/system/cpu/cpu*/cpufreq/cpuinfo_cur_freq
will go away on acpi-cpufreq systems, but note that
/sys/devices/system/cpu/cpu*/cpufreq/scaling_cur_freq
will still be presnet to indicate the most recent request.
(and yes, powertop still works:-)
The most common reason that driver.get() is not reliable
is that modern processors autonomously change frequency
without OS instruction. This means that reading
PERF_STATUS is possibly in-accurate as soon as the
instruction after it is read.
Average frequency over an interval is more useful
than instantaneous frequency on modern hardware.
acpi-cpufreq supplies average frequency via
the the driver->getavg() entry, which is what
the ondemand governor uses.
Merge branch 'bugfixes' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6
* 'bugfixes' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6:
SUNRPC: Fix use of static variable in rpcb_getport_async
NFSv4.1: update nfs4_fattr_bitmap_maxsz
SUNRPC: Fix a race between work-queue and rpc_killall_tasks
pnfs: write: Set mds_offset in the generic layer - it is needed by all LDs
Merge branch 'drm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6
* 'drm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6:
drm/radeon/kms/evergreen: emit SQ_LDS_RESOURCE_MGMT for blits
agp/intel: Fix typo in G4x_GMCH_SIZE_VT_2M
drm/radeon/kms: fix typo in read_disabled vbios code
drm/radeon/kms: use correct BUS_CNTL reg on rs600
drm/radeon/kms: fix backend map typo on juniper
drm/radeon/kms: fix regression in hotplug
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (21 commits)
slip: fix wrong SLIP6 ifdef-endif placing
natsemi: fix another dma-debug report
sctp: ABORT if receive, reassmbly, or reodering queue is not empty while closing socket
net: Fix default in docs for tcp_orphan_retries.
hso: fix a use after free condition
net/natsemi: Fix module parameter permissions
XFRM: Fix memory leak in xfrm_state_update
sctp: Enforce retransmission limit during shutdown
mac80211: fix TKIP replay vulnerability
mac80211: fix ie memory allocation for scheduled scans
ssb: fix init regression of hostmode PCI core
rtlwifi: rtl8192cu: Add new USB ID for Netgear WNA1000M
ath9k: Fix tx throughput drops for AR9003 chips with AES encryption
carl9170: add NEC WL300NU-AG usbid
cfg80211: fix deadlock with rfkill/sched_scan by adding new mutex
ath5k: fix incorrect use of drvdata in PCI suspend/resume code
ath5k: fix incorrect use of drvdata in sysfs code
Bluetooth: Fix memory leak under page timeouts
Bluetooth: Fix regression with incoming L2CAP connections
Bluetooth: Fix hidp disconnect deadlocks and lost wakeup
...
Philip Rakity [Thu, 7 Jul 2011 16:04:55 +0000 (09:04 -0700)]
mmc: core: Bus width testing needs to handle suspend/resume
On reading the ext_csd for the first time (in 1 bit mode), save the
ext_csd information needed for bus width compare.
On every pass we make re-reading the ext_csd, compare the data
against the saved ext_csd data.
This fixes a regression introduced in 3.0-rc1 by 08ee80cc397ac1a3
("mmc: core: eMMC bus width may not work on all platforms"), which
incorrectly assumed we would be re-reading the ext_csd at resume-
time.
Signed-off-by: Philip Rakity <prakity@marvell.com> Tested-by: Jaehoon Chung <jh80.chung@samsung.com> Signed-off-by: Chris Ball <cjb@laptop.org>
ACPI: Fix lockdep false positives in acpi_power_off()
All ACPICA locks are allocated by the same function,
acpi_os_create_lock(), with the help of a local variable called
"lock". Thus, when lockdep is enabled, it uses "lock" as the
name of all those locks and regards them as instances of the same
lock, which causes it to report possible locking problems with them
when there aren't any.
To work around this problem, define acpi_os_create_lock() as a macro
and make it pass its argument to spin_lock_init(), so that lockdep
uses it as the name of the new lock. Define this macron in a
Linux-specific file, to minimize the resulting modifications of
the OS-independent ACPICA parts.
This change is based on an earlier patch from Andrea Righi and it
addresses a regression from 2.6.39 tracked as
https://bugzilla.kernel.org/show_bug.cgi?id=38152
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Reported-and-tested-by: Borislav Petkov <bp@alien8.de> Tested-by: Andrea Righi <andrea@betterlinux.com> Reviewed-by: Florian Mickler <florian@mickler.org> Signed-off-by: Len Brown <len.brown@intel.com>
Paul E. McKenney [Sun, 10 Jul 2011 22:57:35 +0000 (15:57 -0700)]
rcu: Prevent RCU callbacks from executing before scheduler initialized
Under some rare but real combinations of configuration parameters, RCU
callbacks are posted during early boot that use kernel facilities that
are not yet initialized. Therefore, when these callbacks are invoked,
hard hangs and crashes ensue. This commit therefore prevents RCU
callbacks from being invoked until after the scheduler is fully up and
running, as in after multiple tasks have been spawned.
It might well turn out that a better approach is to identify the specific
RCU callbacks that are causing this problem, but that discussion will
wait until such time as someone really needs an RCU callback to be invoked
(as opposed to merely registered) during early boot.
Reported-by: julie Sullivan <kernelmail.jms@gmail.com> Reported-by: RKK <kulkarni.ravi4@gmail.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Tested-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Tested-by: julie Sullivan <kernelmail.jms@gmail.com> Tested-by: RKK <kulkarni.ravi4@gmail.com>
Ryusuke Konishi [Sun, 19 Jun 2011 07:56:29 +0000 (16:56 +0900)]
nilfs2: remove resize from unsupported features list
Resize feature was supported by the commit 4e33f9eab07e but it was not
reflected to the list of unsupported features in nilfs2.txt file.
This updates the list to fix discrepancy.
Chris Wilson [Tue, 12 Jul 2011 22:38:18 +0000 (23:38 +0100)]
agp/intel: Fix typo in G4x_GMCH_SIZE_VT_2M
Konstantin Belousov found an error in the define of G4x_GMCH_SIZE_VT_2M
relative to the GMCH specs, and confirmed that indeed one of his users
with a Q45 reports 0xb not 0xc for a 2/2MiB GATT.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Konstantin Belousov <kostikbel@gmail.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Dave Airlie <airlied@redhat.com>
Al Viro [Wed, 13 Jul 2011 01:40:23 +0000 (21:40 -0400)]
Fix ->d_lock locking order in unlazy_walk()
Make sure that child is still a child of parent before nested locking
of child->d_lock in unlazy_walk(); otherwise we are risking a violation
of locking order and deadlocks.
Merge branch 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc
* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc:
powerpc/mm: Fix memory_block_size_bytes() for non-pseries
mm: Move definition of MIN_MEMORY_BLOCK_SIZE to a header
Merge branch 'drm-intel-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/keithp/linux-2.6
* 'drm-intel-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/keithp/linux-2.6:
drm/i915/ringbuffer: Idling requires waiting for the ring to be empty
Revert "drm/i915: enable rc6 by default"
drm/i915: Clean up i915_driver_load failure path
drm/i915: Enable GPU reset on Ivybridge.
drm/i915/dp: manage sink power state if possible
drm/i915/dp: consolidate AUX retry code
drm/i915/dp: remove DPMS mode tracking from DP
drm/i915/dp: try to read receiver capabilities 3 times when detecting
drm/i915/dp: read more receiver capability bits on hotplug
drm/i915/dp: use DP DPCD defines when looking at DPCD values
drm/i915/dp: retry link status read 3 times on failure
Ben Greear [Tue, 12 Jul 2011 17:27:55 +0000 (10:27 -0700)]
SUNRPC: Fix use of static variable in rpcb_getport_async
Because struct rpcbind_args *map was declared static, if two
threads entered this method at the same time, the values
assigned to map could be sent two two differen tasks.
This could cause all sorts of problems, include use-after-free
and double-free of memory.
Fix this by removing the static declaration so that the map
pointer is on the stack.
Signed-off-by: Ben Greear <greearb@candelatech.com> Cc: stable@kernel.org Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Chris Wilson [Tue, 12 Jul 2011 17:03:29 +0000 (18:03 +0100)]
drm/i915/ringbuffer: Idling requires waiting for the ring to be empty
...which is measured by the size and not the amount of space remaining.
Waiting upon size-8, did one of two things. In the common case with more
than 8 bytes available to write into the ring, it would return
immediately. Otherwise, it would timeout given the impossible condition
of waiting for more space than is available in the ring, leading to
warnings such as:
[drm:intel_cleanup_ring_buffer] *ERROR* failed to quiesce render ring
whilst cleaning up: -16
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Eric Anholt <eric@anholt.net> Signed-off-by: Keith Packard <keithp@keithp.com>
Keith Packard [Sun, 10 Jul 2011 20:12:17 +0000 (13:12 -0700)]
drm/i915: Clean up i915_driver_load failure path
i915_driver_load adds a write-combining MTRR region for the GTT
aperture to improve memory speeds through the aperture. If
i915_driver_load fails after this, it would not have cleaned up the
MTRR. This shouldn't cause any problems, except for consuming an MTRR
register. Still, it's best to clean up completely in the failure path,
which is easily done by calling mtrr_del if the mtrr was successfully
allocated.
i915_driver_load calls i915_gem_load which register
i915_gem_inactive_shrink. If i915_driver_load fails after calling
i915_gem_load, the shrinker will be left registered. When called, it
will access freed memory and crash. The fix is to unregister the shrinker in the
failure path using code duplicated from i915_driver_unload.
i915_driver_load also has some incorrect gotos in the error cleanup
paths:
* After failing to initialize the GTT (which cannot happen, btw,
intel_gtt_get returns a fixed (non-NULL) value), it tries to
free the uninitialized WC IO mapping. Fixed this by changing the
target from out_iomapfree to out_rmmap
Signed-off-by: Keith Packard <keithp@keithp.com> Tested-by: Lin Ming <ming.m.lin@intel.com>
hwmon: (pmbus) Use long variables for register to data conversions
Using integer variable types for register to data conversions can cause
overflows especially for power calculations, which are in microwatt.
Use long variables instead.
Michal Marek [Tue, 12 Jul 2011 09:54:48 +0000 (11:54 +0200)]
kbuild: Do not write to builddir in modules_install
Let depmod.sh create a temporary directory in /tmp instead of writing to
the build directory as root. The mktemp utility should be available on
any recent system (and there is already scripts/gen_initramfs_list.sh
relying on it).
Reported-by: Christian Kujau <lists@nerdbynature.de> Signed-off-by: Michal Marek <mmarek@suse.cz>
There is a potential race during filesystem mounting which has recently
been reported. It occurs when the userland gfs_controld is able to
process requests fast enough that it tries to use the sysfs interface
before the lock module is properly initialised. This is a pretty
unusual case as normally the lock module initialisation is very quick
compared with gfs_controld.
This patch adds an interruptible completion which is used to ensure that
userland will wait for the initialisation of the lock module to
complete.
There are other potential solutions to this problem, but this is the
quickest at this stage and has been tested both with and without
mount.gfs2 present in the system.
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com> Reported-by: David Booher <dbooher@adams.net>
GFS2: force a log flush when invalidating the rindex glock
Right now, there is nothing that forces the log to get flushed when a node
drops its rindex glock so that another node can grow the filesystem. If the
log doesn't get flushed, GFS2 can corrupt the sd_log_le_rg list in the
following way.
A node puts an rgd on the list in rg_lo_add(), and then the rindex glock is
dropped so the other node can grow the filesystem. When the node reacquires the
rindex glock, that rgd gets deleted in clear_rgrpdi() before ever being
removed from the list by gfs2_log_flush().
This code simply forces a log flush when the rindex glock is invalidated,
solving the problem.
Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
powerpc/mm: Fix memory_block_size_bytes() for non-pseries
Just compiling pseries in the kernel causes it to override
memory_block_size_bytes() regardless of what is the runtime
platform.
This cleans up the implementation of that function, fixing
a bug or two while at it, so that it's harmless (and potentially
useful) for other platforms. Without this, bugs in that code
would trigger a WARN_ON() in drivers/base/memory.c when
booting some different platforms.
If/when we have another platform supporting memory hotplug we
might want to either move that out to a generic place or
make it a ppc_md. callback.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
mm: Move definition of MIN_MEMORY_BLOCK_SIZE to a header
The macro MIN_MEMORY_BLOCK_SIZE is currently defined twice in two .c
files, and I need it in a third one to fix a powerpc bug, so let's
first move it into a header
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Acked-by: Ingo Molnar <mingo@elte.hu>
Documentation/Changes: remove some really obsolete text
That file harkens back to the days of the big 2.4 -> 2.6 version jump,
and was based even then on older versions. Some of it is just obsolete,
and Jesper Juhl points out that it talks about kernel versions 2.6 and
should be updated to 3.0.
Remove some obsolete text, and re-phrase some other to not be 2.6-specific.
Merge branch 'v4l_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-2.6
* 'v4l_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-2.6:
[media] msp3400: fill in v4l2_tuner based on vt->type field
[media] tuner-core.c: don't change type field in g_tuner or g_frequency
[media] cx18/ivtv: fix g_tuner support
[media] tuner-core: power up tuner when called with s_power(1)
[media] v4l2-ioctl.c: check for valid tuner type in S_HW_FREQ_SEEK
[media] tuner-core: simplify the standard fixup
[media] tuner-core/v4l2-subdev: document that the type field has to be filled in
[media] v4l2-subdev.h: remove unused s_mode tuner op
[media] feature-removal-schedule: change in how radio device nodes are handled
[media] bttv: fix s_tuner for radio
[media] pvrusb2: fix g/s_tuner support
[media] v4l2-ioctl.c: prefill tuner type for g_frequency and g/s_tuner
[media] tuner-core: fix tuner_resume: use t->mode instead of t->type
[media] tuner-core: fix s_std and s_tuner