git.karo-electronics.de Git - karo-tx-linux.git/log

Merge branch 'acpi-cpufreq' into test

Merge branch 'apei' into test

Merge branch 'battery' into test

Merge branch 'bz-24492' into test

Merge branch 'misc' into test

Merge branch 'acpica' into test

Merge branches 'd3cold', 'bugzilla-37412' and 'bugzilla-38152' into release

ACPI: fix 80 char overflow

Trivial fix for 80 char line overflow in drivers/acpi/pci_root.c

Signed-off-by: Jon Mason <jdmason@kudzu.us>
Signed-off-by: Len Brown <len.brown@intel.com>

ACPI / Battery: Resolve the race condition in the sysfs_remove_battery()

Use battery->lock in sysfs_remove_battery() to make
checking, removing, and clearing bat.dev atomic.
This is necessary because sysfs_remove_battery() may
be invoked concurrently from different paths.

https://bugzilla.kernel.org/show_bug.cgi?id=35642

Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>

ACPI / Battery: Add the check before refresh sysfs in the battery_notify()

In the commit 25be5821, add the refresh sysfs when system resumes
from suspending. But it didn't check that the battery exists. This
will cause battery sysfs files added when the battery doesn't exist.
This patch add the check before refreshing.

https://bugzilla.kernel.org/show_bug.cgi?id=35642

Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>

ACPI / Battery: Add the hibernation process in the battery_notify()

The Commit 25be58215 has added a PM notifier to refresh the sys in order
to deal with the unit change of the Battery Present Rate. But it just
consided the suspend situation. The problem also will happen during the
hibernation according the bug 28192.

https://bugzilla.kernel.org/show_bug.cgi?id=28192

This patch adds the hibernation process and fix the bug.

Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>

ACPI / Battery: Rename acpi_battery_quirks2 with acpi_battery_quirks

This patch is cosmetic only, and makes no functional change.
Since the acpi_battery_quirks has been deleted, rename
acpi_battery_quirks2 with acpi_battery_quirks to clean the code.

Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>

ACPI / Battery: Change 16-bit signed negative battery current into correct value

This patch is for some machines which report the battery current
as a 16-bit signed negative when it is charging. This is caused
by DSDT bug. The commit bc76f90b8a5cf4aceedf210d08d5e8292f820cec
has resolved the problem for Acer laptops. But some other machines
also have such problem.

https://bugzilla.kernel.org/show_bug.cgi?id=33722

Since it is improper that the current is above 32A on laptops
whether on AC or on battery, this patch is to check the current and
take its absolute value as current and producing a message when it
is negative in s16.

Remove Acer quirk, as this workaround handles Acer too.

Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>

ACPI / Battery: Add the power unit macro

This patch is cosmetic only, and makes no functional change.

Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>

ACPI / SBS: Correct the value of power_now and power_avg in the sysfs

https://bugzilla.kernel.org/show_bug.cgi?id=24492

Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>

ACPI / SBS: Add getting state operation in the acpi_sbs_battery_get_property()

https://bugzilla.kernel.org/show_bug.cgi?id=24492

Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>

ACPI: Fixes device power states array overflow

Commit 28c2103 added new state ACPI_STATE_D3_COLD, so the device power
states array must be expanded by one also.

v2: Use ACPI_D_STATE_COUNT instead of number 5 for the array size.

Reported-by: Dan Carpenter <error27@gmail.com>
Suggested-by: Oldřich Jedlička <oldium.pro@seznam.cz>
Signed-off-by: Lin Ming <ming.m.lin@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>

ACPICA: Update to version 20110623

Version 20110623.

Signed-off-by: Bob Moore <robert.moore@intel.com>
Signed-off-by: Lin Ming <ming.m.lin@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>

ACPICA: Do not repair _TSS return package if _PSS is present

We can only sort the _TSS return package if there is no _PSS
in the same scope. This is because if _PSS is present, the ACPI
specification dictates that the _TSS Power Dissipation field is
to be ignored, and therefore some BIOSs leave garbage values in
the _TSS Power field(s). In this case, it is best to just return
the _TSS package as-is.

Reported-by: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Bob Moore <robert.moore@intel.com>
Signed-off-by: Lin Ming <ming.m.lin@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>

ACPICA: Add option to disable method return value validation and repair

Runtime option can be used to disable return value repair if this
is causing a problem on a particular machine.

Signed-off-by: Bob Moore <robert.moore@intel.com>
Signed-off-by: Lin Ming <ming.m.lin@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>

ACPICA: Update to version 20110527

Version 20110527.

Signed-off-by: Bob Moore <robert.moore@intel.com>
Signed-off-by: Lin Ming <ming.m.lin@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>

ACPICA: Add missing _TDL to list of predefined names

Affects both iASL and core ACPICA.

Signed-off-by: Bob Moore <robert.moore@intel.com>
Signed-off-by: Lin Ming <ming.m.lin@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>

ACPICA: Load operator: re-instate most restrictions on incoming table signature

Now, only allow "SSDT" "OEM", and a null signature.

Signed-off-by: Bob Moore <robert.moore@intel.com>
Signed-off-by: Lin Ming <ming.m.lin@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>

ACPI, APEI, GHES: Add hardware memory error recovery support

memory_failure_queue() is called when recoverable memory errors are
notified by firmware to do the recovery work.

Signed-off-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>

HWPoison: add memory_failure_queue()

memory_failure() is the entry point for HWPoison memory error
recovery.  It must be called in process context.  But commonly
hardware memory errors are notified via MCE or NMI, so some delayed
execution mechanism must be used.  In MCE handler, a work queue + ring
buffer mechanism is used.

In addition to MCE, now APEI (ACPI Platform Error Interface) GHES
(Generic Hardware Error Source) can be used to report memory errors
too.  To add support to APEI GHES memory recovery, a mechanism similar
to that of MCE is implemented.  memory_failure_queue() is the new
entry point that can be called in IRQ context.  The next step is to
make MCE handler uses this interface too.

Signed-off-by: Huang Ying <ying.huang@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Wu Fengguang <fengguang.wu@intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Len Brown <len.brown@intel.com>

ACPI, APEI, GHES, Error records content based throttle

printk is used by GHES to report hardware errors.  Ratelimit is
enforced on the printk to avoid too many hardware error reports in
kernel log.  Because there may be thousands or even millions of
corrected hardware errors during system running.

Currently, a simple scheme is used.  That is, the total number of
hardware error reporting is ratelimited.  This may cause some issues
in practice.

For example, there are two kinds of hardware errors occurred in
system.  One is corrected memory error, because the fault memory
address is accessed frequently, there may be hundreds error report
per-second.  The other is corrected PCIe AER error, it will be
reported once per-second.  Because they share one ratelimit control
structure, it is highly possible that only memory error is reported.

To avoid the above issue, an error record content based throttle
algorithm is implemented in the patch.  Where after the first
successful reporting, all error records that are same are throttled for
some time, to let other kinds of error records have the opportunity to
be reported.

In above example, the memory errors will be throttled for some time,
after being printked.  Then the PCIe AER error will be printked
successfully.

Signed-off-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>

ACPI, APEI, GHES, printk support for recoverable error via NMI

Some APEI GHES recoverable errors are reported via NMI, but printk is
not safe in NMI context.

To solve the issue, a lock-less memory allocator is used to allocate
memory in NMI handler, save the error record into the allocated
memory, put the error record into a lock-less list. On the other
hand, an irq_work is used to delay the operation from NMI context to
IRQ context. The irq_work IRQ handler will remove nodes from
lock-less list, printk the error record and do some further processing
include recovery operation, then free the memory.

Signed-off-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>

lib, Make gen_pool memory allocator lockless

This version of the gen_pool memory allocator supports lockless
operation.

This makes it safe to use in NMI handlers and other special
unblockable contexts that could otherwise deadlock on locks.  This is
implemented by using atomic operations and retries on any conflicts.
The disadvantage is that there may be livelocks in extreme cases.  For
better scalability, one gen_pool allocator can be used for each CPU.

The lockless operation only works if there is enough memory available.
If new memory is added to the pool a lock has to be still taken.  So
any user relying on locklessness has to ensure that sufficient memory
is preallocated.

The basic atomic operation of this allocator is cmpxchg on long.  On
architectures that don't have NMI-safe cmpxchg implementation, the
allocator can NOT be used in NMI handler.  So code uses the allocator
in NMI handler should depend on CONFIG_ARCH_HAVE_NMI_SAFE_CMPXCHG.

Signed-off-by: Huang Ying <ying.huang@intel.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Reviewed-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Len Brown <len.brown@intel.com>

lib, Add lock-less NULL terminated single list

Cmpxchg is used to implement adding new entry to the list, deleting
all entries from the list, deleting first entry of the list and some
other operations.

Because this is a single list, so the tail can not be accessed in O(1).

If there are multiple producers and multiple consumers, llist_add can
be used in producers and llist_del_all can be used in consumers.  They
can work simultaneously without lock.  But llist_del_first can not be
used here.  Because llist_del_first depends on list->first->next does
not changed if list->first is not changed during its operation, but
llist_del_first, llist_add, llist_add (or llist_del_all, llist_add,
llist_add) sequence in another consumer may violate that.

If there are multiple producers and one consumer, llist_add can be
used in producers and llist_del_all or llist_del_first can be used in
the consumer.

This can be summarized as follow:

           |   add    | del_first |  del_all
add       |    -     |     -     |     -
del_first |          |     L     |     L
del_all   |          |           |     -

Where "-" stands for no lock is needed, while "L" stands for lock is
needed.

The list entries deleted via llist_del_all can be traversed with
traversing function such as llist_for_each etc.  But the list entries
can not be traversed safely before deleted from the list.  The order
of deleted entries is from the newest to the oldest added one.  If you
want to traverse from the oldest to the newest, you must reverse the
order by yourself before traversing.

The basic atomic operation of this list is cmpxchg on long.  On
architectures that don't have NMI-safe cmpxchg implementation, the
list can NOT be used in NMI handler.  So code uses the list in NMI
handler should depend on CONFIG_ARCH_HAVE_NMI_SAFE_CMPXCHG.

Signed-off-by: Huang Ying <ying.huang@intel.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Reviewed-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Len Brown <len.brown@intel.com>

Add Kconfig option ARCH_HAVE_NMI_SAFE_CMPXCHG

cmpxchg() is widely used by lockless code, including NMI-safe lockless
code. But on some architectures, the cmpxchg() implementation is not
NMI-safe, on these architectures the lockless code may need a
spin_trylock_irqsave() based implementation.

This patch adds a Kconfig option: ARCH_HAVE_NMI_SAFE_CMPXCHG, so that
NMI-safe lockless code can depend on it or provide different
implementation according to it.

On many architectures, cmpxchg is only NMI-safe for several specific
operand sizes. So, ARCH_HAVE_NMI_SAFE_CMPXCHG define in this patch
only guarantees cmpxchg is NMI-safe for sizeof(unsigned long).

Signed-off-by: Huang Ying <ying.huang@intel.com>
Acked-by: Mike Frysinger <vapier@gentoo.org>
Acked-by: Paul Mundt <lethal@linux-sh.org>
Acked-by: Hans-Christian Egtvedt <hans-christian.egtvedt@atmel.com>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-by: Chris Metcalf <cmetcalf@tilera.com>
CC: Richard Henderson <rth@twiddle.net>
CC: Mikael Starvik <starvik@axis.com>
CC: David Howells <dhowells@redhat.com>
CC: Yoshinori Sato <ysato@users.sourceforge.jp>
CC: Tony Luck <tony.luck@intel.com>
CC: Hirokazu Takata <takata@linux-m32r.org>
CC: Geert Uytterhoeven <geert@linux-m68k.org>
CC: Michal Simek <monstr@monstr.eu>
CC: Ralf Baechle <ralf@linux-mips.org>
CC: Kyle McMartin <kyle@mcmartin.ca>
CC: Martin Schwidefsky <schwidefsky@de.ibm.com>
CC: Chen Liqin <liqin.chen@sunplusct.com>
CC: "David S. Miller" <davem@davemloft.net>
CC: Ingo Molnar <mingo@redhat.com>
CC: Chris Zankel <chris@zankel.net>
Signed-off-by: Len Brown <len.brown@intel.com>

ACPI, APEI, Add WHEA _OSC support

APEI firmware first mode must be turned on explicitly on some
machines, otherwise there may be no GHES hardware error record for
hardware error notification. APEI bit in generic _OSC call can be
used to do that, but on some machine, a special WHEA _OSC call must be
used. This patch adds the support to that WHEA _OSC call.

Signed-off-by: Huang Ying <ying.huang@intel.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Reviewed-by: Matthew Garrett <mjg@redhat.com>
Signed-off-by: Len Brown <len.brown@intel.com>

ACPI, APEI, Add APEI bit support in generic _OSC call

In APEI firmware first mode, hardware error is reported by hardware to
firmware firstly, then firmware reports the error to Linux in a GHES
error record via POLL/SCI/IRQ/NMI etc.

This may result in some issues if OS has no full APEI support.  So
some firmware implementation will work in a back-compatible mode by
default.  Where firmware will only notify OS in old-fashion, without
GHES record.  For example, for a fatal hardware error, only NMI is
signaled, no GHES record.

To gain full APEI power on these machines, APEI bit in generic _OSC
call can be specified to tell firmware that Linux has full APEI
support.  This patch adds the APEI bit support in generic _OSC call.

Signed-off-by: Huang Ying <ying.huang@intel.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Reviewed-by: Matthew Garrett <mjg@redhat.com>
Signed-off-by: Len Brown <len.brown@intel.com>

ACPI, APEI, GHES, Support disable GHES at boot time

Some machine may have broken firmware so that GHES and firmware first
mode should be disabled. This patch adds support to that.

Signed-off-by: Huang Ying <ying.huang@intel.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Reviewed-by: Matthew Garrett <mjg@redhat.com>
Signed-off-by: Len Brown <len.brown@intel.com>

ACPI, APEI, GHES, Prevent GHES to be built as module

GHES (Generic Hardware Error Source) is used to process hardware error
notification in firmware first mode. But because firmware first mode
can be turned on but can not be turned off, it is unreasonable to
unload the GHES module with firmware first mode turned on. To avoid
confusion, this patch makes GHES can be enabled/disabled in
configuration time, but not built as module and unloaded at run time.

Signed-off-by: Huang Ying <ying.huang@intel.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Reviewed-by: Matthew Garrett <mjg@redhat.com>
Signed-off-by: Len Brown <len.brown@intel.com>

ACPI, APEI, Use apei_exec_run_optional in APEI EINJ and ERST

This patch changes APEI EINJ and ERST to use apei_exec_run for
mandatory actions, and apei_exec_run_optional for optional actions.

Cc: Thomas Renninger <trenn@novell.com>
Signed-off-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>

ACPI, APEI, Add apei_exec_run_optional

Some actions in APEI ERST and EINJ tables are optional, for example,
ACPI_EINJ_BEGIN_OPERATION action is used to do some preparation for
error injection, and firmware may choose to do nothing here. While
some other actions are mandatory, for example, firmware must provide
ACPI_EINJ_GET_ERROR_TYPE implementation.

Original implementation treats all actions as optional (that is, can
have no instructions), that may cause issue if firmware does not
provide some mandatory actions. To fix this, this patch adds
apei_exec_run_optional, which should be used for optional actions.
The original apei_exec_run should be used for mandatory actions.

Cc: Thomas Renninger <trenn@novell.com>
Signed-off-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>

ACPI, APEI, GHES, Do not ratelimit fatal error printk before panic

printk is used by GHES to report hardware errors.  Normally, the
printk will be ratelimited to avoid too many hardware error reports in
kernel log.  Because there may be thousands or even millions of
corrected hardware errors during system running.

That is different for fatal hardware error, because system will go
panic as soon as possible, there will be no more than several error
records.  And these error records are valuable for system fault
diagnosis, so they should not be ratelimited.

Signed-off-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>

ACPI, APEI, ERST, Fix erst-dbg long record reading issue

When we debug ERST table with erst-dbg, if the error record in ERST
table is too long(>4K), it can't be read out. So this patch increases
the buffer size to 16K to ensure such error records can be read from
ERST table.

Signed-off-by: Chen Gong <gong.chen@linux.intel.com>
Signed-off-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>

ACPI, APEI, ERST, Prevent erst_dbg from loading if ERST is disabled

erst_dbg module can not work when ERST is disabled. So disable module
loading to provide clearer information to user.

Signed-off-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>

ACPI, APEI, HEST, Detect duplicated hardware error source ID

The firmware on some machine will report duplicated hardware error
source ID in HEST. This is considered a firmware bug. To provide
better warning message, this patch adds duplicated hardware error
source ID detecting and corresponding printk.

This patch fixes #37412 on kernel bugzilla:
https://bugzilla.kernel.org/show_bug.cgi?id=37412

Reported-by: marconifabio@ubuntu-it.org
Signed-off-by: Huang Ying <ying.huang@intel.com>
Tested-by: Mathias <janedo.spam@gmail.com>
Signed-off-by: Len Brown <len.brown@intel.com>

acpi-cpufreq: remove unreliable device.get() code

cpufreq offers the optional driver.get() entry point
for drivers to export instantaneous frequency in
/sys/devices/system/cpu/cpu*/cpufreq/cpuinfo_cur_freq.

25% of the acpi-cpufreq driver is involved in supporting
that optional feature, but on modern processors, it
is not reliable.

So here we delete this optional feature from acpi-cpufreq.
/sys/devices/system/cpu/cpu*/cpufreq/cpuinfo_cur_freq
will go away on acpi-cpufreq systems, but note that
/sys/devices/system/cpu/cpu*/cpufreq/scaling_cur_freq
will still be presnet to indicate the most recent request.

(and yes, powertop still works:-)

The most common reason that driver.get() is not reliable
is that modern processors autonomously change frequency
without OS instruction. This means that reading
PERF_STATUS is possibly in-accurate as soon as the
instruction after it is read.

Average frequency over an interval is more useful
than instantaneous frequency on modern hardware.
acpi-cpufreq supplies average frequency via
the the driver->getavg() entry, which is what
the ondemand governor uses.

Signed-off-by: Len Brown <len.brown@intel.com>

ACPI: Fix lockdep false positives in acpi_power_off()

All ACPICA locks are allocated by the same function,
acpi_os_create_lock(), with the help of a local variable called
"lock". Thus, when lockdep is enabled, it uses "lock" as the
name of all those locks and regards them as instances of the same
lock, which causes it to report possible locking problems with them
when there aren't any.

To work around this problem, define acpi_os_create_lock() as a macro
and make it pass its argument to spin_lock_init(), so that lockdep
uses it as the name of the new lock. Define this macron in a
Linux-specific file, to minimize the resulting modifications of
the OS-independent ACPICA parts.

This change is based on an earlier patch from Andrea Righi and it
addresses a regression from 2.6.39 tracked as
https://bugzilla.kernel.org/show_bug.cgi?id=38152

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Reported-and-tested-by: Borislav Petkov <bp@alien8.de>
Tested-by: Andrea Righi <andrea@betterlinux.com>
Reviewed-by: Florian Mickler <florian@mickler.org>
Signed-off-by: Len Brown <len.brown@intel.com>

Merge branch 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc

* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc:
powerpc/mm: Fix memory_block_size_bytes() for non-pseries
mm: Move definition of MIN_MEMORY_BLOCK_SIZE to a header

Merge branch 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/linux-arm-soc

* 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/linux-arm-soc:
  pcmcia: pxa2xx/vpac270: free gpios on exist rather than requesting
  ARM: pxa/raumfeld: fix device name for codec ak4104
  ARM: pxa/raumfeld: display initialisation fixes
  ARM: pxa/raumfeld: adapt to upcoming hardware change
  ARM: pxa: fix gpio_to_chip() clash with gpiolib namespace
  genirq: replace irq_gc_ack() with {set,clr}_bit variants (fwd)
  arm: mach-vt8500: add forgotten irq_data conversion
  ARM: pxa168: correct nand pmu setting
  ARM: pxa910: correct nand pmu setting
  ARM: pxa: fix PGSR register address calculation

Merge branch 'drm-intel-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/keithp/linux-2.6

* 'drm-intel-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/keithp/linux-2.6:
  drm/i915/ringbuffer: Idling requires waiting for the ring to be empty
  Revert "drm/i915: enable rc6 by default"
  drm/i915: Clean up i915_driver_load failure path
  drm/i915: Enable GPU reset on Ivybridge.
  drm/i915/dp: manage sink power state if possible
  drm/i915/dp: consolidate AUX retry code
  drm/i915/dp: remove DPMS mode tracking from DP
  drm/i915/dp: try to read receiver capabilities 3 times when detecting
  drm/i915/dp: read more receiver capability bits on hotplug
  drm/i915/dp: use DP DPCD defines when looking at DPCD values
  drm/i915/dp: retry link status read 3 times on failure

drm/i915/ringbuffer: Idling requires waiting for the ring to be empty

...which is measured by the size and not the amount of space remaining.

Waiting upon size-8, did one of two things. In the common case with more
than 8 bytes available to write into the ring, it would return
immediately. Otherwise, it would timeout given the impossible condition
of waiting for more space than is available in the ring, leading to
warnings such as:

[drm:intel_cleanup_ring_buffer] *ERROR* failed to quiesce render ring
whilst cleaning up: -16

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Keith Packard <keithp@keithp.com>

Revert "drm/i915: enable rc6 by default"

This reverts commit a51f7a66fb5e4af5ec4286baef940d06594b59d2.

We still have a few Ironlake and Sandybridge machines which fail when
RC6 is enabled. Better luck next release?

Signed-off-by: Keith Packard <keithp@keithp.com>

drm/i915: Clean up i915_driver_load failure path

i915_driver_load adds a write-combining MTRR region for the GTT
aperture to improve memory speeds through the aperture. If
i915_driver_load fails after this, it would not have cleaned up the
MTRR. This shouldn't cause any problems, except for consuming an MTRR
register. Still, it's best to clean up completely in the failure path,
which is easily done by calling mtrr_del if the mtrr was successfully
allocated.

i915_driver_load calls i915_gem_load which register
i915_gem_inactive_shrink. If i915_driver_load fails after calling
i915_gem_load, the shrinker will be left registered. When called, it
will access freed memory and crash. The fix is to unregister the shrinker in the
failure path using code duplicated from i915_driver_unload.

i915_driver_load also has some incorrect gotos in the error cleanup
paths:

* After failing to initialize the GTT (which cannot happen, btw,
   intel_gtt_get returns a fixed (non-NULL) value), it tries to
   free the uninitialized WC IO mapping. Fixed this by changing the
   target from out_iomapfree to out_rmmap

Signed-off-by: Keith Packard <keithp@keithp.com>
Tested-by: Lin Ming <ming.m.lin@intel.com>

powerpc/mm: Fix memory_block_size_bytes() for non-pseries

Just compiling pseries in the kernel causes it to override
memory_block_size_bytes() regardless of what is the runtime
platform.

This cleans up the implementation of that function, fixing
a bug or two while at it, so that it's harmless (and potentially
useful) for other platforms. Without this, bugs in that code
would trigger a WARN_ON() in drivers/base/memory.c when
booting some different platforms.

If/when we have another platform supporting memory hotplug we
might want to either move that out to a generic place or
make it a ppc_md. callback.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>

mm: Move definition of MIN_MEMORY_BLOCK_SIZE to a header

The macro MIN_MEMORY_BLOCK_SIZE is currently defined twice in two .c
files, and I need it in a third one to fix a powerpc bug, so let's
first move it into a header

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-by: Ingo Molnar <mingo@elte.hu>

Linux 3.0-rc7

Documentation/Changes: remove some really obsolete text

That file harkens back to the days of the big 2.4 -> 2.6 version jump,
and was based even then on older versions. Some of it is just obsolete,
and Jesper Juhl points out that it talks about kernel versions 2.6 and
should be updated to 3.0.

Remove some obsolete text, and re-phrase some other to not be 2.6-specific.

Reported-by: Jesper Juhl <jj@chaosbits.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Merge branch 'v4l_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-2.6

* 'v4l_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-2.6:
  [media] msp3400: fill in v4l2_tuner based on vt->type field
  [media] tuner-core.c: don't change type field in g_tuner or g_frequency
  [media] cx18/ivtv: fix g_tuner support
  [media] tuner-core: power up tuner when called with s_power(1)
  [media] v4l2-ioctl.c: check for valid tuner type in S_HW_FREQ_SEEK
  [media] tuner-core: simplify the standard fixup
  [media] tuner-core/v4l2-subdev: document that the type field has to be filled in
  [media] v4l2-subdev.h: remove unused s_mode tuner op
  [media] feature-removal-schedule: change in how radio device nodes are handled
  [media] bttv: fix s_tuner for radio
  [media] pvrusb2: fix g/s_tuner support
  [media] v4l2-ioctl.c: prefill tuner type for g_frequency and g/s_tuner
  [media] tuner-core: fix tuner_resume: use t->mode instead of t->type
  [media] tuner-core: fix s_std and s_tuner

Merge branch 'fix' of git://git.kernel.org/pub/scm/linux/kernel/git/ycmiao/pxa-linux-2.6 into fixes

Merge branch 'pm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/suspend-2.6

* 'pm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/suspend-2.6:
PM: Reintroduce dropped call to check_wakeup_irqs

Merge git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6

* git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6:
  cifs: drop spinlock before calling cifs_put_tlink
  cifs: fix expand_dfs_referral
  cifs: move bdi_setup_and_register outside of CONFIG_CIFS_DFS_UPCALL
  cifs: factor smb_vol allocation out of cifs_setup_volume_info
  cifs: have cifs_cleanup_volume_info not take a double pointer
  cifs: fix build_unc_path_to_root to account for a prefixpath
  cifs: remove bogus call to cifs_cleanup_volume_info

Merge branch 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/davej/cpufreq

* 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/davej/cpufreq:
[CPUFREQ] fix cpumask memory leak in acpi-cpufreq on cpu hotplug.

Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mjg59/platform-drivers-x86

* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mjg59/platform-drivers-x86:
  hp-wmi: fix use after free
  dell-laptop - using buffer without mutex_lock
  Revert: "dell-laptop: Toggle the unsupported hardware killswitch"
  platform-drivers-x86: set backlight type to BACKLIGHT_PLATFORM
  thinkpad-acpi: handle HKEY 0x4010, 0x4011 events
  drivers/platform/x86: Fix memory leak
  thinkpad-acpi: handle some new HKEY 0x60xx events
  acer-wmi: fix bitwise bug when set device state
  acer-wmi: Only update rfkill status for associated hotkey events

Merge branch 'movieboard' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394-2.6

* 'movieboard' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394-2.6:
firewire: ohci: do not bind to Pinnacle cards, avert panic

ath5k: Add missing breaks in switch/case

Signed-off-by: Joe Perches <joe@perches.com>
Acked-by: Pavel Roskin <proski@gnu.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Documentation/spinlocks.txt: Remove reference to sti()/cli()

Since we removed sti()/cli() and related, how about removing it from
Documentation/spinlocks.txt?

Signed-off-by: Muthukumar R <muthur@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

cifs: drop spinlock before calling cifs_put_tlink

...as that function can sleep.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>

hp-wmi: fix use after free

[  191.310008] WARNING: kmemcheck: Caught 32-bit read from freed memory (f0d25f14)
[  191.310011] c056d2f088000000105fd2f00000000050415353040000000000000000000000
[  191.310020]  i i i i f f f f f f f f f f f f f f f f f f f f f f f f f f f f
[  191.310027]                                          ^
[  191.310029]
[  191.310032] Pid: 737, comm: modprobe Not tainted 3.0.0-rc5+ #268 Hewlett-Packard HP Compaq 6005 Pro SFF PC/3047h
[  191.310036] EIP: 0060:[<f80b3104>] EFLAGS: 00010286 CPU: 0
[  191.310039] EIP is at hp_wmi_perform_query+0x104/0x150 [hp_wmi]
[  191.310041] EAX: f0d25601 EBX: f0d25f00 ECX: 000121cf EDX: 000121ce
[  191.310043] ESI: f0d25f10 EDI: f0f97ea8 EBP: f0f97ec4 ESP: c173f34c
[  191.310045]  DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
[  191.310046] CR0: 8005003b CR2: f540c000 CR3: 30f30000 CR4: 000006d0
[  191.310048] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
[  191.310050] DR6: ffff4ff0 DR7: 00000400
[  191.310051]  [<f80b317b>] hp_wmi_dock_state+0x2b/0x40 [hp_wmi]
[  191.310054]  [<f80b6093>] hp_wmi_init+0x93/0x1a8 [hp_wmi]
[  191.310057]  [<c10011f0>] do_one_initcall+0x30/0x170
[  191.310061]  [<c107ab9f>] sys_init_module+0xef/0x1a60
[  191.310064]  [<c149f998>] sysenter_do_call+0x12/0x28
[  191.310067]  [<ffffffff>] 0xffffffff

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Matthew Garrett <mjg@redhat.com>

dell-laptop - using buffer without mutex_lock

Using buffer->output[1] without mutex_lock()

Signed-off-by: Jose Alonso <joalonsof@gmail.com>
Signed-off-by: Matthew Garrett <mjg@redhat.com>

Revert: "dell-laptop: Toggle the unsupported hardware killswitch"

This reverts commit a3d77411e8b2ad661958c1fbee65beb476ec6d70,

as it causes a mess in the wireless rfkill status on some models.
It is probably a bad idea to toggle the rfkill for all dell models
without the respect to the claim that it is hardware-controlled.

Cc: stable@kernel.org
Signed-off-by: Keng-Yu Lin <kengyu@canonical.com>
Signed-off-by: Matthew Garrett <mjg@redhat.com>

PM: Reintroduce dropped call to check_wakeup_irqs

Patch 2e711c04dbbf7a7732a3f7073b1fc285d12b369d
(PM: Remove sysdev suspend, resume and shutdown operations)
deleted sysdev_suspend(), which was being relied on to call
check_wakeup_irqs() in suspend. If check_wakeup_irqs() is not
called, wake interrupts that are pending when suspend is
entered may be lost. It also breaks IRQCHIP_MASK_ON_SUSPEND,
which is handled in check_wakeup_irqs().

This patch adds a call to check_wakeup_irqs() in syscore_suspend(),
similar to what was deleted in sysdev_suspend().

Signed-off-by: Colin Cross <ccross@android.com>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>

pcmcia: pxa2xx/vpac270: free gpios on exist rather than requesting

Signed-off-by: Jonathan Cameron <jic23@cam.ac.uk>
Acked-by: Marek Vasut <marek.vasut@gmail.com>
Signed-off-by: Eric Miao <eric.y.miao@gmail.com>

ARM: pxa/raumfeld: fix device name for codec ak4104

In commit f0fba2ad (ASoC: multi-component - ASoC Multi-Component
Support), the name of the ak4104 codec driver was changed without
amending the platform code which uses it as well.

Signed-off-by: Daniel Mack <zonque@gmail.com>
Signed-off-by: Eric Miao <eric.y.miao@gmail.com>

ARM: pxa/raumfeld: display initialisation fixes

The display requires some milliseconds between GPIO_TFT_VA_EN
and GPIO_DISPLAY_ENABLE. Reorder initialisation to comply with
the display spec.

Also tune timings for better compliance with the spec.

Signed-off-by: Sven Neumann <s.neumann@raumfeld.com>
Acked-by: Daniel Mack <zonque@gmail.com>
Signed-off-by: Eric Miao <eric.y.miao@gmail.com>

ARM: pxa/raumfeld: adapt to upcoming hardware change

The backlight control is going to change back to PWM in the
upcoming Raumfeld Controller hardware revision.

Signed-off-by: Sven Neumann <s.neumann@raumfeld.com>
Acked-by: Daniel Mack <zonque@gmail.com>
Signed-off-by: Eric Miao <eric.y.miao@gmail.com>

ARM: pxa: fix gpio_to_chip() clash with gpiolib namespace

The PXA platform code has a static inline helper called
gpio_to_chip which clashes with the gpiolib namespace if we
try to expose the function with the same name from gpiolib,
and it's still confusing even if we don't do that. So rename
it to gpio_to_pxachip().

Reported-by: H Hartley Sweeten <hartleys@visionengravers.com>
Cc: Eric Miao <eric.miao@linaro.org>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Eric Miao <eric.y.miao@gmail.com>

[CPUFREQ] fix cpumask memory leak in acpi-cpufreq on cpu hotplug.

I came across a memory leak during a cyclic cpu-online-offline test.

Signed-off-by: Yu Luming <luming.yu@intel.com>
Cc: Len Brown <lenb@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Dave Jones <davej@redhat.com>

Merge branch 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/staging

* 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/staging:
  hwmon: (pmbus) Improve auto-detection of temperature status register
  hwmon: (lm95241) Fix negative temperature results
  hwmon: (lm95241) Fix chip detection code

hwmon: (pmbus) Improve auto-detection of temperature status register

It is possible that a PMBus device supports the READ_TEMPERATURE2 and/or
READ_TEMPERATURE3 registers but does not support READ_TEMPERATURE1.
Improve temperature status register detection to address this condition.

Reported-by: Jean Delvare <khali@linux-fr.org>
Signed-off-by: Guenter Roeck <guenter.roeck@ericsson.com>
Acked-by: Jean Delvare <khali@linux-fr.org>
Cc: stable@kernel.org # 2.6.39+

hwmon: (lm95241) Fix negative temperature results

Negative temperatures were returned in degrees C instead of milli-Degrees C.
Also, negative temperatures were reported for remote temperature sensors even
if the chip was configured for positive-only results.

Fix by detecting temperature modes, and by treating negative temperatures
similar to positive temperatures, with appropriate sign extension.

Signed-off-by: Guenter Roeck <guenter.roeck@ericsson.com>
Acked-by: Jean Delvare <khali@linux-fr.org>
Cc: stable@kernel.org # 2.6.30+

Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound-2.6

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound-2.6:
  ALSA: hda - Fix a copmile warning
  ASoC: ak4642: fixup snd_soc_update_bits mask for PW_MGMT2
  ALSA: hda - Change all ADCs for dual-adc switching mode for Realtek
  ASoC: Manage WM8731 ACTIVE bit as a supply widget
  ASoC: Don't set invalid name string to snd_card->driver field
  ASoC: Ensure we delay long enough for WM8994 FLL to lock when starting
  ASoC: Tegra: I2S: Ensure clock is enabled when writing regs
  ASoC: Fix Blackfin I2S _pointer() implementation return in bounds values
  ASoC: tlv320aic3x: Do soft reset to codec when going to bias off state
  ASoC: tlv320aic3x: Don't sync first two registers from register cache
  audio: tlv320aic26: fix PLL register configuration

Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6:
PCI: conditional resource-reallocation through kernel parameter pci=realloc

Merge branch 'fixes' of master.kernel.org:/home/rmk/linux-2.6-arm

* 'fixes' of master.kernel.org:/home/rmk/linux-2.6-arm:
  ARM: 6994/1: smp_twd: Fix typo in 'twd_timer_rate' printing
  ARM: 6987/1: l2x0: fix disabling function to avoid deadlock
  ARM: 6966/1: ep93xx: fix inverted RTS/DTR signals on uart1
  ARM: 6980/1: mmci: use StartBitErr to detect bad connections
  ARM: 6979/1: mach-vt8500: add forgotten irq_data conversion
  ARM: move memory layout sanity checking before meminfo initialization
  ARM: 6990/1: MAINTAINERS: add entry for ARM PMU profiling and debugging
  ARM: 6989/1: perf: do not start the PMU when no events are present
  ARM: dmabounce: fix map_single() error return value

Merge branch 'drm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6

* 'drm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6:
drm/radeon/kms: clean up multiple crtc handling for evergreen+ (v2)

firewire: ohci: do not bind to Pinnacle cards, avert panic

When firewire-ohci is bound to a Pinnacle MovieBoard, eventually a
"Register access failure" is logged and an interrupt storm or a kernel
panic happens. https://bugzilla.kernel.org/show_bug.cgi?id=36622

Until this is sorted out (if that is going to succeed at all), let's
just prevent firewire-ohci from touching these devices.

Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Cc: <stable@kernel.org>

cifs: fix expand_dfs_referral

Regression introduced in commit 724d9f1cfba.

Prior to that, expand_dfs_referral would regenerate the mount data string
and then call cifs_parse_mount_options to re-parse it (klunky, but it
worked). The above commit moved cifs_parse_mount_options out of cifs_mount,
so the re-parsing of the new mount options no longer occurred. Fix it by
making expand_dfs_referral re-parse the mount options.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>

cifs: move bdi_setup_and_register outside of CONFIG_CIFS_DFS_UPCALL

This needs to be done regardless of whether that KConfig option is set
or not.

Reported-by: Sven-Haegar Koch <haegar@sdinet.de>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>

Merge branch 'fix/asoc' into for-linus

ALSA: hda - Fix a copmile warning

It's harmless but annyoing.
sound/pci/hda/patch_realtek.c: In function ‘alc_cap_getput_caller’:
sound/pci/hda/patch_realtek.c:2722:9: warning: ‘err’ may be used uninitialized in this function

Signed-off-by: Takashi Iwai <tiwai@suse.de>

Merge branch 'for-3.0' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound-2.6 into fix/asoc

Merge branch 's5p-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/kgene/linux-samsung

* 's5p-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/kgene/linux-samsung:
  ARM: S3C2440: fix section mismatch on mini2440
  ARM: S3C24XX: drop return codes in void function of dma.c
  ARM: S3C24XX: don't use uninitialized variable in dma.c
  ARM: EXYNOS4: Set appropriate I2C device variant
  ARM: S5PC100: Fix for compilation error
  spi/s3c64xx: Bug fix for SPI with different FIFO level
  ARM: SAMSUNG: Add tx_st_done variable
  ARM: EXYNOS4: Address a section mismatch w/ suspend issue.
  ARM: S5P: Fix bug on init of PWMTimers for HRTimer
  ARM: SAMSUNG: header file revised to prevent declaring duplicated
  ARM: EXYNOS4: fix improper gpio configuration
  ARM: EXYNOS4: Fix card detection for sdhci 0 and 2

Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/lrg/voltage-2.6

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/lrg/voltage-2.6:
  regulator: max8997: Fix setting inappropriate value for ramp_delay variable
  regulator: db8500-prcmu: small fixes
  regulator: max8997: remove dependency on platform_data pointer
  regulator: MAX8997: Fix for divide by zero error
  regulator: max8952 - fix wrong gpio valid check

Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstable

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstable:
  btrfs: fix oops when doing space balance
  Btrfs: don't panic if we get an error while balancing V2
  btrfs: add missing options displayed in mount output

MAINTAINERS: update Bjorn Helgaas's email address

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

drivers/leds/leds-pca9532.c: change driver name to be unique

This driver handles the variants pca9530-pca9533, so it chose the name
"pca953x".  However, there is a gpio driver which decided on the same
name.  As a result, those two can't be loaded at the same time.  Add a
subsystem prefix to make the driver name unique.  Device matching will not
suffer, because both are I2C drivers which match using a
i2c_device_id-table which is not altered.

Signed-off-by: Wolfram Sang <w.sang@pengutronix.de>
Acked-by: Jean Delvare <khali@linux-fr.org>
Cc: Richard Purdie <richard.purdie@linuxfoundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

mm/nommu.c: fix remap_pfn_range()

remap_pfn_range() means map physical address pfn<<PAGE_SHIFT to user addr.

For nommu arch it's implemented by vma->vm_start = pfn << PAGE_SHIFT which
is wrong acroding the original meaning of this function. And some driver
developer using remap_pfn_range() with correct parameter will get
unexpected result because vm_start is changed. It should be implementd
like addr = pfn << PAGE_SHIFT but which is meanless on nommu arch, this
patch just make it simply return.

Parameter name and setting of vma->vm_flags also be fixed.

Signed-off-by: Bob Liu <lliubbo@gmail.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: David Howells <dhowells@redhat.com>
Acked-by: Greg Ungerer <gerg@uclinux.org>
Cc: Mike Frysinger <vapier@gentoo.org>
Cc: Bob Liu <lliubbo@gmail.com>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

w1: ds1wm: add a reset recovery parameter

This fixes a regression in 3.0 reported by Paul Parsons regarding the
removal of the msleep(1) in the ds1wm_reset() function:

: The linux-3.0-rc4 DS1WM 1-wire driver is logging "bus error, retrying"
: error messages on an HP iPAQ hx4700 PDA (XScale-PXA270):
:
: <snip>
: Driver for 1-wire Dallas network protocol.
: DS1WM w1 busmaster driver - (c) 2004 Szabolcs Gyurko
: 1-Wire driver for the DS2760 battery monitor  chip  - (c) 2004-2005, Szabolcs Gyurko
: ds1wm ds1wm: pass: 1 bus error, retrying
: ds1wm ds1wm: pass: 2 bus error, retrying
: ds1wm ds1wm: pass: 3 bus error, retrying
: ds1wm ds1wm: pass: 4 bus error, retrying
: ds1wm ds1wm: pass: 5 bus error, retrying
: ...
:
: The visible result is that the battery charging LED is erratic; sometimes
: it works, mostly it doesn't.
:
: The linux-2.6.39 DS1WM 1-wire driver worked OK.  I haven't tried 3.0-rc1,
: 3.0-rc2, or 3.0-rc3.

This sleep should not be required on normal circuitry provided the
pull-ups on the bus are correctly adapted to the slaves.  Unfortunately,
this is not always the case.  The sleep is restored but as a parameter to
the probe function in the pdata.

[akpm@linux-foundation.org: coding-style fixes]
Reported-by: Paul Parsons <lost.distance@yahoo.com>
Tested-by: Paul Parsons <lost.distance@yahoo.com>
Signed-off-by: Jean-François Dagenais <dagenaisj@sonatest.com>
Cc: Evgeniy Polyakov <johnpol@2ka.mipt.ru>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

memcg: fix numa scan information update to be triggered by memory event

commit 889976dbcb12 ("memcg: reclaim memory from nodes in round-robin
order") adds an numa node round-robin for memcg. But the information is
updated once per 10sec.

This patch changes the update trigger from jiffies to memcg's event count.
After this patch, numa scan information will be updated when we see 1024
events of pagein/pageout under a memcg.

[akpm@linux-foundation.org: attempt to repair code layout]
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Ying Han <yinghan@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

memcg: fix reclaimable lru check in memcg

Now, in mem_cgroup_hierarchical_reclaim(), mem_cgroup_local_usage() is
used for checking whether the memcg contains reclaimable pages or not.  If
no pages in it, the routine skips it.

But, mem_cgroup_local_usage() contains Unevictable pages and cannot handle
"noswap" condition correctly.  This doesn't work on a swapless system.

This patch adds test_mem_cgroup_reclaimable() and replaces
mem_cgroup_local_usage().  test_mem_cgroup_reclaimable() see LRU counter
and returns correct answer to the caller.  And this new function has
"noswap" argument and can see only FILE LRU if necessary.

[akpm@linux-foundation.org: coding-style fixes]
[akpm@linux-foundation.org: fix kerneldoc layout]
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Ying Han <yinghan@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

mm: __tlb_remove_page() check the correct batch

__tlb_remove_page() switches to a new batch page, but still checks space
in the old batch. This check always fails, and causes a forced tlb flush.

Signed-off-by: Shaohua Li <shaohua.li@intel.com>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

mm: vmscan: only read new_classzone_idx from pgdat when reclaiming successfully

During allocator-intensive workloads, kswapd will be woken frequently
causing free memory to oscillate between the high and min watermark.  This
is expected behaviour.  Unfortunately, if the highest zone is small, a
problem occurs.

When balance_pgdat() returns, it may be at a lower classzone_idx than it
started because the highest zone was unreclaimable.  Before checking if it
should go to sleep though, it checks pgdat->classzone_idx which when there
is no other activity will be MAX_NR_ZONES-1.  It interprets this as it has
been woken up while reclaiming, skips scheduling and reclaims again.  As
there is no useful reclaim work to do, it enters into a loop of shrinking
slab consuming loads of CPU until the highest zone becomes reclaimable for
a long period of time.

There are two problems here.  1) If the returned classzone or order is
lower, it'll continue reclaiming without scheduling.  2) if the highest
zone was marked unreclaimable but balance_pgdat() returns immediately at
DEF_PRIORITY, the new lower classzone is not communicated back to kswapd()
for sleeping.

This patch does two things that are related.  If the end_zone is
unreclaimable, this information is communicated back.  Second, if the
classzone or order was reduced due to failing to reclaim, new information
is not read from pgdat and instead an attempt is made to go to sleep.  Due
to this, it is also necessary that pgdat->classzone_idx be initialised
each time to pgdat->nr_zones - 1 to avoid re-reads being interpreted as
wakeups.

Signed-off-by: Mel Gorman <mgorman@suse.de>
Reported-by: Pádraig Brady <P@draigBrady.com>
Tested-by: Pádraig Brady <P@draigBrady.com>
Tested-by: Andrew Lutomirski <luto@mit.edu>
Acked-by: Rik van Riel <riel@redhat.com>
Cc: Minchan Kim <minchan.kim@gmail.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

mm: vmscan: evaluate the watermarks against the correct classzone

When deciding if kswapd is sleeping prematurely, the classzone is taken
into account but this is different to what balance_pgdat() and the
allocator are doing. Specifically, the DMA zone will be checked based on
the classzone used when waking kswapd which could be for a GFP_KERNEL or
GFP_HIGHMEM request. The lowmem reserve limit kicks in, the watermark is
not met and kswapd thinks it's sleeping prematurely keeping kswapd awake in
error.

Signed-off-by: Mel Gorman <mgorman@suse.de>
Reported-by: Pádraig Brady <P@draigBrady.com>
Tested-by: Pádraig Brady <P@draigBrady.com>
Tested-by: Andrew Lutomirski <luto@mit.edu>
Acked-by: Rik van Riel <riel@redhat.com>
Reviewed-by: Minchan Kim <minchan.kim@gmail.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

mm: vmscan: do not apply pressure to slab if we are not applying pressure to zone

During allocator-intensive workloads, kswapd will be woken frequently
causing free memory to oscillate between the high and min watermark.  This
is expected behaviour.

When kswapd applies pressure to zones during node balancing, it checks if
the zone is above a high+balance_gap threshold.  If it is, it does not
apply pressure but it unconditionally shrinks slab on a global basis which
is excessive.  In the event kswapd is being kept awake due to a high small
unreclaimable zone, it skips zone shrinking but still calls shrink_slab().

Once pressure has been applied, the check for zone being unreclaimable is
being made before the check is made if all_unreclaimable should be set.
This miss of unreclaimable can cause has_under_min_watermark_zone to be
set due to an unreclaimable zone preventing kswapd backing off on
congestion_wait().

Signed-off-by: Mel Gorman <mgorman@suse.de>
Reported-by: Pádraig Brady <P@draigBrady.com>
Tested-by: Pádraig Brady <P@draigBrady.com>
Tested-by: Andrew Lutomirski <luto@mit.edu>
Acked-by: Rik van Riel <riel@redhat.com>
Reviewed-by: Minchan Kim <minchan.kim@gmail.com>
Reviewed-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

mm: vmscan: correct check for kswapd sleeping in sleeping_prematurely

During allocator-intensive workloads, kswapd will be woken frequently
causing free memory to oscillate between the high and min watermark.  This
is expected behaviour.  Unfortunately, if the highest zone is small, a
problem occurs.

This seems to happen most with recent sandybridge laptops but it's
probably a co-incidence as some of these laptops just happen to have a
small Normal zone.  The reproduction case is almost always during copying
large files that kswapd pegs at 100% CPU until the file is deleted or
cache is dropped.

The problem is mostly down to sleeping_prematurely() keeping kswapd awake
when the highest zone is small and unreclaimable and compounded by the
fact we shrink slabs even when not shrinking zones causing a lot of time
to be spent in shrinkers and a lot of memory to be reclaimed.

Patch 1 corrects sleeping_prematurely to check the zones matching
the classzone_idx instead of all zones.

Patch 2 avoids shrinking slab when we are not shrinking a zone.

Patch 3 notes that sleeping_prematurely is checking lower zones against
a high classzone which is not what allocators or balance_pgdat()
is doing leading to an artifical belief that kswapd should be
still awake.

Patch 4 notes that when balance_pgdat() gives up on a high zone that the
decision is not communicated to sleeping_prematurely()

This problem affects 2.6.38.8 for certain and is expected to affect 2.6.39
and 3.0-rc4 as well.  If accepted, they need to go to -stable to be picked
up by distros and this series is against 3.0-rc4.  I've cc'd people that
reported similar problems recently to see if they still suffer from the
problem and if this fixes it.

This patch: correct the check for kswapd sleeping in sleeping_prematurely()

During allocator-intensive workloads, kswapd will be woken frequently
causing free memory to oscillate between the high and min watermark.  This
is expected behaviour.

A problem occurs if the highest zone is small.  balance_pgdat() only
considers unreclaimable zones when priority is DEF_PRIORITY but
sleeping_prematurely considers all zones.  It's possible for this sequence
to occur

  1. kswapd wakes up and enters balance_pgdat()
  2. At DEF_PRIORITY, marks highest zone unreclaimable
  3. At DEF_PRIORITY-1, ignores highest zone setting end_zone
  4. At DEF_PRIORITY-1, calls shrink_slab freeing memory from
        highest zone, clearing all_unreclaimable. Highest zone
        is still unbalanced
  5. kswapd returns and calls sleeping_prematurely
  6. sleeping_prematurely looks at *all* zones, not just the ones
     being considered by balance_pgdat. The highest small zone
     has all_unreclaimable cleared but the zone is not
     balanced. all_zones_ok is false so kswapd stays awake

This patch corrects the behaviour of sleeping_prematurely to check the
zones balance_pgdat() checked.

Signed-off-by: Mel Gorman <mgorman@suse.de>
Reported-by: Pádraig Brady <P@draigBrady.com>
Tested-by: Pádraig Brady <P@draigBrady.com>
Tested-by: Andrew Lutomirski <luto@mit.edu>
Acked-by: Rik van Riel <riel@redhat.com>
Reviewed-by: Minchan Kim <minchan.kim@gmail.com>
Reviewed-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

hwmon: (lm95241) Fix chip detection code

The LM95241 driver accepts every chip ID equal to or larger than 0xA4 as its
own, and other chips such as LM95245 use chip IDs in the accepted ID range.
This results in false chip detection.

Fix problem by accepting only the known LM95241 chip ID.

Signed-off-by: Guenter Roeck <guenter.roeck@ericsson.com>
Acked-by: Jean Delvare <khali@linux-fr.org>
Cc: stable@kernel.org # 2.6.30+