git.karo-electronics.de Git - karo-tx-linux.git/log

context_tracking: New context tracking susbsystem

Create a new subsystem that probes on kernel boundaries
to keep track of the transitions between level contexts
with two basic initial contexts: user or kernel.

This is an abstraction of some RCU code that use such tracking
to implement its userspace extended quiescent state.

We need to pull this up from RCU into this new level of indirection
because this tracking is also going to be used to implement an "on
demand" generic virtual cputime accounting. A necessary step to
shutdown the tick while still accounting the cputime.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Li Zhong <zhong@linux.vnet.ibm.com>
Cc: Gilad Ben-Yossef <gilad@benyossef.com>
Reviewed-by: Steven Rostedt <rostedt@goodmis.org>
[ paulmck: fix whitespace error and email address. ]
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>

sched: Mark RCU reader in sched_show_task()

When sched_show_task() is invoked from try_to_freeze_tasks(), there is
no RCU read-side critical section, resulting in the following splat:

[  125.780730] ===============================
[  125.780766] [ INFO: suspicious RCU usage. ]
[  125.780804] 3.7.0-rc3+ #988 Not tainted
[  125.780838] -------------------------------
[  125.780875] /home/rafael/src/linux/kernel/sched/core.c:4497 suspicious rcu_dereference_check() usage!
[  125.780946]
[  125.780946] other info that might help us debug this:
[  125.780946]
[  125.781031]
[  125.781031] rcu_scheduler_active = 1, debug_locks = 0
[  125.781087] 4 locks held by s2ram/4211:
[  125.781120]  #0:  (&buffer->mutex){+.+.+.}, at: [<ffffffff811e2acf>] sysfs_write_file+0x3f/0x160
[  125.781233]  #1:  (s_active#94){.+.+.+}, at: [<ffffffff811e2b58>] sysfs_write_file+0xc8/0x160
[  125.781339]  #2:  (pm_mutex){+.+.+.}, at: [<ffffffff81090a81>] pm_suspend+0x81/0x230
[  125.781439]  #3:  (tasklist_lock){.?.?..}, at: [<ffffffff8108feed>] try_to_freeze_tasks+0x2cd/0x3f0
[  125.781543]
[  125.781543] stack backtrace:
[  125.781584] Pid: 4211, comm: s2ram Not tainted 3.7.0-rc3+ #988
[  125.781632] Call Trace:
[  125.781662]  [<ffffffff810a3c73>] lockdep_rcu_suspicious+0x103/0x140
[  125.781719]  [<ffffffff8107cf21>] sched_show_task+0x121/0x180
[  125.781770]  [<ffffffff8108ffb4>] try_to_freeze_tasks+0x394/0x3f0
[  125.781823]  [<ffffffff810903b5>] freeze_kernel_threads+0x25/0x80
[  125.781876]  [<ffffffff81090b65>] pm_suspend+0x165/0x230
[  125.781924]  [<ffffffff8108fa29>] state_store+0x99/0x100
[  125.781975]  [<ffffffff812f5867>] kobj_attr_store+0x17/0x20
[  125.782038]  [<ffffffff811e2b71>] sysfs_write_file+0xe1/0x160
[  125.782091]  [<ffffffff811667a6>] vfs_write+0xc6/0x180
[  125.782138]  [<ffffffff81166ada>] sys_write+0x5a/0xa0
[  125.782185]  [<ffffffff812ff6ae>] ? trace_hardirqs_on_thunk+0x3a/0x3f
[  125.782242]  [<ffffffff81669dd2>] system_call_fastpath+0x16/0x1b

This commit therefore adds the needed RCU read-side critical section.

Reported-by: "Rafael J. Wysocki" <rjw@sisk.pl>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>

rcu: Separate accounting of callbacks from callback-free CPUs

Currently, callback invocations from callback-free CPUs are accounted to
the CPU that registered the callback, but using the same field that is
used for normal callbacks.  This makes it impossible to determine from
debugfs output whether callbacks are in fact being diverted.  This commit
therefore adds a separate ->n_nocbs_invoked field in the rcu_data structure
in which diverted callback invocations are counted.  RCU's debugfs tracing
still displays normal callback invocations using ci=, but displayed
diverted callbacks with nci=.

Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>

rcu: Add callback-free CPUs

RCU callback execution can add significant OS jitter and also can
degrade both scheduling latency and, in asymmetric multiprocessors,
energy efficiency.  This commit therefore adds the ability for selected
CPUs ("rcu_nocbs=" boot parameter) to have their callbacks offloaded
to kthreads.  If the "rcu_nocb_poll" boot parameter is also specified,
these kthreads will do polling, removing the need for the offloaded
CPUs to do wakeups.  At least one CPU must be doing normal callback
processing: currently CPU 0 cannot be selected as a no-CBs CPU.
In addition, attempts to offline the last normal-CBs CPU will fail.

This feature was inspired by Jim Houston's and Joe Korty's JRCU, and
this commit includes fixes to problems located by Fengguang Wu's
kbuild test robot.

[ paulmck: Added gfp.h include file as suggested by Fengguang Wu. ]

Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>

Merge branches 'urgent.2012.10.27a', 'doc.2012.11.16a', 'fixes.2012.11.13a', 'srcu.2012.10.27a', 'stall.2012.11.13a', 'tracing.2012.11.08a' and 'idle.2012.10.24a' into HEAD

urgent.2012.10.27a: Fix for RCU user-mode transition (already in -tip).

doc.2012.11.08a: Documentation updates, most notably codifying the
memory-barrier guarantees inherent to grace periods.

fixes.2012.11.13a: Miscellaneous fixes.

srcu.2012.10.27a: Allow statically allocated and initialized srcu_struct
structures (courtesy of Lai Jiangshan).

stall.2012.11.13a: Add more diagnostic information to RCU CPU stall
warnings, also decrease from 60 seconds to 21 seconds.

hotplug.2012.11.08a: Minor updates to CPU hotplug handling.

tracing.2012.11.08a: Improved debugfs tracing, courtesy of Michael Wang.

idle.2012.10.24a: Updates to RCU idle/adaptive-idle handling, including
a boot parameter that maps normal grace periods to expedited.

Resolved conflict in kernel/rcutree.c due to side-by-side change.

rcu: Add documentation for the new rcuexp debugfs trace file

This commit adds the documentation of the rcuexp debugfs trace file
that records statistics for expedited grace periods.

Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>

rcu: Update documentation for TREE_RCU debugfs tracing

This commit updates the tracing documentation to reflect the new
format that has per-RCU-flavor directories.

Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>

rcu: Reduce default RCU CPU stall warning timeout

The RCU CPU stall warning timeout has defaulted to 60 seconds for
some years, with almost no false positives. This commit therefore
reduces the default to 21 seconds, slightly shorter than the new
soft-lockup timeout.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>

rcu: Fix TINY_RCU rcu_is_cpu_rrupt_from_idle check

The rcu_is_cpu_rrupt_from_idle() needs to allow for one interrupt level
from the idle loop, but TINY_RCU checks for a call from the idle loop
itself. This commit fixes this issue.

Reported-by: Josh Triplett <josh@joshtriplett.org>
Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>

rcu: Clarify memory-ordering properties of grace-period primitives

This commit explicitly states the memory-ordering properties of the
RCU grace-period primitives. Although these properties were in some
sense implied by the fundmental property of RCU ("a grace period must
wait for all pre-existing RCU read-side critical sections to complete"),
stating it explicitly will be a great labor-saving device.

Reported-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Oleg Nesterov <oleg@redhat.com>

rcu: Add new rcutorture module parameters to start/end test messages

Several new rcutorture module parameters have been added, but are not
printed to the console at the beginning and end of tests, which makes
it difficult to reproduce a prior test. This commit therefore adds
these new module parameters to the list printed at the beginning and
the end of the tests.

Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>

rcu: Remove list_for_each_continue_rcu()

The list_for_each_continue_rcu() macro is no longer used, so this commit
removes it. The list_for_each_entry_continue_rcu() macro should be
used instead.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>

rcu: Fix batch-limit size problem

Commit 29c00b4a1d9e27 (rcu: Add event-tracing for RCU callback
invocation) added a regression in rcu_do_batch()

Under stress, RCU is supposed to allow to process all items in queue,
instead of a batch of 10 items (blimit), but an integer overflow makes
the effective limit being 1. So, unless there is frequent idle periods
(during which RCU ignores batch limits), RCU can be forced into a
state where it cannot keep up with the callback-generation rate,
eventually resulting in OOM.

This commit therefore converts a few variables in rcu_do_batch() from
int to long to fix this problem, along with the module parameters
controlling the batch limits.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: <stable@vger.kernel.org> # 3.2 +

rcu: Add tracing for synchronize_sched_expedited()

This commit adds a per-RCU-flavor "rcuexp" file that dumps out
statistics for synchonize_sched_expedited().

Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>

rcu: Remove old debugfs interfaces and also RCU flavor name

This commit removes the old debugfs interfaces, so that the new
directory-per-RCU-flavor versions remain. Because the RCU flavor is
given by the directory name, there is no need to print it out, so remove
the name from the printout.

Signed-off-by: Michael Wang <wangyun@linux.vnet.ibm.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>

rcu: split 'rcuhier' to each flavor

This patch add new 'rcuhier' to each flavor's folder, now we could use:
'cat /debugfs/rcu/rsp/rcuhier'
to get the selected rsp info.

Signed-off-by: Michael Wang <wangyun@linux.vnet.ibm.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>

rcu: split 'rcugp' to each flavor

This patch add new 'rcugp' to each flavor's folder, now we could use:
'cat /debugfs/rcu/rsp/rcugp'
to get the selected rsp info.

Signed-off-by: Michael Wang <wangyun@linux.vnet.ibm.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>

rcu: split 'rcuboost' to each flavor

This patch add new 'rcuboost' to each flavor's folder, now we could use:
'cat /debugfs/rcu/rsp/rcuboost'
to get the selected rsp info.

Signed-off-by: Michael Wang <wangyun@linux.vnet.ibm.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>

rcu: split 'rcubarrier' to each flavor

This patch add new 'rcubarrier' to each flavor's folder, now we could use:
'cat /debugfs/rcu/rsp/rcubarrier'
to get the selected rsp info.

Signed-off-by: Michael Wang <wangyun@linux.vnet.ibm.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>

rcu: Fix tracing formatting

The rcu_state structure's ->completed field is unsigned long, so this
commit adjusts show_one_rcugp()'s printf() format to suit. Also add
the required ACCESS_ONCE() directives while we are in this function.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>

rcu: Remove the interface "rcudata.csv"

This patch removes the interface "rcudata.csv" since it is apparently
not used.

Signed-off-by: Michael Wang <wangyun@linux.vnet.ibm.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>

rcu: Replace the old interface with the new one

This patch removed the old RCU debugfs interface and replaced it with
the new one.

Signed-off-by: Michael Wang <wangyun@linux.vnet.ibm.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>

rcu: Optimize the 'rcu_pending' for RCU trace

This patch implements the new 'rcu_pending' interface under each rsp
directory, by using the 'CPU units sequence reading', thus avoiding loss
of tracing data.

Signed-off-by: Michael Wang <wangyun@linux.vnet.ibm.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>

rcu: Optimize the 'rcudata.csv' for RCU trace

This patch implements the new 'rcudata.csv' interface under each rsp
directory, by using the 'CPU units sequence reading', thus avoiding loss
of tracing data.

Signed-off-by: Michael Wang <wangyun@linux.vnet.ibm.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>

rcu: Optimize the 'rcudata' for RCU trace

This patch implements the new 'rcudata' interface under each rsp
directory, by using the 'CPU units sequence reading', thus avoiding loss
of tracing data.

Signed-off-by: Michael Wang <wangyun@linux.vnet.ibm.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>

rcu: Fundamental facility for 'CPU units sequence reading'

This patch add the fundamental facility used by the following patches, so we
can implement the 'CPU units sequence reading' later.

This helps us avoid losing data when there are too many CPUs and too
small of a buffer, since this new approach allows userspace to read out
the data one CPU at a time. Thus, if the buffer is not large enough,
userspace will get whatever CPUs fit, and can then issue another read
for the remainder of the data.

Signed-off-by: Michael Wang <wangyun@linux.vnet.ibm.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>

rcu: Create directory for each flavor of rcu

This patch will create subdirectory according to each flavor of rcu, the new
structure will be:

/debugfs/rcu/ -> rsp_0
-> rsp_1
-> ...

So we can go to '/debugfs/rcu/rsp_0' and get the cpu info of rsp_0 there.
The flavors of RCU are currently rcu_bh, rcu_preempt, and rcu_sched.

Signed-off-by: Michael Wang <wangyun@linux.vnet.ibm.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>

rcu: Instrument synchronize_rcu_expedited() for debugfs tracing

This commit adds the counters to rcu_state and updates them in
synchronize_rcu_expedited() to provide the data needed for debugfs
tracing.

Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>

rcu: Move synchronize_sched_expedited() state to rcu_state

Tracing (debugfs) of expedited RCU primitives is required, which in turn
requires that the relevant data be located where the tracing code can find
it, not in its current static global variables in kernel/rcutree.c.
This commit therefore moves sync_sched_expedited_started and
sync_sched_expedited_done to the rcu_state structure, as fields
->expedited_start and ->expedited_done, respectively.

Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>

rcu: Avoid counter wrap in synchronize_sched_expedited()

There is a counter scheme similar to ticket locking that
synchronize_sched_expedited() uses to service multiple concurrent
callers with the same expedited grace period.  Upon entry, a
sync_sched_expedited_started variable is atomically incremented,
and upon completion of a expedited grace period a separate
sync_sched_expedited_done variable is atomically incremented.

However, if a synchronize_sched_expedited() is delayed while
in try_stop_cpus(), concurrent invocations will increment the
sync_sched_expedited_started counter, which will eventually overflow.
If the original synchronize_sched_expedited() resumes execution just
as the counter overflows, a concurrent invocation could incorrectly
conclude that an expedited grace period elapsed in zero time, which
would be bad.  One could rely on counter size to prevent this from
happening in practice, but the goal is to formally validate this
code, so it needs to be fixed anyway.

This commit therefore checks the gap between the two counters before
incrementing sync_sched_expedited_started, and if the gap is too
large, does a normal grace period instead.  Overflow is thus only
possible if there are more than about 3.5 billion threads on 32-bit
systems, which can be excluded until such time as task_struct fits
into a single byte and 4G/4G patches are accepted into mainline.
It is also easy to encode this limitation into mechanical theorem
provers.

Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>

rcu: Rename ->onofflock to ->orphan_lock

The ->onofflock field in the rcu_state structure at one time synchronized
CPU-hotplug operations for RCU. However, its scope has decreased over time
so that it now only protects the lists of orphaned RCU callbacks. This
commit therefore renames it to ->orphan_lock to reflect its current use.

Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>

rcu: Document alternative RCU/reference-count algorithms

The approach for mixing RCU and reference counting listed in the RCU
documentation only describes one possible approach.  This approach can
result in failure on the read side, which is nice if you want fresh data,
but not so good if you want simple code.  This commit therefore adds
two additional approaches that feature unconditional reference-count
acquisition by RCU readers.  These approaches are very similar to that
used in the security code.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>

rcu: Update docs to include kfree_rcu()

Mention kfree_rcu() in the call_rcu() section. Additionally fix the
example code for list replacement that used the wrong structure element.

Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>

rcu: Fix unrecovered RCU user mode in syscall_trace_leave()

On x86-64 syscall exit, 3 non exclusive events may happen
looping in the following order:

1) Check if we need resched for user preemption, if so call
schedule_user()

2) Check if we have pending signals, if so call do_notify_resume()

3) Check if we do syscall tracing, if so call syscall_trace_leave()

However syscall_trace_leave() has been written assuming it directly
follows the syscall and forget about the above possible 1st and 2nd
steps.

Now schedule_user() and do_notify_resume() exit in RCU user mode
because they have most chances to resume userspace immediately and
this avoids an rcu_user_enter() call in the syscall fast path.

So by the time we call syscall_trace_leave(), we may well be in RCU
user mode. To fix this up, simply call rcu_user_exit() in the beginning
of this function.

This fixes some reported RCU uses in extended quiescent state.

Reported-by: Dave Jones <davej@redhat.com>
Reported-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Tested-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>

rcutorture: Use DEFINE_STATIC_SRCU()

Use DEFINE_STATIC_SRCU() to simplify the rcutorture.c SRCU test code.

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>

srcu: Add DEFINE_SRCU()

In old days, we had two different API sets for dynamic-allocated per-CPU
data and DEFINE_PER_CPU()-defined per_cpu data, and because SRCU used
dynamic-allocated per-CPU data, its srcu_struct structures cannot be
declared statically. This commit therefore introduces DEFINE_SRCU()
and DEFINE_STATIC_SRCU() to allow statically declared SRCU structures,
using the new static per-CPU interfaces.

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
[ paulmck: Updated for __DELAYED_WORK_INITIALIZER() added argument,
fixed whitespace issue. ]

rcu: Wordsmith help text for RCU_USER_QS kernel parameter

This commit adds a "try" missing from the end of the first paragraph
of the RCU_USER_QS help text.

[ paulmck: Also fix up the last paragraph a bit. ]

Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>

rcu: Dump number of callbacks in stall warning messages

In theory, if a grace period manages to get started despite there being
no callbacks on any of the CPUs, all CPUs could go into dyntick-idle
mode, so that the grace period would never end. This commit updates
the RCU CPU stall warning messages to detect this condition by summing
up the number of callbacks on all CPUs.

Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>

rcu: Add grace-period information to RCU CPU stall warnings

This commit causes the last grace period started and completed to be
printed on RCU CPU stall warning messages in order to aid diagnosis.

Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>

rcu: Print remote CPU's stacks in stall warnings

The RCU CPU stall warnings rely on trigger_all_cpu_backtrace() to
do NMI-based dump of the stack traces of all CPUs. Unfortunately, a
number of architectures do not implement trigger_all_cpu_backtrace(), in
which case RCU falls back to just dumping the stack of the running CPU.
This is unhelpful in the case where the running CPU has detected that
some other CPU has stalled.

This commit therefore makes the running CPU dump the stacks of the
tasks running on the stalled CPUs.

Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>

srcu: Export process_srcu()

Because process_srcu() will be used in DEFINE_SRCU(), which is a macro
that could be expanded pretty much anywhere, it can no longer be static.
Note that process_srcu() is still internal to srcu.h.

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>

srcu: Credit Lai Jiangshan with SRCU rewrite

Lai Jiangshan rewrote SRCU, so this commit ensures that he gets his
proper share of blame^Wcredit.

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>

rcu: Fix precedence error in cpu_needs_another_gp()

The fix introduced by a10d206e (rcu: Fix day-one dyntick-idle
stall-warning bug) has a C-language precedence error. It turns out
that this error is harmless in that the same result is computed for all
inputs, but the code is nevertheless a potential source of confusion.
This commit therefore introduces parentheses in order to force the
execution of the code to reflect the intent.

Reported-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>

rcu: Add a module parameter to force use of expedited RCU primitives

There have been some embedded applications that would benefit from
use of expedited grace-period primitives. In some ways, this is
similar to synchronize_net() doing either a normal or an expedited
grace period depending on lock state, but with control outside of
the kernel.

This commit therefore adds rcu_expedited boot and sysfs parameters
that cause the kernel to substitute expedited primitives for the
normal grace-period primitives.

[ paulmck: Add trace/event/rcu.h to kernel/srcu.c to avoid build error.
Get rid of infinite loop through contention path.]

Signed-off-by: Antti P Miettinen <amiettinen@nvidia.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>

rcu: Update RCU_FAST_NO_HZ help text

The RCU_FAST_NO_HZ help text included a warning about overhead on large
systems, but that issue has since been resolved. The main remaining
issue with RCU_FAST_NO_HZ is increased real-time latency. This commit
therefore updates the help text accordingly.

Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>

rcu: Remove rcu_switch()

It's only there to call rcu_user_hooks_switch(). Let's
just call rcu_user_hooks_switch() directly, we don't need this
function in the middle.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Josh Triplett <josh@joshtriplett.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Richard Weinberger <richard@nod.at>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>

rcu: Make rcutorture give diagnostics if CPU offline fails

This commit causes rcutorture to print the errno if cpu_down() fails
when the rcutorture "verbose" module parameter is specified.

Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>

rcu: Fix comment about _rcu_barrier()/orphanage exclusion

In the old days, _rcu_barrier() acquired ->onofflock to exclude
rcu_send_cbs_to_orphanage(), which allowed the latter to avoid memory
barriers in callback handling. However, _rcu_barrier() recently started
doing get_online_cpus() to lock out CPU-hotplug operations entirely, which
means that the comment in rcu_send_cbs_to_orphanage() that talks about
->onofflock is now obsolete. This commit therefore fixes the comment.

Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>

rcu: Correct the name of a reference in list of RCU papers

Trying to go through the history of RCU (not for the weak
minded) led me to search for a non-existent paper.

Correct it to the actual reference

Signed-off-by: Dhaval Giani <dhaval.giani@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>

Documentation: Fix memory-barriers.txt example

This commit fixes a broken example of overlapping stores in the
Documentation/memory-barriers.txt file.

Reported-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>

rcu: Accelerate callbacks for CPU initiating a grace period

Because grace-period initialization is carried out by a separate
kthread, it might happen on a different CPU than the one that
had the callback needing a grace period -- which is where the
callback acceleration needs to happen.

Fortunately, rcu_start_gp() holds the root rcu_node structure's
->lock, which prevents a new grace period from starting. This
allows this function to safely determine that a grace period has
not yet started, which in turn allows it to fully accelerate any
callbacks that it has pending. This commit adds this acceleration.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>

Linux 3.7-rc2

Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/cmarinas/linux-aarch64

Pull arm64 fixes from Catalin Marinas:
"Main changes:
   - AArch64 Linux compilation fixes following 3.7-rc1 changes
     (MODULES_USE_ELF_RELA, update_vsyscall() prototype)
   - Unnecessary register setting in start_thread() (thanks to Al Viro)
   - ptrace fixes"

* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/cmarinas/linux-aarch64:
  arm64: fix alignment padding in assembly code
  arm64: ptrace: use HW_BREAKPOINT_EMPTY type for disabled breakpoints
  arm64: ptrace: make structure padding explicit for debug registers
  arm64: No need to set the x0-x2 registers in start_thread()
  arm64: Ignore memory blocks below PHYS_OFFSET
  arm64: Fix the update_vsyscall() prototype
  arm64: Select MODULES_USE_ELF_RELA
  arm64: Remove duplicate inclusion of mmu_context.h in smp.c

arm64: fix alignment padding in assembly code

An interesting effect of using the generic version of linkage.h
is that the padding is defined in terms of x86 NOPs, which can have
even more interesting effects when the assembly code looks like this:

ENTRY(func1)
mov x0, xzr
ENDPROC(func1)
// fall through
ENTRY(func2)
mov x0, #1
ret
ENDPROC(func2)

Admittedly, the code is not very nice. But having code from another
architecture doesn't look completely sane either.

The fix is to add arm64's version of linkage.h, which causes the insertion
of proper AArch64 NOPs.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>

use clamp_t in UNAME26 fix

The min/max call needed to have explicit types on some architectures
(e.g. mn10300). Use clamp_t instead to avoid the warning:

kernel/sys.c: In function 'override_release':
kernel/sys.c:1287:10: warning: comparison of distinct pointer types lacks a cast [enabled by default]

Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull perf fixes from Ingo Molnar:
"Assorted small fixes"

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf python: Properly link with libtraceevent
  perf hists browser: Add back callchain folding symbol
  perf tools: Fix build on sparc.
  perf python: Link with libtraceevent
  perf python: Initialize 'page_size' variable
  tools lib traceevent: Fix missed freeing of subargs in free_arg() in filter
  lib tools traceevent: Add back pevent assignment in __pevent_parse_format()
  perf hists browser: Fix off-by-two bug on the first column
  perf tools: Remove warnings on JIT samples for srcline sort key
  perf tools: Fix segfault when using srcline sort key
  perf: Require exclude_guest to use PEBS - kernel side enforcement
  perf tool: Precise mode requires exclude_guest

perf python: Properly link with libtraceevent

Namhyung Kim reported that the build fails with:

  GEN python/perf.so
  gcc: error: python_ext_build/tmp//../../libtraceevent.a: No such file or directory
  error: command 'gcc' failed with exit status 1
  cp: cannot stat `python_ext_build/lib/perf.so': No such file or directory
  make: *** [python/perf.so] Error 1

We need to propagate the TE_PATH variable to the setup.py file.

Reported-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Link: http://lkml.kernel.org/n/tip-8umiPbm4sxpknKivbjgykhut@git.kernel.org
[ Fixed superfluous variable build error. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>

Merge tag 'perf-urgent-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent

Pull perf/urgent fixes from Arnaldo Carvalho de Melo:

* The python binding needs to link with libtraceevent and to initialize
  the 'page_size' variable so that mmaping works again.

* The callchain folding character that appears on the TUI just before
  the overhead had disappeared due to recent changes, add it back.

* Intel PEBS in VT-x context uses the DS address as a guest linear address,
  even though its programmed by the host as a host linear address. This either
  results in guest memory corruption and or the hardware faulting and 'crashing'
  the virtual machine.  Therefore we have to disable PEBS on VT-x enter and
  re-enable on VT-x exit, enforcing a strict exclude_guest.

  Kernel side enforcement fix by Peter Zijlstra, tooling side fix by David Ahern.

* Fix build on sparc due to UAPI, fix from David Miller.

* Fixes for the srclike sort key for unresolved symbols and when processing
  samples in JITted code, where we don't have an ELF file, just an special
  symbol table, fixes from Namhyung Kim.

* Fix some leaks in libtraceevent, from Steven Rostedt.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>

Merge tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc

Pull ARM soc fixes from Olof Johansson:
"A set of fixes and some minor cleanups for -rc2:

   - A series from Arnd that fixes warnings in drivers and other code
     included by ARM defconfigs.  Most have been acked by corresponding
     maintainers (and seem quite hard to argue not picking up anyway in
     the few exception cases).
   - A few misc patches from the list for integrator/vt8500/i.MX
   - A batch of fixes to OMAP platforms, fixing:
     - boot problems on beaglebone,
     - regression fixes for local timers
     - clockdomain locking fixes
     - a few boot/sparse warnings
   - For Tegra:
     - Clock rate calculation overflow fix
     - Revert a change that removed timer clocks and a fix for symbol
       name clashes
   - For Renesas:
     - IO accessor / annotation cleanups to remove warnings
   - For Kirkwood/Dove/mvebu:
     - Fixes for device trees for Dove (some minor cleanups, some fixes)
     - Fixes for the mvebu gpio driver
     - Fix build problem for Feroceon due to missing ifdefs
     - Fix lsxl DTS files"

* tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (31 commits)
  ARM: kirkwood: fix buttons on lsxl boards
  ARM: kirkwood: fix LEDs names for lsxl boards
  ARM: Kirkwood: fix disabling CACHE_FEROCEON_L2
  gpio: mvebu: Add missing breaks in mvebu_gpio_irq_set_type
  ARM: dove: Add crypto engine to DT
  ARM: dove: Remove watchdog from DT
  ARM: dove: Restructure SoC device tree descriptor
  ARM: dove: Fix clock names of sata and gbe
  ARM: dove: Fix tauros2 device tree init
  ARM: dove: Add pcie clock support
  ARM: OMAP2+: Allow kernel to boot even if GPMC fails to reserve memory
  ARM: OMAP: clockdomain: Fix locking on _clkdm_clk_hwmod_enable / disable
  ARM: s3c: mark s3c2440_clk_add as __init_refok
  spi/s3c64xx: use correct dma_transfer_direction type
  ARM: OMAP4: devices: fixup OMAP4 DMIC platform device error message
  ARM: OMAP2+: clock data: Add dev-id for the omap-gpmc dummy fck
  ARM: OMAP: resolve sparse warning concerning debug_card_init()
  ARM: OMAP4: Fix twd_local_timer_register regression
  ARM: tegra: add tegra_timer clock
  ARM: tegra: rename tegra system timer
  ...

MODSIGN: Move the magic string to the end of a module and eliminate the search

Emit the magic string that indicates a module has a signature after the
signature data instead of before it.  This allows module_sig_check() to
be made simpler and faster by the elimination of the search for the
magic string.  Instead we just need to do a single memcmp().

This works because at the end of the signature data there is the
fixed-length signature information block.  This block then falls
immediately prior to the magic number.

From the contents of the information block, it is trivial to calculate
the size of the signature data and thus the size of the actual module
data.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Merge tag 'kirkwood_fixes_for_v3.7' of git://git.infradead.org/users/jcooper/linux into fixes

From Jason Cooper:
- improve #ifdef logic to prevent linker errors with CACHE_FEROCEON_L2
- lsxl board dts fixes

* tag 'kirkwood_fixes_for_v3.7' of git://git.infradead.org/users/jcooper/linux:
  ARM: kirkwood: fix buttons on lsxl boards
  ARM: kirkwood: fix LEDs names for lsxl boards
  ARM: Kirkwood: fix disabling CACHE_FEROCEON_L2

MODSIGN: Cleanup .gitignore

The module build process no longer creates intermediate files for module
signing, so remove them from .gitignore.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

MODSIGN: perlify sign-file and merge in x509keyid

Turn sign-file into perl and merge in x509keyid. The latter doesn't
need to be a separate script as it doesn't actually need to work out the
SHA1 sum of the X.509 certificate itself, since it can get that from the
X.509 certificate.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Merge branch 'testing/driver-warnings' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc into fixes

A collection of warning fixes on non-ARM code from Arnd Bergmann:

* 'testing/driver-warnings' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
  ARM: s3c: mark s3c2440_clk_add as __init_refok
  spi/s3c64xx: use correct dma_transfer_direction type
  pcmcia: sharpsl: don't discard sharpsl_pcmcia_ops
  USB: EHCI: mark ehci_orion_conf_mbus_windows __devinit
  mm/slob: use min_t() to compare ARCH_SLAB_MINALIGN
  SCSI: ARM: make fas216_dumpinfo function conditional
  SCSI: ARM: ncr5380/oak uses no interrupts

hold task->mempolicy while numa_maps scans.

  /proc/<pid>/numa_maps scans vma and show mempolicy under
  mmap_sem. It sometimes accesses task->mempolicy which can
  be freed without mmap_sem and numa_maps can show some
  garbage while scanning.

This patch tries to take reference count of task->mempolicy at reading
numa_maps before calling get_vma_policy(). By this, task->mempolicy
will not be freed until numa_maps reaches its end.

V2->v3
  -  updated comments to be more verbose.
  -  removed task_lock() in numa_maps code.
V1->V2
  -  access task->mempolicy only once and remember it.  Becase kernel/exit.c
     can overwrite it.

Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Acked-by: David Rientjes <rientjes@google.com>
Acked-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Merge branch 'x86/urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull miscellaneous x86 fixes from Peter Anvin:
"The biggest ones are fixing suspend/resume breakage on 32 bits, and an
  interrim fix for mapping over holes that allows AMD kit with more than
  1 TB.

  A final solution for the latter is in the works, but involves some
  fairly invasive changes that will probably mean it will only be
  appropriate for 3.8."

* 'x86/urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86, MCE: Remove bios_cmci_threshold sysfs attribute
  x86, amd, mce: Avoid NULL pointer reference on CPU northbridge lookup
  x86: Exclude E820_RESERVED regions and memory holes above 4 GB from direct mapping.
  x86/cache_info: Use ARRAY_SIZE() in amd_l3_attrs()
  x86/reboot: Remove quirk entry for SBC FITPC
  x86, suspend: Correct the restore of CR4, EFER; skip computing EFLAGS.ID

Merge branch 'akpm' (Fixes from Andrew)

Merge misc fixes from Andrew Morton:
"Seven fixes"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (7 patches)
  lib/dma-debug.c: fix __hash_bucket_find()
  mm: compaction: correct the nr_strict va isolated check for CMA
  firmware/memmap: avoid type conflicts with the generic memmap_init()
  pidns: remove recursion from free_pid_ns()
  drivers/video/backlight/lm3639_bl.c: return proper error in lm3639_bled_mode_store() error paths
  kernel/sys.c: fix stack memory content leak via UNAME26
  linux/coredump.h needs asm/siginfo.h

lib/dma-debug.c: fix __hash_bucket_find()

If there is only one match, the unique matched entry should be returned.

Without the fix, the upcoming dma debug interfaces ("dma-debug: new
interfaces to debug dma mapping errors") can't work reliably because
only device and dma_addr are passed to dma_mapping_error().

Signed-off-by: Ming Lei <ming.lei@canonical.com>
Reported-by: Wu Fengguang <fengguang.wu@intel.com>
Cc: Joerg Roedel <joerg.roedel@amd.com>
Tested-by: Shuah Khan <shuah.khan@hp.com>
Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
Cc: Jakub Kicinski <kubakici@wp.pl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

mm: compaction: correct the nr_strict va isolated check for CMA

Thierry reported that the "iron out" patch for isolate_freepages_block()
had problems due to the strict check being too strict with "mm:
compaction: Iron out isolate_freepages_block() and
isolate_freepages_range() -fix1". It's possible that more pages than
necessary are isolated but the check still fails and I missed that this
fix was not picked up before RC1. This same problem has been identified
in 3.7-RC1 by Tony Prisk and should be addressed by the following patch.

Signed-off-by: Mel Gorman <mgorman@suse.de>
Tested-by: Tony Prisk <linux@prisktech.co.nz>
Reported-by: Thierry Reding <thierry.reding@avionic-design.de>
Acked-by: Rik van Riel <riel@redhat.com>
Acked-by: Minchan Kim <minchan@kernel.org>
Cc: Richard Davies <richard@arachsys.com>
Cc: Shaohua Li <shli@kernel.org>
Cc: Avi Kivity <avi@redhat.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

firmware/memmap: avoid type conflicts with the generic memmap_init()

Fix this build error:

drivers/firmware/memmap.c:240:19: error: conflicting types for 'memmap_init'
arch/ia64/include/asm/pgtable.h:565:17: note: previous declaration of 'memmap_init' was here

Signed-off-by: Fengguang Wu <fengguang.wu@intel.com>
Cc: Bernhard Walle <bwalle@suse.de>
Cc: Glauber Costa <glommer@parallels.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

pidns: remove recursion from free_pid_ns()

free_pid_ns() operates in a recursive fashion:

free_pid_ns(parent)
  put_pid_ns(parent)
    kref_put(&ns->kref, free_pid_ns);
      free_pid_ns

thus if there was a huge nesting of namespaces the userspace may trigger
avalanche calling of free_pid_ns leading to kernel stack exhausting and a
panic eventually.

This patch turns the recursion into an iterative loop.

Based on a patch by Andrew Vagin.

[akpm@linux-foundation.org: export put_pid_ns() to modules]
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: Andrew Vagin <avagin@openvz.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Pavel Emelyanov <xemul@parallels.com>
Cc: Greg KH <greg@kroah.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

drivers/video/backlight/lm3639_bl.c: return proper error in lm3639_bled_mode_store() error paths

Signed-off-by: Axel Lin <axel.lin@ingics.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

kernel/sys.c: fix stack memory content leak via UNAME26

Calling uname() with the UNAME26 personality set allows a leak of kernel
stack contents. This fixes it by defensively calculating the length of
copy_to_user() call, making the len argument unsigned, and initializing
the stack buffer to zero (now technically unneeded, but hey, overkill).

CVE-2012-0957

Reported-by: PaX Team <pageexec@freemail.hu>
Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: PaX Team <pageexec@freemail.hu>
Cc: Brad Spengler <spender@grsecurity.net>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

linux/coredump.h needs asm/siginfo.h

Commit 5ab1c309b344 ("coredump: pass siginfo_t* to do_coredump() and
below, not merely signr") added siginfo_t to linux/coredump.h but forgot
to include asm/siginfo.h.  This breaks the build for UML/i386.  (And any
other arch where asm/siginfo.h is not magically preincluded...)

  In file included from arch/x86/um/elfcore.c:2:0: include/linux/coredump.h:15:25: error: unknown type name 'siginfo_t'
  make[1]: *** [arch/x86/um/elfcore.o] Error 1

Signed-off-by: Richard Weinberger <richard@nod.at>
Cc: Denys Vlasenko <vda.linux@googlemail.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Amerigo Wang <amwang@redhat.com>
Cc: "Jonathan M. Foote" <jmfoote@cert.org>
Cc: Roland McGrath <roland@hack.frob.com>
Cc: Pedro Alves <palves@redhat.com>
Cc: Fengguang Wu <fengguang.wu@intel.com>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

remap_file_pages: correctly handle the case of a NULL vm_ops pointer

In commit 0b173bc4daa8 ("mm: kill vma flag VM_CAN_NONLINEAR") we
replaced the VM_CAN_NONLINEAR test with checking whether the mapping has
a '->remap_pages()' vm operation, but there is no guarantee that there
it even has a vm_ops pointer at all.

Add the appropriate test for NULL vm_ops.

Reported-by: Sasha Levin <levinsasha928@gmail.com>
Cc: Konstantin Khlebnikov <khlebnikov@openvz.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Merge tag 'xtensa-next-20121018' of git://github.com/czankel/xtensa-linux

Pull Xtensa patchset from Chris Zankel:
"These are all limited to the xtensa subtree and include some important
  changes (adding long missing system calls for newer libc versions and
  other fixes) and the UAPI changes"

* tag 'xtensa-next-20121018' of git://github.com/czankel/xtensa-linux:
  xtensa: add missing system calls to the syscall table
  xtensa: minor compiler warning fix
  xtensa: Use Kbuild infrastructure to handle asm-generic headers
  UAPI: (Scripted) Disintegrate arch/xtensa/include/asm
  xtensa: fix unaligned usermode access
  xtensa: reorganize SR referencing
  xtensa: fix boot parameters parsing
  xtensa: fix missing return in do_page_fault for SIGBUS case
  xtensa: copy_thread with CLONE_VM must not copy live parent AR windows
  xtensa: fix memmove(), bcopy(), and memcpy().
  xtensa: ISS: fix rs_put_char
  xtensa: ISS: fix specific simcalls

kbuild: Fix module signature generation

Rusty had clearly not actually tested his module signing changes that I
(trustingly) applied as commit e2a666d52b48 ("kbuild: sign the modules
at install time"). That commit had multiple bugs:

- using "${#VARIABLE}" to get the number of characters in a shell
   variable may look clever, but it's locale-dependent: it returns the
   number of *characters*, not bytes. And we do need bytes.

   So don't use "${#..}" expansion, do the stupid "wc -c" thing instead
   (where "c" stands for "bytes", not "characters", despite the letter.

- Rusty had confused "siglen" and "signerlen", and his conversion
   didn't set "signerlen" at all, and incorrectly set "siglen" to the
   size of the signer, not the size of the signature.

End result: the modified sign-file script did create something that
superficially *looked* like a signature, but didn't actually work at
all, and would fail the signature check. Oops.

Tssk, tssk, Rusty.

But Rusty was definitely right that this whole thing should be rewritten
in perl by somebody who has the perl-fu to do so.  That is not me,
though - I'm just doing an emergency fix for the shell script.

Cc: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

xen: Fix annoying compile-time warning

Commit cb6b6df111e4 ("xen/pv-on-hvm kexec: add quirk for Xen 3.4 and
shutdown watches.") added the xen_strict_xenbus_quirk() function with an
old K&R-style declaration without proper typing, causing gcc to rightly
complain:

drivers/xen/xenbus/xenbus_xs.c:628:13: warning: function declaration isn’t a prototype [-Wstrict-prototypes]

because we really don't live in caves using stone-age tools any more,
and the kernel has always used properly typed ANSI C function
declarations.

So if a function doesn't take arguments, we tell the compiler so
explicitly by adding the proper "void" in the prototype.

I'm sure there are tons of other examples of this kind of stuff in the
tree, but this is the one that hits my workstation config, so..

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Merge tag 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging

Pull hwmon fixes from Guenter Roeck:
"Drop some leftover dependencies on CONFIG_EXPERIMENTAL, and add
  support for Intel Atom CE4110/4150/4170."

* tag 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
  hwmon: (coretemp) Add support for Atom CE4110/4150/4170
  Documentation/hwmon: remove CONFIG_EXPERIMENTAL
  hwmon: (pmbus) remove CONFIG_EXPERIMENTAL

Merge tag 'tty-3.7-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty

Pull TTY fixes from Greg Kroah-Hartman:
"Here are some tty and serial driver fixes for your 3.7-rc1 tree.

  Again, the UABI header file fixes, and a number of build and runtime
  serial driver bugfixes that solve problems people have been reporting
  (the staging driver is a tty driver, hence the fixes coming in through
  this tree.)

  All of these have been in the linux-next tree for a while.

Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>"
* tag 'tty-3.7-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
  staging: dgrp: check return value of alloc_tty_driver
  staging: dgrp: check for NULL pointer in (un)register_proc_table
  serial/8250_hp300: Missing 8250 register interface conversion bits
  UAPI: (Scripted) Disintegrate include/linux/hsi
  tty: serial: sccnxp: Fix bug with unterminated platform_id list
  staging: serial: dgrp: Add missing #include <linux/uaccess.h>
  serial: sccnxp: Allows the driver to be compiled as a module
  tty: Fix bogus "callbacks suppressed" messages
  net, TTY: initialize tty->driver_data before usage

Merge tag 'usb-3.7-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb

Pull USB fixes from Greg Kroah-Hartman:
"Here are the USB patches against your 3.7-rc1 tree.

  There are the usual UABI header file movements, and we finally are now
  able to remove the dbg() macro that is over 15 years old (that had to
  wait for after some other trees got merged into yours during the big
  3.7-rc1 merge window.)

  Other than that, nothing major, just a number of bugfixes and new
  device ids.  It turns out that almost all of the usb-serial drivers
  had bugs in how they were handling their internal data, leaking
  memory, hence all of those fixups.

Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>"
* tag 'usb-3.7-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (42 commits)
  USB: option: add more ZTE devices
  USB: option: blacklist net interface on ZTE devices
  usb: host: xhci: New system added for Compliance Mode Patch on SN65LVPE502CP
  USB: io_ti: fix sysfs-attribute creation
  USB: iuu_phoenix: fix sysfs-attribute creation
  USB: spcp8x5: fix port-data memory leak
  USB: ssu100: fix port-data memory leak
  USB: ti_usb_3410_5052: fix port-data memory leak
  USB: oti6858: fix port-data memory leak
  USB: iuu_phoenix: fix port-data memory leak
  USB: kl5kusb105: fix port-data memory leak
  USB: io_ti: fix port-data memory leak
  USB: keyspan_pda: fix port-data memory leak
  USB: f81232: fix port-data memory leak
  USB: io_edgeport: fix port-data memory leak
  USB: kobil_sct: fix port-data memory leak
  USB: cypress_m8: fix port-data memory leak
  usb: acm: fix the computation of the number of data bits
  usb: Missing dma_mask in ehci-vt8500.c when probed from device-tree
  usb: Missing dma_mask in uhci-platform.c when probed from device-tree
  ...

Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rkuo/linux-hexagon-kernel

Pull hexagon updates from Richard Kuo:
"It includes the Hexagon UAPI changes from David Howells and some CR
  marking changes for the transition from Code Aurora to Linux
  Foundation."

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rkuo/linux-hexagon-kernel:
  Hexagon: Copyright marking changes
  UAPI: (Scripted) Disintegrate arch/hexagon/include/asm

Merge tag 'parisc-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/parisc-2.6

Pull PARISC changes from James Bottomley:
"This is a couple of high code motion patches (all within arch/parisc)
  I'd like to apply at -rc1 to avoid conflicts with anything else.  One
  moves us on to the generated instead of included asm file model and
  the other is a pull request from David Howells for UAPI
  disintegration.

Signed-off-by: James Bottomley <JBottomley@Parallels.com>"
* tag 'parisc-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/parisc-2.6:
  UAPI: (Scripted) Disintegrate arch/parisc/include/asm
  [PARISC] asm: redo generic includes

MAINTAINERS: Add Rafael's address to ACPI maintainers

Since I will be maintaining ACPI together with Len from now on, add my
address to the ACPI maintainers list in the MAINTAINERS file (this is
the address to send patches to).

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Len Brown <len.brown@intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Merge branch 'for-3.7' of git://linux-nfs.org/~bfields/linux

Pull nfsd bugfixes from J Bruce Fields.

* 'for-3.7' of git://linux-nfs.org/~bfields/linux:
SUNRPC: Prevent kernel stack corruption on long values of flush
NLM: nlm_lookup_file() may return NLMv4-specific error codes

USB: ehci-fsl: Return valid error in ehci_fsl_setup_phy

ehci_fsl_setup_phy is supposed to return an int, but had a void return
value in the case of controller_ver being invalid.

Introduced by commit 3735ba8db8e6 ("powerpc/usb: fix bug of CPU hang
when missing USB PHY clock"), which missed one return.

Signed-off-by: Ben Collins <ben.c@servergy.com>
Cc: Shengzhou Liu <Shengzhou.Liu@freescale.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

xtensa: add missing system calls to the syscall table

Add the following system calls to the syscall table:

fallocate
sendmmsg
umount2
syncfs
epoll_create1
inotify_init1
signalfd4
dup3
pipe2
timerfd_create
timerfd_settime
timerfd_gettime
eventfd2
preadv
pwritev
fanotify_init
fanotify_mark
process_vm_readv
process_vm_writev
name_to_handle_at
open_by_handle_at
sync_file_range
perf_event_open
rt_tgsigqueueinfo
clock_adjtime
prlimit64
kcmp

Note that we have to use the 'sys_sync_file_range2' version, so that
the 64-bit arguments are aligned correctly to the argument registers.

Signed-off-by: Chris Zankel <chris@zankel.net>

Merge tag 'disintegrate-parisc-20121016' into for-linus

UAPI Disintegration 2012-10-16

Signed-off-by: James Bottomley <JBottomley@Parallels.com>

xtensa: minor compiler warning fix

Fix two compiler warnings complaining about truncating a value on
a 64-bit host, and about declaring an unused variable that is only
used for a specific configuration.

Signed-off-by: Chris Zankel <chris@zankel.net>

kbuild: sign the modules at install time

Linus deleted the old code and put signing on the install command,
I fixed it to extract the keyid and signer-name within sign-file
and cleaned up that script now it always signs in-place.

Some enthusiast should convert sign-key to perl and pull
x509keyid into it.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Merge commit '5bc66170dc486556a1e36fd384463536573f4b82' into x86/urgent

From Borislav Petkov <bp@amd64.org>:

Below is a RAS fix which reverts the addition of a sysfs attribute
which we agreed is not needed, post-factum. And this should go in now
because that sysfs attribute is going to end up in 3.7 otherwise and
thus exposed to userspace; removing it then would be a lot harder.

This is done as a merge rather than a simple patch/cherry-pick since
the baseline for this patch was not in the previous x86/urgent.

Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>

x86, MCE: Remove bios_cmci_threshold sysfs attribute

450cc201038f3 ("x86/mce: Provide boot argument to honour bios-set CMCI
threshold") added the bios_cmci_threshold sysfs attribute which was
supposed to communicate to userspace tools that BIOS CMCI threshold has
been honoured.

However, this info is not of any importance to userspace - it should
rather get the actual error count it has been thresholded already from
MCi_STATUS[38:52].

So drop this before it becomes a used interface (good thing we caught
this early in 3.7-rc1, right after the merge window closed).

Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Acked-by: Tony Luck <tony.luck@intel.com>
Link: http://lkml.kernel.org/r/20121017105940.GA14590@x1.osrc.amd.com
Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>

Merge branch 'v4l_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media

Pull media fixes from Mauro Carvalho Chehab:
"Media fixes for:
   - one Kconfig fix patch;
   - one patch fixing DocBook breakage due to the drivers/media UAPI
     changes;
   - the remaining UAPI media changes (DVB API).

  I'm aware that is is a little late for the UAPI renames for the DVB
  API, but IMHO, it is better to merge it for 3.7, due to two reasons:

   1) There is a major rename at 3.7 (not only uapi changes, but also
      the entire media drivers were reorganized on 3.7, in order to
      simplify the Kconfig logic, and easy drivers selection, especially
      for hybrid devices).  By confining all those renames there at 3.7
      it will cause all the harm at for media developers on just one
      shot.  Stable backports upstream and at distros will likely
      welcome it as well, as they won't need to check what changed on
      3.7 and what was postponed for on 3.8.

   2) The V4L2 DocBook Makefile creates a cross-reference between the
      media API headers and the specs.  This helps us _a_lot_ to be sure
      that all API improvements are properly documented.  Every time a
      header changes from one place to another, DocBook/media/Makefile
      needs to be patched.  Currently, the DocBook breakage patch
      depends on the DVB UAPI."

* 'v4l_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media:
  [media] Kconfig: Fix dependencies for driver autoselect options
  DocBook/media/Makefile: Fix build due to uapi breakage
  UAPI: (Scripted) Disintegrate include/linux/dvb

Hexagon: Copyright marking changes

Code Aurora Forum (CAF) is becoming a part of Linux Foundation Labs.

Signed-off-by: Richard Kuo <rkuo@codeaurora.org>

UAPI: (Scripted) Disintegrate arch/hexagon/include/asm

Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Michael Kerrisk <mtk.manpages@gmail.com>
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Acked-by: Dave Jones <davej@redhat.com>
Signed-off-by: Richard Kuo <rkuo@codeaurora.org>

crypto: aesni - fix XTS mode on x86-32, add wrapper function for asmlinkage aesni_enc()

Calling convention for internal functions and 'asmlinkage' functions is
different on x86-32. Therefore do not directly cast aesni_enc as XTS tweak
function, but use wrapper function in between. Fixes crash with "XTS +
aesni_intel + x86-32" combination.

Cc: stable@vger.kernel.org
Reported-by: Krzysztof Kolasa <kkolasa@winsoft.pl>
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@mbnet.fi>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

fs, xattr: fix bug when removing a name not in xattr list

Commit 38f38657444d ("xattr: extract simple_xattr code from tmpfs") moved
some code from tmpfs but introduced a subtle bug along the way.

If the name passed to simple_xattr_remove() does not exist in the list of
xattrs, then it is possible to call kfree(new_xattr) when new_xattr is
actually initialized to itself on the stack via uninitialized_var().

This causes a BUG() since the memory was not allocated via the slab
allocator and was not bypassed through to the page allocator because it
was too large.

Initialize the local variable to NULL so the kfree() never takes place.

Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: David Rientjes <rientjes@google.com>
Acked-by: Hugh Dickins <hughd@google.com>
Acked-by: Aristeu Rozanski <aris@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

arm64: ptrace: use HW_BREAKPOINT_EMPTY type for disabled breakpoints

If a debugger tries to zero a hardware debug control register, the
kernel will try to infer both the type and length of the breakpoint
in order to sanity-check against the requested regset type. This will
fail because the encoding will appear as a zero-length breakpoint.

This patch changes the control register setting so that disabled
breakpoints are treated as HW_BREAKPOINT_EMPTY and no further
sanity-checking is required.

Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>

arm64: ptrace: make structure padding explicit for debug registers

The user_hwdebug_state structure contains implicit padding to conform to
the alignment requirements of the AArch64 ABI (namely that aggregates
must be aligned to their most aligned member).

This patch fixes the ptrace functions operating on struct
user_hwdebug_state so that the padding is handled correctly.

Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>

arm64: No need to set the x0-x2 registers in start_thread()

For historical reasons, ARM used to set r0-r2 in start_thread() to the
first values on the user stack when starting a new user application. The
same logic has been inherited in AArch64. The x0 register is overridden
by the sys_execve() return value so it's always zero on success. The x1
and x2 registers are ignored by AArch64 and EABI AArch32 applications,
so we can safely remove the register setting for both native and compat
user space.

This also fixes a potential fault with the kernel accessing user space
stack directly.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Reported-by: Al Viro <viro@zeniv.linux.org.uk>