git.karo-electronics.de Git - linux-beck.git/log

tracing/filters: use ring_buffer_discard_commit() in filter_check_discard()

This patch changes filter_check_discard() to make use of the new
ring_buffer_discard_commit() function and modifies the current users to
call the old commit function in the non-discard case.

It also introduces a version of filter_check_discard() that uses the
global trace buffer (filter_current_check_discard()) for those cases.

v2 changes:

- fix compile error noticed by Ingo Molnar

Signed-off-by: Tom Zanussi <tzanussi@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: fweisbec@gmail.com
LKML-Reference: <1239178554.10295.36.camel@tropicana>
Signed-off-by: Ingo Molnar <mingo@elte.hu>

tracing/infrastructure: separate event tracer from event support

Add a new config option, CONFIG_EVENT_TRACING that gets selected
when CONFIG_TRACING is selected and adds everything needed by the stuff
in trace_export - basically all the event tracing support needed by e.g.
bprint, minus the actual events, which are only included if
CONFIG_EVENT_TRACER is selected.

So CONFIG_EVENT_TRACER can be used to turn on or off the generated events
(what I think of as the 'event tracer'), while CONFIG_EVENT_TRACING turns
on or off the base event tracing support used by both the event tracer and
the other things such as bprint that can't be configured out.

Signed-off-by: Tom Zanussi <tzanussi@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: fweisbec@gmail.com
LKML-Reference: <1239178441.10295.34.camel@tropicana>
Signed-off-by: Ingo Molnar <mingo@elte.hu>

tracing/filters: use ring_buffer_discard_commit for discarded events

The ring_buffer_discard_commit makes better usage of the ring_buffer
when an event has been discarded. It tries to remove it completely if
possible.

This patch converts the trace event filtering to use
ring_buffer_discard_commit instead of the ring_buffer_event_discard.

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>

ring-buffer: add ring_buffer_discard_commit

The ring_buffer_discard_commit is similar to ring_buffer_event_discard
but it can only be done on an event that has yet to be commited.
Unpredictable results can happen otherwise.

The main difference between ring_buffer_discard_commit and
ring_buffer_event_discard is that ring_buffer_discard_commit will try
to free the data in the ring buffer if nothing has addded data
after the reserved event. If something did, then it acts almost the
same as ring_buffer_event_discard followed by a
ring_buffer_unlock_commit.

Note, either ring_buffer_commit_discard and ring_buffer_unlock_commit
can be called on an event, not both.

This commit also exports both discard functions to be usable by
GPL modules.

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>

tracing/filters: add TRACE_EVENT_FORMAT_NOFILTER event macro

Frederic Weisbecker suggested that the trace_special event shouldn't be
filterable; this patch adds a TRACE_EVENT_FORMAT_NOFILTER event macro
that allows an event format to be exported without having a filter
attached, and removes filtering from the trace_special event.

Signed-off-by: Tom Zanussi <tzanussi@gmail.com>
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>

tracing/filters: add run-time field descriptions to TRACE_EVENT_FORMAT events

This patch adds run-time field descriptions to all the event formats
exported using TRACE_EVENT_FORMAT. It also hooks up all the tracers
that use them (i.e. the tracers in the 'ftrace subsystem') so they can
also have their output filtered by the event-filtering mechanism.

When I was testing this, there were a couple of things that fooled me
into thinking the filters weren't working, when actually they were -
I'll mention them here so others don't make the same mistakes (and file
bug reports. ;-)

One is that some of the tracers trace multiple events e.g. the
sched_switch tracer uses the context_switch and wakeup events, and if
you don't set filters on all of the traced events, the unfiltered output
from the events without filters on them can make it look like the
filtering as a whole isn't working properly, when actually it is doing
what it was asked to do - it just wasn't asked to do the right thing.

The other is that for the really high-volume tracers e.g. the function
tracer, the volume of filtered events can be so high that it pushes the
unfiltered events out of the ring buffer before they can be read so e.g.
cat'ing the trace file repeatedly shows either no output, or once in
awhile some output but that isn't there the next time you read the
trace, which isn't what you normally expect when reading the trace file.
If you read from the trace_pipe file though, you can catch them before
they disappear.

Changes from v1:

As suggested by Frederic Weisbecker:

- get rid of externs in functions
- added unlikely() to filter_check_discard()

Signed-off-by: Tom Zanussi <tzanussi@gmail.com>
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>

blktrace: fix output of BLK_TC_PC events

BLK_TC_PC events should be treated differently with BLK_TC_FS events.

Before this patch:

# echo 1 > /sys/block/sda/sda1/trace/enable
# echo pc > /sys/block/sda/sda1/trace/act_mask
# echo blk > /debugfs/tracing/current_tracer
# (generate some BLK_TC_PC events)
# cat trace
        bash-2184  [000]  1774.275413:   8,7    I   N [bash]
        bash-2184  [000]  1774.275435:   8,7    D   N [bash]
        bash-2184  [000]  1774.275540:   8,7    I   R [bash]
        bash-2184  [000]  1774.275547:   8,7    D   R [bash]
ksoftirqd/0-4     [000]  1774.275580:   8,7    C   N 0 [0]
        bash-2184  [000]  1774.275648:   8,7    I   R [bash]
        bash-2184  [000]  1774.275653:   8,7    D   R [bash]
ksoftirqd/0-4     [000]  1774.275682:   8,7    C   N 0 [0]
        bash-2184  [000]  1774.275739:   8,7    I   R [bash]
        bash-2184  [000]  1774.275744:   8,7    D   R [bash]
ksoftirqd/0-4     [000]  1774.275771:   8,7    C   N 0 [0]
        bash-2184  [000]  1774.275804:   8,7    I   R [bash]
        bash-2184  [000]  1774.275808:   8,7    D   R [bash]
ksoftirqd/0-4     [000]  1774.275836:   8,7    C   N 0 [0]

After this patch:

# cat trace
        bash-2263  [000]   366.782149:   8,7    I   N 0 (00 ..) [bash]
        bash-2263  [000]   366.782323:   8,7    D   N 0 (00 ..) [bash]
        bash-2263  [000]   366.782557:   8,7    I   R 8 (25 00 ..) [bash]
        bash-2263  [000]   366.782560:   8,7    D   R 8 (25 00 ..) [bash]
ksoftirqd/0-4     [000]   366.782582:   8,7    C   N (25 00 ..) [0]
        bash-2263  [000]   366.782648:   8,7    I   R 8 (5a 00 3f 00) [bash]
        bash-2263  [000]   366.782650:   8,7    D   R 8 (5a 00 3f 00) [bash]
ksoftirqd/0-4     [000]   366.782669:   8,7    C   N (5a 00 3f 00) [0]
        bash-2263  [000]   366.782710:   8,7    I   R 8 (5a 00 08 00) [bash]
        bash-2263  [000]   366.782713:   8,7    D   R 8 (5a 00 08 00) [bash]
ksoftirqd/0-4     [000]   366.782730:   8,7    C   N (5a 00 08 00) [0]
        bash-2263  [000]   366.783375:   8,7    I   R 36 (5a 00 08 00) [bash]
        bash-2263  [000]   366.783379:   8,7    D   R 36 (5a 00 08 00) [bash]
ksoftirqd/0-4     [000]   366.783404:   8,7    C   N (5a 00 08 00) [0]

This is what we do with PC events in user-space blktrace.

Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <49D32387.9040106@cn.fujitsu.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>

blktrace: fix output of unknown events

Not all events are pc (packet command) events. An event is a pc
event only if it has BLK_TC_PC bit set.

Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <49D3236D.3090705@cn.fujitsu.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>

tracing, kmemtrace: Make kmem tracepoints use TRACE_EVENT macro

TRACE_EVENT is a more generic way to define tracepoints.
Doing so adds these new capabilities to this tracepoint:

  - zero-copy and per-cpu splice() tracing
  - binary tracing without printf overhead
  - structured logging records exposed under /debug/tracing/events
  - trace events embedded in function tracer output and other plugins
  - user-defined, per tracepoint filter expressions

Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com>
Acked-by: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro>
Acked-by: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <49DEE6DA.80600@cn.fujitsu.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>

tracing, kmemtrace: Separate include/trace/kmemtrace.h to kmemtrace part and tracepoint part

Impact: refactor code for future changes

Current kmemtrace.h is used both as header file of kmemtrace and kmem's
tracepoints definition.

Tracepoints' definition file may be used by other code, and should only have
definition of tracepoint.

We can separate include/trace/kmemtrace.h into 2 files:

include/linux/kmemtrace.h: header file for kmemtrace
include/trace/kmem.h: definition of kmem tracepoints

Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com>
Acked-by: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro>
Acked-by: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <49DEE68A.5040902@cn.fujitsu.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>

tracing: Document the event tracing system

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Cc: Theodore Ts'o <tytso@mit.edu>
Cc: Steven Rostedt <rostedt@goodmis.org>
LKML-Reference: <1239479479-2603-3-git-send-email-tytso@mit.edu>
Signed-off-by: Ingo Molnar <mingo@elte.hu>

tracing: Add documentation for the power tracer

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Acked-by: Arjan van de Ven <arjan@linux.intel.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Steven Rostedt <srostedt@redhat.com>
LKML-Reference: <1239479479-2603-4-git-send-email-tytso@mit.edu>
Signed-off-by: Ingo Molnar <mingo@elte.hu>

tracing, net, skb tracepoint: make skb tracepoint use the TRACE_EVENT() macro

TRACE_EVENT is a more generic way to define a tracepoint.
Doing so adds these new capabilities to this tracepoint:

  - zero-copy and per-cpu splice() tracing
  - binary tracing without printf overhead
  - structured logging records exposed under /debug/tracing/events
  - trace events embedded in function tracer output and other plugins
  - user-defined, per tracepoint filter expressions

Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: "Steven Rostedt ;" <rostedt@goodmis.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <49DD90D2.5020604@cn.fujitsu.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>

x86, function-graph: only save return values on x86_64

Impact: speed up

The return to handler portion of the function graph tracer should only
need to save the return values. The caller already saved off the
registers that the callee can modify. The returning function already
saved the registers it modified. When we call our own trace function
it too will save the registers that the callee must restore.

There's no reason to save off anything more that the registers used
to return the values.

Note, I did a complete kernel build with this modification and the
function graph tracer running on x86_64.

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>

tracing/lockdep: report the time waited for a lock

While trying to optimize the new lock on reiserfs to replace
the bkl, I find the lock tracing very useful though it lacks
something important for performance (and latency) instrumentation:
the time a task waits for a lock.

That's what this patch implements:

  bash-4816  [000]   202.652815: lock_contended: lock_contended: &sb->s_type->i_mutex_key
  bash-4816  [000]   202.652819: lock_acquired: &rq->lock (0.000 us)
<...>-4787  [000]   202.652825: lock_acquired: &rq->lock (0.000 us)
<...>-4787  [000]   202.652829: lock_acquired: &rq->lock (0.000 us)
  bash-4816  [000]   202.652833: lock_acquired: &sb->s_type->i_mutex_key (16.005 us)

As shown above, the "lock acquired" field is followed by the time
it has been waiting for the lock. Usually, a lock contended entry
is followed by a near lock_acquired entry with a non-zero time waited.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
LKML-Reference: <1238975373-15739-1-git-send-email-fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>

Merge branch 'tracing/urgent' into tracing/core

Merge reason: pick up both v2.6.30-rc1 [which includes tracing/urgent fixes]
and pick up the current lineup of tracing/urgent fixes as well

Signed-off-by: Ingo Molnar <mingo@elte.hu>

tracing: fix splice return too large

I got these from strace:

splice(0x3, 0, 0x5, 0, 0x1000, 0x1) = 12288
splice(0x3, 0, 0x5, 0, 0x1000, 0x1) = 12288
splice(0x3, 0, 0x5, 0, 0x1000, 0x1) = 12288
splice(0x3, 0, 0x5, 0, 0x1000, 0x1) = 16384
splice(0x3, 0, 0x5, 0, 0x1000, 0x1) = 8192
splice(0x3, 0, 0x5, 0, 0x1000, 0x1) = 8192
splice(0x3, 0, 0x5, 0, 0x1000, 0x1) = 8192

I wanted to splice_read 4096 bytes, but it returns 8192 or larger.

It is because the return value of tracing_buffers_splice_read()
does not include "zero out any left over data" bytes.

But tracing_buffers_read() includes these bytes, we make them
consistent.

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Steven Rostedt <srostedt@redhat.com>
LKML-Reference: <49D46674.9030804@cn.fujitsu.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>

tracing: update file->f_pos when splice(2) it

Impact: Cleanup

These two lines:

if (unlikely(*ppos))
return -ESPIPE;

in tracing_buffers_splice_read() are not needed, VFS layer
has disabled seek(2).

We remove these two lines, and then we can update file->f_pos.

And tracing_buffers_read() updates file->f_pos, this fix
make tracing_buffers_splice_read() updates file->f_pos too.

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Steven Rostedt <srostedt@redhat.com>
LKML-Reference: <49D46670.4010503@cn.fujitsu.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>

tracing: allocate page when needed

Impact: Cleanup

Sometimes, we open trace_pipe_raw, but we don't read(2) it,
we just splice(2) it, thus, the page is not used.

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Steven Rostedt <srostedt@redhat.com>
LKML-Reference: <49D4666B.4010608@cn.fujitsu.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>

tracing: disable seeking for trace_pipe_raw

Impact: disable pread()

We set tracing_buffers_fops.llseek to no_llseek,
but we can still perform pread() to read this file.

That is not expected.

This fix uses nonseekable_open() to disable it.

tracing_buffers_fops.llseek is still set to no_llseek,
it mark this file is a "non-seekable device" and is used by
sys_splice(). See also do_splice() or manual of splice(2):

ERRORS
       EINVAL Target file system doesn't support  splicing;
              neither  of the descriptors refers to a pipe;
              or offset given for non-seekable device.

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Steven Rostedt <srostedt@redhat.com>
LKML-Reference: <49D46668.8030806@cn.fujitsu.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>

MN10300: Kill MN10300's own profiling Kconfig

Kill MN10300's own profiling Kconfig as this is superfluous given that the
profiling options have moved to init/Kconfig and arch/Kconfig. Not only is
this now superfluous, but the dependencies are not correct.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

FRV: Use <asm-generic/pgtable.h> in NOMMU mode

asm-frv/pgtable.h could just #include <asm-generic/pgtable.h> in NOMMU mode
rather than #defining macros for lazy MMU and CPU stuff.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

keys: Handle there being no fallback destination keyring for request_key()

When request_key() is called, without there being any standard process
keyrings on which to fall back if a destination keyring is not specified, an
oops is liable to occur when construct_alloc_key() calls down_write() on
dest_keyring's semaphore.

Due to function inlining this may be seen as an oops in down_write() as called
from request_key_and_link().

This situation crops up during boot, where request_key() is called from within
the kernel (such as in CIFS mounts) where nobody is actually logged in, and so
PAM has not had a chance to create a session keyring and user keyrings to act
as the fallback.

To fix this, make construct_alloc_key() not attempt to cache a key if there is
no fallback key if no destination keyring is given specifically.

Signed-off-by: David Howells <dhowells@redhat.com>
Tested-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

afs: BUG to BUG_ON changes

Signed-off-by: Stoyan Gaydarov <stoyboyker@gmail.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip

* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86: cpu_debug remove execute permission
  x86: smarten /proc/interrupts output for new counters
  x86: DMI match for the Dell DXP061 as it needs BIOS reboot
  x86: make 64 bit to use default_inquire_remote_apic
  x86, setup: un-resequence mode setting for VGA 80x34 and 80x60 modes
  x86, intel-iommu: fix X2APIC && !ACPI build failure

Merge branch 'tracing-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip

* 'tracing-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  tracing: consolidate documents
  blktrace: pass the right pointer to kfree()
  tracing/syscalls: use a dedicated file header
  tracing: append a comma to INIT_FTRACE_GRAPH

Merge branch 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip

* 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  sched: do not count frozen tasks toward load
  sched: refresh MAINTAINERS entry
  sched: Print sched_group::__cpu_power in sched_domain_debug
  cpuacct: add per-cgroup utime/stime statistics
  posixtimers, sched: Fix posix clock monotonicity
  sched_rt: don't allocate cpumask in fastpath
  cpuacct: make cpuacct hierarchy walk in cpuacct_charge() safe when rcupreempt is used -v2

Merge branches 'core-fixes-for-linus', 'irq-fixes-for-linus' and 'timers-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip

* 'core-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  printk: fix wrong format string iter for printk
  futex: comment requeue key reference semantics

* 'irq-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  irq: fix cpumask memory leak on offstack cpumask kernels

* 'timers-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  posix-timers: fix RLIMIT_CPU && setitimer(CPUCLOCK_PROF)
  posix-timers: fix RLIMIT_CPU && fork()
  timers: add missing kernel-doc

MN10300: Convert obsolete no_irq_type to no_irq_chip

Convert the last remaining users to no_irq_chip.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Merge git://git.kernel.org/pub/scm/linux/kernel/git/agk/linux-2.6-dm

* git://git.kernel.org/pub/scm/linux/kernel/git/agk/linux-2.6-dm:
  dm kcopyd: fix callback race
  dm kcopyd: prepare for callback race fix
  dm: implement basic barrier support
  dm: remove dm_request loop
  dm: rework queueing and suspension
  dm: simplify dm_request loop
  dm: split DMF_BLOCK_IO flag into two
  dm: rearrange dm_wq_work
  dm: remove limited barrier support
  dm: add integrity support

module: try_then_request_module must wait

Since the whole point of try_then_request_module is to retry
the operation after a module has been loaded, we must wait for
the module to fully load.

Otherwise all sort of things start breaking, e.g., you won't
be able to read your encrypted disks on the first attempt.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Tested-by: Maciej Rutecki <maciej.rutecki@gmail.com>
Tested-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

sched: do not count frozen tasks toward load

Freezing tasks via the cgroup freezer causes the load average to climb
because the freezer's current implementation puts frozen tasks in
uninterruptible sleep (D state).

Some applications which perform job-scheduling functions consult the
load average when making decisions.  If a cgroup is frozen, the load
average does not provide a useful measure of the system's utilization
to such applications.  This is especially inconvenient if the job
scheduler employs the cgroup freezer as a mechanism for preempting low
priority jobs.  Contrast this with using SIGSTOP for the same purpose:
the stopped tasks do not count toward system load.

Change task_contributes_to_load() to return false if the task is
frozen.  This results in /proc/loadavg behavior that better meets
users' expectations.

Signed-off-by: Nathan Lynch <ntl@pobox.com>
Acked-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Nigel Cunningham <nigel@tuxonice.net>
Tested-by: Nigel Cunningham <nigel@tuxonice.net>
Cc: <stable@kernel.org>
Cc: containers@lists.linux-foundation.org
Cc: linux-pm@lists.linux-foundation.org
Cc: Matt Helsley <matthltc@us.ibm.com>
LKML-Reference: <20090408194512.47a99b95@manatee.lan>
Signed-off-by: Ingo Molnar <mingo@elte.hu>

tracing: consolidate documents

Move kmemtrace.txt, tracepoints.txt, ftrace.txt and mmiotrace.txt to
the new trace/ directory.

I didnt find any references to those documents in both source
files and documents, so no extra work needs to be done.

Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
Acked-by: Pekka Paalanen <pq@iki.fi>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
LKML-Reference: <49DD6E2B.6090200@cn.fujitsu.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>

x86: cpu_debug remove execute permission

It seems by mistake these files got execute permissions so removing it.

Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
LKML-Reference: <1239211186.9037.2.camel@ht.satnam>
Signed-off-by: Ingo Molnar <mingo@elte.hu>

blktrace: pass the right pointer to kfree()

Impact: fix kfree crash with non-standard act_mask string

If passing a string with leading white spaces to strstrip(),
the returned ptr != the original ptr.

This bug was introduced by me.

Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <49DD694C.8020902@cn.fujitsu.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>

tracing/syscalls: use a dedicated file header

Impact: fix build warnings and possibe compat misbehavior on IA64

Building a kernel on ia64 might trigger these ugly build warnings:

CC      arch/ia64/ia32/sys_ia32.o
In file included from arch/ia64/ia32/sys_ia32.c:55:
arch/ia64/ia32/ia32priv.h:290:1: warning: "elf_check_arch" redefined
In file included from include/linux/elf.h:7,
                 from include/linux/module.h:14,
                 from include/linux/ftrace.h:8,
                 from include/linux/syscalls.h:68,
                 from arch/ia64/ia32/sys_ia32.c:18:
arch/ia64/include/asm/elf.h:19:1: warning: this is the location of the previous definition
[...]

sys_ia32.c includes linux/syscalls.h which in turn includes linux/ftrace.h
to import the syscalls tracing prototypes.

But including ftrace.h can pull too much things for a low level file,
especially on ia64 where the ia32 private headers conflict with higher
level headers.

Now we isolate the syscall tracing headers in their own lightweight file.

Reported-by: Tony Luck <tony.luck@intel.com>
Tested-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Acked-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Jason Baron <jbaron@redhat.com>
Cc: "Frank Ch. Eigler" <fche@redhat.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: Jiaying Zhang <jiayingz@google.com>
Cc: Michael Rubin <mrubin@google.com>
Cc: Martin Bligh <mbligh@google.com>
Cc: Michael Davidson <md@google.com>
LKML-Reference: <20090408184058.GB6017@nowhere>
Signed-off-by: Ingo Molnar <mingo@elte.hu>

Merge git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus

* git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus:
  work_on_cpu(): rewrite it to create a kernel thread on demand
  kthread: move sched-realeted initialization from kthreadd context
  kthread: Don't looking for a task in create_kthread() #2

Merge git://git.infradead.org/battery-2.6

* git://git.infradead.org/battery-2.6:
  pda_power: Add optional OTG transceiver and voltage regulator support
  pcf50633_charger: Remove unused mbc_set_status function
  pcf50633_charger: Enable periodic charging restart

Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6:
cap_prctl: don't set error to 0 at 'no_change'

Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6:
  igb: remove sysfs entry that was used to set the number of vfs
  igbvf: add new driver to support 82576 virtual functions
  drivers/net/eql.c: Fix a dev leakage.
  niu: Fix unused variable warning.
  r6040: set MODULE_VERSION
  bnx2: Don't use reserved names
  FEC driver: add missing #endif
  niu: Fix error handling
  mv643xx_eth: don't reset the rx coal timer on interface up
  smsc911x: correct debugging message on mii read timeout
  ethoc: fix library build errors
  netfilter: ctnetlink: fix regression in expectation handling
  netfilter: fix selection of "LED" target in netfilter
  netfilter: ip6tables regression fix

Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-2.6

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-2.6:
  sparc: Hook up sys_preadv and sys_pwritev
  sparc64: add_node_ranges() must be __init
  serial: sunsu: sunsu_kbd_ms_init needs to be __devinit
  sparc: Fix section mismatch warnings in cs4231 sound driver.
  sparc64: Fix section mismatch warnings in PCI controller drivers.
  sparc64: Fix section mismatch warnings in power driver.
  sparc64: get_cells() can't be marked __init

Merge branch 'ext3-latency-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4

* 'ext3-latency-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
ext3: Try to avoid starting a transaction in writepage for data=writepage
block_write_full_page: switch synchronous writes to use WRITE_SYNC_PLUG

work_on_cpu(): rewrite it to create a kernel thread on demand

Impact: circular locking bugfix

The various implemetnations and proposed implemetnations of work_on_cpu()
are vulnerable to various deadlocks because they all used queues of some
form.

Unrelated pieces of kernel code thus gained dependencies wherein if one
work_on_cpu() caller holds a lock which some other work_on_cpu() callback
also takes, the kernel could rarely deadlock.

Fix this by creating a short-lived kernel thread for each work_on_cpu()
invokation.

This is not terribly fast, but the only current caller of work_on_cpu() is
pci_call_probe().

It would be nice to find some other way of doing the node-local
allocations in the PCI probe code so that we can zap work_on_cpu()
altogether. The code there is rather nasty. I can't think of anything
simple at this time...

Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>

kthread: move sched-realeted initialization from kthreadd context

kthreadd is the single thread which implements ths "create" request, move
sched_setscheduler/etc from create_kthread() to kthread_create() to
improve the scalability.

We should be careful with sched_setscheduler(), use _nochek helper.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Pavel Emelyanov <xemul@openvz.org>
Cc: Vitaliy Gusev <vgusev@openvz.org
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>

kthread: Don't looking for a task in create_kthread() #2

Remove the unnecessary find_task_by_pid_ns(). kthread() can just
use "current" to get the same result.

Signed-off-by: Vitaliy Gusev <vgusev@openvz.org>
Acked-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>

dm kcopyd: fix callback race

If the thread calling dm_kcopyd_copy is delayed due to scheduling inside
split_job/segment_complete and the subjobs complete before the loop in
split_job completes, the kcopyd callback could be invoked from the
thread that called dm_kcopyd_copy instead of the kcopyd workqueue.

dm_kcopyd_copy -> split_job -> segment_complete -> job->fn()

Snapshots depend on the fact that callbacks are called from the singlethreaded
kcopyd workqueue and expect that there is no racing between individual
callbacks. The racing between callbacks can lead to corruption of exception
store and it can also mean that exception store callbacks are called twice
for the same exception - a likely reason for crashes reported inside
pending_complete() / remove_exception().

This patch fixes two problems:

1. job->fn being called from the thread that submitted the job (see above).

- Fix: hand over the completion callback to the kcopyd thread.

2. job->fn(read_err, write_err, job->context); in segment_complete
reports the error of the last subjob, not the union of all errors.

- Fix: pass job->write_err to the callback to report all error bits
(it is done already in run_complete_job)

Cc: stable@kernel.org
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>

dm kcopyd: prepare for callback race fix

Use a variable in segment_complete() to point to the dm_kcopyd_client
struct and only release job->pages in run_complete_job() if any are
defined. These changes are needed by the next patch.

Cc: stable@kernel.org
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>

dm: implement basic barrier support

Barriers are submitted to a worker thread that issues them in-order.

The thread is modified so that when it sees a barrier request it waits
for all pending IO before the request then submits the barrier and
waits for it. (We must wait, otherwise it could be intermixed with
following requests.)

Errors from the barrier request are recorded in a per-device barrier_error
variable. There may be only one barrier request in progress at once.

For now, the barrier request is converted to a non-barrier request when
sending it to the underlying device.

This patch guarantees correct barrier behavior if the underlying device
doesn't perform write-back caching. The same requirement existed before
barriers were supported in dm.

Bottom layer barrier support (sending barriers by target drivers) and
handling devices with write-back caches will be done in further patches.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>

dm: remove dm_request loop

Remove queue_io return value and a loop in dm_request.

IO may be submitted to a worker thread with queue_io(). queue_io() sets
DMF_QUEUE_IO_TO_THREAD so that all further IO is queued for the thread. When
the thread finishes its work, it clears DMF_QUEUE_IO_TO_THREAD and from this
point on, requests are submitted from dm_request again. This will be used
for processing barriers.

Remove the loop in dm_request. queue_io() can submit I/Os to the worker thread
even if DMF_QUEUE_IO_TO_THREAD was not set.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>

dm: rework queueing and suspension

Rework shutting down on suspend and document the associated rules.

Drop write lock in __split_and_process_bio to allow more processing
concurrency.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>

dm: simplify dm_request loop

Refactor the code in dm_request().

Require the new DMF_BLOCK_FOR_SUSPEND flag on readahead bios we will
discard so we don't drop such bios while processing a barrier.

Signed-off-by: Alasdair G Kergon <agk@redhat.com>

dm: split DMF_BLOCK_IO flag into two

Split the DMF_BLOCK_IO flag into two.

DMF_BLOCK_IO_FOR_SUSPEND is set when I/O must be blocked while suspending a
device. DMF_QUEUE_IO_TO_THREAD is set when I/O must be queued to a
worker thread for later processing.

Signed-off-by: Alasdair G Kergon <agk@redhat.com>

dm: rearrange dm_wq_work

Refactor dm_wq_work() to make later patch more readable.

Signed-off-by: Alasdair G Kergon <agk@redhat.com>

dm: remove limited barrier support

Prepare for full barrier implementation: first remove the restricted support.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>

dm: add integrity support

This patch provides support for data integrity passthrough in the device
mapper.

- If one or more component devices support integrity an integrity
profile is preallocated for the DM device.

- If all component devices have compatible profiles the DM device is
flagged as capable.

- Handle integrity metadata when splitting and cloning bios.

Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>

cap_prctl: don't set error to 0 at 'no_change'

One-liner: capsh --print is broken without this patch.

In certain cases, cap_prctl returns error > 0 for success. However,
the 'no_change' label was always setting error to 0. As a result,
for example, 'prctl(CAP_BSET_READ, N)' would always return 0.
It should return 1 if a process has N in its bounding set (as
by default it does).

I'm keeping the no_change label even though it's now functionally
the same as 'error'.

Signed-off-by: Serge Hallyn <serue@us.ibm.com>
Acked-by: David Howells <dhowells@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>

igb: remove sysfs entry that was used to set the number of vfs

This patch removes the sysfs entry num_vfs which was added to support
enabling pci virtual functions for 82576.

To prevent VFs from loading automatically a module parameter "max_vfs" was
added so that the number of VFs per PF can be limited. This is especially
useful when 4 or more 82576 ports are on the system because otherwise to
load all VFs would result in 8 interface per physical port.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

igbvf: add new driver to support 82576 virtual functions

This adds an igbvf driver to handle virtual functions provided by the
igb driver when SR-IOV has been enabled. A virtual function is a
lightweight pci-e function that supports a single queue and shares
resources with the 82576 physical function contained within the igb
driver.

To spawn virtual functions from the igb driver all that is needed is to
enable CONFIG_PCI_IOV and have an 82576 Ethernet adapter on a system that
supports SR-IOV in the BIOS. The virtual functions will appear after the
interface is loaded.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

drivers/net/eql.c: Fix a dev leakage.

After dev_get_by_name(), we should follow a dev_put().

Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

niu: Fix unused variable warning.

Don't strain gcc's tiny mind.

Signed-off-by: David S. Miller <davem@davemloft.net>

r6040: set MODULE_VERSION

This patch sets MODULE_VERSION in order to help users track
changes to this module.

Signed-off-by: Florian Fainelli <florian@openwrt.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

bnx2: Don't use reserved names

The mips identifier is reserved by gcc on mips plattforms. Don't use it
in the code.

Signed-off-by: Bastian Blank <waldi@debian.org>
Tested-by: Martin Michlmayr <tbm@cyrius.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

FEC driver: add missing #endif

Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>

niu: Fix error handling

platform_device_register_simple() returns ERR_PTR(), not NULL, if an error
occurs.

Found by smatch (http://repo.or.cz/w/smatch.git). Compile tested.

Signed-off-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

mv643xx_eth: don't reset the rx coal timer on interface up

Move SDMA configuration from interface up to port probe, to prevent
overwriting the receive coalescing timer value on interface up.

Signed-off-by: Lennert Buytenhek <buytenh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

smsc911x: correct debugging message on mii read timeout

the warning printed when a mii READ times out currently says "Timed out
waiting for MII write to finish". This patch corrects this.

Signed-off-by: Steve Glendinning <steve.glendinning@smsc.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ethoc: fix library build errors

ethoc indirectly uses crc32_le() and bitrev32(), so select
those library functions to be built.

drivers/built-in.o: In function `ethoc_set_multicast_list':
ethoc.c:(.text+0x6226f): undefined reference to `crc32_le'
ethoc.c:(.text+0x62276): undefined reference to `bitrev32'

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6

* 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6:
[IA64] wire up preadv/pwritev system calls

Merge branch 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6

* 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6:
  x86 ACPI: Add support for Always Running APIC timer
  ACPI x86: Make aperf/mperf MSR access in acpi_cpufreq read_only
  ACPI x86: Cleanup acpi_cpufreq structures related to aperf/mperf
  ACPICA: delete check for AML access to port 0x81-83
  ACPI: WMI: use .notify method instead of installing handler directly
  sony-laptop: use .notify method instead of installing handler directly
  panasonic-laptop: use .notify method instead of installing handler directly
  fujitsu-laptop: use .notify method instead of installing hotkey handler directly
  fujitsu-laptop: use .notify method instead of installing handler directly
  ACPI: video: use .notify method instead of installing handler directly
  ACPI: thermal: use .notify method instead of installing handler directly
  ACPI battery: fix async boot oops
  ACPI: delete acpi_device.g_list
  NULL noise: drivers/platform/x86/panasonic-laptop.c
  ACPI: cpufreq: remove dupilcated #include
  ACPI: Adjust Kelvin offset to match local implementation
  ACPI: convert acpi_device_lock spinlock to mutex

Merge master.kernel.org:/home/rmk/linux-2.6-arm

* master.kernel.org:/home/rmk/linux-2.6-arm:
  [ARM] 5446/1: ohci-at91: Limit vbus_pin assignment to the size of the array
  [ARM] 5445/1: AT91: Remove flexible array from USBH platform data
  [ARM] 5447/1: Add SZ_32K
  [ARM] omap: fix omap1 clock usecount decrement bug
  [ARM] pxa: register AC97 controller devices
  [ARM] pxa/csb701: do not register devices on non-csb726 boads
  [ARM] pxa/colibri: get rid of set_irq_type()
  [ARM] pxa/colibri: provide MAC address from ATAG_SERIAL
  [ARM] pxa/cm-x2xx: fix ucb1400 not being registered
  [ARM] pxa: Add support for suspend on PalmTX, T5 and LD
  [ARM] pxa: PalmTE2 support for battery, UDC, IrDA and backlight
  [ARM] pxa: Palm Tungsten E2 basic support
  [ARM] pxa/em-x270: add libertas device registration
  [ARM] pxa/magician: Enable bq24022 regulator for gpio_vbus and pda_power

Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/drzeus/mmc

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/drzeus/mmc:
  mmc_spi: support for non-byte-aligned cards
  omap_hsmmc: Do not expect cmd/data to be non-null when CC/TC occurs
  mmc: Fix compile for omap_hsmmc.c
  mmc_spi: convert timeout handling to jiffies and avoid busy waiting
  mmc_spi: do not check CID and CSD blocks with CRC16
  omap_hsmmc: Flush posted write to IRQ
  New mail address for Pierre Ossman
  imxmmc: move RSSR BLR
  imxmmc: init-exit rework
  mmc: Accept EXT_CSD rev 1.3 since it is backwards compatible with 1.2

tty: MAX3100

Thou shalt remember to use 'git add' or errors shall be visited on your
downloads and there shall be wrath from on list and much gnashing of teeth.

Thou shalt remember to use git status or there shall be catcalls and much
embarrasment shall come to pass.

Signed-off-by: Alan "I'm hiding" Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

[IA64] wire up preadv/pwritev system calls

Gerd Hoffmann added these to Linux. Let ia64 use them.

Signed-off-by: Tony Luck <tony.luck@intel.com>

Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/kaber/nf-2.6

[ARM] 5446/1: ohci-at91: Limit vbus_pin assignment to the size of the array

Currently, the vbus_pin assignment loop is limited by the value of the "ports"
variable in the platform data. Now that the vbus_pin array is no longer
flexible, we can use its actual size.

Signed-off-by: Justin Waters <justin.waters@timesys.com>
Acked-by: Andrew Victor <linux@maxim.org.za>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>

[ARM] 5445/1: AT91: Remove flexible array from USBH platform data

The flexible array in the USBH platform data is not safe to copy.  The
compiler will not allocate any extra memory for the non-init platform
data structure (in the *_devices.c files) since it isn't given any
defaults at compile time.  When the probe function attempts to address
that array, it will actually attempt to access data in an adjacent
structure.

Since there are currently no (known) implementations of the at91 USBH
IP with more than 2 vbus pins, I am capping the value at 2.  If somebody
tries to assign more, then the compiler will produce a warning.

Signed-off-by: Justin Waters <justin.waters@timesys.com>
Acked-by: David Brownell <dbrownell@users.sourceforge.net>
Acked-by: Andrew Victor <linux@maxim.org.za>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>

[ARM] 5447/1: Add SZ_32K

This adds a SZ_32K define to the available sizes. I need it for an
upcoming platform support.

Signed-off-by: Linus Walleij <linus.walleij@stericsson.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>

mmc_spi: support for non-byte-aligned cards

A very large subset of SD cards in the market send their
responses and data non-byte-aligned. So add logic to the
mmc spi driver to handle this mess.

Signed-off-by: Wolfgang Muees <wolfgang.mues@auerswald.de>
Signed-off-by: Pierre Ossman <pierre@ossman.eu>

omap_hsmmc: Do not expect cmd/data to be non-null when CC/TC occurs

With spurious interrupt cmd can be null even when we have CC
set in irq status.

Fixes: NB#106295 - prevent potential kernel crash in the MMC driver
Signed-off-by: Jarkko Lavinen <jarkko.lavinen@nokia.com>
Signed-off-by: Adrian Hunter <adrian.hunter@nokia.com>
Signed-off-by: Pierre Ossman <pierre@ossman.eu>

mmc: Fix compile for omap_hsmmc.c

This fixes the issue noted by Russell King:

drivers/mmc/host/omap_hsmmc.c: In function 'mmc_omap_xfer_done':
drivers/mmc/host/omap_hsmmc.c:301: error: implicit declaration of function 'mmc_omap_fclk_lazy_disable'

This got broken by 4a694dc915c9a223044ce21fc0d99e63facd1d64.

Signed-off-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Pierre Ossman <pierre@ossman.eu>

mmc_spi: convert timeout handling to jiffies and avoid busy waiting

SD/MMC card timeouts can be very high. So avoid busy-waiting,
using the scheduler. Calculate all timeouts in jiffies units,
because this will give us the correct sign when to involve
the scheduler.

Signed-off-by: Wolfgang Muees <wolfgang.mues@auerswald.de>
Signed-off-by: Pierre Ossman <pierre@ossman.eu>

mmc_spi: do not check CID and CSD blocks with CRC16

Some cards are not able to calculate a valid CRC16 value
for CID and CSD reads (CRC for 512 byte data blocks is OK).
By moving the CRC enable after the read of CID and CSD, these
cards can be used. This patch was tested with a faulty 8 GByte
takeMS Class 6 SDHC card. This patch was suggested by
Pierre Ossman.

Signed-off-by: Wolfgang Muees <wolfgang.mues@auerswald.de>
Signed-off-by: Pierre Ossman <pierre@ossman.eu>

omap_hsmmc: Flush posted write to IRQ

Spurious IRQs seen on MMC after 2.6.29. Flush posted write in IRQ
handler.

The interrupt line is released by clearing the error status bits
in the MMCHS_STAT register, which must occur before the interrupt
handler returns to avoid unwanted irqs. Hence the need to flush
the posted write.

Signed-off-by: Kevin Hilman <khilman@deeprootsystems.com>
Signed-off-by: Adrian Hunter <adrian.hunter@nokia.com>
Acked-by: Tony Lindgen <tony@atomide.com>
Signed-off-by: Pierre Ossman <pierre@ossman.eu>

New mail address for Pierre Ossman

Signed-off-by: Pierre Ossman <pierre@ossman.eu>

imxmmc: move RSSR BLR

DMA request source (RSSR) needs to be set only once (in probe).
DMA burst length (BLR) need to be set only in set_ios()

This cleans up imxmci_setup_data() and should make it a little
bit faster :)

Signed-off-by: Paulius Zaleckas <paulius.zaleckas@teltonika.lt>
Signed-off-by: Pierre Ossman <drzeus@drzeus.cx>

imxmmc: init-exit rework

Add __init __exit for appropriate probe and remove functions.
Conver to platform_driver_probe()

Signed-off-by: Paulius Zaleckas <paulius.zaleckas@teltonika.lt>
Acked-by: Sascha Hauer <s.hauer@pengutronix.de>
Signed-off-by: Pierre Ossman <drzeus@drzeus.cx>

mmc: Accept EXT_CSD rev 1.3 since it is backwards compatible with 1.2

Signed-off-by: Jarkko Lavinen <jarkko.lavinen@nokia.com>
Signed-off-by: Pierre Ossman <drzeus@drzeus.cx>

ptrace: some checkpatch fixes

This fixes all the checkpatch --file complaints about kernel/ptrace.c
and also removes an unused #include. I've verified that there are no
changes to the compiled code on x86_64.

Signed-off-by: Roland McGrath <roland@redhat.com>
[ Removed the parts that just split a line - Linus ]
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

nommu: fix typo vma->pg_off to vma->vm_pgoff

6260a4b0521a41189b2c2a8119096c1e21dbdf2c ("/proc/pid/maps: don't show
pgoff of pure ANON VMAs" had a typo.

fs/proc/task_nommu.c:138: error: 'struct vm_area_struct' has no member named 'pg_off'
distcc[21484] ERROR: compile fs/proc/task_nommu.c on sprygo/32 failed

Signed-off-by: Nobuhiro Iwamatsu <iwamatsu.nobuhiro@renesas.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

befs: fix build on parisc

fs/befs/super.c:85: error: 'PAGE_SIZE' undeclared

Signed-off-by: Alexander Beregalov <a.beregalov@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

ext3: Try to avoid starting a transaction in writepage for data=writepage

This does the same as commit 9e80d407736161d9b8b0c5a0d44f786e44c322ea
(avoid starting a transaction when no block allocation is needed)
but for data=writeback mode of ext3. We also cleanup the data=ordered
case a bit to stick to coding style...

Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>

block_write_full_page: switch synchronous writes to use WRITE_SYNC_PLUG

Now that we have a distinction between WRITE_SYNC and WRITE_SYNC_PLUG,
use WRITE_SYNC_PLUG in __block_write_full_page() to avoid unplugging
the block device I/O queue between each page that gets flushed out.

Otherwise, when we run sync() or fsync() and we need to write out a
large number of pages, the block device queue will get unplugged
between for every page that is flushed out, which will be a pretty
serious performance regression caused by commit a64c8610.

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>

x86: smarten /proc/interrupts output for new counters

Now /proc/interrupts of tip tree has new counters:

  PLT: Platform interrupts

Format change of output, as like that by commit:

  commit 7a81d9a7da03d2f27840d659f97ef140d032f609
  x86: smarten /proc/interrupts output

should be applied to these new counters too.

Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Cc: Jan Beulich <jbeulich@novell.com>
LKML-Reference: <49C98DEA.8060208@jp.fujitsu.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>

Merge commit 'v2.6.30-rc1' into x86/urgent

Merge reason: fix to be queued up depends on upstream facilities

Signed-off-by: Ingo Molnar <mingo@elte.hu>

x86: DMI match for the Dell DXP061 as it needs BIOS reboot

Closes http://bugzilla.kernel.org/show_bug.cgi?12901

Signed-off-by: Alan Cox <alan@lxorguk.ukuu.org.uk>
LKML-Reference: <20090326204524.4454.8776.stgit@localhost.localdomain>
Signed-off-by: Ingo Molnar <mingo@elte.hu>

posix-timers: fix RLIMIT_CPU && setitimer(CPUCLOCK_PROF)

update_rlimit_cpu() tries to optimize out set_process_cpu_timer() in case
when we already have CPUCLOCK_PROF timer which should expire first. But it
uses cputime_lt() instead of cputime_gt().

Test case:

int main(void)
{
struct itimerval it = {
.it_value = { .tv_sec = 1000 },
};

assert(!setitimer(ITIMER_PROF, &it, NULL));

struct rlimit rl = {
.rlim_cur = 1,
.rlim_max = 1,
};

assert(!setrlimit(RLIMIT_CPU, &rl));

for (;;)
;

return 0;
}

Without this patch, the task is not killed as RLIMIT_CPU demands.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: Peter Lojkin <ia6432@inbox.ru>
Cc: Roland McGrath <roland@redhat.com>
Cc: stable@kernel.org
LKML-Reference: <20090327000610.GA10108@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>

posix-timers: fix RLIMIT_CPU && fork()

See http://bugzilla.kernel.org/show_bug.cgi?id=12911

copy_signal() copies signal->rlim, but RLIMIT_CPU is "lost". Because
posix_cpu_timers_init_group() sets cputime_expires.prof_exp = 0 and thus
fastpath_timer_check() returns false unless we have other expired cpu timers.

Change copy_signal() to set cputime_expires.prof_exp if we have RLIMIT_CPU.
Also, set cputimer.running = 1 in that case. This is not strictly necessary,
but imho makes sense.

Reported-by: Peter Lojkin <ia6432@inbox.ru>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: Peter Lojkin <ia6432@inbox.ru>
Cc: Roland McGrath <roland@redhat.com>
Cc: stable@kernel.org
LKML-Reference: <20090327000607.GA10104@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>

x86: make 64 bit to use default_inquire_remote_apic

Impact: restore old behavior

for flat and phys_flat

Signed-off-by: Yinhai Lu <yinghai@kernel.org.
LKML-Reference: <49DCBBF1.8080903@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>

sched: refresh MAINTAINERS entry

Peter has become a co-maintainer of the scheduler during the last year,
and Robert has become inactive - update the MAINTAINERS entry.

Acked-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>

Merge commit 'v2.6.30-rc1' into sched/urgent

Merge reason: update to latest upstream to queue up fix

Signed-off-by: Ingo Molnar <mingo@elte.hu>