Where there are lots of errors related to python methods receiving
'char *' for things like file open mode, which break the build, also
disable strict aliasing and fixup some other warnings. Now builds on
both 32-bit and 64-bit fedora systems.
Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
%td is for ptrdiff_t, avoiding this warning on 32-bit:
cc1: warnings being treated as errors
builtin-probe.c: In function ‘opt_set_filter’:
builtin-probe.c:176:4: error: format ‘%ld’ expects type ‘long int’, but
argument 3 has type ‘int’
Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
perf evlist: Store pointer to the cpu and thread maps
So that we don't have to pass it around to the several methods that
needs it, simplifying usage.
There is one case where we don't have the thread/cpu map in advance,
which is in the parsing routines used by top, stat, record, that we have
to wait till all options are parsed to know if a cpu or thread list was
passed to then create those maps.
For that case consolidate the cpu and thread map creation via
perf_evlist__create_maps() out of the code in top and record, while also
providing a perf_evlist__set_maps() for cases where multiple evlists
share maps or for when maps that represent CPU sockets, for instance,
get crafted out of topology information or subsets of threads in a
particular application are to be monitored, providing more granularity
in specifying which cpus and threads to monitor.
Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
They were on evsel.c because they came from refactoring existing evsel
methods, so, to make reviewing the changes easier, I kept it there, now
its a plain move.
Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
First clarifying that this kind of binding is not a replacement or an
equivalent to the 'perf script' way of using python with perf.
The 'perf script' way is to process events and look at a given script
for some python function that matches the events to pass each event for
processing.
This is a python module, i.e. everything is driven from the python
script, that merely uses "import perf" or "from perf import".
perf script is focused on tracepoints, this binding is focused on profiling as
an initial target. More work is needed to make available tracepoint specific
variables as event variables accessible via this binding.
There is one example of such usage model, in
tools/perf/python/twatch.py, a tool to watch "cycles" events together
with task (fork, exit) and comm perf events.
For now, due to me not being able to grok how python distutils cope with
building C extensions outside the sources dir the install target just
builds it, I'm using it as:
The first 8 mmap events in this 8 way machine are a mistery that is still being
investigated.
More of the tools/perf/util/ APIs will be exposed via this python binding as
the need arises. For now the focus is on creating events and processing them,
symbol resolution is an obvious next step, with tracepoint variables as a close
second step.
Cc: Clark Williams <williams@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
perf evlist: Support non overwrite mode in perf_evlist__read_on_cpu
I.e. stash the overwrite mode in struct perf_evlist and act accordingly
in perf_evlist__read_on_cpu, not checking for overwrites and touching
the tail after consuming one event, like perf record does, for instance.
Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
perf events: Account PERF_RECORD_LOST events in event__process
Right now this function is only used by perf top, that uses PROT_READ
only, i.e. overwrite mode, so no PERF_RECORD_LOST events are generated,
but don't forget those events.
The patch that moved this out of perf top was made so that this routine
could be used by 'perf probe' in the uprobes patchset, so perhaps there
they need to check for LOST events and warn the user, as will be done in
the following patches that will switch 'perf top' to non overwrite mode
(mmap with PROT_READ|PROT_WRITE).
Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Cc: Stephane Eranian <eranian@google.com> Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
As we open the mmap with (PROT_READ | PROT_WRITE), signalling the kernel
with perf_mmap__write_tail() when consuming data, so the kernel will not
overwrite.
Suggested-by: Peter Zijlstra <peterz@infradead.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Han Pingtian [Mon, 24 Jan 2011 23:39:00 +0000 (07:39 +0800)]
perf test: Fix return values checking
Fixing some cut'n'paste mistakes.
LKML-Reference: <20110124233900.GA3443@epc900.nay.redhat.com> Signed-off-by: Han Pingtian <phan@redhat.com>
[ committer note: I had already removed the CPU_ALLOC calls ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Masami Hiramatsu [Thu, 20 Jan 2011 14:15:39 +0000 (23:15 +0900)]
perf probe: Add variable filter support
Add filters support for available variable list.
Default filter is "!__k???tab_*&!__crc_*" for filtering out
automatically generated symbols.
The format of filter rule is "[!]GLOBPATTERN", so you can use wild
cards. If the filter rule starts with '!', matched variables are filter
out.
e.g.:
# perf probe -V schedule --externs --filter=cpu*
Available variables at schedule
@<schedule+0>
cpumask_var_t cpu_callout_mask
cpumask_var_t cpu_core_map
cpumask_var_t cpu_isolated_map
cpumask_var_t cpu_sibling_map
int cpu_number
long unsigned int* cpu_bit_bitmap
...
Cc: 2nddept-manager@sdl.hitachi.co.jp Cc: Chase Douglas <chase.douglas@canonical.com> Cc: Franck Bui-Huu <fbuihuu@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Cc: Steven Rostedt <rostedt@goodmis.org>
LKML-Reference: <20110120141539.25915.43401.stgit@ltc236.sdl.hitachi.co.jp> Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
[ committer note: Removed the elf.h include as it was fixed up in e80711c] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Out of the {con,des}structor, as in interpreted language bindings we will
need to go back from the wrapper object to the real thing. In that case
using container_of will save us to have an extra pointer in the perf_evsel
struct.
Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
To avoid linking more stuff in the python binding I'm working on, future
csets will make the sample type be taken from the evsel itself, but for
that we need to first have one file per cpu and per sample_type, not a
single perf.data file.
Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Franck Bui-Huu [Sun, 16 Jan 2011 16:14:45 +0000 (17:14 +0100)]
perf record: auto detect when stdout is a pipe
This patch gives the ability to 'perf record' to detect when its stdout
has been redirected to a pipe. There's now no more need to add '-o -'
switch in this case.
However '-o <path>' option has always precedence, that is if specified
and stdout has been connected via a pipe then the output will go into
the specified output.
The callchains are fed with an array of a fixed size.
As a result we iterate over each callchains three times:
- 1st to resolve symbols
- 2nd to filter out context boundaries
- 3rd for the insertion into the tree
This also involves some pairs of memory allocation/deallocation
everytime we insert a callchain, for the filtered out array of
addresses and for the array of symbols that comes along.
Instead, feed the callchains through a linked list with persistent
allocations. It brings several pros like:
- Merge the 1st and 2nd iterations in one. That was possible before
but in a way that would involve allocating an array slightly taller
than necessary because we don't know in advance the number of context
boundaries to filter out.
- Much lesser allocations/deallocations. The linked list keeps
persistent empty entries for the next usages and is extendable at
will.
- Makes it easier for multiple sources of callchains to feed a
stacktrace together. This is deemed to pave the way for cfi based
callchains wherein traditional frame pointer based kernel
stacktraces will precede cfi based user ones, producing an overall
callchain which size is hardly predictable. This requirement
makes the static array obsolete and makes a linked list based
iterator a much more flexible fit.
Basic testing on a big perf file containing callchains (~ 176 MB)
has shown a throughput gain of about 11% with perf report.
Cc: Ingo Molnar <mingo@elte.hu> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1294977121-5700-2-git-send-email-fweisbec@gmail.com> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This test will generate random numbers of calls to some getpid syscalls,
then establish an mmap for a group of events that are created to monitor
these syscalls.
It will receive the events, using mmap, use its PERF_SAMPLE_ID generated
sample.id field to map back to its respective perf_evsel instance.
Then it checks if the number of syscalls reported as perf events by the
kernel corresponds to the number of syscalls made.
Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
There is more stuff that can go to the perf_ev{sel,list} layer, like
detecting if sample_id_all is available, etc, but lets try using this in
'perf test' first.
Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Adopting the new model used in 'perf record', where we don't have a map
per thread per cpu, instead we have an mmap per cpu, established on the
first fd for that cpu and ask the kernel using the
PERF_EVENT_IOC_SET_OUTPUT ioctl to send events for the other fds on that
cpu for the one with the mmap.
The methods moved from perf_evsel to perf_evlist, but for easing review
they were modified in place, in evsel.c, the next patch will move the
migrated methods to evlist.c.
With this 'perf top' now uses the same mmap model used by 'perf record'
and the next patches will make 'perf record' use these new routines,
establishing a common codebase for both tools.
Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Now that it handles group_fd and inherit we can use it, sharing it with
stat.
Next step: 'perf record' should use, then move the mmap_array out of
->priv and into perf_evsel, with top and record sharing this, and at the
same time, write a 'perf test' stress test.
Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
[acme@localhost linux]$ make O=~acme/git/build/perf -C tools/perf
make: Entering directory `/home/acme/git/linux/tools/perf'
Makefile:526: No libdw.h found or old libdw.h found or elfutils is older than 0.138, disables dwarf support. Please install new elfutils-devel/libdw-dev
Makefile:582: newt not found, disables TUI support. Please install newt-devel or libnewt-dev
CC /home/acme/git/build/perf/builtin-annotate.o
In file included from builtin-annotate.c:23:
util/parse-events.h:26: warning: declaration of 'evsel_list' shadows a global declaration
util/parse-events.h:12: warning: shadowed declaration is here
make: *** [/home/acme/git/build/perf/builtin-annotate.o] Error 1
make: Leaving directory `/home/acme/git/linux/tools/perf'
[acme@localhost linux]$ gcc --version | head -1
gcc (GCC) 3.4.6 20060404 (Red Hat 3.4.6-11)
[acme@localhost linux]$
Fix it by renaming the parameter to evlist.
Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
We need the definiton for __always_inline in bitops.h to fix the build
on distros where it isn't available or compiler.h doesn't get included
indirectly.
One of the fixes needed to build perf on RHEL4 systems, for instance.
Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Using %L[uxd] has issues in some architectures, like on ppc64. Fix it
by making our 64 bit integers typedefs of stdint.h types and using
PRI[ux]64 like, for instance, git does.
Reported by Denis Kirjanov that provided a patch for one case, I went
and changed all cases.
Reported-by: Denis Kirjanov <dkirjanov@kernel.org> Tested-by: Denis Kirjanov <dkirjanov@kernel.org>
LKML-Reference: <20110120093246.GA8031@hera.kernel.org> Cc: Denis Kirjanov <dkirjanov@kernel.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Pingtian Han <phan@redhat.com> Cc: Stephane Eranian <eranian@google.com> Cc: Tom Zanussi <tzanussi@gmail.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Where we don't have CPU_ALLOC & friends. As the tools are being used in older
distros where the only allowed change are to replace the kernel, like RHEL4 and
5.
Reported-by: Eric Dumazet <eric.dumazet@gmail.com> Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Stephane Eranian <eranian@google.com>
LKML-Reference: <new-submission> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
In theory, almost every user of task->child->perf_event_ctxp[]
is wrong. find_get_context() can install the new context at any
moment, we need read_barrier_depends().
dbe08d82ce3967ccdf459f7951d02589cf967300 "perf: Fix
find_get_context() vs perf_event_exit_task() race" added
rcu_dereference() into perf_event_exit_task_context() to make
the precedent, but this makes __rcu_dereference_check() unhappy.
Use rcu_dereference_raw() to shut up the warning.
Han Pingtian [Thu, 20 Jan 2011 11:47:07 +0000 (19:47 +0800)]
perf test: Use cpu_map->[cpu] when setting affinity
When some of CPUs are offline:
# cat /sys/devices/system/cpu/online
0,6-31
perf test will fail on #3 testcase:
3: detect open syscall event on all cpus:
--- start ---
perf_evsel__read_on_cpu: expected to intercept 111 calls on cpu 0, got 681
perf_evsel__read_on_cpu: expected to intercept 112 calls on cpu 1, got 117
perf_evsel__read_on_cpu: expected to intercept 113 calls on cpu 2, got 118
perf_evsel__read_on_cpu: expected to intercept 114 calls on cpu 3, got 119
perf_evsel__read_on_cpu: expected to intercept 115 calls on cpu 4, got 120
perf_evsel__read_on_cpu: expected to intercept 116 calls on cpu 5, got 121
perf_evsel__read_on_cpu: expected to intercept 117 calls on cpu 6, got 122
perf_evsel__read_on_cpu: expected to intercept 118 calls on cpu 7, got 123
perf_evsel__read_on_cpu: expected to intercept 119 calls on cpu 8, got 124
perf_evsel__read_on_cpu: expected to intercept 120 calls on cpu 9, got 125
perf_evsel__read_on_cpu: expected to intercept 121 calls on cpu 10, got 126
....
This patch try to use 'cpus->map[cpu]' when setting cpu affinity, and
will check the return code of sched_setaffinity()
LKML-Reference: <20110120114707.GA11781@hpt.nay.redhat.com> Signed-off-by: Han Pingtian <phan@redhat.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
In ARM's Thumb mode the bottom bit of the symbol address is set to mark
the function as Thumb; the instructions are in reality 2 or 4 byte on 2
byte alignments, and when the +1 address is used in annotate it causes
objdump to disassemble invalid instructions.
The patch removes that bottom bit during symbol loading.
Many thinks to Dave Martin for comments on an initial version of the
patch.
(For reference this corresponds to this bug
https://bugs.launchpad.net/linux-linaro/+bug/677547 )
Cc: Ingo Molnar <mingo@elte.hu> Cc: Dave Martin <dave.martin@linaro.org>
LKML-Reference: <20110121163922.GA31398@davesworkthinkpad> Signed-off-by: Dr. David Alan Gilbert <david.gilbert@linaro.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
But because the deadlock would be cpuhotplug (cpu-event) vs fork
(task-event) it cannot, in fact, happen. We can annotate this by giving the
perf_event_context used for the cpuctx a different lock class from those
used by tasks.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
perf_event_init_task() should clear child->perf_event_ctxp[]
before anything else. Otherwise, if
perf_event_init_context(perf_hw_context) fails,
perf_event_free_task() can free perf_event_ctxp[perf_sw_context]
copied from parent->perf_event_ctxp[] by dup_task_struct().
Also move the initialization of perf_event_mutex and
perf_event_list from perf_event_init_context() to
perf_event_init_context().
Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Alan Stern <stern@rowland.harvard.edu> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Prasad <prasad@linux.vnet.ibm.com> Cc: Roland McGrath <roland@redhat.com>
LKML-Reference: <20110119182228.GC12183@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
Oleg Nesterov [Wed, 19 Jan 2011 18:22:07 +0000 (19:22 +0100)]
perf: Fix find_get_context() vs perf_event_exit_task() race
find_get_context() must not install the new perf_event_context
if the task has already passed perf_event_exit_task().
If nothing else, this means the memory leak. Initially
ctx->refcount == 2, it is supposed that
perf_event_exit_task_context() should participate and do the
necessary put_ctx().
find_lively_task_by_vpid() checks PF_EXITING but this buys
nothing, by the time we call find_get_context() this task can be
already dead. To the point, cmpxchg() can succeed when the task
has already done the last schedule().
Change find_get_context() to populate task->perf_event_ctxp[]
under task->perf_event_mutex, this way we can trust PF_EXITING
because perf_event_exit_task() takes the same mutex.
Also, change perf_event_exit_task_context() to use
rcu_dereference(). Probably this is not strictly needed, but
with or without this change find_get_context() can race with
setup_new_exec()->perf_event_exit_task(), rcu_dereference()
looks better.
Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Alan Stern <stern@rowland.harvard.edu> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Prasad <prasad@linux.vnet.ibm.com> Cc: Roland McGrath <roland@redhat.com>
LKML-Reference: <20110119182207.GB12183@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
Linus Torvalds [Tue, 18 Jan 2011 22:29:37 +0000 (14:29 -0800)]
Merge branch 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
perf: Validate cpu early in perf_event_alloc()
perf: Find_get_context: fix the per-cpu-counter check
perf: Fix contexted inheritance
Linus Torvalds [Tue, 18 Jan 2011 22:29:21 +0000 (14:29 -0800)]
Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
x86: Clear irqstack thread_info
x86: Make relocatable kernel work with new binutils
Linus Torvalds [Tue, 18 Jan 2011 22:28:48 +0000 (14:28 -0800)]
Merge branch 'upstream' of git://git.linux-mips.org/pub/scm/upstream-linus
* 'upstream' of git://git.linux-mips.org/pub/scm/upstream-linus: (26 commits)
MIPS: Malta: enable Cirrus FB console
MIPS: add CONFIG_VIRTUALIZATION for virtio support
MIPS: Implement __read_mostly
MIPS: ath79: add common WMAC device for AR913X based boards
MIPS: ath79: Add initial support for the Atheros AP81 reference board
MIPS: ath79: add common SPI controller device
SPI: Add SPI controller driver for the Atheros AR71XX/AR724X/AR913X SoCs
MIPS: ath79: add common GPIO buttons device
MIPS: ath79: add common watchdog device
MIPS: ath79: add common GPIO LEDs device
MIPS: ath79: add initial support for the Atheros PB44 reference board
MIPS: ath79: utilize the MIPS multi-machine support
MIPS: ath79: add GPIOLIB support
MIPS: Add initial support for the Atheros AR71XX/AR724X/AR931X SoCs
MIPS: jump label: Add MIPS support.
MIPS: Use WARN() in uasm for better diagnostics.
MIPS: Optimize TLB handlers for Octeon CPUs
MIPS: Add LDX and LWX instructions to uasm.
MIPS: Use BBIT instructions in TLB handlers
MIPS: Declare uasm bbit0 and bbit1 functions.
...
Oleg Nesterov [Tue, 18 Jan 2011 16:10:08 +0000 (17:10 +0100)]
perf: Find_get_context: fix the per-cpu-counter check
If task == NULL, find_get_context() should always check that cpu
is correct.
Afaics, the bug was introduced by 38a81da2 "perf events: Clean
up pid passing", but even before that commit "&& cpu != -1" was
not exactly right, -ESRCH from find_task_by_vpid() is not
accurate.
Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Alan Stern <stern@rowland.harvard.edu> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Prasad <prasad@linux.vnet.ibm.com> Cc: Roland McGrath <roland@redhat.com> Cc: gregkh@suse.de Cc: stable@kernel.org
LKML-Reference: <20110118161008.GB693@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
Aurelien Jarno [Tue, 18 Jan 2011 11:20:45 +0000 (12:20 +0100)]
MIPS: Malta: enable Cirrus FB console
While most users of a physical Malta board are using the serial port
as the console, a lot of QEMU users would prefer to interact with a
graphical console. Enable the Cirrus FB support in the Malta default
configuration to make that possible. Note that the default console will
still be the serial port, users have to pass "console=tty0" to the
kernel to use the Cirrus FB.
David Daney [Thu, 14 Oct 2010 19:36:49 +0000 (12:36 -0700)]
MIPS: Implement __read_mostly
Just do what everyone else is doing by placing __read_mostly things in
the .data.read_mostly section.
mips_io_port_base can not be read-only (const) and writable
(__read_mostly) at the same time. One of them has to go, so I chose
to eliminate the __read_mostly. It will still get stuck in a portion
of memory that is not adjacent to things that are written, and thus
not be on a dirty cache line, for whatever that is worth.
Signed-off-by: David Daney <ddaney@caviumnetworks.com>
To: linux-mips@linux-mips.org
Patchwork: http://patchwork.linux-mips.org/patch/1702/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Gabor Juhos [Tue, 4 Jan 2011 20:28:29 +0000 (21:28 +0100)]
MIPS: ath79: add common WMAC device for AR913X based boards
Add common platform_device and helper code to make the registration
of the built-in wireless MAC easier on the Atheros AR9130/AR9132
based boards. Also register the WMAC device on the AR81 board.
Signed-off-by: Gabor Juhos <juhosg@openwrt.org> Cc: linux-mips@linux-mips.org Cc: Imre Kaloz <kaloz@openwrt.org>, Cc: Luis R. Rodriguez <lrodriguez@atheros.com> Cc: Cliff Holden <Cliff.Holden@Atheros.com> Cc: Kathy Giori <Kathy.Giori@Atheros.com>
Patchwork: https://patchwork.linux-mips.org/patch/1962/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Gabor Juhos [Tue, 4 Jan 2011 20:28:23 +0000 (21:28 +0100)]
MIPS: ath79: add common SPI controller device
Several boards are using the built-in SPI controller of the
AR71XX/AR724X/AR913X SoCs. This patch adds common platform_device
and helper code to register it. Additionally, the patch registers
the SPI bus on the PB44 board.
Signed-off-by: Gabor Juhos <juhosg@openwrt.org> Cc: linux-mips@linux-mips.org Cc: Imre Kaloz <kaloz@openwrt.org> Cc: Luis R. Rodriguez <lrodriguez@atheros.com> Cc: Cliff Holden <Cliff.Holden@Atheros.com> Cc: Kathy Giori <Kathy.Giori@Atheros.com>
Patchwork: https://patchwork.linux-mips.org/patch/1956/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
The current patch contains minimal support only, but the resulting
kernel can boot into user-space with using of an initramfs image on
various boards which are using these SoCs. Support for more built-in
devices and individual boards will be implemented in further patches.
Signed-off-by: Gabor Juhos <juhosg@openwrt.org> Signed-off-by: Imre Kaloz <kaloz@openwrt.org> Cc: linux-mips@linux-mips.org Cc: Luis R. Rodriguez <lrodriguez@atheros.com> Cc: Cliff Holden <Cliff.Holden@Atheros.com> Cc: Kathy Giori <Kathy.Giori@Atheros.com>
Patchwork: https://patchwork.linux-mips.org/patch/1947/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
David Daney [Tue, 28 Dec 2010 21:26:23 +0000 (13:26 -0800)]
MIPS: jump label: Add MIPS support.
In order not to be left behind, we add jump label support for MIPS.
Tested on 64-bit big endian (Octeon), and 32-bit little endian
(malta/qemu).
Signed-off-by: David Daney <ddaney@caviumnetworks.com>
To: linux-mips@linux-mips.org Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Jason Baron <jbaron@redhat.com>
Patchwork: https://patchwork.linux-mips.org/patch/1923/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
David Daney [Tue, 28 Dec 2010 02:18:29 +0000 (18:18 -0800)]
MIPS: Use WARN() in uasm for better diagnostics.
On the off chance that uasm ever warns about overflow, there is no way
to know what the offending instruction is.
Change the printks to WARNs, so we can get a nice stack trace. It has
the added benefit of being much more noticeable than the short single
line warning message, so is less likely to be ignored.
Signed-off-by: David Daney <ddaney@caviumnetworks.com>
To: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/1905/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
David Daney [Mon, 20 Dec 2010 23:54:50 +0000 (15:54 -0800)]
MIPS: Use BBIT instructions in TLB handlers
If the CPU supports BBIT0 and BBIT1, use them in TLB handlers as they
are more efficient than an AND followed by an branch and then
restoring the clobbered register.
Signed-off-by: David Daney <ddaney@caviumnetworks.com>
To: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/1873/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Gabor Juhos [Tue, 23 Nov 2010 15:06:25 +0000 (16:06 +0100)]
MIPS: Add generic support for multiple machines within a single kernel
This patch adds a generic solution to support multiple machines based on
a given SoC within a single kernel image. It is implemented already for
several other architectures but MIPS has no generic support for that yet.
[Ralf: This competes with DT but DT is a much more complex solution and this
code has been used by OpenWRT for a long time so for now DT is a bad reason
to stop the merge but longer term this should be migrated to DT.]