Namhyung Kim [Wed, 17 Jul 2013 08:08:15 +0000 (17:08 +0900)]
perf symbols: Do not apply symfs for an absolute vmlinux path
If an user gives both of --symfs and --vmlinux option, the vmlinux will
be searched under the symfs directory. This is somewhat confusing since
vmlinux often lives in kernel build directory or somewhere other than
user space binaries.
So it'd be better not adding symfs prefix for a vmlinux if it has an
absolute pathname.
Reported-by: Kwanghyun Yoo <ykh815.yoo@lge.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: David Ahern <dsahern@gmail.com> Cc: David Ahern <dsahern@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1374048495-3643-1-git-send-email-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Robert Richter [Tue, 16 Jul 2013 14:50:36 +0000 (16:50 +0200)]
perf tools: Fix 'make tools/perf'
Boris just raised another variant of building perf tools which is
broken:
$ make tools/perf
...
LINK /home/robert/cx/linux/tools/perf/perf
gcc: error: ../linux/tools/lib/lk/liblk.a: No such file or directory
The variant wasn't considered by:
107de37 perf tools: Fix build errors with O and DESTDIR make vars set
There are other variant of building perf too:
$ make -C tools perf
$ make -C tools/perf
Plus variants with O= and DESTDIR set.
This patch fixes the above and was tested with the following:
$ make O=... DESTDIR=... tools/perf
$ make O=... DESTDIR=... -C tools/ perf
$ make O=... DESTDIR=... -C tools/perf
$ make tools/perf
$ make -C tools/ perf
$ make -C tools/perf
Reported-by: Borislav Petkov <bp@alien8.de> Signed-off-by: Robert Richter <robert.richter@linaro.org> Signed-off-by: Robert Richter <rric@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-tip-commits@vger.kernel.org Link: http://lkml.kernel.org/r/20130716145036.GH8731@rric.localhost Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Jiri Olsa [Thu, 11 Jul 2013 15:28:31 +0000 (17:28 +0200)]
perf tools: Remove event types from perf data file
Removing event types data storing/reading to/from perf data file as it's
no longer needed. The only user of this data 'perf timechart' was
switched to use tracepoints handler callbacks.
The event_types offset and size stay in the perf data file header but
are ignored from now on.
Signed-off-by: Jiri Olsa <jolsa@redhat.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1373556513-3000-4-git-send-email-jolsa@redhat.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Jiri Olsa [Thu, 11 Jul 2013 15:28:30 +0000 (17:28 +0200)]
perf timechart: Remove event types framework only user
The only user of the event types data is 'perf timechart' command and
uses this info to identify proper tracepoints based on its name.
Switching this code to use tracepoint callbacks handlers same as another
commands like builtin-{kmem,lock,sched}.c using the
perf_session__set_tracepoints_handlers function.
This way we get rid of the only event types user and can remove them
completely in next patches.
Signed-off-by: Jiri Olsa <jolsa@redhat.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Thomas Renninger <trenn@suse.de> Link: http://lkml.kernel.org/r/1373556513-3000-3-git-send-email-jolsa@redhat.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Jiri Olsa [Sat, 1 Dec 2012 21:00:00 +0000 (22:00 +0100)]
perf diff: Change diff command to work over multiple data files
Adding diff command the flexibility to specify multiple data files on
input. If not input file is given the standard behaviour stands and diff
inspects 'perf.data' and 'perf.data.old' files.
Signed-off-by: Jiri Olsa <jolsa@redhat.com> Reviewed-by: Namhyung Kim <namhyung@kernel.org> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/n/tip-8j3xer54ltvs76t0fh01gcvu@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Greg Price [Fri, 7 Dec 2012 05:48:05 +0000 (21:48 -0800)]
perf report/top: Add option to collapse undesired parts of call graph
For example, in an application with an expensive function implemented
with deeply nested recursive calls, the default call-graph presentation
is dominated by the different callchains within that function. By
ignoring these callees, we can collect the callchains leading into the
function and compactly identify what to blame for expensive calls.
For example, in this report the callers of garbage_collect() are
scattered across the tree:
Adrian Hunter [Thu, 4 Jul 2013 13:20:27 +0000 (16:20 +0300)]
perf tools: Validate perf event header size
The 'size' variable includes the header so must be at least
'sizeof(struct perf_event_header)'. Error out immediately if that is
not the case. Also don't byte-swap the header until it is actually
"fetched" from the mmap region.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Namhyung Kim <namhyung@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1372944040-32690-9-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Adrian Hunter [Thu, 4 Jul 2013 13:20:21 +0000 (16:20 +0300)]
perf tools: Fix missing tool parameter
The 'inject' command expects to get a reference to 'struct perf_inject'
from its 'tool' member. For that to work, 'tool' needs to be a
parameter of all tool callbacks. Make it so.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Namhyung Kim <namhyung@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1372944040-32690-3-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Namhyung Kim [Tue, 4 Jun 2013 09:22:17 +0000 (18:22 +0900)]
perf gtk/hists: Set rules hint for the hist browser
The 'rules' means that every second line of the tree view has a shaded
background, which makes it easier to see which cell belongs to which
row in the tree view. It can be useful for a tree view that has a lot
of rows.
Reviewed-by: Pekka Enberg <penberg@kernel.org> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: David Ahern <dsahern@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Pekka Enberg <penberg@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1370337737-30812-7-git-send-email-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Namhyung Kim [Tue, 4 Jun 2013 09:22:16 +0000 (18:22 +0900)]
perf gtk/hists: Add a double-click handler for callchains
If callchain is displayed, add "row-activated" signal handler for
handling double-click or pressing ENTER key action.
Reviewed-by: Pekka Enberg <penberg@kernel.org> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: David Ahern <dsahern@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Pekka Enberg <penberg@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1370337737-30812-6-git-send-email-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Namhyung Kim [Tue, 4 Jun 2013 09:22:15 +0000 (18:22 +0900)]
perf gtk/hists: Make column headers resizable
Sometimes it's annoying to see when some symbols have very wierd long
names. So it might be a good idea to make column size changable.
Reviewed-by: Pekka Enberg <penberg@kernel.org> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: David Ahern <dsahern@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Pekka Enberg <penberg@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1370337737-30812-5-git-send-email-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Namhyung Kim [Tue, 4 Jun 2013 09:22:14 +0000 (18:22 +0900)]
perf gtk/hists: Display callchain overhead also
Display callchain percent value in the overhead column.
Signed-off-by: Namhyung Kim <namhyung@kernel.org> Reviewed-by: Pekka Enberg <penberg@kernel.org> Cc: David Ahern <dsahern@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Pekka Enberg <penberg@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1370337737-30812-4-git-send-email-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Namhyung Kim [Tue, 4 Jun 2013 09:22:13 +0000 (18:22 +0900)]
perf gtk/hists: Add support for callchains
Display callchain information in the symbol column. It's only enabled
when recorded with -g and has symbol sort key.
Reviewed-by: Pekka Enberg <penberg@kernel.org> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: David Ahern <dsahern@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Pekka Enberg <penberg@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1370337737-30812-3-git-send-email-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Namhyung Kim [Tue, 4 Jun 2013 09:22:12 +0000 (18:22 +0900)]
perf gtk/hists: Use GtkTreeStore instead of GtkListStore
The GtkTreeStore can save items in a tree-like way. This is a
preparation for supporting callgraphs in the hist browser.
Reviewed-by: Pekka Enberg <penberg@kernel.org> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: David Ahern <dsahern@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Pekka Enberg <penberg@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1370337737-30812-2-git-send-email-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Namhyung Kim [Tue, 4 Jun 2013 05:46:19 +0000 (14:46 +0900)]
perf sched: Move struct perf_sched definition out of cmd_sched()
For some reason it consumed quite amount of compile time when declared
as local variable, and it disappeared when moved out of the function.
Moving other variables/tables didn't help.
On my system this single-file-change build time reduced from 11s to 3s.
Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: David Ahern <dsahern@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1370324779-16921-1-git-send-email-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Namhyung Kim [Tue, 4 Jun 2013 05:20:29 +0000 (14:20 +0900)]
perf util: Rename read_*() functions in trace-event-info.c
It's confusing to have same name for two difference functions which does
something opposite way. Since what they do in this file is read *AND*
writing some of tracing metadata files, rename them to record_*() looks
better to me.
Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Steven Rostedt <rostedt@goodmis.org> Link: http://lkml.kernel.org/r/1370323231-14022-15-git-send-email-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Namhyung Kim [Tue, 4 Jun 2013 05:20:26 +0000 (14:20 +0900)]
perf util: Parse header_page to get proper long size
The header_page file describes the format of the ring buffer page
which is used by ftrace (not perf). And size of "commit" field (I
guess it's older name was 'size') represents the real size of long
type used for kernel. So update the pevent's long size.
Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Steven Rostedt <rostedt@goodmis.org> Link: http://lkml.kernel.org/r/1370323231-14022-12-git-send-email-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Namhyung Kim [Tue, 4 Jun 2013 05:20:20 +0000 (14:20 +0900)]
tools lib traceevent: Add page_size field to pevent
The page size of traced system can be different than current system's
because the recorded data file might be analyzed in a different machine.
In this case we should use original page size of traced system when
accessing the data file, so this information needs to be saved.
Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Steven Rostedt <rostedt@goodmis.org> Link: http://lkml.kernel.org/r/1370323231-14022-6-git-send-email-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Namhyung Kim [Tue, 4 Jun 2013 05:20:19 +0000 (14:20 +0900)]
tools lib traceevent: Add trace_seq_reset()
Sometimes it'd be useful if existing trace_seq can be reused. But
currently it's impossible since there's no API to reset the trace_seq.
Let's add trace_seq_reset() for this case.
Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Steven Rostedt <rostedt@goodmis.org> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Steven Rostedt <rostedt@goodmis.org> Link: http://lkml.kernel.org/r/1370323231-14022-5-git-send-email-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Namhyung Kim [Tue, 4 Jun 2013 05:20:18 +0000 (14:20 +0900)]
tools lib traceevent: Add const qualifier to string arguments
If pevent_register_event_handler() received a string literal as
@sys_name or @event_name parameter, it emitted a warning about const
qualifier removal. Since they're not modified in the function we can
make it have const qualifier.
Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Steven Rostedt <rostedt@goodmis.org> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Steven Rostedt <rostedt@goodmis.org> Link: http://lkml.kernel.org/r/1370323231-14022-4-git-send-email-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
David Ahern [Tue, 2 Jul 2013 19:27:25 +0000 (13:27 -0600)]
perf parse events: Demystify memory allocations
List heads are currently allocated way down the function chain in
__add_event and add_tracepoint and then freed when the scanner code
calls parse_events_update_lists.
Be more explicit with where memory is allocated and who should free it. With
this patch the list_head is allocated in the scanner code and freed when the
scanner code calls parse_events_update_lists.
Signed-off-by: David Ahern <dsahern@gmail.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1372793245-4136-7-git-send-email-dsahern@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
David Ahern [Tue, 2 Jul 2013 19:27:21 +0000 (13:27 -0600)]
perf evlist: Fix use of uninitialized variable
Fixes valgrind complaint:
==1870== Syscall param write(buf) points to uninitialised byte(s)
==1870== at 0x4E3F5B0: __write_nocancel (in /lib64/libpthread-2.14.90.so)
==1870== by 0x449D7C: perf_evlist__start_workload (evlist.c:846)
==1870== by 0x427BC1: cmd_record (builtin-record.c:561)
==1870== by 0x419D72: run_builtin (perf.c:319)
==1870== by 0x4195F2: main (perf.c:376)
==1870== Address 0x7feffcdd7 is on thread 1's stack
Signed-off-by: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1372793245-4136-3-git-send-email-dsahern@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Runzhen Wang [Fri, 28 Jun 2013 08:14:57 +0000 (16:14 +0800)]
perf tools: Make Power7 events available for perf
Power7 supports over 530 different perf events but only a small subset
of these can be specified by name, for the remaining events, we must
specify them by their raw code:
perf stat -e r2003c <application>
This patch makes all the POWER7 events available in sysfs. So we can
instead specify these as:
perf stat -e 'cpu/PM_CMPLU_STALL_DFU/' <application>
where PM_CMPLU_STALL_DFU is the r2003c in previous example.
Before this patch is applied, the size of power7-pmu.o is:
$ size arch/powerpc/perf/power7-pmu.o
text data bss dec hex filename
3073 2720 0 5793 16a1 arch/powerpc/perf/power7-pmu.o
and after the patch is applied, it is:
$ size arch/powerpc/perf/power7-pmu.o
text data bss dec hex filename
15950 31112 0 47062 b7d6 arch/powerpc/perf/power7-pmu.o
For the run time overhead, I use two scripts, one is "event_name.sh",
which contains 50 event names, it looks like:
# ./perf record -e 'cpu/PM_CMPLU_STALL_DFU/' -e ..... /bin/sleep 1
the other one is named "event_code.sh" which use corresponding events
raw
code instead of events names, it looks like:
# ./perf record -e r2003c -e ...... /bin/sleep 1
below is the result.
Using events name:
[root@localhost perf]# time ./event_name.sh
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.002 MB perf.data (~102 samples) ]
real 0m1.192s
user 0m0.028s
sys 0m0.106s
Using events raw code:
[root@localhost perf]# time ./event_code.sh
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.003 MB perf.data (~112 samples) ]
real 0m1.198s
user 0m0.028s
sys 0m0.105s
Signed-off-by: Runzhen Wang <runzhen@linux.vnet.ibm.com> Acked-by: Michael Ellerman <michael@ellerman.id.au> Cc: icycoder@gmail.com Cc: linuxppc-dev@lists.ozlabs.org Cc: Michael Ellerman <michael@ellerman.id.au> Cc: Paul Mackerras <paulus@samba.org> Cc: Runzhen Wang <runzhew@clemson.edu> Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> Cc: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> Link: http://lkml.kernel.org/r/1372407297-6996-3-git-send-email-runzhen@linux.vnet.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Jiri Olsa [Tue, 25 Jun 2013 11:54:13 +0000 (13:54 +0200)]
perf report: Fix perf_session__delete removal
There's no point of having out_delete label with perf_session__delete
call within __cmd_report function, because it's called at the end of the
cmd_report function.
The speed up due to commenting out the perf_session__delete at the end
does not seem relevant anymore. Measured speedup for ~1GB data file with
222466 FORKS events is around 0.5%.
Namhyung Kim [Wed, 26 Jun 2013 07:14:05 +0000 (16:14 +0900)]
perf util: Use evsel->name to get tracepoint_paths
Most tracepoint events already have their system and event name in
->name field so that searching whole event tracing directory for each
evsel to match given id is suboptimal.
Factor out this routine into tracepoint_name_to_path(). In case of en
invalid name, it'll try to find path using id again.
Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@google.com> Cc: Steven Rostedt <rostedt@goodmis.org> Link: http://lkml.kernel.org/r/1372230862-15861-3-git-send-email-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Robert Richter [Tue, 11 Jun 2013 15:29:18 +0000 (17:29 +0200)]
perf tools: Use default include path notation for libtraceevent headers
Header files of libtraceevent or no longer local headers. Thus, use
default path notation for them. Also removing extra traceevent include
path and instead handle this similar to liblk.
Signed-off-by: Robert Richter <robert.richter@linaro.org> Signed-off-by: Robert Richter <rric@kernel.org> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Robert Richter <rric@kernel.org> Link: http://lkml.kernel.org/r/1370964558-8599-1-git-send-email-rric@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Peter Zijlstra [Fri, 12 Jul 2013 09:08:33 +0000 (11:08 +0200)]
perf: Fix perf_lock_task_context() vs RCU
Jiri managed to trigger this warning:
[] ======================================================
[] [ INFO: possible circular locking dependency detected ]
[] 3.10.0+ #228 Tainted: G W
[] -------------------------------------------------------
[] p/6613 is trying to acquire lock:
[] (rcu_node_0){..-...}, at: [<ffffffff810ca797>] rcu_read_unlock_special+0xa7/0x250
[]
[] but task is already holding lock:
[] (&ctx->lock){-.-...}, at: [<ffffffff810f2879>] perf_lock_task_context+0xd9/0x2c0
[]
[] which lock already depends on the new lock.
[]
[] the existing dependency chain (in reverse order) is:
[]
[] -> #4 (&ctx->lock){-.-...}:
[] -> #3 (&rq->lock){-.-.-.}:
[] -> #2 (&p->pi_lock){-.-.-.}:
[] -> #1 (&rnp->nocb_gp_wq[1]){......}:
[] -> #0 (rcu_node_0){..-...}:
Paul was quick to explain that due to preemptible RCU we cannot call
rcu_read_unlock() while holding scheduler (or nested) locks when part
of the read side critical section was preemptible.
Therefore solve it by making the entire RCU read side non-preemptible.
Also pull out the retry from under the non-preempt to play nice with RT.
Reported-by: Jiri Olsa <jolsa@redhat.com> Helped-out-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: <stable@kernel.org> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org>
Jiri Olsa [Tue, 9 Jul 2013 15:44:11 +0000 (17:44 +0200)]
perf: Remove WARN_ON_ONCE() check in __perf_event_enable() for valid scenario
The '!ctx->is_active' check has a valid scenario, so
there's no need for the warning.
The reason is that there's a time window between the
'ctx->is_active' check in the perf_event_enable() function
and the __perf_event_enable() function having:
- IRQs on
- ctx->lock unlocked
where the task could be killed and 'ctx' deactivated by
perf_event_exit_task(), ending up with the warning below.
So remove the WARN_ON_ONCE() check and add comments to
explain it all.
This addresses the following warning reported by Vince Weaver:
Jiri Olsa [Tue, 9 Jul 2013 15:44:10 +0000 (17:44 +0200)]
perf: Clone child context from parent context pmu
Currently when the child context for inherited events is
created, it's based on the pmu object of the first event
of the parent context.
This is wrong for the following scenario:
- HW context having HW and SW event
- HW event got removed (closed)
- SW event stays in HW context as the only event
and its pmu is used to clone the child context
The issue starts when the cpu context object is touched
based on the pmu context object (__get_cpu_context). In
this case the HW context will work with SW cpu context
ending up with following WARN below.
Fixing this by using parent context pmu object to clone
from child context.
Addresses the following warning reported by Vince Weaver:
765532c8 (perf script: Finish the rename from trace to script,
2010-12-23) made a mistake during find-and-replace replacing
"../../../util/trace-event.h" with "../../../util/script-event.h", a
non-existent file. Fix this include.
Mike Frysinger [Thu, 9 May 2013 04:17:44 +0000 (00:17 -0400)]
perf tools: Fix -ldw/-lelf link test when static linking
Since libelf sometimes uses libpthread, we have to list that after -lelf
when someone tries to build statically. Else things go boom:
Makefile:479: *** No libelf.h/libelf found, please install \
libelf-dev/elfutils-libelf-devel. Stop.
Similarly, the -ldw test fails as it often uses -lz:
Makefile:462: No libdw.h found or old libdw.h found or elfutils is older \
than 0.138, disables dwarf support. Please install new elfutils-devel/libdw-dev
And if we add debugging to try-cc, we see:
+ echo '#include <dwarf.h>
attempts to aid the user by tapping into an existing error message,
as described in the commit message:
... Also fix an issue where _get_attempt was called with only
one argument. This prevented the error message from printing
the name of the variable that can be used to fix the problem.
However, The "missing" argument was in fact missing on purpose; it's
absence is a signal that the error message should be skipped, because
the failure would be due to the default value, not any user-supplied
value. This can be seen in how `_ge_attempt' uses `gea_err' (in the
config/utilities.mak file):
_ge_attempt = $(if $(get-executable),$(get-executable),$(_gea_warn)$(call _gea_err,$(2)))
_gea_warn = $(warning The path '$(1)' is not executable.)
_gea_err = $(if $(1),$(error Please set '$(1)' appropriately))
That is, because the argument is no longer missing, the value `$(1)'
(associated with `_gea_err') always evaluates to true, thus always
triggering the error condition that is meant to be reserved for
only the case when a user explicitly supplies an invalid value.
Concretely, the result is a regression in the Makefile's configuration
of python support; rather than gracefully disable support when the
relevant executables cannot be found according to default values, the
build process halts in error as though the user explicitly supplied
the values.
This new commit simply reverts the offending one-line change.
Waiman Long [Thu, 9 May 2013 14:42:48 +0000 (10:42 -0400)]
perf symbols: Fix vdso list searching
When "perf record" was used on a large machine with a lot of CPUs, the
perf post-processing time (the time after the workload was done until
the perf command itself exited) could take a lot of minutes and even
hours depending on how large the resulting perf.data file was.
While running AIM7 1500-user high_systime workload on a 80-core x86-64
system with a 3.9 kernel (with only the -s -a options used), the
workload itself took about 2 minutes to run and the perf.data file had a
size of 1108.746 MB. However, the post-processing step took more than 10
minutes.
With a gprof-profiled perf binary, the time spent by perf was as
follows:
It was found that the vdso__dso_findnew() function failed to locate
VDSO__MAP_NAME ("[vdso]") in the dso list and have to insert a new
entry at the end for 192156 times. This problem is due to the fact that
there are 2 types of name in the dso entry - short name and long name.
The initial dso__new() adds "[vdso]" to both the short and long names.
After that, vdso__dso_findnew() modifies the long name to something
like /tmp/perf-vdso.so-NoXkDj. The dsos__find() function only compares
the long name. As a result, the same vdso entry is duplicated many
time in the dso list. This bug increases memory consumption as well
as slows the symbol processing time to a crawl.
To resolve this problem, the dsos__find() function interface was
modified to enable searching either the long name or the short
name. The vdso__dso_findnew() will now search only the short name
while the other call sites search for the long name as before.
With this change, the cpu time of perf was reduced from 848.38s to
15.77s and dsos__find() only accounted for 0.06% of the total time.
0.06 15.73 0.01 192151 0.00 0.00 dsos__find
Signed-off-by: Waiman Long <Waiman.Long@hp.com> Acked-by: Ingo Molnar <mingo@kernel.org> Cc: "Chandramouleeswaran, Aswin" <aswin@hp.com> Cc: "Norton, Scott J" <scott.norton@hp.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1368110568-64714-1-git-send-email-Waiman.Long@hp.com
[ replaced TRUE/FALSE with stdbool.h equivalents, fixing builds where
those macros are not present (NO_LIBPYTHON=1 NO_LIBPERL=1), fix from Jiri Olsa ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Adrian Hunter [Thu, 4 Jul 2013 13:20:34 +0000 (16:20 +0300)]
perf evsel: Fix missing increment in sample parsing
The final sample format bit used to be PERF_SAMPLE_STACK_USER which
neglected to do a final increment of the array pointer. The result is
that the following parsing might start at the wrong place.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Namhyung Kim <namhyung@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1372944040-32690-16-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
David Ahern [Tue, 2 Jul 2013 19:27:20 +0000 (13:27 -0600)]
perf evsel: Fix count parameter to read call in event_format__new
per realloc above the length of the buffer is alloc_size, not BUFSIZ.
Adjust length per size as done for buf start.
Addresses some valgrind complaints:
==1870== Syscall param read(buf) points to unaddressable byte(s)
==1870== at 0x4E3F610: __read_nocancel (in /lib64/libpthread-2.14.90.so)
==1870== by 0x44AEE1: event_format__new (unistd.h:45)
==1870== by 0x44B025: perf_evsel__newtp (evsel.c:158)
==1870== by 0x451919: add_tracepoint_event (parse-events.c:395)
==1870== by 0x479815: parse_events_parse (parse-events.y:292)
==1870== by 0x45463A: parse_events_option (parse-events.c:861)
==1870== by 0x44FEE4: get_value (parse-options.c:113)
==1870== by 0x450767: parse_options_step (parse-options.c:192)
==1870== by 0x450C40: parse_options (parse-options.c:422)
==1870== by 0x42735F: cmd_record (builtin-record.c:918)
==1870== by 0x419D72: run_builtin (perf.c:319)
==1870== by 0x4195F2: main (perf.c:376)
==1870== Address 0xcffebf0 is 0 bytes after a block of size 8,192 alloc'd
==1870== at 0x4C2A62F: malloc (vg_replace_malloc.c:270)
==1870== by 0x4C2A7A3: realloc (vg_replace_malloc.c:662)
==1870== by 0x44AF07: event_format__new (evsel.c:121)
==1870== by 0x44B025: perf_evsel__newtp (evsel.c:158)
==1870== by 0x451919: add_tracepoint_event (parse-events.c:395)
==1870== by 0x479815: parse_events_parse (parse-events.y:292)
==1870== by 0x45463A: parse_events_option (parse-events.c:861)
==1870== by 0x44FEE4: get_value (parse-options.c:113)
==1870== by 0x450767: parse_options_step (parse-options.c:192)
==1870== by 0x450C40: parse_options (parse-options.c:422)
==1870== by 0x42735F: cmd_record (builtin-record.c:918)
==1870== by 0x419D72: run_builtin (perf.c:319)
Signed-off-by: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1372793245-4136-2-git-send-email-dsahern@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Runzhen Wang [Fri, 28 Jun 2013 08:14:56 +0000 (16:14 +0800)]
perf tools: fix a typo of a Power7 event name
In the Power7 PMU guide:
https://www.power.org/documentation/commonly-used-metrics-for-performance-analysis/
PM_BRU_MPRED is referred to as PM_BR_MPRED.
It fixed the typo by changing the name of the event in kernel and
documentation accordingly.
This patch changes the ABI, there are some reasons I think it's ok:
- It is relatively new interface, specific to the Power7 platform.
- No tools that we know of actually use this interface at this point
(none are listed near the interface).
- Users of this interface (eg oprofile users migrating to perf)
would be more used to the "PM_BR_MPRED" rather than "PM_BRU_MPRED".
- These are in the ABI/testing at this point rather than ABI/stable,
so hoping we have some wiggle room.
Signed-off-by: Runzhen Wang <runzhen@linux.vnet.ibm.com> Acked-by: Michael Ellerman <michael@ellerman.id.au> Cc: icycoder@gmail.com Cc: linuxppc-dev@lists.ozlabs.org Cc: Michael Ellerman <michael@ellerman.id.au> Cc: Paul Mackerras <paulus@samba.org> Cc: Runzhen Wang <runzhew@clemson.edu> Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> Cc: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> Link: http://lkml.kernel.org/r/1372407297-6996-2-git-send-email-runzhen@linux.vnet.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
When perf tries to start a workload, it relies on a pipe which the
workload was blocked for reading. After closing the pipe on the parent,
the workload (child) can start the actual work via exec().
However, if another process was forked after creating a workload, this
mechanism cannot work since the other process (child) also inherits the
pipe, so that closing the pipe in parent cannot unblock the workload.
Fix it by using explicit write call can then closing it.
For similar reason, the pipe fd on parent should be marked as CLOEXEC so
that it can be closed after another child exec'ed.
Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@google.com> Cc: Steven Rostedt <rostedt@goodmis.org> Link: http://lkml.kernel.org/r/1372230862-15861-13-git-send-email-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Jiri Olsa [Wed, 5 Jun 2013 11:37:21 +0000 (13:37 +0200)]
perf record: Remove -f/--force option
It no longer have any affect on the processing and is marked as obsolete
anyway.
Signed-off-by: Jiri Olsa <jolsa@redhat.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/n/tip-tvwyspiqr4getzfib2lw06ty@git.kernel.org Link: http://lkml.kernel.org/r/1372307120-737-1-git-send-email-namhyung@kernel.org
[ combined patch removing the -f usage in various sub-commands, such as 'perf sched', etc, by Namhyung Kim ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
perf stat: Avoid sending SIGTERM to random processes
This patch fixes a problem with perf stat whereby on termination it may
send a SIGTERM signal to random processes on systems with high PID
recycling. I got some actual bug reports on this.
There is race between the SIGCHLD and sig_atexit() handlers. This patch
addresses this problem by clearing child_pid in the SIGCHLD handler.
Signed-off-by: Stephane Eranian <eranian@google.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20130604154426.GA2928@quad Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>