Adrian Hunter [Mon, 9 Dec 2013 13:18:40 +0000 (15:18 +0200)]
perf record: Fix display of incorrect mmap pages
'mmap_pages' is 'unsigned int' not 'int' e.g.
perf record -m2147483648 uname
Permission error mapping pages.
Consider increasing /proc/sys/kernel/perf_event_mlock_kb,
or try again with a smaller value of -m/--mmap_pages.
(current value: -2147483648)
Fixed:
perf record -m2147483648 uname
Permission error mapping pages.
Consider increasing /proc/sys/kernel/perf_event_mlock_kb,
or try again with a smaller value of -m/--mmap_pages.
(current value: 2147483648)
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Jiri Olsa <jolsa@redhat.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Namhyung Kim <namhyung@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1386595120-22978-5-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Adrian Hunter [Fri, 6 Dec 2013 07:42:57 +0000 (09:42 +0200)]
perf script: Add an option to print the source line number
Add field 'srcline' that displays the source file name and line number
associated with the sample ip. The information displayed is the same as
from addr2line.
Adrian Hunter [Fri, 6 Dec 2013 07:42:56 +0000 (09:42 +0200)]
perf script: Fix symoff printing in callchains
The address being used to calculate the offset was the memory address
but the address needed is the address mapped to the dso. i.e. the 'addr'
member of 'struct addr_location'
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: David Ahern <dsahern@gmail.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Namhyung Kim <namhyung@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1386315778-11633-2-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Steven Rostedt [Tue, 19 Nov 2013 02:38:20 +0000 (21:38 -0500)]
tools lib traceevent: Report better error message on bad function args
When Jiri Olsa was writing a function callback for
scsi_trace_parse_cdb(), he thought that the traceevent library had a
bug in it because he was getting this error:
Error: expected ')' but read ','
Error: expected ')' but read ','
Error: expected ')' but read ','
Error: expected ')' but read ','
But in truth, he didn't have the write number of arguments for the
function callback, and the error was the library detecting the
discrepancy. A better error message would have prevented the confusion:
Error: function 'scsi_trace_parse_cdb()' only expects 2 arguments but event scsi_dispatch_cmd_timeout has more
Error: function 'scsi_trace_parse_cdb()' only expects 2 arguments but event scsi_dispatch_cmd_start has more
Error: function 'scsi_trace_parse_cdb()' only expects 2 arguments but event scsi_dispatch_cmd_error has more
Error: function 'scsi_trace_parse_cdb()' only expects 2 arguments but event scsi_dispatch_cmd_done has more
Or
Error: function 'scsi_trace_parse_cdb()' expects 4 arguments but event scsi_dispatch_cmd_timeout only uses 3
Error: function 'scsi_trace_parse_cdb()' expects 4 arguments but event scsi_dispatch_cmd_start only uses 3
Error: function 'scsi_trace_parse_cdb()' expects 4 arguments but event scsi_dispatch_cmd_error only uses 3
Error: function 'scsi_trace_parse_cdb()' expects 4 arguments but event scsi_dispatch_cmd_done only uses 3
David Ahern [Thu, 5 Dec 2013 02:41:39 +0000 (19:41 -0700)]
perf trace: Add support for syscalls vs raw_syscalls
Older kernels (e.g., RHEL6) do system call tracing via
syscalls:sys_{enter,exit} rather than raw_syscalls. Update perf-trace to
detect lack of raw_syscalls support and try syscalls.
Jiri Olsa [Tue, 3 Dec 2013 13:09:35 +0000 (14:09 +0100)]
tools lib traceevent: Add cfg80211 plugin
Adding cfg80211 plugin.
This plugin adds handler for __le16_to_cpup function
t properly parse following tracepoint events:
cfg80211:cfg80211_tx_mlme_mgmt
cfg80211:cfg80211_rx_mlme_mgmt
cfg80211:cfg80211_rx_unprot_mlme_mgmt
The diff of 'perf script' output generated by old and new code:
(data was generated by 'perf record -e 'cfg80211:*' -a')
Jiri Olsa [Tue, 3 Dec 2013 13:09:32 +0000 (14:09 +0100)]
tools lib traceevent: Add function plugin
Backporting function plugin.
Backported from Steven Rostedt's trace-cmd repo (HEAD 0f2c2fb):
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/trace-cmd.git
This plugin adds function and parent function fields resolving for
ftrace:function tracepoint event.
The diff of 'perf script' output generated by old and new code:
(data was generated by 'perf record -e ftrace:function ls')
--- script.function.old
+++ script.function.new
- ls 10781 [001] 32667.291379: ftrace:function: ffffffff811adb80 <-- ffffffff811afc48
- ls 10781 [001] 32667.291379: ftrace:function: ffffffff811b35d0 <-- ffffffff811adb9b
- ls 10781 [001] 32667.291380: ftrace:function: ffffffff811b3520 <-- ffffffff811b35e8
- ls 10781 [001] 32667.291380: ftrace:function: ffffffff811b2720 <-- ffffffff811b3549
- ls 10781 [001] 32667.291381: ftrace:function: ffffffff81297e10 <-- ffffffff811b356c
- ls 10781 [001] 32667.291381: ftrace:function: ffffffff81298f40 <-- ffffffff81297e2c
- ls 10781 [001] 32667.291382: ftrace:function: ffffffff81076160 <-- ffffffff811afbf0
- ls 10781 [001] 32667.291383: ftrace:function: ffffffff811c3eb0 <-- ffffffff811afbfc
- ls 10781 [001] 32667.291383: ftrace:function: ffffffff8164e100 <-- ffffffff811c3ed8
- ls 10781 [001] 32667.291384: ftrace:function: ffffffff811a5d10 <-- ffffffff811c3f53
- ls 10781 [001] 32667.291384: ftrace:function: ffffffff811e8e70 <-- ffffffff811a5d58
- ls 10781 [001] 32667.291385: ftrace:function: ffffffff811f38e0 <-- ffffffff811a5d63
- ls 10781 [001] 32667.291385: ftrace:function: ffffffff811a9ff0 <-- ffffffff811a5d6b
- ls 10781 [001] 32667.291386: ftrace:function: ffffffff811a9fa0 <-- ffffffff811aa015
- ls 10781 [001] 32667.291387: ftrace:function: ffffffff810851c0 <-- ffffffff811aa053
- ls 10781 [001] 32667.291387: ftrace:function: ffffffff81090e00 <-- ffffffff81085211
+ ls 10781 [001] 32667.291379: ftrace:function: would_dump <-- setup_new_exec
+ ls 10781 [001] 32667.291379: ftrace:function: inode_permission <-- would_dump
+ ls 10781 [001] 32667.291380: ftrace:function: __inode_permission <-- inode_permission
+ ls 10781 [001] 32667.291380: ftrace:function: generic_permission <-- __inode_permission
+ ls 10781 [001] 32667.291381: ftrace:function: security_inode_permission <-- __inode_permission
+ ls 10781 [001] 32667.291381: ftrace:function: cap_inode_permission <-- security_inode_permission
+ ls 10781 [001] 32667.291382: ftrace:function: flush_signal_handlers <-- setup_new_exec
+ ls 10781 [001] 32667.291383: ftrace:function: do_close_on_exec <-- setup_new_exec
+ ls 10781 [001] 32667.291383: ftrace:function: _raw_spin_lock <-- do_close_on_exec
+ ls 10781 [001] 32667.291384: ftrace:function: filp_close <-- do_close_on_exec
+ ls 10781 [001] 32667.291384: ftrace:function: dnotify_flush <-- filp_close
+ ls 10781 [001] 32667.291385: ftrace:function: locks_remove_posix <-- filp_close
+ ls 10781 [001] 32667.291385: ftrace:function: fput <-- filp_close
+ ls 10781 [001] 32667.291386: ftrace:function: file_sb_list_del <-- fput
+ ls 10781 [001] 32667.291387: ftrace:function: task_work_add <-- fput
+ ls 10781 [001] 32667.291387: ftrace:function: kick_process <-- task_work_add
Removing options support as it's not backported yet.
Currently this plugin supports 2 options:
'parent' to display parent function
'indent' to show function call indents
Enabling both of them by default.
Signed-off-by: Jiri Olsa <jolsa@redhat.com> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Steven Rostedt <rostedt@goodmis.org> Link: http://lkml.kernel.org/r/1386076182-14484-19-git-send-email-jolsa@redhat.com Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Jiri Olsa [Tue, 3 Dec 2013 13:09:29 +0000 (14:09 +0100)]
tools lib traceevent: Add kvm plugin
Backporting kvm plugin.
Backported from Steven Rostedt's trace-cmd repo (HEAD 0f2c2fb):
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/trace-cmd.git
This plugin adds field resolving functions for following
tracepoint events:
kvm:kvm_exit
kvm:kvm_emulate_insn
kvm:kvm_nested_vmexit
kvm:kvm_nested_vmexit_inject
kvmmmu:kvm_mmu_get_page
kvmmmu:kvm_mmu_sync_page
kvmmmu:kvm_mmu_unsync_page
kvmmmu:kvm_mmu_zap_page
kvmmmu:kvm_mmu_prepare_zap_page
The diff of 'perf script' output generated by old and new code:
(data was generated by 'perf record -e 'kvm:*,kvmmmu:*' -a')
Note:
- kvm_mmu_zap_page is replaced by kvm_mmu_prepare_zap_page
in current kernel, keeping it for backward compatibility
- some of the tracepoints keep the same output even with
the plugin handling: kvm:kvm_exit, kvm:kvm_emulate_insn
- the 'kvmmmu:fast_page_fault' is still broken because of
missing is_writable_pte function and is fixed in another patch
- ommited following tracepoints from backport because
the output was buggy
kvm:kvm_nested_vmexit
kvm:kvm_nested_vmexit_inject
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Steven Rostedt <rostedt@goodmis.org> Link: http://lkml.kernel.org/r/1386076182-14484-16-git-send-email-jolsa@redhat.com Signed-off-by: Jiri Olsa <jolsa@redhat.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Jiri Olsa [Tue, 3 Dec 2013 13:09:26 +0000 (14:09 +0100)]
tools lib traceevent: Add jbd2 plugin
Backporting jbd2 plugin.
Backported from Steven Rostedt's trace-cmd repo (HEAD 0f2c2fb):
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/trace-cmd.git
This plugin adds field resolving functions for following tracepoint
events:
jbd2:jbd2_checkpoint_stats
jbd2:jbd2_run_stats
The diff of 'perf script' output generated by old and new code:
(data was generated by 'perf record -e 'jbd2:jbd2_run_stats,jbd2:jbd2_checkpoint_stats' -a')
Jiri Olsa [Tue, 3 Dec 2013 13:09:25 +0000 (14:09 +0100)]
perf tools: Overload pr_stat traceevent print function
The traceevent lib uses pr_stat to display all standard info. It's
defined as __weak. Overloading it with perf version plugged into perf
output system logic.
Displaying the pr_stat stuff under '-v' option.
Signed-off-by: Jiri Olsa <jolsa@redhat.com> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Steven Rostedt <rostedt@goodmis.org> Link: http://lkml.kernel.org/r/1386076182-14484-12-git-send-email-jolsa@redhat.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Jiri Olsa [Tue, 3 Dec 2013 13:09:24 +0000 (14:09 +0100)]
perf tools: Add trace-event global object for tracepoint interface
In order to get the proper plugins processing we need to use full
trace-event interface when creating tracepoint events. So far we were
using shortcut to get the parsed format.
Moving current 'event_format__new' function into trace-event object as
'trace_event__tp_format'.
This function uses properly initialized global trace-event object,
ensuring proper plugins processing.
Signed-off-by: Jiri Olsa <jolsa@redhat.com> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Steven Rostedt <rostedt@goodmis.org> Link: http://lkml.kernel.org/r/1386076182-14484-11-git-send-email-jolsa@redhat.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Jiri Olsa [Tue, 3 Dec 2013 13:09:20 +0000 (14:09 +0100)]
tools lib traceevent: Harmonize the install messages in lib-traceevent
Removing the 'to ...' part out of the install message, because it does
not fit to the rest of the build messages we use.
Before:
INSTALL plugin_hrtimer.so to /home/jolsa/libexec/perf-core/traceevent/plugins
INSTALL plugin_jbd2.so to /home/jolsa/libexec/perf-core/traceevent/plugins
INSTALL plugin_kmem.so to /home/jolsa/libexec/perf-core/traceevent/plugins
INSTALL plugin_kvm.so to /home/jolsa/libexec/perf-core/traceevent/plugins
INSTALL plugin_mac80211.so to /home/jolsa/libexec/perf-core/traceevent/plugins
INSTALL plugin_sched_switch.so to /home/jolsa/libexec/perf-core/traceevent/plugins
INSTALL plugin_function.so to /home/jolsa/libexec/perf-core/traceevent/plugins
INSTALL plugin_xen.so to /home/jolsa/libexec/perf-core/traceevent/plugins
INSTALL plugin_scsi.so to /home/jolsa/libexec/perf-core/traceevent/plugins
Jiri Olsa [Tue, 3 Dec 2013 13:09:19 +0000 (14:09 +0100)]
tools lib traceevent: Change pevent_parse_format to include pevent handle
Changing the pevent_parse_format interface to include the pevent handle.
The goal is to always use pevent object when dealing with traceevent
library. The reason is that we might need additional processing (like
plugins), which is not possible otherwise.
Patches follow to make this happen completely.
Signed-off-by: Jiri Olsa <jolsa@redhat.com> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Steven Rostedt <rostedt@goodmis.org> Link: http://lkml.kernel.org/r/1386076182-14484-6-git-send-email-jolsa@redhat.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Adrian Hunter [Wed, 4 Dec 2013 14:16:36 +0000 (16:16 +0200)]
perf script: Do not call perf_event__preprocess_sample() twice)
The perf_event__preprocess_sample() function is called in
process_sample_event(). Instead of calling it again in
perf_evsel__print_ip(), pass through the resultant addr_location.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Namhyung Kim <namhyung@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/529F3944.9050007@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Adrian Hunter [Wed, 4 Dec 2013 14:23:01 +0000 (16:23 +0200)]
perf symbols: Fix random fd closing with no libelf
When built without libelf, perf tools was failing to initialize a file
descriptor, but nevertheless closing it. That sometimes resulted in the
output being truncated because the stdout file descriptor got closed.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Namhyung Kim <namhyung@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1386166981-30197-1-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Dongsheng Yang [Wed, 4 Dec 2013 22:56:42 +0000 (17:56 -0500)]
perf kvm: Add more detail about buildid-list in man page
As the buildid is read from /sys/kernel/notes, then if we use perf kvm
buildid-list with a perf data file captured by perf kvm record with
--guestkallsyms and --guestmodules, there is no result in output.
This patch add a explanation about it and add a limit of using perf kvm
buildid-list.
Dongsheng Yang [Wed, 4 Dec 2013 22:56:39 +0000 (17:56 -0500)]
perf tools: Remove condition in machine__get_kernel_start_addr.
In machine__get_kernel_start_addr, the code, which is using
machine->root_dir to build filename, works for both host and guests
initialized from guestmount, as root_dir is set to "" for the host
machine in the machine__init() function.
So this patch remove the branch for machine__is_host.
Signed-off-by: Dongsheng Yang <yangds.fnst@cn.fujitsu.com> Cc: David Ahern <dsahern@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/0a81645dd0b384a12cb4f962cf193ef8c3ce2010.1386197481.git.yangds.fnst@cn.fujitsu.com
[ Clarified changeset mentioning root_dir setup in machine__init() ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Jiri Olsa [Tue, 3 Dec 2013 13:09:15 +0000 (14:09 +0100)]
perf tools: Remove stackprotector feature check
We use -fstack-protector-all option to enable stack protecting for all
available functions. There's no reason for enabling -Wstack-protector to
get warning for unprotected functions.
Removing stackprotector feature check which was used to enable the
-Wstack-protector option.
Signed-off-by: Jiri Olsa <jolsa@redhat.com> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Steven Rostedt <rostedt@goodmis.org> Link: http://lkml.kernel.org/r/1386076182-14484-2-git-send-email-jolsa@redhat.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Adrian Hunter [Tue, 3 Dec 2013 07:23:10 +0000 (09:23 +0200)]
perf tools: Do not disable source line lookup just because of 1 failure
Looking up an ip's source file name and line number does not succeed
always. Current logic disables the lookup for a dso entirely on any
failure. Change it so that disabling never happens if there has ever
been a successful lookup for that dso but disable if the first 123
lookups fail.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Namhyung Kim <namhyung@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1386055390-13757-8-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Adrian Hunter [Tue, 3 Dec 2013 07:23:07 +0000 (09:23 +0200)]
perf symbols: Retain bfd reference to lookup source line numbers
Closng and re-opening for every lookup when using libbfd to lookup
source file name and line number is very very slow. Instead keep the
reference on struct dso.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Namhyung Kim <namhyung@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1386055390-13757-5-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Ingo Molnar [Wed, 4 Dec 2013 09:17:17 +0000 (10:17 +0100)]
Merge tag 'perf-core-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core
Pull perf/tools improvements and fixes from Arnaldo Carvalho de Melo:
* Honour -m option in 'trace', the tool was offering the option to
set the mmap size, but wasn't using it when doing the actual mmap
on the events file descriptors, fix from Jiri Olsa.
* Correct the message in feature-libnuma checking, swowing the right
devel package names for various distros, from Dongsheng Yang.
* Polish 'readn' function and introduce its counterpart, 'writen', from
Jiri Olsa.
* Start moving timechart state from global variables to a 'perf_tool' derived
'timechart' struct.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Ingo Molnar <mingo@kernel.org>
To avoid having all those global variables and to use the interface to
event processing that is based on passing a 'perf_tool' struct that
should be embedded in a per tool specific struct passed to all the
sample processing callbacks.
There are some more globals to move, next patches will do it.
Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stanislav Fomichev <stfomichev@yandex-team.ru> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-0iah65pq796ezbk5u1lzwy1k@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Since b000c8065a92 "tracing: Remove the extra 4 bytes of padding in
events" removed padding bytes, perf timechart got out of sync with the
kernel's trace_entry structure.
Convert perf timechart to use dynamic fields offsets (via
perf_evsel__intval) not relying on a hardcoded copy of fields layout
from the kernel.
Signed-off-by: Stanislav Fomichev <stfomichev@yandex-team.ru> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Chia-I Wu <olvaffe@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Steven Rostedt <rostedt@goodmis.org> Link: http://lkml.kernel.org/r/20131127104459.GB3309@stfomichev-desktop Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Adrian Hunter [Tue, 26 Nov 2013 13:19:24 +0000 (15:19 +0200)]
perf symbols: Fix not finding kcore in buildid cache
The logic was not looking in the buildid cache for kcore if the host
kernel buildid did not match the recorded kernel buildid.
This affects the non-live case i.e. the kernel has changed and we are
looking at a special copy of kcore that we placed in the buildid cache
(using "perf buildid-cache -v -k /proc/kcore") when the data was
recorded.
After this fix kernel symbols get resolved/annotated correctly.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Namhyung Kim <namhyung@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1385471964-4037-1-git-send-email-adrian.hunter@intel.com
[ Added further explanation extracted from conversation between Ingo & Adrian on lkml ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
David Ahern [Wed, 20 Nov 2013 04:07:37 +0000 (21:07 -0700)]
perf script: Print callchains and symbols if they exist
The intent of perf-script is to dump the events and information in the
file. H/W, S/W and raw events all dump callchains if they are present;
might as well make that the default for tracepoints too.
v2: Only add options for sym, dso and ip if callchains are present
David Ahern [Mon, 18 Nov 2013 20:32:44 +0000 (13:32 -0700)]
perf evsel: Skip ignored symbols while printing callchain
Allows a command to have a symbol_filter controlled by the user to skip
certain functions in a backtrace. One example is to allow the user to
reduce repeating patterns like:
do_select core_sys_select sys_select
to just sys_select when dumping callchains, consuming less real estate
on the screen while still conveying the essential message - the process
is in a select call.
This option is leveraged by the upcoming timehist command.
Signed-off-by: David Ahern <dsahern@gmail.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1384806771-2945-2-git-send-email-dsahern@gmail.com
[ Checked if al.sym is NULL before touching al.sym->ignored, as noted by Adrian Hunter ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add -g flag to `perf timechart record` which saves callchain info in the
perf.data.
When generating SVG, add backtrace information to the figure details, so
now it's possible to see which code path woke up the task and why some
task went to sleep.
Signed-off-by: Stanislav Fomichev <stfomichev@yandex-team.ru> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1383323151-19810-8-git-send-email-stfomichev@yandex-team.ru Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
perf timechart: Add support for -P and -T in timechart recording
If we don't want either power or task events we may use -T or -P with
the `perf timechart record` command to filter out events while recording
to keep perf.data small.
Signed-off-by: Stanislav Fomichev <stfomichev@yandex-team.ru> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1383323151-19810-7-git-send-email-stfomichev@yandex-team.ru Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Adrian Hunter [Mon, 18 Nov 2013 09:55:57 +0000 (11:55 +0200)]
perf record: Default -t option to no inheritance
The change to per-cpu mmaps causes the -p, -t and -u options now to have
inheritance enabled by default. Change that back to no inheritance but
for the -t option only.
Requested-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Namhyung Kim <namhyung@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1384768557-23331-5-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Adrian Hunter [Mon, 18 Nov 2013 09:55:55 +0000 (11:55 +0200)]
perf tools: Allow '--inherit' as the negation of '--no-inherit'
Long options can be negated by prefixing them with 'no-'. However
options that already start with 'no-', such as '--no-inherit' result in
ugly double 'no's.
Avoid that by accepting that the removal of 'no-' also negates the long
option.
Suggested-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Namhyung Kim <namhyung@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1384768557-23331-3-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Namhyung Kim [Mon, 18 Nov 2013 05:34:52 +0000 (14:34 +0900)]
perf script: Move evname print code to process_event()
The print_sample_start() will be reused by other printing routine for
internal events like COMM, FORK and EXIT from next patch. And because
they're not tied to a specific event, move the evname print code to its
caller.
Signed-off-by: Namhyung Kim <namhyung@kernel.org> Reviewed-by: David Ahern <dsahern@gmail.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung.kim@lge.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1384752894-10974-1-git-send-email-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
compgen is a bash-builtin; factor out the invocations into a separate
function to give us a chance to override it with a zsh equivalent in
future patches.
Define the variables cur, words, cword, and prev outside the main
completion function so that we have a chance to override it when we
introduce zsh support.
Steven Rostedt [Tue, 19 Nov 2013 23:29:37 +0000 (18:29 -0500)]
tools lib traceevent: Use helper trace-seq in print functions like kernel does
Jiri Olsa reported that his plugin for scsi was chopping off part of the
output. Investigating this, I found that Jiri used the same functions as
what is in the kernel, which adds the following:
trace_seq_putc(p, 0);
This adds a '\0' to the output string. The reason this works in the
kernel is that the "p" that is passed to the function helper is a
temporary trace_seq. But in the libtraceevent library, it's the pointer
to the trace_seq used to output. By adding the '\0', it truncates the
line and nothing added after that will be printed.
We can solve this in two ways. One is to have the helper functions for
the library not add the unnecessary '\0'. The other is to change the
library to also use a helper trace_seq structure that gets copied to the
main trace_seq just like the kernel does.
The latter allows the helper functions in the plugins to be the same as
the kernel, which is the better solution.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Reported-by: Jiri Olsa <jolsa@redhat.com> Tested-by: Jiri Olsa <jolsa@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lkml.kernel.org/r/20131119182937.401668e3@gandalf.local.home Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Stephane Eranian [Tue, 12 Nov 2013 16:58:51 +0000 (17:58 +0100)]
perf/x86: Add RAPL hrtimer support
The RAPL PMU counters do not interrupt on overflow.
Therefore, the kernel needs to poll the counters
to avoid missing an overflow. This patch adds
the hrtimer code to do this.
The timer interval is calculated at boot time
based on the power unit used by the HW.
There is one hrtimer per-cpu to handle the case
of multiple simultaneous use across cores on
the same package + hotplug CPU.
Thanks to Maria Dimakopoulou for her contributions
to this patch especially on the math aspects.
Stephane Eranian [Tue, 12 Nov 2013 16:58:50 +0000 (17:58 +0100)]
perf/x86: Add Intel RAPL PMU support
This patch adds a new uncore PMU to expose the Intel
RAPL energy consumption counters. Up to 3 counters,
each counting a particular RAPL event are exposed.
The RAPL counters are available on Intel SandyBridge,
IvyBridge, Haswell. The server skus add a 3rd counter.
The following events are available and exposed in sysfs:
- power/energy-cores: power consumption of all cores on socket
- power/energy-pkg: power consumption of all cores + LLc cache
- power/energy-dram: power consumption of DRAM (servers only)
For each event both the unit (Joules) and scale (2^-32 J)
is exposed in sysfs for use by perf stat and other tools.
The files are:
The RAPL PMU is uncore by nature and is implemented such
that it only works in system-wide mode. Measuring only
one CPU per socket is sufficient. The /sys/devices/power/cpumask
file can be used by tools to figure out which CPUs to monitor
by default. For instance, on a 2-socket system, 2 CPUs
(one on each socket) will be shown.
All the counters measure in the same unit (exposed via sysfs).
The perf_events API exposes all RAPL counters as 64-bit integers
counting in unit of 1/2^32 Joules (about 0.23 nJ). User level tools
must convert the counts by multiplying them by 2^-32 to obtain
Joules. The reason for this is that the kernel avoids
doing floating point math whenever possible because it is
expensive (user floating-point state must be saved). The method
used avoids kernel floating-point usage. There is no loss of
precision. Thanks to PeterZ for suggesting this approach.
To convert the raw count in Watt:
W = C * 2.3 / (1e10 * time)
or ldexp(C, -32).
RAPL PMU is a new standalone PMU which registers with the
perf_event core subsystem. The PMU type (attr->type) is
dynamically allocated and is available from /sys/device/power/type.
Sampling is not supported by the RAPL PMU. There is no
privilege level filtering either.
This patch modifies the pmu event alias code to check
for the presence of the .unit and .scale files to load
the corresponding values. They are then used by perf stat
transparently:
# perf stat -a -e power/energy-pkg/,power/energy-cores/,cycles -I 1000 sleep 1000
# time counts unit events
1.000214717 3.07 Joules power/energy-pkg/ [100.00%]
1.000214717 0.53 Joules power/energy-cores/
1.00021471712965028 cycles [100.00%]
2.000749289 3.01 Joules power/energy-pkg/
2.000749289 0.52 Joules power/energy-cores/
2.00074928915817043 cycles
When the event does not have an explicit unit exported by
the kernel, nothing is printed. In csv output mode, there
will be an empty field.
Special thanks to Jiri for providing the supporting code
in the parser to trigger reading of the scale and unit files.
Stephane Eranian [Tue, 12 Nov 2013 16:58:48 +0000 (17:58 +0100)]
perf: Add active_entry list head to struct perf_event
This patch adds a new field to the struct perf_event.
It is intended to be used to chain events which are
active (enabled). It helps in the hardware layer
for PMUs which do not have actual counter restrictions, i.e.,
free running read-only counters. Active events are chained
as opposed to being tracked via the counter they use.
To save space we use a union with hlist_entry as both
are mutually exclusive (suggested by Jiri Olsa).
Oleg Nesterov [Sat, 9 Nov 2013 17:44:19 +0000 (18:44 +0100)]
uprobes/powerpc: Kill arch_uprobe->ainsn
powerpc has both arch_uprobe->insn and arch_uprobe->ainsn to
make the generic code happy. This is no longer needed after
the previous change, powerpc can just use "u32 insn".
Oleg Nesterov [Sat, 9 Nov 2013 16:58:54 +0000 (17:58 +0100)]
uprobes: Don't assume that arch_uprobe->insn/ixol is u8[MAX_UINSN_BYTES]
arch_uprobe should be opaque as much as possible to the generic
code, but currently it assumes that insn/ixol must be u8[] of the
known size. Remove this unnecessary dependency, we can use "&" and
and sizeof() with the same effect.
uprobe_task->vaddr is a bit strange. The generic code uses it only
to pass the additional argument to arch_uprobe_pre_xol(), and since
it is always equal to instruction_pointer() this looks even more
strange.
And both utask->vaddr and and utask->autask have the same scope,
they only have the meaning when the task executes the probed insn
out-of-line, so it is safe to reuse both in UTASK_RUNNING state.
This all means that logically ->vaddr belongs to arch_uprobe_task
and we should probably move it there, arch_uprobe_pre_xol() can
record instruction_pointer() itself.
OTOH, it is also used by uprobe_copy_process() and dup_xol_work()
for another purpose, this doesn't look clean and doesn't allow to
move this member into arch_uprobe_task.
This patch adds the union with 2 anonymous structs into uprobe_task.
The first struct is autask + vaddr, this way we "almost" move vaddr
into autask.
The second struct has 2 new members for uprobe_copy_process() paths:
->dup_xol_addr which can be used instead ->vaddr, and ->dup_xol_work
which can be used to avoid kmalloc() and simplify the code.
Note that this union will likely have another member(s), we need
something like "private_data_for_handlers" so that the tracing
handlers could use it to communicate with call_fetch() methods.