For ARM, when tracing with tracepoint events, the IP and cpsr are set
to 0, preventing the perf code parsing the callchain and resolving the
symbols correctly.
./perf record -e sched:sched_switch -g --call-graph dwarf ls
[ perf record: Captured and wrote 0.006 MB perf.data ]
./perf report -f
Samples: 5 of event 'sched:sched_switch', Event count (approx.): 5
Children Self Command Shared Object Symbol
100.00% 100.00% ls [unknown] [.]
00000000
The fix is to implement perf_arch_fetch_caller_regs for ARM, which fills
several necessary registers used for callchain unwinding, including pc,sp,
fp and cpsr.
With this patch, callchain can be parsed correctly as :
.....
- 100.00% 100.00% ls [kernel.kallsyms] [k] __sched_text_start
+ __sched_text_start
+ 20.00% 0.00% ls libc-2.18.so [.] _dl_addr
+ 20.00% 0.00% ls libc-2.18.so [.] write
.....
Jean Pihet found this in ARM and come up with a patch:
http://thread.gmane.org/gmane.linux.kernel/
1734283/focus=
1734280
This patch rewrite Jean's patch in C.
Signed-off-by: Hou Pengyang <houpengyang@huawei.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
#define perf_misc_flags(regs) perf_misc_flags(regs)
#endif
+#define perf_arch_fetch_caller_regs(regs, __ip) { \
+ (regs)->ARM_pc = (__ip); \
+ (regs)->ARM_fp = (unsigned long) __builtin_frame_address(0); \
+ (regs)->ARM_sp = current_stack_pointer; \
+ (regs)->ARM_cpsr = SVC_MODE; \
+}
+
#endif /* __ARM_PERF_EVENT_H__ */