perf core: Allow setting up max frame stack depth via sysctl
The default remains 127, which is good for most cases, and not even hit
most of the time, but then for some cases, as reported by Brendan, 1024+
deep frames are appearing on the radar for things like groovy, ruby.
And in some workloads putting a _lower_ cap on this may make sense. One
that is per event still needs to be put in place tho.
Because we only allocate the callchain percpu data structures when there
is a user, which allows for changing the max easily, its just a matter
of having no callchain users at that point.
Reported-and-Tested-by: Brendan Gregg <brendan.d.gregg@gmail.com> Reviewed-by: Frederic Weisbecker <fweisbec@gmail.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Acked-by: David Ahern <dsahern@gmail.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: He Kuang <hekuang@huawei.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Milian Wolff <milian.wolff@kdab.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: Wang Nan <wangnan0@huawei.com> Cc: Zefan Li <lizefan@huawei.com> Link: http://lkml.kernel.org/r/20160426002928.GB16708@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: David Ahern <dsahern@gmail.com> Cc: Hitoshi Mitake <mitake@dcl.info.waseda.ac.jp> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-z6erjg35d1gekevwujoa0223@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Introduced in commit 4babf2c5efb7 ("x86: wire up preadv2 and pwritev2").
This will make 'perf trace' aware of them.
Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Milian Wolff <milian.wolff@kdab.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-vojoylgce2cetsy36446s5ny@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Ravi Bangoria [Tue, 26 Apr 2016 14:25:40 +0000 (19:55 +0530)]
perf probe: Fix module probe issue if no dwarf support
Perf is not able to register probe in kernel module when dwarf supprt
is not there(and so it goes for symtab). Perf passes full path of
module where only module name is required which is causing the problem.
This patch fixes this issue.
Before applying patch:
$ dpkg -s libdw-dev
dpkg-query: package 'libdw-dev' is not installed and no information is...
$ sudo ./perf probe -m /linux/samples/kprobes/kprobe_example.ko kprobe_init
Added new event:
probe:kprobe_init (on kprobe_init in /linux/samples/kprobes/kprobe_example.ko)
$ sudo ./perf record -a -e probe:kprobe_init
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.105 MB perf.data (2 samples) ]
Ravi Bangoria [Tue, 26 Apr 2016 14:25:41 +0000 (19:55 +0530)]
perf probe: Fix offline module name missmatch issue
Perf can add a probe on kernel module which has not been loaded yet.
The current implementation finds the module name from path. But if the
filename is different from the actual module name then perf fails to
register a probe while loading module because of mismatch in the names.
For example, samples/kobject/kobject-example.ko is loaded as
kobject_example.
Before applying patch:
$ sudo ./perf probe -m /linux/samples/kobject/kobject-example.ko foo_show
Added new event:
probe:foo_show (on foo_show in kobject-example)
$ lsmod
Module Size Used by
kobject_example 16384 0
Generate read to /sys/kernel/kobject_example/foo while recording data
with below command
$ sudo ./perf record -e probe:foo_show -a
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.093 MB perf.data ]
$./perf report --stdio -F overhead,comm,dso,sym
Error:
The perf.data.old file has no samples!
After applying patch:
$ sudo ./perf probe -m /linux/samples/kobject/kobject-example.ko foo_show
Added new event:
probe:foo_show (on foo_show in kobject_example)
$ lsmod
Module Size Used by
kobject_example 16384 0
Generate read to /sys/kernel/kobject_example/foo while recording data
with below command
$ sudo ./perf record -e probe:foo_show -a
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.097 MB perf.data (8 samples) ]
perf trace: Read thread's COMM from /proc when not set
We get notifications for threads that gets created while we're tracing,
but for preexisting threads we may end not having synthesized them, like
when tracing a 'perf trace' session that will use '--pid' to trace some
other thread.
And besides we should probably stop synthesizing those records and
instead read thread information in a lazy way, i.e. just when we need,
like done in this patch:
Now the 'pid_t' argument in 'perf_event_open' gets translated to a COMM:
perf trace: Do not beautify the 'pid' parameter as a simple integer
Leave it alone so that it ends up assigned to SCA_PID via its type,
'pid_t', that will look up the pid on the machine thread rb_tree and
possibly find its COMM.
Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Milian Wolff <milian.wolff@kdab.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-r7dujgmhtxxfajuunpt1bkuo@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
perf trace: Move perf_flags beautifier to tools/perf/trace/beauty/
To reduce the size of builtin-trace.c.
Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-8r3gmymyn3r0ynt4yuzspp9g@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
perf tools: Add lsdir() helper to read a directory
As a utility function, add lsdir() which reads given directory and store
entry name into a strlist. lsdir accepts a filter function so that user
can filter out unneeded entries.
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com> Cc: Hemant Kumar <hemant@linux.vnet.ibm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20160426090242.11891.79014.stgit@devbox
[ Do not use the 'dirname' it is used in some distros ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Wang Nan [Tue, 26 Apr 2016 02:28:54 +0000 (02:28 +0000)]
perf evlist: Enforce ring buffer reading
Don't read broken data after 'head' pointer.
Following commits will feed perf_evlist__mmap_read() with some 'head'
pointers not maintained by kernel. If 'head' pointer breaks an event, we
should avoid reading from the broken event. This can happen in backward
ring buffer.
For example:
old head
| |
V V
+---+------+----------+----+-----+--+
|..E|D....D|C........C|B..B|A....|E.|
+---+------+----------+----+-----+--+
'old' pointer points to the beginning of 'A' and trying read from it,
but 'A' has been overwritten. In this case, don't try to read from 'A',
simply return NULL.
Colin Ian King [Sun, 24 Apr 2016 18:56:43 +0000 (19:56 +0100)]
perf intel-pt: Fix off-by-one comparison on maximum code
The check for the maximum code is off-by-one; the current comparison of
a code that is INTEL_PT_ERR_MAX will cause the strlcpy to perform an out
of bounds array access on the intel_pt_err_msgs array.
Fix this with a >= comparison.
Signed-off-by: Colin Ian King <colin.king@canonical.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1461524203-10224-1-git-send-email-colin.king@canonical.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Given that the 'val' parameter is ignored for FUTEX_LOCK_PI, get rid of
the bogus deadlock detection flag in the wrapper code and avoid the
extra argument, making it resemble its unlock counterpart. And if
nothing else, we already only pass 0 anyway.
Colin Ian King [Sat, 23 Apr 2016 13:45:54 +0000 (14:45 +0100)]
perf tests: Replace assignment with comparison on assert check
The current assert check is checking an assignment, which will always be
true. Instead, the assert should be checking if scale is equal to 0.122
Signed-off-by: Colin Ian King <colin.king@canonical.com> Reviewed-by: Masami Hiramatsu <mhiramat@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1461419154-16918-1-git-send-email-colin.king@canonical.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
While trying to use --call-graph lbr in 'perf trace', since we only are
interested in the callchain for userspace, up to the callchain, I found
that 'perf evlist' is not decoding the branch_sample_type field, fix it.
Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Milian Wolff <milian.wolff@kdab.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-hozai7974u0ulgx13k96fcaw@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Milian Wolff <milian.wolff@kdab.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-wuji34xx003kr88nmqt6jkgf@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Milian Wolff <milian.wolff@kdab.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-shj0fazntmskhjild5i6x73l@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Chris Phlipot [Wed, 20 Apr 2016 02:32:11 +0000 (19:32 -0700)]
perf script: Fix segfault when printing callchains
This fixes a bug caused by an unitialized callchain cursor. The crash
frist appeared in:
6f736735e30f ("perf evsel: Require that callchains be resolved before
calling fprintf_{sym,callchain}")
The callchain cursor is a struct that contains pointers, that when
uninitialized will cause unpredictable behavior (usually a crash)
when trying to append to the callchain.
The existing implementation has the following issues:
1. The callchain cursor used is not initialized, resulting in
unpredictable behavior when used.
2. The cursor is declared on the stack. Even if it is properly initalized,
the implmentation will leak memory when the function returns,
since all the references to the callchain_nodes allocated by
callchain_cursor_append will be lost when the cursor goes out of
scope.
3. Storing the cursor on the stack is inefficient. Even if memory is
properly freed when it goes out of scope, a performance penalty
will be incurred due to reallocation of callchain nodes.
callchain_cursor_append is designed to avoid these reallocations
when an existing cursor is reused.
This patch fixes the crash by replacing cursor_callchain with a reference
to the global callchain_cursor which also resolves all 3 issues mentioned
above.
Signed-off-by: Chris Phlipot <cphlipot0@gmail.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Fixes: 6f736735e30f ("perf evsel: Require that callchains be resolved before calling fprintf_{sym,callchain}") Link: http://lkml.kernel.org/r/1461119531-2529-1-git-send-email-cphlipot0@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Milian Wolff <milian.wolff@kdab.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-8h6ssirw5z15qyhy2lwd6f89@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
perf trace: Extract evsel contructor from perf_evlist__add_pgfault
Prep work for next patches, where we'll need access to the created
evsels, to possibly configure callchains.
Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-2pcgsgnkgellhlcao4aub8tu@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
write_buildid() increments 'name_len' with intention to take into
account trailing zero byte. However, 'name_len' was already incremented
in machine__write_buildid_table() before. So this leads to
out-of-bounds read in do_write():
$ ./perf record sleep 0
[ perf record: Woken up 1 times to write data ]
=================================================================
==15899==ERROR: AddressSanitizer: global-buffer-overflow on address 0x00000099fc92 at pc 0x7f1aa9c7eab5 bp 0x7fff940f84d0 sp 0x7fff940f7c78
READ of size 19 at 0x00000099fc92 thread T0
#0 0x7f1aa9c7eab4 (/usr/lib/gcc/x86_64-pc-linux-gnu/5.3.0/libasan.so.2+0x44ab4)
#1 0x649c5b in do_write util/header.c:67
#2 0x649c5b in write_padded util/header.c:82
#3 0x57e8bc in write_buildid util/build-id.c:239
#4 0x57e8bc in machine__write_buildid_table util/build-id.c:278
...
0x00000099fc92 is located 0 bytes to the right of global variable '*.LC99' defined in 'util/symbol.c' (0x99fc80) of size 18
'*.LC99' is ascii string '[kernel.kallsyms]'
...
Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1461053847-5633-1-git-send-email-aryabinin@virtuozzo.com
[ Remove the off-by one at the origin, to keep len(s) == strlen(s) assumption ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Merge tag 'perf-core-for-mingo-20160419' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux
Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo:
Build fixes:
- Fix 'perf trace' build when DWARF unwind isn't available (Arnaldo Carvalho de Melo)
- Remove x86 references from arch-neutral Build, fixing it in !x86 arches,
reported as breaking the build for powerpc64le in linux-next (Arnaldo Carvalho de Melo)
Infrastructure changes:
- Do memset() variable 'st' using the correct size in the jit code (Colin Ian King)
- Use callchain_param more thoroughly when checking how callchains were
configured, eventually will be the only way to look for callchain parameters
(Arnaldo Carvalho de Melo)
- Fix some issues in the 'perf test kallsyms' entry (Arnaldo Carvalho de Melo)
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Ingo Molnar <mingo@kernel.org>
Add Skylake client support for RAPL domains. In addition to RAPL domains
in Broadwell clients, it has support for platform domain (aka PSys). The
PSys domain controls the entire SoC instead of just a CPU package. Unlike
package domain, PSys support requires more than just processor level
implementation. The other parts in the system need additional HW level
signaling, which OEMs need to support. When not supported, the energy
counter register in PSys domain returns 0.
Also corrected error in comment for GPU counter, which previously was
DRAM counter.
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com
[ Cnverted to model_match stuff. ] Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: bp@alien8.de Cc: hpa@zytor.com Cc: jacob.jun.pan@linux.intel.com Cc: rjw@rjwysocki.net Link: http://lkml.kernel.org/r/1460930581-29748-2-git-send-email-srinivas.pandruvada@linux.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Ingo Molnar <mingo@kernel.org>
Wang Nan [Tue, 5 Apr 2016 14:11:18 +0000 (14:11 +0000)]
perf/core: Add ::write_backward attribute to perf event
This patch introduces 'write_backward' bit to perf_event_attr, which
controls the direction of a ring buffer. After set, the corresponding
ring buffer is written from end to beginning. This feature is design to
support reading from overwritable ring buffer.
Ring buffer can be created by mapping a perf event fd. Kernel puts event
records into ring buffer, user tooling like perf fetch them from
address returned by mmap(). To prevent racing between kernel and tooling,
they communicate to each other through 'head' and 'tail' pointers.
Kernel maintains 'head' pointer, points it to the next free area (tail
of the last record). Tooling maintains 'tail' pointer, points it to the
tail of last consumed record (record has already been fetched). Kernel
determines the available space in a ring buffer using these two
pointers to avoid overwrite unfetched records.
By mapping without 'PROT_WRITE', an overwritable ring buffer is created.
Different from normal ring buffer, tooling is unable to maintain 'tail'
pointer because writing is forbidden. Therefore, for this type of ring
buffers, kernel overwrite old records unconditionally, works like flight
recorder. This feature would be useful if reading from overwritable ring
buffer were as easy as reading from normal ring buffer. However,
there's an obscure problem.
The following figure demonstrates a full overwritable ring buffer. In
this figure, the 'head' pointer points to the end of last record, and a
long record 'E' is pending. For a normal ring buffer, a 'tail' pointer
would have pointed to position (X), so kernel knows there's no more
space in the ring buffer. However, for an overwritable ring buffer,
kernel ignore the 'tail' pointer.
(X) head
. |
. V
+------+-------+----------+------+---+
|A....A|B.....B|C........C|D....D| |
+------+-------+----------+------+---+
Record 'A' is overwritten by event 'E':
head
|
V
+--+---+-------+----------+------+---+
|.E|..A|B.....B|C........C|D....D|E..|
+--+---+-------+----------+------+---+
Now tooling decides to read from this ring buffer. However, none of these
two natural positions, 'head' and the start of this ring buffer, are
pointing to the head of a record. Even the full ring buffer can be
accessed by tooling, it is unable to find a position to start decoding.
The first attempt tries to solve this problem AFAIK can be found from
[1]. It makes kernel to maintain 'tail' pointer: updates it when ring
buffer is half full. However, this approach introduces overhead to
fast path. Test result shows a 1% overhead [2]. In addition, this method
utilizes no more tham 50% records.
Another attempt can be found from [3], which allows putting the size of
an event at the end of each record. This approach allows tooling to find
records in a backward manner from 'head' pointer by reading size of a
record from its tail. However, because of alignment requirement, it
needs 8 bytes to record the size of a record, which is a huge waste. Its
performance is also not good, because more data need to be written.
This approach also introduces some extra branch instructions to fast
path.
'write_backward' is a better solution to this problem.
Following figure demonstrates the state of the overwritable ring buffer
when 'write_backward' is set before overwriting:
head
|
V
+---+------+----------+-------+------+
| |D....D|C........C|B.....B|A....A|
+---+------+----------+-------+------+
and after overwriting:
head
|
V
+---+------+----------+-------+---+--+
|..E|D....D|C........C|B.....B|A..|E.|
+---+------+----------+-------+---+--+
In each situation, 'head' points to the beginning of the newest record.
From this record, tooling can iterate over the full ring buffer and fetch
records one by one.
The only limitation that needs to be considered is back-to-back reading.
Due to the non-deterministic of user programs, it is impossible to ensure
the ring buffer keeps stable during reading. Consider an extreme situation:
tooling is scheduled out after reading record 'D', then a burst of events
come, eat up the whole ring buffer (one or multiple rounds). When the
tooling process comes back, reading after 'D' is incorrect now.
To prevent this problem, we need to find a way to ensure the ring buffer
is stable during reading. ioctl(PERF_EVENT_IOC_PAUSE_OUTPUT) is
suggested because its overhead is lower than
ioctl(PERF_EVENT_IOC_ENABLE).
By carefully verifying 'header' pointer, reader can avoid pausing the
ring-buffer. For example:
/* A union of all possible events */
union perf_event event;
p = head = perf_mmap__read_head();
while (true) {
/* copy header of next event */
fetch(&event.header, p, sizeof(event.header));
/* read 'head' pointer */
head = perf_mmap__read_head();
/* check overwritten: is the header good? */
if (!verify(sizeof(event.header), p, head))
break;
/* copy the whole event */
fetch(&event, p, event.header.size);
/* read 'head' pointer again */
head = perf_mmap__read_head();
/* is the whole event good? */
if (!verify(event.header.size, p, head))
break;
p += event.header.size;
}
However, the overhead is high because:
a) In-place decoding is not safe.
Copying-verifying-decoding is required.
b) Fetching 'head' pointer requires additional synchronization.
(From Alexei Starovoitov:
Even when this trick works, pause is needed for more than stability of
reading. When we collect the events into overwrite buffer we're waiting
for some other trigger (like all cpu utilization spike or just one cpu
running and all others are idle) and when it happens the buffer has
valuable info from the past. At this point new events are no longer
interesting and buffer should be paused, events read and unpaused until
next trigger comes.)
This patch utilizes event's default overflow_handler introduced
previously. perf_event_output_backward() is created as the default
overflow handler for backward ring buffers. To avoid extra overhead to
fast path, original perf_event_output() becomes __perf_event_output()
and marked '__always_inline'. In theory, there's no extra overhead
introduced to fast path.
Performance testing:
Calling 3000000 times of 'close(-1)', use gettimeofday() to check
duration. Use 'perf record -o /dev/null -e raw_syscalls:*' to capture
system calls. In ns.
Testing environment:
CPU : Intel(R) Core(TM) i7-4790 CPU @ 3.60GHz
Kernel : v4.5.0
MEAN STDVAR
BASE 800214.950 2853.083
PRE1 2253846.700 9997.014
PRE2 2257495.540 8516.293
POST 2250896.100 8933.921
Where 'BASE' is pure performance without capturing. 'PRE1' is test
result of pure 'v4.5.0' kernel. 'PRE2' is test result before this
patch. 'POST' is test result after this patch. See [4] for the detailed
experimental setup.
Considering the stdvar, this patch doesn't introduce performance
overhead to the fast path.
perf test: Add missing verbose output explaining the reason for failure
One of the branches leading to an error had no debug message emitted,
fix it, the new lines are:
# perf test -v kallsyms
<SNIP>
0xffffffff81001000: diff name v: xen_hypercall_set_trap_table k: hypercall_page
0xffffffff810691f0: diff name v: try_to_free_pud_page k: try_to_free_pmd_page
<SNIP>
0xffffffff8150bb20: diff name v: wakeup_expire_count_show.part.5 k: wakeup_active_count_show.part.7
0xffffffff816bc7f0: diff name v: phys_switch_id_show.part.11 k: phys_port_name_show.part.12
0xffffffff817bbb90: diff name v: __do_softirq k: __softirqentry_text_start
<SNIP>
This in turn exercises another bug, still under investigation, because those
aliases _are_ in kallsyms, with the same name...
Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Fixes: ab414dcda8fa ("perf test: Fixup aliases checking in the 'vmlinux matches kallsyms' test") Link: http://lkml.kernel.org/n/tip-5fhea7a54a54gsmagu9obpr4@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
# perf test -v 1
1: vmlinux symtab matches kallsyms :
--- start ---
test child forked, pid 7058
Looking at the vmlinux_path (8 entries long)
Using /lib/modules/4.6.0-rc1+/build/vmlinux for symbols
0xffffffff81076870: diff end addr for aesni_gcm_dec v: 0xffffffff810791f2 k: 0xffffffff81076902
0xffffffff81079200: diff end addr for aesni_gcm_enc v: 0xffffffff8107bb03 k: 0xffffffff81079292
0xffffffff8107e8d0: diff end addr for aesni_gcm_enc_avx_gen2 v: 0xffffffff81083e76 k: 0xffffffff8107e943
0xffffffff81083e80: diff end addr for aesni_gcm_dec_avx_gen2 v: 0xffffffff81089611 k: 0xffffffff81083ef3
0xffffffff81089990: diff end addr for aesni_gcm_enc_avx_gen4 v: 0xffffffff8108e7c4 k: 0xffffffff81089a03
0xffffffff8108e7d0: diff end addr for aesni_gcm_dec_avx_gen4 v: 0xffffffff810937ef k: 0xffffffff8108e843
Maps only in vmlinux: ffffffff81d5e000-ffffffff81ec3ac8115e000 [kernel].init.text ffffffff81ec3ac8-ffffffffa000000012c3ac8 [kernel].exit.text
Maps in vmlinux with a different name in kallsyms:
Maps only in kallsyms:
test child finished with -1
---- end ----
vmlinux symtab matches kallsyms: FAILED!
#
Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Fixes: 8e0cf965f95e ("perf symbols: Add support for reading from /proc/kcore") Link: http://lkml.kernel.org/n/tip-n6vrwt9t89w8k769y349govx@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
perf symbols: Allow loading kallsyms without considering kcore files
Before the support for using /proc/kcore was introduced, the kallsyms
routines used /proc/modules and the first 'perf test' entry expected
finding maps for each module in the system, which is not the case with
the kcore code. Provide a way to ignore kcore files so that the test can
have its expectations met.
Improving the test to cover kcore files as well needs to be done.
Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-ek5urnu103dlhfk4l6pcw041@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
perf build: Remove x86 references from arch-neutral Build
It will already be dealt with generating the syscalltbl.c file in the
x86 arch specific Build files, namely via 'archheaders'.
This fixes the build on !x86 arches, as reported for powerpcle
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Tested-by: Jiri Olsa <jolsa@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Wang Nan <wangnan0@huawei.com> Fixes: 1b700c997500 ("perf tools: Build syscall table .c header from kernel's syscall_64.tbl") Link: http://lkml.kernel.org/r/20160415212831.GT9056@kernel.org
[ Removed the syscalltbl.o altogether, as per Jiri's suggestion ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Colin Ian King [Mon, 18 Apr 2016 23:07:18 +0000 (00:07 +0100)]
perf jit: memset() variable 'st' using the correct size
The current code is memsetting the 'struct stat' variable 'st' with the size of
'stat' (which turns out to be 1 byte) rather than the size of variable 'sz'.
Committer notes:
sizeof(function) isn't valid, the result depends on the compiler used, with
gcc, enabling pedantic warnings we get:
int main(void)
{
printf("sizeof(stat)=%zd, stat=%p\n", sizeof(stat), stat);
return 0;
}
$ readelf -sW sizeof_function | grep -w stat
49: 0000000000400630 16 FUNC WEAK HIDDEN 13 stat
$ cc -pedantic sizeof_function.c -o sizeof_function
sizeof_function.c: In function ‘main’:
sizeof_function.c:8:46: warning: invalid application of ‘sizeof’ to a function type [-Wpointer-arith]
printf("sizeof(stat)=%zd, stat=%p\n", sizeof(stat), stat);
^
$ ./sizeof_function
sizeof(stat)=1, stat=0x400630
$
Standard C, section 6.5.3.4:
"The sizeof operator shall not be applied to an expression that has function
type or an incomplete type, to the parenthesized name of such a type,
or to an expression that designates a bit-field member."
The current instructions for setting up an Ubuntu system for using the
export-to-postgresql.py script are incorrect.
The instructions in the script have been updated to work on newer
versions of ubuntu.
-Add missing dependencies to apt-get command:
python-pyside.qtsql, libqt4-sql-psql
-Add '-s' option to createuser command to force the user to be a
superuser since the command doesn't prompt as indicated in the
current instructions.
perf top: Use callchain_param.enabled instead of symbol_conf.use_callchain
One more step in the direction of using just callchain_param for
callchain parameters.
Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Milian Wolff <milian.wolff@kdab.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-3b1o9kb2dc94zldz0klckti6@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
perf hists browser: Fold two consecutive symbol_conf.use_callchain ifs
Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Milian Wolff <milian.wolff@kdab.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-u701i6qpecgm9jiat52i8l98@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Milian Wolff <milian.wolff@kdab.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-silwqjc2t25ls42dsvg28pp5@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
perf report: Use callchain_param.enabled instead of tool specific knob
We have callchain_param.enabled, so no need to have something just for
'perf report' to do the same thing.
Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Milian Wolff <milian.wolff@kdab.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-wbeisubpualwogwi5u8utnt1@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
perf callchain: Set callchain_param.enabled when parsing --call-graph
Trying to move in the direction of using callchain_param for all
callchain parameters, eventually ditching them from symbol_conf.
Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Milian Wolff <milian.wolff@kdab.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-kixllia6r26mz45ng056zq7z@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
perf script: Check sample->callchain before using it
Found by code inspection, while looking at thread__resolve_callchain()
callsites, one had it, the other didn't.
Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Milian Wolff <milian.wolff@kdab.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-6r8i2afd3523thuuaxl39yhk@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
perf evsel: Add missign class prefix to has_branch_stack method
Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Milian Wolff <milian.wolff@kdab.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-5i07ivw1yjsweb7gztr255jd@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Merge tag 'dm-4.6-fix-2' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm
Pull device mapper fix from Mike Snitzer:
"Fix for earlier 4.6-rc4 stable@ commit that introduced improper use of
write lock in cmd_read_lock() -- due to cut-n-paste gone awry (and
sparse didn't catch it)"
* tag 'dm-4.6-fix-2' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
dm cache metadata: fix cmd_read_lock() acquiring write lock
Commit 9567366fefdd ("dm cache metadata: fix READ_LOCK macros and
cleanup WRITE_LOCK macros") uses down_write() instead of down_read() in
cmd_read_lock(), yet up_read() is used to release the lock in
READ_UNLOCK(). Fix it.
Fixes: 9567366fefdd ("dm cache metadata: fix READ_LOCK macros and cleanup WRITE_LOCK macros") Cc: stable@vger.kernel.org Signed-off-by: Ahmed Samy <f.fallen45@gmail.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Merge tag 'char-misc-4.6-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc
Pull char/misc fixes from Greg KH:
"Here are some small char/misc driver fixes for 4.6-rc4. Full details
are in the shortlog, nothing major here.
These have all been in linux-next for a while with no reported issues"
* tag 'char-misc-4.6-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
lkdtm: do not leak free page on kmalloc failure
lkdtm: fix memory leak of base
lkdtm: fix memory leak of val
extcon: palmas: Drop stray IRQF_EARLY_RESUME flag
Merge tag 'driver-core-4.6-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core
Pull misc fixes from Greg KH:
"Here are three small fixes for 4.6-rc4.
Two fix up some lz4 issues with big endian systems, and the remaining
one resolves a minor debugfs issue that was reported.
All have been in linux-next with no reported issues"
* tag 'driver-core-4.6-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core:
lib: lz4: cleanup unaligned access efficiency detection
lib: lz4: fixed zram with lz4 on big endian machines
debugfs: Make automount point inodes permanently empty
Merge tag 'usb-4.6-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb
Pull USB driver fixes from Greg KH:
"Here are some small USB fixes for 4.6-rc4.
Mostly xhci fixes for reported issues, a UAS bug that has hit a number
of people, including stable tree users, and a few other minor things.
All have been in linux-next for a while with no reported issues"
* tag 'usb-4.6-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
usb: hcd: out of bounds access in for_each_companion
USB: uas: Add a new NO_REPORT_LUNS quirk
USB: uas: Limit qdepth at the scsi-host level
doc: usb: Fix typo in gadget_multi documentation
usb: host: xhci-plat: Make enum xhci_plat_type start at a non zero value
xhci: fix 10 second timeout on removal of PCI hotpluggable xhci controllers
usb: xhci: fix wild pointers in xhci_mem_cleanup
usb: host: xhci-plat: fix cannot work if R-Car Gen2/3 run on above 4GB phys
usb: host: xhci: add a new quirk XHCI_NO_64BIT_SUPPORT
xhci: resume USB 3 roothub first
usb: xhci: applying XHCI_PME_STUCK_QUIRK to Intel BXT B0 host
cdc-acm: fix crash if flushed with nothing buffered
Merge tag 'dmaengine-fix-4.6-rc4' of git://git.infradead.org/users/vkoul/slave-dma
Pull dmaengine fixes from Vinod Koul:
"This time we have some odd fixes in hsu, edma, omap and xilinx.
Usual fixes and nothing special"
* tag 'dmaengine-fix-4.6-rc4' of git://git.infradead.org/users/vkoul/slave-dma:
dmaengine: dw: fix master selection
dmaengine: edma: special case slot limit workaround
dmaengine: edma: Remove dynamic TPTC power management feature
dmaengine: vdma: don't crash when bad channel is requested
dmaengine: omap-dma: Do not suppress interrupts for memcpy
dmaengine: omap-dma: Fix polled channel completion detection and handling
dmaengine: hsu: correct use of channel status register
dmaengine: hsu: correct residue calculation of active descriptor
dmaengine: hsu: set HSU_CH_MTSR to memory width
perf trace: Fix build when DWARF unwind isn't available
The variable is initialized and then conditionally set to a different
value, but not used when DWARF unwinding is not available, bummer, write
1000 times: "Run make -C tools/perf build-test"...
builtin-trace.c: In function ‘cmd_trace’:
builtin-trace.c:3112:6: error: variable ‘max_stack_user_set’ set but not
used [-Werror=unused-but-set-variable]
bool max_stack_user_set = true;
^
cc1: all warnings being treated as err
Fix it by marking it as __maybe_unused.
Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Milian Wolff <milian.wolff@kdab.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Fixes: 056149932602 ("perf trace: Make --(min,max}-stack imply "--call-graph dwarf"") Link: http://lkml.kernel.org/n/tip-85r40c5hhv6jnmph77l1hgsr@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Merge tag 'perf-core-for-mingo-20160415' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core
Pull perf/core improvements from Arnaldo Carvalho de Melo:
User visible changes:
- Wire the callchain unwinding "max-stack" now to 'perf script --max-stack',
allowing to limit the depth of callchains, possibly reducing processing
time (Arnaldo Carvalho de Melo)
- Ditto for 'perf trace --max-stack' (Arnaldo Carvalho de Melo)
- Introduce a --min-stack filter for 'perf trace', to show syscalls that
had a userspace callchain leading to it at least min-stack deep (Arnaldo Carvalho de Melo)
- Make 'perf trace' work with multiple threads and the --duration filter,
i.e. do not print the start of an interrupted syscall followed by ...
to print interrupts from other threads, as we need to wait the sys_exit
syscall tracepoint to calculate the duration, duh. (Arnaldo Carvalho de Melo)
Merge branch 'for-linus' of git://git.kernel.dk/linux-block
Pull block fixes from Jens Axboe:
"A few fixes for the current series. This contains:
- Two fixes for NVMe:
One fixes a reset race that can be triggered by repeated
insert/removal of the module.
The other fixes an issue on some platforms, where we get probe
timeouts since legacy interrupts isn't working. This used not to
be a problem since we had the worker thread poll for completions,
but since that was killed off, it means those poor souls can't
successfully probe their NVMe device. Use a proper IRQ check and
probe (msi-x -> msi ->legacy), like most other drivers to work
around this. Both from Keith.
- A loop corruption issue with offset in iters, from Ming Lei.
- A fix for not having the partition stat per cpu ref count
initialized before sending out the KOBJ_ADD, which could cause user
space to access the counter prior to initialization. Also from
Ming Lei.
- A fix for using the wrong congestion state, from Kaixu Xia"
* 'for-linus' of git://git.kernel.dk/linux-block:
block: loop: fix filesystem corruption in case of aio/dio
NVMe: Always use MSI/MSI-x interrupts
NVMe: Fix reset/remove race
writeback: fix the wrong congested state variable definition
block: partition: initialize percpuref before sending out KOBJ_ADD
Merge branch 'libnvdimm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm
Pull libnvdimm fixes from Ross Zwisler:
"Two fixes:
- Fix memcpy_from_pmem() to fallback to memcpy() for architectures
where CONFIG_ARCH_HAS_PMEM_API=n.
- Add a comment explaining why we write data twice when clearing
poison in pmem_do_bvec().
This has passed a boot test on an X86_32 config, which was the
architecture where issue #1 above was first noticed"
Dan Williams adds:
"We're giving this multi-maintainer setup a shot, so expect libnvdimm
pull requests from either Ross or I going forward"
* 'libnvdimm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm:
libnvdimm, pmem: clarify the write+clear_poison+write flow
pmem: fix BUG() error in pmem.h:48 on X86_32
Merge tag 'mmc-v4.6-rc3' of git://git.linaro.org/people/ulf.hansson/mmc
Pull MMC fixes from Ulf Hansson:
"Here are a couple of mmc fixes intended for v4.6 rc4.
Regarding the fix for the regression about mmcblk device indexes. The
approach taken to solve the problem seems to be good enough. There
were some discussions around the solution, but it seems like people
were happy about it in the end.
MMC core:
- Restore similar old behaviour when assigning mmcblk device indexes
MMC host:
- tegra: Disable UHS-I modes for Tegra124 to fix regression"
* tag 'mmc-v4.6-rc3' of git://git.linaro.org/people/ulf.hansson/mmc:
mmc: tegra: Disable UHS-I modes for Tegra124
mmc: block: Use the mmc host device index as the mmcblk device index
Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux
Pull drm fixes from Dave Airlie:
"This contains fixes for exynos, amdgpu, radeon, i915 and qxl.
It also contains some fixes to the core drm edid parser.
qxl:
- fix for a cursor hotspot issue
radeon:
- some MST fixes that I've been running locally and make my monitor a
bit happier
exynos:
- fix some regressions and build fixes
amdgpu:
- a couple of small fixes
i915:
- two DP MST fixes and a couple of other regression fixes
Nothing too out of the ordinary or surprising at this point"
* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux:
drm/exynos: Use VIDEO_SAMSUNG_S5P_G2D=n as G2D Kconfig dependency
drm/exynos: fix a warning message
drm/exynos: mic: fix an error code
drm/exynos: fimd: fix broken dp_clock control
drm/exynos: build fbdev code conditionally
drm/exynos: fix adjusted_mode pointer in exynos_plane_mode_set
drm/exynos: fix error handling in exynos_drm_subdrv_open
drm/amd/amdgpu: fix irq domain remove for tonga ih
drm/i915: fix deadlock on lid open
drm/radeon: use helper for mst connector dpms.
drm/radeon/mst: port some MST setup code from DAL.
drm/amdgpu: add invisible pin size statistic
drm/edid: Fix DMT 1024x768@43Hz (interlaced) timings
drm/i915: Exit cherryview_irq_handler() after one pass
drm/i915: Call intel_dp_mst_resume() before resuming displays
drm/i915: Fix race condition in intel_dp_destroy_mst_connector()
drm/edid: Fix parsing of EDID 1.4 Established Timings III descriptor
drm/edid: Fix EDID Established Timings I and II
drm/qxl: fix cursor position with non-zero hotspot
Merge branch 'parisc-4.6-4' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux
Pull parisc ftrace fixes from Helge Deller:
"This is (most likely) the last pull request for v4.6 for the parisc
architecture.
It fixes the FTRACE feature for parisc, which is horribly broken since
quite some time and doesn't even compile. This patch just fixes the
bare minimum (it actually removes more lines than it adds), so that
the function tracer works again on 32- and 64bit kernels.
I've queued up additional patches on top of this patch which e.g. add
the syscall tracer, but those have to wait for the merge window for
v4.7."
* 'parisc-4.6-4' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
parisc: Fix ftrace function tracer
Dan Williams [Fri, 15 Apr 2016 02:40:47 +0000 (19:40 -0700)]
libnvdimm, pmem: clarify the write+clear_poison+write flow
The ACPI specification does not specify the state of data after a clear
poison operation. Potential future libnvdimm bus implementations for
other architectures also might not specify or disagree on the state of
data after clear poison. Clarify why we write twice.
Reported-by: Jeff Moyer <jmoyer@redhat.com> Reported-by: Vishal Verma <vishal.l.verma@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Reviewed-by: Jeff Moyer <jmoyer@redhat.com> Reviewed-by: Vishal Verma <vishal.l.verma@intel.com>
perf trace: Bump --mmap-pages when --call-graph is used by the root user
To reduce the chances we'll overflow the mmap buffer, manual fine tuning
trumps this.
Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Milian Wolff <milian.wolff@kdab.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-wxygbxmp1v9mng1ea28wet02@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
When the user doesn't set --mmap-pages, perf_evlist__mmap() will do it
by reading the maximum possible for a non-root user from the
/proc/sys/kernel/perf_event_mlock_kb file.
Expose that function so that 'perf trace' can, for root users, to bump
mmap-pages to a higher value for root, based on the contents of this
proc file.
Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Milian Wolff <milian.wolff@kdab.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-xay69plylwibpb3l4isrpl1k@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
perf trace: Make --(min,max}-stack imply "--call-graph dwarf"
If one uses:
# perf trace --min-stack 16
Then it implicitly means that callgraphs should be enabled, and the best
option in terms of widespread availability is "dwarf".
Further work needed to choose a better alternative, LBR, in capable
systems.
Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Milian Wolff <milian.wolff@kdab.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-xtjmnpkyk42npekxz3kynzmx@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
perf record: Export record_opts based callchain parsing helper
To be able to call it outside option parsing, like when setting a
default --call-graph parameter in 'perf trace' when just --min-stack is
used.
Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-xay69plylwibpb3l4isrpl1k@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Milian Wolff <milian.wolff@kdab.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-jncuxju9fibq2rl6olhqwjw6@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
perf trace: Do not print interrupted syscalls when using --duration
With multiple threads, e.g. a system wide trace session, and one syscall is
midway in a thread and another thread starts another syscall we must print the
start of the interrupted syscall followed by ..., but that can't be done that
way when we use the --duration filter, as we have to wait for the syscall exit
to calculate the duration and decide if it should be filtered, so we have to
disable the interrupted logic and only print at syscall exit, duh.
Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Milian Wolff <milian.wolff@kdab.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-d2nay6kjax5ro991c9kelvi5@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Ming Lei [Fri, 15 Apr 2016 10:51:28 +0000 (18:51 +0800)]
block: loop: fix filesystem corruption in case of aio/dio
Starting from commit e36f620428(block: split bios to max possible length),
block core starts to split bio in the middle of bvec.
Unfortunately loop dio/aio doesn't consider this situation, and
always treat 'iter.iov_offset' as zero. Then filesystem corruption
is observed.
This patch figures out the offset of the base bvevc via
'bio->bi_iter.bi_bvec_done' and fixes the issue by passing the offset
to iov iterator.
Fixes: e36f6204288088f (block: split bios to max possible length) Cc: Keith Busch <keith.busch@intel.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: stable@vger.kernel.org (4.5) Signed-off-by: Ming Lei <ming.lei@canonical.com> Signed-off-by: Jens Axboe <axboe@fb.com>
Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Ingo Molnar:
"Misc fixes: a binutils fix, an lguest fix, an mcelog fix and a missing
documentation fix"
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/mce: Avoid using object after free in genpool
lguest, x86/entry/32: Fix handling of guest syscalls using interrupt gates
x86/build: Build compressed x86 kernels as PIE
x86/mm/pkeys: Add missing Documentation
Merge branch 'mm-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull mm gup cleanup from Ingo Molnar:
"This removes the ugly get-user-pages API hack, now that all upstream
code has been migrated to it"
("ugly" is putting it mildly. But it worked.. - Linus)
* 'mm-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
mm/gup: Remove the macro overload API migration helpers from the get_user*() APIs
Merge tag 'dm-4.6-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm
Pull device mapper fixes from Mike Snitzer:
- fix a 4.6-rc1 bio-based DM 'struct dm_target_io' leak in an error
path
- stable@ fix for DM cache metadata's READ_LOCK macros that were
incorrectly returning error if the block manager was in read-only
mode; also cleanup multi-statement macros to use do {} while(0)
* tag 'dm-4.6-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
dm cache metadata: fix READ_LOCK macros and cleanup WRITE_LOCK macros
dm: fix dm_target_io leak if clone_bio() returns an error
Merge tag 'sound-4.6-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
"We've had a very calm development cycle, so far. Here are the few
fixes for HD-audio and USB-audio, all of which are small and easy"
* tag 'sound-4.6-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
ALSA: hda - Fix inconsistent monitor_present state until repoll
ALSA: hda - Fix regression of monitor_present flag in eld proc file
ALSA: usb-audio: Skip volume controls triggers hangup on Dell USB Dock
ALSA: hda/realtek - Enable the ALC292 dock fixup on the Thinkpad T460s
ALSA: sscape: Use correct format identifier for size_t
ALSA: usb-audio: Add a quirk for Plantronics BT300
ALSA: usb-audio: Add a sample rate quirk for Phoenix Audio TMX320
ALSA: hda - Bind with i915 only when Intel graphics is present
Merge branch 'mailbox-devel' of git://git.linaro.org/landing-teams/working/fujitsu/integration
Pull mailbox fixes from Jussi Brar:
"Misc fixes:
mailbox-test driver:
- prevent memory leak and another cosmetic change
mailbox:
- change the returned error code
Xgene driver:
- return -ENOMEM instead of PTR_ERR for failed devm_kzalloc"
* 'mailbox-devel' of git://git.linaro.org/landing-teams/working/fujitsu/integration:
mailbox: Stop using ENOSYS for anything other than unimplemented syscalls
mailbox: mailbox-test: Prevent memory leak
mailbox: mailbox-test: Use more consistent format for calling copy_from_user()
mailbox: xgene-slimpro: Fix wrong test for devm_kzalloc
Merge tag 'for-linus-4.6-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs
Pull f2fs/fscrypto fixes from Jaegeuk Kim:
"In addition to f2fs/fscrypto fixes, I've added one patch which
prevents RCU mode lookup in d_revalidate, as Al mentioned.
These patches fix f2fs and fscrypto based on -rc3 bug fixes in ext4
crypto, which have not yet been fully propagated as follows.
- use of dget_parent and file_dentry to avoid crashes
- disallow RCU-mode lookup in d_invalidate
- disallow -ENOMEM in the core data encryption path"
* tag 'for-linus-4.6-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs:
ext4/fscrypto: avoid RCU lookup in d_revalidate
fscrypto: don't let data integrity writebacks fail with ENOMEM
f2fs: use dget_parent and file_dentry in f2fs_file_open
fscrypto: use dget_parent() in fscrypt_d_revalidate()
Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Pull crypto fixes from Herbert Xu:
"This fixes an NFS regression caused by the skcipher/hash conversion in
sunrpc. It also fixes a build problem in certain configurations with
bcm63xx"
* 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
hwrng: bcm63xx - fix device tree compilation
sunrpc: Fix skcipher/shash conversion
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security
Pull keys bugfixes from James Morris:
"Two bugfixes for Keys related code"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security:
ASN.1: fix open failure check on headername
assoc_array: don't call compare_object() on a node
perf evsel: Move fprintf methods to separate source file
They still use functions that would drag more stuff to the python
binding, where these fprintf methods are not used, so separate it.
Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-xfp0mgq3hh3px61di6ixi1jk@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Similar to the one in the other tools (report, script, top).
Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Milian Wolff <milian.wolff@kdab.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-lh7kk5a5t3erwxw31ah0cgar@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Works just like with 'perf report'. In some cases we may want to have
more than 127 entries, the default maximum.
Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-mqkz2p5ok2978gztb0vsnocc@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
perf tools: Remove addr_location argument to sample__fprintf_callchain
Not used at all, nuke it.
Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-jf2w8ce8nl3wso3vuodg5jci@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
perf evsel: Require that callchains be resolved before calling fprintf_{sym,callchain}
This way the print routine merely does printing, not requiring access to
the resolving machinery, which helps disentangling the object files and
easing creating subsets with a limited functionality set.
Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-2ti2jbra8fypdfawwwm3aee3@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
perf symbols: Move fprintf routines to separate object file
To disentangle symbol printing from all the code related to symbol
tables, resolution of addresses to symbols, etc.
Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-eik9g3hbtdc7ddv57f1d4v3p@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Mike Snitzer [Tue, 12 Apr 2016 16:14:46 +0000 (12:14 -0400)]
dm cache metadata: fix READ_LOCK macros and cleanup WRITE_LOCK macros
The READ_LOCK macro was incorrectly returning -EINVAL if
dm_bm_is_read_only() was true -- it will always be true once the cache
metadata transitions to read-only by dm_cache_metadata_set_read_only().
Wrap READ_LOCK and WRITE_LOCK multi-statement macros in do {} while(0).
Also, all accesses of the 'cmd' argument passed to these related macros
are now encapsulated in parenthesis.
A follow-up patch can be developed to eliminate the use of macros in
favor of pure C code. Avoiding that now given that this needs to apply
to stable@.
Reported-by: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Fixes: d14fcf3dd79 ("dm cache: make sure every metadata function checks fail_io") Cc: stable@vger.kernel.org