Peter Zijlstra [Wed, 10 Jun 2009 13:03:06 +0000 (15:03 +0200)]
perf_counter tools: Small frequency related fixes
Create the counter in a disabled state and only enable it after we
mmap() the buffer, this allows us to see the first few samples (and
observe the frequency ramp).
Furthermore, print the period in the verbose report.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
Yong Wang [Wed, 10 Jun 2009 09:06:12 +0000 (17:06 +0800)]
perf_counter/x86: Fix the model number of Intel Core2 processors
Fix the model number of Intel Core2 processors according to the
documentation: Intel Processor Identification with the CPUID
Instruction: http://www.intel.com/support/processors/sb/cs-009861.htm
Signed-off-by: Yong Wang <yong.y.wang@intel.com> Also-Reported-by: Arnd Bergmann <arnd@arndb.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <20090610090612.GA26580@ywang-moblin2.bj.intel.com>
[ Added two more model numbers suggested by Arnd Bergmann ] Signed-off-by: Ingo Molnar <mingo@elte.hu>
Pekka Enberg [Mon, 8 Jun 2009 18:12:48 +0000 (21:12 +0300)]
perf report: Add support for profiling JIT generated code
This patch adds support for profiling JIT generated code to 'perf
report'. A JIT compiler is required to generate a "/tmp/perf-$PID.map"
symbols map that is parsed when looking and displaying symbols.
Thanks to Peter Zijlstra for his help with this patch!
Peter Zijlstra [Mon, 8 Jun 2009 18:11:57 +0000 (21:11 +0300)]
perf_counter: Add mmap event hooks to mprotect()
Some JIT compilers allocate memory for generated code with
posix_memalign() + mprotect() so we need to hook into mprotect()
to make sure 'perf' is aware that we're executing code in
anonymous memory.
[ penberg@cs.helsinki.fi: move the hook to sys_mprotect() ] Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
LKML-Reference: <Pine.LNX.4.64.0906082111030.12407@melkki.cs.Helsinki.FI> Signed-off-by: Ingo Molnar <mingo@elte.hu>
Ingo Molnar [Sun, 7 Jun 2009 15:39:02 +0000 (17:39 +0200)]
perf record: Fall back to cpu-clock-ticks if no PMU
On architectures/CPUs without PMU support but with perfcounters
enabled 'perf record' currently fails because it cannot create a
cycle based hw-perfcounter.
Fall back to the cpu-clock-tick sw-perfcounter in this case, which
is hrtimer based and will always work (as long as perfcounters
are enabled).
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
Ingo Molnar [Sun, 7 Jun 2009 15:31:52 +0000 (17:31 +0200)]
perf top: Fall back to cpu-clock-tick hrtimer sampling if no cycle counter available
On architectures/CPUs without PMU support but with perfcounters
enabled 'perf top' currently fails because it cannot create a
cycle based hw-perfcounter.
Fall back to the cpu-clock-tick sw-perfcounter in this case, which
is hrtimer based and will always work (as long as perfcounters
is enabled).
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
perf top: Wait for a minimal set of events before reading first snapshot
The first snapshot reading often occur before any events have
been read in the mapped perfcounter files.
Just wait until we have at least one event before starting the
snapshot, or the delay before the first set of entries to be
displayed may be long in case of low refresh rate.
Note: we could also use a semaphore to wait before
"print_entries" number of eveents is reached, but again this
value is tunable and we can't ensure we will even reach it.
Also we could base on a default mimimum set of entries for the
first refresh, say 15, but again, the minimal sample is
tunable, and we could end up displaying nothing until we have a
minimal default set of events, which can take some time in case
of high samples filters.
Hence this simple solution which partially covers the default
case.
[ Impact: fix display artifacts in perf top ]
Signed-off-by: Frederic Weisbecker <fweisbeec@gmail.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: Marcelo Tosatti <mtosatti@redhat.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <1244322643-6447-1-git-send-email-fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
Ingo Molnar [Sat, 6 Jun 2009 19:25:29 +0000 (21:25 +0200)]
perf annotate: Fix command line help text
Arjan noticed this bug in the perf annotate help output:
-s, --symbol <file> symbol to annotate
that should be <symbol> instead.
Reported-by: Arjan van de Ven <arjan@linux.intel.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
perf_counter tools: Initialize a stack variable before use
the "perf report" utility crashed in some circumstances
because the "sym" stack variable was not initialized before used
(as also proven by valgrind).
With this fix both the crash goes away and valgrind no longer complains.
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
Ingo Molnar [Sat, 6 Jun 2009 19:17:03 +0000 (21:17 +0200)]
perf annotate: Automatically pick up vmlinux in the local directory
Right now kernel debug info does not get resolved by default, because
we dont know where to look for the vmlinux.
The -k option can be used for that - but if no option is given, pick
up vmlinux files in the current directory - in case a kernel hacker
runs profiling from the source directory that the kernel was built in.
The real solution would be to embedd the location (and perhaps the
date/timestamp) of the vmlinux file in /proc/kallsyms, so that
tools can pick it up automatically.
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
Ingo Molnar [Sat, 6 Jun 2009 19:04:17 +0000 (21:04 +0200)]
perf_counter tools: Fix error condition in parse_aliases()
gcc warned about this bug:
util/parse-events.c: In function ‘parse_generic_hw_symbols’:
util/parse-events.c:175: warning: comparison is always false due to limited range of data type
util/parse-events.c:182: warning: comparison is always false due to limited range of data type
util/parse-events.c:190: warning: comparison is always false due to limited range of data type
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
Ingo Molnar [Sat, 6 Jun 2009 18:33:43 +0000 (20:33 +0200)]
perf_counter tools: Move from Documentation/perf_counter/ to tools/perf/
Several people have suggested that 'perf' has become a full-fledged
tool that should be moved out of Documentation/. Move it to the
(new) tools/ directory.
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
Extend generic event enumeration with the PERF_TYPE_HW_CACHE
method.
This is a 3-dimensional space:
{ L1-D, L1-I, L2, ITLB, DTLB, BPU } x
{ load, store, prefetch } x
{ accesses, misses }
User-space passes in the 3 coordinates and the kernel provides
a counter. (if the hardware supports that type and if the
combination makes sense.)
Combinations that make no sense produce a -EINVAL.
Combinations that are not supported by the hardware produce -ENOTSUP.
Extend the tools to deal with this, and rewrite the event symbol
parsing code with various popular aliases for the units and
access methods above. So 'l1-cache-miss' and 'l1d-read-ops' are
both valid aliases.
( x86 is supported for now, with the Nehalem event table filled in,
and with Core2 and Atom having placeholder tables. )
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: Marcelo Tosatti <mtosatti@redhat.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
Yong Wang [Fri, 5 Jun 2009 03:37:35 +0000 (11:37 +0800)]
perf_counter tools: Fix incorrect printf formats
Otherwise the code does not compile on 32-bit boxes.
builtin-report.c: In function 'map__fprintf':
builtin-report.c:240: error: format '%lx' expects type 'long unsigned int', but argument 3 has type 'uint64_t'
builtin-report.c:240: error: format '%lx' expects type 'long unsigned int', but argument 4 has type 'uint64_t'
builtin-report.c:240: error: format '%lx' expects type 'long unsigned int', but argument 5 has type 'uint64_t'
Signed-off-by: Yong Wang <yong.y.wang@intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <20090605033735.GA20451@ywang-moblin2.bj.intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
Paul Mackerras [Fri, 5 Jun 2009 02:36:28 +0000 (12:36 +1000)]
perf_counter: Fix lockup with interrupting counters
Commit 8e3747c1 ("perf_counter: Change data head from u32 to u64")
changed the type of 'head' in struct perf_mmap_data from atomic_t
to atomic_long_t, but missed converting one use of atomic_read on
it to atomic_long_read. The effect of using atomic_read rather than
atomic_long_read on powerpc (and other big-endian architectures) is
that we get the high half of the 64-bit quantity, resulting in the
cmpxchg retry loop in perf_output_begin spinning forever as soon as
data->head becomes non-zero. On little-endian architectures such as
x86 we would get the low half, resulting in a lockup once data->head
becomes greater than 4G.
This fixes it by using atomic_long_read rather than atomic_read.
[ Impact: fix perfcounter lockup on PowerPC / big-endian systems ]
Signed-off-by: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <18984.33964.21541.743096@cargo.ozlabs.ibm.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
Linus Torvalds [Thu, 4 Jun 2009 22:23:39 +0000 (15:23 -0700)]
Merge branch 'drm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6
* 'drm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6:
drm: ignore EDID with really tiny modes.
drm: don't associate _DRM_DRIVER maps with a master
drm/i915: intel_lvds.c fix section mismatch
drm: Hook up DPMS property handling in drm_crtc.c. Add drm_helper_connector_dpms.
drm: set permissions on edid file to 0444
drm: add newlines to text sysfs files
drm/radeon: fix ring free alignment calculations
drm: fix irq naming for kms drivers.
Salman Qazi [Thu, 4 Jun 2009 22:20:39 +0000 (15:20 -0700)]
drivers/char/mem.c: avoid OOM lockup during large reads from /dev/zero
While running 20 parallel instances of dd as follows:
#!/bin/bash
for i in `seq 1 20`; do
dd if=/dev/zero of=/export/hda3/dd_$i bs=1073741824 count=1 &
done
wait
on a 16G machine, we noticed that rather than just killing the processes,
the entire kernel went down. Stracing dd reveals that it first does an
mmap2, which makes 1GB worth of zero page mappings. Then it performs a
read on those pages from /dev/zero, and finally it performs a write.
The machine died during the reads. Looking at the code, it was noticed
that /dev/zero's read operation had been changed by 557ed1fa2620dc119adb86b34c614e152a629a80 ("remove ZERO_PAGE") from giving
zero page mappings to actually zeroing the page.
The zeroing of the pages causes physical pages to be allocated to the
process. But, when the process exhausts all the memory that it can, the
kernel cannot kill it, as it is still in the kernel mode allocating more
memory. Consequently, the kernel eventually crashes.
To fix this, I propose that when a fatal signal is pending during
/dev/zero read operation, we simply return and let the user process die.
Signed-off-by: Salman Qazi <sqazi@google.com> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
[ Modified error return and comment trivially. - Linus] Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Fix warnings for return values that we don't care about:
util/quote.c:222: attention : ignoring return value of ‘fwrite’, declared with attribute warn_unused_result
util/quote.c:235: attention : ignoring return value of ‘fwrite’, declared with attribute warn_unused_result
util/quote.c: In function ‘write_name_quotedpfx’:
util/quote.c:290: attention : ignoring return value of ‘fwrite’, declared with attribute warn_unused_result
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: Marcelo Tosatti <mtosatti@redhat.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <1244146558-8635-1-git-send-email-fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
perf_counter: Sleep before refresh using poll in perf top
perf top is refreshed every delay_secs the thread runs in such
loop:
while (sleep(delay_secs)) {
print_sym_table();
}
At the end of print_sym_table(), poll is used without sleep delay
to check if we have something from stdin.
It means that this check is done only every delay_secs, which can
be higher that 2 secs if the user defined a custom refresh rate.
We can drop sleep() here and directly use poll to wait between
refresh periods, so that the reaction after the user stops perf top
after typing "Enter" is immediate and doesn't suffer from the
delay_secs latency.
Nb: poll doesn't add any overhead that can parasite perf top measures
since it sleeps the entire timeout here.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: Marcelo Tosatti <mtosatti@redhat.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <1244141284-7507-1-git-send-email-fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
Rusty Russell [Wed, 3 Jun 2009 05:22:24 +0000 (14:52 +0930)]
lguest: fix 'unhandled trap 13' with CONFIG_CC_STACKPROTECTOR
We don't set up the canary; let's disable stack protector on boot.c so
we can get into lguest_init, then set it up. As a side effect,
switch_to_new_gdt() sets up %fs for us properly too.
Eric Anholt [Thu, 4 Jun 2009 11:18:14 +0000 (11:18 +0000)]
drm/i915: Remove a bad BUG_ON in the fence management code.
This could be triggered by a gtt mapping fault on 965 that decides to
remove the fence from another object that happens to be active currently.
Since the other object doesn't rely on the fence reg for its execution, we
don't wait for it to finish. We'll soon be not waiting on 915 most of the
time as well, so just drop the BUG_ON.
Paul Mackerras [Wed, 3 Jun 2009 23:49:59 +0000 (09:49 +1000)]
perf_counter: powerpc: Use new identifier names in powerpc-specific code
Commit b23f3325 ("perf_counter: Rename various fields") fixed up
most of the uses of the renamed fields, but missed one instance
of "record_type" in powerpc-specific code which needs to be changed
to "sample_type", and a "PERF_RECORD_ADDR" in the same statement that
needs to be changed to "PERF_SAMPLE_ADDR", causing compilation
errors on powerpc. This fixes it.
Signed-off-by: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <18983.3111.770392.800486@cargo.ozlabs.ibm.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
> FYI, i just got this crash (segfault) in perf report after
> collecting a long profile from Xorg:
>
> Starting program: /home/mingo/tip/Documentation/perf_counter/perf report
> [Thread debugging using libthread_db enabled]
> Detaching after fork from child process 20008.
> [New Thread 0x7f92fd62a6f0 (LWP 20005)]
>
> Program received signal SIGSEGV, Segmentation fault.
> 0x000000000041031a in __rb_erase_color (node=0x142c090, parent=0x0,
> root=0x881918)
> at util/rbtree.c:143
> 143 if (parent->rb_left == node)
Ben Skeggs [Tue, 26 May 2009 00:35:52 +0000 (10:35 +1000)]
drm: don't associate _DRM_DRIVER maps with a master
A driver will use the _DRM_DRIVER map flag to indicate that it wants
to be responsible for removing the map itself, bypassing the DRM's
automagic cleanup code.
Since the multi-master changes this has been broken, resulting in some
drivers having their registers unmapped before it's finished with them.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Signed-off-by: Dave Airlie <airlied@redhat.com>
Keith Packard [Sun, 31 May 2009 03:42:28 +0000 (20:42 -0700)]
drm: Hook up DPMS property handling in drm_crtc.c. Add drm_helper_connector_dpms.
Making the drm_crtc.c code recognize the DPMS property and invoke the
connector->dpms function doesn't remove any capability from the driver while
reducing code duplication.
That just highlighted the problem with the existing DPMS functions which
could turn off the connector, but failed to turn off any relevant crtcs. The
new drm_helper_connector_dpms function manages all of that, using the
drm_helper-specific crtc and encoder dpms functions, automatically computing
the appropriate DPMS level for each object in the system.
This fixes the current troubles in the i915 driver which left PLLs, pipes
and planes running while in DPMS_OFF mode or even while they were unused.
Signed-off-by: Keith Packard <keithp@keithp.com> Signed-off-by: Dave Airlie <airlied@redhat.com>
Dave Airlie [Wed, 3 Jun 2009 21:08:13 +0000 (07:08 +1000)]
drm/radeon: fix ring free alignment calculations
fd.o bz#21849
We were aligning to +16 dwords, instead of to the next 16dword
boundary in the ring. Fix the calculation to go to the next 16dword
boundary when space checking.
Ingo Molnar [Wed, 3 Jun 2009 21:29:14 +0000 (23:29 +0200)]
perf report: Handle all known event types
We have munmap, throttle/unthrottle and period events as well,
process them - otherwise they are considered broke events and
we mis-parse the next few events.
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: Marcelo Tosatti <mtosatti@redhat.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de>
LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
Peter Zijlstra [Wed, 3 Jun 2009 12:01:36 +0000 (14:01 +0200)]
perf_counter: Fix race in counter initialization
We need the PID namespace and counter ID available when the
counter overflows and we need to generate a sample event.
[ Impact: fix kernel crash with high-frequency sampling ]
Reported-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: Marcelo Tosatti <mtosatti@redhat.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: John Kacur <jkacur@redhat.com>
LKML-Reference: <new-submission>
[ fixed a further crash and cleaned up the initialization a bit ] Signed-off-by: Ingo Molnar <mingo@elte.hu>
Peter Zijlstra [Wed, 3 Jun 2009 10:37:36 +0000 (12:37 +0200)]
perf report: Fix comm sorting
Since we can (and do) change comm strings during the collection
phase, we cannot actually sort on them to build the histogram.
Therefore add an (optional) third sorting phase to collapse the
histrogram.
Comm sorting now builds the histrogram on threads and then in
the collapse phase collects all threads with the same comm.
This collapsed histogram is then reversed and sorted on events.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: Marcelo Tosatti <mtosatti@redhat.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: John Kacur <jkacur@redhat.com>
LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>