Ingo Molnar [Fri, 12 Oct 2012 10:13:10 +0000 (12:13 +0200)]
sched/numa: Describe the NUMA scheduling problem formally
This is probably a first: formal description of a complex high-level
computing problem, within the kernel source.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Mike Galbraith <efault@gmx.de>
Rik van Riel <riel@redhat.com> Link: http://lkml.kernel.org/n/tip-mmnlpupoetcatimvjEld16Pb@git.kernel.org
[ Next step: generate the kernel source from such formal descriptions and retire to a tropical island! ] Signed-off-by: Ingo Molnar <mingo@kernel.org>
Peter Zijlstra [Tue, 9 Oct 2012 11:46:22 +0000 (13:46 +0200)]
sched/numa: Add fault driven placement policy
As per the problem/design document Documentation/scheduler/numa-problem.txt
implement 3ac & 4.
A pure 3a was found too unstable, I did briefly try 3bc
but found no significant improvement. We could add a NUMA_FREQ knob
if people want to play further -- but for now implement the simplest
form.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/n/tip-gzh3q3dzud3lvu6zmf7do0wu@git.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
Peter Zijlstra [Tue, 9 Oct 2012 11:19:50 +0000 (13:19 +0200)]
sched/numa: Remove small mode
Now that the 'big' mode is more or less working, remove the small
mode. There's still the periodic scan cost I don't like, but Rik's
suggestion of adapting the scan period (growing it for stable tasks)
should take care of most of that (not implemented yet).
For now, start by removing 'small' mode to clear out/simplify things.
Peter Zijlstra [Fri, 5 Oct 2012 13:57:54 +0000 (15:57 +0200)]
sched/numa: Fix NUMA_PULL_BIAS
PULL_BIAS is broken, the intent was to only attempt a small bias when
the machine was otherwise balanced, hence the out_balanced label.
The problem is that a number of branches that decide we're completely
unbalanced also jump to the out_balanced label, causing undue numa
bias. The end result was that an unbalanced system of 8:1 would try
and move the 1 task towards the 8.
Peter Zijlstra [Fri, 5 Oct 2012 13:57:54 +0000 (15:57 +0200)]
sched/numa: Introduce alternative wakeup bias
Rename NUMA_BIAS to NUMA_TTWU_BIAS to clarify what it does.
Also, disable by default, it seems too agressive. Also provide an
alternative to play with, instead of altering the prev cpu, alter
the waking cpu, maybe that's less agressive.
Rik van Riel [Tue, 9 Oct 2012 13:31:59 +0000 (15:31 +0200)]
mm: Only flush the TLB when clearing an accessible pte
If ptep_clear_flush() is called to clear a page table entry that is
accessible anyway by the CPU, eg. a _PAGE_PROTNONE page table entry,
there is no need to flush the TLB on remote CPUs.
Signed-off-by: Rik van Riel <riel@redhat.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/n/tip-vm3rkzevahelwhejx5uwm8ex@git.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
Rik van Riel [Tue, 9 Oct 2012 13:31:12 +0000 (15:31 +0200)]
x86/mm: Introduce pte_accessible()
We need pte_present to return true for _PAGE_PROTNONE pages, to indicate that
the pte is associated with a page.
However, for TLB flushing purposes, we would like to know whether the pte
points to an actually accessible page. This allows us to skip remote TLB
flushes for pages that are not actually accessible.
Signed-off-by: Rik van Riel <riel@redhat.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/n/tip-66p11te4uj23gevgh4j987ip@git.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
Michal Hocko [Wed, 10 Oct 2012 06:21:09 +0000 (11:51 +0530)]
nohz: Fix idle ticks in cpu summary line of /proc/stat
Git commit 09a1d34f8535ecf9 "nohz: Make idle/iowait counter update
conditional" introduced a bug in regard to cpu hotplug. The effect is
that the number of idle ticks in the cpu summary line in /proc/stat is
still counting ticks for offline cpus.
Reproduction is easy, just start a workload that keeps all cpus busy,
switch off one or more cpus and then watch the idle field in top.
On a dual-core with one cpu 100% busy and one offline cpu you will get
something like this:
The problem is that an offline cpu still has ts->idle_active == 1.
To fix this we should make sure that the cpu is online when calling
get_cpu_idle_time_us and get_cpu_iowait_time_us.
[Srivatsa: Rebased to current mainline]
Reported-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Michal Hocko <mhocko@suse.cz> Reviewed-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> Link: http://lkml.kernel.org/r/20121010061820.8999.57245.stgit@srivatsabhat.in.ibm.com Cc: deepthi@linux.vnet.ibm.com Cc: stable@vger.kernel.org Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
timers: Fix endless looping between cascade() and internal_add_timer()
Adding two (or more) timers with large values for "expires" (they have
to reside within tv5 in the same list) leads to endless looping
between cascade() and internal_add_timer() in case CONFIG_BASE_SMALL
is one and jiffies are crossing the value 1 << 18. The bug was
introduced between 2.6.11 and 2.6.12 (and survived for quite some
time).
This patch ensures that when cascade() is called timers within tv5 are
not added endlessly to their own list again, instead they are added to
the next lower tv level tv4 (as expected).
Irina Tirdea [Mon, 8 Oct 2012 06:43:28 +0000 (09:43 +0300)]
Documentation: add documentation on compiling for Android
Add documentation for cross-compiling on Android including:
() instructions on how to set the Android NDK environment
() how to cross-compile perf for Android
() how to install on an Android device/emulator, set the runtime
environment and run it
Signed-off-by: Irina Tirdea <irina.tirdea@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Pekka Enberg <penberg@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Steven Rostedt <rostedt@goodmis.org> Link: http://lkml.kernel.org/r/1349678613-7045-4-git-send-email-irina.tirdea@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Irina Tirdea [Mon, 8 Oct 2012 06:43:27 +0000 (09:43 +0300)]
perf tools: Update Makefile for Android
For cross-compiling on Android, some specific changes are needed in
the Makefile.
Update the Makefile to support cross-compiling for Android.
The original ideea for this was send by Bernhard Rosenkraenzer in
https://lkml.org/lkml/2012/8/23/316, but this is a rewrite.
Changes:
() support bionic in addition to glibc
() remove rt and pthread libraries that do not exist in Android
() use $(CFLAGS) when detecting initial compiler flags. This is needed
when setting CFLAGS as an argument of make (e.g. for setting --sysroot).
() include perf's local directory when building for Android to be able to find
relative paths if using --sysroot (e.g.: ../../include/linux/perf_event.h)
Signed-off-by: Irina Tirdea <irina.tirdea@intel.com> Cc: Bernhard Rosenkraenzer <Bernhard.Rosenkranzer@linaro.org> Cc: David Ahern <dsahern@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Pekka Enberg <penberg@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Steven Rostedt <rostedt@goodmis.org> Link: http://lkml.kernel.org/r/1349678613-7045-3-git-send-email-irina.tirdea@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
on_exit() is only available in new versions of glibc.
It is not implemented in Bionic and will lead to linking errors when
compiling for Android.
Implement a wrapper for on_exit using atexit.
The implementation for on_exit is the one sent by Bernhard Rosenkraenzer in
https://lkml.org/lkml/2012/8/23/316. The configuration part from the Makefile
is different than the one from the original patch.
Signed-off-by: Bernhard Rosenkraenzer <Bernhard.Rosenkranzer@linaro.org> Signed-off-by: Irina Tirdea <irina.tirdea@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Irina Tirdea <irina.tirdea@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Pekka Enberg <penberg@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Steven Rostedt <rostedt@goodmis.org> Link: http://lkml.kernel.org/r/1349678613-7045-2-git-send-email-irina.tirdea@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
rcu: Grace-period initialization excludes only RCU notifier
Kirill noted the following deadlock cycle on shutdown involving padata:
> With commit 755609a9087fa983f567dc5452b2fa7b089b591f I've got deadlock on
> poweroff.
>
> It guess it happens because of race for cpu_hotplug.lock:
>
> CPU A CPU B
> disable_nonboot_cpus()
> _cpu_down()
> cpu_hotplug_begin()
> mutex_lock(&cpu_hotplug.lock);
> __cpu_notify()
> padata_cpu_callback()
> __padata_remove_cpu()
> padata_replace()
> synchronize_rcu()
> rcu_gp_kthread()
> get_online_cpus();
> mutex_lock(&cpu_hotplug.lock);
It would of course be good to eliminate grace-period delays from
CPU-hotplug notifiers, but that is a separate issue. Deadlock is
not an appropriate diagnostic for excessive CPU-hotplug latency.
Fortunately, grace-period initialization does not actually need to
exclude all of the CPU-hotplug operation, but rather only RCU's own
CPU_UP_PREPARE and CPU_DEAD CPU-hotplug notifiers. This commit therefore
introduces a new per-rcu_state onoff_mutex that provides the required
concurrency control in place of the get_online_cpus() that was previously
in rcu_gp_init().
Reported-by: "Kirill A. Shutemov" <kirill@shutemov.name> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Tested-by: Kirill A. Shutemov <kirill@shutemov.name>
perf machine: Carve up event processing specific from perf_tool
The perf_tool vtable expects methods that receive perf_tool and
perf_sample entries, but for tools not interested in doing any special
processing on non PERF_RECORD_SAMPLE events, like 'perf top', and for
those not using perf_session, like 'perf trace', they were using
perf_event__process passing tool and sample paramenters that were just
not used.
Provide 'machine' methods for this purpose and make the perf_event
ones use them.
Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Namhyung Kim <namhyung@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-ot9cc6mt025o8kbngzckcrx9@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
perf event: No need to create a thread when handling PERF_RECORD_EXIT
When we were processing a PERF_RECORD_EXIT event we first used
machine__findnew_thread for both the thread exiting and for its parent,
only to use just the thread struct associated with the one exiting, and
to just delete it.
If it existed, i.e. not created at this very moment in
machine__findnew_thread, it will be moved to the machine->dead_threads
linked list, because we may have hist_entries pointing to it, but if it
was created just do be deleted, it will just sit there with no
references at all.
Use the new machine__find_thread() method so that if it is not there, we
don't create it.
As a bonus the parent thread will also not be created at this point.
Create process_fork() and process_exit() helpers to use this and make
the builtins use it instead of the generic process_task(), ditched by
this patch.
Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Namhyung Kim <namhyung@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-z7n2y98ebjyrvmytaope4vdl@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Jiri Olsa [Fri, 5 Oct 2012 14:44:47 +0000 (16:44 +0200)]
perf diff: Display empty space for non paired samples
Currently in 'Baseline' and 'Period Base' columns zero values are
displayed in case no pair is found for the sample. This might be
confusing, using empty space instead.
Signed-off-by: Jiri Olsa <jolsa@redhat.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1349448287-18919-9-git-send-email-jolsa@redhat.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Jiri Olsa [Fri, 5 Oct 2012 14:44:44 +0000 (16:44 +0200)]
perf diff: Add -p option to display period values for hist entries
Adding -p option to show period values for both compared hist entries.
Showing hist column PERF_HPP__PERIOD and newly added hist column
PERF_HPP__PERIOD_BASELINE.
Signed-off-by: Jiri Olsa <jolsa@redhat.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1349448287-18919-6-git-send-email-jolsa@redhat.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Jiri Olsa [Fri, 5 Oct 2012 14:44:43 +0000 (16:44 +0200)]
perf diff: Add weighted diff computation way to compare hist entries
Adding 'wdiff' as new computation way to compare hist entries.
If specified the 'Weighted diff' column is displayed with value 'd'
computed as:
d = B->period * WEIGHT-A - A->period * WEIGHT-B
- A/B being matching hist entry from first/second file specified
(or perf.data/perf.data.old) respectively.
- period being the hist entry period value
- WEIGHT-A/WEIGHT-B being user suplied weights in the the '-c' option
behind ':' separator like '-c wdiff:1,2'.
Signed-off-by: Jiri Olsa <jolsa@redhat.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1349448287-18919-5-git-send-email-jolsa@redhat.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Jiri Olsa [Fri, 5 Oct 2012 14:44:42 +0000 (16:44 +0200)]
perf diff: Add option to sort entries based on diff computation
Adding support to sort hist entries based on the outcome of selected
computation. It's now possible to specify '+' as a first character of
'-c' option value to make such sort.
Jiri Olsa [Fri, 5 Oct 2012 14:44:41 +0000 (16:44 +0200)]
perf diff: Add ratio computation way to compare hist entries
Adding -c option to select computation method with the current 'Delta'
computation as default. Current possible values are of this option are:
'delta' and 'ratio'.
Adding 'ratio' as new computation way to compare hist entries. If
specified the 'Ratio' column is displayed with value 'r' computed as:
r = A->period / B->period
with:
- A/B being matching hist entry from first/second file specified
(or perf.data/perf.data.old) respectively.
- period being the hist entry period value
Signed-off-by: Jiri Olsa <jolsa@redhat.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1349448287-18919-3-git-send-email-jolsa@redhat.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Namhyung Kim [Fri, 5 Oct 2012 05:02:16 +0000 (14:02 +0900)]
perf trace: Add support for tracing workload given by command line
Now perf trace is able to trace specified workload by forking it like
perf record does. And also finish the tracing if the workload quits or
gets SIGINT.
Namhyung Kim [Fri, 5 Oct 2012 05:02:14 +0000 (14:02 +0900)]
perf trace: Explicitly enable system-wide mode if no option is given
When no target cpu/user/task option is given, perf trace will do its job
system wide for all online cpus. Make it explicit to reduce possible
confusion when reading code.
Peter Zijlstra [Tue, 2 Oct 2012 13:41:23 +0000 (15:41 +0200)]
perf: Fix perf_cgroup_switch for sw-events
Jiri reported that he could trigger the WARN_ON_ONCE() in
perf_cgroup_switch() using sw-events. This is because sw-events share
a cpuctx with multiple PMUs.
Use the ->unique_pmu pointer to limit the pmu iteration to unique
cpuctx instances.
Peter Zijlstra [Tue, 2 Oct 2012 13:38:52 +0000 (15:38 +0200)]
perf: Clarify perf_cpu_context::active_pmu usage by renaming it to ::unique_pmu
Stephane thought the perf_cpu_context::active_pmu name confusing and
suggested using 'unique_pmu' instead.
This pointer is a pointer to a 'random' pmu sharing the cpuctx
instance, therefore limiting a for_each_pmu loop to those where
cpuctx->unique_pmu matches the pmu we get a loop over unique cpuctx
instances.
sched: Update sched_domains_numa_masks[][] when new cpus are onlined
Once array sched_domains_numa_masks[] []is defined, it is never updated.
When a new cpu on a new node is onlined, the coincident member in
sched_domains_numa_masks[][] is not initialized, and all the masks are 0.
As a result, the build_overlap_sched_groups() will initialize a NULL
sched_group for the new cpu on the new node, which will lead to kernel panic:
sched: Ensure 'sched_domains_numa_levels' is safe to use in other functions
We should temporarily reset 'sched_domains_numa_levels' to 0 after
it is reset to 'level' in sched_init_numa(). If it fails to allocate
memory for array sched_domains_numa_masks[][], the array will contain
less then 'level' members. This could be dangerous when we use it to
iterate array sched_domains_numa_masks[][] in other functions.
This patch set sched_domains_numa_levels to 0 before initializing
array sched_domains_numa_masks[][], and reset it to 'level' when
sched_domains_numa_masks[][] is fully initialized.
When we stop the tick in idle, we save the current jiffies value
in ts->idle_jiffies. This snapshot is substracted from the later
value of jiffies when the tick is restarted and the resulting
delta is accounted as idle cputime. This is how we handle the
idle cputime accounting without the tick.
But sometimes we need to schedule the next tick to some time in
the future instead of completely stopping it. In this case, a
tick may happen before we restart the periodic behaviour and
from that tick we account one jiffy to idle cputime as usual but
we also increment the ts->idle_jiffies snapshot by one so that
when we compute the delta to account, we substract the one jiffy
we just accounted.
To prepare for stopping the tick outside idle, we introduced a
check that prevents from fixing up that ts->idle_jiffies if we
are not running the idle task. But we use idle_cpu() for that
and this is a problem if we run the tick while another CPU
remotely enqueues a ttwu to our runqueue:
CPU 0: CPU 1:
tick_sched_timer() { ttwu_queue_remote()
if (idle_cpu(CPU 0))
ts->idle_jiffies++;
}
Here, idle_cpu() notes that &rq->wake_list is not empty and
hence won't consider the CPU as idle. As a result,
ts->idle_jiffies won't be incremented. But this is wrong because
we actually account the current jiffy to idle cputime. And that
jiffy won't get substracted from the nohz time delta. So in the
end, this jiffy is accounted twice.
Fix this by changing idle_cpu(smp_processor_id()) with
is_idle_task(current). This way the jiffy is substracted
correctly even if a ttwu operation is enqueued on the CPU.
Jiri Olsa [Thu, 4 Oct 2012 12:49:40 +0000 (21:49 +0900)]
perf diff: Removing the total_period argument from output code
The total_period is available in struct hists data via the 'struct
hist_entry::hists' pointer. There's no need to carry it through the
output code path.
Removing 'struct perf_hpp::total_period' pointer, because it's no longer
needed.
Signed-off-by: Jiri Olsa <jolsa@redhat.com> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1349354994-17853-7-git-send-email-namhyung@kernel.org Signed-off-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Jiri Olsa [Thu, 4 Oct 2012 12:49:39 +0000 (21:49 +0900)]
perf tool: Add hpp interface to enable/disable hpp column
Adding perf_hpp__column_enable function to enable/disable hists column
and removing diff command specific stuff 'need_pair and
show_displacement' from hpp code.
The diff command now enables/disables columns separately according to
the user arguments. This will be helpful in future patches where more
columns are added into diff output.
Signed-off-by: Jiri Olsa <jolsa@redhat.com> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Ingo Molnar <mingo@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1349354994-17853-6-git-send-email-namhyung@kernel.org Signed-off-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Jiri Olsa [Thu, 4 Oct 2012 12:49:37 +0000 (21:49 +0900)]
perf hists: Separate overhead and baseline columns
Currently the overhead and baseline columns are handled within single
function and the distinction is made by 'baseline hists' pointer passed
by 'struct perf_hpp::ptr'.
Since hists pointer is now part of each hist_entry, it's possible to
locate paired hists pointer directly from the passed struct hist_entry
pointer.
Also separating those 2 columns makes the code more obvious.
Signed-off-by: Jiri Olsa <jolsa@redhat.com> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Ingo Molnar <mingo@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1349354994-17853-4-git-send-email-namhyung@kernel.org Signed-off-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Jiri Olsa [Thu, 4 Oct 2012 12:49:36 +0000 (21:49 +0900)]
perf diff: Refactor diff displacement possition info
Moving the position calculation into the diff command, so the position
as prepared inside struct hist_entry data and there's no need to compute
in the output display path.
Removing 'displacement' from struct perf_hpp as it is no longer needed.
Signed-off-by: Jiri Olsa <jolsa@redhat.com> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1349354994-17853-3-git-send-email-namhyung@kernel.org Signed-off-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Namhyung Kim [Thu, 4 Oct 2012 05:23:54 +0000 (14:23 +0900)]
perf tools: Complete tracepoint event names
Currently tracepoint events cannot be completed because they contain a
colon (:) character. The colon is considered as a word separator when
bash completion is done - variable COMP_WORDBREAKS contains colon - so
if a word being completed contains a colon it can be a problem.
Recent versions of bash completion provide -n switch to
_get_comp_words_by_ref and __ltrim_colon_completions functions in order
to resolve this issue. Copy the latter in case not exists.
Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1349328234-16995-1-git-send-email-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
perf/x86: Add support for Intel Xeon-Phi Knights Corner PMU
The following patch adds perf_event support for the Xeon-Phi
PMU, as documented in the "Intel Xeon Phi Coprocessor (codename:
Knights Corner) Performance Monitoring Units" manual.
Even though it is a co-processor, a Phi runs a full Linux
environment and can support performance counters.
This is just barebones support, it does not add support for
interesting new features such as the SPFLT intruction that
allows starting/stopping events without entering the kernel.
The PMU internally is just like that of an original Pentium, but
a "P6-like" MSR interface is provided. The interface is
different enough from a real P6 that it's not easy (or
practical) to re-use the code in perf_event_p6.c
Acked-by: Lawrence F Meadows <lawrence.f.meadows@intel.com> Acked-by: Cyrill Gorcunov <gorcunov@openvz.org> Signed-off-by: Vince Weaver <vincent.weaver@maine.edu> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Cc: eranian@gmail.com Cc: Lawrence F <lawrence.f.meadows@intel.com> Link: http://lkml.kernel.org/r/alpine.DEB.2.02.1209261405320.8398@vincent-weaver-1.um.maine.edu Signed-off-by: Ingo Molnar <mingo@kernel.org>
David Hooper [Tue, 2 Oct 2012 14:26:53 +0000 (15:26 +0100)]
x86/reboot: Remove quirk entry for SBC FITPC
Remove the quirk for the SBC FITPC. It seems ot have been
required when the default was kbd reboot, but no longer required
now that the default is acpi reboot. Furthermore, BIOS reboot no
longer works for this board as of 2.6.39 or any of the 3.x
kernels.
Namhyung Kim [Mon, 1 Oct 2012 16:32:51 +0000 (01:32 +0900)]
perf tools: Convert to BACKTRACE_SUPPORT
For building perf without stack backtrace debug, we can set
NO_BACKTRACE=1 as a argument of make. It then defines NO_BACKTRACE
macro for C code to do the proper handling. However it usually used in
a negative semantics - e.g. #ifndef - so we saw double negations which
can be misleading. Convert it to a positive form to make it more
readable and add _SUPPORT suffix for consistency.
Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Ingo Molnar <mingo@kernel.org> Cc: Irina Tirdea <irina.tirdea@gmail.com> Cc: Irina Tirdea <irina.tirdea@intel.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1349109171-1942-1-git-send-email-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>