Andi Kleen [Thu, 24 Jan 2013 15:10:29 +0000 (16:10 +0100)]
perf tools: Add support for weight v7 (modified)
perf record has a new option -W that enables weightened sampling.
Add sorting support in top/report for the average weight per sample and the
total weight sum. This allows to both compare relative cost per event
and the total cost over the measurement period.
Add the necessary glue to perf report, record and the library.
v2: Merge with new hist refactoring.
v3: Fix manpage. Remove value check.
Rename global_weight to weight and weight to local_weight.
v4: Readd sort keys to manpage
v5: Move weight to end
v6: Move weight to template
v7: Rename weight key.
Original patch from Andi modified by Stephane Eranian <eranian@google.com>
to include ONLY the weight supporting code and apply to pristine 3.8.0-rc4.
Signed-off-by: Andi Kleen <ak@linux.intel.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung.kim@lge.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1359040242-8269-6-git-send-email-eranian@google.com
[ committer note: changed to cope with fc5871ed and the hists_link perf test entry ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Stephane Eranian [Thu, 24 Jan 2013 15:10:39 +0000 (16:10 +0100)]
perf: Add PERF_RECORD_MISC_MMAP_DATA to RECORD_MMAP
Type of mapping was lost and made it hard for a tool
to distinguish code vs. data mmaps. Perf has the ability
to distinguish the two.
Use a bit in the header->misc bitmask to keep track of
the mmap type. If PERF_RECORD_MISC_MMAP_DATA is set then
the mapping is not executable (!VM_EXEC). If not set, then
the mapping is executable.
Stephane Eranian [Thu, 24 Jan 2013 15:10:34 +0000 (16:10 +0100)]
perf/x86: Add support for PEBS Precise Store
This patch adds support for PEBS Precise Store
which is available on Intel Sandy Bridge and
Ivy Bridge processors.
To use Precise store, the proper PEBS event
must be used: mem_trans_retired:precise_stores.
For the perf tool, the generic mem-stores event
exported via sysfs can be used directly.
Stephane Eranian [Thu, 24 Jan 2013 15:10:32 +0000 (16:10 +0100)]
perf/x86: Add memory profiling via PEBS Load Latency
This patch adds support for memory profiling using the
PEBS Load Latency facility.
Load accesses are sampled by HW and the instruction
address, data address, load latency, data source, tlb,
locked information can be saved in the sampling buffer
if using the PERF_SAMPLE_COST (for latency),
PERF_SAMPLE_ADDR, PERF_SAMPLE_DATA_SRC types.
To enable PEBS Load Latency, users have to use the
model specific event:
- on NHM/WSM: MEM_INST_RETIRED:LATENCY_ABOVE_THRESHOLD
- on SNB/IVB: MEM_TRANS_RETIRED:LATENCY_ABOVE_THRESHOLD
To make things easier, this patch also exports a generic
alias via sysfs: mem-loads. It export the right event
encoding based on the host CPU and can be used directly
by the perf tool.
Loosely based on Intel's Lin Ming patch posted on LKML
in July 2011.
Stephane Eranian [Thu, 24 Jan 2013 15:10:31 +0000 (16:10 +0100)]
perf: Add generic memory sampling interface
This patch adds PERF_SAMPLE_DATA_SRC.
PERF_SAMPLE_DATA_SRC collects the data source, i.e., where
did the data associated with the sampled instruction
come from. Information is stored in a perf_mem_data_src
structure. It contains opcode, mem level, tlb, snoop,
lock information, subject to availability in hardware.
Andi Kleen [Thu, 24 Jan 2013 15:10:28 +0000 (16:10 +0100)]
perf/core: Add weighted samples
For some events it's useful to weight sample with a hardware
provided number. This expresses how expensive the action the
sample represent was. This allows the profiler to scale
the samples to be more informative to the programmer.
There is already the period which is used similarly, but it
means something different, so I chose to not overload it.
Instead a new sample type for WEIGHT is added.
Can be used for multiple things. Initially it is used for TSX
abort costs and profiling by memory latencies (so to make
expensive load appear higher up in the histograms). The concept
is quite generic and can be extended to many other kinds of
events or architectures, as long as the hardware provides
suitable auxillary values. In principle it could be also used
for software tracepoints.
This adds the generic glue. A new optional sample format for a
64-bit weight value.
Stephane Eranian [Thu, 24 Jan 2013 15:10:27 +0000 (16:10 +0100)]
perf/x86: Add flags to event constraints
This patch adds a flags field to each event constraint.
It can be used to store event specific features which can
then later be used by scheduling code or low-level x86 code.
The flags are propagated into event->hw.flags during the
get_event_constraint() call. They are cleared during the
put_event_constraint() call.
This mechanism is going to be used by the PEBS-LL patches.
It avoids defining yet another table to hold event specific
information.
Stephane Eranian [Thu, 24 Jan 2013 15:10:26 +0000 (16:10 +0100)]
perf/x86: Improve sysfs event mapping with event string
This patch extends Jiri's changes to make generic
events mapping visible via sysfs. The patch extends
the mechanism to non-generic events by allowing
the mappings to be hardcoded in strings.
This mechanism will be used by the PEBS-LL patch
later on.
Signed-off-by: Stephane Eranian <eranian@google.com> Cc: peterz@infradead.org Cc: ak@linux.intel.com Cc: acme@redhat.com Cc: jolsa@redhat.com Cc: namhyung.kim@lge.com Link: http://lkml.kernel.org/r/1359040242-8269-3-git-send-email-eranian@google.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
[ fixed up conflict with 2663960 "perf: Make EVENT_ATTR global" ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Stephane Eranian [Thu, 14 Feb 2013 12:57:29 +0000 (13:57 +0100)]
perf stat: Add per-core aggregation
This patch adds the --per-core option to perf stat.
This option is used to aggregate system-wide counts
on a per physical core basis. On processors with
hyperthreading, this means counts of all HT threads
running on a physical core are aggregated.
This mode is useful to find imblance between physical
cores running an uniform workload. Cores are identified
by socket: S0-C1, means physical core 1 on socket 0. Note
that cores are identified using their physical core id,
thus their numbering may not be continuous.
Per core aggregation can be combined with interval printing:
Namhyung Kim [Thu, 21 Mar 2013 07:18:52 +0000 (16:18 +0900)]
perf tools: Cleanup calc_data_size logic
It's for calculating whole trace data size during reading. However
relation functions are called only in this file, no need to
conditionalize it with tricky +1 offset and rename the variable to
more meaningful name like trace_data_size.
Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Steven Rostedt <rostedt@goodmis.org> Link: http://lkml.kernel.org/r/1363850332-25297-10-git-send-email-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Namhyung Kim [Thu, 21 Mar 2013 07:18:48 +0000 (16:18 +0900)]
perf tools: Handle failure case in trace_report()
If pevent allocation in read_trace_init() fails, trace_report() will
return -1 and *ppevent is set to NULL. Its callers should check this
case and handle it properly.
This is also a preparation for the removal of *die() calls.
Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Steven Rostedt <rostedt@goodmis.org> Link: http://lkml.kernel.org/r/1363850332-25297-6-git-send-email-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Namhyung Kim [Mon, 18 Mar 2013 02:41:47 +0000 (11:41 +0900)]
perf tests: Add a test case for checking sw clock event frequency
This test case checks frequency conversion of hrtimer-based software
clock events (cpu-clock, task-clock) have valid (non-1) periods.
Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1363574507-18808-2-git-send-email-namhyung@kernel.org
[ committer note: Moved .sample_freq to outside named init block to cope with some gcc versions ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Stephane Eranian [Sun, 17 Mar 2013 13:49:57 +0000 (14:49 +0100)]
perf/x86: Add SNB/SNB-EP scheduling constraints for cycle_activity event
Add scheduling constraints for SNB/SNB-EP CYCLE_ACTIVITY event
as defined by SDM Jan 2013 edition. The STALLS umasks are
combinations with the NO_DISPATCH umask.
Masami Hiramatsu [Thu, 14 Mar 2013 11:52:43 +0000 (20:52 +0900)]
kprobes/x86: Check Interrupt Flag modifier when registering probe
Currently kprobes check whether the copied instruction modifies
IF (interrupt flag) on each probe hit. This results not only in
introducing overhead but also involving
inat_get_opcode_attribute into the kprobes hot path, and it can
cause an infinite recursive call (and kernel panic in the end).
Actually, since the copied instruction itself can never be modified
on the buffer, it is needless to analyze the instruction on every
probe hit.
To fix this issue, we check it only once when registering probe
and store the result on ainsn->if_modifier.
Reported-by: Timo Juhani Lindfors <timo.lindfors@iki.fi> Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Acked-by: Ananth N Mavinakayanahalli <ananth@in.ibm.com> Cc: yrl.pp-manager.tt@hitachi.com Cc: Steven Rostedt <rostedt@goodmis.org> Cc: David S. Miller <davem@davemloft.net> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/20130314115242.19690.33573.stgit@mhiramat-M0-7522 Signed-off-by: Ingo Molnar <mingo@kernel.org>
Ingo Molnar [Mon, 18 Mar 2013 09:00:56 +0000 (10:00 +0100)]
Merge tag 'perf-urgent-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent
Pull perf/urgent fixes from Arnaldo Carvalho de Melo:
. perf probe: Fix segfault due to testing the wrong pointer for NULL,
from Ananth N Mavinakayanahalli.
. libtraceevent: Remove hard coded include to /usr/local/include in
Makefile, which causes cross builds to include host header files,
fix from Jack Mitchell.
. perf record: Use the right target interface for synthesizing
threads when --cpu/-C option is used, fix from Jiri Olsa.
. Check if -DFORTIFY_SOURCE=2 is allowed, as gcc 4.7.2 defines
it and then the build is broken when it is redefined in perf,
fix from Marcin Slusarz.
. Fix build with NO_NEWT=1, that can happen explicitely or when
the newt-devel package is not installed, from Michael Ellerman.
. perf/POWER7: Create a sysfs format entry for Power7 events, missing
patch from a patchseries already merged, from Sukadev Bhattiprolu.
. Fix LIBNUMA build with glibc 2.12 and older, from Vinson Lee.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Ingo Molnar <mingo@kernel.org>
Namhyung Kim [Fri, 15 Mar 2013 07:27:13 +0000 (16:27 +0900)]
perf: Generate EXIT event only once per task context
perf_event_task_event() iterates pmu list and generate events
for each eligible pmu context. But if task_event has task_ctx
like in EXIT it'll generate events even though the pmu doesn't
have an eligible one. Fix it by moving the code to proper
places.
Before this patch:
$ perf record -n true
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.006 MB perf.data (~248 samples) ]
$ perf report -D | tail
Aggregated stats:
TOTAL events: 73
MMAP events: 67
COMM events: 2
EXIT events: 4
cycles stats:
TOTAL events: 73
MMAP events: 67
COMM events: 2
EXIT events: 4
After this patch:
$ perf report -D | tail
Aggregated stats:
TOTAL events: 70
MMAP events: 67
COMM events: 2
EXIT events: 1
cycles stats:
TOTAL events: 70
MMAP events: 67
COMM events: 2
EXIT events: 1
Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung.kim@lge.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1363332433-7637-1-git-send-email-namhyung@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
Namhyung Kim [Mon, 18 Mar 2013 02:41:46 +0000 (11:41 +0900)]
perf: Reset hwc->last_period on sw clock events
When cpu/task clock events are initialized, their sampling
frequencies are converted to have a fixed value. However it
missed to update the hwc->last_period which was set to 1 for
initial sampling frequency calibration.
Because this hwc->last_period value is used as a period in
perf_swevent_ hrtime(), every recorded sample will have an
incorrected period of 1.
$ perf record -e task-clock noploop 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.158 MB perf.data (~6919 samples) ]
The following patch causes 'perf stat --repeat 0' to be interpreted as
'forever', displaying the stats for every run.
We act as if a single run was asked, and reset the stats in each
iteration. In this mode SIGINT is passed to perf to be able to stop the
loop with Ctrl+C.
Signed-off-by: Frederik Deweerdt <frederik.deweerdt@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/20130301180227.GA24385@ks398093.ip-192-95-24.net Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Jiri Olsa [Sun, 10 Mar 2013 18:41:10 +0000 (19:41 +0100)]
perf tests: Test breakpoint overflow signal handler
Adding automated test for breakpoint event signal handler checking if
it's executed properly.
The test is related to the proper handling of the RF EFLAGS bit on
x86_64, but it's generic for all archs.
First we check the signal handler is properly called and that the
following debug exception return to user space wouldn't trigger
recursive breakpoint.
This is related to x86_64 RF EFLAGS bit being managed in a wrong way.
Second we check that we can set breakpoint in signal handler, which is
not possible on x86_64 if the signal handler is executed with RF EFLAG
set.
This test is inpired by overflow tests done by Vince Weaver.
Signed-off-by: Jiri Olsa <jolsa@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Link: http://lkml.kernel.org/r/1362940871-24486-6-git-send-email-jolsa@redhat.com
[ committer note: s/pr_err/pr_debug/g i.e. print just OK or FAILED in non verbose mode ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Namhyung Kim [Tue, 5 Mar 2013 05:53:28 +0000 (14:53 +0900)]
perf annotate: Support event group view for --print-line
Dynamically allocate source_line_percent according to a number of group
members and save nr_pcnt to the struct source_line. This way we can
handle multiple events in a general manner.
However since the size of struct source_line is not fixed anymore,
iterating whole source_line should care about its size.
The perf_evsel__is_group_event function is for checking whether given
evsel needs event group view support or not. Please note that it's
different to the existing perf_evsel__is_group_leader() which checks
only the given evsel is a leader or a standalone (i.e. non-group) event
regardless of event group feature.
Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Andi Kleen <andi@firstfloor.org> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Pekka Enberg <penberg@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1362462812-30885-7-git-send-email-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Namhyung Kim [Tue, 5 Mar 2013 05:53:24 +0000 (14:53 +0900)]
perf annotate: Cleanup disasm__calc_percent()
The loop end condition is calculated from next disasm_line or the symbol
size if it's the last disasm_line. But it doesn't need to be calculated
at every iteration. Moving it out of the function can simplify code a
bit. Also the src_line doesn't need to be checked in every time.
Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Andi Kleen <andi@firstfloor.org> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Pekka Enberg <penberg@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1362462812-30885-5-git-send-email-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Namhyung Kim [Tue, 5 Mar 2013 05:53:22 +0000 (14:53 +0900)]
perf annotate: Add a comment on the symbol__parse_objdump_line()
The symbol__parse_objdump_line() parses result of the objdump run but
it's hard to follow if one doesn't know the output format of the
objdump. Add a head comment on the function to help her.
Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Andi Kleen <andi@firstfloor.org> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Pekka Enberg <penberg@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1362462812-30885-3-git-send-email-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Namhyung Kim [Mon, 11 Mar 2013 07:43:12 +0000 (16:43 +0900)]
perf evlist: Remove cpus and threads arguments from perf_evlist__new()
It's almost always used with NULL for both arguments. Get rid of the
arguments from the signature and use perf_evlist__set_maps() if needed.
Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: David Ahern <dsahern@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1362987798-24969-1-git-send-email-namhyung@kernel.org
[ committer note: replaced spaces with tabs in some of the affected lines ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Borislav Petkov [Sun, 3 Mar 2013 19:25:33 +0000 (20:25 +0100)]
tools lib lk: Fix _FORTIFY_SOURCE builds
Jiri Olsa triggers the following build error:
SUBDIR ../lib/lk/
CC debugfs.o
In file included from /usr/include/errno.h:29:0,
from debugfs.c:1:
/usr/include/features.h:314:4: error: #warning _FORTIFY_SOURCE requires compiling with optimization (-O) [-Werror=cpp]
This is because enabling buffer overflow checks through _FORTIFY_SOURCE
require compiler optimizations to be enabled too. However, those are
not. Enable them by simply copying the perf optimization level. It can
be expanded later if we want to support debug builds, etc.
Borislav Petkov [Wed, 20 Feb 2013 15:32:30 +0000 (16:32 +0100)]
perf tools: Introduce tools/lib/lk library
This introduces the tools/lib/lk library, that will gradually have the
routines that now are used in tools/perf/ and other tools and that can
be shared.
Start by carving out debugfs routines for general use.
Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Ingo Molnar <mingo@kernel.org> Cc: Steven Rostedt <rostedt@goodmis.org> Link: http://lkml.kernel.org/r/1361374353-30385-5-git-send-email-bp@alien8.de
[ committer note: Add tools/lib/lk/ to perf's MANIFEST so that its tarballs continue to build ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Borislav Petkov [Wed, 20 Feb 2013 15:32:29 +0000 (16:32 +0100)]
perf tools: Correct Makefile.include
It looks at O= and adjusts the $(OUTPUT) variable based on what the
output directory will be. However, when O is defined but empty, it
wrongly becomes the user's $HOME dir which is not what we want. So check
it is not empty before working with it further.
Borislav Petkov [Wed, 20 Feb 2013 15:32:28 +0000 (16:32 +0100)]
perf tools: Honor parallel jobs
We need to hand down parallel build options like the internal make
--jobserver-fds one so that parallel builds can also happen when
building perf from the toplevel directory.
Cody P Schafer [Thu, 14 Mar 2013 22:27:51 +0000 (15:27 -0700)]
perf tools: Fix build on non-glibc systems due to libio.h absence
Including libio.h causes build failures on uClibc systems (which lack
libio.h).
It appears that libio.h was only included to pull in a definition for
NULL, so it has been replaced by stddef.h.
On powerpc, libio.h was conditionally included, but could be removed
completely as it is unneeded. Also, the included of stdlib.h was changed
to stddef.h (as again, only NULL is needed).
tracing: Prevent buffer overwrite disabled for latency tracers
The latency tracers require the buffers to be in overwrite mode,
otherwise they get screwed up. Force the buffers to stay in overwrite
mode when latency tracers are enabled.
Added a flag_changed() method to the tracer structure to allow
the tracers to see what flags are being changed, and also be able
to prevent the change from happing.
Cc: stable@vger.kernel.org Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
tracing: Keep overwrite in sync between regular and snapshot buffers
Changing the overwrite mode for the ring buffer via the trace
option only sets the normal buffer. But the snapshot buffer could
swap with it, and then the snapshot would be in non overwrite mode
and the normal buffer would be in overwrite mode, even though the
option flag states otherwise.
Keep the two buffers overwrite modes in sync.
Cc: stable@vger.kernel.org Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
tracing: Protect tracer flags with trace_types_lock
Seems that the tracer flags have never been protected from
synchronous writes. Luckily, admins don't usually modify the
tracing flags via two different tasks. But if scripts were to
be used to modify them, then they could get corrupted.
Move the trace_types_lock that protects against tracers changing
to also protect the flags being set.
Cc: stable@vger.kernel.org Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Vinson Lee [Wed, 13 Mar 2013 22:34:24 +0000 (15:34 -0700)]
perf tools: Fix LIBNUMA build with glibc 2.12 and older.
The tokens MADV_HUGEPAGE and MADV_NOHUGEPAGE are not available with
glibc 2.12 and older. Define these tokens if they are not already
defined.
This patch fixes these build errors with older versions of glibc.
CC bench/numa.o
bench/numa.c: In function ‘alloc_data’:
bench/numa.c:334: error: ‘MADV_HUGEPAGE’ undeclared (first use in this function)
bench/numa.c:334: error: (Each undeclared identifier is reported only once
bench/numa.c:334: error: for each function it appears in.)
bench/numa.c:341: error: ‘MADV_NOHUGEPAGE’ undeclared (first use in this function)
make: *** [bench/numa.o] Error 1
Signed-off-by: Vinson Lee <vlee@twitter.com> Acked-by: Ingo Molnar <mingo@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Irina Tirdea <irina.tirdea@intel.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Pekka Enberg <penberg@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1363214064-4671-2-git-send-email-vlee@twitter.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
anish kumar [Tue, 12 Mar 2013 18:44:08 +0000 (14:44 -0400)]
watchdog: Add comments to explain the watchdog_disabled variable
The watchdog_disabled flag is a bit cryptic. However it's
usefulness is multifold. Uses are:
1. Check if smpboot_register_percpu_thread function passed.
2. Makes sure that user enables and disables the watchdog in
sequence i.e. enable watchdog->disable watchdog->enable watchdog
Unlike enable watchdog->enable watchdog which is wrong.
Linus Torvalds [Wed, 13 Mar 2013 22:47:50 +0000 (15:47 -0700)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace
Pull namespace bugfixes from Eric Biederman:
"This tree includes a partial revert for "fs: Limit sys_mount to only
request filesystem modules." When I added the new style module aliases
to the filesystems I deleted the old ones. A bad move. It turns out
that distributions like Arch linux use module aliases when
constructing ramdisks. Which meant ultimately that an ext3 filesystem
mounted with ext4 would not result in the ext4 module being put into
the ramdisk.
The other change in this tree adds a handful of filesystem module
alias I simply failed to add the first time. Which inconvinienced a
few folks using cifs.
I don't want to inconvinience folks any longer than I have to so here
are these trivial fixes."
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace:
fs: Readd the fs module aliases.
fs: Limit sys_mount to only request filesystem modules. (Part 3)
Tejun Heo [Wed, 13 Mar 2013 21:59:49 +0000 (14:59 -0700)]
idr: idr_alloc() shouldn't trigger lowmem warning when preloaded
GFP_NOIO is often used for idr_alloc() inside preloaded section as the
allocation mask doesn't really matter. If the idr tree needs to be
expanded, idr_alloc() first tries to allocate using the specified
allocation mask and if it fails falls back to the preloaded buffer. This
order prevent non-preloading idr_alloc() users from taking advantage of
preloading ones by using preload buffer without filling it shifting the
burden of allocation to the preload users.
Unfortunately, this allowed/expected-to-fail kmem_cache allocation ends up
generating spurious slab lowmem warning before succeeding the request from
the preload buffer.
This patch makes idr_layer_alloc() add __GFP_NOWARN to the first
kmem_cache attempt and try kmem_cache again w/o __GFP_NOWARN after
allocation from preload_buffer fails so that lowmem warning is generated
if not suppressed by the original @gfp_mask.
Signed-off-by: Tejun Heo <tj@kernel.org> Reported-by: David Teigland <teigland@redhat.com> Tested-by: David Teigland <teigland@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
David Howells [Wed, 13 Mar 2013 21:59:48 +0000 (14:59 -0700)]
UAPI: fix endianness conditionals in M32R's asm/stat.h
In the UAPI header files, __BIG_ENDIAN and __LITTLE_ENDIAN must be
compared against __BYTE_ORDER in preprocessor conditionals where these are
exposed to userspace (that is they're not inside __KERNEL__ conditionals).
However, in the main kernel the norm is to check for
"defined(__XXX_ENDIAN)" rather than comparing against __BYTE_ORDER and
this has incorrectly leaked into the userspace headers.
The definition of struct stat64 in M32R's asm/stat.h is wrong in this way.
Note that userspace will likely interpret the field order incorrectly as
the big-endian variant on little-endian machines - depending on header
inclusion order.
[!!!] NOTE [!!!] This patch may adversely change the userspace API. It might
be better to fix the ordering of st_blocks and __pad4 in struct stat64.
Signed-off-by: David Howells <dhowells@redhat.com> Cc: Hirokazu Takata <takata@linux-m32r.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
David Howells [Wed, 13 Mar 2013 21:59:47 +0000 (14:59 -0700)]
UAPI: fix endianness conditionals in linux/raid/md_p.h
In the UAPI header files, __BIG_ENDIAN and __LITTLE_ENDIAN must be
compared against __BYTE_ORDER in preprocessor conditionals where these are
exposed to userspace (that is they're not inside __KERNEL__ conditionals).
However, in the main kernel the norm is to check for
"defined(__XXX_ENDIAN)" rather than comparing against __BYTE_ORDER and
this has incorrectly leaked into the userspace headers.
The definition of struct mdp_superblock_s in linux/raid/md_p.h is wrong in
this way. Note that userspace will likely interpret the ordering of the
fields incorrectly as the big-endian variant on a little-endian machines -
depending on header inclusion order.
[!!!] NOTE [!!!] This patch may adversely change the userspace API. It might
be better to fix the ordering of events_hi, events_lo, cp_events_hi and
cp_events_lo in struct mdp_superblock_s / typedef mdp_super_t.
Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: NeilBrown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
David Howells [Wed, 13 Mar 2013 21:59:46 +0000 (14:59 -0700)]
UAPI: fix endianness conditionals in linux/acct.h
In the UAPI header files, __BIG_ENDIAN and __LITTLE_ENDIAN must be
compared against __BYTE_ORDER in preprocessor conditionals where these are
exposed to userspace (that is they're not inside __KERNEL__ conditionals).
However, in the main kernel the norm is to check for
"defined(__XXX_ENDIAN)" rather than comparing against __BYTE_ORDER and
this has incorrectly leaked into the userspace headers.
The definition of ACCT_BYTEORDER in linux/acct.h is wrong in this way.
Note that userspace will likely interpret this incorrectly as the
big-endian variant on little-endian machines - depending on header
inclusion order.
[!!!] NOTE [!!!] This patch may adversely change the userspace API. It might
be better to fix the value of ACCT_BYTEORDER.
Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>