Steven Rostedt [Mon, 12 May 2008 19:20:50 +0000 (21:20 +0200)]
ftrace: do not profile lib/string.o
Most archs define the string and memory compare functions in assembly.
Some do not. But these functions may be used in some archs at early
boot up.
Since most archs define this code in assembly and they are not usually
traced, there's no need to trace them when they are not defined in
assembly.
This patch removes the -pg from the CFLAGS for lib/string.o.
This prevents the string functions use in either vdso or early bootup
from crashing the system.
Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Steven Rostedt [Mon, 12 May 2008 19:20:50 +0000 (21:20 +0200)]
ftrace: remove address of function names
PowerPC is very fragile when it comes to use of function names
and function addresses. ftrace needs to either use all function
addresses or function names (i.e. my_func as suppose to &my_func).
This patch chooses to use the names and not the addresses, and
makes ftrace consistent.
Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Steven Rostedt [Mon, 12 May 2008 19:20:49 +0000 (21:20 +0200)]
ftrace: add trace_function api for other tracers to use
A new check was added in the ftrace function that wont trace if the CPU
trace buffer is disabled. Unfortunately, other tracers used ftrace() to
write to the buffer after they disabled it. The new disable check makes
these calls into a nop.
This patch changes the __ftrace that is called without the check into a
new api for the other tracers to use, called "trace_function". The other
tracers use this interface instead when the trace CPU buffer is already
disabled.
Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Steven Rostedt [Mon, 12 May 2008 19:20:48 +0000 (21:20 +0200)]
ftrace: disable tracing on failure
Since ftrace touches practically every function. If we detect any
anomaly, we want to fully disable ftrace. This patch adds code
to try shutdown ftrace as much as possible without doing any more
harm is something is detected not quite correct.
This only kills ftrace, this patch does have checks for other parts of
the tracer (irqsoff, wakeup, etc.).
Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Steven Rostedt [Mon, 12 May 2008 19:20:48 +0000 (21:20 +0200)]
ftrace - fix dynamic ftrace memory leak
The ftrace dynamic function update allocates a record to store the
instruction pointers that are being modified. If the modified
instruction pointer fails to update, then the record is marked as
failed and nothing more is done.
Worse, if the modification fails, but the record ip function is still
called, it will allocate a new record and try again. In just a matter
of time, will this cause a serious memory leak and crash the system.
This patch plugs this memory leak. When a record fails, it is
included back into the pool of records to be used. Now a record may
fail over and over again, but the number of allocated records will
not increase.
Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Steven Rostedt [Mon, 12 May 2008 19:20:46 +0000 (21:20 +0200)]
ftrace: user run time file reading
This patch creates a file called trace_pipe in the tracing
debug directory. This file is a consumer of the trace buffers.
This means that reads of this file consumes the entries from
the trace buffers so that they will not be read a second time,
as contrast to the static buffers latency_trace and trace.
Reading from the trace_pipe will remove the entries from trace
and latency_trace too.
The advantage that trace_pipe has is that it can record live
traces. It will block when there is nothing in the buffer,
and read the entries as they are entered. An EOF happens when
tracing is disabled (tracing_enabled = 0).
Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Steven Rostedt [Mon, 12 May 2008 19:20:46 +0000 (21:20 +0200)]
ftrace: add a buffer for output
Later patches will need to print the same things as the seq output
does. But those outputs will not use the seq utility. This patch
adds a buffer to the iterator, that can be used by either the
seq utility or other output.
Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Steven Rostedt [Mon, 12 May 2008 19:20:45 +0000 (21:20 +0200)]
ftrace: change buffers to producer consumer
This patch changes the way the CPU trace buffers are handled.
Instead of always starting from the trace page head, the logic
is changed to a producer consumer logic. This allows for the
buffers to be drained while they are alive.
Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Steven Rostedt [Mon, 12 May 2008 19:20:45 +0000 (21:20 +0200)]
ftrace: reset selftests
The tests may leave stuff in the buffers. This resets the buffers
after each test is run. If a test fails, it does not reset the
buffer to avoid touching a buffer that is corrupted.
Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Steven Rostedt [Mon, 12 May 2008 19:20:45 +0000 (21:20 +0200)]
ftrace: startup tester on dynamic tracing.
This patch adds a startup self test on dynamic code modification
and filters. The test filters on a specific function, makes sure that
no other function is traced, exectutes the function, then makes sure that
the function is traced.
This patch also fixes a slight bug with the ftrace selftest, where
tracer_enabled was not being set.
Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Steven Rostedt [Mon, 12 May 2008 19:20:44 +0000 (21:20 +0200)]
ftrace: irqs off smp_processor_id() fix
The irqsoff function tracer did a __get_cpu_var to determine
if it should trace the function or not. The problem is that
__get_cpu_var can preempt between getting the CPU and reading
the cpu variable. This means that the cpu variable that is
being read is not from the cpu being run on.
At worst, this can give a false positive, where we trace the
function when we should not. It will never give a false negative
since we only want to trace when interrupts are disabled
and we never preempt when they are.
This fix adds a check after reading the irq flags to only
trace if the interrupts are actually disabled. It also changes
the reading of the cpu variable to use a raw_smp_processor_id
since we now don't care if we preempt. We still catch that fact.
Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Steven Rostedt [Mon, 12 May 2008 19:20:44 +0000 (21:20 +0200)]
ftrace: debug smp_processor_id, use notrace preempt disable
The debug smp_processor_id caused a recursive fault in debugging
the irqsoff tracer. The tracer used a smp_processor_id in the
ftrace callback, and this function called preempt_disable which
also is traced. This caused a recursive fault (stack overload).
Since using smp_processor_id without debugging on does not cause
faults with the tracer (even when the tracer is wrong), the
debug version should not cause a system reboot.
This changes the debug_smp_processor_id to use the notrace versions
of preempt_disable and enable.
Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Steven Rostedt [Mon, 12 May 2008 19:20:43 +0000 (21:20 +0200)]
ftrace: convert single large buffer into single pages.
Allocating large buffers for the tracer may fail easily.
This patch converts the buffer from a large ordered allocation
to single pages. It uses the struct page LRU field to link the
pages together.
Later patches may also implement dynamic increasing and decreasing
of the trace buffers.
Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Steven Rostedt [Mon, 12 May 2008 19:20:43 +0000 (21:20 +0200)]
ftrace: add filter select functions to trace
This patch adds two files to the debugfs system:
/debugfs/tracing/available_filter_functions
and
/debugfs/tracing/set_ftrace_filter
The available_filter_functions lists all functions that has been
recorded by the ftraced that has called the ftrace_record_ip function.
This is to allow users to see what functions have been converted
to nops and can be enabled for tracing.
To enable functions, simply echo the names (whitespace delimited)
into set_ftrace_filter. Simple wildcards are also allowed.
Steven Rostedt [Mon, 12 May 2008 19:20:43 +0000 (21:20 +0200)]
ftrace: use dynamic patching for updating mcount calls
This patch replaces the indirect call to the mcount function
pointer with a direct call that will be patched by the
dynamic ftrace routines.
On boot up, the mcount function calls the ftace_stub function.
When the dynamic ftrace code is initialized, the ftrace_stub
is replaced with a call to the ftrace_record_ip, which records
the instruction pointers of the locations that call it.
Later, the ftraced daemon will call kstop_machine and patch all
the locations to nops.
When a ftrace is enabled, the original calls to mcount will now
be set top call ftrace_caller, which will do a direct call
to the registered ftrace function. This direct call is also patched
when the function that should be called is updated.
All patching is performed by a kstop_machine routine to prevent any
type of race conditions that is associated with modifying code
on the fly.
Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Steven Rostedt [Mon, 12 May 2008 19:20:43 +0000 (21:20 +0200)]
ftrace: add ftrace_enabled sysctl to disable mcount function
This patch adds back the sysctl ftrace_enabled. This time it is
defaulted to on, if DYNAMIC_FTRACE is configured. When ftrace_enabled
is disabled, the ftrace function is set to the stub return.
If DYNAMIC_FTRACE is also configured, on ftrace_enabled = 0,
the registered ftrace functions will all be set to jmps, but no more
new calls to ftrace recording (used to find the ftrace calling sites)
will be called.
Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Steven Rostedt [Mon, 12 May 2008 19:20:42 +0000 (21:20 +0200)]
ftrace: dynamic enabling/disabling of function calls
This patch adds a feature to dynamically replace the ftrace code
with the jmps to allow a kernel with ftrace configured to run
as fast as it can without it configured.
The way this works, is on bootup (if ftrace is enabled), a ftrace
function is registered to record the instruction pointer of all
places that call the function.
Later, if there's still any code to patch, a kthread is awoken
(rate limited to at most once a second) that performs a stop_machine,
and replaces all the code that was called with a jmp over the call
to ftrace. It only replaces what was found the previous time. Typically
the system reaches equilibrium quickly after bootup and there's no code
patching needed at all.
e.g.
call ftrace /* 5 bytes */
is replaced with
jmp 3f /* jmp is 2 bytes and we jump 3 forward */
3:
When we want to enable ftrace for function tracing, the IP recording
is removed, and stop_machine is called again to replace all the locations
of that were recorded back to the call of ftrace. When it is disabled,
we replace the code back to the jmp.
Allocation is done by the kthread. If the ftrace recording function is
called, and we don't have any record slots available, then we simply
skip that call. Once a second a new page (if needed) is allocated for
recording new ftrace function calls. A large batch is allocated at
boot up to get most of the calls there.
Because we do this via stop_machine, we don't have to worry about another
CPU executing a ftrace call as we modify it. But we do need to worry
about NMI's so all functions that might be called via nmi must be
annotated with notrace_nmi. When this code is configured in, the NMI code
will not call notrace.
Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Steven Rostedt [Mon, 12 May 2008 19:20:42 +0000 (21:20 +0200)]
ftrace: trace preempt off critical timings
Add preempt off timings. A lot of kernel core code is taken from the RT patch
latency trace that was written by Ingo Molnar.
This adds "preemptoff" and "preemptirqsoff" to /debugfs/tracing/available_tracers
Now instead of just tracing irqs off, preemption off can be selected
to be recorded.
When this is selected, it shares the same files as irqs off timings.
One can either trace preemption off, irqs off, or one or the other off.
By echoing "preemptoff" into /debugfs/tracing/current_tracer, recording
of preempt off only is performed. "irqsoff" will only record the time
irqs are disabled, but "preemptirqsoff" will take the total time irqs
or preemption are disabled. Runtime switching of these options is now
supported by simpling echoing in the appropriate trace name into
/debugfs/tracing/current_tracer.
Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Steven Rostedt [Mon, 12 May 2008 19:20:42 +0000 (21:20 +0200)]
ftrace: trace irq disabled critical timings
This patch adds latency tracing for critical timings
(how long interrupts are disabled for).
"irqsoff" is added to /debugfs/tracing/available_tracers
Note:
tracing_max_latency
also holds the max latency for irqsoff (in usecs).
(default to large number so one must start latency tracing)
tracing_thresh
threshold (in usecs) to always print out if irqs off
is detected to be longer than stated here.
If irq_thresh is non-zero, then max_irq_latency
is ignored.
Here's an example of a trace with ftrace_enabled = 0
Steven Rostedt [Mon, 12 May 2008 19:20:42 +0000 (21:20 +0200)]
ftrace: tracer for scheduler wakeup latency
This patch adds the tracer that tracks the wakeup latency of the
highest priority waking task.
"wakeup" is added to /debugfs/tracing/available_tracers
Also added to /debugfs/tracing
tracing_max_latency
holds the current max latency for the wakeup
wakeup_thresh
if set to other than zero, a log will be recorded
for every wakeup that takes longer than the number
entered in here (usecs for all counters)
(deletes previous trace)
Steven Rostedt [Mon, 12 May 2008 19:20:42 +0000 (21:20 +0200)]
ftrace: function tracer
This is a simple trace that uses the ftrace infrastructure. It is
designed to be fast and small, and easy to use. It is useful to
record things that happen over a very short period of time, and
not to analyze the system in general.
Updates:
available_tracers
"function" is added to this file.
current_tracer
To enable the function tracer:
echo function > /debugfs/tracing/current_tracer
To disable the tracer:
echo disable > /debugfs/tracing/current_tracer
The output of the function_trace file is as follows
Steven Rostedt [Mon, 12 May 2008 19:20:42 +0000 (21:20 +0200)]
ftrace: latency tracer infrastructure
This patch adds the latency tracer infrastructure. This patch
does not add anything that will select and turn it on, but will
be used by later patches.
If it were to be compiled, it would add the following files
to the debugfs:
The root tracing directory:
/debugfs/tracing/
This patch also adds the following files:
available_tracers
list of available tracers. Currently no tracers are
available. Looking into this file only shows
"none" which is used to unregister all tracers.
current_tracer
The trace that is currently active. Empty on start up.
To switch to a tracer simply echo one of the tracers that
are listed in available_tracers:
example: (used with later patches)
echo function > /debugfs/tracing/current_tracer
To disable the tracer:
echo disable > /debugfs/tracing/current_tracer
tracing_enabled
echoing "1" into this file starts the ftrace function tracing
(if sysctl kernel.ftrace_enabled=1)
echoing "0" turns it off.
latency_trace
This file is readonly and holds the result of the trace.
trace
This file outputs a easier to read version of the trace.
iter_ctrl
Controls the way the output of traces look.
So far there's two controls:
echoing in "symonly" will only show the kallsyms variables
without the addresses (if kallsyms was configured)
echoing in "verbose" will change the output to show
a lot more data, but not very easy to understand by
humans.
echoing in "nosymonly" turns off symonly.
echoing in "noverbose" turns off verbose.
Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
ftrace: add basic support for gcc profiler instrumentation
If CONFIG_FTRACE is selected and /proc/sys/kernel/ftrace_enabled is
set to a non-zero value the ftrace routine will be called everytime
we enter a kernel function that is not marked with the "notrace"
attribute.
The ftrace routine will then call a registered function if a function
happens to be registered.
[ This code has been highly hacked by Steven Rostedt and Ingo Molnar,
so don't blame Arnaldo for all of this ;-) ]
Update:
It is now possible to register more than one ftrace function.
If only one ftrace function is registered, that will be the
function that ftrace calls directly. If more than one function
is registered, then ftrace will call a function that will loop
through the functions to call.
Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
ftrace: annotate core code that should not be traced
Mark with "notrace" functions in core code that should not be
traced. The "notrace" attribute will prevent gcc from adding
a call to ftrace on the annotated funtions.
Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Steven Rostedt [Mon, 12 May 2008 19:20:41 +0000 (21:20 +0200)]
x86: add notrace annotations to vsyscall.
Add the notrace annotations to the vsyscall functions - there we are
not in kernel context yet, so the tracer function cannot (and must not)
be called.
Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Ingo Molnar [Mon, 12 May 2008 19:20:41 +0000 (21:20 +0200)]
tracing: add notrace to linkage.h
notrace signals that a function should not be traced. Most of the
time this is used by tracers to annotate code that cannot be
traced - it's in a volatile state (such as in user vdso context
or NMI context) or it's in the tracer internals.
Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Steven Rostedt [Mon, 12 May 2008 19:20:41 +0000 (21:20 +0200)]
ftrace: add preempt_enable/disable notrace macros
The tracer may need to call preempt_enable and disable functions
for time keeping and such. The trace gets ugly when we see these
functions show up for all traces. To make the output cleaner
this patch adds preempt_enable_notrace and preempt_disable_notrace
to be used by tracer (and debugging) functions.
Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Steven Rostedt [Mon, 12 May 2008 19:20:41 +0000 (21:20 +0200)]
ftrace: make the task state char-string visible to all
The tracer wants to be able to convert the state number
into a user visible character. This patch pulls that conversion
string out the scheduler into the header. This way if it were to
ever change, other parts of the kernel will know.
Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6:
net: The world is not perfect patch.
tcp: Make prior_ssthresh a u32
xfrm_user: Remove zero length key checks.
net/ipv4/arp.c: Use common hex_asc helpers
cassini: Only use chip checksum for ipv4 packets.
tcp: TCP connection times out if ICMP frag needed is delayed
netfilter: Move linux/types.h inclusions outside of #ifdef __KERNEL__
af_key: Fix selector family initialization.
libertas: Fix ethtool statistics
mac80211: fix NULL pointer dereference in ieee80211_compatible_rates
mac80211: don't claim iwspy support
orinoco_cs: add ID for SpeedStream wireless adapters
hostap_cs: add ID for Conceptronic CON11CPro
rtl8187: resource leak in error case
ath5k: Fix loop variable initializations
David S. Miller [Thu, 22 May 2008 01:14:28 +0000 (18:14 -0700)]
sparc64: Fix kernel thread stack termination.
Because of the silly way I set up the initial stack for
new kernel threads, there is a loop at the top of the
stack.
To fix this, properly add another stack frame that is copied
from the parent and terminate it in the child by setting
the frame pointer in that frame to zero.
Signed-off-by: David S. Miller <davem@davemloft.net>
Rami Rosen [Thu, 22 May 2008 00:47:54 +0000 (17:47 -0700)]
net: The world is not perfect patch.
Unless there will be any objection here, I suggest consider the
following patch which simply removes the code for the
-DI_WISH_WORLD_WERE_PERFECT in the three methods which use it.
The compilation errors we get when using -DI_WISH_WORLD_WERE_PERFECT
show that this code was not built and not used for really a long time.
Signed-off-by: Rami Rosen <ramirose@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Ilpo Järvinen [Thu, 22 May 2008 00:40:05 +0000 (17:40 -0700)]
tcp: Make prior_ssthresh a u32
If previous window was above representable values of u16,
strange things will happen if undo with the truncated value
is called for. Alternatively, this could be fixed by some
max trickery but that would limit undoing high-speed undos.
Adds 16-bit hole but there isn't anything to fill it with.
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>
and here to print HW addresses, the hex cases are not significant.
Thanks to Harvey Harrison to introduce the hex_asc_hi/hex_asc_lo helpers.
Signed-off-by: Denis Cheng <crquan@gmail.com> Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Al Viro [Wed, 21 May 2008 05:32:11 +0000 (06:32 +0100)]
provide out-of-line strcat() for m68k
Whether we sidestep it in init/main.c or not, such situations
will arise again; compiler does generate calls of strcat()
on optimizations, so we really ought to have an out-of-line
version...
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>