Paul Mackerras [Tue, 30 Jun 2009 06:07:19 +0000 (16:07 +1000)]
perf_counter: Provide a way to enable counters on exec
This provides a way to mark a counter to be enabled on the next
exec. This is useful for measuring the total activity of a
program without including overhead from the process that
launches it.
This also changes the perf stat command to use this new
facility.
Signed-off-by: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <19017.43927.838745.689203@cargo.ozlabs.ibm.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
Paul Mackerras [Mon, 29 Jun 2009 11:13:21 +0000 (21:13 +1000)]
perf_counter tools: Reduce perf stat measurement overhead/skew
Vince Weaver reported a 'perf stat' measurement overhead in the
count of retired instructions, which can amount to a +6000
instructions inflated count in the reported count.
At present, perf stat creates its counters on the perf process. Thus
the counters count the fork and various other activity in both the
parent and child, such as the resolver overhead for resolving PLT
entries for any libc functions that haven't been called before, such
as execvp.
This reduces the overhead by creating the counters on the child process
after the fork, using a couple of pipes to synchronize so that the
child process waits until the parent has created the counters before
doing the exec. To eliminate the PLT resolution overhead on calling
execvp, this does a dummy execvp first which will always fail.
With this, the overhead of executing a program goes down from over
4800 instructions to about 90 instructions on powerpc (32-bit).
This was measured with a statically-linked program written in
assembler which only does the 3 instructions needed to call _exit(0).
Ingo Molnar [Mon, 29 Jun 2009 19:50:54 +0000 (21:50 +0200)]
perf stat: Use percentages for scaling output
Peter expressed a strong preference for percentage based
display of scaled values - so revert to that from the
recently introduced multiplication-factor unit.
Reported-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Jaswinder Singh Rajput <jaswinder@kernel.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
Ingo Molnar [Sat, 27 Jun 2009 04:10:30 +0000 (06:10 +0200)]
perf stat: Add -n/--null option to run without counters
Allow a no-counters run. This can be useful to measure just
elapsed wall-clock time - or to assess the raw overhead of perf
stat itself, without running any counters.
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
perf report: Print sorted callchains per histogram entries
Use the newly created callchains radix tree to gather the chains stats
from the recorded events and then print the callchains for all of them,
sorted by hits, using the "-c" parameter with perf report.
perf_counter tools: Prepare a small callchain framework
We plan to display the callchains depending on some user-configurable
parameters.
To gather the callchains stats from the recorded stream in a fast way,
this patch introduces an ad hoc radix tree adapted for callchains and also
a rbtree to sort these callchains once we have gathered every events
from the stream.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1246026481-8314-2-git-send-email-fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
Building latest perfcounter fails on the following error:
builtin-record.c: In function ‘create_counter’:
builtin-record.c:451: erreur: ignoring return value of ‘read’, declared with attribute warn_unused_result
make: *** [builtin-record.o] Erreur 1
Just check if we successfully read the perf file descriptor.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <1245961287-5327-1-git-send-email-fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
Peter Zijlstra [Thu, 25 Jun 2009 09:27:12 +0000 (11:27 +0200)]
perf_counter: Rework the sample ABI
The PERF_EVENT_READ implementation made me realize we don't
actually need the sample_type int the output sample, since
we already have that in the perf_counter_attr information.
Therefore, remove the PERF_EVENT_MISC_OVERFLOW bit and the
event->type overloading, and imply put counter overflow
samples in a PERF_EVENT_SAMPLE type.
This also fixes the issue that event->type was only 32-bit
and sample_type had 64 usable bits.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
Peter Zijlstra [Wed, 24 Jun 2009 19:11:59 +0000 (21:11 +0200)]
perf_counter: Implement more accurate per task statistics
With the introduction of PERF_EVENT_READ we have the
possibility to provide accurate counter values for
individual tasks in a task hierarchy.
However, due to the lazy context switching used for similar
counter contexts our current per task counts are way off.
In order to maintain some of the lazy switch benefits we
don't disable it out-right, but simply iterate the active
counters and flip the values between the contexts.
This only reads the counters but does not need to reprogram
the full PMU.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
Peter Zijlstra [Tue, 23 Jun 2009 18:13:11 +0000 (20:13 +0200)]
perf_counter: Add PERF_EVENT_READ
Provide a read() like event which can be used to log the
counter value at specific sites such as child->parent
folding on exit.
In order to be useful, we log the counter parent ID, not the
actual counter ID, since userspace can only relate parent
IDs to perf_counter_attr constructs.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
Peter Zijlstra [Thu, 25 Jun 2009 15:05:54 +0000 (17:05 +0200)]
perf_counter tools: Rework the file format
Create a structured file format that includes the full
perf_counter_attr and all its relevant counter IDs so that
the reporting program has full information.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
Performance counter stats for 'ls -lR /usr/include/':
248064467 L1-data-Cache-Load-Referencees (scaled from 23.27%) 1001433 L1-data-Cache-Load-Misses (scaled from 23.34%)
153691 L1-data-Cache-Store-Referencees (scaled from 23.34%)
423248 L1-data-Cache-Prefetch-Referencees (scaled from 23.33%)
302138 L1-data-Cache-Prefetch-Misses (scaled from 23.25%) 251217546 L1-instruction-Cache-Load-Referencees (scaled from 23.25%) 5757005 L1-instruction-Cache-Load-Misses (scaled from 23.23%)
93435 L1-instruction-Cache-Prefetch-Referencees (scaled from 23.24%) 6496073 L2-Cache-Load-Referencees (scaled from 23.32%)
609485 L2-Cache-Load-Misses (scaled from 23.45%) 6876991 L2-Cache-Store-Referencees (scaled from 23.71%) 248922840 Data-TLB-Cache-Load-Referencees (scaled from 23.94%) 5828386 Data-TLB-Cache-Load-Misses (scaled from 24.17%) 257613506 Instruction-TLB-Cache-Load-Referencees (scaled from 24.20%)
6833 Instruction-TLB-Cache-Load-Misses (scaled from 23.88%) 109043606 Branch-Cache-Load-Referencees (scaled from 23.64%) 5552296 Branch-Cache-Load-Misses (scaled from 23.42%)
Peformance counter stats for 'ls -lR /usr/include/':
266590464 L1-d$-loads (scaled from 23.03%) 1222273 L1-d$-load-misses (scaled from 23.58%)
146204 L1-d$-stores (scaled from 23.83%)
406344 L1-d$-prefetches (scaled from 24.09%)
283748 L1-d$-prefetch-misses (scaled from 24.10%) 249650965 L1-i$-loads (scaled from 23.80%) 3353961 L1-i$-load-misses (scaled from 23.82%)
104599 L1-i$-prefetches (scaled from 23.68%) 4836405 LLC-loads (scaled from 23.67%)
498214 LLC-load-misses (scaled from 23.66%) 4953994 LLC-stores (scaled from 23.64%) 243354097 dTLB-loads (scaled from 23.77%) 6468584 dTLB-load-misses (scaled from 23.74%) 249719549 iTLB-loads (scaled from 23.25%)
5060 iTLB-load-misses (scaled from 23.00%) 112343016 branch-loads (scaled from 22.76%) 5528876 branch-load-misses (scaled from 22.54%)
Johannes Weiner [Wed, 24 Jun 2009 19:08:36 +0000 (21:08 +0200)]
perf record: Fix filemap pathname parsing in /proc/pid/maps
Looking backward for the first space from the end of a line in
/proc/pid/maps does not find the start of the pathname of the mapped
file if it contains a space.
Since the only slashes we have in this file occur in the (absolute!)
pathname column of file mappings, looking for the first slash in a
line is a safe method to find the name.
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Cc: Stefani Seibold <stefani@seibold.net> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <20090624190835.GA25548@cmpxchg.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
Ingo Molnar [Wed, 24 Jun 2009 17:54:29 +0000 (19:54 +0200)]
perf_counter tools: Add CREDITS file for Git contributors
Much of perf's libraries comes from the Git project. I noticed
that the files (in tools/perf/util/*.[ch] and elsewhere) are
quite spartan wrt. credits, so lets add a CREDITS file that
includes an (incomplete!) list of main contributors.
Thanks guys, these libraries are really useful. Special thanks
go to Johannes Schindelin and Junio C Hamano for coming up with
this list.
List-Composed-By: Johannes Schindelin <Johannes.Schindelin@gmx.de> Cc: Junio C Hamano <gitster@pobox.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
Yong Wang [Wed, 24 Jun 2009 02:13:24 +0000 (10:13 +0800)]
perf_counter, x86: Set global control MSR correctly
Previous code made an assumption that the power on value of global
control MSR has enabled all fixed and general purpose counters properly.
However, this is not the case for certain Intel processors, such as
Atom - and it might also be firmware dependent.
Each enable bit in IA32_PERF_GLOBAL_CTRL is AND'ed with the
enable bits for all privilege levels in the respective IA32_PERFEVTSELx
or IA32_PERF_FIXED_CTR_CTRL MSRs to start/stop the counting of
respective counters. Counting is enabled if the AND'ed results is true;
counting is disabled when the result is false.
The end result is that all fixed counters are always disabled on Atom
processors because the assumption is just invalid.
Fix this by not initializing the ctrl-mask out of the global MSR,
but setting it to perf_counter_mask.
Reported-by: Stephane Eranian <eranian@googlemail.com> Signed-off-by: Yong Wang <yong.y.wang@intel.com> Cc: Arjan van de Ven <arjan@infradead.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <20090624021324.GA2788@ywang-moblin2.bj.intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
Peter Zijlstra [Mon, 22 Jun 2009 11:58:35 +0000 (13:58 +0200)]
perf_counter: Optimize perf_counter_alloc()'s inherit case
We don't need to add usage counts for swcounter and attr usage
models for inherited counters since the parent counter will
always have one, which suffices to generate the needed output.
This avoids up to 3 global atomic increments per inherited
counter.
LKML-Reference: <new-submission> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>
Martin Schwidefsky reported "perf report" symbol resolution
problems on S390.
Since we only report MMAP, not MUNMAP, we have to deal with
overlapping maps.
We used to simply throw out the old map on the assumption whole
maps got unmapped. This obviously doesn't deal with partial
unmaps. However it appears some dynamic linkers do fancy
partial unmaps (s390), so do something more elaborate and
truncate the old maps, only removing them when they've been
fully covered.
This resolves (part of) the S390 symbol resolution problems.
Reported-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Tested-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
Ingo Molnar [Sun, 21 Jun 2009 11:58:51 +0000 (13:58 +0200)]
perf_counter tools: Fix vmlinux fallback when running on a different kernel
Lucas De Marchi reported that perf report and perf annotate
displays mismatching profile if a perf.data is analyzed on
an older kernel - even if the correct vmlinux is specified
via the -k option.
The reason is the fallback path in util/symbol.c:dso__load_kernel():
int dso__load_kernel(struct dso *self, const char *vmlinux,
symbol_filter_t filter, int verbose)
{
int err = -1;
if (vmlinux)
err = dso__load_vmlinux(self, vmlinux, filter, verbose);
if (err)
err = dso__load_kallsyms(self, filter, verbose);
return err;
}
dso__load_vmlinux() returns negative on error, but on success it
returns the number of symbols loaded - which confuses the function
to load the kallsyms.
This is normally harmless, as reporting is usually performed on the
same kernel that is analyzed - but if there's a mismatch then we
load the wrong kallsyms and create a non-sensical symbol tree.
The fix is to only fall back to kallsyms on errors.
Reported-by: Lucas De Marchi <lucas.de.marchi@gmail.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2499484 L1-data-Cache-Load-Referencees (scaled from 3.97%)
70347 L1-data-Cache-Load-Misses (scaled from 7.30%)
9360 L1-data-Cache-Store-Referencees (scaled from 8.64%)
32804 L1-data-Cache-Prefetch-Referencees (scaled from 17.72%)
7693 L1-data-Cache-Prefetch-Misses (scaled from 22.97%) 2180945 L1-instruction-Cache-Load-Referencees (scaled from 28.48%)
14518 L1-instruction-Cache-Load-Misses (scaled from 35.00%)
2405 L1-instruction-Cache-Prefetch-Referencees (scaled from 34.89%)
71387 L2-Cache-Load-Referencees (scaled from 34.94%)
18732 L2-Cache-Load-Misses (scaled from 34.92%)
79918 L2-Cache-Store-Referencees (scaled from 36.02%) 1295294 Data-TLB-Cache-Load-Referencees (scaled from 35.99%)
30896 Data-TLB-Cache-Load-Misses (scaled from 33.36%) 1222030 Instruction-TLB-Cache-Load-Referencees (scaled from 29.46%)
357 Instruction-TLB-Cache-Load-Misses (scaled from 20.46%)
530888 Branch-Cache-Load-Referencees (scaled from 11.48%)
8638 Branch-Cache-Load-Misses (scaled from 5.09%)
Johannes Weiner [Fri, 19 Jun 2009 17:30:56 +0000 (19:30 +0200)]
mm: page_alloc: clear PG_locked before checking flags on free
da456f1 "page allocator: do not disable interrupts in free_page_mlock()" moved
the PG_mlocked clearing after the flag sanity checking which makes mlocked
pages always trigger 'bad page'. Fix this by clearing the bit up front.
Reported--and-debugged-by: Peter Chubb <peter.chubb@nicta.com.au> Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by: Mel Gorman <mel@csn.ul.ie> Tested-by: Maxim Levitsky <maximlevitsky@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Sat, 20 Jun 2009 22:40:00 +0000 (15:40 -0700)]
x86, 64-bit: Clean up user address masking
The discussion about using "access_ok()" in get_user_pages_fast() (see
commit 7f8189068726492950bf1a2dcfd9b51314560abf: "x86: don't use
'access_ok()' as a range check in get_user_pages_fast()" for details and
end result), made us notice that x86-64 was really being very sloppy
about virtual address checking.
So be way more careful and straightforward about masking x86-64 virtual
addresses:
- All the VIRTUAL_MASK* variants now cover half of the address
space, it's not like we can use the full mask on a signed
integer, and the larger mask just invites mistakes when
applying it to either half of the 48-bit address space.
- /proc/kcore's kc_offset_to_vaddr() becomes a lot more
obvious when it transforms a file offset into a
(kernel-half) virtual address.
- Unify/simplify the 32-bit and 64-bit USER_DS definition to
be based on TASK_SIZE_MAX.
This cleanup and more careful/obvious user virtual address checking also
uncovered a buglet in the x86-64 implementation of strnlen_user(): it
would do an "access_ok()" check on the whole potential area, even if the
string itself was much shorter, and thus return an error even for valid
strings. Our sloppy checking had hidden this.
So this fixes 'strnlen_user()' to do this properly, the same way we
already handled user strings in 'strncpy_from_user()'. Namely by just
checking the first byte, and then relying on fault handling for the
rest. That always works, since we impose a guard page that cannot be
mapped at the end of the user space address space (and even if we
didn't, we'd have the address space hole).
Acked-by: Ingo Molnar <mingo@elte.hu> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Nick Piggin <npiggin@suse.de> Cc: Hugh Dickins <hugh.dickins@tiscali.co.uk> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Sat, 20 Jun 2009 18:30:01 +0000 (11:30 -0700)]
Merge branch 'irq-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'irq-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
genirq, irq.h: Fix kernel-doc warnings
genirq: fix comment to say IRQ_WAKE_THREAD
Linus Torvalds [Sat, 20 Jun 2009 18:29:32 +0000 (11:29 -0700)]
Merge branch 'perfcounters-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'perfcounters-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (49 commits)
perfcounter: Handle some IO return values
perf_counter: Push perf_sample_data through the swcounter code
perf_counter tools: Define and use our own u64, s64 etc. definitions
perf_counter: Close race in perf_lock_task_context()
perf_counter, x86: Improve interactions with fast-gup
perf_counter: Simplify and fix task migration counting
perf_counter tools: Add a data file header
perf_counter: Update userspace callchain sampling uses
perf_counter: Make callchain samples extensible
perf report: Filter to parent set by default
perf_counter tools: Handle lost events
perf_counter: Add event overlow handling
fs: Provide empty .set_page_dirty() aop for anon inodes
perf_counter: tools: Makefile tweaks for 64-bit powerpc
perf_counter: powerpc: Add processor back-end for MPC7450 family
perf_counter: powerpc: Make powerpc perf_counter code safe for 32-bit kernels
perf_counter: powerpc: Change how processor-specific back-ends get selected
perf_counter: powerpc: Use unsigned long for register and constraint values
perf_counter: powerpc: Enable use of software counters on 32-bit powerpc
perf_counter tools: Add and use isprint()
...
Linus Torvalds [Sat, 20 Jun 2009 17:57:40 +0000 (10:57 -0700)]
Merge branch 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
sched: Fix out of scope variable access in sched_slice()
sched: Hide runqueues from direct refer at source code level
sched: Remove unneeded __ref tag
sched, x86: Fix cpufreq + sched_clock() TSC scaling
Linus Torvalds [Sat, 20 Jun 2009 17:56:46 +0000 (10:56 -0700)]
Merge branch 'tracing-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'tracing-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (24 commits)
tracing/urgent: warn in case of ftrace_start_up inbalance
tracing/urgent: fix unbalanced ftrace_start_up
function-graph: add stack frame test
function-graph: disable when both x86_32 and optimize for size are configured
ring-buffer: have benchmark test print to trace buffer
ring-buffer: do not grab locks in nmi
ring-buffer: add locks around rb_per_cpu_empty
ring-buffer: check for less than two in size allocation
ring-buffer: remove useless compile check for buffer_page size
ring-buffer: remove useless warn on check
ring-buffer: use BUF_PAGE_HDR_SIZE in calculating index
tracing: update sample event documentation
tracing/filters: fix race between filter setting and module unload
tracing/filters: free filter_string in destroy_preds()
ring-buffer: use commit counters for commit pointer accounting
ring-buffer: remove unused variable
ring-buffer: have benchmark test handle discarded events
ring-buffer: prevent adding write in discarded area
tracing/filters: strloc should be unsigned short
tracing/filters: operand can be negative
...
Fix up kmemcheck-induced conflict in kernel/trace/ring_buffer.c manually
Linus Torvalds [Sat, 20 Jun 2009 17:49:48 +0000 (10:49 -0700)]
Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (45 commits)
x86, mce: fix error path in mce_create_device()
x86: use zalloc_cpumask_var for mce_dev_initialized
x86: fix duplicated sysfs attribute
x86: de-assembler-ize asm/desc.h
i386: fix/simplify espfix stack switching, move it into assembly
i386: fix return to 16-bit stack from NMI handler
x86, ioapic: Don't call disconnect_bsp_APIC if no APIC present
x86: Remove duplicated #include's
x86: msr.h linux/types.h is only required for __KERNEL__
x86: nmi: Add Intel processor 0x6f4 to NMI perfctr1 workaround
x86, mce: mce_intel.c needs <asm/apic.h>
x86: apic/io_apic.c: dmar_msi_type should be static
x86, io_apic.c: Work around compiler warning
x86: mce: Don't touch THERMAL_APIC_VECTOR if no active APIC present
x86: mce: Handle banks == 0 case in K7 quirk
x86, boot: use .code16gcc instead of .code16
x86: correct the conversion of EFI memory types
x86: cap iomem_resource to addressable physical memory
x86, mce: rename _64.c files which are no longer 64-bit-specific
x86, mce: mce.h cleanup
...
Manually fix up trivial conflict in arch/x86/mm/fault.c
Linus Torvalds [Sat, 20 Jun 2009 17:17:02 +0000 (10:17 -0700)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input: (35 commits)
Input: add driver for Synaptics I2C touchpad
Input: synaptics - add support for reporting x/y resolution
Input: ALPS - handle touchpoints buttons correctly
Input: gpio-keys - change timer to workqueue
Input: ads7846 - pin change interrupt support
Input: add support for touchscreen on W90P910 ARM platform
Input: appletouch - improve finger detection
Input: wacom - clear Intuos4 wheel data when finger leaves proximity
Input: ucb1400 - move static function from header into core
Input: add driver for EETI touchpanels
Input: ads7846 - more detailed model name in sysfs
Input: ads7846 - support swapping x and y axes
Input: ati_remote2 - use non-atomic bitops
Input: introduce lm8323 keypad driver
Input: psmouse - ESD workaround fix for OLPC XO touchpad
Input: tsc2007 - make sure platform provides get_pendown_state()
Input: uinput - flush all pending ff effects before destroying device
Input: simplify name handling for certain input handles
Input: serio - do not use deprecated dev.power.power_state
Input: wacom - add support for Intuos4 tablets
...
Linus Torvalds [Sat, 20 Jun 2009 17:15:30 +0000 (10:15 -0700)]
Merge branch 'drm-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6
* 'drm-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6: (24 commits)
agp/intel: Make intel_i965_mask_memory use dma_addr_t for physical addresses
agp: add user mapping support to ATI AGP bridge.
drm/i915: enable GEM on PAE.
drm/radeon: fix unused variables warning
agp: switch AGP to use page array instead of unsigned long array
agpgart: detected ALi M???? chipset with M1621
drm/radeon: command stream checker for r3xx-r5xx hardware
drm/radeon: Fully initialize LVDS info also when we can't get it from the ROM.
radeon: Fix CP byte order on big endian architectures with KMS.
agp/uninorth: Handle user memory types.
drm/ttm: Add some powerpc cache flush code.
radeon: Enable modesetting on non-x86.
drm/radeon: Respect AGP cant_use_aperture flag.
drm: EDID endianness fixes.
drm/radeon: this VRAM vs aperture test is wrong, just remove it.
drm/ttm: fix an error path to exit function correctly
drm: Apply "Memory fragmentation from lost alignment blocks"
ttm: Return -ERESTART when a signal interrupts bo eviction.
drm: Remove memory debugging infrastructure.
drm/i915: Clear fence register on tiling stride change.
...
Linus Torvalds [Sat, 20 Jun 2009 16:52:27 +0000 (09:52 -0700)]
x86: don't use 'access_ok()' as a range check in get_user_pages_fast()
It's really not right to use 'access_ok()', since that is meant for the
normal "get_user()" and "copy_from/to_user()" accesses, which are done
through the TLB, rather than through the page tables.
Why? access_ok() does both too few, and too many checks. Too many,
because it is meant for regular kernel accesses that will not honor the
'user' bit in the page tables, and because it honors the USER_DS vs
KERNEL_DS distinction that we shouldn't care about in GUP. And too few,
because it doesn't do the 'canonical' check on the address on x86-64,
since the TLB will do that for us.
So instead of using a function that isn't meant for this, and does
something else and much more complicated, just do the real rules: we
don't want the range to overflow, and on x86-64, we want it to be a
canonical low address (on 32-bit, all addresses are canonical).
Acked-by: Ingo Molnar <mingo@elte.hu> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Michal Simek [Sat, 20 Jun 2009 12:24:01 +0000 (14:24 +0200)]
microblaze: Add missing symbols for CONSTRUCTORS support
Commit b99b87f70c7785ab1e253c6220f4b0b57ce3a7f7 add CONSTRUCTOR
support to Linux but Microblaze not defined KERNEL_CTORS symbols
which are used with that patch.
This patch fixed it.
Arnd Bergmann [Thu, 18 Jun 2009 17:55:26 +0000 (19:55 +0200)]
microblaze: remove init_mm
Alexey removed the definition for init_mm from all architectures
but forgot microblaze, which was only recently added.
This fixes the microblaze build by dropping it there as well.
Randy Dunlap [Thu, 18 Jun 2009 00:36:15 +0000 (17:36 -0700)]
kernel-doc: ignore kmemcheck_bitfield_begin/end
Teach kernel-doc to ignore kmemcheck_bitfield_{begin,end} sugar
so that it won't generate warnings like this:
Warning(include/net/sock.h:297): No description found for parameter 'kmemcheck_bitfield_begin(flags)'
Warning(include/net/sock.h:297): No description found for parameter 'kmemcheck_bitfield_end(flags)'
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Amerigo Wang [Fri, 19 Jun 2009 07:06:54 +0000 (03:06 -0400)]
kbuild: fix build error during make htmldocs
Fix the following build error when do 'make htmldocs':
DOCPROC Documentation/DocBook/debugobjects.xml
exec /scripts/kernel-doc: No such file or directory
exec /scripts/kernel-doc: No such file or directory
Reported-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: WANG Cong <amwang@redhat.com> Acked-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Building perfcounter tools raises the following warnings:
builtin-record.c: In function ‘atexit_header’:
builtin-record.c:464: erreur: ignoring return value of ‘pwrite’, declared with attribute warn_unused_result
builtin-record.c: In function ‘__cmd_record’:
builtin-record.c:503: erreur: ignoring return value of ‘read’, declared with attribute warn_unused_result
builtin-report.c: In function ‘__cmd_report’:
builtin-report.c:1403: erreur: ignoring return value of ‘read’, declared with attribute warn_unused_result
This patch handles these IO return values.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <1245456100-5477-1-git-send-email-fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
Rainer Weikusat [Thu, 18 Jun 2009 15:04:00 +0000 (17:04 +0200)]
ide-cd: prevent null pointer deref via cdrom_newpc_intr
With 2.6.30, the error handling code in cdrom_newpc_intr was changed
to deal with partial request failures by normally completing the 'good'
parts of a request and only 'error' the last (and presumably,
incompletely transferred) bio associated with a particular
request. In order to do this, ide_complete_rq is called over
ide_cd_error_cmd() to partially complete the rq. The block layer
does partial completion only for requests with bio's and if the
rq doesn't have one (eg 'GPCMD_READ_DISC_INFO') the request is
completed as a whole and the drive->hwif->rq pointer set to NULL
afterwards. When calling ide_complete_rq again to report
the error, this null pointer is derefenced, resulting in a kernel
crash.
This fixes http://bugzilla.kernel.org/show_bug.cgi?id=13399.
Mike Rapoport [Thu, 11 Jun 2009 15:08:39 +0000 (08:08 -0700)]
Input: add driver for Synaptics I2C touchpad
This driver supports Synaptics I2C touchpad controller on eXeda
mobile device. Unfortunaltely it only works in relative mode and
thus is not comaptible with Xorg Synaptics driver.
Signed-off-by: Igor Grinberg <grinberg@compulab.co.il> Signed-off-by: Mike Rapoport <mike@compulab.co.il> Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
Tero Saarni [Thu, 11 Jun 2009 06:27:24 +0000 (23:27 -0700)]
Input: synaptics - add support for reporting x/y resolution
Synaptics uses anisotropic coordinate system. On some wide touchpads
vertical resolution can be twice as high as horizontal which causes
unequal sensitivity on x/y directions. Add support for reading the
resolution with EVIOCGABS ioctl.
This snapshot has been taken while neither the function tracer nor
the function graph tracer was running.
With dynamic ftrace, such results show a wrong ftrace behaviour
because all calls to ftrace_caller or ftrace_graph_caller (the patched
calls to mcount) are supposed to be patched into nop if none of those
tracers are running.
The problem occurs after the first run of the function tracer. Once we
launch it a second time, the callsites will never be nopped back,
unless you set custom filters.
For example it happens during the self tests at boot time.
The function tracer selftest runs, and then the dynamic tracing is
tested too. After that, the callsites are left un-nopped.
This is because the reset callback of the function tracer tries to
unregister two ftrace callbacks in once: the common function tracer
and the function tracer with stack backtrace, regardless of which
one is currently in use.
It then creates an unbalance on ftrace_start_up value which is expected
to be zero when the last ftrace callback is unregistered. When it
reaches zero, the FTRACE_DISABLE_CALLS is set on the next ftrace
command, triggering the patching into nop. But since it becomes
unbalanced, ie becomes lower than zero, if the kernel functions
are patched again (as in every further function tracer runs), they
won't ever be nopped back.
Note that ftrace_call and ftrace_graph_call are still patched back
to ftrace_stub in the off case, but not the callers of ftrace_call
and ftrace_graph_caller. It means that the tracing is well deactivated
but we waste a useless call into every kernel function.
This patch just unregisters the right ftrace_ops for the function
tracer on its reset callback and ignores the other one which is
not registered, fixing the unbalance. The problem also happens
is .30
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: stable@kernel.org
Linus Torvalds [Sat, 20 Jun 2009 00:45:51 +0000 (17:45 -0700)]
Merge branch 'i2c-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jdelvare/staging
* 'i2c-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jdelvare/staging:
i2c: New macro to initialize i2c address lists on the fly
i2c: Don't advertise i2c functions when not available
i2c: Use rwsem instead of mutex for board info
i2c: Add a sysfs interface to instantiate devices
i2c: Limit core locking to the necessary sections
i2c: Kill the redundant client list
i2c: Kill is_newstyle_driver
i2c: Merge i2c_attach_client into i2c_new_device
i2c: Drop i2c_probe function
i2c: Get rid of the legacy binding model
i2c: Kill client_register and client_unregister methods
Linus Torvalds [Sat, 20 Jun 2009 00:34:46 +0000 (17:34 -0700)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/vapier/blackfin
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/vapier/blackfin:
Blackfin: convert page/tlb to asm-generic
Blackfin: convert types to asm-generic
Blackfin: convert irq/process to asm-generic
Blackfin: convert signal/mmap to asm-generic
Blackfin: convert locking primitives to asm-generic
Blackfin: convert termios to asm-generic
Blackfin: convert simple headers to asm-generic
Blackfin: convert socket/poll to asm-generic
Blackfin: convert user/elf to asm-generic
Blackfin: convert shm/sysv/ipc to asm-generic
Blackfin: convert asm/ioctls.h to asm-generic/ioctls.h
Blackfin: only build irqpanic.c when needed
Blackfin: pull in asm/io.h in ksyms for prototypes
Blackfin: use common test_bit() rather than __test_bit()
Joe Perches [Thu, 18 Jun 2009 23:49:25 +0000 (16:49 -0700)]
MAINTAINERS: kmemtrace pattern update
Signed-off-by: Joe Perches <joe@perches.com> Cc: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro> Cc: Pekka Enberg <penberg@cs.helsinki.fi> Cc: Ingo Molnar <mingo@elte.hu> Cc: Zhao Lei <zhaolei@cn.fujitsu.com> Cc: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Joe Perches [Thu, 18 Jun 2009 23:49:23 +0000 (16:49 -0700)]
MAINTAINERS: update Ftrace documentation pattern
Signed-off-by: Joe Perches <joe@perches.com> Cc: Randy Dunlap <rdunlap@xenotime.net> Cc: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Krzysztof Helt <krzysztof.h1@wp.pl> Reported-by: Werner Lemberg <wl@gnu.org> Cc: Michal Januszewski <spock@gentoo.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Andrew Morton [Thu, 18 Jun 2009 23:49:17 +0000 (16:49 -0700)]
convert some DMA_nnBIT_MASK() callers
We're about to make DMA_nnBIT_MASK() emit `deprecated' warnings. Convert the
remaining stragglers which are visible to the x86_64 build.
Cc: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Cc: James Bottomley <James.Bottomley@HansenPartnership.com> Cc: Eric Moore <Eric.Moore@lsil.com> Cc: Takashi Iwai <tiwai@suse.de> Cc: "David S. Miller" <davem@davemloft.net> Cc: Alexander Duyck <alexander.h.duyck@intel.com> Cc: Yi Zou <yi.zou@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Dirk Eibach [Thu, 18 Jun 2009 23:49:15 +0000 (16:49 -0700)]
char: moxa, prevent opening unavailable ports
In moxa.c there are 32 minor numbers reserved for each device. The number
of ports actually available per device is stored in
moxa_board_conf->numPorts. This number is not considered in moxa_open().
Opening a port that is not available results in a kernel oops. This patch
adds a test to moxa_open() that prevents opening unavailable ports.
Mike Frysinger [Thu, 18 Jun 2009 23:49:14 +0000 (16:49 -0700)]
istallion: add missing __devexit marking
The remove member of the pci_driver stli_pcidriver uses __devexit_p(), so
the remove function itself should be marked with __devexit. Even more so
considering the probe function is marked with __devinit.
Signed-off-by: Mike Frysinger <vapier@gentoo.org> Acked-by: Greg Ungerer <gerg@uclinux.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Roel Kluin [Thu, 18 Jun 2009 23:49:13 +0000 (16:49 -0700)]
dtlk: off by one in {read,write}_tts()
With a postfix increment retries is incremented beyond DTLK_MAX_RETRIES so
the error message is not displayed correctly.
Signed-off-by: Roel Kluin <roel.kluin@gmail.com> Cc: James R. Van Zandt <jrv@vanzandt.mv.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Oleg Nesterov [Thu, 18 Jun 2009 23:49:11 +0000 (16:49 -0700)]
ptrace: wait_task_zombie: do not account traced sub-threads
The bug is ancient.
If we trace the sub-thread of our natural child and this sub-thread exits,
we update parent->signal->cxxx fields. But we should not do this until
the whole thread-group exits, otherwise we account this thread (and all
other live threads) twice.
Add the task_detached() check. No need to check thread_group_empty(),
wait_consider_task()->delay_group_leader() already did this.
Signed-off-by: Oleg Nesterov <oleg@redhat.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Acked-by: Roland McGrath <roland@redhat.com> Cc: Stanislaw Gruszka <sgruszka@redhat.com> Cc: Vitaly Mayatskikh <vmayatsk@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Atsushi Nemoto [Thu, 18 Jun 2009 23:49:09 +0000 (16:49 -0700)]
rtc: make rtc_update_irq callable with irqs enabled
The rtc_update_irq() might be called with irqs enabled, if a interrupt
handler was registered without IRQF_DISABLED. Use
spin_lock_irqsave/spin_unlock_irqrestore instead of spin_lock/spin_unlock.
Also update kerneldoc and drivers which do extra work to follow the
current interface spec, as suggestted by David Brownell.
Signed-off-by: Atsushi Nemoto <anemo@mba.ocn.ne.jp> Cc: Alessandro Zummo <a.zummo@towertech.it> Cc: David Brownell <david-b@pacbell.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Anton Vorontsov [Thu, 18 Jun 2009 23:49:08 +0000 (16:49 -0700)]
spi_mpc8xxx: s/83xx/8xxx/g
Since we renamed the file, we might want to rename the file internals too.
Though we don't bother with changing platform driver name and platform
module alias. The stuff is legacy and hopefully we'll remove it soon.
Suggested-by: Kumar Gala <galak@kernel.crashing.org> Signed-off-by: Anton Vorontsov <avorontsov@ru.mvista.com> Cc: David Brownell <david-b@pacbell.net> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>