Jiri Olsa [Wed, 5 Sep 2012 17:51:33 +0000 (19:51 +0200)]
perf tools: Fix cache event name generation
If the event name is specified with all 3 components, the last one
overwrites the previous one during the name composing within the
parse_events_add_cache function.
Fixing this by properly adjusting the string index.
Reported-by: Joel Uckelman <joel@lightboxtechnologies.com> Signed-off-by: Jiri Olsa <jolsa@redhat.com> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Joel Uckelman <joel@lightboxtechnologies.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
LPU-Reference: 20120905175133.GA18352@krava.brq.redhat.com
[ committer note: Remove the newline fix, done already in 42e1fb7 ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
perf test: Add roundtrip test for hardware cache events
That nicely catches the problem reported by Joel Uckelman in
http://permalink.gmane.org/gmane.linux.kernel.perf.user/1016 :
[root@sandy ~]# perf test
1: vmlinux symtab matches kallsyms: Ok
2: detect open syscall event: Ok
3: detect open syscall event on all cpus: Ok
4: read samples using the mmap interface: Ok
5: parse events tests: Ok
6: x86 rdpmc test: Ok
7: Validate PERF_RECORD_* events & perf_sample fields: Ok
8: Test perf pmu format parsing: Ok
9: Test dso data interface: Ok
10: roundtrip evsel->name check: FAILED!
Merge tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc
Pull ARM SoC bug fixes from Olof Johansson:
"Mostly Renesas and Atmel bugfixes this time, targeting boot and build
problems. A couple of patches for gemini and kirkwood as well. On a
whole nothing very controversial."
* tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
ARM: gemini: fix the gemini build
ARM: shmobile: armadillo800eva: enable rw rootfs mount
ARM: Kirkwood: Fix 'SZ_1M' undeclared here for db88f6281-bp-setup.c
ARM: shmobile: mackerel: fixup usb module order
ARM: shmobile: armadillo800eva: fixup: sound card detection order
ARM: shmobile: marzen: fixup smsc911x id for regulator
ARM: at91/feature-removal-schedule: delay at91_mci removal
ARM: mach-shmobile: armadillo800eva: Enable power button as wakeup source
ARM: mach-shmobile: armadillo800eva: Fix GPIO buttons descriptions
ARM: at91/dts: remove partial parameter in at91sam9g25ek.dts
ARM: at91/clock: fix PLLA overclock warning
ARM: at91: fix rtc-at91sam9 irq issue due to sparse irq support
ARM: at91: fix system timer irq issue due to sparse irq support
ARM: shmobile: sh73a0: fixup RELOC_BASE of intca_irq_pins_desc
Merge tag 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging
Pull a hwmon fix from Guenter Roeck:
"One patch, fixing DIV_ROUND_CLOSEST to support negative dividends.
While the changes are not in the drivers/hwmon directory, the problem
primarily affects hwmon drivers, and it makes sense to push the patch
through the hwmon tree."
* tag 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
linux/kernel.h: Fix DIV_ROUND_CLOSEST to support negative dividends
Merge branch 'rc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild
Pull kbuild fixes from Michal Marek:
"These are two fixes that should go into 3.6. The link-vmlinux.sh one
is obvious.
The other one fixes make firmware_install with certain configurations,
where a file in the toplevel firmware tree gets installed first, and
$(INSTALL_FW_PATH)/$$(dir <file>) results in /lib/firmware/./, which
confuses make 3.82 for some reason."
* 'rc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild:
firmware: fix directory creation rule matching with make 3.82
link-vmlinux.sh: Fix stray "echo" in error message
perf test: Add round trip test for sw and hw event names
It basically traverses the hardware and software event name arrays
creating an evlist with all events, then it uses perf_evsel__name to
check that the name is the expected one.
With it I noticed this problem:
[root@sandy ~]# perf test 10
10: roundtrip evsel->name check:invalid or unsupported event: 'CPU-migrations'
Run 'perf list' for a list of valid events
FAILED!
Changed it to "cpu-migrations" in the software event arrays and it
worked.
This is to catch problems like the one reported by Joel Uckelman in
http://permalink.gmane.org/gmane.linux.kernel.perf.user/1016
Hardware cache events will be checked in the following patch.
Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Namhyung Kim <namhyung@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-5jskfkuqvf2fi257zmni0ftz@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Namhyung Kim [Thu, 6 Sep 2012 02:10:46 +0000 (11:10 +0900)]
perf header: Prepare tracepoint events regardless of name
Current perf_evlist__set_tracepoint_names is a misnomer because it finds
and sets correspoding event_format in addition to the name. So skipping
it when a event has set name already caused a trouble.
Rename it and set name only a event doesn't have one.
Reported-by: David Ahern <dsahern@gmail.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: David Ahern <dsahern@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1346897446-16569-2-git-send-email-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
David Ahern [Thu, 6 Sep 2012 00:53:36 +0000 (18:53 -0600)]
perf tools: Clean target should do clean for lib/traceevent too
It's built as part of perf, so it should be cleaned too.
Tested-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Steven Rostedt <rostedt@goodmis.org> Link: http://lkml.kernel.org/r/1346892816-61779-1-git-send-email-dsahern@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Merge tag 'mmc-fixes-for-3.6-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/cjb/mmc
Pull MMC fixes from Chris Ball:
- a firmware bug on several Samsung MoviNAND eMMC models causes
permanent corruption on the device when secure erase and secure trim
requests are made, so we disable those requests on these eMMC devices.
- atmel-mci: fix a hang with some SD cards by waiting for not-busy flag.
- dw_mmc: low-power mode breaks SDIO interrupts; fix PIO error handling;
fix handling of error interrupts.
- mxs-mmc: fix deadlocks; fix compile error due to dma.h arch change.
- omap: fix broken PIO mode causing memory corruption.
- sdhci-esdhc: fix card detection.
* tag 'mmc-fixes-for-3.6-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/cjb/mmc:
mmc: omap: fix broken PIO mode
mmc: card: Skip secure erase on MoviNAND; causes unrecoverable corruption.
mmc: dw_mmc: Disable low power mode if SDIO interrupts are used
mmc: dw_mmc: fix error handling in PIO mode
mmc: dw_mmc: correct mishandling error interrupt
mmc: dw_mmc: amend using error interrupt status
mmc: atmel-mci: not busy flag has also to be used for read operations
mmc: sdhci-esdhc: break out early if clock is 0
mmc: mxs-mmc: fix deadlock caused by recursion loop
mmc: mxs-mmc: fix deadlock in SDIO IRQ case
mmc: bfin_sdh: fix dma_desc_array build error
arch/um/os-Linux/time.c: In function 'deliver_alarm':
arch/um/os-Linux/time.c:117:3: error: too few arguments to function 'alarm_handler'
arch/um/os-Linux/internal.h:1:6: note: declared here
The error was introduced by commit d3c1cfcd ("um: pass siginfo to guest
process") in 3.6-rc1.
Merge branch 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc
Pull powerpc fixes from Benjamin Herrenschmidt:
"Here are a few fixes for 3.6 that were piling up while I was away or
busy (I was mostly MIA a week or two before San Diego).
Some fixes from Anton fixing up issues with our relatively new DSCR
control feature, and a few other fixes that are either regressions or
bugs nasty enough to warrant not waiting."
* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc:
powerpc: Don't use __put_user() in patch_instruction
powerpc: Make sure IPI handlers see data written by IPI senders
powerpc: Restore correct DSCR in context switch
powerpc: Fix DSCR inheritance in copy_thread()
powerpc: Keep thread.dscr and thread.dscr_inherit in sync
powerpc: Update DSCR on all CPUs when writing sysfs dscr_default
powerpc/powernv: Always go into nap mode when CPU is offline
powerpc: Give hypervisor decrementer interrupts their own handler
powerpc/vphn: Fix arch_update_cpu_topology() return value
Merge tag 'gpio-fixes-for-v3.6' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio
Pull GPIO fixes from Linus Walleij:
"These are some GPIO regression fixes for v3.6:
- Erroneous debug message from of_get_named_gpio_flags()
- Make sure the MC9S08DZ60 GPIO driver depend on I2C being compiled
in (not module) or allmodconfig breaks.
- Check return value from irq_alloc_descs() in the Emma Mobile GPIO
driver.
- Assign the owner field for the rdc321x driver so the module won't
be removed if it has active GPIOs."
* tag 'gpio-fixes-for-v3.6' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio:
gpio: rdc321x: Prevent removal of modules exporting active GPIOs
gpio: em: Fix checking return value of irq_alloc_descs
gpio: mc9s08dz60: Fix build error if I2C=m
gpio: Fix debug message in of_get_named_gpio_flags()
Merge tag 'sound-3.6' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
"There are nothing scaring, contains only small fixes for HD-audio and
USB-audio:
- EPSS regression fix and GPIO fix for HD-audio IDT codecs
- A series of USB-audio regression fixes that are found since 3.5
kernel"
* tag 'sound-3.6' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
ALSA: snd-usb: fix cross-interface streaming devices
ALSA: snd-usb: fix calls to next_packet_size
ALSA: snd-usb: restore delay information
ALSA: snd-usb: use list_for_each_safe for endpoint resources
ALSA: snd-usb: Fix URB cancellation at stream start
ALSA: hda - Don't trust codec EPSS bit for IDT 92HD83xx & co
ALSA: hda - Avoid unnecessary parameter read for EPSS
ALSA: hda - Do not set GPIOs for speakers on IDT if there are no speakers
Merge tag 'fbdev-fixes-for-3.6-1' of git://github.com/schandinat/linux-2.6
Pull fbdev fixes from Florian Tobias Schandinat:
- a fix by Paul Cercueil to prevent a possible buffer overflow
- a fix by Bruno Prémont to prevent a rare sleep in invalid context
- a fix by Julia Lawall for a double free in auo_k190x
- a fix by Dan Carpenter to prevent a division by zero in mb862xxfb
- a regression fix by Tomi Valkeinen for the SDI output in OMAP
- a fix by Grazvydas Ignotas to fix the console colors in OMAP
* tag 'fbdev-fixes-for-3.6-1' of git://github.com/schandinat/linux-2.6:
OMAPFB: fix framebuffer console colors
OMAPDSS: Fix SDI PLL locking
video: mb862xxfb: prevent divide by zero bug
drivers/video/auo_k190x.c: drop kfree of devm_kzalloc's data
fbcon: Fix bit_putcs() call to kmalloc(s, GFP_KERNEL)
fbcon: prevent possible buffer overflow.
Merge tag 'upstream-3.6-rc5' of git://git.infradead.org/linux-ubi
Pull ubi fix from Artem Bityutskiy:
"A single small fix for memory deallocation: we allocated memory using
'kmem_cache_alloc()' but were freeing it using 'kfree()' in some
cases. Now we fix this by using 'kmem_cache_free()' instead."
* tag 'upstream-3.6-rc5' of git://git.infradead.org/linux-ubi:
UBI: fix a horrible memory deallocation bug
Fix order of arguments to compat_put_time[spec|val]
Commit 644595f89620 ("compat: Handle COMPAT_USE_64BIT_TIME in
net/socket.c") introduced a bug where the helper functions to take
either a 64-bit or compat time[spec|val] got the arguments in the wrong
order, passing the kernel stack pointer off as a user pointer (and vice
versa).
Because of the user address range check, that in turn then causes an
EFAULT due to the user pointer range checking failing for the kernel
address. Incorrectly resuling in a failed system call for 32-bit
processes with a 64-bit kernel.
On odder architectures like HP-PA (with separate user/kernel address
spaces), it can be used read kernel memory.
perf tools: Allow user to indicate path to objdump in command line
When analyzing perf data from hosts of other architecture than one of
the local host it's useful to call objdump that is part of a toolchain
for that architecture. Instead of calling regular objdump, call one that
user specified in command line.
perf tools: Remove the node from rblist in strlist__remove
The following commit:
author David Ahern <dsahern@gmail.com>
Tue, 31 Jul 2012 04:31:33 +0000 (22:31 -0600)
committer Arnaldo Carvalho de Melo <acme@redhat.com>
Fri, 3 Aug 2012 13:39:51 +0000 (10:39 -0300)
commit ee8dd3ca43f151d9fbe1edeef68fb8a77eb9f047
causes a double free during a probe deletion as the node is never
removed from the list via strlist__remove(), even though it gets
'deleted' (read free()'d). This causes a double free when we do
strlist__delete() as the node is already deleted but present in the
rblist.
[suzukikp@suzukikp perf]$ sudo ./perf probe -a do_fork
Added new event:
probe:do_fork (on do_fork)
Make sure we remove the node from the rblist before we delete the node.
The rblist__remove_node() will invoke rblist->node_delete, which will
take care of deleting the node with the suitable function provided by
the user.
Reported-by: Ananth N. Mavinakayanahalli <ananth@in.ibm.com> Signed-off-by: Suzuki K. Poulose <suzuki@in.ibm.com> Acked-by: David Ahern <dsahern@gmail.com> Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20120829055840.7802.1459.stgit@suzukikp.in.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Similar to the one in :
https://lkml.org/lkml/2012/8/29/27
Make sure we remove the node from the rblist before we delete the node.
The rblist__remove_node() will invoke rblist->node_delete, which will
take care of deleting the node with the suitable function provided by
the user.
Signed-off-by: Suzuki K Poulose <suzuki@in.ibm.com> Acked-by: David Ahern <dsahern@gmail.com> Cc: David Ahern <dsahern@gmail.com> Cc: Suzuki K Poulose <suzuki@in.ibm.com> Link: http://lkml.kernel.org/r/20120831065840.5167.90318.stgit@suzukikp.in.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
David Ahern [Mon, 27 Aug 2012 19:05:54 +0000 (13:05 -0600)]
perf tools: Fix x86 builds with ARCH specified on the command line
e.g., compiling i386 on x86_64 using:
$ make -C tools/perf ARCH=i386
fails with:
CC /tmp/pbuild/util/evsel.o
In file included from util/evsel.c:21:0:
util/perf_regs.h:5:23: fatal error: perf_regs.h: No such file or directory
compilation terminated.
Adding V=1 you see that the include argument for the arch is
'-Iarch/i386/include' is wrong. It is supposed to be -Iarch/x86/include
per the redefinition of ARCH in the Makefile.
According to the make manual,
http://www.gnu.org/software/make/manual/make.html#Override-Directive:
"If a variable has been set with a command argument (see Overriding
Variables), then ordinary assignments in the makefile are ignored. If
you want to set the variable in the makefile even though it was set
with a command argument, you can use an override directive ..."
Make it so.
Signed-off-by: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1346094354-74356-1-git-send-email-dsahern@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Joe Millenbach [Mon, 3 Sep 2012 00:38:20 +0000 (17:38 -0700)]
x86/kconfig: Remove outdated reference to Intel CPUs in CONFIG_SWIOTLB
Deleted the no longer valid example of which x86 CPUs lack a
hardware IOMMU, and moved the "If unsure..." statement to a new
line to follow the style of surrounding options.
x86/Kconfig: Turn off DEBUG_NX_TEST module in defconfigs
The x86 defconfigs include exactly one module: test_nx.ko, a
special-purpose module which just exists to do evil things like
executing code off the stack to see if the kernel has enabled NX
support. Anyone who actually uses that module can easily enable
it themselves, but the vast majority of kernel builds don't need
it; disable it by default.
Signed-off-by: Josh Triplett <josh@joshtriplett.org> Cc: Randy Dunlap <rdunlap@xenotime.net> Cc: Suresh Siddha <suresh.b.siddha@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Arjan van de Ven <arjan@linux.intel.com> Link: http://lkml.kernel.org/r/e72faf875e1172fb1cbec5e6d3cd4122df508a97.1346649518.git.josh@joshtriplett.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
The vast majority of systems either use initramfs or mount a
root filesystem directly from the kernel. Distros have
defaulted to initramfs for years. Only highly specialized
systems would use an actual filesystem-image initrd at this
point, and such systems don't rely on defconfig anyway. Drop
initrd support (and specifically RAM block device support) from
the defconfigs.
Signed-off-by: Josh Triplett <josh@joshtriplett.org> Cc: Randy Dunlap <rdunlap@xenotime.net> Cc: Suresh Siddha <suresh.b.siddha@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/2521e983a63595cd7a331236d929577660f89c72.1346649518.git.josh@joshtriplett.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
x86/Kconfig: Disable CONFIG_CRC_T10DIF in defconfigs
CONFIG_CRC_T10DIF explicitly states that it exists only for use
by out-of-tree modules; anything in-kernel that needs it selects
it. Thus, compile it out by default.
Signed-off-by: Josh Triplett <josh@joshtriplett.org> Cc: Randy Dunlap <rdunlap@xenotime.net> Cc: Suresh Siddha <suresh.b.siddha@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/3aaff7a0af1320427952d411a21b8ded29747a1f.1346649518.git.josh@joshtriplett.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
The current x86 and x86-64 defconfigs do not enable ext4, which
most current distributions default to. Switch the defconfigs to
ext4, so they will boot on current systems without additional
configuration.
Signed-off-by: Josh Triplett <josh@joshtriplett.org> Cc: Randy Dunlap <rdunlap@xenotime.net> Cc: Suresh Siddha <suresh.b.siddha@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/bd8a359506b7e1287c680823de16d67608ec52fe.1346649518.git.josh@joshtriplett.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
x86/Kconfig: Update defconfigs to current results of "make savedefconfig"
The x86 defconfigs have become somewhat out of date compared to
the current result of "make savedefconfig". Update them to the
current output, as a prelude to further defconfig changes, to
avoid unrelated noise in those further changes.
Signed-off-by: Josh Triplett <josh@joshtriplett.org> Cc: Randy Dunlap <rdunlap@xenotime.net> Cc: Suresh Siddha <suresh.b.siddha@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/80c8a5fbeaf6cdb72fb78a016013427efee52668.1346649518.git.josh@joshtriplett.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
powerpc: Don't use __put_user() in patch_instruction
patch_instruction() can be called very early on ppc32, when the kernel
isn't yet running at it's linked address. That can cause the !
is_kernel_addr() test in __put_user() to trip and call might_sleep()
which is very bad at that point during boot.
Use a lower level function instead for now, at least until we get to
rework ppc32 boot process to do the code patching later, like ppc64
does.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Paul Mackerras [Tue, 4 Sep 2012 18:33:08 +0000 (18:33 +0000)]
powerpc: Make sure IPI handlers see data written by IPI senders
We have been observing hangs, both of KVM guest vcpu tasks and more
generally, where a process that is woken doesn't properly wake up and
continue to run, but instead sticks in TASK_WAKING state. This
happens because the update of rq->wake_list in ttwu_queue_remote()
is not ordered with the update of ipi_message in
smp_muxed_ipi_message_pass(), and the reading of rq->wake_list in
scheduler_ipi() is not ordered with the reading of ipi_message in
smp_ipi_demux(). Thus it is possible for the IPI receiver not to see
the updated rq->wake_list and therefore conclude that there is nothing
for it to do.
In order to make sure that anything done before smp_send_reschedule()
is ordered before anything done in the resulting call to scheduler_ipi(),
this adds barriers in smp_muxed_message_pass() and smp_ipi_demux().
The barrier in smp_muxed_message_pass() is a full barrier to ensure that
there is a full ordering between the smp_send_reschedule() caller and
scheduler_ipi(). In smp_ipi_demux(), we use xchg() rather than
xchg_local() because xchg() includes release and acquire barriers.
Using xchg() rather than xchg_local() makes sense given that
ipi_message is not just accessed locally.
This moves the barrier between setting the message and calling the
cause_ipi() function into the individual cause_ipi implementations.
Most of them -- those that used outb, out_8 or similar -- already had
a full barrier because out_8 etc. include a sync before the MMIO
store. This adds an explicit barrier in the two remaining cases.
These changes made no measurable difference to the speed of IPIs as
measured using a simple ping-pong latency test across two CPUs on
different cores of a POWER7 machine.
The analysis of the reason why processes were not waking up properly
is due to Milton Miller.
Cc: stable@vger.kernel.org # v3.0+ Reported-by: Milton Miller <miltonm@bga.com> Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Anton Blanchard [Mon, 3 Sep 2012 16:51:10 +0000 (16:51 +0000)]
powerpc: Restore correct DSCR in context switch
During a context switch we always restore the per thread DSCR value.
If we aren't doing explicit DSCR management
(ie thread.dscr_inherit == 0) and the default DSCR changed while
the process has been sleeping we end up with the wrong value.
Check thread.dscr_inherit and select the default DSCR or per thread
DSCR as required.
This was found with the following test case, when running with
more threads than CPUs (ie forcing context switching):
Anton Blanchard [Mon, 3 Sep 2012 16:49:47 +0000 (16:49 +0000)]
powerpc: Fix DSCR inheritance in copy_thread()
If the default DSCR is non zero we set thread.dscr_inherit in
copy_thread() meaning the new thread and all its children will ignore
future updates to the default DSCR. This is not intended and is
a change in behaviour that a number of our users have hit.
We just need to inherit thread.dscr and thread.dscr_inherit from
the parent which ends up being much simpler.
Anton Blanchard [Mon, 3 Sep 2012 16:48:46 +0000 (16:48 +0000)]
powerpc: Keep thread.dscr and thread.dscr_inherit in sync
When we update the DSCR either via emulation of mtspr(DSCR) or via
a change to dscr_default in sysfs we don't update thread.dscr.
We will eventually update it at context switch time but there is
a period where thread.dscr is incorrect.
If we fork at this point we will copy the old value of thread.dscr
into the child. To avoid this, always keep thread.dscr in sync with
reality.
Anton Blanchard [Mon, 3 Sep 2012 16:47:56 +0000 (16:47 +0000)]
powerpc: Update DSCR on all CPUs when writing sysfs dscr_default
Writing to dscr_default in sysfs doesn't actually change the DSCR -
we rely on a context switch on each CPU to do the work. There is no
guarantee we will get a context switch in a reasonable amount of time
so fire off an IPI to force an immediate change.
This issue was found with the following test case:
Paul Mackerras [Thu, 26 Jul 2012 18:51:09 +0000 (18:51 +0000)]
powerpc/powernv: Always go into nap mode when CPU is offline
The CPU hotplug code for the powernv platform currently only puts
offline CPUs into nap mode if the powersave_nap variable is set.
However, HV-style KVM on this platform requires secondary CPU threads
to be offline and in nap mode. Since we know nap mode works just
fine on all POWER7 machines, and the only machines that support the
powernv platform are POWER7 machines, this changes the code to
always put offline CPUs into nap mode, regardless of powersave_nap.
Powersave_nap still controls whether or not CPUs go into nap mode
when idle, as before.
Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Paul Mackerras [Thu, 26 Jul 2012 13:56:11 +0000 (13:56 +0000)]
powerpc: Give hypervisor decrementer interrupts their own handler
At the moment the handler for hypervisor decrementer interrupts is
the same as for decrementer interrupts, i.e. timer_interrupt().
This is bogus; if we ever do get a hypervisor decrementer interrupt
it won't have anything to do with the next timer event. In fact
the only time we get hypervisor decrementer interrupts is when one
is left pending on exit from a KVM guest.
When we get a hypervisor decrementer interrupt we don't need to do
anything special to clear it, since they are edge-triggered on the
transition of HDEC from 0 to -1. Thus this adds an empty handler
function for them. We don't need to have them masked when interrupts
are soft-disabled, so we use STD_EXCEPTION_HV instead of
MASKABLE_EXCEPTION_HV.
Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Linus Walleij [Thu, 30 Aug 2012 17:22:36 +0000 (19:22 +0200)]
ARM: gemini: fix the gemini build
Test-compiling obscure machines I notice that the gemini (which
by the way lacks a defconfig) is broken since some time back.
Adding a simple missing include makes it build again.
Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Olof Johansson <olof@lixom.net>
[ 2.086181] Waiting for root device /dev/mmcblk0p1...
[ 2.324066] Unhandled fault: imprecise external abort (0x406) at 0x00000000
[ 2.331451] Internal error: : 406 [#1] ARM
[ 2.335784] Modules linked in:
[ 2.339050] CPU: 0 Not tainted (3.6.0-rc3 #60)
[ 2.344146] PC is at default_idle+0x28/0x30
[ 2.348602] LR is at trace_hardirqs_on_caller+0x15c/0x1b0
...
This turned out to be due to memory corruption caused by long-broken
PIO code in drivers/mmc/host/omap.c. (Previously, this driver had
been using DMA; but the above commit caused the MMC driver to fall
back to PIO mode with an unmodified Kconfig.)
The PIO code, added with the rest of the driver in commit 730c9b7e6630f786fcec026fb11d2e6f2c90fdcb ("[MMC] Add OMAP MMC host
driver"), confused bytes with 16-bit words. This bug caused memory
located after the PIO transfer buffer to be corrupted with transfers
larger than 32 bytes. The driver also did not increment the buffer
pointer after the transfer occurred. This bug resulted in data
corruption during any transfer larger than 64 bytes.
Signed-off-by: Paul Walmsley <paul@pwsan.com> Reviewed-by: Felipe Balbi <balbi@ti.com> Tested-by: Tony Lindgren <tony@atomide.com> Signed-off-by: Chris Ball <cjb@laptop.org>
Ian Chen [Wed, 29 Aug 2012 06:05:36 +0000 (15:05 +0900)]
mmc: card: Skip secure erase on MoviNAND; causes unrecoverable corruption.
For several MoviNAND eMMC parts, there are known issues with secure
erase and secure trim. For these specific MoviNAND devices, we skip
these operations.
Specifically, there is a bug in the eMMC firmware that causes
unrecoverable corruption when the MMC is erased with MMC_CAP_ERASE
enabled.
Doug Anderson [Wed, 25 Jul 2012 15:33:17 +0000 (08:33 -0700)]
mmc: dw_mmc: Disable low power mode if SDIO interrupts are used
The documentation for the dw_mmc part says that the low power
mode should normally only be set for MMC and SD memory and should
be turned off for SDIO cards that need interrupts detected.
The best place I could find to do this is when the SDIO interrupt
was first enabled. I rely on the fact that dw_mci_setup_bus()
will be called when it's time to reenable.
Signed-off-by: Doug Anderson <dianders@chromium.org> Acked-by: Seungwon Jeon <tgih.jun@samsung.com> Signed-off-by: Chris Ball <cjb@laptop.org>
Seungwon Jeon [Wed, 1 Aug 2012 00:30:46 +0000 (09:30 +0900)]
mmc: dw_mmc: fix error handling in PIO mode
Data transfer will be continued until all the bytes are transmitted,
even if data crc error occurs during a multiple-block data transfer.
This means RXDR/TXDR interrupts will occurs until data transfer is
terminated. Early setting of host->sg to NULL prevents going into
xxx_data_pio functions, hence permanent unhandled RXDR/TXDR interrupts
occurs. And checking error interrupt status in the xxx_data_pio functions
is no need because dw_mci_interrupt does do the same. This patch also
removes it.
Signed-off-by: Seungwon Jeon <tgih.jun@samsung.com> Acked-by: Jaehoon Chung <jh80.chung@samsung.com> Acked-by: Will Newton <will.newton@imgtec.com> Signed-off-by: Chris Ball <cjb@laptop.org>
Seungwon Jeon [Wed, 1 Aug 2012 00:30:40 +0000 (09:30 +0900)]
mmc: dw_mmc: correct mishandling error interrupt
Datasheet of SYNOPSYS mentions that DTO(Data Transfer Over) interrupt
will be raised even if some error interrupts, however it is actually
found that DTO does not occur. SYNOPSYS has confirmed this issue.
Current implementation defers the call of tasklet_schedule until DTO
when the error interrupts is happened. This patch fixes error handling.
Signed-off-by: Seungwon Jeon <tgih.jun@samsung.com> Acked-by: Jaehoon Chung <jh80.chung@samsung.com> Acked-by: Will Newton <will.newton@imgtec.com> Signed-off-by: Chris Ball <cjb@laptop.org>
Seungwon Jeon [Wed, 1 Aug 2012 00:30:30 +0000 (09:30 +0900)]
mmc: dw_mmc: amend using error interrupt status
RINTSTS status includes masked interrupts as well as unmasked.
data_status and cmd_status are set by value of RINTSTS in interrupt handler
and tasklet finally uses it to decide whether error is happened or not.
In addition, MINTSTS status is used for setting data_status in PIO.
Masked error interrupt will not be handled and that status can be considered
non-error case.
Signed-off-by: Seungwon Jeon <tgih.jun@samsung.com>
Reviewed By: Girish K S <girish.shivananjappa@linaro.org> Acked-by: Jaehoon Chung <jh80.chung@samsung.com> Acked-by: Will Newton <will.newton@imgtec.com> Signed-off-by: Chris Ball <cjb@laptop.org>
mmc: atmel-mci: not busy flag has also to be used for read operations
Even if the datasheet says that the not busy flag has to be used only
for write operations, it's false except for version lesser than v2xx.
Not waiting on the not busy flag for read operations can cause the
controller to hang-up during the initialization of some SD cards
with DMA after the first CMD6 -- the next command is sent too early.
Shawn Guo [Wed, 22 Aug 2012 15:10:01 +0000 (23:10 +0800)]
mmc: sdhci-esdhc: break out early if clock is 0
Since commit 30832ab56 ("mmc: sdhci: Always pass clock request value
zero to set_clock host op") was merged, esdhc_set_clock starts hitting
"if (clock == 0)" where ESDHC_SYSTEM_CONTROL has been operated. This
causes SDHCI card-detection function being broken. Fix the regression
by moving "if (clock == 0)" above ESDHC_SYSTEM_CONTROL operation.
Signed-off-by: Shawn Guo <shawn.guo@linaro.org> Cc: <stable@vger.kernel.org> Signed-off-by: Chris Ball <cjb@laptop.org>
K.Prasad [Thu, 2 Aug 2012 08:16:35 +0000 (13:46 +0530)]
perf/hwpb: Invoke __perf_event_disable() if interrupts are already disabled
While debugging a warning message on PowerPC while using hardware
breakpoints, it was discovered that when perf_event_disable is invoked
through hw_breakpoint_handler function with interrupts disabled, a
subsequent IPI in the code path would trigger a WARN_ON_ONCE message in
smp_call_function_single function.
This patch calls __perf_event_disable() when interrupts are already
disabled, instead of perf_event_disable().
Reported-by: Edjunior Barbosa Machado <emachado@linux.vnet.ibm.com> Signed-off-by: K.Prasad <Prasad.Krishnan@gmail.com>
[naveen.n.rao@linux.vnet.ibm.com: v3: Check to make sure we target current task] Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Acked-by: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/20120802081635.5811.17737.stgit@localhost.localdomain
[ Fixed build error on MIPS. ] Signed-off-by: Ingo Molnar <mingo@kernel.org>
Al Viro [Mon, 20 Aug 2012 13:59:25 +0000 (14:59 +0100)]
perf_event: Switch to internal refcount, fix race with close()
Don't mess with file refcounts (or keep a reference to file, for
that matter) in perf_event. Use explicit refcount of its own
instead. Deal with the race between the final reference to event
going away and new children getting created for it by use of
atomic_long_inc_not_zero() in inherit_event(); just have the
latter free what it had allocated and return NULL, that works
out just fine (children of siblings of something doomed are
created as singletons, same as if the child of leader had been
created and immediately killed).
Michael Wang [Tue, 3 Jul 2012 06:34:02 +0000 (14:34 +0800)]
sched: Remove useless code in yield_to()
It's impossible to enter the else branch if we have set
skip_clock_update in task_yield_fair(), as yield_to_task_fair()
will directly return true after invoke task_yield_fair().
Signed-off-by: Michael Wang <wangyun@linux.vnet.ibm.com> Acked-by: Mike Galbraith <efault@gmx.de> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/4FF2925A.9060005@linux.vnet.ibm.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
Namhyung Kim [Thu, 16 Aug 2012 08:03:24 +0000 (17:03 +0900)]
sched/debug: Limit sd->*_idx range on sysctl
Various sd->*_idx's are used for refering the rq's load average table
when selecting a cpu to run. However they can be set to any number
with sysctl knobs so that it can crash the kernel if something bad is
given. Fix it by limiting them into the actual range.
Randy Dunlap [Sun, 19 Aug 2012 00:45:08 +0000 (17:45 -0700)]
sched: Fix kernel-doc warnings in kernel/sched/fair.c
Fix two kernel-doc warnings in kernel/sched/fair.c:
Warning(kernel/sched/fair.c:3660): Excess function parameter 'cpus' description in 'update_sg_lb_stats'
Warning(kernel/sched/fair.c:3806): Excess function parameter 'cpus' description in 'update_sd_lb_stats'
sched: Unthrottle rt runqueues in __disable_runtime()
migrate_tasks() uses _pick_next_task_rt() to get tasks from the
real-time runqueues to be migrated. When rt_rq is throttled
_pick_next_task_rt() won't return anything, in which case
migrate_tasks() can't move all threads over and gets stuck in an
infinite loop.
Instead unthrottle rt runqueues before migrating tasks.
Additionally: move unthrottle_offline_cfs_rqs() to rq_offline_fair()
Peter Zijlstra [Mon, 20 Aug 2012 09:26:57 +0000 (11:26 +0200)]
sched: Fix load avg vs cpu-hotplug
Rabik and Paul reported two different issues related to the same few
lines of code.
Rabik's issue is that the nr_uninterruptible migration code is wrong in
that he sees artifacts due to this (Rabik please do expand in more
detail).
Paul's issue is that this code as it stands relies on us using
stop_machine() for unplug, we all would like to remove this assumption
so that eventually we can remove this stop_machine() usage altogether.
The only reason we'd have to migrate nr_uninterruptible is so that we
could use for_each_online_cpu() loops in favour of
for_each_possible_cpu() loops, however since nr_uninterruptible() is the
only such loop and its using possible lets not bother at all.
The problem Rabik sees is (probably) caused by the fact that by
migrating nr_uninterruptible we screw rq->calc_load_active for both rqs
involved.
So don't bother with fancy migration schemes (meaning we now have to
keep using for_each_possible_cpu()) and instead fold any nr_active delta
after we migrate all tasks away to make sure we don't have any skewed
nr_active accounting.
Reported-by: Rakib Mullick <rakib.mullick@gmail.com> Reported-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1345454817.23018.27.camel@twins Signed-off-by: Ingo Molnar <mingo@kernel.org>