git.karo-electronics.de Git - karo-tx-linux.git/log

Merge tag 'perf-core-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core

Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo:

- Rename libtraceevent 'private' struct member to 'priv' so that it works
   in C++, from Steven Rostedt

- Remove lots of exit()/die() calls from tools so that the main perf exit
   routine can take place, from David Ahern

- Fix x86 build on x86-64, from David Ahern.

- Remove some headers that prevented perf from building on Android,
   from David Ahern

- {int,str,rb}list fixes from Suzuki K Poulose

- perf.data header fixes from Namhyung Kim

- Replace needless mempcpy with memcpy, to allow build on Android, from Irina Tirdea

- Allow user to indicate objdump path, needed in cross environments, from
   Maciek Borzecki

- Fix hardware cache event name generation, fix from Jiri Olsa

- Add round trip test for sw, hw and cache event names, catching the
   problem Jiri fixed, after Jiri's patch, the test passes successfully.

- Clean target should do clean for lib/traceevent too, fix from David Ahern

- Check the right variable for allocation failure, fix from Namhyung Kim

- Set up evsel->tp_format regardless of evsel->name being set already,
   fix from Namhyung Kim

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>

Merge branch 'linus'

perf tools: Fix cache event name generation

If the event name is specified with all 3 components, the last one
overwrites the previous one during the name composing within the
parse_events_add_cache function.

Fixing this by properly adjusting the string index.

Reported-by: Joel Uckelman <joel@lightboxtechnologies.com>
Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Joel Uckelman <joel@lightboxtechnologies.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
LPU-Reference: 20120905175133.GA18352@krava.brq.redhat.com
[ committer note: Remove the newline fix, done already in 42e1fb7 ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

perf test: Add roundtrip test for hardware cache events

That nicely catches the problem reported by Joel Uckelman in
http://permalink.gmane.org/gmane.linux.kernel.perf.user/1016 :

  [root@sandy ~]# perf test
   1: vmlinux symtab matches kallsyms: Ok
   2: detect open syscall event: Ok
   3: detect open syscall event on all cpus: Ok
   4: read samples using the mmap interface: Ok
   5: parse events tests: Ok
   6: x86 rdpmc test: Ok
   7: Validate PERF_RECORD_* events & perf_sample fields: Ok
   8: Test perf pmu format parsing: Ok
   9: Test dso data interface: Ok
  10: roundtrip evsel->name check: FAILED!

  [root@sandy ~]# perf test -v 10
  10: roundtrip evsel->name check:
  --- start ---
  L1-dcache-misses != L1-dcache-load-misses
  L1-dcache-misses != L1-dcache-store-misses
  L1-dcache-misses != L1-dcache-prefetch-misses
  L1-icache-misses != L1-icache-load-misses
  L1-icache-misses != L1-icache-prefetch-misses
  LLC-misses != LLC-load-misses
  LLC-misses != LLC-store-misses
  LLC-misses != LLC-prefetch-misses
  dTLB-misses != dTLB-load-misses
  dTLB-misses != dTLB-store-misses
  dTLB-misses != dTLB-prefetch-misses
  iTLB-misses != iTLB-load-misses
  branch-misses != branch-load-misses
  node-misses != node-load-misses
  node-misses != node-store-misses
  node-misses != node-prefetch-misses
  ---- end ----
  roundtrip evsel->name check: FAILED!

  [root@sandy ~]#

Now lemme apply Jiri's fix and try it again...

Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Joel Uckelman <joel@lightboxtechnologies.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-bbewtxw0rfipp5qy1j3jtg5d@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

perf evlist: Add fprintf method

For debugging, etc.

Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-fjimge1ovgh976qlt8dtmlp0@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

perf tools: Remove extraneous newline when parsing hardware cache events

Noticed while developing a 'perf test' entry to verify that
perf_evsel__name works.

Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-xz6zgh38mp3cjnd2udh38z8f@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

Merge tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc

Pull ARM SoC bug fixes from Olof Johansson:
"Mostly Renesas and Atmel bugfixes this time, targeting boot and build
  problems.  A couple of patches for gemini and kirkwood as well.  On a
  whole nothing very controversial."

* tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
  ARM: gemini: fix the gemini build
  ARM: shmobile: armadillo800eva: enable rw rootfs mount
  ARM: Kirkwood: Fix 'SZ_1M' undeclared here for db88f6281-bp-setup.c
  ARM: shmobile: mackerel: fixup usb module order
  ARM: shmobile: armadillo800eva: fixup: sound card detection order
  ARM: shmobile: marzen: fixup smsc911x id for regulator
  ARM: at91/feature-removal-schedule: delay at91_mci removal
  ARM: mach-shmobile: armadillo800eva: Enable power button as wakeup source
  ARM: mach-shmobile: armadillo800eva: Fix GPIO buttons descriptions
  ARM: at91/dts: remove partial parameter in at91sam9g25ek.dts
  ARM: at91/clock: fix PLLA overclock warning
  ARM: at91: fix rtc-at91sam9 irq issue due to sparse irq support
  ARM: at91: fix system timer irq issue due to sparse irq support
  ARM: shmobile: sh73a0: fixup RELOC_BASE of intca_irq_pins_desc

Merge tag 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging

Pull a hwmon fix from Guenter Roeck:
"One patch, fixing DIV_ROUND_CLOSEST to support negative dividends.

  While the changes are not in the drivers/hwmon directory, the problem
  primarily affects hwmon drivers, and it makes sense to push the patch
  through the hwmon tree."

* tag 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
  linux/kernel.h: Fix DIV_ROUND_CLOSEST to support negative dividends

Merge branch 'rc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild

Pull kbuild fixes from Michal Marek:
"These are two fixes that should go into 3.6.  The link-vmlinux.sh one
  is obvious.

  The other one fixes make firmware_install with certain configurations,
  where a file in the toplevel firmware tree gets installed first, and
  $(INSTALL_FW_PATH)/$$(dir <file>) results in /lib/firmware/./, which
  confuses make 3.82 for some reason."

* 'rc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild:
  firmware: fix directory creation rule matching with make 3.82
  link-vmlinux.sh: Fix stray "echo" in error message

Remove user-triggerable BUG from mpol_to_str

Trivially triggerable, found by trinity:

  kernel BUG at mm/mempolicy.c:2546!
  Process trinity-child2 (pid: 23988, threadinfo ffff88010197e000, task ffff88007821a670)
  Call Trace:
    show_numa_map+0xd5/0x450
    show_pid_numa_map+0x13/0x20
    traverse+0xf2/0x230
    seq_read+0x34b/0x3e0
    vfs_read+0xac/0x180
    sys_pread64+0xa2/0xc0
    system_call_fastpath+0x1a/0x1f
  RIP: mpol_to_str+0x156/0x360

Cc: stable@vger.kernel.org
Signed-off-by: Dave Jones <davej@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

perf test: Add round trip test for sw and hw event names

It basically traverses the hardware and software event name arrays
creating an evlist with all events, then it uses perf_evsel__name to
check that the name is the expected one.

With it I noticed this problem:

[root@sandy ~]# perf test 10
10: roundtrip evsel->name check:invalid or unsupported event: 'CPU-migrations'
Run 'perf list' for a list of valid events
FAILED!

Changed it to "cpu-migrations" in the software event arrays and it
worked.

This is to catch problems like the one reported by Joel Uckelman in
http://permalink.gmane.org/gmane.linux.kernel.perf.user/1016

Hardware cache events will be checked in the following patch.

Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-5jskfkuqvf2fi257zmni0ftz@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

perf header: Prepare tracepoint events regardless of name

Current perf_evlist__set_tracepoint_names is a misnomer because it finds
and sets correspoding event_format in addition to the name. So skipping
it when a event has set name already caused a trouble.

Rename it and set name only a event doesn't have one.

Reported-by: David Ahern <dsahern@gmail.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1346897446-16569-2-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

perf header: Fix a typo on evsel

For checking return value of the strdup, 'event' should be 'evsel'.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1346897446-16569-1-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

perf tools: Clean target should do clean for lib/traceevent too

It's built as part of perf, so it should be cleaned too.

Tested-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Link: http://lkml.kernel.org/r/1346892816-61779-1-git-send-email-dsahern@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

Merge branch 'linus'

Merge tag 'mmc-fixes-for-3.6-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/cjb/mmc

Pull MMC fixes from Chris Ball:
- a firmware bug on several Samsung MoviNAND eMMC models causes
   permanent corruption on the device when secure erase and secure trim
   requests are made, so we disable those requests on these eMMC devices.
- atmel-mci: fix a hang with some SD cards by waiting for not-busy flag.
- dw_mmc: low-power mode breaks SDIO interrupts; fix PIO error handling;
   fix handling of error interrupts.
- mxs-mmc: fix deadlocks; fix compile error due to dma.h arch change.
- omap: fix broken PIO mode causing memory corruption.
- sdhci-esdhc: fix card detection.

* tag 'mmc-fixes-for-3.6-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/cjb/mmc:
  mmc: omap: fix broken PIO mode
  mmc: card: Skip secure erase on MoviNAND; causes unrecoverable corruption.
  mmc: dw_mmc: Disable low power mode if SDIO interrupts are used
  mmc: dw_mmc: fix error handling in PIO mode
  mmc: dw_mmc: correct mishandling error interrupt
  mmc: dw_mmc: amend using error interrupt status
  mmc: atmel-mci: not busy flag has also to be used for read operations
  mmc: sdhci-esdhc: break out early if clock is 0
  mmc: mxs-mmc: fix deadlock caused by recursion loop
  mmc: mxs-mmc: fix deadlock in SDIO IRQ case
  mmc: bfin_sdh: fix dma_desc_array build error

uml: fix compile error in deliver_alarm()

Fix the following compile error on UML.

  arch/um/os-Linux/time.c: In function 'deliver_alarm':
  arch/um/os-Linux/time.c:117:3: error: too few arguments to function 'alarm_handler'
  arch/um/os-Linux/internal.h:1:6: note: declared here

The error was introduced by commit d3c1cfcd ("um: pass siginfo to guest
process") in 3.6-rc1.

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
CC: Martin Pärtel <martin.partel@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

dj: memory scribble in logi_dj

Allocate a structure not a pointer to it !

Signed-off-by: Alan Cox <alan@linux.intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Merge branch 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc

Pull powerpc fixes from Benjamin Herrenschmidt:
"Here are a few fixes for 3.6 that were piling up while I was away or
  busy (I was mostly MIA a week or two before San Diego).

  Some fixes from Anton fixing up issues with our relatively new DSCR
  control feature, and a few other fixes that are either regressions or
  bugs nasty enough to warrant not waiting."

* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc:
  powerpc: Don't use __put_user() in patch_instruction
  powerpc: Make sure IPI handlers see data written by IPI senders
  powerpc: Restore correct DSCR in context switch
  powerpc: Fix DSCR inheritance in copy_thread()
  powerpc: Keep thread.dscr and thread.dscr_inherit in sync
  powerpc: Update DSCR on all CPUs when writing sysfs dscr_default
  powerpc/powernv: Always go into nap mode when CPU is offline
  powerpc: Give hypervisor decrementer interrupts their own handler
  powerpc/vphn: Fix arch_update_cpu_topology() return value

Merge tag 'gpio-fixes-for-v3.6' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio

Pull GPIO fixes from Linus Walleij:
"These are some GPIO regression fixes for v3.6:
   - Erroneous debug message from of_get_named_gpio_flags()
   - Make sure the MC9S08DZ60 GPIO driver depend on I2C being compiled
     in (not module) or allmodconfig breaks.
   - Check return value from irq_alloc_descs() in the Emma Mobile GPIO
     driver.
   - Assign the owner field for the rdc321x driver so the module won't
     be removed if it has active GPIOs."

* tag 'gpio-fixes-for-v3.6' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio:
  gpio: rdc321x: Prevent removal of modules exporting active GPIOs
  gpio: em: Fix checking return value of irq_alloc_descs
  gpio: mc9s08dz60: Fix build error if I2C=m
  gpio: Fix debug message in of_get_named_gpio_flags()

Merge tag 'sound-3.6' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound

Pull sound fixes from Takashi Iwai:
"There are nothing scaring, contains only small fixes for HD-audio and
  USB-audio:
   - EPSS regression fix and GPIO fix for HD-audio IDT codecs
   - A series of USB-audio regression fixes that are found since 3.5
     kernel"

* tag 'sound-3.6' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
  ALSA: snd-usb: fix cross-interface streaming devices
  ALSA: snd-usb: fix calls to next_packet_size
  ALSA: snd-usb: restore delay information
  ALSA: snd-usb: use list_for_each_safe for endpoint resources
  ALSA: snd-usb: Fix URB cancellation at stream start
  ALSA: hda - Don't trust codec EPSS bit for IDT 92HD83xx & co
  ALSA: hda - Avoid unnecessary parameter read for EPSS
  ALSA: hda - Do not set GPIOs for speakers on IDT if there are no speakers

Merge tag 'fbdev-fixes-for-3.6-1' of git://github.com/schandinat/linux-2.6

Pull fbdev fixes from Florian Tobias Schandinat:
- a fix by Paul Cercueil to prevent a possible buffer overflow
- a fix by Bruno Prémont to prevent a rare sleep in invalid context
- a fix by Julia Lawall for a double free in auo_k190x
- a fix by Dan Carpenter to prevent a division by zero in mb862xxfb
- a regression fix by Tomi Valkeinen for the SDI output in OMAP
- a fix by Grazvydas Ignotas to fix the console colors in OMAP

* tag 'fbdev-fixes-for-3.6-1' of git://github.com/schandinat/linux-2.6:
  OMAPFB: fix framebuffer console colors
  OMAPDSS: Fix SDI PLL locking
  video: mb862xxfb: prevent divide by zero bug
  drivers/video/auo_k190x.c: drop kfree of devm_kzalloc's data
  fbcon: Fix bit_putcs() call to kmalloc(s, GFP_KERNEL)
  fbcon: prevent possible buffer overflow.

Merge tag 'upstream-3.6-rc5' of git://git.infradead.org/linux-ubi

Pull ubi fix from Artem Bityutskiy:
"A single small fix for memory deallocation: we allocated memory using
  'kmem_cache_alloc()' but were freeing it using 'kfree()' in some
  cases.  Now we fix this by using 'kmem_cache_free()' instead."

* tag 'upstream-3.6-rc5' of git://git.infradead.org/linux-ubi:
  UBI: fix a horrible memory deallocation bug

Fix order of arguments to compat_put_time[spec|val]

Commit 644595f89620 ("compat: Handle COMPAT_USE_64BIT_TIME in
net/socket.c") introduced a bug where the helper functions to take
either a 64-bit or compat time[spec|val] got the arguments in the wrong
order, passing the kernel stack pointer off as a user pointer (and vice
versa).

Because of the user address range check, that in turn then causes an
EFAULT due to the user pointer range checking failing for the kernel
address. Incorrectly resuling in a failed system call for 32-bit
processes with a 64-bit kernel.

On odder architectures like HP-PA (with separate user/kernel address
spaces), it can be used read kernel memory.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

perf tools: Allow user to indicate path to objdump in command line

When analyzing perf data from hosts of other architecture than one of
the local host it's useful to call objdump that is part of a toolchain
for that architecture. Instead of calling regular objdump, call one that
user specified in command line.

Signed-off-by: Maciek Borzecki <maciek.borzecki@gmail.com>
Acked-by: David Ahern <dsahern@gmail.com>
Link: http://lkml.kernel.org/r/1346754750.16299.3.camel@localhost.localdomain
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

perf tools: Replace mempcpy with memcpy

mempcpy is not supported by bionic in Android and will lead to
compilation errors.

Replacing mempcpy with memcpy so it will work in Android.

Signed-off-by: Irina Tirdea <irina.tirdea@intel.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Steven Rostedt <rostedt@goodmis.org>
Link: http://lkml.kernel.org/r/CANg8OW+Y3ZMG-GdhYu2_yKOYH_XEMgw73PdCX_23UTnfYhmttA@mail.gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

perf header: Swap pmu mapping numbers if needed

Like others, the numbers can be saved in a different endian format than
a host machine. Swap them if needed.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Robert Richter <robert.richter@amd.com>
Link: http://lkml.kernel.org/r/1346821373-31621-4-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

perf header: Set tracepoint event name only if not set

The event name can be set already by processing a event_desc data.

So check it before setting to prevent possible leak.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1346821373-31621-3-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

perf header: Use evlist->nr_entries on write_event_desc()

Number of events (evsels) in a evlist is kept on nr_entries field
so that we don't need to recalculate it.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1346821373-31621-2-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

perf tools: remove unneeded include of network header files

perf does not have networking related functionality, and the inclusion
of these headers is one of the causes of compile failures for Android:

https://lkml.org/lkml/2012/8/23/316
https://lkml.org/lkml/2012/8/28/293

So, remove them.

Signed-off-by: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1346255732-93246-1-git-send-email-dsahern@gmail.com
[ committer note: fix trace-event-perl.c compile failure by reordering includes ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

perf tools: Remove the node from rblist in strlist__remove

The following commit:

author David Ahern <dsahern@gmail.com>
Tue, 31 Jul 2012 04:31:33 +0000 (22:31 -0600)
committer Arnaldo Carvalho de Melo <acme@redhat.com>
Fri, 3 Aug 2012 13:39:51 +0000 (10:39 -0300)
commit ee8dd3ca43f151d9fbe1edeef68fb8a77eb9f047

causes a double free during a probe deletion as the node is never
removed from the list via strlist__remove(), even though it gets
'deleted' (read free()'d). This causes a double free when we do
strlist__delete() as the node is already deleted but present in the
rblist.

[suzukikp@suzukikp perf]$ sudo ./perf probe -a do_fork
Added new event:
probe:do_fork (on do_fork)

You can now use it in all perf tools, such as:

perf record -e probe:do_fork -aR sleep 1

[suzukikp@suzukikp perf]$ sudo ./perf probe -d do_fork
Removed event: probe:do_fork
*** glibc detected *** ./perf: double free or corruption (fasttop): 0x000000000133d600 ***
======= Backtrace: =========
/lib64/libc.so.6[0x38eec7dda6]
./perf(rblist__delete+0x5c)[0x47d3dc]
./perf(del_perf_probe_events+0xb6)[0x47b826]
./perf(cmd_probe+0x471)[0x42c8d1]
./perf[0x4150b3]
./perf(main+0x501)[0x4148e1]
/lib64/libc.so.6(__libc_start_main+0xed)[0x38eec2169d]
./perf[0x414a61]

Make sure we remove the node from the rblist before we delete the node.
The rblist__remove_node() will invoke rblist->node_delete, which will
take care of deleting the node with the suitable function provided by
the user.

Reported-by: Ananth N. Mavinakayanahalli <ananth@in.ibm.com>
Signed-off-by: Suzuki K. Poulose <suzuki@in.ibm.com>
Acked-by: David Ahern <dsahern@gmail.com>
Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20120829055840.7802.1459.stgit@suzukikp.in.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

perf tools: Fix intlist node removal

Similar to the one in :
https://lkml.org/lkml/2012/8/29/27

Make sure we remove the node from the rblist before we delete the node.
The rblist__remove_node() will invoke rblist->node_delete, which will
take care of deleting the node with the suitable function provided by
the user.

Signed-off-by: Suzuki K Poulose <suzuki@in.ibm.com>
Acked-by: David Ahern <dsahern@gmail.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Suzuki K Poulose <suzuki@in.ibm.com>
Link: http://lkml.kernel.org/r/20120831065840.5167.90318.stgit@suzukikp.in.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

perf tools: Fix x86 builds with ARCH specified on the command line

e.g., compiling i386 on x86_64 using:
$ make -C tools/perf ARCH=i386

fails with:

    CC /tmp/pbuild/util/evsel.o
In file included from util/evsel.c:21:0:
util/perf_regs.h:5:23: fatal error: perf_regs.h: No such file or directory
compilation terminated.

Adding V=1 you see that the include argument for the arch is
'-Iarch/i386/include' is wrong. It is supposed to be -Iarch/x86/include
per the redefinition of ARCH in the Makefile.

According to the make manual,
http://www.gnu.org/software/make/manual/make.html#Override-Directive:
  "If a variable has been set with a command argument (see Overriding
   Variables), then ordinary assignments in the makefile are ignored. If
   you want to set the variable in the makefile even though it was set
   with a command argument, you can use an override directive ..."

Make it so.

Signed-off-by: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1346094354-74356-1-git-send-email-dsahern@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

perf record: Remove use of die/exit

Allows perf to clean up properly on exit. If perf-record is exiting due
to failure, the on_exit should not run as the session has been deleted.

Signed-off-by: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1346005487-62961-8-git-send-email-dsahern@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

perf script: Remove use of die/exit

Allows perf to clean up properly on exit. Only exits left are exec
failures which are appropriate and usage callbacks that list available
options.

Signed-off-by: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1346005487-62961-7-git-send-email-dsahern@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

perf help: Remove use of die and handle errors

Allows perf to clean up properly on exit.

Signed-off-by: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1346005487-62961-6-git-send-email-dsahern@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

perf stat: Remove use of die/exit and handle errors

Allows perf to clean up properly on program termination.

Signed-off-by: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1346005487-62961-5-git-send-email-dsahern@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

perf lock: Remove use of die and handle errors

Allows perf to clean up properly on exit.

Signed-off-by: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1346005487-62961-4-git-send-email-dsahern@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

perf tool: handle errors in synthesized event functions

Handle error from process callback and propagate back to caller.

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1346005487-62961-3-git-send-email-dsahern@gmail.com
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

perf session: flush_sample_queue needs to handle errors from handlers

Allows errors to propogate through event processing code and back to
commands.

Signed-off-by: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1346005487-62961-2-git-send-email-dsahern@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

tools lib traceevent: Modify header to work in C++ programs

Replace keyword "private" to "priv" in event-parse.h to allow it to be
used in C++ programs.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1345735321.5069.62.camel@gandalf.local.home
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

Merge branch 'x86/urgent'

x86/kconfig: Remove outdated reference to Intel CPUs in CONFIG_SWIOTLB

Deleted the no longer valid example of which x86 CPUs lack a
hardware IOMMU, and moved the "If unsure..." statement to a new
line to follow the style of surrounding options.

Signed-off-by: Joe Millenbach <jmillenbach@gmail.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
Cc: team-fjord@googlegroups.com
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Link: http://lkml.kernel.org/r/1346632700-29113-1-git-send-email-jmillenbach@gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>

Merge branch 'x86/debug'

x86/iommu: Use NULL instead of plain 0 for __IOMMU_INIT

IOMMU_INIT_POST and IOMMU_INIT_POST_FINISH pass the plain value
0 instead of NULL to __IOMMU_INIT. Fix this and make sparse
happy by doing so.

Signed-off-by: Mathias Krause <minipli@googlemail.com>
Cc: Joerg Roedel <joerg.roedel@amd.com>
Link: http://lkml.kernel.org/r/1346621506-30857-8-git-send-email-minipli@googlemail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>

x86/iommu: Drop duplicate const in __IOMMU_INIT

It's redundant and makes sparse complain about it.

Signed-off-by: Mathias Krause <minipli@googlemail.com>
Cc: Joerg Roedel <joerg.roedel@amd.com>
Link: http://lkml.kernel.org/r/1346621506-30857-7-git-send-email-minipli@googlemail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>

x86/fpu/xsave: Keep __user annotation in casts

Don't remove the __user annotation of the fpstate pointer, but
drop the superfluous void * cast instead.

This fixes the following sparse warnings:

  xsave.c:135:15: warning: cast removes address space of expression
  xsave.c:135:15: warning: incorrect type in argument 1 (different address spaces)
  xsave.c:135:15:    expected void const volatile [noderef] <asn:1>*<noident>
  [...]

Signed-off-by: Mathias Krause <minipli@googlemail.com>
Cc: Suresh Siddha <suresh.b.siddha@intel.com>
Link: http://lkml.kernel.org/r/1346621506-30857-6-git-send-email-minipli@googlemail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>

x86/pci/probe_roms: Add missing __iomem annotation to pci_map_biosrom()

Stay in sync with the declaration and fix the corresponding
sparse warnings.

Signed-off-by: Mathias Krause <minipli@googlemail.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Link: http://lkml.kernel.org/r/1346621506-30857-5-git-send-email-minipli@googlemail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>

x86/signals: ia32_signal.c: add __user casts to fix sparse warnings

Fix the following sparse warnings by adding appropriate __user
casts and annotations:

  ia32_signal.c:165:38: warning: incorrect type in argument 1 (different address spaces)
   ia32_signal.c:165:38:    expected struct sigaltstack const [noderef] [usertype] <asn:1>*<noident>
  ia32_signal.c:165:38:    got struct sigaltstack *
  [...]

Signed-off-by: Mathias Krause <minipli@googlemail.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Link: http://lkml.kernel.org/r/1346621506-30857-4-git-send-email-minipli@googlemail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>

x86/vdso: Add __user annotation to VDSO32_SYMBOL

The address calculated by VDSO32_SYMBOL() is a pointer into
userland. Add the __user annotation to fix related sparse
warnings in its users.

Signed-off-by: Mathias Krause <minipli@googlemail.com>
Cc: Andy Lutomirski <luto@MIT.EDU>
Link: http://lkml.kernel.org/r/1346621506-30857-3-git-send-email-minipli@googlemail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>

x86: Fix __user annotations in asm/sys_ia32.h

Fix the following sparse warnings:

  sys_ia32.c:293:38: warning: incorrect type in argument 2 (different address spaces)
  sys_ia32.c:293:38:    expected unsigned int [noderef] [usertype] <asn:1>*stat_addr
  sys_ia32.c:293:38:    got unsigned int *stat_addr

Ironically, sys_ia32.h was introduced to fix sparse warnings but
missed that one.

Signed-off-by: Mathias Krause <minipli@googlemail.com>
Link: http://lkml.kernel.org/r/1346621506-30857-2-git-send-email-minipli@googlemail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>

Merge branch 'x86/build'

Merge branch 'core/urgent'

x86/Kconfig: Turn off DEBUG_NX_TEST module in defconfigs

The x86 defconfigs include exactly one module: test_nx.ko, a
special-purpose module which just exists to do evil things like
executing code off the stack to see if the kernel has enabled NX
support. Anyone who actually uses that module can easily enable
it themselves, but the vast majority of kernel builds don't need
it; disable it by default.

Signed-off-by: Josh Triplett <josh@joshtriplett.org>
Cc: Randy Dunlap <rdunlap@xenotime.net>
Cc: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Link: http://lkml.kernel.org/r/e72faf875e1172fb1cbec5e6d3cd4122df508a97.1346649518.git.josh@joshtriplett.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>

x86/Kconfig: Turn off CONFIG_BLK_DEV_RAM

The vast majority of systems either use initramfs or mount a
root filesystem directly from the kernel.  Distros have
defaulted to initramfs for years.  Only highly specialized
systems would use an actual filesystem-image initrd at this
point, and such systems don't rely on defconfig anyway.  Drop
initrd support (and specifically RAM block device support) from
the defconfigs.

Signed-off-by: Josh Triplett <josh@joshtriplett.org>
Cc: Randy Dunlap <rdunlap@xenotime.net>
Cc: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/2521e983a63595cd7a331236d929577660f89c72.1346649518.git.josh@joshtriplett.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>

x86/Kconfig: Disable CONFIG_CRC_T10DIF in defconfigs

CONFIG_CRC_T10DIF explicitly states that it exists only for use
by out-of-tree modules; anything in-kernel that needs it selects
it. Thus, compile it out by default.

Signed-off-by: Josh Triplett <josh@joshtriplett.org>
Cc: Randy Dunlap <rdunlap@xenotime.net>
Cc: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/3aaff7a0af1320427952d411a21b8ded29747a1f.1346649518.git.josh@joshtriplett.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>

x86/Kconfig: Switch to ext4 in defconfigs

The current x86 and x86-64 defconfigs do not enable ext4, which
most current distributions default to. Switch the defconfigs to
ext4, so they will boot on current systems without additional
configuration.

Signed-off-by: Josh Triplett <josh@joshtriplett.org>
Cc: Randy Dunlap <rdunlap@xenotime.net>
Cc: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/bd8a359506b7e1287c680823de16d67608ec52fe.1346649518.git.josh@joshtriplett.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>

x86/Kconfig: Update defconfigs to current results of "make savedefconfig"

The x86 defconfigs have become somewhat out of date compared to
the current result of "make savedefconfig". Update them to the
current output, as a prelude to further defconfig changes, to
avoid unrelated noise in those further changes.

Signed-off-by: Josh Triplett <josh@joshtriplett.org>
Cc: Randy Dunlap <rdunlap@xenotime.net>
Cc: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/80c8a5fbeaf6cdb72fb78a016013427efee52668.1346649518.git.josh@joshtriplett.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>

mm/memblock: Use NULL instead of 0 for pointers

This type cleanup also fixes the following sparse warning:

mm/memblock.c:249:49: warning: Using plain integer as NULL pointer

Signed-off-by: Sachin Kamat <sachin.kamat@linaro.org>
Acked-by: Tejun Heo <tj@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: patches@linaro.org
Cc: linux-mm@kvack.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>

Merge branch 'perf/core'

Merge branch 'perf/urgent'

Merge branch 'core' of git://git.kernel.org/pub/scm/linux/kernel/git/rric/oprofile into perf/core

Pull oprofile fixes from Robert Richter.

Signed-off-by: Ingo Molnar <mingo@kernel.org>

Merge branch 'urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/rric/oprofile into perf/urgent

Pull s390 oprofile fix from Robert Richter.

Signed-off-by: Ingo Molnar <mingo@kernel.org>

powerpc: Don't use __put_user() in patch_instruction

patch_instruction() can be called very early on ppc32, when the kernel
isn't yet running at it's linked address. That can cause the !
is_kernel_addr() test in __put_user() to trip and call might_sleep()
which is very bad at that point during boot.

Use a lower level function instead for now, at least until we get to
rework ppc32 boot process to do the code patching later, like ppc64
does.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>

powerpc: Make sure IPI handlers see data written by IPI senders

We have been observing hangs, both of KVM guest vcpu tasks and more
generally, where a process that is woken doesn't properly wake up and
continue to run, but instead sticks in TASK_WAKING state.  This
happens because the update of rq->wake_list in ttwu_queue_remote()
is not ordered with the update of ipi_message in
smp_muxed_ipi_message_pass(), and the reading of rq->wake_list in
scheduler_ipi() is not ordered with the reading of ipi_message in
smp_ipi_demux().  Thus it is possible for the IPI receiver not to see
the updated rq->wake_list and therefore conclude that there is nothing
for it to do.

In order to make sure that anything done before smp_send_reschedule()
is ordered before anything done in the resulting call to scheduler_ipi(),
this adds barriers in smp_muxed_message_pass() and smp_ipi_demux().
The barrier in smp_muxed_message_pass() is a full barrier to ensure that
there is a full ordering between the smp_send_reschedule() caller and
scheduler_ipi().  In smp_ipi_demux(), we use xchg() rather than
xchg_local() because xchg() includes release and acquire barriers.
Using xchg() rather than xchg_local() makes sense given that
ipi_message is not just accessed locally.

This moves the barrier between setting the message and calling the
cause_ipi() function into the individual cause_ipi implementations.
Most of them -- those that used outb, out_8 or similar -- already had
a full barrier because out_8 etc. include a sync before the MMIO
store.  This adds an explicit barrier in the two remaining cases.

These changes made no measurable difference to the speed of IPIs as
measured using a simple ping-pong latency test across two CPUs on
different cores of a POWER7 machine.

The analysis of the reason why processes were not waking up properly
is due to Milton Miller.

Cc: stable@vger.kernel.org # v3.0+
Reported-by: Milton Miller <miltonm@bga.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>

powerpc: Restore correct DSCR in context switch

During a context switch we always restore the per thread DSCR value.
If we aren't doing explicit DSCR management
(ie thread.dscr_inherit == 0) and the default DSCR changed while
the process has been sleeping we end up with the wrong value.

Check thread.dscr_inherit and select the default DSCR or per thread
DSCR as required.

This was found with the following test case, when running with
more threads than CPUs (ie forcing context switching):

http://ozlabs.org/~anton/junkcode/dscr_default_test.c

With the four patches applied I can run a combination of all
test cases successfully at the same time:

http://ozlabs.org/~anton/junkcode/dscr_default_test.c
http://ozlabs.org/~anton/junkcode/dscr_explicit_test.c
http://ozlabs.org/~anton/junkcode/dscr_inherit_test.c

Signed-off-by: Anton Blanchard <anton@samba.org>
Cc: <stable@kernel.org> # 3.0+
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>

powerpc: Fix DSCR inheritance in copy_thread()

If the default DSCR is non zero we set thread.dscr_inherit in
copy_thread() meaning the new thread and all its children will ignore
future updates to the default DSCR. This is not intended and is
a change in behaviour that a number of our users have hit.

We just need to inherit thread.dscr and thread.dscr_inherit from
the parent which ends up being much simpler.

This was found with the following test case:

http://ozlabs.org/~anton/junkcode/dscr_default_test.c

Signed-off-by: Anton Blanchard <anton@samba.org>
Cc: <stable@kernel.org> # 3.0+
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>

powerpc: Keep thread.dscr and thread.dscr_inherit in sync

When we update the DSCR either via emulation of mtspr(DSCR) or via
a change to dscr_default in sysfs we don't update thread.dscr.
We will eventually update it at context switch time but there is
a period where thread.dscr is incorrect.

If we fork at this point we will copy the old value of thread.dscr
into the child. To avoid this, always keep thread.dscr in sync with
reality.

This issue was found with the following testcase:

http://ozlabs.org/~anton/junkcode/dscr_inherit_test.c

Signed-off-by: Anton Blanchard <anton@samba.org>
Cc: <stable@kernel.org> # 3.0+
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>

powerpc: Update DSCR on all CPUs when writing sysfs dscr_default

Writing to dscr_default in sysfs doesn't actually change the DSCR -
we rely on a context switch on each CPU to do the work. There is no
guarantee we will get a context switch in a reasonable amount of time
so fire off an IPI to force an immediate change.

This issue was found with the following test case:

http://ozlabs.org/~anton/junkcode/dscr_explicit_test.c

Signed-off-by: Anton Blanchard <anton@samba.org>
Cc: <stable@kernel.org> # 3.0+
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>

powerpc/powernv: Always go into nap mode when CPU is offline

The CPU hotplug code for the powernv platform currently only puts
offline CPUs into nap mode if the powersave_nap variable is set.
However, HV-style KVM on this platform requires secondary CPU threads
to be offline and in nap mode. Since we know nap mode works just
fine on all POWER7 machines, and the only machines that support the
powernv platform are POWER7 machines, this changes the code to
always put offline CPUs into nap mode, regardless of powersave_nap.
Powersave_nap still controls whether or not CPUs go into nap mode
when idle, as before.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>

powerpc: Give hypervisor decrementer interrupts their own handler

At the moment the handler for hypervisor decrementer interrupts is
the same as for decrementer interrupts, i.e. timer_interrupt().
This is bogus; if we ever do get a hypervisor decrementer interrupt
it won't have anything to do with the next timer event.  In fact
the only time we get hypervisor decrementer interrupts is when one
is left pending on exit from a KVM guest.

When we get a hypervisor decrementer interrupt we don't need to do
anything special to clear it, since they are edge-triggered on the
transition of HDEC from 0 to -1.  Thus this adds an empty handler
function for them.  We don't need to have them masked when interrupts
are soft-disabled, so we use STD_EXCEPTION_HV instead of
MASKABLE_EXCEPTION_HV.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>

powerpc/vphn: Fix arch_update_cpu_topology() return value

arch_update_cpu_topology() should only return 1 when the topology has
actually changed, and should return 0 otherwise.

This patch fixes a potential bug where rebuild_sched_domains() would
reinitialize the sched domains even when the topology hasn't changed.

Signed-off-by: Jesse Larrew <jlarrew@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>

ARM: gemini: fix the gemini build

Test-compiling obscure machines I notice that the gemini (which
by the way lacks a defconfig) is broken since some time back.
Adding a simple missing include makes it build again.

Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Olof Johansson <olof@lixom.net>

Merge branch 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/horms/renesas into fixes

Two regression fixes and one boot-loader compatibility fix from Simon Horman.

* 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/horms/renesas:
  ARM: shmobile: armadillo800eva: enable rw rootfs mount
  ARM: shmobile: mackerel: fixup usb module order
  ARM: shmobile: armadillo800eva: fixup: sound card detection order

mmc: omap: fix broken PIO mode

After commit 26b88520b80695a6fa5fd95b5d97c03f4daf87e0 ("mmc:
omap_hsmmc: remove private DMA API implementation"), the Nokia N800
here stopped booting:

[    2.086181] Waiting for root device /dev/mmcblk0p1...
[    2.324066] Unhandled fault: imprecise external abort (0x406) at 0x00000000
[    2.331451] Internal error: : 406 [#1] ARM
[    2.335784] Modules linked in:
[    2.339050] CPU: 0    Not tainted  (3.6.0-rc3 #60)
[    2.344146] PC is at default_idle+0x28/0x30
[    2.348602] LR is at trace_hardirqs_on_caller+0x15c/0x1b0

...

This turned out to be due to memory corruption caused by long-broken
PIO code in drivers/mmc/host/omap.c.  (Previously, this driver had
been using DMA; but the above commit caused the MMC driver to fall
back to PIO mode with an unmodified Kconfig.)

The PIO code, added with the rest of the driver in commit
730c9b7e6630f786fcec026fb11d2e6f2c90fdcb ("[MMC] Add OMAP MMC host
driver"), confused bytes with 16-bit words.  This bug caused memory
located after the PIO transfer buffer to be corrupted with transfers
larger than 32 bytes.  The driver also did not increment the buffer
pointer after the transfer occurred.  This bug resulted in data
corruption during any transfer larger than 64 bytes.

Signed-off-by: Paul Walmsley <paul@pwsan.com>
Reviewed-by: Felipe Balbi <balbi@ti.com>
Tested-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Chris Ball <cjb@laptop.org>

mmc: card: Skip secure erase on MoviNAND; causes unrecoverable corruption.

For several MoviNAND eMMC parts, there are known issues with secure
erase and secure trim. For these specific MoviNAND devices, we skip
these operations.

Specifically, there is a bug in the eMMC firmware that causes
unrecoverable corruption when the MMC is erased with MMC_CAP_ERASE
enabled.

References:

http://forum.xda-developers.com/showthread.php?t=1644364
https://plus.google.com/111398485184813224730/posts/21pTYfTsCkB#111398485184813224730/posts/21pTYfTsCkB

Signed-off-by: Ian Chen <ian.cy.chen@samsung.com>
Reviewed-by: Namjae Jeon <linkinjeon@gmail.com>
Acked-by: Jaehoon Chung <jh80.chung@samsung.com>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Cc: stable <stable@vger.kernel.org> [3.0+]
Signed-off-by: Chris Ball <cjb@laptop.org>

mmc: dw_mmc: Disable low power mode if SDIO interrupts are used

The documentation for the dw_mmc part says that the low power
mode should normally only be set for MMC and SD memory and should
be turned off for SDIO cards that need interrupts detected.

The best place I could find to do this is when the SDIO interrupt
was first enabled. I rely on the fact that dw_mci_setup_bus()
will be called when it's time to reenable.

Signed-off-by: Doug Anderson <dianders@chromium.org>
Acked-by: Seungwon Jeon <tgih.jun@samsung.com>
Signed-off-by: Chris Ball <cjb@laptop.org>

mmc: dw_mmc: fix error handling in PIO mode

Data transfer will be continued until all the bytes are transmitted,
even if data crc error occurs during a multiple-block data transfer.
This means RXDR/TXDR interrupts will occurs until data transfer is
terminated. Early setting of host->sg to NULL prevents going into
xxx_data_pio functions, hence permanent unhandled RXDR/TXDR interrupts
occurs. And checking error interrupt status in the xxx_data_pio functions
is no need because dw_mci_interrupt does do the same. This patch also
removes it.

Signed-off-by: Seungwon Jeon <tgih.jun@samsung.com>
Acked-by: Jaehoon Chung <jh80.chung@samsung.com>
Acked-by: Will Newton <will.newton@imgtec.com>
Signed-off-by: Chris Ball <cjb@laptop.org>

mmc: dw_mmc: correct mishandling error interrupt

Datasheet of SYNOPSYS mentions that DTO(Data Transfer Over) interrupt
will be raised even if some error interrupts, however it is actually
found that DTO does not occur. SYNOPSYS has confirmed this issue.
Current implementation defers the call of tasklet_schedule until DTO
when the error interrupts is happened. This patch fixes error handling.

Signed-off-by: Seungwon Jeon <tgih.jun@samsung.com>
Acked-by: Jaehoon Chung <jh80.chung@samsung.com>
Acked-by: Will Newton <will.newton@imgtec.com>
Signed-off-by: Chris Ball <cjb@laptop.org>

mmc: dw_mmc: amend using error interrupt status

RINTSTS status includes masked interrupts as well as unmasked.
data_status and cmd_status are set by value of RINTSTS in interrupt handler
and tasklet finally uses it to decide whether error is happened or not.
In addition, MINTSTS status is used for setting data_status in PIO.
Masked error interrupt will not be handled and that status can be considered
non-error case.

Signed-off-by: Seungwon Jeon <tgih.jun@samsung.com>
Reviewed By: Girish K S <girish.shivananjappa@linaro.org>
Acked-by: Jaehoon Chung <jh80.chung@samsung.com>
Acked-by: Will Newton <will.newton@imgtec.com>
Signed-off-by: Chris Ball <cjb@laptop.org>

mmc: atmel-mci: not busy flag has also to be used for read operations

Even if the datasheet says that the not busy flag has to be used only
for write operations, it's false except for version lesser than v2xx.

Not waiting on the not busy flag for read operations can cause the
controller to hang-up during the initialization of some SD cards
with DMA after the first CMD6 -- the next command is sent too early.

Signed-off-by: Ludovic Desroches <ludovic.desroches@atmel.com>
Cc: stable <stable@vger.kernel.org> [3.5, 3.6]
Signed-off-by: Chris Ball <cjb@laptop.org>

mmc: sdhci-esdhc: break out early if clock is 0

Since commit 30832ab56 ("mmc: sdhci: Always pass clock request value
zero to set_clock host op") was merged, esdhc_set_clock starts hitting
"if (clock == 0)" where ESDHC_SYSTEM_CONTROL has been operated. This
causes SDHCI card-detection function being broken. Fix the regression
by moving "if (clock == 0)" above ESDHC_SYSTEM_CONTROL operation.

Signed-off-by: Shawn Guo <shawn.guo@linaro.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Chris Ball <cjb@laptop.org>

mmc: mxs-mmc: fix deadlock caused by recursion loop

Release the lock before mmc_signal_sdio_irq is called by
mxs_mmc_enable_sdio_irq.

Backtrace:
[   65.470000] =============================================
[   65.470000] [ INFO: possible recursive locking detected ]
[   65.470000] 3.5.0-rc5 #2 Not tainted
[   65.470000] ---------------------------------------------
[   65.470000] ksdioirqd/mmc0/73 is trying to acquire lock:
[   65.470000]  (&(&host->lock)->rlock#2){-.-...}, at: [<bf054120>] mxs_mmc_enable_sdio_irq+0x18/0xdc [mxs_mmc]
[   65.470000]
[   65.470000] but task is already holding lock:
[   65.470000]  (&(&host->lock)->rlock#2){-.-...}, at: [<bf054120>] mxs_mmc_enable_sdio_irq+0x18/0xdc [mxs_mmc]
[   65.470000]
[   65.470000] other info that might help us debug this:
[   65.470000]  Possible unsafe locking scenario:
[   65.470000]
[   65.470000]        CPU0
[   65.470000]        ----
[   65.470000]   lock(&(&host->lock)->rlock#2);
[   65.470000]   lock(&(&host->lock)->rlock#2);
[   65.470000]
[   65.470000]  *** DEADLOCK ***
[   65.470000]
[   65.470000]  May be due to missing lock nesting notation
[   65.470000]
[   65.470000] 1 lock held by ksdioirqd/mmc0/73:
[   65.470000]  #0:  (&(&host->lock)->rlock#2){-.-...}, at: [<bf054120>] mxs_mmc_enable_sdio_irq+0x18/0xdc [mxs_mmc]
[   65.470000]
[   65.470000] stack backtrace:
[   65.470000] [<c0014990>] (unwind_backtrace+0x0/0xf4) from [<c005ccb8>] (__lock_acquire+0x14f8/0x1b98)
[   65.470000] [<c005ccb8>] (__lock_acquire+0x14f8/0x1b98) from [<c005d3f8>] (lock_acquire+0xa0/0x108)
[   65.470000] [<c005d3f8>] (lock_acquire+0xa0/0x108) from [<c02f671c>] (_raw_spin_lock_irqsave+0x48/0x5c)
[   65.470000] [<c02f671c>] (_raw_spin_lock_irqsave+0x48/0x5c) from [<bf054120>] (mxs_mmc_enable_sdio_irq+0x18/0xdc [mxs_mmc])
[   65.470000] [<bf054120>] (mxs_mmc_enable_sdio_irq+0x18/0xdc [mxs_mmc]) from [<bf0541d0>] (mxs_mmc_enable_sdio_irq+0xc8/0xdc [mxs_mmc])
[   65.470000] [<bf0541d0>] (mxs_mmc_enable_sdio_irq+0xc8/0xdc [mxs_mmc]) from [<c0219b38>] (sdio_irq_thread+0x1bc/0x274)
[   65.470000] [<c0219b38>] (sdio_irq_thread+0x1bc/0x274) from [<c003c324>] (kthread+0x8c/0x98)
[   65.470000] [<c003c324>] (kthread+0x8c/0x98) from [<c00101ac>] (kernel_thread_exit+0x0/0x8)
[   65.470000] BUG: spinlock lockup suspected on CPU#0, ksdioirqd/mmc0/73
[   65.470000]  lock: 0xc3358724, .magic: dead4ead, .owner: ksdioirqd/mmc0/73, .owner_cpu: 0
[   65.470000] [<c0014990>] (unwind_backtrace+0x0/0xf4) from [<c01b46b0>] (do_raw_spin_lock+0x100/0x144)
[   65.470000] [<c01b46b0>] (do_raw_spin_lock+0x100/0x144) from [<c02f6724>] (_raw_spin_lock_irqsave+0x50/0x5c)
[   65.470000] [<c02f6724>] (_raw_spin_lock_irqsave+0x50/0x5c) from [<bf054120>] (mxs_mmc_enable_sdio_irq+0x18/0xdc [mxs_mmc])
[   65.470000] [<bf054120>] (mxs_mmc_enable_sdio_irq+0x18/0xdc [mxs_mmc]) from [<bf0541d0>] (mxs_mmc_enable_sdio_irq+0xc8/0xdc [mxs_mmc])
[   65.470000] [<bf0541d0>] (mxs_mmc_enable_sdio_irq+0xc8/0xdc [mxs_mmc]) from [<c0219b38>] (sdio_irq_thread+0x1bc/0x274)
[   65.470000] [<c0219b38>] (sdio_irq_thread+0x1bc/0x274) from [<c003c324>] (kthread+0x8c/0x98)
[   65.470000] [<c003c324>] (kthread+0x8c/0x98) from [<c00101ac>] (kernel_thread_exit+0x0/0x8)

Reported-by: Attila Kinali <attila@kinali.ch>
Signed-off-by: Lauri Hintsala <lauri.hintsala@bluegiga.com>
Acked-by: Shawn Guo <shawn.guo@linaro.org>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Chris Ball <cjb@laptop.org>

mmc: mxs-mmc: fix deadlock in SDIO IRQ case

Release the lock before mmc_signal_sdio_irq is called by mxs_mmc_irq_handler.

Backtrace:
[   79.660000] =============================================
[   79.660000] [ INFO: possible recursive locking detected ]
[   79.660000] 3.4.0-00009-g3e96082-dirty #11 Not tainted
[   79.660000] ---------------------------------------------
[   79.660000] swapper/0 is trying to acquire lock:
[   79.660000]  (&(&host->lock)->rlock#2){-.....}, at: [<c026ea3c>] mxs_mmc_enable_sdio_irq+0x18/0xd4
[   79.660000]
[   79.660000] but task is already holding lock:
[   79.660000]  (&(&host->lock)->rlock#2){-.....}, at: [<c026f744>] mxs_mmc_irq_handler+0x1c/0xe8
[   79.660000]
[   79.660000] other info that might help us debug this:
[   79.660000]  Possible unsafe locking scenario:
[   79.660000]
[   79.660000]        CPU0
[   79.660000]        ----
[   79.660000]   lock(&(&host->lock)->rlock#2);
[   79.660000]   lock(&(&host->lock)->rlock#2);
[   79.660000]
[   79.660000]  *** DEADLOCK ***
[   79.660000]
[   79.660000]  May be due to missing lock nesting notation
[   79.660000]
[   79.660000] 1 lock held by swapper/0:
[   79.660000]  #0:  (&(&host->lock)->rlock#2){-.....}, at: [<c026f744>] mxs_mmc_irq_handler+0x1c/0xe8
[   79.660000]
[   79.660000] stack backtrace:
[   79.660000] [<c0014bd0>] (unwind_backtrace+0x0/0xf4) from [<c005f9c0>] (__lock_acquire+0x1948/0x1d48)
[   79.660000] [<c005f9c0>] (__lock_acquire+0x1948/0x1d48) from [<c005fea0>] (lock_acquire+0xe0/0xf8)
[   79.660000] [<c005fea0>] (lock_acquire+0xe0/0xf8) from [<c03a8460>] (_raw_spin_lock_irqsave+0x44/0x58)
[   79.660000] [<c03a8460>] (_raw_spin_lock_irqsave+0x44/0x58) from [<c026ea3c>] (mxs_mmc_enable_sdio_irq+0x18/0xd4)
[   79.660000] [<c026ea3c>] (mxs_mmc_enable_sdio_irq+0x18/0xd4) from [<c026f7fc>] (mxs_mmc_irq_handler+0xd4/0xe8)
[   79.660000] [<c026f7fc>] (mxs_mmc_irq_handler+0xd4/0xe8) from [<c006bdd8>] (handle_irq_event_percpu+0x70/0x254)
[   79.660000] [<c006bdd8>] (handle_irq_event_percpu+0x70/0x254) from [<c006bff8>] (handle_irq_event+0x3c/0x5c)
[   79.660000] [<c006bff8>] (handle_irq_event+0x3c/0x5c) from [<c006e6d0>] (handle_level_irq+0x90/0x110)
[   79.660000] [<c006e6d0>] (handle_level_irq+0x90/0x110) from [<c006b930>] (generic_handle_irq+0x38/0x50)
[   79.660000] [<c006b930>] (generic_handle_irq+0x38/0x50) from [<c00102fc>] (handle_IRQ+0x30/0x84)
[   79.660000] [<c00102fc>] (handle_IRQ+0x30/0x84) from [<c000f058>] (__irq_svc+0x38/0x60)
[   79.660000] [<c000f058>] (__irq_svc+0x38/0x60) from [<c0010520>] (default_idle+0x2c/0x40)
[   79.660000] [<c0010520>] (default_idle+0x2c/0x40) from [<c0010a90>] (cpu_idle+0x64/0xcc)
[   79.660000] [<c0010a90>] (cpu_idle+0x64/0xcc) from [<c04ff858>] (start_kernel+0x244/0x2c8)
[   79.660000] BUG: spinlock lockup on CPU#0, swapper/0
[   79.660000]  lock: c398cb2c, .magic: dead4ead, .owner: swapper/0, .owner_cpu: 0
[   79.660000] [<c0014bd0>] (unwind_backtrace+0x0/0xf4) from [<c01ddb1c>] (do_raw_spin_lock+0xf0/0x144)
[   79.660000] [<c01ddb1c>] (do_raw_spin_lock+0xf0/0x144) from [<c03a8468>] (_raw_spin_lock_irqsave+0x4c/0x58)
[   79.660000] [<c03a8468>] (_raw_spin_lock_irqsave+0x4c/0x58) from [<c026ea3c>] (mxs_mmc_enable_sdio_irq+0x18/0xd4)
[   79.660000] [<c026ea3c>] (mxs_mmc_enable_sdio_irq+0x18/0xd4) from [<c026f7fc>] (mxs_mmc_irq_handler+0xd4/0xe8)
[   79.660000] [<c026f7fc>] (mxs_mmc_irq_handler+0xd4/0xe8) from [<c006bdd8>] (handle_irq_event_percpu+0x70/0x254)
[   79.660000] [<c006bdd8>] (handle_irq_event_percpu+0x70/0x254) from [<c006bff8>] (handle_irq_event+0x3c/0x5c)
[   79.660000] [<c006bff8>] (handle_irq_event+0x3c/0x5c) from [<c006e6d0>] (handle_level_irq+0x90/0x110)
[   79.660000] [<c006e6d0>] (handle_level_irq+0x90/0x110) from [<c006b930>] (generic_handle_irq+0x38/0x50)
[   79.660000] [<c006b930>] (generic_handle_irq+0x38/0x50) from [<c00102fc>] (handle_IRQ+0x30/0x84)
[   79.660000] [<c00102fc>] (handle_IRQ+0x30/0x84) from [<c000f058>] (__irq_svc+0x38/0x60)
[   79.660000] [<c000f058>] (__irq_svc+0x38/0x60) from [<c0010520>] (default_idle+0x2c/0x40)
[   79.660000] [<c0010520>] (default_idle+0x2c/0x40) from [<c0010a90>] (cpu_idle+0x64/0xcc)
[   79.660000] [<c0010a90>] (cpu_idle+0x64/0xcc) from [<c04ff858>] (start_kernel+0x244/0x2c8)

Signed-off-by: Lauri Hintsala <lauri.hintsala@bluegiga.com>
Acked-by: Shawn Guo <shawn.guo@linaro.org>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Chris Ball <cjb@laptop.org>

mmc: bfin_sdh: fix dma_desc_array build error

Descriptor array structure has been moved into blackfin dma.h head file.
This patch fix below error:

drivers/mmc/host/bfin_sdh.c:52:8: error: redefinition of 'struct
dma_desc_array'
make[4]: *** [drivers/mmc/host/bfin_sdh.o] Error 1

Signed-off-by: Sonic Zhang <sonic.zhang@analog.com>
Signed-off-by: Bob Liu <lliubbo@gmail.com>
Signed-off-by: Chris Ball <cjb@laptop.org>

Merge branch 'sched/core'

Merge branch 'perf/urgent'

perf/hwpb: Invoke __perf_event_disable() if interrupts are already disabled

While debugging a warning message on PowerPC while using hardware
breakpoints, it was discovered that when perf_event_disable is invoked
through hw_breakpoint_handler function with interrupts disabled, a
subsequent IPI in the code path would trigger a WARN_ON_ONCE message in
smp_call_function_single function.

This patch calls __perf_event_disable() when interrupts are already
disabled, instead of perf_event_disable().

Reported-by: Edjunior Barbosa Machado <emachado@linux.vnet.ibm.com>
Signed-off-by: K.Prasad <Prasad.Krishnan@gmail.com>
[naveen.n.rao@linux.vnet.ibm.com: v3: Check to make sure we target current task]
Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Acked-by: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20120802081635.5811.17737.stgit@localhost.localdomain
[ Fixed build error on MIPS. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>

perf/x86: Enable Intel Cedarview Atom suppport

This patch enables perf_events support for Intel Cedarview
Atom (model 54) processors. Support includes PEBS and LBR.
Tested on my Atom N2600 netbook.

Signed-off-by: Stephane Eranian <eranian@google.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20120820092421.GA11284@quad
Signed-off-by: Ingo Molnar <mingo@kernel.org>

perf_event: Switch to internal refcount, fix race with close()

Don't mess with file refcounts (or keep a reference to file, for
that matter) in perf_event. Use explicit refcount of its own
instead. Deal with the race between the final reference to event
going away and new children getting created for it by use of
atomic_long_inc_not_zero() in inherit_event(); just have the
latter free what it had allocated and return NULL, that works
out just fine (children of siblings of something doomed are
created as singletons, same as if the child of leader had been
created and immediately killed).

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Cc: stable@kernel.org
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20120820135925.GG23464@ZenIV.linux.org.uk
Signed-off-by: Ingo Molnar <mingo@kernel.org>

sched: Remove useless code in yield_to()

It's impossible to enter the else branch if we have set
skip_clock_update in task_yield_fair(), as yield_to_task_fair()
will directly return true after invoke task_yield_fair().

Signed-off-by: Michael Wang <wangyun@linux.vnet.ibm.com>
Acked-by: Mike Galbraith <efault@gmx.de>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/4FF2925A.9060005@linux.vnet.ibm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>

sched: Add time unit suffix to sched sysctl knobs

Unlike others, sched_migration_cost, sched_time_avg and
sched_shares_window doesn't have time unit as suffix. Add them.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1345083330-19486-1-git-send-email-namhyung@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>

sched/debug: Limit sd->*_idx range on sysctl

Various sd->*_idx's are used for refering the rq's load average table
when selecting a cpu to run. However they can be set to any number
with sysctl knobs so that it can crash the kernel if something bad is
given. Fix it by limiting them into the actual range.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1345104204-8317-1-git-send-email-namhyung@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>

sched: Remove AFFINE_WAKEUPS feature flag

Commit beac4c7e4a1c ("sched: Remove AFFINE_WAKEUPS feature") removed
use of the flag but left the definition. Get rid of it.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Link: http://lkml.kernel.org/r/1345090865-20851-1-git-send-email-namhyung@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>

Merge branch 'sched/urgent' into sched/core

Merge in the current fixes branch, we are going to apply dependent patches.

Signed-off-by: Ingo Molnar <mingo@kernel.org>

sched: Fix kernel-doc warnings in kernel/sched/fair.c

Fix two kernel-doc warnings in kernel/sched/fair.c:

Warning(kernel/sched/fair.c:3660): Excess function parameter 'cpus' description in 'update_sg_lb_stats'
Warning(kernel/sched/fair.c:3806): Excess function parameter 'cpus' description in 'update_sd_lb_stats'

Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/50303714.3090204@xenotime.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>

sched: Unthrottle rt runqueues in __disable_runtime()

migrate_tasks() uses _pick_next_task_rt() to get tasks from the
real-time runqueues to be migrated. When rt_rq is throttled
_pick_next_task_rt() won't return anything, in which case
migrate_tasks() can't move all threads over and gets stuck in an
infinite loop.

Instead unthrottle rt runqueues before migrating tasks.

Additionally: move unthrottle_offline_cfs_rqs() to rq_offline_fair()

Signed-off-by: Peter Boonstoppel <pboonstoppel@nvidia.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Turner <pjt@google.com>
Link: http://lkml.kernel.org/r/5FBF8E85CA34454794F0F7ECBA79798F379D3648B7@HQMAIL04.nvidia.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>

sched: Add missing call to calc_load_exit_idle()

Azat Khuzhin reported high loadavg in Linux v3.6

After checking the upstream scheduler code, I found Peter's commit:

5167e8d5417b sched/nohz: Rewrite and fix load-avg computation -- again

not fully applied, missing the call to calc_load_exit_idle().

After that idle exit in sampling window will always be calculated
to non-idle, and the load will be higher than normal.

This patch adds the missing call to calc_load_exit_idle().

Signed-off-by: Charles Wang <muming.wq@taobao.com>
Cc: stable@kernel.org
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1345449754-27130-1-git-send-email-muming.wq@gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>

sched: Fix load avg vs cpu-hotplug

Rabik and Paul reported two different issues related to the same few
lines of code.

Rabik's issue is that the nr_uninterruptible migration code is wrong in
that he sees artifacts due to this (Rabik please do expand in more
detail).

Paul's issue is that this code as it stands relies on us using
stop_machine() for unplug, we all would like to remove this assumption
so that eventually we can remove this stop_machine() usage altogether.

The only reason we'd have to migrate nr_uninterruptible is so that we
could use for_each_online_cpu() loops in favour of
for_each_possible_cpu() loops, however since nr_uninterruptible() is the
only such loop and its using possible lets not bother at all.

The problem Rabik sees is (probably) caused by the fact that by
migrating nr_uninterruptible we screw rq->calc_load_active for both rqs
involved.

So don't bother with fancy migration schemes (meaning we now have to
keep using for_each_possible_cpu()) and instead fold any nr_active delta
after we migrate all tasks away to make sure we don't have any skewed
nr_active accounting.

Reported-by: Rakib Mullick <rakib.mullick@gmail.com>
Reported-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1345454817.23018.27.camel@twins
Signed-off-by: Ingo Molnar <mingo@kernel.org>