git.karo-electronics.de Git - karo-tx-linux.git/log

Merge branch 'x86/apic'

Merge branch 'timers/nohz'

Merge branch 'timers/core'

Merge branch 'sched/core'

Merge branch 'perf/core'

Merge branch 'core/locking'

x86/ioapic/kcrash: Prevent crash_kexec() from deadlocking on ioapic_lock

Prevent crash_kexec() from deadlocking on ioapic_lock. When
crash_kexec() is executed on a CPU, the CPU will take ioapic_lock
in disable_IO_APIC(). So if the cpu gets an NMI while locking
ioapic_lock, a deadlock will happen.

In this patch, ioapic_lock is zapped/initialized before disable_IO_APIC().

You can reproduce this deadlock the following way:

1. Add mdelay(1000) after raw_spin_lock_irqsave() in
   native_ioapic_set_affinity()@arch/x86/kernel/apic/io_apic.c

   Although the deadlock can occur without this modification, it will increase
   the potential of the deadlock problem.

2. Build and install the kernel

3. Set up the OS which will run panic() and kexec when NMI is injected
    # echo "kernel.unknown_nmi_panic=1" >> /etc/sysctl.conf
    # vim /etc/default/grub
      add "nmi_watchdog=0 crashkernel=256M" in GRUB_CMDLINE_LINUX line
    # grub2-mkconfig

4. Reboot the OS

5. Run following command for each vcpu on the guest
    # while true; do echo <CPU num> > /proc/irq/<IO-APIC-edge or IO-APIC-fasteoi>/smp_affinitity; done;
   By running this command, cpus will get ioapic_lock for setting affinity.

6. Inject NMI (push a dump button or execute 'virsh inject-nmi <domain>' if you
   use VM). After injecting NMI, panic() is called in an nmi-handler context.
   Then, kexec will normally run in panic(), but the operation will be stopped
   by deadlock on ioapic_lock in crash_kexec()->machine_crash_shutdown()->
   native_machine_crash_shutdown()->disable_IO_APIC()->clear_IO_APIC()->
   clear_IO_APIC_pin()->ioapic_read_entry().

Signed-off-by: Yoshihiro YUNOMAE <yoshihiro.yunomae.ez@hitachi.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Gleb Natapov <gleb@redhat.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Hidehiro Kawai <hidehiro.kawai.ez@hitachi.com>
Cc: Sebastian Andrzej Siewior <sebastian@breakpoint.cc>
Cc: Zhang Yanfei <zhangyanfei@cn.fujitsu.com>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: yrl.pp-manager.tt@hitachi.com
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Seiji Aguchi <seiji.aguchi@hds.com>
Link: http://lkml.kernel.org/r/20130820070107.28245.83806.stgit@yunodevel
Signed-off-by: Ingo Molnar <mingo@kernel.org>

proc: more readdir conversion bug-fixes

In the previous commit, Richard Genoud fixed proc_root_readdir(), which
had lost the check for whether all of the non-process /proc entries had
been returned or not.

But that in turn exposed _another_ bug, namely that the original readdir
conversion patch had yet another problem: it had lost the return value
of proc_readdir_de(), so now checking whether it had completed
successfully or not didn't actually work right anyway.

This reinstates the non-zero return for the "end of base entries" that
had also gotten lost in commit f0c3b5093add ("[readdir] convert
procfs"). So now you get all the base entries *and* you get all the
process entries, regardless of getdents buffer size.

(Side note: the Linux "getdents" manual page actually has a nice example
application for testing getdents, which can be easily modified to use
different buffers. Who knew? Man-pages can be useful)

Reported-by: Emmanuel Benisty <benisty.e@gmail.com>
Reported-by: Marc Dionne <marc.c.dionne@gmail.com>
Cc: Richard Genoud <richard.genoud@gmail.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

proc: return on proc_readdir error

Commit f0c3b5093add ("[readdir] convert procfs") introduced a bug on the
listing of the proc file-system. The return value of proc_readdir()
isn't tested anymore in the proc_root_readdir function.

This lead to an "interesting" behaviour when we are using the getdents()
system call with a buffer too small: instead of failing, it returns the
first entries of /proc (enough to fill the given buffer), plus the PID
directories.

This is not triggered on glibc (as getdents is called with a 32KB
buffer), but on uclibc, the buffer size is only 1KB, thus some proc
entries are missing.

See https://lkml.org/lkml/2013/8/12/288 for more background.

Signed-off-by: Richard Genoud <richard.genoud@gmail.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Merge git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-3.0-fixes

Pull gfs2 fixes from Steven Whitehouse:
"Out of these five patches, the one for ensuring that the number of
  revokes is not exceeded, and the one for checking the glock is not
  already held in gfs2_getxattr are the two most important.  The latter
  can be triggered by selinux.

  The other three patches are very small and fix mostly fairly trivial
  issues"

* git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-3.0-fixes:
  GFS2: Check for glock already held in gfs2_getxattr
  GFS2: alloc_workqueue() doesn't return an ERR_PTR
  GFS2: don't overrun reserved revokes
  GFS2: WQ_NON_REENTRANT is meaningless and going away
  GFS2: Fix typo in gfs2_create_inode()

Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 fixes from Ingo Molnar:
"Two AMD microcode loader fixes and an OLPC firmware support fix"

* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86, microcode, AMD: Fix early microcode loading
  x86, microcode, AMD: Make cpu_has_amd_erratum() use the correct struct cpuinfo_x86
  x86: Don't clear olpc_ofw_header when sentinel is detected

Merge branch 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull timer fixes from Ingo Molnar:
"Three small fixlets"

* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  nohz: fix compile warning in tick_nohz_init()
  nohz: Do not warn about unstable tsc unless user uses nohz_full
  sched_clock: Fix integer overflow

Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux

Pull drm fixes from Dave Airlie:
"Bit late with these, was under the weather for a a few days, nothing
  too crazy:

  Some radeon regression fixes, one intel regression fix, and one fix to
  avoid a warn with i915 when used with dma-buf"

* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux:
  drm/i915: unpin backing storage in dmabuf_unmap
  drm/radeon: fix WREG32_OR macro setting bits in a register
  drm/radeon/r7xx: fix copy paste typo in golden register setup
  drm/i915: Don't deref pipe->cpu_transcoder in the hangcheck code
  drm/radeon: fix UVD message buffer validation

kernel: fix new kernel-doc warning in wait.c

Fix new kernel-doc warnings in kernel/wait.c:

  Warning(kernel/wait.c:374): No description found for parameter 'p'
  Warning(kernel/wait.c:374): Excess function parameter 'word' description in 'wake_up_atomic_t'
  Warning(kernel/wait.c:374): Excess function parameter 'bit' description in 'wake_up_atomic_t'

Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

GFS2: Check for glock already held in gfs2_getxattr

Since the introduction of atomic_open, gfs2_getxattr can be
called with the glock already held, so we need to allow for
this.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
Reported-by: David Teigland <teigland@redhat.com>
Tested-by: David Teigland <teigland@redhat.com>

GFS2: alloc_workqueue() doesn't return an ERR_PTR

alloc_workqueue() returns a NULL on error, it doesn't return an ERR_PTR.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>

GFS2: don't overrun reserved revokes

When run during fsync, a gfs2_log_flush could happen between the
time when gfs2_ail_flush checked the number of blocks to revoke,
and when it actually started the transaction to do those revokes.
This occassionally caused it to need more revokes than it reserved,
causing gfs2 to crash.

Instead of just reserving enough revokes to handle the blocks that
currently need them, this patch makes gfs2_ail_flush reserve the
maximum number of revokes it can, without increasing the total number
of reserved log blocks. This patch also passes the number of reserved
revokes to __gfs2_ail_flush() so that it doesn't go over its limit
and cause a crash like we're seeing. Non-fsync calls to __gfs2_ail_flush
will still cause a BUG() necessary revokes are skipped.

Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>

GFS2: WQ_NON_REENTRANT is meaningless and going away

dbf2576e37 ("workqueue: make all workqueues non-reentrant") made
WQ_NON_REENTRANT no-op and the flag is going away. Remove its usages.

This patch doesn't introduce any behavior changes.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
Cc: cluster-devel@redhat.com

GFS2: Fix typo in gfs2_create_inode()

PTR_RET should be PTR_ERR

Reported-by: Sachin Kamat <sachin.kamat@linaro.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>

generic-ipi/locking: Fix misleading smp_call_function_any() description

Fix locking description: after commit 8969a5ede0f9e17da4b9437
("generic-ipi: remove kmalloc()"), wait = 0 can be guaranteed
because we don't kmalloc() anymore.

Signed-off-by: Xie XiuQi <xiexiuqi@huawei.com>
Cc: Sheng Yang <sheng@linux.intel.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Jens Axboe <jens.axboe@oracle.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Link: http://lkml.kernel.org/r/51F5E6F8.1000801@huawei.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>

Merge tag 'drm-intel-fixes-2013-08-15' of git://people.freedesktop.org/~danvet/drm-intel

* tag 'drm-intel-fixes-2013-08-15' of git://people.freedesktop.org/~danvet/drm-intel: (153 commits)
drm/i915: Don't deref pipe->cpu_transcoder in the hangcheck code

drm/i915: unpin backing storage in dmabuf_unmap

This fixes a WARN in i915_gem_free_object when the
obj->pages_pin_count isn't 0.

v2: Add locking to unmap, noticed by Chris Wilson. Note that even
though we call unmap with our own dev->struct_mutex held that won't
result in an immediate deadlock since we never go through the dma_buf
interfaces for our own, reimported buffers. But it's still easy to
blow up and anger lockdep, but that's already the case with our ->map
implementation. Fixing this for real will involve per dma-buf ww mutex
locking by the callers. And lots of fun. So go with the duct-tape
approach for now.

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reported-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
Cc: Maarten Lankhorst <maarten.lankhorst@canonical.com>
Tested-by: Armin K. <krejzi@email.com> (v1)
Tested-by: Dave Airlie <airlied@redhat.com>
Acked-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Dave Airlie <airlied@gmail.com>

Merge branch 'drm-fixes-3.11' of git://people.freedesktop.org/~agd5f/linux

Just two small fixes for radeon.  One fixes an array overrun
that can cause garbage to get written to registers on some r7xx boards,
the other is a small UVD fix.
Also one audio regresion

* 'drm-fixes-3.11' of git://people.freedesktop.org/~agd5f/linux:
  drm/radeon: fix WREG32_OR macro setting bits in a register
  drm/radeon/r7xx: fix copy paste typo in golden register setup
  drm/radeon: fix UVD message buffer validation

Linux 3.11-rc6

Merge branch 'for-3.11-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup

Pull cgroup fix from Tejun Heo:
"This contains one patch to fix the return value of cpuset's cgroups
  interface function, which used to always return -ENODEV for the writes
  on the 'memory_pressure_enabled' file"

* 'for-3.11-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
  cpuset: fix the return value of cpuset_write_u64()

Merge tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4

Pull jbd2 bug fixes from Ted Ts'o:
"Two jbd2 bug fixes, one of which is a regression fix"

* tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
jbd2: Fix oops in jbd2_journal_file_inode()
jbd2: Fix use after free after error in jbd2_journal_dirty_metadata()

s390: Fix broken build

Fix this build error:

  In file included from fs/exec.c:61:0:
  arch/s390/include/asm/tlb.h:35:23: error: expected identifier or '(' before 'unsigned'
  arch/s390/include/asm/tlb.h:36:1: warning: no semicolon at end of struct or union [enabled by default]
  arch/s390/include/asm/tlb.h: In function 'tlb_gather_mmu':
  arch/s390/include/asm/tlb.h:57:5: error: 'struct mmu_gather' has no member named 'end'

Broken due to commit 2b047252d0 ("Fix TLB gather virtual address range
invalidation corner cases").

Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: stable@vger.kernel.org
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
[ Oh well. We had build testing for ppc amd um, but no s390  - Linus ]
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

MAINTAINERS: Change ownership for SGI specific modules.

I have taken a different job. I am removing myself as maintainer of
GRU. Dimitri will continue to maintain the SGI GRU driver, changing the
XP/XPC/XPNET maintainer to Cliff Whickman, but leaving behind my
personal email address to answer any questions about the design or
operation of the XP family of drivers.

Signed-off-by: Robin Holt <holt@sgi.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

jbd2: Fix oops in jbd2_journal_file_inode()

Commit 0713ed0cde76438d05849f1537d3aab46e099475 added
jbd2_journal_file_inode() call into ext4_block_zero_page_range().
However that function gets called from truncate path and thus inode
needn't have jinode attached - that happens in ext4_file_open() but
the file needn't be ever open since mount. Calling
jbd2_journal_file_inode() without jinode attached results in the oops.

We fix the problem by attaching jinode to inode also in ext4_truncate()
and ext4_punch_hole() when we are going to zero out partial blocks.

Reported-by: majianpeng <majianpeng@gmail.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>

Merge branch 'fixes' of git://git.linaro.org/people/rmk/linux-arm

Pull ARM fixes from Russell King:
"The usual collection of random fixes.  Also some further fixes to the
  last set of security fixes, and some more from Will (which you may
  already have in a slightly different form)"

* 'fixes' of git://git.linaro.org/people/rmk/linux-arm:
  ARM: 7807/1: kexec: validate CPU hotplug support
  ARM: 7812/1: rwlocks: retry trylock operation if strex fails on free lock
  ARM: 7811/1: locks: use early clobber in arch_spin_trylock
  ARM: 7810/1: perf: Fix array out of bounds access in armpmu_map_hw_event()
  ARM: 7809/1: perf: fix event validation for software group leaders
  ARM: Fix FIQ code on VIVT CPUs
  ARM: Fix !kuser helpers case
  ARM: Fix the world famous typo with is_gate_vma()

Merge branch 'for-3.11' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k

Pull m68k fixes from Geert Uytterhoeven:
"These are two critical fixes, needed by distro kernels, and thus also
  destined for stable:

   - The do_div() commit fixes a crash in mounting btrfs volumes, which
     was a regression from 3.2,

   - The ARAnyM fix allows to have NatFeat drivers as loadable modules,
     which is needed for initrds"

* 'for-3.11' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k:
  m68k: Truncate base in do_div()
  m68k/atari: ARAnyM - Fix NatFeat module support

Merge tag 'clk-fixes-for-linus' of git://git.linaro.org/people/mturquette/linux

Pull clock controller fixes from Michael Turquette:
"Two small fixes for the Zynq clock controller introduced in 3.11-rc1
  and another Exynos clock patch which fixes a regression that prevents
  the video pipeline from functioning on that platform"

* tag 'clk-fixes-for-linus' of git://git.linaro.org/people/mturquette/linux:
  clk: exynos4: Add CLK_GET_RATE_NOCACHE flag for the Exynos4x12 ISP clocks
  clk/zynq/clkc: Add CLK_SET_RATE_PARENT flag to ethernet muxes
  clk/zynq/clkc: Add dedicated spinlock for the SWDT

Merge tag 'pm-3.11-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull power management fix from Rafael Wysocki:
"The removal of delayed_work_pending() checks from kernel/power/qos.c
  done in 3.9 introduced a deadlock in pm_qos_work_fn().

  Fix from Stephen Boyd"

* tag 'pm-3.11-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  PM / QoS: Fix workqueue deadlock when using pm_qos_update_request_timeout()

Merge tag 'sound-3.11' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound

Pull sound fixes from Takashi Iwai:
"This batch contains a few USB audio fixes, a couple of HD-audio
  quirks, various small ASoC driver fixes in addition to an ASoC core
  fix that may lead to memory corruption.

  Unfortunately slightly more volume than the previous pull request, but
  all are reasonable regression fixes"

* tag 'sound-3.11' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
  ALSA: hda - Add a fixup for Gateway LT27
  ASoC: tegra: fix Tegra30 I2S capture parameter setup
  ALSA: usb-audio: Fix invalid volume resolution for Logitech HD Webcam C525
  ALSA: hda - Fix missing mute controls for CX5051
  ALSA: usb-audio: fix automatic Roland/Yamaha MIDI detection
  ALSA: 6fire: make buffers DMA-able (midi)
  ALSA: 6fire: make buffers DMA-able (pcm)
  ALSA: hda - Add pinfix for LG LW25 laptop
  ASoC: cs42l52: Add new TLV for Beep Volume
  ASoC: cs42l52: Reorder Min/Max and update to SX_TLV for Beep Volume
  ASoC: dapm: Fix empty list check in dapm_new_mux()
  ASoC: sgtl5000: fix buggy 'Capture Attenuate Switch' control
  ASoC: sgtl5000: prevent playback to be muted when terminating concurrent capture

Merge tag 'usb-3.11-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb

Pull USB fixes from Greg KH:
"Here are some small USB fixes for 3.11-rc6 that have accumulated.

  Nothing huge, a EHCI fix that solves a much-reported audio USB
  problem, some usb-serial driver endian fixes and other minor fixes, a
  wireless USB oops fix, and two new quirks"

* tag 'usb-3.11-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
  USB: keyspan: fix null-deref at disconnect and release
  USB: mos7720: fix broken control requests
  usb: add two quirky touchscreen
  USB: ti_usb_3410_5052: fix big-endian firmware handling
  USB: adutux: fix big-endian device-type reporting
  USB: usbtmc: fix big-endian probe of Rigol devices
  USB: mos7840: fix big-endian probe
  USB-Serial: Fix error handling of usb_wwan
  wusbcore: fix kernel panic when disconnecting a wireless USB->serial device
  USB: EHCI: accept very late isochronous URBs

Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net

Pull networking fixes from David Miller:

1) Fix SKB leak in 8139cp, from Dave Jones.

2) Fix use of *_PAGES interfaces with mlx5 firmware, from Moshe Lazar.

3) RCU conversion of macvtap introduced two races, fixes by Eric
    Dumazet

4) Synchronize statistic flows in bnx2x driver to prevent corruption,
    from Dmitry Kravkov

5) Undo optimization in IP tunneling, we were using the inner IP header
    in some cases to inherit the IP ID, but that isn't correct in some
    circumstances.  From Pravin B Shelar

6) Use correct struct size when parsing netlink attributes in
    rtnl_bridge_getlink().  From Asbjoern Sloth Toennesen

7) Length verifications in tun_get_user() are bogus, from Weiping Pan
    and Dan Carpenter

8) Fix bad merge resolution during 3.11 networking development in
    openvswitch, albeit a harmless one which added some unreachable
    code.  From Jesse Gross

9) Wrong size used in flexible array allocation in openvswitch, from
    Pravin B Shelar

10) Clear out firmware capability flags the be2net driver isn't ready to
    handle yet, from Sarveshwar Bandi

11) Revert DMA mapping error checking addition to cxgb3 driver, it's
    buggy.  From Alexey Kardashevskiy

12) Fix regression in packet scheduler rate limiting when working with a
    link layer of ATM.  From Jesper Dangaard Brouer

13) Fix several errors in TCP Cubic congestion control, in particular
    overflow errors in timestamp calculations.  From Eric Dumazet and
    Van Jacobson

14) In ipv6 routing lookups, we need to backtrack if subtree traversal
    don't result in a match.  From Hannes Frederic Sowa

15) ipgre_header() returns incorrect packet offset.  Fix from Timo Teräs

16) Get "low latency" out of the new MIB counter names.  From Eliezer
    Tamir

17) State check in ndo_dflt_fdb_del() is inverted, from Sridhar
    Samudrala

18) Handle TCP Fast Open properly in netfilter conntrack, from Yuchung
    Cheng

19) Wrong memcpy length in pcan_usb driver, from Stephane Grosjean

20) Fix dealock in TIPC, from Wang Weidong and Ding Tianhong

21) call_rcu() call to destroy SCTP transport is done too early and
    might result in an oops.  From Daniel Borkmann

22) Fix races in genetlink family dumps, from Johannes Berg

23) Flags passed into macvlan by the user need to be validated properly,
    from Michael S Tsirkin

24) Fix skge build on 32-bit, from Stephen Hemminger

25) Handle malformed TCP headers properly in xt_TCPMSS, from Pablo Neira
    Ayuso

26) Fix handling of stacked vlans in vlan_dev_real_dev(), from Nikolay
    Aleksandrov

27) Eliminate MTU calculation overflows in esp{4,6}, from Daniel
    Borkmann

28) neigh_parms need to be setup before calling the ->ndo_neigh_setup()
    method.  From Veaceslav Falico

29) Kill out-of-bounds prefetch in fib_trie, from Eric Dumazet

30) Don't dereference MLD query message if the length isn't value in the
    bridge multicast code, from Linus Lüssing

31) Fix VXLAN IGMP join regression due to an inverted check, from Cong
    Wang

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (70 commits)
  net/mlx5_core: Support MANAGE_PAGES and QUERY_PAGES firmware command changes
  tun: signedness bug in tun_get_user()
  qlcnic: Fix diagnostic interrupt test for 83xx adapters
  qlcnic: Fix beacon state return status handling
  qlcnic: Fix set driver version command
  net: tg3: fix NULL pointer dereference in tg3_io_error_detected and tg3_io_slot_reset
  net_sched: restore "linklayer atm" handling
  drivers/net/ethernet/via/via-velocity.c: update napi implementation
  Revert "cxgb3: Check and handle the dma mapping errors"
  be2net: Clear any capability flags that driver is not interested in.
  openvswitch: Reset tunnel key between input and output.
  openvswitch: Use correct type while allocating flex array.
  openvswitch: Fix bad merge resolution.
  tun: compare with 0 instead of total_len
  rtnetlink: rtnl_bridge_getlink: Call nlmsg_find_attr() with ifinfomsg header
  ethernet/arc/arc_emac - fix NAPI "work > weight" warning
  ip_tunnel: Do not use inner ip-header-id for tunnel ip-header-id.
  bnx2x: prevent crash in shutdown flow with CNIC
  bnx2x: fix PTE write access error
  bnx2x: fix memory leak in VF
  ...

perf: Do not compute time values unnecessarily

We should not be calling calc_timer_values() for events that do not actually
have an mmap()'ed userpage.

Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20130802191630.GT27162@twins.programming.kicks-ass.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>

perf: Account freq events globally

Freq events may not always be affine to a particular CPU. As such,
account_event_cpu() may crash if we account per cpu a freq event
that has event->cpu == -1.

To solve this, lets account freq events globally. In practice
this doesn't change much the picture because perf tools create
per-task perf events with one event per CPU by default. Profiling a
single CPU is usually a corner case so there is no much point in
optimizing things that way.

Reported-by: Jiri Olsa <jolsa@redhat.com>
Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Tested-by: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Stephane Eranian <eranian@google.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1375460996-16329-3-git-send-email-fweisbec@gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>

perf: Roll back callchain buffer refcount under the callchain mutex

When we fail to allocate the callchain buffers, we roll back the refcount
we did and return from get_callchain_buffers().

However we take the refcount and allocate under the callchain lock
but the rollback is done outside the lock.

As a result, while we roll back, some concurrent callchain user may
call get_callchain_buffers(), see the non-zero refcount and give up
because the buffers are NULL without itself retrying the allocation.

The consequences aren't that bad but that behaviour looks weird enough and
it's better to give their chances to the following callchain users where
we failed.

Reported-by: Jiri Olsa <jolsa@redhat.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Stephane Eranian <eranian@google.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1375460996-16329-2-git-send-email-fweisbec@gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>

perf/x86/intel/uncore: Enable EV_SEL_EXT bit for PCU

This patch adds support for the SNB-EP PCU uncore PMU extra_sel_bit
(bit 21) which is missing from the documentation in Table-2.75 of
Intel Xeon Processor E5-2600 Product Family Uncore Performance
Monitoring Guide. It is referred to later in Table-2.81. Without
this selection bit explicitly enabled by the kernel, some events
such as COREx_TRANSITION_CYCLES do not count correctly.

Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
Reviewed-by: Stephane Eranian <eranian@google.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1376375382-21350-4-git-send-email-zheng.z.yan@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>

perf/x86/intel/uncore: Add filter support for QPI boxes

The QPI uncore boxes have two pairs of MATCH/MASK registers that
user to filter packet traffic serviced by QPI link layer. These
registers are in auxiliary PCI devices.

This patch adds the auxiliary PCI devices to snbep_uncore_pci_ids
and adds field definitions for the MATCH/MASK registers.

Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1375856245-10717-2-git-send-email-zheng.z.yan@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>

perf/x86/intel/uncore: Add auxiliary pci device support

The QPI uncore boxes have two pairs of MATCH/MASK registers that
user to filter packet traffic serviced by QPI link layer. These
registers are in auxiliary PCI devices.

This patch changes the meaning of (struct pci_device_id)->driver_data.
The first 8 bits are device index of the same uncore type, the second
8 bytes are uncore type index. Auxiliary PCI device's type is defined
as UNCORE_EXTRA_PCI_DEV(0xff)

Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1375856245-10717-1-git-send-email-zheng.z.yan@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>

nohz: Include local CPU in full dynticks global kick

tick_nohz_full_kick_all() is useful to notify all full dynticks
CPUs that there is a system state change to checkout before
re-evaluating the need for the tick.

Unfortunately this is implemented using smp_call_function_many()
that ignores the local CPU. This CPU also needs to re-evaluate
the tick.

on_each_cpu_mask() is not useful either because we don't want to
re-evaluate the tick state in place but asynchronously from an IPI
to avoid messing up with any random locking scenario.

So lets call tick_nohz_full_kick() from tick_nohz_full_kick_all()
so that the usual irq work takes care of it.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Li Zhong <zhong@linux.vnet.ibm.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Kevin Hilman <khilman@linaro.org>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1375460996-16329-4-git-send-email-fweisbec@gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>

Fix TLB gather virtual address range invalidation corner cases

Ben Tebulin reported:

"Since v3.7.2 on two independent machines a very specific Git
  repository fails in 9/10 cases on git-fsck due to an SHA1/memory
  failures.  This only occurs on a very specific repository and can be
  reproduced stably on two independent laptops.  Git mailing list ran
  out of ideas and for me this looks like some very exotic kernel issue"

and bisected the failure to the backport of commit 53a59fc67f97 ("mm:
limit mmu_gather batching to fix soft lockups on !CONFIG_PREEMPT").

That commit itself is not actually buggy, but what it does is to make it
much more likely to hit the partial TLB invalidation case, since it
introduces a new case in tlb_next_batch() that previously only ever
happened when running out of memory.

The real bug is that the TLB gather virtual memory range setup is subtly
buggered.  It was introduced in commit 597e1c3580b7 ("mm/mmu_gather:
enable tlb flush range in generic mmu_gather"), and the range handling
was already fixed at least once in commit e6c495a96ce0 ("mm: fix the TLB
range flushed when __tlb_remove_page() runs out of slots"), but that fix
was not complete.

The problem with the TLB gather virtual address range is that it isn't
set up by the initial tlb_gather_mmu() initialization (which didn't get
the TLB range information), but it is set up ad-hoc later by the
functions that actually flush the TLB.  And so any such case that forgot
to update the TLB range entries would potentially miss TLB invalidates.

Rather than try to figure out exactly which particular ad-hoc range
setup was missing (I personally suspect it's the hugetlb case in
zap_huge_pmd(), which didn't have the same logic as zap_pte_range()
did), this patch just gets rid of the problem at the source: make the
TLB range information available to tlb_gather_mmu(), and initialize it
when initializing all the other tlb gather fields.

This makes the patch larger, but conceptually much simpler.  And the end
result is much more understandable; even if you want to play games with
partial ranges when invalidating the TLB contents in chunks, now the
range information is always there, and anybody who doesn't want to
bother with it won't introduce subtle bugs.

Ben verified that this fixes his problem.

Reported-bisected-and-tested-by: Ben Tebulin <tebulin@googlemail.com>
Build-testing-by: Stephen Rothwell <sfr@canb.auug.org.au>
Build-testing-by: Richard Weinberger <richard.weinberger@gmail.com>
Reviewed-by: Michal Hocko <mhocko@suse.cz>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: stable@vger.kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

sched/cputime: Use this_cpu_add() in task_group_account_field()

Use of a this_cpu() operation reduces the number of instructions used
for accounting (account_user_time()) and frees up some registers. This is in
the scheduler tick hotpath.

Signed-off-by: Christoph Lameter <cl@linux.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/00000140596dd165-338ff7f5-893b-4fec-b251-aaac5557239e-000000@email.amazonses.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>

cpumask: Fix cpumask leak in partition_sched_domains()

If doms_new is NULL, partition_sched_domains() will reset ndoms_cur
to 0, and free old sched domains with free_sched_domains(doms_cur, ndoms_cur).
As ndoms_cur is 0, the cpumask will not be freed.

Signed-off-by: Xiaotian Feng <xtfeng@gmail.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1375790802-11857-1-git-send-email-xtfeng@gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>

Merge tag 'v3.11-rc5' into sched/core

Merge Linux 3.11-rc5, to pick up the latest fixes.

Signed-off-by: Ingo Molnar <mingo@kernel.org>

ALSA: hda - Add a fixup for Gateway LT27

Gateway LT27 needs a fixup for the inverted digital mic.

Reported-by: "Nathanael D. Noblet" <nathanael@gnat.ca>
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>

net/mlx5_core: Support MANAGE_PAGES and QUERY_PAGES firmware command changes

In the previous QUERY_PAGES command version we used one command to get the
required amount of boot, init and post init pages. The new version uses the
op_mod field to specify whether the query is for the required amount of boot,
init or post init pages. In addition the output field size for the required
amount of pages increased from 16 to 32 bits.

In MANAGE_PAGES command the input_num_entries and output_num_entries fields
sizes changed from 16 to 32 bits and the PAS tables offset changed to 0x10.

In the pages request event the num_pages field also changed to 32 bits.

In the HCA-capabilities-layout the size and location of max_qp_mcg field has
been changed to support 24 bits.

This patch isn't compatible with firmware versions < 5; however, it turns out that the
first GA firmware we will publish will not support previous versions so this should be OK.

Signed-off-by: Moshe Lazer <moshel@mellanox.com>
Signed-off-by: Eli Cohen <eli@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

tun: signedness bug in tun_get_user()

The recent fix d9bf5f1309 "tun: compare with 0 instead of total_len" is
not totally correct. Because "len" and "sizeof()" are size_t type, that
means they are never less than zero.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

qlcnic: Fix diagnostic interrupt test for 83xx adapters

o Do not allow interrupt test when adapter is resetting.

Signed-off-by: Manish Chopra <manish.chopra@qlogic.com>
Signed-off-by: Sucheta Chakraborty <sucheta.chakraborty@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

qlcnic: Fix beacon state return status handling

o Driver was misinterpreting the return status for beacon
  state query leading to incorrect interpretation of beacon
  state and logging an error message for successful status.
  Fixed the driver to properly interpret the return status.

Signed-off-by: Sucheta Chakraborty <sucheta.chakraborty@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

qlcnic: Fix set driver version command

Driver was issuing set driver version command through all
functions in the adapter. Fix the driver to issue set driver
version once per adapter, through function 0.

Signed-off-by: Himanshu Madhani <himanshu.madhani@qlogic.com>
Signed-off-by: Sucheta Chakraborty <sucheta.chakraborty@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: tg3: fix NULL pointer dereference in tg3_io_error_detected and tg3_io_slot_reset

Commit d8af4dfd8 ("net/tg3: Fix kernel crash") introduced a possible
NULL pointer dereference in tg3 driver when !netdev || !netif_running(netdev)
condition is met and netdev is NULL. Then, the jump to the 'done' label
calls dev_close() with a netdevice that is NULL. Therefore, only call
dev_close() when we have a netdevice, but one that is not running.

[ Add the same checks in tg3_io_slot_reset() per Gavin Shan - by Nithin
Nayak Sujir ]

Reported-by: Dave Jones <davej@redhat.com>
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Cc: Gavin Shan <shangw@linux.vnet.ibm.com>
Cc: Michael Chan <mchan@broadcom.com>
Signed-off-by: Nithin Nayak Sujir <nsujir@broadcom.com>
Signed-off-by: Nithin Nayak Sujir <nsujir@broadcom.com>
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge tag 'asoc-v3.11-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound into for-linus

ASoC: Fixes for v3.11

A few driver specific fixes here plus one core fix for a memory
corruption issue in DAPM initialisation which could lead to crashes.

drm/radeon: fix WREG32_OR macro setting bits in a register

This bug (introduced in 3.10) in WREG32_OR made
commit d3418eacad403033e95e49dc14afa37c2112c134
"drm/radeon/evergreen: setup HDMI before enabling it"
cause a regression. Sometimes audio over HDMI wasn't working, sometimes
display was corrupted.

This fixes:
https://bugzilla.kernel.org/show_bug.cgi?id=60687
https://bugzilla.kernel.org/show_bug.cgi?id=60709
https://bugs.freedesktop.org/show_bug.cgi?id=67767

Signed-off-by: Rafał Miłecki <zajec5@gmail.com>
Cc: stable@vger.kernel.org
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

Merge remote-tracking branch 'asoc/fix/tegra' into asoc-linus

Merge remote-tracking branch 'asoc/fix/sgtl5000' into asoc-linus

Merge remote-tracking branch 'asoc/fix/dapm' into asoc-linus

Merge remote-tracking branch 'asoc/fix/cs42l52' into asoc-linus

ASoC: tegra: fix Tegra30 I2S capture parameter setup

The Tegra30 I2S driver was writing the AHUB interface parameters to the
playback path register rather than the capture path register. This
caused the capture parameters not to be configured at all, so if
capturing using non-HW-default parameters (e.g. 16-bit stereo rather
than 8-bit mono) the audio would be corrupted.

With this fixed, audio capture from an analog microphone works correctly
on the Cardhu board.

Cc: stable@vger.kernel.org
Signed-off-by: Stephen Warren <swarren@nvidia.com>
Signed-off-by: Mark Brown <broonie@linaro.org>

net_sched: restore "linklayer atm" handling

commit 56b765b79 ("htb: improved accuracy at high rates")
broke the "linklayer atm" handling.

tc class add ... htb rate X ceil Y linklayer atm

The linklayer setting is implemented by modifying the rate table
which is send to the kernel.  No direct parameter were
transferred to the kernel indicating the linklayer setting.

The commit 56b765b79 ("htb: improved accuracy at high rates")
removed the use of the rate table system.

To keep compatible with older iproute2 utils, this patch detects
the linklayer by parsing the rate table.  It also supports future
versions of iproute2 to send this linklayer parameter to the
kernel directly. This is done by using the __reserved field in
struct tc_ratespec, to convey the choosen linklayer option, but
only using the lower 4 bits of this field.

Linklayer detection is limited to speeds below 100Mbit/s, because
at high rates the rtab is gets too inaccurate, so bad that
several fields contain the same values, this resembling the ATM
detect.  Fields even start to contain "0" time to send, e.g. at
1000Mbit/s sending a 96 bytes packet cost "0", thus the rtab have
been more broken than we first realized.

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jesse/openvswitch

Jesse Gross says:

====================
Three bug fixes that are fairly small either way but resolve obviously
incorrect code. For net/3.11.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

drivers/net/ethernet/via/via-velocity.c: update napi implementation

Drivers supporting NAPI should use a NAPI-specific function for receiving
packets. Hence netif_rx is changed to netif_receive_skb.

Furthermore netif_napi_del should be used in the probe and remove function
to clean up the NAPI resource information.

Thanks to Francois Romieu, David Shwatrz and Rami Rosen for their help on
this patch.

Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>

Revert "cxgb3: Check and handle the dma mapping errors"

This reverts commit f83331bab149e29fa2c49cf102c0cd8c3f1ce9f9.

As the tests PPC64 (powernv platform) show, IOMMU pages are leaking
when transferring big amount of small packets (<=64 bytes),
"ping -f" and waiting for 15 seconds is the simplest way to confirm the bug.

Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Santosh Rastapur <santosh@chelsio.com>
Cc: Jay Fenlason <fenlason@redhat.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Divy Le ray <divy@chelsio.com>
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Acked-by: Divy Le Ray <divy@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

be2net: Clear any capability flags that driver is not interested in.

It is possible for some versions of firmware to advertise capabilities that driver
is not ready to handle. This may lead to controller stall. Since the driver is
interested only in subset of flags, clearing the rest.

Signed-off-by: Sarveshwar Bandi <sarveshwar.bandi@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'tip/perf/core' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace into perf/core

Pull perf optimizations from Steve Rostedt.

Signed-off-by: Ingo Molnar <mingo@kernel.org>

Merge tag 'v3.11-rc5' into perf/core

Merge Linux 3.11-rc5, to sync up with the latest upstream fixes since -rc1.

Signed-off-by: Ingo Molnar <mingo@kernel.org>

Merge tag 'perf-core-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core

Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo:

* Allow specifying syscalls in 'perf trace', a la strace.

* Simplify symbol filtering by doing it at machine class level,
   from Adrian Hunter.

* Add option to 'perf kvm' to print only events that exceed a specified time
   duration, from David Ahern.

* 'perf sched' improvements, including removing some tracepoints that provide
   the same information as the PERF_RECORD_{FORK,EXIT} events.

* Improve stack trace printing, from David Ahern.

* Update documentation with live command, from David Ahern

* Fix 'perf test' compile failure on do_sort_something, from David Ahern.

* Improve robustness of topology parsing code, from Stephane Eranian.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>

openvswitch: Reset tunnel key between input and output.

It doesn't make sense to output a tunnel packet using the same
parameters that it was received with since that will generally
just result in the packet going back to us. As a result, userspace
assumes that the tunnel key is cleared when transitioning through
the switch. In the majority of cases this doesn't matter since a
packet is either going to a tunnel port (in which the key is
overwritten with new values) or to a non-tunnel port (in which
case the key is ignored). However, it's theoreticaly possible that
userspace could rely on the documented behavior, so this corrects
it.

Signed-off-by: Jesse Gross <jesse@nicira.com>

openvswitch: Use correct type while allocating flex array.

Flex array is used to allocate hash buckets which is type struct
hlist_head, but we use `struct hlist_head *` to calculate
array size. Since hlist_head is of size pointer it works fine.

Following patch use correct type.

Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Signed-off-by: Jesse Gross <jesse@nicira.com>

openvswitch: Fix bad merge resolution.

git silently included an extra hunk in vport_cmd_set() during
automatic merging. This code is unreachable so it does not actually
introduce a problem but it is clearly incorrect.

Signed-off-by: Jesse Gross <jesse@nicira.com>

drm/radeon/r7xx: fix copy paste typo in golden register setup

Uses the wrong array size for some asics which can lead
to garbage getting written to registers.

Fixes:
https://bugzilla.kernel.org/show_bug.cgi?id=60674

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org

USB: keyspan: fix null-deref at disconnect and release

Make sure to fail properly if the device is not accepted during attach
in order to avoid null-pointer derefs (of missing interface private
data) at disconnect or release.

Cc: stable@vger.kernel.org
Signed-off-by: Johan Hovold <jhovold@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

USB: mos7720: fix broken control requests

The parallel-port code of the drivers used a stack allocated
control-request buffer for asynchronous (and possibly deferred) control
requests. This not only violates the no-DMA-from-stack requirement but
could also lead to corrupt control requests being submitted.

Cc: stable@vger.kernel.org
Signed-off-by: Johan Hovold <jhovold@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

usb: add two quirky touchscreen

These devices tend to become unresponsive after S3

Signed-off-by: Oliver Neukum <oneukum@suse.de>
CC: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

drm/i915: Don't deref pipe->cpu_transcoder in the hangcheck code

If we get an error event really early in the driver setup sequence,
which gen3 is especially prone to with various display GTT faults we
Oops. So try to avoid this.

Additionally with Haswell the transcoders are a separate bank of
registers from the pipes (4 transcoders, 3 pipes). In event of an
error, we want to be sure we have a complete and accurate picture of
the machine state, so record all the transcoders in addition to all
the active pipes.

This regression has been introduced in

commit 702e7a56af3780d8b3a717f698209bef44187bb0
Author: Paulo Zanoni <paulo.r.zanoni@intel.com>
Date: Tue Oct 23 18:29:59 2012 -0200

drm/i915: convert PIPECONF to use transcoder instead of pipe

Based on the patch "drm/i915: Dump all transcoder registers on error"
from Chris Wilson:

v2: Rebase so that we don't try to be clever and try to figure out the
cpu transcoder from hw state. That exercise should be done when we
analyze the error state offline.

The actual bugfix is to not call intel_pipe_to_cpu_transcoder in the
error state capture code in case the pipes aren't fully set up yet.

v3: Simplifiy the err->num_transcoders computation a bit. While at it
make the error capture stuff save on systems without a display block.

v4: Fix fail, spotted by Jani.

v5: Completely new commit message, cc: stable.

Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
Cc: Damien Lespiau <damien.lespiau@intel.com>
Cc: Jani Nikula <jani.nikula@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=60021
Cc: stable@vger.kernel.org
Tested-by: Dustin King <daking@rescomp.stanford.edu>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

Merge branch 'akpm' (patches from Andrew Morton)

Merge a bunch of fixes from Andrew Morton.

* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
  fs/proc/task_mmu.c: fix buffer overflow in add_page_map()
  arch: *: Kconfig: add "kernel/Kconfig.freezer" to "arch/*/Kconfig"
  ocfs2: fix null pointer dereference in ocfs2_dir_foreach_blk_id()
  x86 get_unmapped_area(): use proper mmap base for bottom-up direction
  ocfs2: fix NULL pointer dereference in ocfs2_duplicate_clusters_by_page
  ocfs2: Revert 40bd62e to avoid regression in extended allocation
  drivers/rtc/rtc-stmp3xxx.c: provide timeout for potentially endless loop polling a HW bit
  hugetlb: fix lockdep splat caused by pmd sharing
  aoe: adjust ref of head for compound page tails
  microblaze: fix clone syscall
  mm: save soft-dirty bits on file pages
  mm: save soft-dirty bits on swapped pages
  memcg: don't initialize kmem-cache destroying work for root caches

Merge branch 'timers/nohz-v3' of git://git.kernel.org/pub/scm/linux/kernel/git/frederic/linux-dynticks into timers/nohz

Pull nohz improvements from Frederic Weisbecker:

" It mostly contains fixes and full dynticks off-case optimizations. I believe that
   distros want to enable this feature so it seems important to optimize the case
   where the "nohz_full=" parameter is empty. ie: I'm trying to remove any performance
   regression that comes with NO_HZ_FULL=y when the feature is not used.

   This patchset improves the current situation a lot (off-case appears to be around 11% faster
   with hackbench, although I guess it may vary depending on the configuration but it should be
   significantly faster in any case) now there is still some work to do: I can still observe a
   remaining loss of 1.6% throughput seen with hackbench compared to CONFIG_NO_HZ_FULL=n. "

Signed-off-by: Ingo Molnar <mingo@kernel.org>

nohz: Optimize full dynticks's sched hooks with static keys

Scheduler IPIs and task context switches are serious fast path.
Let's try to hide as much as we can the impact of full
dynticks APIs' off case that are called on these sites
through the use of static keys.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Li Zhong <zhong@linux.vnet.ibm.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Kevin Hilman <khilman@linaro.org>

nohz: Optimize full dynticks state checks with static keys

These APIs are frequenctly accessed and priority is given
to optimize the full dynticks off-case in order to let
distros enable this feature without suffering from
significant performance regressions.

Let's inline these APIs and optimize them with static keys.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Li Zhong <zhong@linux.vnet.ibm.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Kevin Hilman <khilman@linaro.org>

nohz: Rename a few state variables

Rename the full dynticks's cpumask and cpumask state variables
to some more exportable names.

These will be used later from global headers to optimize
the main full dynticks APIs in conjunction with static keys.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Li Zhong <zhong@linux.vnet.ibm.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Kevin Hilman <khilman@linaro.org>

vtime: Always debug check snapshot source _before_ updating it

The vtime delta update performed by get_vtime_delta() always check
that the source of the snapshot is valid.

Meanhile the snapshot updaters that rely on get_vtime_delta() also
set the new snapshot origin. But some of them do this right before
the call to get_vtime_delta(), making its debug check useless.

This is easily fixable by moving the snapshot origin update after
the call to get_vtime_delta(). The order doesn't matter there.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Li Zhong <zhong@linux.vnet.ibm.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Kevin Hilman <khilman@linaro.org>

vtime: Always scale generic vtime accounting results

The cputime accounting in full dynticks can be a subtle
mixup of CPUs using tick based accounting and others using
generic vtime.

As long as the tick can have a share on producing these stats, we
want to scale the result against CFS precise accounting as the tick
can miss some task hiding between the periodic interrupt.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Li Zhong <zhong@linux.vnet.ibm.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Kevin Hilman <khilman@linaro.org>

vtime: Optimize full dynticks accounting off case with static keys

If no CPU is in the full dynticks range, we can avoid the full
dynticks cputime accounting through generic vtime along with its
overhead and use the traditional tick based accounting instead.

Let's do this and nope the off case with static keys.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Li Zhong <zhong@linux.vnet.ibm.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Kevin Hilman <khilman@linaro.org>

vtime: Describe overriden functions in dedicated arch headers

If the arch overrides some generic vtime APIs, let it describe
these on a dedicated and standalone header. This way it becomes
convenient to include it in vtime generic headers without irrelevant
stuff in such a low level header.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Li Zhong <zhong@linux.vnet.ibm.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Kevin Hilman <khilman@linaro.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>

m68k: hardirq_count() only need preempt_mask.h

The m68k irqflags implementation needs to check hardirq
context in some cases.

As it is a very low level header file, it's better to
include preempt_mask.h rather than hardirq.h when the
only purpose is to use irq context APIs. This way we
can avoid future header circular dependencies when
vtime.h will expand to use static keys.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Li Zhong <zhong@linux.vnet.ibm.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Kevin Hilman <khilman@linaro.org>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>

hardirq: Split preempt count mask definitions

In order to use static keys with vtime APIs, we'll need to
add static keys headers to vtime.h

hardirq.h then becomes a problem because it needs vtime.h
for irqtime accounting in irq_enter/irq_exit, but it's
often included just to get the irq mask definitions in the
task preempt_count field and the APIs that come along:
in_interrupt(), in_hardirq(), etc...

Some very low level arch headers sometimes need these masks
and APIs such as arch/m68k/include/asm/irqflags.h for example.
But they don't want to include hardirq.h if vtime.h, jump_label.h
and even workqueue.h come along. Including such bloated high
level header from arch headers can quickly result in circular
headers dependency that crash the build.

So let's split hardirq.h in two parts:

* preempt_mask.h that gathers all the preempt_count definitions
and the APIs associated. This one is considered low level and can
be safely included anywhere.

* hardirq.h that includes the previous one. It defines the irq
entry/exit APIs.

To avoid future circular headers dependencies, the preempt_mask.h
inclusion can replace hardirq.h on files that don't implement irq
low level handlers but just need the atomic/context check APIs.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Li Zhong <zhong@linux.vnet.ibm.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Kevin Hilman <khilman@linaro.org>

context_tracking: Split low level state headers

We plan to use the context tracking static key on inline
vtime APIs. For this we need to include the context tracking
headers from those of vtime.

However vtime headers need to stay low level because they are
included in hardirq.h that mostly contains standalone
definitions. But context_tracking.h includes sched.h for
a few task_struct references, therefore it wouldn't be sensible
to include it from vtime.h

To solve this, lets split the context tracking headers and move
out the pure state definitions that only require a few low level
headers. We can safely include that small part in vtime.h later.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Li Zhong <zhong@linux.vnet.ibm.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Kevin Hilman <khilman@linaro.org>

vtime: Fix racy cputime delta update

get_vtime_delta() must be called under the task vtime_seqlock
with the code that does the cputime accounting flush.

Otherwise the cputime reader can be fooled and run into
a race where it sees the snapshot update but misses the
cputime flush. As a result it can report a cputime that is
way too short.

Fix vtime_account_user() that wasn't complying to that rule.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Li Zhong <zhong@linux.vnet.ibm.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Kevin Hilman <khilman@linaro.org>

vtime: Remove a few unneeded generic vtime state checks

Some generic vtime APIs check if the vtime accounting
is enabled on the local CPU before doing their work.

Some of these are not needed because all their callers already
take care of that. Let's remove the checks on these.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Li Zhong <zhong@linux.vnet.ibm.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Kevin Hilman <khilman@linaro.org>

context_tracking: User/kernel broundary cross trace events

This can be useful to track all kernel/user round trips.
And it's also helpful to debug the context tracking subsystem.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Li Zhong <zhong@linux.vnet.ibm.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Kevin Hilman <khilman@linaro.org>

context_tracking: Optimize context switch off case with static keys

No need for syscall slowpath if no CPU is full dynticks,
rather nop this in this case.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Li Zhong <zhong@linux.vnet.ibm.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Kevin Hilman <khilman@linaro.org>

context_tracking: Optimize guest APIs off case with static key

Optimize guest entry/exit APIs with static keys. This minimize
the overhead for those who enable CONFIG_NO_HZ_FULL without
always using it. Having no range passed to nohz_full= should
result in the probes overhead to be minimized.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Li Zhong <zhong@linux.vnet.ibm.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Kevin Hilman <khilman@linaro.org>

context_tracking: Optimize main APIs off case with static key

Optimize user and exception entry/exit APIs with static
keys. This minimize the overhead for those who enable
CONFIG_NO_HZ_FULL without always using it. Having no range
passed to nohz_full= should result in the probes to be nopped
(at least we hope so...).

If this proves not be enough in the long term, we'll need
to bring an exception slow path by re-routing the exception
handlers.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Li Zhong <zhong@linux.vnet.ibm.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Kevin Hilman <khilman@linaro.org>

context_tracking: Ground setup for static key use

Prepare for using a static key in the context tracking subsystem.
This will help optimizing the off case on its many users:

* user_enter, user_exit, exception_enter, exception_exit, guest_enter,
guest_exit, vtime_*()

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Li Zhong <zhong@linux.vnet.ibm.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Kevin Hilman <khilman@linaro.org>

perf trace: Allow specifying which syscalls to trace

Similar to -e in strace, i.e. a comma separated list of syscall names
to trace.

Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-5zku7q5wug3103k1dzn3yy63@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

perf tools: Improve robustness of topology parsing code

This patch improves the robustness of the build_cpu_topo() routine by
allowing either the CPU parsing or the thread parsing to fail and yet
get perf to produce some topology data which could be useful for the
analysis.

Without this patch, if the cpu parsing fails, the thread parsing is not
attempted vice-versa.

Signed-off-by: Stephane Eranian <eranian@google.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung.kim@lge.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20130814100426.GA3444@quad
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

perf tests: Fix compile failure on do_sort_something

Commit b55ae0a9 added code-reading.c which fails to compile on Fedora 16
with compiler version:
$ gcc --version
gcc (GCC) 4.6.3 20120306 (Red Hat 4.6.3-2)

Failure message is:

tests/code-reading.c: In function ‘do_sort_something’:
tests/code-reading.c:305:13: error: stack protector not protecting local variables: variable length buffer [-Werror=stack-protector]
cc1: all warnings being treated as errors
make: *** [/tmp/junk/tests/code-reading.o] Error 1
make: *** Waiting for unfinished jobs....

v2: as Adrian noticed changed sizeof to ARRAY_SIZE

Signed-off-by: David Ahern <dsahern@gmail.com>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/1376454732-83728-1-git-send-email-dsahern@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

Merge tag 'amd_ucode_fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp into x86/urgent

Pull AMD microcode fixes from Borislav Petkov:

" Those are basically two fixes which correct the AMD early ucode loader
   from accessing cpu_data too early, i.e. before smp_store_cpu_info()
   has copied the boot_cpu_data ontop and overwritten an already empty
   structure (which we shouldn't access that early in the first place
   anyway).

   The second patch is kinda largish for that late in the game but it
   shouldn't be problematic because we're simply switching from using
   cpu_data to use the CPU family number directly and thus again, not use
   uninitialized cpu_data structure. "

Signed-off-by: Ingo Molnar <mingo@kernel.org>