git.karo-electronics.de Git - karo-tx-linux.git/log

agp: fix arbitrary kernel memory writes

commit 194b3da873fd334ef183806db751473512af29ce upstream.

pg_start is copied from userspace on AGPIOC_BIND and AGPIOC_UNBIND ioctl
cmds of agp_ioctl() and passed to agpioc_bind_wrap(). As said in the
comment, (pg_start + mem->page_count) may wrap in case of AGPIOC_BIND,
and it is not checked at all in case of AGPIOC_UNBIND. As a result, user
with sufficient privileges (usually "video" group) may generate either
local DoS or privilege escalation.

Signed-off-by: Vasiliy Kulikov <segoon@openwall.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

drm: select FRAMEBUFFER_CONSOLE_PRIMARY if we have FRAMEBUFFER_CONSOLE

commit bf5192edcbc1f0a7f9c054649dbf1a0b3210d9b7 upstream.

Multi-gpu/switcheroo relies on this option to get the console on the
correct GPU at bootup, some distros enable it but it seems some get
it wrong.

Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

um: mdd support for 64 bit atomic operations

commit 57d8e02e3cd21bccf2b84b26b42feb79e1f0f83e upstream.

This adds support for 64 bit atomic operations on 32 bit UML systems.  XFS
needs them since 2.6.38.

  $ make ARCH=um SUBARCH=i386
  ...
    LD      .tmp_vmlinux1
  fs/built-in.o: In function `xlog_regrant_reserve_log_space':
  xfs_log.c:(.text+0xd8584): undefined reference to `atomic64_read_386'
  xfs_log.c:(.text+0xd85ac): undefined reference to `cmpxchg8b_emu'
  ...

Addresses https://bugzilla.kernel.org/show_bug.cgi?id=32812

Reported-by: Martin Walch <walch.martin@web.de>
Tested-by: Martin Walch <walch.martin@web.de>
Cc: Martin Walch <walch.martin@web.de>
Signed-off-by: Richard Weinberger <richard@nod.at>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

NFSv4.1: Ensure state manager thread dies on last umount

commit 47c2199b6eb5fbe38ddb844db7cdbd914d304f9c upstream.

Currently, the state manager may continue to try recovering state forever
even after the last filesystem to reference that nfs_client has umounted.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

nfs: don't lose MS_SYNCHRONOUS on remount of noac mount

commit 26c4c170731f00008f4317a2888a0a07ac99d90d upstream.

On a remount, the VFS layer will clear the MS_SYNCHRONOUS bit on the
assumption that the flags on the mount syscall will have it set if the
remounted fs is supposed to keep it.

In the case of "noac" though, MS_SYNCHRONOUS is implied. A remount of
such a mount will lose the MS_SYNCHRONOUS flag since "sync" isn't part
of the mount options.

Reported-by: Max Matveev <makc@redhat.com>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

vfs: avoid large kmalloc()s for the fdtable

commit 6d4831c283530a5f2c6bd8172c13efa236eb149d upstream.

Azurit reports large increases in system time after 2.6.36 when running
Apache. It was bisected down to a892e2d7dcdfa6c76e6 ("vfs: use kmalloc()
to allocate fdmem if possible").

That patch caused the vfs to use kmalloc() for very large allocations and
this is causing excessive work (and presumably excessive reclaim) within
the page allocator.

Fix it by falling back to vmalloc() earlier - when the allocation attempt
would have been considered "costly" by reclaim.

Reported-by: azurIt <azurit@pobox.sk>
Tested-by: azurIt <azurit@pobox.sk>
Acked-by: Changli Gao <xiaosuo@gmail.com>
Cc: Americo Wang <xiyou.wangcong@gmail.com>
Cc: Jiri Slaby <jslaby@suse.cz>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

m68k/mm: Set all online nodes in N_NORMAL_MEMORY

commit 4aac0b4815ba592052758f4b468f253d383dc9d6 upstream.

For m68k, N_NORMAL_MEMORY represents all nodes that have present memory
since it does not support HIGHMEM. This patch sets the bit at the time
node_present_pages has been set by free_area_init_node.
At the time the node is brought online, the node state would have to be
done unconditionally since information about present memory has not yet
been recorded.

If N_NORMAL_MEMORY is not accurate, slub may encounter errors since it
uses this nodemask to setup per-cache kmem_cache_node data structures.

This pach is an alternative to the one proposed by David Rientjes
<rientjes@google.com> attempting to set node state immediately when
bringing the node online.

Signed-off-by: Michael Schmitz <schmitz@debian.org>
Tested-by: Thorsten Glaser <tg@debian.org>
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

mm: thp: fix /dev/zero MAP_PRIVATE and vm_flags cleanups

commit 78f11a255749d09025f54d4e2df4fbcb031530e2 upstream.

The huge_memory.c THP page fault was allowed to run if vm_ops was null
(which would succeed for /dev/zero MAP_PRIVATE, as the f_op->mmap wouldn't
setup a special vma->vm_ops and it would fallback to regular anonymous
memory) but other THP logics weren't fully activated for vmas with vm_file
not NULL (/dev/zero has a not NULL vma->vm_file).

So this removes the vm_file checks so that /dev/zero also can safely use
THP (the other albeit safer approach to fix this bug would have been to
prevent the THP initial page fault to run if vm_file was set).

After removing the vm_file checks, this also makes huge_memory.c stricter
in khugepaged for the DEBUG_VM=y case. It doesn't replace the vm_file
check with a is_pfn_mapping check (but it keeps checking for VM_PFNMAP
under VM_BUG_ON) because for a is_cow_mapping() mapping VM_PFNMAP should
only be allowed to exist before the first page fault, and in turn when
vma->anon_vma is null (so preventing khugepaged registration). So I tend
to think the previous comment saying if vm_file was set, VM_PFNMAP might
have been set and we could still be registered in khugepaged (despite
anon_vma was not NULL to be registered in khugepaged) was too paranoid.
The is_linear_pfn_mapping check is also I think superfluous (as described
by comment) but under DEBUG_VM it is safe to stay.

Addresses https://bugzilla.kernel.org/show_bug.cgi?id=33682

Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
Reported-by: Caspar Zhang <bugs@casparzhang.com>
Acked-by: Mel Gorman <mel@csn.ul.ie>
Acked-by: Rik van Riel <riel@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

mm: check if PTE is already allocated during page fault

commit cc03638df20acbec5d0d0d9e07234aadde9e698d upstream.

With transparent hugepage support, handle_mm_fault() has to be careful
that a normal PMD has been established before handling a PTE fault.  To
achieve this, it used __pte_alloc() directly instead of pte_alloc_map as
pte_alloc_map is unsafe to run against a huge PMD.  pte_offset_map() is
called once it is known the PMD is safe.

pte_alloc_map() is smart enough to check if a PTE is already present
before calling __pte_alloc but this check was lost.  As a consequence,
PTEs may be allocated unnecessarily and the page table lock taken.  Thi
useless PTE does get cleaned up but it's a performance hit which is
visible in page_test from aim9.

This patch simply re-adds the check normally done by pte_alloc_map to
check if the PTE needs to be allocated before taking the page table lock.
The effect is noticable in page_test from aim9.

  AIM9
                  2.6.38-vanilla 2.6.38-checkptenone
  creat-clo      446.10 ( 0.00%)   424.47 (-5.10%)
  page_test       38.10 ( 0.00%)    42.04 ( 9.37%)
  brk_test        52.45 ( 0.00%)    51.57 (-1.71%)
  exec_test      382.00 ( 0.00%)   456.90 (16.39%)
  fork_test       60.11 ( 0.00%)    67.79 (11.34%)
  MMTests Statistics: duration
  Total Elapsed Time (seconds)                611.90    612.22

(While this affects 2.6.38, it is a performance rather than a functional
bug and normally outside the rules -stable.  While the big performance
differences are to a microbench, the difference in fork and exec
performance may be significant enough that -stable wants to consider the
patch)

Reported-by: Raz Ben Yehuda <raziebe@gmail.com>
Signed-off-by: Mel Gorman <mgorman@suse.de>
Reviewed-by: Rik van Riel <riel@redhat.com>
Reviewed-by: Andrea Arcangeli <aarcange@redhat.com>
Reviewed-by: Minchan Kim <minchan.kim@gmail.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

oom: use pte pages in OOM score

commit f755a042d82b51b54f3bdd0890e5ea56c0fb6807 upstream.

PTE pages eat up memory just like anything else, but we do not account for
them in any way in the OOM scores. They are also _guaranteed_ to get
freed up when a process is OOM killed, while RSS is not.

Reported-by: Dave Hansen <dave@linux.vnet.ibm.com>
Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Acked-by: David Rientjes <rientjes@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

virtio: console: Enable call to hvc_remove() on console port remove

commit afa2689e19073cd2e762d0f2c1358fab1ab9f18c upstream.

This call was disabled as hot-unplugging one virtconsole port led to
another virtconsole port freezing.

Upon testing it again, this now works, so enable it.

In addition, a bug was found in qemu wherein removing a port of one type
caused the guest output from another port to stop working. I doubt it
was just this bug that caused it (since disabling the hvc_remove() call
did allow other ports to continue working), but since it's all solved
now, we're fine with hot-unplugging of virtconsole ports.

Signed-off-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

FLEXCOP-PCI: fix __xlate_proc_name-warning for flexcop-pci

commit b934c20de1398d4a82d2ecfeb588a214a910f13f upstream.

This patch fixes the warning about bad names for sys-fs and other kernel-things. The flexcop-pci driver was using '/'-characters in it, which is not good.
This has been fixed in several attempts by several people, but obviously never made it into the kernel.

Signed-off-by: Patrick Boettcher <pboettcher@kernellabs.com>
Cc: Steffen Barszus <steffenbpunkt@googlemail.com>
Cc: Boris Cuber <me@boris64.net>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

set memory ranges in N_NORMAL_MEMORY when onlined

commit d9b41e0b54fd7e164daf1e9c539c1070398aa02e upstream.

When a DISCONTIGMEM memory range is brought online as a NUMA node, it
also needs to have its bet set in N_NORMAL_MEMORY. This is necessary for
generic kernel code that utilizes N_NORMAL_MEMORY as a subset of N_ONLINE
for memory savings.

These types of hacks can hopefully be removed once DISCONTIGMEM is either
removed or abstracted away from CONFIG_NUMA.

Fixes a panic in the slub code which only initializes structures for
N_NORMAL_MEMORY to save memory:

Backtrace:
[<000000004021c938>] add_partial+0x28/0x98
[<000000004021faa0>] __slab_free+0x1d0/0x1d8
[<000000004021fd04>] kmem_cache_free+0xc4/0x128
[<000000004033bf9c>] ida_get_new_above+0x21c/0x2c0
[<00000000402a8980>] sysfs_new_dirent+0xd0/0x238
[<00000000402a974c>] create_dir+0x5c/0x168
[<00000000402a9ab0>] sysfs_create_dir+0x98/0x128
[<000000004033d6c4>] kobject_add_internal+0x114/0x258
[<000000004033d9ac>] kobject_add_varg+0x7c/0xa0
[<000000004033df20>] kobject_add+0x50/0x90
[<000000004033dfb4>] kobject_create_and_add+0x54/0xc8
[<00000000407862a0>] cgroup_init+0x138/0x1f0
[<000000004077ce50>] start_kernel+0x5a0/0x840
[<000000004011fa3c>] start_parisc+0xa4/0xb8
[<00000000404bb034>] packet_ioctl+0x16c/0x208
[<000000004049ac30>] ip_mroute_setsockopt+0x260/0xf20

Signed-off-by: David Rientjes <rientjes@google.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

slub: fix panic with DISCONTIGMEM

commit 4a5fa3590f09999f6db41bc386bce40848fa9f63 upstream.

Slub makes assumptions about page_to_nid() which are violated by
DISCONTIGMEM and !NUMA.  This violation results in a panic because
page_to_nid() can be non-zero for pages in the discontiguous ranges and
this leads to a null return by get_node().  The assertion by the
maintainer is that DISCONTIGMEM should only be allowed when NUMA is also
defined.  However, at least six architectures: alpha, ia64, m32r, m68k,
mips, parisc violate this.  The panic is a regression against slab, so
just mark slub broken in the problem configuration to prevent users
reporting these panics.

Acked-by: David Rientjes <rientjes@google.com>
Acked-by: Pekka Enberg <penberg@kernel.org>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

ACPI / PM: Avoid infinite recurrence while registering power resources

commit 7bed50c5edf5cba8dd515a31191cbfb6065ddc85 upstream.

There is at least one BIOS with a DSDT containing a power resource
object with a _PR0 entry pointing back to that power resource. In
consequence, while registering that power resource
acpi_bus_get_power_flags() sees that it depends on itself and tries
to register it again, which leads to an infinitely deep recurrence.
This problem was introduced by commit bf325f9538d8c89312be305b9779e
(ACPI / PM: Register power resource devices as soon as they are
needed).

To fix this problem use the observation that power resources cannot
be power manageable and prevent acpi_bus_get_power_flags() from
being called for power resource objects.

References: https://bugzilla.kernel.org/show_bug.cgi?id=31872
Reported-and-tested-by: Pascal Dormeau <pdormeau@free.fr>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Len Brown <lenb@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

pfault: fix token handling

commit e35c76cd47c244eaa7a74adaabde4d0a1cadb907 upstream.

f6649a7e "[S390] cleanup lowcore access from external interrupts" changed
handling of external interrupts. Instead of letting the external interrupt
handlers accessing the per cpu lowcore the entry code of the kernel reads
already all fields that are necessary and passes them to the handlers.
The pfault interrupt handler was incorrectly converted. It tries to
dereference a value which used to be a pointer to a lowcore field. After
the conversion however it is not anymore the pointer to the field but its
content. So instead of a dereference only a cast is needed to get the
task pointer that caused the pfault.

Fixes a NULL pointer dereference and a subsequent kernel crash:

Unable to handle kernel pointer dereference at virtual kernel address (null)
Oops: 0004 [#1] SMP
Modules linked in: nfsd exportfs nfs lockd fscache nfs_acl auth_rpcgss sunrpc
                   loop qeth_l3 qeth vmur ccwgroup ext3 jbd mbcache dm_mod
                   dasd_eckd_mod dasd_diag_mod dasd_mod
CPU: 0 Not tainted 2.6.38-2-s390x #1
Process cron (pid: 1106, task: 000000001f962f78, ksp: 000000001fa0f9d0)
Krnl PSW : 0404200180000000 000000000002c03e (pfault_interrupt+0xa2/0x138)
           R:0 T:1 IO:0 EX:0 Key:0 M:1 W:0 P:0 AS:0 CC:2 PM:0 EA:3
Krnl GPRS: 0000000000000000 0000000000000001 0000000000000000 0000000000000001
           000000001f962f78 0000000000518968 0000000090000002 000000001ff03280
           0000000000000000 000000000064f000 000000001f962f78 0000000000002603
           0000000006002603 0000000000000000 000000001ff7fe68 000000001ff7fe48
Krnl Code: 000000000002c036: 5820d010            l       %r2,16(%r13)
           000000000002c03a: 1832                lr      %r3,%r2
           000000000002c03c: 1a31                ar      %r3,%r1
          >000000000002c03e: ba23d010            cs      %r2,%r3,16(%r13)
           000000000002c042: a744fffc            brc     4,2c03a
           000000000002c046: a7290002            lghi    %r2,2
           000000000002c04a: e320d0000024        stg     %r2,0(%r13)
           000000000002c050: 07f0                bcr     15,%r0
Call Trace:
([<000000001f962f78>] 0x1f962f78)
  [<000000000001acda>] do_extint+0xf6/0x138
  [<000000000039b6ca>] ext_no_vtime+0x30/0x34
  [<000000007d706e04>] 0x7d706e04
Last Breaking-Event-Address:
  [<0000000000000000>] 0x0

For stable maintainers:
the first kernel which contains this bug is 2.6.37.

Reported-by: Stephen Powell <zlinuxman@wowway.com>
Cc: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

kvm-390: Let kernel exit SIE instruction on work

commit 9ff4cfb3fcfd48b49fdd9be7381b3be340853aa4 upstream.

From: Christian Borntraeger <borntraeger@de.ibm.com>

This patch fixes the sie exit on interrupts. The low level
interrupt handler returns to the PSW address in pt_regs and not
to the PSW address in the lowcore.
Without this fix a cpu bound guest might never leave guest state
since the host interrupt handler would blindly return to the
SIE instruction, even on need_resched and friends.

Signed-off-by: Carsten Otte <cotte@de.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

UBIFS: fix false space checking failure

commit 8c230d9a5b5ec7970139acb7e2d165d7a3fe9f9e upstream.

This patch fixes UBIFS mount failure when the debugging support is enabled,
we are recovering from a power cut, we were first mounter R/O and we are
re-mounting R/W. In this case we should not assume that the amount of free
space before we have re-mounted R/W and after are equivalent, because
when we have mounted R/O the file-system is in a non-committed state so
the amount of free space is slightly smaller, due to the fact that we cannot
predict the amount of free space precisely before we commit.

This patch fixes the issue by skipping the debugging check in case of
recovery. This issue was reported by Caizhiyong <caizhiyong@huawei.com>
here: http://thread.gmane.org/gmane.linux.drivers.mtd/34350/focus=34387

Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
Reported-by: Caizhiyong <caizhiyong@huawei.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

ath9k_hw: partially revert "fix dma descriptor rx error bit parsing"

commit 115dad7a7f42e68840392767323ceb9306dbdb36 upstream.

The rx error bit parsing was changed to consider PHY errors and various
decryption errors separately. While correct according to the documentation,
this is causing spurious decryption error reports in some situations.

Fix this by restoring the original order of the checks in those places,
where the errors are meant to be mutually exclusive.

If a CRC error is reported, then MIC failure and decryption errors
are irrelevant, and a PHY error is unlikely.

Signed-off-by: Felix Fietkau <nbd@openwrt.org>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

ACPI battery: fribble sysfs files from a resume notifier

commit 25be5821521640eb00b7eb219ffe59664510d073 upstream.

Commit da8aeb92 re-poked the battery on resume, but Linus reports that
it broke his eee and partially reverted it in b23fffd7. Unfortunately
this also results in my x201s giving crack values until the sysfs files
are poked again. In the revert message, it was suggested that we poke it
from a PM notifier, so let's do that.

With this in place, I haven't noticed the units going nutty on my
gnome-power-manager across a dozen suspends or so...

Signed-off-by: Kyle McMartin <kyle@redhat.com>
Acked-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Len Brown <len.brown@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

ideapad: read brightness setting on brightness key notify

commit 2165136585b5c7d6f118f1d90fbde550bb7de212 upstream.

BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=25922
On ideapad Y530, the brightness key notify will be blocked if the last notify
is not responsed by getting the brightness value. Read value when we get the
notify shall fix the problem and will not have any difference on other ideapads.

Signed-off-by: Ike Panhc <ike.pan@canonical.com>
Signed-off-by: Matthew Garrett <mjg@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

UBIFS: fix master node recovery

commit 6e0d9fd38b750d678bf9fd07db23582f52fafa55 upstream.

This patch fixes the following symptoms:
1. Unmount UBIFS cleanly.
2. Start mounting UBIFS R/W and have a power cut immediately
3. Start mounting UBIFS R/O, this succeeds
4. Try to re-mount UBIFS R/W - this fails immediately or later on,
   because UBIFS will write the master node to the flash area
   which has been written before.

The analysis of the problem:

1. UBIFS is unmounted cleanly, both copies of the master node are clean.
2. UBIFS is being mounter R/W, starts changing master node copy 1, and
   a power cut happens. The copy N1 becomes corrupted.
3. UBIFS is being mounted R/O. It notices the copy N1 is corrupted and
   reads copy N2. Copy N2 is clean.
4. Because of R/O mode, UBIFS cannot recover copy 1.
5. The mount code (ubifs_mount()) sees that the master node is clean,
   so it decides that no recovery is needed.
6. We are re-mounting R/W. UBIFS believes no recovery is needed and
   starts updating the master node, but copy N1 is still corrupted
   and was not recovered!

Fix this problem by marking the master node as dirty every time we
recover it and we are in R/O mode. This forces further recovery and
the UBIFS cleans-up the corruptions and recovers the copy N1 when
re-mounting R/W later.

Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

kconfig: Avoid buffer underrun in choice input

commit 3ba41621156681afcdbcd624e3191cbc65eb94f4 upstream.

Commit 40aee729b350 ('kconfig: fix default value for choice input')
fixed some cases where kconfig would select the wrong option from a
choice with a single valid option and thus enter an infinite loop.

However, this broke the test for user input of the form 'N?', because
when kconfig selects the single valid option the input is zero-length
and the test will read the byte before the input buffer. If this
happens to contain '?' (as it will in a mips build on Debian unstable
today) then kconfig again enters an infinite loop.

Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

libata: Implement ATA_FLAG_NO_DIPM and apply it to mcp65

commit ae01b2493c3bf03c504c32ac4ebb01d528508db3 upstream.

NVIDIA mcp65 familiy of controllers cause command timeouts when DIPM
is used. Implement ATA_FLAG_NO_DIPM and apply it.

This problem was reported by Stefan Bader in the following thread.

http://thread.gmane.org/gmane.linux.ide/48841

stable: applicable to 2.6.37 and 38.

Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Stefan Bader <stefan.bader@canonical.com>
Signed-off-by: Jeff Garzik <jgarzik@pobox.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

ahci: don't enable port irq before handler is registered

commit 7b3a24c57d2eeda8dba9c205342b12689c4679f9 upstream.

The ahci_pmp_attach() & ahci_pmp_detach() unmask port irqs, but they
are also called during port initialization, before ahci host irq
handler is registered. On ce4100 platform, this sometimes triggers
"irq 4: nobody cared" message when loading driver.

Fixed this by not touching the register if the port is in frozen
state, and mark all uninitialized port as frozen.

Signed-off-by: Maxime Bizon <mbizon@freebox.fr>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jeff Garzik <jgarzik@pobox.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

ALSA: hda - Add a fix-up for Acer dmic with ALC271x codec

commit 6981d184376e74391c23c116a068f8d1305f0e57 upstream.

Acer laptops with ALC271x needs a magic initialization for digital-mic
to make it working with mono streams (and PulseAudio).
Added a fix-up applied to Acer with ALC271x generically.

Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

ASoC: codecs: JZ4740: Fix OOPS

commit 1fdf9b49e9e7788d09bad4b08a6a821ac39798f3 upstream.

Commit ce6120cc(ASoC: Decouple DAPM from CODECs) changed the signature of
snd_soc_dapm_widgets_new to take an pointer to a snd_soc_dapm_context instead of
a snd_soc_codec. The call to snd_soc_dapm_widgets_new in jz4740_codec_dev_probe
was not updated to reflect this change, which results in a compiletime warning
and a runtime OOPS.

Since the core code calls snd_soc_dapm_widgets_new after the codec has been
registered it can be dropped here.

Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Acked-by: Liam Girdwood <lrg@ti.com>
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

ASoC: Fix output PGA enabling in wm_hubs CODECs

commit 39cca168bdfaef9d0c496ec27f292445d6184946 upstream.

The output PGA was not being powered up in headphone and speaker paths,
removing the ability to offer volume control and mute with the output
PGA.

Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Acked-by: Liam Girdwood <lrg@slimlogic.co.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

serial/imx: read cts state only after acking cts change irq

commit 5680e94148a86e8c31fdc5cb0ea0d5c6810c05b0 upstream.

If cts changes between reading the level at the cts input (USR1_RTSS)
and acking the irq (USR1_RTSD) the last edge doesn't generate an irq and
uart_handle_cts_change is called with a outdated value for cts.

The race was introduced by commit

ceca629 ([ARM] 2971/1: i.MX uart handle rts irq)

Reported-by: Arwed Springer <Arwed.Springer@de.trumpf.com>
Tested-by: Arwed Springer <Arwed.Springer@de.trumpf.com>
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

tty/n_gsm: fix bug in CRC calculation for gsm1 mode

commit 9db4e4381a8e881ff65a5d3400bfa471f84217e7 upstream.

Problem description:
  gsm_queue() calculate a CRC for arrived frames. As a last step of
  CRC calculation it call

    gsm->fcs = gsm_fcs_add(gsm->fcs, gsm->received_fcs);

  This work perfectly for the case of GSM0 mode as gsm->received_fcs
  contain the last piece of data required to generate final CRC.

  gsm->received_fcs is not used for GSM1 mode. Thus we put an
  additional byte to CRC calculation. As result we get a wrong CRC
  and reject incoming frame.

Signed-off-by: Mikhail Kshevetskiy <mikhail.kshevetskiy@gmail.com>
Acked-by: Alan Cox <alan@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

drm/i915/tv: Remember the detected TV type

commit d5627663f2088fa4be447fdcfd52bcb233448d85 upstream.

During detect() we would probe the connection bits to determine if
there was a TV attached, and what video input type (Component, S-Video,
Composite, etc) to use. However, we promptly discarded this vital bit of
information and never propagated it to where it was used to determine
the correct modes and setup the control registers. Fix it!

This fixes a regression from 7b334fcb45b757ffb093696ca3de1b0c8b4a33f1.

Reported-and-tested-by: Mathew McKernan <matmckernan@rauland.com.au>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=35977
Signed-off-by: Mathew McKernan <matmckernan@rauland.com.au>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Paul Menzel <paulepanter@users.sourceforge.net>
Signed-off-by: Keith Packard <keithp@keithp.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

drm/i915: Sanitize the output registers after resume

commit f6e5b1603b8bb7131b6778d0d4e2e5dda120a379 upstream.

Similar to booting, we need to inspect the state left by the BIOS and
remove any conflicting bits before we take over. The example reported by
Seth Forshee is very similar to the bug we encountered with the state left
by grub2, that the crtc pipe<->planning mapping was reversed from our
expectations and so we failed to turn off the outputs when booting or,
in this case, resuming. This may be in fact the same bug, but triggered
at resume time.

This patch rearranges the code we already have to clear up the
conflicting state upon init and calls it from reset (which is called
after we have lost control of the hardware, i.e. along both the boot and
resume paths) instead.

Reported-and-tested-by: Seth Forshee <seth.forshee@canonical.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=35796
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Keith Packard <keithp@keithp.com>
Signed-off-by: Keith Packard <keithp@keithp.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

drm/radeon/kms: fix bad shift in atom iio table parser

commit 8e461123f28e6b17456225e70eb834b3b30d28bb upstream.

Noticed by Patrick Lowry.

Signed-off-by: Alex Deucher <alexdeucher@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

drm/nouveau: fix notifier memory corruption bug

commit a18d89ca026140eb8ac4459bf70a01c571dd9a32 upstream.

nouveau_bo_wr32 expects offset to be in words, but we pass value in bytes,
so after commit 73412c3854c877e5f37ad944ee8977addde4d35a ("drm/nouveau: allocate
kernel's notifier object at end of block") we started to overwrite some memory
after notifier buffer object (previously m2mf_ntfy was always 0, so it didn't
matter it was a value in bytes).

Reported-by: Dominik Brodowski <linux@dominikbrodowski.net>
Reported-by: Nigel Cunningham <lkml@nigelcunningham.com.au>
Signed-off-by: Marcin Slusarz <marcin.slusarz@gmail.com>
Cc: Ben Skeggs <bskeggs@redhat.com>
Cc: Pekka Paalanen <pq@iki.fi>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

drm/radeon/kms: pll tweaks for r7xx

commit 5785e53ffa73f77fb19e378c899027afc07272bc upstream.

Prefer min m to max p only on pre-r7xx asics.

Fixes:
https://bugs.freedesktop.org/show_bug.cgi?id=36197

Signed-off-by: Alex Deucher <alexdeucher@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

intel-iommu: Fix get_domain_for_dev() error path

commit 2fe9723df8e45fd247782adea244a5e653c30bf4 upstream.

If we run out of domain_ids and fail iommu_attach_domain(), we
fall into domain_exit() without having setup enough of the
domain structure for this to do anything useful. In fact, it
typically runs off into the weeds walking the bogus domain->devices
list. Just free the domain.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Acked-by: Donald Dutile <ddutile@redhat.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

intel-iommu: Unlink domain from iommu

commit a97590e56d0d58e1dd262353f7cbd84e81d8e600 upstream.

When we remove a device, we unlink the iommu from the domain, but
we never do the reverse unlinking of the domain from the iommu.
This means that we never clear iommu->domain_ids, eventually leading
to resource exhaustion if we repeatedly bind and unbind a device
to a driver. Also free empty domains to avoid a resource leak.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Acked-by: Donald Dutile <ddutile@redhat.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

intel-iommu: Fix use after release during device attach

commit 7a6610139a1e1d9297dd1c5d178022eac36839cb upstream.

Obtain the new pgd pointer before releasing the page containing this
value.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Reviewed-by: Sheng Yang <sheng@linux.intel.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

x86, gart: Make sure GART does not map physmem above 1TB

commit 665d3e2af83c8fbd149534db8f57d82fa6fa6753 upstream.

The GART can only map physical memory below 1TB. Make sure
the gart driver in the kernel does not try to map memory
above 1TB.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Link: http://lkml.kernel.org/r/1303134346-5805-5-git-send-email-joerg.roedel@amd.com
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

x86, gart: Set DISTLBWALKPRB bit always

commit c34151a742d84ae65db2088ea30495063f697fbe upstream.

The DISTLBWALKPRB bit must be set for the GART because the
gatt table is mapped UC. But the current code does not set
the bit at boot when the BIOS setup the aperture correctly.
Fix that by setting this bit when enabling the GART instead
of the other places.

Cc: Borislav Petkov <borislav.petkov@amd.com>
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Link: http://lkml.kernel.org/r/1303134346-5805-4-git-send-email-joerg.roedel@amd.com
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

virtio: Decrement avail idx on buffer detach

commit b3258ff1d6086bd2b9eeb556844a868ad7d49bc8 upstream.

When detaching a buffer from a vq, the avail.idx value should be
decremented as well.

This was noticed by hot-unplugging a virtio console port and then
plugging in a new one on the same number (re-using the vqs which were
just 'disowned'). qemu reported

'Guest moved used index from 0 to 256'

when any IO was attempted on the new port.

Reported-by: juzhang <juzhang@redhat.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

nfsd4: Fix filp leak

commit a96e5b90804be8b540d30f4a1453fc87f95b3149 upstream.

23fcf2ec93fb8573a653408316af599939ff9a8e (nfsd4: fix oops on lock failure)

The above patch breaks free path for stp->st_file. If stp was inserted
into sop->so_stateids, we have to free stp->st_file refcount. Because
stp->st_file refcount itself is taken whether or not any refcounts are
taken on the stp->st_file->fi_fds[].

Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

nfsd4: fix struct file leak on delegation

commit 4ee63624fd927376b97ead3a8d00728d437bc8e8 upstream.

Introduced by acfdf5c383b38f7f4dddae41b97c97f1ae058f49.

Reported-by: Gerhard Heift <ml-nfs-linux-20110412-ef47@gheift.de>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

qlcnic: limit skb frags for non tso packet

commit 91a403caf0f26c71ce4407fd235b2d6fb225fba9 upstream.

Machines are getting deadlock in four node cluster environment.
All nodes are accessing (find /gfs2 -depth -print|cpio -ocv > /dev/null)
200 GB storage on a GFS2 filesystem.
This result in memory fragmentation and driver receives 18 frags for
1448 byte packets.
For non tso packet, fw drops the tx request, if it has >14 frags.

Fixing it by pulling extra frags.

Signed-off-by: Amit Kumar Salecha <amit.salecha@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

p54: Initialize extra_len in p54_tx_80211

commit a6756da9eace8b4af73e9dea43f1fc2889224c94 upstream.

This patch fixes a very serious off-by-one bug in
the driver, which could leave the device in an
unresponsive state.

The problem was that the extra_len variable [used to
reserve extra scratch buffer space for the firmware]
was left uninitialized. Because p54_assign_address
later needs the value to reserve additional space,
the resulting frame could be to big for the small
device's memory window and everything would
immediately come to a grinding halt.

Reference: https://bugs.launchpad.net/bugs/722185

Acked-by: Christian Lamparter <chunkeey@googlemail.com>
Signed-off-by: Jason Conti <jason.conti@gmail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

block, blk-sysfs: Fix an err return path in blk_register_queue()

commit ed5302d3c25006a9edc7a7fbea97a30483f89ef7 upstream.

We do not call blk_trace_remove_sysfs() in err return path
if kobject_add() fails. This path fixes it.

Signed-off-by: Liu Yuan <tailai.ly@taobao.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

ath: add missing regdomain pair 0x5c mapping

commit bd39a274fb7b43374c797bafdb7f506598f36f77 upstream.

Joe Culler reported a problem with his AR9170 device:

> ath: EEPROM regdomain: 0x5c
> ath: EEPROM indicates we should expect a direct regpair map
> ath: invalid regulatory domain/country code 0x5c
> ath: Invalid EEPROM contents

It turned out that the regdomain 'APL7_FCCA' was not mapped yet.
According to Luis R. Rodriguez [Atheros' engineer] APL7 maps to
FCC_CTL and FCCA maps to FCC_CTL as well, so the attached patch
should be correct.

Reported-by: Joe Culler <joe.culler@gmail.com>
Acked-by: Luis R. Rodriguez <lrodriguez@atheros.com>
Signed-off-by: Christian Lamparter <chunkeey@googlemail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

netxen: limit skb frags for non tso packet

commit c968bdf6912cad6d0fc63d7037cc1c870604a808 upstream.

Machines are getting deadlock in four node cluster environment.
All nodes are accessing (find /gfs2 -depth -print|cpio -ocv > /dev/null)
200 GB storage on a GFS2 filesystem.
This result in memory fragmentation and driver receives 18 frags for
1448 byte packets.
For non tso packet, fw drops the tx request, if it has >14 frags.

Fixing it by pulling extra frags.

Signed-off-by: Amit Kumar Salecha <amit.salecha@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

ath9k_hw: fix stopping rx DMA during resets

commit 5882da02e9d9089b7e8c739f3e774aaeeff8b7ba upstream.

During PHY errors, the MAC can sometimes fail to enter an idle state on older
hardware (before AR9380) after an rx stop has been requested.

This typically shows up in the kernel log with messages like these:

ath: Could not stop RX, we could be confusing the DMA engine when we start RX up
------------[ cut here ]------------
WARNING: at drivers/net/wireless/ath/ath9k/recv.c:504 ath_stoprecv+0xcc/0xf0 [ath9k]()
Call Trace:
[<8023f0e8>] dump_stack+0x8/0x34
[<80075050>] warn_slowpath_common+0x78/0xa4
[<80075094>] warn_slowpath_null+0x18/0x24
[<80d66d60>] ath_stoprecv+0xcc/0xf0 [ath9k]
[<80d642cc>] ath_set_channel+0xbc/0x270 [ath9k]
[<80d65254>] ath_radio_disable+0x4a4/0x7fc [ath9k]

When this happens, the state that the MAC enters is easy to identify and
does not result in bogus DMA traffic, however to ensure a working state
after a channel change, the hardware should still be reset.

This patch adds detection for this specific MAC state, after which the above
warnings completely disappear in my tests.

Signed-off-by: Felix Fietkau <nbd@openwrt.org>
Cc: Kyungwan Nam <Kyungwan.Nam@Atheros.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

Linux 2.6.38.4

ip: ip_options_compile() resilient to NULL skb route

commit c65353daf137dd41f3ede3baf62d561fca076228 upstream.

Scot Doyle demonstrated ip_options_compile() could be called with an skb
without an attached route, using a setup involving a bridge, netfilter,
and forged IP packets.

Let's make ip_options_compile() and ip_options_rcv_srr() a bit more
robust, instead of changing bridge/netfilter code.

With help from Hiroaki SHIMODA.

Reported-by: Scot Doyle <lkml@scotdoyle.com>
Tested-by: Scot Doyle <lkml@scotdoyle.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Stephen Hemminger <shemminger@vyatta.com>
Acked-by: Hiroaki SHIMODA <shimoda.hiroaki@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

bridge: reset IPCB in br_parse_ip_options

commit f8e9881c2aef1e982e5abc25c046820cd0b7cf64 upstream.

Commit 462fb2af9788a82 (bridge : Sanitize skb before it enters the IP
stack), missed one IPCB init before calling ip_options_compile()

Thanks to Scot Doyle for his tests and bug reports.

Reported-by: Scot Doyle <lkml@scotdoyle.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Hiroaki SHIMODA <shimoda.hiroaki@gmail.com>
Acked-by: Bandan Das <bandan.das@stratus.com>
Acked-by: Stephen Hemminger <shemminger@vyatta.com>
Cc: Jan Lübbe <jluebbe@debian.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

perf tool: Fix gcc 4.6.0 issues

commit fb7d0b3cefb80a105f7fd26bbc62e0cbf9192822 upstream.

GCC 4.6.0 in Fedora rawhide turned up some compile errors in tools/perf
due to the -Werror=unused-but-set-variable flag.

I've gone through and annotated some of the assignments that had side
effects (ie: return value from a function) with the __used annotation,
and in some cases, just removed unused code.

In a few cases, we were assigning something useful, but not using it in
later parts of the function.

kyle@dreadnought:~/src% gcc --version
gcc (GCC) 4.6.0 20110122 (Red Hat 4.6.0-0.3)

Cc: Ingo Molnar <mingo@redhat.com>
LKML-Reference: <20110124161304.GK27353@bombadil.infradead.org>
Signed-off-by: Kyle McMartin <kyle@redhat.com>
[ committer note: Fixed up the annotation fixes, as that code moved recently ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
[Backported to 2.6.38.2 by deleting unused but set variables]
Signed-off-by: Thomas Meyer <thomas@m3y3r.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

Bluetooth: Fix HCI_RESET command synchronization

commit f630cf0d5434e3923e1b8226ffa2753ead6b0ce5 upstream.

We can't send new commands before a cmd_complete for the HCI_RESET command
shows up.

Reported-by: Mikko Vinni <mmvinni@yahoo.com>
Reported-by: Justin P. Mattock <justinmattock@gmail.com>
Reported-by: Ed Tomlinson <edt@aei.ca>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
Tested-by: Justin P. Mattock <justinmattock@gmail.com>
Tested-by: Mikko Vinni <mmvinni@yahoo.com>
Tested-by: Ed Tomlinson <edt@aei.ca>

radeon: Fix KMS CP writeback on big endian machines.

commit dc66b325f161bb651493c7d96ad44876b629cf6a upstream.

This is necessary even with PCI(e) GART, and it makes writeback work even with
AGP on my PowerBook. Might still be unreliable with older revisions of UniNorth
and other AGP bridges though.

Signed-off-by: Michel Dänzer <daenzer@vmware.com>
Reviewed-by: Alex Deucher <alex.deucher@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

USB: Fix unplug of device with active streams

commit b214f191d95ba4b5a35aebd69cd129cf7e3b1884 upstream.

If I unplug a device while the UAS driver is loaded, I get an oops
in usb_free_streams().  This is because usb_unbind_interface() calls
usb_disable_interface() which calls usb_disable_endpoint() which sets
ep_out and ep_in to NULL.  Then the UAS driver calls usb_pipe_endpoint()
which returns a NULL pointer and passes an array of NULL pointers to
usb_free_streams().

I think the correct fix for this is to check for the NULL pointer
in usb_free_streams() rather than making the driver check for this
situation.  My original patch for this checked for dev->state ==
USB_STATE_NOTATTACHED, but the call to usb_disable_interface() is
conditional, so not all drivers would want this check.

Note from Sarah Sharp: This patch does avoid a potential dereference,
but the real fix (which will be implemented later) is to set the
.soft_unbind flag in the usb_driver structure for the UAS driver, and
all drivers that allocate streams.  The driver should free any streams
when it is unbound from the interface.  This avoids leaking stream rings
in the xHCI driver when usb_disable_interface() is called.

This should be queued for stable trees back to 2.6.35.

Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

USB: xhci - also free streams when resetting devices

commit 2dea75d96ade3c7cd2bfe73f99c7b3291dc3d03a upstream.

Currently, when resetting a device, xHCI driver disables all but one
endpoints and frees their rings, but leaves alone any streams that
might have been allocated. Later, when users try to free allocated
streams, we oops in xhci_setup_no_streams_ep_input_ctx() because
ep->ring is NULL.

Let's free not only rings but also stream data as well, so that
calling free_streams() on a device that was reset will be safe.

This should be queued for stable trees back to 2.6.35.

Reviewed-by: Micah Elizabeth Scott <micah@vmware.com>
Signed-off-by: Dmitry Torokhov <dtor@vmware.com>
Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

USB: xhci - fix math in xhci_get_endpoint_interval()

commit dfa49c4ad120a784ef1ff0717168aa79f55a483a upstream.

When parsing exponent-expressed intervals we subtract 1 from the
value and then expect it to match with original + 1, which is
highly unlikely, and we end with frequent spew:

usb 3-4: ep 0x83 - rounding interval to 512 microframes

Also, parsing interval for fullspeed isochronous endpoints was
incorrect - according to USB spec they use exponent-based
intervals (but xHCI spec claims frame-based intervals). I trust
USB spec more, especially since USB core agrees with it.

This should be queued for stable kernels back to 2.6.31.

Reviewed-by: Micah Elizabeth Scott <micah@vmware.com>
Signed-off-by: Dmitry Torokhov <dtor@vmware.com>
Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

USB: xhci - fix unsafe macro definitions

commit 5a6c2f3ff039154872ce597952f8b8900ea0d732 upstream.

Macro arguments used in expressions need to be enclosed in parenthesis
to avoid unpleasant surprises.

This should be queued for kernels back to 2.6.31

Signed-off-by: Dmitry Torokhov <dtor@vmware.com>
Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

USB: fix formatting of SuperSpeed endpoints in /proc/bus/usb/devices

commit 2868a2b1ba8f9c7f6c4170519ebb6c62934df70e upstream.

Isochronous and interrupt SuperSpeed endpoints use the same mechanisms
for decoding bInterval values as HighSpeed ones so adjust the code
accordingly.

Also bandwidth reservation for SuperSpeed matches highspeed, not
low/full speed.

Signed-off-by: Dmitry Torokhov <dtor@vmware.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

USB: EHCI: unlink unused QHs when the controller is stopped

commit 94ae4976e253757e9b03a44d27d41b20f1829d80 upstream.

This patch (as1458) fixes a problem affecting ultra-reliable systems:
When hardware failover of an EHCI controller occurs, the data
structures do not get released correctly. This is because the routine
responsible for removing unused QHs from the async schedule assumes
the controller is running properly (the frame counter is used in
determining how long the QH has been idle) -- but when a failover
causes the controller to be electronically disconnected from the PCI
bus, obviously it stops running.

The solution is simple: Allow scan_async() to remove a QH from the
async schedule if it has been idle for long enough _or_ if the
controller is stopped.

Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Reported-and-Tested-by: Dan Duval <dan.duval@stratus.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

usb: qcserial add missing errorpath kfrees

commit cb62d65f966146a39fdde548cb474dacf1d00fa5 upstream.

There are two -ENODEV error paths in qcprobe where the allocated private
data is not freed, this patch adds the two missing kfrees to avoid
leaking memory on the error path

Signed-off-by: Steven Hardy <shardy@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

usb: qcserial avoid pointing to freed memory

commit 99ab3f9e4eaec35fd2d7159c31b71f17f7e613e3 upstream.

Rework the qcprobe logic such that serial->private is not set when
qcprobe exits with -ENODEV, otherwise serial->private will point to freed
memory on -ENODEV

Signed-off-by: Steven Hardy <shardy@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

usb: Fix qcserial memory leak on rmmod

commit 10c9ab15d6aee153968d150c05b3ee3df89673de upstream.

qcprobe function allocates serial->private but this is never freed, this
patch adds a new function qc_release() which frees serial->private, after
calling usb_wwan_release

Signed-off-by: Steven Hardy <shardy@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

powerpc/perf_event: Skip updating kernel counters if register value shrinks

commit 86c74ab317c1ef4d37325e0d7ca8a01a796b0bd7 upstream.

Because of speculative event roll back, it is possible for some event coutners
to decrease between reads on POWER7.  This causes a problem with the way that
counters are updated.  Delta calues are calculated in a 64 bit value and the
top 32 bits are masked.  If the register value has decreased, this leaves us
with a very large positive value added to the kernel counters.  This patch
protects against this by skipping the update if the delta would be negative.
This can lead to a lack of precision in the coutner values, but from my testing
the value is typcially fewer than 10 samples at a time.

Signed-off-by: Eric B Munson <emunson@mgebm.net>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

powerpc: Fix oops if scan_dispatch_log is called too early

commit 84ffae55af79d7b8834fd0c08d0d1ebf2c77f91e upstream.

We currently enable interrupts before the dispatch log for the boot
cpu is setup. If a timer interrupt comes in early enough we oops in
scan_dispatch_log:

Unable to handle kernel paging request for data at address 0x00000010

...

.scan_dispatch_log+0xb0/0x170
.account_system_vtime+0xa0/0x220
.irq_enter+0x88/0xc0
.do_IRQ+0x48/0x230

The patch below adds a check to scan_dispatch_log to ensure the
dispatch log has been allocated.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

proc: do proper range check on readdir offset

commit d8bdc59f215e62098bc5b4256fd9928bf27053a1 upstream.

Rather than pass in some random truncated offset to the pid-related
functions, check that the offset is in range up-front.

This is just cleanup, the previous commit fixed the real problem.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

next_pidmap: fix overflow condition

commit c78193e9c7bcbf25b8237ad0dec82f805c4ea69b upstream.

next_pidmap() just quietly accepted whatever 'last' pid that was passed
in, which is not all that safe when one of the users is /proc.

Admittedly the proc code should do some sanity checking on the range
(and that will be the next commit), but that doesn't mean that the
helper functions should just do that pidmap pointer arithmetic without
checking the range of its arguments.

So clamp 'last' to PID_MAX_LIMIT. The fact that we then do "last+1"
doesn't really matter, the for-loop does check against the end of the
pidmap array properly (it's only the actual pointer arithmetic overflow
case we need to worry about, and going one bit beyond isn't going to
overflow).

[ Use PID_MAX_LIMIT rather than pid_max as per Eric Biederman ]

Reported-by: Tavis Ormandy <taviso@cmpxchg8b.com>
Analyzed-by: Robert Święcki <robert@swiecki.net>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

USB: option: Added support for Samsung GT-B3730/GT-B3710 LTE USB modem.

commit 80f9df3e0093ad9f1eeefd2ff7fd27daaa518d25 upstream.

Bind only modem AT command endpoint to option.

Signed-off-by: Marius B. Kotsbak <marius@kotsbak.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

USB: ftdi_sio: add ids for Hameg HO720 and HO730

commit c53c2fab40cf16e13af66f40bfd27200cda98d2f upstream.

usb serial: ftdi_sio: add two missing USB ID's for Hameg interfaces HO720
and HO730

Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

USB: ftdi_sio: add PID for OCT DK201 docking station

commit 11a31d84129dc3133417d626643d714c9df5317e upstream.

Add PID 0x0103 for serial port of the OCT DK201 docking station.

Reported-by: Jan Hoogenraad <jan@hoogenraad.net>
Signed-off-by: Johan Hovold <jhovold@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

USB: ftdi_sio: Added IDs for CTI USB Serial Devices

commit 5a9443f08c83c294c5c806a689c1184b27cb26b3 upstream.

I added new ProdutIds for two devices from CTI GmbH Leipzig.

Signed-off-by: Christian Simon <simon@swine.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

usb: musb: temporarily make it bool

commit 7a180e70cfc56e131bfe4796773df2acfc7d4180 upstream.

Due to the recent changes to musb's glue layers,
we can't compile musb-hdrc as a module - compilation
will break due to undefined symbol musb_debug. In
order to fix that, we need a big re-work of the
debug support on the MUSB driver.

Because that would mean a lot of new code coming
into the -rc series, it's best to defer that to
next merge window and for now just disable module
support for MUSB.

Once we get the refactor of the debugging support
done, we can simply revert this patch and things
will go back to normal again.

Signed-off-by: Felipe Balbi <balbi@ti.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

brk: COMPAT_BRK: fix detection of randomized brk

commit 4471a675dfc7ca676c165079e91c712b09dc9ce4 upstream.

5520e89 ("brk: fix min_brk lower bound computation for COMPAT_BRK")
tried to get the whole logic of brk randomization for legacy
(libc5-based) applications finally right.

It turns out that the way to detect whether brk has actually been
randomized in the end or not introduced by that patch still doesn't work
for those binaries, as reported by Geert:

: /sbin/init from my old m68k ramdisk exists prematurely.
:
: Before the patch:
:
: | brk(0x80005c8e) = 0x80006000
:
: After the patch:
:
: | brk(0x80005c8e) = 0x80005c8e
:
: Old libc5 considers brk() to have failed if the return value is not
: identical to the requested value.

I don't like it, but currently see no better option than a bit flag in
task_struct to catch the CONFIG_COMPAT_BRK && randomize_va_space == 2
case.

Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Tested-by: Geert Uytterhoeven <geert@linux-m68k.org>
Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

vmscan: all_unreclaimable() use zone->all_unreclaimable as a name

commit 929bea7c714220fc76ce3f75bef9056477c28e74 upstream.

all_unreclaimable check in direct reclaim has been introduced at 2.6.19
by following commit.

2006 Sep 25; commit 408d8544; oom: use unreclaimable info

And it went through strange history. firstly, following commit broke
the logic unintentionally.

2008 Apr 29; commit a41f24ea; page allocator: smarter retry of
      costly-order allocations

Two years later, I've found obvious meaningless code fragment and
restored original intention by following commit.

2010 Jun 04; commit bb21c7ce; vmscan: fix do_try_to_free_pages()
      return value when priority==0

But, the logic didn't works when 32bit highmem system goes hibernation
and Minchan slightly changed the algorithm and fixed it .

2010 Sep 22: commit d1908362: vmscan: check all_unreclaimable
      in direct reclaim path

But, recently, Andrey Vagin found the new corner case. Look,

struct zone {
  ..
        int                     all_unreclaimable;
  ..
        unsigned long           pages_scanned;
  ..
}

zone->all_unreclaimable and zone->pages_scanned are neigher atomic
variables nor protected by lock.  Therefore zones can become a state of
zone->page_scanned=0 and zone->all_unreclaimable=1.  In this case, current
all_unreclaimable() return false even though zone->all_unreclaimabe=1.

This resulted in the kernel hanging up when executing a loop of the form

1. fork
2. mmap
3. touch memory
4. read memory
5. munmmap

as described in
http://www.gossamer-threads.com/lists/linux/kernel/1348725#1348725

Is this ignorable minor issue?  No.  Unfortunately, x86 has very small dma
zone and it become zone->all_unreclamble=1 easily.  and if it become
all_unreclaimable=1, it never restore all_unreclaimable=0.  Why?  if
all_unreclaimable=1, vmscan only try DEF_PRIORITY reclaim and
a-few-lru-pages>>DEF_PRIORITY always makes 0.  that mean no page scan at
all!

Eventually, oom-killer never works on such systems.  That said, we can't
use zone->pages_scanned for this purpose.  This patch restore
all_unreclaimable() use zone->all_unreclaimable as old.  and in addition,
to add oom_killer_disabled check to avoid reintroduce the issue of commit
d1908362 ("vmscan: check all_unreclaimable in direct reclaim path").

Reported-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Nick Piggin <npiggin@kernel.dk>
Reviewed-by: Minchan Kim <minchan.kim@gmail.com>
Reviewed-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Acked-by: David Rientjes <rientjes@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

sched: Fix erroneous all_pinned logic

commit b30aef17f71cf9e24b10c11cbb5e5f0ebe8a85ab upstream.

The scheduler load balancer has specific code to deal with cases of
unbalanced system due to lots of unmovable tasks (for example because of
hard CPU affinity). In those situation, it excludes the busiest CPU that
has pinned tasks for load balance consideration such that it can perform
second 2nd load balance pass on the rest of the system.

This all works as designed if there is only one cgroup in the system.

However, when we have multiple cgroups, this logic has false positives and
triggers multiple load balance passes despite there are actually no pinned
tasks at all.

The reason it has false positives is that the all pinned logic is deep in
the lowest function of can_migrate_task() and is too low level:

load_balance_fair() iterates each task group and calls balance_tasks() to
migrate target load. Along the way, balance_tasks() will also set a
all_pinned variable. Given that task-groups are iterated, this all_pinned
variable is essentially the status of last group in the scanning process.
Task group can have number of reasons that no load being migrated, none
due to cpu affinity. However, this status bit is being propagated back up
to the higher level load_balance(), which incorrectly think that no tasks
were moved. It kick off the all pinned logic and start multiple passes
attempt to move load onto puller CPU.

To fix this, move the all_pinned aggregation up at the iterator level.
This ensures that the status is aggregated over all task-groups, not just
last one in the list.

Signed-off-by: Ken Chen <kenchen@google.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/BANLkTi=ernzNawaR5tJZEsV_QVnfxqXmsQ@mail.gmail.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

RTC: add missing "return 0" in new alarm func for rtc-bfin.c

commit 8c122b96866580c99e44f3f07ac93a993d964ec3 upstream.

The new bfin_rtc_alarm_irq_enable function forgot to add a "return 0" to
the end leading to the build warning:
drivers/rtc/rtc-bfin.c: In function 'bfin_rtc_alarm_irq_enable':
drivers/rtc/rtc-bfin.c:253: warning: control reaches end of non-void function

CC: Thomas Gleixner <tglx@linutronix.de>
CC: Alessandro Zummo <a.zummo@towertech.it>
Signed-off-by: Mike Frysinger <vapier@gentoo.org>
Signed-off-by: John Stultz <john.stultz@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

i2c-algo-bit: Call pre/post_xfer for bit_test

commit d3b3e15da14ded61c9654db05863b04a2435f4cc upstream.

Apparently some distros set i2c-algo-bit.bit_test to 1 by
default.  In some cases this causes i2c_bit_add_bus
to fail and prevents the i2c bus from being added.  In the
radeon case, we fail to add the ddc i2c buses which prevents
the driver from being able to detect attached monitors.
The i2c bus works fine even if bit_test fails.  This is likely
due to gpio switching that is required and handled in the
pre/post_xfer hooks, so call the pre/post_xfer hooks in the
bit test as well.

Fixes:
https://bugs.freedesktop.org/show_bug.cgi?id=36221

Signed-off-by: Alex Deucher <alexdeucher@gmail.com>
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

ARM: 6864/1: hw_breakpoint: clear DBGVCR out of reset

commit e89c0d7090c54d7b11b9b091e495a1ae345dd3ff upstream.

The DBGVCR, used for configuring vector catch debug events, is UNKNOWN
out of reset on ARMv7. When enabling monitor mode, this must be zeroed
to avoid UNPREDICTABLE behaviour.

This patch adds the zeroing code to the debug reset path.

Reported-by: Stepan Moskovchenko <stepanm@codeaurora.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

vfs: Fix absolute RCU path walk failures due to uninitialized seq number

commit c1530019e311c91d14b24d8e74d233152d806e45 upstream.

During RCU walk in path_lookupat and path_openat, the rcu lookup
frequently failed if looking up an absolute path, because when root
directory was looked up, seq number was not properly set in nameidata.

We dropped out of RCU walk in nameidata_drop_rcu due to mismatch in
directory entry's seq number. We reverted to slow path walk that need
to take references.

With the following patch, I saw a 50% increase in an exim mail server
benchmark throughput on a 4-socket Nehalem-EX system.

Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

x86, amd: Disable GartTlbWlkErr when BIOS forgets it

commit 5bbc097d890409d8eff4e3f1d26f11a9d6b7c07e upstream.

This patch disables GartTlbWlk errors on AMD Fam10h CPUs if
the BIOS forgets to do is (or is just too old). Letting
these errors enabled can cause a sync-flood on the CPU
causing a reboot.

The AMD BKDG recommends disabling GART TLB Wlk Error completely.

This patch is the fix for

https://bugzilla.kernel.org/show_bug.cgi?id=33012

on my machine.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Link: http://lkml.kernel.org/r/20110415131152.GJ18463@8bytes.org
Tested-by: Alexandre Demers <alexandre.f.demers@gmail.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

x86, AMD: Set ARAT feature on AMD processors

commit b87cf80af3ba4b4c008b4face3c68d604e1715c6 upstream.

Support for Always Running APIC timer (ARAT) was introduced in
commit db954b5898dd3ef3ef93f4144158ea8f97deb058. This feature
allows us to avoid switching timers from LAPIC to something else
(e.g. HPET) and go into timer broadcasts when entering deep
C-states.

AMD processors don't provide a CPUID bit for that feature but
they also keep APIC timers running in deep C-states (except for
cases when the processor is affected by erratum 400). Therefore
we should set ARAT feature bit on AMD CPUs.

Tested-by: Borislav Petkov <borislav.petkov@amd.com>
Acked-by: Andreas Herrmann <andreas.herrmann3@amd.com>
Acked-by: Mark Langsdorf <mark.langsdorf@amd.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@amd.com>
LKML-Reference: <1300205624-4813-1-git-send-email-ostr@amd64.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

UBIFS: fix oops when R/O file-system is fsync'ed

commit 78530bf7f2559b317c04991b52217c1608d5a58d upstream.

This patch fixes severe UBIFS bug: UBIFS oopses when we 'fsync()' an
file on R/O-mounter file-system. We (the UBIFS authors) incorrectly
thought that VFS would not propagate 'fsync()' down to the file-system
if it is read-only, but this is not the case.

It is easy to exploit this bug using the following simple perl script:

use strict;
use File::Sync qw(fsync sync);

die "File path is not specified" if not defined $ARGV[0];
my $path = $ARGV[0];

open FILE, "<", "$path" or die "Cannot open $path: $!";
fsync(\*FILE) or die "cannot fsync $path: $!";
close FILE or die "Cannot close $path: $!";

Thanks to Reuben Dowle <Reuben.Dowle@navico.com> for reporting about this
issue.

Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
Reported-by: Reuben Dowle <Reuben.Dowle@navico.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

MAINTAINERS: update STABLE BRANCH info

commit d00ebeac5f24f290636f7a895dafc124b2930a08 upstream.

Drop Chris Wright from STABLE maintainers. He hasn't done STABLE release
work for quite some time.

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Acked-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

oom-kill: remove boost_dying_task_prio()

commit 341aea2bc48bf652777fb015cc2b3dfa9a451817 upstream.

This is an almost-revert of commit 93b43fa ("oom: give the dying task a
higher priority").

That commit dramatically improved oom killer logic when a fork-bomb
occurs.  But I've found that it has nasty corner case.  Now cpu cgroup has
strange default RT runtime.  It's 0!  That said, if a process under cpu
cgroup promote RT scheduling class, the process never run at all.

If an admin inserts a !RT process into a cpu cgroup by setting
rtruntime=0, usually it runs perfectly because a !RT task isn't affected
by the rtruntime knob.  But if it promotes an RT task via an explicit
setscheduler() syscall or an OOM, the task can't run at all.  In short,
the oom killer doesn't work at all if admins are using cpu cgroup and don't
touch the rtruntime knob.

Eventually, kernel may hang up when oom kill occur.  I and the original
author Luis agreed to disable this logic.

Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Acked-by: Luis Claudio R. Goncalves <lclaudio@uudg.org>
Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Reviewed-by: Minchan Kim <minchan.kim@gmail.com>
Acked-by: David Rientjes <rientjes@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

ramfs: fix memleak on no-mmu arch

commit b836aec53e2bce71de1d5415313380688c851477 upstream.

On no-mmu arch, there is a memleak during shmem test.  The cause of this
memleak is ramfs_nommu_expand_for_mapping() added page refcount to 2
which makes iput() can't free that pages.

The simple test file is like this:

  int main(void)
  {
int i;
key_t k = ftok("/etc", 42);

for ( i=0; i<100; ++i) {
int id = shmget(k, 10000, 0644|IPC_CREAT);
if (id == -1) {
printf("shmget error\n");
}
if(shmctl(id, IPC_RMID, NULL ) == -1) {
printf("shm  rm error\n");
return -1;
}
}
printf("run ok...\n");
return 0;
  }

And the result:

  root:/> free
               total         used         free       shared      buffers
  Mem:         60320        17912        42408            0            0
  -/+ buffers:              17912        42408
  root:/> shmem
  run ok...
  root:/> free
               total         used         free       shared      buffers
  Mem:         60320        19096        41224            0            0
  -/+ buffers:              19096        41224
  root:/> shmem
  run ok...
  root:/> free
               total         used         free       shared      buffers
  Mem:         60320        20296        40024            0            0
  -/+ buffers:              20296        40024
  ...

After this patch the test result is:(no memleak anymore)

  root:/> free
               total         used         free       shared      buffers
  Mem:         60320        16668        43652            0            0
  -/+ buffers:              16668        43652
  root:/> shmem
  run ok...
  root:/> free
               total         used         free       shared      buffers
  Mem:         60320        16668        43652            0            0
  -/+ buffers:              16668        43652

Signed-off-by: Bob Liu <lliubbo@gmail.com>
Acked-by: Hugh Dickins <hughd@google.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

mm/thp: use conventional format for boolean attributes

commit e27e6151b154ff6e5e8162efa291bc60196d29ea upstream.

The conventional format for boolean attributes in sysfs is numeric ("0" or
"1" followed by new-line). Any boolean attribute can then be read and
written using a generic function. Using the strings "yes [no]", "[yes]
no" (read), "yes" and "no" (write) will frustrate this.

[akpm@linux-foundation.org: use kstrtoul()]
[akpm@linux-foundation.org: test_bit() doesn't return 1/0, per Neil]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Johannes Weiner <jweiner@redhat.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Hugh Dickins <hughd@google.com>
Tested-by: David Rientjes <rientjes@google.com>
Cc: NeilBrown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

kstrto*: converting strings to integers done (hopefully) right

commit 33ee3b2e2eb9b4b6c64dcf9ed66e2ac3124e748c upstream.

1. simple_strto*() do not contain overflow checks and crufty,
   libc way to indicate failure.
2. strict_strto*() also do not have overflow checks but the name and
   comments pretend they do.
3. Both families have only "long long" and "long" variants,
   but users want strtou8()
4. Both "simple" and "strict" prefixes are wrong:
   Simple doesn't exactly say what's so simple, strict should not exist
   because conversion should be strict by default.

The solution is to use "k" prefix and add convertors for more types.
Enter
kstrtoull()
kstrtoll()
kstrtoul()
kstrtol()
kstrtouint()
kstrtoint()

kstrtou64()
kstrtos64()
kstrtou32()
kstrtos32()
kstrtou16()
kstrtos16()
kstrtou8()
kstrtos8()

Include runtime testsuite (somewhat incomplete) as well.

strict_strto*() become deprecated, stubbed to kstrto*() and
eventually will be removed altogether.

Use kstrto*() in code today!

Note: on some archs _kstrtoul() and _kstrtol() are left in tree, even if
      they'll be unused at runtime. This is temporarily solution,
      because I don't want to hardcode list of archs where these
      functions aren't needed. Current solution with sizeof() and
      __alignof__ at least always works.

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

futex: Set FLAGS_HAS_TIMEOUT during futex_wait restart setup

commit 0cd9c6494ee5c19aef085152bc37f3a4e774a9e1 upstream.

The FLAGS_HAS_TIMEOUT flag was not getting set, causing the restart_block to
restart futex_wait() without a timeout after a signal.

Commit b41277dc7a18ee332d in 2.6.38 introduced the regression by accidentally
removing the the FLAGS_HAS_TIMEOUT assignment from futex_wait() during the setup
of the restart block. Restore the originaly behavior.

Fixes: https://bugzilla.kernel.org/show_bug.cgi?id=32922
Reported-by: Tim Smith <tsmith201104@yahoo.com>
Reported-by: Torsten Hilbrich <torsten.hilbrich@secunet.com>
Signed-off-by: Darren Hart <dvhart@linux.intel.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: John Kacur <jkacur@redhat.com>
Link: http://lkml.kernel.org/r/%3Cdaac0eb3af607f72b9a4d3126b2ba8fb5ed3b883.1302820917.git.dvhart%40linux.intel.com%3E
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

sparc64: Fix build errors with gcc-4.6.0

[ Upstream commit c6fee0810df4e0f4cf9c4834d2569ca01c02cffc ]

Most of the warnings emitted (we fail arch/sparc file
builds with -Werror) were legitimate but harmless, however
one case (n2_pcr_write) was a genuine bug.

Based almost entirely upon a patch by Sam Ravnborg.

Reported-by: Dennis Gilmore <dennis@ausil.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

sparc32: Pass task_struct to schedule_tail() in ret_from_fork

[ Upstream commit 47c7c97a93a5b8f719093dbf83555090b3b8228b ]

We have to pass task_struct of previous process to function
schedule_tail(). Currently in ret_from_fork previous thread_info
is passed:

switch_to: mov %g6, %g3 /* previous thread_info in g6 */

ret_from_fork: call schedule_tail
mov %g3, %o0 /* previous thread_info is passed */

void schedule_tail(struct task_struct *prev);

Signed-off-by: Tkhai Kirill <tkhai@yandex.ru>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

sparc32: Fix might-be-used-uninitialized warning in do_sparc_fault().

[ Upstream commit c816be7b5f24585baa9eba1f2413935f771d6ad6 ]

When we try to handle vmalloc faults, we can take a code
path which uses "code" before we actually set it.

Amusingly gcc-3.3 notices this yet gcc-4.x does not.

Reported-by: Bob Breuer <breuerr@mc.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

sparc: Fix .size directive for do_int_load

[ Upstream commit 35043c428f1fcb92feb5792f5878a8852ee00771 ]

gas used to accept (and ignore?) .size directives which referred to
undefined symbols, as this does. In binutils 2.21 these are treated
as errors.

Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

bridge: Reset IPCB when entering IP stack on NF_FORWARD

[ Upstream commit 6b1e960fdbd75dcd9bcc3ba5ff8898ff1ad30b6e ]

Whenever we enter the IP stack proper from bridge netfilter we
need to ensure that the skb is in a form the IP stack expects
it to be in.

The entry point on NF_FORWARD did not meet the requirements of
the IP stack, therefore leading to potential crashes/panics.

This patch fixes the problem.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Acked-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

vlan: should take into account needed_headroom

[ Upstream commit d870bfb9d366c5d466c0f5419a4ec95a3f71ea8a ]

Commit c95b819ad7 (gre: Use needed_headroom)
made gre use needed_headroom instead of hard_header_len

This uncover a bug in vlan code.

We should make sure vlan devices take into account their
real_dev->needed_headroom or we risk a crash in ipgre_header(), because
we dont have enough room to push IP header in skb.

Reported-by: Diddi Oscarsson <diddi@diddi.se>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Patrick McHardy <kaber@trash.net>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

xfrm: Refcount destination entry on xfrm_lookup

[ Upstream commit fbd5060875d25f7764fd1c3d35b83a8ed1d88d7b ]

We return a destination entry without refcount if a socket
policy is found in xfrm_lookup. This triggers a warning on
a negative refcount when freeeing this dst entry. So take
a refcount in this case to fix it.

This refcount was forgotten when xfrm changed to cache bundles
instead of policies for outgoing flows.

Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Acked-by: Timo Teräs <timo.teras@iki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

tcp: avoid cwnd moderation in undo

[ Upstream commit 67d4120a1793138bc9f4a6eb61d0fc5298ed97e0 ]

In the current undo logic, cwnd is moderated after it was restored
to the value prior entering fast-recovery. It was moderated first
in tcp_try_undo_recovery then again in tcp_complete_cwr.

Since the undo indicates recovery was false, these moderations
are not necessary. If the undo is triggered when most of the
outstanding data have been acknowledged, the (restored) cwnd is
falsely pulled down to a small value.

This patch removes these cwnd moderations if cwnd is undone
a) during fast-recovery
b) by receiving DSACKs past fast-recovery

Signed-off-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

sctp: Pass __GFP_NOWARN to hash table allocation attempts.

[ Upstream commit a84b50ceb7d640437d0dc28a2bef0d0de054de89 ]

Like DCCP and other similar pieces of code, there are mechanisms
here to try allocating smaller hash tables if the allocation
fails. So pass in __GFP_NOWARN like the others do instead of
emitting a scary message.

Reported-by: Dave Jones <davej@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

pppoe: drop PPPOX_ZOMBIEs in pppoe_flush_dev

[ Upstream commit ae07b0b221b6ab2edf9e3abd518aec6cd3f1ba66 ]

otherwise we loop forever if a PPPoE socket was set
to PPPOX_ZOMBIE state by a PADT message when the
ethernet device is going down afterwards.

Signed-off-by: Ulrich Weber <uweber@astaro.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

net_sched: fix ip_tos2prio

[ Upstream commit 4a2b9c3756077c05dd8666e458a751d2248b61b6 ]

ECN support incorrectly maps ECN BESTEFFORT packets to TC_PRIO_FILLER
(1) instead of TC_PRIO_BESTEFFORT (0)

This means ECN enabled flows are placed in pfifo_fast/prio low priority
band, giving ECN enabled flows [ECT(0) and CE codepoints] higher drop
probabilities.

This is rather unfortunate, given we would like ECN being more widely
used.

Ref : http://www.coverfire.com/archives/2011/03/13/pfifo_fast-and-ecn/

Signed-off-by: Dan Siemon <dan@coverfire.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Dave Täht <d@taht.net>
Cc: Jonathan Morton <chromatix99@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>