git.karo-electronics.de Git - karo-tx-linux.git/log

USB: CP210x Add 4 Device IDs for AC-Services Devices

commit 4eff0b40a7174896b860312910e0db51f2dcc567 upstream.

This patch adds 4 device IDs for CP2102 based devices manufactured by
AC-Services. See http://www.ac-services.eu for further info.

Signed-off-by: Craig Shelley <craig@microtron.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

i2c/writing-clients: Fix foo_driver.id_table

commit 3116c86033079a1d4d4e84c40028f96b614843b8 upstream.

The i2c_device_id structure variable's name is not used in the
i2c_driver structure.

Signed-off-by: Vikram Narayanan <vikram186@gmail.com>
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

i2c: tegra: Enable new slave mode.

commit 65a1a0ace554d61ea5a90377a54df1505275c1b1 upstream.

For Tegra i2c controller to function properly new slave mode must be
enabled.

swarren notes:

In particular, I found this was needed when working on enabling the
Tegra audio driver on the Seaboard board. There are two different PCB
layouts for this board; a "clamshell" version, which works just fine
without this change, and the original non-clamshell version, which needs
this change in order for I2C to operate correctly. Without it, I2C
probing fails for some devices, e.g. with:

wm8903 0-001a: Device with ID register 0 is not a WM8903
wm8903 0-001a: asoc: failed to probe CODEC wm8903.0-001a: -19
asoc: failed to instantiate card tegra-wm8903: -19
ALSA device list:
No soundcards found.

Signed-off-by: Rakesh Iyer <riyer@nvidia.com>
Signed-off-by: Stephen Warren <swarren@nvidia.com>
Signed-off-by: Ben Dooks <ben-linux@fluff.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

loop: handle on-demand devices correctly

commit a1c15c59feee36267c43142a41152fbf7402afb6 upstream.

When finding or allocating a loop device, loop_probe() did not take
partition numbers into account so that it can result to a different
device. Consider following example:

$ sudo modprobe loop max_part=15
$ ls -l /dev/loop*
brw-rw---- 1 root disk 7,   0 2011-05-24 22:16 /dev/loop0
brw-rw---- 1 root disk 7,  16 2011-05-24 22:16 /dev/loop1
brw-rw---- 1 root disk 7,  32 2011-05-24 22:16 /dev/loop2
brw-rw---- 1 root disk 7,  48 2011-05-24 22:16 /dev/loop3
brw-rw---- 1 root disk 7,  64 2011-05-24 22:16 /dev/loop4
brw-rw---- 1 root disk 7,  80 2011-05-24 22:16 /dev/loop5
brw-rw---- 1 root disk 7,  96 2011-05-24 22:16 /dev/loop6
brw-rw---- 1 root disk 7, 112 2011-05-24 22:16 /dev/loop7
$ sudo mknod /dev/loop8 b 7 128
$ sudo losetup /dev/loop8 ~/temp/disk-with-3-parts.img
$ sudo losetup -a
/dev/loop128: [0805]:278201 (/home/namhyung/temp/disk-with-3-parts.img)
$ ls -l /dev/loop*
brw-rw---- 1 root disk 7,    0 2011-05-24 22:16 /dev/loop0
brw-rw---- 1 root disk 7,   16 2011-05-24 22:16 /dev/loop1
brw-rw---- 1 root disk 7, 2048 2011-05-24 22:18 /dev/loop128
brw-rw---- 1 root disk 7, 2049 2011-05-24 22:18 /dev/loop128p1
brw-rw---- 1 root disk 7, 2050 2011-05-24 22:18 /dev/loop128p2
brw-rw---- 1 root disk 7, 2051 2011-05-24 22:18 /dev/loop128p3
brw-rw---- 1 root disk 7,   32 2011-05-24 22:16 /dev/loop2
brw-rw---- 1 root disk 7,   48 2011-05-24 22:16 /dev/loop3
brw-rw---- 1 root disk 7,   64 2011-05-24 22:16 /dev/loop4
brw-rw---- 1 root disk 7,   80 2011-05-24 22:16 /dev/loop5
brw-rw---- 1 root disk 7,   96 2011-05-24 22:16 /dev/loop6
brw-rw---- 1 root disk 7,  112 2011-05-24 22:16 /dev/loop7
brw-r--r-- 1 root root 7,  128 2011-05-24 22:17 /dev/loop8

After this patch, /dev/loop8 - instead of /dev/loop128 - was
accessed correctly.

In addition, 'range' passed to blk_register_region() should
include all range of dev_t that LOOP_MAJOR can address. It does
not need to be limited by partition numbers unless 'max_loop'
param was specified.

Signed-off-by: Namhyung Kim <namhyung@gmail.com>
Cc: Laurent Vivier <Laurent.Vivier@bull.net>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

loop: limit 'max_part' module param to DISK_MAX_PARTS

commit 78f4bb367fd147a0e7e3998ba6e47109999d8814 upstream.

The 'max_part' parameter controls the number of maximum partition
a loop block device can have. However if a user specifies very
large value it would exceed the limitation of device minor number
and can cause a kernel panic (or, at least, produce invalid
device nodes in some cases).

On my desktop system, following command kills the kernel. On qemu,
it triggers similar oops but the kernel was alive:

$ sudo modprobe loop max_part0000
------------[ cut here ]------------
kernel BUG at /media/Linux_Data/project/linux/fs/sysfs/group.c:65!
invalid opcode: 0000 [#1] SMP
last sysfs file:
CPU 0
Modules linked in: loop(+)

Pid: 43, comm: insmod Tainted: G        W   2.6.39-qemu+ #155 Bochs Bochs
RIP: 0010:[<ffffffff8113ce61>]  [<ffffffff8113ce61>] internal_create_group=
+0x2a/0x170
RSP: 0018:ffff880007b3fde8  EFLAGS: 00000246
RAX: 00000000ffffffef RBX: ffff880007b3d878 RCX: 00000000000007b4
RDX: ffffffff8152da50 RSI: 0000000000000000 RDI: ffff880007b3d878
RBP: ffff880007b3fe38 R08: ffff880007b3fde8 R09: 0000000000000000
R10: ffff88000783b4a8 R11: ffff880007b3d878 R12: ffffffff8152da50
R13: ffff880007b3d868 R14: 0000000000000000 R15: ffff880007b3d800
FS:  0000000002137880(0063) GS:ffff880007c00000(0000) knlGS:00000000000000=
00
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000422680 CR3: 0000000007b50000 CR4: 00000000000006b0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 0000000000000000 DR7: 0000000000000000
Process insmod (pid: 43, threadinfo ffff880007b3e000, task ffff880007afb9c=
0)
Stack:
  ffff880007b3fe58 ffffffff811e66dd ffff880007b3fe58 ffffffff811e570b
  0000000000000010 ffff880007b3d800 ffff880007a7b390 ffff880007b3d868
  0000000000400920 ffff880007b3d800 ffff880007b3fe48 ffffffff8113cfc8
Call Trace:
  [<ffffffff811e66dd>] ? device_add+0x4bc/0x5af
  [<ffffffff811e570b>] ? dev_set_name+0x3c/0x3e
  [<ffffffff8113cfc8>] sysfs_create_group+0xe/0x12
  [<ffffffff810b420e>] blk_trace_init_sysfs+0x14/0x16
  [<ffffffff8116a090>] blk_register_queue+0x47/0xf7
  [<ffffffff8116f527>] add_disk+0xdf/0x290
  [<ffffffffa00060eb>] loop_init+0xeb/0x1b8 [loop]
  [<ffffffffa0006000>] ? 0xffffffffa0005fff
  [<ffffffff8100020a>] do_one_initcall+0x7a/0x12e
  [<ffffffff81096804>] sys_init_module+0x9c/0x1e0
  [<ffffffff813329bb>] system_call_fastpath+0x16/0x1b
Code: c3 55 48 89 e5 41 57 41 56 41 89 f6 41 55 41 54 49 89 d4 53 48 89 fb=
48 83 ec 28 48 85 ff 74 0b 85 f6 75 0b 48 83 7f 30 00 75 14 <0f> 0b eb fe =
48 83 7f 30 00 b9 ea ff ff ff 0f 84 18 01 00 00 49
RIP  [<ffffffff8113ce61>] internal_create_group+0x2a/0x170
  RSP <ffff880007b3fde8>
---[ end trace a123eb592043acad ]---

Signed-off-by: Namhyung Kim <namhyung@gmail.com>
Cc: Laurent Vivier <Laurent.Vivier@bull.net>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

mm/page_alloc.c: prevent unending loop in __alloc_pages_slowpath()

commit cfa54a0fcfc1017c6f122b6f21aaba36daa07f71 upstream.

I believe I found a problem in __alloc_pages_slowpath, which allows a
process to get stuck endlessly looping, even when lots of memory is
available.

Running an I/O and memory intensive stress-test I see a 0-order page
allocation with __GFP_IO and __GFP_WAIT, running on a system with very
little free memory.  Right about the same time that the stress-test gets
killed by the OOM-killer, the utility trying to allocate memory gets stuck
in __alloc_pages_slowpath even though most of the systems memory was freed
by the oom-kill of the stress-test.

The utility ends up looping from the rebalance label down through the
wait_iff_congested continiously.  Because order=0,
__alloc_pages_direct_compact skips the call to get_page_from_freelist.
Because all of the reclaimable memory on the system has already been
reclaimed, __alloc_pages_direct_reclaim skips the call to
get_page_from_freelist.  Since there is no __GFP_FS flag, the block with
__alloc_pages_may_oom is skipped.  The loop hits the wait_iff_congested,
then jumps back to rebalance without ever trying to
get_page_from_freelist.  This loop repeats infinitely.

The test case is pretty pathological.  Running a mix of I/O stress-tests
that do a lot of fork() and consume all of the system memory, I can pretty
reliably hit this on 600 nodes, in about 12 hours.  32GB/node.

Signed-off-by: Andrew Barry <abarry@cray.com>
Signed-off-by: Minchan Kim <minchan.kim@gmail.com>
Reviewed-by: Rik van Riel<riel@redhat.com>
Acked-by: Mel Gorman <mgorman@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

HID: magicmouse: ignore 'ivalid report id' while switching modes

commit 23746a66d7d9e73402c68ef00d708796b97ebd72 upstream.

The device reponds with 'invalid report id' when feature report switching it
into multitouch mode is sent to it.

This has been silently ignored before 0825411ade ("HID: bt: Wait for ACK
on Sent Reports"), but since this commit, it propagates -EIO from the _raw
callback .

So let the driver ignore -EIO as response to 0xd7,0x01 report, as that's
how the device reacts in normal mode.

Sad, but following reality.

This fixes https://bugzilla.kernel.org/show_bug.cgi?id=35022

Tested-by: Chase Douglas <chase.douglas@canonical.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

ASoC: fix raumfeld platform

commit 477a66948ef8683f182682cc68e8520baf8a5b43 upstream.

Commit f0fba2ad (ASoC: multi-component - ASoC Multi-Component Support)
broke support for Raumfeld platforms as it didn't take into account the
different hardware features on individual devices.

In particular, Raumfeld speakers have no S/PDIF output, so the members
of the snd_soc_card struct must be set dynamically.

Signed-off-by: Daniel Mack <zonque@gmail.com>
Acked-by: Liam Girdwood <lrg@slimlogic.co.uk>
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

ASoC: Add some missing volume update bit sets for wm_hubs devices

commit fb5af53d421d80725172427e9076f6e889603df6 upstream.

Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Acked-by: Liam Girdwood <lrg@ti.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

ASoC: Ensure output PGA is enabled for line outputs in wm_hubs

commit d0b48af6c2b887354d0893e598d92911ce52620e upstream.

Also fix a left/right typo while we're at it.

Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Acked-by: Liam Girdwood <lrg@ti.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

ALSA: hda - Use LPIB for ATI/AMD chipsets as default

commit 50e3bbf9898840eead86f90a43b3625a2b2f4112 upstream.

ATI and AMD chipsets seem not providing the proper position-buffer
information, and it also doesn't provide FIFO register required by
VIACOMBO fix. It's better to use LPIB for these.

Reported-by: David Henningsson <david.henningsson@canonical.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

ALSA: hda - Fix input-src parse in patch_analog.c

commit 5a2d227fdc7a02ed1b4cebba391d8fb9ad57caaf upstream.

Compare pin type enum to the pin type and not the array index.
Fixes bug#0005368.

Signed-off-by: Adrian Wilkins <adrian.wilkins@nhs.net>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

ALSA: HDA: Add quirk for Lenovo U350

commit d2859fd49200f1f3efd8acdb54b6d51d3ab82302 upstream.

Add model=asus quirk for Lenovo Ideapad U350 to make internal mic
work correctly.

BugLink: http://bugs.launchpad.net/bugs/751681
Reported-by: Kent Baxley <kent.baxley@canonical.com>
Signed-off-by: David Henningsson <david.henningsson@canonical.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

ALSA: HDA: Use one dmic only for Dell Studio 1558

commit e033ebfb399227e01686260ac271029011bc6b47 upstream.

There are no signs of a dmic at node 0x0b, so the user is left with
an additional internal mic which does not exist. This commit removes
that non-existing mic.

BugLink: http://bugs.launchpad.net/bugs/731706
Reported-by: James Page <james.page@canonical.com>
Signed-off-by: David Henningsson <david.henningsson@canonical.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

slub: Make CONFIG_DEBUG_PAGE_ALLOC work with new fastpath

commit 1393d9a1857471f816d0be1ccc1d6433a86050f6 upstream.

Fastpath can do a speculative access to a page that CONFIG_DEBUG_PAGE_ALLOC may have
marked as invalid to retrieve the pointer to the next free object.

Use probe_kernel_read in that case in order not to cause a page fault.

Reported-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Christoph Lameter <cl@linux.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

mm: vmscan: correctly check if reclaimer should schedule during shrink_slab

commit f06590bd718ed950c98828e30ef93204028f3210 upstream.

It has been reported on some laptops that kswapd is consuming large
amounts of CPU and not being scheduled when SLUB is enabled during large
amounts of file copying.  It is expected that this is due to kswapd
missing every cond_resched() point because;

shrink_page_list() calls cond_resched() if inactive pages were isolated
        which in turn may not happen if all_unreclaimable is set in
        shrink_zones(). If for whatver reason, all_unreclaimable is
        set on all zones, we can miss calling cond_resched().

balance_pgdat() only calls cond_resched if the zones are not
        balanced. For a high-order allocation that is balanced, it
        checks order-0 again. During that window, order-0 might have
        become unbalanced so it loops again for order-0 and returns
        that it was reclaiming for order-0 to kswapd(). It can then
        find that a caller has rewoken kswapd for a high-order and
        re-enters balance_pgdat() without ever calling cond_resched().

shrink_slab only calls cond_resched() if we are reclaiming slab
pages. If there are a large number of direct reclaimers, the
shrinker_rwsem can be contended and prevent kswapd calling
cond_resched().

This patch modifies the shrink_slab() case.  If the semaphore is
contended, the caller will still check cond_resched().  After each
successful call into a shrinker, the check for cond_resched() remains in
case one shrinker is particularly slow.

[mgorman@suse.de: preserve call to cond_resched after each call into shrinker]
Signed-off-by: Mel Gorman <mgorman@suse.de>
Signed-off-by: Minchan Kim <minchan.kim@gmail.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Wu Fengguang <fengguang.wu@intel.com>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Tested-by: Colin King <colin.king@canonical.com>
Cc: Raghavendra D Prabhu <raghu.prabhu13@gmail.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Chris Mason <chris.mason@oracle.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Rik van Riel <riel@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

mm: vmscan: correct use of pgdat_balanced in sleeping_prematurely

commit afc7e326a3f5bafc41324d7926c324414e343ee5 upstream.

There are a few reports of people experiencing hangs when copying large
amounts of data with kswapd using a large amount of CPU which appear to be
due to recent reclaim changes.  SLUB using high orders is the trigger but
not the root cause as SLUB has been using high orders for a while.  The
root cause was bugs introduced into reclaim which are addressed by the
following two patches.

Patch 1 corrects logic introduced by commit 1741c877 ("mm: kswapd:
        keep kswapd awake for high-order allocations until a percentage of
        the node is balanced") to allow kswapd to go to sleep when
        balanced for high orders.

Patch 2 notes that it is possible for kswapd to miss every
        cond_resched() and updates shrink_slab() so it'll at least reach
        that scheduling point.

Chris Wood reports that these two patches in isolation are sufficient to
prevent the system hanging.  AFAIK, they should also resolve similar hangs
experienced by James Bottomley.

This patch:

Johannes Weiner poined out that the logic in commit 1741c877 ("mm: kswapd:
keep kswapd awake for high-order allocations until a percentage of the
node is balanced") is backwards.  Instead of allowing kswapd to go to
sleep when balancing for high order allocations, it keeps it kswapd
running uselessly.

Signed-off-by: Mel Gorman <mgorman@suse.de>
Reviewed-by: Rik van Riel <riel@redhat.com>
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Reviewed-by: Wu Fengguang <fengguang.wu@intel.com>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Tested-by: Colin King <colin.king@canonical.com>
Cc: Raghavendra D Prabhu <raghu.prabhu13@gmail.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Chris Mason <chris.mason@oracle.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Rik van Riel <riel@redhat.com>
Reviewed-by: Minchan Kim <minchan.kim@gmail.com>
Reviewed-by: Wu Fengguang <fengguang.wu@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

md/bitmap: fix saving of events_cleared and other state.

commit 8258c53208d7a9b7207e7d4dae36d2ea384cb278 upstream.

If a bitmap is found to be 'stale' the events_cleared value
is set to match 'events'.
However if the array is degraded this does not get stored on disk.
This can subsequently lead to incorrect behaviour.

So change bitmap_update_sb to always update events_cleared in the
superblock from the known events_cleared.
For neatness also set ->state from ->flags.
This requires updating ->state whenever we update ->flags, which makes
sense anyway.

This is suitable for any active -stable release.

Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

md: Fix race when creating a new md device.

commit b0140891a8cea36469f58d23859e599b1122bd37 upstream.

There is a race when creating an md device by opening /dev/mdXX.

If two processes do this at much the same time they will follow the
call path
  __blkdev_get -> get_gendisk -> kobj_lookup

The first will call
  -> md_probe -> md_alloc -> add_disk -> blk_register_region

and the race happens when the second gets to kobj_lookup after
add_disk has called blk_register_region but before it returns to
md_alloc.

In the case the second will not call md_probe (as the probe is already
done) but will get a handle on the gendisk, return to __blkdev_get
which will then call md_open (via the ->open) pointer.

As mddev->gendisk hasn't been set yet, md_open will think something is
wrong an return with ERESTARTSYS.

This can loop endlessly while the first thread makes no progress
through add_disk.  Nothing is blocking it, but due to scheduler
behaviour it doesn't get a turn.
So this is essentially a live-lock.

We fix this by simply moving the assignment to mddev->gendisk before
the call the add_disk() so md_open doesn't get confused.
Also move blk_queue_flush earlier because add_disk should be as late
as possible.

To make sure that md_open doesn't complete until md_alloc has done all
that is needed, we take mddev->open_mutex during the last part of
md_alloc.  md_open will wait for this.

This can cause a lock-up on boot so Cc:ing for stable.
For 2.6.36 and earlier a different patch will be needed as the
'blk_queue_flush' call isn't there.

Signed-off-by: NeilBrown <neilb@suse.de>
Reported-by: Thomas Jarosch <thomas.jarosch@intra2net.com>
Tested-by: Thomas Jarosch <thomas.jarosch@intra2net.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

seqlock: Don't smp_rmb in seqlock reader spin loop

commit 5db1256a5131d3b133946fa02ac9770a784e6eb2 upstream.

Move the smp_rmb after cpu_relax loop in read_seqlock and add
ACCESS_ONCE to make sure the test and return are consistent.

A multi-threaded core in the lab didn't like the update
from 2.6.35 to 2.6.36, to the point it would hang during
boot when multiple threads were active. Bisection showed
af5ab277ded04bd9bc6b048c5a2f0e7d70ef0867 (clockevents:
Remove the per cpu tick skew) as the culprit and it is
supported with stack traces showing xtime_lock waits including
tick_do_update_jiffies64 and/or update_vsyscall.

Experimentation showed the combination of cpu_relax and smp_rmb
was significantly slowing the progress of other threads sharing
the core, and this patch is effective in avoiding the hang.

A theory is the rmb is affecting the whole core while the
cpu_relax is causing a resource rebalance flush, together they
cause an interfernce cadance that is unbroken when the seqlock
reader has interrupts disabled.

At first I was confused why the refactor in
3c22cd5709e8143444a6d08682a87f4c57902df3 (kernel: optimise
seqlock) didn't affect this patch application, but after some
study that affected seqcount not seqlock. The new seqcount was
not factored back into the seqlock. I defer that the future.

While the removal of the timer interrupt offset created
contention for the xtime lock while a cpu does the
additonal work to update the system clock, the seqlock
implementation with the tight rmb spin loop goes back much
further, and is just waiting for the right trigger.

Signed-off-by: Milton Miller <miltonm@bga.com>
Cc: <linuxppc-dev@lists.ozlabs.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Nick Piggin <npiggin@kernel.dk>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Anton Blanchard <anton@samba.org>
Cc: Paul McKenney <paulmck@linux.vnet.ibm.com>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Link: http://lkml.kernel.org/r/%3Cseqlock-rmb%40mdm.bga.com%3E
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

Fix for buffer overflow in ldm_frag_add not sufficient

commit cae13fe4cc3f24820ffb990c09110626837e85d4 upstream.

As Ben Hutchings discovered [1], the patch for CVE-2011-1017 (buffer
overflow in ldm_frag_add) is not sufficient. The original patch in
commit c340b1d64000 ("fs/partitions/ldm.c: fix oops caused by corrupted
partition table") does not consider that, for subsequent fragments,
previously allocated memory is used.

[1] http://lkml.org/lkml/2011/5/6/407

Reported-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: Timo Warns <warns@pre-sense.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

staging: brcm80211: bugfix for div by zero in minstrel_ht_tx_status

commit e9c661e08c2a6015c1b7cba1cecefa27a089df71 upstream.

Caused by brcmsmac.ko suppling a 0 value to Mac80211. Mac80211 subsequently
divides by this number. Bug only occurred on AMPDU traffic. This is a fix
for https://bugzilla.kernel.org/show_bug.cgi?id=32032, titled
'Divide error in minstrel_ht_tx_status followed by hang', reported by
Wouter Cloetens.

Signed-off-by: Roland Vossen <rvossen@broadcom.com>
Reviewed-by: Arend van Spriel <arend@broadcom.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

staging: r8712u: Fix driver to support ad-hoc mode

commit 62819fd9481021db7f87d5f61f2e2fd2be1dfcfa upstream.

Driver r8712u is unable to handle ad-hoc mode. The issue is that when
the driver first starts, there will not be an SSID for association.
The fix is to always call the "select and join from scan" routine when
in ad-hoc mode.

Note: Ad-hoc mode worked intermittently before. If the driver had
previously been associated, then things were OK.

Signed-off-by: Jeff Chua <jeff.chua.linux@gmail.com>
Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

staging: usbip: fix wrong endian conversion

commit cacd18a8476ce145ca5dcd46dc5b75585fd1289c upstream.

Fix number_of_packets wrong endian conversion in function
correct_endian_ret_submit()

Signed-off-by: David Chang <dchang@novell.com>
Acked-by: Arjan Mels <arjan.mels@gmx.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

When mandatory encryption on share, fail mount

commit 6848b7334b24b47aa3d0e70342ff839ffa95d5fa upstream.

    When mandatory encryption is configured in samba server on a
    share (smb.conf parameter "smb encrypt = mandatory") the
    server will hang up the tcp session when we try to send
    the first frame after the tree connect if it is not a
    QueryFSUnixInfo, this causes cifs mount to hang (it must
    be killed with ctl-c).  Move the QueryFSUnixInfo call
    earlier in the mount sequence, and check whether the SetFSUnixInfo
    fails due to mandatory encryption so we can return a sensible
    error (EACCES) on mount.

    In a future patch (for 2.6.40) we will support mandatory
    encryption.

Reviewed-by: Pavel Shilovsky <piastry@etersoft.ru>
Signed-off-by: Steve French <sfrench@us.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

rcu: Fix unpaired rcu_irq_enter() from locking selftests

commit ba9f207c9f82115aba4ce04b22e0081af0ae300f upstream.

HARDIRQ_ENTER() maps to irq_enter() which calls rcu_irq_enter().
But HARDIRQ_EXIT() maps to __irq_exit() which doesn't call
rcu_irq_exit().

So for every locking selftest that simulates hardirq disabled,
we create an imbalance in the rcu extended quiescent state
internal state.

As a result, after the first missing rcu_irq_exit(), subsequent
irqs won't exit dyntick-idle mode after leaving the interrupt
handler. This means that RCU won't see the affected CPU as being
in an extended quiescent state, resulting in long grace-period
delays (as in grace periods extending for hours).

To fix this, just use __irq_enter() to simulate the hardirq
context. This is sufficient for the locking selftests as we
don't need to exit any extended quiescent state or perform
any check that irqs normally do when they wake up from idle.

As a side effect, this patch makes it possible to restore
"rcu: Decrease memory-barrier usage based on semi-formal proof",
which eventually helped finding this bug.

Reported-and-tested-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

oprofile, x86: Enable preemption during pci device setup in IBS init

commit 3d2606f42984613d324ad3047cf503bcddc3880a upstream.

IBS initialization is a mix of per-core register access and per-node
pci device setup. Register access should be pinned to the cpu, but pci
setup must run with preemption enabled.

This patch better separates the code into non-/preemptible sections
and fixes sleeping with preemption disabled. See bug message below.

Fixes also freeing the eilvt entry by introducing put_eilvt().

BUG: sleeping function called from invalid context at mm/slub.c:824
in_atomic(): 1, irqs_disabled(): 0, pid: 32357, name: modprobe
INFO: lockdep is turned off.
Pid: 32357, comm: modprobe Not tainted 2.6.39-rc7+ #14
Call Trace:
[<ffffffff8104bdc8>] __might_sleep+0x112/0x117
[<ffffffff81129693>] kmem_cache_alloc_trace+0x4b/0xe7
[<ffffffff81278f14>] kzalloc.constprop.0+0x29/0x2b
[<ffffffff81278f4c>] pci_get_subsys+0x36/0x78
[<ffffffff81022689>] ? setup_APIC_eilvt+0xfb/0x139
[<ffffffff81278fa4>] pci_get_device+0x16/0x18
[<ffffffffa06c8b5d>] op_amd_init+0xd3/0x211 [oprofile]
[<ffffffffa064d000>] ? 0xffffffffa064cfff
[<ffffffffa064d298>] op_nmi_init+0x21e/0x26a [oprofile]
[<ffffffffa064d062>] oprofile_arch_init+0xe/0x26 [oprofile]
[<ffffffffa064d010>] oprofile_init+0x10/0x42 [oprofile]
[<ffffffff81002099>] do_one_initcall+0x7f/0x13a
[<ffffffff81096524>] sys_init_module+0x132/0x281
[<ffffffff814cc682>] system_call_fastpath+0x16/0x1b

Reported-by: Dave Jones <davej@redhat.com>
Signed-off-by: Robert Richter <robert.richter@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

x86, cpufeature: Update CPU feature RDRND to RDRAND

commit 7ccafc5f75c87853f3c49845d5a884f2376e03ce upstream.

The Intel manual changed the name of the CPUID bit to match the
instruction name. We should follow suit for sanity's sake. (See Intel SDM
Volume 2, Table 3-20 "Feature Information Returned in the ECX Register".)

[ hpa: we can only do this at this time because there are currently no CPUs
with this feature on the market, hence this is pre-hardware enabling.
However, Cc:'ing stable so that stable can present a consistent ABI. ]

Signed-off-by: Kees Cook <kees.cook@canonical.com>
Link: http://lkml.kernel.org/r/20110524232926.GA27728@outflux.net
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

x86, efi: Retain boot service code until after switching to virtual mode

commit 916f676f8dc016103f983c7ec54c18ecdbb6e349 upstream.

UEFI stands for "Unified Extensible Firmware Interface", where "Firmware"
is an ancient African word meaning "Why do something right when you can
do it so wrong that children will weep and brave adults will cower before
you", and "UEI" is Celtic for "We missed DOS so we burned it into your
ROMs". The UEFI specification provides for runtime services (ie, another
way for the operating system to be forced to depend on the firmware) and
we rely on these for certain trivial tasks such as setting up the
bootloader. But some hardware fails to work if we attempt to use these
runtime services from physical mode, and so we have to switch into virtual
mode. So far so dreadful.

The specification makes it clear that the operating system is free to do
whatever it wants with boot services code after ExitBootServices() has been
called. SetVirtualAddressMap() can't be called until ExitBootServices() has
been. So, obviously, a whole bunch of EFI implementations call into boot
services code when we do that. Since we've been charmingly naive and
trusted that the specification may be somehow relevant to the real world,
we've already stuffed a picture of a penguin or something in that address
space. And just to make things more entertaining, we've also marked it
non-executable.

This patch allocates the boot services regions during EFI init and makes
sure that they're executable. Then, after SetVirtualAddressMap(), it
discards them and everyone lives happily ever after. Except for the ones
who have to work on EFI, who live sad lives haunted by the knowledge that
someone's eventually going to write yet another firmware specification.

[ hpa: adding this to urgent with a stable tag since it fixes currently-broken
hardware. However, I do not know what the dependencies are and so I do
not know which -stable versions this may be a candidate for. ]

Signed-off-by: Matthew Garrett <mjg@redhat.com>
Link: http://lkml.kernel.org/r/1306331593-28715-1-git-send-email-mjg@redhat.com
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Cc: Tony Luck <tony.luck@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

x86, amd: Use _safe() msr access for GartTlbWlk disable code

commit d47cc0db8fd6011de2248df505fc34990b7451bf upstream.

The workaround for Bugzilla:

https://bugzilla.kernel.org/show_bug.cgi?id=33012

introduced a read and a write to the MC4 mask msr.

Unfortunatly this MSR is not emulated by the KVM hypervisor
so that the kernel will get a #GP and crashes when applying
this workaround when running inside KVM.

This issue was reported as:

https://bugzilla.kernel.org/show_bug.cgi?id=35132

and is fixed with this patch. The change just let the kernel
ignore any #GP it gets while accessing this MSR by using the
_safe msr access methods.

Reported-by: Török Edwin <edwintorok@gmail.com>
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Cc: Rafael J. Wysocki <rjw@sisk.pl>
Cc: Maciej Rutecki <maciej.rutecki@gmail.com>
Cc: Avi Kivity <avi@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

x86, amd: Do not enable ARAT feature on AMD processors below family 0x12

commit e9cdd343a5e42c43bcda01e609fa23089e026470 upstream.

Commit b87cf80af3ba4b4c008b4face3c68d604e1715c6 added support for
ARAT (Always Running APIC timer) on AMD processors that are not
affected by erratum 400. This erratum is present on certain processor
families and prevents APIC timer from waking up the CPU when it
is in a deep C state, including C1E state.

Determining whether a processor is affected by this erratum may
have some corner cases and handling these cases is somewhat
complicated. In the interest of simplicity we won't claim ARAT
support on processor families below 0x12 and will go back to
broadcasting timer when going idle.

Signed-off-by: Boris Ostrovsky <ostr@amd64.org>
Link: http://lkml.kernel.org/r/1306423192-19774-1-git-send-email-ostr@amd64.org
Tested-by: Boris Petkov <borislav.petkov@amd.com>
Cc: Hans Rosenfeld <Hans.Rosenfeld@amd.com>
Cc: Andreas Herrmann <Andreas.Herrmann3@amd.com>
Cc: Chuck Ebbert <cebbert@redhat.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

x86, ioapic: Fix potential resume deadlock

commit b64ce24daffb634b5b3133a2e411bd4de50654e8 upstream.

Fix a potential deadlock when resuming; here the calling
function has disabled interrupts, so we cannot sleep.

Change the memory allocation flag from GFP_KERNEL to GFP_ATOMIC.

TODO: We can do away with this memory allocation during resume
      by reusing the ioapic suspend/resume code that uses boot time
      allocated buffers, but we want to keep this -stable patch
      simple.

Signed-off-by: Daniel J Blueman <daniel.blueman@gmail.com>
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Link: http://lkml.kernel.org/r/20110518233157.385970138@sbsiddha-MOBL3.sc.intel.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

um: Use RWSEM_GENERIC_SPINLOCK on x86

commit 3a3679078aed2c451ebc32836bbd3b8219a65e01 upstream.

Commit d12337 (rwsem: Remove redundant asmregparm annotation)
broke rwsem on UML.

As we cannot compile UML with -mregparm=3 and keeping asmregparm only
for UML is inadequate the easiest solution is using RWSEM_GENERIC_SPINLOCK.

Thanks to Thomas Gleixner for the idea.

Reported-by: Toralf Förster <toralf.foerster@gmx.de>
Tested-by: Toralf Förster <toralf.foerster@gmx.de>
Signed-off-by: Richard Weinberger <richard@nod.at>
Cc: user-mode-linux-devel@lists.sourceforge.net
Link: http://lkml.kernel.org/r/%3C1306183893-26655-1-git-send-email-richard%40nod.at%3E
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

target: Fix task->task_execute_queue=1 clear bug + LUN_RESET OOPs

commit af57c3ac9947990da2608561b71f4799eb7795c6 upstream.

This patch fixes a bug where task->task_execute_queue=1 was not being
cleared once se_task had been removed from se_device->execute_task_list,
resulting in an OOPs in core_tmr_lun_reset() for the task->task_active=0
case where transport_remove_task_from_execute_queue() was incorrectly
being called.

This patch fixes two cases in transport_get_task_from_execute_queue()
and transport_remove_task_from_execute_queue() to properly clear
task->task_execute_queue=0 once list_del(&task->t_execute_list) has
been called.

It also adds an explict check in transport_remove_task_from_execute_queue()
to dump_stack + return if called with task->task_execute_queue=0.

Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Signed-off-by: James Bottomley <jbottomley@parallels.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

target: Fix bug with task_sg chained transport_free_dev_tasks release

commit f436677262a5b524ac87675014c6d4e8ee153029 upstream.

This patch addresses a bug in the target core release path for HW
operation where transport_free_dev_tasks() was incorrectly being called
from transport_lun_remove_cmd() while releasing a se_cmd reference and
calling struct target_core_fabric_ops->queue_data_in().

This would result in a OOPs with HW target mode when the release of
se_task->task_sg[] would happen before pci_unmap_sg() can be called in
HW target mode fabric module code. This patch addresses the issue by
moving transport_free_dev_tasks() from transport_lun_remove_cmd() into
transport_generic_free_cmd(), and adding TRANSPORT_FREE_CMD_INTR and
transport_generic_free_cmd_intr() to allow se_cmd descriptor release
to happen fromfrom within transport_processing_thread() process context
when release of se_cmd is not possible from HW interrupt context.

Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Signed-off-by: James Bottomley <jbottomley@parallels.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

target: Fix interrupt context bug with stats_lock and core_tmr_alloc_req

commit 53ab6709b4d35b1924240854d794482fd7d33d4a upstream.

This patch fixes two bugs wrt to the interrupt context usage of target
core with HW target mode drivers. It first converts the usage of struct
se_device->stats_lock in transport_get_lun_for_cmd() and core_tmr_lun_reset()
to properly use spin_lock_irq() to address an BUG with CONFIG_LOCKDEP_SUPPORT=y
enabled.

This patch also adds a 'in_interrupt()' check to allow GFP_ATOMIC usage from
core_tmr_alloc_req() to fix a 'sleeping in interrupt context' BUG with HW
target fabrics that require this logic to function.

Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Signed-off-by: James Bottomley <jbottomley@parallels.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

target: Fix multi task->task_sg[] chaining logic bug

commit 97868c8905a1537153d406c4a3aa39a503a5c299 upstream.

This patch fixes a bug in transport_do_task_sg_chain() used by HW target
mode modules with sg_chain() to provide a single sg_next() walkable memory
layout for use with pci_map_sg() and friends. This patch addresses an
issue with mapping multiple small block max_sector tasks across multiple
struct se_task->task_sg[] mappings for HW target mode operation.

This was causing OOPs with (cmd->t_task->t_tasks_no > 1) I/O traffic for
HW target drivers using transport_do_task_sg_chain(), and has been tested
so far with tcm_fc(openfcoe), tcm_qla2xxx, and ib_srpt fabrics with
t_tasks_no > 1 IBLOCK backends using a smaller max_sectors to trigger the
original issue.

Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Acked-by: Kiran Patil <kiran.patil@intel.com>
Signed-off-by: James Bottomley <jbottomley@parallels.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

Fix Ultrastor asm snippet

commit fad4dab5e44e10acf6b0235e469cb8e773b58e31 upstream.

Commit 1292500b replaced

"=m" (*field) : "1" (*field)

with

"=m" (*field) :

with comment "The following patch fixes it by using the '+' operator on
the (*field) operand, marking it as read-write to gcc."
'+' was actually forgotten. This really puts it.

Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
Signed-off-by: James Bottomley <jbottomley@parallels.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

bnx2i: Updated the connection shutdown/cleanup timeout

commit d5307a078bb0288945c900c6f4a2fd77ba6d0817 upstream.

Modified the 10s wait time for inflight offload connections to
advance to the next state to 2s based on test result.
Modified the 20s shutdown timeout to 30s based on test result.

Signed-off-by: Eddie Wai <eddie.wai@broadcom.com>
Reviewed-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <jbottomley@parallels.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

bnx2i: Fixed packet error created when the sq_size is set to 16

commit 7287c63e986fe1a51a89f4bb1327320274a7a741 upstream.

The number of chip's internal command cell, which is use to generate
SCSI cmd packets to the target, was not initialized correctly by
the driver when the sq_size is changed from the default 128.
This, in turn, will create a problem where the chip's transmit pipe
will erroneously reuse an old command cell that is no longer valid.
The fix is to correctly initialize the chip's command cell upon setup.

Signed-off-by: Eddie Wai <eddie.wai@broadcom.com>
Reviewed-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <jbottomley@parallels.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

mpt2sas: move even handling of MPT2SAS_TURN_ON_FAULT_LED into process context

commit 3ace8e052be5293ebb3e00f819effccc64108a38 upstream.

Driver was a sending a SEP request during interrupt context which
required to go to sleep.

The fix is to rearrange the code so a fake event
MPT2SAS_TURN_ON_FAULT_LED is fired from interrupt context, then later
during the kernel worker threads processing, the SEP request is issued
to firmware.

Signed-off-by: Kashyap Desai <kashyap.desai@lsi.com>
Signed-off-by: James Bottomley <jbottomley@parallels.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

dst: catch uninitialized metrics

[ Upstream commit 1f37070d3ff325827c6213e51b57f21fd5ac9d05 ]

Catch cases where dst_metric_set() and other functions are called
but _metrics is NULL.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

net: fix __dst_destroy_metrics_generic()

[ Upstream commit b30c516f875004f025f4d10147bde28c5e98466b ]

dst_default_metrics is readonly, we dont want to kfree() it later.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

sch_sfq: avoid giving spurious NET_XMIT_CN signals

[ Upstream commit 8efa885406359af300d46910642b50ca82c0fe47 ]

While chasing a possible net_sched bug, I found that IP fragments have
litle chance to pass a congestioned SFQ qdisc :

- Say SFQ qdisc is full because one flow is non responsive.
- ip_fragment() wants to send two fragments belonging to an idle flow.
- sfq_enqueue() queues first packet, but see queue limit reached :
- sfq_enqueue() drops one packet from 'big consumer', and returns
NET_XMIT_CN.
- ip_fragment() cancel remaining fragments.

This patch restores fairness, making sure we return NET_XMIT_CN only if
we dropped a packet from the same flow.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
CC: Patrick McHardy <kaber@trash.net>
CC: Jarek Poplawski <jarkao2@gmail.com>
CC: Jamal Hadi Salim <hadi@cyberus.ca>
CC: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

bonding: prevent deadlock on slave store with alb mode (v3)

[ Upstream commit 9fe0617d9b6d21f700ee9e658e1c9fe3be2fb402 ]

This soft lockup was recently reported:

[root@dell-per715-01 ~]# echo +bond5 > /sys/class/net/bonding_masters
[root@dell-per715-01 ~]# echo +eth1 > /sys/class/net/bond5/bonding/slaves
bonding: bond5: doing slave updates when interface is down.
bonding bond5: master_dev is not up in bond_enslave
[root@dell-per715-01 ~]# echo -eth1 > /sys/class/net/bond5/bonding/slaves
bonding: bond5: doing slave updates when interface is down.

BUG: soft lockup - CPU#12 stuck for 60s! [bash:6444]
CPU 12:
Modules linked in: bonding autofs4 hidp rfcomm l2cap bluetooth lockd sunrpc
be2d
Pid: 6444, comm: bash Not tainted 2.6.18-262.el5 #1
RIP: 0010:[<ffffffff80064bf0>]  [<ffffffff80064bf0>]
.text.lock.spinlock+0x26/00
RSP: 0018:ffff810113167da8  EFLAGS: 00000286
RAX: ffff810113167fd8 RBX: ffff810123a47800 RCX: 0000000000ff1025
RDX: 0000000000000000 RSI: ffff810123a47800 RDI: ffff81021b57f6f8
RBP: ffff81021b57f500 R08: 0000000000000000 R09: 000000000000000c
R10: 00000000ffffffff R11: ffff81011d41c000 R12: ffff81021b57f000
R13: 0000000000000000 R14: 0000000000000282 R15: 0000000000000282
FS:  00002b3b41ef3f50(0000) GS:ffff810123b27940(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 00002b3b456dd000 CR3: 000000031fc60000 CR4: 00000000000006e0

Call Trace:
[<ffffffff80064af9>] _spin_lock_bh+0x9/0x14
[<ffffffff886937d7>] :bonding:tlb_clear_slave+0x22/0xa1
[<ffffffff8869423c>] :bonding:bond_alb_deinit_slave+0xba/0xf0
[<ffffffff8868dda6>] :bonding:bond_release+0x1b4/0x450
[<ffffffff8006457b>] __down_write_nested+0x12/0x92
[<ffffffff88696ae4>] :bonding:bonding_store_slaves+0x25c/0x2f7
[<ffffffff801106f7>] sysfs_write_file+0xb9/0xe8
[<ffffffff80016b87>] vfs_write+0xce/0x174
[<ffffffff80017450>] sys_write+0x45/0x6e
[<ffffffff8005d28d>] tracesys+0xd5/0xe0

It occurs because we are able to change the slave configuarion of a bond while
the bond interface is down.  The bonding driver initializes some data structures
only after its ndo_open routine is called.  Among them is the initalization of
the alb tx and rx hash locks.  So if we add or remove a slave without first
opening the bond master device, we run the risk of trying to lock/unlock a
spinlock that has garbage for data in it, which results in our above softlock.

Note that sometimes this works, because in many cases an unlocked spinlock has
the raw_lock parameter initialized to zero (meaning that the kzalloc of the
net_device private data is equivalent to calling spin_lock_init), but thats not
true in all cases, and we aren't guaranteed that condition, so we need to pass
the relevant spinlocks through the spin_lock_init function.

Fix it by moving the spin_lock_init calls for the tx and rx hashtable locks to
the ndo_init path, so they are ready for use by the bond_store_slaves path.

Change notes:
v2) Based on conversation with Jay and Nicolas it seems that the ability to
enslave devices while the bond master is down should be safe to do.  As such
this is an outlier bug, and so instead we'll just initalize the errant spinlocks
in the init path rather than the open path, solving the problem.  We'll also
remove the warnings about the bond being down during enslave operations, since
it should be safe

v3) Fix spelling error

Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Reported-by: jtluka@redhat.com
CC: Jay Vosburgh <fubar@us.ibm.com>
CC: Andy Gospodarek <andy@greyhouse.net>
CC: nicolas.2p.debian@gmail.com
CC: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

sch_sfq: fix peek() implementation

[ Upstream commit 07bd8df5df4369487812bf85a237322ff3569b77 ]

Since commit eeaeb068f139 (sch_sfq: allow big packets and be fair),
sfq_peek() can return a different skb that would be normally dequeued by
sfq_dequeue() [ if current slot->allot is negative ]

Use generic qdisc_peek_dequeued() instead of custom implementation, to
get consistent result.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
CC: Jarek Poplawski <jarkao2@gmail.com>
CC: Patrick McHardy <kaber@trash.net>
CC: Jesper Dangaard Brouer <hawk@diku.dk>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

sctp: fix memory leak of the ASCONF queue when free asoc

[ Upstream commit 8b4472cc13136d04727e399c6fdadf58d2218b0a ]

If an ASCONF chunk is outstanding, then the following ASCONF
chunk will be queued for later transmission. But when we free
the asoc, we forget to free the ASCONF queue at the same time,
this will cause memory leak.

Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

bridge: initialize fake_rtable metrics

[ Upstream commit 33eb9873a283a2076f2b5628813d5365ca420ea9 ]

bridge netfilter code uses a fake_rtable, and we must init its _metric
field or risk NULL dereference later.

Ref: https://bugzilla.kernel.org/show_bug.cgi?id=35672

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

igmp: call ip_mc_clear_src() only when we have no users of ip_mc_list

[ Upstream commit 24cf3af3fed5edcf90bc2a0ed181e6ce1513d2dc ]

In igmp_group_dropped() we call ip_mc_clear_src(), which resets the number
of source filters per mulitcast. However, igmp_group_dropped() is also
called on NETDEV_DOWN, NETDEV_PRE_TYPE_CHANGE and NETDEV_UNREGISTER, which
means that the group might get added back on NETDEV_UP, NETDEV_REGISTER and
NETDEV_POST_TYPE_CHANGE respectively, leaving us with broken source
filters.

To fix that, we must clear the source filters only when there are no users
in the ip_mc_list, i.e. in ip_mc_dec_group() and on device destroy.

Acked-by: David L Stevens <dlstevens@us.ibm.com>
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

SCTP: fix race between sctp_bind_addr_free() and sctp_bind_addr_conflict()

[ Upstream commit c182f90bc1f22ce5039b8722e45621d5f96862c2 ]

During the sctp_close() call, we do not use rcu primitives to
destroy the address list attached to the endpoint.  At the same
time, we do the removal of addresses from this list before
attempting to remove the socket from the port hash

As a result, it is possible for another process to find the socket
in the port hash that is in the process of being closed.  It then
proceeds to traverse the address list to find the conflict, only
to have that address list suddenly disappear without rcu() critical
section.

Fix issue by closing address list removal inside RCU critical
section.

Race can result in a kernel crash with general protection fault or
kernel NULL pointer dereference:

kernel: general protection fault: 0000 [#1] SMP
kernel: RIP: 0010:[<ffffffffa02f3dde>]  [<ffffffffa02f3dde>] sctp_bind_addr_conflict+0x64/0x82 [sctp]
kernel: Call Trace:
kernel:  [<ffffffffa02f415f>] ? sctp_get_port_local+0x17b/0x2a3 [sctp]
kernel:  [<ffffffffa02f3d45>] ? sctp_bind_addr_match+0x33/0x68 [sctp]
kernel:  [<ffffffffa02f4416>] ? sctp_do_bind+0xd3/0x141 [sctp]
kernel:  [<ffffffffa02f5030>] ? sctp_bindx_add+0x4d/0x8e [sctp]
kernel:  [<ffffffffa02f5183>] ? sctp_setsockopt_bindx+0x112/0x4a4 [sctp]
kernel:  [<ffffffff81089e82>] ? generic_file_aio_write+0x7f/0x9b
kernel:  [<ffffffffa02f763e>] ? sctp_setsockopt+0x14f/0xfee [sctp]
kernel:  [<ffffffff810c11fb>] ? do_sync_write+0xab/0xeb
kernel:  [<ffffffff810e82ab>] ? fsnotify+0x239/0x282
kernel:  [<ffffffff810c2462>] ? alloc_file+0x18/0xb1
kernel:  [<ffffffff8134a0b1>] ? compat_sys_setsockopt+0x1a5/0x1d9
kernel:  [<ffffffff8134aaf1>] ? compat_sys_socketcall+0x143/0x1a4
kernel:  [<ffffffff810467dc>] ? sysenter_dispatch+0x7/0x32

Signed-off-by: Jacek Luczak <luczak.jacek@gmail.com>
Acked-by: Vlad Yasevich <vladislav.yasevich@hp.com>
CC: Eric Dumazet <eric.dumazet@gmail.com>
Reviewed-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

macvlan: fix panic if lowerdev in a bond

[ Upstream commit d93515611bbc70c2fe4db232e5feb448ed8e4cc9 ]

commit a35e2c1b6d905 (macvlan: use rx_handler_data pointer to store
macvlan_port pointer V2) added a bug in macvlan_port_create()

Steps to reproduce the bug:

# ifenslave bond0 eth0 eth1

# ip link add link eth0 up name eth0#1 type macvlan
->error EBUSY

# ip link add link eth0 up name eth0#1 type macvlan
->panic

Fix: Dont set IFF_MACVLAN_PORT in error case.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

net: add skb_dst_force() in sock_queue_err_skb()

[ Upstream commit abb57ea48fd9431fa320a5c55f73e6b5a44c2efb ]

Commit 7fee226ad239 (add a noref bit on skb dst) forgot to use
skb_dst_force() on packets queued in sk_error_queue

This triggers following warning, for applications using IP_CMSG_PKTINFO
receiving one error status

------------[ cut here ]------------
WARNING: at include/linux/skbuff.h:457 ip_cmsg_recv_pktinfo+0xa6/0xb0()
Hardware name: 2669UYD
Modules linked in: isofs vboxnetadp vboxnetflt nfsd ebtable_nat ebtables
lib80211_crypt_ccmp uinput xcbc hdaps tp_smapi thinkpad_ec radeonfb fb_ddc
radeon ttm drm_kms_helper drm ipw2200 intel_agp intel_gtt libipw i2c_algo_bit
i2c_i801 agpgart rng_core cfbfillrect cfbcopyarea cfbimgblt video raid10 raid1
raid0 linear md_mod vboxdrv
Pid: 4697, comm: miredo Not tainted 2.6.39-rc6-00569-g5895198-dirty #22
Call Trace:
[<c17746b6>] ? printk+0x1d/0x1f
[<c1058302>] warn_slowpath_common+0x72/0xa0
[<c15bbca6>] ? ip_cmsg_recv_pktinfo+0xa6/0xb0
[<c15bbca6>] ? ip_cmsg_recv_pktinfo+0xa6/0xb0
[<c1058350>] warn_slowpath_null+0x20/0x30
[<c15bbca6>] ip_cmsg_recv_pktinfo+0xa6/0xb0
[<c15bbdd7>] ip_cmsg_recv+0x127/0x260
[<c154f82d>] ? skb_dequeue+0x4d/0x70
[<c1555523>] ? skb_copy_datagram_iovec+0x53/0x300
[<c178e834>] ? sub_preempt_count+0x24/0x50
[<c15bdd2d>] ip_recv_error+0x23d/0x270
[<c15de554>] udp_recvmsg+0x264/0x2b0
[<c15ea659>] inet_recvmsg+0xd9/0x130
[<c1547752>] sock_recvmsg+0xf2/0x120
[<c11179cb>] ? might_fault+0x4b/0xa0
[<c15546bc>] ? verify_iovec+0x4c/0xc0
[<c1547660>] ? sock_recvmsg_nosec+0x100/0x100
[<c1548294>] __sys_recvmsg+0x114/0x1e0
[<c1093895>] ? __lock_acquire+0x365/0x780
[<c1148b66>] ? fget_light+0xa6/0x3e0
[<c1148b7f>] ? fget_light+0xbf/0x3e0
[<c1148aee>] ? fget_light+0x2e/0x3e0
[<c1549f29>] sys_recvmsg+0x39/0x60

Close bug https://bugzilla.kernel.org/show_bug.cgi?id=34622

Reported-by: Witold Baryluk <baryluk@smp.if.uj.edu.pl>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
CC: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

net: ethtool: fix IPV6 checksum feature name string

[ Upstream commit 7cc31a9ae1477abc79d5992b3afe889f25c50c99 ]

Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

net: Change netdev_fix_features messages loglevel

[ Upstream commit 604ae14ffb6d75d6eef4757859226b758d6bf9e3 ]

Cool, how about we make 'Features changed' debug as well?
This way userspace can't fill up the log just by tweaking tun features
with an ioctl.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

net: use hlist_del_rcu() in dev_change_name()

[ Upstream commit 372b2312010bece1e36f577d6c99a6193ec54cbd ]

Using plain hlist_del() in dev_change_name() is wrong since a
concurrent reader can crash trying to dereference LIST_POISON1.

Bug introduced in commit 72c9528bab94 (net: Introduce
dev_get_by_name_rcu())

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

netfilter: nf_ct_sip: fix SDP parsing in TCP SIP messages for some Cisco phones

[ Upstream commit e6e4d9ed11fb1fab8b3256a3dc14d71b5e984ac4 ]

Some Cisco phones do not place the Content-Length field at the end of the
SIP message. This is valid, due to a misunderstanding of the specification
the parser expects the SDP body to start directly after the Content-Length
field. Fix the parser to scan for \r\n\r\n to locate the beginning of the
SDP body.

Reported-by: Teresa Kang <teresa_kang@gemtek.com.tw>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

netfilter: nf_ct_sip: validate Content-Length in TCP SIP messages

[ Upstream commit 274ea0e2a4cdf18110e5931b8ecbfef6353e5293 ]

Verify that the message length of a single SIP message, which is calculated
based on the Content-Length field contained in the SIP message, does not
exceed the packet boundaries.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

ext4: release page cache in ext4_mb_load_buddy error path

commit 26626f1172fb4f3f323239a6a5cf4e082643fa46 upstream.

Add missing page_cache_release in the error path of ext4_mb_load_buddy

Signed-off-by: Yang Ruirui <ruirui.r.yang@tieto.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

jbd: fix fsync() tid wraparound bug

commit d9b01934d56a96d9f4ae2d6204d4ea78a36f5f36 upstream.

If an application program does not make any changes to the indirect
blocks or extent tree, i_datasync_tid will not get updated. If there
are enough commits (i.e., 2**31) such that tid_geq()'s calculations
wrap, and there isn't a currently active transaction at the time of
the fdatasync() call, this can end up triggering a BUG_ON in
fs/jbd/commit.c:

J_ASSERT(journal->j_running_transaction != NULL);

It's pretty rare that this can happen, since it requires the use of
fdatasync() plus *very* frequent and excessive use of fsync(). But
with the right workload, it can.

We fix this by replacing the use of tid_geq() with an equality test,
since there's only one valid transaction id that is valid for us to
start: namely, the currently running transaction (if it exists).

Reported-by: Martin_Zielinski@McAfee.com
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

jbd: Fix forever sleeping process in do_get_write_access()

commit 2842bb20eed2e25cde5114298edc62c8883a1d9a upstream.

In do_get_write_access() we wait on BH_Unshadow bit for buffer to get
from shadow state. The waking code in journal_commit_transaction() has
a bug because it does not issue a memory barrier after the buffer is moved
from the shadow state and before wake_up_bit() is called. Thus a waitqueue
check can happen before the buffer is actually moved from the shadow state
and waiting process may never be woken. Fix the problem by issuing proper
barrier.

Reported-by: Tao Ma <boyu.mt@taobao.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

ext3: Fix fs corruption when make_indexed_dir() fails

commit 86c4f6d85595cd7da635dc6985d27bfa43b1ae10 upstream.

When make_indexed_dir() fails (e.g. because of ENOSPC) after it has allocated
block for index tree root, we did not properly mark all changed buffers dirty.
This lead to only some of these buffers being written out and thus effectively
corrupting the directory.

Fix the issue by marking all changed data dirty even in the error failure case.

Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

usb-storage: fix up the unusual_realtek device list

commit a8e62dd6d91f3bc3687abbb26227e5fc39c4829c upstream.

This patch (as1461) fixes the unusual_devs entries for the Realtek USB
card reader. They should be ordered by PID, and they should not
override the Subclass and Protocol values provided by the device.
Otherwise a notification about unnecessary entries gets printed in the
kernel log during probing.

Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Tested-By: Tony Vroon <tony@linx.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

pata_cm64x: fix boot crash on parisc

commit 9281b16caac1276817b77033c5b8a1f5ca30102c upstream.

The old IDE cmd64x checks the status of the CNTRL register to see if
the ports are enabled before probing them. pata_cmd64x doesn't do
this, which causes a HPMC on parisc when it tries to poke at the
secondary port because apparently the BAR isn't wired up (and a
non-responding piece of memory causes a HPMC).

Fix this by porting the CNTRL register port detection logic from IDE
cmd64x. In addition, following converns from Alan Cox, add a check to
see if a mobility electronics bridge is the immediate parent and forgo
the check if it is (prevents problems on hotplug controllers).

Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Signed-off-by: Jeff Garzik <jgarzik@pobox.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

wire up syncfs syscall

commit 2e7bad5f34b5beed47542490c760ed26574e38ba upstream.

Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

wire up the fhandle syscalls

commit a71aae4cec120ee85cf32608fca40a4605461214 upstream.

Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

wire up clock_adjtime syscall

commit c3f957a22eca106bd28136943305b390b4337ebf upstream.

Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

wire up fanotify syscalls

commit 1824074b07ee66fa0f714e08579ad85075132d7b upstream.

Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

VFS: move BUG_ON test for symlink nd->depth after current->link_count test

commit 1a4022f88d40e1255920b017556092ab926d7f66 upstream.

This solves a serious VFS-level bug in nested_symlink (which was
rewritten from do_follow_link), and follows the order of depth tests
that existed before.

The bug triggers a BUG_ON in fs/namei.c:1381, when running racer with
symlink and rename ops.

Signed-off-by: Erez Zadok <ezk@cs.sunysb.edu>
Acked-by: Miklos Szeredi <mszeredi@suse.cz>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

mac80211: fix a few RCU issues

commit a3836e02ba4c50db958d32d710b226f2408623dc upstream.

A few configuration functions correctly do
rcu_read_lock() but don't correctly reference
some pointers protected by RCU. Fix that.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

nl80211: Fix set_key regression with some drivers

commit 0e579d6a8f4aea346da818f13ee71401c125e639 upstream.

Commit dbd2fd656f2060abfd3a16257f8b51ec60f6d2ed added a mechanism for
user space to indicate whether a default key is being configured for
only unicast or only multicast frames instead of all frames. This
commit added a driver capability flag for indicating whether separate
default keys are supported and validation of the set_key command based
on that capability.

However, this single capability flag is not enough to cover possible
difference based on mode (AP/IBSS/STA) and the way this change was
introduced resulted in a regression with drivers that do not indicate
the new capability (i.e.., more or less any non-mac80211 driver using
cfg80211) when using a recent wpa_supplicant snapshot.

Fix the regression by removing the new check which is not strictly
speaking needed. The new separate default key functionality is needed
only for RSN IBSS which has a separate capability indication.

Signed-off-by: Jouni Malinen <jouni.malinen@atheros.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

mac80211: use wake_queue to restart trasmit

commit 470ab2a23b453518ac86937572b4531d8925ca55 upstream.

netif_tx_start_all_queues is used to allow the upper layer
to transmit frames but it does not restart transmission.
To restart the trasmission use netif_tx_wake_all_queues.
Not doing so, sometimes stalls the transmission and the
application has to be restarted to proceed further.

This issue was originally found while sending udp traffic
in higer bandwidth in open environment without bgscan.

Signed-off-by: Rajkumar Manoharan <rmanoharan@atheros.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

iwlwifi: fix bugs in change_interface

commit a2b76b3b31568da9d281a393845f17689594ccdf upstream.

If change_interface gets invoked during a firmware
restart, it may crash; prevent that from happening
by checking if ctx->vif is assigned.

Additionally, in my initial commit I forgot to set
the vif->p2p variable correctly, so fix that too.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Wey-Yi Guy <wey-yi.w.guy@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

libata: Use Maximum Write Same Length to report discard size limit

commit 5f4e206666f834340b69ddb43f86de3851c8675a upstream.

Previously we used Maximum Unmap LBA Count in the Block Limits VPD to
signal the maximum number of sectors we could handle in a single Write
Same command.

Starting with SBC3r26 the Block Limits VPD has an explicit limit on the
number of blocks in a Write Same. This means we can stop abusing a field
related to the Unmap command and let our SAT use the proper value in the
VPD (Maximum Write Same Length).

Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Jeff Garzik <jgarzik@pobox.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

x86, 64-bit: Fix copy_[to/from]_user() checks for the userspace address limit

commit 26afb7c661080ae3f1f13ddf7f0c58c4f931c22b upstream.

As reported in BZ #30352:

  https://bugzilla.kernel.org/show_bug.cgi?id=30352

there's a kernel bug related to reading the last allowed page on x86_64.

The _copy_to_user() and _copy_from_user() functions use the following
check for address limit:

  if (buf + size >= limit)
fail();

while it should be more permissive:

  if (buf + size > limit)
fail();

That's because the size represents the number of bytes being
read/write from/to buf address AND including the buf address.
So the copy function will actually never touch the limit
address even if "buf + size == limit".

Following program fails to use the last page as buffer
due to the wrong limit check:

#include <sys/mman.h>
#include <sys/socket.h>
#include <assert.h>

#define PAGE_SIZE       (4096)
#define LAST_PAGE       ((void*)(0x7fffffffe000))

int main()
{
        int fds[2], err;
        void * ptr = mmap(LAST_PAGE, PAGE_SIZE, PROT_READ | PROT_WRITE,
                          MAP_ANONYMOUS | MAP_PRIVATE | MAP_FIXED, -1, 0);
        assert(ptr == LAST_PAGE);
        err = socketpair(AF_LOCAL, SOCK_STREAM, 0, fds);
        assert(err == 0);
        err = send(fds[0], ptr, PAGE_SIZE, 0);
        perror("send");
        assert(err == PAGE_SIZE);
        err = recv(fds[1], ptr, PAGE_SIZE, MSG_WAITALL);
        perror("recv");
        assert(err == PAGE_SIZE);
        return 0;
}

The other place checking the addr limit is the access_ok() function,
which is working properly. There's just a misleading comment
for the __range_not_ok() macro - which this patch fixes as well.

The last page of the user-space address range is a guard page and
Brian Gerst observed that the guard page itself due to an erratum on K8 cpus
(#121 Sequential Execution Across Non-Canonical Boundary Causes Processor
Hang).

However, the test code is using the last valid page before the guard page.
The bug is that the last byte before the guard page can't be read
because of the off-by-one error. The guard page is left in place.

This bug would normally not show up because the last page is
part of the process stack and never accessed via syscalls.

Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Acked-by: Brian Gerst <brgerst@gmail.com>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/1305210630-7136-1-git-send-email-jolsa@redhat.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

SATA: enable non-queueable flush flag

commit 900e599eb0c16390ff671652a44e0ea90532220e upstream.

Enable non-queueable flush flag for SATA.

Stable: 2.6.39 only

Signed-off-by: Shaohua Li <shaohua.li@intel.com>
Acked-by: Tejun Heo <tj@kernel.org>
Acked-by: Jeff Garzik <jgarzik@redhat.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

mtd: omap: fix subpage ecc issue with prefetch

commit c5d8c0cae4af7d78823d32fcd1c458ee1a1b5489 upstream.

When reading/writing a subpage (When HW ECC is not available/enabled)
for number of bytes not aligned to 4, the mis-aligned bytes are handled
first (by cpu copy method) before enabling the Prefetch engine to/from
'p'(start of buffer 'buf'). Then it reads/writes rest of the bytes with
the help of Prefetch engine, if available, or again using cpu copy method.
Currently, reading/writing of rest of bytes, is not done correctly since
its trying to read/write again to/from begining of buffer 'buf',
overwriting the mis-aligned bytes.

Read & write using prefetch engine got broken in commit '2c01946c'.
We never hit a scenario of not getting 'gpmc_prefetch_enable' call
success. So, problem did not get caught up.

Signed-off-by: Kishore Kadiyala <kishore.kadiyala@ti.com>
Signed-off-by: Vimal Singh <vimal.newwork@gmail.com>
Reported-by: Bryan DE FARIA <bdefaria@adeneo-embedded.com>
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

mtd: return badblockbits back

commit 26d9be11485ea8c1102c3e8eaa7667412eef4950 upstream.

In commit c7b28e25cb9beb943aead770ff14551b55fa8c79 the initialization of
the backblockbits was accidentally removed. This patch returns it back,
because otherwise some NAND drivers are broken.

This problem was reported by "Saxena, Parth <parth.saxena@ti.com>" here:
http://lists.infradead.org/pipermail/linux-mtd/2011-April/035221.html

Reported-by: Parth Saxena <parth.saxena@ti.com>
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
Tested-by: Parth Saxena <parth.saxena@ti.com>
Acked-by: Parth Saxena <parth.saxena@ti.com>
Acked-by: Brian Norris <computersforpeace@gmail.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

mtd: mtdconcat: fix NAND OOB write

commit 431e1ecabddcd7cbba237182ddf431771f98bb4c upstream.

Currently mtdconcat is broken for NAND. An attemtpt to create
JFFS2 filesystem on concatenation of several NAND devices fails
with OOB write errors. This patch fixes that problem.

Signed-off-by: Felix Radensky <felix@embedded-sol.com>
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

block: always allocate genhd->ev if check_events is implemented

commit 75e3f3ee3c64968d42f4843ec49e579f84b5aa0c upstream.

9fd097b149 (block: unexport DISK_EVENT_MEDIA_CHANGE for legacy/fringe
drivers) removed DISK_EVENT_MEDIA_CHANGE from legacy/fringe block
drivers which have inadequate ->check_events(). Combined with earlier
change 7c88a168da (block: don't propagate unlisted DISK_EVENTs to
userland), this enables using ->check_events() for internal processing
while avoiding enabling in-kernel block event polling which can lead
to infinite event loop.

Unfortunately, this made many drivers including floppy without any bit
set in disk->events and ->async_events in which case disk_add_events()
simply skipped allocation of disk->ev, which disables whole event
handling. As ->check_events() is still used during open processing
for revalidation, this can lead to open failure.

This patch always allocates disk->ev if ->check_events is implemented.
In the long term, it would make sense to simply include the event
structure inline into genhd as it's now used by virtually all block
devices.

Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Ondrej Zary <linux@rainbow-software.org>
Reported-by: Alex Villacis Lasso <avillaci@ceibo.fiec.espol.edu.ec>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

block: add proper state guards to __elv_next_request

commit 0a58e077eb600d1efd7e54ad9926a75a39d7f8ae upstream.

blk_cleanup_queue() calls elevator_exit() and after this, we can't
touch the elevator without oopsing. __elv_next_request() must check
for this state because in the refcounted queue model, we can still
call it after blk_cleanup_queue() has been called.

This was reported as causing an oops attributable to scsi.

Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

block: Fix discard topology stacking and reporting

commit a934a00a69e940b126b9bdbf83e630ef5fe43523 upstream.

In some cases we would end up stacking discard_zeroes_data incorrectly.
Fix this by enabling the feature by default for stacking drivers and
clearing it for low-level drivers. Incorporating a device that does not
support dzd will then cause the feature to be disabled in the stacking
driver.

Also ensure that the maximum discard value does not overflow when
exported in sysfs and return 0 in the alignment and dzd fields for
devices that don't support discard.

Reported-by: Lukas Czerner <lczerner@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Acked-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

block: hold queue if flush is running for non-queueable flush drive

commit 3ac0cc4508709d42ec9aa351086c7d38bfc0660c upstream.

In some drives, flush requests are non-queueable. When flush request is
running, normal read/write requests can't run. If block layer dispatches
such request, driver can't handle it and requeue it.  Tejun suggested we
can hold the queue when flush is running. This can avoid unnecessary
requeue.  Also this can improve performance. For example, we have
request flush1, write1, flush 2. flush1 is dispatched, then queue is
hold, write1 isn't inserted to queue. After flush1 is finished, flush2
will be dispatched. Since disk cache is already clean, flush2 will be
finished very soon, so looks like flush2 is folded to flush1.

In my test, the queue holding completely solves a regression introduced by
commit 53d63e6b0dfb95882ec0219ba6bbd50cde423794:

    block: make the flush insertion use the tail of the dispatch list

    It's not a preempt type request, in fact we have to insert it
    behind requests that do specify INSERT_FRONT.

which causes about 20% regression running a sysbench fileio
workload.

Stable: 2.6.39 only

Signed-off-by: Shaohua Li <shaohua.li@intel.com>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

block: add a non-queueable flush flag

commit f3876930952390a31c3a7fd68dd621464a36eb80 upstream.

flush request isn't queueable in some drives. Add a flag to let driver
notify block layer about this. We can optimize flush performance with the
knowledge.

Stable: 2.6.39 only

Signed-off-by: Shaohua Li <shaohua.li@intel.com>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

block: move bd_set_size() above rescan_partitions() in __blkdev_get()

commit 7e69723fef8771a9d57bd27d36281d756130b4b5 upstream.

02e352287a4 (block: rescan partitions on invalidated devices on
-ENOMEDIA too) relocated partition rescan above explicit bd_set_size()
to simplify condition check.  As rescan_partitions() does its own bdev
size setting, this doesn't break anything; however,
rescan_partitions() prints out the following messages when adjusting
bdev size, which can be confusing.

  sda: detected capacity change from 0 to 146815737856
  sdb: detected capacity change from 0 to 146815737856

This patch restores the original order and remove the warning
messages.

stable: Please apply together with 02e352287a4 (block: rescan
        partitions on invalidated devices on -ENOMEDIA too).

Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Tony Luck <tony.luck@gmail.com>
Tested-by: Tony Luck <tony.luck@gmail.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

ext4: don't set PageUptodate in ext4_end_bio()

commit 39db00f1c45e770856264bdb3ceca27980b01965 upstream.

In the bio completion routine, we should not be setting
PageUptodate at all -- it's set at sys_write() time, and is
unaffected by success/failure of the write to disk.

This can cause a page corruption bug when the file system's
block size is less than the architecture's VM page size.

if we have only written a single block -- we might end up
setting the page's PageUptodate flag, indicating that page
is completely read into memory, which may not be true.
This could cause subsequent reads to get bad data.

This commit also takes the opportunity to clean up error
handling in ext4_end_bio(), and remove some extraneous code:

   - fixes ext4_end_bio() to set AS_EIO in the
     page->mapping->flags on error, which was left out by
     mistake.  This is needed so that fsync() will
     return an error if there was an I/O error.
   - remove the clear_buffer_dirty() call on unmapped
     buffers for each page.
   - consolidate page/buffer error handling in a single
     section.

Signed-off-by: Curt Wohlgemuth <curtw@google.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Reported-by: Jim Meyering <jim@meyering.net>
Reported-by: Hugh Dickins <hughd@google.com>
Cc: Mingming Cao <cmm@us.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

iwlagn: fix iwl_is_any_associated

commit 054ec924944912413e4ee927b8cf02f476d08783 upstream.

The function iwl_is_any_associated() was intended
to check both contexts, but due to an oversight
it only checks the BSS context. This leads to a
problem with scanning since the passive dwell
time isn't restricted appropriately and a scan
that includes passive channels will never finish
if only the PAN context is associated since the
default dwell time of 120ms won't fit into the
normal 100 TU DTIM interval.

Fix the function by using for_each_context() and
also reorganise the other functions a bit to take
advantage of each other making the code easier to
read.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Wey-Yi Guy <wey-yi.w.guy@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

powerpc/oprofile: Handle events that raise an exception without overflowing

commit ad5d5292f16c6c1d7d3e257c4c7407594286b97e upstream.

Commit 0837e3242c73566fc1c0196b4ec61779c25ffc93 fixes a situation on POWER7
where events can roll back if a specualtive event doesn't actually complete.
This can raise a performance monitor exception. We need to catch this to ensure
that we reset the PMC. In all cases the PMC will be less than 256 cycles from
overflow.

This patch lifts Anton's fix for the problem in perf and applies it to oprofile
as well.

Signed-off-by: Eric B Munson <emunson@mgebm.net>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

ath9k_hw: Fix STA connection issues with AR9380 (XB113).

commit be0e6aa5a0c487a2a0880dda8bc70f7f1860fc39 upstream.

XB113 (AR9380) 3x3 SB 5G only cards were failing to connect to APs
due to incorrect xpabiaslevel configuration. fix it.

Cc: Ray Li <ray.li@greenwavereality.com>
Cc: Kathy Giori <kathy.giori@atheros.com>
Cc: Aeolus Yang <aeolus.yang@atheros.com>
Cc: compat@orbit-lab.org
Signed-off-by: Senthil Balasubramanian <senthilkumar@atheros.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

ath9k_hw: fix dual band assumption for XB113

commit 9ba7f4f5eba5f4b44c7796bbad29f8ec3a7d5864 upstream.

The XB113 cards are single band, 5 GHz-only, but the
default settings were configured to assume it was dual
band. Users of these cards then would see 2.4 GHz channels
but you would never get any scan results from these channels
given that the radio is not present.

Cc: Fiona Cain <Fiona.Cain@atheros.com>
Cc: Ray Li <ray.li@greenwavereality.com>
Cc: Kathy Giori <kathy.giori@atheros.com>
Cc: Aeolus Yang <aeolus.yang@atheros.com>
Cc: Dan Friedman <dan.friedman@atheros.com>
Signed-off-by: Luis R. Rodriguez <lrodriguez@atheros.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

ath9k_hw: fix power for the HT40 duplicate frames

commit cf3a03b9c99a0b2715741d116f50f513f545bb2d upstream.

With AR9003 at about ~ 10 feet from an AP that uses RTS / CTS you
will be able to associate but not not get data through given that
the power for the rates used was set too low. This increases the
power and permits data connectivity at longer distances from
access points when connected with HT40. Without this you will not
get any data through when associated to APs configured in HT40
at about more than 10 feet away.

Cc: Fiona Cain <fcain@atheros.com>
Cc: Zhen Xie <Zhen.Xie@Atheros.com>
Cc: Kathy Giori <kathy.giori@atheros.com>
Cc: Neha Choksi <neha.choksi@atheros.com>
Cc: Wayne Daniel <wayne.daniel@atheros.com>
Cc: Gaurav Jauhar <gaurav.jauhar@atheros.com>
Cc: Samira Naraghi <samira.naraghi@atheros.com>
CC: Ashok Chennupati <ashok.chennupati@atheros.com>
Cc: Lance Zimmerman <lance.zimmerman@atheros.com>
Signed-off-by: Luis R. Rodriguez <lrodriguez@atheros.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

ath9k_hw: do noise floor calibration only on required chains

commit 28ef6450f0182f95c4f50aaa0ab2043a09c72b0a upstream.

At present the noise floor calibration is processed in supported
control and extension chains rather than required chains.
Unnccesarily doing nfcal in all supported chains leads to
invalid nf readings on extn chains and these invalid values
got updated into history buffer. While loading those values
from history buffer is moving the chip to deaf state.

This issue was observed in AR9002/AR9003 chips while doing
associate/dissociate in HT40 mode and interface up/down
in iterative manner. After some iterations, the chip was moved
to deaf state. Somehow the pci devices are recovered by poll work
after chip reset. Raading the nf values in all supported extension chains
when the hw is not yet configured in HT40 mode results invalid values.

Signed-off-by: Rajkumar Manoharan <rmanoharan@atheros.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

x86, cpufeature: Fix cpuid leaf 7 feature detection

commit 2494b030ba9334c7dd7df9b9f7abe4eacc950ec5 upstream.

CPUID leaf 7, subleaf 0 returns the maximum subleaf in EAX, not the
number of subleaves. Since so far only subleaf 0 is defined (and only
the EBX bitfield) we do not need to qualify the test.

Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Link: http://lkml.kernel.org/r/1305660806-17519-1-git-send-email-fenghua.yu@intel.com
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

powerpc/kexec: Fix memory corruption from unallocated slaves

commit 3d2cea732d68aa270c360f55d8669820ebce188a upstream.

Commit 1fc711f7ffb01089efc58042cfdbac8573d1b59a (powerpc/kexec: Fix race
in kexec shutdown) moved the write to signal the cpu had exited the kernel
from before the transition to real mode in kexec_smp_wait to kexec_wait.

Unfornately it missed that kexec_wait is used both by cpus leaving the
kernel and by secondary slave cpus that were not allocated a paca for
what ever reason -- they could be beyond nr_cpus or not described in
the current device tree for whatever reason (for example, kexec-load
was not refreshed after a cpu hotplug operation). Cpus coming through
that path they will write to paca[NR_CPUS] which is beyond the space
allocated for the paca data and overwrite memory not allocated to pacas
but very likely still real mode accessable).

Move the write back to kexec_smp_wait, which is used only by cpus that
found their paca, but after the transition to real mode.

Signed-off-by: Milton Miller <miltonm@bga.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

powerpc/kdump64: Don't reference freed memory as pacas

commit bd9e5eefecb3d69018bb95796298019d309cbec8 upstream.

Starting with 1426d5a3bd07589534286375998c0c8c6fdc5260 (powerpc:
Dynamically allocate pacas) the space for pacas beyond cpu_possible
is freed, but we failed to update the loop in crash.c.

Since c1854e00727f50f7ac99e98d26ece04c087ef785 (powerpc: Set nr_cpu_ids
early and use it to free PACAs) the number of pacas allocated is
always nr_cpu_ids.

Signed-off-by: Milton Miller <miltonm@bga.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

qla2xxx: Fix vport delete hang when logins are outstanding.

commit 9f40682e2857a3c2ddb80a87b185af3c6a708346 upstream.

Timer is required to flush out entries that may be present in work queues.

Signed-off-by: Arun Easi <arun.easi@qlogic.com>
Signed-off-by: Madhuranath Iyengar <Madhu.Iyengar@qlogic.com>
Signed-off-by: James Bottomley <jbottomley@parallels.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

qla2xxx: Fix virtual port failing to login after chip reset.

commit cefcaba67ab97fb756b3a6af5139c94d861b660d upstream.

This patch ensures qla82xx_watchdog is not being run for the vport. It also
makes sure that beacon ON is not done for the vport, as it will lead to the
waking up of the dpc thread again and again.

Signed-off-by: Saurav Kashyap <saurav.kashyap@qlogic.com>
Signed-off-by: Madhuranath Iyengar <Madhu.Iyengar@qlogic.com>
Signed-off-by: James Bottomley <jbottomley@parallels.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

qla2xxx: Fix hang during driver unload when vport is active.

commit 43ebf16d762b082663976b679b813e1b546548d1 upstream.

Bumping ref count during fc_vport_terminate() was the cause. vport
delete would wait for ref count to drop to zero and that would never
happen.

Signed-off-by: Arun Easi <arun.easi@qlogic.com>
Signed-off-by: Madhuranath Iyengar <Madhu.Iyengar@qlogic.com>
Signed-off-by: James Bottomley <jbottomley@parallels.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

qla2xxx: Properly set the dsd_list_len for dsd_chaining in cmd type 6.

commit fa96d927362a422405d65491326f8ef763572e84 upstream.

The firmware spec has the fcp_data_dseg_len defined as a 32-bit
value, while the corresponding field in the driver structure has
it defined as a 16-bit value.

Signed-off-by: Andrew Vasquez <andrew.vasquez@qlogic.com>
Signed-off-by: Madhuranath Iyengar <Madhu.Iyengar@qlogic.com>
Signed-off-by: James Bottomley <jbottomley@parallels.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

ftrace: Only update the function code on write to filter files

commit 058e297d34a404caaa5ed277de15698d8dc43000 upstream.

If function tracing is enabled, a read of the filter files will
cause the call to stop_machine to update the function trace sites.
It should only call stop_machine on write.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

net: recvmmsg: Strip MSG_WAITFORONE when calling recvmsg

commit b9eb8b8752804cecbacdb4d24b52e823cf07f107 upstream.

recvmmsg fails on a raw socket with EINVAL. The reason for this is
packet_recvmsg checks the incoming flags:

        err = -EINVAL;
        if (flags & ~(MSG_PEEK|MSG_DONTWAIT|MSG_TRUNC|MSG_CMSG_COMPAT|MSG_ERRQUEUE))
                goto out;

This patch strips out MSG_WAITFORONE when calling recvmmsg which
fixes the issue.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>