git.karo-electronics.de Git - karo-tx-linux.git/log

pnfs: Increase the refcount when LAYOUTGET fails the first time

commit 39e88fcfb1d5c6c4b1ff76ca2ab76cf449b850e8 upstream.

The layout will be set unusable if LAYOUTGET fails. Is it reasonable to
increase the refcount iff LAYOUTGET fails the first time?

Signed-off-by: Yanchuan Nian <ycnian@gmail.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

NFS: Fix access to suid/sgid executables

commit f8d9a897d4384b77f13781ea813156568f68b83e upstream.

nfs_open_permission_mask() should only check MAY_EXEC for files that
are opened with __FMODE_EXEC.

Also fix NFSv4 access-in-open path in a similar way -- openflags must be
used because fmode will not always have FMODE_EXEC set.

This patch fixes https://bugzilla.kernel.org/show_bug.cgi?id=49101

Signed-off-by: Weston Andros Adamson <dros@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

nfsd: avoid permission checks on EXCLUSIVE_CREATE replay

commit 7007c90fb9fef593b4aeaeee57e6a6754276c97c upstream.

With NFSv4, if we create a file then open it we explicit avoid checking
the permissions on the file during the open because the fact that we
created it ensures we should be allow to open it (the create and the
open should appear to be a single operation).

However if the reply to an EXCLUSIVE create gets lots and the client
resends the create, the current code will perform the permission check -
because it doesn't realise that it did the open already..

This patch should fix this.

Note that I haven't actually seen this cause a problem. I was just
looking at the code trying to figure out a different EXCLUSIVE open
related issue, and this looked wrong.

(Fix confirmed with pynfs 4.0 test OPEN4--bfields)

Signed-off-by: NeilBrown <neilb@suse.de>
[bfields: use OWNER_OVERRIDE and update for 4.1]
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

nfsd4: fix oops on unusual readlike compound

commit d5f50b0c290431c65377c4afa1c764e2c3fe5305 upstream.

If the argument and reply together exceed the maximum payload size, then
a reply with a read-like operation can overlow the rq_pages array.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

nfsd: fix v4 reply caching

commit 57d276d71aef7d8305ff002a070cb98deb2edced upstream.

Very embarassing: 1091006c5eb15cba56785bd5b498a8d0b9546903 "nfsd: turn
on reply cache for NFSv4" missed a line, effectively leaving the reply
cache off in the v4 case. I thought I'd tested that, but I guess not.

This time, wrote a pynfs test to confirm it works.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

nfs: fix wrong object type in lockowner_slab

commit 3c40794b2dd0f355ef4e6bf8d85af5dcd7da7ece upstream.

The object type in the cache of lockowner_slab is wrong, and it is
better to fix it.

Signed-off-by: Yanchuan Nian <ycnian@gmail.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

NFS: Don't use SetPageError in the NFS writeback code

commit ada8e20d044c0fa5610e504ce6fb4578ebd3edd9 upstream.

The writeback code is already capable of passing errors back to user space
by means of the open_context->error. In the case of ENOSPC, Neil Brown
is reporting seeing 2 errors being returned.

Neil writes:

"e.g. if /mnt2/ if an nfs mounted filesystem that has no space then

strace dd if=/dev/zero conv=fsync >> /mnt2/afile count=1

reported Input/output error and the relevant parts of the strace output are:

write(1, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 512) = 512
fsync(1) = -1 EIO (Input/output error)
close(1) = -1 ENOSPC (No space left on device)"

Neil then shows that the duplication of error messages appears to be due to
the use of the PageError() mechanism, which causes filemap_fdatawait_range
to return the extra EIO. The regression was introduced by
commit 7b281ee026552f10862b617a2a51acf49c829554 (NFS: fsync() must exit
with an error if page writeback failed).

Fix this by removing the call to SetPageError(), and just relying on
open_context->error reporting the ENOSPC back to fsync().

Reported-by: Neil Brown <neilb@suse.de>
Tested-by: Neil Brown <neilb@suse.de>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

NFS: Fix calls to drop_nlink()

commit 1f018458b30b0d5c535c94e577aa0acbb92e1395 upstream.

It is almost always wrong for NFS to call drop_nlink() after removing a
file. What we really want is to mark the inode's attributes for
revalidation, and we want to ensure that the VFS drops it if we're
reasonably sure that this is the final unlink().
Do the former using the usual cache validity flags, and the latter
by testing if inode->i_nlink == 1, and clearing it in that case.

This also fixes the following warning reported by Neil Brown and
Jeff Layton (among others).

[634155.004438] WARNING:
at /home/abuild/rpmbuild/BUILD/kernel-desktop-3.5.0/lin [634155.004442]
Hardware name: Latitude E6510 [634155.004577]  crc_itu_t crc32c_intel
snd_hwdep snd_pcm snd_timer snd soundcor [634155.004609] Pid: 13402, comm:
bash Tainted: G        W    3.5.0-36-desktop # [634155.004611] Call Trace:
[634155.004630]  [<ffffffff8100444a>] dump_trace+0xaa/0x2b0
[634155.004641]  [<ffffffff815a23dc>] dump_stack+0x69/0x6f
[634155.004653]  [<ffffffff81041a0b>] warn_slowpath_common+0x7b/0xc0
[634155.004662]  [<ffffffff811832e4>] drop_nlink+0x34/0x40
[634155.004687]  [<ffffffffa05bb6c3>] nfs_dentry_iput+0x33/0x70 [nfs]
[634155.004714]  [<ffffffff8118049e>] dput+0x12e/0x230
[634155.004726]  [<ffffffff8116b230>] __fput+0x170/0x230
[634155.004735]  [<ffffffff81167c0f>] filp_close+0x5f/0x90
[634155.004743]  [<ffffffff81167cd7>] sys_close+0x97/0x100
[634155.004754]  [<ffffffff815c3b39>] system_call_fastpath+0x16/0x1b
[634155.004767]  [<00007f2a73a0d110>] 0x7f2a73a0d10f

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

NFS: avoid NULL dereference in nfs_destroy_server

commit f259613a1e4b44a0cf85a5dafd931be96ee7c9e5 upstream.

In rare circumstances, nfs_clone_server() of a v2 or v3 server can get
an error between setting server->destory (to nfs_destroy_server), and
calling nfs_start_lockd (which will set server->nlm_host).

If this happens, nfs_clone_server will call nfs_free_server which
will call nfs_destroy_server and thence nlmclnt_done(NULL). This
causes the NULL to be dereferenced.

So add a guard to only call nlmclnt_done() if ->nlm_host is not NULL.

The other guards there are irrelevant as nlm_host can only be non-NULL
if one of these flags are set - so remove those tests. (Thanks to Trond
for this suggestion).

This is suitable for any stable kernel since 2.6.25.

Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

nfs: don't zero out the rest of the page if we hit the EOF on a DIO READ

commit 67fad106a219e083c91c79695bd1807dde1bf7b9 upstream.

Eryu provided a test program that would segfault when attempting to read
past the EOF on file that was opened O_DIRECT. The buffer given to the
read() call was on the stack, and when he attempted to read past it it
would scribble over the rest of the stack page.

If we hit the end of the file on a DIO READ request, then we don't want
to zero out the rest of the buffer. These aren't pagecache pages after
all, and there's no guarantee that the buffers that were passed in
represent entire pages.

Reported-by: Eryu Guan <eguan@redhat.com>
Cc: Fred Isaman <iisaman@netapp.com>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

NFSv4: Check for buffer length in __nfs4_get_acl_uncached

commit 7d3e91a89b7adbc2831334def9e494dd9892f9af upstream.

Commit 1f1ea6c "NFSv4: Fix buffer overflow checking in
__nfs4_get_acl_uncached" accidently dropped the checking for too small
result buffer length.

If someone uses getxattr on "system.nfs4_acl" on an NFSv4 mount
supporting ACLs, the ACL has not been cached and the buffer suplied is
too short, we still copy the complete ACL, resulting in kernel and user
space memory corruption.

Signed-off-by: Sven Wegener <sven.wegener@stealer.net>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

nfs: don't extend writes to cover entire page if pagecache is invalid

commit 81d9bce5309288086b58b4d97a644e495fef75f2 upstream.

Jian reported that the following sequence would leave "testfile" with
corrupt data:

    # mount localhost:/export /mnt/nfs/ -o vers=3
    # echo abc > /mnt/nfs/testfile; echo def >> /export/testfile; echo ghi >> /mnt/nfs/testfile
    # cat -v /export/testfile
    abc
    ^@^@^@^@ghi

While there's no locking involved here, the operations are serialized,
so CTO should prevent corruption.

The first write to the file is fine and writes 4 bytes. The file is then
extended on the server. When it's reopened a GETATTR is issued and the
size change is noticed. This causes NFS_INO_INVALID_DATA to be set on
the file. Because the file is opened for write only,
nfs_want_read_modify_write() returns 0 to nfs_write_begin().
nfs_updatepage then calls nfs_write_pageuptodate() to see if it should
extend the nfs_page to cover the whole page. NFS_INO_INVALID_DATA is
still set on the file at that point, but that flag is ignored and
nfs_pageuptodate erroneously extends the write to cover the whole page,
with the write done on the server side filled in with zeroes.

This patch just has that function check for NFS_INO_INVALID_DATA in
addition to NFS_INO_REVAL_PAGECACHE. This fixes the bug, but looking
over the code, I wonder if we might have a similar bug in
nfs_revalidate_size(). The difference between those two flags is very
subtle, so it seems like we ought to be checking for
NFS_INO_INVALID_DATA in most of the places that we look for
NFS_INO_REVAL_PAGECACHE.

I believe this is regression introduced by commit 8d197a568. The code
did check for NFS_INO_INVALID_DATA prior to that patch.

Original bug report is here:

    https://bugzilla.redhat.com/show_bug.cgi?id=885743

Reported-by: Jian Li <jiali@redhat.com>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

NFS: Add sequence_priviliged_ops for nfs4_proc_sequence()

commit 6bdb5f213c4344324f600dde885f25768fbd14db upstream.

If I mount an NFS v4.1 server to a single client multiple times and then
run xfstests over each mountpoint I usually get the client into a state
where recovery deadlocks.  The server informs the client of a
cb_path_down sequence error, the client then does a
bind_connection_to_session and checks the status of the lease.

I found that bind_connection_to_session sets the NFS4_SESSION_DRAINING
flag on the client, but this flag is never unset before
nfs4_check_lease() reaches nfs4_proc_sequence().  This causes the client
to deadlock, halting all NFS activity to the server.  nfs4_proc_sequence()
is only called by the state manager, so I can change it to run in privileged
mode to bypass the NFS4_SESSION_DRAINING check and avoid the deadlock.

Signed-off-by: Bryan Schumaker <bjschuma@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ACPI / scan: Do not use dummy HID for system bus ACPI nodes

commit 4f5f64cf0cc916220aaa055992e31195470cfe37 upstream.

At one point acpi_device_set_id() checks if acpi_device_hid(device)
returns NULL, but that never happens, so system bus devices with an
empty list of PNP IDs are given the dummy HID ("device") instead of
the "system bus HID" ("LNXSYBUS"). Fix the code to use the right
check.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

libata: restore acpi disable functionality

commit 0d0cdb028f9d9771e2b346038707734121f906e3 upstream.

Commit 66fa7f215 "libata-acpi: improve ACPI disabling" introdcued the
behaviour of disabling ATA ACPI if ata_acpi_on_devcfg failed the 2nd
time, but commit 30dcf76ac dropped this behaviour and this caused
problem for Dimitris Damigos, where his laptop can not resume correctly.

The bugzilla page for it is:
https://bugzilla.kernel.org/show_bug.cgi?id=49331

The problem is, ata_dev_push_id will fail the 2nd time it is invoked,
and due to disabling ACPI code is dropped, ata_acpi_on_devcfg which
calls ata_dev_push_id will keep failing and eventually made the device
disabled.

This patch restores the original behaviour, if acpi failed the 2nd time,
disable acpi functionality for the device(and we do not event need to
add a debug message for this as it is still there ;-).

Reported-by: Dimitris Damigos <damigos@freemail.gr>
Signed-off-by: Aaron Lu <aaron.lu@intel.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ACPI: do acpisleep dmi check when CONFIG_ACPI_SLEEP is set

commit 0ac1b1d7b7424cd6f129b5454b504b3cae746f0e upstream.

The current acpisleep DMI checks only run when CONFIG_SUSPEND is set.
And this may break hibernation on some platforms when CONFIG_SUSPEND
is cleared.

Move acpisleep DMI check into #ifdef CONFIG_ACPI_SLEEP instead.

[rjw: Added acpi_sleep_dmi_check() and rebased on top of earlier
patches adding entries to acpisleep_dmi_table[].]
References: https://bugzilla.kernel.org/show_bug.cgi?id=45921
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

usb: gadget: network: fix bind() error path

commit e79cc615a9bb44da72c499ccfa2c9c4bbea3aa84 upstream.

I think this is wrong since 72c973dd ("usb: gadget: add
usb_endpoint_descriptor to struct usb_ep"). If we fail to allocate an ep
or bail out early we shouldn't check for the descriptor which is
assigned at ep_enable() time.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Tatyana Brokhman <tlinder@codeaurora.org>
Signed-off-by: Felipe Balbi <balbi@ti.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

usb: gadget: uvc: fix error path in uvc_function_bind()

commit 0f9df939385527049c8062a099fbfa1479fe7ce0 upstream.

The "video->minor = -1" assigment is done in V4L2 by
video_register_device() so it is removed here.
Now. uvc_function_bind() calls in error case uvc_function_unbind() for
cleanup. The problem is that uvc_function_unbind() frees the uvc struct
and uvc_bind_config() does as well in error case of usb_add_function().
Removing kfree() in usb_add_function() would make the patch smaller but
it would look odd because the new allocated memory is not cleaned up.
However it is not guaranteed that if we call usb_add_function() we also
get to the bind function.
Therefore the patch extracts the conditional cleanup from
uvc_function_unbind() applies to uvc_function_bind().
uvc_function_unbind() now contains only the complete cleanup which is
required once everything has been registrated.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Cc: Bhupesh Sharma <bhupesh.sharma@st.com>
Signed-off-by: Felipe Balbi <balbi@ti.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

usb: gadget: tcm_usb_gadget: NULL terminate the FS descriptor list

commit fad8deb274edcef1c8ca38946338f5f4f8126fe2 upstream.

The descriptor list for FS speed was not NULL terminated. This patch
fixes this.

While here one of the twe two bAlternateSetting assignments for the BOT
interface. Both assign 0, one is enough.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Felipe Balbi <balbi@ti.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

usb: gadget: phonet: free requests in pn_bind()'s error path

commit d0eca719dd11ad0619e8dd6a1f3eceb95b0216dd upstream.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Felipe Balbi <balbi@ti.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

usb: gadget: midi: free hs descriptors

commit d185039f7982eb82cf8d03b6fb6689587ca5af24 upstream.

The HS descriptors are only created if HS is supported by the UDC but we
never free them.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Felipe Balbi <balbi@ti.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

USB: chipidea: fix use after free bug

commit 98c35534420d3147553bd3071a5fc63cd56de5b1 upstream.

The pointer to a platform_device struct must not be dereferenced after
the device has been unregistered.

This bug produces a crash when unloading the ci13xxx kernel module
compiled with CONFIG_PAGE_POISONING enabled.

Signed-off-by: Lothar Waßmann <LW@KARO-electronics.de>
Acked-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

p54usb: add USBIDs for two more p54usb devices

commit 4010fe21a315b4223c25376714c6a2b61b722e5c upstream.

This patch adds USBIDs for:
- DrayTek Vigor 530
- Zoom 4410a

It also adds a note about Gemtek WUBI-100GW
and SparkLAN WL-682 USBID conflict [WUBI-100GW
is a ISL3886+NET2280 (LM86 firmare) solution,
whereas WL-682 is a ISL3887 (LM87 firmware)]
device.

Source: <http://www.wikidevi.com/wiki/Intersil/p54/usb/windows>

Signed-off-by: Christian Lamparter <chunkeey@googlemail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

p54usb: add USB ID for T-Com Sinus 154 data II

commit 3194b7fcdf6caea338b5d2c72d76fed80437649c upstream.

Added USB ID for T-Com Sinus 154 data II.

Signed-off-by: Tomasz Guszkowski <tsg@o2.pl>
Acked-by: Christian Lamparter <chunkeey@googlemail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

rtlwifi: fix incorrect use of usb_alloc_coherent with usb_control_msg

commit 4c3de5920c486b8eefa6187ee6a181864c161100 upstream.

Incorrect use of usb_alloc_coherent memory as input buffer to usb_control_msg
can cause problems in arch DMA code, for example kernel BUG at
'arch/arm/include/asm/dma-mapping.h:321' on ARM (linux-3.4).

Change _usb_writeN_sync use kmalloc'd buffer instead.

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@mbnet.fi>
Acked-by: Larry Finger <Larry.Finger@lwfinger.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

qmi_wwan/cdc_ether: add Dell Wireless 5800 (Novatel E362) USB IDs

commit 0370acd4d4d2595a11b0b0a793acb506e19b9d4c upstream.

Signed-off-by: Dan Williams <dcbw@redhat.com>
Acked-by: Bjørn Mork <bjorn@mork.no>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ALSA: hda - add mute LED for HP Pavilion 17 (Realtek codec)

commit 6d3cd5d444223c41eabb70dccff14fae4e8cb8b1 upstream.

The mute LED is in this case connected to the Mic1 VREF.

The machine also exposes the following string in BIOS:
"HP_Mute_LED_0_A", so if more machines are coming, it probably
makes sense to try to do something more generic, like for the
IDT codec.

BugLink: https://bugs.launchpad.net/bugs/1096789
Signed-off-by: David Henningsson <david.henningsson@canonical.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ALSA: hda - Fix pin configuration of HP Pavilion dv7

commit 8ae5865ec77c22462c736846a0679947a6953548 upstream.

Fix the quirk entry for HP Pavilion dv7 in order to make the bass
speaker working.

Reported-and-tested-by: Tomas Pospisek <tpo2@sourcepole.ch>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ALSA: hda - Fix the wrong pincaps set in ALC861VD dallas/hp fixup

commit b78562b10fa66175e30b76073e32a0ad8d92aa83 upstream.

The workaround to force VREF50 for dallas/hp model with ALC861VD
was introduced in commit 8fdcb6fe4204bdb4c6991652717ab5063751414e,
but it contained wrong pincap override bits.

This patch fixes to exclude VREF80 pincap bit correctly.

Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ALSA: hda - Always turn on pins for HDMI/DP

commit 6169b673618bf0b2518ce413b54925782a603f06 upstream.

We've seen the broken HDMI *video* output on some machines with GM965,
and the debugging session pointed that the culprit is the disabled
audio output pins. Toggling these pins dynamically on demand caused
flickering of HDMI TV.

This patch changes the behavior to keep the pin ON constantly.

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=51421

Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ALSA: hda - Add stereo-dmic fixup for Acer Aspire One 522

commit 63a077e27648b4043b1ca1b4e29f0c42d99616b6 upstream.

Acer Aspire One 522 has the infamous digital mic unit that needs the
phase inversion fixup for stereo.

Bugzilla: https://bugzilla.novell.com/show_bug.cgi?id=715737

Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ALSA: hda - Move runtime PM check to runtime_idle callback

commit 6eb827d23577a4efec2b10a9c4cc9ded268a1d1c upstream.

The runtime_idle callback is the right place to check the suspend
capability, but currently we do it wrongly in the runtime_suspend
callback. This leads to a kernel error message like:
pci_pm_runtime_suspend(): azx_runtime_suspend+0x0/0x50 [snd_hda_intel] returns -11
and the runtime PM core would even repeat the attempts.

Reported-and-tested-by: Borislav Petkov <bp@alien8.de>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ALSA: usb-audio: Fix missing autopm for MIDI input

commit f5f165418cabf2218eb466c0e94693b8b1aee88b upstream.

The commit [88a8516a: ALSA: usbaudio: implement USB autosuspend] added
the support of autopm for USB MIDI output, but it didn't take the MIDI
input into account.

This patch adds the following for fixing the autopm:
- Manage the URB start at the first MIDI input stream open, instead of
the time of instance creation
- Move autopm code to the common substream_open()
- Make snd_usbmidi_input_start/_stop() more robust and add the running
state check

Reviewd-by: Clemens Ladisch <clemens@ladisch.de>
Tested-by: Clemens Ladisch <clemens@ladisch.de>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ALSA: usb-audio: Avoid autopm calls after disconnection

commit 59866da9e4ae54819e3c4e0a8f426bdb0c2ef993 upstream.

Add a similar protection against the disconnection race and the
invalid use of usb instance after disconnection, as well as we've done
for the USB audio PCM.

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=51201

Reviewd-by: Clemens Ladisch <clemens@ladisch.de>
Tested-by: Clemens Ladisch <clemens@ladisch.de>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

tmpfs mempolicy: fix /proc/mounts corrupting memory

commit f2a07f40dbc603c15f8b06e6ec7f768af67b424f upstream.

Recently I suggested using "mount -o remount,mpol=local /tmp" in NUMA
mempolicy testing.  Very nasty.  Reading /proc/mounts, /proc/pid/mounts
or /proc/pid/mountinfo may then corrupt one bit of kernel memory, often
in a page table (causing "Bad swap" or "Bad page map" warning or "Bad
pagetable" oops), sometimes in a vm_area_struct or rbnode or somewhere
worse.  "mpol=prefer" and "mpol=prefer:Node" are equally toxic.

Recent NUMA enhancements are not to blame: this dates back to 2.6.35,
when commit e17f74af351c "mempolicy: don't call mpol_set_nodemask() when
no_context" skipped mpol_parse_str()'s call to mpol_set_nodemask(),
which used to initialize v.preferred_node, or set MPOL_F_LOCAL in flags.
With slab poisoning, you can then rely on mpol_to_str() to set the bit
for node 0x6b6b, probably in the next page above the caller's stack.

mpol_parse_str() is only called from shmem_parse_options(): no_context
is always true, so call it unused for now, and remove !no_context code.
Set v.nodes or v.preferred_node or MPOL_F_LOCAL as mpol_to_str() might
expect.  Then mpol_to_str() can ignore its no_context argument also,
the mpol being appropriately initialized whether contextualized or not.
Rename its no_context unused too, and let subsequent patch remove them
(that's not needed for stable backporting, which would involve rejects).

I don't understand why MPOL_LOCAL is described as a pseudo-policy:
it's a reasonable policy which suffers from a confusing implementation
in terms of MPOL_PREFERRED with MPOL_F_LOCAL.  I believe this would be
much more robust if MPOL_LOCAL were recognized in switch statements
throughout, MPOL_F_LOCAL deleted, and MPOL_PREFERRED use the (possibly
empty) nodes mask like everyone else, instead of its preferred_node
variant (I presume an optimization from the days before MPOL_LOCAL).
But that would take me too long to get right and fully tested.

Signed-off-by: Hugh Dickins <hughd@google.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

mm: Fix PageHead when !CONFIG_PAGEFLAGS_EXTENDED

commit ad4b3fb7ff9940bcdb1e4cd62bd189d10fa636ba upstream.

Unfortunately with !CONFIG_PAGEFLAGS_EXTENDED, (!PageHead) is false, and
(PageHead) is true, for tail pages.  If this is indeed the intended
behavior, which I doubt because it breaks cache cleaning on some ARM
systems, then the nomenclature is highly problematic.

This patch makes sure PageHead is only true for head pages and PageTail
is only true for tail pages, and neither is true for non-compound pages.

[ This buglet seems ancient - seems to have been introduced back in Apr
  2008 in commit 6a1e7f777f61: "pageflags: convert to the use of new
  macros".  And the reason nobody noticed is because the PageHead()
  tests are almost all about just sanity-checking, and only used on
  pages that are actual page heads.  The fact that the old code returned
  true for tail pages too was thus not really noticeable.   - Linus ]

Signed-off-by: Christoffer Dall <cdall@cs.columbia.edu>
Acked-by: Andrea Arcangeli <aarcange@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Will Deacon <Will.Deacon@arm.com>
Cc: Steve Capper <Steve.Capper@arm.com>
Cc: Christoph Lameter <cl@linux.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

mm: fix calculation of dirtyable memory

commit c8b74c2f6604923de91f8aa6539f8bb934736754 upstream.

The system uses global_dirtyable_memory() to calculate number of
dirtyable pages/pages that can be allocated to the page cache.  A bug
causes an underflow thus making the page count look like a big unsigned
number.  This in turn confuses the dirty writeback throttling to
aggressively write back pages as they become dirty (usually 1 page at a
time).  This generally only affects systems with highmem because the
underflowed count gets subtracted from the global count of dirtyable
memory.

The problem was introduced with v3.2-4896-gab8fabd

Fix is to ensure we don't get an underflowed total of either highmem or
global dirtyable memory.

Signed-off-by: Sonny Rao <sonnyrao@chromium.org>
Signed-off-by: Puneet Kumar <puneetster@chromium.org>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Tested-by: Damien Wyart <damien.wyart@free.fr>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

virtio: force vring descriptors to be allocated from lowmem

commit b92b1b89a33c172c075edccf6afb0edc41d851fd upstream.

Virtio devices may attempt to add descriptors to a virtqueue from atomic
context using GFP_ATOMIC allocation. This is problematic because such
allocations can fall outside of the lowmem mapping, causing virt_to_phys
to report bogus physical addresses which are subsequently passed to
userspace via the buffers for the virtual device.

This patch masks out __GFP_HIGH and __GFP_HIGHMEM from the requested
flags when allocating descriptors for a virtqueue. If an atomic
allocation is requested and later fails, we will return -ENOSPC which
will be handled by the driver.

Signed-off-by: Will Deacon <will.deacon@arm.com>
Cc: Sasha Levin <levinsasha928@gmail.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

virtio: 9p: correctly pass physical address to userspace for high pages

commit b9cdc88df8e63e81c723b82c286fc97f5d0dc325 upstream.

When using a virtio transport, the 9p net device may pass the physical
address of a kernel buffer to userspace via a scatterlist inside a
virtqueue. If the kernel buffer is mapped outside of the linear mapping
(e.g. highmem), then virt_to_page will return a bogus value and we will
populate the scatterlist with junk.

This patch uses kmap_to_page when populating the page array for a kernel
buffer.

Signed-off-by: Will Deacon <will.deacon@arm.com>
Cc: Sasha Levin <levinsasha928@gmail.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

mm: highmem: export kmap_to_page for modules

commit f0263d2d222e9e25f2587e51a9dc58c6fb2a9352 upstream.

Some virtio device drivers (9p) need to translate high virtual addresses
to physical addresses, which are inserted into the virtqueue for
processing by userspace.

This patch exports the kmap_to_page symbol, so that the affected drivers
can be compiled as modules.

Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

x86, 8042: Enable A20 using KBC to fix S3 resume on some MSI laptops

commit ad68652412276f68ad4fe3e1ecf5ee6880876783 upstream.

Some MSI laptop BIOSes are broken - INT 15h code uses port 92h to enable A20
line but resume code assumes that KBC was used.
The laptop will not resume from S3 otherwise but powers off after a while
and then powers on again stuck with a blank screen.

Fix it by enabling A20 using KBC in i8042_platform_init for x86.

Fixes https://bugzilla.kernel.org/show_bug.cgi?id=12878

Signed-off-by: Ondrej Zary <linux@rainbow-software.org>
Cc: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Cc: Rafael J. Wysocki <rjw@sisk.pl>
Link: http://lkml.kernel.org/r/201212112218.06551.linux@rainbow-software.org
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

arm64: signal: push the unwinding prologue on the signal stack

commit 304ef4e8367244b547734143c792a2ab764831e8 upstream.

To allow debuggers to unwind through signal frames, we create a fake
stack unwinding prologue containing the link register and frame pointer
of the interrupted context. The signal frame is then offset by 16 bytes
to make room for the two saved registers which are pushed onto the frame
of the *interrupted* context, rather than placed directly above the
signal stack.

This doesn't work when an alternative signal stack is set up for a SEGV
handler, which is raised in response to RLIMIT_STACK being reached. In
this case, we try to push the unwinding prologue onto the full stack and
subsequently take a fault which we fail to resolve, causing setup_return
to return -EFAULT and handle_signal to force_sigsegv on the current task.

This patch fixes the problem by including the unwinding prologue as part
of the rt_sigframe definition, which is populated during setup_sigframe,
ensuring that it always ends up on the signal stack.

Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

arm64: Make !dirty ptes read-only

commit 33eaa58f854770dc9c98411a356c98e3a53edfda upstream.

The AArch64 Linux port relies on the mm code to wrprotect clean ptes.
This however is not the case with newly created ptes and
PAGE_SHARED(_EXEC) is writable but !dirty.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

exec: do not leave bprm->interp on stack

commit b66c5984017533316fd1951770302649baf1aa33 upstream.

If a series of scripts are executed, each triggering module loading via
unprintable bytes in the script header, kernel stack contents can leak
into the command line.

Normally execution of binfmt_script and binfmt_misc happens recursively.
However, when modules are enabled, and unprintable bytes exist in the
bprm->buf, execution will restart after attempting to load matching
binfmt modules.  Unfortunately, the logic in binfmt_script and
binfmt_misc does not expect to get restarted.  They leave bprm->interp
pointing to their local stack.  This means on restart bprm->interp is
left pointing into unused stack memory which can then be copied into the
userspace argv areas.

After additional study, it seems that both recursion and restart remains
the desirable way to handle exec with scripts, misc, and modules.  As
such, we need to protect the changes to interp.

This changes the logic to require allocation for any changes to the
bprm->interp.  To avoid adding a new kmalloc to every exec, the default
value is left as-is.  Only when passing through binfmt_script or
binfmt_misc does an allocation take place.

For a proof of concept, see DoTest.sh from:

   http://www.halfdog.net/Security/2012/LinuxKernelBinfmtScriptStackDataDisclosure/

Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: halfdog <me@halfdog.net>
Cc: P J P <ppandit@redhat.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

SGI-XP: handle non-fatal traps

commit 891348ca0f66206f1dc0e30d63757e3df1ae2d15 upstream.

We found a user code which was raising a divide-by-zero trap. That trap
would lead to XPC connections between system-partitions being torn down
due to the die_chain notifier callouts it received.

This also revealed a different issue where multiple callers into
xpc_die_deactivate() would all attempt to do the disconnect in parallel
which would sometimes lock up but often overwhelm the console on very
large machines as each would print at least one line of output at the
end of the deactivate.

I reviewed all the users of the die_chain notifier and changed the code
to ignore the notifier callouts for reasons which will not actually lead
to a system to continue on to call die().

[akpm@linux-foundation.org: fix ia64]
Signed-off-by: Robin Holt <holt@sgi.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

pnpacpi: fix incorrect TEST_ALPHA() test

commit cdc87c5a30f407ed1ce43d8a22261116873d5ef1 upstream.

TEST_ALPHA() is broken and always returns 0.

[akpm@linux-foundation.org: return false for '@' as well, per Bjorn]
Signed-off-by: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

b43: fix tx path skb leaks

commit 78f18df4b323d2ac14d6c82e2fc3c8dc4556bccc upstream.

ieee80211_free_txskb() needs to be used instead of dev_kfree_skb_any for
tx packets passed to the driver from mac80211

Signed-off-by: Felix Fietkau <nbd@openwrt.org>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

b43legacy: Fix firmware loading when driver is built into the kernel

commit 576d28a7c73013717311cfcb514dbcae27c82eeb upstream.

Recent versions of udev cause synchronous firmware loading from the
probe routine to fail because the request to user space times out.
The original fix for b43legacy (commit a3ea2c7) moved the firmware
load from the probe routine to a work queue, but it still used synchronous
firmware loading. This method is OK when b43legacy is built as a module;
however, it fails when the driver is compiled into the kernel.

This version changes the code to load the initial firmware file
using request_firmware_nowait(). A completion event is used to
hold the work queue until that file is available. The remaining
firmware files are read synchronously.

Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

firmware loader: Fix the concurrent request_firmware() race for kref_get/put

commit bd9eb7fbe69111ea0ff1f999ef4a5f26d223d1d5 upstream.

There is one race that both request_firmware() with the same
firmware name.

The race scenerio is as below:
CPU1                                                  CPU2
request_firmware() -->
_request_firmware_load() return err                   another request_firmware() is coming -->
_request_firmware_cleanup is called -->               _request_firmware_prepare -->
release_firmware --->                                 fw_lookup_and_allocate_buf -->
                                                      spin_lock(&fwc->lock)
...                                                   __fw_lookup_buf() return true
fw_free_buf() will be called -->                      ...
kref_put -->
decrease the refcount to 0
                                                      kref_get(&tmp->ref) ==> it will trigger warning
                                                                              due to refcount == 0
__fw_free_buf() -->
...                                                   spin_unlock(&fwc->lock)
spin_lock(&fwc->lock)
list_del(&buf->list)
spin_unlock(&fwc->lock)
kfree(buf)
                                                      After that, the freed buf will be used.

The key race is decreasing refcount to 0 and list_del is not protected together by
fwc->lock, and it is possible another thread try to get it between refcount==0
and list_del.

Fix it here to protect it together.

Acked-by: Ming Lei <ming.lei@canonical.com>
Signed-off-by: liu chuansheng <chuansheng.liu@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

firmware loader: Fix the race FW_STATUS_DONE is followed by class_timeout

commit ce2fcbd99cef580623116bb33531dbc3e6f690b0 upstream.

There is a race as below when calling request_firmware():
CPU1                                   CPU2
write 0 > loading
mutex_lock(&fw_lock)
...
set_bit FW_STATUS_DONE                 class_timeout is coming
                                       set_bit FW_STATUS_ABORT
complete_all &completion
...
mutex_unlock(&fw_lock)

In this time, the bit FW_STATUS_DONE and FW_STATUS_ABORT are set,
and request_firmware() will return failure due to condition in
_request_firmware_load():
if (!buf->size || test_bit(FW_STATUS_ABORT, &buf->status))
retval = -ENOENT;

But from the above scenerio, it should be a successful requesting.
So we need judge if the bit FW_STATUS_DONE is already set before
calling fw_load_abort() in timeout function.

As Ming's proposal, we need change the timer into sched_work to
benefit from using &fw_lock mutex also.

Signed-off-by: liu chuansheng <chuansheng.liu@intel.com>
Acked-by: Ming Lei <ming.lei@canonical.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

net: fix a race in gro_cell_poll()

[ Upstream commit f8e8f97c11d5ff3cc47d85b97c7c35e443dcf490 ]

Dmitry Kravkov reported packet drops for GRE packets since GRO support
was added.

There is a race in gro_cell_poll() because we call napi_complete()
without any synchronization with a concurrent gro_cells_receive()

Once bug was triggered, we queued packets but did not schedule NAPI
poll.

We can fix this issue using the spinlock protected the napi_skbs queue,
as we have to hold it to perform skb dequeue anyway.

As we open-code skb_dequeue(), we no longer need to mask IRQS, as both
producer and consumer run under BH context.

Bug added in commit c9e6bc644e (net: add gro_cells infrastructure)

Reported-by: Dmitry Kravkov <dmitry@broadcom.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Tested-by: Dmitry Kravkov <dmitry@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Linux 3.7.1

Staging: bcm: Add two products and remove an existing product.

commit 4f29ef050848245f7c180b95ccf67dfcd76b1fd8 upstream.

This patch adds two new products and modifies
the device id table to include them. In addition,
product of 0xbccd - BCM_USB_PRODUCT_ID_SM250 is
removed because Beceem, ZTE, Sprint use this id
for block devices.

Reported-by: Muhammad Minhazul Haque <mdminhazulhaque@gmail.com>
Signed-off-by: Kevin McKinney <klmckinney1@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

rcu: Fix batch-limit size problem

commit 878d7439d0f45a95869e417576774673d1fa243f upstream.

Commit 29c00b4a1d9e27 (rcu: Add event-tracing for RCU callback
invocation) added a regression in rcu_do_batch()

Under stress, RCU is supposed to allow to process all items in queue,
instead of a batch of 10 items (blimit), but an integer overflow makes
the effective limit being 1. So, unless there is frequent idle periods
(during which RCU ignores batch limits), RCU can be forced into a
state where it cannot keep up with the callback-generation rate,
eventually resulting in OOM.

This commit therefore converts a few variables in rcu_do_batch() from
int to long to fix this problem, along with the module parameters
controlling the batch limits.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

USB: EHCI: bugfix: urb->hcpriv should not be NULL

commit 2656a9abcf1ec8dd5fee6a75d6997a0f2fa0094e upstream.

This patch (as1632b) fixes a bug in ehci-hcd.  The USB core uses
urb->hcpriv to determine whether or not an URB is active; host
controller drivers are supposed to set this pointer to a non-NULL
value when an URB is queued.  However ehci-hcd sets it to NULL for
isochronous URBs, which defeats the check in usbcore.

In itself this isn't a big deal.  But people have recently found that
certain sequences of actions will cause the snd-usb-audio driver to
reuse URBs without waiting for them to complete.  In the absence of
proper checking by usbcore, the URBs get added to their endpoint list
twice.  This leads to list corruption and a system freeze.

The patch makes ehci-hcd assign a meaningful value to urb->hcpriv for
isochronous URBs.  Improving robustness always helps.

Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Reported-by: Artem S. Tashkinov <t.artem@lycos.com>
Reported-by: Christof Meerwald <cmeerw@cmeerw.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

perf test: fix a build error on builtin-test

commit 12f8f74b2a4d26c4facfa7ef99487cf0930f6ef7 upstream.

Recently I build perf and get a build error on builtin-test.c. The error is as
following:

$ make
CC perf.o
CC builtin-test.o
cc1: warnings being treated as errors
builtin-test.c: In function ‘sched__get_first_possible_cpu’:
builtin-test.c:977: warning: implicit declaration of function ‘CPU_ALLOC’
builtin-test.c:977: warning: nested extern declaration of ‘CPU_ALLOC’
builtin-test.c:977: warning: assignment makes pointer from integer without a cast
builtin-test.c:978: warning: implicit declaration of function ‘CPU_ALLOC_SIZE’
builtin-test.c:978: warning: nested extern declaration of ‘CPU_ALLOC_SIZE’
builtin-test.c:979: warning: implicit declaration of function ‘CPU_ZERO_S’
builtin-test.c:979: warning: nested extern declaration of ‘CPU_ZERO_S’
builtin-test.c:982: warning: implicit declaration of function ‘CPU_FREE’
builtin-test.c:982: warning: nested extern declaration of ‘CPU_FREE’
builtin-test.c:992: warning: implicit declaration of function ‘CPU_ISSET_S’
builtin-test.c:992: warning: nested extern declaration of ‘CPU_ISSET_S’
builtin-test.c:998: warning: implicit declaration of function ‘CPU_CLR_S’
builtin-test.c:998: warning: nested extern declaration of ‘CPU_CLR_S’
make: *** [builtin-test.o] Error 1

This problem is introduced in 3e7c439a. CPU_ALLOC and related macros are
missing in sched__get_first_possible_cpu function. In 54489c18, commiter
mentioned that CPU_ALLOC has been removed. So CPU_ALLOC calls in this
function are removed to let perf to be built.

Signed-off-by: Vinson Lee <vlee@twitter.com>
Signed-off-by: Zheng Liu <wenqing.lz@taobao.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Vinson Lee <vlee@twitter.com>
Cc: Zheng Liu <wenqing.lz@taobao.com>
Link: http://lkml.kernel.org/r/1352422726-31114-1-git-send-email-vlee@twitter.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

cdc-acm: implement TIOCSSERIAL to avoid blocking close(2)

commit ba2d8ce9db0a61505362bb17b8899df3d3326146 upstream.

Some devices (ex Nokia C7) simply don't respond at all when data is sent
to some of their USB interfaces.  The data gets stuck in the TTYs queue
and sits there until close(2), which them blocks because closing_wait
defaults to 30 seconds (even though the fd is O_NONBLOCK).  This is
rarely desired.  Implement the standard mechanism to adjust closing_wait
and let applications handle it how they want to.

See also 02303f73373aa1da19dbec510ec5a4e2576f9610 for usb_wwan.c.

Signed-off-by: Dan Williams <dcbw@redhat.com>
Acked-by: Oliver Neukum <oneukum@suse.de>
Tested-by: Aleksander Morgado <aleksander@gnu.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ring-buffer: Fix race between integrity check and readers

commit 9366c1ba13fbc41bdb57702e75ca4382f209c82f upstream.

The function rb_check_pages() was added to make sure the ring buffer's
pages were sane. This check is done when the ring buffer size is modified
as well as when the iterator is released (closing the "trace" file),
as that was considered a non fast path and a good place to do a sanity
check.

The problem is that the check does not have any locks around it.
If one process were to read the trace file, and another were to read
the raw binary file, the check could happen while the reader is reading
the file.

The issues with this is that the check requires to clear the HEAD page
before doing the full check and it restores it afterward. But readers
require the HEAD page to exist before it can read the buffer, otherwise
it gives a nasty warning and disables the buffer.

By adding the reader lock around the check, this keeps the race from
happening.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ring-buffer: Fix NULL pointer if rb_set_head_page() fails

commit 54f7be5b831254199522523ccab4c3d954bbf576 upstream.

The function rb_set_head_page() searches the list of ring buffer
pages for a the page that has the HEAD page flag set. If it does
not find it, it will do a WARN_ON(), disable the ring buffer and
return NULL, as this should never happen.

But if this bug happens to happen, not all callers of this function
can handle a NULL pointer being returned from it. That needs to be
fixed.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ftrace: Clear bits properly in reset_iter_read()

commit 70f77b3f7ec010ff9624c1f2e39a81babc9e2429 upstream.

There is a typo here where '&' is used instead of '|' and it turns the
statement into a noop. The original code is equivalent to:

iter->flags &= ~((1 << 2) & (1 << 4));

Link: http://lkml.kernel.org/r/20120609161027.GD6488@elgon.mountain
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

xhci: Extend Fresco Logic MSI quirk.

commit bba18e33f25072ebf70fd8f7f0cdbf8cdb59a746 upstream.

Ali reports that plugging a device into the Fresco Logic xHCI host with
PCI device ID 1400 produces an IRQ error:

do_IRQ: 3.176 No irq handler for vector (irq -1)

Other early Fresco Logic host revisions don't support MSI, even though
their PCI config space claims they do.  Extend the quirk to disabling
MSI to this chipset revision.  Also enable the short transfer quirk,
since it's likely this revision also has that quirk, and it should be
harmless to enable.

04:00.0 0c03: 1b73:1400 (rev 01) (prog-if 30 [XHCI])
        Subsystem: 1d5c:1000
        Physical Slot: 3
        Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
        Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
        Latency: 0, Cache Line Size: 64 bytes
        Interrupt: pin A routed to IRQ 51
        Region 0: Memory at d4600000 (32-bit, non-prefetchable) [size=64K]
        Capabilities: [50] Power Management version 3
                Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold-)
                Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
        Capabilities: [68] MSI: Enable+ Count=1/1 Maskable- 64bit+
                Address: 00000000feeff00c  Data: 41b1
        Capabilities: [80] Express (v1) Endpoint, MSI 00
                DevCap: MaxPayload 128 bytes, PhantFunc 0, Latency L0s <2us, L1 <32us
                        ExtTag- AttnBtn- AttnInd- PwrInd- RBE+ FLReset-
                DevCtl: Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
                        RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop+
                        MaxPayload 128 bytes, MaxReadReq 512 bytes
                DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr- TransPend-
                LnkCap: Port #0, Speed 2.5GT/s, Width x1, ASPM L0s L1, Latency L0 unlimited, L1 unlimited
                        ClockPM- Surprise- LLActRep- BwNot-
                LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- Retrain- CommClk+
                        ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
                LnkSta: Speed 2.5GT/s, Width x1, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
        Kernel driver in use: xhci_hcd

This patch should be backported to stable kernels as old as 2.6.36, that
contain the commit f5182b4155b9d686c5540a6822486400e34ddd98 "xhci:
Disable MSI for some Fresco Logic hosts."

Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com>
Reported-by: A Sh <smr.ash1991@gmail.com>
Tested-by: A Sh <smr.ash1991@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

USB: OHCI: workaround for hardware bug: retired TDs not added to the Done Queue

commit 50ce5c0683aa83eb161624ea89daa5a9eee0c2ce upstream.

This patch (as1636) is a partial workaround for a hardware bug
affecting OHCI controllers by NVIDIA at least, maybe others too.  When
the controller retires a Transfer Descriptor, it is supposed to add
the TD onto the Done Queue.  But sometimes this doesn't happen, with
the result that ohci-hcd never realizes the corresponding transfer has
finished.  Symptoms can vary; a typical result is that USB audio stops
working after a while.

The patch works around the problem by recognizing that TDs are always
processed in order.  Therefore, if a later TD is found on the Done
Queue than all the earlier TDs for the same endpoint must be finished
as well.

Unfortunately this won't solve the problem in cases where the missing
TD is the last one in the endpoint's queue.  A complete fix would
require a signficant amount of change to the driver.

Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Tested-by: Oliver Neukum <oneukum@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ACPI / video: Add "Asus UL30VT" to ACPI video detect blacklist

commit d0c2ce16bec0afa6013b4c5220ca4c9c67210215 upstream.

The ACPI video driver can't control backlight correctly on
Asus UL30VT. Vendor driver (asus-laptop) can work. This patch is to
add "Asus UL30VT" to ACPI video detect blacklist in order to use
asus-laptop for video control on the "Asus UL30VT" rather than ACPI
video driver.

References: https://bugzilla.kernel.org/show_bug.cgi?id=32592
Reported-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ACPI / video: ignore BIOS initial backlight value for HP Folio 13-2000

commit 129ff8f8d58297b04f47b5d6fad81aa2d08404e1 upstream.

Or else the laptop will boot with a dimmed screen.

References: https://bugzilla.kernel.org/show_bug.cgi?id=51141
Tested-by: Stefan Nagy <public@stefan-nagy.at>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ACPI / PNP: Do not crash due to stale pointer use during system resume

commit a6b5e88c0e42093b9057856f35770966c8c591e3 upstream.

During resume from system suspend the 'data' field of
struct pnp_dev in pnpacpi_set_resources() may be a stale pointer,
due to removal of the associated ACPI device node object in the
previous suspend-resume cycle.  This happens, for example, if a
dockable machine is booted in the docking station and then suspended
and resumed and suspended again.  If that happens,
pnpacpi_build_resource_template() called from pnpacpi_set_resources()
attempts to use that pointer and crashes.

However, pnpacpi_set_resources() actually checks the device's ACPI
handle, attempts to find the ACPI device node object attached to it
and returns an error code if that fails, so in fact it knows what the
correct value of dev->data should be.  Use this observation to update
dev->data with the correct value if necessary and dump a call trace
if that's the case (once).

We still need to fix the root cause of this issue, but preventing
systems from crashing because of it is an improvement too.

Reported-and-tested-by: Zdenek Kabelac <zdenek.kabelac@gmail.com>
References: https://bugzilla.kernel.org/show_bug.cgi?id=51071
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ACPI / PM: Add Sony Vaio VPCEB1S1E to nonvs blacklist.

commit 876ab79055019e248508cfd0dee7caa3c0c831ed upstream.

Sony Vaio VPCEB1S1E does not resume correctly without
acpi_sleep=nonvs, so add it to the ACPI sleep blacklist.

References: https://bugzilla.kernel.org/show_bug.cgi?id=48781
Reported-by: Sébastien Wilmet <swilmet@gnome.org>
Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ACPI / battery: Correct battery capacity values on Thinkpads

commit 4000e626156935dfb626321ce09cae2c833eabbb upstream.

Add a quirk to correctly report battery capacity on 2010 and 2011
Lenovo Thinkpad models.

The affected models that I tested (x201, t410, t410s, and x220)
exhibit a problem where, when battery capacity reporting unit is mAh,
the values being reported are wrong.  Pre-2010 and 2012 models appear
to always report in mWh and are thus unaffected.  Also, in mid-2012
Lenovo issued a BIOS update for the 2011 models that fixes the issue
(tested on x220 with a post-1.29 BIOS).  No such update is available
for the 2010 models, so those still need this patch.

Problem description: for some reason, the affected Thinkpads switch
the reporting unit between mAh and mWh; generally, mAh is used when a
laptop is plugged in and mWh when it's unplugged, although a
suspend/resume or rmmod/modprobe is needed for the switch to take
effect.  The values reported in mAh are *always* wrong.  This does
not appear to be a kernel regression; I believe that the values were
never reported correctly.  I tested back to kernel 2.6.34, with
multiple machines and BIOS versions.

Simply plugging a laptop into mains before turning it on is enough to
reproduce the problem.  Here's a sample /proc/acpi/battery/BAT0/info
from Thinkpad x220 (before a BIOS update) with a 4-cell battery:

present:                 yes
design capacity:         2886 mAh
last full capacity:      2909 mAh
battery technology:      rechargeable
design voltage:          14800 mV
design capacity warning: 145 mAh
design capacity low:     13 mAh
cycle count:              0
capacity granularity 1:  1 mAh
capacity granularity 2:  1 mAh
model number:            42T4899
serial number:           21064
battery type:            LION
OEM info:                SANYO

Once the laptop switches the unit to mWh (unplug from mains, suspend,
resume), the output changes to:

present:                 yes
design capacity:         28860 mWh
last full capacity:      29090 mWh
battery technology:      rechargeable
design voltage:          14800 mV
design capacity warning: 1454 mWh
design capacity low:     200 mWh
cycle count:              0
capacity granularity 1:  1 mWh
capacity granularity 2:  1 mWh
model number:            42T4899
serial number:           21064
battery type:            LION
OEM info:                SANYO

Can you see how the values for "design capacity", etc., differ by a
factor of 10 instead of 14.8 (the design voltage of this battery)?
On the battery itself it says: 14.8V, 1.95Ah, 29Wh, so clearly the
values reported in mWh are correct and the ones in mAh are not.

My guess is that this problem has been around ever since those
machines were released, but because the most common Thinkpad
batteries are rated at 10.8V, the error (8%) is small enough that it
simply hasn't been noticed or at least nobody could be bothered to
look into it.

My patch works around the problem by adjusting the incorrectly
reported mAh values by "10000 / design_voltage".  The patch also has
code to figure out if it should be activated or not.  It only
activates on Lenovo Thinkpads, only when the unit is mAh, and, as an
extra precaution, only when the battery capacity reported through
ACPI does not match what is reported through DMI (I've never
encountered a machine where the first two conditions would be true
but the last would not, but better safe than sorry).

I've been using this patch for close to a year on several systems
without any problems.

References: https://bugzilla.kernel.org/show_bug.cgi?id=41062
Acked-by: Henrique de Moraes Holschuh <hmh@hmh.eng.br>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

USB: mark uas driver as BROKEN

commit fb37ef98015f864d22be223a0e0d93547cd1d4ef upstream.

As reported https://bugzilla.kernel.org/show_bug.cgi?id=51031, the UAS
driver causes problems and has been asked to be not built into any of
the major distributions. To prevent users from running into problems
with it, and for distros that were not notified, just mark the whole
thing as broken.

Acked-by: Sarah Sharp <sarah.a.sharp@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

USB: cp210x: add Virtenio Preon32 device id

commit 356fe44f4b8ece867bdb9876b1854d7adbef9de2 upstream.

Signed-off-by: Markus Becker <mab@comnets.uni-bremen.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

usb: ftdi_sio: fixup BeagleBone A5+ quirk

commit 1a88d5eee2ef2ad1d3c4e32043e9c4c5347d4fc1 upstream.

BeagleBone A5+ devices ended up getting shipped with the
'BeagleBone/XDS100V2' product string, and not XDS100 like it
was agreed, so adjust the quirk to match.

For details, see the thread on the beagle list:

https://groups.google.com/forum/#!msg/beagleboard/zrFPew9_Wvo/ibWr1-eE8JwJ

Signed-off-by: Peter Korsgaard <jacmet@sunsite.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

USB: ftdi_sio: Add support for Newport AGILIS motor drivers

commit d7e14b375b40c04cd735b115713043b69a2c68ac upstream.

The Newport AGILIS model AG-UC8 compact piezo motor controller
(http://search.newport.com/?q=*&x2=sku&q2=AG-UC8)
is yet another device using an FTDI USB-to-serial chip. It works
fine with the ftdi_sio driver when adding

options ftdi-sio product=0x3000 vendor=0x104d

to modprobe.d. udevadm reports "Newport" as the manufacturer,
and "Agilis" as the product name.

Signed-off-by: Martin Teichmann <lkb.teichmann@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

USB: option: blacklist network interface on Huawei E173

commit f36446cf9bbebaa03a80d95cfeeafbaf68218249 upstream.

The Huawei E173 will normally appear as 12d1:1436 in Linux.  But
the modem has another mode with different device ID and a slightly
different set of descriptors. This is the mode used by Windows like
this:

  3Modem:      USB\VID_12D1&PID_140C&MI_00\6&3A1D2012&0&0000
  Networkcard: USB\VID_12D1&PID_140C&MI_01\6&3A1D2012&0&0001
  Appli.Inter: USB\VID_12D1&PID_140C&MI_02\6&3A1D2012&0&0002
  PC UI Inter: USB\VID_12D1&PID_140C&MI_03\6&3A1D2012&0&0003

All interfaces have the same ff/ff/ff class codes in this mode.
Blacklisting the network interface to allow it to be picked up by
the network driver.

Reported-by: Thomas Schäfer <tschaefer@t-online.de>
Signed-off-by: Bjørn Mork <bjorn@mork.no>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

USB: add new zte 3g-dongle's pid to option.c

commit 31b6a1048b7292efff8b5b53ae3d9d29adde385e upstream.

Signed-off-by: Rui li <li.rui27@zte.com.cn>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

x86: hpet: Fix masking of MSI interrupts

commit 6acf5a8c931da9d26c8dd77d784daaf07fa2bff0 upstream.

HPET_TN_FSB is not a proper mask bit; it merely toggles between MSI and
legacy interrupt delivery. The proper mask bit is HPET_TN_ENABLE, so
use both bits when (un)masking the interrupt.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Link: http://lkml.kernel.org/r/5093E09002000078000A60E6@nat28.tlf.novell.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ezusb: add dependency to USB

commit 9db72fe6c943852032d9ed863c87e8c02d3cb9da upstream.

This fixes an error during modpost, when ezusb is built into the kernel
while USB is built as module.

Signed-off-by: René Bürgel <rene.buergel@sohard.de>
Cc: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

telephony: ijx: buffer overflow in ixj_write_cid()

[Not needed in 3.8 or newer as this driver is removed there. - gregkh]

We get this from user space and nothing has been done to ensure that
these strings are NUL terminated.

Reported-by: Chen Gang <gang.chen@asianux.com>
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

x86,AMD: Power driver support for AMD's family 16h processors

commit 22e32f4f57778ebc6e17812fa3008361c05d64f9 upstream.

Add family 16h PCI ID to AMD's power driver to allow it report
power consumption on these processors.

Signed-off-by: Boris Ostrovsky <boris.ostrovsky@amd.com>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

clk: ux500: fix bit error

commit 2630b17b6ee47ac79b4f5120ac49105027f644ea upstream.

This fixes a bit error in the U8500 clock implementation: the
unused p2_pclk12 registered at bit 12 in periphereral group 6
was defined as using bit 11 rather than bit 12.

When walking over and disabling the unused clocks in the tree
at late init time, p2_pclk12 was disabled, by effectively
clearing the but for p2_pclk11 instead of bit 12 as it should
have, thus disabling gpio block 6 and 7.

Reported-by: Lee Jones <lee.jones@linaro.org>
Acked-by: Ulf Hansson <ulf.hansson@linaro.org>
Cc: Philippe Begnic <philippe.begnic@st.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Mike Turquette <mturquette@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

mm: dmapool: use provided gfp flags for all dma_alloc_coherent() calls

commit 387870f2d6d679746020fa8e25ef786ff338dc98 upstream.

dmapool always calls dma_alloc_coherent() with GFP_ATOMIC flag,
regardless the flags provided by the caller. This causes excessive
pruning of emergency memory pools without any good reason. Additionaly,
on ARM architecture any driver which is using dmapools will sooner or
later trigger the following error:
"ERROR: 256 KiB atomic DMA coherent pool is too small!
Please increase it with coherent_pool= kernel parameter!".
Increasing the coherent pool size usually doesn't help much and only
delays such error, because all GFP_ATOMIC DMA allocations are always
served from the special, very limited memory pool.

This patch changes the dmapool code to correctly use gfp flags provided
by the dmapool caller.

Reported-by: Soeren Moch <smoch@web.de>
Reported-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Tested-by: Andrew Lunn <andrew@lunn.ch>
Tested-by: Soeren Moch <smoch@web.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Linux 3.7

Input: matrix-keymap - provide proper module license

The matrix-keymap module is currently lacking a proper module license,
add one so we don't have this module tainting the entire kernel. This
issue has been present since commit 1932811f426f ("Input: matrix-keymap
- uninline and prepare for device tree support")

Signed-off-by: Florian Fainelli <florian@openwrt.org>
CC: stable@vger.kernel.org # v3.5+
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net

Pull networking fixes from David Miller:

1) Netlink socket dumping had several missing verifications and checks.

    In particular, address comparisons in the request byte code
    interpreter could access past the end of the address in the
    inet_request_sock.

    Also, address family and address prefix lengths were not validated
    properly at all.

    This means arbitrary applications can read past the end of certain
    kernel data structures.

    Fixes from Neal Cardwell.

2) ip_check_defrag() operates in contexts where we're in the process
    of, or about to, input the packet into the real protocols
    (specifically macvlan and AF_PACKET snooping).

    Unfortunately, it does a pskb_may_pull() which can modify the
    backing packet data which is not legal if the SKB is shared.  It
    very much can be shared in this context.

    Deal with the possibility that the SKB is segmented by using
    skb_copy_bits().

    Fix from Johannes Berg based upon a report by Eric Leblond.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net:
  ipv4: ip_check_defrag must not modify skb before unsharing
  inet_diag: validate port comparison byte code to prevent unsafe reads
  inet_diag: avoid unsafe and nonsensical prefix matches in inet_diag_bc_run()
  inet_diag: validate byte code to prevent oops in inet_diag_bc_run()
  inet_diag: fix oops for IPv4 AF_INET6 TCP SYN-RECV state

Revert "revert "Revert "mm: remove __GFP_NO_KSWAPD""" and associated damage

This reverts commits a50915394f1fc02c2861d3b7ce7014788aa5066e and
d7c3b937bdf45f0b844400b7bf6fd3ed50bac604.

This is a revert of a revert of a revert.  In addition, it reverts the
even older i915 change to stop using the __GFP_NO_KSWAPD flag due to the
original commits in linux-next.

It turns out that the original patch really was bogus, and that the
original revert was the correct thing to do after all.  We thought we
had fixed the problem, and then reverted the revert, but the problem
really is fundamental: waking up kswapd simply isn't the right thing to
do, and direct reclaim sometimes simply _is_ the right thing to do.

When certain allocations fail, we simply should try some direct reclaim,
and if that fails, fail the allocation.  That's the right thing to do
for THP allocations, which can easily fail, and the GPU allocations want
to do that too.

So starting kswapd is sometimes simply wrong, and removing the flag that
said "don't start kswapd" was a mistake.  Let's hope we never revisit
this mistake again - and certainly not this many times ;)

Acked-by: Mel Gorman <mgorman@suse.de>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

ipv4: ip_check_defrag must not modify skb before unsharing

ip_check_defrag() might be called from af_packet within the
RX path where shared SKBs are used, so it must not modify
the input SKB before it has unshared it for defragmentation.
Use skb_copy_bits() to get the IP header and only pull in
everything later.

The same is true for the other caller in macvlan as it is
called from dev->rx_handler which can also get a shared SKB.

Reported-by: Eric Leblond <eric@regit.org>
Cc: stable@vger.kernel.org
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Revert "mm: avoid waking kswapd for THP allocations when compaction is deferred or contended"

This reverts commit 782fd30406ecb9d9b082816abe0c6008fc72a7b0.

We are going to reinstate the __GFP_NO_KSWAPD flag that has been
removed, the removal reverted, and then removed again. Making this
commit a pointless fixup for a problem that was caused by the removal of
__GFP_NO_KSWAPD flag.

The thing is, we really don't want to wake up kswapd for THP allocations
(because they fail quite commonly under any kind of memory pressure,
including when there is tons of memory free), and these patches were
just trying to fix up the underlying bug: the original removal of
__GFP_NO_KSWAPD in commit c654345924f7 ("mm: remove __GFP_NO_KSWAPD")
was simply bogus.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

inet_diag: validate port comparison byte code to prevent unsafe reads

Add logic to verify that a port comparison byte code operation
actually has the second inet_diag_bc_op from which we read the port
for such operations.

Previously the code blindly referenced op[1] without first checking
whether a second inet_diag_bc_op struct could fit there. So a
malicious user could make the kernel read 4 bytes beyond the end of
the bytecode array by claiming to have a whole port comparison byte
code (2 inet_diag_bc_op structs) when in fact the bytecode was not
long enough to hold both.

Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

inet_diag: avoid unsafe and nonsensical prefix matches in inet_diag_bc_run()

Add logic to check the address family of the user-supplied conditional
and the address family of the connection entry. We now do not do
prefix matching of addresses from different address families (AF_INET
vs AF_INET6), except for the previously existing support for having an
IPv4 prefix match an IPv4-mapped IPv6 address (which this commit
maintains as-is).

This change is needed for two reasons:

(1) The addresses are different lengths, so comparing a 128-bit IPv6
prefix match condition to a 32-bit IPv4 connection address can cause
us to unwittingly walk off the end of the IPv4 address and read
garbage or oops.

(2) The IPv4 and IPv6 address spaces are semantically distinct, so a
simple bit-wise comparison of the prefixes is not meaningful, and
would lead to bogus results (except for the IPv4-mapped IPv6 case,
which this commit maintains).

Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

inet_diag: validate byte code to prevent oops in inet_diag_bc_run()

Add logic to validate INET_DIAG_BC_S_COND and INET_DIAG_BC_D_COND
operations.

Previously we did not validate the inet_diag_hostcond, address family,
address length, and prefix length. So a malicious user could make the
kernel read beyond the end of the bytecode array by claiming to have a
whole inet_diag_hostcond when the bytecode was not long enough to
contain a whole inet_diag_hostcond of the given address family. Or
they could make the kernel read up to about 27 bytes beyond the end of
a connection address by passing a prefix length that exceeded the
length of addresses of the given family.

Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

inet_diag: fix oops for IPv4 AF_INET6 TCP SYN-RECV state

Fix inet_diag to be aware of the fact that AF_INET6 TCP connections
instantiated for IPv4 traffic and in the SYN-RECV state were actually
created with inet_reqsk_alloc(), instead of inet6_reqsk_alloc(). This
means that for such connections inet6_rsk(req) returns a pointer to a
random spot in memory up to roughly 64KB beyond the end of the
request_sock.

With this bug, for a server using AF_INET6 TCP sockets and serving
IPv4 traffic, an inet_diag user like `ss state SYN-RECV` would lead to
inet_diag_fill_req() causing an oops or the export to user space of 16
bytes of kernel memory as a garbage IPv6 address, depending on where
the garbage inet6_rsk(req) pointed.

Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

mm: vmscan: fix inappropriate zone congestion clearing

commit c702418f8a2f ("mm: vmscan: do not keep kswapd looping forever due
to individual uncompactable zones") removed zone watermark checks from
the compaction code in kswapd but left in the zone congestion clearing,
which now happens unconditionally on higher order reclaim.

This messes up the reclaim throttling logic for zones with
dirty/writeback pages, where zones should only lose their congestion
status when their watermarks have been restored.

Remove the clearing from the zone compaction section entirely. The
preliminary zone check and the reclaim loop in kswapd will clear it if
the zone is considered balanced.

Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Reviewed-by: Rik van Riel <riel@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

vfs: fix O_DIRECT read past end of block device

The direct-IO write path already had the i_size checks in mm/filemap.c,
but it turns out the read path did not, and removing the block size
checks in fs/block_dev.c (commit bbec0270bdd8: "blkdev_max_block: make
private to fs/buffer.c") removed the magic "shrink IO to past the end of
the device" code there.

Fix it by truncating the IO to the size of the block device, like the
write path already does.

NOTE! I suspect the write path would be *much* better off doing it this
way in fs/block_dev.c, rather than hidden deep in mm/filemap.c. The
mm/filemap.c code is extremely hard to follow, and has various
conditionals on the target being a block device (ie the flag passed in
to 'generic_write_checks()', along with a conditional update of the
inode timestamp etc).

It is also quite possible that we should treat this whole block device
size as a "s_maxbytes" issue, and try to make the logic even more
generic. However, in the meantime this is the fairly minimal targeted
fix.

Noted by Milan Broz thanks to a regression test for the cryptsetup
reencrypt tool.

Reported-and-tested-by: Milan Broz <mbroz@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net

Pull networking fixes from David Miller:
"Two stragglers:

   1) The new code that adds new flushing semantics to GRO can cause SKB
      pointer list corruption, manage the lists differently to avoid the
      OOPS.  Fix from Eric Dumazet.

   2) When TCP fast open does a retransmit of data in a SYN-ACK or
      similar, we update retransmit state that we shouldn't triggering a
      WARN_ON later.  Fix from Yuchung Cheng."

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net:
  net: gro: fix possible panic in skb_gro_receive()
  tcp: bug fix Fast Open client retransmission

net: gro: fix possible panic in skb_gro_receive()

commit 2e71a6f8084e (net: gro: selective flush of packets) added
a bug for skbs using frag_list. This part of the GRO stack is rarely
used, as it needs skb not using a page fragment for their skb->head.

Most drivers do use a page fragment, but some of them use GFP_KERNEL
allocations for the initial fill of their RX ring buffer.

napi_gro_flush() overwrite skb->prev that was used for these skb to
point to the last skb in frag_list.

Fix this using a separate field in struct napi_gro_cb to point to the
last fragment.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

tcp: bug fix Fast Open client retransmission

If SYN-ACK partially acks SYN-data, the client retransmits the
remaining data by tcp_retransmit_skb(). This increments lost recovery
state variables like tp->retrans_out in Open state. If loss recovery
happens before the retransmission is acked, it triggers the WARN_ON
check in tcp_fastretrans_alert(). For example: the client sends
SYN-data, gets SYN-ACK acking only ISN, retransmits data, sends
another 4 data packets and get 3 dupacks.

Since the retransmission is not caused by network drop it should not
update the recovery state variables. Further the server may return a
smaller MSS than the cached MSS used for SYN-data, so the retranmission
needs a loop. Otherwise some data will not be retransmitted until timeout
or other loss recovery events.

Signed-off-by: Yuchung Cheng <ycheng@google.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge tag 'mmc-fixes-for-3.7' of git://git.kernel.org/pub/scm/linux/kernel/git/cjb/mmc

Pull MMC fixes from Chris Ball:
"Two small regression fixes:

   - sdhci-s3c: Fix runtime PM regression against 3.7-rc1
   - sh-mmcif: Fix oops against 3.6"

* tag 'mmc-fixes-for-3.7' of git://git.kernel.org/pub/scm/linux/kernel/git/cjb/mmc:
  mmc: sh-mmcif: avoid oops on spurious interrupts (second try)
  Revert misapplied "mmc: sh-mmcif: avoid oops on spurious interrupts"
  mmc: sdhci-s3c: fix missing clock for gpio card-detect

tmpfs: fix shared mempolicy leak

This fixes a regression in 3.7-rc, which has since gone into stable.

Commit 00442ad04a5e ("mempolicy: fix a memory corruption by refcount
imbalance in alloc_pages_vma()") changed get_vma_policy() to raise the
refcount on a shmem shared mempolicy; whereas shmem_alloc_page() went
on expecting alloc_page_vma() to drop the refcount it had acquired.
This deserves a rework: but for now fix the leak in shmem_alloc_page().

Hugh: shmem_swapin() did not need a fix, but surely it's clearer to use
the same refcounting there as in shmem_alloc_page(), delete its onstack
mempolicy, and the strange mpol_cond_copy() and __mpol_cond_copy() -
those were invented to let swapin_readahead() make an unknown number of
calls to alloc_pages_vma() with one mempolicy; but since 00442ad04a5e,
alloc_pages_vma() has kept refcount in balance, so now no problem.

Reported-and-tested-by: Tommi Rantala <tt.rantala@gmail.com>
Signed-off-by: Mel Gorman <mgorman@suse.de>
Signed-off-by: Hugh Dickins <hughd@google.com>
Cc: stable@vger.kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

mm: vmscan: do not keep kswapd looping forever due to individual uncompactable zones

When a zone meets its high watermark and is compactable in case of
higher order allocations, it contributes to the percentage of the node's
memory that is considered balanced.

This requirement, that a node be only partially balanced, came about
when kswapd was desparately trying to balance tiny zones when all bigger
zones in the node had plenty of free memory.  Arguably, the same should
apply to compaction: if a significant part of the node is balanced
enough to run compaction, do not get hung up on that tiny zone that
might never get in shape.

When the compaction logic in kswapd is reached, we know that at least
25% of the node's memory is balanced properly for compaction (see
zone_balanced and pgdat_balanced).  Remove the individual zone checks
that restart the kswapd cycle.

Otherwise, we may observe more endless looping in kswapd where the
compaction code loops back to reclaim because of a single zone and
reclaim does nothing because the node is considered balanced overall.

See for example

  https://bugzilla.redhat.com/show_bug.cgi?id=866988

Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Reported-and-tested-by: Thorsten Leemhuis <fedora@leemhuis.info>
Reported-by: Jiri Slaby <jslaby@suse.cz>
Tested-by: John Ellson <john.ellson@comcast.net>
Tested-by: Zdenek Kabelac <zkabelac@redhat.com>
Tested-by: Bruno Wolff III <bruno@wolff.to>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

mm: compaction: validate pfn range passed to isolate_freepages_block

Commit 0bf380bc70ec ("mm: compaction: check pfn_valid when entering a
new MAX_ORDER_NR_PAGES block during isolation for migration") added a
check for pfn_valid() when isolating pages for migration as the scanner
does not necessarily start pageblock-aligned.

Since commit c89511ab2f8f ("mm: compaction: Restart compaction from near
where it left off"), the free scanner has the same problem.  This patch
makes sure that the pfn range passed to isolate_freepages_block() is
within the same block so that pfn_valid() checks are unnecessary.

In answer to Henrik's wondering why others have not reported this:
reproducing this requires a large enough hole with the right aligment to
have compaction walk into a PFN range with no memmap.  Size and
alignment depends in the memory model - 4M for FLATMEM and 128M for
SPARSEMEM on x86.  It needs a "lucky" machine.

Reported-by: Henrik Rydberg <rydberg@euromail.se>
Signed-off-by: Mel Gorman <mgorman@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

mmc: sh-mmcif: avoid oops on spurious interrupts (second try)

On some systems, e.g., kzm9g, MMCIF interfaces can produce spurious
interrupts without any active request. To prevent the Oops, that results
in such cases, don't dereference the mmc request pointer until we make
sure, that we are indeed processing such a request.

Reported-by: Tetsuyuki Kobayashi <koba@kmckk.co.jp>
Signed-off-by: Guennadi Liakhovetski <g.liakhovetski@gmx.de>
Tested-by: Tetsuyuki Kobayashi <koba@kmckk.co.jp>
Cc: stable@vger.kernel.org
Signed-off-by: Chris Ball <cjb@laptop.org>

Revert misapplied "mmc: sh-mmcif: avoid oops on spurious interrupts"

This reverts commit 8464dd52d3198dd05, which was a misapplied debugging
version of the patch, not the final patch itself.

Signed-off-by: Chris Ball <cjb@laptop.org>
Cc: stable@vger.kernel.org