git.karo-electronics.de Git - karo-tx-linux.git/log

mm: account for MAP_SHARED mappings using VM_MAYSHARE and not VM_SHARED in hugetlbfs

commit f83a275dbc5ca1721143698e844243fcadfabf6a upstream.

Addresses http://bugzilla.kernel.org/show_bug.cgi?id=13302

hugetlbfs reserves huge pages but does not fault them at mmap() time to
ensure that future faults succeed.  The reservation behaviour differs
depending on whether the mapping was mapped MAP_SHARED or MAP_PRIVATE.
For MAP_SHARED mappings, hugepages are reserved when mmap() is first
called and are tracked based on information associated with the inode.
Other processes mapping MAP_SHARED use the same reservation.  MAP_PRIVATE
track the reservations based on the VMA created as part of the mmap()
operation.  Each process mapping MAP_PRIVATE must make its own
reservation.

hugetlbfs currently checks if a VMA is MAP_SHARED with the VM_SHARED flag
and not VM_MAYSHARE.  For file-backed mappings, such as hugetlbfs,
VM_SHARED is set only if the mapping is MAP_SHARED and the file was opened
read-write.  If a shared memory mapping was mapped shared-read-write for
populating of data and mapped shared-read-only by other processes, then
hugetlbfs would account for the mapping as if it was MAP_PRIVATE.  This
causes processes to fail to map the file MAP_SHARED even though it should
succeed as the reservation is there.

This patch alters mm/hugetlb.c and replaces VM_SHARED with VM_MAYSHARE
when the intent of the code was to check whether the VMA was mapped
MAP_SHARED or MAP_PRIVATE.

Signed-off-by: Mel Gorman <mel@csn.ul.ie>
Cc: Hugh Dickins <hugh.dickins@tiscali.co.uk>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: <starlight@binnacle.cx>
Cc: Eric B Munson <ebmunson@us.ibm.com>
Cc: Adam Litke <agl@us.ibm.com>
Cc: Andy Whitcroft <apw@canonical.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

igb: fix LRO warning

This fix is only needed for 2.6.29.y tree, since in 2.6.30 and later IGB
has moved to using GRO instead of LRO.

igb supports LRO, but was not setting any hooks to the ->set_flags
ethtool_ops function. This would trigger warnings if the user tried
to enable or disable LRO.

Based on the patch provided by Stephen Hemminger <shemminger@vyatta.com>

Reported-by: Sergey Kononenko <sergk@sergk.org.ua>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
CC: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

hwmon: (lm78) Add missing __devexit_p()

commit 39d8bbedb9571a89d638f5b05358f26ab503d7a6 upstream.

The remove function uses __devexit, so the .remove assignment needs
__devexit_p() to fix a build error with hotplug disabled.

Signed-off-by: Mike Frysinger <vapier@gentoo.org>
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

e1000: add missing length check to e1000 receive routine

commit ea30e11970a96cfe5e32c03a29332554573b4a10 upstream.

Patch to fix bad length checking in e1000.  E1000 by default does two
things:

1) Spans rx descriptors for packets that don't fit into 1 skb on recieve
2) Strips the crc from a frame by subtracting 4 bytes from the length prior to
doing an skb_put

Since the e1000 driver isn't written to support receiving packets that span
multiple rx buffers, it checks the End of Packet bit of every frame, and
discards it if its not set.  This places us in a situation where, if we have a
spanning packet, the first part is discarded, but the second part is not (since
it is the end of packet, and it passes the EOP bit test).  If the second part of
the frame is small (4 bytes or less), we subtract 4 from it to remove its crc,
underflow the length, and wind up in skb_over_panic, when we try to skb_put a
huge number of bytes into the skb.  This amounts to a remote DOS attack through
careful selection of frame size in relation to interface MTU.  The fix for this
is already in the e1000e driver, as well as the e1000 sourceforge driver, but no
one ever pushed it to e1000.  This is lifted straight from e1000e, and prevents
small frames from causing the underflow described above

Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Tested-by: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

drivers/serial/mpc52xx_uart.c: fix array overindexing check

commit b898f4f869da5b9d41f297fff87aca4cd42d80b3 upstream.

The check for an overindexing of mpc52xx_uart_{ports,nodes} has an
off-by-one.

Signed-off-by: Roel Kluin <roel.kluin@gmail.com>
Acked-by: Wolfram Sang <w.sang@pengutronix.de>
Acked-by: Grant Likely <grant.likely@secretlab.ca>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

cpuidle: make AMC C1E work in processor_idle

commit 87ad57bacb25c3f24c54f142ef445f68277705f0 upstream

When AMD C1E is enabled, local APIC timer will stop even in C1. This patch uses
broadcast ipi to replace local APIC timer in C1.

http://bugzilla.kernel.org/show_bug.cgi?id=13233

[ impact: avoid boot hang in AMD CPU with C1E enabled ]

Tested-by: Dmitry Lyzhyn <thisistempbox@yahoo.com>
Signed-off-by: Shaohua Li <shaohua.li@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

cpuidle: fix AMD C1E suspend hang

commit 7d60e8ab0d5507229dfbdf456501cc378610fa01 upstream.

When AMD C1E is enabled, local APIC timer will stop even in C1. To avoid
suspend/resume hang, this patch removes C1 and replace it with a cpu_relax() in
suspend/resume path. This hasn't any impact in runtime path.

http://bugzilla.kernel.org/show_bug.cgi?id=13233

[ impact: avoid suspend/resume hang in AMD CPU with C1E enabled ]

Tested-by: Dmitry Lyzhyn <thisistempbox@yahoo.com>
Signed-off-by: Shaohua Li <shaohua.li@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

bnx2: Fix panic in bnx2_poll_work().

commit 581daf7e00c5e766f26aff80a61a860a17b0d75a upstream.

Add barrier() to bnx2_get_hw_{tx|rx}_cons() to fix this issue:

http://bugzilla.kernel.org/show_bug.cgi?id=12698

This issue was reported by multiple i386 users.  Without barrier(),
the compiled code looks like the following where %eax contains the
address of the tx_cons or rx_cons in the DMA status block.  The
status block contents can change between the cmpb and the movzwl
instruction.  The driver would crash if the value was not 0xff during
the cmpb instruction, but changed to 0xff during the movzwl
instruction.

6828: 80 38 ff              cmpb   $0xff,(%eax)
682b: 0f b7 10              movzwl (%eax),%edx

With the added barrier(), the compiled code now looks correct:

683d: 0f b7 10              movzwl (%eax),%edx
6840: 0f b6 c2              movzbl %dl,%eax
6843: 3d ff 00 00 00        cmp    $0xff,%eax

Thanks to Pascal de Bruijn <pmjdebruijn@pcode.nl> for reporting the
problem and Holger Noefer <hnoefer@pironet-ndh.com> for patiently
testing test patches for us.

[greg - took out version change]

Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

3w-xxxx: scsi_dma_unmap fix

commit 7b14f58ad65f9d74e4273fb45360cfea824495aa upstream.

This patch fixes the following regression that occurred during the
scsi_dma_map()/unmap()
changes when compiling with CONFIG_DMA_API_DEBUG=y :

WARNING: at lib/dma-debug.c:496 check_unmap+0x142/0x542()
Hardware name:
3w-xxxx 0000:02:02.0: DMA-API: device driver tries to free DMA memory
it has not allocated [device address=0x0000000000000000] [size=36
bytes]

Signed-off-by: Adam Radford <aradford@gmail.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

x86: work around Fedora-11 x86-32 kernel failures on Intel Atom CPUs

commit 211b3d03c7400f48a781977a50104c9d12f4e229 upstream

[Trivial backport to 2.6.27 by cebbert@redhat.com]

x86: work around Fedora-11 x86-32 kernel failures on Intel Atom CPUs

Impact: work around boot crash

Work around Intel Atom erratum AAH41 (probabilistically) - it's triggering
in the field.

Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Tested-by: Kyle McMartin <kyle@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Cc: <cebbert@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

tcp: fix >2 iw selection

[ Upstream commit 86bcebafc5e7f5163ccf828792fe694b112ed6fa ]

A long-standing feature in tcp_init_metrics() is such that
any of its goto reset prevents call to tcp_init_cwnd().

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

net: fix skb_seq_read returning wrong offset/length for page frag data

[ Upstream commit 995b337952cdf7e05d288eede580257b632a8343 ]

When called with a consumed value that is less than skb_headlen(skb)
bytes into a page frag, skb_seq_read() incorrectly returns an
offset/length relative to skb->data. Ensure that data which should come
from a page frag does.

Signed-off-by: Thomas Chenault <thomas_chenault@dell.com>
Tested-by: Shyam Iyer <shyam_iyer@dell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

pktgen: do not access flows[] beyond its length

[ Upstream commit 5b5f792a6a9a2f9ae812d151ed621f72e99b1725 ]

typo -- pkt_dev->nflows is for stats only, the number of concurrent
flows is stored in cflows.

Reported-By: Vladimir Ivashchenko <hazard@francoudi.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

myr10ge: again fix lro_gen_skb() alignment

[ Upstream commit 636d2f68a0814d84de26c021b2c15e3b4ffa29de ]

Add LRO alignment initially committed in
621544eb8c3beaa859c75850f816dd9b056a00a3 ("[LRO]: fix lro_gen_skb()
alignment") and removed in 0dcffac1a329be69bab0ac604bf7283737108e68
("myri10ge: add multislices support") during conversion to
multi-slice.

Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

vlan/macvlan: fix NULL pointer dereferences in ethtool handlers

[ Upstream commit 7816a0a862d851d0b05710e7d94bfe390f3180e2 ]

Check whether the underlying device provides a set of ethtool ops before
checking for individual handlers to avoid NULL pointer dereferences.

Reported-by: Art van Breemen <ard@telegraafnet.nl>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

bonding: fix alb mode locking regression

[ Upstream commit 815bcc2719c12b6f5b511706e2d19728e07f0b02 ]

Fix locking issue in alb MAC address management; removed
incorrect locking and replaced with correct locking. This bug was
introduced in commit 059fe7a578fba5bbb0fdc0365bfcf6218fa25eb0
("bonding: Convert locks to _bh, rework alb locking for new locking")

Bug reported by Paul Smith <paul@mad-scientist.net>, who also
tested the fix.

Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

sparc64: Reschedule KGDB capture to a software interrupt.

[ Upstream commit 42cc77c861e8e850e86252bb5b1e12e006261973 ]

Otherwise it might interrupt switch_to() midstream and use
half-cooked register window state.

Reported-by: Chris Torek <chris.torek@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

sparc64: Fix lost interrupts on sun4u.

[ Upstream commit d0cac39e4ec8097e4c7099d291b1fdcc0fe56b58 ]

Based upon a report by Meelis Roos.

Sparc64 SBUS and PCI controllers use a combination of IMAP and ICLR
registers to manage device interrupts.

The IMAP register contains the "valid" enable bit as well as CPU
targetting information. Whereas the ICLR register is written with
zero at the end of handling an interrupt to reset the state machine
for that interrupt to IDLE so it can be sent again.

For PCI slot and SBUS slot devices we can have multiple interrupts
sharing the same IMAP register. There are individual ICLR registers
but only one IMAP register for managing those.

We represent each shared case with individual virtual IRQs so the
generic IRQ layer thinks there is only one user of the IRQ instance.

In such shared IMAP cases this is wrong, so if there are multiple
active users then a free_irq() call will prematurely turn off the
interrupt by clearing the Valid bit in the IMAP register even though
there are other active users.

Fix this by simply doing nothing in sun4u_disable_irq() and checking
IRQF_DISABLED during IRQ dispatch.

This situation doesn't exist in the hypervisor sun4v cases, so I left
those alone.

Tested-by: Meelis Roos <mroos@linux.ee>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

sparc64: Fix crash with /proc/iomem

[ Upstream commit 192d7a4667c6d11d1a174ec4cad9a3c5d5f9043c ]

When you compile kernel on Sparc64 with heap memory checking and type
"cat /proc/iomem", you get a crash, because pointers in struct
resource are uninitialized.

Most code fills struct resource with zeros, so I assume that it is
responsibility of the caller of request_resource to initialized it,
not the responsibility of request_resource functuion.

After 2.6.29 is out, there could be a check for uninitialized fields
added to request_resource to avoid crashes like this.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

sparc64: Flush TLB before releasing pages.

[ Upstream commit 86ee79c3dbd48d7430fd81edc1da3516c9f6dabc ]

tlb_flush_mmu() needs to flush pending TLB entries before
processing the mmu_gather ->pages list.

Noticed by Benjamin Herrenschmidt.

Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

sparc64: Fix MM refcount check in smp_flush_tlb_pending().

[ Upstream commit f9384d41c02408dd404aa64d66d0ef38adcf6479 ]

As explained by Benjamin Herrenschmidt:

> CPU 0 is running the context, task->mm == task->active_mm == your
> context. The CPU is in userspace happily churning things.
>
> CPU 1 used to run it, not anymore, it's now running fancyfsd which
> is a kernel thread, but current->active_mm still points to that
> same context.
>
> Because there's only one "real" user, mm_users is 1 (but mm_count is
> elevated, it's just that the presence on CPU 1 as active_mm has no
> effect on mm_count().
>
> At this point, fancyfsd decides to invalidate a mapping currently mapped
> by that context, for example because a networked file has changed
> remotely or something like that, using unmap_mapping_ranges().
>
> So CPU 1 goes into the zapping code, which eventually ends up calling
> flush_tlb_pending(). Your test will succeed, as current->active_mm is
> indeed the target mm for the flush, and mm_users is indeed 1. So you
> will -not- send an IPI to the other CPU, and CPU 0 will continue happily
> accessing the pages that should have been unmapped.

To fix this problem, check ->mm instead of ->active_mm, and this
means:

> So if you test current->mm, you effectively account for mm_users == 1,
> so the only way the mm can be active on another processor is as a lazy
> mm for a kernel thread. So your test should work properly as long
> as you don't have a HW that will do speculative TLB reloads into the
> TLB on that other CPU (and even if you do, you flush-on-switch-in should
> get rid of any crap here).

And therefore we should be OK.

Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

sparc: Fix bus type probing for ESP and LE devices.

If there is a dummy "espdma" or "ledma" parent device above ESP scsi
or LE ethernet device nodes, we have to match the bus as SBUS.

Otherwise the address and size cell counts are wrong and we don't
calculate the final physical device resource values correctly at all.

Commit 5280267c1dddb8d413595b87dc406624bb497946 ("sparc: Fix handling
of LANCE and ESP parent nodes in of_device.c") was meant to fix this
problem, but that only influences the inner loop of
build_device_resources(). We need this logic to also kick in at the
beginning of build_device_resources() as well, when we make the first
attempt to determine the device's immediate parent bus type for 'reg'
property element extraction.

Based almost entirely upon a patch by Friedrich Oslage.

Tested-by: Meelis Roos <mroos@linux.ee>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

sparc64: Fix smp_callin() locking.

[ Upstream commit 8e255baa449df3049a8827a7f1f4f12b6921d0d1 ]

Interrupts must be disabled when taking the IPI lock.

Caught by lockdep.

Reported-by: Meelis Roos <mroos@linux.ee>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

TPM: get_event_name stack corruption

commit fbaa58696cef848de818768783ef185bd3f05158 upstream.

get_event_name uses sprintf to fill a buffer declared on the stack.  It fills
the buffer 2 bytes at a time.  What the code doesn't take into account is that
sprintf(buf, "%02x", data) actually writes 3 bytes.  2 bytes for the data and
then it nul terminates the string.  Since we declare buf to be 40 characters
long and then we write 40 bytes of data into buf sprintf is going to write 41
characters.  The fix is to leave room in buf for the nul terminator.

Signed-off-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

nfs: Fix NFS v4 client handling of MAY_EXEC in nfs_permission.

commit 7ee2cb7f32b299c2b06a31fde155457203e4b7dd upstream.

The problem is that permission checking is skipped if atomic open is
possible, but when exec opens a file, it just opens it O_READONLY which
means EXEC permission will not be checked at that time.

This problem is observed by the following sequence (executed as root):

  mount -t nfs4 server:/ /mnt4
  echo "ls" >/mnt4/foo
  chmod 744 /mnt4/foo
  su guest -c "mnt4/foo"

Signed-off-by: Frank Filz <ffilzlnx@us.ibm.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Tested-by: Eugene Teo <eugeneteo@kernel.sg>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

icom: fix rmmod crash

commit 95caa0a9bdaf93607bd0cc8932f53112496f2f22 upstream.

Actually the icom driver is crashing when is being removed because
the driver is kfreeing the adapter structure before calling
pci_release_regions(), which result in the following error:

  Unable to handle kernel paging request for data at address 0x6b6b6b6b6b6b6d33
  Faulting instruction address: 0xc000000000246b80
  Oops: Kernel access of bad area, sig: 11 [#1]
  ....
  [c000000012d436a0] [c0000000001002d0] .kfree+0x120/0x34c (unreliable)
  [c000000012d43730] [c000000000246d60] .pci_release_selected_regions+0x3c/0x68
  [c000000012d437c0] [d000000002d54700] .icom_kref_release+0xf4/0x118 [icom]
  [c000000012d43850] [c000000000232e50] .kref_put+0x74/0x94
  [c000000012d438d0] [d000000002d56c58] .icom_remove+0x40/0xa4 [icom]
  [c000000012d43960] [c000000000249e48] .pci_device_remove+0x50/0x90
  [c000000012d439e0] [c0000000002d68d8] .__device_release_driver+0x94/0xd4
  [c000000012d43a70] [c0000000002d7104] .driver_detach+0xf8/0x12c
  [c000000012d43b00] [c0000000002d549c] .bus_remove_driver+0xbc/0x11c
  [c000000012d43b90] [c0000000002d71dc] .driver_unregister+0x60/0x80
  [c000000012d43c20] [c00000000024a07c] .pci_unregister_driver+0x44/0xe8
  [c000000012d43cb0] [d000000002d56bf4] .icom_exit+0x1c/0x40 [icom]
  [c000000012d43d30] [c000000000095fa8] .SyS_delete_module+0x214/0x2a8
  [c000000012d43e30] [c00000000000852c] syscall_exit+0x0/0x40

Signed-off-by: Breno Leitao <leitao@linux.vnet.ibm.com>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

ocfs2: fix i_mutex locking in ocfs2_splice_to_file()

commit 328eaaba4e41a04c1dc4679d65bea3fee4349d86 upstream.

Rearrange locking of i_mutex on destination and call to
ocfs2_rw_lock() so locks are only held while buffers are copied with
the pipe_to_file() actor, and not while waiting for more data on the
pipe.

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

splice: fix i_mutex locking in generic_splice_write()

commit eb443e5a25d43996deb62b9bcee1a4ce5dea2ead upstream.

Rearrange locking of i_mutex on destination so it's only held while
buffers are copied with the pipe_to_file() actor, and not while
waiting for more data on the pipe.

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

splice: remove i_mutex locking in splice_from_pipe()

commit 2933970b960223076d6affcf7a77e2bc546b8102 upstream.

splice_from_pipe() is only called from two places:

- generic_splice_sendpage()
- splice_write_null()

Neither of these require i_mutex to be taken on the destination inode.

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

splice: split up __splice_from_pipe()

commit b3c2d2ddd63944ef2a1e4a43077b602288107e01 upstream.

Split up __splice_from_pipe() into four helper functions:

  splice_from_pipe_begin()
  splice_from_pipe_next()
  splice_from_pipe_feed()
  splice_from_pipe_end()

splice_from_pipe_next() will wait (if necessary) for more buffers to
be added to the pipe.  splice_from_pipe_feed() will feed the buffers
to the supplied actor and return when there's no more data available
(or if all of the requested data has been copied).

This is necessary so that implementations can do locking around the
non-waiting splice_from_pipe_feed().

This patch should not cause any change in behavior.

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

powerpc/5200: Don't specify IRQF_SHARED in PSC UART driver

commit d9f0c5f9bc74f16d0ea0f6c518b209e48783a796 upstream.

The MPC5200 PSC device is wired up to a dedicated interrupt line
which is never shared. This patch removes the IRQF_SHARED flag
from the request_irq() call which eliminates the "IRQF_DISABLED
is not guaranteed on shared IRQs" warning message from the console
output.

Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Reviewed-by: Wolfram Sang <w.sang@pengutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

ehea: fix invalid pointer access

commit 0b2febf38a33d7c40fb7bb4a58c113a1fa33c412 upstream.

This patch fixes an invalid pointer access in case the receive queue
holds no pointer to the next skb when the queue is empty.

Signed-off-by: Hannes Hering <hering2@de.ibm.com>
Signed-off-by: Jan-Bernd Themann <themann@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

NFS: Fix the notifications when renaming onto an existing file

commit b1e4adf4ea41bb8b5a7bfc1a7001f137e65495df upstream.

NFS appears to be returning an unnecessary "delete" notification when
we're doing an atomic rename. See

http://bugzilla.gnome.org/show_bug.cgi?id=575684

The fix is to get rid of the redundant call to d_delete().

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

nfsd4: check for negative dentry before use in nfsv4 readdir

commit b2c0cea6b1cb210e962f07047df602875564069e upstream.

After 2f9092e1020246168b1309b35e085ecd7ff9ff72 "Fix i_mutex vs. readdir
handling in nfsd" (and 14f7dd63 "Copy XFS readdir hack into nfsd code"),
an entry may be removed between the first mutex_unlock and the second
mutex_lock. In this case, lookup_one_len() will return a negative
dentry. Check for this case to avoid a NULL dereference.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Reviewed-by: J. R. Okajima <hooanon05@yahoo.co.jp>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

epoll: fix size check in epoll_create()

commit bfe3891a5f5d3b78146a45f40e435d14f5ae39dd upstream.

Fix a size check WRT the manual pages. This was inadvertently broken by
commit 9fe5ad9c8cef9ad5873d8ee55d1cf00d9b607df0 ("flag parameters
add-on: remove epoll_create size param").

Signed-off-by: Davide Libenzi <davidel@xmailserver.org>
Cc: <Hiroyuki.Mach@gmail.com>
Cc: rohit verma <rohit.170309@gmail.com>
Cc: Ulrich Drepper <drepper@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

cifs: Fix unicode string area word alignment in session setup

commit 27b87fe52baba0a55e9723030e76fce94fabcea4 refreshed.

cifs: fix unicode string area word alignment in session setup

The handling of unicode string area alignment is wrong.
decode_unicode_ssetup improperly assumes that it will always be preceded
by a pad byte. This isn't the case if the string area is already
word-aligned.

This problem, combined with the bad buffer sizing for the serverDomain
string can cause memory corruption. The bad alignment can make it so
that the alignment of the characters is off. This can make them
translate to characters that are greater than 2 bytes each. If this
happens we can overflow the allocation.

Fix this by fixing the alignment in CIFS_SessSetup instead so we can
verify it against the head of the response. Also, clean up the
workaround for improperly terminated strings by checking for a
odd-length unicode buffers and then forcibly terminating them.

Finally, resize the buffer for serverDomain. Now that we've fixed
the alignment, it's probably fine, but a malicious server could
overflow it.

A better solution for handling these strings is still needed, but
this should be a suitable bandaid.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
Cc: Suresh Jayaraman <sjayaraman@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

cifs: Fix buffer size in cifs_convertUCSpath

Relevant commits 7fabf0c9479fef9fdb9528a5fbdb1cb744a744a4 and
f58841666bc22e827ca0dcef7b71c7bc2758ce82. The upstream commits adds
cifs_from_ucs2 that includes functionality of cifs_convertUCSpath and
does cleanup.

Reported-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Suresh Jayaraman <sjayaraman@suse.de>
Acked-by: Steve French <sfrench@us.ibm.com>
Acked-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

cifs: Fix incorrect destination buffer size in cifs_strncpy_to_host

Relevant commits 968460ebd8006d55661dec0fb86712b40d71c413 and
066ce6899484d9026acd6ba3a8dbbedb33d7ae1b. Minimal hunks to fix buffer
size and fix an existing problem pointed out by Guenter Kukuk that length
of src is used for NULL termination of dst.

cifs: Rename cifs_strncpy_to_host and fix buffer size

There is a possibility for the path_name and node_name buffers to
overflow if they contain charcters that are >2 bytes in the local
charset. Resize the buffer allocation so to avoid this possibility.

Also, as pointed out by Jeff Layton, it would be appropriate to
rename the function to cifs_strlcpy_to_host to reflect the fact
that the copied string is always NULL terminated.

Signed-off-by: Suresh Jayaraman <sjayaraman@suse.de>
Acked-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

cifs: Increase size of tmp_buf in cifs_readdir to avoid potential overflows

Commit 7b0c8fcff47a885743125dd843db64af41af5a61 refreshed and use
a #define from commit f58841666bc22e827ca0dcef7b71c7bc2758ce82.

cifs: Increase size of tmp_buf in cifs_readdir to avoid potential overflows

Increase size of tmp_buf to possible maximum to avoid potential
overflows. Also moved UNICODE_NAME_MAX definition so that it can be used
elsewhere.

Pointed-out-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Suresh Jayaraman <sjayaraman@suse.de>
Acked-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

cifs: Fix buffer size for tcon->nativeFileSystem field

Commit f083def68f84b04fe3f97312498911afce79609e refreshed.

cifs: fix buffer size for tcon->nativeFileSystem field

The buffer for this was resized recently to fix a bug. It's still
possible however that a malicious server could overflow this field
by sending characters in it that are >2 bytes in the local charset.
Double the size of the buffer to account for this possibility.

Also get rid of some really strange and seemingly pointless NULL
termination. It's NULL terminating the string in the source buffer,
but by the time that happens, we've already copied the string.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
Cc: Suresh Jayaraman <sjayaraman@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

NFS: Close page_mkwrite() races

commit 7fdf523067666b0eaff330f362401ee50ce187c4 upstream

Follow up to Nick Piggin's patches to ensure that nfs_vm_page_mkwrite
returns with the page lock held, and sets the VM_FAULT_LOCKED flag.

See http://bugzilla.kernel.org/show_bug.cgi?id=12913

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Nick Piggin <npiggin@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

NFS: Fix the return value in nfs_page_mkwrite()

commit 2b2ec7554cf7ec5e4412f89a5af6abe8ce950700 upstream

Commit c2ec175c39f62949438354f603f4aa170846aabb ("mm: page_mkwrite
change prototype to match fault") exposed a bug in the NFS
implementation of page_mkwrite. We should be returning 0 on success...

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Nick Piggin <npiggin@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

GFS2: Fix page_mkwrite() return code

commit e56985da455b9dc0591b8cb2006cc94b6f4fb0f4 upstream

This allows for the possibility of returning VM_FAULT_OOM as
well as VM_FAULT_SIGBUS. This ensures that the correct action
is taken.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
Cc: Nick Piggin <npiggin@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

mm: close page_mkwrite races

commit b827e496c893de0c0f142abfaeb8730a2fd6b37f upstream

mm: close page_mkwrite races

Change page_mkwrite to allow implementations to return with the page
locked, and also change it's callers (in page fault paths) to hold the
lock until the page is marked dirty.  This allows the filesystem to have
full control of page dirtying events coming from the VM.

Rather than simply hold the page locked over the page_mkwrite call, we
call page_mkwrite with the page unlocked and allow callers to return with
it locked, so filesystems can avoid LOR conditions with page lock.

The problem with the current scheme is this: a filesystem that wants to
associate some metadata with a page as long as the page is dirty, will
perform this manipulation in its ->page_mkwrite.  It currently then must
return with the page unlocked and may not hold any other locks (according
to existing page_mkwrite convention).

In this window, the VM could write out the page, clearing page-dirty.  The
filesystem has no good way to detect that a dirty pte is about to be
attached, so it will happily write out the page, at which point, the
filesystem may manipulate the metadata to reflect that the page is no
longer dirty.

It is not always possible to perform the required metadata manipulation in
->set_page_dirty, because that function cannot block or fail.  The
filesystem may need to allocate some data structure, for example.

And the VM cannot mark the pte dirty before page_mkwrite, because
page_mkwrite is allowed to fail, so we must not allow any window where the
page could be written to if page_mkwrite does fail.

This solution of holding the page locked over the 3 critical operations
(page_mkwrite, setting the pte dirty, and finally setting the page dirty)
closes out races nicely, preventing page cleaning for writeout being
initiated in that window.  This provides the filesystem with a strong
synchronisation against the VM here.

- Sage needs this race closed for ceph filesystem.
- Trond for NFS (http://bugzilla.kernel.org/show_bug.cgi?id=12913).
- I need it for fsblock.
- I suspect other filesystems may need it too (eg. btrfs).
- I have converted buffer.c to the new locking. Even simple block allocation
  under dirty pages might be susceptible to i_size changing under partial page
  at the end of file (we also have a buffer.c-side problem here, but it cannot
  be fixed properly without this patch).
- Other filesystems (eg. NFS, maybe btrfs) will need to change their
  page_mkwrite functions themselves.

[ This also moves page_mkwrite another step closer to fault, which should
  eventually allow page_mkwrite to be moved into ->fault, and thus avoiding a
  filesystem calldown and page lock/unlock cycle in __do_fault. ]

[akpm@linux-foundation.org: fix derefs of NULL ->mapping]
Cc: Sage Weil <sage@newdream.net>
Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
Signed-off-by: Nick Piggin <npiggin@suse.de>
Cc: Valdis Kletnieks <Valdis.Kletnieks@vt.edu>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

fs: fix page_mkwrite error cases in core code and btrfs

commit 56a76f8275c379ed73c8a43cfa1dfa2f5e9cfa19 upstream

fs: fix page_mkwrite error cases in core code and btrfs

page_mkwrite is called with neither the page lock nor the ptl held.  This
means a page can be concurrently truncated or invalidated out from
underneath it.  Callers are supposed to prevent truncate races themselves,
however previously the only thing they can do in case they hit one is to
raise a SIGBUS.  A sigbus is wrong for the case that the page has been
invalidated or truncated within i_size (eg.  hole punched).  Callers may
also have to perform memory allocations in this path, where again, SIGBUS
would be wrong.

The previous patch ("mm: page_mkwrite change prototype to match fault")
made it possible to properly specify errors.  Convert the generic buffer.c
code and btrfs to return sane error values (in the case of page removed
from pagecache, VM_FAULT_NOPAGE will cause the fault handler to exit
without doing anything, and the fault will be retried properly).

This fixes core code, and converts btrfs as a template/example.  All other
filesystems defining their own page_mkwrite should be fixed in a similar
manner.

Acked-by: Chris Mason <chris.mason@oracle.com>
Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

mm: page_mkwrite change prototype to match fault

commit c2ec175c39f62949438354f603f4aa170846aabb upstream

mm: page_mkwrite change prototype to match fault

Change the page_mkwrite prototype to take a struct vm_fault, and return
VM_FAULT_xxx flags.  There should be no functional change.

This makes it possible to return much more detailed error information to
the VM (and also can provide more information eg.  virtual_address to the
driver, which might be important in some special cases).

This is required for a subsequent fix.  And will also make it easier to
merge page_mkwrite() with fault() in future.

Signed-off-by: Nick Piggin <npiggin@suse.de>
Cc: Chris Mason <chris.mason@oracle.com>
Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
Cc: Miklos Szeredi <miklos@szeredi.hu>
Cc: Steven Whitehouse <swhiteho@redhat.com>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <joel.becker@oracle.com>
Cc: Artem Bityutskiy <dedekind@infradead.org>
Cc: Felix Blyakher <felixb@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

i2c-algo-pca: Let PCA9564 recover from unacked data byte (state 0x30)

commit 2196d1cf4afab93fb64c2e5b417096e49b661612 upstream

Currently, the i2c-algo-pca driver does nothing if the chip enters state
0x30 (Data byte in I2CDAT has been transmitted; NOT ACK has been
received). Thus, the i2c bus connected to the controller gets stuck
afterwards.

I have seen this kind of error on a custom board in certain load
situations most probably caused by interference or noise.

A possible reaction is to let the controller generate a STOP condition.
This is documented in the PCA9564 data sheet (2006-09-01) and the same
is done for other NACK states as well.

Further, state 0x38 isn't handled completely, either. Try to do another
START in this case like the data sheet says. As this couldn't be tested,
I've added a comment to try to reset the chip if the START doesn't help
as suggested by Wolfram Sang.

Signed-off-by: Enrik Berkhan <Enrik.Berkhan@ge.com>
Reviewed-by: Wolfram Sang <w.sang@pengutronix.de>
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

i2c-algo-bit: Fix timeout test

commit 0cdba07bb23cdd3e0d64357ec3d983e6b75e541f upstream

When fetching DDC using i2c algo bit, we were often seeing timeouts
before getting valid EDID on a retry. The VESA spec states 2ms is the
DDC timeout, so when this translates into 1 jiffie and we are close
to the end of the time period, it could return with a timeout less than
2ms.

Change this code to use time_after instead of time_after_eq.

Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

dup2: Fix return value with oldfd == newfd and invalid fd

commit 2b79bc4f7ebbd5af3c8b867968f9f15602d5f802 upstream.

The return value of dup2 when oldfd == newfd and the fd isn't valid is
not getting properly sign extended. We end up with 4294967287 instead
of -EBADF.

I've reproduced this on SLE11 (2.6.27.21), openSUSE Factory
(2.6.29-rc5), and Ubuntu 9.04 (2.6.28).

This patch uses a signed int for the error value so it is properly
extended.

Commit 6c5d0512a091480c9f981162227fdb1c9d70e555 introduced this
regression.

Reported-by: Jiri Dluhos <jdluhos@novell.com>
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

USB: Gadget: fix UTF conversion in the usbstring library

commit 0f43158caddcbb110916212ebe4e39993ae70864 upstream.

This patch (as1234) fixes a bug in the UTF8 -> UTF-16 conversion
routine in the gadget/usbstring library. In a UTF-8 multi-byte
sequence, all bytes after the first should have their high-order
two bits set to 10, not 11.

Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Acked-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

md: remove ability to explicit set an inactive array to 'clean'.

commit 5bf295975416f8e97117bbbcfb0191c00bc3e2b4 upstream.

Being able to write 'clean' to an 'array_state' of an inactive array
to activate it in 'clean' mode is both unnecessary and inconvenient.

It is unnecessary because the same can be achieved by writing
'active'.  This activates and array, but it still remains 'clean'
until the first write.

It is inconvenient because writing 'clean' is more often used to
cause an 'active' array to revert to 'clean' mode (thus blocking
any writes until a 'write-pending' is promoted to 'active').

Allowing 'clean' to both activate an array and mark an active array as
clean can lead to races:  One program writes 'clean' to mark the
active array as clean at the same time as another program writes
'inactive' to deactivate (stop) and active array.  Depending on which
writes first, the array could be deactivated and immediately
reactivated which isn't what was desired.

So just disable the use of 'clean' to activate an array.

This avoids a race that can be triggered with mdadm-3.0 and external
metadata, so it suitable for -stable.

Reported-by: Rafal Marszewski <rafal.marszewski@intel.com>
Acked-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

md/raid10: don't clear bitmap during recovery if array will still be degraded.

commit 18055569127253755d01733f6ecc004ed02f88d0 upstream.

If we have a raid10 with multiple missing devices, and we recover just
one of these to a spare, then we risk (depending on the bitmap and
array chunk size) clearing bits of the bitmap for which recovery isn't
complete (because a device is still missing).

This can lead to a subsequent "re-add" being recovered without
any IO happening, which would result in loss of data.

This patch takes the safe approach of not clearing bitmap bits
if the array will still be degraded.

This patch is suitable for all active -stable kernels.

Cc: stable@kernel.org
Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

md: fix some (more) errors with bitmaps on devices larger than 2TB.

commit db305e507d554430a69ede901a6308e6ecb72349 upstream.

If a write intent bitmap covers more than 2TB, we sometimes work with
values beyond 32bit, so these need to be sector_t. This patches
add the required casts to some unsigned longs that are being shifted
up.

This will affect any raid10 larger than 2TB, or any raid1/4/5/6 with
member devices that are larger than 2TB.

Signed-off-by: NeilBrown <neilb@suse.de>
Reported-by: "Mario 'BitKoenig' Holbe" <Mario.Holbe@TU-Ilmenau.DE>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

md: fix loading of out-of-date bitmap.

commit b74fd2826c5acce20e6f691437b2d19372bc2057 upstream.

When md is loading a bitmap which it knows is out of date, it fills
each page with 1s and writes it back out again.  However the
write_page call makes used of bitmap->file_pages and
bitmap->last_page_size which haven't been set correctly yet.  So this
can sometimes fail.

Move the setting of file_pages and last_page_size to before the call
to write_page.

This bug can cause the assembly on an array to fail, thus making the
data inaccessible.  Hence I think it is a suitable candidate for
-stable.

Reported-by: Vojtech Pavlik <vojtech@suse.cz>
Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

rndis_wlan: fix initialization order for workqueue&workers

commit e805e4d0b53506dff4255a2792483f094e7fcd2c upstream.

rndis_wext_link_change() might be called from rndis_command() at
initialization stage and priv->workqueue/priv->work have not been
initialized yet. This causes invalid opcode at rndis_wext_bind on
some brands of bcm4320.

Fix by initializing workqueue/workers in rndis_wext_bind() before
rndis_command is used.

This bug has existed since 2.6.25, reported at:
http://bugzilla.kernel.org/show_bug.cgi?id=12794

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@mbnet.fi>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

proc: avoid information leaks to non-privileged processes

commit f83ce3e6b02d5e48b3a43b001390e2b58820389d upstream.

By using the same test as is used for /proc/pid/maps and /proc/pid/smaps,
only allow processes that can ptrace() a given process to see information
that might be used to bypass address space layout randomization (ASLR).
These include eip, esp, wchan, and start_stack in /proc/pid/stat as well
as the non-symbolic output from /proc/pid/wchan.

ASLR can be bypassed by sampling eip as shown by the proof-of-concept
code at http://code.google.com/p/fuzzyaslr/ As part of a presentation
(http://www.cr0.org/paper/to-jt-linux-alsr-leak.pdf) esp and wchan were
also noted as possibly usable information leaks as well. The
start_stack address also leaks potentially useful information.

Cc: Stable Team <stable@kernel.org>
Signed-off-by: Jake Edge <jake@lwn.net>
Acked-by: Arjan van de Ven <arjan@linux.intel.com>
Acked-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

mv643xx_eth: 64bit mib counter read fix

commit 93af7aca44f0e82e67bda10a0fb73d383edcc8bd upstream.

On several mv643xx_eth hardware versions, the two 64bit mib counters
for 'good octets received' and 'good octets sent' are actually 32bit
counters, and reading from the upper half of the register has the same
effect as reading from the lower half of the register: an atomic
read-and-clear of the entire 32bit counter value. This can under heavy
traffic occasionally lead to small numbers being added to the upper
half of the 64bit mib counter even though no 32bit wrap has occured.

Since we poll the mib counters at least every 30 seconds anyway, we
might as well just skip the reads of the upper halves of the hardware
counters without breaking the stats, which this patch does.

Signed-off-by: Lennert Buytenhek <buytenh@marvell.com>
Cc: stable@kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

Ignore madvise(MADV_WILLNEED) for hugetlbfs-backed regions

commit a425a638c858fd10370b573bde81df3ba500e271 upstream.

madvise(MADV_WILLNEED) forces page cache readahead on a range of memory
backed by a file. The assumption is made that the page required is
order-0 and "normal" page cache.

On hugetlbfs, this assumption is not true and order-0 pages are
allocated and inserted into the hugetlbfs page cache. This leaks
hugetlbfs page reservations and can cause BUGs to trigger related to
corrupted page tables.

This patch causes MADV_WILLNEED to be ignored for hugetlbfs-backed
regions.

Signed-off-by: Mel Gorman <mel@csn.ul.ie>
Cc: stable@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

clockevents: prevent endless loop in tick_handle_periodic()

commit 74a03b69d1b5ce00a568e142ca97e76b7f5239c6 upstream.

tick_handle_periodic() can lock up hard when a one shot clock event
device is used in combination with jiffies clocksource.

Avoid an endless loop issue by requiring that a highres valid
clocksource be installed before we call tick_periodic() in a loop when
using ONESHOT mode. The result is we will only increment jiffies once
per interrupt until a continuous hardware clocksource is available.

Without this, we can run into a endless loop, where each cycle through
the loop, jiffies is updated which increments time by tick_period or
more (due to clock steering), which can cause the event programming to
think the next event was before the newly incremented time and fail
causing tick_periodic() to be called again and the whole process loops
forever.

[ Impact: prevent hard lock up ]

Signed-off-by: John Stultz <johnstul@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

USB: serial: fix lifetime and locking problems

This is commit 2d93148ab6988cad872e65d694c95e8944e1b626 back-ported to
2.6.27.

This patch (as1229-1) fixes a few lifetime and locking problems in the
usb-serial driver. The main symptom is that an invalid kevent is
created when the serial device is unplugged while a connection is
active.

Ports should be unregistered when device is disconnected,
not when the parent usb_serial structure is deallocated.

Each open file should hold a reference to the corresponding
port structure, and the reference should be released when
the file is closed.

serial->disc_mutex should be acquired in serial_open(), to
resolve the classic race between open and disconnect.

serial_close() doesn't need to hold both serial->disc_mutex
and port->mutex at the same time.

Release the subdriver's module reference only after releasing
all the other references, in case one of the release routines
needs to invoke some code in the subdriver module.

Replace a call to flush_scheduled_work() (which is prone to
deadlocks) with cancel_work_sync(). Also, add a call to
cancel_work_sync() in the disconnect routine.

Reduce the scope of serial->disc_mutex in serial_disconnect().
The only place it really needs to protect is where the
"disconnected" flag is set.

Call the shutdown method from within serial_disconnect()
instead of destroy_serial(), because some subdrivers expect
the port data structures still to be in existence when
their shutdown method runs.

This fixes the bug reported in

http://bugs.freedesktop.org/show_bug.cgi?id=20703

Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

MIPS: CVE-2009-0029: Enable syscall wrappers

Backport of upstream commits by:
  Ralf Baechle <ralf@linux-mips.org>
  Xiaotian Feng <Xiaotian.Feng@windriver.com>

upstream commits:
  dbda6ac0897603f6c6dfadbbc37f9882177ec7ac
  d6c178e9694e7e0c7ffe0289cf4389a498cac735
  c189846ecf900cd6b3ad7d3cef5b45a746ce646b

Signed-off-by: dann frazier <dannf@debian.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

ACPI: Revert conflicting workaround for BIOS w/ mangled PRT entries

upstream 82babbb3887e234c995626e4121d411ea9070ca5
backported to apply cleanly to 2.6.27.21
and apply with offset -1 to 2.6.28.9

2f894ef9c8b36a35d80709bedca276d2fc691941
in Linux-2.6.21 worked around BIOS with mangled _PRT entries:
http://bugzilla.kernel.org/show_bug.cgi?id=6859

d0e184abc5983281ef189db2c759d65d56eb1b80
worked around the same issue via ACPICA, and shipped in 2.6.27.

Unfortunately the two workarounds conflict:
http://bugzilla.kernel.org/show_bug.cgi?id=12270

So revert the Linux specific one.

Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

x86/PCI: don't call e820_all_mapped with -1 in the mmconfig case

commit 044cd80942e47b9de0915b627902adf05c52377f upstream.

e820_all_mapped need end is (addr + size) instead of (addr + size - 1)

Cc: stable@kernel.org
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

PCI quirk: disable MSI on VIA VT3364 chipsets

commit 162dedd39dcc6eca3fc0d29cf19658c6c13b840e upstream.

Without this patch, Broadcom BCM5906 Ethernet controllers set up via MSI
cause the machine to hang. Tejun agreed that the best is to blacklist
the whole chipset and after adding it, seeing the other VIA quirks
disabling MSI, this very much looks like the right way.

Cc: <stable@kernel.org>
Signed-off-by: Thomas Renninger <trenn@suse.de>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

pagemap: require aligned-length, non-null reads of /proc/pid/pagemap

commit 0816178638c15ce5472d39d771a96860dff4141a upstream.

The intention of commit aae8679b0ebcaa92f99c1c3cb0cd651594a43915
("pagemap: fix bug in add_to_pagemap, require aligned-length reads of
/proc/pid/pagemap") was to force reads of /proc/pid/pagemap to be a
multiple of 8 bytes, but now it allows to read 0 bytes, which actually
puts some data to user's buffer. According to POSIX, if count is zero,
read() should return zero and has no other results.

Signed-off-by: Vitaly Mayatskikh <v.mayatskih@gmail.com>
Cc: Thomas Tuttle <ttuttle@google.com>
Acked-by: Matt Mackall <mpm@selenic.com>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

kbuild: fix Module.markers permission error under cygwin

commit 99e3a1eb3c22bb671c6f3d22d8244bfc9fad8185 upstream.

While building the kernel, we end-up calling modpost with -K and -M
options for the same file (Modules.markers).  This is resulting in
modpost's main function calling read_markers() and then write_markers() on
the same file.

We then have read_markers() mmap'ing the file, and writer_markers()
opening that same file for writing.

The issue is that read_markers() exits without munmap'ing the file and is
as a matter holding a reference on Modules.markers.  When write_markers()
is opening that very same file for writing, we still have a reference on
it and cygwin (Windows?) is then making fopen() fail with EPERM.

Calling release_file() before exiting read_markers() clears that reference
(and memory leak) and fopen() then succeeds.

Tested on both cygwin (1.3.22) and Linux.  Also ran modpost within
valgrind on Linux to make sure that the munmap'ed file was not accessed
after read_markers()

Signed-off-by: Cedric Hombourger <chombourger@gmail.com>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

b43: Refresh RX poison on buffer recycling

upstream commit: cf68636a9773aa97915497fe54fa4a51e3f08f3a

The RX buffer poison needs to be refreshed, if we recycle an RX buffer,
because it might be (partially) overwritten by some DMA operations.

Cc: stable@kernel.org
Cc: Francesco Gringoli <francesco.gringoli@ing.unibs.it>
Signed-off-by: Michael Buesch <mb@bu3sch.de>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

b43: Poison RX buffers

upstream commit: ec9a1d8c13e36440eda0f3c79b8149080e3ab5ba

This patch adds poisoning and sanity checking to the RX DMA buffers.
This is used for protection against buggy hardware/firmware that raises
RX interrupts without doing an actual DMA transfer.

This mechanism protects against rare "bad packets" (due to uninitialized skb data)
and rare kernel crashes due to uninitialized RX headers.

The poison is selected to not match on valid frames and to be cheap for checking.

The poison check mechanism _might_ trigger incorrectly, if we are voluntarily
receiving frames with bad PLCP headers. However, this is nonfatal, because the
chance of such a match is basically zero and in case it happens it just results
in dropping the packet.
Bad-PLCP RX defaults to off, and you should leave it off unless you want to listen
to the latest news broadcasted by your microwave oven.

This patch also moves the initialization of the RX-header "length" field in front of
the mapping of the DMA buffer. The CPU should not touch the buffer after we mapped it.

Cc: stable@kernel.org
Reported-by: Francesco Gringoli <francesco.gringoli@ing.unibs.it>
Signed-off-by: Michael Buesch <mb@bu3sch.de>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

forcedeth: Fix resume from hibernation regression.

upstream commit: 35a7433c789ba6df6d96b70fa745ae9e6cac0038

Reset phy state on resume, fixing a regression caused by powering down
the phy on hibernate.

Signed-off-by: Ed Swierk <eswierk@aristanetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Cc: Tvrtko Ursulin <tvrtko.ursulin@sophos.com>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

USB: Unusual Device support for Gold MP3 Player Energy

upstream commit: 46c6e93faa85d1362e1d127dc28cf9d0b304a6f1

Reported by Alessio Treglia on
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/125250

User was getting the following errors in dmesg:

[ 2158.139386] sd 5:0:0:1: ioctl_internal_command return code = 8000002
[ 2158.139390] : Current: sense key: No Sense
[ 2158.139393] Additional sense: No additional sense information

Adds unusual device support.

modified: drivers/usb/storage/unusual_devs.h

Signed-off-by: Chuck Short <zulcss@ubuntu.com>
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
Cc: stable <stable@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>

virtio-rng: Remove false BUG for spurious callbacks

upstream commit: e5b89542ea18020961882228c26db3ba87f6e608

The virtio-rng drivers checks for spurious callbacks. Since
callbacks can be implemented via shared interrupts (e.g. PCI) this
could lead to guest kernel oopses with lots of virtio devices.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: stable@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

drm/i915: add support for G41 chipset

commit 72021788678523047161e97b3dfed695e802a5fd upstream.

This had been delayed for some time due to failure to work on the one piece
of G41 hardware we had, and lack of success reports from anybody else.
Current hardware appears to be OK.

Signed-off-by: Zhenyu Wang <zhenyu.z.wang@intel.com>
[anholt: hand-applied due to conflicts with IGD patches]
Signed-off-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

unreached code in selinux_ip_postroute_iptables_compat() (CVE-2009-1184)

Not upstream in 2.6.30, as the function was removed there, making this a
non-issue.

Node and port send checks can skip in the compat_net=1 case. This bug
was introduced in commit effad8d.

Signed-off-by: Eugene Teo <eugeneteo@kernel.sg>
Reported-by: Dan Carpenter <error27@gmail.com>
Acked-by: James Morris <jmorris@namei.org>
Acked-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

ACPI: EC: fix compilation warning

commit 3e54048691bce3f323fd5460695273be379803b9 upstream.

Fix the warning introduced in commit c5279dee26c0e8d7c4200993bfc4b540d2469598,
and give the dummy variable a more verbose name.

drivers/acpi/ec.c: In function 'acpi_ec_ecdt_probe':
drivers/acpi/ec.c:1015: warning: ISO C90 forbids mixed declarations and code

Signed-off-by: Hannes Eder <hannes@hanneseder.net>
Acked-by: Alexey Starikovskiy <astarikovskiy@suse.de>
Signed-off-by: Len Brown <len.brown@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

ACPI: EC: Add some basic check for ECDT data

commit c5279dee26c0e8d7c4200993bfc4b540d2469598 upstream.

One more ASUS comes with empty ECDT, add a guard for it...

http://bugzilla.kernel.org/show_bug.cgi?id=11880

Signed-off-by: Alexey Starikovskiy <astarikovskiy@suse.de>
Signed-off-by: Len Brown <len.brown@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

thinkpad-acpi: fix LED blinking through timer trigger

commit 75bd3bf2ade9d548be0d2bde60b5ee0fdce0b127 upstream.

The set_blink hook code in the LED subdriver would never manage to get
a LED to blink, and instead it would just turn it on. The consequence
of this is that the "timer" trigger would not cause the LED to blink
if given default parameters.

This problem exists since 2.6.26-rc1.

To fix it, switch the deferred LED work handling to use the
thinkpad-acpi-specific LED status (off/on/blink) directly.

This also makes the code easier to read, and to extend later.

Signed-off-by: Henrique de Moraes Holschuh <hmh@hmh.eng.br>
Cc: stable@kernel.org
Signed-off-by: Len Brown <len.brown@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

PCI: fix incorrect mask of PM No_Soft_Reset bit

commit 998dd7c719f62dcfa91d7bf7f4eb9c160e03d817 upstream.

Reviewed-by: Matthew Wilcox <matthew@wil.cx>
Signed-off-by: Yu Zhao <yu.zhao@intel.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

fs core fixes

Please add the following 4 commits to 2.6.27-stable and 2.6.28-stable.
However, there has been a lot of change here between 2.6.28 and 2.6.29:
in particular, fs/exec.c's unsafe_exec() grew into the more complicated
check_unsafe_exec(). So applying the original patches gives too many
rejects: at the bottom is the diffstat and the combined patch required.

1
Commit: 53e9309e01277ec99c38e84e0ca16921287cf470
Author: Hugh Dickins <hugh@veritas.com>
Date: Sat, 28 Mar 2009 23:16:03 +0000 (+0000)
Subject: compat_do_execve should unshare_files

2
Commit: e426b64c412aaa3e9eb3e4b261dc5be0d5a83e78
Author: Hugh Dickins <hugh@veritas.com>
Date: Sat, 28 Mar 2009 23:20:19 +0000 (+0000)
Subject: fix setuid sometimes doesn't

3
Commit: 7c2c7d993044cddc5010f6f429b100c63bc7dffb
Author: Hugh Dickins <hugh@veritas.com>
Date: Sat, 28 Mar 2009 23:21:27 +0000 (+0000)
Subject: fix setuid sometimes wouldn't

4
Commit: f1191b50ec11c8e2ca766d6d99eb5bb9d2c084a3
Author: Al Viro <viro@zeniv.linux.org.uk>
Date: Mon, 30 Mar 2009 11:35:18 +0000 (-0400)
Subject: check_unsafe_exec() doesn't care about signal handlers sharing

Signed-off-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

fix ptrace slowness

commit 53da1d9456fe7f87a920a78fdbdcf1225d197cb7 upstream.

This patch fixes bug #12208:

Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=12208
Subject : uml is very slow on 2.6.28 host

This turned out to be not a scheduler regression, but an already
existing problem in ptrace being triggered by subtle scheduler
changes.

The problem is this:

- task A is ptracing task B
- task B stops on a trace event
- task A is woken up and preempts task B
- task A calls ptrace on task B, which does ptrace_check_attach()
- this calls wait_task_inactive(), which sees that task B is still on the runq
- task A goes to sleep for a jiffy
- ...

Since UML does lots of the above sequences, those jiffies quickly add
up to make it slow as hell.

This patch solves this by not rescheduling in read_unlock() after
ptrace_stop() has woken up the tracer.

Thanks to Oleg Nesterov and Ingo Molnar for the feedback.

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
CC: stable@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

exit_notify: kill the wrong capable(CAP_KILL) check (CVE-2009-1337)

CVE-2009-1337

commit 432870dab85a2f69dc417022646cb9a70acf7f94 upstream.

The CAP_KILL check in exit_notify() looks just wrong, kill it.

Whatever logic we have to reset ->exit_signal, the malicious user
can bypass it if it execs the setuid application before exiting.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Acked-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

crypto: ixp4xx - Fix handling of chained sg buffers

commit 0d44dc59b2b434b29aafeae581d06f81efac7c83 upstream.

- keep dma functions away from chained scatterlists.
Use the existing scatterlist iteration inside the driver
to call dma_map_single() for each chunk and avoid dma_map_sg().

Signed-off-by: Christian Hohnstaedt <chohnstaedt@innominate.com>
Tested-By: Karl Hiramoto <karl@hiramoto.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

b44: Use kernel DMA addresses for the kernel DMA API

commit 37efa239901493694a48f1d6f59f8de17c2c4509 upstream.

We must not use the device DMA addresses for the kernel DMA API, because
device DMA addresses have an additional offset added for the SSB translation.

Use the original dma_addr_t for the sync operation.

Cc: stable@kernel.org
Signed-off-by: Michael Buesch <mb@bu3sch.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

ath9k: AR9280 PCI devices must serialize IO as well

This is a port of:
commit SHA1 5ec905a8df3fa877566ba98298433fbfb3d688cc
for 2.6.27

Signed-off-by: Luis R. Rodriguez <lrodriguez@atheros.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

ath9k: implement IO serialization

This is a port of:
commit SHA1 6158425be398936af1fd04451f78ffad01529cb0
for 2.6.27

All 802.11n PCI devices (Cardbus, PCI, mini-PCI) require
serialization of IO when on non-uniprocessor systems. PCI
express devices not not require this.

This should fix our only last standing open ath9k kernel.org
bugzilla bug report:

http://bugzilla.kernel.org/show_bug.cgi?id=12110

Signed-off-by: Luis R. Rodriguez <lrodriguez@atheros.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

powerpc: Sanitize stack pointer in signal handling code

This has been backported to 2.6.27.x from commit efbda86098 in Linus' tree.

On powerpc64 machines running 32-bit userspace, we can get garbage bits in the
stack pointer passed into the kernel.  Most places handle this correctly, but
the signal handling code uses the passed value directly for allocating signal
stack frames.

This fixes the issue by introducing a get_clean_sp function that returns a
sanitized stack pointer.  For 32-bit tasks on a 64-bit kernel, the stack
pointer is masked correctly.  In all other cases, the stack pointer is simply
returned.

Additionally, we pass an 'is_32' parameter to get_sigframe now in order to
get the properly sanitized stack.  The callers are know to be 32 or 64-bit
statically.

Signed-off-by: Josh Boyer <jwboyer@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

mm: check for no mmaps in exit_mmap()

commit dcd4a049b9751828c516c59709f3fdf50436df85 upstream.

When dup_mmap() ooms we can end up with mm->mmap == NULL. The error
path does mmput() and unmap_vmas() gets a NULL vma which it
dereferences.

In exit_mmap() there is nothing to do at all for this case, we can
cancel the callpath right there.

[akpm@linux-foundation.org: add sorely-needed comment]
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Reported-by: Akinobu Mita <akinobu.mita@gmail.com>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Cc: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Reported-by: Kir Kolyshkin <kir@openvz.org>
Tested-by: Kir Kolyshkin <kir@openvz.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

r8169: reset IntrStatus after chip reset

Upstream as d78ad8cbfe73ad568de38814a75e9c92ad0a907c (post 2.6.29).

Original comment (Karsten):
On a MSI MS-6702E mainboard, when in rtl8169_init_one() for the first time
after BIOS has run, IntrStatus reads 5 after chip has been reset.
IntrStatus should equal 0 there, so patch changes IntrStatus reset to happen
after chip reset instead of before.

Remark (Francois):
Assuming that the loglevel of the driver is increased above NETIF_MSG_INTR,
the bug reveals itself with a typical "interrupt 0025 in poll" message
at startup. In retrospect, the message should had been read as an hint of
an unexpected hardware state several months ago :o(

Fixes (at least part of) https://bugzilla.redhat.com/show_bug.cgi?id=460747

Signed-off-by: Karsten Wiese <fzu@wemgehoertderstaat.de>
Signed-off-by: Francois Romieu <romieu@fr.zoreil.com>
Tested-by: Josep <josep.puigdemont@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

r8169: use hardware auto-padding.

Upstream as 97d477a914b146e7e6722ded21afa79886ae8ccd (post 2.6.28).

It shortens the code and fixes the current pci_unmap leak with
padded skb reported by Dave Jones.

Signed-off-by: Francois Romieu <romieu@fr.zoreil.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

r8169: Don't update statistics counters when interface is down

Upstream as 355423d0849f4506bc71ab2738d38cb74429aaef (post 2.6.28).

Some Realtek chips (RTL8169sb/8110sb in my case) are unable to retrieve
ethtool statistics when the interface is down. The process stays in
endless loop in rtl8169_get_ethtool_stats. This is because these chips
need to have receiver enabled (CmdRxEnb bit in ChipCmd register) that is
cleared when the interface is going down. It's better to update statistics
only when the interface is up and otherwise return copy of statistics
grabbed when the interface was up (in rtl8169_close).

It is interesting that PCI-E NICs (like 8168b/8111b...) are not affected.

Signed-off-by: Ivan Vecera <ivecera@redhat.com>
Acked-by: Francois Romieu <romieu@fr.zoreil.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

block: revert part of 18ce3751ccd488c78d3827e9f6bf54e6322676fb

commit 78f707bfc723552e8309b7c38a8d0cc51012e813 upstream.

The above commit added WRITE_SYNC and switched various places to using
that for committing writes that will be waited upon immediately after
submission. However, this causes a performance regression with AS and CFQ
for ext3 at least, since sync_dirty_buffer() will submit some writes with
WRITE_SYNC while ext3 has sumitted others dependent writes without the sync
flag set. This causes excessive anticipation/idling in the IO scheduler
because sync and async writes get interleaved, causing a big performance
regression for the below test case (which is meant to simulate sqlite
like behaviour).

---- test case ----

int main(int argc, char **argv)
{

int fdes, i;
FILE *fp;
struct timeval start;
struct timeval end;
struct timeval res;

gettimeofday(&start, NULL);
for (i=0; i<ROWS; i++) {
fp = fopen("test_file", "a");
fprintf(fp, "Some Text Data\n");
fdes = fileno(fp);
fsync(fdes);
fclose(fp);
}
gettimeofday(&end, NULL);

timersub(&end, &start, &res);
fprintf(stdout, "time to write %d lines is %ld(msec)\n", ROWS,
(res.tv_sec*1000000 + res.tv_usec)/1000);

return 0;
}

-------------------

Thanks to Sean.White@APCC.com for tracking down this performance
regression and providing a test case.

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

kprobes: Fix locking imbalance in kretprobes

commit f02b8624fedca39886b0eef770dca70c2f0749b3 upstream.

Fix locking imbalance in kretprobes:

=====================================
[ BUG: bad unlock balance detected! ]
-------------------------------------
kthreadd/2 is trying to release lock (&rp->lock) at:
[<c06b3080>] pre_handler_kretprobe+0xea/0xf4
but there are no more locks to release!

other info that might help us debug this:
1 lock held by kthreadd/2:
#0: (rcu_read_lock){..--}, at: [<c06b2b24>] __atomic_notifier_call_chain+0x0/0x5a

stack backtrace:
Pid: 2, comm: kthreadd Not tainted 2.6.29-rc8 #1
Call Trace:
[<c06ae498>] ? printk+0xf/0x17
[<c06b3080>] ? pre_handler_kretprobe+0xea/0xf4
[<c044ce6c>] print_unlock_inbalance_bug+0xc3/0xce
[<c0444d4b>] ? clocksource_read+0x7/0xa
[<c04450a4>] ? getnstimeofday+0x5f/0xf6
[<c044a9ca>] ? register_lock_class+0x17/0x293
[<c044b72c>] ? mark_lock+0x1e/0x30b
[<c0448956>] ? tick_dev_program_event+0x4a/0xbc
[<c0498100>] ? __slab_alloc+0xa5/0x415
[<c06b2fbe>] ? pre_handler_kretprobe+0x28/0xf4
[<c06b3080>] ? pre_handler_kretprobe+0xea/0xf4
[<c044cf1b>] lock_release_non_nested+0xa4/0x1a5
[<c06b3080>] ? pre_handler_kretprobe+0xea/0xf4
[<c044d15d>] lock_release+0x141/0x166
[<c06b07dd>] _spin_unlock_irqrestore+0x19/0x50
[<c06b3080>] pre_handler_kretprobe+0xea/0xf4
[<c06b20b5>] kprobe_exceptions_notify+0x1c9/0x43e
[<c06b2b02>] notifier_call_chain+0x26/0x48
[<c06b2b5b>] __atomic_notifier_call_chain+0x37/0x5a
[<c06b2b24>] ? __atomic_notifier_call_chain+0x0/0x5a
[<c06b2b8a>] atomic_notifier_call_chain+0xc/0xe
[<c0442d0d>] notify_die+0x2d/0x2f
[<c06b0f9c>] do_int3+0x1f/0x71
[<c06b0e84>] int3+0x2c/0x34
[<c042d476>] ? do_fork+0x1/0x288
[<c040221b>] ? kernel_thread+0x71/0x79
[<c043ed1b>] ? kthread+0x0/0x60
[<c043ed1b>] ? kthread+0x0/0x60
[<c04040b8>] ? kernel_thread_helper+0x0/0x10
[<c043ec7f>] kthreadd+0xac/0x148
[<c043ebd3>] ? kthreadd+0x0/0x148
[<c04040bf>] kernel_thread_helper+0x7/0x10

Signed-off-by: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Tested-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Cc: Masami Hiramatsu <mhiramat@redhat.com>
Cc: Jim Keniston <jkenisto@us.ibm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
LKML-Reference: <20090318113621.GB4129@in.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

hugetlbfs: return negative error code for bad mount option

upstream commit: c12ddba09394c60e1120e6997794fa6ed52da884

This fixes the following BUG:

  # mount -o size=MM -t hugetlbfs none /huge
  hugetlbfs: Bad value 'MM' for mount option 'size=MM'
  ------------[ cut here ]------------
  kernel BUG at fs/super.c:996!

Due to

BUG_ON(!mnt->mnt_sb);

in vfs_kern_mount().

Also, remove unused #include <linux/quotaops.h>

Cc: William Irwin <wli@holomorphy.com>
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

agp: zero pages before sending to userspace

upstream commit: 59de2bebabc5027f93df999d59cc65df591c3e6e

CVE-2009-1192

AGP pages might be mapped into userspace finally, so the pages should be
set to zero before userspace can use it. Otherwise there is potential
information leakage.

Signed-off-by: Shaohua Li <shaohua.li@intel.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

USB: usb-storage: augment unusual_devs entry for Simple Tech/Datafab

upstream commit: e4813eec8d47c8299d968bd5349dc881fa481c26

This patch (as1227) adds the MAX_SECTORS_64 flag to the unusual_devs
entry for the Simple Tech/Datafab controller. This fixes Bugzilla
#12882.

Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Reported-and-tested-by: binbin <binbinsh@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>

USB: fix oops in cdc-wdm in case of malformed descriptors

upstream commit: e13c594f3a1fc2c78e7a20d1a07974f71e4b448f

cdc-wdm needs to ignore extremely malformed descriptors.

Signed-off-by: Oliver Neukum <oliver@neukum.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>

USB: ftdi_sio: add vendor/project id for JETI specbos 1201 spectrometer

upstream commit: ae27d84351f1f3568118318a8c40ff3a154bd629

Signed-off-by: Peter Korsgaard <jacmet@sunsite.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>

usb gadget: fix ethernet link reports to ethtool

upstream commit: 237e75bf1e558f7330f8deb167fa3116405bef2c

The g_ether USB gadget driver currently decides whether or not there's a
link to report back for eth_get_link based on if the USB link speed is
set. The USB gadget speed is however often set even before the device is
enumerated. It seems more sensible to only report a "link" if we're
actually connected to a host that wants to talk to us. The patch below
does this for me - tested with the PXA27x UDC driver.

Signed-off-by: Jonathan McDowell <noodles@earth.li>
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>