]> git.karo-electronics.de Git - karo-tx-linux.git/log
karo-tx-linux.git
13 years agofuse: verify ioctl retries
Miklos Szeredi [Tue, 30 Nov 2010 15:39:27 +0000 (16:39 +0100)]
fuse: verify ioctl retries

commit 7572777eef78ebdee1ecb7c258c0ef94d35bad16 upstream.

Verify that the total length of the iovec returned in FUSE_IOCTL_RETRY
doesn't overflow iov_length().

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
CC: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
13 years agosgi-xp: incoming XPC channel messages can come in after the channel's partition struc...
Robin Holt [Tue, 26 Oct 2010 21:21:15 +0000 (14:21 -0700)]
sgi-xp: incoming XPC channel messages can come in after the channel's partition structures have been torn down

commit 09358972bff5ce99de496bbba97c85d417b3c054 upstream.

Under some workloads, some channel messages have been observed being
delayed on the sending side past the point where the receiving side has
been able to tear down its partition structures.

This condition is already detected in xpc_handle_activate_IRQ_uv(), but
that information is not given to xpc_handle_activate_mq_msg_uv().  As a
result, xpc_handle_activate_mq_msg_uv() assumes the structures still exist
and references them, causing a NULL-pointer deref.

Signed-off-by: Robin Holt <holt@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
13 years agords: Integer overflow in RDS cmsg handling
Dan Rosenberg [Wed, 17 Nov 2010 06:37:16 +0000 (06:37 +0000)]
rds: Integer overflow in RDS cmsg handling

commit 218854af84038d828a32f061858b1902ed2beec6 upstream.

In rds_cmsg_rdma_args(), the user-provided args->nr_local value is
restricted to less than UINT_MAX.  This seems to need a tighter upper
bound, since the calculation of total iov_size can overflow, resulting
in a small sock_kmalloc() allocation.  This would probably just result
in walking off the heap and crashing when calling rds_rdma_pages() with
a high count value.  If it somehow doesn't crash here, then memory
corruption could occur soon after.

Signed-off-by: Dan Rosenberg <drosenberg@vsecurity.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
13 years agoeconet: fix CVE-2010-3848
Phil Blundell [Wed, 24 Nov 2010 19:51:47 +0000 (11:51 -0800)]
econet: fix CVE-2010-3848

commit a27e13d370415add3487949c60810e36069a23a6 upstream.

Don't declare variable sized array of iovecs on the stack since this
could cause stack overflow if msg->msgiovlen is large.  Instead, coalesce
the user-supplied data into a new buffer and use a single iovec for it.

Signed-off-by: Phil Blundell <philb@gnu.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
13 years agoeconet: fix CVE-2010-3850
Phil Blundell [Wed, 24 Nov 2010 19:49:53 +0000 (11:49 -0800)]
econet: fix CVE-2010-3850

commit 16c41745c7b92a243d0874f534c1655196c64b74 upstream.

Add missing check for capable(CAP_NET_ADMIN) in SIOCSIFADDR operation.

Signed-off-by: Phil Blundell <philb@gnu.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
13 years agoeconet: disallow NULL remote addr for sendmsg(), fixes CVE-2010-3849
Phil Blundell [Wed, 24 Nov 2010 19:49:19 +0000 (11:49 -0800)]
econet: disallow NULL remote addr for sendmsg(), fixes CVE-2010-3849

commit fa0e846494792e722d817b9d3d625a4ef4896c96 upstream.

Later parts of econet_sendmsg() rely on saddr != NULL, so return early
with EINVAL if NULL was passed otherwise an oops may occur.

Signed-off-by: Phil Blundell <philb@gnu.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
13 years agocrypto: padlock - Fix AES-CBC handling on odd-block-sized input
Herbert Xu [Thu, 4 Nov 2010 18:38:39 +0000 (14:38 -0400)]
crypto: padlock - Fix AES-CBC handling on odd-block-sized input

commit c054a076a1bd4731820a9c4d638b13d5c9bf5935 upstream.

On certain VIA chipsets AES-CBC requires the input/output to be
a multiple of 64 bytes.  We had a workaround for this but it was
buggy as it sent the whole input for processing when it is meant
to only send the initial number of blocks which makes the rest
a multiple of 64 bytes.

As expected this causes memory corruption whenever the workaround
kicks in.

Reported-by: Phil Sutter <phil@nwl.cc>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
13 years agox25: Prevent crashing when parsing bad X.25 facilities
Dan Rosenberg [Fri, 12 Nov 2010 20:44:42 +0000 (12:44 -0800)]
x25: Prevent crashing when parsing bad X.25 facilities

commit 5ef41308f94dcbb3b7afc56cdef1c2ba53fa5d2f upstream.

Now with improved comma support.

On parsing malformed X.25 facilities, decrementing the remaining length
may cause it to underflow.  Since the length is an unsigned integer,
this will result in the loop continuing until the kernel crashes.

This patch adds checks to ensure decrementing the remaining length does
not cause it to wrap around.

Signed-off-by: Dan Rosenberg <drosenberg@vsecurity.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
13 years agoV4L/DVB: ivtvfb: prevent reading uninitialized stack memory
Dan Rosenberg [Wed, 15 Sep 2010 21:44:22 +0000 (18:44 -0300)]
V4L/DVB: ivtvfb: prevent reading uninitialized stack memory

commit 405707985594169cfd0b1d97d29fcb4b4c6f2ac9 upstream.

The FBIOGET_VBLANK device ioctl allows unprivileged users to read 16
bytes of uninitialized stack memory, because the "reserved" member of
the fb_vblank struct declared on the stack is not altered or zeroed
before being copied back to the user.  This patch takes care of it.

Signed-off-by: Dan Rosenberg <dan.j.rosenberg@gmail.com>
Signed-off-by: Andy Walls <awalls@md.metrocast.net>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
13 years agocan-bcm: fix minor heap overflow
Oliver Hartkopp [Wed, 10 Nov 2010 12:10:30 +0000 (12:10 +0000)]
can-bcm: fix minor heap overflow

commit 0597d1b99fcfc2c0eada09a698f85ed413d4ba84 upstream.

On 64-bit platforms the ASCII representation of a pointer may be up to 17
bytes long. This patch increases the length of the buffer accordingly.

http://marc.info/?l=linux-netdev&m=128872251418192&w=2

Reported-by: Dan Rosenberg <drosenberg@vsecurity.com>
Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net>
CC: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
13 years agomemory corruption in X.25 facilities parsing
andrew hendry [Wed, 3 Nov 2010 12:54:53 +0000 (12:54 +0000)]
memory corruption in X.25 facilities parsing

commit a6331d6f9a4298173b413cf99a40cc86a9d92c37 upstream.

Signed-of-by: Andrew Hendry <andrew.hendry@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
13 years agox25: Patch to fix bug 15678 - x25 accesses fields beyond end of packet.
John Hughes [Thu, 8 Apr 2010 04:29:25 +0000 (21:29 -0700)]
x25: Patch to fix bug 15678 - x25 accesses fields beyond end of packet.

commit f5eb917b861828da18dc28854308068c66d1449a upstream.

Here is a patch to stop X.25 examining fields beyond the end of the packet.

For example, when a simple CALL ACCEPTED was received:

10 10 0f

x25_parse_facilities was attempting to decode the FACILITIES field, but this
packet contains no facilities field.

Signed-off-by: John Hughes <john@calva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
13 years agonet: Limit socket I/O iovec total length to INT_MAX.
David S. Miller [Thu, 28 Oct 2010 18:41:55 +0000 (11:41 -0700)]
net: Limit socket I/O iovec total length to INT_MAX.

commit 8acfe468b0384e834a303f08ebc4953d72fb690a upstream.

This helps protect us from overflow issues down in the
individual protocol sendmsg/recvmsg handlers.  Once
we hit INT_MAX we truncate out the rest of the iovec
by setting the iov_len members to zero.

This works because:

1) For SOCK_STREAM and SOCK_SEQPACKET sockets, partial
   writes are allowed and the application will just continue
   with another write to send the rest of the data.

2) For datagram oriented sockets, where there must be a
   one-to-one correspondance between write() calls and
   packets on the wire, INT_MAX is going to be far larger
   than the packet size limit the protocol is going to
   check for and signal with -EMSGSIZE.

Based upon a patch by Linus Torvalds.

Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
13 years agonet: Truncate recvfrom and sendto length to INT_MAX.
Linus Torvalds [Sat, 30 Oct 2010 23:43:10 +0000 (16:43 -0700)]
net: Truncate recvfrom and sendto length to INT_MAX.

commit 253eacc070b114c2ec1f81b067d2fed7305467b0 upstream.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
13 years agogenirq: Fix incorrect proc spurious output
Kenji Kaneshige [Tue, 30 Nov 2010 08:36:08 +0000 (17:36 +0900)]
genirq: Fix incorrect proc spurious output

commit 25c9170ed64a6551beefe9315882f754e14486f4 upstream.

Since commit a1afb637(switch /proc/irq/*/spurious to seq_file) all
/proc/irq/XX/spurious files show the information of irq 0.

Current irq_spurious_proc_open() passes on NULL as the 3rd argument,
which is used as an IRQ number in irq_spurious_proc_show(), to the
single_open(). Because of this, all the /proc/irq/XX/spurious file
shows IRQ 0 information regardless of the IRQ number.

To fix the problem, irq_spurious_proc_open() must pass on the
appropreate data (IRQ number) to single_open().

Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Reviewed-by: Yong Zhang <yong.zhang0@gmail.com>
LKML-Reference: <4CF4B778.90604@jp.fujitsu.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
13 years agonohz/s390: fix arch_needs_cpu() return value on offline cpus
Heiko Carstens [Wed, 1 Dec 2010 09:08:01 +0000 (10:08 +0100)]
nohz/s390: fix arch_needs_cpu() return value on offline cpus

commit 398812159e328478ae49b4bd01f0d71efea96c39 upstream.

This fixes the same problem as described in the patch "nohz: fix
printk_needs_cpu() return value on offline cpus" for the arch_needs_cpu()
primitive:

arch_needs_cpu() may return 1 if called on offline cpus. When a cpu gets
offlined it schedules the idle process which, before killing its own cpu,
will call tick_nohz_stop_sched_tick().
That function in turn will call arch_needs_cpu() in order to check if the
local tick can be disabled. On offline cpus this function should naturally
return 0 since regardless if the tick gets disabled or not the cpu will be
dead short after. That is besides the fact that __cpu_disable() should already
have made sure that no interrupts on the offlined cpu will be delivered anyway.

In this case it prevents tick_nohz_stop_sched_tick() to call
select_nohz_load_balancer(). No idea if that really is a problem. However what
made me debug this is that on 2.6.32 the function get_nohz_load_balancer() is
used within __mod_timer() to select a cpu on which a timer gets enqueued.
If arch_needs_cpu() returns 1 then the nohz_load_balancer cpu doesn't get
updated when a cpu gets offlined. It may contain the cpu number of an offline
cpu. In turn timers get enqueued on an offline cpu and not very surprisingly
they never expire and cause system hangs.

This has been observed 2.6.32 kernels. On current kernels __mod_timer() uses
get_nohz_timer_target() which doesn't have that problem. However there might
be other problems because of the too early exit tick_nohz_stop_sched_tick()
in case a cpu goes offline.

This specific bug was indrocuded with 3c5d92a0 "nohz: Introduce
arch_needs_cpu".

In this case a cpu hotplug notifier is used to fix the issue in order to keep
the normal/fast path small. All we need to do is to clear the condition that
makes arch_needs_cpu() return 1 since it is just a performance improvement
which is supposed to keep the local tick running for a short period if a cpu
goes idle. Nothing special needs to be done except for clearing the condition.

Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
13 years agowmi: use memcmp instead of strncmp to compare GUIDs
Thadeu Lima de Souza Cascardo [Sun, 28 Nov 2010 21:46:50 +0000 (19:46 -0200)]
wmi: use memcmp instead of strncmp to compare GUIDs

commit 8b14d7b22c61f17ccb869e0047d9df6dd9f50a9f upstream.

While looking for the duplicates in /sys/class/wmi/, I couldn't find
them. The code that looks for duplicates uses strncmp in a binary GUID,
which may contain zero bytes. The right function is memcmp, which is
also used in another section of wmi code.

It was finding 49142400-C6A3-40FA-BADB-8A2652834100 as a duplicate of
39142400-C6A3-40FA-BADB-8A2652834100. Since the first byte is the fourth
printed, they were found as equal by strncmp.

Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@holoscopio.com>
Signed-off-by: Matthew Garrett <mjg@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
13 years agovmscan: raise the bar to PAGEOUT_IO_SYNC stalls
Wu Fengguang [Tue, 10 Aug 2010 00:20:01 +0000 (17:20 -0700)]
vmscan: raise the bar to PAGEOUT_IO_SYNC stalls

commit e31f3698cd3499e676f6b0ea12e3528f569c4fa3 upstream.

Fix "system goes unresponsive under memory pressure and lots of
dirty/writeback pages" bug.

http://lkml.org/lkml/2010/4/4/86

In the above thread, Andreas Mohr described that

Invoking any command locked up for minutes (note that I'm
talking about attempted additional I/O to the _other_,
_unaffected_ main system HDD - such as loading some shell
binaries -, NOT the external SSD18M!!).

This happens when the two conditions are both meet:
- under memory pressure
- writing heavily to a slow device

OOM also happens in Andreas' system.  The OOM trace shows that 3 processes
are stuck in wait_on_page_writeback() in the direct reclaim path.  One in
do_fork() and the other two in unix_stream_sendmsg().  They are blocked on
this condition:

(sc->order && priority < DEF_PRIORITY - 2)

which was introduced in commit 78dc583d (vmscan: low order lumpy reclaim
also should use PAGEOUT_IO_SYNC) one year ago.  That condition may be too
permissive.  In Andreas' case, 512MB/1024 = 512KB.  If the direct reclaim
for the order-1 fork() allocation runs into a range of 512KB
hard-to-reclaim LRU pages, it will be stalled.

It's a severe problem in three ways.

Firstly, it can easily happen in daily desktop usage.  vmscan priority can
easily go below (DEF_PRIORITY - 2) on _local_ memory pressure.  Even if
the system has 50% globally reclaimable pages, it still has good
opportunity to have 0.1% sized hard-to-reclaim ranges.  For example, a
simple dd can easily create a big range (up to 20%) of dirty pages in the
LRU lists.  And order-1 to order-3 allocations are more than common with
SLUB.  Try "grep -v '1 :' /proc/slabinfo" to get the list of high order
slab caches.  For example, the order-1 radix_tree_node slab cache may
stall applications at swap-in time; the order-3 inode cache on most
filesystems may stall applications when trying to read some file; the
order-2 proc_inode_cache may stall applications when trying to open a
/proc file.

Secondly, once triggered, it will stall unrelated processes (not doing IO
at all) in the system.  This "one slow USB device stalls the whole system"
avalanching effect is very bad.

Thirdly, once stalled, the stall time could be intolerable long for the
users.  When there are 20MB queued writeback pages and USB 1.1 is writing
them in 1MB/s, wait_on_page_writeback() will stuck for up to 20 seconds.
Not to mention it may be called multiple times.

So raise the bar to only enable PAGEOUT_IO_SYNC when priority goes below
DEF_PRIORITY/3, or 6.25% LRU size.  As the default dirty throttle ratio is
20%, it will hardly be triggered by pure dirty pages.  We'd better treat
PAGEOUT_IO_SYNC as some last resort workaround -- its stall time is so
uncomfortably long (easily goes beyond 1s).

The bar is only raised for (order < PAGE_ALLOC_COSTLY_ORDER) allocations,
which are easy to satisfy in 1TB memory boxes.  So, although 6.25% of
memory could be an awful lot of pages to scan on a system with 1TB of
memory, it won't really have to busy scan that much.

Andreas tested an older version of this patch and reported that it mostly
fixed his problem.  Mel Gorman helped improve it and KOSAKI Motohiro will
fix it further in the next patch.

Reported-by: Andreas Mohr <andi@lisas.de>
Reviewed-by: Minchan Kim <minchan.kim@gmail.com>
Reviewed-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: Mel Gorman <mel@csn.ul.ie>
Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
Cc: Rik van Riel <riel@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
13 years agoPrioritize synchronous signals over 'normal' signals
Linus Torvalds [Tue, 2 Mar 2010 16:36:46 +0000 (08:36 -0800)]
Prioritize synchronous signals over 'normal' signals

commit a27341cd5fcb7cf2d2d4726e9f324009f7162c00 upstream.

This makes sure that we pick the synchronous signals caused by a
processor fault over any pending regular asynchronous signals sent to
use by [t]kill().

This is not strictly required semantics, but it makes it _much_ easier
for programs like Wine that expect to find the fault information in the
signal stack.

Without this, if a non-synchronous signal gets picked first, the delayed
asynchronous signal will have its signal context pointing to the new
signal invocation, rather than the instruction that caused the SIGSEGV
or SIGBUS in the first place.

This is not all that pretty, and we're discussing making the synchronous
signals more explicit rather than have these kinds of implicit
preferences of SIGSEGV and friends.  See for example

http://bugzilla.kernel.org/show_bug.cgi?id=15395

for some of the discussion.  But in the meantime this is a simple and
fairly straightforward work-around, and the whole

if (x & Y)
x &= Y;

thing can be compiled into (and gcc does do it) just three instructions:

movq    %rdx, %rax
andl    $Y, %eax
cmovne  %rax, %rdx

so it is at least a simple solution to a subtle issue.

Reported-and-tested-by: Pavel Vilim <wylda@volny.cz>
Acked-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
13 years agoBtrfs: kfree correct pointer during mount option parsing
Josef Bacik [Thu, 25 Feb 2010 20:38:35 +0000 (20:38 +0000)]
Btrfs: kfree correct pointer during mount option parsing

commit da495ecc0fb096b383754952a1c152147bc95b52 upstream.

We kstrdup the options string, but then strsep screws with the pointer,
so when we kfree() it, we're not giving it the right pointer.

Tested-by: Andy Lutomirski <luto@mit.edu>
Signed-off-by: Chris Mason <chris.mason@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
13 years agomm: fix corruption of hibernation caused by reusing swap during image saving
KAMEZAWA Hiroyuki [Tue, 10 Aug 2010 00:20:09 +0000 (17:20 -0700)]
mm: fix corruption of hibernation caused by reusing swap during image saving

commit 966cca029f739716fbcc8068b8c6dfe381f86fc3 upstream.

Since 2.6.31, swap_map[]'s refcounting was changed to show that a used
swap entry is just for swap-cache, can be reused.  Then, while scanning
free entry in swap_map[], a swap entry may be able to be reclaimed and
reused.  It was caused by commit c9e444103b5e7a5 ("mm: reuse unused swap
entry if necessary").

But this caused deta corruption at resume. The scenario is

- Assume a clean-swap cache, but mapped.

- at hibernation_snapshot[], clean-swap-cache is saved as
  clean-swap-cache and swap_map[] is marked as SWAP_HAS_CACHE.

- then, save_image() is called.  And reuse SWAP_HAS_CACHE entry to save
  image, and break the contents.

After resume:

- the memory reclaim runs and finds clean-not-referenced-swap-cache and
  discards it because it's marked as clean.  But here, the contents on
  disk and swap-cache is inconsistent.

Hance memory is corrupted.

This patch avoids the bug by not reclaiming swap-entry during hibernation.
This is a quick fix for backporting.

Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Rafael J. Wysocki <rjw@sisk.pl>
Reported-by: Ondreg Zary <linux@rainbow-software.org>
Tested-by: Ondreg Zary <linux@rainbow-software.org>
Tested-by: Andrea Gelmini <andrea.gelmini@gmail.com>
Acked-by: Hugh Dickins <hughd@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
13 years agoext4: Prevent creation of files larger than RLIMIT_FSIZE using fallocate
Nikanth Karthikesan [Sun, 16 May 2010 18:00:00 +0000 (14:00 -0400)]
ext4: Prevent creation of files larger than RLIMIT_FSIZE using fallocate

commit 6d19c42b7cf81c39632b6d4dbc514e8449bcd346 upstream.

Currently using posix_fallocate one can bypass an RLIMIT_FSIZE limit
and create a file larger than the limit. Add a check for that.

Signed-off-by: Nikanth Karthikesan <knikanth@suse.de>
Signed-off-by: Amit Arora <aarora@in.ibm.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
13 years agocompat: Make compat_alloc_user_space() incorporate the access_ok()
H. Peter Anvin [Tue, 7 Sep 2010 23:16:18 +0000 (16:16 -0700)]
compat: Make compat_alloc_user_space() incorporate the access_ok()

commit c41d68a513c71e35a14f66d71782d27a79a81ea6 upstream.

compat_alloc_user_space() expects the caller to independently call
access_ok() to verify the returned area.  A missing call could
introduce problems on some architectures.

This patch incorporates the access_ok() check into
compat_alloc_user_space() and also adds a sanity check on the length.
The existing compat_alloc_user_space() implementations are renamed
arch_compat_alloc_user_space() and are used as part of the
implementation of the new global function.

This patch assumes NULL will cause __get_user()/__put_user() to either
fail or access userspace on all architectures.  This should be
followed by checking the return value of compat_access_user_space()
for NULL in the callers, at which time the access_ok() in the callers
can also be removed.

Reported-by: Ben Hawkes <hawkes@sota.gen.nz>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-by: Chris Metcalf <cmetcalf@tilera.com>
Acked-by: David S. Miller <davem@davemloft.net>
Acked-by: Ingo Molnar <mingo@elte.hu>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Tony Luck <tony.luck@intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Helge Deller <deller@gmx.de>
Cc: James Bottomley <jejb@parisc-linux.org>
Cc: Kyle McMartin <kyle@mcmartin.ca>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
13 years agox86-64, compat: Retruncate rax after ia32 syscall entry tracing
Roland McGrath [Tue, 14 Sep 2010 19:22:58 +0000 (12:22 -0700)]
x86-64, compat: Retruncate rax after ia32 syscall entry tracing

commit eefdca043e8391dcd719711716492063030b55ac upstream.

In commit d4d6715, we reopened an old hole for a 64-bit ptracer touching a
32-bit tracee in system call entry.  A %rax value set via ptrace at the
entry tracing stop gets used whole as a 32-bit syscall number, while we
only check the low 32 bits for validity.

Fix it by truncating %rax back to 32 bits after syscall_trace_enter,
in addition to testing the full 64 bits as has already been added.

Reported-by: Ben Hawkes <hawkes@sota.gen.nz>
Signed-off-by: Roland McGrath <roland@redhat.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
13 years agox86-64, compat: Test %rax for the syscall number, not %eax
H. Peter Anvin [Tue, 14 Sep 2010 19:42:41 +0000 (12:42 -0700)]
x86-64, compat: Test %rax for the syscall number, not %eax

commit 36d001c70d8a0144ac1d038f6876c484849a74de upstream.

On 64 bits, we always, by necessity, jump through the system call
table via %rax.  For 32-bit system calls, in theory the system call
number is stored in %eax, and the code was testing %eax for a valid
system call number.  At one point we loaded the stored value back from
the stack to enforce zero-extension, but that was removed in checkin
d4d67150165df8bf1cc05e532f6efca96f907cab.  An actual 32-bit process
will not be able to introduce a non-zero-extended number, but it can
happen via ptrace.

Instead of re-introducing the zero-extension, test what we are
actually going to use, i.e. %rax.  This only adds a handful of REX
prefixes to the code.

Reported-by: Ben Hawkes <hawkes@sota.gen.nz>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Cc: Roland McGrath <roland@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
13 years agoext4: consolidate in_range() definitions
Akinobu Mita [Thu, 4 Mar 2010 04:55:01 +0000 (23:55 -0500)]
ext4: consolidate in_range() definitions

commit 731eb1a03a8445cde2cb23ecfb3580c6fa7bb690 upstream.

There are duplicate macro definitions of in_range() in mballoc.h and
balloc.c.  This consolidates these two definitions into ext4.h, and
changes extents.c to use in_range() as well.

Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Cc: Andreas Dilger <adilger@sun.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
13 years agomm: fix up some user-visible effects of the stack guard page
Linus Torvalds [Sun, 15 Aug 2010 18:35:52 +0000 (11:35 -0700)]
mm: fix up some user-visible effects of the stack guard page

commit d7824370e26325c881b665350ce64fb0a4fde24a upstream.

This commit makes the stack guard page somewhat less visible to user
space. It does this by:

 - not showing the guard page in /proc/<pid>/maps

   It looks like lvm-tools will actually read /proc/self/maps to figure
   out where all its mappings are, and effectively do a specialized
   "mlockall()" in user space.  By not showing the guard page as part of
   the mapping (by just adding PAGE_SIZE to the start for grows-up
   pages), lvm-tools ends up not being aware of it.

 - by also teaching the _real_ mlock() functionality not to try to lock
   the guard page.

   That would just expand the mapping down to create a new guard page,
   so there really is no point in trying to lock it in place.

It would perhaps be nice to show the guard page specially in
/proc/<pid>/maps (or at least mark grow-down segments some way), but
let's not open ourselves up to more breakage by user space from programs
that depends on the exact deails of the 'maps' file.

Special thanks to Henrique de Moraes Holschuh for diving into lvm-tools
source code to see what was going on with the whole new warning.

Reported-and-tested-by: François Valenduc <francois.valenduc@tvcablenet.be
Reported-by: Henrique de Moraes Holschuh <hmh@hmh.eng.br>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
13 years agomm: fix page table unmap for stack guard page properly
Linus Torvalds [Sat, 14 Aug 2010 18:44:56 +0000 (11:44 -0700)]
mm: fix page table unmap for stack guard page properly

commit 11ac552477e32835cb6970bf0a70c210807f5673 upstream.

We do in fact need to unmap the page table _before_ doing the whole
stack guard page logic, because if it is needed (mainly 32-bit x86 with
PAE and CONFIG_HIGHPTE, but other architectures may use it too) then it
will do a kmap_atomic/kunmap_atomic.

And those kmaps will create an atomic region that we cannot do
allocations in.  However, the whole stack expand code will need to do
anon_vma_prepare() and vma_lock_anon_vma() and they cannot do that in an
atomic region.

Now, a better model might actually be to do the anon_vma_prepare() when
_creating_ a VM_GROWSDOWN segment, and not have to worry about any of
this at page fault time.  But in the meantime, this is the
straightforward fix for the issue.

See https://bugzilla.kernel.org/show_bug.cgi?id=16588 for details.

Reported-by: Wylda <wylda@volny.cz>
Reported-by: Sedat Dilek <sedat.dilek@gmail.com>
Reported-by: Mike Pagano <mpagano@gentoo.org>
Reported-by: François Valenduc <francois.valenduc@tvcablenet.be>
Tested-by: Ed Tomlinson <edt@aei.ca>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Greg KH <gregkh@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
13 years agomm: fix missing page table unmap for stack guard page failure case
Linus Torvalds [Fri, 13 Aug 2010 16:24:04 +0000 (09:24 -0700)]
mm: fix missing page table unmap for stack guard page failure case

commit 5528f9132cf65d4d892bcbc5684c61e7822b21e9 upstream.

.. which didn't show up in my tests because it's a no-op on x86-64 and
most other architectures.  But we enter the function with the last-level
page table mapped, and should unmap it at exit.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
13 years agomm: keep a guard page below a grow-down stack segment
Linus Torvalds [Fri, 13 Aug 2010 00:54:33 +0000 (17:54 -0700)]
mm: keep a guard page below a grow-down stack segment

commit 320b2b8de12698082609ebbc1a17165727f4c893 upstream.

This is a rather minimally invasive patch to solve the problem of the
user stack growing into a memory mapped area below it.  Whenever we fill
the first page of the stack segment, expand the segment down by one
page.

Now, admittedly some odd application might _want_ the stack to grow down
into the preceding memory mapping, and so we may at some point need to
make this a process tunable (some people might also want to have more
than a single page of guarding), but let's try the minimal approach
first.

Tested with trivial application that maps a single page just below the
stack, and then starts recursing.  Without this, we will get a SIGSEGV
_after_ the stack has smashed the mapping.  With this patch, we'll get a
nice SIGBUS just as the stack touches the page just above the mapping.

Requested-by: Keith Packard <keithp@keithp.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
13 years agostaging: rtl8187se: Change panic to warn when RF switch turned off
Larry Finger [Sat, 13 Nov 2010 19:01:56 +0000 (13:01 -0600)]
staging: rtl8187se: Change panic to warn when RF switch turned off

commit f36d83a8cb7224f45fdfa1129a616dff56479a09 upstream.

This driver issues a kernel panic over conditions that do not
justify such drastic action. Change these to log entries with
a stack dump.

This patch fixes the system crash reported in
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/674285.

Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net>
Reported-and-Tested-by: Robie Basik <rb-oss-3@justgohome.co.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
13 years agoStaging: frontier: fix up some sysfs attribute permissions
Greg Kroah-Hartman [Tue, 16 Nov 2010 19:18:33 +0000 (11:18 -0800)]
Staging: frontier: fix up some sysfs attribute permissions

commit 3bad28ec006ad6ab2bca4e5103860b75391e3c9d and
2a767fda5d0d8dcff465724dfad6ee131489b3f2 upstream merged together.

They should not be writable by any user

Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: David Taht <d@teklibre.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
13 years agoStaging: samsung-laptop: fix up my fixup for some sysfs attribute permissions
Greg Kroah-Hartman [Thu, 18 Nov 2010 19:21:04 +0000 (11:21 -0800)]
Staging: samsung-laptop: fix up my fixup for some sysfs attribute permissions

commit 4d7bc388b44e42a1feafa35e50eef4f24d6ca59d upstream.

They should be writable by root, not readable.
Doh, stupid me with the wrong flags.

Reported-by: Jonathan Cameron <jic23@cam.ac.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
13 years agoStaging: samsung-laptop: fix up some sysfs attribute permissions
Greg Kroah-Hartman [Tue, 16 Nov 2010 19:21:03 +0000 (11:21 -0800)]
Staging: samsung-laptop: fix up some sysfs attribute permissions

commit 90c05b97fdec8d2196e420d98f774bab731af7aa upstream.

They should not be writable by any user

Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
13 years agoudp: add rehash on connect()
Eric Dumazet [Wed, 8 Sep 2010 05:08:44 +0000 (05:08 +0000)]
udp: add rehash on connect()

commit 719f835853a92f6090258114a72ffe41f09155cd upstream.

commit 30fff923 introduced in linux-2.6.33 (udp: bind() optimisation)
added a secondary hash on UDP, hashed on (local addr, local port).

Problem is that following sequence :

fd = socket(...)
connect(fd, &remote, ...)

not only selects remote end point (address and port), but also sets
local address, while UDP stack stored in secondary hash table the socket
while its local address was INADDR_ANY (or ipv6 equivalent)

Sequence is :
 - autobind() : choose a random local port, insert socket in hash tables
              [while local address is INADDR_ANY]
 - connect() : set remote address and port, change local address to IP
              given by a route lookup.

When an incoming UDP frame comes, if more than 10 sockets are found in
primary hash table, we switch to secondary table, and fail to find
socket because its local address changed.

One solution to this problem is to rehash datagram socket if needed.

We add a new rehash(struct socket *) method in "struct proto", and
implement this method for UDP v4 & v6, using a common helper.

This rehashing only takes care of secondary hash table, since primary
hash (based on local port only) is not changed.

Reported-by: Krzysztof Piotr Oledzki <ole@ans.pl>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Tested-by: Krzysztof Piotr Oledzki <ole@ans.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
13 years agoleds: fix bug with reading NAS SS4200 dmi code
Steven Rostedt [Wed, 24 Nov 2010 20:56:52 +0000 (12:56 -0800)]
leds: fix bug with reading NAS SS4200 dmi code

commit 50d431e8a15701b599c98afe2b464eb33c952477 upstream.

While running randconfg with ktest.pl I stumbled upon this bug:

  BUG: unable to handle kernel NULL pointer dereference at 0000000000000003
  IP: [<ffffffff815fe44f>] strstr+0x39/0x86
  PGD 0
  Oops: 0000 [#1] SMP
  last sysfs file:
  CPU 0
  Modules linked in:

  Pid: 1, comm: swapper Not tainted 2.6.37-rc1-test+ #6 DG965MQ/
  RIP: 0010:[<ffffffff815fe44f>]  [<ffffffff815fe44f>] strstr+0x39/0x86
  RSP: 0018:ffff8800797cbd80  EFLAGS: 00010213
  RAX: 0000000000000000 RBX: 0000000000000003 RCX: ffffffffffffffff
  RDX: 0000000000000000 RSI: ffffffff82eb7ac9 RDI: 0000000000000003
  RBP: ffff8800797cbda0 R08: ffff880000000003 R09: 0000000000030725
  R10: ffff88007d294c00 R11: 0000000000014c00 R12: 0000000000000020
  R13: ffffffff82eb7ac9 R14: ffffffffffffffff R15: ffffffff82eb7b08
  FS:  0000000000000000(0000) GS:ffff88007d200000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
  CR2: 0000000000000003 CR3: 0000000002a1d000 CR4: 00000000000006f0
  DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
  DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
  Process swapper (pid: 1, threadinfo ffff8800797ca000, task ffff8800797d0000)
  Stack:
   00000000000000ba ffffffff82eb7ac9 ffffffff82eb7ab8 00000000000000ba
   ffff8800797cbdf0 ffffffff81e2050f ffff8800797cbdc0 00000000815f913b
   ffff8800797cbe00 ffffffff82eb7ab8 0000000000000000 0000000000000000
  Call Trace:
   [<ffffffff81e2050f>] dmi_matches+0x117/0x154
   [<ffffffff81e205d7>] dmi_check_system+0x3d/0x8d
   [<ffffffff82e1ad25>] ? nas_gpio_init+0x0/0x2c8
   [<ffffffff82e1ad49>] nas_gpio_init+0x24/0x2c8
   [<ffffffff820d750d>] ? wm8350_led_init+0x0/0x20
   [<ffffffff82e1ad25>] ? nas_gpio_init+0x0/0x2c8
   [<ffffffff810022f7>] do_one_initcall+0xab/0x1b2
   [<ffffffff82da749c>] kernel_init+0x248/0x331
   [<ffffffff8100e624>] kernel_thread_helper+0x4/0x10
   [<ffffffff82da7254>] ? kernel_init+0x0/0x331

Found that the nas_led_whitelist dmi_system_id structure array had no
NULL end delimiter, causing the dmi_check_system() loop to read an
undefined entry.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Acked-by: Dave Hansen <dave@sr71.net>
Acked-by: Richard Purdie <rpurdie@linux.intel.com>
Acked-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
13 years agoARM: 6482/2: Fix find_next_zero_bit and related assembly
James Jones [Tue, 23 Nov 2010 23:21:37 +0000 (00:21 +0100)]
ARM: 6482/2: Fix find_next_zero_bit and related assembly

commit 0e91ec0c06d2cd15071a6021c94840a50e6671aa upstream.

The find_next_bit, find_first_bit, find_next_zero_bit
and find_first_zero_bit functions were not properly
clamping to the maxbit argument at the bit level. They
were instead only checking maxbit at the byte level.
To fix this, add a compare and a conditional move
instruction to the end of the common bit-within-the-
byte code used by all the functions and be sure not to
clobber the maxbit argument before it is used.

Reviewed-by: Nicolas Pitre <nicolas.pitre@linaro.org>
Tested-by: Stephen Warren <swarren@nvidia.com>
Signed-off-by: James Jones <jajones@nvidia.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
13 years agoARM: 6489/1: thumb2: fix incorrect optimisation in usracc
Will Deacon [Fri, 19 Nov 2010 12:18:31 +0000 (13:18 +0100)]
ARM: 6489/1: thumb2: fix incorrect optimisation in usracc

commit 1142b71d85894dcff1466dd6c871ea3c89e0352c upstream.

Commit 8b592783 added a Thumb-2 variant of usracc which, when it is
called with \rept=2, calls usraccoff once with an offset of 0 and
secondly with a hard-coded offset of 4 in order to avoid incrementing
the pointer again. If \inc != 4 then we will store the data to the wrong
offset from \ptr. Luckily, the only caller that passes \rept=2 to this
function is __clear_user so we haven't been actively corrupting user data.

This patch fixes usracc to pass \inc instead of #4 to usraccoff
when it is called a second time.

Reported-by: Tony Thompson <tony.thompson@arm.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
13 years agoperf_events: Fix perf_counter_mmap() hook in mprotect()
Pekka Enberg [Mon, 8 Nov 2010 19:29:07 +0000 (21:29 +0200)]
perf_events: Fix perf_counter_mmap() hook in mprotect()

commit 63bfd7384b119409685a17d5c58f0b56e5dc03da upstream.

As pointed out by Linus, commit dab5855 ("perf_counter: Add mmap event hooks to
mprotect()") is fundamentally wrong as mprotect_fixup() can free 'vma' due to
merging. Fix the problem by moving perf_event_mmap() hook to
mprotect_fixup().

Note: there's another successful return path from mprotect_fixup() if old
flags equal to new flags. We don't, however, need to call
perf_event_mmap() there because 'perf' already knows the VMA is
executable.

Reported-by: Dave Jones <davej@redhat.com>
Analyzed-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Ingo Molnar <mingo@elte.hu>
Reviewed-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
13 years agoDECnet: don't leak uninitialized stack byte
Dan Rosenberg [Tue, 23 Nov 2010 11:02:13 +0000 (11:02 +0000)]
DECnet: don't leak uninitialized stack byte

commit 3c6f27bf33052ea6ba9d82369fb460726fb779c0 upstream.

A single uninitialized padding byte is leaked to userspace.

Signed-off-by: Dan Rosenberg <drosenberg@vsecurity.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
13 years agox86: Ignore trap bits on single step exceptions
Frederic Weisbecker [Thu, 11 Nov 2010 20:18:43 +0000 (21:18 +0100)]
x86: Ignore trap bits on single step exceptions

commit 6c0aca288e726405b01dacb12cac556454d34b2a upstream.

When a single step exception fires, the trap bits, used to
signal hardware breakpoints, are in a random state.

These trap bits might be set if another exception will follow,
like a breakpoint in the next instruction, or a watchpoint in the
previous one. Or there can be any junk there.

So if we handle these trap bits during the single step exception,
we are going to handle an exception twice, or we are going to
handle junk.

Just ignore them in this case.

This fixes https://bugzilla.kernel.org/show_bug.cgi?id=21332

Reported-by: Michael Stefaniuc <mstefani@redhat.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Rafael J. Wysocki <rjw@sisk.pl>
Cc: Maciej Rutecki <maciej.rutecki@gmail.com>
Cc: Alexandre Julliard <julliard@winehq.org>
Cc: Jason Wessel <jason.wessel@windriver.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
13 years agonommu: yield CPU while disposing VM
Steven J. Magnani [Wed, 24 Nov 2010 20:56:54 +0000 (12:56 -0800)]
nommu: yield CPU while disposing VM

commit 04c3496152394d17e3bc2316f9731ee3e8a026bc upstream.

Depending on processor speed, page size, and the amount of memory a
process is allowed to amass, cleanup of a large VM may freeze the system
for many seconds.  This can result in a watchdog timeout.

Make sure other tasks receive some service when cleaning up large VMs.

Signed-off-by: Steven J. Magnani <steve@digidescorp.com>
Cc: Greg Ungerer <gerg@snapgear.com>
Reviewed-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
13 years agobacklight: grab ops_lock before testing bd->ops
Uwe Kleine-König [Wed, 24 Nov 2010 20:57:14 +0000 (12:57 -0800)]
backlight: grab ops_lock before testing bd->ops

commit d1d73578e053b981c3611e5a211534290d24a5eb upstream.

According to the comment describing ops_lock in the definition of struct
backlight_device and when comparing with other functions in backlight.c
the mutex must be hold when checking ops to be non-NULL.

Fixes a problem added by c835ee7f4154992e6 ("backlight: Add suspend/resume
support to the backlight core") in Jan 2009.

Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Acked-by: Richard Purdie <rpurdie@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
13 years agodo_exit(): make sure that we run with get_fs() == USER_DS
Nelson Elhage [Thu, 2 Dec 2010 22:31:21 +0000 (14:31 -0800)]
do_exit(): make sure that we run with get_fs() == USER_DS

commit 33dd94ae1ccbfb7bf0fb6c692bc3d1c4269e6177 upstream.

If a user manages to trigger an oops with fs set to KERNEL_DS, fs is not
otherwise reset before do_exit().  do_exit may later (via mm_release in
fork.c) do a put_user to a user-controlled address, potentially allowing
a user to leverage an oops into a controlled write into kernel memory.

This is only triggerable in the presence of another bug, but this
potentially turns a lot of DoS bugs into privilege escalations, so it's
worth fixing.  I have proof-of-concept code which uses this bug along
with CVE-2010-3849 to write a zero to an arbitrary kernel address, so
I've tested that this is not theoretical.

A more logical place to put this fix might be when we know an oops has
occurred, before we call do_exit(), but that would involve changing
every architecture, in multiple places.

Let's just stick it in do_exit instead.

[akpm@linux-foundation.org: update code comment]
Signed-off-by: Nelson Elhage <nelhage@ksplice.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
13 years agofuse: fix attributes after open(O_TRUNC)
Ken Sumrall [Wed, 24 Nov 2010 20:57:00 +0000 (12:57 -0800)]
fuse: fix attributes after open(O_TRUNC)

commit a0822c55779d9319939eac69f00bb729ea9d23da upstream.

The attribute cache for a file was not being cleared when a file is opened
with O_TRUNC.

If the filesystem's open operation truncates the file ("atomic_o_trunc"
feature flag is set) then the kernel should invalidate the cached st_mtime
and st_ctime attributes.

Also i_size should be explicitly be set to zero as it is used sometimes
without refreshing the cache.

Signed-off-by: Ken Sumrall <ksumrall@android.com>
Cc: Anfei <anfei.zhou@gmail.com>
Cc: "Anand V. Avati" <avati@gluster.com>
Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
13 years agoacpi-cpufreq: fix a memleak when unloading driver
Zhang Rui [Tue, 12 Oct 2010 01:09:37 +0000 (09:09 +0800)]
acpi-cpufreq: fix a memleak when unloading driver

commit dab5fff14df2cd16eb1ad4c02e83915e1063fece upstream.

We didn't free per_cpu(acfreq_data, cpu)->freq_table
when acpi_freq driver is unloaded.

Resulting in the following messages in /sys/kernel/debug/kmemleak:

unreferenced object 0xf6450e80 (size 64):
  comm "modprobe", pid 1066, jiffies 4294677317 (age 19290.453s)
  hex dump (first 32 bytes):
    00 00 00 00 e8 a2 24 00 01 00 00 00 00 9f 24 00  ......$.......$.
    02 00 00 00 00 6a 18 00 03 00 00 00 00 35 0c 00  .....j.......5..
  backtrace:
    [<c123ba97>] kmemleak_alloc+0x27/0x50
    [<c109f96f>] __kmalloc+0xcf/0x110
    [<f9da97ee>] acpi_cpufreq_cpu_init+0x1ee/0x4e4 [acpi_cpufreq]
    [<c11cd8d2>] cpufreq_add_dev+0x142/0x3a0
    [<c11920b7>] sysdev_driver_register+0x97/0x110
    [<c11cce56>] cpufreq_register_driver+0x86/0x140
    [<f9dad080>] 0xf9dad080
    [<c1001130>] do_one_initcall+0x30/0x160
    [<c10626e9>] sys_init_module+0x99/0x1e0
    [<c1002d97>] sysenter_do_call+0x12/0x26
    [<ffffffff>] 0xffffffff

https://bugzilla.kernel.org/show_bug.cgi?id=15807#c21

Tested-by: Toralf Forster <toralf.foerster@gmx.de>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
13 years agoUSB: serial: ftdi_sio: Vardaan USB RS422/485 converter PID added
Jacques Viviers [Wed, 24 Nov 2010 09:56:38 +0000 (11:56 +0200)]
USB: serial: ftdi_sio: Vardaan USB RS422/485 converter PID added

commit 6fdbad8021151a9e93af8159a6232c8f26415c09 upstream.

Add the PID for the Vardaan Enterprises VEUSB422R3 USB to RS422/485
converter. It uses the same chip as the FTDI_8U232AM_PID 0x6001.

This should also work with the stable branches for:
2.6.31, 2.6.32, 2.6.33, 2.6.34, 2.6.35, 2.6.36

Signed-off-by: Jacques Viviers <jacques.viviers@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
13 years agoUSB: ftdi_sio: Add ID for RT Systems USB-29B radio cable
Michael Stuermer [Wed, 17 Nov 2010 23:45:43 +0000 (00:45 +0100)]
USB: ftdi_sio: Add ID for RT Systems USB-29B radio cable

commit 28942bb6a9dd4e2ed793675e515cfb8297ed355b upstream.

Another variant of the RT Systems programming cable for ham radios.

Signed-off-by: Michael Stuermer <ms@mallorn.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
13 years agoUSB: misc: usbsevseg: fix up some sysfs attribute permissions
Greg Kroah-Hartman [Mon, 15 Nov 2010 19:36:44 +0000 (11:36 -0800)]
USB: misc: usbsevseg: fix up some sysfs attribute permissions

commit e24d7ace4e822debcb78386bf279c9aba4d7fbd1 upstream.

They should not be writable by any user.

Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Harrison Metzger <harrisonmetz@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
13 years agoUSB: misc: trancevibrator: fix up a sysfs attribute permission
Greg Kroah-Hartman [Mon, 15 Nov 2010 19:34:26 +0000 (11:34 -0800)]
USB: misc: trancevibrator: fix up a sysfs attribute permission

commit d489a4b3926bad571d404ca6508f6744b9602776 upstream.

It should not be writable by any user.

Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Sam Hocevar <sam@zoy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
13 years agoUSB: ftdi_sio: revert "USB: ftdi_sio: fix DTR/RTS line modes"
Johan Hovold [Sun, 12 Sep 2010 14:31:45 +0000 (16:31 +0200)]
USB: ftdi_sio: revert "USB: ftdi_sio: fix DTR/RTS line modes"

commit 677aeafe19e88c282af74564048243ccabb1c590 upstream.

This reverts commit 6a1a82df91fa0eb1cc76069a9efe5714d087eccd.

RTS and DTR should not be modified based on CRTSCTS when calling
set_termios.

Modem control lines are raised at port open by the tty layer and should stay
raised regardless of whether hardware flow control is enabled or not.

This is in conformance with the way serial ports work today and many
applications depend on this behaviour to be able to talk to hardware
implementing hardware flow control (without the applications actually using
it).

Hardware which expects different behaviour on these lines can always
use TIOCMSET/TIOCMBI[SC] after port open to change them.

Reported-by: Daniel Mack <daniel@caiaq.de>
Reported-by: Dave Mielke <dave@mielke.cc>
Signed-off-by: Johan Hovold <jhovold@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
13 years agoUSB: misc: usbled: fix up some sysfs attribute permissions
Greg Kroah-Hartman [Mon, 15 Nov 2010 19:35:49 +0000 (11:35 -0800)]
USB: misc: usbled: fix up some sysfs attribute permissions

commit 48f115470e68d443436b76b22dad63ffbffd6b97 upstream.

They should not be writable by any user.

Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
13 years agoUSB: misc: cypress_cy7c63: fix up some sysfs attribute permissions
Greg Kroah-Hartman [Mon, 15 Nov 2010 19:32:38 +0000 (11:32 -0800)]
USB: misc: cypress_cy7c63: fix up some sysfs attribute permissions

commit c990600d340641150f7270470a64bd99a5c0b225 upstream.

They should not be writable by any user.

Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oliver Bock <bock@tfh-berlin.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
13 years agoUSB: atm: ueagle-atm: fix up some permissions on the sysfs files
Greg Kroah-Hartman [Mon, 15 Nov 2010 19:11:45 +0000 (11:11 -0800)]
USB: atm: ueagle-atm: fix up some permissions on the sysfs files

commit e502ac5e1eca99d7dc3f12b2a6780ccbca674858 upstream.

Some of the sysfs files had the incorrect permissions.  Some didn't make
sense at all (writable for a file that you could not write to?)

Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Matthieu Castet <castet.matthieu@free.fr>
Cc: Stanislaw Gruszka <stf_xl@wp.pl>
Cc: Damien Bergamini <damien.bergamini@free.fr>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
13 years agoUSB: storage: sierra_ms: fix sysfs file attribute
Greg Kroah-Hartman [Mon, 15 Nov 2010 19:17:52 +0000 (11:17 -0800)]
USB: storage: sierra_ms: fix sysfs file attribute

commit d9624e75f6ad94d8a0718c1fafa89186d271a78c upstream.

A non-writable sysfs file shouldn't have writable attributes.

Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Kevin Lloyd <klloyd@sierrawireless.com>
Cc: Matthew Dharm <mdharm-usb@one-eyed-alien.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
13 years agoUSB: EHCI: fix obscure race in ehci_endpoint_disable
Alan Stern [Tue, 16 Nov 2010 15:57:37 +0000 (10:57 -0500)]
USB: EHCI: fix obscure race in ehci_endpoint_disable

commit 02e2c51ba3e80acde600721ea784c3ef84da5ea1 upstream.

This patch (as1435) fixes an obscure and unlikely race in ehci-hcd.
When an async URB is unlinked, the corresponding QH is removed from
the async list.  If the QH's endpoint is then disabled while the URB
is being given back, ehci_endpoint_disable() won't find the QH on the
async list, causing it to believe that the QH has been lost.  This
will lead to a memory leak at best and quite possibly to an oops.

The solution is to trust usbcore not to lose track of endpoints.  If
the QH isn't on the async list then it doesn't need to be taken off
the list, but the driver should still wait for the QH to become IDLE
before disabling it.

In theory this fixes Bugzilla #20182.  In fact the race is so rare
that it's not possible to tell whether the bug is still present.
However, adding delays and making other changes to force the race
seems to show that the patch works.

Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Reported-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
CC: David Brownell <david-b@pacbell.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
13 years agoStaging: rt2870: Add USB ID for Buffalo Airstation WLI-UC-GN
John Tapsell [Thu, 25 Mar 2010 13:30:45 +0000 (13:30 +0000)]
Staging: rt2870: Add USB ID for Buffalo Airstation WLI-UC-GN

commit 251d380034c6c34efe75ffb89d863558ba68ec6a upstream.

BugLink: http://bugs.launchpad.net/bugs/441990
This was tested to successfully enable the hardware.

Signed-off-by: John Tapsell <johnflux@gmail.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
13 years agousb: core: fix information leak to userland
Vasiliy Kulikov [Sat, 6 Nov 2010 14:41:28 +0000 (17:41 +0300)]
usb: core: fix information leak to userland

commit 886ccd4520064408ce5876cfe00554ce52ecf4a7 upstream.

Structure usbdevfs_connectinfo is copied to userland with padding byted
after "slow" field uninitialized.  It leads to leaking of contents of
kernel stack memory.

Signed-off-by: Vasiliy Kulikov <segooon@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
13 years agousb: misc: iowarrior: fix information leak to userland
Vasiliy Kulikov [Sat, 6 Nov 2010 14:41:31 +0000 (17:41 +0300)]
usb: misc: iowarrior: fix information leak to userland

commit eca67aaeebd6e5d22b0d991af1dd0424dc703bfb upstream.

Structure iowarrior_info is copied to userland with padding byted
between "serial" and "revision" fields uninitialized.  It leads to
leaking of contents of kernel stack memory.

Signed-off-by: Vasiliy Kulikov <segooon@gmail.com>
Acked-by: Kees Cook <kees.cook@canonical.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
13 years agousb: misc: sisusbvga: fix information leak to userland
Vasiliy Kulikov [Sat, 6 Nov 2010 14:41:35 +0000 (17:41 +0300)]
usb: misc: sisusbvga: fix information leak to userland

commit 5dc92cf1d0b4b0debbd2e333b83f9746c103533d upstream.

Structure sisusb_info is copied to userland with "sisusb_reserved" field
uninitialized.  It leads to leaking of contents of kernel stack memory.

Signed-off-by: Vasiliy Kulikov <segooon@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
13 years agoUSB: option: fix when the driver is loaded incorrectly for some Huawei devices.
ma rui [Mon, 1 Nov 2010 03:32:18 +0000 (11:32 +0800)]
USB: option: fix when the driver is loaded incorrectly for some Huawei devices.

commit 58c0d9d70109bd7e82bdb9517007311a48499960 upstream.

When huawei datacard with PID 0x14AC is insterted into Linux system, the
present kernel will load the "option" driver to all the interfaces. But
actually, some interfaces run as other function and do not need "option"
driver.

In this path, we modify the id_tables, when the PID is 0x14ac ,VID is
0x12d1, Only when the interface's Class is 0xff,Subclass is 0xff, Pro is
0xff, it does need "option" driver.

Signed-off-by: ma rui <m00150988@huawei.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
13 years agoefifb: check that the base address is plausible on pci systems
Peter Jones [Wed, 22 Sep 2010 20:05:04 +0000 (13:05 -0700)]
efifb: check that the base address is plausible on pci systems

commit 85a00d9bbfb4704fbf368944b1cb9fed8f1598c5 upstream.

Some Apple machines have identical DMI data but different memory
configurations for the video.  Given that, check that the address in our
table is actually within the range of a PCI BAR on a VGA device in the
machine.

This also fixes up the return value from set_system(), which has always
been wrong, but never resulted in bad behavior since there's only ever
been one matching entry in the dmi table.

The patch

1) stops people's machines from crashing when we get their display wrong,
   which seems to be unfortunately inevitable,

2) allows us to support identical dmi data with differing video memory
   configurations

This also adds me as the efifb maintainer, since I've effectively been
acting as such for quite some time.

Signed-off-by: Peter Jones <pjones@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: maximilian attems <max@stro.at>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
13 years agoPCI: fix offset check for sysfs mmapped files
Darrick J. Wong [Tue, 16 Nov 2010 17:13:41 +0000 (09:13 -0800)]
PCI: fix offset check for sysfs mmapped files

commit 8c05cd08a7504b855c265263e84af61aabafa329 upstream.

I just loaded 2.6.37-rc2 on my machines, and I noticed that X no longer starts.
Running an strace of the X server shows that it's doing this:

open("/sys/bus/pci/devices/0000:07:00.0/resource0", O_RDWR) = 10
mmap(NULL, 16777216, PROT_READ|PROT_WRITE, MAP_SHARED, 10, 0) = -1 EINVAL (Invalid argument)

This code seems to be asking for a shared read/write mapping of 16MB worth of
BAR0 starting at file offset 0, and letting the kernel assign a starting
address.  Unfortunately, this -EINVAL causes X not to start.  Looking into
dmesg, there's a complaint like so:

process "Xorg" tried to map 0x01000000 bytes at page 0x00000000 on 0000:07:00.0 BAR 0 (start 0x        96000000, size 0x         1000000)

...with the following code in pci_mmap_fits:

pci_start = (mmap_api == PCI_MMAP_SYSFS) ?
pci_resource_start(pdev, resno) >> PAGE_SHIFT : 0;
        if (start >= pci_start && start < pci_start + size &&
                        start + nr <= pci_start + size)

It looks like the logic here is set up such that when the mmap call comes via
sysfs, the check in pci_mmap_fits wants vma->vm_pgoff to be between the
resource's start and end address, and the end of the vma to be no farther than
the end.  However, the sysfs PCI resource files always start at offset zero,
which means that this test always fails for programs that mmap the sysfs files.
Given the comment in the original commit
3b519e4ea618b6943a82931630872907f9ac2c2b, I _think_ the old procfs files
require that the file offset be equal to the resource's base address when
mmapping.

I think what we want here is for pci_start to be 0 when mmap_api ==
PCI_MMAP_PROCFS.  The following patch makes that change, after which the Matrox
and Mach64 X drivers work again.

Acked-by: Martin Wilck <martin.wilck@ts.fujitsu.com>
Signed-off-by: Darrick J. Wong <djwong@us.ibm.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
13 years agoPCI: fix size checks for mmap() on /proc/bus/pci files
Martin Wilck [Wed, 10 Nov 2010 10:03:21 +0000 (11:03 +0100)]
PCI: fix size checks for mmap() on /proc/bus/pci files

commit 3b519e4ea618b6943a82931630872907f9ac2c2b upstream.

The checks for valid mmaps of PCI resources made through /proc/bus/pci files
that were introduced in 9eff02e2042f96fb2aedd02e032eca1c5333d767 have several
problems:

1. mmap() calls on /proc/bus/pci files are made with real file offsets > 0,
whereas under /sys/bus/pci/devices, the start of the resource corresponds
to offset 0. This may lead to false negatives in pci_mmap_fits(), which
implicitly assumes the /sys/bus/pci/devices layout.

2. The loop in proc_bus_pci_mmap doesn't skip empty resouces. This leads
to false positives, because pci_mmap_fits() doesn't treat empty resources
correctly (the calculated size is 1 << (8*sizeof(resource_size_t)-PAGE_SHIFT)
in this case!).

3. If a user maps resources with BAR > 0, pci_mmap_fits will emit bogus
WARNINGS for the first resources that don't fit until the correct one is found.

On many controllers the first 2-4 BARs are used, and the others are empty.
In this case, an mmap attempt will first fail on the non-empty BARs
(including the "right" BAR because of 1.) and emit bogus WARNINGS because
of 3., and finally succeed on the first empty BAR because of 2.
This is certainly not the intended behaviour.

This patch addresses all 3 issues.
Updated with an enum type for the additional parameter for pci_mmap_fits().

Signed-off-by: Martin Wilck <martin.wilck@ts.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
13 years agolibata: fix NULL sdev dereference race in atapi_qc_complete()
Tejun Heo [Mon, 1 Nov 2010 10:39:19 +0000 (11:39 +0100)]
libata: fix NULL sdev dereference race in atapi_qc_complete()

commit 2a5f07b5ec098edc69e05fdd2f35d3fbb1235723 upstream.

SCSI commands may be issued between __scsi_add_device() and dev->sdev
assignment, so it's unsafe for ata_qc_complete() to dereference
dev->sdev->locked without checking whether it's NULL or not.  Fix it.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
13 years agodrm/ttm: Clear the ghost cpu_writers flag on ttm_buffer_object_transfer.
Francisco Jerez [Tue, 21 Sep 2010 00:15:15 +0000 (02:15 +0200)]
drm/ttm: Clear the ghost cpu_writers flag on ttm_buffer_object_transfer.

commit 0fbecd400dd0a82d465b3086f209681e8c54cb0f upstream.

It makes sense for a BO to move after a process has requested
exclusive RW access on it (e.g. because the BO used to be located in
unmappable VRAM and we intercepted the CPU access from the fault
handler).

If we let the ghost object inherit cpu_writers from the original
object, ttm_bo_release_list() will raise a kernel BUG when the ghost
object is destroyed. This can be reproduced with the nouveau driver on
nv5x.

Reported-by: Marcin Slusarz <marcin.slusarz@gmail.com>
Reviewed-by: Jerome Glisse <jglisse@redhat.com>
Tested-by: Marcin Slusarz <marcin.slusarz@gmail.com>
Signed-off-by: Francisco Jerez <currojerez@riseup.net>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
13 years agobio: take care not overflow page count when mapping/copying user data
Jens Axboe [Wed, 10 Nov 2010 13:36:25 +0000 (14:36 +0100)]
bio: take care not overflow page count when mapping/copying user data

commit cb4644cac4a2797afc847e6c92736664d4b0ea34 upstream.

If the iovec is being set up in a way that causes uaddr + PAGE_SIZE
to overflow, we could end up attempting to map a huge number of
pages. Check for this invalid input type.

Reported-by: Dan Rosenberg <drosenberg@vsecurity.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
13 years agomm/vfs: revalidate page->mapping in do_generic_file_read()
Dave Hansen [Thu, 11 Nov 2010 22:05:15 +0000 (14:05 -0800)]
mm/vfs: revalidate page->mapping in do_generic_file_read()

commit 8d056cb965b8fb7c53c564abf28b1962d1061cd3 upstream.

70 hours into some stress tests of a 2.6.32-based enterprise kernel, we
ran into a NULL dereference in here:

int block_is_partially_uptodate(struct page *page, read_descriptor_t *desc,
                                        unsigned long from)
{
----> struct inode *inode = page->mapping->host;

It looks like page->mapping was the culprit.  (xmon trace is below).
After closer examination, I realized that do_generic_file_read() does a
find_get_page(), and eventually locks the page before calling
block_is_partially_uptodate().  However, it doesn't revalidate the
page->mapping after the page is locked.  So, there's a small window
between the find_get_page() and ->is_partially_uptodate() where the page
could get truncated and page->mapping cleared.

We _have_ a reference, so it can't get reclaimed, but it certainly
can be truncated.

I think the correct thing is to check page->mapping after the
trylock_page(), and jump out if it got truncated.  This patch has been
running in the test environment for a month or so now, and we have not
seen this bug pop up again.

xmon info:

  1f:mon> e
  cpu 0x1f: Vector: 300 (Data Access) at [c0000002ae36f770]
      pc: c0000000001e7a6c: .block_is_partially_uptodate+0xc/0x100
      lr: c000000000142944: .generic_file_aio_read+0x1e4/0x770
      sp: c0000002ae36f9f0
     msr: 8000000000009032
     dar: 0
   dsisr: 40000000
    current = 0xc000000378f99e30
    paca    = 0xc000000000f66300
      pid   = 21946, comm = bash
  1f:mon> r
  R00 = 0025c0500000006d   R16 = 0000000000000000
  R01 = c0000002ae36f9f0   R17 = c000000362cd3af0
  R02 = c000000000e8cd80   R18 = ffffffffffffffff
  R03 = c0000000031d0f88   R19 = 0000000000000001
  R04 = c0000002ae36fa68   R20 = c0000003bb97b8a0
  R05 = 0000000000000000   R21 = c0000002ae36fa68
  R06 = 0000000000000000   R22 = 0000000000000000
  R07 = 0000000000000001   R23 = c0000002ae36fbb0
  R08 = 0000000000000002   R24 = 0000000000000000
  R09 = 0000000000000000   R25 = c000000362cd3a80
  R10 = 0000000000000000   R26 = 0000000000000002
  R11 = c0000000001e7b60   R27 = 0000000000000000
  R12 = 0000000042000484   R28 = 0000000000000001
  R13 = c000000000f66300   R29 = c0000003bb97b9b8
  R14 = 0000000000000001   R30 = c000000000e28a08
  R15 = 000000000000ffff   R31 = c0000000031d0f88
  pc  = c0000000001e7a6c .block_is_partially_uptodate+0xc/0x100
  lr  = c000000000142944 .generic_file_aio_read+0x1e4/0x770
  msr = 8000000000009032   cr  = 22000488
  ctr = c0000000001e7a60   xer = 0000000020000000   trap =  300
  dar = 0000000000000000   dsisr = 40000000
  1f:mon> t
  [link register   ] c000000000142944 .generic_file_aio_read+0x1e4/0x770
  [c0000002ae36f9f0c000000000142a14 .generic_file_aio_read+0x2b4/0x770 (unreliable)
  [c0000002ae36fb40c0000000001b03e4 .do_sync_read+0xd4/0x160
  [c0000002ae36fce0c0000000001b153c .vfs_read+0xec/0x1f0
  [c0000002ae36fd80c0000000001b1768 .SyS_read+0x58/0xb0
  [c0000002ae36fe30c00000000000852c syscall_exit+0x0/0x40
  --- Exception: c00 (System Call) at 00000080a840bc54
  SP (fffca15df30) is in userspace
  1f:mon> di c0000000001e7a6c
  c0000000001e7a6c  e9290000      ld      r9,0(r9)
  c0000000001e7a70  418200c0      beq     c0000000001e7b30        # .block_is_partially_uptodate+0xd0/0x100
  c0000000001e7a74  e9440008      ld      r10,8(r4)
  c0000000001e7a78  78a80020      clrldi  r8,r5,32
  c0000000001e7a7c  3c000001      lis     r0,1
  c0000000001e7a80  812900a8      lwz     r9,168(r9)
  c0000000001e7a84  39600001      li      r11,1
  c0000000001e7a88  7c080050      subf    r0,r8,r0
  c0000000001e7a8c  7f805040      cmplw   cr7,r0,r10
  c0000000001e7a90  7d6b4830      slw     r11,r11,r9
  c0000000001e7a94  796b0020      clrldi  r11,r11,32
  c0000000001e7a98  419d00a8      bgt     cr7,c0000000001e7b40    # .block_is_partially_uptodate+0xe0/0x100
  c0000000001e7a9c  7fa55840      cmpld   cr7,r5,r11
  c0000000001e7aa0  7d004214      add     r8,r0,r8
  c0000000001e7aa4  79080020      clrldi  r8,r8,32
  c0000000001e7aa8  419c0078      blt     cr7,c0000000001e7b20    # .block_is_partially_uptodate+0xc0/0x100

Signed-off-by: Dave Hansen <dave@linux.vnet.ibm.com>
Reviewed-by: Minchan Kim <minchan.kim@gmail.com>
Reviewed-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Rik van Riel <riel@redhat.com>
Cc: <arunabal@in.ibm.com>
Cc: <sbest@us.ibm.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Minchan Kim <minchan.kim@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
13 years agolatencytop: fix per task accumulator
Ken Chen [Thu, 11 Nov 2010 22:05:16 +0000 (14:05 -0800)]
latencytop: fix per task accumulator

commit 38715258aa2e8cd94bd4aafadc544e5104efd551 upstream.

Per task latencytop accumulator prematurely terminates due to erroneous
placement of latency_record_count.  It should be incremented whenever a
new record is allocated instead of increment on every latencytop event.

Also fix search iterator to only search known record events instead of
blindly searching all pre-allocated space.

Signed-off-by: Ken Chen <kenchen@google.com>
Reviewed-by: Arjan van de Ven <arjan@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
13 years agonetfilter: nf_conntrack: allow nf_ct_alloc_hashtable() to get highmem pages
Eric Dumazet [Thu, 28 Oct 2010 10:34:21 +0000 (12:34 +0200)]
netfilter: nf_conntrack: allow nf_ct_alloc_hashtable() to get highmem pages

commit 6b1686a71e3158d3c5f125260effce171cc7852b upstream.

commit ea781f197d6a8 (use SLAB_DESTROY_BY_RCU and get rid of call_rcu())
did a mistake in __vmalloc() call in nf_ct_alloc_hashtable().

I forgot to add __GFP_HIGHMEM, so pages were taken from LOWMEM only.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
13 years agoALSA: hda: Use "alienware" model quirk for another SSID
Daniel T Chen [Thu, 2 Dec 2010 00:16:07 +0000 (19:16 -0500)]
ALSA: hda: Use "alienware" model quirk for another SSID

commit 0defe09ca70daccdc83abd9c3c24cd89ae6a1141 upstream.

BugLink: https://launchpad.net/bugs/683695
The original reporter states that headphone jacks do not appear to
work.  Upon inspecting his codec dump, and upon further testing, it is
confirmed that the "alienware" model quirk is correct.

Reported-and-tested-by: Cody Thierauf
Signed-off-by: Daniel T Chen <crimsun@ubuntu.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
13 years agoALSA: HDA: Add an extra DAC for Realtek ALC887-VD
David Henningsson [Wed, 24 Nov 2010 13:17:47 +0000 (14:17 +0100)]
ALSA: HDA: Add an extra DAC for Realtek ALC887-VD

commit cc1c452e509aefc28f7ad2deed75bc69d4f915f7 upstream.

The patch enables ALC887-VD to use the DAC at nid 0x26,
which makes it possible to use this DAC for e g Headphone
volume.

Signed-off-by: David Henningsson <david.henningsson@canonical.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
13 years agoALSA: ac97: Apply quirk for Dell Latitude D610 binding Master and Headphone controls
Daniel T Chen [Mon, 1 Nov 2010 05:14:51 +0000 (01:14 -0400)]
ALSA: ac97: Apply quirk for Dell Latitude D610 binding Master and Headphone controls

commit 0613a59456980161d0cd468bae6c63d772743102 upstream.

BugLink: https://launchpad.net/bugs/669279
The original reporter states: "The Master mixer does not change the
volume from the headphone output (which is affected by the headphone
mixer). Instead it only seems to control the on-board speaker volume.
This confuses PulseAudio greatly as the Master channel is merged into
the volume mix."

Fix this symptom by applying the hp_only quirk for the reporter's SSID.
The fix is applicable to all stable kernels.

Reported-and-tested-by: Ben Gamari <bgamari@gmail.com>
Signed-off-by: Daniel T Chen <crimsun@ubuntu.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
13 years agoALSA: hda - Fixed ALC887-VD initial error
Kailang Yang [Mon, 22 Nov 2010 09:59:36 +0000 (10:59 +0100)]
ALSA: hda - Fixed ALC887-VD initial error

commit 01e0f1378c47947b825eac05c98697ab1be1c86f upstream.

ALC887-VD is like ALC888-VD. It can not be initialized as ALC882.

Signed-off-by: Kailang Yang <kailang@realtek.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
13 years agofirewire: ohci: fix race in AR split packet handling
Clemens Ladisch [Mon, 25 Oct 2010 09:42:20 +0000 (11:42 +0200)]
firewire: ohci: fix race in AR split packet handling

commit a1f805e5e73a8fe166b71c6592d3837df0cd5e2e upstream.

When handling an AR buffer that has been completely filled, we assumed
that its descriptor will not be read by the controller and can be
overwritten.  However, when the last received packet happens to end at
the end of the buffer, the controller might not yet have moved on to the
next buffer and might read the branch address later.  If we overwrite
and free the page before that, the DMA context will either go dead
because of an invalid Z value, or go off into some random memory.

To fix this, ensure that the descriptor does not get overwritten by
using only the actual buffer instead of the entire page for reassembling
the split packet.  Furthermore, to avoid freeing the page too early,
move on to the next buffer only when some data in it guarantees that the
controller has moved on.

This should eliminate the remaining firewire-net problems.

Signed-off-by: Clemens Ladisch <clemens@ladisch.de>
Tested-by: Maxim Levitsky <maximlevitsky@gmail.com>
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
13 years agofirewire: ohci: fix buffer overflow in AR split packet handling
Clemens Ladisch [Mon, 25 Oct 2010 09:41:53 +0000 (11:41 +0200)]
firewire: ohci: fix buffer overflow in AR split packet handling

commit 85f7ffd5d2b320f73912b15fe8cef34bae297daf upstream.

When the controller had to split a received asynchronous packet into two
buffers, the driver tries to reassemble it by copying both parts into
the first page.  However, if size + rest > PAGE_SIZE, i.e., if the yet
unhandled packets before the split packet, the split packet itself, and
any received packets after the split packet are together larger than one
page, then the memory after the first page would get overwritten.

To fix this, do not try to copy the data of all unhandled packets at
once, but copy the possibly needed data every time when handling
a packet.

This gets rid of most of the infamous crashes and data corruptions when
using firewire-net.

Signed-off-by: Clemens Ladisch <clemens@ladisch.de>
Tested-by: Maxim Levitsky <maximlevitsky@gmail.com>
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
13 years agoTTY: ldisc, fix open flag handling
Jiri Slaby [Wed, 24 Nov 2010 23:27:54 +0000 (00:27 +0100)]
TTY: ldisc, fix open flag handling

commit 7f90cfc505d613f4faf096e0d84ffe99208057d9 upstream.

When a concrete ldisc open fails in tty_ldisc_open, we forget to clear
TTY_LDISC_OPEN. This causes a false warning on the next ldisc open:
WARNING: at drivers/char/tty_ldisc.c:445 tty_ldisc_open+0x26/0x38()
Hardware name: System Product Name
Modules linked in: ...
Pid: 5251, comm: a.out Tainted: G        W  2.6.32-5-686 #1
Call Trace:
 [<c1030321>] ? warn_slowpath_common+0x5e/0x8a
 [<c1030357>] ? warn_slowpath_null+0xa/0xc
 [<c119311c>] ? tty_ldisc_open+0x26/0x38
 [<c11936c5>] ? tty_set_ldisc+0x218/0x304
...

So clear the bit when failing...

Introduced in c65c9bc3efa (tty: rewrite the ldisc locking) back in
2.6.31-rc1.

Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Cc: Alan Cox <alan@linux.intel.com>
Reported-by: Sergey Lapin <slapin@ossfans.org>
Tested-by: Sergey Lapin <slapin@ossfans.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
13 years agotty_ldisc: Fix BUG() on hangup
Philippe Rétornaz [Wed, 27 Oct 2010 15:13:21 +0000 (17:13 +0200)]
tty_ldisc: Fix BUG() on hangup

commit 1c95ba1e1de7edffc0c4e275e147f1a9eb1f81ae upstream.

A kernel BUG when bluetooth rfcomm connection drop while the associated
serial port is open is sometime triggered.

It seems that the line discipline can disappear between the
tty_ldisc_put and tty_ldisc_get. This patch fall back to the N_TTY line
discipline if the previous discipline is not available anymore.

Signed-off-by: Philippe Retornaz <philippe.retornaz@epfl.ch>
Acked-by: Alan Cox <alan@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
13 years agoTTY: restore tty_ldisc_wait_idle
Jiri Slaby [Sun, 31 Oct 2010 22:17:51 +0000 (23:17 +0100)]
TTY: restore tty_ldisc_wait_idle

commit 100eeae2c5ce23b4db93ff320ee330ef1d740151 upstream.

It was removed in 65b770468e98 (tty-ldisc: turn ldisc user count into
a proper refcount), but we need to wait for last user to quit the
ldisc before we close it in tty_set_ldisc.

Otherwise weird things start to happen. There might be processes
waiting in tty_read->n_tty_read on tty->read_wait for input to appear
and at that moment, a change of ldisc is fatal. n_tty_close is called,
it frees read_buf and the waiting process is still in the middle of
reading and goes nuts after it is woken.

Previously we prevented close to happen when others are in ldisc ops
by tty_ldisc_wait_idle in tty_set_ldisc. But the commit above removed
that. So revoke the change and test whether there is 1 user (=we), and
allow the close then.

We can do that without ldisc/tty locks, because nobody else can open
the device due to TTY_LDISC_CHANGING bit set, so we in fact wait for
everybody to leave.

I don't understand why tty_ldisc_lock would be needed either when the
counter is an atomic variable, so this is a lockless
tty_ldisc_wait_idle.

On the other hand, if we fail to wait (timeout or signal), we have to
reenable the halted ldiscs, so we take ldisc lock and reuse the setup
path at the end of tty_set_ldisc.

Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Tested-by: Sebastian Andrzej Siewior <bigeasy@breakpoint.cc>
LKML-Reference: <20101031104136.GA511@Chamillionaire.breakpoint.cc>
LKML-Reference: <1287669539-22644-1-git-send-email-jslaby@suse.cz>
Cc: Alan Cox <alan@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
13 years agotty: prevent DOS in the flush_to_ldisc
Jiri Olsa [Mon, 8 Nov 2010 18:01:47 +0000 (19:01 +0100)]
tty: prevent DOS in the flush_to_ldisc

commit e045fec48970df84647a47930fcf7a22ff7229c0 upstream.

There's a small window inside the flush_to_ldisc function,
where the tty is unlocked and calling ldisc's receive_buf
function. If in this window new buffer is added to the tty,
the processing might never leave the flush_to_ldisc function.

This scenario will hog the cpu, causing other tty processing
starving, and making it impossible to interface the computer
via tty.

I was able to exploit this via pty interface by sending only
control characters to the master input, causing the flush_to_ldisc
to be scheduled, but never actually generate any output.

To reproduce, please run multiple instances of following code.

- SNIP
#define _XOPEN_SOURCE
#include <stdlib.h>
#include <stdio.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>

int main(int argc, char **argv)
{
        int i, slave, master = getpt();
        char buf[8192];

        sprintf(buf, "%s", ptsname(master));
        grantpt(master);
        unlockpt(master);

        slave = open(buf, O_RDWR);
        if (slave < 0) {
                perror("open slave failed");
                return 1;
        }

        for(i = 0; i < sizeof(buf); i++)
                buf[i] = rand() % 32;

        while(1) {
                write(master, buf, sizeof(buf));
        }

        return 0;
}
- SNIP

The attached patch (based on -next tree) fixes this by checking on the
tty buffer tail. Once it's reached, the current work is rescheduled
and another could run.

Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Acked-by: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
13 years agomd: fix return value of rdev_size_change()
Justin Maggard [Wed, 24 Nov 2010 05:36:17 +0000 (16:36 +1100)]
md: fix return value of rdev_size_change()

commit c26a44ed1e552aaa1d4ceb71842002d235fe98d7 upstream.

When trying to grow an array by enlarging component devices,
rdev_size_store() expects the return value of rdev_size_change() to be
in sectors, but the actual value is returned in KBs.

This functionality was broken by commit
     dd8ac336c13fd8afdb082ebacb1cddd5cf727889
so this patch is suitable for any kernel since 2.6.30.

Signed-off-by: Justin Maggard <jmaggard10@gmail.com>
Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
13 years agomd/raid1: really fix recovery looping when single good device fails.
NeilBrown [Wed, 24 Nov 2010 05:39:46 +0000 (16:39 +1100)]
md/raid1: really fix recovery looping when single good device fails.

commit 8f9e0ee38f75d4740daa9e42c8af628d33d19a02 upstream.

Commit 4044ba58dd15cb01797c4fd034f39ef4a75f7cc3 supposedly fixed a
problem where if a raid1 with just one good device gets a read-error
during recovery, the recovery would abort and immediately restart in
an infinite loop.

However it depended on raid1_remove_disk removing the spare device
from the array.  But that does not happen in this case.  So add a test
so that in the 'recovery_disabled' case, the device will be removed.

This suitable for any kernel since 2.6.29 which is when
recovery_disabled was introduced.

Reported-by: Sebastian Färber <faerber@gmail.com>
Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
13 years agoeCryptfs: Clear LOOKUP_OPEN flag when creating lower file
Tyler Hicks [Thu, 23 Sep 2010 07:35:04 +0000 (02:35 -0500)]
eCryptfs: Clear LOOKUP_OPEN flag when creating lower file

commit 2e21b3f124eceb6ab5a07c8a061adce14ac94e14 upstream.

eCryptfs was passing the LOOKUP_OPEN flag through to the lower file
system, even though ecryptfs_create() doesn't support the flag. A valid
filp for the lower filesystem could be returned in the nameidata if the
lower file system's create() function supported LOOKUP_OPEN, possibly
resulting in unencrypted writes to the lower file.

However, this is only a potential problem in filesystems (FUSE, NFS,
CIFS, CEPH, 9p) that eCryptfs isn't known to support today.

https://bugs.launchpad.net/ecryptfs/+bug/641703

Reported-by: Kevin Buhr
Signed-off-by: Tyler Hicks <tyhicks@linux.vnet.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
13 years agoviafb: use proper register for colour when doing fill ops
Florian Tobias Schandinat [Wed, 22 Sep 2010 02:33:52 +0000 (02:33 +0000)]
viafb: use proper register for colour when doing fill ops

commit efd4f6398dc92b5bf392670df862f42a19f34cf2 upstream.

The colour was written to a wrong register for fillrect operations.
This sometimes caused empty console space (for example after 'clear')
to have a different colour than desired. Fix this by writing to the
correct register.
Many thanks to Daniel Drake and Jon Nettleton for pointing out this
issue and pointing me in the right direction for the fix.

Fixes http://dev.laptop.org/ticket/9323

Signed-off-by: Florian Tobias Schandinat <FlorianSchandinat@gmx.de>
Cc: Joseph Chan <JosephChan@via.com.tw>
Cc: Daniel Drake <dsd@laptop.org>
Cc: Jon Nettleton <jon.nettleton@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
13 years agodrivers/char/vt_ioctl.c: fix VT_OPENQRY error value
Graham Gower [Wed, 27 Oct 2010 22:33:00 +0000 (15:33 -0700)]
drivers/char/vt_ioctl.c: fix VT_OPENQRY error value

commit 1e0ad2881d50becaeea70ec696a80afeadf944d2 upstream.

When all VT's are in use, VT_OPENQRY casts -1 to unsigned char before
returning it to userspace as an int.  VT255 is not the next available
console.

Signed-off-by: Graham Gower <graham.gower@gmail.com>
Cc: Greg KH <greg@kroah.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
13 years agonet: NETIF_F_HW_CSUM does not imply FCoE CRC offload
Ben Hutchings [Fri, 22 Oct 2010 04:38:26 +0000 (04:38 +0000)]
net: NETIF_F_HW_CSUM does not imply FCoE CRC offload

commit 66c68bcc489fadd4f5e8839e966e3a366e50d1d5 upstream.

NETIF_F_HW_CSUM indicates the ability to update an TCP/IP-style 16-bit
checksum with the checksum of an arbitrary part of the packet data,
whereas the FCoE CRC is something entirely different.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
13 years agosys_semctl: fix kernel stack leakage
Dan Rosenberg [Thu, 30 Sep 2010 22:15:31 +0000 (15:15 -0700)]
sys_semctl: fix kernel stack leakage

commit 982f7c2b2e6a28f8f266e075d92e19c0dd4c6e56 upstream.

The semctl syscall has several code paths that lead to the leakage of
uninitialized kernel stack memory (namely the IPC_INFO, SEM_INFO,
IPC_STAT, and SEM_STAT commands) during the use of the older, obsolete
version of the semid_ds struct.

The copy_semid_to_user() function declares a semid_ds struct on the stack
and copies it back to the user without initializing or zeroing the
"sem_base", "sem_pending", "sem_pending_last", and "undo" pointers,
allowing the leakage of 16 bytes of kernel stack memory.

The code is still reachable on 32-bit systems - when calling semctl()
newer glibc's automatically OR the IPC command with the IPC_64 flag, but
invoking the syscall directly allows users to use the older versions of
the struct.

Signed-off-by: Dan Rosenberg <dan.j.rosenberg@gmail.com>
Cc: Manfred Spraul <manfred@colorfullife.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
13 years agoipc: shm: fix information leak to userland
Vasiliy Kulikov [Sat, 30 Oct 2010 14:22:49 +0000 (18:22 +0400)]
ipc: shm: fix information leak to userland

commit 3af54c9bd9e6f14f896aac1bb0e8405ae0bc7a44 upstream.

The shmid_ds structure is copied to userland with shm_unused{,2,3}
fields unitialized.  It leads to leaking of contents of kernel stack
memory.

Signed-off-by: Vasiliy Kulikov <segooon@gmail.com>
Acked-by: Al Viro <viro@ZenIV.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
13 years agoipc: initialize structure memory to zero for compat functions
Dan Rosenberg [Wed, 27 Oct 2010 22:34:17 +0000 (15:34 -0700)]
ipc: initialize structure memory to zero for compat functions

commit 03145beb455cf5c20a761e8451e30b8a74ba58d9 upstream.

This takes care of leaking uninitialized kernel stack memory to
userspace from non-zeroed fields in structs in compat ipc functions.

Signed-off-by: Dan Rosenberg <drosenberg@vsecurity.com>
Cc: Manfred Spraul <manfred@colorfullife.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
13 years agoxen: don't bother to stop other cpus on shutdown/reboot
Jeremy Fitzhardinge [Mon, 29 Nov 2010 22:16:53 +0000 (14:16 -0800)]
xen: don't bother to stop other cpus on shutdown/reboot

commit 31e323cca9d5c8afd372976c35a5d46192f540d1 upstream.

Xen will shoot all the VCPUs when we do a shutdown hypercall, so there's
no need to do it manually.

In any case it will fail because all the IPI irqs have been pulled
down by this point, so the cross-CPU calls will simply hang forever.

Until change 76fac077db6b34e2c6383a7b4f3f4f7b7d06d8ce the function calls
were not synchronously waited for, so this wasn't apparent.  However after
that change the calls became synchronous leading to a hang on shutdown
on multi-VCPU guests.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Alok Kataria <akataria@vmware.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
13 years agoxen: ensure that all event channels start off bound to VCPU 0
Ian Campbell [Fri, 8 Oct 2010 15:59:12 +0000 (16:59 +0100)]
xen: ensure that all event channels start off bound to VCPU 0

commit b0097adeec27e30223c989561ab0f7aa60d1fe93 upstream.

All event channels startbound to VCPU 0 so ensure that cpu_evtchn_mask
is initialised to reflect this. Otherwise there is a race after registering an
event channel but before the affinity is explicitly set where the event channel
can be delivered. If this happens then the event channel remains pending in the
L1 (evtchn_pending) array but is cleared in L2 (evtchn_pending_sel), this means
the event channel cannot be reraised until another event channel happens to
trigger the same L2 entry on that VCPU.

sizeof(cpu_evtchn_mask(0))==sizeof(unsigned long*) which is not correct, and
causes only the first 32 or 64 event channels (depending on architecture) to be
initially bound to VCPU0. Use sizeof(struct cpu_evtchn_s) instead.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
13 years agosgi-xpc: XPC fails to discover partitions with all nasids above 128
Robin@sgi.com [Wed, 24 Nov 2010 20:56:59 +0000 (12:56 -0800)]
sgi-xpc: XPC fails to discover partitions with all nasids above 128

commit c22c7aeff69796f46ae0fcec141538e28f50b24e upstream.

UV hardware defines 256 memory protection regions versus the baseline 64
with increasing size for the SN2 ia64.  This was overlooked when XPC was
modified to accomodate both UV and SN2.

Without this patch, a user could reconfigure their existing system and
suddenly disable cross-partition communications with no indication of what
has gone wrong.  It also prevents larger configurations from using
cross-partition communication.

Signed-off-by: Robin Holt <holt@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
13 years agossb: b43-pci-bridge: Add new vendor for BCM4318
Daniel Klaffenbach [Sat, 20 Nov 2010 03:25:21 +0000 (21:25 -0600)]
ssb: b43-pci-bridge: Add new vendor for BCM4318

commit 1d8638d4038eb8709edc80e37a0bbb77253d86e9 upstream.

Add new vendor for Broadcom 4318.

Signed-off-by: Daniel Klaffenbach <danielklaffenbach@gmail.com>
Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
13 years agomm: fix is_mem_section_removable() page_order BUG_ON check
KAMEZAWA Hiroyuki [Tue, 26 Oct 2010 21:22:08 +0000 (14:22 -0700)]
mm: fix is_mem_section_removable() page_order BUG_ON check

commit 572438f9b52236bd8938b1647cc15e027d27ef55 upstream.

page_order() is called by memory hotplug's user interface to check the
section is removable or not.  (is_mem_section_removable())

It calls page_order() withoug holding zone->lock.
So, even if the caller does

if (PageBuddy(page))
ret = page_order(page) ...
The caller may hit BUG_ON().

For fixing this, there are 2 choices.
  1. add zone->lock.
  2. remove BUG_ON().

is_mem_section_removable() is used for some "advice" and doesn't need to
be 100% accurate.  This is_removable() can be called via user program..
We don't want to take this important lock for long by user's request.  So,
this patch removes BUG_ON().

Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Acked-by: Wu Fengguang <fengguang.wu@intel.com>
Acked-by: Michal Hocko <mhocko@suse.cz>
Acked-by: Mel Gorman <mel@csn.ul.ie>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
13 years agomm: fix return value of scan_lru_pages in memory unplug
KAMEZAWA Hiroyuki [Tue, 26 Oct 2010 21:21:10 +0000 (14:21 -0700)]
mm: fix return value of scan_lru_pages in memory unplug

commit f8f72ad5396987e05a42cf7eff826fb2a15ff148 upstream.

scan_lru_pages returns pfn. So, it's type should be "unsigned long"
not "int".

Note: I guess this has been work until now because memory hotplug tester's
      machine has not very big memory....
      physical address < 32bit << PAGE_SHIFT.

Reported-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Reviewed-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
13 years agohwmon: (lm85) Fix ADT7468 frequency table
Jean Delvare [Thu, 28 Oct 2010 18:31:50 +0000 (20:31 +0200)]
hwmon: (lm85) Fix ADT7468 frequency table

commit fa7a5797e57d2ed71f9a6fb44f0ae42c2d7b74b7 upstream.

The ADT7468 uses the same frequency table as the ADT7463.

Signed-off-by: Jean Delvare <khali@linux-fr.org>
Cc: Darrick J. Wong <djwong@us.ibm.com>
Acked-by: Guenter Roeck <guenter.roeck@ericsson.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
13 years agonuma: fix slab_node(MPOL_BIND)
Eric Dumazet [Wed, 27 Oct 2010 17:33:43 +0000 (19:33 +0200)]
numa: fix slab_node(MPOL_BIND)

commit 800416f799e0723635ac2d720ad4449917a1481c upstream.

When a node contains only HighMem memory, slab_node(MPOL_BIND)
dereferences a NULL pointer.

[ This code seems to go back all the way to commit 19770b32609b: "mm:
  filter based on a nodemask as well as a gfp_mask".  Which was back in
  April 2008, and it got merged into 2.6.26.  - Linus ]

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Christoph Lameter <cl@linux.com>
Cc: Lee Schermerhorn <lee.schermerhorn@hp.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
13 years agoum: fix global timer issue when using CONFIG_NO_HZ
Richard Weinberger [Tue, 26 Oct 2010 21:21:13 +0000 (14:21 -0700)]
um: fix global timer issue when using CONFIG_NO_HZ

commit 482db6df1746c4fa7d64a2441d4cb2610249c679 upstream.

This fixes a issue which was introduced by fe2cc53e ("uml: track and make
up lost ticks").

timeval_to_ns() returns long long and not int.  Due to that UML's timer
did not work properlt and caused timer freezes.

Signed-off-by: Richard Weinberger <richard@nod.at>
Acked-by: Pekka Enberg <penberg@kernel.org>
Cc: Jeff Dike <jdike@addtoit.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
13 years agoum: remove PAGE_SIZE alignment in linker script causing kernel segfault.
Richard Weinberger [Tue, 26 Oct 2010 21:21:16 +0000 (14:21 -0700)]
um: remove PAGE_SIZE alignment in linker script causing kernel segfault.

commit 6915e04f8847bea16d0890f559694ad8eedd026c upstream.

The linker script cleanup that I did in commit 5d150a97f93 ("um: Clean up
linker script using standard macros.") (2.6.32) accidentally introduced an
ALIGN(PAGE_SIZE) when converting to use INIT_TEXT_SECTION; Richard
Weinberger reported that this causes the kernel to segfault with
CONFIG_STATIC_LINK=y.

I'm not certain why this extra alignment is a problem, but it seems likely
it is because previously

__init_begin = _stext = _text = _sinittext

and with the extra ALIGN(PAGE_SIZE), _sinittext becomes different from the
rest.  So there is likely a bug here where something is assuming that
_sinittext is the same as one of those other symbols.  But reverting the
accidental change fixes the regression, so it seems worth committing that
now.

Signed-off-by: Tim Abbott <tabbott@ksplice.com>
Reported-by: Richard Weinberger <richard@nod.at>
Cc: Jeff Dike <jdike@addtoit.com>
Tested by: Antoine Martin <antoine@nagafix.co.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
13 years agopercpu: fix list_head init bug in __percpu_counter_init()
Masanori ITOH [Tue, 26 Oct 2010 21:21:20 +0000 (14:21 -0700)]
percpu: fix list_head init bug in __percpu_counter_init()

commit 8474b591faf3bb0a1e08a60d21d6baac498f15e4 upstream.

WARNING: at lib/list_debug.c:26 __list_add+0x3f/0x81()
Hardware name: Express5800/B120a [N8400-085]
list_add corruption. next->prev should be prev (ffffffff81a7ea00), but was dead000000200200. (next=ffff88080b872d58).
Modules linked in: aoe ipt_MASQUERADE iptable_nat nf_nat autofs4 sunrpc bridge 8021q garp stp llc ipv6 cpufreq_ondemand acpi_cpufreq freq_table dm_round_robin dm_multipath kvm_intel kvm uinput lpfc scsi_transport_fc igb ioatdma scsi_tgt i2c_i801 i2c_core dca iTCO_wdt iTCO_vendor_support pcspkr shpchp megaraid_sas [last unloaded: aoe]
Pid: 54, comm: events/3 Tainted: G        W  2.6.34-vanilla1 #1
Call Trace:
[<ffffffff8104bd77>] warn_slowpath_common+0x7c/0x94
[<ffffffff8104bde6>] warn_slowpath_fmt+0x41/0x43
[<ffffffff8120fd2e>] __list_add+0x3f/0x81
[<ffffffff81212a12>] __percpu_counter_init+0x59/0x6b
[<ffffffff810d8499>] bdi_init+0x118/0x17e
[<ffffffff811f2c50>] blk_alloc_queue_node+0x79/0x143
[<ffffffff811f2d2b>] blk_alloc_queue+0x11/0x13
[<ffffffffa02a931d>] aoeblk_gdalloc+0x8e/0x1c9 [aoe]
[<ffffffffa02aa655>] aoecmd_sleepwork+0x25/0xa8 [aoe]
[<ffffffff8106186c>] worker_thread+0x1a9/0x237
[<ffffffffa02aa630>] ? aoecmd_sleepwork+0x0/0xa8 [aoe]
[<ffffffff81065827>] ? autoremove_wake_function+0x0/0x39
[<ffffffff810616c3>] ? worker_thread+0x0/0x237
[<ffffffff810653ad>] kthread+0x7f/0x87
[<ffffffff8100aa24>] kernel_thread_helper+0x4/0x10
[<ffffffff8106532e>] ? kthread+0x0/0x87
[<ffffffff8100aa20>] ? kernel_thread_helper+0x0/0x10

It's because there is no initialization code for a list_head contained in
the struct backing_dev_info under CONFIG_HOTPLUG_CPU, and the bug comes up
when block device drivers calling blk_alloc_queue() are used.  In case of
me, I got them by using aoe.

Signed-off-by: Masanori Itoh <itoumsn@nttdata.co.jp>
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>