Johannes Berg [Thu, 11 Nov 2010 22:05:21 +0000 (14:05 -0800)]
led-class: always implement blinking
Currently, blinking LEDs can be awkward because it is not guaranteed that
all LEDs implement blinking. The trigger that wants it to blink then
needs to implement its own timer solution.
Rather than require that, add led_blink_set() API that triggers can use.
This function will attempt to use hw blinking, but if that fails
implements a timer for it. To stop blinking again, brightness_set() also
needs to be wrapped into API that will stop the software blink.
As a result of this, the timer trigger becomes a very trivial one, and
hopefully we can finally see triggers using blinking as well because it's
always easy to use.
Signed-off-by: Johannes Berg <johannes.berg@intel.com> Acked-by: Richard Purdie <rpurdie@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Dave Jones [Thu, 11 Nov 2010 22:05:20 +0000 (14:05 -0800)]
hugetlbfs: lessen the impact of a deprecation warning
WARN_ONCE is a bit strong for a deprecation warning, given that it spews a
huge backtrace.
Signed-off-by: Dave Jones <davej@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Nick Piggin [Thu, 11 Nov 2010 22:05:19 +0000 (14:05 -0800)]
radix-tree: fix RCU bug
Salman Qazi describes the following radix-tree bug:
In the following case, we get can get a deadlock:
0. The radix tree contains two items, one has the index 0.
1. The reader (in this case find_get_pages) takes the rcu_read_lock.
2. The reader acquires slot(s) for item(s) including the index 0 item.
3. The non-zero index item is deleted, and as a consequence the other item is
moved to the root of the tree. The place where it used to be is queued for
deletion after the readers finish.
3b. The zero item is deleted, removing it from the direct slot, it remains in
the rcu-delayed indirect node.
4. The reader looks at the index 0 slot, and finds that the page has 0 ref
count
5. The reader looks at it again, hoping that the item will either be freed or
the ref count will increase. This never happens, as the slot it is looking
at will never be updated. Also, this slot can never be reclaimed because
the reader is holding rcu_read_lock and is in an infinite loop.
The fix is to re-use the same "indirect" pointer case that requires a slot
lookup retry into a general "retry the lookup" bit.
Signed-off-by: Nick Piggin <npiggin@kernel.dk> Reported-by: Salman Qazi <sqazi@google.com> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Dan Rosenberg [Thu, 11 Nov 2010 22:05:18 +0000 (14:05 -0800)]
Restrict unprivileged access to kernel syslog
The kernel syslog contains debugging information that is often useful
during exploitation of other vulnerabilities, such as kernel heap
addresses. Rather than futilely attempt to sanitize hundreds (or
thousands) of printk statements and simultaneously cripple useful
debugging functionality, it is far simpler to create an option that
prevents unprivileged users from reading the syslog.
This patch, loosely based on grsecurity's GRKERNSEC_DMESG, creates the
dmesg_restrict sysctl. When set to "0", the default, no restrictions are
enforced. When set to "1", only users with CAP_SYS_ADMIN can read the
kernel syslog via dmesg(8) or other mechanisms.
[akpm@linux-foundation.org: explain the config option in kernel.txt] Signed-off-by: Dan Rosenberg <drosenberg@vsecurity.com> Acked-by: Ingo Molnar <mingo@elte.hu> Acked-by: Eugene Teo <eugeneteo@kernel.org> Acked-by: Kees Cook <kees.cook@canonical.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
David Rientjes [Thu, 11 Nov 2010 22:05:18 +0000 (14:05 -0800)]
oom: document obsolete oom_adj tunable
/proc/pid/oom_adj was deprecated in August 2010 with the introduction of
the new oom killer heuristic.
This patch copies the Documentation/feature-removal-schedule.txt entry for
this tunable to the Documentation/ABI/obsolete directory so nobody misses
it.
Signed-off-by: David Rientjes <rientjes@google.com> Reported-by: Alan Cox <alan@lxorguk.ukuu.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Shaohua Li [Thu, 11 Nov 2010 22:05:17 +0000 (14:05 -0800)]
vmscan: avoid setting zone congested if no page dirty
nr_dirty and nr_congested are increased only when the page is dirty. So
if all pages are clean, both them will be zero. In this case, we should
not mark the zone congested.
Signed-off-by: Shaohua Li <shaohua.li@intel.com> Reviewed-by: Johannes Weiner <hannes@cmpxchg.org> Reviewed-by: Minchan Kim <minchan.kim@gmail.com> Acked-by: Mel Gorman <mel@csn.ul.ie> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Ken Chen [Thu, 11 Nov 2010 22:05:16 +0000 (14:05 -0800)]
latencytop: fix per task accumulator
Per task latencytop accumulator prematurely terminates due to erroneous
placement of latency_record_count. It should be incremented whenever a
new record is allocated instead of increment on every latencytop event.
Also fix search iterator to only search known record events instead of
blindly searching all pre-allocated space.
Signed-off-by: Ken Chen <kenchen@google.com> Reviewed-by: Arjan van de Ven <arjan@infradead.org> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Dave Hansen [Thu, 11 Nov 2010 22:05:15 +0000 (14:05 -0800)]
mm/vfs: revalidate page->mapping in do_generic_file_read()
70 hours into some stress tests of a 2.6.32-based enterprise kernel, we
ran into a NULL dereference in here:
int block_is_partially_uptodate(struct page *page, read_descriptor_t *desc,
unsigned long from)
{
----> struct inode *inode = page->mapping->host;
It looks like page->mapping was the culprit. (xmon trace is below).
After closer examination, I realized that do_generic_file_read() does a
find_get_page(), and eventually locks the page before calling
block_is_partially_uptodate(). However, it doesn't revalidate the
page->mapping after the page is locked. So, there's a small window
between the find_get_page() and ->is_partially_uptodate() where the page
could get truncated and page->mapping cleared.
We _have_ a reference, so it can't get reclaimed, but it certainly
can be truncated.
I think the correct thing is to check page->mapping after the
trylock_page(), and jump out if it got truncated. This patch has been
running in the test environment for a month or so now, and we have not
seen this bug pop up again.
kernel/range.c: fix clean_sort_range() for the case of full array
clean_sort_range() should return a number of nonempty elements of range
array, but if the array is full clean_sort_range() returns 0.
The problem is that the number of nonempty elements is evaluated by
finding the first empty element of the array. If there is no such element
it returns an initial value of local variable nr_range that is zero.
The fix is trivial: it changes initial value of nr_range to size of the
array.
The bug can lead to loss of information regarding all ranges, since
typically returned value of clean_sort_range() is considered as an actual
number of ranges in the array after a series of add/subtract operations.
Found by Analytical Verification project of Linux Verification Center
(linuxtesting.org), thanks to Alexander Kolosov.
Signed-off-by: Alexey Khoroshilov <khoroshilov@ispras.ru> Cc: Yinghai Lu <yinghai@kernel.org> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Dan Carpenter [Thu, 11 Nov 2010 22:05:13 +0000 (14:05 -0800)]
drivers/misc/bh1770glc.c: error handling in bh1770_power_state_store()
There was a signedness bug so "ret" was never less than zero and that
breaks the error handling. Also in the original code it would overwrite
ret and the result is still negative but it's bogus number instead of the
correct error code.
Signed-off-by: Dan Carpenter <error27@gmail.com> Cc: Samu Onkalo <samu.p.onkalo@nokia.com> Cc: Jonathan Cameron <jic23@cam.ac.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Dan Carpenter [Thu, 11 Nov 2010 22:05:12 +0000 (14:05 -0800)]
memcg: null dereference on allocation failure
The original code had a null dereference if alloc_percpu() failed. This
was introduced in commit 711d3d2c9bc3 ("memcg: cpu hotplug aware percpu
count updates")
Catalin Marinas [Thu, 11 Nov 2010 22:05:10 +0000 (14:05 -0800)]
include/linux/highmem.h needs hardirq.h
Commit 3e4d3af501cc ("mm: stack based kmap_atomic()") introduced the
kmap_atomic_idx_push() function which warns on in_irq() with
CONFIG_DEBUG_HIGHMEM enabled. This patch includes linux/hardirq.h for
the in_irq definition.
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Eric Dumazet [Thu, 11 Nov 2010 22:05:08 +0000 (14:05 -0800)]
atomic: add atomic_inc_not_zero_hint()
Followup of perf tools session in Netfilter WorkShop 2010
In the network stack we make high usage of atomic_inc_not_zero() in
contexts we know the probable value of atomic before increment (2 for udp
sockets for example)
Using a special version of atomic_inc_not_zero() giving this hint can help
processor to use less bus transactions.
On x86 (MESI protocol) for example, this avoids entering Shared state,
because "lock cmpxchg" issues an RFO (Read For Ownership)
akpm: Adds a new include/linux/atomic.h. This means that new code should
henceforth include linux/atomic.h and not asm/atomic.h. The presence of
include/linux/atomic.h will in fact cause checkpatch.pl to warn about use
of asm/atomic.h. The new include/linux/atomic.h becomes the place where
arch-neutral atomic_t code should be placed.
[akpm@linux-foundation.org: coding-style fixes] Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Cc: Christoph Lameter <cl@linux-foundation.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: Andi Kleen <andi@firstfloor.org> Cc: Arnaldo Carvalho de Melo <acme@infradead.org> Cc: David Miller <davem@davemloft.net> Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Cc: Nick Piggin <npiggin@kernel.dk> Reviewed-by: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Dan Carpenter [Thu, 11 Nov 2010 22:05:07 +0000 (14:05 -0800)]
rapidio: use resource_size()
The size calculation is done incorrectly here because it should include
both the start and end (end - start + 1). It's easiest to just use
resource_size() which does the right thing.
I was worried there was something non-standard going on because the
printk() subtracts "end - 1", but the rest of the file uses the normal
resource size calculations. This function is only called from
fsl_rio_setup() in arch/powerpc/sysdev/fsl_rio.c and the calculation
there is also:
Signed-off-by: Dan Carpenter <error27@gmail.com> Cc: Alexandre Bounine <alexandre.bounine@idt.com> Acked-by: Li Yang <leoli@freescale.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
drivers/macintosh/adb-iop.c: flags should be unsigned long
Fix these warnings:
drivers/macintosh/adb-iop.c: In function `adb_iop_complete':
drivers/macintosh/adb-iop.c:85: warning: comparison of distinct pointer types lacks a cast
drivers/macintosh/adb-iop.c:92: warning: comparison of distinct pointer types lacks a cast
drivers/macintosh/adb-iop.c: In function ¡adb_iop_listen¢:
drivers/macintosh/adb-iop.c:111: warning: comparison of distinct pointer types lacks a cast
drivers/macintosh/adb-iop.c:151: warning: comparison of distinct pointer types lacks a cast
Both commits 0a3d763f1a68 ("ptrace: cleanup arch_ptrace() on um") and 9b05a69e0534 ("ptrace: change signature of arch_ptrace()") broke the um
build. This patch fixes the issues.
0a3d763f1a68 introduced the undeclared variable "datavp". The patch seems
completely untested. :-(
9b05a69e0534 changed arch_ptrace()'s signature but did not update
um/include/asm/ptrace-generic.h.
Signed-off-by: Richard Weinberger <richard@nod.at> Cc: Namhyung Kim <namhyung@gmail.com> Cc: Jeff Dike <jdike@addtoit.com> Tested-by: Will Newton <will.newton@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6:
cifs: fix a memleak in cifs_setattr_nounix()
cifs: make cifs_ioctl handle NULL filp->private_data correctly
Pekka Enberg [Mon, 8 Nov 2010 19:29:07 +0000 (21:29 +0200)]
perf_events: Fix perf_counter_mmap() hook in mprotect()
As pointed out by Linus, commit dab5855 ("perf_counter: Add mmap event hooks to
mprotect()") is fundamentally wrong as mprotect_fixup() can free 'vma' due to
merging. Fix the problem by moving perf_event_mmap() hook to
mprotect_fixup().
Note: there's another successful return path from mprotect_fixup() if old
flags equal to new flags. We don't, however, need to call
perf_event_mmap() there because 'perf' already knows the VMA is
executable.
Reported-by: Dave Jones <davej@redhat.com> Analyzed-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Ingo Molnar <mingo@elte.hu> Reviewed-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Pekka Enberg <penberg@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The commit 1025774c that removed inode_setattr() seems to have introduced this
memleak by returning early without freeing 'full_path'.
Reported-by: Andrew Hendry <andrew.hendry@gmail.com> Cc: Christoph Hellwig <hch@lst.de> Reviewed-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Suresh Jayaraman <sjayaraman@suse.de> Signed-off-by: Steve French <sfrench@us.ibm.com>
Tetsuo Handa [Mon, 8 Nov 2010 02:20:49 +0000 (11:20 +0900)]
kernel: Constify temporary variable in roundup()
Fix build error with GCC 3.x caused by commit b28efd54
"kernel: roundup should only reference arguments once" by constifying
temporary variable used in that macro.
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Suggested-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Eric Paris <eparis@redhat.com> Signed-off-by: James Morris <jmorris@namei.org>
Meelis Roos [Mon, 8 Nov 2010 21:38:14 +0000 (13:38 -0800)]
sparc: fix openpromfs compile
Fix openpromfs compilation by adding a missing semicolon in
fs/openpromfs/inode.c openprom_mount().
Signed-off-by: Meelis Roos <mroos@linux.ee> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Mon, 8 Nov 2010 19:54:53 +0000 (11:54 -0800)]
Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4
* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
ext4: Add new ext4 inode tracepoints
ext4: Don't call sb_issue_discard() in ext4_free_blocks()
ext4: do not try to grab the s_umount semaphore in ext4_quota_off
ext4: fix potential race when freeing ext4_io_page structures
ext4: handle writeback of inodes which are being freed
ext4: initialize the percpu counters before replaying the journal
ext4: "ret" may be used uninitialized in ext4_lazyinit_thread()
ext4: fix lazyinit hang after removing request
Jeff Layton [Mon, 8 Nov 2010 12:28:32 +0000 (07:28 -0500)]
cifs: make cifs_ioctl handle NULL filp->private_data correctly
Commit 13cfb7334e made cifs_ioctl use the tlink attached to the
cifsFileInfo for a filp. This ignores the case of an open directory
however, which in CIFS can have a NULL private_data until a readdir
is done on it.
This patch re-adds the NULL pointer checks that were removed in commit 50ae28f01 and moves the setting of tcon and "caps" variables lower.
Long term, a better fix would be to establish a f_op->open routine for
directories that populates that field at open time, but that requires
some other changes to how readdir calls are handled.
Reported-by: Kjell Rune Skaaraas <kjella79@yahoo.no> Reviewed-and-Tested-by: Suresh Jayaraman <sjayaraman@suse.de> Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>
* git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty-2.6:
TTY: move .gitignore from drivers/char/ to drivers/tty/vt/
TTY: create drivers/tty/vt and move the vt code there
TTY: create drivers/tty and move the tty core files there
Linus Torvalds [Mon, 8 Nov 2010 18:54:49 +0000 (10:54 -0800)]
Merge branch 'staging-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging-next-2.6
* 'staging-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging-next-2.6:
Staging: ath6kl: remove empty files that mess with 'distclean'
staging: ath6kl: Fixing the driver to use modified mmc_host structure
Staging: solo6x10: fix build problem
Linus Torvalds [Mon, 8 Nov 2010 18:53:21 +0000 (10:53 -0800)]
Merge branch 'sh-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/lethal/sh-2.6
* 'sh-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/lethal/sh-2.6:
sh: clkfwk: Fix up checkpatch warnings.
sh: make some needlessly global sh7724 clocks static
sh: add clk_round_parent() to optimize parent clock rate
sh: Simplify phys_addr_mask()/PTE_PHYS_MASK for 29/32-bit.
sh: nommu: Support building without an uncached mapping.
sh: nommu: use 32-bit phys mode.
sh: mach-se: Fix up SE7206 no ioport build.
sh: intc: Update for single IRQ reservation helper.
sh: clkfwk: Fix up rate rounding error handling.
sh: mach-se: Rip out superfluous 7751 PIO routines.
sh: mach-se: Rip out superfluous 770x PIO routines.
sh: mach-edosk7705: Kill off machtype, consolidate board def.
sh: mach-edosk7705: update for this century, kill off PIO trapping.
sh: mach-se: Rip out superfluous 7206 PIO routines.
sh: mach-systemh: Kill off dead board.
sh: mach-snapgear: Kill off machtype, consolidate board def.
sh: mach-snapgear: Rip out superfluous PIO routines.
sh: mach-microdev: SuperIO-relative ioport mapping.
Theodore Ts'o [Mon, 8 Nov 2010 18:49:33 +0000 (13:49 -0500)]
ext4: Don't call sb_issue_discard() in ext4_free_blocks()
Commit 5c521830cf (ext4: Support discard requests when running in
no-journal mode) attempts to add sb_issue_discard() for data blocks
(in data=writeback mode) and in no-journal mode. Unfortunately, this
no longer works, because in commit dd3932eddf (block: remove
BLKDEV_IFL_WAIT), sb_issue_discard() only presents a synchronous
interface, and there are times when we call ext4_free_blocks() when we
are are holding a spinlock, or are otherwise in an atomic context.
For now, I've removed the call to sb_issue_discard() to prevent a
deadlock or (if spinlock debugging is enabled) failures like this:
Theodore Ts'o [Mon, 8 Nov 2010 18:43:33 +0000 (13:43 -0500)]
ext4: handle writeback of inodes which are being freed
The following BUG can occur when an inode which is getting freed when
it still has dirty pages outstanding, and it gets deleted (in this
because it was the target of a rename). In ordered mode, we need to
make sure the data pages are written just in case we crash before the
rename (or unlink) is committed. If the inode is being freed then
when we try to igrab the inode, we end up tripping the BUG_ON at
fs/ext4/page-io.c:146.
To solve this problem, we need to keep track of the number of io
callbacks which are pending, and avoid destroying the inode until they
have all been completed. That way we don't have to bump the inode
count to keep the inode from being destroyed; an approach which
doesn't work because the count could have already been dropped down to
zero before the inode writeback has started (at which point we're not
allowed to bump the count back up to 1, since it's already started
getting freed).
Thanks to Dave Chinner for suggesting this approach, which is also
used by XFS.
sh: add clk_round_parent() to optimize parent clock rate
Sometimes it is possible and reasonable to adjust the parent clock rate to
improve precision of the child clock, e.g., if the child clock has no siblings.
clk_round_parent() is a new addition to the SH clock-framework API, that
implements such an optimization for child clocks with divisors, taking all
integer values in a range.
Signed-off-by: Guennadi Liakhovetski <g.liakhovetski@gmx.de> Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Vivek Goyal [Sat, 6 Nov 2010 12:16:05 +0000 (08:16 -0400)]
floppy: fix another use-after-free
While scanning the floopy code due to c093ee4f07f4 ("floppy: fix
use-after-free in module load failure path"), I found one more instance
of trying to access disk->queue pointer after doing put_disk() on
gendisk. For some reason , floppy moule still loads/unloads fine. The
object is probably still around with right pointer values.
o There seems to be one more instance of trying to cleanup the request
queue after we have called put_disk() on associated gendisk.
o This fix is more out of code inspection. Even without this fix for
some reason I am able to load/unload floppy module without any
issues.
TTY: move .gitignore from drivers/char/ to drivers/tty/vt/
The autogenerated files (consolemap_deftbl.c and defkeymap.c) need to
be ignored by git, so move the .gitignore file that was doing it to the
properly location now that the files have moved as well.
Linus Torvalds [Sat, 6 Nov 2010 01:57:04 +0000 (18:57 -0700)]
ipw2x00: remove the right /proc/net entry
Commit 27ae60f8f7aa ("ipw2x00: replace "ieee80211" with "libipw" where
appropriate") changed DRV_NAME to be "libipw", but didn't properly fix
up the places where it was used to specify the name for the /proc/net/
directory.
For backwards compatibility reasons, that directory name remained
"ieee80211", but due to the DRV_NAME change, the error case printouts
and the cleanup functions now used "libipw" instead. Which made it all
fail badly.
For example, on module unload as reported by Randy:
WARNING: at fs/proc/generic.c:816 remove_proc_entry+0x156/0x35e()
name 'libipw'
because it's trying to unregister a /proc directory that obviously
doesn't even exist.
Clean it all up to use DRV_PROCNAME for the actual /proc directory name.
Reported-and-tested-by: Randy Dunlap <randy.dunlap@oracle.com> Cc: Pavel Roskin <proski@gnu.org> Cc: John W. Linville <linville@tuxdriver.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Sat, 6 Nov 2010 00:49:22 +0000 (17:49 -0700)]
Merge branch 'kvm-updates/2.6.37' of git://git.kernel.org/pub/scm/virt/kvm/kvm
* 'kvm-updates/2.6.37' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
KVM: PPC: BookE: Load the lower half of MSR
KVM: PPC: BookE: fix sleep with interrupts disabled
KVM: PPC: e500: Call kvm_vcpu_uninit() before kvmppc_e500_tlb_uninit().
PPC: KVM: Book E doesn't have __end_interrupts.
KVM: x86: Issue smp_call_function_many with preemption disabled
KVM: x86: fix information leak to userland
KVM: PPC: fix information leak to userland
KVM: MMU: fix rmap_remove on non present sptes
KVM: Write protect memory after slot swap
Linus Torvalds [Sat, 6 Nov 2010 00:45:59 +0000 (17:45 -0700)]
floppy: fix use-after-free in module load failure path
Commit 488211844e0c ("floppy: switch to one queue per drive instead of
sharing a queue") introduced a use-after-free. We do "put_disk()" on
the disk device _before_ we then clean up the queue associated with that
disk.
Move the put_disk() down to avoid dereferencing a free'd data structure.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (41 commits)
inet_diag: Make sure we actually run the same bytecode we audited.
netlink: Make nlmsg_find_attr take a const nlmsghdr*.
fib: fib_result_assign() should not change fib refcounts
netfilter: ip6_tables: fix information leak to userspace
cls_cgroup: Fix crash on module unload
memory corruption in X.25 facilities parsing
net dst: fix percpu_counter list corruption and poison overwritten
rds: Remove kfreed tcp conn from list
rds: Lost locking in loop connection freeing
de2104x: fix panic on load
atl1 : fix panic on load
netxen: remove unused firmware exports
caif: Remove noisy printout when disconnecting caif socket
caif: SPI-driver bugfix - incorrect padding.
caif: Bugfix for socket priority, bindtodev and dbg channel.
smsc911x: Set Ethernet EEPROM size to supported device's size
ipv4: netfilter: ip_tables: fix information leak to userland
ipv4: netfilter: arp_tables: fix information leak to userland
cxgb4vf: remove call to stop TX queues at load time.
cxgb4: remove call to stop TX queues at load time.
...
Linus Torvalds [Fri, 5 Nov 2010 21:17:22 +0000 (14:17 -0700)]
Merge branch 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394-2.6
* 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394-2.6:
firewire: ohci: fix race when reading count in AR descriptor
firewire: ohci: avoid reallocation of AR buffers
firewire: ohci: fix race in AR split packet handling
firewire: ohci: fix buffer overflow in AR split packet handling
* git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6:
cifs: make cifs_set_oplock_level() take a cifsInodeInfo pointer
cifs: dereferencing first then checking
cifs: trivial comment fix: tlink_tree is now a rbtree
[CIFS] Cleanup unused variable build warning
cifs: convert tlink_tree to a rbtree
cifs: store pointer to master tlink in superblock (try #2)
cifs: trivial doc fix: note setlease implemented
CIFS: Add cifs_set_oplock_level
FS: cifs, remove unneeded NULL tests
Oleg Nesterov [Fri, 5 Nov 2010 15:53:42 +0000 (16:53 +0100)]
posix-cpu-timers: workaround to suppress the problems with mt exec
posix-cpu-timers.c correctly assumes that the dying process does
posix_cpu_timers_exit_group() and removes all !CPUCLOCK_PERTHREAD
timers from signal->cpu_timers list.
But, it also assumes that timer->it.cpu.task is always the group
leader, and thus the dead ->task means the dead thread group.
This is obviously not true after de_thread() changes the leader.
After that almost every posix_cpu_timer_ method has problems.
It is not simple to fix this bug correctly. First of all, I think
that timer->it.cpu should use struct pid instead of task_struct.
Also, the locking should be reworked completely. In particular,
tasklist_lock should not be used at all. This all needs a lot of
nontrivial and hard-to-test changes.
Change __exit_signal() to do posix_cpu_timers_exit_group() when
the old leader dies during exec. This is not the fix, just the
temporary hack to hide the problem for 2.6.37 and stable. IOW,
this is obviously wrong but this is what we currently have anyway:
cpu timers do not work after mt exec.
In theory this change adds another race. The exiting leader can
detach the timers which were attached to the new leader. However,
the window between de_thread() and release_task() is small, we
can pretend that sys_timer_create() was called before de_thread().
Pavel Shilovsky [Wed, 3 Nov 2010 07:58:57 +0000 (10:58 +0300)]
cifs: make cifs_set_oplock_level() take a cifsInodeInfo pointer
All the callers already have a pointer to struct cifsInodeInfo. Use it.
Signed-off-by: Suresh Jayaraman <sjayaraman@suse.de> Signed-off-by: Pavel Shilovsky <piastryyy@gmail.com> Signed-off-by: Steve French <sfrench@us.ibm.com>
Jean Delvare [Fri, 5 Nov 2010 14:59:29 +0000 (10:59 -0400)]
hwmon: (ltc4261) Fix error message format
adapter->id is deprecated and not set by any adapter driver, so this
was certainly not what the author wanted to use. adapter->nr maybe,
but as dev_err() already includes this value, as well as the client's
address, there's no point repeating them. Better print a simple error
message in plain English words.
* git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tile:
arch/tile: mark "hardwall" device as non-seekable
asm-generic/stat.h: support 64-bit file time_t for stat()
arch/tile: don't allow user code to set the PL via ptrace or signal return
arch/tile: correct double syscall restart for nested signals
arch/tile: avoid __must_check warning on one strict_strtol check
arch/tile: bomb raw_local_irq_ to arch_local_irq_
arch/tile: complete migration to new kmap_atomic scheme
Scott Wood [Thu, 30 Sep 2010 19:28:50 +0000 (14:28 -0500)]
KVM: PPC: BookE: fix sleep with interrupts disabled
It is not legal to call mutex_lock() with interrupts disabled.
This will assert with debug checks enabled.
If there's a real need to disable interrupts here, it could be done
after the mutex is acquired -- but I don't see why it's needed at all.
Signed-off-by: Scott Wood <scottwood@freescale.com> Reviewed-by: Christian Ehrhardt <ehrhardt@linux.vnet.ibm.com> Signed-off-by: Alexander Graf <agraf@suse.de>
Vasiliy Kulikov [Sat, 30 Oct 2010 18:54:47 +0000 (22:54 +0400)]
KVM: x86: fix information leak to userland
Structures kvm_vcpu_events, kvm_debugregs, kvm_pit_state2 and
kvm_clock_data are copied to userland with some padding and reserved
fields unitialized. It leads to leaking of contents of kernel stack
memory. We have to initialize them to zero.
In patch v1 Jan Kiszka suggested to fill reserved fields with zeros
instead of memset'ting the whole struct. It makes sense as these
fields are explicitly marked as padding. No more fields need zeroing.
1. userspace calls GET_DIRTY_LOG
2. kvm_mmu_slot_remove_write_access is called and makes a page ro
3. page fault happens and makes the page writeable
fault is logged in the bitmap appropriately
4. kvm_vm_ioctl_get_dirty_log swaps slot pointers
a lot of time passes
5. guest writes into the page
6. userspace calls GET_DIRTY_LOG
At point (5), bitmap is clean and page is writeable,
thus, guest modification of memory is not logged
and GET_DIRTY_LOG returns an empty bitmap.
The rule is that all pages are either dirty in the current bitmap,
or write-protected, which is violated here.
It seems that just moving kvm_mmu_slot_remove_write_access down
to after the slot pointer swap should fix this bug.
KVM-Stable-Tag. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
Linus Torvalds [Fri, 5 Nov 2010 14:54:40 +0000 (07:54 -0700)]
Merge branch 'for-linus-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/gerg/m68knommu
* 'for-linus-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/gerg/m68knommu:
m68k, m68knommu: Do not include linux/hardirq.h in asm/irqflags.h
m68knommu: add back in declaration of do_IRQ
Jeff Layton [Tue, 2 Nov 2010 20:22:50 +0000 (16:22 -0400)]
cifs: dereferencing first then checking
This patch is based on Dan's original patch. His original description is
below:
Smatch complained about a couple checking for NULL after dereferencing
bugs. I'm not super familiar with the code so I did the conservative
thing and move the dereferences after the checks.
The dereferences in cifs_lock() and cifs_fsync() were added in ba00ba64cf0 "cifs: make various routines use the cifsFileInfo->tcon
pointer". The dereference in find_writable_file() was added in 6508d904e6f "cifs: have find_readable/writable_file filter by fsuid".
The comments there say it's possible to trigger the NULL dereference
under stress.
Signed-off-by: Dan Carpenter <error27@gmail.com> Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>
Nelson Elhage [Wed, 3 Nov 2010 16:35:41 +0000 (16:35 +0000)]
inet_diag: Make sure we actually run the same bytecode we audited.
We were using nlmsg_find_attr() to look up the bytecode by attribute when
auditing, but then just using the first attribute when actually running
bytecode. So, if we received a message with two attribute elements, where only
the second had type INET_DIAG_REQ_BYTECODE, we would validate and run different
bytecode strings.
Fix this by consistently using nlmsg_find_attr everywhere.
Signed-off-by: Nelson Elhage <nelhage@ksplice.com> Signed-off-by: Thomas Graf <tgraf@infradead.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Thu, 4 Nov 2010 01:21:39 +0000 (01:21 +0000)]
fib: fib_result_assign() should not change fib refcounts
After commit ebc0ffae5 (RCU conversion of fib_lookup()),
fib_result_assign() should not change fib refcounts anymore.
Thanks to Michael who did the bisection and bug report.
Reported-by: Michael Ellerman <michael@ellerman.id.au> Tested-by: Michael Ellerman <michael@ellerman.id.au> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Paul Mundt [Thu, 4 Nov 2010 03:51:08 +0000 (12:51 +0900)]
sh: Simplify phys_addr_mask()/PTE_PHYS_MASK for 29/32-bit.
Given that __in_29bit_mode() is a constant for the non-PMB case, we can
simply use the PMB-facing version of phys_addr_mask() and drop the other
variants.
Paul Mundt [Thu, 4 Nov 2010 03:46:19 +0000 (12:46 +0900)]
sh: nommu: Support building without an uncached mapping.
Now that nommu selects 32BIT we run in to the situation where SH-2A
supports an uncached identity mapping by way of the BSC, while the SH-2
does not. This provides stubs for the PC manglers and tidies up some of
the system*.h mess in the process.
Paul Mundt [Thu, 4 Nov 2010 03:32:24 +0000 (12:32 +0900)]
sh: nommu: use 32-bit phys mode.
The nommu code has regressed somewhat in that 29BIT gets set for the
SH-2/2A configs regardless of the fact that they are really 32BIT sans
MMU or PMB. This does a bit of tidying to get nommu properly selecting
32BIT as it was before.
Paul Mundt [Thu, 4 Nov 2010 03:21:25 +0000 (12:21 +0900)]
mmc: sh_mmcif: Convert extern inline to static inline.
Presently the extern inline case results in a compiler warning on ARM due
to the memory barrier definition used in the I/O routines. These
ultimately all want to be static inline anyways, so just convert them all
in place.
Herbert Xu [Wed, 3 Nov 2010 13:31:05 +0000 (13:31 +0000)]
cls_cgroup: Fix crash on module unload
Somewhere along the lines net_cls_subsys_id became a macro when
cls_cgroup is built as a module. Not only did it make cls_cgroup
completely useless, it also causes it to crash on module unload.
This patch fixes this by removing that macro.
Thanks to Eric Dumazet for diagnosing this problem.
Reported-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Reviewed-by: Li Zefan <lizf@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Xiaotian Feng [Tue, 2 Nov 2010 16:11:05 +0000 (16:11 +0000)]
net dst: fix percpu_counter list corruption and poison overwritten
There're some percpu_counter list corruption and poison overwritten warnings
in recent kernel, which is resulted by fc66f95c.
commit fc66f95c switches to use percpu_counter, in ip6_route_net_init, kernel
init the percpu_counter for dst entries, but, the percpu_counter is never destroyed
in ip6_route_net_exit. So if the related data is freed by kernel, the freed percpu_counter
is still on the list, then if we insert/remove other percpu_counter, list corruption
resulted. Also, if the insert/remove option modifies the ->prev,->next pointer of
the freed value, the poison overwritten is resulted then.
With the following patch, the percpu_counter list corruption and poison overwritten
warnings disappeared.
Signed-off-by: Xiaotian Feng <dfeng@redhat.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru> Cc: "Pekka Savola (ipv6)" <pekkas@netcore.fi> Cc: James Morris <jmorris@namei.org> Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org> Cc: Patrick McHardy <kaber@trash.net> Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Amerigo Wang [Tue, 2 Nov 2010 18:25:31 +0000 (18:25 +0000)]
netxen: remove unused firmware exports
Quote from Amit Salecha:
"Actually I was not updated, NX_UNIFIED_ROMIMAGE_NAME (phanfw.bin) is already
submitted and its present in linux-firmware.git.
I will get back to you on NX_P2_MN_ROMIMAGE_NAME, NX_P3_CT_ROMIMAGE_NAME and
NX_P3_MN_ROMIMAGE_NAME. Whether this will be submitted ?"
We have to remove these, otherwise we will get wrong info from modinfo.
Signed-off-by: WANG Cong <amwang@redhat.com> Cc: Amit Kumar Salecha <amit.salecha@qlogic.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Dhananjay Phadke <dhananjay.phadke@qlogic.com> Cc: Narender Kumar <narender.kumar@qlogic.com> Acked-by: Amit Kumar Salecha <amit.salecha@qlogic.com>-- Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Wed, 3 Nov 2010 17:44:55 +0000 (13:44 -0400)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound-2.6:
ASoC: tpa6130a2: Get rid of compile warning from tpa6130a2_power
ALSA: hda - MacBookAir3,1(3,2) alsa support
ASoC: fix the building issue of missing codec field in 'struct snd_soc_card'
ALSA: usb-audio - Support for Power/Status LED on Creative USB X-Fi S51
ALSA: asihpi - Unsafe memory management when allocating control cache
ASoC: Update WARN uses in wm_hubs
ASoC: Include cx20442 to SND_SOC_ALL_CODECS
ASoC: Fix SND_SOC_ALL_CODECS typo for jz4740
ASoC: Remove volatility from WM8900 POWER1 register
ALSA: lx6464es - make 1 bit signed bitfield unsigned
ALSA: cs46xx memory management fixes for cs46xx_dsp_spos_create()
ALSA: usb - driver neglects kmalloc return value check and may deref NULL
ASoC: tpa6130a2: Fix unbalanced regulator disables
ASoC: tlv320dac33: Mode1 FIFO auto configuration fix
ASoC: tlv320dac33: Limit the US_TO_SAMPLES macro
ASoC: tlv320dac33: Error handling for broken chip
ASoC: Check return value of struct_strtoul() in pmdown_time_set()
Theodore Ts'o [Wed, 3 Nov 2010 16:03:21 +0000 (12:03 -0400)]
ext4: initialize the percpu counters before replaying the journal
We now initialize the percpu counters before replaying the journal,
but after the journal, we recalculate the global counters, to deal
with the possibility of the per-blockgroup counts getting updated by
the journal replay.