Sanjay Lal [Wed, 20 Mar 2013 04:06:49 +0000 (15:06 +1100)]
mips: define KVM_USER_MEM_SLOTS
ARCH=mips, config=fuloong2e_defconfig:
akpm3:/usr/src/25> make arch/mips/kernel/early_printk.o
...
CC arch/mips/kernel/asm-offsets.s
In file included from arch/mips/kernel/asm-offsets.c:20:
include/linux/kvm_host.h:334: error: `KVM_USER_MEM_SLOTS' undeclared here (not in a function)
Signed-off-by: Sanjay Lal <sanjayl@kymasys.com> Reported-by: Andrew Morton <akpm@linux-foundation.org> Cc: Alex Williamson <alex.williamson@redhat.com> Cc: Marcelo Tosatti <mtosatti@redhat.com> Cc: Gleb Natapov <gleb@redhat.com> Cc: Ralf Baechle <ralf@linux-mips.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Josh Boyer [Wed, 20 Mar 2013 04:06:49 +0000 (15:06 +1100)]
kmsg: honor dmesg_restrict sysctl on /dev/kmsg
Originally, the addition of dmesg_restrict covered both the syslog
method of accessing dmesg, as well as /dev/kmsg itself. This was done
indirectly by security_syslog calling cap_syslog before doing any LSM
checks.
However, commit 12b3052c3ee ("capabilities/syslog: open code cap_syslog
logic to fix build failure") moved the code around and pushed the checks
into the caller itself. That seems to have inadvertently dropped the
checks for dmesg_restrict on /dev/kmsg. Most people haven't noticed
because util-linux dmesg(1) defaults to using the syslog method for access
in older versions. With util-linux 2.22 and a kernel newer than 3.5,
dmesg(1) defaults to reading directly from /dev/kmsg.
Fix this by making an explicit check in the devkmsg_open function.
This fixes https://bugzilla.redhat.com/show_bug.cgi?id=903192
Signed-off-by: Josh Boyer <jwboyer@redhat.com> Reported-by: Christian Kujau <lists@nerdbynature.de> Cc: Eric Paris <eparis@redhat.com> Cc: James Morris <jmorris@namei.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Vladimir Davydov [Wed, 20 Mar 2013 04:06:48 +0000 (15:06 +1100)]
mqueue: sys_mq_open: do not call mnt_drop_write() if read-only
mnt_drop_write() must be called only if mnt_want_write() succeeded,
otherwise the mnt_writers counter will diverge.
mnt_writers counters are used to check if remounting FS as read-only is
OK, so after an extra mnt_drop_write() call, it would be impossible to
remount mqueue FS as read-only. Besides, on umount a warning would be
printed like this one:
[ 194.714880] =====================================
[ 194.719680] [ BUG: bad unlock balance detected! ]
[ 194.724488] 3.9.0-rc3 #5 Not tainted
[ 194.728159] -------------------------------------
[ 194.732958] a.out/12486 is trying to release lock (sb_writers) at:
[ 194.739355] [<ffffffff811b177f>] mnt_drop_write+0x1f/0x30
[ 194.744851] but there are no more locks to release!
Signed-off-by: Vladimir Davydov <vdavydov@parallels.com> Cc: Doug Ledford <dledford@redhat.com> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Alexander Duyck [Wed, 20 Mar 2013 04:06:48 +0000 (15:06 +1100)]
dma-debug: update DMA debug API to better handle multiple mappings of a buffer
There were reports of the igb driver unmapping buffers without calling
dma_mapping_error. On closer inspection issues were found in the DMA
debug API and how it handled multiple mappings of the same buffer.
The issue I found is the fact that the debug_dma_mapping_error would only
set the map_err_type to MAP_ERR_CHECKED in the case that the was only one
match for device and device address. However in the case of non-IOMMU,
multiple addresses existed and as a result it was not setting this field
once a second mapping was instantiated. I have resolved this by changing
the search so that it instead will now set MAP_ERR_CHECKED on the first
buffer that matches the device and DMA address that is currently in the
state MAP_ERR_NOT_CHECKED.
A secondary side effect of this patch is that in the case of multiple
buffers using the same address only the last mapping will have a valid
map_err_type. The previous mappings will all end up with map_err_type set
to MAP_ERR_CHECKED because of the dma_mapping_error call in
debug_dma_map_page. However this behavior may be preferable as it means
you will likely only see one real error per multi-mapped buffer, versus
the current behavior of multiple false errors mer multi-mapped buffer.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Cc: Joerg Roedel <joro@8bytes.org> Reviewed-by: Shuah Khan <shuah.khan@hp.com> Tested-by: Shuah Khan <shuah.khan@hp.com> Cc: Jakub Kicinski <kubakici@wp.pl> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Alexander Duyck [Wed, 20 Mar 2013 04:06:47 +0000 (15:06 +1100)]
dma-debug: fix locking bug in check_unmap()
In check_unmap() it is possible to get into a dead-locked state if
dma_mapping_error is called. The problem is that the bucket is locked in
check_unmap, and locked again by debug_dma_mapping_error which is called
by dma_mapping_error. To resolve that we must release the lock on the
bucket before making the call to dma_mapping_error.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Cc: Joerg Roedel <joro@8bytes.org> Reviewed-by: Shuah Khan <shuah.khan@hp.com> Tested-by: Shuah Khan <shuah.khan@hp.com> Cc: Jakub Kicinski <kubakici@wp.pl> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
zone_end_pfn is "unsigned" (32 bits). Changing it to
"unsigned long" (64 bits) fixes the problem.
zone_end_pfn() was added recently in commit 108bcc96ef70 ("mm: add & use
zone_end_pfn() and zone_spans_pfn()")
http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/include/linux/mmzone.h?id=108bcc96ef7047c02cad4d229f04da38186a3f3f
Signed-off-by: Russ Anderson <rja@sgi.com> Reported-by: George Beshers <gbeshers@sgi.com> Acked-by: Hedi Berriche <hedi@sgi.com> Cc: Cody P Schafer <cody@linux.vnet.ibm.com> Cc: Michal Hocko <mhocko@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Oleg Nesterov [Wed, 20 Mar 2013 04:06:47 +0000 (15:06 +1100)]
poweroff: change orderly_poweroff() to use schedule_work()
David said:
Commit 6c0c0d4d108 ("poweroff: fix bug in orderly_poweroff()")
apparently fixes one bug in orderly_poweroff(), but introduces
another. The comments on orderly_poweroff() claim it can be called
from any context - and indeed we call it from interrupt context in
arch/powerpc/platforms/pseries/ras.c for example. But since that
commit this is no longer safe, since call_usermodehelper_fns() is not
safe in interrupt context without the UMH_NO_WAIT option.
orderly_poweroff() can be used from any context but UMH_WAIT_EXEC is
sleepable. Move the "force" logic into __orderly_poweroff() and change
orderly_poweroff() to use the global poweroff_work which simply calls
__orderly_poweroff().
While at it, remove the unneeded "int argc" and change argv_split() to use
GFP_KERNEL.
We use the global "bool poweroff_force" to pass the argument, this can
obviously affect the previous request if it is pending/running. So we
only allow the "false => true" transition assuming that the pending "true"
should succeed anyway. If schedule_work() fails after that we know that
work->func() was not called yet, it must see the new value.
This means that orderly_poweroff() becomes async even if we do not run the
command and always succeeds, schedule_work() can only fail if the work is
already pending. We can export __orderly_poweroff() and change the
non-atomic callers which want the old semantics.
Signed-off-by: Oleg Nesterov <oleg@redhat.com> Reported-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Reported-by: David Gibson <david@gibson.dropbear.id.au> Cc: Lucas De Marchi <lucas.demarchi@profusion.mobi> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Feng Hong <hongfeng@marvell.com> Cc: Kees Cook <keescook@chromium.org> Cc: Serge Hallyn <serge.hallyn@canonical.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: "Rafael J. Wysocki" <rjw@sisk.pl> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Wanpeng Li [Wed, 20 Mar 2013 04:06:46 +0000 (15:06 +1100)]
mm/hugetlb: fix total hugetlbfs pages count when using memory overcommit accouting
hugetlb_total_pages is used for overcommit calculations but the current
implementation considers only the default hugetlb page size (which is
either the first defined hugepage size or the one specified by
default_hugepagesz kernel boot parameter).
If the system is configured for more than one hugepage size, which is
possible since a137e1cc ("hugetlbfs: per mount huge page sizes") then the
overcommit estimation done by __vm_enough_memory() (resp. shown by
meminfo_proc_show) is not precise - there is an impression of more
available/allowed memory. This can lead to an unexpected ENOMEM/EFAULT
resp. SIGSEGV when memory is accounted.
Testcase:
boot: hugepagesz=1G hugepages=1
the default overcommit ratio is 50
before patch:
egrep 'CommitLimit' /proc/meminfo
CommitLimit: 55434168 kB
after patch:
egrep 'CommitLimit' /proc/meminfo
CommitLimit: 54909880 kB
Signed-off-by: Wanpeng Li <liwanp@linux.vnet.ibm.com> Acked-by: Michal Hocko <mhocko@suse.cz> Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com> Cc: Hillf Danton <dhillf@gmail.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: <stable@vger.kernel.org> [3.0+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
wake_up_klogd() is useless when CONFIG_PRINTK=n because neither printk()
nor printk_sched() are in use and there are actually no waiter on log_wait
waitqueue. It should be a stub in this case for users like
bust_spinlocks().
Otherwise this results in this warning when CONFIG_PRINTK=n
and CONFIG_IRQ_WORK=n:
kernel/built-in.o In function `wake_up_klogd':
(.text.wake_up_klogd+0xb4): undefined reference to `irq_work_queue'
To fix this, provide an off-case for wake_up_klogd() when CONFIG_PRINTK=n.
There is much more from console_unlock() and other console related code in
printk.c that should be moved under CONFIG_PRINTK. But for now, focus on
a minimal fix as we passed the merged window already.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Reported-by: James Hogan <james.hogan@imgtec.com> Cc: James Hogan <james.hogan@imgtec.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ingo Molnar <mingo@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>