]> git.karo-electronics.de Git - karo-tx-linux.git/log
karo-tx-linux.git
10 years agovsprintf: ignore %n again
Kees Cook [Tue, 5 Nov 2013 06:07:01 +0000 (17:07 +1100)]
vsprintf: ignore %n again

This ignores %n in printf again, as was originally documented.
Implementing %n poses a greater security risk than utility, so it should
stay ignored.  To help anyone attempting to use %n, a warning will be
emitted if it is encountered.

Based on an earlier patch by Joe Perches.

Because %n was designed to write to pointers on the stack, it has been
frequently used as an attack vector when bugs are found that leak
user-controlled strings into functions that ultimately process format
strings.  While this class of bug can still be turned into an information
leak, removing %n eliminates the common method of elevating such a bug
into an arbitrary kernel memory writing primitive, significantly reducing
the danger of this class of bug.

For seq_file users that need to know the length of a written string for
padding, please see seq_setwidth() and seq_pad() instead.

Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: Joe Perches <joe@perches.com>
Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Cc: David Miller <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agoseq_file: remove "%n" usage from seq_file users
Tetsuo Handa [Tue, 5 Nov 2013 06:07:00 +0000 (17:07 +1100)]
seq_file: remove "%n" usage from seq_file users

All seq_printf() users are using "%n" for calculating padding size,
convert them to use seq_setwidth() / seq_pad() pair.

Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: Joe Perches <joe@perches.com>
Cc: David Miller <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agoseq_file: introduce seq_setwidth() and seq_pad()
Tetsuo Handa [Tue, 5 Nov 2013 06:07:00 +0000 (17:07 +1100)]
seq_file: introduce seq_setwidth() and seq_pad()

There are several users who want to know bytes written by seq_*() for
alignment purpose.  Currently they are using %n format for knowing it
because seq_*() returns 0 on success.

This patch introduces seq_setwidth() and seq_pad() for allowing them to
align without using %n format.

Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: Joe Perches <joe@perches.com>
Cc: David Miller <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agomm-dynamically-allocate-page-ptl-if-it-cannot-be-embedded-to-struct-page-fix-fix
Andrew Morton [Tue, 5 Nov 2013 06:06:59 +0000 (17:06 +1100)]
mm-dynamically-allocate-page-ptl-if-it-cannot-be-embedded-to-struct-page-fix-fix

Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agomm: try to detect that page->ptl is in use
Kirill A. Shutemov [Tue, 5 Nov 2013 06:06:59 +0000 (17:06 +1100)]
mm: try to detect that page->ptl is in use

prep_new_page() initialize page->private (and therefore page->ptl) with 0.
 Make sure nobody took it in use in between allocation of the page and
page table constructor.

It can happen if arch try to use slab for page table allocation: slab code
uses page->slab_cache and page->first_page (for tail pages), which share
storage with page->ptl.

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agomm: dynamically allocate page->ptl if it cannot be embedded to struct page
Kirill A. Shutemov [Tue, 5 Nov 2013 06:06:58 +0000 (17:06 +1100)]
mm: dynamically allocate page->ptl if it cannot be embedded to struct page

If split page table lock is in use, we embed the lock into struct page of
table's page.  We have to disable split lock, if spinlock_t is too big be
to be embedded, like when DEBUG_SPINLOCK or DEBUG_LOCK_ALLOC enabled.

This patch add support for dynamic allocation of split page table lock if
we can't embed it to struct page.

page->ptl is unsigned long now and we use it as spinlock_t if
sizeof(spinlock_t) <= sizeof(long), otherwise it's pointer to spinlock_t.

The spinlock_t allocated in pgtable_page_ctor() for PTE table and in
pgtable_pmd_page_ctor() for PMD table.  All other helpers converted to
support dynamically allocated page->ptl.

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Christoph Lameter <cl@linux.com>
Reviewed-by: Peter Zijlstra <peterz@infradead.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: "James E.J. Bottomley" <jejb@parisc-linux.org>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Chen Liqin <liqin.chen@sunplusct.com>
Cc: Chris Metcalf <cmetcalf@tilera.com>
Cc: Chris Zankel <chris@zankel.net>
Cc: David Howells <dhowells@redhat.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Grant Likely <grant.likely@linaro.org>
Cc: Guan Xuetao <gxt@mprc.pku.edu.cn>
Cc: Haavard Skinnemoen <hskinnemoen@gmail.com>
Cc: Hans-Christian Egtvedt <egtvedt@samfundet.no>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Helge Deller <deller@gmx.de>
Cc: Hirokazu Takata <takata@linux-m32r.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: James Hogan <james.hogan@imgtec.com>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Jesper Nilsson <jesper.nilsson@axis.com>
Cc: Jonas Bonn <jonas@southpole.se>
Cc: Koichi Yasutake <yasutake.koichi@jp.panasonic.com>
Cc: Lennox Wu <lennox.wu@gmail.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Mikael Starvik <starvik@axis.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Richard Kuo <rkuo@codeaurora.org>
Cc: Richard Weinberger <richard@nod.at>
Cc: Rob Herring <rob.herring@calxeda.com>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agoxtensa: use buddy allocator for PTE table
Kirill A. Shutemov [Tue, 5 Nov 2013 06:06:58 +0000 (17:06 +1100)]
xtensa: use buddy allocator for PTE table

At the moment xtensa uses slab allocator for PTE table.  It doesn't work
with enabled split page table lock: slab uses page->slab_cache and
page->first_page for its pages.  These fields share stroage with
page->ptl.

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Chris Zankel <chris@zankel.net>
Acked-by: Max Filippov <jcmvbkbc@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agoiommu/arm-smmu: handle pgtable_page_ctor() fail
Kirill A. Shutemov [Tue, 5 Nov 2013 06:06:57 +0000 (17:06 +1100)]
iommu/arm-smmu: handle pgtable_page_ctor() fail

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Cc: Grant Likely <grant.likely@linaro.org>
Cc: Rob Herring <rob.herring@calxeda.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agoxtensa: handle pgtable_page_ctor() fail
Kirill A. Shutemov [Tue, 5 Nov 2013 06:06:56 +0000 (17:06 +1100)]
xtensa: handle pgtable_page_ctor() fail

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Chris Zankel <chris@zankel.net>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agox86: handle pgtable_page_ctor() fail
Kirill A. Shutemov [Tue, 5 Nov 2013 06:06:56 +0000 (17:06 +1100)]
x86: handle pgtable_page_ctor() fail

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agounicore32: handle pgtable_page_ctor() fail
Kirill A. Shutemov [Tue, 5 Nov 2013 06:06:55 +0000 (17:06 +1100)]
unicore32: handle pgtable_page_ctor() fail

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Guan Xuetao <gxt@mprc.pku.edu.cn>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agoum: handle pgtable_page_ctor() fail
Kirill A. Shutemov [Tue, 5 Nov 2013 06:06:55 +0000 (17:06 +1100)]
um: handle pgtable_page_ctor() fail

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Richard Weinberger <richard@nod.at>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agotile: handle pgtable_page_ctor() fail
Kirill A. Shutemov [Tue, 5 Nov 2013 06:06:54 +0000 (17:06 +1100)]
tile: handle pgtable_page_ctor() fail

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Acked-by: Chris Metcalf <cmetcalf@tilera.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agosparc: handle pgtable_page_ctor() fail
Kirill A. Shutemov [Tue, 5 Nov 2013 06:06:54 +0000 (17:06 +1100)]
sparc: handle pgtable_page_ctor() fail

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agosh: handle pgtable_page_ctor() fail
Kirill A. Shutemov [Tue, 5 Nov 2013 06:06:53 +0000 (17:06 +1100)]
sh: handle pgtable_page_ctor() fail

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Paul Mundt <lethal@linux-sh.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agoscore: handle pgtable_page_ctor() fail
Kirill A. Shutemov [Tue, 5 Nov 2013 06:06:53 +0000 (17:06 +1100)]
score: handle pgtable_page_ctor() fail

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Chen Liqin <liqin.chen@sunplusct.com>
Acked-by: Lennox Wu <lennox.wu@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agos390: handle pgtable_page_ctor() fail
Kirill A. Shutemov [Tue, 5 Nov 2013 06:06:52 +0000 (17:06 +1100)]
s390: handle pgtable_page_ctor() fail

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agopowerpc: handle pgtable_page_ctor() fail
Kirill A. Shutemov [Tue, 5 Nov 2013 06:06:52 +0000 (17:06 +1100)]
powerpc: handle pgtable_page_ctor() fail

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agoparisc: handle pgtable_page_ctor() fail
Kirill A. Shutemov [Tue, 5 Nov 2013 06:06:51 +0000 (17:06 +1100)]
parisc: handle pgtable_page_ctor() fail

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: "James E.J. Bottomley" <jejb@parisc-linux.org>
Cc: Helge Deller <deller@gmx.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agomips: handle pgtable_page_ctor() fail
Kirill A. Shutemov [Tue, 5 Nov 2013 06:06:50 +0000 (17:06 +1100)]
mips: handle pgtable_page_ctor() fail

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agometag: handle pgtable_page_ctor() fail
Kirill A. Shutemov [Tue, 5 Nov 2013 06:06:50 +0000 (17:06 +1100)]
metag: handle pgtable_page_ctor() fail

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: James Hogan <james.hogan@imgtec.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agom68k-handle-pgtable_page_ctor-fail-fix-fix
Kirill A. Shutemov [Tue, 5 Nov 2013 06:06:49 +0000 (17:06 +1100)]
m68k-handle-pgtable_page_ctor-fail-fix-fix

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agom68k-handle-pgtable_page_ctor-fail-fix
Kirill A. Shutemov [Tue, 5 Nov 2013 06:06:49 +0000 (17:06 +1100)]
m68k-handle-pgtable_page_ctor-fail-fix

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agom68k: handle pgtable_page_ctor() fail
Kirill A. Shutemov [Tue, 5 Nov 2013 06:06:48 +0000 (17:06 +1100)]
m68k: handle pgtable_page_ctor() fail

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agom32r: handle pgtable_page_ctor() fail
Kirill A. Shutemov [Tue, 5 Nov 2013 06:06:48 +0000 (17:06 +1100)]
m32r: handle pgtable_page_ctor() fail

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Hirokazu Takata <takata@linux-m32r.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agoia64: handle pgtable_page_ctor() fail
Kirill A. Shutemov [Tue, 5 Nov 2013 06:06:47 +0000 (17:06 +1100)]
ia64: handle pgtable_page_ctor() fail

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agohexagon: handle pgtable_page_ctor() fail
Kirill A. Shutemov [Tue, 5 Nov 2013 06:06:47 +0000 (17:06 +1100)]
hexagon: handle pgtable_page_ctor() fail

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Richard Kuo <rkuo@codeaurora.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agofrv: handle pgtable_page_ctor() fail
Kirill A. Shutemov [Tue, 5 Nov 2013 06:06:46 +0000 (17:06 +1100)]
frv: handle pgtable_page_ctor() fail

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Acked-by: David Howells <dhowells@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agocris: handle pgtable_page_ctor() fail
Kirill A. Shutemov [Tue, 5 Nov 2013 06:06:45 +0000 (17:06 +1100)]
cris: handle pgtable_page_ctor() fail

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Mikael Starvik <starvik@axis.com>
Acked-by: Jesper Nilsson <jesper.nilsson@axis.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agoavr32: handle pgtable_page_ctor() fail
Kirill A. Shutemov [Tue, 5 Nov 2013 06:06:45 +0000 (17:06 +1100)]
avr32: handle pgtable_page_ctor() fail

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Haavard Skinnemoen <hskinnemoen@gmail.com>
Acked-by: Hans-Christian Egtvedt <egtvedt@samfundet.no>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agoarm64: handle pgtable_page_ctor() fail
Kirill A. Shutemov [Tue, 5 Nov 2013 06:06:44 +0000 (17:06 +1100)]
arm64: handle pgtable_page_ctor() fail

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agoarm: handle pgtable_page_ctor() fail
Kirill A. Shutemov [Tue, 5 Nov 2013 06:06:44 +0000 (17:06 +1100)]
arm: handle pgtable_page_ctor() fail

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Russell King <linux@arm.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agoarc: handle pgtable_page_ctor() fail
Kirill A. Shutemov [Tue, 5 Nov 2013 06:06:43 +0000 (17:06 +1100)]
arc: handle pgtable_page_ctor() fail

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Acked-by: Vineet Gupta <vgupta@synopsys.com> [for arch/arc bits]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agoalpha: handle pgtable_page_ctor() fail
Kirill A. Shutemov [Tue, 5 Nov 2013 06:06:43 +0000 (17:06 +1100)]
alpha: handle pgtable_page_ctor() fail

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Matt Turner <mattst88@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agoopenrisc: add missing pgtable_page_ctor/dtor calls
Kirill A. Shutemov [Tue, 5 Nov 2013 06:06:42 +0000 (17:06 +1100)]
openrisc: add missing pgtable_page_ctor/dtor calls

It will fix NR_PAGETABLE accounting.  It's also required if the arch is
going ever support split ptl.

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Jonas Bonn <jonas@southpole.se>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agomn10300: add missing pgtable_page_ctor/dtor calls
Kirill A. Shutemov [Tue, 5 Nov 2013 06:06:42 +0000 (17:06 +1100)]
mn10300: add missing pgtable_page_ctor/dtor calls

It will fix NR_PAGETABLE accounting.  It's also required if the arch is
going ever support split ptl.

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Acked-by: David Howells <dhowells@redhat.com>
Cc: Koichi Yasutake <yasutake.koichi@jp.panasonic.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agomicroblaze: add missing pgtable_page_ctor/dtor calls
Kirill A. Shutemov [Tue, 5 Nov 2013 06:06:41 +0000 (17:06 +1100)]
microblaze: add missing pgtable_page_ctor/dtor calls

It will fix NR_PAGETABLE accounting.  It's also required if the arch is
going ever support split ptl.

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Michal Simek <monstr@monstr.eu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agomm: allow pgtable_page_ctor() to fail
Kirill A. Shutemov [Tue, 5 Nov 2013 06:06:40 +0000 (17:06 +1100)]
mm: allow pgtable_page_ctor() to fail

Change pgtable_page_ctor() return type from void to bool.  Returns true,
if initialization is successful and false otherwise.

Current implementation never fails, but it will change later.

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agoxtensa: fix potential NULL-pointer dereference
Kirill A. Shutemov [Tue, 5 Nov 2013 06:06:40 +0000 (17:06 +1100)]
xtensa: fix potential NULL-pointer dereference

Add missing check for memory allocation fail.

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Chris Zankel <chris@zankel.net>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agom32r: fix potential NULL-pointer dereference
Kirill A. Shutemov [Tue, 5 Nov 2013 06:06:39 +0000 (17:06 +1100)]
m32r: fix potential NULL-pointer dereference

Add missing check for memory allocation fail.

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Hirokazu Takata <takata@linux-m32r.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agocris: fix potential NULL-pointer dereference
Kirill A. Shutemov [Tue, 5 Nov 2013 06:06:39 +0000 (17:06 +1100)]
cris: fix potential NULL-pointer dereference

Add missing check for memory allocation fail.

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Mikael Starvik <starvik@axis.com>
Acked-by: Jesper Nilsson <jesper.nilsson@axis.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agox86: add missed pgtable_pmd_page_ctor/dtor calls for preallocated pmds
Kirill A. Shutemov [Tue, 5 Nov 2013 06:06:38 +0000 (17:06 +1100)]
x86: add missed pgtable_pmd_page_ctor/dtor calls for preallocated pmds

In split page table lock case, we embed spinlock_t into struct page.  For
obvious reason, we don't want to increase size of struct page if
spinlock_t is too big, like with DEBUG_SPINLOCK or DEBUG_LOCK_ALLOC or on
-rt kernel.  So we disable split page table lock, if spinlock_t is too big.

This patchset allows to allocate the lock dynamically if spinlock_t is
big.  In this page->ptl is used to store pointer to spinlock instead of
spinlock itself.  It costs additional cache line for indirect access, but
fix page fault scalability for multi-threaded applications.

LOCK_STAT depends on DEBUG_SPINLOCK, so on current kernel enabling
LOCK_STAT to analyse scalability issues breaks scalability.  ;)

The patchset mostly fixes this.  Results for ./thp_memscale -c 80 -b 512M
on 4-socket machine:

baseline, no CONFIG_LOCK_STAT: 9.115460703 seconds time elapsed
baseline, CONFIG_LOCK_STAT=y: 53.890567123 seconds time elapsed
patched, no CONFIG_LOCK_STAT: 8.852250368 seconds time elapsed
patched, CONFIG_LOCK_STAT=y: 11.069770759 seconds time elapsed

Patch count is scary, but most of them trivial. Overview:

 Patches 1-4 Few bug fixes. No dependencies to other patches.
Probably should applied as soon as possible.

 Patch 5 Changes signature of pgtable_page_ctor(). We will use it
for dynamic lock allocation, so it can fail.

 Patches 6-8 Add missing constructor/destructor calls on few archs.
It's fixes NR_PAGETABLE accounting and prepare to use
split ptl.

 Patches 9-33 Add pgtable_page_ctor() fail handling to all archs.

 Patches 34 Finally adds support of dynamically-allocated page->pte.
Also contains documentation for split page table lock.

This patch (of 34):

I've missed that we preallocate few pmds on pgd_alloc() if X86_PAE
enabled.  Let's add missed constructor/destructor calls.

I haven't noticed it during testing since prep_new_page() clears
page->mapping and therefore page->ptl.  It's effectively equal to
spin_lock_init(&page->ptl).

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: "James E.J. Bottomley" <jejb@parisc-linux.org>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Chen Liqin <liqin.chen@sunplusct.com>
Cc: Chris Metcalf <cmetcalf@tilera.com>
Cc: Chris Zankel <chris@zankel.net>
Cc: Christoph Lameter <cl@linux.com>
Cc: David Howells <dhowells@redhat.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Grant Likely <grant.likely@linaro.org>
Cc: Guan Xuetao <gxt@mprc.pku.edu.cn>
Cc: Haavard Skinnemoen <hskinnemoen@gmail.com>
Cc: Hans-Christian Egtvedt <egtvedt@samfundet.no>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Helge Deller <deller@gmx.de>
Cc: Hirokazu Takata <takata@linux-m32r.org>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: James Hogan <james.hogan@imgtec.com>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Jesper Nilsson <jesper.nilsson@axis.com>
Cc: Jonas Bonn <jonas@southpole.se>
Cc: Koichi Yasutake <yasutake.koichi@jp.panasonic.com>
Cc: Lennox Wu <lennox.wu@gmail.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Mikael Starvik <starvik@axis.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Richard Kuo <rkuo@codeaurora.org>
Cc: Richard Weinberger <richard@nod.at>
Cc: Rob Herring <rob.herring@calxeda.com>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agox86-mm-enable-split-page-table-lock-for-pmd-level-checkpatch-fixes
Andrew Morton [Tue, 5 Nov 2013 06:06:38 +0000 (17:06 +1100)]
x86-mm-enable-split-page-table-lock-for-pmd-level-checkpatch-fixes

ERROR: need consistent spacing around '|' (ctx:VxW)
#62: FILE: arch/x86/include/asm/pgalloc.h:84:
+ page = alloc_pages(GFP_KERNEL | __GFP_REPEAT| __GFP_ZERO, 0);
                                              ^

total: 1 errors, 0 warnings, 32 lines checked

./patches/x86-mm-enable-split-page-table-lock-for-pmd-level.patch has style problems, please review.

If any of these errors are false positives, please report
them to the maintainer, see CHECKPATCH in MAINTAINERS.

Please run checkpatch prior to sending patches

Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agox86, mm: enable split page table lock for PMD level
Kirill A. Shutemov [Tue, 5 Nov 2013 06:06:37 +0000 (17:06 +1100)]
x86, mm: enable split page table lock for PMD level

Enable PMD split page table lock for X86_64 and PAE.

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Tested-by: Alex Thorlton <athorlton@sgi.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: "Eric W . Biederman" <ebiederm@xmission.com>
Cc: "Paul E . McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Dave Jones <davej@redhat.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Robin Holt <robinmholt@gmail.com>
Cc: Sedat Dilek <sedat.dilek@gmail.com>
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Hugh Dickins <hughd@google.com>
Reviewed-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agomm: implement split page table lock for PMD level
Kirill A. Shutemov [Tue, 5 Nov 2013 06:06:37 +0000 (17:06 +1100)]
mm: implement split page table lock for PMD level

The basic idea is the same as with PTE level: the lock is embedded into
struct page of table's page.

We can't use mm->pmd_huge_pte to store pgtables for THP, since we don't
take mm->page_table_lock anymore. Let's reuse page->lru of table's page
for that.

pgtable_pmd_page_ctor() returns true, if initialization is successful and
false otherwise.  Current implementation never fails, but assumption that
constructor can fail will help to port it to -rt where spinlock_t is
rather huge and cannot be embedded into struct page -- dynamic allocation
is required.

Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Tested-by: Alex Thorlton <athorlton@sgi.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "Eric W . Biederman" <ebiederm@xmission.com>
Cc: "Paul E . McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Dave Jones <davej@redhat.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Robin Holt <robinmholt@gmail.com>
Cc: Sedat Dilek <sedat.dilek@gmail.com>
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Hugh Dickins <hughd@google.com>
Reviewed-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agomm: convert the rest to new page table lock api
Kirill A. Shutemov [Tue, 5 Nov 2013 06:06:36 +0000 (17:06 +1100)]
mm: convert the rest to new page table lock api

Only trivial cases left. Let's convert them altogether.

Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Tested-by: Alex Thorlton <athorlton@sgi.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "Eric W . Biederman" <ebiederm@xmission.com>
Cc: "Paul E . McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Dave Jones <davej@redhat.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Robin Holt <robinmholt@gmail.com>
Cc: Sedat Dilek <sedat.dilek@gmail.com>
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Hugh Dickins <hughd@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agomm-hugetlb-convert-hugetlbfs-to-use-split-pmd-lock-checkpatch-fixes
Andrew Morton [Tue, 5 Nov 2013 06:06:35 +0000 (17:06 +1100)]
mm-hugetlb-convert-hugetlbfs-to-use-split-pmd-lock-checkpatch-fixes

ERROR: code indent should use tabs where possible
#65: FILE: include/linux/hugetlb.h:396:
+               struct mm_struct *mm, pte_t *pte)$

WARNING: please, no spaces at the start of a line
#65: FILE: include/linux/hugetlb.h:396:
+               struct mm_struct *mm, pte_t *pte)$

WARNING: please, no spaces at the start of a line
#67: FILE: include/linux/hugetlb.h:398:
+       if (huge_page_size(h) == PMD_SIZE)$

WARNING: suspect code indent for conditional statements (7, 15)
#67: FILE: include/linux/hugetlb.h:398:
+       if (huge_page_size(h) == PMD_SIZE)
+               return pmd_lockptr(mm, (pmd_t *) pte);

ERROR: code indent should use tabs where possible
#68: FILE: include/linux/hugetlb.h:399:
+               return pmd_lockptr(mm, (pmd_t *) pte);$

WARNING: please, no spaces at the start of a line
#68: FILE: include/linux/hugetlb.h:399:
+               return pmd_lockptr(mm, (pmd_t *) pte);$

WARNING: please, no spaces at the start of a line
#69: FILE: include/linux/hugetlb.h:400:
+       VM_BUG_ON(huge_page_size(h) == PAGE_SIZE);$

WARNING: please, no spaces at the start of a line
#70: FILE: include/linux/hugetlb.h:401:
+       return &mm->page_table_lock;$

ERROR: code indent should use tabs where possible
#90: FILE: include/linux/hugetlb.h:436:
+               struct mm_struct *mm, pte_t *pte)$

WARNING: please, no spaces at the start of a line
#90: FILE: include/linux/hugetlb.h:436:
+               struct mm_struct *mm, pte_t *pte)$

WARNING: please, no spaces at the start of a line
#92: FILE: include/linux/hugetlb.h:438:
+       return &mm->page_table_lock;$

ERROR: code indent should use tabs where possible
#97: FILE: include/linux/hugetlb.h:443:
+               struct mm_struct *mm, pte_t *pte)$

WARNING: please, no spaces at the start of a line
#97: FILE: include/linux/hugetlb.h:443:
+               struct mm_struct *mm, pte_t *pte)$

WARNING: please, no spaces at the start of a line
#99: FILE: include/linux/hugetlb.h:445:
+       spinlock_t *ptl;$

WARNING: please, no spaces at the start of a line
#100: FILE: include/linux/hugetlb.h:446:
+       ptl = huge_pte_lockptr(h, mm, pte);$

WARNING: please, no spaces at the start of a line
#101: FILE: include/linux/hugetlb.h:447:
+       spin_lock(ptl);$

WARNING: please, no spaces at the start of a line
#102: FILE: include/linux/hugetlb.h:448:
+       return ptl;$

WARNING: line over 80 characters
#264: FILE: mm/hugetlb.c:2668:
+  * race occurs while re-acquiring page table lock, and

total: 4 errors, 14 warnings, 474 lines checked

NOTE: whitespace errors detected, you may wish to use scripts/cleanpatch or
      scripts/cleanfile

./patches/mm-hugetlb-convert-hugetlbfs-to-use-split-pmd-lock.patch has style problems, please review.

If any of these errors are false positives, please report
them to the maintainer, see CHECKPATCH in MAINTAINERS.

Please run checkpatch prior to sending patches

Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agomm, hugetlb: convert hugetlbfs to use split pmd lock
Kirill A. Shutemov [Tue, 5 Nov 2013 06:06:35 +0000 (17:06 +1100)]
mm, hugetlb: convert hugetlbfs to use split pmd lock

Hugetlb supports multiple page sizes. We use split lock only for PMD
level, but not for PUD.

Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Tested-by: Alex Thorlton <athorlton@sgi.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "Eric W . Biederman" <ebiederm@xmission.com>
Cc: "Paul E . McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Dave Jones <davej@redhat.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Robin Holt <robinmholt@gmail.com>
Cc: Sedat Dilek <sedat.dilek@gmail.com>
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Hugh Dickins <hughd@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agomm, thp: do not access mm->pmd_huge_pte directly
Kirill A. Shutemov [Tue, 5 Nov 2013 06:06:34 +0000 (17:06 +1100)]
mm, thp: do not access mm->pmd_huge_pte directly

Currently mm->pmd_huge_pte protected by page table lock.  It will not work
with split lock.  We have to have per-pmd pmd_huge_pte for proper access
serialization.

For now, let's just introduce wrapper to access mm->pmd_huge_pte.

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Tested-by: Alex Thorlton <athorlton@sgi.com>
Cc: Alex Thorlton <athorlton@sgi.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: "Eric W . Biederman" <ebiederm@xmission.com>
Cc: "Paul E . McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Dave Jones <davej@redhat.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Robin Holt <robinmholt@gmail.com>
Cc: Sedat Dilek <sedat.dilek@gmail.com>
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Hugh Dickins <hughd@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agomm, thp: move ptl taking inside page_check_address_pmd()
Kirill A. Shutemov [Tue, 5 Nov 2013 06:06:34 +0000 (17:06 +1100)]
mm, thp: move ptl taking inside page_check_address_pmd()

With split page table lock we can't know which lock we need to take before
we find the relevant pmd.

Let's move lock taking inside the function.

Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Tested-by: Alex Thorlton <athorlton@sgi.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "Eric W . Biederman" <ebiederm@xmission.com>
Cc: "Paul E . McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Dave Jones <davej@redhat.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Robin Holt <robinmholt@gmail.com>
Cc: Sedat Dilek <sedat.dilek@gmail.com>
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Hugh Dickins <hughd@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agomm, thp: change pmd_trans_huge_lock() to return taken lock
Kirill A. Shutemov [Tue, 5 Nov 2013 06:06:33 +0000 (17:06 +1100)]
mm, thp: change pmd_trans_huge_lock() to return taken lock

With split ptlock it's important to know which lock pmd_trans_huge_lock()
took.  This patch adds one more parameter to the function to return the
lock.

In most places migration to new api is trivial.  Exception is
move_huge_pmd(): we need to take two locks if pmd tables are different.

Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Tested-by: Alex Thorlton <athorlton@sgi.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "Eric W . Biederman" <ebiederm@xmission.com>
Cc: "Paul E . McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Dave Jones <davej@redhat.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Robin Holt <robinmholt@gmail.com>
Cc: Sedat Dilek <sedat.dilek@gmail.com>
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Hugh Dickins <hughd@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agomm: introduce api for split page table lock for PMD level
Kirill A. Shutemov [Tue, 5 Nov 2013 06:06:33 +0000 (17:06 +1100)]
mm: introduce api for split page table lock for PMD level

Basic api, backed by mm->page_table_lock for now. Actual implementation
will be added later.

Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Tested-by: Alex Thorlton <athorlton@sgi.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "Eric W . Biederman" <ebiederm@xmission.com>
Cc: "Paul E . McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Dave Jones <davej@redhat.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Robin Holt <robinmholt@gmail.com>
Cc: Sedat Dilek <sedat.dilek@gmail.com>
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Hugh Dickins <hughd@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agomm: convert mm->nr_ptes to atomic_long_t
Kirill A. Shutemov [Tue, 5 Nov 2013 06:06:32 +0000 (17:06 +1100)]
mm: convert mm->nr_ptes to atomic_long_t

With split page table lock for PMD level we can't hold mm->page_table_lock
while updating nr_ptes.

Let's convert it to atomic_long_t to avoid races.

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Tested-by: Alex Thorlton <athorlton@sgi.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: "Eric W . Biederman" <ebiederm@xmission.com>
Cc: "Paul E . McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Dave Jones <davej@redhat.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Robin Holt <robinmholt@gmail.com>
Cc: Sedat Dilek <sedat.dilek@gmail.com>
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Hugh Dickins <hughd@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agomm: rename USE_SPLIT_PTLOCKS to USE_SPLIT_PTE_PTLOCKS
Kirill A. Shutemov [Tue, 5 Nov 2013 06:06:31 +0000 (17:06 +1100)]
mm: rename USE_SPLIT_PTLOCKS to USE_SPLIT_PTE_PTLOCKS

We're going to introduce split page table lock for PMD level.  Let's
rename existing split ptlock for PTE level to avoid confusion.

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Tested-by: Alex Thorlton <athorlton@sgi.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: "Eric W . Biederman" <ebiederm@xmission.com>
Cc: "Paul E . McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Dave Jones <davej@redhat.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Robin Holt <robinmholt@gmail.com>
Cc: Sedat Dilek <sedat.dilek@gmail.com>
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Hugh Dickins <hughd@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agomm: avoid increase sizeof(struct page) due to split page table lock
Kirill A. Shutemov [Tue, 5 Nov 2013 06:06:31 +0000 (17:06 +1100)]
mm: avoid increase sizeof(struct page) due to split page table lock

Alex Thorlton noticed that some massively threaded workloads work poorly,
if THP enabled.  This patchset fixes this by introducing split page table
lock for PMD tables.  hugetlbfs is not covered yet.

This patchset is based on work by Naoya Horiguchi.

: akpm result summary:
:
: THP off, v3.12-rc2: 18.059261877 seconds time elapsed
: THP off, patched:   16.768027318 seconds time elapsed
:
: THP on, v3.12-rc2:  42.162306788 seconds time elapsed
: THP on, patched:    8.397885779 seconds time elapsed
:
: HUGETLB, v3.12-rc2: 47.574936948 seconds time elapsed
: HUGETLB, patched:   19.447481153 seconds time elapsed

THP off, v3.12-rc2:
-------------------

 Performance counter stats for './thp_memscale -c 80 -b 512m' (5 runs):

    1037072.835207 task-clock                #   57.426 CPUs utilized            ( +-  3.59% )
            95,093 context-switches          #    0.092 K/sec                    ( +-  3.93% )
               140 cpu-migrations            #    0.000 K/sec                    ( +-  5.28% )
        10,000,550 page-faults               #    0.010 M/sec                    ( +-  0.00% )
 2,455,210,400,261 cycles                    #    2.367 GHz                      ( +-  3.62% ) [83.33%]
 2,429,281,882,056 stalled-cycles-frontend   #   98.94% frontend cycles idle     ( +-  3.67% ) [83.33%]
 1,975,960,019,659 stalled-cycles-backend    #   80.48% backend  cycles idle     ( +-  3.88% ) [66.68%]
    46,503,296,013 instructions              #    0.02  insns per cycle
                                             #   52.24  stalled cycles per insn  ( +-  3.21% ) [83.34%]
     9,278,997,542 branches                  #    8.947 M/sec                    ( +-  4.00% ) [83.34%]
        89,881,640 branch-misses             #    0.97% of all branches          ( +-  1.17% ) [83.33%]

      18.059261877 seconds time elapsed                                          ( +-  2.65% )

THP on, v3.12-rc2:
------------------

 Performance counter stats for './thp_memscale -c 80 -b 512m' (5 runs):

    3114745.395974 task-clock                #   73.875 CPUs utilized            ( +-  1.84% )
           267,356 context-switches          #    0.086 K/sec                    ( +-  1.84% )
                99 cpu-migrations            #    0.000 K/sec                    ( +-  1.40% )
            58,313 page-faults               #    0.019 K/sec                    ( +-  0.28% )
 7,416,635,817,510 cycles                    #    2.381 GHz                      ( +-  1.83% ) [83.33%]
 7,342,619,196,993 stalled-cycles-frontend   #   99.00% frontend cycles idle     ( +-  1.88% ) [83.33%]
 6,267,671,641,967 stalled-cycles-backend    #   84.51% backend  cycles idle     ( +-  2.03% ) [66.67%]
   117,819,935,165 instructions              #    0.02  insns per cycle
                                             #   62.32  stalled cycles per insn  ( +-  4.39% ) [83.34%]
    28,899,314,777 branches                  #    9.278 M/sec                    ( +-  4.48% ) [83.34%]
        71,787,032 branch-misses             #    0.25% of all branches          ( +-  1.03% ) [83.33%]

      42.162306788 seconds time elapsed                                          ( +-  1.73% )

HUGETLB, v3.12-rc2:
-------------------

 Performance counter stats for './thp_memscale_hugetlbfs -c 80 -b 512M' (5 runs):

    2588052.787264 task-clock                #   54.400 CPUs utilized            ( +-  3.69% )
           246,831 context-switches          #    0.095 K/sec                    ( +-  4.15% )
               138 cpu-migrations            #    0.000 K/sec                    ( +-  5.30% )
            21,027 page-faults               #    0.008 K/sec                    ( +-  0.01% )
 6,166,666,307,263 cycles                    #    2.383 GHz                      ( +-  3.68% ) [83.33%]
 6,086,008,929,407 stalled-cycles-frontend   #   98.69% frontend cycles idle     ( +-  3.77% ) [83.33%]
 5,087,874,435,481 stalled-cycles-backend    #   82.51% backend  cycles idle     ( +-  4.41% ) [66.67%]
   133,782,831,249 instructions              #    0.02  insns per cycle
                                             #   45.49  stalled cycles per insn  ( +-  4.30% ) [83.34%]
    34,026,870,541 branches                  #   13.148 M/sec                    ( +-  4.24% ) [83.34%]
        68,670,942 branch-misses             #    0.20% of all branches          ( +-  3.26% ) [83.33%]

      47.574936948 seconds time elapsed                                          ( +-  2.09% )

THP off, patched:
-----------------

 Performance counter stats for './thp_memscale -c 80 -b 512m' (5 runs):

     943301.957892 task-clock                #   56.256 CPUs utilized            ( +-  3.01% )
            86,218 context-switches          #    0.091 K/sec                    ( +-  3.17% )
               121 cpu-migrations            #    0.000 K/sec                    ( +-  6.64% )
        10,000,551 page-faults               #    0.011 M/sec                    ( +-  0.00% )
 2,230,462,457,654 cycles                    #    2.365 GHz                      ( +-  3.04% ) [83.32%]
 2,204,616,385,805 stalled-cycles-frontend   #   98.84% frontend cycles idle     ( +-  3.09% ) [83.32%]
 1,778,640,046,926 stalled-cycles-backend    #   79.74% backend  cycles idle     ( +-  3.47% ) [66.69%]
    45,995,472,617 instructions              #    0.02  insns per cycle
                                             #   47.93  stalled cycles per insn  ( +-  2.51% ) [83.34%]
     9,179,700,174 branches                  #    9.731 M/sec                    ( +-  3.04% ) [83.35%]
        89,166,529 branch-misses             #    0.97% of all branches          ( +-  1.45% ) [83.33%]

      16.768027318 seconds time elapsed                                          ( +-  2.47% )

THP on, patched:
----------------

 Performance counter stats for './thp_memscale -c 80 -b 512m' (5 runs):

     458793.837905 task-clock                #   54.632 CPUs utilized            ( +-  0.79% )
            41,831 context-switches          #    0.091 K/sec                    ( +-  0.97% )
                98 cpu-migrations            #    0.000 K/sec                    ( +-  1.66% )
            57,829 page-faults               #    0.126 K/sec                    ( +-  0.62% )
 1,077,543,336,716 cycles                    #    2.349 GHz                      ( +-  0.81% ) [83.33%]
 1,067,403,802,964 stalled-cycles-frontend   #   99.06% frontend cycles idle     ( +-  0.87% ) [83.33%]
   864,764,616,143 stalled-cycles-backend    #   80.25% backend  cycles idle     ( +-  0.73% ) [66.68%]
    16,129,177,440 instructions              #    0.01  insns per cycle
                                             #   66.18  stalled cycles per insn  ( +-  7.94% ) [83.35%]
     3,618,938,569 branches                  #    7.888 M/sec                    ( +-  8.46% ) [83.36%]
        33,242,032 branch-misses             #    0.92% of all branches          ( +-  2.02% ) [83.32%]

       8.397885779 seconds time elapsed                                          ( +-  0.18% )

HUGETLB, patched:
-----------------

 Performance counter stats for './thp_memscale_hugetlbfs -c 80 -b 512M' (5 runs):

     395353.076837 task-clock                #   20.329 CPUs utilized            ( +-  8.16% )
            55,730 context-switches          #    0.141 K/sec                    ( +-  5.31% )
               138 cpu-migrations            #    0.000 K/sec                    ( +-  4.24% )
            21,027 page-faults               #    0.053 K/sec                    ( +-  0.00% )
   930,219,717,244 cycles                    #    2.353 GHz                      ( +-  8.21% ) [83.32%]
   914,295,694,103 stalled-cycles-frontend   #   98.29% frontend cycles idle     ( +-  8.35% ) [83.33%]
   704,137,950,187 stalled-cycles-backend    #   75.70% backend  cycles idle     ( +-  9.16% ) [66.69%]
    30,541,538,385 instructions              #    0.03  insns per cycle
                                             #   29.94  stalled cycles per insn  ( +-  3.98% ) [83.35%]
     8,415,376,631 branches                  #   21.286 M/sec                    ( +-  3.61% ) [83.36%]
        32,645,478 branch-misses             #    0.39% of all branches          ( +-  3.41% ) [83.32%]

      19.447481153 seconds time elapsed                                          ( +-  2.00% )

This patch (of 11):

CONFIG_GENERIC_LOCKBREAK increases sizeof(spinlock_t) to 8 bytes.  It
leads to increase sizeof(struct page) by 4 bytes on 32-bit system if split
page table lock is in use, since page->ptl shares space in union with
longs and pointers.

Let's disable split page table lock on 32-bit systems with
GENERIC_LOCKBREAK enabled.

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Alex Thorlton <athorlton@sgi.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: "Eric W . Biederman" <ebiederm@xmission.com>
Cc: "Paul E . McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Dave Jones <davej@redhat.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Robin Holt <robinmholt@gmail.com>
Cc: Sedat Dilek <sedat.dilek@gmail.com>
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Hugh Dickins <hughd@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agomm-drop-actor-argument-of-do_generic_file_read-fix
Andrew Morton [Tue, 5 Nov 2013 06:06:30 +0000 (17:06 +1100)]
mm-drop-actor-argument-of-do_generic_file_read-fix

fix mm-drop-actor-argument-of-do_generic_file_read for linux-next changes

Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Matthew Wilcox <willy@linux.intel.com>
Cc: Wanpeng Li <liwanp@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agomm: drop actor argument of do_generic_file_read()
Kirill A. Shutemov [Tue, 5 Nov 2013 06:06:30 +0000 (17:06 +1100)]
mm: drop actor argument of do_generic_file_read()

There's only one caller of do_generic_file_read() and the only actor is
file_read_actor().  No reason to have a callback parameter.

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Acked-by: Dave Hansen <dave.hansen@linux.intel.com>
Reviewed-by: Wanpeng Li <liwanp@linux.vnet.ibm.com>
Cc: Matthew Wilcox <willy@linux.intel.com>
Cc: Hugh Dickins <hughd@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agonet/netfilter/ipset/ip_set_hash_netportnet.c: fix build with older gccs
Andrew Morton [Tue, 5 Nov 2013 06:06:29 +0000 (17:06 +1100)]
net/netfilter/ipset/ip_set_hash_netportnet.c: fix build with older gccs

net/netfilter/ipset/ip_set_hash_netportnet.c: In function 'hash_netportnet4_kadt':
net/netfilter/ipset/ip_set_hash_netportnet.c:151: error: unknown field 'cidr' specified in initializer
net/netfilter/ipset/ip_set_hash_netportnet.c:151: warning: missing braces around initializer
net/netfilter/ipset/ip_set_hash_netportnet.c:151: warning: (near initialization for 'e.<anonymous>')

etc

gcc-4.4.4 doesn't like that anonymous union initializer and I couldnt'
find a way of tricking it into doing the right thing, so open-code it.

Cc: Oliver Smith <oliver@8.c.9.b.0.7.4.0.1.0.0.2.ip6.arpa>
Cc: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agonet/netfilter/ipset/ip_set_hash_netnet.c: fix build with older gcc
Andrew Morton [Tue, 5 Nov 2013 06:06:29 +0000 (17:06 +1100)]
net/netfilter/ipset/ip_set_hash_netnet.c: fix build with older gcc

net/netfilter/ipset/ip_set_hash_netnet.c: In function 'hash_netnet4_kadt':
net/netfilter/ipset/ip_set_hash_netnet.c:141: error: unknown field 'cidr' specified in initializer
net/netfilter/ipset/ip_set_hash_netnet.c:141: warning: missing braces around initializer
net/netfilter/ipset/ip_set_hash_netnet.c:141: warning: (near initialization for 'e.<anonymous>')

etc.

gcc-4.4.4 doesn't like that anonymous union initializer and I couldnt'
find a way of tricking it into doing the right thing, so open-code it.

Cc: Oliver Smith <oliver@8.c.9.b.0.7.4.0.1.0.0.2.ip6.arpa>
Cc: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agoMerge branch 'akpm-current/current'
Stephen Rothwell [Tue, 5 Nov 2013 06:29:26 +0000 (17:29 +1100)]
Merge branch 'akpm-current/current'

Conflicts:
arch/x86/mm/init.c
drivers/net/wireless/rt2x00/rt2800pci.c
fs/bio-integrity.c
mm/bounce.c

10 years agoscripts/bloat-o-meter: use .startswith rather than fragile slicing
Josh Triplett [Tue, 5 Nov 2013 05:57:53 +0000 (16:57 +1100)]
scripts/bloat-o-meter: use .startswith rather than fragile slicing

str.startswith has existed since at least Python 2.0, in 2000; use it
rather than a fragile comparison against an initial slice of a string,
which requires hard-coding the length of the string to compare against.

Signed-off-by: Josh Triplett <josh@joshtriplett.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agoscripts/bloat-o-meter: ignore changes in the size of linux_banner
Josh Triplett [Tue, 5 Nov 2013 05:57:52 +0000 (16:57 +1100)]
scripts/bloat-o-meter: ignore changes in the size of linux_banner

linux_banner can change size due to changes in the compiler, build number,
or the user@host the system was compiled on; ignore size changes in
linux_banner entirely.

Signed-off-by: Josh Triplett <josh@joshtriplett.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agoipc-msg-fix-message-length-check-for-negative-values-fix
Andrew Morton [Tue, 5 Nov 2013 05:57:52 +0000 (16:57 +1100)]
ipc-msg-fix-message-length-check-for-negative-values-fix

fix i386 min() warnings

ipc/msgutil.c: In function 'alloc_msg':
ipc/msgutil.c:54: warning: comparison of distinct pointer types lacks a cast
ipc/msgutil.c:66: warning: comparison of distinct pointer types lacks a cast
ipc/msgutil.c: In function 'load_msg':
ipc/msgutil.c:94: warning: comparison of distinct pointer types lacks a cast
ipc/msgutil.c:101: warning: comparison of distinct pointer types lacks a cast
ipc/msgutil.c: In function 'copy_msg':
ipc/msgutil.c:127: warning: comparison of distinct pointer types lacks a cast
ipc/msgutil.c:135: warning: comparison of distinct pointer types lacks a cast
ipc/msgutil.c: In function 'store_msg':
ipc/msgutil.c:155: warning: comparison of distinct pointer types lacks a cast
ipc/msgutil.c:162: warning: comparison of distinct pointer types lacks a cast

Cc: Brad Spengler <spender@grsecurity.net>
Cc: Davidlohr Bueso <davidlohr@hp.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Manfred Spraul <manfred@colorfullife.com>
Cc: Mathias Krause <minipli@googlemail.com>
Cc: Pax Team <pageexec@freemail.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agoipc, msg: fix message length check for negative values
Mathias Krause [Tue, 5 Nov 2013 05:57:51 +0000 (16:57 +1100)]
ipc, msg: fix message length check for negative values

On 64 bit systems the test for negative message sizes is bogus as the
size, which may be positive when evaluated as a long, will get truncated
to an int when passed to load_msg().  So a long might very well contain a
positive value but when truncated to an int it would become negative.

That in combination with a small negative value of msg_ctlmax (which will
be promoted to an unsigned type for the comparison against msgsz, making
it a big positive value and therefore make it pass the check) will lead to
two problems: 1/ The kmalloc() call in alloc_msg() will allocate a too
small buffer as the addition of alen is effectively a subtraction.  2/ The
copy_from_user() call in load_msg() will first overflow the buffer with
userland data and then, when the userland access generates an access
violation, the fixup handler copy_user_handle_tail() will try to fill the
remainder with zeros -- roughly 4GB.  That almost instantly results in a
system crash or reset.

,-[ Reproducer (needs to be run as root) ]--
| #include <sys/stat.h>
| #include <sys/msg.h>
| #include <unistd.h>
| #include <fcntl.h>
|
| int main(void) {
|     long msg = 1;
|     int fd;
|
|     fd = open("/proc/sys/kernel/msgmax", O_WRONLY);
|     write(fd, "-1", 2);
|     close(fd);
|
|     msgsnd(0, &msg, 0xfffffff0, IPC_NOWAIT);
|
|     return 0;
| }
'---

Fix the issue by preventing msgsz from getting truncated by consistently
using size_t for the message length.  This way the size checks in
do_msgsnd() could still be passed with a negative value for msg_ctlmax but
we would fail on the buffer allocation in that case and error out.

Also change the type of m_ts from int to size_t to avoid similar nastiness
in other code paths -- it is used in similar constructs, i.e.  signed vs.
unsigned checks.  It should never become negative under normal
circumstances, though.

Setting msg_ctlmax to a negative value is an odd configuration and should
be prevented.  As that might break existing userland, it will be handled
in a separate commit so it could easily be reverted and reworked without
reintroducing the above described bug.

Hardening mechanisms for user copy operations would have catched that bug
early -- e.g.  checking slab object sizes on user copy operations as the
usercopy feature of the PaX patch does.  Or, for that matter, detect the
long vs.  int sign change due to truncation, as the size overflow plugin
of the very same patch does.

Signed-off-by: Mathias Krause <minipli@googlemail.com>
Cc: Pax Team <pageexec@freemail.hu>
Cc: Davidlohr Bueso <davidlohr@hp.com>
Cc: Brad Spengler <spender@grsecurity.net>
Cc: Manfred Spraul <manfred@colorfullife.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: <stable@vger.kernel.org> [ v2.3.27+ -- yes, that old ;) ]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agoipc/util.c: remove unnecessary work pending test
Xie XiuQi [Tue, 5 Nov 2013 05:57:51 +0000 (16:57 +1100)]
ipc/util.c: remove unnecessary work pending test

Remove unnecessary work pending test before calling schedule_work().  It
has been tested in queue_work_on() already.  No functional changed.

Signed-off-by: Xie XiuQi <xiexiuqi@huawei.com>
Cc: Tejun Heo <tj@kernel.org>
Reviewed-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agodevpts: plug the memory leak in kill_sb
Ilija Hadzic [Tue, 5 Nov 2013 05:57:50 +0000 (16:57 +1100)]
devpts: plug the memory leak in kill_sb

When devpts is unmounted, there may be a no-longer-used IDR tree hanging
off the superblock we are about to kill.  This needs to be cleaned up
before destroying the SB.

The leak is usually not a big deal because unmounting devpts is typically
done when shutting down the whole machine.  However, shutting down an LXC
container instead of a physical machine exposes the problem (the garbage
is detectable with kmemleak).

Signed-off-by: Ilija Hadzic <ihadzic@research.bell-labs.com>
Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agokfifo-api-type-safety-checkpatch-fixes
Andrew Morton [Tue, 5 Nov 2013 05:57:50 +0000 (16:57 +1100)]
kfifo-api-type-safety-checkpatch-fixes

ERROR: code indent should use tabs where possible
#201: FILE: include/linux/kfifo.h:504:
+ ^Itypeof(__tmp->ptr_const) __buf = (buf); \$

WARNING: please, no space before tabs
#201: FILE: include/linux/kfifo.h:504:
+ ^Itypeof(__tmp->ptr_const) __buf = (buf); \$

WARNING: please, no spaces at the start of a line
#201: FILE: include/linux/kfifo.h:504:
+ ^Itypeof(__tmp->ptr_const) __buf = (buf); \$

total: 1 errors, 2 warnings, 221 lines checked

NOTE: whitespace errors detected, you may wish to use scripts/cleanpatch or
      scripts/cleanfile

./patches/kfifo-api-type-safety.patch has style problems, please review.

If any of these errors are false positives, please report
them to the maintainer, see CHECKPATCH in MAINTAINERS.

Please run checkpatch prior to sending patches

Cc: Stefani Seibold <stefani@seibold.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agokfifo API type safety
Stefani Seibold [Tue, 5 Nov 2013 05:57:49 +0000 (16:57 +1100)]
kfifo API type safety

This patch enhances the type safety for the kfifo API.  It is now safe to
put const data into a non const FIFO and the API will now generate a
compiler warning when reading from the fifo where the destination address
is pointing to a const variable.

As a side effect the kfifo_put() does now expect the value of an element
instead a pointer to the element.  This was suggested Russell King.  It
make the handling of the kfifo_put easier since there is no need to create
a helper variable for getting the address of a pointer or to pass integers
of different sizes.

IMHO the API break is okay, since there are currently only six users of
kfifo_put().

The code is also cleaner by kicking out the "if (0)" expressions.

Signed-off-by: Stefani Seibold <stefani@seibold.net>
Cc: Russell King <rmk@arm.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agokfifo: kfifo_copy_{to,from}_user: fix copied bytes calculation
Lars-Peter Clausen [Tue, 5 Nov 2013 05:57:48 +0000 (16:57 +1100)]
kfifo: kfifo_copy_{to,from}_user: fix copied bytes calculation

'copied' and 'len' are in bytes, while 'ret' is in elements, so we need to
multiply 'ret' with the size of one element to get the correct result.

Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Cc: Stefani Seibold <stefani@seibold.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years ago./Makefile: export initial ramdisk compression config option
P J P [Tue, 5 Nov 2013 05:57:48 +0000 (16:57 +1100)]
./Makefile: export initial ramdisk compression config option

Make menuconfig allows one to choose compression format of an initial
ramdisk image.  But this choice does not result in duly compressed ramdisk
image.  Because - $ make install - does not pass on the selected
compression choice to the dracut(8) tool, which creates the initramfs
file.  dracut(8) generates the image with the default compression, ie.
gzip(1).

This patch exports the selected compression option to a sub-shell
environment, so that it could be used by dracut(8) tool to generate
appropriately compressed initramfs images.

There isn't a straightforward way to pass on options to dracut(8) via
positional parameters.  Because it is indirectly invoked at the end of a $
make install sequence.

 # make install
   -> arch/$arch/boot/Makefile
    -> arch/$arch/boot/install.sh
     -> /sbing/installkernel ...
      -> /sbin/new-kernel-pkg ...
       -> /sbin/dracut ...

Signed-off-by: P J P <ppandit@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agoinit/Kconfig: add option to disable kernel compression
Christian Ruppert [Tue, 5 Nov 2013 05:57:47 +0000 (16:57 +1100)]
init/Kconfig: add option to disable kernel compression

Some ARC users say they can boot faster with without kernel compression.
This probably depends on things like the FLASH chip they use etc.

Until now, kernel compression can only be disabled by removing "select
HAVE_<compression>" lines from the architecture Kconfig.  So add the
Kconfig logic to permit disabling of kernel compression.

Signed-off-by: Christian Ruppert <christian.ruppert@abilis.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agodrivers: w1: make w1_slave::flags long to avoid memory corruption
Michal Nazarewicz [Tue, 5 Nov 2013 05:57:47 +0000 (16:57 +1100)]
drivers: w1: make w1_slave::flags long to avoid memory corruption

On architectures where long is more then 32 bits, modifying a 32-bit field
with set_bit (and other atomic bit operations) may cause bytes following
the field to by modified.

Because the endianness of the bits within a field is the native endianness
of the CPU[1], on big-endian machines, bit number zero is in the last byte
of the field.

Therefore, `set_bit(0, ptr)' on a 64-bit big-endian machine is roughly
equivalent to `((char *)ptr)[7] |= 1', and since w1 driver uses a 32-bit
field for holding the flags, this causes bytes beyond the field to be
modified.

[1] From Documentation/atomic_ops.txt:

    Native atomic bit operations are defined to operate on objects
    aligned to the size of an "unsigned long" C data type, and are
    least of that size.  The endianness of the bits within each
    "unsigned long" are the native endianness of the cpu.

Signed-off-by: Michal Nazarewicz <mina86@mina86.com>
Cc: Evgeniy Polyakov <zbr@ioremap.net>
Cc: Greg KH <greg@kroah.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agodrivers/w1/masters/ds1wm.cuse dev_get_platdata()
Jingoo Han [Tue, 5 Nov 2013 05:57:46 +0000 (16:57 +1100)]
drivers/w1/masters/ds1wm.cuse dev_get_platdata()

Use the wrapper function for retrieving the platform data instead of
accessing dev->platform_data directly.  This is a cosmetic change to make
the code simpler and enhance the readability.

Signed-off-by: Jingoo Han <jg1.han@samsung.com>
Acked-by: Evgeniy Polyakov <zbr@ioremap.net>
Cc: Greg KH <greg@kroah.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agodrivers/memstick/core/mspro_block.c: fix attributes array allocation
Michal Nazarewicz [Tue, 5 Nov 2013 05:57:46 +0000 (16:57 +1100)]
drivers/memstick/core/mspro_block.c: fix attributes array allocation

attrs field of attribute_group structure is a pointer to a pointer (as in
an array of pointers) rather than pointer to attribute struct (as in an
array of structures), so when allocating size of the pointer sholud be
used instead of the structure it is pointing to.

While at it, also change the call to use kcalloc rather than kzalloc.

Signed-off-by: Michal Nazarewicz <mina86@mina86.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Alex Dubov <oakad@yahoo.com>
Cc: Maxim Levitsky <maximlevitsky@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agopps : add non blocking option to PPS_FETCH ioctl.
Paul Chavent [Tue, 5 Nov 2013 05:57:45 +0000 (16:57 +1100)]
pps : add non blocking option to PPS_FETCH ioctl.

The PPS_FETCH ioctl is blocking still the reception of a PPS event.  But,
in some case, one may immediately need the last event date.  This patch
allow to get the result of PPS_FETCH if the device has the O_NONBLOCK flag
set.

Signed-off-by: Paul Chavent <paul.chavent@onera.fr>
Acked-by: Rodolfo Giometti <giometti@enneenne.com>
Cc: Alexander Gordeev <agordeev@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agodrivers/pps/clients/pps-gpio.c: remove redundant of_match_ptr
Sachin Kamat [Tue, 5 Nov 2013 05:57:45 +0000 (16:57 +1100)]
drivers/pps/clients/pps-gpio.c: remove redundant of_match_ptr

The data structure of_match_ptr() protects is always compiled in.  Hence
of_match_ptr() is not needed.

Signed-off-by: Sachin Kamat <sachin.kamat@linaro.org>
Cc: Rodolfo Giometti <giometti@enneenne.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agokernel/panic.c: reduce 1 byte usage for print tainted buffer
Chen Gang [Tue, 5 Nov 2013 05:57:44 +0000 (16:57 +1100)]
kernel/panic.c: reduce 1 byte usage for print tainted buffer

sizeof("Tainted: ") already counts '\0', and after first sprintf(), 's'
will start from the current string end (its' value is '\0').

So need not add additional 1 byte for maximized usage of 'buf' in
print_tainted().

Signed-off-by: Chen Gang <gang.chen@asianux.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agogcov: reuse kbasename helper
Andy Shevchenko [Tue, 5 Nov 2013 05:57:43 +0000 (16:57 +1100)]
gcov: reuse kbasename helper

To get name of the file from a pathname let's use kbasename() helper.

Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Jingoo Han <jg1.han@samsung.com>
Cc: Peter Oberparleiter <peter.oberparleiter@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agokernel/gcov/fs.c: use pr_warn()
Andrew Morton [Tue, 5 Nov 2013 05:57:43 +0000 (16:57 +1100)]
kernel/gcov/fs.c: use pr_warn()

pr_warning() is deprecated in favor of pr_warn()

Cc: Andy Gospodarek <agospoda@redhat.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Frantisek Hrbata <fhrbata@redhat.com>
Cc: Jan Stancek <jstancek@redhat.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Peter Oberparleiter <peter.oberparleiter@de.ibm.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agokernel/module.c: use pr_foo()
Andrew Morton [Tue, 5 Nov 2013 05:57:42 +0000 (16:57 +1100)]
kernel/module.c: use pr_foo()

kernel/module.c uses a mix of printk(KERN_foo and pr_foo().  Convert it
all to pr_foo and make the offered cleanups.

Not sure what to do about the printk(KERN_DEFAULT).  We don't have a
pr_default().

Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Joe Perches <joe@perches.com>
Cc: Frantisek Hrbata <fhrbata@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agogcov: compile specific gcov implementation based on gcc version
Frantisek Hrbata [Tue, 5 Nov 2013 05:57:42 +0000 (16:57 +1100)]
gcov: compile specific gcov implementation based on gcc version

Compile the correct gcov implementation file for the specific gcc version.

Signed-off-by: Frantisek Hrbata <fhrbata@redhat.com>
Cc: Jan Stancek <jstancek@redhat.com>
Cc: Kees Cook <keescook@chromium.org>
Acked-by: Peter Oberparleiter <peter.oberparleiter@de.ibm.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Andy Gospodarek <agospoda@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agogcov-add-support-for-gcc-47-gcov-format-fix-3
Peter Oberparleiter [Tue, 5 Nov 2013 05:57:41 +0000 (16:57 +1100)]
gcov-add-support-for-gcc-47-gcov-format-fix-3

gcc_4_7.c needs vmalloc.h

Cc: Frantisek Hrbata <fhrbata@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agogcov-add-support-for-gcc-47-gcov-format-checkpatch-fixes
Andrew Morton [Tue, 5 Nov 2013 05:57:41 +0000 (16:57 +1100)]
gcov-add-support-for-gcc-47-gcov-format-checkpatch-fixes

ERROR: that open brace { should be on the previous line
#227: FILE: kernel/gcov/gcc_4_7.c:179:
+ for (fi_idx = 0; fi_idx < info->n_functions; fi_idx++)
+ {

ERROR: that open brace { should be on the previous line
#269: FILE: kernel/gcov/gcc_4_7.c:221:
+ for (fi_idx = 0; fi_idx < src->n_functions; fi_idx++)
+ {

total: 2 errors, 0 warnings, 574 lines checked

./patches/gcov-add-support-for-gcc-47-gcov-format.patch has style problems, please review.

If any of these errors are false positives, please report
them to the maintainer, see CHECKPATCH in MAINTAINERS.

Please run checkpatch prior to sending patches

Cc: Frantisek Hrbata <fhrbata@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agogcov-add-support-for-gcc-47-gcov-format-fix-fix
Andrew Morton [Tue, 5 Nov 2013 05:57:40 +0000 (16:57 +1100)]
gcov-add-support-for-gcc-47-gcov-format-fix-fix

checkpatch whined about the argument order

Cc: Frantisek Hrbata <fhrbata@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agogcov-add-support-for-gcc-47-gcov-format-fix
Andrew Morton [Tue, 5 Nov 2013 05:57:40 +0000 (16:57 +1100)]
gcov-add-support-for-gcc-47-gcov-format-fix

use kmemdup() and kcalloc()

Cc: Frantisek Hrbata <fhrbata@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agogcov: add support for gcc 4.7 gcov format
Frantisek Hrbata [Tue, 5 Nov 2013 05:57:39 +0000 (16:57 +1100)]
gcov: add support for gcc 4.7 gcov format

The gcov in-memory format changed in gcc 4.7.  The biggest change, which
requires this special implementation, is that gcov_info no longer contains
array of counters for each counter type for all functions and gcov_fn_info
is not used for mapping of function's counters to these arrays(offset).
Now each gcov_fn_info contans it's counters, which makes things a little
bit easier.

This is heavily based on the previous gcc_3_4.c implementation and patches
provided by Peter Oberparleiter.  Specially the buffer gcda implementation
for iterator.

Signed-off-by: Frantisek Hrbata <fhrbata@redhat.com>
Cc: Jan Stancek <jstancek@redhat.com>
Cc: Kees Cook <keescook@chromium.org>
Reviewed-by: Peter Oberparleiter <peter.oberparleiter@de.ibm.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Andy Gospodarek <agospoda@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agogcov: move gcov structs definitions to a gcc version specific file
Frantisek Hrbata [Tue, 5 Nov 2013 05:57:39 +0000 (16:57 +1100)]
gcov: move gcov structs definitions to a gcc version specific file

Since also the gcov structures(gcov_info, gcov_fn_info, gcov_ctr_info) can
change between gcc releases, as shown in gcc 4.7, they cannot be defined
in a common header and need to be moved to a specific gcc implemention
file.  This also requires to make the gcov_info structure opaque for the
common code and to introduce simple helpers for accessing data inside
gcov_info.

Signed-off-by: Frantisek Hrbata <fhrbata@redhat.com>
Cc: Jan Stancek <jstancek@redhat.com>
Cc: Kees Cook <keescook@chromium.org>
Acked-by: Peter Oberparleiter <peter.oberparleiter@de.ibm.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Andy Gospodarek <agospoda@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agokernel/taskstats.c: return -ENOMEM when alloc memory fails in add_del_listener()
Chen Gang [Tue, 5 Nov 2013 05:57:38 +0000 (16:57 +1100)]
kernel/taskstats.c: return -ENOMEM when alloc memory fails in add_del_listener()

For registering in add_del_listener(), when kmalloc_node() fails, need
return -ENOMEM instead of success code, and cmd_attr_register_cpumask()
wants to know about it.

After modification, give a simple common test "build -> boot up ->
kernel/controllers/cgroup/getdelays by LTP tools".

Signed-off-by: Chen Gang <gang.chen@asianux.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agokernel/taskstats.c: add nla_nest_cancel() for failure processing between nla_nest_sta...
Chen Gang [Tue, 5 Nov 2013 05:57:37 +0000 (16:57 +1100)]
kernel/taskstats.c: add nla_nest_cancel() for failure processing between nla_nest_start() and nla_nest_end()

When failure occurs between nla_nest_start() and nla_nest_end(), we should
call nla_nest_cancel() to clean up related things.

Signed-off-by: Chen Gang <gang.chen@asianux.com>
Cc: Balbir Singh <bsingharora@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agokernel/sysctl_binary.c: use scnprintf() instead of snprintf()
Chen Gang [Tue, 5 Nov 2013 05:57:37 +0000 (16:57 +1100)]
kernel/sysctl_binary.c: use scnprintf() instead of snprintf()

snprintf() will return the 'ideal' length which may be larger than real
buffer length, if we only want to use real length, need use scnprintf()
instead of.

Signed-off-by: Chen Gang <gang.chen@asianux.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agokernel/sysctl.c: check return value after call proc_put_char() in __do_proc_doulongve...
Chen Gang [Tue, 5 Nov 2013 05:57:36 +0000 (16:57 +1100)]
kernel/sysctl.c: check return value after call proc_put_char() in __do_proc_doulongvec_minmax()

Need to check the return value of proc_put_char(), as was done in
__do_proc_doulongvec_minmax().

Signed-off-by: Chen Gang <gang.chen@asianux.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agokernel/kexec.c: use vscnprintf() instead of vsnprintf() in vmcoreinfo_append_str()
Chen Gang [Tue, 5 Nov 2013 05:57:36 +0000 (16:57 +1100)]
kernel/kexec.c: use vscnprintf() instead of vsnprintf() in vmcoreinfo_append_str()

vsnprintf() may let 'r' larger than sizeof(buf), in this case, if 'r' is
also less than "vmcoreinfo_max_size - vmcoreinfo_size" (left size of
destination buffer), next memcpy() will read the unexpected addresses.

Signed-off-by: Chen Gang <gang.chen@asianux.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agoexec/ptrace: fix get_dumpable() incorrect tests
Kees Cook [Tue, 5 Nov 2013 05:57:35 +0000 (16:57 +1100)]
exec/ptrace: fix get_dumpable() incorrect tests

The get_dumpable() return value is not boolean.  Most users of the
function actually want to be testing for non-SUID_DUMP_USER(1) rather than
SUID_DUMP_DISABLE(0).  The SUID_DUMP_ROOT(2) is also considered a
protected state.  Almost all places did this correctly, excepting the two
places fixed in this patch.

Wrong logic:
    if (dumpable == SUID_DUMP_DISABLE) { /* be protective */ }
        or
    if (dumpable == 0) { /* be protective */ }
        or
    if (!dumpable) { /* be protective */ }

Correct logic:
    if (dumpable != SUID_DUMP_USER) { /* be protective */ }
        or
    if (dumpable != 1) { /* be protective */ }

Without this patch, if the system had set the sysctl fs/suid_dumpable=2, a
user was able to ptrace attach to processes that had dropped privileges to
that user.  (This may have been partially mitigated if Yama was enabled.)

The macros have been moved into the file that declares get/set_dumpable(),
which means things like the ia64 code can see them too.

CVE-2013-2929

Reported-by: Vasily Kulikov <segoon@openwall.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agoprocfs: clean up proc_reg_get_unmapped_area for 80-column limit
HATAYAMA Daisuke [Tue, 5 Nov 2013 05:57:35 +0000 (16:57 +1100)]
procfs: clean up proc_reg_get_unmapped_area for 80-column limit

Clean up proc_reg_get_unmapped_area due to its 80-column limit
violation.

Signed-off-by: HATAYAMA Daisuke <d.hatayama@jp.fujitsu.com>
Tested-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agoDocumentation/ABI: document the non-ABI status of Kconfig and symbols
Josh Triplett [Tue, 5 Nov 2013 05:57:34 +0000 (16:57 +1100)]
Documentation/ABI: document the non-ABI status of Kconfig and symbols

Discussion at Kernel Summit made it clear that the presence or absence of
specific Kconfig symbols are not considered ABI, and that no userspace (or
bootloader, etc) should rely on them.

In addition, kernel-internal symbols are well established as non-ABI, per
Documentation/stable_api_nonsense.txt.

Document both of these in Documentation/ABI/README, in a new section for
notable bits of non-ABI.

Signed-off-by: Josh Triplett <josh@joshtriplett.org>
Cc: Rob Landley <rob@landley.net>
Cc: Tao Ma <boyu.mt@taobao.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Acked-by: H. Peter Anvin <hpa@zytor.com>
Cc: Richard Weinberger <richard.weinberger@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agokernel-doc: improve "no structured comments found" error
Johannes Berg [Tue, 5 Nov 2013 05:57:33 +0000 (16:57 +1100)]
kernel-doc: improve "no structured comments found" error

When using '!Ffile function' in a docbook template, and the function no
longer exists, you get a "no structured comments found" error from the
kernel-doc processing script.  It's useful to know which functions it was
looking for, so print them out in this case.  Also do the same for '!Pfile
doc-section'

The same error also happens when using '!Efile' when some exported
functions aren't documented (in the same file.) There's a very large
number of such functions though, so don't print the message in this case
-- right now it would give ~850 messages.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Cc: Rob Landley <rob@landley.net>
Cc: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agoDocumentation/trace/tracepoints.txt: add links to TRACE_EVENT documentation
Stefan Raspl [Tue, 5 Nov 2013 05:57:33 +0000 (16:57 +1100)]
Documentation/trace/tracepoints.txt: add links to TRACE_EVENT documentation

Existing tracepoint documentation doesn't mention the popular TRACE_EVENT
macro.  Since an excellent series of articles on proper usage already
exists, respective links are added to the existing documentation.

Signed-off-by: Stefan Raspl <raspl@linux.vnet.ibm.com>
Cc: Rob Landley <rob@landley.net>
Cc: Jiri Kosina <jkosina@suse.cz>
Acked-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Zoltan Kiss <zoltan.kiss@citrix.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agofat: permit to return phy block number by fibmap in fallocated region
Namjae Jeon [Tue, 5 Nov 2013 05:57:32 +0000 (16:57 +1100)]
fat: permit to return phy block number by fibmap in fallocated region

Make the fibmap call the return the proper physical block number for any
offset request in the fallocated range.

Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com>
Signed-off-by: Amit Sahrawat <a.sahrawat@samsung.com>
Cc: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agofat: fallback to buffered write in case of fallocatded region on direct IO
Namjae Jeon [Tue, 5 Nov 2013 05:57:32 +0000 (16:57 +1100)]
fat: fallback to buffered write in case of fallocatded region on direct IO

For normal cases of direct IO write, trying to seek to location greater
than file size, makes it fall back to buffered write to fill that region.
Similarly, in case for write in Fallocated region, make it fall to
buffered write.

Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com>
Signed-off-by: Amit Sahrawat <a.sahrawat@samsung.com>
Cc: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agofat: zero out seek range on _fat_get_block
Namjae Jeon [Tue, 5 Nov 2013 05:57:31 +0000 (16:57 +1100)]
fat: zero out seek range on _fat_get_block

For normal buffered write operations, normally if we try to write to an
offset > than file size, it does a cont_expand_zero till that offset.
Now, in case of fallocated regions, since the blocks are already
allocated.  So, make it zero out that buffers for those blocks till the
seek'ed offset.

Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com>
Signed-off-by: Amit Sahrawat <a.sahrawat@samsung.com>
Cc: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>