]> git.karo-electronics.de Git - karo-tx-linux.git/log
karo-tx-linux.git
10 years agortc: hym8563: include clkout code only if COMMON_CLK active
Heiko Stuebner [Fri, 3 Jan 2014 03:10:23 +0000 (14:10 +1100)]
rtc: hym8563: include clkout code only if COMMON_CLK active

The contents of clk-provide.h, struct clk_hw etc, are only available if
CONFIG_COMMON_CLK is selected.  Therefore IS_ENABLED(COMMON_CLK) is not
sufficient and real preprocessor conditions are necessary to keep the code
in question from being compiled on non-COMMON_CLK systems.

Signed-off-by: Heiko Stuebner <heiko@sntech.de>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Reported-by: Wu Fengguang <fengguang.wu@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agortc: add hym8563 rtc-driver
Heiko Stuebner [Fri, 3 Jan 2014 03:10:23 +0000 (14:10 +1100)]
rtc: add hym8563 rtc-driver

The Haoyu Microelectronics HYM8563 provides rtc and alarm functions as
well as a clock output of up to 32kHz.

Signed-off-by: Heiko Stuebner <heiko@sntech.de>
Cc: Rob Herring <rob.herring@calxeda.com>
Cc: Pawel Moll <pawel.moll@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Stephen Warren <swarren@wwwdotorg.org>
Cc: Ian Campbell <ijc+devicetree@hellion.org.uk>
Cc: Grant Likely <grant.likely@linaro.org>
Cc: Mike Turquette <mturquette@linaro.org>
Cc: Richard Weinberger <richard.weinberger@gmail.com>
Cc: Mark Brown <broonie@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agodt-bindings: add hym8563 binding
Heiko Stuebner [Fri, 3 Jan 2014 03:10:22 +0000 (14:10 +1100)]
dt-bindings: add hym8563 binding

Add binding documentation for the hym8563 rtc chip.

Signed-off-by: Heiko Stuebner <heiko@sntech.de>
Cc: Rob Herring <rob.herring@calxeda.com>
Cc: Pawel Moll <pawel.moll@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Stephen Warren <swarren@wwwdotorg.org>
Cc: Ian Campbell <ijc+devicetree@hellion.org.uk>
Cc: Grant Likely <grant.likely@linaro.org>
Cc: Mike Turquette <mturquette@linaro.org>
Cc: Richard Weinberger <richard.weinberger@gmail.com>
Cc: Mark Brown <broonie@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agodrivers/rtc/rtc-vr41xx.c: use devm_*() functions
Jingoo Han [Fri, 3 Jan 2014 03:10:22 +0000 (14:10 +1100)]
drivers/rtc/rtc-vr41xx.c: use devm_*() functions

Use devm_*() functions to make cleanup paths simpler, and remove
unnecessary remove().

Signed-off-by: Jingoo Han <jg1.han@samsung.com>
Cc: Yoichi Yuasa <yuasa@linux-mips.org>
Cc: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agodrivers/rtc/rtc-twl.c: use devm_*() functions
Jingoo Han [Fri, 3 Jan 2014 03:10:22 +0000 (14:10 +1100)]
drivers/rtc/rtc-twl.c: use devm_*() functions

Use devm_*() functions to make cleanup paths simpler, and remove
unnecessary remove().

Signed-off-by: Jingoo Han <jg1.han@samsung.com>
Cc: Yoichi Yuasa <yuasa@linux-mips.org>
Cc: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agodrivers/rtc/rtc-ds1742.c: add devicetree support
Alexander Shiyan [Fri, 3 Jan 2014 03:10:22 +0000 (14:10 +1100)]
drivers/rtc/rtc-ds1742.c: add devicetree support

This patch allows the driver to be enabled with devicetree.

Signed-off-by: Alexander Shiyan <shc_work@mail.ru>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agodrivers/rtc/rtc-mxc.c: check the return value from clk_prepare_enable()
Fabio Estevam [Fri, 3 Jan 2014 03:10:21 +0000 (14:10 +1100)]
drivers/rtc/rtc-mxc.c: check the return value from clk_prepare_enable()

clk_prepare_enable() may fail, so let's check its return value and
propagate it in the case of error.

Signed-off-by: Fabio Estevam <fabio.estevam@freescale.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agodrivers/rtc/rtc-mxc.c: remove unneeded label
Fabio Estevam [Fri, 3 Jan 2014 03:10:21 +0000 (14:10 +1100)]
drivers/rtc/rtc-mxc.c: remove unneeded label

There is no need to jump to the 'exit_free_pdata' label when
devm_clk_get() fails, as we can directly return the error and simplify the
code a bit.

Signed-off-by: Fabio Estevam <fabio.estevam@freescale.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agodrivers/rtc/rtc-ds1305.c: remove unnecessary spi_set_drvdata()
Jingoo Han [Fri, 3 Jan 2014 03:10:21 +0000 (14:10 +1100)]
drivers/rtc/rtc-ds1305.c: remove unnecessary spi_set_drvdata()

The driver core clears the driver data to NULL after device_release or on
probe failure.  Thus, it is not needed to manually clear the device driver
data to NULL.

Signed-off-by: Jingoo Han <jg1.han@samsung.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agodrivers/rtc/rtc-as3722: use devm for rtc and irq registration
Laxman Dewangan [Fri, 3 Jan 2014 03:10:21 +0000 (14:10 +1100)]
drivers/rtc/rtc-as3722: use devm for rtc and irq registration

Use devm_* calls for rtc and irq registration and get rid of
remove callback for platform driver.

Signed-off-by: Laxman Dewangan <ldewangan@nvidia.com>
Cc: Jingoo Han <jg1.han@samsung.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agoautofs: fix symlinks aren't checked for expiry
Ian Kent [Fri, 3 Jan 2014 03:10:20 +0000 (14:10 +1100)]
autofs: fix symlinks aren't checked for expiry

The autofs4 module doesn't consider symlinks for expire as it did in the
older autofs v3 module (so it's actually a long standing regression).

The user space daemon has focused on the use of bind mounts instead of
symlinks for a long time now and that's why this has not been noticed.
But with the future addition of amd map parsing to automount(8), not to
mention amd itself (of am-utils), symlink expiry will be needed.

The direct and offset mount types can't be symlinks and the tree mounts of
version 4 were always real mounts so only indirect mounts need expire
symlinks.

Since the current users of the autofs4 module haven't reported this as a
problem to date this patch probably isn't a candidate for backport to
stable.

Signed-off-by: Ian Kent <ikent@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agoautofs: use IS_ROOT to replace root dentry checks
Rui Xiang [Fri, 3 Jan 2014 03:10:20 +0000 (14:10 +1100)]
autofs: use IS_ROOT to replace root dentry checks

Use the helper macro !IS_ROOT to replace parent != dentry->d_parent.  Just
clean up.

Signed-off-by: Rui Xiang <rui.xiang@huawei.com>
Signed-off-by: Ian Kent <raven@themaw.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agoautofs: fix the return value of autofs4_fill_super
Rui Xiang [Fri, 3 Jan 2014 03:10:20 +0000 (14:10 +1100)]
autofs: fix the return value of autofs4_fill_super

While kzallocing sbi/ino fails, it should return -ENOMEM.

And it should return the err value from autofs_prepare_pipe.

Signed-off-by: Rui Xiang <rui.xiang@huawei.com>
Signed-off-by: Ian Kent <raven@themaw.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agoautofs4: translate pids to the right namespace for the daemon
Miklos Szeredi [Fri, 3 Jan 2014 03:10:20 +0000 (14:10 +1100)]
autofs4: translate pids to the right namespace for the daemon

The PID and the TGID of the process triggering the mount are sent to the
daemon.  Currently the global pid values are sent (ones valid in the
initial pid namespace) but this is wrong if the autofs daemon itself is
not running in the initial pid namespace.

So send the pid values that are valid in the namespace of the autofs daemon.

The namespace to use is taken from the oz_pgrp pid pointer, which was set
at mount time to the mounting process' pid namespace.

If the pid translation fails (the triggering process is in an unrelated
pid namespace) then the automount fails with ENOENT.

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Acked-by: Serge Hallyn <serge.hallyn@canonical.com>
Cc: Eric Biederman <ebiederm@xmission.com>
Acked-by: Ian Kent <raven@themaw.net>
Cc: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agoautofs4: allow autofs to work outside the initial PID namespace
Sukadev Bhattiprolu [Fri, 3 Jan 2014 03:10:20 +0000 (14:10 +1100)]
autofs4: allow autofs to work outside the initial PID namespace

Enable autofs4 to work in a "container".  oz_pgrp is converted from pid_t
to struct pid and this is stored at mount time based on the "pgrp=" option
or if the option is missing then the current pgrp.

The "pgrp=" option is interpreted in the PID namespace of the current
process.  This option is flawed in that it doesn't carry the namespace
information, so it should be deprecated.  AFAICS the autofs daemon always
sends the current pgrp, which is the default anyway.

The oz_pgrp is also set from the AUTOFS_DEV_IOCTL_SETPIPEFD_CMD ioctl.
This ioctl sets oz_pgrp to the current pgrp.  It is not allowed to change
the pid namespace.

oz_pgrp is used mainly to determine whether the process traversing the
autofs mount tree is the autofs daemon itself or not.  This function now
compares the pid pointers instead of the pid_t values.

One other use of oz_pgrp is in autofs4_show_options.  There is shows the
virtual pid number (i.e.  the one that is valid inside the PID namespace
of the calling process)

For debugging printk convert oz_pgrp to the value in the initial pid
namespace.

Signed-off-by: Sukadev Bhattiprolu <sukadev@us.ibm.com>
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Acked-by: Serge Hallyn <serge.hallyn@canonical.com>
Cc: Eric Biederman <ebiederm@xmission.com>
Acked-by: Ian Kent <raven@themaw.net>
Cc: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agoinit/main.c: remove unused declaration of tc_init()
Geert Uytterhoeven [Fri, 3 Jan 2014 03:10:19 +0000 (14:10 +1100)]
init/main.c: remove unused declaration of tc_init()

Its user was removed in v2.5.2.4.

Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agofs/ramfs: move ramfs_aops to inode.c
Axel Lin [Fri, 3 Jan 2014 03:10:19 +0000 (14:10 +1100)]
fs/ramfs: move ramfs_aops to inode.c

ramfs_aops is identical in file-mmu.c and file-nommu.c.  Thus move it to
fs/ramfs/inode.c and make it static.

Signed-off-by: Axel Lin <axel.lin@ingics.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agofs/ramfs/file-nommu.c: make ramfs_nommu_get_unmapped_area() and ramfs_nommu_mmap...
Axel Lin [Fri, 3 Jan 2014 03:10:19 +0000 (14:10 +1100)]
fs/ramfs/file-nommu.c: make ramfs_nommu_get_unmapped_area() and ramfs_nommu_mmap() static

Since commit 853ac43ab194f "shmem: unify regular and tiny shmem",
ramfs_nommu_get_unmapped_area() and ramfs_nommu_mmap() are not directly
referenced outside of file-nommu.c.  Thus make them static.

Signed-off-by: Axel Lin <axel.lin@ingics.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agobinfmt_elf.c: use get_random_int() to fix entropy depleting
Jeff Liu [Fri, 3 Jan 2014 03:10:19 +0000 (14:10 +1100)]
binfmt_elf.c: use get_random_int() to fix entropy depleting

Entropy is quickly depleted under normal operations like ls(1), cat(1),
etc...  between 2.6.30 to current mainline, for instance:

$ cat /proc/sys/kernel/random/entropy_avail
3428
$ cat /proc/sys/kernel/random/entropy_avail
2911
$cat /proc/sys/kernel/random/entropy_avail
2620

We observed this problem has been occurring since 2.6.30 with
fs/binfmt_elf.c: create_elf_tables()->get_random_bytes(), introduced by
f06295b44c296c8f ("ELF: implement AT_RANDOM for glibc PRNG seeding").

/*
 * Generate 16 random bytes for userspace PRNG seeding.
 */
get_random_bytes(k_rand_bytes, sizeof(k_rand_bytes));

The patch introduces a wrapper around get_random_int() which has lower
overhead than calling get_random_bytes() directly.

With this patch applied:
$ cat /proc/sys/kernel/random/entropy_avail
2731
$ cat /proc/sys/kernel/random/entropy_avail
2802
$ cat /proc/sys/kernel/random/entropy_avail
2878

Analyzed by John Sobecki.

This has been applied on a specific Oracle kernel and has been running on
the customer's production environment (the original bug reporter) for
several months; it has worked fine until now.

Signed-off-by: Jie Liu <jeff.liu@oracle.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Andreas Dilger <aedilger@gmail.com>
Cc: Alan Cox <alan@linux.intel.com>
Cc: Arnd Bergmann <arnn@arndb.de>
Cc: John Sobecki <john.sobecki@oracle.com>
Cc: James Morris <james.l.morris@oracle.com>
Cc: Jakub Jelinek <jakub@redhat.com>
Cc: Ted Ts'o <tytso@mit.edu>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Acked-by: Kees Cook <keescook@chromium.org>
Cc: Ulrich Drepper <drepper@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agocheckpatch: update the FSF/GPL address check
Joe Perches [Fri, 3 Jan 2014 03:10:18 +0000 (14:10 +1100)]
checkpatch: update the FSF/GPL address check

The FSF address check is a bit too verbose looking for the GPL text.
Quiet it a bit by requiring --strict for the GPL bit.

Also make the address tests match a few uses of abbreviations for street
names and make it case insensitive.

Signed-off-by: Joe Perches <joe@perches.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
Cc: Manfred Spraul <manfred@colorfullife.com>
Cc: Andy Whitcroft <apw@canonical.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agocheckpatch: check for if's with unnecessary parentheses
Joe Perches [Fri, 3 Jan 2014 03:10:18 +0000 (14:10 +1100)]
checkpatch: check for if's with unnecessary parentheses

If statements don't need multiple parentheses around tested comparisons
like "if ((foo == bar))".

An == comparison maybe a sign of an intended assignment, so emit a
slightly different message if so.

Signed-off-by: Joe Perches <joe@perches.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
Cc: Manfred Spraul <manfred@colorfullife.com>
Cc: Andy Whitcroft <apw@canonical.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agocheckpatch: improve space before tab --fix option
Joe Perches [Fri, 3 Jan 2014 03:10:18 +0000 (14:10 +1100)]
checkpatch: improve space before tab --fix option

This test should remove all the spaces before a tab not just one space.

Substitute a tab for each 8 space block before a tab and remove less than
8 spaces before a tab.

This SPACE_BEFORE_TAB test is done after CODE_INDENT.

If there are spaces used at the beginning of a line that should be
converted to tabs, please make sure that the CODE_INDENT test and
conversion is done before this SPACE_BEFORE_TAB test and conversion.

Reported-by: Manfred Spraul <manfred@colorfullife.com>
Signed-off-by: Joe Perches <joe@perches.com>
Cc: Josh Triplett <josh@joshtriplett.org>
Cc: Andy Whitcroft <apw@canonical.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agocheckpatch: add a --fix-inplace option
Joe Perches [Fri, 3 Jan 2014 03:10:18 +0000 (14:10 +1100)]
checkpatch: add a --fix-inplace option

Add the ability to fix and overwrite existing files/patches instead of
creating a new file "<filename>.EXPERIMENTAL-checkpatch-fixes".

Suggested-by: Manfred Spraul <manfred@colorfullife.com>
Signed-off-by: Joe Perches <joe@perches.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
Cc: Andy Whitcroft <apw@canonical.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agocheckpatch-add-tests-for-function-pointer-style-misuses-fix
Josh Triplett [Fri, 3 Jan 2014 03:10:18 +0000 (14:10 +1100)]
checkpatch-add-tests-for-function-pointer-style-misuses-fix

Fix an ambiguity in one warning message and a copy/paste problem in
another.

Signed-off-by: Josh Triplett <josh@joshtriplett.org>
Acked-by: Joe Perches <joe@perches.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agocheckpatch: add tests for function pointer style misuses
Joe Perches [Fri, 3 Jan 2014 03:10:17 +0000 (14:10 +1100)]
checkpatch: add tests for function pointer style misuses

Kernel style uses function pointers in this form:
"type (*funcptr)(args...)"

Emit warnings when this function pointer form isn't used.

Signed-off-by: Joe Perches <joe@perches.com>
Cc: Josh Triplett <josh@joshtriplett.org>
Cc: Manfred Spraul <manfred@colorfullife.com>
Cc: Andy Whitcroft <apw@canonical.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agocheckpatch: attempt to find missing switch/case break;
Joe Perches [Fri, 3 Jan 2014 03:10:17 +0000 (14:10 +1100)]
checkpatch: attempt to find missing switch/case break;

switch case statements missing a break statement are an unfortunately
common error.

e.g.:
commit 4a2c94c9b6c0 ("HID: kye: Add report fixup for Genius Manticore Keyboard")

case blocks should end in a break/return/goto/continue.

If a fall-through is used, it should have a comment showing that it is
intentional.  Ideally that comment should be something like:
"/* fall-through */"

Add a test to look for missing break statements.

This looks only at the context lines before an inserted case so it's
possible to have false positives when the context contains a close brace
and the break is before the brace and not part of the patch context.

Looking at recent patches, this is a pretty rare occurrence.  The normal
kernel style uses a break as the last line of the previous block.

Signed-off-by: Joe Perches <joe@perche.com>
Cc: Andy Whitcroft <apw@shadowen.org>
Cc: Jiri Kosina <jkosina@suse.cz>
Cc: Benjamin Tissoires <benjamin.tissoires@redhat.com>
Cc: Dave Jones <davej@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agocheckpatch: add warning of future __GFP_NOFAIL use
David Rientjes [Fri, 3 Jan 2014 03:10:17 +0000 (14:10 +1100)]
checkpatch: add warning of future __GFP_NOFAIL use

gfp.h and page_alloc.c already specify that __GFP_NOFAIL is deprecated and
no new users should be added.

Add a warning to checkpatch to catch this.

Signed-off-by: David Rientjes <rientjes@google.com>
Cc: Joe Perches <joe@perches.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agocheckpatch: warn only on "space before semicolon" at end of line
Joe Perches [Fri, 3 Jan 2014 03:10:17 +0000 (14:10 +1100)]
checkpatch: warn only on "space before semicolon" at end of line

The "space before a non-naked semicolon" test has unwanted output when
used in "for ( ;; )" loops.

Make the test work only on end-of-line statement
termination semicolons.

Signed-off-by: Joe Perches <joe@perches.com>
Cc: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agocheckpatch: more comprehensive split strings warning
Joe Perches [Fri, 3 Jan 2014 03:10:16 +0000 (14:10 +1100)]
checkpatch: more comprehensive split strings warning

The current checkpatch test for split strings does not find several cases
that should be found.

For instance:

  /* Else poor success; go back to mode in "active" table */
  } else {
  IWL_DEBUG_RATE(mvm,
-        "LQ: GOING BACK TO THE OLD TABLE suc=%d cur-tpt=%d old-tpt=%d\n",
+        "GOING BACK TO THE OLD TABLE: SR %d "
+        "cur-tpt %d old-tpt %d\n",
         window->success_ratio,
         window->average_tpt,
        lq_sta->last_tpt);

does not currently emit a warning.

Improve the test to find these cases.

Add more exceptions to reduce false positives for assembly and octal/hex
string constants.

Signed-off-by: Joe Perches <joe@perches.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agofirmware/dmi_scan: generalize for use by other archs
Ard Biesheuvel [Fri, 3 Jan 2014 03:10:16 +0000 (14:10 +1100)]
firmware/dmi_scan: generalize for use by other archs

This patch makes a couple of changes to the SMBIOS/DMI scanning
code so it can be used on other archs (such as ARM and arm64):
(a) wrap the calls to ioremap()/iounmap(), this allows the use of a
    flavor of ioremap() more suitable for random unaligned access;
(b) allow the non-EFI fallback probe into hardcoded physical address
    0xF0000 to be disabled.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Acked-by: Grant Likely <grant.likely@linaro.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc "Luck, Tony" <tony.luck@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agolib: Add CRC64 ECMA module
Marian Chereji [Fri, 3 Jan 2014 03:10:16 +0000 (14:10 +1100)]
lib: Add CRC64 ECMA module

Add implementation of CRC64 ECMA checksum.

We have an IP Acceleration driver for Freescale network processors which
is using this CRC64.  However, it still needs some work in order for it to
become upstreamable.

Signed-off-by: Marian Chereji <marian.chereji@freescale.com>
Reviewed-by: Varvara Andrei-B21317 <andrei.varvara@freescale.com>
Reviewed-by: Fleming Andrew-AFLEMING <AFLEMING@freescale.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agotest: fix sparse warnings in user_copy tests
Kees Cook [Fri, 3 Jan 2014 03:10:16 +0000 (14:10 +1100)]
test: fix sparse warnings in user_copy tests

Sparse fix for "test: check copy_to/from_user boundary validation":

To keep sparse happy with the horrible things being done with the user
memory pointers, declare both __user and non-__user cases ahead of time to
avoid needing to do the casts later.

Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Joe Perches <joe@perches.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agotest: check copy_to/from_user boundary validation
Kees Cook [Fri, 3 Jan 2014 03:10:15 +0000 (14:10 +1100)]
test: check copy_to/from_user boundary validation

To help avoid an architecture failing to correctly check kernel/user
boundaries when handling copy_to_user, copy_from_user, put_user, or
get_user, perform some simple tests and fail to load if any of them behave
unexpectedly.

Specifically, this is to make sure there is a way to notice if things like
what was fixed in 8404663f81 ("ARM: 7527/1: uaccess: explicitly check
__user pointer when !CPU_USE_DOMAINS") ever regresses again, for any
architecture.

Additionally, adds new "user" selftest target, which loads this module.

Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Joe Perches <joe@perches.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agotest: add minimal module for verification testing
Kees Cook [Fri, 3 Jan 2014 03:10:15 +0000 (14:10 +1100)]
test: add minimal module for verification testing

This is a pair of test modules I'd like to see in the tree.  Instead of
putting these in lkdtm, where I've been adding various tests that trigger
crashes, these don't make sense there since they need to be either
distinctly separate, or their pass/fail state don't need to crash the
machine.

These live in lib/ for now, along with a few other in-kernel test modules,
and use the slightly more common "test_" naming convention, instead of
"test-".  We should likely standardize on the former:

$ find . -name 'test_*.c' | grep -v /tools/ | wc -l
4
$ find . -name 'test-*.c' | grep -v /tools/ | wc -l
2

The first is entirely a no-op module, designed to allow simple testing of
the module loading and verification interface.  It's useful to have a
module that has no other uses or dependencies so it can be reliably used
for just testing module loading and verification.

The second is a module that exercises the user memory access functions, in
an effort to make sure that we can quickly catch any regressions in
boundary checking (e.g.  like what was recently fixed on ARM).

This patch (of 2):

When doing module loading verification tests (for example, with module
signing, or LSM hooks), it is very handy to have a module that can be
built on all systems under test, isn't auto-loaded at boot, and has no
device or similar dependencies.  This creates the "test_module.ko" module
for that purpose, which only reports its load and unload to printk.

Signed-off-by: Kees Cook <keescook@chromium.org>
Acked-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: Joe Perches <joe@perches.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agolib/cmdline.c: declare exported symbols immediately
Felipe Contreras [Fri, 3 Jan 2014 03:10:15 +0000 (14:10 +1100)]
lib/cmdline.c: declare exported symbols immediately

WARNING: EXPORT_SYMBOL(foo); should immediately follow its function/variable
+EXPORT_SYMBOL(memparse);

WARNING: EXPORT_SYMBOL(foo); should immediately follow its function/variable
+EXPORT_SYMBOL(get_option);

WARNING: EXPORT_SYMBOL(foo); should immediately follow its function/variable
+EXPORT_SYMBOL(get_options);

Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Cc: Levente Kurusa <levex@linux.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agolib/cmdline.c: fix style issues
Felipe Contreras [Fri, 3 Jan 2014 03:10:15 +0000 (14:10 +1100)]
lib/cmdline.c: fix style issues

WARNING: space prohibited between function name and open parenthesis '('
+int get_option (char **str, int *pint)

WARNING: space prohibited between function name and open parenthesis '('
+ *pint = simple_strtol (cur, str, 0);

ERROR: trailing whitespace
+ $

WARNING: please, no spaces at the start of a line
+ $

WARNING: space prohibited between function name and open parenthesis '('
+ res = get_option ((char **)&str, ints + i);

Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agolib/kstrtox.c: remove redundant cleanup
Felipe Contreras [Fri, 3 Jan 2014 03:10:15 +0000 (14:10 +1100)]
lib/kstrtox.c: remove redundant cleanup

We can't reach the cleanup code unless the flag KSTRTOX_OVERFLOW is not
set, so there's not no point in clearing a bit that we know is not set.

Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Acked-by: Levente Kurusa <levex@linux.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agodrivers/video/backlight/backlight.c: remove backlight sysfs uevent
Kyungmin Park [Fri, 3 Jan 2014 03:10:14 +0000 (14:10 +1100)]
drivers/video/backlight/backlight.c: remove backlight sysfs uevent

Most mobile phones have Ambient Light Sensors and it changes brightness
according to the lux.  It means it changes backlight brightness frequently
by just writing sysfs node, so it generates uevent.

Usually there's no user to use this backlight changes.  But it forks udev
worker threads and it takes about 5ms.  The main problem is that it hurts
other process activities.  so remove it.

Kay said
"Uevents are for the major, low-frequent, global device state-changes,
 not for carrying-out any sort of measurement data. Subsystems which
 need that should use other facilities like poll()-able sysfs file or
 any other subscription-based, client-tracking interface which does not
 cause overhead if it isn't used. Uevents are not the right thing to
 use here, and upstream udev should not paper-over broken kernel
 subsystems."

Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Cc: Henrique de Moraes Holschuh <hmh@hmh.eng.br>
Cc: Matthew Garrett <mjg59@srcf.ucam.org>
Cc: Kay Sievers <kay.sievers@vrfy.org>
Acked-by: Jingoo Han <jg1.han@samsung.com>
Cc: Henrique de Moraes Holschuh <ibm-acpi@hmh.eng.br>
Cc: Richard Purdie <rpurdie@rpsys.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agobacklight: tosa: use devm_lcd_device_register()
Jingoo Han [Fri, 3 Jan 2014 03:10:14 +0000 (14:10 +1100)]
backlight: tosa: use devm_lcd_device_register()

Use devm_lcd_device_register() to make cleanup paths simpler.

Signed-off-by: Jingoo Han <jg1.han@samsung.com>
Cc: Tomi Valkeinen <tomi.valkeinen@ti.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agobacklight: l4f00242t03: use devm_lcd_device_register()
Jingoo Han [Fri, 3 Jan 2014 03:10:14 +0000 (14:10 +1100)]
backlight: l4f00242t03: use devm_lcd_device_register()

Use devm_lcd_device_register() to make cleanup paths simpler.

Signed-off-by: Jingoo Han <jg1.han@samsung.com>
Cc: Tomi Valkeinen <tomi.valkeinen@ti.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agobacklight: jornada720: use devm_lcd_device_register()
Jingoo Han [Fri, 3 Jan 2014 03:10:14 +0000 (14:10 +1100)]
backlight: jornada720: use devm_lcd_device_register()

Use devm_lcd_device_register() to make cleanup paths simpler,
and remove unnecessary remove().

Signed-off-by: Jingoo Han <jg1.han@samsung.com>
Cc: Tomi Valkeinen <tomi.valkeinen@ti.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agobacklight: tosa: use devm_backlight_device_register()
Jingoo Han [Fri, 3 Jan 2014 03:10:13 +0000 (14:10 +1100)]
backlight: tosa: use devm_backlight_device_register()

Use devm_backlight_device_register() to make cleanup paths simpler.

Signed-off-by: Jingoo Han <jg1.han@samsung.com>
Cc: Tomi Valkeinen <tomi.valkeinen@ti.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agobacklight: ot200_bl: use devm_backlight_device_register()
Jingoo Han [Fri, 3 Jan 2014 03:10:13 +0000 (14:10 +1100)]
backlight: ot200_bl: use devm_backlight_device_register()

Use devm_backlight_device_register() to make cleanup paths simpler.

Signed-off-by: Jingoo Han <jg1.han@samsung.com>
Cc: Tomi Valkeinen <tomi.valkeinen@ti.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agobacklight: omap1: use devm_backlight_device_register()
Jingoo Han [Fri, 3 Jan 2014 03:10:13 +0000 (14:10 +1100)]
backlight: omap1: use devm_backlight_device_register()

Use devm_backlight_device_register() to make cleanup paths simpler,
and remove unnecessary remove().

Signed-off-by: Jingoo Han <jg1.han@samsung.com>
Cc: Tomi Valkeinen <tomi.valkeinen@ti.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agobacklight: hp680_bl: use devm_backlight_device_register()
Jingoo Han [Fri, 3 Jan 2014 03:10:13 +0000 (14:10 +1100)]
backlight: hp680_bl: use devm_backlight_device_register()

Use devm_backlight_device_register() to make cleanup paths simpler.

Signed-off-by: Jingoo Han <jg1.han@samsung.com>
Cc: Tomi Valkeinen <tomi.valkeinen@ti.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agobacklight: jornada720: use devm_backlight_device_register()
Jingoo Han [Fri, 3 Jan 2014 03:10:13 +0000 (14:10 +1100)]
backlight: jornada720: use devm_backlight_device_register()

Use devm_backlight_device_register() to make cleanup paths simpler,
and remove unnecessary remove().

Signed-off-by: Jingoo Han <jg1.han@samsung.com>
Cc: Tomi Valkeinen <tomi.valkeinen@ti.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agoMAINTAINERS: add an entry for the Macintosh HFSPlus Filesystem
Geert Uytterhoeven [Fri, 3 Jan 2014 03:10:12 +0000 (14:10 +1100)]
MAINTAINERS: add an entry for the Macintosh HFSPlus Filesystem

To make scripts/get_maintainer.pl output something sensible.

Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agoget_maintainer: add commit author information to --rolestats
Joe Perches [Fri, 3 Jan 2014 03:10:12 +0000 (14:10 +1100)]
get_maintainer: add commit author information to --rolestats

get_maintainer currently uses "Signed-off-by" style lines to find
interested parties to send patches to when the MAINTAINERS file does not
have a specific section entry with a matching file pattern.

Add statistics for commit authors and lines added and deleted to the
information provided by --rolestats.

These statistics are also emitted whenever --rolestats and --git are
selected even when there is a specified maintainer.

This can have the effect of expanding the number of people that are shown
as possible "maintainers" of a particular file because "authors",
"added_lines", and "removed_lines" are also used as criterion for the
--max-maintainers option separate from the "commit_signers".

The first "--git-max-maintainers" values of each criterion
are emitted.  Any "ties" are not shown.

For example: (forcedeth does not have a named maintainer)

Old output:

$ ./scripts/get_maintainer.pl -f drivers/net/ethernet/nvidia/forcedeth.c
"David S. Miller" <davem@davemloft.net> (commit_signer:8/10=80%)
Jiri Pirko <jiri@resnulli.us> (commit_signer:2/10=20%)
Patrick McHardy <kaber@trash.net> (commit_signer:2/10=20%)
Larry Finger <Larry.Finger@lwfinger.net> (commit_signer:1/10=10%)
Peter Zijlstra <peterz@infradead.org> (commit_signer:1/10=10%)
netdev@vger.kernel.org (open list:NETWORKING DRIVERS)
linux-kernel@vger.kernel.org (open list)

New output:

$ ./scripts/get_maintainer.pl -f drivers/net/ethernet/nvidia/forcedeth.c
"David S. Miller" <davem@davemloft.net> (commit_signer:8/10=80%)
Jiri Pirko <jiri@resnulli.us> (commit_signer:2/10=20%,authored:2/10=20%,removed_lines:3/33=9%)
Patrick McHardy <kaber@trash.net> (commit_signer:2/10=20%,authored:2/10=20%,added_lines:12/95=13%,removed_lines:10/33=30%)
Larry Finger <Larry.Finger@lwfinger.net> (commit_signer:1/10=10%,authored:1/10=10%,added_lines:35/95=37%)
Peter Zijlstra <peterz@infradead.org> (commit_signer:1/10=10%)
"Peter Hüwe" <PeterHuewe@gmx.de> (authored:1/10=10%,removed_lines:15/33=45%)
Joe Perches <joe@perches.com> (authored:1/10=10%)
Neil Horman <nhorman@tuxdriver.com> (added_lines:40/95=42%)
Bill Pemberton <wfp5p@virginia.edu> (removed_lines:3/33=9%)
netdev@vger.kernel.org (open list:NETWORKING DRIVERS)
linux-kernel@vger.kernel.org (open list)

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agovsprintf: add %pad extension for dma_addr_t use
Joe Perches [Fri, 3 Jan 2014 03:10:12 +0000 (14:10 +1100)]
vsprintf: add %pad extension for dma_addr_t use

dma_addr_t's can be either u32 or u64 depending on a CONFIG option.

There are a few hundred dma_addr_t's printed via either cast to unsigned
long long, unsigned long or no cast at all.

Add %pad to be able to emit them without the cast.

Update Documentation/printk-formats.txt too.

Signed-off-by: Joe Perches <joe@perches.com>
Cc: "Shevchenko, Andriy" <andriy.shevchenko@intel.com>
Cc: Rob Landley <rob@landley.net>
Cc: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com>
Cc: Julia Lawall <julia.lawall@lip6.fr>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agoprintk-cache-mark-printk_once-test-variable-__read_mostly-fix
Joe Perches [Fri, 3 Jan 2014 03:10:12 +0000 (14:10 +1100)]
printk-cache-mark-printk_once-test-variable-__read_mostly-fix

fix ia64 build

Cc: James Hogan <james.hogan@imgtec.com>
Cc: Wu Fengguang <fengguang.wu@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agoprintk/cache: Mark printk_once test variable __read_mostly
Joe Perches [Fri, 3 Jan 2014 03:10:11 +0000 (14:10 +1100)]
printk/cache: Mark printk_once test variable __read_mostly

Add #include <linux/cache.h> to define __read_mostly.

Convert cache.h to use uapi/linux/kernel.h instead
of linux/kernel.h to avoid recursive #includes.

Convert the ALIGN macro to __ALIGN_KERNEL.

printk_once only sets the bool variable tested
once so mark it __read_mostly.

Neaten the alignment so it matches the rest of the
pr_<level>_once #defines too.

Signed-off-by: Joe Perches <joe@perches.com>
Reviewed-by: James Hogan <james.hogan@imgtec.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agodynamic-debug-howto.txt: update since new wildcard support
Du, Changbin [Fri, 3 Jan 2014 03:10:11 +0000 (14:10 +1100)]
dynamic-debug-howto.txt: update since new wildcard support

Add the usage of using new feature wildcard support.

Signed-off-by: Du, Changbin <changbin.du@gmail.com>
Cc: Jason Baron <jbaron@akamai.com>
Cc: Joe Perches <joe@perches.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agodynamic_debug: add wildcard support to filter files/functions/modules
Du, Changbin [Fri, 3 Jan 2014 03:10:11 +0000 (14:10 +1100)]
dynamic_debug: add wildcard support to filter files/functions/modules

Add wildcard '*'(matches zero or more characters) and '?' (matches one
character) support when qurying debug flags.

Now we can open debug messages using keywords. eg:
1. open debug logs in all usb drivers
    echo "file drivers/usb/* +p" > <debugfs>/dynamic_debug/control
2.  open debug logs for usb xhci code
    echo "file *xhci* +p" > <debugfs>/dynamic_debug/control

Signed-off-by: Du, Changbin <changbin.du@gmail.com>
Cc: Jason Baron <jbaron@akamai.com>
Cc: Joe Perches <joe@perches.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agolib/parser.c: put EXPORT_SYMBOLs in the conventional place
Andrew Morton [Fri, 3 Jan 2014 03:10:11 +0000 (14:10 +1100)]
lib/parser.c: put EXPORT_SYMBOLs in the conventional place

Cc: Du, Changbin <changbin.du@gmail.com>
Cc: Joe Perches <joe@perches.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agolib/parser.c: add match_wildcard function
Du, Changbin [Fri, 3 Jan 2014 03:10:10 +0000 (14:10 +1100)]
lib/parser.c: add match_wildcard function

match_wildcard function is a simple implementation of wildcard
matching algorithm. It only supports two usual wildcardes:
    '*' - matches zero or more characters
    '?' - matches one character
This algorithm is safe since it is non-recursive.

Signed-off-by: Du, Changbin <changbin.du@gmail.com>
Cc: Jason Baron <jbaron@akamai.com>
Cc: Joe Perches <joe@perches.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agodrivers/misc/ti-st/st_core.c: fix NULL dereference on protocol type check
Gustavo Padovan [Fri, 3 Jan 2014 03:10:10 +0000 (14:10 +1100)]
drivers/misc/ti-st/st_core.c: fix NULL dereference on protocol type check

If the type we receive is greater than ST_MAX_CHANNELS we can't rely on
type as vector index since we would be accessing unknown memory when we use the type
as index.

 Unable to handle kernel NULL pointer dereference at virtual address 0000001b
 pgd = c0004000
 [0000001b] *pgd=00000000
 Internal error: Oops: 17 [#1] PREEMPT SMP ARM
 Modules linked in: btwilink wl12xx wlcore mac80211 cfg80211 rfcomm bnep bluo
 CPU: 0    Tainted: G        W     (3.4.0+ #15)
 PC is at st_int_recv+0x278/0x344
 LR is at get_parent_ip+0x14/0x30
 pc : [<c03b01a8>]    lr : [<c007273c>]    psr: 200f0193
 sp : dc631ed0  ip : e3e21c24  fp : dc631f04
 r10: 00000000  r9 : 600f0113  r8 : 0000003f
 r7 : e3e21b14  r6 : 00000067  r5 : e2e49c1c  r4 : e3e21a80
 r3 : 00000001  r2 : 00000001  r1 : 00000001  r0 : 600f0113
 Flags: nzCv  IRQs off  FIQs on  Mode SVC_32  ISA ARM  Segment kernel
 Control: 10c5387d  Table: 9c50004a  DAC: 00000015

Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agostack protector: provide -fstack-protector-strong build option
Kees Cook [Fri, 3 Jan 2014 03:10:10 +0000 (14:10 +1100)]
stack protector: provide -fstack-protector-strong build option

This changes the stack protector config option into a choice of "None",
"Regular", and "Strong".  For "Strong", the kernel is built with
-fstack-protector-strong (gcc 4.9 and later).  This options increases the
coverage of the stack protector without the heavy performance hit of
-fstack-protector-all.

For reference, the stack protector options available in gcc are:

-fstack-protector-all:
Adds the stack-canary saving prefix and stack-canary checking suffix to
_all_ function entry and exit. Results in substantial use of stack space
for saving the canary for deep stack users (e.g. historically xfs), and
measurable (though shockingly still low) performance hit due to all the
saving/checking. Really not suitable for sane systems, and was entirely
removed as an option from the kernel many years ago.

-fstack-protector:
Adds the canary save/check to functions that define an 8
(--param=ssp-buffer-size=N, N=8 by default) or more byte local char
array. Traditionally, stack overflows happened with string-based
manipulations, so this was a way to find those functions. Very few
total functions actually get the canary; no measurable performance or
size overhead.

-fstack-protector-strong
Adds the canary for a wider set of functions, since it's not just those
with strings that have ultimately been vulnerable to stack-busting.  With
this superset, more functions end up with a canary, but it still remains
small compared to all functions with no measurable change in performance.
Based on the original design document, a function gets the canary when it
contains any of:

- local variable's address used as part of the RHS of an assignment or
  function argument
- local variable is an array (or union containing an array), regardless
  of array type or length
- uses register local variables
https://docs.google.com/a/google.com/document/d/1xXBH6rRZue4f296vGt9YQcuLVQHeE516stHwt8M9xyU

Comparison of "size" and "objdump" output when built with gcc-4.9 in
three configurations:
- defconfig
11430641 text size
36110 function bodies
- defconfig + CONFIG_CC_STACKPROTECTOR
11468490 text size (+0.33%)
1015 of 36110 functions stack-protected (2.81%)
- defconfig + CONFIG_CC_STACKPROTECTOR_STRONG via this patch
11692790 text size (+2.24%)
7401 of 36110 functions stack-protected (20.5%)

With -strong, ARM's compressed boot code now triggers stack protection, so
a static guard was added.  Since this is only used during decompression
and was never used before, the exposure here is very small.  Once it
switches to the full kernel, the stack guard is back to normal.

Chrome OS has been using -fstack-protector-strong for its kernel builds
for the last 8 months with no problems.

Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: Michal Marek <mmarek@suse.cz>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: James Hogan <james.hogan@imgtec.com>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Shawn Guo <shawn.guo@linaro.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agostack protector: create HAVE_CC_STACKPROTECTOR for centralized use
Kees Cook [Fri, 3 Jan 2014 03:10:10 +0000 (14:10 +1100)]
stack protector: create HAVE_CC_STACKPROTECTOR for centralized use

Instead of duplicating the CC_STACKPROTECTOR Kconfig and Makefile logic in
each architecture, switch to using HAVE_CC_STACKPROTECTOR and keep
everything in one place.  This retains the x86-specific bug verification
scripts.

Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: Michal Marek <mmarek@suse.cz>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: James Hogan <james.hogan@imgtec.com>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Shawn Guo <shawn.guo@linaro.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agokernel/smp.c: remove cpumask_ipi
Roman Gushchin [Fri, 3 Jan 2014 03:10:10 +0000 (14:10 +1100)]
kernel/smp.c: remove cpumask_ipi

After 9a46ad6 ("smp: make smp_call_function_many() use logic similar to
smp_call_function_single()"), cfd->cpumask is accessed only in
smp_call_function_many().  So there is no more need to copy it into
cfd->cpumask_ipi before putting csd into the list.  The cpumask_ipi field
is obsolete and can be removed.

Signed-off-by: Roman Gushchin <klamm@yandex-team.ru>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Wang YanQing <udknight@gmail.com>
Cc: Xie XiuQi <xiexiuqi@huawei.com>
Cc: Shaohua Li <shli@fusionio.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agoremove extra definitions of U32_MAX
Alex Elder [Fri, 3 Jan 2014 03:10:09 +0000 (14:10 +1100)]
remove extra definitions of U32_MAX

Now that the definition is centralized in <linux/kernel.h>, the
definitions of U32_MAX (and related) elsewhere in the kernel can be
removed.

Signed-off-by: Alex Elder <elder@linaro.org>
Acked-by: Sage Weil <sage@inktank.com>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agokernel.h: define u8, s8, u32, etc. limits
Alex Elder [Fri, 3 Jan 2014 03:10:09 +0000 (14:10 +1100)]
kernel.h: define u8, s8, u32, etc. limits

Create constants that define the maximum and minimum values
representable by the kernel types u8, s8, u16, s16, and so on.

Signed-off-by: Alex Elder <elder@linaro.org>
Cc: Sage Weil <sage@inktank.com>
Cc: David Miller <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agoconditionally define U32_MAX
Alex Elder [Fri, 3 Jan 2014 03:10:09 +0000 (14:10 +1100)]
conditionally define U32_MAX

The symbol U32_MAX is defined in several spots.  Change these definitions
to be conditional.  This is in preparation for the next patch, which
centralizes the definition in <linux/kernel.h>.

Signed-off-by: Alex Elder <elder@linaro.org>
Cc: Sage Weil <sage@inktank.com>
Cc: David Miller <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agoum: use generic fixmap.h
Mark Salter [Fri, 3 Jan 2014 03:10:09 +0000 (14:10 +1100)]
um: use generic fixmap.h

Signed-off-by: Mark Salter <msalter@redhat.com>
Acked-by: Richard Weinberger <richard@nod.at>
Cc: Jeff Dike <jdike@addtoit.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agotile: use generic fixmap.h
Mark Salter [Fri, 3 Jan 2014 03:10:08 +0000 (14:10 +1100)]
tile: use generic fixmap.h

Signed-off-by: Mark Salter <msalter@redhat.com>
Acked-by: Chris Metcalf <cmetcalf@tilera.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agosh: use generic fixmap.h
Mark Salter [Fri, 3 Jan 2014 03:10:08 +0000 (14:10 +1100)]
sh: use generic fixmap.h

Signed-off-by: Mark Salter <msalter@redhat.com>
Cc: Paul Mundt <lethal@linux-sh.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agopowerpc: use generic fixmap.h
Mark Salter [Fri, 3 Jan 2014 03:10:08 +0000 (14:10 +1100)]
powerpc: use generic fixmap.h

Signed-off-by: Mark Salter <msalter@redhat.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agomips: use generic fixmap.h
Mark Salter [Fri, 3 Jan 2014 03:10:08 +0000 (14:10 +1100)]
mips: use generic fixmap.h

Signed-off-by: Mark Salter <msalter@redhat.com>
Acked-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agomicroblaze: use generic fixmap.h
Mark Salter [Fri, 3 Jan 2014 03:10:07 +0000 (14:10 +1100)]
microblaze: use generic fixmap.h

Signed-off-by: Mark Salter <msalter@redhat.com>
Tested-by: Michal Simek <monstr@monstr.eu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agometag: use generic fixmap.h
Mark Salter [Fri, 3 Jan 2014 03:10:07 +0000 (14:10 +1100)]
metag: use generic fixmap.h

Signed-off-by: Mark Salter <msalter@redhat.com>
Acked-by: James Hogan <james.hogan@imgtec.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agohexagon: use generic fixmap.h
Mark Salter [Fri, 3 Jan 2014 03:10:07 +0000 (14:10 +1100)]
hexagon: use generic fixmap.h

Signed-off-by: Mark Salter <msalter@redhat.com>
Acked-by: Richard Kuo <rkuo@codeaurora.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agoarm: use generic fixmap.h
Mark Salter [Fri, 3 Jan 2014 03:10:07 +0000 (14:10 +1100)]
arm: use generic fixmap.h

ARM is different from other architectures in that fixmap pages are indexed
with a positive offset from FIXADDR_START.  Other architectures index with
a negative offset from FIXADDR_TOP.  In order to use the generic fixmap.h
definitions, this patch redefines FIXADDR_TOP to be inclusive of the
useable range.  That is, FIXADDR_TOP is the virtual address of the topmost
fixed page.  The newly defined FIXADDR_END is the first virtual address
past the fixed mappings.

Signed-off-by: Mark Salter <msalter@redhat.com>
Cc: Russell King <linux@arm.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agox86: use generic fixmap.h
Mark Salter [Fri, 3 Jan 2014 03:10:07 +0000 (14:10 +1100)]
x86: use generic fixmap.h

Signed-off-by: Mark Salter <msalter@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agoadd generic fixmap.h
Mark Salter [Fri, 3 Jan 2014 03:10:06 +0000 (14:10 +1100)]
add generic fixmap.h

Many architectures provide an asm/fixmap.h which defines support for
compile-time 'special' virtual mappings which need to be made before
paging_init() has run.  This support is also used for early ioremap on
x86.  Much of this support is identical across the architectures.  This
patch consolidates all of the common bits into asm-generic/fixmap.h which
is intended to be included from arch/*/include/asm/fixmap.h.

Signed-off-by: Mark Salter <msalter@redhat.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Ralf Baechle <ralf@linux-mips.org>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Richard Kuo <rkuo@codeaurora.org>
Cc: James Hogan <james.hogan@imgtec.com>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Chris Metcalf <cmetcalf@tilera.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Richard Weinberger <richard@nod.at>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Jonas Bonn <jonas.bonn@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agologfs: check for the return value after calling find_or_create_page()
Younger Liu [Fri, 3 Jan 2014 03:10:06 +0000 (14:10 +1100)]
logfs: check for the return value after calling find_or_create_page()

In get_mapping_page(), after calling find_or_create_page(), the return
value should be checked.

 This patch has been provided:
http://www.spinics.net/lists/linux-fsdevel/msg66948.html but not been
applied now.

Signed-off-by: Younger Liu <liuyiyang@hisense.com>
Cc: Younger Liu <younger.liucn@gmail.com>
Cc: Vyacheslav Dubeyko <slava@dubeyko.com>
Reviewed-by: Prasad Joshi <prasadjoshi.linux@gmail.com>
Cc: Jörn Engel <joern@logfs.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agodrivers/block/Kconfig: update RAM block device module name
Fabian Frederick [Fri, 3 Jan 2014 03:10:06 +0000 (14:10 +1100)]
drivers/block/Kconfig: update RAM block device module name

RAM block device support module name changed to brd.ko some years ago with
an "rd" alias to match previous module implementation.  This patch updates
its Kconfig definition.

Signed-off-by: Fabian Frederick <fabf@skynet.be>
Acked-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agodrivers/mailbox/omap: make mbox->irq signed for error handling
Dan Carpenter [Fri, 3 Jan 2014 03:10:06 +0000 (14:10 +1100)]
drivers/mailbox/omap: make mbox->irq signed for error handling

There is a bug in omap2_mbox_probe() where we try do:

mbox->irq = platform_get_irq(pdev, info->irq_id);
if (mbox->irq < 0) {

The problem is that mbox->irq is unsigned so the error handling doesn't
work.  I've changed it to a signed integer.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Cc: Suman Anna <s-anna@ti.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Omar Ramirez Luna <omar.ramirez@copitl.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agoasm/types.h: Remove include/asm-generic/int-l64.h
Geert Uytterhoeven [Fri, 3 Jan 2014 03:10:05 +0000 (14:10 +1100)]
asm/types.h: Remove include/asm-generic/int-l64.h

Now all 64-bit architectures have been converted to int-ll64.h, we can
remove int-l64.h in kernelspace.

For backwards compatibility, alpha, ia64, mips64, and powerpc64 still use
int-l64.h in userspace.

This is the (reworked for UAPI) non-documentation part of more than two
year old "asm/types.h: All architectures use int-ll64.h in kernelspace"
(https://lkml.org/lkml/2011/8/13/104)

Since <asm/types.h> (from include/uapi/asm-generic/types.h) is used for
both kernel and user space, include/asm-generic/int-ll64.h cannot just
become include/asm-generic/types.h, as Arnd suggested.

Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agokernel: use lockless list for smp_call_function_single
Christoph Hellwig [Fri, 3 Jan 2014 03:10:05 +0000 (14:10 +1100)]
kernel: use lockless list for smp_call_function_single

Make smp_call_function_single and friends more efficient by using
a lockless list.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jan Kara <jack@suse.cz>
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agoswap: swapin_nr_pages() can be static
Fengguang Wu [Fri, 3 Jan 2014 03:10:05 +0000 (14:10 +1100)]
swap: swapin_nr_pages() can be static

Signed-off-by: Fengguang Wu <fengguang.wu@intel.com>
Cc: Shaohua Li <shli@kernel.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agoswap: add a simple detector for inappropriate swapin readahead
Shaohua Li [Fri, 3 Jan 2014 03:10:05 +0000 (14:10 +1100)]
swap: add a simple detector for inappropriate swapin readahead

This is a patch to improve swap readahead algorithm. It's from Hugh and I
slightly changed it.

Hugh's original changelog:

swapin readahead does a blind readahead, whether or not the swapin
is sequential.  This may be ok on harddisk, because large reads have
relatively small costs, and if the readahead pages are unneeded they
can be reclaimed easily - though, what if their allocation forced
reclaim of useful pages?  But on SSD devices large reads are more
expensive than small ones: if the readahead pages are unneeded,
reading them in caused significant overhead.

This patch adds very simplistic random read detection.  Stealing
the PageReadahead technique from Konstantin Khlebnikov's patch,
avoiding the vma/anon_vma sophistications of Shaohua Li's patch,
swapin_nr_pages() simply looks at readahead's current success
rate, and narrows or widens its readahead window accordingly.
There is little science to its heuristic: it's about as stupid
as can be whilst remaining effective.

The table below shows elapsed times (in centiseconds) when running
a single repetitive swapping load across a 1000MB mapping in 900MB
ram with 1GB swap (the harddisk tests had taken painfully too long
when I used mem=500M, but SSD shows similar results for that).

Vanilla is the 3.6-rc7 kernel on which I started; Shaohua denotes
his Sep 3 patch in mmotm and linux-next; HughOld denotes my Oct 1
patch which Shaohua showed to be defective; HughNew this Nov 14
patch, with page_cluster as usual at default of 3 (8-page reads);
HughPC4 this same patch with page_cluster 4 (16-page reads);
HughPC0 with page_cluster 0 (1-page reads: no readahead).

HDD for swapping to harddisk, SSD for swapping to VertexII SSD.
Seq for sequential access to the mapping, cycling five times around;
Rand for the same number of random touches.  Anon for a MAP_PRIVATE
anon mapping; Shmem for a MAP_SHARED anon mapping, equivalent to tmpfs.

One weakness of Shaohua's vma/anon_vma approach was that it did
not optimize Shmem: seen below.  Konstantin's approach was perhaps
mistuned, 50% slower on Seq: did not compete and is not shown below.

HDD        Vanilla Shaohua HughOld HughNew HughPC4 HughPC0
Seq Anon     73921   76210   75611   76904   78191  121542
Seq Shmem    73601   73176   73855   72947   74543  118322
Rand Anon   895392  831243  871569  845197  846496  841680
Rand Shmem 1058375 1053486  827935  764955  764376  756489

SSD        Vanilla Shaohua HughOld HughNew HughPC4 HughPC0
Seq Anon     24634   24198   24673   25107   21614   70018
Seq Shmem    24959   24932   25052   25703   22030   69678
Rand Anon    43014   26146   28075   25989   26935   25901
Rand Shmem   45349   45215   28249   24268   24138   24332

These tests are, of course, two extremes of a very simple case:
under heavier mixed loads I've not yet observed any consistent
improvement or degradation, and wider testing would be welcome.

Shaohua Li:

Test shows Vanilla is slightly better in sequential workload than Hugh's patch.
I observed with Hugh's patch sometimes the readahead size is shrinked too fast
(from 8 to 1 immediately) in sequential workload if there is no hit. And in
such case, continuing doing readahead is good actually.

I don't prepare a sophisticated algorithm for the sequential workload because
so far we can't guarantee sequential accessed pages are swap out sequentially.
So I slightly change Hugh's heuristic - don't shrink readahead size too fast.

Here is my test result (unit second, 3 runs average):
Vanilla Hugh New
Seq 356 370 360
Random 4525 2447 2444

Attached graph is the swapin/swapout throughput I collected with 'vmstat 2'.
The first part is running a random workload (till around 1200 of the x-axis)
and the second part is running a sequential workload. swapin and swapout
throughput are almost identical in steady state in both workloads. These are
expected behavior. while in Vanilla, swapin is much bigger than swapout
especially in random workload (because wrong readahead).

Original patches by: Shaohua Li and Konstantin Khlebnikov.

Signed-off-by: Hugh Dickins <hughd@google.com>
Signed-off-by: Shaohua Li <shli@fusionio.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Wu Fengguang <fengguang.wu@intel.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Konstantin Khlebnikov <khlebnikov@openvz.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agoswap: fix setting PAGE_SIZE blocksize during swapoff/swapon race
Krzysztof Kozlowski [Fri, 3 Jan 2014 03:10:05 +0000 (14:10 +1100)]
swap: fix setting PAGE_SIZE blocksize during swapoff/swapon race

Fix race between swapoff and swapon resulting in setting blocksize of
PAGE_SIZE for block devices during swapoff.

The swapon modifies swap_info->old_block_size before acquiring
swapon_mutex.  It reads block_size of bdev, stores it under
swap_info->old_block_size and sets new block_size to PAGE_SIZE.

On the other hand the swapoff sets the device's block_size to
old_block_size after releasing swapon_mutex.

This patch locks the swapon_mutex much earlier during swapon. It also
releases the swapon_mutex later during swapoff.

The effect of race can be triggered by following scenario:
 - One block swap device with block size of 512
 - thread 1: Swapon is called, swap is activated,
   p->old_block_size = block_size(p->bdev); /512/
   block_size(p->bdev) = PAGE_SIZE;
   Thread ends.

 - thread 2: Swapoff is called and it goes just after releasing the
   swapon_mutex. The swap is now fully disabled except of setting the
   block size to old value. The p->bdev->block_size is still equal to
   PAGE_SIZE.

 - thread 3: New swapon is called. This swap is disabled so without
   acquiring the swapon_mutex:
   - p->old_block_size = block_size(p->bdev); /PAGE_SIZE (!!!)/
   - block_size(p->bdev) = PAGE_SIZE;
   Swap is activated and thread ends.

 - thread 2: resumes work and sets blocksize to old value:
   - set_blocksize(bdev, p->old_block_size)
   But now the p->old_block_size is equal to PAGE_SIZE.

The patch swap-fix-set_blocksize-race-during-swapon-swapoff does not fix
this particular issue.  It reduces the possibility of races as the swapon
must overwrite p->old_block_size before acquiring swapon_mutex in swapoff.

Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
Cc: Weijie Yang <weijie.yang.kh@gmail.com>
Cc: Bob Liu <bob.liu@oracle.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Shaohua Li <shli@fusionio.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Hugh Dickins <hughd@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agomm-dump-page-when-hitting-a-vm_bug_on-using-vm_bug_on_page-fix-fix
Andrew Morton [Fri, 3 Jan 2014 03:10:04 +0000 (14:10 +1100)]
mm-dump-page-when-hitting-a-vm_bug_on-using-vm_bug_on_page-fix-fix

Fix the patch for mm-print-more-details-for-bad_page.patch.

Also fix up an include mess - various files were using mmdebug.h
facilities but were not including that file.

Cc: Sasha Levin <sasha.levin@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agomm: dump page when hitting a VM_BUG_ON using VM_BUG_ON_PAGE fix
Sasha Levin [Fri, 3 Jan 2014 03:10:04 +0000 (14:10 +1100)]
mm: dump page when hitting a VM_BUG_ON using VM_BUG_ON_PAGE fix

I messed up and forgot to commit this fix before sending out the original
patch.

It fixes build issues in various files using VM_BUG_ON_PAGE.

Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agomm: dump page when hitting a VM_BUG_ON using VM_BUG_ON_PAGE
Sasha Levin [Fri, 3 Jan 2014 03:10:04 +0000 (14:10 +1100)]
mm: dump page when hitting a VM_BUG_ON using VM_BUG_ON_PAGE

Most of the VM_BUG_ON assertions are performed on a page.  Usually, when
one of these assertions fails we'll get a BUG_ON with a call stack and the
registers.

I've recently noticed based on the requests to add a small piece of code
that dumps the page to various VM_BUG_ON sites that the page dump is quite
useful to people debugging issues in mm.

This patch adds a VM_BUG_ON_PAGE(cond, page) which beyond doing what
VM_BUG_ON() does, also dumps the page before executing the actual BUG_ON.

Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agofs/proc/page.c: add PageAnon check to surely detect thp
Naoya Horiguchi [Fri, 3 Jan 2014 03:10:03 +0000 (14:10 +1100)]
fs/proc/page.c: add PageAnon check to surely detect thp

stable_page_flags() checks !PageHuge && PageTransCompound && PageLRU to
know that a specified page is thp or not.  But sometimes it's not enough
and we fail to detect thp when the thp is on pagevec.  This happens only
for a few seconds after LRU list operations, but it makes it difficult to
control our applications depending on this flag.

So this patch adds another check PageAnon to detect thps on pagevec.  It
might not give the future extensibility for thp pagecache, but it's OK at
least for now.

Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: David Rientjes <rientjes@google.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Wu Fengguang <fengguang.wu@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agomm: remove BUG_ON() from mlock_vma_page()
Bob Liu [Fri, 3 Jan 2014 03:10:03 +0000 (14:10 +1100)]
mm: remove BUG_ON() from mlock_vma_page()

objrmap doesn't work for nonlinear VMAs because the assumption that
offset-into-file correlates with offset-into-virtual-addresses does not
hold.  Hence what try_to_unmap_cluster does is a mini "virtual scan" of
each nonlinear VMA which maps the file to which the target page belongs.
If vma locked, mlock the pages in the cluster, rather than unmapping them.
However, not all pages are guarantee page locked instead of the check
page, resulting in the below BUG_ON().

It's safe to mlock_vma_page() without PageLocked, so fix this issue by
removing that BUG_ON().

[  253.869145] kernel BUG at mm/mlock.c:82!
[  253.869549] invalid opcode: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC
[  253.870098] Dumping ftrace buffer:
[  253.870098]    (ftrace buffer empty)
[  253.870098] Modules linked in:
[  253.870098] CPU: 10 PID: 9162 Comm: trinity-child75 Tainted: G        W 3.13.0-rc4-next-20131216-sasha-00011-g5f105ec-dirty #4137
[  253.873310] task: ffff8800c98cb000 ti: ffff8804d34e8000 task.ti: ffff8804d34e8000
[  253.873310] RIP: 0010:[<ffffffff81281f28>]  [<ffffffff81281f28>] mlock_vma_page+0x18/0xc0
[  253.873310] RSP: 0000:ffff8804d34e99e8  EFLAGS: 00010246
[  253.873310] RAX: 006fffff8038002c RBX: ffffea00474944c0 RCX: ffff880807636000
[  253.873310] RDX: ffffea0000000000 RSI: 00007f17a9bca000 RDI: ffffea00474944c0
[  253.873310] RBP: ffff8804d34e99f8 R08: ffff880807020000 R09: 0000000000000000
[  253.873310] R10: 0000000000000001 R11: 0000000000002000 R12: 00007f17a9bca000
[  253.873310] R13: ffffea00474944c0 R14: 00007f17a9be0000 R15: ffff880807020000
[  253.873310] FS:  00007f17aa31a700(0000) GS:ffff8801c9c00000(0000) knlGS:0000000000000000
[  253.873310] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[  253.873310] CR2: 00007f17a94fa000 CR3: 00000004d3b02000 CR4: 00000000000006e0
[  253.873310] DR0: 00007f17a74ca000 DR1: 0000000000000000 DR2: 0000000000000000
[  253.873310] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000600
[  253.873310] Stack:
[  253.873310]  0000000b3de28067 ffff880b3de28e50 ffff8804d34e9aa8 ffffffff8128bc31
[  253.873310]  0000000000000301 ffffea0011850220 ffff8809a4039000 ffffea0011850238
[  253.873310]  ffff8804d34e9aa8 ffff880807636060 0000000000000001 ffff880807636348
[  253.873310] Call Trace:
[  253.873310]  [<ffffffff8128bc31>] try_to_unmap_cluster+0x1c1/0x340
[  253.873310]  [<ffffffff8128c60a>] try_to_unmap_file+0x20a/0x2e0
[  253.873310]  [<ffffffff8128c7b3>] try_to_unmap+0x73/0x90
[  253.873310]  [<ffffffff812b526d>] __unmap_and_move+0x18d/0x250
[  253.873310]  [<ffffffff812b53e9>] unmap_and_move+0xb9/0x180
[  253.873310]  [<ffffffff812b559b>] migrate_pages+0xeb/0x2f0
[  253.873310]  [<ffffffff812a0660>] ? queue_pages_pte_range+0x1a0/0x1a0
[  253.873310]  [<ffffffff812a193c>] migrate_to_node+0x9c/0xc0
[  253.873310]  [<ffffffff812a30b8>] do_migrate_pages+0x1b8/0x240
[  253.873310]  [<ffffffff812a3456>] SYSC_migrate_pages+0x316/0x380
[  253.873310]  [<ffffffff812a31ec>] ? SYSC_migrate_pages+0xac/0x380
[  253.873310]  [<ffffffff811763c6>] ? vtime_account_user+0x96/0xb0
[  253.873310]  [<ffffffff812a34ce>] SyS_migrate_pages+0xe/0x10
[  253.873310]  [<ffffffff843c4990>] tracesys+0xdd/0xe2
[  253.873310] Code: 0f 1f 00 65 48 ff 04 25 10 25 1d 00 48 83 c4 08
5b c9 c3 55 48 89 e5 53 48 83 ec 08 66 66 66 66 90 48 8b 07 48 89 fb
a8 01 75 10 <0f> 0b 66 0f 1f 44 00 00 eb fe 66 0f 1f 44 00 00 f0 0f ba
2f 15
[  253.873310] RIP  [<ffffffff81281f28>] mlock_vma_page+0x18/0xc0
[  253.873310]  RSP <ffff8804d34e99e8>
[  253.904194] ---[ end trace be59c4a7f8edab3f ]---

Signed-off-by: Bob Liu <bob.liu@oracle.com>
Reported-by: Sasha Levin <sasha.levin@oracle.com>
Cc: Wanpeng Li <liwanp@linux.vnet.ibm.com>
Cc: Michel Lespinasse <walken@google.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Hugh Dickins <hughd@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agomemcg: do not use vmalloc for mem_cgroup allocations
Vladimir Davydov [Fri, 3 Jan 2014 03:10:03 +0000 (14:10 +1100)]
memcg: do not use vmalloc for mem_cgroup allocations

The vmalloc was introduced by 333279 ("memcgroup: use vmalloc for
mem_cgroup allocation"), because at that time MAX_NUMNODES was used for
defining the per-node array in the mem_cgroup structure so that the
structure could be huge even if the system had the only NUMA node.

The situation was significantly improved by patch 45cf7e ("memcg: reduce
the size of struct memcg 244-fold"), which made the size of the mem_cgroup
structure calculated dynamically depending on the real number of NUMA
nodes installed on the system (nr_node_ids), so now there is no point in
using vmalloc here: the structure is allocated rarely and on most systems
its size is about 1K.

Signed-off-by: Vladimir Davydov <vdavydov@parallels.com>
Acked-by: Michal Hocko <mhocko@suse.cz>
Cc: Glauber Costa <glommer@openvz.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Balbir Singh <bsingharora@gmail.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agomm-munlock-fix-potential-race-with-thp-page-split-fix
Andrew Morton [Fri, 3 Jan 2014 03:10:03 +0000 (14:10 +1100)]
mm-munlock-fix-potential-race-with-thp-page-split-fix

avoid a coding-style ugly

Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agomm: munlock: fix potential race with THP page split
Vlastimil Babka [Fri, 3 Jan 2014 03:10:03 +0000 (14:10 +1100)]
mm: munlock: fix potential race with THP page split

Since commit ff6a6da60 ("mm: accelerate munlock() treatment of THP pages")
munlock skips tail pages of a munlocked THP page.  There is some attempt
to prevent bad consequences of racing with a THP page split, but code
inspection indicates that there are two problems that may lead to a
non-fatal, yet wrong outcome.

First, __split_huge_page_refcount() copies flags including PageMlocked
from the head page to the tail pages.  Clearing PageMlocked by
munlock_vma_page() in the middle of this operation might result in part of
tail pages left with PageMlocked flag.  As the head page still appears to
be a THP page until all tail pages are processed, munlock_vma_page() might
think it munlocked the whole THP page and skip all the former tail pages.
Before ff6a6da60, those pages would be cleared in further iterations of
munlock_vma_pages_range(), but NR_MLOCK would still become undercounted
(related the next point).

Second, NR_MLOCK accounting is based on call to hpage_nr_pages() after the
PageMlocked is cleared.  The accounting might also become inconsistent due
to race with __split_huge_page_refcount()

- undercount when HUGE_PMD_NR is subtracted, but some tail pages are
  left with PageMlocked set and counted again (only possible before
  ff6a6da60)

- overcount when hpage_nr_pages() sees a normal page (split has already
  finished), but the parallel split has meanwhile cleared PageMlocked from
  additional tail pages

This patch prevents both problems via extending the scope of lru_lock in
munlock_vma_page().  This is convenient because:

- __split_huge_page_refcount() takes lru_lock for its whole operation

- munlock_vma_page() typically takes lru_lock anyway for page isolation

As this becomes a second function where page isolation is done with
lru_lock already held, factor this out to a new
__munlock_isolate_lru_page() function and clean up the code around.

Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Sasha Levin <sasha.levin@oracle.com>
Cc: Michel Lespinasse <walken@google.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Hugh Dickins <hughd@google.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agomm-print-more-details-for-bad_page-fix
Andrew Morton [Fri, 3 Jan 2014 03:10:02 +0000 (14:10 +1100)]
mm-print-more-details-for-bad_page-fix

switch to pr_alert.

Cc: Dave Hansen <dave.hansen@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agomm: print more details for bad_page()
Dave Hansen [Fri, 3 Jan 2014 03:10:02 +0000 (14:10 +1100)]
mm: print more details for bad_page()

bad_page() is cool in that it prints out a bunch of data about the page.
But, I can never remember which page flags are good and which are bad, or
whether ->index or ->mapping is required to be NULL.

This patch allows bad/dump_page() callers to specify a string about why
they are dumping the page and adds explanation strings to a number of
places.  It also adds a 'bad_flags' argument to bad_page(), which it then
dumps out separately from the flags which are actually set.

This way, the messages will show specifically why the page was bad,
*specifically* which flags it is complaining about, if it was a page flag
combination which was the problem.

Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Reviewed-by: Christoph Lameter <cl@linux.com>
Cc: Andi Kleen <andi@firstfloor.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agomm/zswap.c: change params from hidden to ro
Dan Streetman [Fri, 3 Jan 2014 03:10:02 +0000 (14:10 +1100)]
mm/zswap.c: change params from hidden to ro

The "compressor" and "enabled" params are currently hidden, this changes
them to read-only, so userspace can tell if zswap is enabled or not and
see what compressor is in use.

Signed-off-by: Dan Streetman <ddstreet@ieee.org>
Cc: Vladimir Murzin <murzin.v@gmail.com>
Cc: Bob Liu <bob.liu@oracle.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Weijie Yang <weijie.yang@samsung.com>
Acked-by: Seth Jennings <sjennings@variantweb.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agomm: documentation: remove hopelessly out-of-date locking doc
Dave Hansen [Fri, 3 Jan 2014 03:10:02 +0000 (14:10 +1100)]
mm: documentation: remove hopelessly out-of-date locking doc

Documentation/vm/locking is a blast from the past.  In the entire git
history, it has had precisely Three modifications.  Two of those look to
be pure renames, and the third was from 2005.

The doc contains such gems as:

> The page_table_lock is grabbed while holding the
> kernel_lock spinning monitor.

> Page stealers hold kernel_lock to protect against a bunch of
> races.

Or this which talks about mmap_sem:

> 4. The exception to this rule is expand_stack, which just
>    takes the read lock and the page_table_lock, this is ok
>    because it doesn't really modify fields anybody relies on.

expand_stack() doesn't take any locks any more directly, and the
mmap_sem acquisition was long ago moved up in to the page fault
code itself.

It could be argued that we need to rewrite this, but it is
dangerous to leave it as-is.  It will confuse more people than it
helps.

Signed-off-by: Dave Hansen <dave.hansen@intel.com>
Cc: Hugh Dickins <hughd@google.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Wanpeng Li <liwanp@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agomm/migrate: remove unused function, fail_migrate_page()
Joonsoo Kim [Fri, 3 Jan 2014 03:10:01 +0000 (14:10 +1100)]
mm/migrate: remove unused function, fail_migrate_page()

fail_migrate_page() isn't used anywhere, so remove it.

Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Acked-by: Christoph Lameter <cl@linux.com>
Reviewed-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Reviewed-by: Wanpeng Li <liwanp@linux.vnet.ibm.com>
Cc: Rafael Aquini <aquini@redhat.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Wanpeng Li <liwanp@linux.vnet.ibm.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Rik van Riel <riel@redhat.com>
Cc: Zhang Yanfei <zhangyanfei@cn.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agomm/migrate: remove putback_lru_pages, fix comment on putback_movable_pages
Joonsoo Kim [Fri, 3 Jan 2014 03:10:01 +0000 (14:10 +1100)]
mm/migrate: remove putback_lru_pages, fix comment on putback_movable_pages

Some part of putback_lru_pages() and putback_movable_pages() is
duplicated, so it could confuse us what we should use.  We can remove
putback_lru_pages() since it is not really needed now.  This makes us
undestand and maintain the code more easily.

And comment on putback_movable_pages() is stale now, so fix it.

Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Reviewed-by: Wanpeng Li <liwanp@linux.vnet.ibm.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Rafael Aquini <aquini@redhat.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Wanpeng Li <liwanp@linux.vnet.ibm.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Rik van Riel <riel@redhat.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Zhang Yanfei <zhangyanfei@cn.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agomm/migrate: correct failure handling if !hugepage_migration_support()
Joonsoo Kim [Fri, 3 Jan 2014 03:10:01 +0000 (14:10 +1100)]
mm/migrate: correct failure handling if !hugepage_migration_support()

We should remove the page from the list if we fail with ENOSYS, since
migrate_pages() consider error cases except -ENOMEM and -EAGAIN as
permanent failure and it assumes that the page would be removed from the
list.  Without this patch, we could overcount number of failure.

In addition, we should put back the new hugepage if
!hugepage_migration_support().  If not, we would leak hugepage memory.

Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Acked-by: Christoph Lameter <cl@linux.com>
Reviewed-by: Wanpeng Li <liwanp@linux.vnet.ibm.com>
Reviewed-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Rafael Aquini <aquini@redhat.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Wanpeng Li <liwanp@linux.vnet.ibm.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Rik van Riel <riel@redhat.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Zhang Yanfei <zhangyanfei@cn.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agomm/migrate: add comment about permanent failure path
Naoya Horiguchi [Fri, 3 Jan 2014 03:10:01 +0000 (14:10 +1100)]
mm/migrate: add comment about permanent failure path

Let's add a comment about where the failed page goes to, which makes code
more readable.

Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Acked-by: Christoph Lameter <cl@linux.com>
Reviewed-by: Wanpeng Li <liwanp@linux.vnet.ibm.com>
Acked-by: Rafael Aquini <aquini@redhat.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Wanpeng Li <liwanp@linux.vnet.ibm.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Rik van Riel <riel@redhat.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Zhang Yanfei <zhangyanfei@cn.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agomm, page_alloc: warn for non-blockable __GFP_NOFAIL allocation failure
David Rientjes [Fri, 3 Jan 2014 03:10:01 +0000 (14:10 +1100)]
mm, page_alloc: warn for non-blockable __GFP_NOFAIL allocation failure

__GFP_NOFAIL may return NULL when coupled with GFP_NOWAIT or GFP_ATOMIC.

Luckily, nothing currently does such craziness.  So instead of causing
such allocations to loop (potentially forever), we maintain the current
behavior and also warn about the new users of the deprecated flag.

Suggested-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David Rientjes <rientjes@google.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Michal Hocko <mhocko@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agomm: compaction: reset scanner positions immediately when they meet
Vlastimil Babka [Fri, 3 Jan 2014 03:10:00 +0000 (14:10 +1100)]
mm: compaction: reset scanner positions immediately when they meet

Compaction used to start its migrate and free page scaners at the zone's
lowest and highest pfn, respectively.  Later, caching was introduced to
remember the scanners' progress across compaction attempts so that
pageblocks are not re-scanned uselessly.  Additionally, pageblocks where
isolation failed are marked to be quickly skipped when encountered again
in future compactions.

Currently, both the reset of cached pfn's and clearing of the pageblock
skip information for a zone is done in __reset_isolation_suitable().  This
function gets called when:

 - compaction is restarting after being deferred
 - compact_blockskip_flush flag is set in compact_finished() when the scanners
   meet (and not again cleared when direct compaction succeeds in allocation)
   and kswapd acts upon this flag before going to sleep

This behavior is suboptimal for several reasons:

 - when direct sync compaction is called after async compaction fails (in the
   allocation slowpath), it will effectively do nothing, unless kswapd
   happens to process the compact_blockskip_flush flag meanwhile. This is racy
   and goes against the purpose of sync compaction to more thoroughly retry
   the compaction of a zone where async compaction has failed.
   The restart-after-deferring path cannot help here as deferring happens only
   after the sync compaction fails. It is also done only for the preferred
   zone, while the compaction might be done for a fallback zone.

 - the mechanism of marking pageblock to be skipped has little value since the
   cached pfn's are reset only together with the pageblock skip flags. This
   effectively limits pageblock skip usage to parallel compactions.

This patch changes compact_finished() so that cached pfn's are reset
immediately when the scanners meet.  Clearing pageblock skip flags is
unchanged, as well as the other situations where cached pfn's are reset.
This allows the sync-after-async compaction to retry pageblocks not marked
as skipped, such as blocks !MIGRATE_MOVABLE blocks that async compactions
now skips without marking them.

Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Rik van Riel <riel@redhat.com>
Acked-by: Mel Gorman <mgorman@suse.de>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agomm: compaction: do not mark unmovable pageblocks as skipped in async compaction
Vlastimil Babka [Fri, 3 Jan 2014 03:10:00 +0000 (14:10 +1100)]
mm: compaction: do not mark unmovable pageblocks as skipped in async compaction

Compaction temporarily marks pageblocks where it fails to isolate pages as
to-be-skipped in further compactions, in order to improve efficiency.  One
of the reasons to fail isolating pages is that isolation is not attempted
in pageblocks that are not of MIGRATE_MOVABLE (or CMA) type.

The problem is that blocks skipped due to not being MIGRATE_MOVABLE in
async compaction become skipped due to the temporary mark also in future
sync compaction.  Moreover, this may follow quite soon during
__alloc_page_slowpath, without much time for kswapd to clear the pageblock
skip marks.  This goes against the idea that sync compaction should try to
scan these blocks more thoroughly than the async compaction.

The fix is to ensure in async compaction that these !MIGRATE_MOVABLE
blocks are not marked to be skipped.  Note this should not affect
performance or locking impact of further async compactions, as skipping a
block due to being !MIGRATE_MOVABLE is done soon after skipping a block
marked to be skipped, both without locking.

Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Rik van Riel <riel@redhat.com>
Acked-by: Mel Gorman <mgorman@suse.de>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>