]> git.karo-electronics.de Git - karo-tx-linux.git/log
karo-tx-linux.git
11 years agortc: rtc-at91sam9: remove unnecessary platform_set_drvdata()
Jingoo Han [Thu, 9 May 2013 23:57:36 +0000 (09:57 +1000)]
rtc: rtc-at91sam9: remove unnecessary platform_set_drvdata()

The driver core clears the driver data to NULL after device_release or on
probe failure, since commit 0998d063100 ("device-core: Ensure drvdata =
NULL when no driver is bound").  Thus, it is not needed to manually clear
the device driver data to NULL.

Signed-off-by: Jingoo Han <jg1.han@samsung.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agortc: rtc-at91rm9200: remove unnecessary platform_set_drvdata()
Jingoo Han [Thu, 9 May 2013 23:57:36 +0000 (09:57 +1000)]
rtc: rtc-at91rm9200: remove unnecessary platform_set_drvdata()

The driver core clears the driver data to NULL after device_release or on
probe failure, since commit 0998d063100 ("device-core: Ensure drvdata =
NULL when no driver is bound").  Thus, it is not needed to manually clear
the device driver data to NULL.

Signed-off-by: Jingoo Han <jg1.han@samsung.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agortc: rtc-at32ap700x: remove unnecessary platform_set_drvdata()
Jingoo Han [Thu, 9 May 2013 23:57:36 +0000 (09:57 +1000)]
rtc: rtc-at32ap700x: remove unnecessary platform_set_drvdata()

The driver core clears the driver data to NULL after device_release or on
probe failure, since commit 0998d063100 ("device-core: Ensure drvdata =
NULL when no driver is bound").  Thus, it is not needed to manually clear
the device driver data to NULL.

Signed-off-by: Jingoo Han <jg1.han@samsung.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agortc: rtc-ab8500: remove unnecessary platform_set_drvdata()
Jingoo Han [Thu, 9 May 2013 23:57:35 +0000 (09:57 +1000)]
rtc: rtc-ab8500: remove unnecessary platform_set_drvdata()

The driver core clears the driver data to NULL after device_release or on
probe failure, since commit 0998d063100 ("device-core: Ensure drvdata =
NULL when no driver is bound").  Thus, it is not needed to manually clear
the device driver data to NULL.

Signed-off-by: Jingoo Han <jg1.han@samsung.com>
Cc: Srinidhi Kasagar <srinidhi.kasagar@stericsson.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agortc: rtc-ab3100: remove unnecessary platform_set_drvdata()
Jingoo Han [Thu, 9 May 2013 23:57:35 +0000 (09:57 +1000)]
rtc: rtc-ab3100: remove unnecessary platform_set_drvdata()

The driver core clears the driver data to NULL after device_release or on
probe failure, since commit 0998d063100 ("device-core: Ensure drvdata =
NULL when no driver is bound").  Thus, it is not needed to manually clear
the device driver data to NULL.

Signed-off-by: Jingoo Han <jg1.han@samsung.com>
Cc: Lars-Peter Clausen <lars@metafoo.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agortc: rtc-88pm860x: remove unnecessary platform_set_drvdata()
Jingoo Han [Thu, 9 May 2013 23:57:35 +0000 (09:57 +1000)]
rtc: rtc-88pm860x: remove unnecessary platform_set_drvdata()

The driver core clears the driver data to NULL after device_release or on
probe failure, since commit 0998d063100 ("device-core: Ensure drvdata =
NULL when no driver is bound").  Thus, it is not needed to manually clear
the device driver data to NULL.

Signed-off-by: Jingoo Han <jg1.han@samsung.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agodrivers/rtc/rtc-x1205.c: fix checkpatch issues
Sachin Kamat [Thu, 9 May 2013 23:57:35 +0000 (09:57 +1000)]
drivers/rtc/rtc-x1205.c: fix checkpatch issues

Fixes the following types of issues:
ERROR: do not use assignment in if condition
ERROR: open brace '{' following struct go on the same line
ERROR: else should follow close brace '}'
WARNING: please, no space before tabs

Signed-off-by: Sachin Kamat <sachin.kamat@linaro.org>
Cc: Jingoo Han <jg1.han@samsung.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agodrivers/rtc/rtc-vr41xx.c: fix spacing issues
Sachin Kamat [Thu, 9 May 2013 23:57:34 +0000 (09:57 +1000)]
drivers/rtc/rtc-vr41xx.c: fix spacing issues

Fixes the following types of issues:
ERROR: code indent should use tabs where possible

Signed-off-by: Sachin Kamat <sachin.kamat@linaro.org>
Cc: Jingoo Han <jg1.han@samsung.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agodrivers/rtc/rtc-v3020.c: fix spacing issues
Sachin Kamat [Thu, 9 May 2013 23:57:34 +0000 (09:57 +1000)]
drivers/rtc/rtc-v3020.c: fix spacing issues

Fixes the following type of issues:
WARNING: please, no space before tabs

Signed-off-by: Sachin Kamat <sachin.kamat@linaro.org>
Cc: Jingoo Han <jg1.han@samsung.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agodrivers/rtc/rtc-rs5c313.c: fix spacing related issues
Sachin Kamat [Thu, 9 May 2013 23:57:34 +0000 (09:57 +1000)]
drivers/rtc/rtc-rs5c313.c: fix spacing related issues

Fixes the following types of checkpatch issues:
WARNING: please, no space before tabs
ERROR: space prohibited after that open parenthesis '('
ERROR: space prohibited before that close parenthesis ')'
ERROR: need consistent spacing around '>>' (ctx:VxW)

Signed-off-by: Sachin Kamat <sachin.kamat@linaro.org>
Cc: Jingoo Han <jg1.han@samsung.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agodrivers/rtc/rtc-rs5c313.c: include <linux/io.h> instead of <asm/io.h>
Sachin Kamat [Thu, 9 May 2013 23:57:33 +0000 (09:57 +1000)]
drivers/rtc/rtc-rs5c313.c: include <linux/io.h> instead of <asm/io.h>

Use #include <linux/io.h> instead of <asm/io.h> as pointed out by
checkpatch.

Signed-off-by: Sachin Kamat <sachin.kamat@linaro.org>
Cc: Nobuhiro Iwamatsu <iwamatsu@nigauri.org>
Cc: Jingoo Han <jg1.han@samsung.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agodrivers/rtc/rtc-pcf8583.c: move assignment outside if condition
Sachin Kamat [Thu, 9 May 2013 23:57:33 +0000 (09:57 +1000)]
drivers/rtc/rtc-pcf8583.c: move assignment outside if condition

Fixes the following checkpatch error:
ERROR: do not use assignment in if condition

Signed-off-by: Sachin Kamat <sachin.kamat@linaro.org>
Cc: Jingoo Han <jg1.han@samsung.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agodrivers/rtc/rtc-pcf2123.c: remove space before tabs
Sachin Kamat [Thu, 9 May 2013 23:57:33 +0000 (09:57 +1000)]
drivers/rtc/rtc-pcf2123.c: remove space before tabs

Silences the following checkpatch warning:
WARNING: please, no space before tabs

Signed-off-by: Sachin Kamat <sachin.kamat@linaro.org>
Cc: Jingoo Han <jg1.han@samsung.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agodrivers/rtc/rtc-omap.c: include <linux/io.h> instead of <asm/io.h>
Sachin Kamat [Thu, 9 May 2013 23:57:32 +0000 (09:57 +1000)]
drivers/rtc/rtc-omap.c: include <linux/io.h> instead of <asm/io.h>

Use #include <linux/io.h> instead of <asm/io.h> as pointed out by
checkpatch.

Signed-off-by: Sachin Kamat <sachin.kamat@linaro.org>
Cc: George G. Davis <gdavis@mvista.com>
Cc: Jingoo Han <jg1.han@samsung.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agodrivers/rtc/rtc-mxc.c: fix checkpatch error
Sachin Kamat [Thu, 9 May 2013 23:57:32 +0000 (09:57 +1000)]
drivers/rtc/rtc-mxc.c: fix checkpatch error

Fixes the following error:
ERROR: spaces required around that '>=' (ctx:WxV)

Signed-off-by: Sachin Kamat <sachin.kamat@linaro.org>
Cc: Jingoo Han <jg1.han@samsung.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agodrivers/rtc/rtc-msm6242.c: use pr_warn
Sachin Kamat [Thu, 9 May 2013 23:57:32 +0000 (09:57 +1000)]
drivers/rtc/rtc-msm6242.c: use pr_warn

pr_warn is preferred to pr_warning.

Signed-off-by: Sachin Kamat <sachin.kamat@linaro.org>
Cc: Jingoo Han <jg1.han@samsung.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agodrivers/rtc/rtc-mpc5121.c: remove space before tab
Sachin Kamat [Thu, 9 May 2013 23:57:31 +0000 (09:57 +1000)]
drivers/rtc/rtc-mpc5121.c: remove space before tab

WARNING: please, no space before tabs

Signed-off-by: Sachin Kamat <sachin.kamat@linaro.org>
Cc: Jingoo Han <jg1.han@samsung.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agodrivers/rtc/rtc-max8997.c: remove space before semicolon
Sachin Kamat [Thu, 9 May 2013 23:57:31 +0000 (09:57 +1000)]
drivers/rtc/rtc-max8997.c: remove space before semicolon

Fixes the following warning:
WARNING: space prohibited before semicolon

Signed-off-by: Sachin Kamat <sachin.kamat@linaro.org>
Cc: Jingoo Han <jg1.han@samsung.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agodrivers/rtc/rtc-max77686.c: remove space before semicolon
Sachin Kamat [Thu, 9 May 2013 23:57:31 +0000 (09:57 +1000)]
drivers/rtc/rtc-max77686.c: remove space before semicolon

Fixes the following warning:
WARNING: space prohibited before semicolon

Signed-off-by: Sachin Kamat <sachin.kamat@linaro.org>
Cc: Jingoo Han <jg1.han@samsung.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agodrivers/rtc/rtc-max6902.c: remove unwanted spaces
Sachin Kamat [Thu, 9 May 2013 23:57:31 +0000 (09:57 +1000)]
drivers/rtc/rtc-max6902.c: remove unwanted spaces

Silences the following type of warnings:
WARNING: space prohibited between function name and open parenthesis '('

Signed-off-by: Sachin Kamat <sachin.kamat@linaro.org>
Cc: Jingoo Han <jg1.han@samsung.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agodrivers/rtc/rtc-m41t80.c: fix spacing related issue
Sachin Kamat [Thu, 9 May 2013 23:57:30 +0000 (09:57 +1000)]
drivers/rtc/rtc-m41t80.c: fix spacing related issue

Silences the following checkpatch warning:
WARNING: space prohibited before semicolon

Signed-off-by: Sachin Kamat <sachin.kamat@linaro.org>
Cc: Jingoo Han <jg1.han@samsung.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agodrivers/rtc/rtc-fm3130.c: fix whitespace related issue
Sachin Kamat [Thu, 9 May 2013 23:57:30 +0000 (09:57 +1000)]
drivers/rtc/rtc-fm3130.c: fix whitespace related issue

Silences the following checkpatch warning:
WARNING: please, no space before tabs

Signed-off-by: Sachin Kamat <sachin.kamat@linaro.org>
Cc: Jingoo Han <jg1.han@samsung.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agodrivers/rtc/rtc-ds3234.c: fix whitespace issue
Sachin Kamat [Thu, 9 May 2013 23:57:30 +0000 (09:57 +1000)]
drivers/rtc/rtc-ds3234.c: fix whitespace issue

Fixes the following warning:
WARNING: please, no space before tabs

Signed-off-by: Sachin Kamat <sachin.kamat@linaro.org>
Cc: Jingoo Han <jg1.han@samsung.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agodrivers/rtc/rtc-ds1511.c: fix issues related to spaces and braces
Sachin Kamat [Thu, 9 May 2013 23:57:29 +0000 (09:57 +1000)]
drivers/rtc/rtc-ds1511.c: fix issues related to spaces and braces

Fixes the following types of issues:
WARNING: please, no spaces at the start of a line
WARNING: braces {} are not necessary for single statement blocks

Signed-off-by: Sachin Kamat <sachin.kamat@linaro.org>
Cc: Jingoo Han <jg1.han@samsung.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agodrivers/rtc/rtc-ds1374.c: fix spacing related issues
Sachin Kamat [Thu, 9 May 2013 23:57:29 +0000 (09:57 +1000)]
drivers/rtc/rtc-ds1374.c: fix spacing related issues

Fixes the following types of issues:
ERROR: code indent should use tabs where possible
WARNING: please, no spaces at the start of a line

Signed-off-by: Sachin Kamat <sachin.kamat@linaro.org>
Cc: Jingoo Han <jg1.han@samsung.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agodrivers/rtc/rtc-ds1305.c: add missing braces around sizeof
Sachin Kamat [Thu, 9 May 2013 23:57:29 +0000 (09:57 +1000)]
drivers/rtc/rtc-ds1305.c: add missing braces around sizeof

Silences the following type of warnings:
WARNING: sizeof buf should be sizeof(buf)

Signed-off-by: Sachin Kamat <sachin.kamat@linaro.org>
Cc: Jingoo Han <jg1.han@samsung.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agodrivers/rtc/rtc-davinci.c: fix whitespace warning
Sachin Kamat [Thu, 9 May 2013 23:57:28 +0000 (09:57 +1000)]
drivers/rtc/rtc-davinci.c: fix whitespace warning

Silences the following warning:
WARNING: please, no space before tabs

Signed-off-by: Sachin Kamat <sachin.kamat@linaro.org>
Cc: Jingoo Han <jg1.han@samsung.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agodrivers/rtc/rtc-cmos.c: fix whitespace related errors
Sachin Kamat [Thu, 9 May 2013 23:57:28 +0000 (09:57 +1000)]
drivers/rtc/rtc-cmos.c: fix whitespace related errors

Fixes the following types of issues:

ERROR: space required after that ',' (ctx:VxV)
WARNING: please, no spaces at the start of a line

Signed-off-by: Sachin Kamat <sachin.kamat@linaro.org>
Cc: Jingoo Han <jg1.han@samsung.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agodrivers/rtc/rtc-at91rm9200.c: include <linux/uaccess.h>
Sachin Kamat [Thu, 9 May 2013 23:57:28 +0000 (09:57 +1000)]
drivers/rtc/rtc-at91rm9200.c: include <linux/uaccess.h>

Silences the following checkpatch warning:
WARNING: Use #include <linux/uaccess.h> instead of <asm/uaccess.h>

Signed-off-by: Sachin Kamat <sachin.kamat@linaro.org>
Cc: Andrew Victor <linux@maxim.org.za>
Cc: Jingoo Han <jg1.han@samsung.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agodrivers/rtc/rtc-at32ap700x.c: fix checkpatch error
Sachin Kamat [Thu, 9 May 2013 23:57:28 +0000 (09:57 +1000)]
drivers/rtc/rtc-at32ap700x.c: fix checkpatch error

Fixes the following error:
ERROR: space required before the open parenthesis '('

Signed-off-by: Sachin Kamat <sachin.kamat@linaro.org>
Cc: Jingoo Han <jg1.han@samsung.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agodrivers/rtc/interface.c: fix checkpatch errors
Sachin Kamat [Thu, 9 May 2013 23:57:27 +0000 (09:57 +1000)]
drivers/rtc/interface.c: fix checkpatch errors

Fixes the following types of errors:
ERROR: "foo* bar" should be "foo *bar"
ERROR: else should follow close brace '}'
WARNING: braces {} are not necessary for single statement blocks

Signed-off-by: Sachin Kamat <sachin.kamat@linaro.org>
Cc: Jingoo Han <jg1.han@samsung.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agodrivers/rtc/rtc-v3020.c: remove redundant goto
Sachin Kamat [Thu, 9 May 2013 23:57:27 +0000 (09:57 +1000)]
drivers/rtc/rtc-v3020.c: remove redundant goto

Remove a redundant goto statement left over during the conversion of
this driver to use devm_* APIs.

Signed-off-by: Sachin Kamat <sachin.kamat@linaro.org>
Cc: Jingoo Han <jg1.han@samsung.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agortc: rtc-88pm80x: remove unnecessary platform_set_drvdata()
Jingoo Han [Thu, 9 May 2013 23:57:27 +0000 (09:57 +1000)]
rtc: rtc-88pm80x: remove unnecessary platform_set_drvdata()

The driver core clears the driver data to NULL after device_release or on
probe failure, since commit 0998d063100 ("device-core: Ensure drvdata =
NULL when no driver is bound").  Thus, it is not needed to manually clear
the device driver data to NULL.

Signed-off-by: Jingoo Han <jg1.han@samsung.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agoautofs4: translate pids to the right namespace for the daemon
Miklos Szeredi [Thu, 9 May 2013 23:57:26 +0000 (09:57 +1000)]
autofs4: translate pids to the right namespace for the daemon

The PID and the TGID of the process triggering the mount are sent to the
daemon.  Currently the global pid values are sent (ones valid in the
initial pid namespace) but this is wrong if the autofs daemon itself is
not running in the initial pid namespace.

So send the pid values that are valid in the namespace of the autofs daemon.

The namespace to use is taken from the oz_pgrp pid pointer, which was set
at mount time to the mounting process' pid namespace.

If the pid translation fails (the triggering process is in an unrelated
pid namespace) then the automount fails with ENOENT.

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Acked-by: Serge Hallyn <serge.hallyn@canonical.com>
Cc: Eric Biederman <ebiederm@xmission.com>
Cc: Ian Kent <raven@themaw.net>
Cc: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agoautofs4: allow autofs to work outside the initial PID namespace
Sukadev Bhattiprolu [Thu, 9 May 2013 23:57:26 +0000 (09:57 +1000)]
autofs4: allow autofs to work outside the initial PID namespace

Enable autofs4 to work in a "container".  oz_pgrp is converted from pid_t
to struct pid and this is stored at mount time based on the "pgrp=" option
or if the option is missing then the current pgrp.

The "pgrp=" option is interpreted in the PID namespace of the current
process.  This option is flawed in that it doesn't carry the namespace
information, so it should be deprecated.  AFAICS the autofs daemon always
sends the current pgrp, which is the default anyway.

The oz_pgrp is also set from the AUTOFS_DEV_IOCTL_SETPIPEFD_CMD ioctl.
This ioctl sets oz_pgrp to the current pgrp.  It is not allowed to change
the pid namespace.

oz_pgrp is used mainly to determine whether the process traversing the
autofs mount tree is the autofs daemon itself or not.  This function now
compares the pid pointers instead of the pid_t values.

One other use of oz_pgrp is in autofs4_show_options.  There is shows the
virtual pid number (i.e.  the one that is valid inside the PID namespace
of the calling process)

For debugging printk convert oz_pgrp to the value in the initial pid
namespace.

Signed-off-by: Sukadev Bhattiprolu <sukadev@us.ibm.com>
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Acked-by: Serge Hallyn <serge.hallyn@canonical.com>
Cc: Eric Biederman <ebiederm@xmission.com>
Cc: Ian Kent <raven@themaw.net>
Cc: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agoinit: remove permanent string buffer from do_one_initcall()
Steven Rostedt [Thu, 9 May 2013 23:57:25 +0000 (09:57 +1000)]
init: remove permanent string buffer from do_one_initcall()

do_one_initcall() uses a 64 byte string buffer to save a message. This
buffer is declared static and is only used at boot up and when a module
is loaded. As 64 bytes is very small, and this function has very limited
scope, there's no reason to waste permanent memory with this string and
not just simply put it on the stack.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agobinfmt_elf.c: use get_random_int() to fix entropy depleting
Jeff Liu [Thu, 9 May 2013 23:57:25 +0000 (09:57 +1000)]
binfmt_elf.c: use get_random_int() to fix entropy depleting

Entropy is quickly depleted under normal operations like ls(1), cat(1),
etc...  between 2.6.30 to current mainline, for instance:

$ cat /proc/sys/kernel/random/entropy_avail
3428
$ cat /proc/sys/kernel/random/entropy_avail
2911
$cat /proc/sys/kernel/random/entropy_avail
2620

We observed this problem has been occurring since 2.6.30 with
fs/binfmt_elf.c: create_elf_tables()->get_random_bytes(), introduced by
f06295b44c296c8f ("ELF: implement AT_RANDOM for glibc PRNG seeding").

/*
 * Generate 16 random bytes for userspace PRNG seeding.
 */
get_random_bytes(k_rand_bytes, sizeof(k_rand_bytes));

The patch introduces a wrapper around get_random_int() which has lower
overhead than calling get_random_bytes() directly.

With this patch applied:
$ cat /proc/sys/kernel/random/entropy_avail
2731
$ cat /proc/sys/kernel/random/entropy_avail
2802
$ cat /proc/sys/kernel/random/entropy_avail
2878

Analyzed by John Sobecki.

Signed-off-by: Jie Liu <jeff.liu@oracle.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Andreas Dilger <aedilger@gmail.com>
Cc: Alan Cox <alan@linux.intel.com>
Cc: Arnd Bergmann <arnn@arndb.de>
Cc: John Sobecki <john.sobecki@oracle.com>
Cc: James Morris <james.l.morris@oracle.com>
Cc: Jakub Jelinek <jakub@redhat.com>
Cc: Ted Ts'o <tytso@mit.edu>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Acked-by: Kees Cook <keescook@chromium.org>
Cc: Ulrich Drepper <drepper@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agolib-bitmapc-speed-up-bitmap_find_free_region-fix
Andrew Morton [Thu, 9 May 2013 23:57:25 +0000 (09:57 +1000)]
lib-bitmapc-speed-up-bitmap_find_free_region-fix

reduce scope of locals, remove barely comprehensible comment

Cc: Chanho Min <chanho.min@lge.com>
Cc: Jiri Kosina <jkosina@suse.cz>
Cc: Joe Perches <joe@perches.com>
Cc: Nadia Yvette Chambers <nyc@holomorphy.com>
Cc: anish singh <anish198519851985@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agolib/bitmap.c: speed up bitmap_find_free_region
Chanho Min [Thu, 9 May 2013 23:57:24 +0000 (09:57 +1000)]
lib/bitmap.c: speed up bitmap_find_free_region

In bitmap_find_free_region(), if we skip the all-ones words and find bits
in a not-all-ones word, we can improve performance of it.

For example, If bitmap_find_free_region() is called with order=0, First,
It scans bitmap array by the increment of long type, then find 1 free bit
within 1 long type value.  In 32 bits system and 1024 bits size, in the
worst case, We need 1024 for-loops to find 1 free bit.  But, If this is
applied, it takes 64 for-loops.  Instead, It will be needed additional
if-comparison of every word and It can take time slightly as 'Test case
3'.  But, In many cases, It will speed up significantly.

Test cases bellows show the time taken to execute bitmap_find_free_region()
before and after patch.

Test condition : ARMv7 1GHZ 32 bits system, 1024 bits size, gcc 4.6.2

Test case 1: order is 0. all bits are one except that last one bit is zero.
 before patch: 29727 ns
 after patch: 2349 ns

Test case 2: order is 1. all bits are one except that last 2 contiguous bits
are zero.
 before patch: 15475 ns
 after patch: 2225 ns

Test case 3: order is 1. all words are not-all-ones and don't have 2 contiguous
bits except that last 2 contiguous are zero.
 before patch: 15475 ns
 after patch: 16131 ns

Signed-off-by: Chanho Min <chanho.min@lge.com>
Cc: Nadia Yvette Chambers <nyc@holomorphy.com>
Cc: Jiri Kosina <jkosina@suse.cz>
Cc: Joe Perches <joe@perches.com>
Cc: anish singh <anish198519851985@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agodrivers/leds/leds-ot200.c: fix error caused by shifted mask
Christian Gmeiner [Thu, 9 May 2013 23:57:24 +0000 (09:57 +1000)]
drivers/leds/leds-ot200.c: fix error caused by shifted mask

During the development of this driver an in-house register documentation
was used.  The last week some integration tests were done and this problem
was found.  It turned out that the released register documentation is
wrong.

The fix is very simple: shift all masks by one.

Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Cc: Bryan Wu <cooloney@gmail.com>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agobacklight: pcf50633: remove unnecessary platform_set_drvdata()
Jingoo Han [Thu, 9 May 2013 23:57:24 +0000 (09:57 +1000)]
backlight: pcf50633: remove unnecessary platform_set_drvdata()

The driver core clears the driver data to NULL after device_release or on
probe failure, since commit 0998d063100 ("device-core: Ensure drvdata =
NULL when no driver is bound").  Thus, it is not needed to manually clear
the device driver data to NULL.

Signed-off-by: Jingoo Han <jg1.han@samsung.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agobacklight: lp8788: remove unnecessary platform_set_drvdata()
Jingoo Han [Thu, 9 May 2013 23:57:23 +0000 (09:57 +1000)]
backlight: lp8788: remove unnecessary platform_set_drvdata()

The driver core clears the driver data to NULL after device_release or on
probe failure, since commit 0998d063100 ("device-core: Ensure drvdata =
NULL when no driver is bound").  Thus, it is not needed to manually clear
the device driver data to NULL.

Signed-off-by: Jingoo Han <jg1.han@samsung.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agobacklight: ep93xx: remove unnecessary platform_set_drvdata()
Jingoo Han [Thu, 9 May 2013 23:57:23 +0000 (09:57 +1000)]
backlight: ep93xx: remove unnecessary platform_set_drvdata()

The driver core clears the driver data to NULL after device_release or on
probe failure, since commit 0998d063100 ("device-core: Ensure drvdata =
NULL when no driver is bound").  Thus, it is not needed to manually clear
the device driver data to NULL.

Signed-off-by: Jingoo Han <jg1.han@samsung.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agobacklight: atmel-pwm-bl: remove unnecessary platform_set_drvdata()
Jingoo Han [Thu, 9 May 2013 23:57:23 +0000 (09:57 +1000)]
backlight: atmel-pwm-bl: remove unnecessary platform_set_drvdata()

The driver core clears the driver data to NULL after device_release or on
probe failure, since commit 0998d06310 ("device-core: Ensure drvdata =
NULL when no driver is bound").  Thus, it is not needed to manually clear
the device driver data to NULL.

Signed-off-by: Jingoo Han <jg1.han@samsung.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agosmp: Give WARN()ing when calling smp_call_function_many()/single() in serving irq
Chuansheng Liu [Thu, 9 May 2013 23:57:22 +0000 (09:57 +1000)]
smp: Give WARN()ing when calling smp_call_function_many()/single() in serving irq

Currently the functions smp_call_function_many()/single() will give a
WARN()ing only in the case of irqs_disabled(), but that check is not
enough to guarantee execution of the SMP cross-calls.

In many other cases such as softirq handling/interrupt handling, the two
APIs still can not be called, just as the smp_call_function_many()
comments say:

  * You must not call this function with disabled interrupts or from a
  * hardware interrupt handler or from a bottom half handler. Preemption
  * must be disabled when calling this function.

There is a real case for softirq DEADLOCK case:

CPUA                            CPUB
                                spin_lock(&spinlock)
                                Any irq coming, call the irq handler
                                irq_exit()
spin_lock_irq(&spinlock)
<== Blocking here due to
CPUB hold it
                                  __do_softirq()
                                    run_timer_softirq()
                                      timer_cb()
                                        call smp_call_function_many()
                                          send IPI interrupt to CPUA
                                            wait_csd()

Then both CPUA and CPUB will be deadlocked here.

So we should give a warning in the nmi, hardirq or softirq context as well.

Moreover, adding one new macro in_serving_irq() which indicates we are
processing nmi, hardirq or sofirq.

Signed-off-by: liu chuansheng <chuansheng.liu@intel.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Tested-by: Fengguang Wu <fengguang.wu@intel.com>
Cc: Lai Jiangshan <eag0628@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agopanic-add-cpu-pid-to-warn_slowpath_common-in-warning-printks-fix
Andrew Morton [Thu, 9 May 2013 23:57:22 +0000 (09:57 +1000)]
panic-add-cpu-pid-to-warn_slowpath_common-in-warning-printks-fix

remove stray quote

Cc: Alex Thorlton <athorlton@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agopanic: add cpu/pid to warn_slowpath_common in WARNING printk()s
Alex Thorlton [Thu, 9 May 2013 23:57:21 +0000 (09:57 +1000)]
panic: add cpu/pid to warn_slowpath_common in WARNING printk()s

Add the cpu/pid that called WARN() so that the stack traces can be matched
up with the WARNING messages.

Signed-off-by: Alex Thorlton <athorlton@sgi.com>
Reviewed-by: Robin Holt <holt@sgi.com>
Cc: Stephen Boyd <sboyd@codeaurora.org>
Cc: Vikram Mulukutla <markivx@codeaurora.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agodump_stack-serialize-the-output-from-dump_stack-fix
Andrew Morton [Thu, 9 May 2013 23:57:21 +0000 (09:57 +1000)]
dump_stack-serialize-the-output-from-dump_stack-fix

- fix comment indenting
- avoid inclusion of <asm/> files - use <linux/> where possible
- fix uniprocessor build (__dump_stack undefined)
- remove unneeded ifdef around smp.h inclusion

Cc: Alex Thorlton <athorlton@sgi.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Jesper Nilsson <jesper.nilsson@axis.com>
Cc: Richard Kuo <rkuo@codeaurora.org>
Cc: Robin Holt <holt@sgi.com>
Cc: Russ Anderson <rja@sgi.com>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: athorlton@sgi.com
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agodump_stack: serialize the output from dump_stack()
Alex Thorlton [Thu, 9 May 2013 23:57:21 +0000 (09:57 +1000)]
dump_stack: serialize the output from dump_stack()

tAdd adds functionality to serialize the output from dump_stack() to avoid
mangling of the output when dump_stack is called simultaneously from
multiple cpus.

Signed-off-by: Alex Thorlton <athorlton@sgi.com>
Reported-by: Russ Anderson <rja@sgi.com>
Reviewed-by: Robin Holt <holt@sgi.com>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Richard Kuo <rkuo@codeaurora.org>
Cc: Jesper Nilsson <jesper.nilsson@axis.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agoerr.h: IS_ERR() can accept __user pointers
Dan Carpenter [Thu, 9 May 2013 23:57:20 +0000 (09:57 +1000)]
err.h: IS_ERR() can accept __user pointers

Sparse generates a false positive when you pass a __user or __iomem
pointer to the IS_ERR() functions.

drivers/rtc/rtc-ds1286.c:344:36: sparse: incorrect type in argument 1 (different address spaces)
drivers/rtc/rtc-ds1286.c:344:36:    expected void const *ptr
drivers/rtc/rtc-ds1286.c:344:36:    got unsigned int [noderef] [usertype] <asn:2>*rtcregs

We can silence these by adding a __force here and upgrading to the
latest git release of Sparse.

This change has no effect when using current Sparse releases.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Christopher Li <sparse@chrisli.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agomm: add vm event counters for balloon pages compaction
Rafael Aquini [Thu, 9 May 2013 23:57:20 +0000 (09:57 +1000)]
mm: add vm event counters for balloon pages compaction

Introduce a new set of vm event counters to keep track of ballooned pages
compaction activity.

Signed-off-by: Rafael Aquini <aquini@redhat.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Minchan Kim <minchan@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agomemcg-debugging-facility-to-access-dangling-memcgs-fix
Andrew Morton [Thu, 9 May 2013 23:57:19 +0000 (09:57 +1000)]
memcg-debugging-facility-to-access-dangling-memcgs-fix

fix up Kconfig text

Cc: Glauber Costa <glommer@parallels.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agomemcg: debugging facility to access dangling memcgs
Glauber Costa [Thu, 9 May 2013 23:57:19 +0000 (09:57 +1000)]
memcg: debugging facility to access dangling memcgs

If memcg is tracking anything other than plain user memory (swap, tcp buf
mem, or slab memory), it is possible - and normal - that a reference will
be held by the group after it is dead.  Still, for developers, it would be
extremely useful to be able to query about those states during debugging.

This patch provides a debugging facility in the root memcg, so we can
inspect which memcgs still have pending objects, and what is the cause of
this state.

Signed-off-by: Glauber Costa <glommer@parallels.com>
Acked-by: Michal Hocko <mhocko@suse.cz>
Cc: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agomm/dmapool.c: fix null dev in dma_pool_create()
Xi Wang [Thu, 9 May 2013 23:57:19 +0000 (09:57 +1000)]
mm/dmapool.c: fix null dev in dma_pool_create()

A few drivers invoke dma_pool_create() with a null dev.  Note that dev is
dereferenced in dev_to_node(dev), causing a null pointer dereference.

A long term solution is to disallow null dev.  Once the drivers are fixed,
we can simplify the core code here.  For now we add WARN_ON(!dev) to
notify the driver maintainers and avoid the null pointer dereference.

Signed-off-by: Xi Wang <xi.wang@gmail.com>
Cc: David Rientjes <rientjes@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agodrivers/usb/gadget/amd5536udc.c: avoid calling dma_pool_create() with NULL dev
Xi Wang [Thu, 9 May 2013 23:57:19 +0000 (09:57 +1000)]
drivers/usb/gadget/amd5536udc.c: avoid calling dma_pool_create() with NULL dev

Calling dma_pool_create() with dev==NULL will oops on a NUMA machine.
Rather than changing dma_pool_create() we wish to disallow passing
dev==NULL.  This requires fixing up the small number of drivers which are
passing in dev==NULL.

Use &dev->pdev->dev instead of NULL.

Signed-off-by: Xi Wang <xi.wang@gmail.com>
Cc: Felipe Balbi <balbi@ti.com>
Cc: Greg KH <greg@kroah.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agodrop_caches: add some documentation and info message
Michal Hocko [Thu, 9 May 2013 23:57:18 +0000 (09:57 +1000)]
drop_caches: add some documentation and info message

I would like to resurrect Dave's patch.  The last time it was posted was
here https://lkml.org/lkml/2010/9/16/250 and there didn't seem to be any
strong opposition.

Kosaki was worried about possible excessive logging when somebody drops
caches too often (but then he claimed he didn't have a strong opinion on
that) but I would say opposite.  If somebody does that then I would really
like to know that from the log when supporting a system because it almost
for sure means that there is something fishy going on.  It is also worth
mentioning that only root can write drop caches so this is not an flooding
attack vector.

I am bringing that up again because this can be really helpful when
chasing strange performance issues which (surprise surprise) turn out to
be related to artificially dropped caches done because the admin thinks
this would help...

I have just refreshed the original patch on top of the current mm tree
but I could live with KERN_INFO as well if people think that KERN_NOTICE
is too hysterical.

: From: Dave Hansen <dave@linux.vnet.ibm.com>
: Date: Fri, 12 Oct 2012 14:30:54 +0200
:
: There is plenty of anecdotal evidence and a load of blog posts
: suggesting that using "drop_caches" periodically keeps your system
: running in "tip top shape".  Perhaps adding some kernel
: documentation will increase the amount of accurate data on its use.
:
: If we are not shrinking caches effectively, then we have real bugs.
: Using drop_caches will simply mask the bugs and make them harder
: to find, but certainly does not fix them, nor is it an appropriate
: "workaround" to limit the size of the caches.
:
: It's a great debugging tool, and is really handy for doing things
: like repeatable benchmark runs.  So, add a bit more documentation
: about it, and add a little KERN_NOTICE.  It should help developers
: who are chasing down reclaim-related bugs.

[mhocko@suse.cz: refreshed to current -mm tree]
[akpm@linux-foundation.org: checkpatch fixes]
Signed-off-by: Dave Hansen <dave@linux.vnet.ibm.com>
Reviewed-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: Michal Hocko <mhocko@suse.cz>
Acked-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agomm: memmap_init_zone() performance improvement
Mike Yoknis [Thu, 9 May 2013 23:57:18 +0000 (09:57 +1000)]
mm: memmap_init_zone() performance improvement

We have what we call an "architectural simulator".  It is a computer
program that pretends that it is a computer system.  We use it to test the
firmware before real hardware is available.  We have booted Linux on our
simulator.  As you would expect it takes longer to boot on the simulator
than it does on real hardware.

With my patch - boot time 41 minutes
Without patch - boot time 94 minutes

These numbers do not scale linearly to real hardware.  But indicate to me
a place where Linux can be improved.

memmap_init_zone() loops through every Page Frame Number (pfn), including
pfn values that are within the gaps between existing memory sections.  The
unneeded looping will become a boot performance issue when machines
configure larger memory ranges that will contain larger and more numerous
gaps.

The code will skip across invalid pfn values to reduce the number of loops
executed.

Signed-off-by: Mike Yoknis <mike.yoknis@hp.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agoinclude-linux-mmzoneh-cleanups-fix
Andrew Morton [Thu, 9 May 2013 23:57:18 +0000 (09:57 +1000)]
include-linux-mmzoneh-cleanups-fix

use zone_idx() some more, further simplify is_highmem()

Cc: Lin Feng <linfeng@cn.fujitsu.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agoinclude/linux/mmzone.h: cleanups
Andrew Morton [Thu, 9 May 2013 23:57:17 +0000 (09:57 +1000)]
include/linux/mmzone.h: cleanups

- implement zone_idx() in C to fix its references-args-twice macro bug

- use zone_idx() in is_highmem() to remove large amounts of silly fluff.

Cc: Lin Feng <linfeng@cn.fujitsu.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agomm/page_alloc.c: fix watermark check in __zone_watermark_ok()
Tomasz Stanislawski [Thu, 9 May 2013 23:57:17 +0000 (09:57 +1000)]
mm/page_alloc.c: fix watermark check in __zone_watermark_ok()

The watermark check consists of two sub-checks.  The first one is:

if (free_pages <= min + lowmem_reserve)
return false;

The check assures that there is minimal amount of RAM in the zone.  If CMA
is used then the free_pages is reduced by the number of free pages in CMA
prior to the over-mentioned check.

if (!(alloc_flags & ALLOC_CMA))
free_pages -= zone_page_state(z, NR_FREE_CMA_PAGES);

This prevents the zone from being drained from pages available for
non-movable allocations.

The second check prevents the zone from getting too fragmented.

for (o = 0; o < order; o++) {
free_pages -= z->free_area[o].nr_free << o;
min >>= 1;
if (free_pages <= min)
return false;
}

The field z->free_area[o].nr_free is equal to the number of free pages
including free CMA pages.  Therefore the CMA pages are subtracted twice.
This may cause a false positive fail of __zone_watermark_ok() if the CMA
area gets strongly fragmented.  In such a case there are many 0-order free
pages located in CMA.  Those pages are subtracted twice therefore they
will quickly drain free_pages during the check against fragmentation.  The
test fails even though there are many free non-cma pages in the zone.

This patch fixes this issue by subtracting CMA pages only for a purpose of
(free_pages <= min + lowmem_reserve) check.

Signed-off-by: Tomasz Stanislawski <t.stanislaws@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Cc: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Mel Gorman <mel@csn.ul.ie>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agomm-remove-compressed-copy-from-zram-in-memory-fix
Andrew Morton [Thu, 9 May 2013 23:57:17 +0000 (09:57 +1000)]
mm-remove-compressed-copy-from-zram-in-memory-fix

tweak comment text

Cc: Dan Magenheimer <dan.magenheimer@oracle.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Konrad Rzeszutek Wilk <konrad@darnok.org>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Nitin Gupta <ngupta@vflare.org>
Cc: Seth Jennings <sjenning@linux.vnet.ibm.com>
Cc: Shaohua Li <shli@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agomm: remove compressed copy from zram in-memory
Minchan Kim [Thu, 9 May 2013 23:57:16 +0000 (09:57 +1000)]
mm: remove compressed copy from zram in-memory

Swap subsystem does lazy swap slot free with expecting the page would be
swapped out again so we can avoid unnecessary write.

But the problem in in-memory swap(ex, zram) is that it consumes memory
space until vm_swap_full(ie, used half of all of swap device) condition
meet.  It could be bad if we use multiple swap device, small in-memory
swap and big storage swap or in-memory swap alone.

This patch makes swap subsystem free swap slot as soon as swap-read is
completed and make the swapcache page dirty so the page should be written
out the swap device to reclaim it.  It means we never lose it.

I tested this patch with kernel compile workload.

1. before

compile time : 9882.42
zram max wasted space by fragmentation: 13471881 byte
memory space consumed by zram: 174227456 byte
the number of slot free notify: 206684

2. after

compile time : 9653.90
zram max wasted space by fragmentation: 11805932 byte
memory space consumed by zram: 154001408 byte
the number of slot free notify: 426972

* changelog from v3
  * Rebased on next-20130508

* changelog from v1
  * Add more comment

Signed-off-by: Dan Magenheimer <dan.magenheimer@oracle.com>
Signed-off-by: Minchan Kim <minchan@kernel.org>
Cc: Hugh Dickins <hughd@google.com>
Cc: Seth Jennings <sjenning@linux.vnet.ibm.com>
Cc: Nitin Gupta <ngupta@vflare.org>
Cc: Konrad Rzeszutek Wilk <konrad@darnok.org>
Cc: Shaohua Li <shli@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agomm: remove free_area_cache
Michel Lespinasse [Thu, 9 May 2013 23:57:16 +0000 (09:57 +1000)]
mm: remove free_area_cache

Since all architectures have been converted to use vm_unmapped_area(),
there is no remaining use for the free_area_cache.

Signed-off-by: Michel Lespinasse <walken@google.com>
Acked-by: Rik van Riel <riel@redhat.com>
Cc: "James E.J. Bottomley" <jejb@parisc-linux.org>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: David Howells <dhowells@redhat.com>
Cc: Helge Deller <deller@gmx.de>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Richard Henderson <rth@twiddle.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agomm, memcg: don't take task_lock in task_in_mem_cgroup
David Rientjes [Thu, 9 May 2013 23:57:16 +0000 (09:57 +1000)]
mm, memcg: don't take task_lock in task_in_mem_cgroup

For processes that have detached their mm's, task_in_mem_cgroup()
unnecessarily takes task_lock() when rcu_read_lock() is all that is
necessary to call mem_cgroup_from_task().

While we're here, switch task_in_mem_cgroup() to return bool.

Signed-off-by: David Rientjes <rientjes@google.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Michal Hocko <mhocko@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agomm/thp: use the correct function when updating access flags
Aneesh Kumar K.V [Thu, 9 May 2013 23:57:15 +0000 (09:57 +1000)]
mm/thp: use the correct function when updating access flags

We should use pmdp_set_access_flags to update access flags.  Archs like
powerpc use extra checks(_PAGE_BUSY) when updating a hugepage PTE.  A
set_pmd_at doesn't do those checks.  We should use set_pmd_at only when
updating a none hugepage PTE.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>a
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agopagemap: prepare to reuse constant bits with page-shift
Pavel Emelyanov [Thu, 9 May 2013 23:57:15 +0000 (09:57 +1000)]
pagemap: prepare to reuse constant bits with page-shift

In order to reuse bits from pagemap entries gracefully, we leave the
entries as is but on pagemap open emit a warning in dmesg, that bits 55-60
are about to change in a couple of releases.  Next, if a user issues
soft-dirty clear command via the clear_refs file (it was disabled before
v3.9) we assume that he's aware of the new pagemap format, note that fact
and report the bits in pagemap in the new manner.

The "migration strategy" looks like this then:

1. existing users are not affected -- they don't touch soft-dirty feature, thus
   see old bits in pagemap, but are warned and have time to fix themselves
2. those who use soft-dirty know about new pagemap format
3. some time soon we get rid of any signs of page-shift in pagemap as well as
   this trick with clear-soft-dirty affecting pagemap format.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Cc: Matt Mackall <mpm@selenic.com>
Cc: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
Cc: Glauber Costa <glommer@parallels.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@gmail.com>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agomm: soft-dirty bits for user memory changes tracking
Pavel Emelyanov [Thu, 9 May 2013 23:57:15 +0000 (09:57 +1000)]
mm: soft-dirty bits for user memory changes tracking

The soft-dirty is a bit on a PTE which helps to track which pages a task
writes to. In order to do this tracking one should

  1. Clear soft-dirty bits from PTEs ("echo 4 > /proc/PID/clear_refs)
  2. Wait some time.
  3. Read soft-dirty bits (55'th in /proc/PID/pagemap2 entries)

To do this tracking, the writable bit is cleared from PTEs when the
soft-dirty bit is. Thus, after this, when the task tries to modify a page
at some virtual address the #PF occurs and the kernel sets the soft-dirty
bit on the respective PTE.

Note, that although all the task's address space is marked as r/o after the
soft-dirty bits clear, the #PF-s that occur after that are processed fast.
This is so, since the pages are still mapped to physical memory, and thus
all the kernel does is finds this fact out and puts back writable, dirty
and soft-dirty bits on the PTE.

Another thing to note, is that when mremap moves PTEs they are marked with
soft-dirty as well, since from the user perspective mremap modifies the
virtual memory at mremap's new address.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Cc: Matt Mackall <mpm@selenic.com>
Cc: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
Cc: Glauber Costa <glommer@parallels.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@gmail.com>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agopagemap-introduce-pagemap_entry_t-without-pmshift-bits-v4
Pavel Emelyanov [Thu, 9 May 2013 23:57:14 +0000 (09:57 +1000)]
pagemap-introduce-pagemap_entry_t-without-pmshift-bits-v4

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Cc: Matt Mackall <mpm@selenic.com>
Cc: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
Cc: Glauber Costa <glommer@parallels.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agopagemap: introduce pagemap_entry_t without pmshift bits
Pavel Emelyanov [Thu, 9 May 2013 23:57:14 +0000 (09:57 +1000)]
pagemap: introduce pagemap_entry_t without pmshift bits

These bits are always constant (== PAGE_SHIFT) and just occupy space in
the entry.  Moreover, in next patch we will need to report one more bit in
the pagemap, but all bits are already busy on it.

That said, describe the pagemap entry that has 6 more free zero bits.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Cc: Matt Mackall <mpm@selenic.com>
Cc: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
Cc: Glauber Costa <glommer@parallels.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agoclear_refs: introduce private struct for mm_walk
Pavel Emelyanov [Thu, 9 May 2013 23:57:14 +0000 (09:57 +1000)]
clear_refs: introduce private struct for mm_walk

In the next patch the clear-refs-type will be required in
clear_refs_pte_range funciton, so prepare the walk->private to carry this
info.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Cc: Matt Mackall <mpm@selenic.com>
Cc: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
Cc: Glauber Costa <glommer@parallels.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agoclear_refs: sanitize accepted commands declaration
Pavel Emelyanov [Thu, 9 May 2013 23:57:14 +0000 (09:57 +1000)]
clear_refs: sanitize accepted commands declaration

This is the implementation of the soft-dirty bit concept that should help
keep track of changes in user memory, which in turn is very-very required
by the checkpoint-restore project (http://criu.org).

To create a dump of an application(s) we save all the information about it
to files, and the biggest part of such dump is the contents of tasks' memory.
However, there are usage scenarios where it's not required to get _all_ the
task memory while creating a dump. For example, when doing periodical dumps,
it's only required to take full memory dump only at the first step and then
take incremental changes of memory. Another example is live migration. We
copy all the memory to the destination node without stopping all tasks, then
stop them, check for what pages has changed, dump it and the rest of the state,
then copy it to the destination node. This decreases freeze time significantly.

That said, some help from kernel to watch how processes modify the contents
of their memory is required.

The proposal is to track changes with the help of new soft-dirty bit this way:

1. First do "echo 4 > /proc/$pid/clear_refs".
   At that point kernel clears the soft dirty _and_ the writable bits from all
   ptes of process $pid. From now on every write to any page will result in #pf
   and the subsequent call to pte_mkdirty/pmd_mkdirty, which in turn will set
   the soft dirty flag.

2. Then read the /proc/$pid/pagemap2 and check the soft-dirty bit reported there
   (the 55'th one). If set, the respective pte was written to since last call
   to clear refs.

The soft-dirty bit is the _PAGE_BIT_HIDDEN one. Although it's used by kmemcheck,
the latter one marks kernel pages with it, while the former bit is put on user
pages so they do not conflict to each other.

This patch:

A new clear-refs type will be added in the next patch, so prepare
code for that.

[akpm@linux-foundation.org: don't assume that sizeof(enum clear_refs_types) == sizeof(int)]
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Cc: Matt Mackall <mpm@selenic.com>
Cc: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
Cc: Glauber Costa <glommer@parallels.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agowatchdog: trigger all-cpu backtrace when locked up and going to panic
Sasha Levin [Thu, 9 May 2013 23:57:13 +0000 (09:57 +1000)]
watchdog: trigger all-cpu backtrace when locked up and going to panic

Send an NMI to all CPUs when a lockup is detected and the lockup watchdog
code is configured to panic.  This gives us a fairly uptodate snapshot of
all CPUs in the system.

This lets us get stack trace of all CPUs which makes life easier trying to
debug a deadlock, and the NMI doesn't change anything since the next step
is a kernel panic.

Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agoblock: restore /proc/partitions to not display non-partitionable removable devices
Josh Hunt [Thu, 9 May 2013 23:57:13 +0000 (09:57 +1000)]
block: restore /proc/partitions to not display non-partitionable removable devices

We found with newer kernels we started seeing the cdrom device showing
up in /proc/partitions, but it was not there before.

Looking into this I found that commit d27769ec ("block: add
GENHD_FL_NO_PART_SCAN") introduces this change in behavior.  It's not
clear to me from the commit's changelog if this change was intentional or
not.  This comment still remains: /* Don't show non-partitionable
removeable devices or empty devices */ so I've decided to send a patch to
restore the behavior of not printing unpartitionable removable devices.

Signed-off-by: Josh Hunt <johunt@akamai.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Kay Sievers <kay.sievers@vrfy.org>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agolglock: update lockdep annotations to report recursive local locks
Michel Lespinasse [Thu, 9 May 2013 23:57:12 +0000 (09:57 +1000)]
lglock: update lockdep annotations to report recursive local locks

Oleg Nesterov recently noticed that the lockdep annotations in lglock.c
are not sufficient to detect some obvious deadlocks, such as
lg_local_lock(LOCK) + lg_local_lock(LOCK) or spin_lock(X) +
lg_local_lock(Y) vs lg_local_lock(Y) + spin_lock(X).

Both issues are easily fixed by indicating to lockdep that lglock's local
locks are not recursive.  We shouldn't use the rwlock acquire/release
functions here, as lglock doesn't share the same semantics.  Instead we
can base our lockdep annotations on the lock_acquire_shared (for local
lglock) and lock_acquire_exclusive (for global lglock) helpers.

I am not proposing new lglock specific helpers as I don't see the point of
the existing second level of helpers :)

Signed-off-by: Michel Lespinasse <walken@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: "Srivatsa S. Bhat" <srivatsa.bhat@linux.vnet.ibm.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: "Paul E. McKenney" <paulmck@us.ibm.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agolockdep: introduce lock_acquire_exclusive/shared helper macros
Michel Lespinasse [Thu, 9 May 2013 23:57:12 +0000 (09:57 +1000)]
lockdep: introduce lock_acquire_exclusive/shared helper macros

In lockdep.h, the spinlock/mutex/rwsem/rwlock/lock_map acquire macros have
different definitions based on the value of CONFIG_PROVE_LOCKING.  We have
separate ifdefs for each of these definitions, which seems redundant.

Introduce lock_acquire_{exclusive,shared,shared_recursive} helpers which
will have different definitions based on CONFIG_PROVE_LOCKING.  Then all
other helper macros can be defined based on the above ones, which reduces
the amount of ifdefined code.

Signed-off-by: Michel Lespinasse <walken@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: "Srivatsa S. Bhat" <srivatsa.bhat@linux.vnet.ibm.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: "Paul E. McKenney" <paulmck@us.ibm.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agoipvs: change type of netns_ipvs->sysctl_sync_qlen_max
Zhang Yanfei [Thu, 9 May 2013 23:57:12 +0000 (09:57 +1000)]
ipvs: change type of netns_ipvs->sysctl_sync_qlen_max

This member of struct netns_ipvs is calculated from nr_free_buffer_pages
so change its type to unsigned long in case of overflow.  Also, type of
its related proc var sync_qlen_max and the return type of function
sysctl_sync_qlen_max() should be changed to unsigned long, too.

Besides, the type of ipvs_master_sync_state->sync_queue_len should be
changed to unsigned long accordingly.

Signed-off-by: Zhang Yanfei <zhangyanfei@cn.fujitsu.com>
Cc: Simon Horman <horms@verge.net.au>
Cc: Julian Anastasov <ja@ssi.bg>
Cc: David Miller <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agoconfigfs: use capped length for ->store_attribute()
Dan Carpenter [Thu, 9 May 2013 23:57:11 +0000 (09:57 +1000)]
configfs: use capped length for ->store_attribute()

The difference between "count" and "len" is that "len" is capped at 4095.
Changing it like this makes it match how sysfs_write_file() is
implemented.

This is a static analysis patch.  I haven't found any store_attribute()
functions where this change makes a difference.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Joel Becker <jlbec@evilplan.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agodrivers/infiniband/core/cm.c: convert to using idr_alloc_cyclic()
Zhao Hongjiang [Thu, 9 May 2013 23:57:11 +0000 (09:57 +1000)]
drivers/infiniband/core/cm.c: convert to using idr_alloc_cyclic()

commit 3e6628c4b347 ("idr: introduce idr_alloc_cyclic()") adds a new
idr_alloc_cyclic routine and converts several of these users to it.  This
is just a missed one - add it.

Signed-off-by: Zhao Hongjiang <zhaohongjiang@huawei.com>
Cc: Roland Dreier <roland@kernel.org>
Cc: Tejun Heo <tj@kernel.org>
Cc: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agoposix_timers: Fix racy timer delta caching on task exit
Frederic Weisbecker [Thu, 9 May 2013 23:57:10 +0000 (09:57 +1000)]
posix_timers: Fix racy timer delta caching on task exit

When a task exits, we perform a caching of the remaining cputime delta
before expiring of its timers.

This is done from the following places:

* When the task is reaped. We iterate through its list of
  posix cpu timers and store the remaining timer delta to
  the timer struct instead of the absolute value.
  (See posix_cpu_timers_exit() / posix_cpu_timers_exit_group() )

* When we call posix_cpu_timer_get() or posix_cpu_timer_schedule().
  If the timer's task is considered dying when watched from these
  places, the same conversion from absolute to relative expiry time
  is performed. Then the given task's reference is released.
  (See clear_dead_task() ).

The relevance of this caching is questionable but this is another
and deeper debate.

The big issue here is that these two sources of caching don't mix
up very well together.

More specifically, the caching can easily be done twice, resulting
in a wrong delta as it gets spuriously substracted a second time by
the elapsed clock. This can happen in the following scenario:

1) The task exits and gets reaped: we call posix_cpu_timers_exit()
   and the absolute timer expiry values are converted to a relative
   delta.

2) timer_gettime() -> posix_cpu_timer_get() is called and relies on
   clear_dead_task() because  tsk->exit_state == EXIT_DEAD.
   The delta gets substracted again by the elapsed clock and we return
   a wrong result.

To fix this, just remove the caching done on task reaping time.  It
doesn't bring much value on its own.  The caching done from
posix_cpu_timer_get/schedule is enough.

And it would also be hard to get it really right: we could make it put and
clear the target task in the timer struct so that readers know if they are
dealing with a relative cached of absolute value.  But it would be racy.
The only safe way to do it would be to lock the itimer->it_lock so that we
know nobody reads the cputime expiry value while we modify it and its
target task reference.  Doing so would involve some funny workarounds to
avoid circular lock against the sighand lock.  There is just no reason to
maintain this.

The user visible effect of this patch can be observed by running the
following code: it creates a subthread that launches a posix cputimer
which expires after 10 seconds. But then the subthread only busy loops for 2
seconds and exits. The parent reaps the subthread and read the timer value.
Its expected value should the be the initial timer's expiration value
minus the cputime elapsed in the subthread. Roughly 10 - 2 = 8 seconds:

#include <sys/time.h>
#include <stdio.h>
#include <unistd.h>
#include <time.h>
#include <pthread.h>

static timer_t id;
static struct itimerspec val = { .it_value.tv_sec = 10, }, new;

static void *thread(void *unused)
{
int err;
struct timeval start, end, diff;

timer_create(CLOCK_THREAD_CPUTIME_ID, NULL, &id);
if (err < 0) {
perror("Can't create timer\n");
return NULL;
}

/* Arm 10 sec timer */
err = timer_settime(id, 0, &val, NULL);
if (err < 0) {
perror("Can't set timer\n");
return NULL;
}

/* Exit after 2 seconds of execution */
gettimeofday(&start, NULL);
        do {
gettimeofday(&end, NULL);
timersub(&end, &start, &diff);
} while (diff.tv_sec < 2);

return NULL;
}

int main(int argc, char **argv)
{
pthread_t pthread;
int err;

err = pthread_create(&pthread, NULL, thread, NULL);
if (err) {
perror("Can't create thread\n");
return -1;
}
pthread_join(pthread, NULL);
/* Just wait a little bit to make sure the child got reaped */
sleep(1);
err = timer_gettime(id, &new);
if (err)
perror("Can't get timer value\n");
printf("%d %ld\n", new.it_value.tv_sec, new.it_value.tv_nsec);

return 0;
}

Before the patch:

       $ ./posix_cpu_timers
       6 2278074

After the patch:

      $ ./posix_cpu_timers
      8 1158766

Before the patch, the elapsed time got two more seconds spuriously accounted.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Stanislaw Gruszka <sgruszka@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agoposix-timers: correctly get dying task time sample in posix_cpu_timer_schedule()
Frederic Weisbecker [Thu, 9 May 2013 23:57:10 +0000 (09:57 +1000)]
posix-timers: correctly get dying task time sample in posix_cpu_timer_schedule()

In order to re-arm a timer after it fired, we take a sample of the current
process or thread cputime.

If the task is dying though, we don't arm anything but we cache the
remaining timer expiration delta for further reads.

Something similar is performed in posix_cpu_timer_get() but here we forget
to take the process wide cputime sample before caching it.

As a result we are storing random stack content, leading every further
reads of that timer to return junk values.

Fix this by taking the appropriate sample in the case of process wide
timers.

This probably doesn't matter much in practice because, at this stage, the
thread is the last one in the group and we reached exit_notify().  This
implies that we called exit_itimers() and there should be no more timers
to handle for that task.

So this is likely dead code anyway but let's fix the current logic
and the warning that came along:

    kernel/posix-cpu-timers.c: In function 'posix_cpu_timer_schedule':
    kernel/posix-cpu-timers.c:1127: warning: 'now' may be used uninitialized in this function

Then we can start to think further about cleaning up that code.

Reported-by: Andrew Morton <akpm@linux-foundation.org>
Reported-by: Chen Gang <gang.chen@asianux.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Stanislaw Gruszka <sgruszka@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Chen Gang <gang.chen@asianux.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agoselftests: add basic posix timers selftests
Frederic Weisbecker [Thu, 9 May 2013 23:57:10 +0000 (09:57 +1000)]
selftests: add basic posix timers selftests

Add some initial basic tests on a few posix timers interface such as
setitimer() and timer_settime().

These simply check that expiration happens in a reasonable timeframe after
expected elapsed clock time (user time, user + system time, real time,
...).

This is helpful for finding basic breakages while hacking
on this subsystem.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Stanislaw Gruszka <sgruszka@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agoposix_cpu_timers: consolidate expired timers check
Frederic Weisbecker [Thu, 9 May 2013 23:57:09 +0000 (09:57 +1000)]
posix_cpu_timers: consolidate expired timers check

Consolidate the common code amongst per thread and per process timers list
on tick time.

List traversal, expiry check and subsequent updates can be shared in a
common helper.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Stanislaw Gruszka <sgruszka@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agoposix_cpu_timers: consolidate timer list cleanups
Frederic Weisbecker [Thu, 9 May 2013 23:57:09 +0000 (09:57 +1000)]
posix_cpu_timers: consolidate timer list cleanups

Cleaning up the posix cpu timers on task exit shares some common code
among timer list types, most notably the list traversal and expiry time
update.

Unify this in a common helper.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Stanislaw Gruszka <sgruszka@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agoposix_cpu_timer: consolidate expiry time type
Frederic Weisbecker [Thu, 9 May 2013 23:57:09 +0000 (09:57 +1000)]
posix_cpu_timer: consolidate expiry time type

The posix cpu timer expiry time is stored in a union of two types: a 64
bits field if we rely on scheduler precise accounting, or a cputime_t if
we rely on jiffies.

This results in quite some duplicate code and special cases to handle the
two types.

Just unify this into a single 64 bits field.  cputime_t can always fit
into it.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Stanislaw Gruszka <sgruszka@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agocyber2000fb: avoid palette corruption at higher clocks
Ondrej Zary [Thu, 9 May 2013 23:57:08 +0000 (09:57 +1000)]
cyber2000fb: avoid palette corruption at higher clocks

When 1280x1024@75Hz mode is set, console palette is not set properly -
sometimes the background is white, sometimes yellow and text colors are
also messed up.  This does not happen at 1280x1024@60Hz and below.

It seems that the HW needs some time before setting the palette - maybe
the PLL needs more time to lock at higher speeds.  This patch fixes the
problem but without knowing what register to check for PLL lock(?), the
delay might be excessive.

On Fri, 28 Jan 2011 18:15:37 +0000
Russell King <rmk@arm.linux.org.uk> wrote:

> On Tue, Jan 18, 2011 at 01:14:24PM -0800, Andrew Morton wrote:
> > Russell, I have an (old) note here that this is awaiting an ack from
> > yourself?
>
> Well, I can reproduce this problem on the Netwinders here.  I'm not sure
> that we should delay all mode switches by one second - and any attempt
> to reduce this value does result in the palette not being set correctly.
>
> For 1280x1024-75, the dotclock is 135MHz, which gives a PLL values of
> 0x41 and 0x06.  That's: M=0x41+1, N=0x06+1, P=0x00 (top 2 bits of 0x06)
> -> Q=1
>
>  Fpll = 14.31818MHz * M / N
>  Fout = Fpll / Q
>
> The PLL itself is formed by dividing the 14-ish MHz frequency by N and
> phase comparing the output of the VCO, divided by M, and adjusting the
> VCO until the two correlate.  As VCOs typically tend to have a limited
> range, it's normal to divide the output frequency to produce a greater
> range - and in this case that's done by Q.
>
> For the 800x600-100 copied from /etc/fb.modes, this has a dotclock of
> 67.5MHz, which is exactly half this rate.  The PLL values for this are:
> M=0x41+1, N=0x06+1, P=0x01, giving PLL values of 0x41 and 0x46.
>
> Booting with 800x600-100 does not suffer the problem.  So it's not
> related to PLL lock time.  There's something else going on.
>
> Another experiment I tried was forcing the PLL values to produce 108MHz
> instead of 135MHz.  108MHz is the dotclock for 1280x1024-60.  This too
> doesn't suffer the problem.
>
> I've also tried chosing other delay values.  100ms is too short and
> produces the problem, but 1s works.  1s for a PLL to lock is a hell of
> a time, especially for a PLL operating in the MHz range.
>
> I've tried setting the PLL to a known good freqency, and then switching
> to 135MHz - the problem persists.  It's not like 135MHz is reaching the
> limits - it'll go up to 206MHz.
>
> So, I don't think this has anything to do with PLL locking.  I think
> there's something else going on which isn't immediately obvious - maybe
> bandwidth starvation preventing us from writing properly to the palette?
> As it's a horrible VGA, where you write the same register multiple times
> I wouldn't be surprised if some writes were going missing.
>
> I'll see if I can play around with it some more this evening, but I've
> spent an awful long time on just this issue already this afternoon...
>
> I think further investigation needs to happen on this patch before it's
> acceptable.  Or maybe we should prevent the cyberpro coming up in

Signed-off-by: Ondrej Zary <linux@rainbow-software.org>
Cc: Russell King <rmk@arm.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agodrm/fb-helper: don't sleep for screen unblank when an oops is in progress
Daniel Vetter [Thu, 9 May 2013 23:57:08 +0000 (09:57 +1000)]
drm/fb-helper: don't sleep for screen unblank when an oops is in progress

Otherwise the system will burn even brighter and worse, leave the user
wondering what's going on exactly.

Since we already have a panic handler which will (try) to restore the
entire fbdev console mode, we can just bail out.  Inspired by a patch from
Konstantin Khlebnikov.  The callchain leading to this, cut&pasted from
Konstantin's original patch:

callstack:
panic()
bust_spinlocks(1)
unblank_screen()
vc->vc_sw->con_blank()
fbcon_blank()
fb_blank()
info->fbops->fb_blank()
drm_fb_helper_blank()
drm_fb_helper_dpms()
drm_modeset_lock_all()
mutex_lock(&dev->mode_config.mutex)

Note that the entire locking in the fb helper around panic/sysrq and kdbg
is ...  non-existant.  So we have a decent change of blowing up
everything.  But since reworking this ties in with funny concepts like the
fbdev notifier chain or the impressive things which happen around
console_lock while oopsing, I'll leave that as an exercise for braver
souls than me.

Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Konstantin Khlebnikov <khlebnikov@openvz.org>
Cc: Dave Airlie <airlied@gmail.com>
Reviewed-by: Rob Clark <robdclark@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agoaudit: fix mq_open and mq_unlink to add the MQ root as a hidden parent audit_names...
Jeff Layton [Thu, 9 May 2013 23:57:08 +0000 (09:57 +1000)]
audit: fix mq_open and mq_unlink to add the MQ root as a hidden parent audit_names record

The old audit PATH records for mq_open looked like this:

type=PATH msg=audit(1366282323.982:869): item=1 name=(null) inode=6777
dev=00:0c mode=041777 ouid=0 ogid=0 rdev=00:00
obj=system_u:object_r:tmpfs_t:s15:c0.c1023
type=PATH msg=audit(1366282323.982:869): item=0 name="test_mq" inode=26732
dev=00:0c mode=0100700 ouid=0 ogid=0 rdev=00:00
obj=staff_u:object_r:user_tmpfs_t:s15:c0.c1023

...with the audit related changes that went into 3.7, they now look like this:

type=PATH msg=audit(1366282236.776:3606): item=2 name=(null) inode=66655
dev=00:0c mode=0100700 ouid=0 ogid=0 rdev=00:00
obj=staff_u:object_r:user_tmpfs_t:s15:c0.c1023
type=PATH msg=audit(1366282236.776:3606): item=1 name=(null) inode=6926
dev=00:0c mode=041777 ouid=0 ogid=0 rdev=00:00
obj=system_u:object_r:tmpfs_t:s15:c0.c1023
type=PATH msg=audit(1366282236.776:3606): item=0 name="test_mq"

Both of these look wrong to me.  As Steve Grubb pointed out:

"What we need is 1 PATH record that identifies the MQ. The other PATH
 records probably should not be there."

Fix it to record the mq root as a parent, and flag it such that it should
be hidden from view when the names are logged, since the root of the mq
filesystem isn't terribly interesting.  With this change, we get a single
PATH record that looks more like this:

type=PATH msg=audit(1368021604.836:484): item=0 name="test_mq" inode=16914
dev=00:0c mode=0100644 ouid=0 ogid=0 rdev=00:00
obj=unconfined_u:object_r:user_tmpfs_t:s0

In order to do this, a new audit_inode_parent_hidden() function is added.
If we do it this way, then we avoid having the existing callers of
audit_inode needing to do any sort of flag conversion if auditing is
inactive.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Reported-by: Jiri Jaburek <jjaburek@redhat.com>
Cc: Steve Grubb <sgrubb@redhat.com>
Cc: Eric Paris <eparis@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agox86: make 'mem=' option to work for efi platform
Wen Congyang [Thu, 9 May 2013 23:57:07 +0000 (09:57 +1000)]
x86: make 'mem=' option to work for efi platform

Current mem boot option only can work for non efi environment.  If the
user specifies add_efi_memmap, it cannot work for efi environment.  In the
efi environment, we call e820_add_region() to add the memory map.  So we
can modify __e820_add_region() and the mem boot option can work for efi
environment.

Note: Only E820_RAM is limited, and BOOT_SERVICES_{CODE,DATA} are always
mapped(If its address >= mem_limit, the memory won't be freed in
efi_free_boot_services()).

Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
Cc: Matt Fleming <matt.fleming@intel.com>
Cc: Rob Landley <rob@landley.net>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Yasuaki ISIMATU <isimatu.yasuaki@jp.fujitsu.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Matthew Garrett <mjg@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agosound/soc/codecs/si476x.c: don't use 0bNNN
Andrew Morton [Thu, 9 May 2013 23:57:07 +0000 (09:57 +1000)]
sound/soc/codecs/si476x.c: don't use 0bNNN

spacr64 gcc-3.4.5 (at least) spits this back.

Cc: Andrey Smirnov <andrey.smirnov@convergeddevices.net>
Cc: Mark Brown <broonie@opensource.wolfsonmicro.com>
Cc: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agorandom: fix accounting race condition with lockless irq entropy_count update
Jiri Kosina [Thu, 9 May 2013 23:57:06 +0000 (09:57 +1000)]
random: fix accounting race condition with lockless irq entropy_count update

Commit 902c098a3663 ("random: use lockless techniques in the interrupt
path") turned IRQ path from being spinlock protected into lockless
cmpxchg-retry update.

That commit removed r->lock serialization between crediting entropy bits
from IRQ context and accounting when extracting entropy on userspace read
path, but didn't turn the r->entropy_count reads/updates in account() to
use cmpxchg as well.

It has been observed, that under certain circumstances this leads to
read() on /dev/urandom to return 0 (EOF), as r->entropy_count gets
corrupted and becomes negative, which in turn results in propagating 0 all
the way from account() to the actual read() call.

Convert the accounting code to be the proper lockless counterpart of what
has been partially done by 902c098a3663.

Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Cc: Theodore Ts'o <tytso@mit.edu>
Cc: Greg KH <greg@kroah.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agodrivers/char/random.c: fix priming of last_data
Jarod Wilson [Thu, 9 May 2013 23:57:06 +0000 (09:57 +1000)]
drivers/char/random.c: fix priming of last_data

Commit ec8f02da9e ("random: prime last_data value per fips requirements")
added priming of last_data per fips requirements.  Unfortuantely, it did
so in a way that can lead to multiple threads all incrementing nbytes, but
only one actually doing anything with the extra data, which leads to some
fun random corruption and panics.

The fix is to simply do everything needed to prime last_data in a single
shot, so there's no window for multiple cpus to increment nbytes -- in
fact, we won't even increment or decrement nbytes anymore, we'll just
extract the needed EXTRACT_SIZE one time per pool and then carry on with
the normal routine.

All these changes have been tested across multiple hosts and architectures
where panics were previously encoutered.  The code changes are are
strictly limited to areas only touched when when booted in fips mode.

This change should also go into 3.8-stable, to make the myriads of fips
users on 3.8.x happy.

Signed-off-by: Jarod Wilson <jarod@redhat.com>
Tested-by: Jan Stancek <jstancek@redhat.com>
Tested-by: Jan Stodola <jstodola@redhat.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Matt Mackall <mpm@selenic.com>
Cc: "Theodore Ts'o" <tytso@mit.edu>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agokmsg-honor-dmesg_restrict-sysctl-on-dev-kmsg-fix
Andrew Morton [Thu, 9 May 2013 23:57:06 +0000 (09:57 +1000)]
kmsg-honor-dmesg_restrict-sysctl-on-dev-kmsg-fix

use pr_warn_once()

Cc: Josh Boyer <jwboyer@redhat.com>
Cc: Kees Cook <keescook@chromium.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agokmsg: honor dmesg_restrict sysctl on /dev/kmsg
Kees Cook [Thu, 9 May 2013 23:57:05 +0000 (09:57 +1000)]
kmsg: honor dmesg_restrict sysctl on /dev/kmsg

The dmesg_restrict sysctl currently covers the syslog method for access
dmesg, however /dev/kmsg isn't covered by the same protections.  Most
people haven't noticed because util-linux dmesg(1) defaults to using the
syslog method for access in older versions.  With util-linux dmesg(1)
defaults to reading directly from /dev/kmsg.

To fix /dev/kmsg, let's compare the existing interfaces and what they allow:

- /proc/kmsg allows:
 - open (SYSLOG_ACTION_OPEN) if CAP_SYSLOG since it uses a destructive
   single-reader interface (SYSLOG_ACTION_READ).
 - everything, after an open.

- syslog syscall allows:
 - anything, if CAP_SYSLOG.
 - SYSLOG_ACTION_READ_ALL and SYSLOG_ACTION_SIZE_BUFFER, if dmesg_restrict==0.
 - nothing else (EPERM).

The use-cases were:
- dmesg(1) needs to do non-destructive SYSLOG_ACTION_READ_ALLs.
- sysklog(1) needs to open /proc/kmsg, drop privs, and still issue the
  destructive SYSLOG_ACTION_READs.

AIUI, dmesg(1) is moving to /dev/kmsg, and systemd-journald doesn't clear
the ring buffer.

Based on the comments in devkmsg_llseek, it sounds like actions besides
reading aren't going to be supported by /dev/kmsg (i.e.
SYSLOG_ACTION_CLEAR), so we have a strict subset of the non-destructive
syslog syscall actions.

To this end, move the check as Josh had done, but also rename the
constants to reflect their new uses (SYSLOG_FROM_CALL becomes
SYSLOG_FROM_READER, and SYSLOG_FROM_FILE becomes SYSLOG_FROM_PROC).
SYSLOG_FROM_READER allows non-destructive actions, and SYSLOG_FROM_PROC
allows destructive actions after a capabilities-constrained
SYSLOG_ACTION_OPEN check.

- /dev/kmsg allows:
 - open if CAP_SYSLOG or dmesg_restrict==0
 - reading/polling, after open

Addresses https://bugzilla.redhat.com/show_bug.cgi?id=903192

Signed-off-by: Kees Cook <keescook@chromium.org>
Reported-by: Christian Kujau <lists@nerdbynature.de>
Tested-by: Josh Boyer <jwboyer@redhat.com>
Cc: Kay Sievers <kay@vrfy.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agoMAINTAINERS: update Hyper-V file list
Haiyang Zhang [Thu, 9 May 2013 23:57:05 +0000 (09:57 +1000)]
MAINTAINERS: update Hyper-V file list

Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Reviewed-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agoipcsem-fix-semctl-getncnt-fix
Andrew Morton [Thu, 9 May 2013 23:57:05 +0000 (09:57 +1000)]
ipcsem-fix-semctl-getncnt-fix

same coding-style fixes

Cc: Rik van Riel <riel@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agoipcsem-fix-semctl-getzcnt-fix
Andrew Morton [Thu, 9 May 2013 23:57:04 +0000 (09:57 +1000)]
ipcsem-fix-semctl-getzcnt-fix

checkpatch fixes

Cc: Rik van Riel <riel@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agokernel/audit_tree.c:audit_add_tree_rule(): protect `rule' from kill_rules()
Chen Gang [Thu, 9 May 2013 23:57:03 +0000 (09:57 +1000)]
kernel/audit_tree.c:audit_add_tree_rule(): protect `rule' from kill_rules()

audit_add_tree_rule() must set 'rule->tree = NULL;' firstly, to protect
the rule itself freed in kill_rules().

The reason is when it is killed, the 'rule' itself may have already
released, we should not access it.  one example: we add a rule to an
inode, just at the same time the other task is deleting this inode.

The work flow for adding a rule:

    audit_receive() -> (need audit_cmd_mutex lock)
      audit_receive_skb() ->
        audit_receive_msg() ->
          audit_receive_filter() ->
            audit_add_rule() ->
              audit_add_tree_rule() -> (need audit_filter_mutex lock)
                ...
                unlock audit_filter_mutex
                get_tree()
                ...
                iterate_mounts() -> (iterate all related inodes)
                  tag_mount() ->
                    tag_trunk() ->
                      create_trunk() -> (assume it is 1st rule)
                        fsnotify_add_mark() ->
                          fsnotify_add_inode_mark() ->  (add mark to inode->i_fsnotify_marks)
                        ...
                        get_tree(); (each inode will get one)
                ...
                lock audit_filter_mutex

The work flow for deleting an inode:

    __destroy_inode() ->
     fsnotify_inode_delete() ->
       __fsnotify_inode_delete() ->
        fsnotify_clear_marks_by_inode() ->  (get mark from inode->i_fsnotify_marks)
          fsnotify_destroy_mark() ->
           fsnotify_destroy_mark_locked() ->
             audit_tree_freeing_mark() ->
               evict_chunk() ->
                 ...
                 tree->goner = 1
                 ...
                 kill_rules() ->   (assume current->audit_context == NULL)
                   call_rcu() ->   (rule->tree != NULL)
                     audit_free_rule_rcu() ->
                       audit_free_rule()
                 ...
                 audit_schedule_prune() ->  (assume current->audit_context == NULL)
                   kthread_run() ->    (need audit_cmd_mutex and audit_filter_mutex lock)
                     prune_one() ->    (delete it from prue_list)
                       put_tree(); (match the original get_tree above)

Signed-off-by: Chen Gang <gang.chen@asianux.com>
Cc: Eric Paris <eparis@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agomm compaction: fix of improper cache flush in migration code
Leonid Yegoshin [Thu, 9 May 2013 23:57:03 +0000 (09:57 +1000)]
mm compaction: fix of improper cache flush in migration code

Page 'new' during MIGRATION can't be flushed with flush_cache_page().
Using flush_cache_page(vma, addr, pfn) is justified only if the page is
already placed in process page table, and that is done right after
flush_cache_page().  But without it the arch function has no knowledge of
process PTE and does nothing.

Besides that, flush_cache_page() flushes an application cache page, but
the kernel has a different page virtual address and dirtied it.

Replace it with flush_dcache_page(new) which is the proper usage.

The old page is flushed in try_to_unmap_one() before migration.

This bug takes place in Sead3 board with M14Kc MIPS CPU without cache
aliasing (but Harvard arch - separate I and D cache) in tight memory
environment (128MB) each 1-3days on SOAK test.  It fails in cc1 during
kernel build (SIGILL, SIGBUS, SIGSEG) if CONFIG_COMPACTION is switched ON.

Signed-off-by: Leonid Yegoshin <Leonid.Yegoshin@imgtec.com>
Cc: Leonid Yegoshin <yegoshin@mips.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: David Miller <davem@davemloft.net>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agorapidio/tsi721: fix bug in MSI interrupt handling
Alexandre Bounine [Thu, 9 May 2013 23:57:03 +0000 (09:57 +1000)]
rapidio/tsi721: fix bug in MSI interrupt handling

Fix bug in MSI interrupt handling which causes loss of event notifications.

Typical indication of lost MSI interrupts are stalled message and doorbell
transfers between RapidIO endpoints.  To avoid loss of MSI interrupts all
interrupts from the device must be disabled on entering the interrupt
handler routine and re-enabled when exiting it.  Re-enabling device
interrupts will trigger new MSI message(s) if Tsi721 registered new events
since entering interrupt handler routine.

This patch is applicable to kernel versions starting from v3.2.

Signed-off-by: Alexandre Bounine <alexandre.bounine@idt.com>
Cc: Matt Porter <mporter@kernel.crashing.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agoshm-fix-null-pointer-deref-when-userspace-specifies-invalid-hugepage-size-fix
Andrew Morton [Thu, 9 May 2013 23:57:02 +0000 (09:57 +1000)]
shm-fix-null-pointer-deref-when-userspace-specifies-invalid-hugepage-size-fix

eliminate ugly 80-col tricks

Cc: Dave Jones <davej@redhat.com>
Cc: Davidlohr Bueso <davidlohr.bueso@hp.com>
Cc: Li Zefan <lizefan@huawei.com>
Cc: Li Zefan <lizfan@huawei.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Rik van Riel <riel@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>