git.karo-electronics.de Git - karo-tx-linux.git/log

mqueue: revert bump up DFLT_*MAX

Mqueue limitation is slightly naieve parameter likes other ipcs because
unprivileged user can consume kernel memory by using ipcs.

Thus, too aggressive raise bring us security issue. Example, current
setting allow evil unprivileged user use 256GB (= 256 * 1024 * 1024*1024)
and it's enough large to system will belome unresponsive. Don't do that.

Instead, every admin should adjust the knobs for their own systems.

Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Acked-by: Doug Ledford <dledford@redhat.com>
Acked-by: Joe Korty <joe.korty@ccur.com>
Cc: Amerigo Wang <amwang@redhat.com>
Acked-by: Serge E. Hallyn <serue@us.ibm.com>
Cc: Jiri Slaby <jslaby@suse.cz>
Cc: Manfred Spraul <manfred@colorfullife.com>
Cc: Dave Hansen <haveblue@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

ipc/mqueue: update maximums for the mqueue subsystem

Commit b231cca4381e ("message queues: increase range limits") changed the
maximum size of a message in a message queue from INT_MAX to 8192*128.
Unfortunately, we had customers that relied on a size much larger than
8192*128 on their production systems.  After reviewing POSIX, we found
that it is silent on the maximum message size.  We did find a couple other
areas in which it was not silent.  Fix up the mqueue maximums so that the
customer's system can continue to work, and document both the POSIX and
real world requirements in ipc_namespace.h so that we don't have this
issue crop back up.

Also, commit 9cf18e1dd74cd0 ("ipc: HARD_MSGMAX should be higher not lower
on 64bit") fiddled with HARD_MSGMAX without realizing that the number was
intentionally in place to limit the msg queue depth to one that was small
enough to kmalloc an array of pointers (hence why we divided 128k by
sizeof(long)).  If we wish to meet POSIX requirements, we have no choice
but to change our allocation to a vmalloc instead (at least for the large
queue size case).  With that, it's possible to increase our allowed
maximum to the POSIX requirements (or more if we choose).

Signed-off-by: Doug Ledford <dledford@redhat.com>
Cc: Serge E. Hallyn <serue@us.ibm.com>
Cc: Amerigo Wang <amwang@redhat.com>
Cc: Joe Korty <joe.korty@ccur.com>
Cc: Jiri Slaby <jslaby@suse.cz>
Acked-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Manfred Spraul <manfred@colorfullife.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

ipc/mqueue: enforce hard limits

In two places we don't enforce the hard limits for CAP_SYS_RESOURCE apps.
In preparation for making more reasonable hard limits, start enforcing
them even on CAP_SYS_RESOURCE.

Signed-off-by: Doug Ledford <dledford@redhat.com>
Cc: Serge E. Hallyn <serue@us.ibm.com>
Cc: Amerigo Wang <amwang@redhat.com>
Cc: Joe Korty <joe.korty@ccur.com>
Cc: Jiri Slaby <jslaby@suse.cz>
Acked-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Manfred Spraul <manfred@colorfullife.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

ipc/mqueue: switch back to using non-max values on create

Commit b231cca4381 ("message queues: increase range limits") changed how
we create a queue that does not include an attr struct passed to open so
that it creates the queue with whatever the maximum values are.  However,
if the admin has set the maximums to allow flexibility in creating a queue
(aka, both a large size and large queue are allowed, but combined they
create a queue too large for the RLIMIT_MSGQUEUE of the user), then
attempts to create a queue without an attr struct will fail.  Switch back
to using acceptable defaults regardless of what the maximums are.

Note: so far, we only know of a few applications that rely on this
behavior (specifically, set the maximums in /proc, then run the
application which calls mq_open() without passing in an attr struct, and
the application expects the newly created message queue to have the
maximum sizes that were set in /proc used on the mq_open() call, and all
of those applications that we know of are actually part of regression test
suites that were coded to do something like this:

for size in 4096 65536 $((1024 * 1024)) $((16 * 1024 * 1024)); do
echo $size > /proc/sys/fs/mqueue/msgsize_max
mq_open || echo "Error opening mq with size $size"
done

These test suites that depend on any behavior like this are broken.  The
concept that programs should rely upon the system wide maximum in order to
get their desired results instead of simply using a attr struct to specify
what they want is fundamentally unfriendly programming practice for any
multi-tasking OS.

Fixing this will break those few apps that we know of (and those app
authors recognize the brokenness of their code and the need to fix it).
However, the following patch "mqueue: separate mqueue default value"
allows a workaround in the form of new knobs for the default msg queue
creation parameters for any software out there that we don't already know
about that might rely on this behavior at the moment.

Signed-off-by: Doug Ledford <dledford@redhat.com>
Cc: Serge E. Hallyn <serue@us.ibm.com>
Cc: Amerigo Wang <amwang@redhat.com>
Cc: Joe Korty <joe.korty@ccur.com>
Cc: Jiri Slaby <jslaby@suse.cz>
Acked-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Manfred Spraul <manfred@colorfullife.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

ipc/mqueue: cleanup definition names and locations

Since commit b231cca4381 ("message queues: increase range limits") on Oct
18, 2008, calls to mq_open() that did not pass in an attribute struct and
expected to get default values for the size of the queue and the max
message size now get the system wide maximums instead of hardwired
defaults like they used to get.

This was uncovered when one of the earlier patches in this patch set
increased the default system wide maximums at the same time it increased
the hard ceiling on the system wide maximums (a customer specifically
needed the hard ceiling brought back up, the new ceiling that commit
b231cca4381 introduced was too low for their production systems).  By
increasing the default maximums and not realising they were tied to any
attempt to create a message queue without an attribute struct, I had
inadvertently made it such that all message queue creation attempts
without an attribute struct were failing because the new default maximums
would create a queue that exceeded the default rlimit for message queue
bytes.

As a result, the system wide defaults were brought back down to their
previous levels, and the system wide ceilings on the maximums were raised
to meet the customer's needs.  However, the fact that the no attribute
struct behavior of mq_open() could be broken by changing the system wide
maximums for message queues was seen as fundamentally broken itself.  So
we hardwired the no attribute case back like it used to be.  But, then we
realized that on the very off chance that some piece of software in the
wild depended on that behavior, we could work around that issue by adding
two new knobs to /proc that allowed setting the defaults for message
queues created without an attr struct separately from the system wide
maximums.

What is not an option IMO is to leave the current behavior in place.  No
piece of software should ever rely on setting the system wide maximums in
order to get a desired message queue.  Such a reliance would be so
fundamentally multitasking OS unfriendly as to not really be tolerable.
Fortunately, we don't know of any software in the wild that uses this
except for a regression test program that caught the issue in the first
place.  If there is though, we have made accommodations with the two new
/proc knobs (and that's all the accommodations such fundamentally broken
software can be allowed)..

This patch:

The various defines for minimums and maximums of the sysctl controllable
mqueue values are scattered amongst different files and named
inconsistently.  Move them all into ipc_namespace.h and make them have
consistent names.  Additionally, make the number of queues per namespace
also have a minimum and maximum and use the same sysctl function as the
other two settable variables.

Signed-off-by: Doug Ledford <dledford@redhat.com>
Acked-by: Serge E. Hallyn <serue@us.ibm.com>
Cc: Amerigo Wang <amwang@redhat.com>
Cc: Joe Korty <joe.korty@ccur.com>
Cc: Jiri Slaby <jslaby@suse.cz>
Acked-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Manfred Spraul <manfred@colorfullife.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

ipc/sem.c: alternatives to preempt_disable()

ipc/sem.c uses a custom wakeup scheme that relies on preempt_disable().
On -RT, this causes increased latencies and debug warnings.

The patch adds two additional schemes:
- one built around a completion - could be better for -RT kernels
- one built around a spinlock - unfortunately it's broken
- and the current one

My preferred solution would be the spinlock implementation: RT would use
premptible spinlocks, mainline normal spinlocks. Thus both get the
optimal implementation without any special code in ipc/sem.c.
Unfortunately, I don't see how it could be fixed.

Signed-off-by: Manfred Spraul <manfred@colorfullife.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Mike Galbraith <efault@gmx.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

fork: call complete_vfork_done() after clearing child_tid and flushing rss-counters

Child should wake up the parent from vfork() only after finishing all
operations with shared mm. There is no sense in using
CLONE_CHILD_CLEARTID together with CLONE_VFORK, but it looks more accurate
now.

Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Konstantin Khlebnikov <khlebnikov@openvz.org>
Cc: Markus Trippelsdorf <markus@trippelsdorf.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

proc: use IS_ERR_OR_NULL()

Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Hugh Dickins <hughd@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

procfs: use more apprioriate types when dumping /proc/N/stat

- use int fpr priority and nice, since task_nice()/task_prio() return that

- field 24: get_mm_rss() returns unsigned long

Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

proc: pass "fd" by value in /proc/*/{fd,fdinfo} code

Pass "fd" directly, not via pointer -- one less memory read.

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

proc: don't do dummy rcu_read_lock/rcu_read_unlock on error path

rcu_read_lock()/rcu_read_unlock() is nop for TINY_RCU, but is not a nop
for, say, PREEMPT_RCU.

proc_fill_cache() is called without RCU lock, there is no need to
lock/unlock on error path, simply jump out of the loop.

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Cc: "Paul E. McKenney" <paulmck@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

proc: use mm_access() instead of ptrace_may_access()

mm_access() handles this much better, and avoids some race conditions.

Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Hugh Dickins <hughd@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

proc: remove mm_for_maps()

mm_for_maps() is a simple wrapper for mm_access(), and the name is
misleading, so just remove it and use mm_access() directly.

Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Acked-by: Hugh Dickins <hughd@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

proc: unify ptrace_may_access() locking code

Unify mutex_lock+ptrace_may_access code and rename lock_trace() to
task_access_lock(), which better describes what it does.

Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Hugh Dickins <hughd@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

proc-clean-up-proc-pid-environ-handling-checkpatch-fixes

ERROR: "foo* bar" should be "foo *bar"
#26: FILE: fs/proc/base.c:680:
+static int __mem_open(struct inode* inode, struct file* file, unsigned int mode)

ERROR: "foo* bar" should be "foo *bar"
#26: FILE: fs/proc/base.c:680:
+static int __mem_open(struct inode* inode, struct file* file, unsigned int mode)

ERROR: "foo* bar" should be "foo *bar"
#43: FILE: fs/proc/base.c:708:
+static int mem_open(struct inode* inode, struct file* file)

ERROR: "foo* bar" should be "foo *bar"
#43: FILE: fs/proc/base.c:708:
+static int mem_open(struct inode* inode, struct file* file)

ERROR: "foo* bar" should be "foo *bar"
#55: FILE: fs/proc/base.c:809:
+static int environ_open(struct inode* inode, struct file* file)

ERROR: "foo* bar" should be "foo *bar"
#55: FILE: fs/proc/base.c:809:
+static int environ_open(struct inode* inode, struct file* file)

total: 6 errors, 0 warnings, 100 lines checked

./patches/proc-clean-up-proc-pid-environ-handling.patch has style problems, please review.

If any of these errors are false positives, please report
them to the maintainer, see CHECKPATCH in MAINTAINERS.

Please run checkpatch prior to sending patches

Cc: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

proc: clean up /proc/<pid>/environ handling

Similar to e268337dfe2 ("proc: clean up and fix /proc/<pid>/mem
handling"), move the check of permission to open(), this will simplify
read() code.

Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Hugh Dickins <hughd@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

stack usage: add pid to warning printk in check_stack_usage

In embedded systems, sometimes the same program (busybox) is the cause of
multiple warnings. Outputting the pid with the program name in the
warning printk helps distinguish which instances of a program are using
the stack most.

This is a small patch, but useful.

Signed-off-by: Tim Bird <tim.bird@am.sony.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

cred: remove task_is_dead() from __task_cred() validation

commit 8f92054e ("CRED: Fix __task_cred()'s lockdep check and banner
comment"):

    add the following validation condition:

        task->exit_state >= 0

    to permit the access if the target task is dead and therefore
    unable to change its own credentials.

OK, but afaics currently this can only help wait_task_zombie() which calls
__task_cred() without rcu lock.

Remove this validation and change wait_task_zombie() to use task_uid()
instead.  This means we do rcu_read_lock() only to shut up the lockdep,
but we already do the same in, say, wait_task_stopped().

task_is_dead() should die, task->exit_state != 0 means that this task has
passed exit_notify(), only do_wait-like code paths should use this.

Unfortunately, we can't kill task_is_dead() right now, it has already
acquired buggy users in drivers/staging.  The fix already exists.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Reviewed-by: "Eric W. Biederman" <ebiederm@xmission.com>
Acked-by: David Howells <dhowells@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: James Morris <jmorris@namei.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

kmod: avoid deadlock from recursive kmod call

The system deadlocks (at least since 2.6.10) when
call_usermodehelper(UMH_WAIT_EXEC) request triggered
call_usermodehelper(UMH_WAIT_PROC) request.

This is because "khelper thread is waiting for the worker thread at
wait_for_completion() in do_fork() since the worker thread was created
with CLONE_VFORK flag" and "the worker thread cannot call complete()
because do_execve() is blocked at UMH_WAIT_PROC request" and "the khelper
thread cannot start processing UMH_WAIT_PROC request because the khelper
thread is waiting for the worker thread at wait_for_completion() in
do_fork()".

In order to avoid deadlock, do not try to call wait_for_completion() in
call_usermodehelper_exec() if the worker thread was created by khelper
thread with CLONE_VFORK flag.

The easiest example to observe this deadlock is to use a corrupted
/sbin/hotplug binary (like shown below).

  # : > /tmp/dummy
  # chmod 755 /tmp/dummy
  # echo /tmp/dummy > /proc/sys/kernel/hotplug
  # modprobe whatever

call_usermodehelper("/tmp/dummy", UMH_WAIT_EXEC) is called from
kobject_uevent_env() in lib/kobject_uevent.c upon loading/unloading a
module.  do_execve("/tmp/dummy") triggers a call to
request_module("binfmt-0000") from search_binary_handler() which in turn
calls call_usermodehelper(UMH_WAIT_PROC).

There are various hooks called during do_execve() operation (e.g.
security_bprm_check(), audit_bprm(), "struct
linux_binfmt"->load_binary()).  If one of such hooks triggers
UMH_WAIT_EXEC, this deadlock will happen even if /sbin/hotplug is not
corrupted.

[akpm@linux-foundation.org: add comment to kmod_thread_locker]
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

kmod.c: fix kernel-doc warning

Warning(kernel/kmod.c:419): No description found for parameter 'depth'

Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

kmod: move call_usermodehelper_fns() to .c file and unexport all it's helpers

If we move call_usermodehelper_fns() to kmod.c file and EXPORT_SYMBOL it
we can avoid exporting all it's helper functions:
call_usermodehelper_setup
call_usermodehelper_setfns
call_usermodehelper_exec
And make all of them static to kmod.c

Since the optimizer will see all these as a single call site it will
inline them inside call_usermodehelper_fns(). So we loose the call to
_fns but gain 3 calls to the helpers. (Not that it matters)

Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

kmod: convert two call sites to call_usermodehelper_fns()

Both kernel/sys.c && security/keys/request_key.c where inlining the exact
same code as call_usermodehelper_fns(); So simply convert these sites to
directly use call_usermodehelper_fns().

Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

kmod: unexport call_usermodehelper_freeinfo()

call_usermodehelper_freeinfo() is not used outside of kmod.c. So unexport
it, and make it static to kmod.c

Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

HPFS: remove PRINTK() macro

The PRINTK() macro isn't really used. Let's just remove it because it
is ugly and out of date.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Mikulas Patocka <mikulas@artax.karlin.mff.cuni.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

drivers/rtc/rtc-m41t93.c: don't let get_time() reset M41T93_FLAG_OF

If the rtc reports the time might be invalid due to oscillator failure,
M41T93_FLAG_OF flag must not be reset by get_time() as the read operation
doesn't make the time valid.

Without this patch, only the first get_time() reported an invalid time,
the second get_time() reported a valid time althought the reported time is
probably wrong due to oscillator failure.

Instead of resetting in get_time(), with this patch M41T93_FLAG_OF is
reset in set_time() when a valid time is to be written.

Signed-off-by: Nikolaus Voss <n.voss@weinmann.de>
Cc: Alessandro Zummo <a.zummo@towertech.it>
Cc: Grant Likely <grant.likely@secretlab.ca>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

rtc: ds1307: add trickle charger support

Some DS13XX devices have "trickle chargers". Its configuration register
is at different locations, the setup is the same, though. Since the
configuration is board specific, introduce a platform_data to this driver.
Tested with a DS1339 on a custom board.

Signed-off-by: Wolfram Sang <w.sang@pengutronix.de>
Cc: Alessandro Zummo <alessandro.zummo@towertech.it>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

rtc: ds1307: remove superfluous initialization

ds1307 was kzalloced, so no need to zero members of the struct.

Signed-off-by: Wolfram Sang <w.sang@pengutronix.de>
Acked-by: Joakim Tjernlund <Joakim.Tjernlund@transmode.se>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

rtc-rename-config_rtc_mxc-to-config_rtc_drv_mxc-fix

Cc: Alessandro Zummo <a.zummo@towertech.it>
Cc: Fabio Estevam <fabio.estevam@freescale.com>
Cc: Fabio Estevam <festevam@gmail.com>
Cc: Wolfram Sang <w.sang@pengutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

rtc: rename CONFIG_RTC_MXC to CONFIG_RTC_DRV_MXC

In order to keep consistency with other rtc drivers,rename CONFIG_RTC_MXC
to CONFIG_RTC_DRV_MXC.

Signed-off-by: Fabio Estevam <fabio.estevam@freescale.com>
Acked-by: Wolfram Sang <w.sang@pengutronix.de>
Cc: Alessandro Zummo <a.zummo@towertech.it>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

drivers/rtc/Kconfig: place RTC_DRV_IMXDI and RTC_MXC under "on-CPU RTC drivers"

RTC_DRV_IMXDI and RTC_MXC are on-chip RTC modules, so move them under
"on-CPU RTC drivers" selection menu.

While at it change the dependency of RTC_DRV_IMXDI from ARCH_MX25 to
SOC_IMX25.

Signed-off-by: Fabio Estevam <fabio.estevam@freescale.com>
Acked-by: Wolfram Sang <w.sang@pengutronix.de>
Cc: Alessandro Zummo <a.zummo@towertech.it>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

drivers/rtc/rtc-pcf8563.c: add RTC_VL_READ/RTC_VL_CLR ioctl feature

Changes are based on arch/cris/arch-v10/drivers/pcf8563.c

Signed-off-by: Alexander Stein <alexander.stein@systec-electronic.com>
Cc: Alessandro Zummo <a.zummo@towertech.it>
Cc: Mikael Starvik <starvik@axis.com>
Acked-by: Jesper Nilsson <jesper.nilsson@axis.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

rtc: add ioctl to get/clear battery low voltage status

Currently there is no generic way to get the RTC battery status within an
application. So add an ioctl to read the status bit. The idea is that
the bit is set once a low voltage is detected. It stays there until it is
reset using the RTC_VL_CLR ioctl.

Signed-off-by: Alexander Stein <alexander.stein@systec-electronic.com>
Cc: Alessandro Zummo <a.zummo@towertech.it>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

drivers/rtc/rtc-ep93xx.c: convert to use module_platform_driver()

Use module_platform_driver() to remove the boilerplate code.

Also, change the probe and remove functions to __devinit/__devexit.

Signed-off-by: H Hartley Sweeten <hsweeten@visionengravers.com>
Cc: Alessandro Zummo <a.zummo@towertech.it>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

rtc/spear: add Device Tree probing capability

SPEAr platforms now support DT and so must convert all drivers support DT.
This patch adds DT probing support for rtc and updates its documentation
too.

Signed-off-by: Viresh Kumar <viresh.kumar@st.com>
Cc: Stefan Roese <sr@denx.de>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Alessandro Zummo <a.zummo@towertech.it>
Cc: Rajeev Kumar <rajeev-dlh.kumar@st.com>
Cc: Rob Herring <robherring2@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

checkpatch: suggest pr_<level> over printk(KERN_<LEVEL>

Suggest the shorter pr_<level> instead of printk(KERN_<LEVEL>.

Prefer to use pr_<level> over bare printks.
Prefer to use pr_warn over pr_warning.

Signed-off-by: Joe Perches <joe@perches.com>
Cc: Andy Whitcroft <apw@shadowen.org>
Cc: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

spinlock_debug: print kallsyms name for lock

When a spinlock warning is printed we usually get

BUG: spinlock bad magic on CPU#0, modprobe/111
lock: 0xdff09f38, .magic: 00000000, .owner: /0, .owner_cpu: 0

but it's nicer to print the symbol for the lock if we have it so that we
can avoid 'grep dff09f38 /proc/kallsyms' to find out which lock it was.
Use kallsyms to print the symbol name so we get something a bit easier to
read

BUG: spinlock bad magic on CPU#0, modprobe/112
lock: test_lock, .magic: 00000000, .owner: <none>/-1, .owner_cpu: 0

If the lock is not in kallsyms %ps will fall back to printing the address
directly.

Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

vsprintf: fix %ps on non symbols when using kallsyms

Using %ps in a printk format will sometimes fail silently and print the
empty string if the address passed in does not match a symbol that
kallsyms knows about.  But using %pS will fall back to printing the full
address if kallsyms can't find the symbol.  Make %ps act the same as %pS
by falling back to printing the address.

While we're here also make %ps print the module that a symbol comes from
so that it matches what %pS already does.  Take this simple function for
example (in a module):

static void test_printk(void)
{
int test;
pr_info("with pS: %pS\n", &test);
pr_info("with ps: %ps\n", &test);
}

Before this patch:

with pS: 0xdff7df44
with ps:

After this patch:

with pS: 0xdff7df44
with ps: 0xdff7df44

Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

lib/bitmap.c: fix documentation for scnprintf() functions

The code comments for bscnl_emit() and bitmap_scnlistprintf() are
describing snprintf() return semantics, but these functions use
scnprintf() return semantics. Fix that, and document the
bitmap_scnprintf() return value as well.

Cc: Ryota Ozaki <ozaki.ryota@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

lib/string_helpers.c: make arrays static

Moving these arrays into static storage shrinks the kernel a bit:

   text    data     bss     dec     hex filename
    723     112      64     899     383 lib/string_helpers.o
    516     272      64     852     354 lib/string_helpers.o

Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

lib/test-kstrtox.c: mark const init data with __initconst instead of __initdata

As long as there is no other non-const variable marked __initdata in the
same compilation unit it doesn't hurt. If there were one however
compilation would fail with

error: $variablename causes a section type conflict

because a section containing const variables is marked read only and so
cannot contain non-const variables.

Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

list_debug: WARN for adding something already in the list

We were bitten by this at one point and added an additional sanity test
for DEBUG_LIST. You can't validly add a list_head to a list where either
prev or next is the same as the thing you're adding.

Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
Cc: Andi Kleen <andi@firstfloor.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

leds-heartbeat-stop-on-shutdown-checkpatch-fixes

ERROR: space prohibited after that '!' (ctx:WxW)
#45: FILE: drivers/leds/ledtrig-heartbeat.c:121:
+ if( ! rc )
^

ERROR: space prohibited after that open parenthesis '('
#45: FILE: drivers/leds/ledtrig-heartbeat.c:121:
+ if( ! rc )

ERROR: space prohibited before that close parenthesis ')'
#45: FILE: drivers/leds/ledtrig-heartbeat.c:121:
+ if( ! rc )

ERROR: space required before the open parenthesis '('
#45: FILE: drivers/leds/ledtrig-heartbeat.c:121:
+ if( ! rc )

total: 4 errors, 0 warnings, 36 lines checked

./patches/leds-heartbeat-stop-on-shutdown.patch has style problems, please review.

If any of these errors are false positives, please report
them to the maintainer, see CHECKPATCH in MAINTAINERS.

Please run checkpatch prior to sending patches

Cc: Alexander Holler <holler@ahsoftware.de>
Cc: Richard Purdie <rpurdie@rpsys.net>
Cc: Shuah Khan <shuahkhan@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

leds: heartbeat: stop on shutdown

A halted kernel should not show a heartbeat.

Signed-off-by: Alexander Holler <holler@ahsoftware.de>
Cc: Shuah Khan <shuahkhan@gmail.com>
Cc: Richard Purdie <rpurdie@rpsys.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

drivers/leds/leds-lm3530.c: simplify als configuration on initialization

For better code readability, ALS code is moved to new a function -
lm3530_als_configure()

Signed-off-by: Milo(Woogyom) Kim <milo.kim@ti.com>
Acked-by: Linus Walleij <linus.walleij@linaro.org>
Cc: Shreshtha Kumar SAHU <shreshthakumar.sahu@stericsson.com>
Cc: Richard Purdie <rpurdie@rpsys.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

include/linux/led-lm3530.h: comment correction about the range of brightness

max brightness is 127, so the range of brt_val should be from 0 to 127

Signed-off-by: Milo(Woogyom) Kim <milo.kim@ti.com>
Acked-by: Linus Walleij <linus.walleij@linaro.org>
Cc: Shreshtha Kumar SAHU <shreshthakumar.sahu@stericsson.com>
Cc: Richard Purdie <rpurdie@rpsys.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

leds: change ledtrig-timer to use activated flag

Change existing timer trigger to use the new ->activated flag to set
activate successful status in activate routine and check it in deactivate
routine to do cleanup.

Signed-off-by: Shuah Khan <shuahkhan@gmail.com>
Cc: Richard Purdie <rpurdie@rpsys.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

leds: change existing triggers to use activated flag

Change existing triggers backlight, gpio, and heartbeat to use the new
->activated flag to set activate successful status in their activate
routines and check it in their deactivate routines to do cleanup.

Signed-off-by: Shuah Khan <shuahkhan@gmail.com>
Cc: Richard Purdie <rpurdie@rpsys.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

leds: add new field to led_classdev struct to save activation state

Add a new field to led_classdev to save activattion state after activate
routine is successful. This saved state is used in deactivate routine to
do cleanup such as removing device files, and free memory allocated during
activation. Currently trigger_data not being null is used for this
purpose.

Existing triggers will need changes to use this new field.

Signed-off-by: Shuah Khan <shuahkhan@gmail.com>
Cc: Richard Purdie <rpurdie@rpsys.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

leds: Use kcalloc instead of kzalloc to allocate array

The advantage of kcalloc is that will prevent integer overflows which
could result from the multiplication of number of elements and size and it
is also a bit nicer to read.

The semantic patch that makes this change is available
in https://lkml.org/lkml/2011/11/25/107

Signed-off-by: Thomas Meyer <thomas@m3y3r.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

leds: simple_strtoul() cleanup

led-class.c and ledtrig-timer.c still use simple_strtoul(). Change them
to use kstrtoul() instead of obsolete simple_strtoul().

Also fix the existing int ret declaration to be ssize_t to match the
return type for _store functions in ledtrig-timer.c.

Signed-off-by: Shuah Khan <shuahkhan@gmail.com>
Cc: Joe Perches <joe@perches.com>
Cc: Richard Purdie <rpurdie@rpsys.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

leds: lm3556: Don't call kfree for the memory allocated by devm_kzalloc

The devm_* functions eliminate the need for manual resource releasing
and simplify error handling. Resources allocated by devm_* are freed
automatically on driver detach.

Thus adding kfree calls here will introduce double free bug.

Signed-off-by: Axel Lin <axel.lin@gmail.com>
Cc: Geon Si Jeong <gshark.jeong@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

leds-add-led-driver-for-lm3556-chip-checkpatch-fixes

WARNING: please write a paragraph that describes the config symbol fully
#35: FILE: drivers/leds/Kconfig:405:
+config LEDS_LM3556

ERROR: "foo * bar" should be "foo *bar"
#204: FILE: drivers/leds/leds-lm3556.c:142:
+static int lm3556_read_reg(struct i2c_client *client, u8 reg, u8 * val)

total: 1 errors, 1 warnings, 736 lines checked

./patches/leds-add-led-driver-for-lm3556-chip.patch has style problems, please review.

If any of these errors are false positives, please report
them to the maintainer, see CHECKPATCH in MAINTAINERS.

Please run checkpatch prior to sending patches

Cc: Geon Si Jeong <gshark.jeong@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

leds: add LED driver for lm3556 chip

A simple driver for the Texas Instruments LM3556 chip.

The LM3556 is a 4 MHz fixed-frequency synchronous boost converter plus
1.5A constant current driver for a high-current white LED. Datasheet:
www.national.com/ds/LM/LM3556.pdf

Tested on OMAP4430

Signed-off-by: Geon Si Jeong <gshark.jeong@gmail.com>
Cc: Richard Purdie <rpurdie@rpsys.net>
Cc: Daniel Jeong <daniel.jeong@ti.com>
Cc: Mark Brown <broonie@opensource.wolfsonmicro.com>
Cc: Greg KH <greg@kroah.com>
Cc: Wolfram Sang <w.sang@pengutronix.de>
Cc: Shuah Khan <shuahkhan@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

leds-led-module-for-da9052-53-pmic-v2-fix

Cc: Ashish Jangam <ashish.jangam@kpitcummins.com>
Cc: David Dajun Chen <dchen@diasemi.com>
Cc: Lars-Peter Clausen <lars@metafoo.de>
Cc: Richard Purdie <rpurdie@rpsys.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

leds: driver for DA9052/53 PMIC v2

LED Driver for Dialog Semiconductor DA9052/53 PMICs.

Signed-off-by: David Dajun Chen <dchen@diasemi.com>
Signed-off-by: Ashish Jangam <ashish.jangam@kpitcummins.com>
Reviewed-by: Lars-Peter Clausen <lars@metafoo.de>
Cc: Richard Purdie <rpurdie@rpsys.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

drivers/leds/leds-lp5521.c: fix lp5521_read() error handling

Gcc 4.6.2 complains that:
drivers/leds/leds-lp5521.c: In function `lp5521_load_program':
drivers/leds/leds-lp5521.c:214:21: warning: `mode' may be used uninitialized in this function [-Wuninitialized]
drivers/leds/leds-lp5521.c: In function `lp5521_probe':
drivers/leds/leds-lp5521.c:788:5: warning: `buf' may be used uninitialized in this function [-Wuninitialized]
drivers/leds/leds-lp5521.c:740:6: warning: `ret' may be used uninitialized in this function [-Wuninitialized]

These are real problems if lp5521_read() returns an error. When that
happens we should handle it, instead of ignoring it or doing a bitwise OR
with all the other error codes and continuing.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Cc: Milo <Milo.Kim@ti.com>
Cc: Richard Purdie <rpurdie@rpsys.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

fbdev: add events for early fb event support

Add FB_EARLY_EVENT_BLANK and FB_R_EARLY_EVENT_BLANK event mode supports.
first, fb_notifier_call_chain() is called with FB_EARLY_EVENT_BLANK and
fb_blank() of specific fb driver is called and then
fb_notifier_call_chain() is called with FB_EVENT_BLANK again at
fb_blank(). and if fb_blank() was failed then fb_nitifier_call_chain()
would be called with FB_R_EARLY_EVENT_BLANK to revert the previous
effects.

Signed-off-by: Inki Dae <inki.dae@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Cc: Lars-Peter Clausen <lars@metafoo.de>
Cc: Florian Tobias Schandinat <FlorianSchandinat@gmx.de>
Cc: Richard Purdie <rpurdie@rpsys.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

lcd: add callbacks for early fb event blank support

This patchset adds early fb blank feature that a callback of lcd panel
driver is called prior to specific fb driver's one.  In the case of
MIPI-DSI based video mode LCD Panel, for lcd power off, the power off
commands should be transferred to lcd panel with display and mipi-dsi
controller enabled because the commands is set to lcd panel at vsync porch
period.  and in opposite case, the callback of fb driver should be called
prior to lcd panel driver's one because of same issue.  Also if fb_blank
mode is changed to FB_BLANK_POWERDOWN then display controller would be
off(clock disable) but lcd panel would be still on.  at this time, you
could see some issue like sparkling on lcd panel because video clock to be
delivered to ldi module of lcd panel was disabled.  this issue could
occurs for all lcd panels.

The callback order is as the following:

at fb_blank function of fbmem.c
-> fb_notifier_call_chain(FB_EARLY_EVENT_BLANK)
       -> lcd panel driver's early_set_power()
-> info->fbops->fb_blank()
       -> spcefic fb driver's fb_blank()
-> fb_notifier_call_chain(FB_EVENT_BLANK)
       -> lcd panel driver's set_power()
   -> fb_notifier_call_chain(FB_R_EARLY_EVENT_BLANK) if
info->fops->fb_blank() was failed.

fb_notifier_call_chain(FB_R_EARLY_EVENT_BLANK) would be called to revert
the effects of previous FB_EARLY_EVENT_BLANK call.  and note that if
early_set_power() of lcd_ops is NULL then early fb blank callback would be
ignored.

This patch:

Add early_set_power and r_early_set_power callbacks.  early_set_power
callback is called prior to fb_blank() of fbmem.c and r_early_set_power
callback is called if fb_blank() was failed to revert the effects of the
early_set_power call of lcd panel driver.

Signed-off-by: Inki Dae <inki.dae@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Cc: Lars-Peter Clausen <lars@metafoo.de>
Cc: Florian Tobias Schandinat <FlorianSchandinat@gmx.de>
Cc: Richard Purdie <rpurdie@rpsys.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

blacklight: remove redundant spi driver bus initialization

In ancient times it was necessary to manually initialize the bus field of
an spi_driver to spi_bus_type. These days this is done in
spi_driver_register() so we can drop the manual assignment.

The patch was generated using the following coccinelle semantic patch:
// <smpl>
@@
identifier _driver;
@@
struct spi_driver _driver = {
.driver = {
- .bus = &spi_bus_type,
},
};
// </smpl>

Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Cc: Richard Purdie <rpurdie@rpsys.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

MAINTAINERS: remove Alessandro

I'm definitely not up to date on the subject matter any more
and haven't been able to comment on the messages on the topic.

Signed-off-by: Alessandro Rubini <rubini@ipvvis.unipv.it>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

hamradio/scc: orphan driver in MAINTAINERS

The email address doesn't exist anymore and the website yields a 404.

Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

vsprintf-further-optimize-decimal-conversion-checkpatch-fixes

WARNING: line over 80 characters
#112: FILE: lib/vsprintf.c:136:
+ * (x * 0x00cd) >> 11     x <       1029 shorter code than * 0x67 (on i386)

ERROR: trailing statements should be on next line
#192: FILE: lib/vsprintf.c:178:
+ if (q == 0) return buf;

ERROR: trailing statements should be on next line
#195: FILE: lib/vsprintf.c:181:
+ if (r == 0) return buf;

ERROR: trailing statements should be on next line
#198: FILE: lib/vsprintf.c:184:
+ if (q == 0) return buf;

ERROR: trailing statements should be on next line
#201: FILE: lib/vsprintf.c:187:
+ if (r == 0) return buf;

ERROR: trailing statements should be on next line
#204: FILE: lib/vsprintf.c:190:
+ if (q == 0) return buf;

ERROR: trailing statements should be on next line
#207: FILE: lib/vsprintf.c:193:
+ if (r == 0) return buf;

ERROR: trailing statements should be on next line
#210: FILE: lib/vsprintf.c:196:
+ if (q == 0) return buf;

ERROR: space prohibited after that '&' (ctx:WxW)
#290: FILE: lib/vsprintf.c:267:
+ d2  = (h      ) & 0xffff;
                ^

ERROR: space prohibited before that close parenthesis ')'
#290: FILE: lib/vsprintf.c:267:
+ d2  = (h      ) & 0xffff;

total: 9 errors, 1 warnings, 310 lines checked

./patches/vsprintf-further-optimize-decimal-conversion.patch has style problems, please review.

If any of these errors are false positives, please report
them to the maintainer, see CHECKPATCH in MAINTAINERS.

Please run checkpatch prior to sending patches

Cc: Denys Vlasenko <vda.linux@googlemail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

vsprintf-further-optimize-decimal-conversion-v2

Here is a new version. I also plugged a hole in num_to_str() -
it was assuming it's safe to call put_dec() for num=0.
(We never tripped over it before because the single caller
of num_to_str() takes care of that case).

Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
Cc: Michal Nazarewicz <mina86@mina86.com>
Cc: Douglas W Jones <jones@cs.uiowa.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

vsprintf: further optimize decimal conversion

Previous code was using optimizations which were developed to work well
even on narrow-word CPUs (by today's standards).  But Linux runs only on
32-bit and wider CPUs.  We can use that.

First: using 32x32->64 multiply and trivial 32-bit shift, we can correctly
divide by 10 much larger numbers, and thus we can print groups of 9 digits
instead of groups of 5 digits.

Next: there are two algorithms to print larger numbers.  One is generic:
divide by 1000000000 and repeatedly print groups of (up to) 9 digits.
It's conceptually simple, but requires an (unsigned long long) /
1000000000 division.

Second algorithm splits 64-bit unsigned long long into 16-bit chunks,
manipulates them cleverly and generates groups of 4 decimal digits.  It so
happens that it does NOT require long long division.

If long is > 32 bits, division of 64-bit values is relatively easy, and we
will use the first algorithm.  If long long is > 64 bits (strange
architecture with VERY large long long), second algorithm can't be used,
and we again use the first one.

Else (if long is 32 bits and long long is 64 bits) we use second one.

And third: there is a simple optimization which takes fast path not only
for zero as was done before, but for all one-digit numbers.

In all tested cases new code is faster than old one, in many cases by 30%,
in few cases by more than 50% (for example, on x86-32, conversion of
12345678).  Code growth is ~0 in 32-bit case and ~130 bytes in 64-bit
case.

This patch is based upon an original from Michal Nazarewicz.

Signed-off-by: Michal Nazarewicz <mina86@mina86.com>
Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
Cc: Douglas W Jones <jones@cs.uiowa.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

vsprintf-correctly-handle-width-when-flag-used-in-%p-format-checkpatch-fixes

ERROR: "(foo*)" should be "(foo *)"
#20: FILE: lib/vsprintf.c:869:
+ int default_width = 2 * sizeof(void*) + (spec.flags & SPECIAL ? 2 : 0);

total: 1 errors, 0 warnings, 32 lines checked

./patches/vsprintf-correctly-handle-width-when-flag-used-in-%p-format.patch has style problems, please review.

If any of these errors are false positives, please report
them to the maintainer, see CHECKPATCH in MAINTAINERS.

Please run checkpatch prior to sending patches

Cc: Grant Likely <grant.likely@secretlab.ca>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

vsprintf: correctly handle width when '#' flag used in %#p format.

needs decent changelog!

Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

sethostname/setdomainname: notify userspace when there is a change in uts_kern_table

sethostname() and setdomainname() notify userspace on failure (without
modifying uts_kern_table). Change things so that we only notify userspace
on success, when uts_kern_table was actually modified.

Signed-off-by: Sasikantha babu <sasikanth.v19@gmail.com>
Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: WANG Cong <amwang@redhat.com>
Reviewed-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

introduce SIZE_MAX

ULONG_MAX is often used to check for integer overflow when calculating
allocation size. While ULONG_MAX happens to work on most systems, there
is no guarantee that `size_t' must be the same size as `long'.

This patch introduces SIZE_MAX, the maximum value of `size_t', to improve
portability and readability for allocation size validation.

Signed-off-by: Xi Wang <xi.wang@gmail.com>
Acked-by: Alex Elder <elder@dreamhost.com>
Cc: David Airlie <airlied@linux.ie>
Cc: Pekka Enberg <penberg@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

parisc: use set_current_blocked() and block_sigmask()

As described in e6fa16ab ("signal: sigprocmask() should do
retarget_shared_pending()") the modification of current->blocked is
incorrect as we need to check whether the signal we're about to block is
pending in the shared queue.

Also, use the new helper function introduced in commit 5e6292c0f28f
("signal: add block_sigmask() for adding sigmask to current->blocked")
which centralises the code for updating current->blocked after
successfully delivering a signal and reduces the amount of duplicate code
across architectures. In the past some architectures got this code wrong,
so using this helper function should stop that from happening again.

Acked-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Kyle McMartin <kyle@mcmartin.ca>
Cc: Helge Deller <deller@gmx.de>
Cc: "James E.J. Bottomley" <jejb@parisc-linux.org>
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

CodingStyle: add kmalloc_array() to memory allocators

Add the new kmalloc_array() to the list of general-purpose memory
allocators in chapter 14.

Signed-off-by: Xi Wang <xi.wang@gmail.com>
Acked-by: Jesper Juhl <jj@chaosbits.net>
Acked-by: Pekka Enberg <penberg@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

checkpatch: check for spin_is_locked()

spin_is_locked() is usually misued. In checkpatch.pl:

- warn when it is used at all

- error out when it is asserted on free, because that's usually broken
(e.g. doesn't work on on uni processor builds). Recommend
lockdep_assert_held() instead.

[joe@perches.com: some improvements]
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Cc: Andy Whitcroft <apw@canonical.com>
Cc: Joe Perches <joe@perches.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

include/linux/spinlock.h: add a kerneldoc comment to spin_is_locked() that discourages its use

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Cc: Wolfram Sang <w.sang@pengutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

spinlockstxt-add-a-discussion-on-why-spin_is_locked-is-bad-fix

- s/hold/held in various places

- reindent for 80 cols

- a few grammatical tweaks

Cc: Andi Kleen <ak@linux.intel.com>
Cc: Wolfram Sang <w.sang@pengutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

spinlocks.txt: add a discussion on why spin_is_locked() is bad

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Cc: Wolfram Sang <w.sang@pengutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

drivers/net/ethernet/smsc/smsc911x.h: use lockdep_assert_held() instead of home grown buggy construct

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: David S. Miller <davem@davemloft.net>
Cc: Steve Glendinning <steve.glendinning@smsc.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

drivers/net/irda/sir_dev.c: remove spin_is_locked()

It's hard to imagine how this spin_is_locked() debugging check is not
totally racy. Remove it.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Cc: Samuel Ortiz <samuel@sortiz.org>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

futex: use lockdep_assert_held() for lock checking

Use lockdep_assert_held() for lock checking instead of a strange homegrown
variant. This removes the return for this case, but that is unlikely to
be useful anyway.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Darren Hart <dvhart@linux.intel.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

mm/huge_memory.c: use lockdep_assert_held()

Use lockdep_assert_held() to check for locks instead of an opencoded
variant.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Hugh Dickins <hughd@google.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

XFS: fix lock ASSERT on UP

ASSERT(!spin_is_locked()) doesn't work on UP builds. Replace with a
standard lockdep_assert_held()

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Cc: Ben Myers <bpm@sgi.com>
Cc: Alex Elder <elder@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

drivers/scsi/aha152x.c: remove broken usage of spin_is_locked()

Remove racy usage of spin_is_locked. The author seems to have been
unclear on the concept of locking.

This is debug code normally not enabled, but I caught it on a tree sweep.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

sgi-xp: use lockdep_assert_held()

!spin_is_locked() is always true on a UP build, so it cannot be used for
assertions. Replace with lockdep_assert_held().

I realize UP builds are not very likely for this driver, but it's still
better to fix it.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Robin Holt <holt@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

um/kernel/trap.c: port OOM changes to handle_page_fault()

Commit d065bd810b6deb6 ("mm: retry page fault when blocking on disk
transfer") and commit 37b23e0525 ("x86,mm: make pagefault killable")

The above commits introduced changes into the x86 pagefault handler
for making the page fault handler retryable as well as killable.

These changes reduce the mmap_sem hold time, which is crucial
during OOM killer invocation.

Port these changes to um.

Signed-off-by: Kautuk Consul <consul.kautuk@gmail.com>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Richard Weinberger <richard@nod.at>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

frv: use set_current_blocked() and block_sigmask()

As described in e6fa16ab ("signal: sigprocmask() should do
retarget_shared_pending()") the modification of current->blocked is
incorrect as we need to check whether the signal we're about to block is
pending in the shared queue.

Also, use the new helper function introduced in commit 5e6292c0f28f
("signal: add block_sigmask() for adding sigmask to current->blocked")
which centralises the code for updating current->blocked after
successfully delivering a signal and reduces the amount of duplicate code
across architectures. In the past some architectures got this code wrong,
so using this helper function should stop that from happening again.

Acked-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: David Howells <dhowells@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

security/keys/keyctl.c: suppress memory allocation failure warning

This allocation may be large. The code is probing to see if it will
succeed and if not, it falls back to vmalloc(). We should suppress any
page-allocation failure messages when the fallback happens.

Reported-by: Dave Jones <davej@redhat.com>
Acked-by: David Howells <dhowells@redhat.com>
Cc: James Morris <jmorris@namei.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

mm/memcg: use vm_swappiness from target memory cgroup

Use vm_swappiness from memory cgroup which is triggered this memory
reclaim. This is more reasonable and allows to kill one argument.

Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org>
Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujtisu.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Glauber Costa <glommer@parallels.com>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Minchan Kim <minchan@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

memcg: revise the position of threshold index while unregistering event

Index current_threshold should point to threshold just below or equal to
usage. See below: http://www.spinics.net/lists/cgroups/msg00844.html

Signed-off-by: Sha Zhengju <handai.szj@taobao.com>
Reviewed-by: Kirill A. Shutemov <kirill@shutemov.name>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

memcg: make threshold index in the right position

Index current_threshold may point to threshold that just equal to usage
after last call of __mem_cgroup_threshold.  But after registering a new
event, it will change (pointing to threshold just below usage).  So make
it consistent here.

For example:
now:
threshold array:  3  [5]  7  9   (usage = 6, [index] = 5)

next turn (after calling __mem_cgroup_threshold):
threshold array:  3   5  [7]  9   (usage = 7, [index] = 7)

after registering a new event (threshold = 10):
threshold array:  3  [5]  7  9  10 (usage = 7, [index] = 5)

Signed-off-by: Sha Zhengju <handai.szj@taobao.com>
Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Reviewed-by: Kirill A. Shutemov <kirill@shutemov.name>
Cc: Michal Hocko <mhocko@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

memcg: remove redundant parentheses

Signed-off-by: Kirill A. Shutemov <kirill@shutemov.name>
Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Acked-by: Michal Hocko <mhocko@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

memcg: mark stat field of mem_cgroup struct as __percpu

It fixes a lot of sparse warnings.

Signed-off-by: Kirill A. Shutemov <kirill@shutemov.name>
Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Acked-by: Michal Hocko <mhocko@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

memcg: remove unused variable

mm/memcontrol.c: In function `mc_handle_file_pte':
mm/memcontrol.c:5206:16: warning: variable `inode' set but not used [-Wunused-but-set-variable]

Signed-off-by: Kirill A. Shutemov <kirill@shutemov.name>
Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Acked-by: Michal Hocko <mhocko@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

memcg: mark more functions/variables as static

Based on sparse output.

Signed-off-by: Kirill A. Shutemov <kirill@shutemov.name>
Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Acked-by: Michal Hocko <mhocko@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

mm/memcg: kill mem_cgroup_lru_del()

This patch kills mem_cgroup_lru_del(), we can use
mem_cgroup_lru_del_list() instead. On 0-order isolation we already have
right lru list id.

Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Glauber Costa <glommer@parallels.com>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Minchan Kim <minchan@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

mm: remove lru type checks from __isolate_lru_page()

After patch "mm: forbid lumpy-reclaim in shrink_active_list()" we can
completely remove anon/file and active/inactive lru type filters from
__isolate_lru_page(), because isolation for 0-order reclaim always
isolates pages from right lru list. And pages-isolation for lumpy
shrink_inactive_list() or memory-compaction anyway allowed to isolate
pages from all evictable lru lists.

Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org>
Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Hugh Dickins <hughd@google.com>
Acked-by: Michal Hocko <mhocko@suse.cz>
Cc: Glauber Costa <glommer@parallels.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Minchan Kim <minchan@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

mm: mark mm-inline functions as __always_inline

GCC sometimes ignores "inline" directives even for small and simple functions.
This supposed to be fixed in gcc 4.7, but it was released only yesterday.

Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org>
Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Glauber Costa <glommer@parallels.com>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Minchan Kim <minchan@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

mm-push-lru-index-into-shrink_active_list-fix

fix kerneldoc, per Minchan

Cc: Glauber Costa <glommer@parallels.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Konstantin Khlebnikov <khlebnikov@openvz.org>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Minchan Kim <minchan@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

mm: push lru index into shrink_[in]active_list()

Let's toss lru index through call stack to isolate_lru_pages(), this is
better than its reconstructing from individual bits.

Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org>
Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Acked-by: Hugh Dickins <hughd@google.com>
Cc: Glauber Costa <glommer@parallels.com>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Minchan Kim <minchan@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

mm/memcg: move reclaim_stat into lruvec

With mem_cgroup_disabled() now explicit, it becomes clear that the
zone_reclaim_stat structure actually belongs in lruvec, per-zone when
memcg is disabled but per-memcg per-zone when it's enabled.

We can delete mem_cgroup_get_reclaim_stat(), and change
update_page_reclaim_stat() to update just the one set of stats, the one
which get_scan_count() will actually use.

Signed-off-by: Hugh Dickins <hughd@google.com>
Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org>
Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Acked-by: Michal Hocko <mhocko@suse.cz>
Reviewed-by: Minchan Kim <minchan@kernel.org>
Reviewed-by: Michal Hocko <mhocko@suse.cz>
Cc: Glauber Costa <glommer@parallels.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

mm/memcg: scanning_global_lru means mem_cgroup_disabled

Although one has to admire the skill with which it has been concealed,
scanning_global_lru(mz) is actually just an interesting way to test
mem_cgroup_disabled(). Too many developer hours have been wasted on
confusing it with global_reclaim(): just use mem_cgroup_disabled().

Signed-off-by: Hugh Dickins <hughd@google.com>
Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org>
Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Acked-by: Glauber Costa <glommer@parallels.com>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Minchan Kim <minchan@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

memcg swap: use mem_cgroup_uncharge_swap()

That stuff __mem_cgroup_commit_charge_swapin() does with a swap entry, it
has a name and even a declaration: just use mem_cgroup_uncharge_swap().

Signed-off-by: Hugh Dickins <hughd@google.com>
Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

memcg swap: mem_cgroup_move_swap_account never needs fixup

The need_fixup arg to mem_cgroup_move_swap_account() is always false,
so just remove it.

Signed-off-by: Hugh Dickins <hughd@google.com>
Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>