]> git.karo-electronics.de Git - karo-tx-linux.git/log
karo-tx-linux.git
11 years agopagemap: prepare to reuse constant bits with page-shift
Pavel Emelyanov [Fri, 7 Jun 2013 00:07:32 +0000 (10:07 +1000)]
pagemap: prepare to reuse constant bits with page-shift

In order to reuse bits from pagemap entries gracefully, we leave the
entries as is but on pagemap open emit a warning in dmesg, that bits 55-60
are about to change in a couple of releases.  Next, if a user issues
soft-dirty clear command via the clear_refs file (it was disabled before
v3.9) we assume that he's aware of the new pagemap format, note that fact
and report the bits in pagemap in the new manner.

The "migration strategy" looks like this then:

1. existing users are not affected -- they don't touch soft-dirty feature, thus
   see old bits in pagemap, but are warned and have time to fix themselves
2. those who use soft-dirty know about new pagemap format
3. some time soon we get rid of any signs of page-shift in pagemap as well as
   this trick with clear-soft-dirty affecting pagemap format.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Cc: Matt Mackall <mpm@selenic.com>
Cc: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
Cc: Glauber Costa <glommer@parallels.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@gmail.com>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agosoft-dirty: call mmu notifiers when write-protecting ptes
Pavel Emelyanov [Fri, 7 Jun 2013 00:07:31 +0000 (10:07 +1000)]
soft-dirty: call mmu notifiers when write-protecting ptes

As noticed by Xiao, since soft-dirty clear command modifies page tables we
have to flush tlbs and call mmu notifiers.  While the former is done by
the clear_refs engine itself, the latter is to be done.

One thing to note about this -- in order not to call per-page invalidate
notifier (_all_ address space is about to be changed), the
_invalidate_range_start and _end are used.  But for those start and end
are not known exactly.  To address this, the same trick as in exit_mmap()
is used -- start is 0 and end is (unsigned long)-1.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Cc: Matt Mackall <mpm@selenic.com>
Cc: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
Cc: Glauber Costa <glommer@parallels.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@gmail.com>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agomm: soft-dirty bits for user memory changes tracking
Pavel Emelyanov [Fri, 7 Jun 2013 00:07:31 +0000 (10:07 +1000)]
mm: soft-dirty bits for user memory changes tracking

The soft-dirty is a bit on a PTE which helps to track which pages a task
writes to. In order to do this tracking one should

  1. Clear soft-dirty bits from PTEs ("echo 4 > /proc/PID/clear_refs)
  2. Wait some time.
  3. Read soft-dirty bits (55'th in /proc/PID/pagemap2 entries)

To do this tracking, the writable bit is cleared from PTEs when the
soft-dirty bit is. Thus, after this, when the task tries to modify a page
at some virtual address the #PF occurs and the kernel sets the soft-dirty
bit on the respective PTE.

Note, that although all the task's address space is marked as r/o after the
soft-dirty bits clear, the #PF-s that occur after that are processed fast.
This is so, since the pages are still mapped to physical memory, and thus
all the kernel does is finds this fact out and puts back writable, dirty
and soft-dirty bits on the PTE.

Another thing to note, is that when mremap moves PTEs they are marked with
soft-dirty as well, since from the user perspective mremap modifies the
virtual memory at mremap's new address.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Cc: Matt Mackall <mpm@selenic.com>
Cc: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
Cc: Glauber Costa <glommer@parallels.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@gmail.com>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agopagemap-introduce-pagemap_entry_t-without-pmshift-bits-v4
Pavel Emelyanov [Fri, 7 Jun 2013 00:07:31 +0000 (10:07 +1000)]
pagemap-introduce-pagemap_entry_t-without-pmshift-bits-v4

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Cc: Matt Mackall <mpm@selenic.com>
Cc: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
Cc: Glauber Costa <glommer@parallels.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agopagemap: introduce pagemap_entry_t without pmshift bits
Pavel Emelyanov [Fri, 7 Jun 2013 00:07:30 +0000 (10:07 +1000)]
pagemap: introduce pagemap_entry_t without pmshift bits

These bits are always constant (== PAGE_SHIFT) and just occupy space in
the entry.  Moreover, in next patch we will need to report one more bit in
the pagemap, but all bits are already busy on it.

That said, describe the pagemap entry that has 6 more free zero bits.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Cc: Matt Mackall <mpm@selenic.com>
Cc: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
Cc: Glauber Costa <glommer@parallels.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agoclear_refs: introduce private struct for mm_walk
Pavel Emelyanov [Fri, 7 Jun 2013 00:07:30 +0000 (10:07 +1000)]
clear_refs: introduce private struct for mm_walk

In the next patch the clear-refs-type will be required in
clear_refs_pte_range funciton, so prepare the walk->private to carry this
info.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Cc: Matt Mackall <mpm@selenic.com>
Cc: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
Cc: Glauber Costa <glommer@parallels.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agoclear_refs: sanitize accepted commands declaration
Pavel Emelyanov [Fri, 7 Jun 2013 00:07:29 +0000 (10:07 +1000)]
clear_refs: sanitize accepted commands declaration

This is the implementation of the soft-dirty bit concept that should help
keep track of changes in user memory, which in turn is very-very required
by the checkpoint-restore project (http://criu.org).

To create a dump of an application(s) we save all the information about it
to files, and the biggest part of such dump is the contents of tasks' memory.
However, there are usage scenarios where it's not required to get _all_ the
task memory while creating a dump. For example, when doing periodical dumps,
it's only required to take full memory dump only at the first step and then
take incremental changes of memory. Another example is live migration. We
copy all the memory to the destination node without stopping all tasks, then
stop them, check for what pages has changed, dump it and the rest of the state,
then copy it to the destination node. This decreases freeze time significantly.

That said, some help from kernel to watch how processes modify the contents
of their memory is required.

The proposal is to track changes with the help of new soft-dirty bit this way:

1. First do "echo 4 > /proc/$pid/clear_refs".
   At that point kernel clears the soft dirty _and_ the writable bits from all
   ptes of process $pid. From now on every write to any page will result in #pf
   and the subsequent call to pte_mkdirty/pmd_mkdirty, which in turn will set
   the soft dirty flag.

2. Then read the /proc/$pid/pagemap2 and check the soft-dirty bit reported there
   (the 55'th one). If set, the respective pte was written to since last call
   to clear refs.

The soft-dirty bit is the _PAGE_BIT_HIDDEN one. Although it's used by kmemcheck,
the latter one marks kernel pages with it, while the former bit is put on user
pages so they do not conflict to each other.

This patch:

A new clear-refs type will be added in the next patch, so prepare
code for that.

[akpm@linux-foundation.org: don't assume that sizeof(enum clear_refs_types) == sizeof(int)]
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Cc: Matt Mackall <mpm@selenic.com>
Cc: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
Cc: Glauber Costa <glommer@parallels.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agowatchdog: trigger all-cpu backtrace when locked up and going to panic
Sasha Levin [Fri, 7 Jun 2013 00:07:29 +0000 (10:07 +1000)]
watchdog: trigger all-cpu backtrace when locked up and going to panic

Send an NMI to all CPUs when a lockup is detected and the lockup watchdog
code is configured to panic.  This gives us a fairly uptodate snapshot of
all CPUs in the system.

This lets us get stack trace of all CPUs which makes life easier trying to
debug a deadlock, and the NMI doesn't change anything since the next step
is a kernel panic.

Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agoblock: restore /proc/partitions to not display non-partitionable removable devices
Josh Hunt [Fri, 7 Jun 2013 00:07:29 +0000 (10:07 +1000)]
block: restore /proc/partitions to not display non-partitionable removable devices

We found with newer kernels we started seeing the cdrom device showing
up in /proc/partitions, but it was not there before.

Looking into this I found that commit d27769ec ("block: add
GENHD_FL_NO_PART_SCAN") introduces this change in behavior.  It's not
clear to me from the commit's changelog if this change was intentional or
not.  This comment still remains: /* Don't show non-partitionable
removeable devices or empty devices */ so I've decided to send a patch to
restore the behavior of not printing unpartitionable removable devices.

Signed-off-by: Josh Hunt <johunt@akamai.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Kay Sievers <kay.sievers@vrfy.org>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agodrivers/cdrom/cdrom.c: use kzalloc() for failing hardware
Jonathan Salwan [Fri, 7 Jun 2013 00:07:28 +0000 (10:07 +1000)]
drivers/cdrom/cdrom.c: use kzalloc() for failing hardware

In drivers/cdrom/cdrom.c mmc_ioctl_cdrom_read_data() allocates a memory
area with kmalloc in line 2885.

2885         cgc->buffer = kmalloc(blocksize, GFP_KERNEL);
2886         if (cgc->buffer == NULL)
2887                 return -ENOMEM;

In line 2908 we can find the copy_to_user function:

2908         if (!ret && copy_to_user(arg, cgc->buffer, blocksize))

The cgc->buffer is never cleaned and initialized before this function.  If
ret = 0 with the previous basic block, it's possible to display some
memory bytes in kernel space from userspace.

When we read a block from the disk it normally fills the ->buffer but if
the drive is malfunctioning there is a chance that it would only be
partially filled.  The result is an leak information to userspace.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agoblock/compat_ioctl.c: do not leak info to user-space
Cong Wang [Fri, 7 Jun 2013 00:07:28 +0000 (10:07 +1000)]
block/compat_ioctl.c: do not leak info to user-space

There is a hole in struct hd_geometry, so we have to zero the struct on
stack before copying it to user-space.

Signed-off-by: Cong Wang <amwang@redhat.com>
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agodrivers/cdrom/gdrom.c: fix device number leak
Libo Chen [Fri, 7 Jun 2013 00:07:28 +0000 (10:07 +1000)]
drivers/cdrom/gdrom.c: fix device number leak

Without this patch, gdrom_major will leak when gd.cd_info alloc fails.

Signed-off-by: Libo Chen <libo.chen@huawei.com>
Cc: Jens Axboe <axboe@kernel.dk>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agodrivers/net/irda/donauboe.c: convert to module_pci_driver
Libo Chen [Fri, 7 Jun 2013 00:07:27 +0000 (10:07 +1000)]
drivers/net/irda/donauboe.c: convert to module_pci_driver

Signed-off-by: Libo Chen <libo.chen@huawei.com>
Cc: Samuel Ortiz <samuel@sortiz.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agodrivers/scsi/mvumi.c: convert to module_pci_driver
Libo Chen [Fri, 7 Jun 2013 00:07:27 +0000 (10:07 +1000)]
drivers/scsi/mvumi.c: convert to module_pci_driver

Signed-off-by: Libo Chen <libo.chen@huawei.com>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agodrivers/scsi/initio.c: convert to module_pci_driver
Libo Chen [Fri, 7 Jun 2013 00:07:27 +0000 (10:07 +1000)]
drivers/scsi/initio.c: convert to module_pci_driver

Signed-off-by: Libo Chen <libo.chen@huawei.com>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agodrivers/scsi/dmx3191d.c: convert to module_pci_driver
Libo Chen [Fri, 7 Jun 2013 00:07:27 +0000 (10:07 +1000)]
drivers/scsi/dmx3191d.c: convert to module_pci_driver

Signed-off-by: Libo Chen <libo.chen@huawei.com>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agodrivers/scsi/dc395x.c: convert to module_pci_driver
Libo Chen [Fri, 7 Jun 2013 00:07:26 +0000 (10:07 +1000)]
drivers/scsi/dc395x.c: convert to module_pci_driver

Signed-off-by: Libo Chen <libo.chen@huawei.com>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agodrivers/scsi/a100u2w.c: convert to module_pci_driver
Libo Chen [Fri, 7 Jun 2013 00:07:26 +0000 (10:07 +1000)]
drivers/scsi/a100u2w.c: convert to module_pci_driver

Use module_pci_driver instead of init/exit, make code clean.

Signed-off-by: Libo Chen <libo.chen@huawei.com>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agolglock: update lockdep annotations to report recursive local locks
Michel Lespinasse [Fri, 7 Jun 2013 00:07:26 +0000 (10:07 +1000)]
lglock: update lockdep annotations to report recursive local locks

Oleg Nesterov recently noticed that the lockdep annotations in lglock.c
are not sufficient to detect some obvious deadlocks, such as
lg_local_lock(LOCK) + lg_local_lock(LOCK) or spin_lock(X) +
lg_local_lock(Y) vs lg_local_lock(Y) + spin_lock(X).

Both issues are easily fixed by indicating to lockdep that lglock's local
locks are not recursive.  We shouldn't use the rwlock acquire/release
functions here, as lglock doesn't share the same semantics.  Instead we
can base our lockdep annotations on the lock_acquire_shared (for local
lglock) and lock_acquire_exclusive (for global lglock) helpers.

I am not proposing new lglock specific helpers as I don't see the point of
the existing second level of helpers :)

Signed-off-by: Michel Lespinasse <walken@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: "Srivatsa S. Bhat" <srivatsa.bhat@linux.vnet.ibm.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: "Paul E. McKenney" <paulmck@us.ibm.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agolockdep: introduce lock_acquire_exclusive/shared helper macros
Michel Lespinasse [Fri, 7 Jun 2013 00:07:25 +0000 (10:07 +1000)]
lockdep: introduce lock_acquire_exclusive/shared helper macros

In lockdep.h, the spinlock/mutex/rwsem/rwlock/lock_map acquire macros have
different definitions based on the value of CONFIG_PROVE_LOCKING.  We have
separate ifdefs for each of these definitions, which seems redundant.

Introduce lock_acquire_{exclusive,shared,shared_recursive} helpers which
will have different definitions based on CONFIG_PROVE_LOCKING.  Then all
other helper macros can be defined based on the above ones, which reduces
the amount of ifdefined code.

Signed-off-by: Michel Lespinasse <walken@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: "Srivatsa S. Bhat" <srivatsa.bhat@linux.vnet.ibm.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: "Paul E. McKenney" <paulmck@us.ibm.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agoinclude/linux/sched.h: don't use task->pid/tgid in same_thread_group/has_group_leader_pid
Oleg Nesterov [Fri, 7 Jun 2013 00:07:25 +0000 (10:07 +1000)]
include/linux/sched.h: don't use task->pid/tgid in same_thread_group/has_group_leader_pid

task_struct->pid/tgid should go away.

1. Change same_thread_group() to use task->signal for comparison.

2. Change has_group_leader_pid(task) to compare task_pid(task) with
   signal->leader_pid.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Sergey Dyasly <dserrg@gmail.com>
Reviewed-by: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agosoftirq: use _RET_IP_
Davidlohr Bueso [Fri, 7 Jun 2013 00:07:24 +0000 (10:07 +1000)]
softirq: use _RET_IP_

Use the already defined macro to pass the function return address.

Signed-off-by: Davidlohr Bueso <davidlohr.bueso@hp.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agoocfs2: add missing dlm_put() in dlm_begin_reco_handler()
Xue jiufei [Fri, 7 Jun 2013 00:07:24 +0000 (10:07 +1000)]
ocfs2: add missing dlm_put() in dlm_begin_reco_handler()

dlm_begin_reco_handler() returns without putting dlm when dlm recovery
state is DLM_RECO_STATE_FINALIZE.

Signed-off-by: joyce <xuejiufei@huawei.com>
Reviewed-by: Jie Liu <jeff.liu@oracle.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Mark Fasheh <mfasheh@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agoocfs2: should not use le32_add_cpu to set ocfs2_dinode i_flags
Joseph Qi [Fri, 7 Jun 2013 00:07:24 +0000 (10:07 +1000)]
ocfs2: should not use le32_add_cpu to set ocfs2_dinode i_flags

If we use le32_add_cpu to set ocfs2_dinode i_flags, it may lead to the
corresponding flag corrupted.  So we should change it to bitwise and/or
operation.

Signed-off-by: Joseph Qi <joseph.qi@huawei.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: shencanquan <shencanquan@huawei.com>
Reviewed-by: Jie Liu <jeff.liu@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agofs/ocfs2/dlm/dlmrecovery.c:dlm_request_all_locks(): ret should be int instead of...
Joseph Qi [Fri, 7 Jun 2013 00:07:23 +0000 (10:07 +1000)]
fs/ocfs2/dlm/dlmrecovery.c:dlm_request_all_locks(): ret should be int instead of enum

In dlm_request_all_locks, ret is type enum.  But o2net_send_message
returns a type int value.  Then it will never run into the following error
branch.  So we should change the ret type from enum to int.

Signed-off-by: Joseph Qi <joseph.qi@huawei.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Mark Fasheh <mfasheh@suse.com>
Acked-by: Sunil Mushran <sunil.mushran@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agofs/ocfs2/dlm/dlmrecovery.c: remove duplicate declarations
Joseph Qi [Fri, 7 Jun 2013 00:07:23 +0000 (10:07 +1000)]
fs/ocfs2/dlm/dlmrecovery.c: remove duplicate declarations

Below 3 functions have already been declared in dlmcommon.h, so we have
no need to declare them again in dlmrecovery.c.
dlm_complete_recovery_thread
dlm_launch_recovery_thread
dlm_kick_recovery_thread

Signed-off-by: Joseph Qi <joseph.qi@huawei.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Mark Fasheh <mfasheh@suse.com>
Acked-by: Sunil Mushran <sunil.mushran@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agodrivers/atm/he.c: convert to module_pci_driver
Libo Chen [Fri, 7 Jun 2013 00:07:23 +0000 (10:07 +1000)]
drivers/atm/he.c: convert to module_pci_driver

Signed-off-by: Libo Chen <libo.chen@huawei.com>
Cc: Chas Williams <chas@cmf.nrl.navy.mil>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agomISDN: add support for group membership check
Jeff Mahoney [Fri, 7 Jun 2013 00:07:22 +0000 (10:07 +1000)]
mISDN: add support for group membership check

This patch adds a module parameter to allow a group access to the mISDN
devices.  Otherwise, unpriviledged users on systems with ISDN hardware
have the ability to dial out, potentially causing expensive bills.

Based on a different implementation by Patrick Koppen.

Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Acked-by: Jeff Mahoney <jeffm@suse.com>
Cc: Patrick Koppen <isdn4linux@koppen.de>
Cc: Karsten Keil <isdn@linux-pingi.de>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agodrivers/net/ethernet/ibm/ehea/ehea_main.c: add alias entry for portN properties
Olaf Hering [Fri, 7 Jun 2013 00:07:22 +0000 (10:07 +1000)]
drivers/net/ethernet/ibm/ehea/ehea_main.c: add alias entry for portN properties

Use separate table for alias entries in the ehea module, otherwise the
probe() function will operate on the separate ports instead of the
lhea-"root" entry of the device-tree

Addresses https://bugzilla.novell.com/show_bug.cgi?id=435215

Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Olaf Hering <ohering@suse.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Cc: Thadeu Lima de Souza Cascardo <cascardo@linux.vnet.ibm.com>
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agoconfigfs: use capped length for ->store_attribute()
Dan Carpenter [Fri, 7 Jun 2013 00:07:21 +0000 (10:07 +1000)]
configfs: use capped length for ->store_attribute()

The difference between "count" and "len" is that "len" is capped at 4095.
Changing it like this makes it match how sysfs_write_file() is
implemented.

This is a static analysis patch.  I haven't found any store_attribute()
functions where this change makes a difference.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Joel Becker <jlbec@evilplan.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agodrivers/mtd/chips/gen_probe.c: refactor call to request_module()
Kees Cook [Fri, 7 Jun 2013 00:07:21 +0000 (10:07 +1000)]
drivers/mtd/chips/gen_probe.c: refactor call to request_module()

This reduces the size of the stack frame when calling request_module().
Performing the sprintf before the call is not needed.

Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: James Hogan <james.hogan@imgtec.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agodrivers/mfd/lpc_ich.c: convert to module_pci_driver
Libo Chen [Fri, 7 Jun 2013 00:07:21 +0000 (10:07 +1000)]
drivers/mfd/lpc_ich.c: convert to module_pci_driver

Signed-off-by: Libo Chen <libo.chen@huawei.com>
Cc: Peter Tyser <ptyser@xes-inc.com>
Cc: Samuel Ortiz <sameo@linux.intel.com>
Cc: Lee Jones <lee.jones@linaro.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agodrivers/ide/delkin_cb.c: convert to module_pci_driver
Libo Chen [Fri, 7 Jun 2013 00:07:20 +0000 (10:07 +1000)]
drivers/ide/delkin_cb.c: convert to module_pci_driver

Signed-off-by: Libo Chen <libo.chen@huawei.com>
Cc: David Miller <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agodrivers/infiniband/core/cm.c: convert to using idr_alloc_cyclic()
Zhao Hongjiang [Fri, 7 Jun 2013 00:07:20 +0000 (10:07 +1000)]
drivers/infiniband/core/cm.c: convert to using idr_alloc_cyclic()

commit 3e6628c4b347 ("idr: introduce idr_alloc_cyclic()") adds a new
idr_alloc_cyclic routine and converts several of these users to it.  This
is just a missed one - add it.

Signed-off-by: Zhao Hongjiang <zhaohongjiang@huawei.com>
Cc: Roland Dreier <roland@kernel.org>
Cc: Tejun Heo <tj@kernel.org>
Cc: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agokernel/timer.c: fix jiffies wrap behavior of round_jiffies*()
Bart Van Assche [Fri, 7 Jun 2013 00:07:20 +0000 (10:07 +1000)]
kernel/timer.c: fix jiffies wrap behavior of round_jiffies*()

Make sure that the round_jiffies*() functions return a time that is
in the future when the jiffies counter has recently wrapped.

Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Arjan van de Ven <arjan@infradead.org>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agoposix_timers: fix racy timer delta caching on task exit
Frederic Weisbecker [Fri, 7 Jun 2013 00:07:19 +0000 (10:07 +1000)]
posix_timers: fix racy timer delta caching on task exit

When a task exits, we perform a caching of the remaining cputime delta
before expiring of its timers.

This is done from the following places:

* When the task is reaped. We iterate through its list of
  posix cpu timers and store the remaining timer delta to
  the timer struct instead of the absolute value.
  (See posix_cpu_timers_exit() / posix_cpu_timers_exit_group() )

* When we call posix_cpu_timer_get() or posix_cpu_timer_schedule().
  If the timer's task is considered dying when watched from these
  places, the same conversion from absolute to relative expiry time
  is performed. Then the given task's reference is released.
  (See clear_dead_task() ).

The relevance of this caching is questionable but this is another
and deeper debate.

The big issue here is that these two sources of caching don't mix
up very well together.

More specifically, the caching can easily be done twice, resulting
in a wrong delta as it gets spuriously substracted a second time by
the elapsed clock. This can happen in the following scenario:

1) The task exits and gets reaped: we call posix_cpu_timers_exit()
   and the absolute timer expiry values are converted to a relative
   delta.

2) timer_gettime() -> posix_cpu_timer_get() is called and relies on
   clear_dead_task() because  tsk->exit_state == EXIT_DEAD.
   The delta gets substracted again by the elapsed clock and we return
   a wrong result.

To fix this, just remove the caching done on task reaping time.  It
doesn't bring much value on its own.  The caching done from
posix_cpu_timer_get/schedule is enough.

And it would also be hard to get it really right: we could make it put and
clear the target task in the timer struct so that readers know if they are
dealing with a relative cached of absolute value.  But it would be racy.
The only safe way to do it would be to lock the itimer->it_lock so that we
know nobody reads the cputime expiry value while we modify it and its
target task reference.  Doing so would involve some funny workarounds to
avoid circular lock against the sighand lock.  There is just no reason to
maintain this.

The user visible effect of this patch can be observed by running the
following code: it creates a subthread that launches a posix cputimer
which expires after 10 seconds. But then the subthread only busy loops for 2
seconds and exits. The parent reaps the subthread and read the timer value.
Its expected value should the be the initial timer's expiration value
minus the cputime elapsed in the subthread. Roughly 10 - 2 = 8 seconds:

#include <sys/time.h>
#include <stdio.h>
#include <unistd.h>
#include <time.h>
#include <pthread.h>

static timer_t id;
static struct itimerspec val = { .it_value.tv_sec = 10, }, new;

static void *thread(void *unused)
{
int err;
struct timeval start, end, diff;

timer_create(CLOCK_THREAD_CPUTIME_ID, NULL, &id);
if (err < 0) {
perror("Can't create timer\n");
return NULL;
}

/* Arm 10 sec timer */
err = timer_settime(id, 0, &val, NULL);
if (err < 0) {
perror("Can't set timer\n");
return NULL;
}

/* Exit after 2 seconds of execution */
gettimeofday(&start, NULL);
        do {
gettimeofday(&end, NULL);
timersub(&end, &start, &diff);
} while (diff.tv_sec < 2);

return NULL;
}

int main(int argc, char **argv)
{
pthread_t pthread;
int err;

err = pthread_create(&pthread, NULL, thread, NULL);
if (err) {
perror("Can't create thread\n");
return -1;
}
pthread_join(pthread, NULL);
/* Just wait a little bit to make sure the child got reaped */
sleep(1);
err = timer_gettime(id, &new);
if (err)
perror("Can't get timer value\n");
printf("%d %ld\n", new.it_value.tv_sec, new.it_value.tv_nsec);

return 0;
}

Before the patch:

       $ ./posix_cpu_timers
       6 2278074

After the patch:

      $ ./posix_cpu_timers
      8 1158766

Before the patch, the elapsed time got two more seconds spuriously accounted.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Stanislaw Gruszka <sgruszka@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agoposix-timers: correctly get dying task time sample in posix_cpu_timer_schedule()
Frederic Weisbecker [Fri, 7 Jun 2013 00:07:19 +0000 (10:07 +1000)]
posix-timers: correctly get dying task time sample in posix_cpu_timer_schedule()

In order to re-arm a timer after it fired, we take a sample of the current
process or thread cputime.

If the task is dying though, we don't arm anything but we cache the
remaining timer expiration delta for further reads.

Something similar is performed in posix_cpu_timer_get() but here we forget
to take the process wide cputime sample before caching it.

As a result we are storing random stack content, leading every further
reads of that timer to return junk values.

Fix this by taking the appropriate sample in the case of process wide
timers.

This probably doesn't matter much in practice because, at this stage, the
thread is the last one in the group and we reached exit_notify().  This
implies that we called exit_itimers() and there should be no more timers
to handle for that task.

So this is likely dead code anyway but let's fix the current logic
and the warning that came along:

    kernel/posix-cpu-timers.c: In function 'posix_cpu_timer_schedule':
    kernel/posix-cpu-timers.c:1127: warning: 'now' may be used uninitialized in this function

Then we can start to think further about cleaning up that code.

Reported-by: Andrew Morton <akpm@linux-foundation.org>
Reported-by: Chen Gang <gang.chen@asianux.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Stanislaw Gruszka <sgruszka@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Chen Gang <gang.chen@asianux.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agoselftests: add basic posix timers selftests
Frederic Weisbecker [Fri, 7 Jun 2013 00:07:19 +0000 (10:07 +1000)]
selftests: add basic posix timers selftests

Add some initial basic tests on a few posix timers interface such as
setitimer() and timer_settime().

These simply check that expiration happens in a reasonable timeframe after
expected elapsed clock time (user time, user + system time, real time,
...).

This is helpful for finding basic breakages while hacking
on this subsystem.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Stanislaw Gruszka <sgruszka@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agoposix_cpu_timers: consolidate expired timers check
Frederic Weisbecker [Fri, 7 Jun 2013 00:07:18 +0000 (10:07 +1000)]
posix_cpu_timers: consolidate expired timers check

Consolidate the common code amongst per thread and per process timers list
on tick time.

List traversal, expiry check and subsequent updates can be shared in a
common helper.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Stanislaw Gruszka <sgruszka@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agoposix_cpu_timers: consolidate timer list cleanups
Frederic Weisbecker [Fri, 7 Jun 2013 00:07:18 +0000 (10:07 +1000)]
posix_cpu_timers: consolidate timer list cleanups

Cleaning up the posix cpu timers on task exit shares some common code
among timer list types, most notably the list traversal and expiry time
update.

Unify this in a common helper.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Stanislaw Gruszka <sgruszka@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agoposix_cpu_timer: consolidate expiry time type
Frederic Weisbecker [Fri, 7 Jun 2013 00:07:18 +0000 (10:07 +1000)]
posix_cpu_timer: consolidate expiry time type

The posix cpu timer expiry time is stored in a union of two types: a 64
bits field if we rely on scheduler precise accounting, or a cputime_t if
we rely on jiffies.

This results in quite some duplicate code and special cases to handle the
two types.

Just unify this into a single 64 bits field.  cputime_t can always fit
into it.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Stanislaw Gruszka <sgruszka@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agodrivers-iommu-msm_iommu_devc-fix-leak-and-clean-up-error-paths-fix
Andrew Morton [Fri, 7 Jun 2013 00:07:17 +0000 (10:07 +1000)]
drivers-iommu-msm_iommu_devc-fix-leak-and-clean-up-error-paths-fix

remove now-unneeded initialization of ctx_drvdata, remove unneeded braces

Cc: David Brown <davidb@codeaurora.org>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: James Hogan <james.hogan@imgtec.com>
Cc: Libo Chen <clbchenlibo.chen@huawei.com>
Cc: Libo Chen <libo.chen@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agodrivers/iommu/msm_iommu_dev.c: fix leak and clean up error paths
Libo Chen [Fri, 7 Jun 2013 00:07:17 +0000 (10:07 +1000)]
drivers/iommu/msm_iommu_dev.c: fix leak and clean up error paths

Fix two obvious problems:

1. We have registered msm_iommu_driver first, and need unregister it
   when registered msm_iommu_ctx_driver fail

2. We don`t need to kfree drvdata before kzalloc successful.

Signed-off-by: Libo Chen <libo.chen@huawei.com>
Acked-by: David Brown <davidb@codeaurora.org>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: James Hogan <james.hogan@imgtec.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agocyber2000fb: avoid palette corruption at higher clocks
Ondrej Zary [Fri, 7 Jun 2013 00:07:17 +0000 (10:07 +1000)]
cyber2000fb: avoid palette corruption at higher clocks

When 1280x1024@75Hz mode is set, console palette is not set properly -
sometimes the background is white, sometimes yellow and text colors are
also messed up.  This does not happen at 1280x1024@60Hz and below.

It seems that the HW needs some time before setting the palette - maybe
the PLL needs more time to lock at higher speeds.  This patch fixes the
problem but without knowing what register to check for PLL lock(?), the
delay might be excessive.

On Fri, 28 Jan 2011 18:15:37 +0000
Russell King <rmk@arm.linux.org.uk> wrote:

> On Tue, Jan 18, 2011 at 01:14:24PM -0800, Andrew Morton wrote:
> > Russell, I have an (old) note here that this is awaiting an ack from
> > yourself?
>
> Well, I can reproduce this problem on the Netwinders here.  I'm not sure
> that we should delay all mode switches by one second - and any attempt
> to reduce this value does result in the palette not being set correctly.
>
> For 1280x1024-75, the dotclock is 135MHz, which gives a PLL values of
> 0x41 and 0x06.  That's: M=0x41+1, N=0x06+1, P=0x00 (top 2 bits of 0x06)
> -> Q=1
>
>  Fpll = 14.31818MHz * M / N
>  Fout = Fpll / Q
>
> The PLL itself is formed by dividing the 14-ish MHz frequency by N and
> phase comparing the output of the VCO, divided by M, and adjusting the
> VCO until the two correlate.  As VCOs typically tend to have a limited
> range, it's normal to divide the output frequency to produce a greater
> range - and in this case that's done by Q.
>
> For the 800x600-100 copied from /etc/fb.modes, this has a dotclock of
> 67.5MHz, which is exactly half this rate.  The PLL values for this are:
> M=0x41+1, N=0x06+1, P=0x01, giving PLL values of 0x41 and 0x46.
>
> Booting with 800x600-100 does not suffer the problem.  So it's not
> related to PLL lock time.  There's something else going on.
>
> Another experiment I tried was forcing the PLL values to produce 108MHz
> instead of 135MHz.  108MHz is the dotclock for 1280x1024-60.  This too
> doesn't suffer the problem.
>
> I've also tried chosing other delay values.  100ms is too short and
> produces the problem, but 1s works.  1s for a PLL to lock is a hell of
> a time, especially for a PLL operating in the MHz range.
>
> I've tried setting the PLL to a known good freqency, and then switching
> to 135MHz - the problem persists.  It's not like 135MHz is reaching the
> limits - it'll go up to 206MHz.
>
> So, I don't think this has anything to do with PLL locking.  I think
> there's something else going on which isn't immediately obvious - maybe
> bandwidth starvation preventing us from writing properly to the palette?
> As it's a horrible VGA, where you write the same register multiple times
> I wouldn't be surprised if some writes were going missing.
>
> I'll see if I can play around with it some more this evening, but I've
> spent an awful long time on just this issue already this afternoon...
>
> I think further investigation needs to happen on this patch before it's
> acceptable.  Or maybe we should prevent the cyberpro coming up in

Signed-off-by: Ondrej Zary <linux@rainbow-software.org>
Cc: Russell King <rmk@arm.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agodrivers/video/acornfb.c: remove dead code
Paul Bolle [Fri, 7 Jun 2013 00:07:16 +0000 (10:07 +1000)]
drivers/video/acornfb.c: remove dead code

acornfb checks for HAS_VIDC while support for that macro was removed in
v2.6.23 (when the arm26 port was removed).  So we can remove a bit of dead
code.

Signed-off-by: Paul Bolle <pebolle@tiscali.nl>
Cc: Florian Tobias Schandinat <FlorianSchandinat@gmx.de>
Cc: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agodrivers/video/imxfb.c: make local symbols static
Sachin Kamat [Fri, 7 Jun 2013 00:07:16 +0000 (10:07 +1000)]
drivers/video/imxfb.c: make local symbols static

These symbols are used only in this file.  Make them static.

Signed-off-by: Sachin Kamat <sachin.kamat@linaro.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agodrivers/video/udlfb.c: make local symbol static
Sachin Kamat [Fri, 7 Jun 2013 00:07:16 +0000 (10:07 +1000)]
drivers/video/udlfb.c: make local symbol static

'dlfb_handle_damage' is used only in this file. Make it static.

Signed-off-by: Sachin Kamat <sachin.kamat@linaro.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agodrivers/video/udlfb.c: use NULL instead of 0
Sachin Kamat [Fri, 7 Jun 2013 00:07:15 +0000 (10:07 +1000)]
drivers/video/udlfb.c: use NULL instead of 0

Pointer variables should be initialized with NULL instead of 0.

Signed-off-by: Sachin Kamat <sachin.kamat@linaro.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agodrivers/video/smscufx.c: use NULL instead of 0
Sachin Kamat [Fri, 7 Jun 2013 00:07:15 +0000 (10:07 +1000)]
drivers/video/smscufx.c: use NULL instead of 0

'info' is a pointer. Use NULL instead of 0.

Signed-off-by: Sachin Kamat <sachin.kamat@linaro.org>
Acked-by: Steve Glendinning <steve.glendinning@shawell.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agodrivers/media/pci/pt1/pt1.c: convert to module_pci_driver
Libo Chen [Fri, 7 Jun 2013 00:07:15 +0000 (10:07 +1000)]
drivers/media/pci/pt1/pt1.c: convert to module_pci_driver

Signed-off-by: Libo Chen <libo.chen@huawei.com>
Cc: Mauro Carvalho Chehab <mchehab@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agodrivers/media/pci/pluto2/pluto2.c: convert to module_pci_driver
Libo Chen [Fri, 7 Jun 2013 00:07:14 +0000 (10:07 +1000)]
drivers/media/pci/pluto2/pluto2.c: convert to module_pci_driver

Signed-off-by: Libo Chen <libo.chen@huawei.com>
Cc: Mauro Carvalho Chehab <mchehab@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agodrivers/media/pci/mantis/hopper_cards.c: convert to module_pci_driver
Libo Chen [Fri, 7 Jun 2013 00:07:14 +0000 (10:07 +1000)]
drivers/media/pci/mantis/hopper_cards.c: convert to module_pci_driver

Signed-off-by: Libo Chen <libo.chen@huawei.com>
Cc: Mauro Carvalho Chehab <mchehab@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agodrivers/media/pci/dm1105/dm1105.c: convert to module_pci_driver
Libo Chen [Fri, 7 Jun 2013 00:07:14 +0000 (10:07 +1000)]
drivers/media/pci/dm1105/dm1105.c: convert to module_pci_driver

Signed-off-by: Libo Chen <libo.chen@huawei.com>
Cc: Mauro Carvalho Chehab <mchehab@redhat.com>
Cc: Konstantin Dimitrov <kosio.dimitrov@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agodrivers/media/pci/mantis/mantis_cards.c: convert to module_pci_driver
Libo Chen [Fri, 7 Jun 2013 00:07:14 +0000 (10:07 +1000)]
drivers/media/pci/mantis/mantis_cards.c: convert to module_pci_driver

Signed-off-by: Libo Chen <libo.chen@huawei.com>
Cc: Mauro Carvalho Chehab <mchehab@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agodrm/nouveau: make vga_switcheroo code depend on VGA_SWITCHEROO
Jeff Mahoney [Fri, 7 Jun 2013 00:07:13 +0000 (10:07 +1000)]
drm/nouveau: make vga_switcheroo code depend on VGA_SWITCHEROO

Commit 8116188fdef594 ("nouveau/acpi: hook up to the MXM method for mux
switching.") broke the build on non-x86 architectures due to the new
dependency on MXM and MXM being an x86 platform driver.

It built previously since the vga switcheroo registration routines were
zereod out on !X86.  The code was built in but unused.

This patch makes all of the DSM code depend on CONFIG_VGA_SWITCHEROO,
allowing it to build on non-x86 and shrinking the module size as well.

Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Cc: David Airlie <airlied@linux.ie>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agodrm/cirrus: correct register values for 16bpp
Takashi Iwai [Fri, 7 Jun 2013 00:07:13 +0000 (10:07 +1000)]
drm/cirrus: correct register values for 16bpp

When the mode is set with 16bpp on QEMU, the output gets totally broken.
The culprit is the bogus register values set for 16bpp, which was likely
copied from from a wrong place.

Addresses https://bugzilla.novell.com/show_bug.cgi?id=799216

Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Cc: David Airlie <airlied@linux.ie>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agodrm/fb-helper: don't sleep for screen unblank when an oops is in progress
Daniel Vetter [Fri, 7 Jun 2013 00:07:13 +0000 (10:07 +1000)]
drm/fb-helper: don't sleep for screen unblank when an oops is in progress

Otherwise the system will burn even brighter and worse, leave the user
wondering what's going on exactly.

Since we already have a panic handler which will (try) to restore the
entire fbdev console mode, we can just bail out.  Inspired by a patch from
Konstantin Khlebnikov.  The callchain leading to this, cut&pasted from
Konstantin's original patch:

callstack:
panic()
bust_spinlocks(1)
unblank_screen()
vc->vc_sw->con_blank()
fbcon_blank()
fb_blank()
info->fbops->fb_blank()
drm_fb_helper_blank()
drm_fb_helper_dpms()
drm_modeset_lock_all()
mutex_lock(&dev->mode_config.mutex)

Note that the entire locking in the fb helper around panic/sysrq and kdbg
is ...  non-existant.  So we have a decent change of blowing up
everything.  But since reworking this ties in with funny concepts like the
fbdev notifier chain or the impressive things which happen around
console_lock while oopsing, I'll leave that as an exercise for braver
souls than me.

Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Konstantin Khlebnikov <khlebnikov@openvz.org>
Cc: Dave Airlie <airlied@gmail.com>
Reviewed-by: Rob Clark <robdclark@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agodrivers/pcmcia/yenta_socket.c: convert to module_pci_driver
Libo Chen [Fri, 7 Jun 2013 00:07:12 +0000 (10:07 +1000)]
drivers/pcmcia/yenta_socket.c: convert to module_pci_driver

Use module_pci_driver instead of init/exit, make code clean.

Signed-off-by: Libo Chen <libo.chen@huawei.com>
Cc: Dominik Brodowski <linux@dominikbrodowski.net>
Cc: Bill Pemberton <wfp5p@virginia.edu>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Eric Miao <eric.y.miao@gmail.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agodrivers/pcmcia/pd6729.c: convert to module_pci_driver
Libo Chen [Fri, 7 Jun 2013 00:07:12 +0000 (10:07 +1000)]
drivers/pcmcia/pd6729.c: convert to module_pci_driver

Use module_pci_driver instead of init/exit, make code clean.

Signed-off-by: Libo Chen <libo.chen@huawei.com>
Cc: Bill Pemberton <wfp5p@virginia.edu>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Eric Miao <eric.y.miao.com>
Cc: Dominik Brodowski <linux@dominikbrodowski.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agokernel/auditfilter.c: fix leak in audit_add_rule() error path
Chen Gang [Fri, 7 Jun 2013 00:07:12 +0000 (10:07 +1000)]
kernel/auditfilter.c: fix leak in audit_add_rule() error path

If both 'tree' and 'watch' are valid we must call audit_put_tree(), just
like the preceding code within audit_add_rule().

Signed-off-by: Chen Gang <gang.chen@asianux.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Eric Paris <eparis@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agokernel/auditfilter.c: fixing build warning
Raphael S. Carvalho [Fri, 7 Jun 2013 00:07:11 +0000 (10:07 +1000)]
kernel/auditfilter.c: fixing build warning

kernel/auditfilter.c:426: warning: this decimal constant is unsigned only in ISO C90

Signed-off-by: Raphael S. Carvalho <raphael.scarv@gmail.com>
Cc: Eric Paris <eparis@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agoaudit: fix mq_open and mq_unlink to add the MQ root as a hidden parent audit_names...
Jeff Layton [Fri, 7 Jun 2013 00:07:11 +0000 (10:07 +1000)]
audit: fix mq_open and mq_unlink to add the MQ root as a hidden parent audit_names record

The old audit PATH records for mq_open looked like this:

type=PATH msg=audit(1366282323.982:869): item=1 name=(null) inode=6777
dev=00:0c mode=041777 ouid=0 ogid=0 rdev=00:00
obj=system_u:object_r:tmpfs_t:s15:c0.c1023
type=PATH msg=audit(1366282323.982:869): item=0 name="test_mq" inode=26732
dev=00:0c mode=0100700 ouid=0 ogid=0 rdev=00:00
obj=staff_u:object_r:user_tmpfs_t:s15:c0.c1023

...with the audit related changes that went into 3.7, they now look like this:

type=PATH msg=audit(1366282236.776:3606): item=2 name=(null) inode=66655
dev=00:0c mode=0100700 ouid=0 ogid=0 rdev=00:00
obj=staff_u:object_r:user_tmpfs_t:s15:c0.c1023
type=PATH msg=audit(1366282236.776:3606): item=1 name=(null) inode=6926
dev=00:0c mode=041777 ouid=0 ogid=0 rdev=00:00
obj=system_u:object_r:tmpfs_t:s15:c0.c1023
type=PATH msg=audit(1366282236.776:3606): item=0 name="test_mq"

Both of these look wrong to me.  As Steve Grubb pointed out:

"What we need is 1 PATH record that identifies the MQ. The other PATH
 records probably should not be there."

Fix it to record the mq root as a parent, and flag it such that it should
be hidden from view when the names are logged, since the root of the mq
filesystem isn't terribly interesting.  With this change, we get a single
PATH record that looks more like this:

type=PATH msg=audit(1368021604.836:484): item=0 name="test_mq" inode=16914
dev=00:0c mode=0100644 ouid=0 ogid=0 rdev=00:00
obj=unconfined_u:object_r:user_tmpfs_t:s0

In order to do this, a new audit_inode_parent_hidden() function is added.
If we do it this way, then we avoid having the existing callers of
audit_inode needing to do any sort of flag conversion if auditing is
inactive.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Reported-by: Jiri Jaburek <jjaburek@redhat.com>
Cc: Steve Grubb <sgrubb@redhat.com>
Cc: Eric Paris <eparis@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agox86: make 'mem=' option to work for efi platform
Wen Congyang [Fri, 7 Jun 2013 00:07:11 +0000 (10:07 +1000)]
x86: make 'mem=' option to work for efi platform

Current mem boot option only can work for non efi environment.  If the
user specifies add_efi_memmap, it cannot work for efi environment.  In the
efi environment, we call e820_add_region() to add the memory map.  So we
can modify __e820_add_region() and the mem boot option can work for efi
environment.

Note: Only E820_RAM is limited, and BOOT_SERVICES_{CODE,DATA} are always
mapped(If its address >= mem_limit, the memory won't be freed in
efi_free_boot_services()).

Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
Cc: Matt Fleming <matt.fleming@intel.com>
Cc: Rob Landley <rob@landley.net>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Yasuaki ISIMATU <isimatu.yasuaki@jp.fujitsu.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Matthew Garrett <mjg@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agosound/soc/codecs/si476x.c: don't use 0bNNN
Andrew Morton [Fri, 7 Jun 2013 00:07:10 +0000 (10:07 +1000)]
sound/soc/codecs/si476x.c: don't use 0bNNN

spacr64 gcc-3.4.5 (at least) spits this back.

Cc: Andrey Smirnov <andrey.smirnov@convergeddevices.net>
Cc: Mark Brown <broonie@opensource.wolfsonmicro.com>
Cc: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agodrivers/platform/x86/intel_ips.c: convert to module_pci_driver
Libo Chen [Fri, 7 Jun 2013 00:07:10 +0000 (10:07 +1000)]
drivers/platform/x86/intel_ips.c: convert to module_pci_driver

Signed-off-by: Libo Chen <libo.chen@huawei.com>
Cc: Matthew Garrett <matthew.garrett@nebula.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agomm: memcontrol: fix lockless reclaim hierarchy iterator
Johannes Weiner [Fri, 7 Jun 2013 00:07:10 +0000 (10:07 +1000)]
mm: memcontrol: fix lockless reclaim hierarchy iterator

The lockless reclaim hierarchy iterator currently has a misplaced barrier
that can lead to use-after-free crashes.

The reclaim hierarchy iterator consist of a sequence count and a position
pointer that are read and written locklessly, with memory barriers
enforcing ordering.

The write side sets the position pointer first, then updates the sequence
count to "publish" the new position.  Likewise, the read side must read
the sequence count first, then the position.  If the sequence count is up
to date, it's guaranteed that the position is up to date as well:

  writer:                         reader:
  iter->position = position       if iter->sequence == expected:
  smp_wmb()                           smp_rmb()
  iter->sequence = sequence           position = iter->position

However, the read side barrier is currently misplaced, which can lead to
dereferencing stale position pointers that no longer point to valid
memory.  Fix this.

Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Reported-by: Tejun Heo <tj@kernel.org>
Reviewed-by: Tejun Heo <tj@kernel.org>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Glauber Costa <glommer@parallels.com>
Cc: <stable@kernel.org> [3.10+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agofrontswap: fix incorrect zeroing and allocation size for frontswap_map
Akinobu Mita [Fri, 7 Jun 2013 00:07:09 +0000 (10:07 +1000)]
frontswap: fix incorrect zeroing and allocation size for frontswap_map

The bitmap accessed by bitops must have enough size to hold the required
numbers of bits rounded up to a multiple of BITS_PER_LONG.  And the bitmap
must not be zeroed by memset() if the number of bits cleared is not a
multiple of BITS_PER_LONG.

This fixes incorrect zeroing and allocation size for frontswap_map.  The
incorrect zeroing part doesn't cause any problem because frontswap_map is
freed just after zeroing.  But the wrongly calculated allocation size may
cause the problem.

For 32bit systems, the allocation size of frontswap_map is about twice as
large as required size.  For 64bit systems, the allocation size is smaller
than requeired if the number of bits is not a multiple of BITS_PER_LONG.

Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agokernel/audit_tree.c:audit_add_tree_rule(): protect `rule' from kill_rules()
Chen Gang [Fri, 7 Jun 2013 00:07:09 +0000 (10:07 +1000)]
kernel/audit_tree.c:audit_add_tree_rule(): protect `rule' from kill_rules()

audit_add_tree_rule() must set 'rule->tree = NULL;' firstly, to protect
the rule itself freed in kill_rules().

The reason is when it is killed, the 'rule' itself may have already
released, we should not access it.  one example: we add a rule to an
inode, just at the same time the other task is deleting this inode.

The work flow for adding a rule:

    audit_receive() -> (need audit_cmd_mutex lock)
      audit_receive_skb() ->
        audit_receive_msg() ->
          audit_receive_filter() ->
            audit_add_rule() ->
              audit_add_tree_rule() -> (need audit_filter_mutex lock)
                ...
                unlock audit_filter_mutex
                get_tree()
                ...
                iterate_mounts() -> (iterate all related inodes)
                  tag_mount() ->
                    tag_trunk() ->
                      create_trunk() -> (assume it is 1st rule)
                        fsnotify_add_mark() ->
                          fsnotify_add_inode_mark() ->  (add mark to inode->i_fsnotify_marks)
                        ...
                        get_tree(); (each inode will get one)
                ...
                lock audit_filter_mutex

The work flow for deleting an inode:

    __destroy_inode() ->
     fsnotify_inode_delete() ->
       __fsnotify_inode_delete() ->
        fsnotify_clear_marks_by_inode() ->  (get mark from inode->i_fsnotify_marks)
          fsnotify_destroy_mark() ->
           fsnotify_destroy_mark_locked() ->
             audit_tree_freeing_mark() ->
               evict_chunk() ->
                 ...
                 tree->goner = 1
                 ...
                 kill_rules() ->   (assume current->audit_context == NULL)
                   call_rcu() ->   (rule->tree != NULL)
                     audit_free_rule_rcu() ->
                       audit_free_rule()
                 ...
                 audit_schedule_prune() ->  (assume current->audit_context == NULL)
                   kthread_run() ->    (need audit_cmd_mutex and audit_filter_mutex lock)
                     prune_one() ->    (delete it from prue_list)
                       put_tree(); (match the original get_tree above)

Signed-off-by: Chen Gang <gang.chen@asianux.com>
Cc: Eric Paris <eparis@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agodrivers-base-cpuc-fix-maxcpus-boot-option-fix
Andrew Morton [Fri, 7 Jun 2013 00:07:09 +0000 (10:07 +1000)]
drivers-base-cpuc-fix-maxcpus-boot-option-fix

fix CONFIG_SMP=n build

Cc: Greg KH <greg@kroah.com>
Cc: Youquan Song <youquan.song@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agodrivers/base/cpu.c: fix maxcpus boot option
Youquan Song [Fri, 7 Jun 2013 00:07:08 +0000 (10:07 +1000)]
drivers/base/cpu.c: fix maxcpus boot option

The maxcpus boot option limits the maximum number of CPUs in the system,
but this option is broken in recent kernels.  Though we use maxcpus to
limit CPUs number, the current kernel will register all of the present
CPUs in sysfs.  udev will enumerate all registered cpu in sysfs, and it
will bring up the CPU if the CPU is offline.  So the maxcpus option is
broken.

This patch will only register a CPU which is not over the limit of the
maxcpus option in sysfs.  So it will keep the maxcpus limitation during
udev enumeration or other bringup of CPUs over the limitation by methods
such as echo 1 > /sys/devices/system/cpu/online

Signed-off-by: Youquan Song <youquan.song@intel.com>
Cc: Greg KH <greg@kroah.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agomm: migration: add migrate_entry_wait_huge()
Naoya Horiguchi [Fri, 7 Jun 2013 00:07:08 +0000 (10:07 +1000)]
mm: migration: add migrate_entry_wait_huge()

When we have a page fault for the address which is backed by a hugepage
under migration, the kernel can't wait correctly and do busy looping on
hugepage fault until the migration finishes.  As a result, users who try
to kick hugepage migration (via soft offlining, for example) occasionally
experience long delay or soft lockup.

This is because pte_offset_map_lock() can't get a correct migration entry
or a correct page table lock for hugepage.  This patch introduces
migration_entry_wait_huge() to solve this.

Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Reviewed-by: Rik van Riel <riel@redhat.com>
Reviewed-by: Wanpeng Li <liwanp@linux.vnet.ibm.com>
Reviewed-by: Michal Hocko <mhocko@suse.cz>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: <stable@vger.kernel.org> [2.6.35+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agomm/page_alloc.c: fix watermark check in __zone_watermark_ok()
Tomasz Stanislawski [Fri, 7 Jun 2013 00:07:08 +0000 (10:07 +1000)]
mm/page_alloc.c: fix watermark check in __zone_watermark_ok()

The watermark check consists of two sub-checks.  The first one is:

if (free_pages <= min + lowmem_reserve)
return false;

The check assures that there is minimal amount of RAM in the zone.  If CMA
is used then the free_pages is reduced by the number of free pages in CMA
prior to the over-mentioned check.

if (!(alloc_flags & ALLOC_CMA))
free_pages -= zone_page_state(z, NR_FREE_CMA_PAGES);

This prevents the zone from being drained from pages available for
non-movable allocations.

The second check prevents the zone from getting too fragmented.

for (o = 0; o < order; o++) {
free_pages -= z->free_area[o].nr_free << o;
min >>= 1;
if (free_pages <= min)
return false;
}

The field z->free_area[o].nr_free is equal to the number of free pages
including free CMA pages.  Therefore the CMA pages are subtracted twice.
This may cause a false positive fail of __zone_watermark_ok() if the CMA
area gets strongly fragmented.  In such a case there are many 0-order free
pages located in CMA.  Those pages are subtracted twice therefore they
will quickly drain free_pages during the check against fragmentation.  The
test fails even though there are many free non-cma pages in the zone.

This patch fixes this issue by subtracting CMA pages only for a purpose of
(free_pages <= min + lowmem_reserve) check.

Laura said:

  We were observing allocation failures of higher order pages (order 5 =
  128K typically) under tight memory conditions resulting in driver
  failure.  The output from the page allocation failure showed plenty of
  free pages of the appropriate order/type/zone and mostly CMA pages in
  the lower orders.

  For full disclosure, we still observed some page allocation failures
  even after applying the patch but the number was drastically reduced and
  those failures were attributed to fragmentation/other system issues.

Signed-off-by: Tomasz Stanislawski <t.stanislaws@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Tested-by: Laura Abbott <lauraa@codeaurora.org>
Cc: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Acked-by: Minchan Kim <minchan@kernel.org>
Cc: Mel Gorman <mel@csn.ul.ie>
Tested-by: Marek Szyprowski <m.szyprowski@samsung.com>
Cc: <stable@vger.kernel.org> [3.7+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agodrivers/misc/sgi-gru/grufile.c: fix info leak in gru_get_config_info()
Dan Carpenter [Fri, 7 Jun 2013 00:07:07 +0000 (10:07 +1000)]
drivers/misc/sgi-gru/grufile.c: fix info leak in gru_get_config_info()

The "info.fill" array isn't initialized so it can leak uninitialized stack
information to user space.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Robin Holt <holt@sgi.com>
Acked-by: Dimitri Sivanich <sivanich@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agoaio: fix io_destroy() regression by using call_rcu()
Kent Overstreet [Fri, 7 Jun 2013 00:07:07 +0000 (10:07 +1000)]
aio: fix io_destroy() regression by using call_rcu()

There was a regression introduced by 36f5588 ("aio: refcounting cleanup"),
reported by Jens Axboe - the refcounting cleanup switched to using RCU in
the shutdown path, but the synchronize_rcu() was done in the context of
the io_destroy() syscall greatly increasing the time it could block.

This patch switches it to call_rcu() and makes shutdown asynchronous (more
asynchronous than it was originally; before the refcount changes
io_destroy() would still wait on pending kiocbs).

Note that there's a global quota on the max outstanding kiocbs, and that
quota must be manipulated synchronously; otherwise io_setup() could return
-EAGAIN when there isn't quota available, and userspace won't have any way
of waiting until shutdown of the old kioctxs has finished (besides busy
looping).

So we release our quota before kioctx shutdown has finished, which should
be fine since the quota never corresponded to anything real anyways.

Signed-off-by: Kent Overstreet <koverstreet@google.com>
Cc: Zach Brown <zab@redhat.com>
Cc: Felipe Balbi <balbi@ti.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Reported-by: Jens Axboe <axboe@kernel.dk>
Tested-by: Jens Axboe <axboe@kernel.dk>
Cc: Asai Thambi S P <asamymuthupa@micron.com>
Cc: Selvan Mani <smani@micron.com>
Cc: Sam Bradshaw <sbradshaw@micron.com>
Cc: Jeff Moyer <jmoyer@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Benjamin LaHaise <bcrl@kvack.org>
Tested-by: Benjamin LaHaise <bcrl@kvack.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agortc-at91rm9200: use shadow IMR on at91sam9x5
Johan Hovold [Fri, 7 Jun 2013 00:07:07 +0000 (10:07 +1000)]
rtc-at91rm9200: use shadow IMR on at91sam9x5

Add support for the at91sam9x5-family which must use the shadow interrupt
mask due to a hardware issue (causing RTC_IMR to always be zero).

Signed-off-by: Johan Hovold <jhovold@gmail.com>
Acked-by: Nicolas Ferre <nicolas.ferre@atmel.com>
Cc: Douglas Gilbert <dgilbert@interlog.com>
Cc: Jean-Christophe PLAGNIOL-VILLARD <plagnioj@jcrosoft.com>
Cc: Ludovic Desroches <ludovic.desroches@atmel.com>
Cc: Robert Nelson <Robert.Nelson@digikey.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agortc-at91rm9200: add shadow interrupt mask
Johan Hovold [Fri, 7 Jun 2013 00:07:06 +0000 (10:07 +1000)]
rtc-at91rm9200: add shadow interrupt mask

Add shadow interrupt-mask register which can be used on SoCs where the
actual hardware register is broken.

Note that some care needs to be taken to make sure the shadow mask
corresponds to the actual hardware state.  The added overhead is not an
issue for the non-broken SoCs due to the relatively infrequent
interrupt-mask updates.  We do, however, only use the shadow mask value as
a fall-back when it actually needed as there is still a theoretical
possibility that the mask is incorrect (see the code for details).

Signed-off-by: Johan Hovold <jhovold@gmail.com>
Acked-by: Nicolas Ferre <nicolas.ferre@atmel.com>
Cc: Douglas Gilbert <dgilbert@interlog.com>
Cc: Jean-Christophe PLAGNIOL-VILLARD <plagnioj@jcrosoft.com>
Cc: Ludovic Desroches <ludovic.desroches@atmel.com>
Cc: Robert Nelson <Robert.Nelson@digikey.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agortc-at91rm9200: refactor interrupt-register handling
Johan Hovold [Fri, 7 Jun 2013 00:07:06 +0000 (10:07 +1000)]
rtc-at91rm9200: refactor interrupt-register handling

Add accessors for the interrupt register.

This will allow us to easily add a shadow interrupt-mask register to
use on SoCs where the interrupt-mask register cannot be used.

Signed-off-by: Johan Hovold <jhovold@gmail.com>
Acked-by: Nicolas Ferre <nicolas.ferre@atmel.com>
Cc: Douglas Gilbert <dgilbert@interlog.com>
Cc: Jean-Christophe PLAGNIOL-VILLARD <plagnioj@jcrosoft.com>
Cc: Ludovic Desroches <ludovic.desroches@atmel.com>
Cc: Robert Nelson <Robert.Nelson@digikey.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agortc-at91rm9200: add configuration support
Johan Hovold [Fri, 7 Jun 2013 00:07:06 +0000 (10:07 +1000)]
rtc-at91rm9200: add configuration support

Add configuration support which can be used to implement SoC-specific
workarounds for broken hardware.

Signed-off-by: Johan Hovold <jhovold@gmail.com>
Acked-by: Nicolas Ferre <nicolas.ferre@atmel.com>
Cc: Douglas Gilbert <dgilbert@interlog.com>
Cc: Jean-Christophe PLAGNIOL-VILLARD <plagnioj@jcrosoft.com>
Cc: Ludovic Desroches <ludovic.desroches@atmel.com>
Cc: Robert Nelson <Robert.Nelson@digikey.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agortc-at91rm9200: add match-table compile guard
Johan Hovold [Fri, 7 Jun 2013 00:07:06 +0000 (10:07 +1000)]
rtc-at91rm9200: add match-table compile guard

The members of Atmel's at91sam9x5 family (9x5) have a broken RTC interrupt
mask register (AT91_RTC_IMR).  It does not reflect enabled interrupts but
instead always returns zero.

The kernel's rtc-at91rm9200 driver handles the RTC for the 9x5 family.
Currently when the date/time is set, an interrupt is generated and this
driver neglects to handle the interrupt.  The kernel complains about the
un-handled interrupt and disables it henceforth.  This not only breaks the
RTC function, but since that interrupt is shared (Atmel's SYS interrupt)
then other things break as well (e.g.  the debug port no longer accepts
characters).

Tested on the at91sam9g25. Bug confirmed by Atmel.

This patch (of 5):

Add missing match-table compile guard.

Signed-off-by: Johan Hovold <jhovold@gmail.com>
Acked-by: Nicolas Ferre <nicolas.ferre@atmel.com>
Cc: Douglas Gilbert <dgilbert@interlog.com>
Cc: Jean-Christophe PLAGNIOL-VILLARD <plagnioj@jcrosoft.com>
Cc: Ludovic Desroches <ludovic.desroches@atmel.com>
Cc: Robert Nelson <Robert.Nelson@digikey.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agofs/ocfs2/namei.c: remove unecessary ERROR when removing non-empty directory
Goldwyn Rodrigues [Fri, 7 Jun 2013 00:07:05 +0000 (10:07 +1000)]
fs/ocfs2/namei.c: remove unecessary ERROR when removing non-empty directory

While removing a non-empty directory, the kernel dumps a message:
(rmdir,21743,1):ocfs2_unlink:953 ERROR: status = -39

Suppress the error message from being printed in the dmesg so users
don't panic.

Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Acked-by: Sunil Mushran <sunil.mushran@gmail.com>
Reviewed-by: Jie Liu <jeff.liu@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agoswap: avoid read_swap_cache_async() race to deadlock while waiting on discard I/O...
Rafael Aquini [Fri, 7 Jun 2013 00:07:05 +0000 (10:07 +1000)]
swap: avoid read_swap_cache_async() race to deadlock while waiting on discard I/O completion

read_swap_cache_async() can race against get_swap_page(), and stumble
across a SWAP_HAS_CACHE entry in the swap map whose page wasn't brought
into the swapcache yet.  This transient swap_map state is expected to be
transitory, but the actual placement of discard at scan_swap_map() inserts
a wait for I/O completion thus making the thread at
read_swap_cache_async() to loop around its -EEXIST case, while the other
end at get_swap_page() is scheduled away at scan_swap_map().  This can
leave the system deadlocked if the I/O completion happens to be waiting on
the CPU waitqueue where read_swap_cache_async() is busy looping and
!CONFIG_PREEMPT.

This patch introduces a cond_resched() call to make the aforementioned
read_swap_cache_async() busy loop condition to bail out when necessary,
thus avoiding the subtle race window.

Signed-off-by: Rafael Aquini <aquini@redhat.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Acked-by: Hugh Dickins <hughd@google.com>
Cc: Shaohua Li <shli@kernel.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agodrivers/rtc/rtc-twl.c: fix missing device_init_wakeup() when booted with device tree
Tony Lindgren [Fri, 7 Jun 2013 00:07:05 +0000 (10:07 +1000)]
drivers/rtc/rtc-twl.c: fix missing device_init_wakeup() when booted with device tree

When booted in legacy mode device_init_wakeup() gets called by
drivers/mfd/twl-core.c when the children are initialized.  However, when
booted using device tree, the children are created with
of_platform_populate() instead add_children().

This means that the RTC driver will not have device_init_wakeup() set, and
we need to call it from the driver probe like RTC drivers typically do.

Without this we cannot test PM wake-up events on omaps for cases where
there may not be any physical wake-up event.

Signed-off-by: Tony Lindgren <tony@atomide.com>
Reported-by: Kevin Hilman <khilman@linaro.org>
Cc: Alessandro Zummo <a.zummo@towertech.it>
Cc: Jingoo Han <jg1.han@samsung.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agocciss: fix broken mutex usage in ioctl
Stephen M. Cameron [Fri, 7 Jun 2013 00:07:04 +0000 (10:07 +1000)]
cciss: fix broken mutex usage in ioctl

If a new logical drive is added and the CCISS_REGNEWD ioctl is invoked (as
is normal with the Array Configuration Utility) the process will hang as
below.  It attempts to acquire the same mutex twice, once in do_ioctl()
and once in cciss_unlocked_open().  The BKL was recursive, the mutex
isn't.

Linux version 3.10.0-rc2 (scameron@localhost.localdomain) (gcc version 4.4.7 20120313 (Red Hat 4.4.7-3) (GCC) ) #1 SMP Fri May 24 14:32:12 CDT 2013
[...]
acu             D 0000000000000001     0  3246   3191 0x00000080
 ffff8800da833a18 0000000000000086 ffff8800da833fd8 0000000000012d00
 ffff8800da832010 0000000000012d00 0000000000012d00 0000000000012d00
 ffff8800da833fd8 0000000000012d00 ffff8800da8294e0 ffff8800db50eb10
Call Trace:
 [<ffffffff8153a2d9>] schedule+0x29/0x70
 [<ffffffff8153a5de>] schedule_preempt_disabled+0xe/0x10
 [<ffffffff81538e0b>] __mutex_lock_slowpath+0x17b/0x220
 [<ffffffff81538c6b>] mutex_lock+0x2b/0x50
 [<ffffffffa0002bdf>] cciss_unlocked_open+0x2f/0x110 [cciss]
 [<ffffffff811a06f3>] __blkdev_get+0xd3/0x470
 [<ffffffff811a0aec>] blkdev_get+0x5c/0x1e0
 [<ffffffff8123cd42>] register_disk+0x182/0x1a0
 [<ffffffff8123cedc>] add_disk+0x17c/0x310
 [<ffffffffa0002dfa>] cciss_add_disk+0x13a/0x170 [cciss]
 [<ffffffffa000adfb>] cciss_update_drive_info+0x39b/0x480 [cciss]
 [<ffffffffa000b818>] rebuild_lun_table+0x258/0x370 [cciss]
 [<ffffffffa000bc7f>] cciss_ioctl+0x34f/0x470 [cciss]
 [<ffffffff81538c5e>] ? mutex_lock+0x1e/0x50
 [<ffffffffa000bde9>] do_ioctl+0x49/0x70 [cciss]
 [<ffffffff81239e48>] __blkdev_driver_ioctl+0x28/0x30
 [<ffffffff8123a4e0>] blkdev_ioctl+0x200/0x7b0
 [<ffffffff81186633>] ? mntput+0x23/0x40
 [<ffffffff8119ef0c>] block_ioctl+0x3c/0x40
 [<ffffffff8117a309>] do_vfs_ioctl+0x89/0x350
 [<ffffffff8116a12e>] ? ____fput+0xe/0x10
 [<ffffffff81063ae4>] ? task_work_run+0x94/0xf0
 [<ffffffff8117a671>] SyS_ioctl+0xa1/0xb0
 [<ffffffff81544102>] system_call_fastpath+0x16/0x1b

This mutex usage was added into the ioctl path when the big kernel lock
was removed.  As it turns out, these paths are all thread safe anyway (or
can easily be made so) and we don't want ioctl() to be single threaded in
any case.

Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Mike Miller <mike.miller@hp.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agoaudit: wait_for_auditd() should use TASK_UNINTERRUPTIBLE
Oleg Nesterov [Fri, 7 Jun 2013 00:07:04 +0000 (10:07 +1000)]
audit: wait_for_auditd() should use TASK_UNINTERRUPTIBLE

audit_log_start() does wait_for_auditd() in a loop until
audit_backlog_wait_time passes or audit_skb_queue has a room.

If signal_pending() is true this becomes a busy-wait loop, schedule() in
TASK_INTERRUPTIBLE won't block.

Thanks to Guy for fully investigating and explaining the problem.

(akpm: that'll cause the system to lock up on a non-preemptible
uniprocessor kernel)

(Guy: "Our customer was in fact running a uniprocessor machine, and they
reported a system hang.")

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Reported-by: Guy Streeter <streeter@redhat.com>
Cc: Eric Paris <eparis@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agodrivers-rtc-rtc-cmosc-fix-accidentally-enabling-rtc-channel-fix
Andrew Morton [Fri, 7 Jun 2013 00:07:04 +0000 (10:07 +1000)]
drivers-rtc-rtc-cmosc-fix-accidentally-enabling-rtc-channel-fix

coding-style tweak

Cc: Derek Basehore <dbasehore@chromium.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agodrivers/rtc/rtc-cmos.c: fix accidentally enabling rtc channel
Derek Basehore [Fri, 7 Jun 2013 00:07:03 +0000 (10:07 +1000)]
drivers/rtc/rtc-cmos.c: fix accidentally enabling rtc channel

During resume, we call hpet_rtc_timer_init after masking an irq bit in
hpet.  This will cause the call to hpet_disable_rtc_channel to be undone
if RTC_AIE is the only bit not masked.

Allowing the cmos interrupt handler to run before resuming caused some
issues where the timer for the alarm was not removed.  This would cause
other, later timers to not be cleared, so utilities such as hwclock would
time out when waiting for the update interrupt.

Signed-off-by: Derek Basehore <dbasehore@chromium.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agodrivers/rtc/rtc-tps6586x.c: device wakeup flags correction
Dmitry Osipenko [Fri, 7 Jun 2013 00:07:03 +0000 (10:07 +1000)]
drivers/rtc/rtc-tps6586x.c: device wakeup flags correction

Use device_init_wakeup() instead of device_set_wakeup_capable() and move
it before rtc dev registering.  This fixes alarmtimer not registered when
tps6586x rtc is the only wakeup compatible rtc in the system.

Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Cc: Laxman Dewangan <ldewangan@nvidia.com>
Cc: Jingoo Han <jg1.han@samsung.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agomemcg: don't initialize kmem-cache destroying work for root caches
Andrey Vagin [Fri, 7 Jun 2013 00:07:03 +0000 (10:07 +1000)]
memcg: don't initialize kmem-cache destroying work for root caches

struct memcg_cache_params has a union.  Different parts of this union are
used for root and non-root caches.  A part with destroying work is used
only for non-root caches.

[  115.096202] BUG: unable to handle kernel paging request at 0000000fffffffe0
[  115.096785] IP: [<ffffffff8116b641>] kmem_cache_alloc+0x41/0x1f0
[  115.097024] PGD 7ace1067 PUD 0
[  115.097024] Oops: 0000 [#4] SMP
[  115.097024] Modules linked in: netlink_diag af_packet_diag udp_diag tcp_diag inet_diag unix_diag ip6table_filter ip6_tables i2c_piix4 virtio_net virtio_balloon microcode i2c_core pcspkr floppy
[  115.097024] CPU: 0 PID: 1929 Comm: lt-vzctl Tainted: G      D      3.10.0-rc1+ #2
[  115.097024] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
[  115.097024] task: ffff88007b5aaee0 ti: ffff88007bf0c000 task.ti: ffff88007bf0c000
[  115.097024] RIP: 0010<ffffffff8116b641>]  [<ffffffff8116b641>] kmem_cache_alloc+0x41/0x1f0
[  115.097024] RSP: 0018:ffff88007bf0de68  EFLAGS: 00010202
[  115.097024] RAX: 0000000fffffffe0 RBX: 00007fff4014f200 RCX: 0000000000000300
[  115.097024] RDX: 0000000000000005 RSI: 00000000000000d0 RDI: ffff88007d001300
[  115.097024] RBP: ffff88007bf0dea8 R08: 00007f849c3141b7 R09: ffffffff8118e100
[  115.097024] R10: 0000000000000001 R11: 0000000000000246 R12: 00000000000000d0
[  115.097024] R13: 0000000fffffffe0 R14: ffff88007d001300 R15: 0000000000001000
[  115.097024] FS:  00007f849cbb8b40(0000) GS:ffff88007fc00000(0000) knlGS:0000000000000000
[  115.097024] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  115.097024] CR2: 0000000fffffffe0 CR3: 000000007bc38000 CR4: 00000000000006f0
[  115.097024] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  115.097024] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[  115.097024] Stack:
[  115.097024]  ffffffff8118e100 ffffffff81149ea1 0000000000000008 00007fff4014f200
[  115.097024]  00007fff4014f200 0000000000000000 0000000000000000 0000000000001000
[  115.097024]  ffff88007bf0dee8 ffffffff8118e100 ffff880037598e00 00007fff4014f200
[  115.097024] Call Trace:
[  115.097024]  [<ffffffff8118e100>] ? getname_flags.part.34+0x30/0x140
[  115.097024]  [<ffffffff81149ea1>] ? vma_rb_erase+0x121/0x210
[  115.097024]  [<ffffffff8118e100>] getname_flags.part.34+0x30/0x140
[  115.097024]  [<ffffffff8118e248>] getname+0x38/0x60
[  115.097024]  [<ffffffff81181d55>] do_sys_open+0xc5/0x1e0
[  115.097024]  [<ffffffff81181e92>] SyS_open+0x22/0x30
[  115.097024]  [<ffffffff8161cb82>] system_call_fastpath+0x16/0x1b
[  115.097024] Code: f4 53 48 83 ec 18 8b 05 8e 53 b7 00 4c 8b 4d 08 21 f0 a8 10 74 0d 4c 89 4d c0 e8 1b 76 4a 00 4c 8b 4d c0 e9 92 00 00 00 4d 89 f5 <4d> 8b 45 00 65 4c 03 04 25 48 cd 00 00 49 8b 50 08 4d 8b 38 49
[  115.097024] RIP  [<ffffffff8116b641>] kmem_cache_alloc+0x41/0x1f0
[  115.097024]  RSP <ffff88007bf0de68>
[  115.097024] CR2: 0000000fffffffe0
[  115.121352] ---[ end trace 16bb8e8408b97d0e ]---

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Cc: Konstantin Khlebnikov <khlebnikov@openvz.org>
Cc: Glauber Costa <glommer@parallels.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Balbir Singh <bsingharora@gmail.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Reviewed-by: Michal Hocko <mhocko@suse.cz>
Cc: Li Zefan <lizefan@huawei.com>
Cc: <stable@vger.kernel.org> [3.9.x]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agoocfs2: ocfs2_prep_new_orphaned_file() should return ret
Xiaowei.Hu [Fri, 7 Jun 2013 00:07:02 +0000 (10:07 +1000)]
ocfs2: ocfs2_prep_new_orphaned_file() should return ret

If an error occurs, for example an EIO in __ocfs2_prepare_orphan_dir,
ocfs2_prep_new_orphaned_file will release the inode_ac, then when the
caller of ocfs2_prep_new_orphaned_file gets a 0 return, it will refer to a
NULL ocfs2_alloc_context struct in the following functions.  A kernel
panic happens.

Signed-off-by: "Xiaowei.Hu" <xiaowei.hu@oracle.com>
Reviewed-by: shencanquan <shencanquan@huawei.com>
Acked-by: Sunil Mushran <sunil.mushran@gmail.com>
Cc: Joe Jin <joe.jin@oracle.com>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agolib/mpi/mpicoder.c: looping issue, need stop when equal to zero, found by 'EXTRA_FLAG...
Chen Gang [Fri, 7 Jun 2013 00:07:02 +0000 (10:07 +1000)]
lib/mpi/mpicoder.c: looping issue, need stop when equal to zero, found by 'EXTRA_FLAGS=-W'.

For 'while' looping, need stop when 'nbytes == 0', or will cause issue.
('nbytes' is size_t which is always bigger or equal than zero).

The related warning: (with EXTRA_CFLAGS=-W)
  lib/mpi/mpicoder.c:40:2: warning: comparison of unsigned expression >= 0 is always true [-Wtype-limits]

Signed-off-by: Chen Gang <gang.chen@asianux.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: David Howells <dhowells@redhat.com>
Cc: James Morris <james.l.morris@oracle.com>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Acked-by: David Howells <dhowells@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agokmsg-honor-dmesg_restrict-sysctl-on-dev-kmsg-fix
Andrew Morton [Fri, 7 Jun 2013 00:07:02 +0000 (10:07 +1000)]
kmsg-honor-dmesg_restrict-sysctl-on-dev-kmsg-fix

use pr_warn_once()

Cc: Josh Boyer <jwboyer@redhat.com>
Cc: Kees Cook <keescook@chromium.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agokmsg: honor dmesg_restrict sysctl on /dev/kmsg
Kees Cook [Fri, 7 Jun 2013 00:07:02 +0000 (10:07 +1000)]
kmsg: honor dmesg_restrict sysctl on /dev/kmsg

The dmesg_restrict sysctl currently covers the syslog method for access
dmesg, however /dev/kmsg isn't covered by the same protections.  Most
people haven't noticed because util-linux dmesg(1) defaults to using the
syslog method for access in older versions.  With util-linux dmesg(1)
defaults to reading directly from /dev/kmsg.

To fix /dev/kmsg, let's compare the existing interfaces and what they allow:

- /proc/kmsg allows:
 - open (SYSLOG_ACTION_OPEN) if CAP_SYSLOG since it uses a destructive
   single-reader interface (SYSLOG_ACTION_READ).
 - everything, after an open.

- syslog syscall allows:
 - anything, if CAP_SYSLOG.
 - SYSLOG_ACTION_READ_ALL and SYSLOG_ACTION_SIZE_BUFFER, if dmesg_restrict==0.
 - nothing else (EPERM).

The use-cases were:
- dmesg(1) needs to do non-destructive SYSLOG_ACTION_READ_ALLs.
- sysklog(1) needs to open /proc/kmsg, drop privs, and still issue the
  destructive SYSLOG_ACTION_READs.

AIUI, dmesg(1) is moving to /dev/kmsg, and systemd-journald doesn't clear
the ring buffer.

Based on the comments in devkmsg_llseek, it sounds like actions besides
reading aren't going to be supported by /dev/kmsg (i.e.
SYSLOG_ACTION_CLEAR), so we have a strict subset of the non-destructive
syslog syscall actions.

To this end, move the check as Josh had done, but also rename the
constants to reflect their new uses (SYSLOG_FROM_CALL becomes
SYSLOG_FROM_READER, and SYSLOG_FROM_FILE becomes SYSLOG_FROM_PROC).
SYSLOG_FROM_READER allows non-destructive actions, and SYSLOG_FROM_PROC
allows destructive actions after a capabilities-constrained
SYSLOG_ACTION_OPEN check.

- /dev/kmsg allows:
 - open if CAP_SYSLOG or dmesg_restrict==0
 - reading/polling, after open

Addresses https://bugzilla.redhat.com/show_bug.cgi?id=903192

Signed-off-by: Kees Cook <keescook@chromium.org>
Reported-by: Christian Kujau <lists@nerdbynature.de>
Tested-by: Josh Boyer <jwboyer@redhat.com>
Cc: Kay Sievers <kay@vrfy.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agomigrate-shutdown-reboot-to-boot-cpu-v11
Robin Holt [Fri, 7 Jun 2013 00:07:01 +0000 (10:07 +1000)]
migrate-shutdown-reboot-to-boot-cpu-v11

Signed-off-by: Robin Holt <holt@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agoreboot: rigrate shutdown/reboot to boot cpu
Robin Holt [Fri, 7 Jun 2013 00:07:01 +0000 (10:07 +1000)]
reboot: rigrate shutdown/reboot to boot cpu

We recently noticed that reboot of a 1024 cpu machine takes approx 16
minutes of just stopping the cpus.  The slowdown was tracked to commit
f96972f.

The current implementation does all the work of hot removing the cpus
before halting the system.  We are switching to just migrating to the
boot cpu and then continuing with shutdown/reboot.

This also has the effect of not breaking x86's command line parameter for
specifying the reboot cpu.  Note, this code was shamelessly copied from
arch/x86/kernel/reboot.c with bits removed pertaining to the reboot_cpu
command line parameter.

Signed-off-by: Robin Holt <holt@sgi.com>
Tested-by: Shawn Guo <shawn.guo@linaro.org>
Cc: "Srivatsa S. Bhat" <srivatsa.bhat@linux.vnet.ibm.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Russ Anderson <rja@sgi.com>
Cc: Robin Holt <holt@sgi.com>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Guan Xuetao <gxt@mprc.pku.edu.cn>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agocpu-hotplug-provide-a-generic-helper-to-disable-enable-cpu-hotplug-v11
Srivatsa S. Bhat [Fri, 7 Jun 2013 00:07:01 +0000 (10:07 +1000)]
cpu-hotplug-provide-a-generic-helper-to-disable-enable-cpu-hotplug-v11

Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Signed-off-by: Robin Holt <holt@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agoCPU hotplug: provide a generic helper to disable/enable CPU hotplug
Srivatsa S. Bhat [Fri, 7 Jun 2013 00:07:00 +0000 (10:07 +1000)]
CPU hotplug: provide a generic helper to disable/enable CPU hotplug

There are instances in the kernel where we would like to disable CPU
hotplug (from sysfs) during some important operation.  Today the freezer
code depends on this and the code to do it was kinda tailor-made for that.

Restructure the code and make it generic enough to be useful for
other usecases too.

Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Signed-off-by: Robin Holt <holt@sgi.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Russ Anderson <rja@sgi.com>
Cc: Robin Holt <holt@sgi.com>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Guan Xuetao <gxt@mprc.pku.edu.cn>
Cc: Shawn Guo <shawn.guo@linaro.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agoMerge remote-tracking branch 'lzo-update/lzo-update'
Stephen Rothwell [Fri, 7 Jun 2013 05:38:51 +0000 (15:38 +1000)]
Merge remote-tracking branch 'lzo-update/lzo-update'

11 years agoMerge remote-tracking branch 'clk/clk-next'
Stephen Rothwell [Fri, 7 Jun 2013 05:37:10 +0000 (15:37 +1000)]
Merge remote-tracking branch 'clk/clk-next'

11 years agoMerge remote-tracking branch 'userns/for-next'
Stephen Rothwell [Fri, 7 Jun 2013 05:35:26 +0000 (15:35 +1000)]
Merge remote-tracking branch 'userns/for-next'

11 years agoMerge remote-tracking branch 'pwm/for-next'
Stephen Rothwell [Fri, 7 Jun 2013 05:33:45 +0000 (15:33 +1000)]
Merge remote-tracking branch 'pwm/for-next'