]> git.karo-electronics.de Git - karo-tx-linux.git/log
karo-tx-linux.git
10 years agoidr: don't need to shink the free list when idr_remove()
Lai Jiangshan [Thu, 22 May 2014 00:44:09 +0000 (10:44 +1000)]
idr: don't need to shink the free list when idr_remove()

After idr subsystem is changed to RCU-awared, the free layer will not go
to the free list.  The free list will not be filled up when idr_remove().
So we don't need to shink it too.

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agoidr: fix idr_replace()'s returned error code
Lai Jiangshan [Thu, 22 May 2014 00:44:08 +0000 (10:44 +1000)]
idr: fix idr_replace()'s returned error code

When the smaller id is not found, idr_replace() returns -ENOENT.  But when
the id is bigger enough, idr_replace() returns -EINVAL, actually there is
no difference between these two kinds of ids.

These are all unallocated id, the return values of the idr_replace() for
these ids should be the same: -ENOENT.

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agoidr: fix NULL pointer dereference when ida_remove(unallocated_id)
Lai Jiangshan [Thu, 22 May 2014 00:44:08 +0000 (10:44 +1000)]
idr: fix NULL pointer dereference when ida_remove(unallocated_id)

If the ida has at least one existing id, and when an unallocated ID which
meets a certain condition is passed to the ida_remove(), the system will
crash because it hits NULL pointer dereference.

The condition is that the unallocated ID shares the same lowest idr layer
with the existing ID, but the idr slot would be different if the
unallocated ID were to be allocated.

In this case the matching idr slot for the unallocated_id is NULL, causing
@bitmap to be NULL which the function dereferences without checking
crashing the kernel.

See the test code:

static void test3(void)
{
int id;
DEFINE_IDA(test_ida);

printk(KERN_INFO "Start test3\n");
if (ida_pre_get(&test_ida, GFP_KERNEL) < 0) return;
if (ida_get_new(&test_ida,  &id) < 0) return;
ida_remove(&test_ida, 4000); /* bug: null deference here */
printk(KERN_INFO "End of test3\n");
}

It happens only when the caller tries to free an unallocated ID which is
the caller's fault.  It is not a bug.  But it is better to add the proper
check and complain rather than crashing the kernel.

[tj@kernel.org: updated patch description]
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agoidr: fix unexpected ID-removal when idr_remove(unallocated_id)
Lai Jiangshan [Thu, 22 May 2014 00:44:08 +0000 (10:44 +1000)]
idr: fix unexpected ID-removal when idr_remove(unallocated_id)

If unallocated_id = (ANY * idr_max(idp->layers) + existing_id) is passed
to idr_remove().  The existing_id will be removed unexpectedly.

The following test shows this unexpected id-removal:

static void test4(void)
{
int id;
DEFINE_IDR(test_idr);

printk(KERN_INFO "Start test4\n");
id = idr_alloc(&test_idr, (void *)1, 42, 43, GFP_KERNEL);
BUG_ON(id != 42);
idr_remove(&test_idr, 42 + IDR_SIZE);
TEST_BUG_ON(idr_find(&test_idr, 42) != (void *)1);
idr_destroy(&test_idr);
printk(KERN_INFO "End of test4\n");
}

ida_remove() shares the similar problem.

It happens only when the caller tries to free an unallocated ID which is
the caller's fault.  It is not a bug.  But it is better to add the proper
check and complain rather than removing an existing_id silently.

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agoidr: fix overflow bug during maximum ID calculation at maximum height
Lai Jiangshan [Thu, 22 May 2014 00:44:08 +0000 (10:44 +1000)]
idr: fix overflow bug during maximum ID calculation at maximum height

idr_replace() open-codes the logic to calculate the maximum valid ID given
the height of the idr tree; unfortunately, the open-coded logic doesn't
account for the fact that the top layer may have unused slots and
over-shifts the limit to zero when the tree is at its maximum height.

The following test code shows it fails to replace the value for
id=((1<<27)+42):

static void test5(void)
{
int id;
DEFINE_IDR(test_idr);
#define TEST5_START ((1<<27)+42) /* use the highest layer */

printk(KERN_INFO "Start test5\n");
id = idr_alloc(&test_idr, (void *)1, TEST5_START, 0, GFP_KERNEL);
BUG_ON(id != TEST5_START);
TEST_BUG_ON(idr_replace(&test_idr, (void *)2, TEST5_START) != (void *)1);
idr_destroy(&test_idr);
printk(KERN_INFO "End of test5\n");
}

Fix the bug by using idr_max() which correctly takes into account the
maximum allowed shift.

sub_alloc() shares the same problem and may incorrectly fail with -EAGAIN;
however, this bug doesn't affect correct operation because
idr_get_empty_slot(), which already uses idr_max(), retries with the
increased @id in such cases.

[tj@kernel.org: Updated patch description.]
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Acked-by: Tejun Heo <tj@kernel.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agokexec: save PG_head_mask in VMCOREINFO
Petr Tesarik [Thu, 22 May 2014 00:44:07 +0000 (10:44 +1000)]
kexec: save PG_head_mask in VMCOREINFO

To allow filtering of huge pages, makedumpfile must be able to identify
them in the dump.  This can be done by checking the appropriate page flag,
so communicate its value to makedumpfile through the VMCOREINFO interface.

There's only one small catch.  Depending on how many page flags are
available on a given architecture, this bit can be called PG_head or
PG_compound.

I sent a similar patch back in 2012, but Eric Biederman did not like using
an #ifdef.  So, this time I'm adding a common symbol (PG_head_mask)
instead.

See https://lkml.org/lkml/2012/11/28/91 for the previous version.

Signed-off-by: Petr Tesarik <ptesarik@suse.cz>
Cc: Eric Biederman <ebiederm@xmission.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Fengguang Wu <fengguang.wu@intel.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Shaohua Li <shli@kernel.org>
Cc: Alexey Kardashevskiy <aik@ozlabs.ru>
Cc: Sasha Levin <sasha.levin@oracle.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agokernel/kexec.c: convert printk to pr_foo()
Fabian Frederick [Thu, 22 May 2014 00:44:07 +0000 (10:44 +1000)]
kernel/kexec.c: convert printk to pr_foo()

+ some pr_warning -> pr_warn and checkpatch warning fixes

Signed-off-by: Fabian Frederick <fabf@skynet.be>
Cc: Eric Biederman <ebiederm@xmission.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agokernel/panic.c: add "crash_kexec_post_notifiers" option for kdump after panic_notifers
Masami Hiramatsu [Thu, 22 May 2014 00:44:07 +0000 (10:44 +1000)]
kernel/panic.c: add "crash_kexec_post_notifiers" option for kdump after panic_notifers

Add a "crash_kexec_post_notifiers" boot option to run kdump after running
panic_notifiers and dump kmsg.  This can help rare situations where kdump
fails because of unstable crashed kernel or hardware failure (memory
corruption on critical data/code), or the 2nd kernel is already broken by
the 1st kernel (it's a broken behavior, but who can guarantee that the
"crashed" kernel works correctly?).

Usage: add "crash_kexec_post_notifiers" to kernel boot option.

Note that this actually increases risks of the failure of kdump.
This option should be set only if you worry about the rare case
of kdump failure rather than increasing the chance of success.

Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Acked-by: Motohiro Kosaki <Motohiro.Kosaki@us.fujitsu.com>
Acked-by: Vivek Goyal <vgoyal@redhat.com>
Cc: Eric Biederman <ebiederm@xmission.com>
Cc: Yoshihiro YUNOMAE <yoshihiro.yunomae.ez@hitachi.com>
Cc: Satoru MORIYA <satoru.moriya.br@hitachi.com>
Cc: Tomoki Sekiyama <tomoki.sekiyama@hds.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agocpu-hotplug-smp-flush-any-pending-ipi-callbacks-before-cpu-offline-v5-checkpatch...
Andrew Morton [Thu, 22 May 2014 00:44:07 +0000 (10:44 +1000)]
cpu-hotplug-smp-flush-any-pending-ipi-callbacks-before-cpu-offline-v5-checkpatch-fixes

WARNING: line over 80 characters
#105: FILE: kernel/stop_machine.c:230:
+  * arrive late due to hardware latencies.

WARNING: line over 80 characters
#106: FILE: kernel/stop_machine.c:231:
+  * So flush out any pending IPI callbacks

WARNING: line over 80 characters
#107: FILE: kernel/stop_machine.c:232:
+  * explicitly, to ensure that the outgoing

WARNING: line over 80 characters
#108: FILE: kernel/stop_machine.c:233:
+  * CPU doesn't go offline with work still

WARNING: line over 80 characters
#111: FILE: kernel/stop_machine.c:236:
+ generic_smp_call_function_single_interrupt();

total: 0 errors, 5 warnings, 60 lines checked

./patches/cpu-hotplug-smp-flush-any-pending-ipi-callbacks-before-cpu-offline-v5.patch has style problems, please review.

If any of these errors are false positives, please report
them to the maintainer, see CHECKPATCH in MAINTAINERS.

Please run checkpatch prior to sending patches

Cc: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agoCPU hotplug, smp: Flush any pending IPI callbacks before CPU offline
Srivatsa S. Bhat [Thu, 22 May 2014 00:44:06 +0000 (10:44 +1000)]
CPU hotplug, smp: Flush any pending IPI callbacks before CPU offline

During CPU offline, in the stop-machine loop, we use 2 separate stages to
disable interrupts, to ensure that the CPU going offline doesn't get any
new IPIs from the other CPUs after it has gone offline.

However, an IPI sent much earlier might arrive late on the target CPU
(possibly _after_ the CPU has gone offline) due to hardware latencies, and
due to this, the smp-call-function callbacks queued on the outgoing CPU
might not get noticed (and hence not executed) at all.

This is somewhat theoretical, but in any case, it makes sense to
explicitly loop through the call_single_queue and flush any pending
callbacks before the CPU goes completely offline.  So, flush the queued
smp-call-function callbacks in the MULTI_STOP_DISABLE_IRQ_ACTIVE stage,
after disabling interrupts on the active CPU.  This can be trivially
achieved by invoking the generic_smp_call_function_single_interrupt()
function itself (and since the outgoing CPU is still online at this point,
we won't trigger the "IPI to offline CPU" warning in this function; so we
are safe to call it here).

This way, we would have handled all the queued callbacks before going
offline, and also, no new IPIs can be sent by the other CPUs to the
outgoing CPU at that point, because they will all be executing the
stop-machine code with interrupts disabled.

Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Suggested-by: Frederic Weisbecker <fweisbec@gmail.com>
Reviewed-by: Tejun Heo <tj@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agocpu-hotplug-stop-machine-plug-race-window-that-leads-to-ipi-to-offline-cpu-v5
Srivatsa S. Bhat [Thu, 22 May 2014 00:44:06 +0000 (10:44 +1000)]
cpu-hotplug-stop-machine-plug-race-window-that-leads-to-ipi-to-offline-cpu-v5

Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Tejun Heo <tj@kernel.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Rik van Riel <riel@redhat.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Mike Galbraith <mgalbraith@suse.de>
Cc: Gautham R Shenoy <ego@linux.vnet.ibm.com>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Rafael J. Wysocki <rjw@rjwysocki.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agocpu-hotplug-stop-machine-plug-race-window-that-leads-to-ipi-to-offline-cpu-v3
Srivatsa S. Bhat [Thu, 22 May 2014 00:44:06 +0000 (10:44 +1000)]
cpu-hotplug-stop-machine-plug-race-window-that-leads-to-ipi-to-offline-cpu-v3

Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Tejun Heo <tj@kernel.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Rik van Riel <riel@redhat.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Mike Galbraith <mgalbraith@suse.de>
Cc: Gautham R Shenoy <ego@linux.vnet.ibm.com>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Rafael J. Wysocki <rjw@rjwysocki.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agoCPU hotplug, stop-machine: plug race-window that leads to "IPI-to-offline-CPU"
Srivatsa S. Bhat [Thu, 22 May 2014 00:44:06 +0000 (10:44 +1000)]
CPU hotplug, stop-machine: plug race-window that leads to "IPI-to-offline-CPU"

During CPU offline, stop-machine is used to take control over all the
online CPUs (via the per-cpu stopper thread) and then run take_cpu_down()
on the CPU that is to be taken offline.

But stop-machine itself has several stages: _PREPARE, _DISABLE_IRQ, _RUN
etc.  The important thing to note here is that the _DISABLE_IRQ stage
comes much later after starting stop-machine, and hence there is a large
window where other CPUs can send IPIs to the CPU going offline.  As a
result, we can encounter a scenario as depicted below, which causes IPIs
to be sent to the CPU going offline, and that CPU notices them *after* it
has gone offline, triggering the "IPI-to-offline-CPU" warning from the
smp-call-function code.

              CPU 1                                         CPU 2
          (Online CPU)                               (CPU going offline)

       Enter _PREPARE stage                          Enter _PREPARE stage

                                                     Enter _DISABLE_IRQ stage

                                                   =
       Got a device interrupt,                     | Didn't notice the IPI
       and the interrupt handler                   | since interrupts were
       called smp_call_function()                  | disabled on this CPU.
       and sent an IPI to CPU 2.                   |
                                                   =

       Enter _DISABLE_IRQ stage

       Enter _RUN stage                              Enter _RUN stage

                                  =
       Busy loop with interrupts  |                  Invoke take_cpu_down()
       disabled.                  |                  and take CPU 2 offline
                                  =

       Enter _EXIT stage                             Enter _EXIT stage

       Re-enable interrupts                          Re-enable interrupts

                                                     The pending IPI is noted
                                                     immediately, but alas,
                                                     the CPU is offline at
                                                     this point.

So, as we can observe from this scenario, the IPI was sent when CPU 2 was
still online, and hence it was perfectly legal.  But unfortunately it was
noted only after CPU 2 went offline, resulting in the warning from the IPI
handling code.  In other words, the fault was not at the sender, but at
the receiver side - and if we look closely, the real bug is in the
stop-machine sequence itself.

The problem here is that the CPU going offline disabled its local
interrupts (by entering _DISABLE_IRQ phase) *before* the other CPUs.  And
that's the reason why it was not able to respond to the IPI before going
offline.

A simple solution to this problem is to ensure that the CPU going offline
disables its interrupts only *after* the other CPUs do the same thing.  To
achieve this, split the _DISABLE_IRQ state into 2 parts:

1st part: MULTI_STOP_DISABLE_IRQ_INACTIVE, where only the non-active CPUs
(i.e., the "other" CPUs) disable their interrupts.

2nd part: MULTI_STOP_DISABLE_IRQ_ACTIVE, where the active CPU (i.e., the
CPU going offline) disables its interrupts.

With this in place, the CPU going offline will always be the last one to
disable interrupts.  After this step, no further IPIs can be sent to the
outgoing CPU, since all the other CPUs would be executing the stop-machine
code with interrupts disabled.  And by the time stop-machine ends, the CPU
would have gone offline and disappeared from the cpu_online_mask, and
hence future invocations of smp_call_function() and friends will
automatically prune that CPU out.  Thus, we can guarantee that no CPU will
end up *inadvertently* sending IPIs to an offline CPU.

Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Tejun Heo <tj@kernel.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Rik van Riel <riel@redhat.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Mike Galbraith <mgalbraith@suse.de>
Cc: Gautham R Shenoy <ego@linux.vnet.ibm.com>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Rafael J. Wysocki <rjw@rjwysocki.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agosmp-print-more-useful-debug-info-upon-receiving-ipi-on-an-offline-cpu-v5
Srivatsa S. Bhat [Thu, 22 May 2014 00:44:05 +0000 (10:44 +1000)]
smp-print-more-useful-debug-info-upon-receiving-ipi-on-an-offline-cpu-v5

Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Tejun Heo <tj@kernel.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Rik van Riel <riel@redhat.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Mike Galbraith <mgalbraith@suse.de>
Cc: Gautham R Shenoy <ego@linux.vnet.ibm.com>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Rafael J. Wysocki <rjw@rjwysocki.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agosmp-print-more-useful-debug-info-upon-receiving-ipi-on-an-offline-cpu-fix
Andrew Morton [Thu, 22 May 2014 00:44:05 +0000 (10:44 +1000)]
smp-print-more-useful-debug-info-upon-receiving-ipi-on-an-offline-cpu-fix

correctly suppress warning output on second and later occurrences

Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: "Srivatsa S. Bhat" <srivatsa.bhat@linux.vnet.ibm.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Gautham R Shenoy <ego@linux.vnet.ibm.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Mike Galbraith <mgalbraith@suse.de>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rafael J. Wysocki <rjw@rjwysocki.net>
Cc: Rik van Riel <riel@redhat.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Tejun Heo <tj@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agosmp: print more useful debug info upon receiving IPI on an offline CPU
Srivatsa S. Bhat [Thu, 22 May 2014 00:44:05 +0000 (10:44 +1000)]
smp: print more useful debug info upon receiving IPI on an offline CPU

There is a longstanding problem related to CPU hotplug which causes IPIs
to be delivered to offline CPUs, and the smp-call-function IPI handler
code prints out a warning whenever this is detected.  Every once in a
while this (usually harmless) warning gets reported on LKML, but so far it
has not been completely fixed.  Usually the solution involves finding out
the IPI sender and fixing it by adding appropriate synchronization with
CPU hotplug.

However, while going through one such internal bug reports, I found that
there is a significant bug in the receiver side itself (more specifically,
in stop-machine) that can lead to this problem even when the sender code
is perfectly fine.  This patchset fixes that synchronization problem in
the CPU hotplug stop-machine code.

Patch 1 adds some additional debug code to the smp-call-function
framework, to help debug such issues easily.

Patch 2 modifies the stop-machine code to ensure that any IPIs that were
sent while the target CPU was online, would be noticed and handled by that
CPU without fail before it goes offline.  Thus, this avoids scenarios
where IPIs are received on offline CPUs (as long as the sender uses proper
hotplug synchronization).

In fact, I debugged the problem by using Patch 1, and found that the
payload of the IPI was always the block layer's trigger_softirq()
function.  But I was not able to find anything wrong with the block layer
code.  That's when I started looking at the stop-machine code and realized
that there is a race-window which makes the IPI _receiver_ the culprit,
not the sender.  Patch 2 fixes that race and hence this should put an end
to most of the hard-to-debug IPI-to-offline-CPU issues.

This patch (of 2):

Today the smp-call-function code just prints a warning if we get an IPI on
an offline CPU.  This info is sufficient to let us know that something
went wrong, but often it is very hard to debug exactly who sent the IPI
and why, from this info alone.

In most cases, we get the warning about the IPI to an offline CPU,
immediately after the CPU going offline comes out of the stop-machine
phase and reenables interrupts.  Since all online CPUs participate in
stop-machine, the information regarding the sender of the IPI is already
lost by the time we exit the stop-machine loop.  So even if we dump the
stack on each CPU at this point, we won't find anything useful since all
of them will show the stack-trace of the stopper thread.  So we need a
better way to figure out who sent the IPI and why.

To achieve this, when we detect an IPI targeted to an offline CPU, loop
through the call-single-data linked list and print out the payload (i.e.,
the name of the function which was supposed to be executed by the target
CPU).  This would give us an insight as to who might have sent the IPI and
help us debug this further.

Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Tejun Heo <tj@kernel.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Rik van Riel <riel@redhat.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Mike Galbraith <mgalbraith@suse.de>
Cc: Gautham R Shenoy <ego@linux.vnet.ibm.com>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Rafael J. Wysocki <rjw@rjwysocki.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agosignals: change wait_for_helper() to use kernel_sigaction()
Oleg Nesterov [Thu, 22 May 2014 00:44:04 +0000 (10:44 +1000)]
signals: change wait_for_helper() to use kernel_sigaction()

Now that we have kernel_sigaction() we can change wait_for_helper() to use
it and cleans up the code a bit.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Al Viro <viro@ZenIV.linux.org.uk>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Richard Weinberger <richard@nod.at>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agosignals: introduce kernel_sigaction()
Oleg Nesterov [Thu, 22 May 2014 00:44:04 +0000 (10:44 +1000)]
signals: introduce kernel_sigaction()

Now that allow_signal() is really trivial we can unify it with
disallow_signal().  Add the new helper, kernel_sigaction(), and
reimplement allow_signal/disallow_signal as a trivial wrappers.

This saves one EXPORT_SYMBOL() and the new helper can have more users.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Al Viro <viro@ZenIV.linux.org.uk>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Richard Weinberger <richard@nod.at>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agosignals: disallow_signal() should flush the potentially pending signal
Oleg Nesterov [Thu, 22 May 2014 00:44:04 +0000 (10:44 +1000)]
signals: disallow_signal() should flush the potentially pending signal

disallow_signal() simply sets SIG_IGN, this is not enough and
recalc_sigpending() is simply pointless because in can never change the
state of TIF_SIGPENDING.

If we ignore a signal, we also need to do flush_sigqueue_mask() for the
case when this signal is pending, this way recalc_sigpending() can
actually clear TIF_SIGPENDING and we do not "leak" the allocated
siginfo's.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Al Viro <viro@ZenIV.linux.org.uk>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Richard Weinberger <richard@nod.at>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agosignals: kill the obsolete sigdelset() and recalc_sigpending() in allow_signal()
Oleg Nesterov [Thu, 22 May 2014 00:44:03 +0000 (10:44 +1000)]
signals: kill the obsolete sigdelset() and recalc_sigpending() in allow_signal()

allow_signal() does sigdelset(current->blocked) due to historic reason,
previously it could be called by a daemonize()'ed kthread, and daemonize()
played with current->blocked.

Now that daemonize() has gone away we can remove sigdelset() and
recalc_sigpending().  If a user really wants to unblock a signal, it must
use sigprocmask() or set_current_block() explicitely.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Al Viro <viro@ZenIV.linux.org.uk>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Richard Weinberger <richard@nod.at>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agosignals: jffs2: fix the wrong usage of disallow_signal()
Oleg Nesterov [Thu, 22 May 2014 00:44:03 +0000 (10:44 +1000)]
signals: jffs2: fix the wrong usage of disallow_signal()

jffs2_garbage_collect_thread() does disallow_signal(SIGHUP) around
jffs2_garbage_collect_pass() and the comment says "We don't want SIGHUP to
interrupt us".

But disallow_signal() can't ensure that jffs2_garbage_collect_pass() won't
be interrupted by SIGHUP, the problem is that SIGHUP can be already
pending when disallow_signal() is called, and in this case any
interruptible sleep won't block.

Note: this is in fact because disallow_signal() is buggy and should be
fixed, see the next changes.

But there is another reason why disallow_signal() is wrong: SIG_IGN set by
disallow_signal() silently discards any SIGHUP which can be sent before
the next allow_signal(SIGHUP).

Change this code to use sigprocmask(SIG_UNBLOCK/SIG_BLOCK, SIGHUP).  This
even matches the old (and wrong) semantics allow/disallow had when this
logic was written.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Al Viro <viro@ZenIV.linux.org.uk>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Richard Weinberger <richard@nod.at>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agosignals: mv {dis,}allow_signal() from sched.h/exit.c to signal.[ch]
Oleg Nesterov [Thu, 22 May 2014 00:44:03 +0000 (10:44 +1000)]
signals: mv {dis,}allow_signal() from sched.h/exit.c to signal.[ch]

Move the declaration/definition of allow_signal/disallow_signal to
signal.h/signal.c.  The new place is more logical and allows to use the
static helpers in signal.c (see the next changes).

While at it, make them return void and remove the valid_signal() check.
Nobody checks the returned value, and in-kernel users must not pass the
wrong signal number.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Al Viro <viro@ZenIV.linux.org.uk>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Richard Weinberger <richard@nod.at>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agosignals: cleanup the usage of t/current in do_sigaction()
Oleg Nesterov [Thu, 22 May 2014 00:44:03 +0000 (10:44 +1000)]
signals: cleanup the usage of t/current in do_sigaction()

The usage of "task_struct *t" and "current" in do_sigaction() looks really
annoying and chaotic.  Initially "t" is used as a cached value of current
but not consistently, then it is reused as a loop variable and we have to
use "current" again.

Clean up this mess and also convert the code to use for_each_thread().

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Al Viro <viro@ZenIV.linux.org.uk>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Richard Weinberger <richard@nod.at>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agosignals: rename rm_from_queue_full() to flush_sigqueue_mask()
Oleg Nesterov [Thu, 22 May 2014 00:44:02 +0000 (10:44 +1000)]
signals: rename rm_from_queue_full() to flush_sigqueue_mask()

"rm_from_queue_full" looks ugly and misleading, especially now that
rm_from_queue() has gone away.  Rename it to flush_sigqueue_mask(), this
matches flush_sigqueue() we already have.

Also remove the obsolete comment which explains the difference with
rm_from_queue() we already killed.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Al Viro <viro@ZenIV.linux.org.uk>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Richard Weinberger <richard@nod.at>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agosignals: kill rm_from_queue(), change prepare_signal() to use for_each_thread()
Oleg Nesterov [Thu, 22 May 2014 00:44:02 +0000 (10:44 +1000)]
signals: kill rm_from_queue(), change prepare_signal() to use for_each_thread()

rm_from_queue() doesn't make sense.  The only caller, prepare_signal(),
can use rm_from_queue_full() with the same effect.

While at it, change prepare_signal() to use for_each_thread() instead of
do/while_each_thread.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Al Viro <viro@ZenIV.linux.org.uk>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Richard Weinberger <richard@nod.at>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agosignals: s/siginitset/sigemptyset/ in do_sigtimedwait()
Oleg Nesterov [Thu, 22 May 2014 00:44:02 +0000 (10:44 +1000)]
signals: s/siginitset/sigemptyset/ in do_sigtimedwait()

Cosmetic, but siginitset(0) looks a bit strange, sigemptyset() is what
do_sigtimedwait() needs.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Al Viro <viro@ZenIV.linux.org.uk>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Richard Weinberger <richard@nod.at>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agosignals: kill sigfindinword()
Oleg Nesterov [Thu, 22 May 2014 00:44:02 +0000 (10:44 +1000)]
signals: kill sigfindinword()

It has no users and it doesn't look useful.  I do not know why/when it was
introduced, I can't even find any user in the git history.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Al Viro <viro@ZenIV.linux.org.uk>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Richard Weinberger <richard@nod.at>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agoptrace: task_clear_jobctl_trapping()->wake_up_bit() needs mb()
Oleg Nesterov [Thu, 22 May 2014 00:44:01 +0000 (10:44 +1000)]
ptrace: task_clear_jobctl_trapping()->wake_up_bit() needs mb()

__wake_up_bit() checks waitqueue_active() and thus the caller needs mb()
as wake_up_bit() documents, fix task_clear_jobctl_trapping().

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agoptrace: fix fork event messages across pid namespaces
Matthew Dempsky [Thu, 22 May 2014 00:44:01 +0000 (10:44 +1000)]
ptrace: fix fork event messages across pid namespaces

When tracing a process in another pid namespace, it's important for fork
event messages to contain the child's pid as seen from the tracer's pid
namespace, not the parent's.  Otherwise, the tracer won't be able to
correlate the fork event with later SIGTRAP signals it receives from the
child.

We still risk a race condition if a ptracer from a different pid namespace
attaches after we compute the pid_t value.  However, sending a bogus fork
event message in this unlikely scenario is still a vast improvement over
the status quo where we always send bogus fork event messages to debuggers
in a different pid namespace than the forking process.

Signed-off-by: Matthew Dempsky <mdempsky@chromium.org>
Acked-by: Oleg Nesterov <oleg@redhat.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Julien Tinnes <jln@chromium.org>
Cc: Roland McGrath <mcgrathr@chromium.org>
Cc: Jan Kratochvil <jan.kratochvil@redhat.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agoDocumentation/SubmittingPatches: describe the Fixes: tag
Jacob Keller [Thu, 22 May 2014 00:44:01 +0000 (10:44 +1000)]
Documentation/SubmittingPatches: describe the Fixes: tag

Update the SubmittingPatches process to include howto about the new
'Fixes:' tag to be used when a patch fixes an issue in a previous commit
(found by git-bisect for example).

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Cc: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agofs/fat/inode.c: clean up string initializations (char[] instead of char *)
Manuel Schölling [Thu, 22 May 2014 00:44:01 +0000 (10:44 +1000)]
fs/fat/inode.c: clean up string initializations (char[] instead of char *)

Initializations like 'char *foo = "bar"' will create two variables: a
static string and a pointer (foo) to that static string.  Instead 'char
foo[] = "bar"' will declare a single variable and will end up in shorter
assembly (according to Jeff Garzik on the KernelJanitor's TODO list).

Signed-off-by: Manuel Schölling <manuel.schoelling@gmx.de>
Acked-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agofs/fat/: add support for DOS 1.x formatted volumes
Conrad Meyer [Thu, 22 May 2014 00:44:00 +0000 (10:44 +1000)]
fs/fat/: add support for DOS 1.x formatted volumes

Add structure for parsed BPB information, struct fat_bios_param_block, and
move all of the deserialization and validation logic from fat_fill_super()
into fat_read_bpb().

Add a 'dos1xfloppy' mount option to infer DOS 2.x BIOS Parameter Block
defaults from block device geometry for ancient floppies and floppy
images, as a fall-back from the default BPB parsing logic.

When fat_read_bpb() finds an invalid FAT filesystem and dos1xfloppy is
set, fall back to fat_read_static_bpb().  fat_read_static_bpb() validates
that the entire BPB is zero, and that the floppy has a DOS-style 8086 code
bootstrapping header.  Then it fills in default BPB values from media size
and a table.[0]

Media size is assumed to be static for archaic FAT volumes. See also:
[1].

Fixes kernel.org bug #42617.

[0]: https://en.wikipedia.org/wiki/File_Allocation_Table#Exceptions
[1]: http://www.win.tue.nl/~aeb/linux/fs/fat/fat-1.html

Signed-off-by: Conrad Meyer <cse.cem@gmail.com>
Acked-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agofs/hpfs: use __func__ for logging
Fabian Frederick [Thu, 22 May 2014 00:44:00 +0000 (10:44 +1000)]
fs/hpfs: use __func__ for logging

Normalize function display fx() using __func__

Signed-off-by: Fabian Frederick <fabf@skynet.be>
Cc: Mikulas Patocka <mikulas@artax.karlin.mff.cuni.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agofs/hpfs: use pr_fmt for logging
Fabian Frederick [Thu, 22 May 2014 00:44:00 +0000 (10:44 +1000)]
fs/hpfs: use pr_fmt for logging

Also remove redundant level names (warning:...)

Signed-off-by: Fabian Frederick <fabf@skynet.be>
Cc: Mikulas Patocka <mikulas@artax.karlin.mff.cuni.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agofs/hpfs: convert printk to pr_foo()
Fabian Frederick [Thu, 22 May 2014 00:44:00 +0000 (10:44 +1000)]
fs/hpfs: convert printk to pr_foo()

No level printk in hptfs_error converted to pr_err (others to pr_warn or
pr_info)

This patch also fixes if/then/else checkpatch warnings

Signed-off-by: Fabian Frederick <fabf@skynet.be>
Cc: Mikulas Patocka <mikulas@artax.karlin.mff.cuni.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agofs/ufs/balloc.c: remove err parameter in ufs_add_fragments
Fabian Frederick [Thu, 22 May 2014 00:43:59 +0000 (10:43 +1000)]
fs/ufs/balloc.c: remove err parameter in ufs_add_fragments

err is used in ufs_new_fragments (ufs_add_fragments only callsite)
not in ufs_add_fragments.

Signed-off-by: Fabian Frederick <fabf@skynet.be>
Cc: Evgeniy Dushistov <dushistov@mail.ru>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agohfsplus: fix longname handling
Sougata Santra [Thu, 22 May 2014 00:43:59 +0000 (10:43 +1000)]
hfsplus: fix longname handling

Longname is not correctly handled by hfsplus driver.  If an attempt to
create a longname(>255) file/directory is made, it succeeds by creating a
file/directory with HFSPLUS_MAX_STRLEN and incorrect catalog key.  Thus
leaving the volume in an inconsistent state.  This patch fixes this issue.

Although lookup is always called first to create a negative entry, so just
doing a check in lookup would probably fix this issue.  I choose to
propagate error to other iops as well.

Please NOTE: I have factored out hfsplus_cat_build_key_with_cnid from
hfsplus_cat_build_key, to avoid unncessary branching.

Thanks a lot.

TEST:
------
dir="TEST_DIR"
cdir=`pwd`
name255="_123456789_123456789_123456789_123456789_123456789_123456789\
_123456789_123456789_123456789_123456789_123456789_123456789_123456789\
_123456789_123456789_123456789_123456789_123456789_123456789_123456789\
_123456789_123456789_123456789_123456789_123456789_1234"
name256="${name255}5"

mkdir $dir
cd $dir
touch $name255
rm -f $name255
touch $name256
ls -la
cd $cdir
rm -rf $dir

RESULT:
-------
[sougata@ultrabook tmp]$ cdir=`pwd`
[sougata@ultrabook tmp]$
name255="_123456789_123456789_123456789_123456789_123456789_123456789\
 > _123456789_123456789_123456789_123456789_123456789_123456789_123456789\
 > _123456789_123456789_123456789_123456789_123456789_123456789_123456789\
 > _123456789_123456789_123456789_123456789_123456789_1234"
[sougata@ultrabook tmp]$ name256="${name255}5"
[sougata@ultrabook tmp]$
[sougata@ultrabook tmp]$ mkdir $dir
[sougata@ultrabook tmp]$ cd $dir
[sougata@ultrabook TEST_DIR]$ touch $name255
[sougata@ultrabook TEST_DIR]$ rm -f $name255
[sougata@ultrabook TEST_DIR]$ touch $name256
[sougata@ultrabook TEST_DIR]$ ls -la
ls: cannot access
_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_1234:
No such file or directory
total 0
drwxrwxr-x 1 sougata sougata 3 Feb 20 19:56 .
drwxrwxrwx 1 root    root    6 Feb 20 19:56 ..
-????????? ? ?       ?       ?            ?
_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_1234
[sougata@ultrabook TEST_DIR]$ cd $cdir
[sougata@ultrabook tmp]$ rm -rf $dir
rm: cannot remove `TEST_DIR': Directory not empty

-ENAMETOOLONG returned from hfsplus_asc2uni was not propaged to iops.
This allowed hfsplus to create files/directories with HFSPLUS_MAX_STRLEN
and incorrect keys, leaving the FS in an inconsistent state.  This patch
fixes this issue.

Signed-off-by: Sougata Santra <sougata@tuxera.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Cc: Vyacheslav Dubeyko <slava@dubeyko.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agofs/hfsplus/wrapper.c: replace shift loop by ilog2
Fabian Frederick [Thu, 22 May 2014 00:43:59 +0000 (10:43 +1000)]
fs/hfsplus/wrapper.c: replace shift loop by ilog2

Replace while blocksize;shift by ilog2

Signed-off-by: Fabian Frederick <fabf@skynet.be>
Cc: Vyacheslav Dubeyko <slava@dubeyko.com>
Cc: Joe Perches <joe@perches.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agohfsplus: fix "unused node is not erased" error
Sergei Antonov [Thu, 22 May 2014 00:43:59 +0000 (10:43 +1000)]
hfsplus: fix "unused node is not erased" error

Zero newly allocated extents in the catalog tree if volume attributes tell
us to.  Not doing so we risk getting the "unused node is not erased"
error.  See kHFSUnusedNodeFix flag in Apple's source code for reference.

There was a previous commit clearing the node when it is freed:
  commit 899bed05e9f6bbb21776f9ebd88f5631987f987a
  Author: Vyacheslav Dubeyko <slava@dubeyko.com>
  Date:   Wed Feb 27 17:03:06 2013 -0800
  hfsplus: fix issue with unzeroed unused b-tree nodes
It did not handle newly allocated extents (this patch fixes it). And it zeroed
nodes in all trees unconditionally which is an overkill. This patch adds a
condition and also switches to 'tree->node_size' as a simpler method of getting
the length to zero.

Signed-off-by: Sergei Antonov <saproj@gmail.com>
Cc: Anton Altaparmakov <aia21@cam.ac.uk>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Vyacheslav Dubeyko <slava@dubeyko.com>
Cc: Hin-Tak Leung <htl10@users.sourceforge.net>
Cc: Kyle Laracey <kalaracey@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agofs/hfsplus/wrapper.c: replace min/casting by min_t
Fabian Frederick [Thu, 22 May 2014 00:43:58 +0000 (10:43 +1000)]
fs/hfsplus/wrapper.c: replace min/casting by min_t

Also add * before function comments (it was not detected by kernel-doc)

Signed-off-by: Fabian Frederick <fabf@skynet.be>
Cc: Vyacheslav Dubeyko <slava@dubeyko.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agofs/hfsplus/options.c: replace seq_printf by seq_puts
Fabian Frederick [Thu, 22 May 2014 00:43:58 +0000 (10:43 +1000)]
fs/hfsplus/options.c: replace seq_printf by seq_puts

Replace seq_printf where possible

Signed-off-by: Fabian Frederick <fabf@skynet.be>
Cc: Vyacheslav Dubeyko <slava@dubeyko.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agofs/hfsplus/bnode.c: replace min/casting by min_t
Fabian Frederick [Thu, 22 May 2014 00:43:58 +0000 (10:43 +1000)]
fs/hfsplus/bnode.c: replace min/casting by min_t

Also fixes some pr_ formats

Signed-off-by: Fabian Frederick <fabf@skynet.be>
Cc: Vyacheslav Dubeyko <slava@dubeyko.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agohfsplus: emit proper file type from readdir
Sergei Antonov [Thu, 22 May 2014 00:43:58 +0000 (10:43 +1000)]
hfsplus: emit proper file type from readdir

hfsplus_readdir() incorrectly returned DT_REG for symbolic links and
special files.  Return DT_REG, DT_LNK, DT_FIFO, DT_CHR, DT_BLK, DT_SOCK,
or DT_UNKNOWN according to mode field in catalog record.  Programs relying
on information from readdir will now work correctly with HFS+.

Signed-off-by: Sergei Antonov <saproj@gmail.com>
Cc: Anton Altaparmakov <aia21@cam.ac.uk>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Vyacheslav Dubeyko <slava@dubeyko.com>
Cc: Hin-Tak Leung <htl10@users.sourceforge.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agohfsplus: remove unused routine hfsplus_attr_build_key_uni
Hin-Tak Leung [Thu, 22 May 2014 00:43:57 +0000 (10:43 +1000)]
hfsplus: remove unused routine hfsplus_attr_build_key_uni

The directory/file catalog b-tree equivalent, hfsplus_build_key_uni(), is
used by hfsplus_find_cat() for internal referencing between catalog
records.  There is no corresponding usage for attributes - attribute
records do not refer to one another.

Signed-off-by: Hin-Tak Leung <htl10@users.sourceforge.net>
Cc: Sougata Santra <sougata@tuxera.com>
Cc: Anton Altaparmakov <anton@tuxera.com>
Cc: Vyacheslav Dubeyko <slava@dubeyko.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agohfsplus-correct-usage-of-hfsplus_attr_max_strlen-for-non-english-attributes-fix-2
Andrew Morton [Thu, 22 May 2014 00:43:57 +0000 (10:43 +1000)]
hfsplus-correct-usage-of-hfsplus_attr_max_strlen-for-non-english-attributes-fix-2

fix build

fs/hfsplus/xattr_security.c: In function 'hfsplus_security_getxattr':
fs/hfsplus/xattr_security.c:23: error: 'NLS_MAX_CHARSET_SIZE' undeclared (first use in this function)
fs/hfsplus/xattr_security.c:23: error: (Each undeclared identifier is reported o

Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Anton Altaparmakov <anton@tuxera.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Hin-Tak Leung <htl10@users.sourceforge.net>
Cc: Sougata Santra <sougata@tuxera.com>
Cc: Vyacheslav Dubeyko <slava@dubeyko.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agohfsplus-correct-usage-of-hfsplus_attr_max_strlen-for-non-english-attributes-fix
Andrew Morton [Thu, 22 May 2014 00:43:57 +0000 (10:43 +1000)]
hfsplus-correct-usage-of-hfsplus_attr_max_strlen-for-non-english-attributes-fix

fix build

fs/hfsplus/xattr_user.c: In function 'hfsplus_user_getxattr':
fs/hfsplus/xattr_user.c:21: error: 'NLS_MAX_CHARSET_SIZE' undeclared (first use in this function)
fs/hfsplus/xattr_user.c:21: error: (Each undeclared identifier is reported only once

Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Anton Altaparmakov <anton@tuxera.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Hin-Tak Leung <htl10@users.sourceforge.net>
Cc: Sougata Santra <sougata@tuxera.com>
Cc: Vyacheslav Dubeyko <slava@dubeyko.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agohfsplus: correct usage of HFSPLUS_ATTR_MAX_STRLEN for non-English attributes
Hin-Tak Leung [Thu, 22 May 2014 00:43:57 +0000 (10:43 +1000)]
hfsplus: correct usage of HFSPLUS_ATTR_MAX_STRLEN for non-English attributes

HFSPLUS_ATTR_MAX_STRLEN (=127) is the limit of attribute names for the
number of unicode character (UTF-16BE) storable in the HFS+ file system.
Almost all the current usage of it is wrong, in relation to NLS to on-disk
conversion.

Except for one use calling hfsplus_asc2uni (which should stay the same)
and its uses in calling hfsplus_uni2asc (which was corrected in the
earlier patch in this series concerning usage of hfsplus_uni2asc), all the
other uses are of the forms:

- char buffer[size]

- bound check: "if (namespace_adjusted_input_length > size) return failure;"

Conversion between on-disk unicode representation and NLS char strings (in
whichever direction) always needs to accommodate the worst-case NLS
conversion, so all char buffers of that size need to have a
NLS_MAX_CHARSET_SIZE x .

The bound checks are all wrong, since they compare nls_length derived from
strlen() to a unicode length limit.

It turns out that all the bound-checks do is to protect hfsplus_asc2uni(),
which can fail if the input is too large.  There is only one usage of it
as far as attributes are concerned, in hfsplus_attr_build_key().  It is in
turn used by hfsplus_find_attr(), hfsplus_create_attr(),
hfsplus_delete_attr().  Thus making sure that errors from
hfsplus_asc2uni() is caught in hfsplus_attr_build_key() and propagated is
sufficient to replace all the bound checks.

Unpropagated errors from hfsplus_asc2uni() in the file catalog code was
addressed recently in an independent patch "hfsplus: fix longname handling"
by Sougata Santra.

Before this patch, trying to set a 55 CJK character (in a UTF-8
locale, > 127/3=42) attribute plus user prefix fails with:

    $ setfattr -n user.`cat testing-string` -v `cat testing-string` \
        testing-string
    setfattr: testing-string: Operation not supported

and retrieving a stored long attributes is particular ugly(!):

    find /mnt/* -type f -exec getfattr -d {} \;
    getfattr: /mnt/testing-string: Input/output error

with console log:
    [268008.389781] hfsplus: unicode conversion failed

After the patch, both of the above works.

FYI, the test attribute string is prepared with:

echo -e -n \
"\xe9\x80\x99\xe6\x98\xaf\xe4\xb8\x80\xe5\x80\x8b\xe9\x9d\x9e\xe5" \
"\xb8\xb8\xe6\xbc\xab\xe9\x95\xb7\xe8\x80\x8c\xe6\xa5\xb5\xe5\x85" \
"\xb6\xe4\xb9\x8f\xe5\x91\xb3\xe5\x92\x8c\xe7\x9b\xb8\xe7\x95\xb6" \
"\xe7\x84\xa1\xe8\xb6\xa3\xe3\x80\x81\xe4\xbb\xa5\xe5\x8f\x8a\xe7" \
"\x84\xa1\xe7\x94\xa8\xe7\x9a\x84\xe3\x80\x81\xe5\x86\x8d\xe5\x8a" \
"\xa0\xe4\xb8\x8a\xe6\xaf\xab\xe7\x84\xa1\xe6\x84\x8f\xe7\xbe\xa9" \
"\xe7\x9a\x84\xe6\x93\xb4\xe5\xb1\x95\xe5\xb1\xac\xe6\x80\xa7\xef" \
"\xbc\x8c\xe8\x80\x8c\xe5\x85\xb6\xe5\x94\xaf\xe4\xb8\x80\xe5\x89" \
"\xb5\xe5\xbb\xba\xe7\x9b\xae\xe7\x9a\x84\xe5\x83\x85\xe6\x98\xaf" \
"\xe7\x82\xba\xe4\xba\x86\xe6\xb8\xac\xe8\xa9\xa6\xe4\xbd\x9c\xe7" \
"\x94\xa8\xe3\x80\x82" | tr -d ' '

(= "pointlessly long attribute for testing", elaborate Chinese in
UTF-8 enoding).

However, it is not possible to set double the size (110 + 5 is still
under 127) in a UTF-8 locale:

    $setfattr -n user.`cat testing-string testing-string` -v \
        `cat testing-string testing-string` testing-string
    setfattr: testing-string: Numerical result out of range

110 CJK char in UTF-8 is 330 bytes - the generic get/set attribute system
call code in linux/fs/xattr.c imposes a 255 byte limit.  One can use a
combination of iconv to encode content, changing terminal locale for
viewing, and an nls=cp932/cp936/cp949/cp950 mount option to fully use
127-unicode attribute in a double-byte locale.

Also, as an additional information, it is possible to (mis-)use unicode
half-width/full-width forms (U+FFxx) to write attributes which looks like
english but not actually ascii.

Thanks Anton Altaparmakov for reviewing the earlier ideas behind this
change.

Signed-off-by: Hin-Tak Leung <htl10@users.sourceforge.net>
Cc: Anton Altaparmakov <anton@tuxera.com>
Cc: Vyacheslav Dubeyko <slava@dubeyko.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Sougata Santra <sougata@tuxera.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agohfsplus-fixes-worst-case-unicode-to-char-conversion-of-file-names-and-attributes-fix
Andrew Morton [Thu, 22 May 2014 00:43:56 +0000 (10:43 +1000)]
hfsplus-fixes-worst-case-unicode-to-char-conversion-of-file-names-and-attributes-fix

fix build

Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Anton Altaparmakov <anton@tuxera.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Hin-Tak Leung <htl10@users.sourceforge.net>
Cc: Sougata Santra <sougata@tuxera.com>
Cc: Vyacheslav Dubeyko <slava@dubeyko.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agohfsplus: fix worst-case unicode to char conversion of file names and attributes
Hin-Tak Leung [Thu, 22 May 2014 00:43:56 +0000 (10:43 +1000)]
hfsplus: fix worst-case unicode to char conversion of file names and attributes

This is a series of 3 patches which corrects issues in HFS+ concerning the
use of non-english file names and attributes.  Names and attributes are
stored internally as UTF-16 units up to a fixed maximum size, and convert
to and from user-representation by NLS.  The code incorrectly assume that
NLS string lengths are equal to unicode lengths, which is only true for
English ascii usage.

This patch (of 3):

The HFS Plus Volume Format specification (TN1150) states that file names
are stored internally as a maximum of 255 unicode characters, as defined
by The Unicode Standard, Version 2.0 [Unicode, Inc.  ISBN 0-201-48345-9].
File names are converted by the NLS system on Linux before presented to
the user.

255 CJK characters converts to UTF-8 with 1 unicode character to up to 3
bytes, and to GB18030 with 1 unicode character to up to 4 bytes.  Thus,
trying in a UTF-8 locale to list files with names of more than 85 CJK
characters results in:

    $ ls /mnt
    ls: reading directory /mnt: File name too long

The receiving buffer to hfsplus_uni2asc() needs to be 255 x
NLS_MAX_CHARSET_SIZE bytes, not 255 bytes as the code has always been.

Similar consideration applies to attributes, which are stored internally
as a maximum of 127 UTF-16BE units.  See XNU source for an up-to-date
reference on attributes.

Strictly speaking, the maximum value of NLS_MAX_CHARSET_SIZE = 6 is not
attainable in the case of conversion to UTF-8, as going beyond 3 bytes
requires the use of surrogate pairs, i.e.  consuming two input units.

Thanks Anton Altaparmakov for reviewing an earlier version of this change.

This patch fixes all callers of hfsplus_uni2asc(), and also enables the
use of long non-English file names in HFS+.  The getting and setting, and
general usage of long non-English attributes requires further forthcoming
work, in the following patches of this series.

Signed-off-by: Hin-Tak Leung <htl10@users.sourceforge.net>
Reviewed-by: Anton Altaparmakov <anton@tuxera.com>
Cc: Vyacheslav Dubeyko <slava@dubeyko.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Sougata Santra <sougata@tuxera.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agofs/coda: use __func__
Fabian Frederick [Thu, 22 May 2014 00:43:56 +0000 (10:43 +1000)]
fs/coda: use __func__

Replace all function names by __func__ in pr_foo()

Signed-off-by: Fabian Frederick <fabf@skynet.be>
Cc: Jan Harkes <jaharkes@cs.cmu.edu>
Cc: Joe Perches <joe@perches.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agofs/coda: logging prefix uniformization
Fabian Frederick [Thu, 22 May 2014 00:43:56 +0000 (10:43 +1000)]
fs/coda: logging prefix uniformization

- Add pr_fmt based on module name.

- Remove Coda: coda: from pr_foo()

Signed-off-by: Fabian Frederick <fabf@skynet.be>
Cc: Jan Harkes <jaharkes@cs.cmu.edu>
Cc: Joe Perches <joe@perches.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agofs/coda: replace printk by pr_foo()
Fabian Frederick [Thu, 22 May 2014 00:43:55 +0000 (10:43 +1000)]
fs/coda: replace printk by pr_foo()

No level printk converted to pr_warn or pr_info

Signed-off-by: Fabian Frederick <fabf@skynet.be>
Cc: Jan Harkes <jaharkes@cs.cmu.edu>
Cc: Joe Perches <joe@perches.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agofs/isofs: logging clean-up
Fabian Frederick [Thu, 22 May 2014 00:43:55 +0000 (10:43 +1000)]
fs/isofs: logging clean-up

-All printk(KERN_foo converted to pr_foo()
-Default printk converted to pr_warn()
-Define DEBUG in pr_debug callsites to keep old printk(DEBUG behaviour
-Add DEBUG_FLAGS in Makefile for previous #ifdef DEBUG
-Coalesce format fragments.
-Separate format/arguments on lines > 80 characters.
-Add ISOFS, ISOFS Rock, zisofs pr_fmt

Signed-off-by: Fabian Frederick <fabf@skynet.be>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Jan Kara <jack@suse.cz>
Cc: Joe Perches <joe@perches.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agofs/befs: kernel-doc fixes
Fabian Frederick [Thu, 22 May 2014 00:43:55 +0000 (10:43 +1000)]
fs/befs: kernel-doc fixes

Fix some comment errors.

Signed-off-by: Fabian Frederick <fabf@skynet.be>
Cc: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agofs/befs/linuxvfs.c: remove positive test on sector_t
Fabian Frederick [Thu, 22 May 2014 00:43:55 +0000 (10:43 +1000)]
fs/befs/linuxvfs.c: remove positive test on sector_t

sector_t is unsigned.

Signed-off-by: Fabian Frederick <fabf@skynet.be>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agofs/befs/btree.c: Replace strncpy by strlcpy + coding style fixing
Fabian Frederick [Thu, 22 May 2014 00:43:54 +0000 (10:43 +1000)]
fs/befs/btree.c: Replace strncpy by strlcpy + coding style fixing

-strncpy + end of string assignement replaced by strlcpy
-Fix endif };
-Fix typo

Signed-off-by: Fabian Frederick <fabf@skynet.be>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agofs/befs/linuxvfs.c: replace strncpy by strlcpy
Fabian Frederick [Thu, 22 May 2014 00:43:54 +0000 (10:43 +1000)]
fs/befs/linuxvfs.c: replace strncpy by strlcpy

strncpy + end of string assignment replaced by strlcpy

Signed-off-by: Fabian Frederick <fabf@skynet.be>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agortc: rtc-at91rm9200: fix infinite wait for ACKUPD irq
Boris BREZILLON [Thu, 22 May 2014 00:43:54 +0000 (10:43 +1000)]
rtc: rtc-at91rm9200: fix infinite wait for ACKUPD irq

The rtc user must wait at least 1 sec between each time/calandar update
(see atmel's datasheet chapter "Updating Time/Calendar").

Use the 1Hz interrupt to update the at91_rtc_upd_rdy flag and wait for the
at91_rtc_wait_upd_rdy event if the rtc is not ready.

Signed-off-by: Boris BREZILLON <boris.brezillon@free-electrons.com>
Reported-by: Bryan Evenson <bevenson@melinkcorp.com>
Tested-by: Bryan Evenson <bevenson@melinkcorp.com>
Cc: Andrew Victor <linux@maxim.org.za>
Cc: Nicolas Ferre <nicolas.ferre@atmel.com>
Cc: Jean-Christophe Plagniol-Villard <plagnioj@jcrosoft.com>
Cc: Alessandro Zummo <a.zummo@towertech.it>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agodrivers/rtc/rtc-hym8563.c: add optional clock-output-names property
Heiko Stuebner [Thu, 22 May 2014 00:43:54 +0000 (10:43 +1000)]
drivers/rtc/rtc-hym8563.c: add optional clock-output-names property

This enables the setting of a custom clock name for the clock provided by
the hym8563 rtc.

Signed-off-by: Heiko Stuebner <heiko@sntech.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agodrivers/rtc/rtc-bfin.c: do not abort when requesting irq fails
Mike Frysinger [Thu, 22 May 2014 00:43:53 +0000 (10:43 +1000)]
drivers/rtc/rtc-bfin.c: do not abort when requesting irq fails

The RTC framework does not let you return an error once a call to
devm_rtc_device_register has succeeded.  Avoid doing that when the IRQ
request fails as we can still support reading/writing the clock without
the IRQ.

Signed-off-by: Mike Frysinger <vapier@gentoo.org>
Reported-by: Ales Novak <alnovak@suse.cz>
Cc: Alessandro Zummo <a.zummo@towertech.it>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agodrivers/rtc/rtc-omap.c: add support for enabling 32khz clock
Sekhar Nori [Thu, 22 May 2014 00:43:53 +0000 (10:43 +1000)]
drivers/rtc/rtc-omap.c: add support for enabling 32khz clock

Newer versions of OMAP RTC IP such as those found in AM335x and DRA7x need
an explicit enable of 32khz functional clock which ticks the RTC.

AM335x support was working so far because of settings done in U-Boot.
However, the DRA7x U-Boot does no such enable of 32khz clock and this
patch is need to get the RTC to work on DRA7x at least.  In general, it is
better to not depend on settings done in U-Boot.

Thanks to Lokesh Vutla for noticing this.

Signed-off-by: Sekhar Nori <nsekhar@ti.com>
Cc: Alessandro Zummo <a.zummo@towertech.it>
Cc: Lokesh Vutla <lokeshvutla@ti.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agodrivers/rtc/rtc-omap.c: use BIT() macro
Sekhar Nori [Thu, 22 May 2014 00:43:53 +0000 (10:43 +1000)]
drivers/rtc/rtc-omap.c: use BIT() macro

Use BIT() macro for RTC_HAS_<FEATURE> defines instead of hand-writing bit
masks.

Use BIT() macros for register bit field definitions.

While at it, fix indentation done using spaces.

No functional change in this patch.

Signed-off-by: Sekhar Nori <nsekhar@ti.com>
Cc: Alessandro Zummo <a.zummo@towertech.it>
Cc: Lokesh Vutla <lokeshvutla@ti.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agodrivers/rtc/rtc-omap.c: remove multiple device id checks
Sekhar Nori [Thu, 22 May 2014 00:43:53 +0000 (10:43 +1000)]
drivers/rtc/rtc-omap.c: remove multiple device id checks

Remove multiple superfluous device id checks.  Since an id_table is
present in the driver probe() should never encounter an empty device id
entry.  In case of OF style match, of_match_device() returns an matching
entry.

For paranoia sake, check for device id entry once and fail probe() if none
is found.  This is much better than checking for it multiple times.

Signed-off-by: Sekhar Nori <nsekhar@ti.com>
Cc: Alessandro Zummo <a.zummo@towertech.it>
Cc: Lokesh Vutla <lokeshvutla@ti.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agortc-da9063-rtc-driver-fix
Andrew Morton [Thu, 22 May 2014 00:43:52 +0000 (10:43 +1000)]
rtc-da9063-rtc-driver-fix

coding-style tweaks

Cc: Alessandro Zummo <a.zummo@towertech.it>
Cc: David Dajun Chen <david.chen@diasemi.com>
Cc: Lee Jones <lee.jones@linaro.org>
Cc: Mark Brown <broonie@linaro.org>
Cc: Opensource [Steve Twiss] <stwiss.opensource@diasemi.com>
Cc: Philipp Zabel <p.zabel@pengutronix.de>
Cc: Samuel Ortiz <sameo@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agortc: da9063: RTC driver
Opensource [Steve Twiss] [Thu, 22 May 2014 00:43:52 +0000 (10:43 +1000)]
rtc: da9063: RTC driver

Add the RTC driver for DA9063.

Signed-off-by: Opensource [Steve Twiss] <stwiss.opensource@diasemi.com>
Cc: Alessandro Zummo <a.zummo@towertech.it>
Cc: Lee Jones <lee.jones@linaro.org>
Cc: Mark Brown <broonie@linaro.org>
Cc: Philipp Zabel <p.zabel@pengutronix.de>
Cc: Samuel Ortiz <sameo@linux.intel.com>
Cc: David Dajun Chen <david.chen@diasemi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agodrivers/rtc: add support for Microchip MCP795
Josef Gajdusek [Thu, 22 May 2014 00:43:52 +0000 (10:43 +1000)]
drivers/rtc: add support for Microchip MCP795

Add driver for SPI RTC Microchip MCP795.  Only supports saving/loading
time from the chip (i.  e.  no alarms/power events/ID).

Signed-off-by: Josef Gajdusek <atx@atx.name>
Cc: Jingoo Han <jg1.han@samsung.com>
Acked-by: Alessandro Zummo <a.zummo@towertech.it>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agodrivers/rtc/rtc-ds1343.c: fix potential race condition
Raghavendra Ganiga [Thu, 22 May 2014 00:43:52 +0000 (10:43 +1000)]
drivers/rtc/rtc-ds1343.c: fix potential race condition

Avoid the potential race condition by avoiding bailing out of driver in
probe after registering with rtc subsystem

Also the set_alarm , read_alarm and alarm_irq_enable returns error if irq
registration fails in probe.

Also the sysfs will not create entry for alarm if irq registration fails
in probe.

Signed-off-by: Raghavendra Chandra Ganiga <ravi23ganiga@gmail.com>
Acked-by: Alessandro Zummo <a.zummo@towertech.it>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agortc: fix build error
Raghavendra Ganiga [Thu, 22 May 2014 00:43:51 +0000 (10:43 +1000)]
rtc: fix build error

Fix the following build errors reported by kbuild test robot by selecting
REGMAP_SPI in Kconfig file

drivers/built-in.o: In function `ds1343_probe':
rtc-ds1343.c:(.text+0x1baf8f): undefined reference to `devm_regmap_init_spi'

Signed-off-by: Raghavendra Chandra Ganiga <ravi23ganiga@gmail.com>
Acked-by: Alessandro Zummo <a.zummo@towertech.it>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agodrivers/rtc: add support for maxim dallas rtc ds1343 and ds1344
Raghavendra Ganiga [Thu, 22 May 2014 00:43:51 +0000 (10:43 +1000)]
drivers/rtc: add support for maxim dallas rtc ds1343 and ds1344

Signed-off-by: Raghavendra Chandra Ganiga <ravi23ganiga@gmail.com>
Cc: Alessandro Zummo <a.zummo@towertech.it>
Cc: Jingoo Han <jg1.han@samsung.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agortc: rtc-palmas: make of_device_id array const
Jingoo Han [Thu, 22 May 2014 00:43:51 +0000 (10:43 +1000)]
rtc: rtc-palmas: make of_device_id array const

Make of_device_id array const, because all OF functions handle it as
const.

Signed-off-by: Jingoo Han <jg1.han@samsung.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agortc: rtc-mv: make of_device_id array const
Jingoo Han [Thu, 22 May 2014 00:43:51 +0000 (10:43 +1000)]
rtc: rtc-mv: make of_device_id array const

Make of_device_id array const, because all OF functions handle it as
const.

Signed-off-by: Jingoo Han <jg1.han@samsung.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agortc: isl12057: make of_device_id array const
Jingoo Han [Thu, 22 May 2014 00:43:50 +0000 (10:43 +1000)]
rtc: isl12057: make of_device_id array const

Make of_device_id array const, because all OF functions handle it as
const.

Signed-off-by: Jingoo Han <jg1.han@samsung.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agortc: rtc-hym8563: make of_device_id array const
Jingoo Han [Thu, 22 May 2014 00:43:50 +0000 (10:43 +1000)]
rtc: rtc-hym8563: make of_device_id array const

Make of_device_id array const, because all OF functions handle it as
const.

Signed-off-by: Jingoo Han <jg1.han@samsung.com>
Acked-by: Heiko Stbner <heiko@sntech.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agortc: rtc-ds1742: make of_device_id array const
Jingoo Han [Thu, 22 May 2014 00:43:50 +0000 (10:43 +1000)]
rtc: rtc-ds1742: make of_device_id array const

Make of_device_id array const, because all OF functions handle it as
const.

Signed-off-by: Jingoo Han <jg1.han@samsung.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agodrivers/rtc/rtc-da9052.c: ALARM causes interrupt storm
Anthony Olech [Thu, 22 May 2014 00:43:50 +0000 (10:43 +1000)]
drivers/rtc/rtc-da9052.c: ALARM causes interrupt storm

Setting the alarm to a time not on a minute boundary results in repeated
interrupts being generated by the DA9052/3 PMIC device until the kernel
RTC core sees that the alarm has rung.  Sometimes the number and frequency
of interrupts can cause the kernel to disable the IRQ line used by the
DA9052/3 PMIC with disasterous consequences.  This patch fixes the
problem.

Even though the DA9052/3 PMIC is capable generating periodic interrupts,
ie TICKS, the method used to distinguish RTC_AF from RTC_PF events was
flawed and can not work in conjunction with the regmap_irq kernel core.
Thus that flawed detection has also been removed by the DA9052/3 PMIC RTC
driver's irq handler, so that it no longer reports the wrong type of event
to the kernel RTC core.

The internal static functions within the DA9052/3 PMIC RTC driver have
been changed to pass the 'da9052_rtc' structure instead of the 'da9052'
because there is no backwards pointer from the 'da9052' structure.

This patch fixes the three issues described above.  The first is serious
because usiing the RTC alarm set to a non minute boundary will eventually
cause all component drivers that depend on the interrupt line to fail.
The solution adopted is to round up to alarm time to the next highest
minute.

The second bug, reporting a RTC_PF event instead of an RTC_AF event turns
out to not matter with the current implementation of the kernel RTC core
as it seems to ignore the event type.  However, should that change in the
future it is better to fix the issue now and not have 'problems waiting to
happen'

The third set of changes are to make the da9052_rtc structure available to
all the local internal functions in the driver.  This was done during
testing so that diagnostic data could be stored there.  Should the
solution to the first issue be found not acceptable, then the alternative
of using the TICKS interrupt at the fixed one second interval in order to
step to the exact second of the requested alarm requires an extra (alarm
time) piece of data to be stored.  In devices that use the alarm function
to wake up from sleep, accuracy to the second will result in the device
being awake for up to nearly a minute longer than expected.

Signed-off-by: Anthony Olech <anthony.olech.opensource@diasemi.com>
Cc: David Dajun Chen <dchen@diasemi.com>
Cc: Alessandro Zummo <a.zummo@towertech.it>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agodrivers/rtc/rtc-88pm860x.c: add missing of_node_put()
Krzysztof Kozlowski [Thu, 22 May 2014 00:43:49 +0000 (10:43 +1000)]
drivers/rtc/rtc-88pm860x.c: add missing of_node_put()

Add missing of_node_put() to decrement the reference count.

Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
Acked-by: Jingoo Han <jg1.han@samsung.com>
Cc: Haojian Zhuang <haojian.zhuang@linaro.org>
Cc: Sachin Kamat <sachin.kamat@linaro.org>
Cc: Alessandro Zummo <a.zummo@towertech.it>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agodrivers/rtc/rtc-88pm860x.c: use of_get_child_by_name()
Krzysztof Kozlowski [Thu, 22 May 2014 00:43:49 +0000 (10:43 +1000)]
drivers/rtc/rtc-88pm860x.c: use of_get_child_by_name()

Use of_get_child_by_name() to obtain reference to charger node instead of
of_find_node_by_name() which can walk outside of the parent node.

Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
Acked-by: Jingoo Han <jg1.han@samsung.com>
Cc: Haojian Zhuang <haojian.zhuang@linaro.org>
Cc: Sachin Kamat <sachin.kamat@linaro.org>
Cc: Alessandro Zummo <a.zummo@towertech.it>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agoarch/mips/dec: switch DECstation systems to rtc-cmos
Maciej W. Rozycki [Thu, 22 May 2014 00:43:49 +0000 (10:43 +1000)]
arch/mips/dec: switch DECstation systems to rtc-cmos

This adds an RTC platform device for DECstation systems so that they can
use the rtc-cmos driver for their RTC device.

Signed-off-by: Maciej W. Rozycki <macro@linux-mips.org>
Cc: Alessandro Zummo <a.zummo@towertech.it>
Cc: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agortc-rtc-cmos-drivers-char-rtcc-features-for-decstation-support-fix
Andrew Morton [Thu, 22 May 2014 00:43:49 +0000 (10:43 +1000)]
rtc-rtc-cmos-drivers-char-rtcc-features-for-decstation-support-fix

fix weird code layout

Cc: "Maciej W. Rozycki" <macro@linux-mips.org>
Cc: Alessandro Zummo <a.zummo@towertech.it>
Cc: Maciej W. Rozycki <macro@linux-mips.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agodrivers/rtc/rtc-cmos.c: drivers/char/rtc.c features for DECstation support
Maciej W. Rozycki [Thu, 22 May 2014 00:43:48 +0000 (10:43 +1000)]
drivers/rtc/rtc-cmos.c: drivers/char/rtc.c features for DECstation support

This brings in drivers/char/rtc.c functionality required for DECstation
and, should the maintainers decide to switch, Alpha systems to use
rtc-cmos.

Specifically these features are made available:

* RTC iomem rather than x86/PCI port I/O mapping, controlled with the
  RTC_IOMAPPED macro as with the original driver.  The DS1287A chip in all
  DECstation systems is mapped in the host bus address space as a
  contiguous block of 64 32-bit words of which the least significant byte
  accesses the RTC chip for both reads and writes.  All the address and
  data window register accesses are made transparently by the chipset glue
  logic so that the device appears directly mapped on the host bus.

* A way to set the size of the address space explicitly with the
  newly-added `address_space' member of the platform part of the RTC
  device structure.  This avoids the unreliable heuristics that does not
  work in a setup where the RTC is not explicitly accessed with the usual
  address and data window register pair.

* The ability to use the RTC periodic interrupt as a system clock
  device, which is implemented by arch/mips/kernel/cevt-ds1287.c for
  DECstation systems and takes the RTC interrupt away from the RTC driver.
   Eventually hooking back to the clock device's interrupt handler should
  be possible for the purpose of the alarm clock and possibly also
  update-in-progress interrupt, but this is not done by this change.

  o To avoid interfering with the clock interrupt all the places where
    the RTC interrupt mask is fiddled with are only executed if and IRQ
    has been assigned to the RTC driver.

  o To avoid changing the clock setup Register A is not fiddled with
    if CMOS_RTC_FLAGS_NOFREQ is set in the newly-added `flags' member of
    the platform part of the RTC device structure.  Originally, in
    drivers/char/rtc.c, this was keyed with the absence of the RTC
    interrupt, just like the interrupt mask, but there only the periodic
    interrupt frequency is set, whereas rtc-cmos also sets the divider
    bits.  Therefore a new flag is introduced so that systems where the
    RTC interrupt is not usable rather than used as a system clock device
    can fully initialise the RTC.

* A small clean-up is made to the IRQ assignment code that makes the IRQ
  number hardcoded to -1 rather than arbitrary -ENXIO (or whatever error
  happens to be returned by platform_get_irq) where no IRQ has been
  assigned to the RTC driver (NO_IRQ might be another candidate, but it
  looks like this macro has inconsistent or missing definitions and
  limited use and might therefore be unsafe).

Verified to work correctly with a DECstation 5000/240 system.

Signed-off-by: Maciej W. Rozycki <macro@linux-mips.org>
Cc: Alessandro Zummo <a.zummo@towertech.it>
Cc: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agodrivers/rtc/rtc-efi.c: avoid subtracting day twice when computing year days
Lee, Chun-Yi [Thu, 22 May 2014 00:43:48 +0000 (10:43 +1000)]
drivers/rtc/rtc-efi.c: avoid subtracting day twice when computing year days

Compared source code of rtc-lib.c::rtc_year_days() with
efirtc.c::rtc_year_days(), found the code in rtc-efi decreases value of
day twice when it computing year days.  rtc-lib.c::rtc_year_days() has
already decrease days and return the year days from 0 to 365.

Signed-off-by: Lee, Chun-Yi <jlee@suse.com>
Cc: Alessandro Zummo <a.zummo@towertech.it>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agodrivers/rtc/rtc-m41t80.c: add support for MicroCrystal rv4162
Wolfram Sang [Thu, 22 May 2014 00:43:48 +0000 (10:43 +1000)]
drivers/rtc/rtc-m41t80.c: add support for MicroCrystal rv4162

Signed-off-by: Wolfram Sang <wsa@sang-engineering.com>
Cc: Jingoo Han <jg1.han@samsung.com>
Cc: Alessandro Zummo <a.zummo@towertech.it>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agodrivers/rtc/rtc-m41t80.c: propagate error value from smbus functions
Wolfram Sang [Thu, 22 May 2014 00:43:48 +0000 (10:43 +1000)]
drivers/rtc/rtc-m41t80.c: propagate error value from smbus functions

Don't replace the value we got from the I2C layer, just pass it on.

Signed-off-by: Wolfram Sang <wsa@sang-engineering.com>
Cc: Jingoo Han <jg1.han@samsung.com>
Cc: Alessandro Zummo <a.zummo@towertech.it>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agodrivers/rtc/rtc-m41t80.c: clean up error paths
Wolfram Sang [Thu, 22 May 2014 00:43:47 +0000 (10:43 +1000)]
drivers/rtc/rtc-m41t80.c: clean up error paths

There is no cleanup needed when something fails in probe, so no need for
goto.  Directly return when something fails.

Signed-off-by: Wolfram Sang <wsa@sang-engineering.com>
Cc: Jingoo Han <jg1.han@samsung.com>
Acked-by: Alessandro Zummo <a.zummo@towertech.it>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agodrivers/rtc/rtc-m41t80.c: remove DRV_VERSION macro
Wolfram Sang [Thu, 22 May 2014 00:43:47 +0000 (10:43 +1000)]
drivers/rtc/rtc-m41t80.c: remove DRV_VERSION macro

History is in git, no need for sperate versioning.  Also remove the
success printout, RTC core does it, too.

Signed-off-by: Wolfram Sang <wsa@sang-engineering.com>
Cc: Jingoo Han <jg1.han@samsung.com>
Cc: Alessandro Zummo <a.zummo@towertech.it>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agoarm64: add APM X-Gene SoC RTC DTS entry
Loc Ho [Thu, 22 May 2014 00:43:47 +0000 (10:43 +1000)]
arm64: add APM X-Gene SoC RTC DTS entry

This patch adds APM X-Gene SoC RTC DTS entry

Signed-off-by: Rameshwar Prasad Sahu <rsahu@apm.com>
Signed-off-by: Loc Ho <lho@apm.com>
Cc: Jon Masters <jcm@redhat.com>
Cc: Alessandro Zummo <a.zummo@towertech.it>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agodrivers/rtc: add APM X-Gene SoC RTC driver
Loc Ho [Thu, 22 May 2014 00:43:47 +0000 (10:43 +1000)]
drivers/rtc: add APM X-Gene SoC RTC driver

Add support for the APM X-Gene SoC RTC driver.

Signed-off-by: Rameshwar Prasad Sahu <rsahu@apm.com>
Signed-off-by: Loc Ho <lho@apm.com>
Cc: Jon Masters <jcm@redhat.com>
Cc: Alessandro Zummo <a.zummo@towertech.it>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agoDocumentation/devicetree/bindings: add documentation for the APM X-Gene SoC RTC DTS...
Loc Ho [Thu, 22 May 2014 00:43:46 +0000 (10:43 +1000)]
Documentation/devicetree/bindings: add documentation for the APM X-Gene SoC RTC DTS binding

Signed-off-by: Rameshwar Prasad Sahu <rsahu@apm.com>
Signed-off-by: Loc Ho <lho@apm.com>
Cc: Jon Masters <jcm@redhat.com>
Cc: Alessandro Zummo <a.zummo@towertech.it>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agodrivers/rtc/interface.c: fix for fix of alarm initialization
Ales Novak [Thu, 22 May 2014 00:43:46 +0000 (10:43 +1000)]
drivers/rtc/interface.c: fix for fix of alarm initialization

Seems the previous patch "fix infinite loop in initializing the alarm"
did break the infinite loop in alarm initialization, but not in the right
way. The loop indeed should walk through the not-leap years and stop on
the leap one.

This patch does apply on top of the previous one.

Signed-off-by: Ales Novak <alnovak@suse.cz>
Cc: Alessandro Zummo <a.zummo@towertech.it>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agodrivers/rtc/interface.c: fix infinite loop in initializing the alarm
Ales Novak [Thu, 22 May 2014 00:43:46 +0000 (10:43 +1000)]
drivers/rtc/interface.c: fix infinite loop in initializing the alarm

In __rtc_read_alarm(), if the alarm time retrieved by
rtc_read_alarm_internal() from the device contains invalid values (e.g.
month=2,mday=31) and the year not set (=-1), the initialization will loop
infinitely because the year-fixing loop expects the time being invalid due
to leap year.

Fix reduces the loop to the leap years and adds final validity check.

Signed-off-by: Ales Novak <alnovak@suse.cz>
Acked-by: Alessandro Zummo <a.zummo@towertech.it>
Reported-by: Jiri Bohac <jbohac@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agofs/autofs4/dev-ioctl.c: add __init to autofs_dev_ioctl_init
Fabian Frederick [Thu, 22 May 2014 00:43:46 +0000 (10:43 +1000)]
fs/autofs4/dev-ioctl.c: add __init to autofs_dev_ioctl_init

autofs_dev_ioctl_init is only called by __init init_autofs4_fs

Signed-off-by: Fabian Frederick <fabf@skynet.be>
Acked-by: Ian Kent <raven@themaw.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agoinit/main.c: remove an ifdef
Andrew Morton [Thu, 22 May 2014 00:43:45 +0000 (10:43 +1000)]
init/main.c: remove an ifdef

Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agokthreads: kill CLONE_KERNEL, change kernel_thread(kernel_init) to avoid CLONE_SIGHAND
Oleg Nesterov [Thu, 22 May 2014 00:43:45 +0000 (10:43 +1000)]
kthreads: kill CLONE_KERNEL, change kernel_thread(kernel_init) to avoid CLONE_SIGHAND

1. Remove CLONE_KERNEL, it has no users and it is dangerous.

   The (old) comment says "List of flags we want to share for kernel
   threads" but this is not true, we do not want to share ->sighand by
   default. This flag can only be used if the caller is sure that both
   parent/child will never play with signals (say, allow_signal/etc).

2. Change rest_init() to clone kernel_init() without CLONE_SIGHAND.

   In this case CLONE_SIGHAND does not really hurt, and it looks like
   optimization because copy_sighand() can avoid kmem_cache_alloc().

   But in fact this only adds the minor pessimization. kernel_init()
   is going to exec the init process, and de_thread() will need to
   unshare ->sighand and do kmem_cache_alloc(sighand_cachep) anyway,
   but it needs to do more work and take tasklist_lock and siglock.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agoinit-mainc-add-initcall_blacklist-kernel-parameter-fix
Andrew Morton [Thu, 22 May 2014 00:43:45 +0000 (10:43 +1000)]
init-mainc-add-initcall_blacklist-kernel-parameter-fix

tweak printk text

Cc: Andi Kleen <andi@firstfloor.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Josh Boyer <jwboyer@fedoraproject.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Prarit Bhargava <prarit@redhat.com>
Cc: Richard Weinberger <richard.weinberger@gmail.com>
Cc: Rob Landley <rob@landley.net>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agoinit/main.c: add initcall_blacklist kernel parameter
Prarit Bhargava [Thu, 22 May 2014 00:43:45 +0000 (10:43 +1000)]
init/main.c: add initcall_blacklist kernel parameter

When a module is built into the kernel the module_init() function becomes
an initcall.  Sometimes debugging through dynamic debug can help, however,
debugging built in kernel modules is typically done by changing the
.config, recompiling, and booting the new kernel in an effort to determine
exactly which module caused a problem.

This patchset can be useful stand-alone or combined with initcall_debug.
There are cases where some initcalls can hang the machine before the
console can be flushed, which can make initcall_debug output inaccurate.
Having the ability to skip initcalls can help further debugging of these
scenarios.

Usage: initcall_blacklist=<list of comma separated initcalls>

ex) added "initcall_blacklist=sgi_uv_sysfs_init" as a kernel parameter and
the log contains:

blacklisting initcall sgi_uv_sysfs_init
...
...
initcall sgi_uv_sysfs_init blacklisted

ex) added "initcall_blacklist=foo_bar,sgi_uv_sysfs_init" as a kernel parameter
and the log contains:

blacklisting initcall foo_bar
blacklisting initcall sgi_uv_sysfs_init
...
...
initcall sgi_uv_sysfs_init blacklisted

Signed-off-by: Prarit Bhargava <prarit@redhat.com>
Cc: Richard Weinberger <richard.weinberger@gmail.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Josh Boyer <jwboyer@fedoraproject.org>
Cc: Rob Landley <rob@landley.net>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agoinit/main.c: don't use pr_debug()
Andrew Morton [Thu, 22 May 2014 00:43:44 +0000 (10:43 +1000)]
init/main.c: don't use pr_debug()

Pertially revert ea676e846a8171b8 ("init/main.c: convert to pr_foo()").

Unbeknownst to me, pr_debug() is different from the other pr_foo() levels:
pr_debug() is a no-op when DEBUG is not defined.

Happily, init/main.c does have a #define DEBUG so we didn't break
initcall_debug.  But the functioning of initcall_debug should not be
dependent upon the presence of that #define DEBUG.

Reported-by: Russell King <rmk@arm.linux.org.uk>
Cc: Joe Perches <joe@perches.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agobinfmt_elf.c: use get_random_int() to fix entropy depleting
Jeff Liu [Thu, 22 May 2014 00:43:44 +0000 (10:43 +1000)]
binfmt_elf.c: use get_random_int() to fix entropy depleting

Entropy is quickly depleted under normal operations like ls(1), cat(1),
etc...  between 2.6.30 to current mainline, for instance:

$ cat /proc/sys/kernel/random/entropy_avail
3428
$ cat /proc/sys/kernel/random/entropy_avail
2911
$cat /proc/sys/kernel/random/entropy_avail
2620

We observed this problem has been occurring since 2.6.30 with
fs/binfmt_elf.c: create_elf_tables()->get_random_bytes(), introduced by
f06295b44c296c8f ("ELF: implement AT_RANDOM for glibc PRNG seeding").

/*
 * Generate 16 random bytes for userspace PRNG seeding.
 */
get_random_bytes(k_rand_bytes, sizeof(k_rand_bytes));

The patch introduces a wrapper around get_random_int() which has lower
overhead than calling get_random_bytes() directly.

With this patch applied:
$ cat /proc/sys/kernel/random/entropy_avail
2731
$ cat /proc/sys/kernel/random/entropy_avail
2802
$ cat /proc/sys/kernel/random/entropy_avail
2878

Analyzed by John Sobecki.

This has been applied on a specific Oracle kernel and has been running on
the customer's production environment (the original bug reporter) for
several months; it has worked fine until now.

Signed-off-by: Jie Liu <jeff.liu@oracle.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Andreas Dilger <aedilger@gmail.com>
Cc: Alan Cox <alan@linux.intel.com>
Cc: Arnd Bergmann <arnn@arndb.de>
Cc: John Sobecki <john.sobecki@oracle.com>
Cc: James Morris <james.l.morris@oracle.com>
Cc: Jakub Jelinek <jakub@redhat.com>
Cc: Ted Ts'o <tytso@mit.edu>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Acked-by: Kees Cook <keescook@chromium.org>
Cc: Ulrich Drepper <drepper@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agofs/binfmt_flat.c: make old_reloc() static
Axel Lin [Thu, 22 May 2014 00:43:44 +0000 (10:43 +1000)]
fs/binfmt_flat.c: make old_reloc() static

old_reloc() is only used in this file, make it static.

Signed-off-by: Axel Lin <axel.lin@ingics.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agofs/binfmt_elf.c: fix bool assignements
Fabian Frederick [Thu, 22 May 2014 00:43:44 +0000 (10:43 +1000)]
fs/binfmt_elf.c: fix bool assignements

Fix coccinelle warnings.

Signed-off-by: Fabian Frederick <fabf@skynet.be>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agofs/efs: convert printk(KERN_DEBUG to pr_debug
Fabian Frederick [Thu, 22 May 2014 00:43:43 +0000 (10:43 +1000)]
fs/efs: convert printk(KERN_DEBUG to pr_debug

All KERN_DEBUG callsites being under #ifdef DEBUG we can safely convert
everything to pr_debug without changing current behaviour.

Remove #ifdef DEBUG around pr_debugs only (suggested by Joe Perches)

Signed-off-by: Fabian Frederick <fabf@skynet.be>
Cc: Joe Perches <joe@perches.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>