Michal Hocko [Wed, 24 Aug 2011 23:46:22 +0000 (09:46 +1000)]
This patchset aims at addressing /proc/stat issue which has been
introduced with tickless kernel. In short, show_stat (proc handler)
relies on kstat_cpu(i).cpustat statistics which are updated periodically
so those numbers are more or less accurate.
This is, however, not true with tickless kernel for idle and iowait
counters because those are not updated while the cpu is in the tickless
state. As the time when CPU might be tickless is not bounded, we can see
really outdated values.
The biggest problem is that tools which read /proc/stat interpret
unchanged idle/iowait numbers as 0% idle/iowait which might confuse those
who rely on them.
The first patch in this series is just a minor clean-up.
The second one changes update_ts_time_stat semantic. The current
implementation updates idle counter regardless we are in iowait loop at
the moment. I see it as an optimization because cpufreq drivers, which
are only users of those counters, care about busy vs. non-busy states so
idle+iowait makes perfect sense. This, however, makes idle counter
useless for others.
I think that using get_cpu_idle_time_us + get_cpu_iowait_time_us should
have the same meaning (at least this is what we do for jiffies variants).
The third patch changes get_cpu_{idle,iowait}_time_us semantic. Both
functions call update_ts_time_stat so they update counters as a side
effect. This should be OK most of the time as governors (the only users)
are singletons. I can still see a potential problem because they might
race with IRQ:
Anyway, we shouldn't update those counters from other contexts so let's
make updating conditional based on the last_update_time parameter.
The final patch is the actual fix. It uses get_cpu_{idle,iowait}_time_us
to get precise counters. We still fall back to kstat_cpu if tickless
kernel is disabled.
The patchset is based on top of and gave it some testing (although I am
still not sure about the cpufreq part and possible side effects). My
testing was quite trivial (8 CPU machine):
mount -t cgroup -o cpuset none /mnt/cgroup
mkdir /mnt/cgroup/a
echo 0-5 > /mnt/cgroup/a/cpuset.cpus
echo 0 > /mnt/cgroup/a/cpuset.mems
for i in `cat /mnt/cgroup/tasks`; do echo $i > /mnt/cgroup/a/tasks; done
[only kernel threads will stay in the root cgroup]
mkdir /mnt/cgroup/b
echo 6,7 > /mnt/cgroup/a/cpuset.cpus
echo 0 > /mnt/cgroup/a/cpuset.mems
[no task in that group so CPU6,7 should be idle most of the time]
Without the last patch I can see unchanged values for CPU[67] taking up to
several seconds.
This patch:
Get rid of semicolon so that those expressions can be used also somewhere
else than just in an assignment.
Signed-off-by: Michal Hocko <mhocko@suse.cz> Acked-by: Arnd Bergmann <arnd@arndb.de> Cc: Dave Jones <davej@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Jonathan Cameron [Wed, 24 Aug 2011 23:46:22 +0000 (09:46 +1000)]
A straightforward looking use of idr for a device id.
Signed-off-by: Jonathan Cameron <jic23@cam.ac.uk> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Tejun Heo <tj@kernel.org> Cc: Guenter Roeck <guenter.roeck@ericsson.com> Cc: James Bottomley <James.Bottomley@HansenPartnership.com> Acked-by: Darrick J. Wong <djwong@us.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Andrea Righi [Wed, 24 Aug 2011 23:46:21 +0000 (09:46 +1000)]
fb_set_suspend() must be called with the console semaphore held, which
means the code path coming in here will first take the console_lock() and
then call lock_fb_info().
However several framebuffer ioctl commands acquire these locks in reverse
order (lock_fb_info() and then console_lock()). This gives rise to
potential AB-BA deadlock.
Fix this by changing the order of acquisition in the ioctl commands that
make use of console_lock().
Signed-off-by: Andrea Righi <arighi@develer.com> Reported-by: Peter Nordström (Palm GBU) <peter.nordstrom@palm.com> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Jesper Juhl [Wed, 24 Aug 2011 23:46:19 +0000 (09:46 +1000)]
A call to va_copy() should always be followed by a call to va_end() in the
same function. In kernel/autit.c::audit_log_vformat() this is not always
done. This patch makes sure va_end() is always called.
Signed-off-by: Jesper Juhl <jj@chaosbits.net> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Eric Paris <eparis@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Mathias Krause [Wed, 24 Aug 2011 23:46:19 +0000 (09:46 +1000)]
The address limit is already set in flush_old_exec() so this
set_fs(USER_DS) is redundant.
Signed-off-by: Mathias Krause <minipli@googlemail.com> Cc: Russell King <rmk@arm.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
The current interrupt traces from irq_handler_entry and irq_handler_exit
provide when an interrupt is handled. They provide good data about when
the system has switched to kernel space and how it affects the currently
running processes.
There are some IRQ vectors which trigger the system into kernel space,
which are not handled in generic IRQ handlers. Tracing such events gives
us the information about IRQ interaction with other system events.
The trace also tells where the system is spending its time. We want to
know which cores are handling interrupts and how they are affecting other
processes in the system. Also, the trace provides information about when
the cores are idle and which interrupts are changing that state.
The following patch adds the event definition and trace instrumentation
for interrupt vectors. For x86, a lookup table is provided to print out
readable IRQ vector names. The template can be used to provide interrupt
vector lookup tables on other architectures.
Signed-off-by: Vaibhav Nagarnaik <vnagarnaik@google.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Michael Rubin <mrubin@google.com> Cc: David Sharp <dhsharp@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Ed Wildgoose [Wed, 24 Aug 2011 23:46:16 +0000 (09:46 +1000)]
This new driver replaces the old PCEngines Alix 2/3 LED driver with a new
driver that controls the LEDs through the leds-gpio driver. The old
driver accessed GPIOs directly, which created a conflict and prevented
also loading the cs5535-gpio driver to read other GPIOs on the Alix board.
With this new driver, we hook into leds-gpio which in turn uses GPIO to
control the LEDs and therefore it's possible to control both the LEDs and
access onboard GPIOs
Driver is moved to platform/geode and any other geode initialisation
modules should move here also.
This driver is inspired by leds-net5501.c by Alessandro Zummo.
Ideally, leds-net5501.c should also be moved to platform/geode.
Additionally the driver relies on parts of the patch: 7f131cf3ed ("leds:
leds-alix2c - take port address from MSR) by Daniel Mack to perform
detection of the Alix board.
Signed-off-by: Ed Wildgoose <kernel@wildgooses.com> Cc: Alessandro Zummo <a.zummo@towertech.it> Cc: Daniel Mack <daniel@caiaq.de> Reviewed-by: Grant Likely <grant.likely@secretlab.ca> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Richard Purdie <rpurdie@rpsys.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Ludwig Nussel [Wed, 24 Aug 2011 23:46:16 +0000 (09:46 +1000)]
On x86_32 casting the unsigned int result of get_random_int() to long may
result in a negative value. On x86_32 the range of mmap_rnd() therefore
was -255 to 255. The 32bit mode on x86_64 used 0 to 255 as intended.
The bug was introduced by 675a081 ("x86: unify mmap_{32|64}.c") in January
2008.
Signed-off-by: Ludwig Nussel <ludwig.nussel@suse.de> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Shérab [Wed, 24 Aug 2011 23:46:15 +0000 (09:46 +1000)]
This makes the iris driver use the platform API, so it is properly exposed
in /sys.
[akpm@linux-foundation.org: remove commented-out code, add missing space to printk, clean up code layout] Signed-off-by: Shérab <Sebastien.Hinderer@ens-lyon.org> Cc: Len Brown <lenb@kernel.org> Cc: Matthew Garrett <mjg@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Andy Whitcroft [Wed, 24 Aug 2011 23:46:15 +0000 (09:46 +1000)]
Since the commit below which added O_PATH support to the *at() calls, the
error return for readlink/readlinkat for the empty pathname has switched
from ENOENT to EINVAL:
readlinkat(), fchownat() and fstatat() with empty relative pathnames
This is both unexpected for userspace and makes readlink/readlinkat
inconsistant with all other interfaces; and inconsistant with our stated
return for these pathnames.
As the readlinkat call does not have a flags parameter we cannot use the
AT_EMPTY_PATH approach used in the other calls. Therefore expose whether
the original path is infact entry via a new user_path_at_empty() path
lookup function. Use this to determine whether to default to EINVAL or
ENOENT for failures.
BugLink: http://bugs.launchpad.net/bugs/817187 Signed-off-by: Andy Whitcroft <apw@canonical.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
hank [Wed, 24 Aug 2011 23:46:14 +0000 (09:46 +1000)]
The parameter's origin type is long. On an i386 architecture, it can
easily be larger than 0x80000000, causing this function to convert it to a
sign-extended u64 type. Change the type to unsigned long so we get the
correct result.
[akpm@linux-foundation.org: build fix] Signed-off-by: hank <pyu@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: John Stultz <john.stultz@linaro.org> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Because of x86-implement-strict-user-copy-checks-for-x86_64.patch
When compiling mm/mempolicy.c the following warning is shown.
In file included from arch/x86/include/asm/uaccess.h:572,
from include/linux/uaccess.h:5,
from include/linux/highmem.h:7,
from include/linux/pagemap.h:10,
from include/linux/mempolicy.h:70,
from mm/mempolicy.c:68:
In function `copy_from_user',
inlined from `compat_sys_get_mempolicy' at mm/mempolicy.c:1415:
arch/x86/include/asm/uaccess_64.h:64: warning: call to `copy_from_user_overflow' declared with attribute warning: copy_from_user() buffer size is not provably correct
LD mm/built-in.o
Fix this by passing correct buffer size value.
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>