For headers that get exported to userland and make use of u32 style
type names, it is advised to include linux/types.h.
This fixes a headers_check warning.
Signed-off-by: Alexander Shishkin <virtuoso@slind.org> Cc: James Bottomley <James.Bottomley@HansenPartnership.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Jean Delvare [Wed, 24 Aug 2011 23:46:29 +0000 (09:46 +1000)]
The current implementation of dmi_name_in_vendors() is an invitation to
lazy coding and false positives [1]. Searching for a string in 8 know
what you're looking for, so you should know where to look. strstr isn't
fast, especially when it fails, so we should avoid calling it when it just
can't succeed.
Looking at the current users of the function, it seems clear to me that
they are looking for a system or board vendor name, so let's limit
dmi_name_in_vendors to these two DMI fields. This much better matches the
function name, BTW.
[1] We currently have code looking for short names in DMI data, such
as "IBM", "ASUS" or "Acer". I let you guess what will happen the day
other vendors ship products named, for example, "SCHREIBMEISTER",
"PEGASUS" or "Acerola".
Signed-off-by: Jean Delvare <khali@linux-fr.org> Cc: Andi Kleen <andi@firstfloor.org> Cc: Jesse Barnes <jbarnes@virtuousgeek.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Yinghai Lu [Wed, 24 Aug 2011 23:46:28 +0000 (09:46 +1000)]
When do pci remove/rescan on system that have more iommus, got
[ 894.089745] Set context mapping for c4:00.0
[ 894.110890] mpt2sas3: Allocated physical memory: size(4293 kB)
[ 894.112556] mpt2sas3: Current Controller Queue Depth(1883), Max Controller Queue Depth(2144)
[ 894.127278] mpt2sas3: Scatter Gather Elements per IO(128)
[ 894.361295] DRHD: handling fault status reg 2
[ 894.364053] DMAR:[DMA Read] Request device [c4:00.0] fault addr fffbe000
[ 894.364056] DMAR:[fault reason 02] Present bit in context entry is cl
it turns out when remove/rescan, pci dev will be freed and will get
another new dev. but drhd units still keep old one... so
dmar_find_matched_drhd_unit will return wrong drhd and iommu for the
device that is not on first iommu.
So need to update devices in drhd_units during pci remove/rescan.
Could save domain/bus/device/function aside in the list and according that
info restore new dev to drhd_units later. Then
dmar_find_matched_drdh_unit and device_to_iommu could return right drhd
and iommu.
Add remove_dev_from_drhd/restore_dev_to_drhd functions to do the real
work. call them in device ADD_DEVICE and UNBOUND_DRIVER
Need to do the samething to atsr. (expose dmar_atsr_units and add
atsru->segment)
After patch, will right iommu for the new dev and will not get DMAR error
any more.
Signed-off-by: Yinghai Lu <yinghai@kernel.org> Cc: Jesse Barnes <jbarnes@virtuousgeek.org> Cc: David Woodhouse <dwmw2@infradead.org> Cc: Vinod Koul <vinod.koul@intel.com> Cc: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Akinobu Mita [Wed, 24 Aug 2011 23:46:27 +0000 (09:46 +1000)]
The dqc_bitmap field of struct ocfs2_local_disk_chunk is 32-bit aligned,
but not 64-bit aligned. The dqc_bitmap is accessed by ocfs2_set_bit(),
ocfs2_clear_bit(), ocfs2_test_bit(), or ocfs2_find_next_zero_bit(). These
are wrapper macros for ext2_*_bit() which need to take an unsigned long
aligned address (though some architectures are able to handle unaligned
address correctly)
So some 64bit architectures may not be able to access the dqc_bitmap
correctly.
This avoids such unaligned access by using another wrapper functions for
ext2_*_bit(). The code is taken from fs/ext4/mballoc.c which also need to
handle unaligned bitmap access.
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Acked-by: Joel Becker <jlbec@evilplan.org> Cc: Mark Fasheh <mfasheh@suse.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Akinobu Mita [Wed, 24 Aug 2011 23:46:26 +0000 (09:46 +1000)]
ext4_{set,clear}_bit() is defined as __test_and_{set,clear}_bit_le() for
ext4. Only two ext4_{set,clear}_bit() calls check the return value. The
rest of calls ignore the return value and they can be replaced with
__{set,clear}_bit_le().
This changes ext4_{set,clear}_bit() from __test_and_{set,clear}_bit_le()
to __{set,clear}_bit_le() and introduces ext4_test_and_{set,clear}_bit()
for the two places where old bit needs to be returned.
This ext4_{set,clear}_bit() change is considered safe, because if someone
uses these macros without noticing the change, new ext4_{set,clear}_bit
don't have return value and causes compiler errors where the return value
is used.
This also removes unused ext4_find_first_zero_bit().
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Cc: "Theodore Ts'o" <tytso@mit.edu> Cc: Andreas Dilger <adilger.kernel@dilger.ca> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Christine Chan [Wed, 24 Aug 2011 23:46:25 +0000 (09:46 +1000)]
del_timer_sync() calls debug_object_assert_init() to assert that a timer
has been initialized before calling lock_timer_base(). lock_timer_base()
would spin forever on a NULL(uninit-ed) base. The check is added to
del_timer() to prevent silent failure, even though it would not get stuck
in an infinite loop.
Signed-off-by: Christine Chan <cschan@codeaurora.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: John Stultz <john.stultz@linaro.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Michal Hocko [Wed, 24 Aug 2011 23:46:24 +0000 (09:46 +1000)]
show_stat handler of the /proc/stat file relies on kstat_cpu(cpu)
statistics when priting information about idle and iowait times. This is
OK if we are not using tickless kernel (CONFIG_NO_HZ) because counters are
updated periodically.
With NO_HZ things got more tricky because we are not doing idle/iowait
accounting while we are tickless so the value might get outdated. Users
of /proc/stat will notice that by unchanged idle/iowait values which is
then interpreted as 0% idle/iowait time. From the user space POV this is
an unexpected behavior and a change of the interface.
Let's fix this by using get_cpu_{idle,iowait}_time_us which accounts the
total idle/iowait time since boot and it doesn't rely on sampling or any
other periodic activity. Fall back to the previous behavior if NO_HZ is
disabled or not configured.
Signed-off-by: Michal Hocko <mhocko@suse.cz> Cc: Dave Jones <davej@redhat.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Michal Hocko [Wed, 24 Aug 2011 23:46:23 +0000 (09:46 +1000)]
get_cpu_{idle,iowait}_time_us update idle/iowait counters unconditionally
if the given CPU is in the idle loop. This doesn't work well outside of
CPU governors which are singletons so nobody (except for IRQ) can race
with them.
We will need to use both functions from /proc/stat handler to properly
handle nohz idle/iowait times.
Let's update those counters only if the given last_update_time parameter
is non-NULL which means that the caller is interested in updating.
Signed-off-by: Michal Hocko <mhocko@suse.cz> Cc: Dave Jones <davej@redhat.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Michal Hocko [Wed, 24 Aug 2011 23:46:23 +0000 (09:46 +1000)]
update_ts_time_stat currently updates idle time even if we are in iowait
loop at the moment. The only real users of the idle counter (via
get_cpu_idle_time_us) are CPU governors and they expect to get cumulative
time for both idle and iowait times. The value (idle_sleeptime) is also
printed to userspace by print_cpu but it prints both idle and iowait times
so the idle part is misleading.
Let's clean this up and fix update_ts_time_stat to account both counters
properly and update consumers of idle to consider iowait time as well. If
we do this we might use get_cpu_{idle,iowait}_time_us from other contexts
as well and we will get expected values.
Signed-off-by: Michal Hocko <mhocko@suse.cz> Cc: Dave Jones <davej@redhat.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Michal Hocko [Wed, 24 Aug 2011 23:46:22 +0000 (09:46 +1000)]
This patchset aims at addressing /proc/stat issue which has been
introduced with tickless kernel. In short, show_stat (proc handler)
relies on kstat_cpu(i).cpustat statistics which are updated periodically
so those numbers are more or less accurate.
This is, however, not true with tickless kernel for idle and iowait
counters because those are not updated while the cpu is in the tickless
state. As the time when CPU might be tickless is not bounded, we can see
really outdated values.
The biggest problem is that tools which read /proc/stat interpret
unchanged idle/iowait numbers as 0% idle/iowait which might confuse those
who rely on them.
The first patch in this series is just a minor clean-up.
The second one changes update_ts_time_stat semantic. The current
implementation updates idle counter regardless we are in iowait loop at
the moment. I see it as an optimization because cpufreq drivers, which
are only users of those counters, care about busy vs. non-busy states so
idle+iowait makes perfect sense. This, however, makes idle counter
useless for others.
I think that using get_cpu_idle_time_us + get_cpu_iowait_time_us should
have the same meaning (at least this is what we do for jiffies variants).
The third patch changes get_cpu_{idle,iowait}_time_us semantic. Both
functions call update_ts_time_stat so they update counters as a side
effect. This should be OK most of the time as governors (the only users)
are singletons. I can still see a potential problem because they might
race with IRQ:
Anyway, we shouldn't update those counters from other contexts so let's
make updating conditional based on the last_update_time parameter.
The final patch is the actual fix. It uses get_cpu_{idle,iowait}_time_us
to get precise counters. We still fall back to kstat_cpu if tickless
kernel is disabled.
The patchset is based on top of and gave it some testing (although I am
still not sure about the cpufreq part and possible side effects). My
testing was quite trivial (8 CPU machine):
mount -t cgroup -o cpuset none /mnt/cgroup
mkdir /mnt/cgroup/a
echo 0-5 > /mnt/cgroup/a/cpuset.cpus
echo 0 > /mnt/cgroup/a/cpuset.mems
for i in `cat /mnt/cgroup/tasks`; do echo $i > /mnt/cgroup/a/tasks; done
[only kernel threads will stay in the root cgroup]
mkdir /mnt/cgroup/b
echo 6,7 > /mnt/cgroup/a/cpuset.cpus
echo 0 > /mnt/cgroup/a/cpuset.mems
[no task in that group so CPU6,7 should be idle most of the time]
Without the last patch I can see unchanged values for CPU[67] taking up to
several seconds.
This patch:
Get rid of semicolon so that those expressions can be used also somewhere
else than just in an assignment.
Signed-off-by: Michal Hocko <mhocko@suse.cz> Acked-by: Arnd Bergmann <arnd@arndb.de> Cc: Dave Jones <davej@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Jonathan Cameron [Wed, 24 Aug 2011 23:46:22 +0000 (09:46 +1000)]
A straightforward looking use of idr for a device id.
Signed-off-by: Jonathan Cameron <jic23@cam.ac.uk> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Tejun Heo <tj@kernel.org> Cc: Guenter Roeck <guenter.roeck@ericsson.com> Cc: James Bottomley <James.Bottomley@HansenPartnership.com> Acked-by: Darrick J. Wong <djwong@us.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Andrea Righi [Wed, 24 Aug 2011 23:46:21 +0000 (09:46 +1000)]
fb_set_suspend() must be called with the console semaphore held, which
means the code path coming in here will first take the console_lock() and
then call lock_fb_info().
However several framebuffer ioctl commands acquire these locks in reverse
order (lock_fb_info() and then console_lock()). This gives rise to
potential AB-BA deadlock.
Fix this by changing the order of acquisition in the ioctl commands that
make use of console_lock().
Signed-off-by: Andrea Righi <arighi@develer.com> Reported-by: Peter Nordström (Palm GBU) <peter.nordstrom@palm.com> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Jesper Juhl [Wed, 24 Aug 2011 23:46:19 +0000 (09:46 +1000)]
A call to va_copy() should always be followed by a call to va_end() in the
same function. In kernel/autit.c::audit_log_vformat() this is not always
done. This patch makes sure va_end() is always called.
Signed-off-by: Jesper Juhl <jj@chaosbits.net> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Eric Paris <eparis@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Mathias Krause [Wed, 24 Aug 2011 23:46:19 +0000 (09:46 +1000)]
The address limit is already set in flush_old_exec() so this
set_fs(USER_DS) is redundant.
Signed-off-by: Mathias Krause <minipli@googlemail.com> Cc: Russell King <rmk@arm.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
The current interrupt traces from irq_handler_entry and irq_handler_exit
provide when an interrupt is handled. They provide good data about when
the system has switched to kernel space and how it affects the currently
running processes.
There are some IRQ vectors which trigger the system into kernel space,
which are not handled in generic IRQ handlers. Tracing such events gives
us the information about IRQ interaction with other system events.
The trace also tells where the system is spending its time. We want to
know which cores are handling interrupts and how they are affecting other
processes in the system. Also, the trace provides information about when
the cores are idle and which interrupts are changing that state.
The following patch adds the event definition and trace instrumentation
for interrupt vectors. For x86, a lookup table is provided to print out
readable IRQ vector names. The template can be used to provide interrupt
vector lookup tables on other architectures.
Signed-off-by: Vaibhav Nagarnaik <vnagarnaik@google.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Michael Rubin <mrubin@google.com> Cc: David Sharp <dhsharp@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Ed Wildgoose [Wed, 24 Aug 2011 23:46:16 +0000 (09:46 +1000)]
This new driver replaces the old PCEngines Alix 2/3 LED driver with a new
driver that controls the LEDs through the leds-gpio driver. The old
driver accessed GPIOs directly, which created a conflict and prevented
also loading the cs5535-gpio driver to read other GPIOs on the Alix board.
With this new driver, we hook into leds-gpio which in turn uses GPIO to
control the LEDs and therefore it's possible to control both the LEDs and
access onboard GPIOs
Driver is moved to platform/geode and any other geode initialisation
modules should move here also.
This driver is inspired by leds-net5501.c by Alessandro Zummo.
Ideally, leds-net5501.c should also be moved to platform/geode.
Additionally the driver relies on parts of the patch: 7f131cf3ed ("leds:
leds-alix2c - take port address from MSR) by Daniel Mack to perform
detection of the Alix board.
Signed-off-by: Ed Wildgoose <kernel@wildgooses.com> Cc: Alessandro Zummo <a.zummo@towertech.it> Cc: Daniel Mack <daniel@caiaq.de> Reviewed-by: Grant Likely <grant.likely@secretlab.ca> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Richard Purdie <rpurdie@rpsys.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Ludwig Nussel [Wed, 24 Aug 2011 23:46:16 +0000 (09:46 +1000)]
On x86_32 casting the unsigned int result of get_random_int() to long may
result in a negative value. On x86_32 the range of mmap_rnd() therefore
was -255 to 255. The 32bit mode on x86_64 used 0 to 255 as intended.
The bug was introduced by 675a081 ("x86: unify mmap_{32|64}.c") in January
2008.
Signed-off-by: Ludwig Nussel <ludwig.nussel@suse.de> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Andy Whitcroft [Wed, 24 Aug 2011 23:46:15 +0000 (09:46 +1000)]
Since the commit below which added O_PATH support to the *at() calls, the
error return for readlink/readlinkat for the empty pathname has switched
from ENOENT to EINVAL:
readlinkat(), fchownat() and fstatat() with empty relative pathnames
This is both unexpected for userspace and makes readlink/readlinkat
inconsistant with all other interfaces; and inconsistant with our stated
return for these pathnames.
As the readlinkat call does not have a flags parameter we cannot use the
AT_EMPTY_PATH approach used in the other calls. Therefore expose whether
the original path is infact entry via a new user_path_at_empty() path
lookup function. Use this to determine whether to default to EINVAL or
ENOENT for failures.
BugLink: http://bugs.launchpad.net/bugs/817187 Signed-off-by: Andy Whitcroft <apw@canonical.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
hank [Wed, 24 Aug 2011 23:46:14 +0000 (09:46 +1000)]
The parameter's origin type is long. On an i386 architecture, it can
easily be larger than 0x80000000, causing this function to convert it to a
sign-extended u64 type. Change the type to unsigned long so we get the
correct result.
[akpm@linux-foundation.org: build fix] Signed-off-by: hank <pyu@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: John Stultz <john.stultz@linaro.org> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Because of x86-implement-strict-user-copy-checks-for-x86_64.patch
When compiling mm/mempolicy.c the following warning is shown.
In file included from arch/x86/include/asm/uaccess.h:572,
from include/linux/uaccess.h:5,
from include/linux/highmem.h:7,
from include/linux/pagemap.h:10,
from include/linux/mempolicy.h:70,
from mm/mempolicy.c:68:
In function `copy_from_user',
inlined from `compat_sys_get_mempolicy' at mm/mempolicy.c:1415:
arch/x86/include/asm/uaccess_64.h:64: warning: call to `copy_from_user_overflow' declared with attribute warning: copy_from_user() buffer size is not provably correct
LD mm/built-in.o
Fix this by passing correct buffer size value.
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>