Axel Lin [Fri, 13 May 2011 06:54:06 +0000 (14:54 +0800)]
regulator: Fix memory leak in max8998_pmic_probe failure path
Signed-off-by: Axel Lin <axel.lin@gmail.com> Acked-by: Kyungmin Park <kyungmin.park@samsung.com> Acked-by: Mark Brown <broonie@opensource.wolfsonmicro.com> Signed-off-by: Liam Girdwood <lrg@slimlogic.co.uk>
Axel Lin [Thu, 12 May 2011 05:47:50 +0000 (13:47 +0800)]
regulator: Fix desc_id for tps65023/6507x/65910
The desc_id variable should not be a static variable.
The rest of the code assumes the desc_id must less than TPSxxxxx_NUM_REGULATOR.
If we set desc_id to be a static variable, checking the return value of
rdev_get_id() may return error.
Signed-off-by: Axel Lin <axel.lin@gmail.com> Cc: Anuj Aggarwal <anuj.aggarwal@ti.com> Cc: Graeme Gregory <gg@slimlogic.co.uk> Acked-by: Mark Brown <broonie@opensource.wolfsonmicro.com> Signed-off-by: Liam Girdwood <lrg@slimlogic.co.uk>
linux-next: build failure after merge of the voltage tree
On May 10, 2011, at 9:27 PM, Stephen Rothwell wrote:
> Hi Jorge,
>
> On Tue, 10 May 2011 12:30:36 -0500 Jorge Eduardo Candelaria <jedu@slimlogic.co.uk> wrote:
>>
>> On May 10, 2011, at 3:38 AM, Liam Girdwood wrote:
>>
>>> On Tue, 2011-05-10 at 12:44 +1000, Stephen Rothwell wrote:
>>>> Hi Liam,
>>>>
>>>> After merging the voltage tree, today's linux-next build (x86_64
>>>> allmodconfig) failed like this:
>>>>
>>>> ERROR: "tps65910_gpio_init" [drivers/mfd/tps65910.ko] undefined!
>>>> ERROR: "tps65910_irq_init" [drivers/mfd/tps65910.ko] undefined!
>>>> ERROR: "irq_modify_status" [drivers/mfd/tps65910-irq.ko] undefined!
>>>> ERROR: "irq_set_chip_and_handler_name" [drivers/mfd/tps65910-irq.ko] undefined!
>>>> ERROR: "handle_edge_irq" [drivers/mfd/tps65910-irq.ko] undefined!
>>>>
>>>> I have used the voltage tree from next-20110509 for today.
>>>
>>> Jorge, could you send a fix for this today.
>>
>> The following patch should solve this:
>>
>> From: Jorge Eduardo Candelaria <jedu@slimlogic.co.uk>
>> MFD: Fix TPS65910 build
>>
>> Support for tps65910 as a module is not available. The driver can
>> only be compiled as built-in. OTOH, the regulator driver can still
>> be built as module without breaking the compilation.
>>
>> Signed-off-by: Jorge Eduardo Candelaria <jedu@slimlogic.co.uk>
>
> Today (even with the above patch included) I got these errors from the
> x86_64 allmodconfig build:
>
> tps65910.c:(.text+0xf4140): undefined reference to `i2c_master_send'
> drivers/built-in.o: In function `tps65910_i2c_read':
> tps65910.c:(.text+0xf41d2): undefined reference to `i2c_transfer'
> drivers/built-in.o: In function `tps65910_i2c_init':
> tps65910.c:(.init.text+0xcb83): undefined reference to `i2c_register_driver'
> drivers/built-in.o: In function `tps65910_i2c_exit':
> tps65910.c:(.exit.text+0x6e0): undefined reference to `i2c_del_driver'
>
> I have used the voltage tree from next-20110509 again today.
Following patch should fix the dependency problems. Please review:
Axel Lin [Tue, 10 May 2011 11:10:36 +0000 (19:10 +0800)]
Revert "regulator: Move VCOINCELL to be the last element of mc13892_regulators array"
I check this patch again and found this actually is not a bug
because MC13xxx_DEFINE explictly defines the order of each entry in the array.
Thus revert the patch.
Signed-off-by: Axel Lin <axel.lin@gmail.com> Acked-by: Mark Brown <broonie@opensource.wolfsonmicro.com> Signed-off-by: Liam Girdwood <lrg@slimlogic.co.uk>
Mark Brown [Tue, 10 May 2011 13:52:57 +0000 (15:52 +0200)]
regulator: Remove some unused variables from wm831x DCDCs
These became unused with the IRQ removal patch, I'm fairly sure that a
patch was sent earlier by someone else but it doesn't seem to have been
applied and I don't have a copy sitting around any more.
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com> Signed-off-by: Liam Girdwood <lrg@slimlogic.co.uk>
linux-next: build failure after merge of the voltage tree
On May 10, 2011, at 3:38 AM, Liam Girdwood wrote:
> On Tue, 2011-05-10 at 12:44 +1000, Stephen Rothwell wrote:
>> Hi Liam,
>>
>> After merging the voltage tree, today's linux-next build (x86_64
>> allmodconfig) failed like this:
>>
>> ERROR: "tps65910_gpio_init" [drivers/mfd/tps65910.ko] undefined!
>> ERROR: "tps65910_irq_init" [drivers/mfd/tps65910.ko] undefined!
>> ERROR: "irq_modify_status" [drivers/mfd/tps65910-irq.ko] undefined!
>> ERROR: "irq_set_chip_and_handler_name" [drivers/mfd/tps65910-irq.ko] undefined!
>> ERROR: "handle_edge_irq" [drivers/mfd/tps65910-irq.ko] undefined!
>>
>> I have used the voltage tree from next-20110509 for today.
>
> Jorge, could you send a fix for this today.
>
> Thanks
>
> Liam
>
Support for tps65910 as a module is not available. The driver can
only be compiled as built-in. OTOH, the regulator driver can still
be built as module without breaking the compilation.
Graeme Gregory [Mon, 2 May 2011 21:20:08 +0000 (16:20 -0500)]
TPS65910: Add tps65910 regulator driver
The regulator module consists of 3 DCDCs and 8 LDOs. The output
voltages are configurable and are meant to supply power to the
main processor and other components
Signed-off-by: Graeme Gregory <gg@slimlogic.co.uk> Signed-off-by: Jorge Eduardo Candelaria <jedu@slimlogic.co.uk> Acked-by: Mark Brown <broonie@opensource.wolfsonmicro.com> Signed-off-by: Liam Girdwood <lrg@slimlogic.co.uk>
Graeme Gregory [Mon, 2 May 2011 21:20:04 +0000 (16:20 -0500)]
TPS65910: IRQ: Add interrupt controller
This module controls the interrupt handling for the tps chip. The
interrupt sources are the following:
- GPIO falling/rising edge detection
- Battery voltage below/above threshold
- PWRON signal
- PWRHOLD signal
- Temperature detection
- RTC alarm and periodic event
Signed-off-by: Graeme Gregory <gg@slimlogic.co.uk> Signed-off-by: Jorge Eduardo Candelaria <jedu@slimlogic.co.uk> Reviewed-by: Mark Brown <broonie@opensource.wolfsonmicro.com> Acked-by: Samuel Ortiz <sameo@linux.intel.com> Signed-off-by: Liam Girdwood <lrg@slimlogic.co.uk>
Axel Lin [Mon, 9 May 2011 09:49:40 +0000 (17:49 +0800)]
regulator: Use mc13xxx_reg_write instead of mc13xxx_reg_rmw in mc13892_sw_regulator_set_voltage
Currently, we call mc13xxx_reg_read and mc13xxx_reg_rmw for the same register.
This can be converted to simply a mc13xxx_reg_read and a mc13xxx_reg_write,
thus save a redundant register read.
Signed-off-by: Axel Lin <axel.lin@gmail.com> Acked-by: Mark Brown <broonie@opensource.wolfsonmicro.com> Signed-off-by: Liam Girdwood <lrg@slimlogic.co.uk>
Mark Brown [Sun, 8 May 2011 21:13:37 +0000 (22:13 +0100)]
regulator: Support voltage offsets to compensate for drops in system
Some systems, particularly physically large systems used for early
prototyping, may experience substantial voltage drops between the regulator
and the consumers as a result of long traces in the system. With these
systems voltages may need to be set higher than requested in order to
ensure reliable system operation.
Allow systems to work around such hardware issues by allowing constraints
to supply an offset to be applied to any requested and reported voltages.
This is not ideal, especially since the voltage drop may be load dependant,
but is sufficient for most affected systems, it is not expected to be used
in production hardware. The offset is applied after all constraint
processing so constraints should be specified in terms of consumer values
not physically configured values.
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com> Signed-off-by: Liam Girdwood <lrg@slimlogic.co.uk>
Mark Brown [Sun, 8 May 2011 21:30:18 +0000 (22:30 +0100)]
regulator: Remove supply_regulator_dev from machine configuration
supply_regulator_dev (using a struct pointer) has been deprecated in favour
of supply_regulator (using a regulator name) for quite a few releases
now with a warning generated if it is used and there are no current in tree
users so just remove the code.
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com> Signed-off-by: Liam Girdwood <lrg@slimlogic.co.uk>
Axel Lin [Thu, 5 May 2011 15:32:58 +0000 (23:32 +0800)]
regulator: Move VCOINCELL to be the last element of mc13892_regulators array
In include/linux/mfd/mc13892.h, we define MC13892_VCOINCELL as 23.
Thus VCOINCELL should be defined as 23th element in mc13892_regulators array, not the first one.
This actually fixes an off-by-one bug while accessing mc13892_regulators array.
For example,
In mc13892_regulator_probe, we use MC13892_VCAM as array index of mc13892_regulators array.
mc13892_regulators[MC13892_VCAM].desc.ops->set_mode
= mc13892_vcam_set_mode;
Currently, it access mc13892_regulators[12] ,which is VAUDIO not VCAM.
Signed-off-by: Axel Lin <axel.lin@gmail.com> Acked-by: Mark Brown <broonie@opensource.wolfsonmicro.com> Signed-off-by: Liam Girdwood <lrg@slimlogic.co.uk>
Axel Lin [Fri, 1 Apr 2011 10:25:25 +0000 (18:25 +0800)]
regulator: Fix the argument of calling regulator_mode_constrain
The second parameter of regulator_mode_constrain takes a pointer.
This patch fixes below warning:
drivers/regulator/core.c: In function 'regulator_set_mode':
drivers/regulator/core.c:2014: warning: passing argument 2 of 'regulator_mode_constrain' makes pointer from integer without a cast
drivers/regulator/core.c:200: note: expected 'int *' but argument is of type 'unsigned int'
Signed-off-by: Axel Lin <axel.lin@gmail.com> Acked-by: Mark Brown <broonie@opensource.wolfsonmicro.com> Signed-off-by: Liam Girdwood <lrg@vega.(none)>
Saquib Herman [Fri, 1 Apr 2011 04:52:46 +0000 (10:22 +0530)]
regulator: twl: add twl6030 set_mode
Current set_mode logic does not support 6030. The logic for 4030 is
not reusable for 6030 as the mode setting for 6030 now uses the new
CFG_STATE register. We hence rename the old get_status as being
specific to 4030.
Signed-off-by: Nishanth Menon <nm@ti.com> Signed-off-by: Saquib Herman <saquib@ti.com> Acked-by: Mark Brown <broonie@opensource.wolfsonmicro.com> Signed-off-by: Liam Girdwood <lrg@vega.(none)>
Saquib Herman [Fri, 1 Apr 2011 04:52:45 +0000 (10:22 +0530)]
regulator: twl: add twl6030 get_status
Current get_status logic does not support 6030 get_status.
The logic for 4030 is not reusable for 6030 as the status
check for 6030 now depends on the new CFG_STATE register.
We hence rename the old get_status as being specific to
4030 and remove the redundant check for the same.
Signed-off-by: Nishanth Menon <nm@ti.com> Signed-off-by: Saquib Herman <saquib@ti.com> Acked-by: Mark Brown <broonie@opensource.wolfsonmicro.com> Signed-off-by: Liam Girdwood <lrg@vega.(none)>
Saquib Herman [Fri, 1 Apr 2011 04:52:44 +0000 (10:22 +0530)]
regulator: twl: fix twl6030 regulator is_enabled
With TWL6030, it is not enough to ensure that the regulator is the
group of P1 group (CPU/Linux), but we need to check the state as far
as APP is concerned as well.
Split the current is_enabled to 6030 and 4030 specific ones. This
split impacts few macros and variables as well, but sets up the
stage for further fixes to set_mode and get_status in subsequent
patches.
Signed-off-by: Nishanth Menon <nm@ti.com> Signed-off-by: Saquib Herman <saquib@ti.com> Acked-by: Mark Brown <broonie@opensource.wolfsonmicro.com> Signed-off-by: Liam Girdwood <lrg@vega.(none)>
Saquib Herman [Fri, 1 Apr 2011 04:52:43 +0000 (10:22 +0530)]
regulator: twl: remap has no meaning for 6030
TWL6030 does not have remap register. The current implementation
causes value of remap to be written to state register, accidentally
causing the regulators which are probed to be switched on as well.
This is wrong as regulators should be controllable based on calls
to enable/disable for TWL regulator framework. Further, the values
initialized make no sense as well. We hence remove this from the
initalizers and also write to remap register only if the TWL
is 4030.
Signed-off-by: Nishanth Menon <nm@ti.com> Signed-off-by: Saquib Herman <saquib@ti.com> Acked-by: Mark Brown <broonie@opensource.wolfsonmicro.com> Signed-off-by: Liam Girdwood <lrg@vega.(none)>
Saquib Herman [Fri, 1 Apr 2011 04:52:42 +0000 (10:22 +0530)]
regulator: twl: fix twl6030 enable/disable
TWL6030 requires an additional register write to CFG_STATE register
to explicitly state that the regulator is in a certain state. Merely
associating the regulator with the group is not enough. Add the
required register field definitions and fix the handling for
TWL6030 enable/disable.
Signed-off-by: Nishanth Menon <nm@ti.com> Signed-off-by: Saquib Herman <saquib@ti.com> Acked-by: Mark Brown <broonie@opensource.wolfsonmicro.com> Signed-off-by: Liam Girdwood <lrg@vega.(none)>
Axel Lin [Tue, 29 Mar 2011 09:54:58 +0000 (17:54 +0800)]
regulator: Add missing platform_set_drvdata in tps6105x_regulator_probe
Otherwise, calling platform_get_drvdata in tps6105x_regulator_remove
returns NULL.
Signed-off-by: Axel Lin <axel.lin@gmail.com> Acked-by: Mark Brown <broonie@opensource.wolfsonmicro.com> Signed-off-by: Liam Girdwood <lrg@slimlogic.co.uk>
Mark Brown [Tue, 29 Mar 2011 21:29:12 +0000 (06:29 +0900)]
regulator: When constraining modes fall back to higher power modes
If a mode requested by a consumer is not allowed by constraints
automatically fall back to a higher power mode if possible. This
ensures that consumers get at least the output they requested while
allowing machine drivers to transparently limit lower power modes
if required.
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com> Signed-off-by: Liam Girdwood <lrg@slimlogic.co.uk>
Linus Torvalds [Fri, 27 May 2011 02:01:15 +0000 (19:01 -0700)]
Merge branch 'upstream/tidy-xen-mmu-2.6.39' of git://git.kernel.org/pub/scm/linux/kernel/git/jeremy/xen
* 'upstream/tidy-xen-mmu-2.6.39' of git://git.kernel.org/pub/scm/linux/kernel/git/jeremy/xen:
xen: fix compile without CONFIG_XEN_DEBUG_FS
Use arbitrary_virt_to_machine() to deal with ioremapped pud updates.
Use arbitrary_virt_to_machine() to deal with ioremapped pmd updates.
xen/mmu: remove all ad-hoc stats stuff
xen: use normal virt_to_machine for ptes
xen: make a pile of mmu pvop functions static
vmalloc: remove vmalloc_sync_all() from alloc_vm_area()
xen: condense everything onto xen_set_pte
xen: use mmu_update for xen_set_pte_at()
xen: drop all the special iomap pte paths.
Linus Torvalds [Tue, 24 May 2011 20:48:51 +0000 (13:48 -0700)]
selinux: don't pass in NULL avd to avc_has_perm_noaudit
Right now security_get_user_sids() will pass in a NULL avd pointer to
avc_has_perm_noaudit(), which then forces that function to have a dummy
entry for that case and just generally test it.
Don't do it. The normal callers all pass a real avd pointer, and this
helper function is incredibly hot. So don't make avc_has_perm_noaudit()
do conditional stuff that isn't needed for the common case.
* git://git.kernel.org/pub/scm/linux/kernel/git/pkl/squashfs-linus:
Squashfs: update email address
Squashfs: add extra sanity checks at mount time
Squashfs: add sanity checks to fragment reading at mount time
Squashfs: add sanity checks to lookup table reading at mount time
Squashfs: add sanity checks to id reading at mount time
Squashfs: add sanity checks to xattr reading at mount time
Squashfs: reverse order of filesystem table reading
Squashfs: move table allocation into squashfs_read_table()
By the previous style change, CONFIG_GENERIC_FIND_NEXT_BIT,
CONFIG_GENERIC_FIND_BIT_LE, and CONFIG_GENERIC_FIND_LAST_BIT are not used
to test for existence of find bitops anymore.
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Acked-by: Greg Ungerer <gerg@uclinux.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Russell King <linux@arm.linux.org.uk> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Clifton Barnes [Thu, 26 May 2011 23:26:04 +0000 (16:26 -0700)]
w1: add Maxim/Dallas DS2780 Stand-Alone Fuel Gauge IC support
Add support for the Maxim/Dallas DS2780 Stand-Alone Fuel Gauge IC.
It was suggested to combine this functionality with the current ds2782
driver. Unfortunately, I'm unable to commit the time to refactoring this
driver to that extent and I don't have a platform with the ds2782 part to
validate that there are no regression issues by adding this functionality.
[akpm@linux-foundation.org: use min_t()] Signed-off-by: Clifton Barnes <cabarnes@indesign-llc.com> Tested-by: Haojian Zhuang <haojian.zhuang@gmail.com> Cc: Evgeniy Polyakov <johnpol@2ka.mipt.ru> Cc: Ryan Mallon <ryan@bluewatersys.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
David Fries [Thu, 26 May 2011 23:26:03 +0000 (16:26 -0700)]
w1: have netlink search update kernel list
Reorganize so the netlink connector one wire search command will update
the kernel list of detected slave devices. Otherwise, a newly detected
device is unusable because unless it's in the kernel list of known devices
any commands will result in ENODEV status.
Signed-off-by: David Fries <David@Fries.net> Cc: Evgeniy Polyakov <johnpol@2ka.mipt.ru> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
w1: complete the 1-wire (w1) ds1wm driver search algorithm
This adds multi-slave support of the w1 bus for the ds1wm Synthesizable
1-Wire Bus Master. Also many fixes and tweaks based on the rev3 of the
datasheet http://datasheets.maxim-ic.com/en/ds/DS1WM.pdf
w1: add 1-wire (w1) reset and resume command API support
The first patch adds generic functionnality to w1_io for Resume Command
[A5h] lots of slaves support. I found it useful for multi-commands/reset
workflows with the same slave on a multi-slave bus.
This DS2408 w1 slave driver is not complete for all the features of the
chip, but its sufficient if you use it as a simple IO expander. Enjoy!
The ds1wm had Kconfig dependencies towards ARM && HAVE_CLK. I took them
out since I was using the ds1wm on an x86_64 platform (ds1wm in a FPGA
through pcie) and found them irrelevant.
The clock freq/divisors at the top of ds1wm.c did not have the MSB set to
1. This bit is CLK_EN which turns the whole prescaler and dividers on.
The driver never mentionned this bit either, so I just included this bit
right in the table entries. I also took the liberty to add a couple of
entries to the table. The spec doesn't explicitely mentions these
possibilities but the description and examination of the core shows the
prescalers & dividers can be used for more than the table explicitely
shows. The table I enlarged still doesn't cover all possibilities, but
it's a good start.
I also made a few tweaks to a couple of the read and write algorithms
which made sense while I had my head very deep in the ds1wm documentation.
We stressed it a lot with 10+ slaves on the bus, many ds2408, ds2431 and
ds2433 at the same time doing extensive interaction. It proved quite
stable in our production environment.
This patch:
Add generic functionnality to w1_io for Resume Command [A5h] lots of
slaves support.
Rakib Mullick [Thu, 26 May 2011 23:26:00 +0000 (16:26 -0700)]
kernel/profile.c: remove some duplicate code from profile_hits()
profile_hits() has a common check for prof_on and prof_buffer regardless
of SMP or !SMP. So, remove some duplicate code by splitting profile_hits
into two.
[akpm@linux-foundation.org: make do_profile_hits static] Signed-off-by: Rakib Mullick <rakib.mullick@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Julia Lawall [Thu, 26 May 2011 23:25:59 +0000 (16:25 -0700)]
drivers/char/ppdev.c: put gotten port value
parport_find_number() calls parport_get_port() on its result, so there
should be a corresponding call to parport_put_port() before dropping the
reference. Similar code is found in the function register_device() in the
same file.
The semantic match that finds this problem is as follows:
(http://coccinelle.lip6.fr/)
x = parport_find_number(...)
... when != x = rr
when any
when != parport_put_port(x,...)
when != if (...) { ... parport_put_port(x,...) ...}
(
if(<+...x...+>) S1 else S2
|
if(...) { ... when != x = ra
when forall
when != parport_put_port(x,...)
*return...;
}
)
// </smpl>
Signed-off-by: Julia Lawall <julia@diku.dk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Timo Warns [Thu, 26 May 2011 23:25:57 +0000 (16:25 -0700)]
fs/partitions/efi.c: corrupted GUID partition tables can cause kernel oops
The kernel automatically evaluates partition tables of storage devices.
The code for evaluating GUID partitions (in fs/partitions/efi.c) contains
a bug that causes a kernel oops on certain corrupted GUID partition
tables.
This bug has security impacts, because it allows, for example, to
prepare a storage device that crashes a kernel subsystem upon connecting
the device (e.g., a "USB Stick of (Partial) Death").
computes a CRC32 checksum over gpt covering (*gpt)->header_size bytes.
There is no validation of (*gpt)->header_size before the efi_crc32 call.
A corrupted partition table may have large values for (*gpt)->header_size.
In this case, the CRC32 computation access memory beyond the memory
allocated for gpt, which may cause a kernel heap overflow.
Validate value of GUID partition table header size.
[akpm@linux-foundation.org: fix layout and indenting] Signed-off-by: Timo Warns <warns@pre-sense.de> Cc: Matt Domsch <Matt_Domsch@dell.com> Cc: Eugene Teo <eugeneteo@kernel.sg> Cc: Dave Jones <davej@codemonkey.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Olaf Hering [Thu, 26 May 2011 23:25:54 +0000 (16:25 -0700)]
fs/proc/vmcore.c: add hook to read_from_oldmem() to check for non-ram pages
The balloon driver in a Xen guest frees guest pages and marks them as
mmio. When the kernel crashes and the crash kernel attempts to read the
oldmem via /proc/vmcore a read from ballooned pages will generate 100%
load in dom0 because Xen asks qemu-dm for the page content. Since the
reads come in as 8byte requests each ballooned page is tried 512 times.
With this change a hook can be registered which checks wether the given
pfn is really ram. The hook has to return a value > 0 for ram pages, a
value < 0 on error (because the hypercall is not known) and 0 for non-ram
pages.
This will reduce the time to read /proc/vmcore. Without this change a
512M guest with 128M crashkernel region needs 200 seconds to read it, with
this change it takes just 2 seconds.
Signed-off-by: Olaf Hering <olaf@aepfle.de> Cc: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
KOSAKI Motohiro [Thu, 26 May 2011 23:25:53 +0000 (16:25 -0700)]
proc: fix pagemap_read() error case
Currently, pagemap_read() has three error and/or corner case handling
mistake.
(1) If ppos parameter is wrong, mm refcount will be leak.
(2) If count parameter is 0, mm refcount will be leak too.
(3) If the current task is sleeping in kmalloc() and the system
is out of memory and oom-killer kill the proc associated task,
mm_refcount prevent the task free its memory. then system may
hang up.
<Quote Hugh's explain why we shold call kmalloc() before get_mm()>
check_mem_permission gets a reference to the mm. If we
__get_free_page after check_mem_permission, imagine what happens if the
system is out of memory, and the mm we're looking at is selected for
killing by the OOM killer: while we wait in __get_free_page for more
memory, no memory is freed from the selected mm because it cannot reach
exit_mmap while we hold that reference.
KOSAKI Motohiro [Thu, 26 May 2011 23:25:52 +0000 (16:25 -0700)]
proc: put check_mem_permission after __get_free_page in mem_write
It whould be better if put check_mem_permission after __get_free_page in
mem_write, to be same as function mem_read.
Hugh Dickins explained the reason.
check_mem_permission gets a reference to the mm. If we __get_free_page
after check_mem_permission, imagine what happens if the system is out
of memory, and the mm we're looking at is selected for killing by the
OOM killer: while we wait in __get_free_page for more memory, no memory
is freed from the selected mm because it cannot reach exit_mmap while
we hold that reference.
Reported-by: Jovi Zhang <bookjovi@gmail.com> Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Acked-by: Hugh Dickins <hughd@google.com> Reviewed-by: Stephen Wilson <wilsons@start.ca> Cc: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jiri Slaby [Thu, 26 May 2011 23:25:46 +0000 (16:25 -0700)]
coredump: add support for exe_file in core name
Now, exe_file is not proc FS dependent, so we can use it to name core
file. So we add %E pattern for core file name cration which extract path
from mm_struct->exe_file. Then it converts slashes to exclamation marks
and pastes the result to the core file name itself.
This is useful for environments where binary names are longer than 16
character (the current->comm limitation). Also where there are binaries
with same name but in a different path. Further in case the binery itself
changes its current->comm after exec.
So by doing (s/$/#/ -- # is treated as git comment):
Jiri Slaby [Thu, 26 May 2011 23:25:46 +0000 (16:25 -0700)]
mm: extract exe_file handling from procfs
Setup and cleanup of mm_struct->exe_file is currently done in fs/proc/.
This was because exe_file was needed only for /proc/<pid>/exe. Since we
will need the exe_file functionality also for core dumps (so core name can
contain full binary path), built this functionality always into the
kernel.
To achieve that move that out of proc FS to the kernel/ where in fact it
should belong. By doing that we can make dup_mm_exe_file static. Also we
can drop linux/proc_fs.h inclusion in fs/exec.c and kernel/fork.c.
Signed-off-by: Jiri Slaby <jslaby@suse.cz> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The Blackfin arch, like the x86 arch, needs to adjust the PC manually
after a breakpoint is hit as normally this is handled by the remote gdb.
However, rather than starting another arch ifdef mess, create a common
GDB_ADJUSTS_BREAK_OFFSET define for any arch to opt-in via their kgdb.h.
Signed-off-by: Mike Frysinger <vapier@gentoo.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Jason Wessel <jason.wessel@windriver.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Acked-by: Paul Mundt <lethal@linux-sh.org> Acked-by: Dongdong Deng <dongdong.deng@windriver.com> Cc: Sergei Shtylyov <sshtylyov@mvista.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Mike Frysinger [Thu, 26 May 2011 23:25:39 +0000 (16:25 -0700)]
asm-generic/ptrace.h: start a common low level ptrace helper
This is a series of low level ptrace unification steps to make it easier
for common code (like KGDB) to poke at register state. This also avoids
having to duplicate higher level operations for most ports which don't
have special needs for accessing things.
This patch:
This implements a bunch of helper funcs for poking the registers of a
ptrace structure. Now common code should be able to portably update
specific registers (like kgdb updating the PC).
Signed-off-by: Mike Frysinger <vapier@gentoo.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Jason Wessel <jason.wessel@windriver.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Paul Mundt <lethal@linux-sh.org> Cc: Sergei Shtylyov <sshtylyov@mvista.com> Cc: Dongdong Deng <dongdong.deng@windriver.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Ying Han [Thu, 26 May 2011 23:25:38 +0000 (16:25 -0700)]
memcg: add the pagefault count into memcg stats
Two new stats in per-memcg memory.stat which tracks the number of page
faults and number of major page faults.
"pgfault"
"pgmajfault"
They are different from "pgpgin"/"pgpgout" stat which count number of
pages charged/discharged to the cgroup and have no meaning of reading/
writing page to disk.
It is valuable to track the two stats for both measuring application's
performance as well as the efficiency of the kernel page reclaim path.
Counting pagefaults per process is useful, but we also need the aggregated
value since processes are monitored and controlled in cgroup basis in
memcg.
Functional test: check the total number of pgfault/pgmajfault of all
memcgs and compare with global vmstat value:
Performance test: run page fault test(pft) wit 16 thread on faulting in
15G anon pages in 16G container. There is no regression noticed on the
"flt/cpu/s"
Sample output from pft:
TAG pft:anon-sys-default:
Gb Thr CLine User System Wall flt/cpu/s fault/wsec
15 16 1 0.67s 233.41s 14.76s 16798.546 266356.260
+-------------------------------------------------------------------------+
N Min Max Median Avg Stddev
x 10 16682.962 17344.027 16913.524 16928.812 166.5362
+ 10 16695.568 16923.896 16820.604 16824.652 84.816568
No difference proven at 95.0% confidence
Ying Han [Thu, 26 May 2011 23:25:37 +0000 (16:25 -0700)]
memcg: add memory.numastat api for numa statistics
The new API exports numa_maps per-memcg basis. This is a piece of useful
information where it exports per-memcg page distribution across real numa
nodes.
One of the usecases is evaluating application performance by combining
this information w/ the cpu allocation to the application.
The output of the memory.numastat tries to follow w/ simiar format of
numa_maps like:
Ying Han [Thu, 26 May 2011 23:25:36 +0000 (16:25 -0700)]
memcg: rename mem_cgroup_zone_nr_pages() to mem_cgroup_zone_nr_lru_pages()
The caller of the function has been renamed to zone_nr_lru_pages(), and
this is just fixing up in the memcg code. The current name is easily to
be mis-read as zone's total number of pages.
Signed-off-by: Ying Han <yinghan@google.com> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Reviewed-by: Minchan Kim <minchan.kim@gmail.com> Cc: Balbir Singh <balbir@in.ibm.com> Cc: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
If scan < SWAP_CLUSTER_MAX, the scan will be skipped for this time and
priority gets higher. This has some problems.
1. This increases priority as 1 without any scan.
To do scan in this priority, amount of pages should be larger than 512M.
If pages>>priority < SWAP_CLUSTER_MAX, it's recorded and scan will be
batched, later. (But we lose 1 priority.)
If memory size is below 16M, pages >> priority is 0 and no scan in
DEF_PRIORITY forever.
2. If zone->all_unreclaimabe==true, it's scanned only when priority==0.
So, x86's ZONE_DMA will never be recoverred until the user of pages
frees memory by itself.
3. With memcg, the limit of memory can be small. When using small memcg,
it gets priority < DEF_PRIORITY-2 very easily and need to call
wait_iff_congested().
For doing scan before priorty=9, 64MB of memory should be used.
Then, this patch tries to scan SWAP_CLUSTER_MAX of pages in force...when
1. the target is enough small.
2. it's kswapd or memcg reclaim.
Then we can avoid rapid priority drop and may be able to recover
all_unreclaimable in a small zones. And this patch removes nr_saved_scan.
This will allow scanning in this priority even when pages >> priority is
very small.
Ying Han [Thu, 26 May 2011 23:25:33 +0000 (16:25 -0700)]
memcg: reclaim memory from nodes in round-robin order
Presently, memory cgroup's direct reclaim frees memory from the current
node. But this has some troubles. Usually when a set of threads works in
a cooperative way, they tend to operate on the same node. So if they hit
limits under memcg they will reclaim memory from themselves, damaging the
active working set.
For example, assume 2 node system which has Node 0 and Node 1 and a memcg
which has 1G limit. After some work, file cache remains and the usages
are
Node 0: 1M
Node 1: 998M.
and run an application on Node 0, it will eat its foot before freeing
unnecessary file caches.
This patch adds round-robin for NUMA and adds equal pressure to each node.
When using cpuset's spread memory feature, this will work very well.
Namhyung Kim [Thu, 26 May 2011 23:25:29 +0000 (16:25 -0700)]
memcg: mark init_section_page_cgroup() properly
Commit ca371c0d7e23 ("memcg: fix page_cgroup fatal error in FLATMEM")
removes call to alloc_bootmem() in the function so that it can be marked
as __meminit to reduce memory usage when MEMORY_HOTPLUG=n.
Also as the new helper function alloc_page_cgroup() is called only in the
function, it should be marked too.
Signed-off-by: Namhyung Kim <namhyung@gmail.com> Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Acked-by: Balbir Singh <balbir@linux.vnet.ibm.com> Cc: Michal Hocko <mhocko@suse.cz> Cc: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Michal Hocko [Thu, 26 May 2011 23:25:28 +0000 (16:25 -0700)]
memcg: remove pointless next_mz nullification in mem_cgroup_soft_limit_reclaim()
next_mz is assigned to NULL if __mem_cgroup_largest_soft_limit_node
selects the same mz. This doesn't make much sense as we assign to the
variable right in the next loop.
Compiler will probably optimize this out but it is little bit confusing
for the code reading.
Ying Han [Thu, 26 May 2011 23:25:27 +0000 (16:25 -0700)]
memcg: add the soft_limit reclaim in global direct reclaim.
We recently added the change in global background reclaim which counts the
return value of soft_limit reclaim. Now this patch adds the similar logic
on global direct reclaim.
We should skip scanning global LRU on shrink_zone if soft_limit reclaim
does enough work. This is the first step where we start with counting the
nr_scanned and nr_reclaimed from soft_limit reclaim into global
scan_control.
Signed-off-by: Ying Han <yinghan@google.com> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Minchan Kim <minchan.kim@gmail.com> Cc: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp> Cc: Balbir Singh <balbir@linux.vnet.ibm.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Mel Gorman <mel@csn.ul.ie> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Rik van Riel <riel@redhat.com> Cc: Hugh Dickins <hughd@google.com> Cc: Michal Hocko <mhocko@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Ying Han [Thu, 26 May 2011 23:25:25 +0000 (16:25 -0700)]
memcg: count the soft_limit reclaim in global background reclaim
The global kswapd scans per-zone LRU and reclaims pages regardless of the
cgroup. It breaks memory isolation since one cgroup can end up reclaiming
pages from another cgroup. Instead we should rely on memcg-aware target
reclaim including per-memcg kswapd and soft_limit hierarchical reclaim under
memory pressure.
In the global background reclaim, we do soft reclaim before scanning the
per-zone LRU. However, the return value is ignored. This patch is the first
step to skip shrink_zone() if soft_limit reclaim does enough work.
This is part of the effort which tries to reduce reclaiming pages in global
LRU in memcg. The per-memcg background reclaim patchset further enhances the
per-cgroup targetting reclaim, which I should have V4 posted shortly.
Try running multiple memory intensive workloads within seperate memcgs. Watch
the counters of soft_steal in memory.stat.
In the global background reclaim, we do soft reclaim before scanning the
per-zone LRU. However, the return value is ignored.
We would like to skip shrink_zone() if soft_limit reclaim does enough
work. Also, we need to make the memory pressure balanced across per-memcg
zones, like the logic vm-core. This patch is the first step where we
start with counting the nr_scanned and nr_reclaimed from soft_limit
reclaim into the global scan_control.
Daniel Lezcano [Thu, 26 May 2011 23:25:23 +0000 (16:25 -0700)]
cgroup: remove the ns_cgroup
The ns_cgroup is an annoying cgroup at the namespace / cgroup frontier and
leads to some problems:
* cgroup creation is out-of-control
* cgroup name can conflict when pids are looping
* it is not possible to have a single process handling a lot of
namespaces without falling in a exponential creation time
* we may want to create a namespace without creating a cgroup
The ns_cgroup was replaced by a compatibility flag 'clone_children',
where a newly created cgroup will copy the parent cgroup values.
The userspace has to manually create a cgroup and add a task to
the 'tasks' file.
This patch removes the ns_cgroup as suggested in the following thread:
The 'cgroup_clone' function is removed because it is no longer used.
This is a userspace-visible change. Commit 45531757b45c ("cgroup: notify
ns_cgroup deprecated") (merged into 2.6.27) caused the kernel to emit a
printk warning users that the feature is planned for removal. Since that
time we have heard from XXX users who were affected by this.
Signed-off-by: Daniel Lezcano <daniel.lezcano@free.fr> Signed-off-by: Serge E. Hallyn <serge.hallyn@canonical.com> Cc: Eric W. Biederman <ebiederm@xmission.com> Cc: Jamal Hadi Salim <hadi@cyberus.ca> Reviewed-by: Li Zefan <lizf@cn.fujitsu.com> Acked-by: Paul Menage <menage@google.com> Acked-by: Matt Helsley <matthltc@us.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Ben Blum [Thu, 26 May 2011 23:25:21 +0000 (16:25 -0700)]
cgroups: use flex_array in attach_proc
Convert cgroup_attach_proc to use flex_array.
The cgroup_attach_proc implementation requires a pre-allocated array to
store task pointers to atomically move a thread-group, but asking for a
monolithic array with kmalloc() may be unreliable for very large groups.
Using flex_array provides the same functionality with less risk of
failure.
This is a post-patch for cgroup-procs-write.patch.
Signed-off-by: Ben Blum <bblum@andrew.cmu.edu> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Li Zefan <lizf@cn.fujitsu.com> Cc: Matt Helsley <matthltc@us.ibm.com> Reviewed-by: Paul Menage <menage@google.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: David Rientjes <rientjes@google.com> Cc: Miao Xie <miaox@cn.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Ben Blum [Thu, 26 May 2011 23:25:20 +0000 (16:25 -0700)]
cgroups: make procs file writable
Make procs file writable to move all threads by tgid at once.
Add functionality that enables users to move all threads in a threadgroup
at once to a cgroup by writing the tgid to the 'cgroup.procs' file. This
current implementation makes use of a per-threadgroup rwsem that's taken
for reading in the fork() path to prevent newly forking threads within the
threadgroup from "escaping" while the move is in progress.
Signed-off-by: Ben Blum <bblum@andrew.cmu.edu> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Li Zefan <lizf@cn.fujitsu.com> Cc: Matt Helsley <matthltc@us.ibm.com> Reviewed-by: Paul Menage <menage@google.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: David Rientjes <rientjes@google.com> Cc: Miao Xie <miaox@cn.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Ben Blum [Thu, 26 May 2011 23:25:19 +0000 (16:25 -0700)]
cgroups: add per-thread subsystem callbacks
Add cgroup subsystem callbacks for per-thread attachment in atomic contexts
Add can_attach_task(), pre_attach(), and attach_task() as new callbacks
for cgroups's subsystem interface. Unlike can_attach and attach, these
are for per-thread operations, to be called potentially many times when
attaching an entire threadgroup.
Also, the old "bool threadgroup" interface is removed, as replaced by
this. All subsystems are modified for the new interface - of note is
cpuset, which requires from/to nodemasks for attach to be globally scoped
(though per-cpuset would work too) to persist from its pre_attach to
attach_task and attach.
This is a pre-patch for cgroup-procs-writable.patch.
Signed-off-by: Ben Blum <bblum@andrew.cmu.edu> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Li Zefan <lizf@cn.fujitsu.com> Cc: Matt Helsley <matthltc@us.ibm.com> Reviewed-by: Paul Menage <menage@google.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: David Rientjes <rientjes@google.com> Cc: Miao Xie <miaox@cn.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Ben Blum [Thu, 26 May 2011 23:25:18 +0000 (16:25 -0700)]
cgroups: read-write lock CLONE_THREAD forking per threadgroup
Adds functionality to read/write lock CLONE_THREAD fork()ing per-threadgroup
Add an rwsem that lives in a threadgroup's signal_struct that's taken for
reading in the fork path, under CONFIG_CGROUPS. If another part of the
kernel later wants to use such a locking mechanism, the CONFIG_CGROUPS
ifdefs should be changed to a higher-up flag that CGROUPS and the other
system would both depend on.
This is a pre-patch for cgroup-procs-write.patch.
Signed-off-by: Ben Blum <bblum@andrew.cmu.edu> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Li Zefan <lizf@cn.fujitsu.com> Cc: Matt Helsley <matthltc@us.ibm.com> Reviewed-by: Paul Menage <menage@google.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: David Rientjes <rientjes@google.com> Cc: Miao Xie <miaox@cn.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jiri Slaby [Thu, 26 May 2011 23:25:17 +0000 (16:25 -0700)]
Documentation: configfs examples crash fix
When configfs_register_subsystem() fails, we unregister too many
subsystems in configfs_example_init. Decrement i by one to not unregister
non-registered subsystem.
[akpm@linux-foundation.org: coding-style fixes] Signed-off-by: Jiri Slaby <jslaby@suse.cz> Cc: Joel Becker <joel.becker@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Wu Fengguang [Thu, 26 May 2011 23:25:16 +0000 (16:25 -0700)]
getdelays: show average CPU/IO/SWAP/RECLAIM delays
I find it very handy to show the average delays in milliseconds.
Example output (on 100 concurrent dd reading sparse files):
CPU count real total virtual total delay total delay average
986 3223509952320764330138863410579 39.415ms
IO count delay total delay average
0 0 0ms
SWAP count delay total delay average
0 0 0ms
RECLAIM count delay total delay average
1059 5131834899 4ms
dd: read=0, write=0, cancelled_write=0
Documentation/accounting/getdelays.c: In function `get_family_id':
Documentation/accounting/getdelays.c:172:14: warning: variable `rc' set but not used [-Wunused-but-set-variable]
Reported-by: "Justin P. Mattock" <justinmattock@gmail.com> Cc: Balbir Singh <balbir@in.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Documentation/accounting/getdelays.c: fix unused var warning
Fixes
Documentation/accounting/getdelays.c: In function `main':
Documentation/accounting/getdelays.c:436:7: warning: variable `i' set but not used [-Wunused-but-set-variable]
Signed-off-by: Justin P. Mattock <justinmattock@gmail.com> Cc: Balbir Singh <balbir@in.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Dan Carpenter [Thu, 26 May 2011 23:25:12 +0000 (16:25 -0700)]
ufs: fix truncated values handling 64 bit metadata
Originally i_lastfrag was 32 bits but then we added support for handling
64 bit metadata and it became a 64 bit variable. That was during 2007, in 54fb996ac15c "[PATCH] ufs2 write: block allocation update". Unfortunately
these casts got left behind so the value got truncated to 32 bit again.
Commit 51ba60c5 ("RTC: Cleanup rtc_class_ops->update_irq_enable()")
removed the only user of the update IRQ, so there is no need to manage it
any more.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Alessandro Zummo <a.zummo@towertech.it> Cc: Marcelo Roberto Jimenez <mroberto@cpti.cetuc.puc-rio.br> Cc: John Stultz <john.stultz@linaro.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jesse Gross [Thu, 26 May 2011 23:25:02 +0000 (16:25 -0700)]
flex_array: avoid divisions when accessing elements
On most architectures division is an expensive operation and accessing an
element currently requires four of them. This performance penalty
effectively precludes flex arrays from being used on any kind of fast
path. However, two of these divisions can be handled at creation time and
the others can be replaced by a reciprocal divide, completely avoiding
real divisions on access.
[eparis@redhat.com: rebase on top of changes to support 0 len elements]
[eparis@redhat.com: initialize part_nr when array fits entirely in base] Signed-off-by: Jesse Gross <jesse@nicira.com> Signed-off-by: Eric Paris <eparis@redhat.com> Cc: Dave Hansen <dave@linux.vnet.ibm.com> Cc: David Rientjes <rientjes@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
KOSAKI Motohiro [Thu, 26 May 2011 23:24:59 +0000 (16:24 -0700)]
m32r: convert cpumask api
We plan to remove cpus_xx() old cpumask APIs later. Also, we plan to
change mm_cpu_mask() implementation, allocate only nr_cpu_ids, thus
*mm_cpu_mask() is dangerous operation.
Andrew Morton [Thu, 26 May 2011 23:24:57 +0000 (16:24 -0700)]
drivers/bcma/host_pci.c needs slab.h
alpha allmodconfig:
drivers/bcma/host_pci.c: In function 'bcma_host_pci_probe':
drivers/bcma/host_pci.c:102: error: implicit declaration of function 'kzalloc'
drivers/bcma/host_pci.c:102: warning: assignment makes pointer from integer without a cast
Cc: <zajec5@gmail.com> Cc: John W. Linville <linville@tuxdriver.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
drivers/video/mb862xx/mb862xxfbdrv.c: In function 'mb862xxfb_ioctl':
drivers/video/mb862xx/mb862xxfbdrv.c:323: error: implicit declaration of function 'copy_to_user'
drivers/video/mb862xx/mb862xxfbdrv.c:327: error: implicit declaration of function 'copy_from_user'
Cc: Paul Mundt <lethal@linux-sh.org> Cc: Anatolij Gustschin <agust@denx.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Serge E. Hallyn [Thu, 26 May 2011 20:25:05 +0000 (15:25 -0500)]
Set cred->user_ns in key_replace_session_keyring
Since this cred was not created with copy_creds(), it needs to get
initialized. Otherwise use of syscall(__NR_keyctl, KEYCTL_SESSION_TO_PARENT);
can lead to a NULL deref. Thanks to Robert for finding this.
But introduced by commit 47a150edc2a ("Cache user_ns in struct cred").
Signed-off-by: Serge E. Hallyn <serge.hallyn@canonical.com> Reported-by: Robert Święcki <robert@swiecki.net> Cc: David Howells <dhowells@redhat.com> Cc: stable@kernel.org (2.6.39) Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>