Afzal Mohammed [Fri, 9 Nov 2012 03:04:55 +0000 (14:04 +1100)]
ARM: davinci: remove rtc kicker release
rtc-omap driver is now capable of handling kicker mechanism, hence remove
kicker handling at platform level, instead provide proper device name so
that driver can handle kicker mechanism by itself
Signed-off-by: Afzal Mohammed <afzal@ti.com> Acked-by: Sekhar Nori <nsekhar@ti.com> Cc: Grant Likely <grant.likely@secretlab.ca> Cc: Sekhar Nori <nsekhar@ti.com> Cc: Kevin Hilman <khilman@ti.com> Cc: Russell King <linux@arm.linux.org.uk> Cc: Alessandro Zummo <a.zummo@towertech.it> Cc: Daniel Mack <zonque@gmail.com> Cc: Vaibhav Hiremath <hvaibhav@ti.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Afzal Mohammed [Fri, 9 Nov 2012 03:04:55 +0000 (14:04 +1100)]
rtc: omap: kicker mechanism support
OMAP RTC IP can have kicker feature. This prevents spurious writes to
register. To write to registers kicker lock has to be released.
Procedure to do it as follows,
1. write to kick0 register, 0x83e70b13
2. write to kick1 register, 0x95a4f1e0
Writing value other than 0x83e70b13 to kick0 enables write locking, more
details about kicker mechanism can be found in section 20.3.3.5.3 of
AM335X TRM @www.ti.com/am335x
Here id table information is added and is used to distinguish those that
require kicker handling and the ones that doesn't need it. There are more
features in the newer IP's compared to legacy ones other than kicker,
which driver currently doesn't handle, supporting additional features
would be easier with the addition of id table.
Older IP (of OMAP1) doesn't have revision register as per TRM, so revision
register can't be relied always to find features, hence id table is being
used.
While at it, replace __raw_writeb/__raw_readb with writeb/readb; this
driver is used on ARMv7 (AM335X SoC)
Signed-off-by: Afzal Mohammed <afzal@ti.com> Acked-by: Sekhar Nori <nsekhar@ti.com> Cc: Grant Likely <grant.likely@secretlab.ca> Cc: Sekhar Nori <nsekhar@ti.com> Cc: Kevin Hilman <khilman@ti.com> Cc: Russell King <linux@arm.linux.org.uk> Cc: Alessandro Zummo <a.zummo@towertech.it> Cc: Daniel Mack <zonque@gmail.com> Cc: Vaibhav Hiremath <hvaibhav@ti.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Alan Cox [Fri, 9 Nov 2012 03:04:55 +0000 (14:04 +1100)]
binfmt_elf: fix corner case kfree of uninitialized data
If elf_core_dump() is called and fill_note_info() fails in the kmalloc()
then it returns 0 but has not yet initialised all the needed fields. As a
result we do a kfree(randomness) after correctly skipping the thread data.
Signed-off-by: Alan Cox <alan@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Paton J. Lewis [Fri, 9 Nov 2012 03:04:54 +0000 (14:04 +1100)]
epoll: support for disabling items, and a self-test app
It is not currently possible to reliably delete epoll items when using the
same epoll set from multiple threads. After calling epoll_ctl with
EPOLL_CTL_DEL, another thread might still be executing code related to an
event for that epoll item (in response to epoll_wait). Therefore the
deleting thread does not know when it is safe to delete resources
pertaining to the associated epoll item because another thread might be
using those resources.
The deleting thread could wait an arbitrary amount of time after calling
epoll_ctl with EPOLL_CTL_DEL and before deleting the item, but this is
inefficient and could result in the destruction of resources before
another thread is done handling an event returned by epoll_wait.
This patch enhances epoll_ctl to support EPOLL_CTL_DISABLE, which disables
an epoll item. If epoll_ctl returns -EBUSY in this case, then another
thread may handling a return from epoll_wait for this item. Otherwise if
epoll_ctl returns 0, then it is safe to delete the epoll item. This
allows multiple threads to use a mutex to determine when it is safe to
delete an epoll item and its associated resources, which allows epoll
items to be deleted both efficiently and without error in a multi-threaded
environment. Note that EPOLL_CTL_DISABLE is only useful in conjunction
with EPOLLONESHOT, and using EPOLL_CTL_DISABLE on an epoll item without
EPOLLONESHOT returns -EINVAL.
This patch also adds a new test_epoll self-test program to both
demonstrate the need for this feature and test it.
Signed-off-by: Paton J. Lewis <palewis@adobe.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Jason Baron <jbaron@redhat.com> Cc: Paul Holland <pholland@adobe.com> Cc: Davide Libenzi <davidel@xmailserver.org> Cc: Michael Kerrisk <mtk.manpages@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Joe Perches [Fri, 9 Nov 2012 03:04:54 +0000 (14:04 +1100)]
checkpatch: warn on unnecessary line continuations
When the previous line is not a line continuation and the current line has
a line continuation but is not a #define, emit a warning.
Signed-off-by: Joe Perches <joe@perches.com> Cc: Peter Hurley <peter@hurleysoftware.com> Cc: Andy Whitcroft <apw@canonical.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
This function is used by sparc, powerpc tile and arm64 for compat support.
The patch adds a generic implementation with a wrapper for PowerPC to do
the u32->int sign extension.
The reason for a single patch covering powerpc, tile, sparc and arm64 is
to keep it bisectable, otherwise kernel building may fail with mismatched
function declarations.
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Chris Metcalf <cmetcalf@tilera.com> [for tile] Acked-by: David S. Miller <davem@davemloft.net> Acked-by: Arnd Bergmann <arnd@arndb.de> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Oleg Nesterov [Fri, 9 Nov 2012 03:04:52 +0000 (14:04 +1100)]
percpu_rw_semaphore: reimplement to not block the readers unnecessarily
Currently the writer does msleep() plus synchronize_sched() 3 times to
acquire/release the semaphore, and during this time the readers are
blocked completely. Even if the "write" section was not actually started
or if it was already finished.
With this patch down_write/up_write does synchronize_sched() twice and
down_read/up_read are still possible during this time, just they use the
slow path.
percpu_down_write() first forces the readers to use rw_semaphore and
increment the "slow" counter to take the lock for reading, then it
takes that rw_semaphore for writing and blocks the readers.
Also. With this patch the code relies on the documented behaviour of
synchronize_sched(), it doesn't try to pair synchronize_sched() with
barrier.
Signed-off-by: Oleg Nesterov <oleg@redhat.com> Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mikulas Patocka <mpatocka@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com> Cc: Anton Arapov <anton@redhat.com> Cc: Jens Axboe <axboe@kernel.dk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Jan Beulich [Fri, 9 Nov 2012 03:04:52 +0000 (14:04 +1100)]
sscanf: don't ignore field widths for numeric conversions
This is another step towards better standard conformance. Rather than
adding a local buffer to store the specified portion of the string (with
the need to enforce an arbitrary maximum supported width to limit the
buffer size), do a maximum width conversion and then drop as much of it as
is necessary to meet the caller's request.
Also fail on negative field widths.
Uses the deprecated simple_strto*() functions because kstrtoXX() fail on
non-zero terminated strings.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Cc: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Andy Shevchenko [Fri, 9 Nov 2012 03:04:51 +0000 (14:04 +1100)]
drivers/of/fdt.c: re-use kernel's kbasename()
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Acked-by: Grant Likely <grant.likely@secretlab.ca> Cc: Rob Herring <rob.herring@calxeda.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Andy Shevchenko [Fri, 9 Nov 2012 03:04:50 +0000 (14:04 +1100)]
lib: dynamic_debug: use kbasename()
Remove the custom implementation of the functionality similar to kbasename().
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: Jason Baron <jbaron@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Andy Shevchenko [Fri, 9 Nov 2012 03:04:50 +0000 (14:04 +1100)]
string: introduce helper to get base file name from given path
There are several places in the kernel that use functionality like
basename(3) with the exception: in case of '/foo/bar/' we expect to get an
empty string. Let's do it common helper for them.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: Jason Baron <jbaron@redhat.com> Cc: YAMANE Toshiaki <yamanetoshi@gmail.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
WARNING: drivers/video/backlight/built-in.o(.data+0x110): Section mismatch in reference from the variable ep93xxbl_driver to the
function .init.text:ep93xxbl_probe()
The variable ep93xxbl_driver references
the function __init ep93xxbl_probe()
If the reference is valid then annotate the
variable with __init* or __refdata (see linux/init.h) or name the variable:
*_template, *_timer, *_sht, *_ops, *_probe, *_probe_one, *_console
Signed-off-by: Jingoo Han <jg1.han@samsung.com> Cc: H Hartley Sweeten <hsweeten@visionengravers.com> Cc: Richard Purdie <rpurdie@rpsys.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Axel Lin [Fri, 9 Nov 2012 03:04:48 +0000 (14:04 +1100)]
drivers/video/backlight/lm3639_bl.c: fix up world writable sysfs file
We don't need the sysfs file to be world writable or group writable.
This file is write-only, change it to S_IWUSR (0200).
Signed-off-by: Axel Lin <axel.lin@ingics.com> Acked-by: "G.Shark Jeong" <gshark.jeong@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Jingoo Han [Fri, 9 Nov 2012 03:04:48 +0000 (14:04 +1100)]
drivers/video/backlight/max8925_bl.c: drop devm_kfree of devm_kzalloc'd data
devm_kfree() allocates memory that is released when a driver detaches.
Thus, there is no reason to explicitly call devm_kfree in probe or remove
functions.
Signed-off-by: Jingoo Han <jg1.han@samsung.com> Cc: Richard Purdie <rpurdie@rpsys.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Jingoo Han [Fri, 9 Nov 2012 03:04:48 +0000 (14:04 +1100)]
drivers/video/backlight/88pm860x_bl.c: drop devm_kfree of devm_kzalloc'd data
devm_kfree() allocates memory that is released when a driver detaches.
Thus, there is no reason to explicitly call devm_kfree() in probe or remove
functions.
Signed-off-by: Jingoo Han <jg1.han@samsung.com> Cc: Richard Purdie <rpurdie@rpsys.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Add 'const' to static array that was missing it in its definition. Also,
'const' is added to ili9320_write_regs(), because it is called by
vgg2432a4 driver.
Signed-off-by: Jingoo Han <jg1.han@samsung.com> Cc: Ben Dooks <ben-linux@fluff.org> Cc: Richard Purdie <rpurdie@rpsys.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Add 'const' to static array that was missing it in its definition.
Signed-off-by: Jingoo Han <jg1.han@samsung.com> Cc: Eric Miao <eric.y.miao@gmail.com> Cc: Richard Purdie <rpurdie@rpsys.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Add 'const' to static array that was missing it in its definition.
Signed-off-by: Jingoo Han <jg1.han@samsung.com> Acked-by: Marek Vasut <marex@denx.de> Cc: Richard Purdie <rpurdie@rpsys.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
The mutex for accessing lp855x registers is used in case of the user-space
interaction. When the brightness is changed via sysfs, the mutex is
required. But the backlight class device already provides it. Thus, the
lp855x mutex is unnecessary.
Signed-off-by: Milo(Woogyom) Kim <milo.kim@ti.com> Cc: Thierry Reding <thierry.reding@avionic-design.de> Cc: Richard Purdie <rpurdie@rpsys.net> Cc: Bryan Wu <bryan.wu@canonical.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Kim, Milo [Fri, 9 Nov 2012 03:04:45 +0000 (14:04 +1100)]
drivers/video/backlight/lp855x_bl.c: use generic PWM functions
The LP855x family devices support the PWM input for the backlight control.
Period of the PWM is configurable in the platform side. Platform
specific functions are unnecessary anymore because generic PWM functions
are used inside the driver.
(PWM input mode)
To set the brightness, new lp855x_pwm_ctrl() is used.
If a PWM device is not allocated, devm_pwm_get() is called.
The PWM consumer name is from the chip name such as 'lp8550' and 'lp8556'.
To get the brightness value, no additional handling is required.
Just the value of 'props.brightness' is returned.
If the PWM driver is not ready while initializing the LP855x driver, it's
OK. The PWM device can be retrieved later, when the brightness value is
changed.
Documentation is updated with an example.
Signed-off-by: Milo(Woogyom) Kim <milo.kim@ti.com> Cc: Thierry Reding <thierry.reding@avionic-design.de> Cc: Richard Purdie <rpurdie@rpsys.net> Cc: Bryan Wu <bryan.wu@canonical.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Jingoo Han [Fri, 9 Nov 2012 03:04:45 +0000 (14:04 +1100)]
backlight: tosa: use devm_gpio_request_one
By using devm_gpio_request_one it is possible to set the direction and
initial value in one shot. Thus, using devm_gpio_request_one can make the
code simpler.
Signed-off-by: Jingoo Han <jg1.han@samsung.com> Cc: Richard Purdie <rpurdie@rpsys.net> Cc: Dmitry Baryshkov <dbaryshkov@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Jingoo Han [Fri, 9 Nov 2012 03:04:44 +0000 (14:04 +1100)]
backlight: lms283gf05: use devm_gpio_request_one
By using devm_gpio_request_one it is possible to set the direction
and initial value in one shot. Thus, using devm_gpio_request_one
can make the code simpler.
Signed-off-by: Jingoo Han <jg1.han@samsung.com> Cc: Richard Purdie <rpurdie@rpsys.net> Acked-by: Marek Vasut <marex@denx.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Jingoo Han [Fri, 9 Nov 2012 03:04:42 +0000 (14:04 +1100)]
backlight: locomolcd: fix checkpatch error and warning
This patch fixes the checkpatch error and warning as below:
WARNING: line over 80 characters
WARNING: space prohibited between function name and open parenthesis '('
ERROR: trailing statements should be on next line
Also, long comments are fixed for the preferred style and
unnecessary lines are removed.
Signed-off-by: Jingoo Han <jg1.han@samsung.com> Cc: Richard Purdie <rpurdie@rpsys.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Jingoo Han [Fri, 9 Nov 2012 03:04:41 +0000 (14:04 +1100)]
backlight: ili9320: fix checkpatch error and warning
This patch fixes the checkpatch error and warning as below:
WARNING: please, no space before tabs
WARNING: please, no spaces at the start of a line
WARNING: EXPORT_SYMBOL(foo); should immediately follow its function/variable
WARNING: braces {} are not necessary for single statement blocks
ERROR: code indent should use tabs where possible
Signed-off-by: Jingoo Han <jg1.han@samsung.com> Cc: Richard Purdie <rpurdie@rpsys.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Jingoo Han [Fri, 9 Nov 2012 03:04:41 +0000 (14:04 +1100)]
backlight: hp680_bl: fix checkpatch error and warning
This patch fixes the checkpatch error and warning as below:
WARNING: please, no space before tabs
WARNING: please, no spaces at the start of a line
ERROR: do not initialise statics to 0 or NULL
ERROR: code indent should use tabs where possible
Signed-off-by: Jingoo Han <jg1.han@samsung.com> Cc: Richard Purdie <rpurdie@rpsys.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Jingoo Han [Fri, 9 Nov 2012 03:04:39 +0000 (14:04 +1100)]
backlight: da903x_bl: use dev_get_drvdata() instead of platform_get_drvdata()
dev_get_drvdata() can be used instead of platform_get_drvdata()
to make the code smaller.
Signed-off-by: Jingoo Han <jg1.han@samsung.com> Cc: Richard Purdie <rpurdie@rpsys.net> Cc: Mike Rapoport <mike@compulab.co.il> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Matthew Leach [Fri, 9 Nov 2012 03:04:37 +0000 (14:04 +1100)]
include/linux/init.h: use the stringify operator for the __define_initcall macro
Currently the __define_initcall() macro takes three arguments, fn, id and
level. The level argument is exactly the same as the id argument but
wrapped in quotes. To overcome this need to specify three arguments to
the __define_initcall macro, where one argument is the stringification of
another, we can just use the stringification macro instead.
Signed-off-by: Matthew Leach <matthew@mattleach.net> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Wen Congyang [Fri, 9 Nov 2012 03:04:37 +0000 (14:04 +1100)]
Documentation/kernel-parameters.txt: update mem= option's spec according to its implementation
Current mem= implementation seems buggy because the specification and
implementation don't match. The current mem= has been working for many
years and it's not buggy - it works as expected. So we should update the
specification.
Signed-off-by: Wen Congyang <wency@cn.fujitsu.com> Cc: Rob Landley <rob@landley.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
PBM generated with current tools do not have a whitespace between the
digits. Therefore the pnmtologo tool fails to gernerate the required
C-Array for these images. This patch fixes that behaviour and can handle
both 'old style' and 'new style' PBM files.
Signed-off-by: Andreas Bießmann <andreas@biessmann.de> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Wanpeng Li [Fri, 9 Nov 2012 03:04:36 +0000 (14:04 +1100)]
mm/memblock: reduce overhead in binary search
When checking that the indicated address belongs to the memory region, the
memory regions are checked one by one through a binary search, which will
be time consuming.
If the indicated address isn't in the memory region, then we needn't do
the time-consuming search. Add a check on the indicated address for that
purpose.
Signed-off-by: Wanpeng Li <liwanp@linux.vnet.ibm.com> Cc: Michal Hocko <mhocko@suse.cz> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Gavin Shan <shangw@linux.vnet.ibm.com> Cc: Yinghai Lu <yinghai@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Shaohua Li [Fri, 9 Nov 2012 03:04:35 +0000 (14:04 +1100)]
swap: add a simple detector for inappropriate swapin readahead
The swapin readahead does a blind readahead whether or not the swapin is
sequential. This is ok for harddisk because large reads have relatively
small costs and if the readahead pages are unneeded they can be reclaimed
easily. But for SSD devices large reads are more expensive than small
one. If readahead pages are unneeded, reading them in caused significant
overhead
This patch addes a simple random read detection similar to file mmap
readahead. If a random read is detected, swapin readahead will be
skipped. This improves a lot for a swap workload with random IO in a fast
SSD.
I run anonymous mmap write micro benchmark, which will triger swapin/swapout.
For both harddisk and SSD, the randwrite swap workload run time is reduced
significantly. Sequential write swap workload hasn't chanage.
Interestingly, the randwrite harddisk test is improved too. This might be
because swapin readahead needs to allocate extra memory, which further
tights memory pressure, so more swapout/swapin.
Signed-off-by: Shaohua Li <shli@fusionio.com> Acked-by: Rik van Riel <riel@redhat.com> Cc: Konstantin Khlebnikov <khlebnikov@openvz.org> Cc: Wu Fengguang <fengguang.wu@intel.com> Cc: Hugh Dickins <hughd@google.com> Cc: Minchan Kim <minchan@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Michal Hocko [Fri, 9 Nov 2012 03:04:34 +0000 (14:04 +1100)]
drop_caches: add some documentation and info message
I would like to resurrect Dave's patch. The last time it was posted was
here https://lkml.org/lkml/2010/9/16/250 and there didn't seem to be any
strong opposition.
Kosaki was worried about possible excessive logging when somebody drops
caches too often (but then he claimed he didn't have a strong opinion on
that) but I would say opposite. If somebody does that then I would really
like to know that from the log when supporting a system because it almost
for sure means that there is something fishy going on. It is also worth
mentioning that only root can write drop caches so this is not an flooding
attack vector.
I am bringing that up again because this can be really helpful when
chasing strange performance issues which (surprise surprise) turn out to
be related to artificially dropped caches done because the admin thinks
this would help...
I have just refreshed the original patch on top of the current mm tree
but I could live with KERN_INFO as well if people think that KERN_NOTICE
is too hysterical.
: From: Dave Hansen <dave@linux.vnet.ibm.com>
: Date: Fri, 12 Oct 2012 14:30:54 +0200
:
: There is plenty of anecdotal evidence and a load of blog posts
: suggesting that using "drop_caches" periodically keeps your system
: running in "tip top shape". Perhaps adding some kernel
: documentation will increase the amount of accurate data on its use.
:
: If we are not shrinking caches effectively, then we have real bugs.
: Using drop_caches will simply mask the bugs and make them harder
: to find, but certainly does not fix them, nor is it an appropriate
: "workaround" to limit the size of the caches.
:
: It's a great debugging tool, and is really handy for doing things
: like repeatable benchmark runs. So, add a bit more documentation
: about it, and add a little KERN_NOTICE. It should help developers
: who are chasing down reclaim-related bugs.
[mhocko@suse.cz: refreshed to current -mm tree] Signed-off-by: Dave Hansen <dave@linux.vnet.ibm.com> Reviewed-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: Michal Hocko <mhocko@suse.cz> Acked-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Commits 2139cbe627b89 ("cma: fix counting of isolated pages") and d95ea5d18e69951 ("cma: fix watermark checking") introduced a reliable
method of free page accounting when memory is being allocated from CMA
regions, so the workaround introduced earlier by commit 49f223a9cd96c72
("mm: trigger page reclaim in alloc_contig_range() to stabilise
watermarks") can be finally removed.
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> Cc: Kyungmin Park <kyungmin.park@samsung.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Mel Gorman <mel@csn.ul.ie> Acked-by: Michal Nazarewicz <mina86@mina86.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
mm: cma: skip watermarks check for already isolated blocks in split_free_page()
Since commit 2139cbe627b8 ("cma: fix counting of isolated pages") free
pages in isolated pageblocks are not accounted to NR_FREE_PAGES counters,
so watermarks check is not required if one operates on a free page in
isolated pageblock.
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> Cc: Kyungmin Park <kyungmin.park@samsung.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Mel Gorman <mel@csn.ul.ie> Acked-by: Michal Nazarewicz <mina86@mina86.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
David Rientjes [Fri, 9 Nov 2012 03:04:34 +0000 (14:04 +1100)]
mm, oom: fix race when specifying a thread as the oom origin
test_set_oom_score_adj() and compare_swap_oom_score_adj() are used to
specify that current should be killed first if an oom condition occurs in
between the two calls.
The usage is
short oom_score_adj = test_set_oom_score_adj(OOM_SCORE_ADJ_MAX);
...
compare_swap_oom_score_adj(OOM_SCORE_ADJ_MAX, oom_score_adj);
to store the thread's oom_score_adj, temporarily change it to the maximum
score possible, and then restore the old value if it is still the same.
This happens to still be racy, however, if the user writes
OOM_SCORE_ADJ_MAX to /proc/pid/oom_score_adj in between the two calls.
The compare_swap_oom_score_adj() will then incorrectly reset the old value
prior to the write of OOM_SCORE_ADJ_MAX.
To fix this, introduce a new oom_flags_t member in struct signal_struct
that will be used for per-thread oom killer flags. KSM and swapoff can
now use a bit in this member to specify that threads should be killed
first in oom conditions without playing around with oom_score_adj.
This also allows the correct oom_score_adj to always be shown when reading
/proc/pid/oom_score.
Signed-off-by: David Rientjes <rientjes@google.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Reviewed-by: Michal Hocko <mhocko@suse.cz> Cc: Anton Vorontsov <anton.vorontsov@linaro.org> Cc: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
David Rientjes [Fri, 9 Nov 2012 03:04:33 +0000 (14:04 +1100)]
mm, oom: change type of oom_score_adj to short
The maximum oom_score_adj is 1000 and the minimum oom_score_adj is -1000,
so this range can be represented by the signed short type with no
functional change. The extra space this frees up in struct signal_struct
will be used for per-thread oom kill flags in the next patch.
Signed-off-by: David Rientjes <rientjes@google.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Reviewed-by: Michal Hocko <mhocko@suse.cz> Cc: Anton Vorontsov <anton.vorontsov@linaro.org> Cc: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Randy Dunlap [Fri, 9 Nov 2012 03:04:32 +0000 (14:04 +1100)]
mm: fix slab.c kernel-doc warnings
Fix new kernel-doc warnings in mm/slab.c:
Warning(mm/slab.c:2358): No description found for parameter 'cachep'
Warning(mm/slab.c:2358): Excess function parameter 'name' description in '__kmem_cache_create'
Warning(mm/slab.c:2358): Excess function parameter 'size' description in '__kmem_cache_create'
Warning(mm/slab.c:2358): Excess function parameter 'align' description in '__kmem_cache_create'
Warning(mm/slab.c:2358): Excess function parameter 'ctor' description in '__kmem_cache_create'
Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Cc: Christoph Lameter <cl@linux-foundation.org> Cc: Pekka Enberg <penberg@kernel.org> Cc: Matt Mackall <mpm@selenic.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Rafael Aquini [Fri, 9 Nov 2012 03:04:32 +0000 (14:04 +1100)]
mm: introduce putback_movable_pages()
The patch "mm: introduce compaction and migration for virtio ballooned
pages" hacks around putback_lru_pages() in order to allow ballooned pages
to be re-inserted on balloon page list as if a ballooned page was like a
LRU page.
As ballooned pages are not legitimate LRU pages, this patch introduces
putback_movable_pages() to properly cope with cases where the isolated
pageset contains ballooned pages and LRU pages, thus fixing the mentioned
inelegant hack around putback_lru_pages().
Signed-off-by: Rafael Aquini <aquini@redhat.com> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Rik van Riel <riel@redhat.com> Cc: Mel Gorman <mel@csn.ul.ie> Cc: Andi Kleen <andi@firstfloor.org> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Minchan Kim <minchan@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Rafael Aquini [Fri, 9 Nov 2012 03:04:32 +0000 (14:04 +1100)]
virtio_balloon: introduce migration primitives to balloon pages
Memory fragmentation introduced by ballooning might reduce significantly
the number of 2MB contiguous memory blocks that can be used within a
guest, thus imposing performance penalties associated with the reduced
number of transparent huge pages that could be used by the guest workload.
Besides making balloon pages movable at allocation time and introducing
the necessary primitives to perform balloon page migration/compaction,
this patch also introduces the following locking scheme, in order to
enhance the syncronization methods for accessing elements of struct
virtio_balloon, thus providing protection against concurrent access
introduced by parallel memory migration threads.
- balloon_lock (mutex) : synchronizes the access demand to elements of
struct virtio_balloon and its queue operations;
Signed-off-by: Rafael Aquini <aquini@redhat.com> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Rik van Riel <riel@redhat.com> Cc: Mel Gorman <mel@csn.ul.ie> Cc: Andi Kleen <andi@firstfloor.org> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Minchan Kim <minchan@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Rafael Aquini [Fri, 9 Nov 2012 03:04:31 +0000 (14:04 +1100)]
mm: introduce compaction and migration for ballooned pages
Memory fragmentation introduced by ballooning might reduce significantly
the number of 2MB contiguous memory blocks that can be used within a
guest, thus imposing performance penalties associated with the reduced
number of transparent huge pages that could be used by the guest workload.
This patch introduces the helper functions as well as the necessary
changes to teach compaction and migration bits how to cope with pages
which are part of a guest memory balloon, in order to make them movable by
memory compaction procedures.
Signed-off-by: Rafael Aquini <aquini@redhat.com> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Rik van Riel <riel@redhat.com> Cc: Mel Gorman <mel@csn.ul.ie> Cc: Andi Kleen <andi@firstfloor.org> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Minchan Kim <minchan@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Rafael Aquini [Fri, 9 Nov 2012 03:04:30 +0000 (14:04 +1100)]
mm: introduce a common interface for balloon pages mobility
Memory fragmentation introduced by ballooning might reduce significantly
the number of 2MB contiguous memory blocks that can be used within a
guest, thus imposing performance penalties associated with the reduced
number of transparent huge pages that could be used by the guest workload.
This patch introduces a common interface to help a balloon driver on
making its page set movable to compaction, and thus allowing the system to
better leverage the compation efforts on memory defragmentation.
Signed-off-by: Rafael Aquini <aquini@redhat.com> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Rik van Riel <riel@redhat.com> Cc: Mel Gorman <mel@csn.ul.ie> Cc: Andi Kleen <andi@firstfloor.org> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Minchan Kim <minchan@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Rafael Aquini [Fri, 9 Nov 2012 03:04:30 +0000 (14:04 +1100)]
mm: redefine address_space.assoc_mapping
Overhaul struct address_space.assoc_mapping renaming it to
address_space.private_data and its type is redefined to void*. By this
approach we consistently name the .private_* elements from struct
address_space as well as allow extended usage for address_space
association with other data structures through ->private_data.
Also, all users of old ->assoc_mapping element are converted to reflect
its new name and type change.
Signed-off-by: Rafael Aquini <aquini@redhat.com> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Rik van Riel <riel@redhat.com> Cc: Mel Gorman <mel@csn.ul.ie> Cc: Andi Kleen <andi@firstfloor.org> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Minchan Kim <minchan@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Memory fragmentation introduced by ballooning might reduce significantly
the number of 2MB contiguous memory blocks that can be used within a
guest, thus imposing performance penalties associated with the reduced
number of transparent huge pages that could be used by the guest workload.
This patchset follows the main idea discussed at 2012 LSFMMS session:
"Ballooning for transparent huge pages" -- http://lwn.net/Articles/490114/
to introduce the required changes to the virtio_balloon driver, as well as
the changes to the core compaction & migration bits, in order to make
those subsystems aware of ballooned pages and allow memory balloon pages
become movable within a guest, thus avoiding the aforementioned
fragmentation issue
Following are numbers that prove this patch benefits on allowing
compaction to be more effective at memory ballooned guests.
Results for STRESS-HIGHALLOC benchmark, from Mel Gorman's mmtests suite,
running on a 4gB RAM KVM guest which was ballooning 512mB RAM in 64mB
chunks, at every minute (inflating/deflating), while test was running:
Introduce MIGRATEPAGE_SUCCESS as the default return code for
address_space_operations.migratepage() method and documents the expected
return code for the same method in failure cases.
Signed-off-by: Rafael Aquini <aquini@redhat.com> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Rik van Riel <riel@redhat.com> Cc: Mel Gorman <mel@csn.ul.ie> Cc: Andi Kleen <andi@firstfloor.org> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Minchan Kim <minchan@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Fix the x86-64 cache alignment code to take pgoff into account. Use the
x86 and MIPS cache alignment code as the basis for a generic cache
alignment function.
The old x86 code will always align the mmap to aliasing boundaries,
even if the program mmaps the file with a non-zero pgoff.
If program A mmaps the file with pgoff 0, and program B mmaps the file
with pgoff 1. The old code would align the mmaps, resulting in misaligned
pages:
A: 0123
B: 123
After this patch, they are aligned so the pages line up:
A: 0123
B: 123
Proposed by Rik van Riel.
Signed-off-by: Michel Lespinasse <walken@google.com> Cc: Rik van Riel <riel@redhat.com> Cc: Hugh Dickins <hughd@google.com> Cc: Russell King <linux@arm.linux.org.uk> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Paul Mundt <lethal@linux-sh.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: Chris Metcalf <cmetcalf@tilera.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>