Yauhen Kharuzhy [Fri, 16 Dec 2011 04:50:24 +0000 (15:50 +1100)]
drivers/rtc/rtc-mxc.c: fix setting time for MX1 SoC
There is no way to track year in the i.MX1 RTC: Days Counter register is
9-bit wide only. Attempt to save date after 1970-01-01 plus 512 days
causes endless loop in mxc_rtc_set_mmss(). Fix this by resetting year to
1970.
Signed-off-by: Yauhen Kharuzhy <jekhor@gmail.com> Cc: Daniel Mack <daniel@caiaq.de> Cc: Alessandro Zummo <a.zummo@towertech.it> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
YanHong [Fri, 16 Dec 2011 04:50:23 +0000 (15:50 +1100)]
init/do_mounts.c: create /root if it does not exist
If someone supplies an initramfs without /root in it, and we fail to
execute rdinit, we will try to mount root device and fail, for the mount
point does not exits.
But we get error message "VFS: Cannot open root device". It's confusing.
We can give a more detailed error message, or we can go further: if /root
does not exit, create it.
Signed-off-by: YanHong <tempname2@hotmail.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Woody Suwalski <terraluna977@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
David Daney [Fri, 16 Dec 2011 04:50:23 +0000 (15:50 +1100)]
MIPS: randomize PIE load address
... by selecting ARCH_BINFMT_ELF_RANDOMIZE_PIE
Signed-off-by: David Daney <david.daney@cavium.com> Cc: Russell King <linux@arm.linux.org.uk> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Ralf Baechle <ralf@linux-mips.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
David Daney [Fri, 16 Dec 2011 04:50:22 +0000 (15:50 +1100)]
fs: binfmt_elf: create Kconfig variable for PIE randomization
Randomization of PIE load address is hard coded in binfmt_elf.c for X86
and ARM. Create a new Kconfig variable
(CONFIG_ARCH_BINFMT_ELF_RANDOMIZE_PIE) for this and use it instead. Thus
architecture specific policy is pushed out of the generic binfmt_elf.c and
into the architecture Kconfig files.
X86 and ARM Kconfigs are modified to select the new variable so there is
no change in behavior. A follow on patch will select it for MIPS too.
Signed-off-by: David Daney <david.daney@cavium.com> Cc: Russell King <linux@arm.linux.org.uk> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Acked-by: H. Peter Anvin <hpa@zytor.com> Cc: Ralf Baechle <ralf@linux-mips.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Jason Baron [Fri, 16 Dec 2011 04:50:22 +0000 (15:50 +1100)]
epoll: limit paths
The current epoll code can be tickled to run basically indefinitely in
both loop detection path check (on ep_insert()), and in the wakeup paths.
The programs that tickle this behavior set up deeply linked networks of
epoll file descriptors that cause the epoll algorithms to traverse them
indefinitely. A couple of these sample programs have been previously
posted in this thread: https://lkml.org/lkml/2011/2/25/297.
To fix the loop detection path check algorithms, I simply keep track of
the epoll nodes that have been already visited. Thus, the loop detection
becomes proportional to the number of epoll file descriptor and links.
This dramatically decreases the run-time of the loop check algorithm. In
one diabolical case I tried it reduced the run-time from 15 mintues (all
in kernel time) to .3 seconds.
Fixing the wakeup paths could be done at wakeup time in a similar manner
by keeping track of nodes that have already been visited, but the
complexity is harder, since there can be multiple wakeups on different
cpus...Thus, I've opted to limit the number of possible wakeup paths when
the paths are created.
This is accomplished, by noting that the end file descriptor points that
are found during the loop detection pass (from the newly added link), are
actually the sources for wakeup events. I keep a list of these file
descriptors and limit the number and length of these paths that emanate
from these 'source file descriptors'. In the current implemetation I
allow 1000 paths of length 1, 500 of length 2, 100 of length 3, 50 of
length 4 and 10 of length 5. Note that it is sufficient to check the
'source file descriptors' reachable from the newly added link, since no
other 'source file descriptors' will have newly added links. This allows
us to check only the wakeup paths that may have gotten too long, and not
re-check all possible wakeup paths on the system.
In terms of the path limit selection, I think its first worth noting that
the most common case for epoll, is probably the model where you have 1
epoll file descriptor that is monitoring n number of 'source file
descriptors'. In this case, each 'source file descriptor' has a 1 path of
length 1. Thus, I believe that the limits I'm proposing are quite
reasonable and in fact may be too generous. Thus, I'm hoping that the
proposed limits will not prevent any workloads that currently work to
fail.
In terms of locking, I have extended the use of the 'epmutex' to all
epoll_ctl add and remove operations. Currently its only used in a subset
of the add paths. I need to hold the epmutex, so that we can correctly
traverse a coherent graph, to check the number of paths. I believe that
this additional locking is probably ok, since its in the setup/teardown
paths, and doesn't affect the running paths, but it certainly is going to
add some extra overhead. Also, worth noting is that the epmuex was
recently added to the ep_ctl add operations in the initial path loop
detection code using the argument that it was not on a critical path.
Another thing to note here, is the length of epoll chains that is allowed.
Currently, eventpoll.c defines:
/* Maximum number of nesting allowed inside epoll sets */
#define EP_MAX_NESTS 4
This basically means that I am limited to a graph depth of 5 (EP_MAX_NESTS
+ 1). However, this limit is currently only enforced during the loop
check detection code, and only when the epoll file descriptors are added
in a certain order. Thus, this limit is currently easily bypassed. The
newly added check for wakeup paths, stricly limits the wakeup paths to a
length of 5, regardless of the order in which ep's are linked together.
Thus, a side-effect of the new code is a more consistent enforcement of
the graph depth.
Thus far, I've tested this, using the sample programs previously
mentioned, which now either return quickly or return -EINVAL. I've also
testing using the piptest.c epoll tester, which showed no difference in
performance. I've also created a number of different epoll networks and
tested that they behave as expectded.
I believe this solves the original diabolical test cases, while still
preserving the sane epoll nesting.
Signed-off-by: Jason Baron <jbaron@redhat.com> Cc: Nelson Elhage <nelhage@ksplice.com> Cc: Davide Libenzi <davidel@xmailserver.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Joakim Tjernlund [Fri, 16 Dec 2011 04:50:22 +0000 (15:50 +1100)]
crc32: optimize inner loop
Taking a pointer reference to each row in the crc table matrix, one can
reduce the inner loop with a few insn's
Signed-off-by: Joakim Tjernlund <Joakim.Tjernlund@transmode.se> Cc: Bob Pearson <rpearson@systemfabricworks.com> Cc: Frank Zago <fzago@systemfabricworks.com> Cc: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Andy Whitcroft [Fri, 16 Dec 2011 04:50:21 +0000 (15:50 +1100)]
checkpatch: catch all occurences of type and cast spacing errors per line
Fix up type and cast spacing checks such that all occurences on a line are
examined and reported. For example the line below has a valid cast and a
bad type, but currently we check the cast first which is good and stop:
u16* bar = (u16 *)baz;
We will also only report one of the errors in this example:
u16* bar = (u16*)bad;
Move to iterating across all casts and all types, reporting any failure.
Signed-off-by: Andy Whitcroft <apw@canonical.com> Cc: Joe Perches <joe@perches.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Andy Whitcroft [Fri, 16 Dec 2011 04:50:18 +0000 (15:50 +1100)]
checkpatch: only apply kconfig help checks for options which prompt
The intent of this check is to catch the options which the user will see
and ensure they are properly described. It is also common for internal
only options to have a brief description. Allow this form.
Reported-by: Steven Rostedt <rostedt@goodmis.org> Tested-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Andy Whitcroft <apw@canonical.com> Cc: Joe Perches <joe@perches.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Andy Whitcroft [Fri, 16 Dec 2011 04:50:18 +0000 (15:50 +1100)]
checkpatch: optimise statement scanner when mid-statement
In the middle of a long definition or similar, there is no possibility of
finding a smaller sub-statement. Optimise this case by skipping statement
aquirey where there are no starts of statement (open brace '{' or
semi-colon ';'). We are likely to scan slightly more than needed still
but this is safest.
Signed-off-by: Andy Whitcroft <apw@canonical.com> Cc: Joe Perches <joe@perches.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Andy Whitcroft [Fri, 16 Dec 2011 04:50:18 +0000 (15:50 +1100)]
checkpatch: ## is not a valid modifier
Inserting a # into the modifiers list will incorrectly add the null string
to the modifiers list, leading to an infinite loop. As neither of these
is a valid modifier form simply ignore them.
Signed-off-by: Andy Whitcroft <apw@canonical.com> Reported-by: Joe Perches <joe@perches.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Joe Perches [Fri, 16 Dec 2011 04:50:17 +0000 (15:50 +1100)]
checkpatch: improve memset and min/max with cast checking
Improve the checking of arguments to memset and min/max tests.
Move the checking of min/max to statement blocks instead of single line.
Change $Constant to allow any case type 0x initiator and trailing ul
specifier. Add $FuncArg type as any function argument with or without a
cast. Print the whole statement when showing memset or min/max messages.
Improve the memset with 0 as 3rd argument error message.
There are still weaknesses in the $FuncArg and $Constant code as arbitrary
parentheses and negative signs are not generically supported.
Signed-off-by: Joe Perches <joe@perches.com> Acked-by: Andy Whitcroft <apw@canonical.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Andy Whitcroft [Fri, 16 Dec 2011 04:50:17 +0000 (15:50 +1100)]
checkpatch: check for common memset parameter issues against statments
Move the memset checks over to work against the statement. Also add
checks for 0 and 1 used as lengths. Generally these indicate badly
ordered parameters.
Signed-off-by: Andy Whitcroft <apw@canonical.com> Cc: Joe Perches <joe@perches.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Andy Whitcroft [Fri, 16 Dec 2011 04:50:16 +0000 (15:50 +1100)]
checkpatch: correctly track the end of preprocessor commands in context
When looking for a statement we currently run on through preprocessor
commands. This means that a header file with just definitions is parsed
over and over again combining all of the lines from the current line to
the end of file leading to severe performance issues.
Fix up context accumulation to track preprocessor commands and stop when
reaching the end of them. At the same time vastly simplify the #define
handling.
Signed-off-by: Andy Whitcroft <apw@canonical.com> Cc: Joe Perches <joe@perches.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Joe Perches [Fri, 16 Dec 2011 04:50:15 +0000 (15:50 +1100)]
checkpatch: update signature "might be better as" warning
email header lines can look like signature tags. It's valid to have
multiple email recipients on a single line but not valid to have multiple
signatures on a single line.
Validate signatures only when not in the email headers.
Clear the $in_commit_log flag when the patch filename appears.
Add '-' to the valid chars in a message header for headers
like "Message-Id:" and "In-Reply-To:".
Signed-off-by: Joe Perches <joe@perches.com> Reported-by: Julia Lawall <julia.lawall@lip6.fr> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Axel Lin [Fri, 16 Dec 2011 04:50:15 +0000 (15:50 +1100)]
drivers/leds/leds-mc13783.c: fix off-by-one for checking num_leds
The LED id begins from 0. Thus the maximum number of leds should be
MC13783_LED_MAX + 1.
Signed-off-by: Axel Lin <axel.lin@gmail.com> Acked-by: Philippe Retornaz <philippe.retornaz@epfl.ch> Cc: Richard Purdie <rpurdie@rpsys.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: NeilBrown <neilb@suse.de>
WARNING: please write a paragraph that describes the config symbol fully
#31: FILE: drivers/leds/Kconfig:394:
+ help
WARNING: line over 80 characters
#69: FILE: drivers/leds/leds-tca6507.c:16:
+ * each 8msec that the led is 'on'. The levels are named MASTER, BANK0 and BANK1.
WARNING: line over 80 characters
#71: FILE: drivers/leds/leds-tca6507.c:18:
+ * There are two different blink rates that can be programmed, each with separate
WARNING: line over 80 characters
#75: FILE: drivers/leds/leds-tca6507.c:22:
+ * This drivers does not support double-blink so 'second-off' always matches 'off'.
WARNING: line over 80 characters
#93: FILE: drivers/leds/leds-tca6507.c:40:
+ * brightness is used. As 'full' is always available, the worst case would be to
WARNING: line over 80 characters
#97: FILE: drivers/leds/leds-tca6507.c:44:
+ * Each bank (BANK0 and BANK1) have two usage counts - Leds using the brightness and
WARNING: line over 80 characters
#102: FILE: drivers/leds/leds-tca6507.c:49:
+ * there is a flag saying if it was explicitly requested or defaulted. Similarly
WARNING: line over 80 characters
#103: FILE: drivers/leds/leds-tca6507.c:50:
+ * the banks know if each time was explicit or a default. Defaults are permitted
ERROR: open brace '{' following function declarations go on the next line
#170: FILE: drivers/leds/leds-tca6507.c:117:
+static inline int TO_LEVEL(int brightness) {
ERROR: open brace '{' following function declarations go on the next line
#174: FILE: drivers/leds/leds-tca6507.c:121:
+static inline int TO_BRIGHT(int level) {
WARNING: line over 80 characters
#203: FILE: drivers/leds/leds-tca6507.c:150:
+ int blink; /* 1 if we are hardware-blinking */
WARNING: line over 80 characters
#222: FILE: drivers/leds/leds-tca6507.c:169:
+ * the first pair so there is more change-time visible (i.e. it is softer).
ERROR: space required before the open parenthesis '('
#300: FILE: drivers/leds/leds-tca6507.c:247:
+ switch(bank) {
total: 3 errors, 10 warnings, 732 lines checked
./patches/leds-add-driver-for-tca6507-led-controller.patch has style problems, please review.
If any of these errors are false positives, please report
them to the maintainer, see CHECKPATCH in MAINTAINERS.
Please run checkpatch prior to sending patches
Cc: NeilBrown <neilb@suse.de> Cc: Richard Purdie <rpurdie@rpsys.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Axel Lin [Fri, 16 Dec 2011 04:50:14 +0000 (15:50 +1100)]
drivers/leds/leds-netxbig.c: use gpio_request_one()
Use gpio_request_one() instead of multiple gpiolib calls. This also
simplifies error handling a bit.
Signed-off-by: Axel Lin <axel.lin@gmail.com> Cc: Simon Guinot <sguinot@lacie.com> Cc: Richard Purdie <rpurdie@rpsys.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Axel Lin [Fri, 16 Dec 2011 04:50:13 +0000 (15:50 +1100)]
drivers/leds/leds-bd2802.c: use gpio_request_one()
Use gpio_request_one() instead of multiple gpiolib calls.
Signed-off-by: Axel Lin <axel.lin@gmail.com> Cc: Kim Kyuwon <q1.kim@samsung.com> Cc: Richard Purdie <rpurdie@rpsys.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Axel Lin <axel.lin@gmail.com> Cc: Samu Onkalo <samu.p.onkalo@nokia.com> Cc: Richard Purdie <rpurdie@rpsys.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Axel Lin [Fri, 16 Dec 2011 04:50:13 +0000 (15:50 +1100)]
leds: convert leds-dac124s085 to module_spi_driver
Factor out some boilerplate code for spi driver registration into
module_spi_driver.
Signed-off-by: Axel Lin <axel.lin@gmail.com> Cc: Haojian Zhuang <hzhuang1@marvell.com> Cc: Mark Brown <broonie@opensource.wolfsonmicro.com> Cc: Richard Purdie <rpurdie@rpsys.net> Cc: Michael Hennerich <hennerich@blackfin.uclinux.org> Cc: Mike Rapoport <mike@compulab.co.il> Acked-by: Guennadi Liakhovetski <g.liakhovetski@gmx.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Axel Lin [Fri, 16 Dec 2011 04:50:12 +0000 (15:50 +1100)]
leds: convert led i2c drivers to module_i2c_driver
Factor out some boilerplate code for i2c driver registration
into module_i2c_driver.
Signed-off-by: Axel Lin <axel.lin@gmail.com> Cc: Haojian Zhuang <hzhuang1@marvell.com> Cc: Mark Brown <broonie@opensource.wolfsonmicro.com> Cc: Richard Purdie <rpurdie@rpsys.net> Cc: Michael Hennerich <hennerich@blackfin.uclinux.org> Cc: Mike Rapoport <mike@compulab.co.il> Cc: Guennadi Liakhovetski <g.liakhovetski@gmx.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Axel Lin [Fri, 16 Dec 2011 04:50:12 +0000 (15:50 +1100)]
leds: convert led platform drivers to module_platform_driver
Factor out some boilerplate code for platform driver registration into
module_platform_driver.
Signed-off-by: Axel Lin <axel.lin@gmail.com> Acked-by: Haojian Zhuang <hzhuang1@marvell.com> [led-88pm860x.c] Acked-by: Mark Brown <broonie@opensource.wolfsonmicro.com> Cc: Richard Purdie <rpurdie@rpsys.net> Cc: Michael Hennerich <hennerich@blackfin.uclinux.org> Cc: Mike Rapoport <mike@compulab.co.il> Cc: Guennadi Liakhovetski <g.liakhovetski@gmx.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Jingoo Han [Fri, 16 Dec 2011 04:50:11 +0000 (15:50 +1100)]
drivers/video/backlight/ep93xx_bl.c: remove duplicated header include
module.h is included twice.
Signed-off-by: Jingoo Han <jg1.han@samsung.com> Acked-by: H Hartley Sweeten <hsweeten@visionengravers.com> Cc: Ryan Mallon <rmallon@gmail.com> Cc: Richard Purdie <rpurdie@rpsys.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Donghwa Lee [Fri, 16 Dec 2011 04:50:11 +0000 (15:50 +1100)]
backlight/ld9040.c: regulator control in the driver
This patch supports regulator power control in the driver. Current ld9040
driver was controlled power on/off sequence by callback function in the
board file. But, by doing this, there's no need to register lcd power
on/off callback function in the board file.
Signed-off-by: Donghwa Lee <dh09.lee@samsung.com> Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com> Signed-off-by: Inki Dae <inki.dae@samsung.com> Cc: Richard Purdie <rpurdie@rpsys.net> Cc: Florian Tobias Schandinat <FlorianSchandinat@gmx.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Axel Lin [Fri, 16 Dec 2011 04:50:11 +0000 (15:50 +1100)]
backlight: convert drivers/video/backlight/* to use module_platform_driver()
Convert the drivers in drivers/video/backlight/* to use the
module_platform_driver() macro which makes the code smaller and a bit
simpler.
Signed-off-by: Axel Lin <axel.lin@gmail.com> Acked-by: Haojian Zhuang <haojian.zhuang@gmail.com> Acked-by: H Hartley Sweeten <hsweeten@visionengravers.com> [ep93xx_bl.c] Cc: Mike Rapoport <mike@compulab.co.il> Cc: Richard Purdie <rpurdie@rpsys.net> Acked-by: Michael Hennerich <michael.hennerich@analog.com> Acked-by: Mark Brown <broonie@opensource.wolfsonmicro.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Paul Bolle [Fri, 16 Dec 2011 04:50:10 +0000 (15:50 +1100)]
backlight: remove ADX backlight device support
Support for the Avionic Design Xanthos backlight device got added in
commit 3b96ea9ef8 ("backlight: Add support for the Avionic Design Xanthos
backlight device."). That support depends on ARCH_PXA_ADX. The code that
should have provided that Kconfig symbol never got submitted. It has
never been possible to even build this driver. Remove it.
Signed-off-by: Paul Bolle <pebolle@tiscali.nl> Acked-by: Thierry Reding <thierry.reding@avionic-design.de> Cc: Richard Purdie <rpurdie@rpsys.net> Cc: Wim Van Sebroeck <wim@iguana.be> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Kyungmin Park [Fri, 16 Dec 2011 04:50:10 +0000 (15:50 +1100)]
devfreq: add devfreq maintainer entry
As devfreq is merged at mainline. Also update the maintainer entry.
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com> Cc: Kevin Hilman <khilman@ti.com> Cc: MyungJoo Ham <myungjoo.ham@samsung.com> Acked-by: Rafael J. Wysocki <rjw@sisk.pl> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Joe Perches [Fri, 16 Dec 2011 04:50:08 +0000 (15:50 +1100)]
MAINTAINERS: update tulip F: patterns
commit a88394cfb58 ("ewrk3/tulip: Move the DEC - Tulip drivers") moved the
files, update the patterns.
Signed-off-by: Joe Perches <joe@perches.com> Acked-by: Grant Grundler <grundler@parisc-linux.org> Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Cc: Tobias Ringstrom <tori@unhappy.mine.nu> Cc: Grant Grundler <grundler@parisc-linux.org> Cc: David Davies <davies@maniac.ultranet.com> Cc: David Miller <davem@davemloft.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Joe Perches [Fri, 16 Dec 2011 04:50:08 +0000 (15:50 +1100)]
MAINTAINERS: update sdhci F: patterns
commit 38576af1f8c ("mmc: sdhci: make sdhci-of device drivers self
registered") moved the files around. Update the patterns.
Signed-off-by: Joe Perches <joe@perches.com> Cc: Shawn Guo <shawn.guo@linaro.org> Cc: Chris Ball <cjb@laptop.org> Acked-by: Anton Vorontsov <cbouatmailru@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Joe Perches [Fri, 16 Dec 2011 04:50:07 +0000 (15:50 +1100)]
MAINTAINERS: update bt8xx gpio F: patterns
Commit c103de240439d ("gpio: reorganize drivers") renamed the file, update
the pattern.
Signed-off-by: Joe Perches <joe@perches.com> Cc: Grant Likely <grant.likely@secretlab.ca> Cc: Michael Buesch <m@bues.ch> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Joe Perches [Fri, 16 Dec 2011 04:50:06 +0000 (15:50 +1100)]
MAINTAINERS: update adp gpio F: patterns
Commit c103de240439df ("gpio: reorganize drivers") renamed the files,
update the patterns.
Signed-off-by: Joe Perches <joe@perches.com> Cc: Grant Likely <grant.likely@secretlab.ca> Acked-by: Michael Hennerich <michael.hennerich@analog.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Ian Campbell [Fri, 16 Dec 2011 04:50:06 +0000 (15:50 +1100)]
get_maintainers.pl: follow renames when looking up commit signers
I happen to have had a commit to various network drivers since the big
renaming/reorg which happened to drivers/net recently. This means that I
now appear to be in the top few commit signers (by %age) for many of them
so am getting sent all sorts of stuff and people who are involved with the
driver are not. e.g. (to pick one at random):
$ ./scripts/get_maintainer.pl -f drivers/net/ethernet/nvidia/forcedeth.c
"David S. Miller" <davem@davemloft.net> (commit_signer:5/7=71%)
Ian Campbell <ian.campbell@citrix.com> (commit_signer:2/7=29%)
Eric Dumazet <eric.dumazet@gmail.com> (commit_signer:1/7=14%)
Jeff Kirsher <jeffrey.t.kirsher@intel.com> (commit_signer:1/7=14%)
Jiri Pirko <jpirko@redhat.com> (commit_signer:1/7=14%)
netdev@vger.kernel.org (open list:NETWORKING DRIVERS)
linux-kernel@vger.kernel.org (open list)
With the following patch the renames are followed and the result appears
much more sensible:
Andi Kleen [Fri, 16 Dec 2011 04:50:04 +0000 (15:50 +1100)]
brlocks/lglocks: clean up code
lglocks and brlocks are currently generated with some complicated macros
in lglock.h. But there's no reason I can see to not just use common
utility functions that get pointers to the lglock.
Since there are at least two users it makes sense to share this code in a
library.
This will also make it later possible to dynamically allocate lglocks.
In general the users now look more like normal function calls with
pointers, not magic macros.
The patch is rather large because I move over all users in one go to keep
it bisectable. This impacts the VFS somewhat in terms of lines changed.
But no actual behaviour change.
Signed-off-by: Andi Kleen <ak@linux.intel.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Nick Piggin <npiggin@kernel.dk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Jesper Juhl [Fri, 16 Dec 2011 04:50:04 +0000 (15:50 +1100)]
audit: always follow va_copy() with va_end()
A call to va_copy() should always be followed by a call to va_end() in the
same function. In kernel/autit.c::audit_log_vformat() this is not always
done. This patch makes sure va_end() is always called.
Signed-off-by: Jesper Juhl <jj@chaosbits.net> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Eric Paris <eparis@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Heiko Carstens [Fri, 16 Dec 2011 04:50:03 +0000 (15:50 +1100)]
mm,slub,x86: decouple size of struct page from CONFIG_CMPXCHG_LOCAL
While implementing cmpxchg_double() on s390 I realized that we don't set
CONFIG_CMPXCHG_LOCAL besides the fact that we have support for it.
However setting that option will increase the size of struct page by eight
bytes on 64 bit, which we certainly do not want. Also, it doesn't make
sense that a present cpu feature should increase the size of struct page.
Besides that it looks like the dependency to CMPXCHG_LOCAL is wrong and
that it should depend on CMPXCHG_DOUBLE instead.
This patch:
If an architecture supports CMPXCHG_LOCAL this shouldn't result
automatically in larger struct pages if the SLUB allocator is used.
Instead introduce a new config option "HAVE_ALIGNED_STRUCT_PAGE" which can
be selected if a double word aligned struct page is required. Also update
x86 Kconfig so that it should work as before.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Acked-by: Christoph Lameter <cl@linux.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
WARNING: please, no spaces at the start of a line
#57: FILE: arch/m68k/amiga/config.c:515:
+ __noreturn;$
total: 0 errors, 1 warnings, 106 lines checked
./patches/treewide-convert-uses-of-attrib_noreturn-to-__noreturn.patch has style problems, please review.
If any of these errors are false positives, please report
them to the maintainer, see CHECKPATCH in MAINTAINERS.
Please run checkpatch prior to sending patches
Cc: Joe Perches <joe@perches.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Shaohua Li [Fri, 16 Dec 2011 04:49:59 +0000 (15:49 +1100)]
intel_idle: fix API misuse
smp_call_function() only lets all other CPUs execute a specific function,
while we expect all CPUs do in intel_idle. Without the fix, we could have
one cpu which has auto_demotion enabled or has no boradcast timer setup.
Usually we don't see impact because auto demotion just harms power and the
intel_idle init is called in CPU 0, where boradcast timer delivers
interrupt, but this still could be a problem.
Signed-off-by: Shaohua Li <shaohua.li@intel.com> Cc: Len Brown <lenb@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Magnus Lynch [Fri, 16 Dec 2011 04:49:59 +0000 (15:49 +1100)]
hpet: factor timer allocate from open
The current implementation of the /dev/hpet driver couples opening the
device with allocating one of the (scarce) timers (aka comparators). This
is a limitation in that the main counter may be valuable to applications
seeking a high-resolution timer who have no use for the interrupt
generating functionality of the comparators.
This patch alters the open semantics so that when the device is opened, no
timer is allocated. Operations that depend on a timer being in context
implicitly attempt allocating a timer, to maintain backward compatibility.
There is also an IOCTL (HPET_ALLOC_TIMER _IO) added so that the
allocation may be done explicitly. (I prefer the explicit open then
allocate pattern but don't know how practical it would be to require all
existing code to be changed.)
/dev/hpet is accessed via mmap(). This is the only interface of /dev/hpet
that is actually used in practice.
[akpm@linux-foundation.org: coding-style tweaks]
[arnd@arndb.de: fix build] Signed-off-by: Magnus Lynch <maglyx@gmail.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: john stultz <johnstul@us.ibm.com> Acked-by: Clemens Ladisch <clemens@ladisch.de> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
tracepoint: add tracepoints for debugging oom_score_adj
oom_score_adj is used for guarding processes from OOM-Killer. One of
problem is that it's inherited at fork(). When a daemon set oom_score_adj
and make children, it's hard to know where the value is set.
This patch adds some tracepoints useful for debugging. This patch adds
3 trace points.
- creating new task
- renaming a task (exec)
- set oom_score_adj
To debug, users need to enable some trace pointer. Maybe filtering is useful as
KOSAKI Motohiro [Fri, 16 Dec 2011 04:49:57 +0000 (15:49 +1100)]
mm: simplify find_vma_prev()
commit 297c5eee37 ("mm: make the vma list be doubly linked") added the
vm_prev member to vm_area_struct. We can simplify find_vma_prev() by
using it. Also, this change helps to improve page fault performance
because it has stronger locality of reference.
Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Reviewed-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Hugh Dickins <hughd@google.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Shaohua Li <shaohua.li@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Michal Hocko <mhocko@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Andrea Arcangeli [Fri, 16 Dec 2011 04:49:57 +0000 (15:49 +1100)]
mremap: enforce rmap src/dst vma ordering in case of vma_merge() succeeding in copy_vma()
migrate was doing an rmap_walk with speculative lock-less access on
pagetables. That could lead it to not serializing properly against mremap
PT locks. But a second problem remains in the order of vmas in the
same_anon_vma list used by the rmap_walk.
If vma_merge succeeds in copy_vma, the src vma could be placed after the
dst vma in the same_anon_vma list. That could still lead to migrate
missing some pte.
This patch adds an anon_vma_moveto_tail() function to force the dst vma at
the end of the list before mremap starts to solve the problem.
If the mremap is very large and there are a lots of parents or childs
sharing the anon_vma root lock, this should still scale better than taking
the anon_vma root lock around every pte copy practically for the whole
duration of mremap.
Update: Hugh noticed special care is needed in the error path where
move_page_tables goes in the reverse direction, a second
anon_vma_moveto_tail() call is needed in the error path.
This program exercises the anon_vma_moveto_tail:
===
int main()
{
static struct timeval oldstamp, newstamp;
long diffsec;
char *p, *p2, *p3, *p4;
if (posix_memalign((void **)&p, 2*1024*1024, SIZE))
perror("memalign"), exit(1);
if (posix_memalign((void **)&p2, 2*1024*1024, SIZE))
perror("memalign"), exit(1);
if (posix_memalign((void **)&p3, 2*1024*1024, SIZE))
perror("memalign"), exit(1);
Michal Hocko [Fri, 16 Dec 2011 04:49:56 +0000 (15:49 +1100)]
mm: fix off-by-two in __zone_watermark_ok()
88f5acf8 ("mm: page allocator: adjust the per-cpu counter threshold when
memory is low") changed the form how free_pages is calculated but it
forgot that we used to do free_pages - ((1 << order) - 1) so we ended up
with off-by-two when calculating free_pages.
Reported-by: Wang Sheng-Hui <shhuiw@gmail.com> Signed-off-by: Michal Hocko <mhocko@suse.cz> Acked-by: Mel Gorman <mgorman@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Uwe Kleine-König [Fri, 16 Dec 2011 04:49:56 +0000 (15:49 +1100)]
bootmem: micro optimize freeing pages in bulk
The first entry of bdata->node_bootmem_map holds the data for
bdata->node_min_pfn up to bdata->node_min_pfn + BITS_PER_LONG - 1. So the
test for freeing all pages of a single map entry can be slightly relaxed.
Moreover use DIV_ROUND_UP in another place instead of open coding it.
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Cc: Johannes Weiner <hannes@saeurebad.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Hillf Danton [Fri, 16 Dec 2011 04:49:56 +0000 (15:49 +1100)]
mm: compaction: push isolate search base of compact control one pfn ahead
After isolated the current pfn will no longer be scanned and isolated if
the next round is necessary, so push the isolate_migratepages search base
of the given compact_control one step ahead.
Signed-off-by: Hillf Danton <dhillf@gmail.com> Reviewed-by: Andrea Arcangeli <aarcange@redhat.com> Cc: Mel Gorman <mgorman@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Johannes Weiner [Fri, 16 Dec 2011 04:49:55 +0000 (15:49 +1100)]
Btrfs: pass __GFP_WRITE for buffered write page allocations
Tell the page allocator that pages allocated for a buffered write are
expected to become dirty soon.
Signed-off-by: Johannes Weiner <jweiner@redhat.com> Reviewed-by: Rik van Riel <riel@redhat.com> Acked-by: Mel Gorman <mgorman@suse.de> Cc: Minchan Kim <minchan.kim@gmail.com> Cc: Michal Hocko <mhocko@suse.cz> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Christoph Hellwig <hch@infradead.org> Cc: Wu Fengguang <fengguang.wu@intel.com> Cc: Dave Chinner <david@fromorbit.com> Cc: Jan Kara <jack@suse.cz> Cc: Shaohua Li <shaohua.li@intel.com> Cc: Chris Mason <chris.mason@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Johannes Weiner [Fri, 16 Dec 2011 04:49:55 +0000 (15:49 +1100)]
mm: filemap: pass __GFP_WRITE from grab_cache_page_write_begin()
Tell the page allocator that pages allocated through
grab_cache_page_write_begin() are expected to become dirty soon.
Signed-off-by: Johannes Weiner <jweiner@redhat.com> Reviewed-by: Rik van Riel <riel@redhat.com> Acked-by: Mel Gorman <mgorman@suse.de> Reviewed-by: Minchan Kim <minchan.kim@gmail.com> Reviewed-by: Michal Hocko <mhocko@suse.cz> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Christoph Hellwig <hch@infradead.org> Cc: Wu Fengguang <fengguang.wu@intel.com> Cc: Dave Chinner <david@fromorbit.com> Cc: Jan Kara <jack@suse.cz> Cc: Shaohua Li <shaohua.li@intel.com> Cc: Chris Mason <chris.mason@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Johannes Weiner [Fri, 16 Dec 2011 04:49:55 +0000 (15:49 +1100)]
mm: try to distribute dirty pages fairly across zones
The maximum number of dirty pages that exist in the system at any time is
determined by a number of pages considered dirtyable and a user-configured
percentage of those, or an absolute number in bytes.
This number of dirtyable pages is the sum of memory provided by all the
zones in the system minus their lowmem reserves and high watermarks, so
that the system can retain a healthy number of free pages without having
to reclaim dirty pages.
But there is a flaw in that we have a zoned page allocator which does not
care about the global state but rather the state of individual memory
zones. And right now there is nothing that prevents one zone from filling
up with dirty pages while other zones are spared, which frequently leads
to situations where kswapd, in order to restore the watermark of free
pages, does indeed have to write pages from that zone's LRU list. This
can interfere so badly with IO from the flusher threads that major
filesystems (btrfs, xfs, ext4) mostly ignore write requests from reclaim
already, taking away the VM's only possibility to keep such a zone
balanced, aside from hoping the flushers will soon clean pages from that
zone.
Enter per-zone dirty limits. They are to a zone's dirtyable memory what
the global limit is to the global amount of dirtyable memory, and try to
make sure that no single zone receives more than its fair share of the
globally allowed dirty pages in the first place. As the number of pages
considered dirtyable excludes the zones' lowmem reserves and high
watermarks, the maximum number of dirty pages in a zone is such that the
zone can always be balanced without requiring page cleaning.
As this is a placement decision in the page allocator and pages are
dirtied only after the allocation, this patch allows allocators to pass
__GFP_WRITE when they know in advance that the page will be written to and
become dirty soon. The page allocator will then attempt to allocate from
the first zone of the zonelist - which on NUMA is determined by the task's
NUMA memory policy - that has not exceeded its dirty limit.
At first glance, it would appear that the diversion to lower zones can
increase pressure on them, but this is not the case. With a full high
zone, allocations will be diverted to lower zones eventually, so it is
more of a shift in timing of the lower zone allocations. Workloads that
previously could fit their dirty pages completely in the higher zone may
be forced to allocate from lower zones, but the amount of pages that
"spill over" are limited themselves by the lower zones' dirty constraints,
and thus unlikely to become a problem.
For now, the problem of unfair dirty page distribution remains for NUMA
configurations where the zones allowed for allocation are in sum not big
enough to trigger the global dirty limits, wake up the flusher threads and
remedy the situation. Because of this, an allocation that could not
succeed on any of the considered zones is allowed to ignore the dirty
limits before going into direct reclaim or even failing the allocation,
until a future patch changes the global dirty throttling and flusher
thread activation so that they take individual zone states into account.
Test results
15M DMA + 3246M DMA32 + 504 Normal = 3765M memory
40% dirty ratio
16G USB thumb drive
10 runs of dd if=/dev/zero of=disk/zeroes bs=32k count=$((10 << 15))
Signed-off-by: Johannes Weiner <jweiner@redhat.com> Reviewed-by: Minchan Kim <minchan.kim@gmail.com> Acked-by: Mel Gorman <mgorman@suse.de> Reviewed-by: Michal Hocko <mhocko@suse.cz> Tested-by: Wu Fengguang <fengguang.wu@intel.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Christoph Hellwig <hch@infradead.org> Cc: Dave Chinner <david@fromorbit.com> Cc: Jan Kara <jack@suse.cz> Cc: Shaohua Li <shaohua.li@intel.com> Cc: Rik van Riel <riel@redhat.com> Cc: Chris Mason <chris.mason@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Johannes Weiner [Fri, 16 Dec 2011 04:49:54 +0000 (15:49 +1100)]
mm: writeback: cleanups in preparation for per-zone dirty limits
The next patch will introduce per-zone dirty limiting functions in
addition to the traditional global dirty limiting.
Rename determine_dirtyable_memory() to global_dirtyable_memory() before
adding the zone-specific version, and fix up its documentation.
Also, move the functions to determine the dirtyable memory and the
function to calculate the dirty limit based on that together so that their
relationship is more apparent and that they can be commented on as a
group.
Signed-off-by: Johannes Weiner <jweiner@redhat.com> Reviewed-by: Minchan Kim <minchan.kim@gmail.com> Acked-by: Mel Gorman <mel@suse.de> Reviewed-by: Michal Hocko <mhocko@suse.cz> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Christoph Hellwig <hch@infradead.org> Cc: Wu Fengguang <fengguang.wu@intel.com> Cc: Dave Chinner <david@fromorbit.com> Cc: Jan Kara <jack@suse.cz> Cc: Shaohua Li <shaohua.li@intel.com> Cc: Rik van Riel <riel@redhat.com> Cc: Chris Mason <chris.mason@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Chris Mason <chris.mason@oracle.com> Cc: Christoph Hellwig <hch@infradead.org> Cc: Dave Chinner <david@fromorbit.com> Cc: Jan Kara <jack@suse.cz> Cc: Johannes Weiner <jweiner@redhat.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Michal Hocko <mhocko@suse.cz> Cc: Minchan Kim <minchan.kim@gmail.com> Cc: Rik van Riel <riel@redhat.com> Cc: Shaohua Li <shaohua.li@intel.com> Cc: Wu Fengguang <fengguang.wu@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Johannes Weiner [Fri, 16 Dec 2011 04:49:54 +0000 (15:49 +1100)]
mm: exclude reserved pages from dirtyable memory
Per-zone dirty limits try to distribute page cache pages allocated for
writing across zones in proportion to the individual zone sizes, to reduce
the likelihood of reclaim having to write back individual pages from the
LRU lists in order to make progress.
This patch:
The amount of dirtyable pages should not include the full number of free
pages: there is a number of reserved pages that the page allocator and
kswapd always try to keep free.
The closer (reclaimable pages - dirty pages) is to the number of reserved
pages, the more likely it becomes for reclaim to run into dirty pages:
This patch introduces a per-zone dirty reserve that takes both the lowmem
reserve as well as the high watermark of the zone into account, and a
global sum of those per-zone values that is subtracted from the global
amount of dirtyable pages. The lowmem reserve is unavailable to page
cache allocations and kswapd tries to keep the high watermark free. We
don't want to end up in a situation where reclaim has to clean pages in
order to balance zones.
Not treating reserved pages as dirtyable on a global level is only a
conceptual fix. In reality, dirty pages are not distributed equally
across zones and reclaim runs into dirty pages on a regular basis.
But it is important to get this right before tackling the problem on a
per-zone level, where the distance between reclaim and the dirty pages is
mostly much smaller in absolute numbers.
Signed-off-by: Johannes Weiner <jweiner@redhat.com> Reviewed-by: Rik van Riel <riel@redhat.com> Reviewed-by: Michal Hocko <mhocko@suse.cz> Reviewed-by: Minchan Kim <minchan.kim@gmail.com> Acked-by: Mel Gorman <mgorman@suse.de> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Christoph Hellwig <hch@infradead.org> Cc: Wu Fengguang <fengguang.wu@intel.com> Cc: Dave Chinner <david@fromorbit.com> Cc: Jan Kara <jack@suse.cz> Cc: Shaohua Li <shaohua.li@intel.com> Cc: Chris Mason <chris.mason@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
David Rientjes [Fri, 16 Dec 2011 04:49:53 +0000 (15:49 +1100)]
mm, debug: test for online nid when allocating on single node
Calling alloc_pages_exact_node() means the allocation only passes the
zonelist of a single node into the page allocator. If that node isn't
online, it's zonelist may never have been initialized causing a strange
oops that may not immediately be clear.
I recently debugged an issue where node 0 wasn't online and an allocator
was passing 0 to alloc_pages_exact_node() and it resulted in a NULL
pointer on zonelist->_zoneref. If CONFIG_DEBUG_VM is enabled, though, it
would be nice to catch this a bit earlier.
Signed-off-by: David Rientjes <rientjes@google.com> Acked-by: Mel Gorman <mgorman@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Shawn Bohrer [Fri, 16 Dec 2011 04:49:52 +0000 (15:49 +1100)]
fadvise: only initiate writeback for specified range with FADV_DONTNEED
Previously POSIX_FADV_DONTNEED would start writeback for the entire file
when the bdi was not write congested. This negatively impacts performance
if the file contians dirty pages outside of the requested range. This
change uses __filemap_fdatawrite_range() to only initiate writeback for
the requested range.
Signed-off-by: Shawn Bohrer <sbohrer@rgmadvisors.com> Acked-by: Johannes Weiner <jweiner@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Disable slub debug facilities and allocate slabs at minimal order when
debug_guardpage_minorder > 0 to increase probability to catch random
memory corruption by cpu exception.
Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com> Cc: "Rafael J. Wysocki" <rjw@sisk.pl> Cc: Andrea Arcangeli <aarcange@redhat.com> Acked-by: Christoph Lameter <cl@linux.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Stanislaw Gruszka <sgruszka@redhat.com> Cc: Pekka Enberg <penberg@cs.helsinki.fi> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
When debugging with CONFIG_DEBUG_PAGEALLOC and debug_guardpage_minorder >
0, we have lot of free pages that are not marked so. Snapshot code
account them as savable, what cause hibernate memory preallocation
failure.
It is pretty hard to make hibernate allocation succeed with
debug_guardpage_minorder=1. This change at least make it possible when
system has relatively big amount of RAM.
Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com> Acked-by: Rafael J. Wysocki <rjw@sisk.pl> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Christoph Lameter <cl@linux-foundation.org> Cc: Mel Gorman <mgorman@suse.de> Cc: Pekka Enberg <penberg@cs.helsinki.fi> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
With CONFIG_DEBUG_PAGEALLOC configured, the CPU will generate an exception
on access (read,write) to an unallocated page, which permits us to catch
code which corrupts memory. However the kernel is trying to maximise
memory usage, hence there are usually few free pages in the system and
buggy code usually corrupts some crucial data.
This patch changes the buddy allocator to keep more free/protected pages
and to interlace free/protected and allocated pages to increase the
probability of catching corruption.
When the kernel is compiled with CONFIG_DEBUG_PAGEALLOC,
debug_guardpage_minorder defines the minimum order used by the page
allocator to grant a request. The requested size will be returned with
the remaining pages used as guard pages.
The default value of debug_guardpage_minorder is zero: no change from
current behaviour.
Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: "Rafael J. Wysocki" <rjw@sisk.pl> Cc: Christoph Lameter <cl@linux-foundation.org> Cc: Pekka Enberg <penberg@cs.helsinki.fi> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
David Daney [Fri, 16 Dec 2011 04:49:51 +0000 (15:49 +1100)]
hugetlb: replace BUG() with BUILD_BUG() for dummy definitions
The file linux/hugetlb.h has many places where dummy symbols were defined
so that the main source code would contain fewer:
#ifdef CONFIG_HUGETLBFS
or
#ifdef CONFIG_TRANSPARENT_HUGEPAGE
If there were any misuse of these symbols, the only symptom would be an
OOPS at runtime. Change the BUG() to BUILD_BUG() to catch any such abuse
at compile time instead.
Signed-off-by: David Daney <david.daney@cavium.com> Cc: David Rientjes <rientjes@google.com> Cc: DM <dm.n9107@gmail.com> Cc: Ralf Baechle <ralf@linux-mips.org> Acked-by: David Howells <dhowells@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
David Daney [Fri, 16 Dec 2011 04:49:50 +0000 (15:49 +1100)]
kernel.h: Add BUILD_BUG() macro.
We can place this in definitions that we expect the compiler to remove
by dead code elimination. If this assertion fails, we get a nice
error message at build time.
The GCC function attribute error("message") was added in version 4.3,
so we define a new macro __linktime_error(message) to expand to this
for GCC-4.3 and later. This will give us an error diagnostic from the
compiler on the line that fails. For other compilers
__linktime_error(message) expands to nothing, and we have to be
content with a link time error, but at least we will still get a build
error.
BUILD_BUG() expands to the undefined function __build_bug_failed() and
will fail at link time if the compiler ever emits code for it. On
GCC-4.3 and later, attribute((error())) is used so that the failure
will be noted at compile time instead.
Acked-by: David Howells <dhowells@redhat.com> Acked-by: David Rientjes <rientjes@google.com> Signed-off-by: David Daney <david.daney@cavium.com> Cc: David Rientjes <rientjes@google.com> Cc: DM <dm.n9107@gmail.com> Cc: Ralf Baechle <ralf@linux-mips.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
David Daney [Fri, 16 Dec 2011 04:49:50 +0000 (15:49 +1100)]
kernel.h: add BUILD_BUG() macro
We can place this in definitions that we expect the compiler to remove by
dead code elimination. If this assertion fails, we get a nice error
message at build time.
The GCC function attribute error("message") was added in version 4.3, so
we define a new macro __linktime_error(message) to expand to this for
GCC-4.3 and later. This will give us an error diagnostic from the
compiler on the line that fails. For other compilers
__linktime_error(message) expands to nothing, and we have to be content
with a link time error, but at least we will still get a build error.
BUILD_BUG() expands to the undefined function __build_bug_failed() and
will fail at link time if the compiler ever emits code for it. On GCC-4.3
and later, attribute((error())) is used so that the failure will be noted
at compile time instead.
Signed-off-by: David Daney <david.daney@cavium.com> Acked-by: David Rientjes <rientjes@google.com> Cc: DM <dm.n9107@gmail.com> Cc: Ralf Baechle <ralf@linux-mips.org> Acked-by: David Howells <dhowells@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
mm/hugetlb.c: fix virtual address handling in hugetlb fault
handle_mm_fault() passes 'faulted' address to hugetlb_fault(). This
address is not aligned to a hugepage boundary.
Most of the functions for hugetlb pages are aware of that and calculate an
alignment themselves. However some functions such as
copy_user_huge_page() and clear_huge_page() don't handle alignment by
themselves.
This patch make hugeltb_fault() fix the alignment and pass an aligned
addresss (to address of a faulted hugepage) to functions.