]> git.karo-electronics.de Git - karo-tx-linux.git/log
karo-tx-linux.git
11 years agofb: hx8357: Use static arrays for LCD configuration
Maxime Ripard [Wed, 20 Feb 2013 02:15:29 +0000 (13:15 +1100)]
fb: hx8357: Use static arrays for LCD configuration

This allows a smaller and less error-prone code by using static arrays
and the ARRAY_SIZE macro.

Signed-off-by: Maxime Ripard <maxime.ripard@free-electrons.com>
Cc: Jingoo Han <jg1.han@samsung.com>
Cc: Richard Purdie <rpurdie@rpsys.net>
Cc: Florian Tobias Schandinat <FlorianSchandinat@gmx.de>
Cc: Joe Perches <joe@perches.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agofb: hx8357: Remove trailing period
Maxime Ripard [Wed, 20 Feb 2013 02:15:29 +0000 (13:15 +1100)]
fb: hx8357: Remove trailing period

Signed-off-by: Maxime Ripard <maxime.ripard@free-electrons.com>
Cc: Jingoo Han <jg1.han@samsung.com>
Cc: Richard Purdie <rpurdie@rpsys.net>
Cc: Florian Tobias Schandinat <FlorianSchandinat@gmx.de>
Cc: Joe Perches <joe@perches.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agofb: hx8357: Remove useless error message
Maxime Ripard [Wed, 20 Feb 2013 02:15:28 +0000 (13:15 +1100)]
fb: hx8357: Remove useless error message

In case of a failing allocation, a dump stack will be printed anyway, so
the dev_err is redundant.

Signed-off-by: Maxime Ripard <maxime.ripard@free-electrons.com>
Cc: Jingoo Han <jg1.han@samsung.com>
Cc: Richard Purdie <rpurdie@rpsys.net>
Cc: Florian Tobias Schandinat <FlorianSchandinat@gmx.de>
Cc: Joe Perches <joe@perches.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agofb: hx8357: Fix inverted parameters for kcalloc
Maxime Ripard [Wed, 20 Feb 2013 02:15:28 +0000 (13:15 +1100)]
fb: hx8357: Fix inverted parameters for kcalloc

The element size and the number of elements was inverted in the kcalloc
call.

Signed-off-by: Maxime Ripard <maxime.ripard@free-electrons.com>
Cc: Jingoo Han <jg1.han@samsung.com>
Cc: Richard Purdie <rpurdie@rpsys.net>
Cc: Florian Tobias Schandinat <FlorianSchandinat@gmx.de>
Cc: Joe Perches <joe@perches.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agofb: hx8357: Change parameters of the write function to u8
Maxime Ripard [Wed, 20 Feb 2013 02:15:28 +0000 (13:15 +1100)]
fb: hx8357: Change parameters of the write function to u8

Moving from void* to u8* removes the need for castslater on in the
function.

Signed-off-by: Maxime Ripard <maxime.ripard@free-electrons.com>
Cc: Jingoo Han <jg1.han@samsung.com>
Cc: Richard Purdie <rpurdie@rpsys.net>
Cc: Florian Tobias Schandinat <FlorianSchandinat@gmx.de>
Cc: Joe Perches <joe@perches.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agofb: backlight: add the Himax HX-8357B LCD controller
Maxime Ripard [Wed, 20 Feb 2013 02:15:27 +0000 (13:15 +1100)]
fb: backlight: add the Himax HX-8357B LCD controller

Signed-off-by: Maxime Ripard <maxime.ripard@free-electrons.com>
Cc: Jingoo Han <jg1.han@samsung.com>
Cc: Shawn Guo <shawn.guo@linaro.org>
Cc: Brian Lilly <brian@crystalfontz.com>
Cc: Richard Purdie <rpurdie@rpsys.net>
Cc: Florian Tobias Schandinat <FlorianSchandinat@gmx.de>
Cc: Grant Likely <grant.likely@secretlab.ca>
Cc: Rob Herring <rob.herring@calxeda.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agodrivers/video/backlight/ld9040.c: use devm_regulator_bulk_get() API
Sachin Kamat [Wed, 20 Feb 2013 02:15:27 +0000 (13:15 +1100)]
drivers/video/backlight/ld9040.c: use devm_regulator_bulk_get() API

devm_regulator_bulk_get is device managed and saves some cleanup
and exit code.

Signed-off-by: Sachin Kamat <sachin.kamat@linaro.org>
Cc: Donghwa Lee <dh09.lee@samsung.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agodrivers/video/backlight/l4f00242t03.c: convert to devm_regulator_get()
Axel Lin [Wed, 20 Feb 2013 02:15:27 +0000 (13:15 +1100)]
drivers/video/backlight/l4f00242t03.c: convert to devm_regulator_get()

Signed-off-by: Axel Lin <axel.lin@ingics.com>
Acked-by: Jingoo Han <jg1.han@samsung.com>
Cc: Alberto Panizzo <maramaopercheseimorto@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agodrivers/video/backlight/lm3639_bl.c: remove ret = -EIO at error paths of probe
Devendra Naga [Wed, 20 Feb 2013 02:15:27 +0000 (13:15 +1100)]
drivers/video/backlight/lm3639_bl.c: remove ret = -EIO at error paths of probe

The APIs are returning correctly the err codes, no need to assign -EIO
to the ret again.

Signed-off-by: Devendra Naga <devendra.aaru@gmail.com>
Acked-by: Daniel Jeong <daniel.jeong@ti.com>
Cc: G.Shark Jeong <gshark.jeong@gmail.com>
Acked-by: Jingoo Han <jg1.han@samsung.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agobacklight: corgi_lcd: use lcd_get_data instead of dev_get_drvdata
Jingoo Han [Wed, 20 Feb 2013 02:15:26 +0000 (13:15 +1100)]
backlight: corgi_lcd: use lcd_get_data instead of dev_get_drvdata

Use the wrapper function for getting the driver data using lcd_device
instead of using dev_get_drvdata with &ld->dev, so we can directly pass a
struct lcd_device.

Signed-off-by: Jingoo Han <jg1.han@samsung.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agobacklight: omap1: use bl_get_data instead of dev_get_drvdata
Jingoo Han [Wed, 20 Feb 2013 02:15:26 +0000 (13:15 +1100)]
backlight: omap1: use bl_get_data instead of dev_get_drvdata

Use the wrapper function for getting the driver data using
backlight_device instead of using dev_get_drvdata with &bd->dev, so we can
directly pass a struct backlight_device.

Signed-off-by: Jingoo Han <jg1.han@samsung.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agobacklight: tosa: use bl_get_data instead of dev_get_drvdata
Jingoo Han [Wed, 20 Feb 2013 02:15:26 +0000 (13:15 +1100)]
backlight: tosa: use bl_get_data instead of dev_get_drvdata

Use the wrapper function for getting the driver data using
backlight_device instead of using dev_get_drvdata with &bd->dev, so we can
directly pass a struct backlight_device.

Signed-off-by: Jingoo Han <jg1.han@samsung.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agobacklight: corgi_lcd: use bl_get_data instead of dev_get_drvdata
Jingoo Han [Wed, 20 Feb 2013 02:15:25 +0000 (13:15 +1100)]
backlight: corgi_lcd: use bl_get_data instead of dev_get_drvdata

Use the wrapper function for getting the driver data using
backlight_device instead of using dev_get_drvdata with &bd->dev, so we can
directly pass a struct backlight_device.

Signed-off-by: Jingoo Han <jg1.han@samsung.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agobacklight: ams369fg06: use bl_get_data instead of dev_get_drvdata
Jingoo Han [Wed, 20 Feb 2013 02:15:25 +0000 (13:15 +1100)]
backlight: ams369fg06: use bl_get_data instead of dev_get_drvdata

Use the wrapper function for getting the driver data using
backlight_device instead of using dev_get_drvdata with &bd->dev, so we can
directly pass a struct backlight_device.

Signed-off-by: Jingoo Han <jg1.han@samsung.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agopwm_backlight: use bl_get_data instead of dev_get_drvdata
Jingoo Han [Wed, 20 Feb 2013 02:15:25 +0000 (13:15 +1100)]
pwm_backlight: use bl_get_data instead of dev_get_drvdata

Use the wrapper function for getting the driver data using
backlight_device instead of using dev_get_drvdata with &bd->dev, so we can
directly pass a struct backlight_device.

Signed-off-by: Jingoo Han <jg1.han@samsung.com>
Cc: Thierry Reding <thierry.reding@avionic-design.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agobacklight: aat2870: use bl_get_data instead of dev_get_drvdata
Jingoo Han [Wed, 20 Feb 2013 02:15:24 +0000 (13:15 +1100)]
backlight: aat2870: use bl_get_data instead of dev_get_drvdata

Use the wrapper function for getting the driver data using
backlight_device instead of using dev_get_drvdata with &bd->dev, so we can
directly pass a struct backlight_device.

Signed-off-by: Jingoo Han <jg1.han@samsung.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agobacklight: lms501kf03: use spi_get_drvdata and spi_set_drvdata
Jingoo Han [Wed, 20 Feb 2013 02:15:24 +0000 (13:15 +1100)]
backlight: lms501kf03: use spi_get_drvdata and spi_set_drvdata

Use the wrapper functions for getting and setting the driver data using
spi_device instead of using dev_{get|set}_drvdata with &spi->dev, so we
can directly pass a struct spi_device.

Signed-off-by: Jingoo Han <jg1.han@samsung.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agobacklight: corgi_lcd: use spi_get_drvdata and spi_set_drvdata
Jingoo Han [Wed, 20 Feb 2013 02:15:24 +0000 (13:15 +1100)]
backlight: corgi_lcd: use spi_get_drvdata and spi_set_drvdata

Use the wrapper functions for getting and setting the driver data using
spi_device instead of using dev_{get|set}_drvdata with &spi->dev, so we
can directly pass a struct spi_device.

Signed-off-by: Jingoo Han <jg1.han@samsung.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agobacklight: tosa: use spi_get_drvdata and spi_set_drvdata
Jingoo Han [Wed, 20 Feb 2013 02:15:23 +0000 (13:15 +1100)]
backlight: tosa: use spi_get_drvdata and spi_set_drvdata

Use the wrapper functions for getting and setting the driver data using
spi_device instead of using dev_{get|set}_drvdata with &spi->dev, so we
can directly pass a struct spi_device.

Signed-off-by: Jingoo Han <jg1.han@samsung.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agobacklight: vgg2432a4: use spi_get_drvdata and spi_set_drvdata
Jingoo Han [Wed, 20 Feb 2013 02:15:23 +0000 (13:15 +1100)]
backlight: vgg2432a4: use spi_get_drvdata and spi_set_drvdata

Use the wrapper functions for getting and setting the driver data using
spi_device instead of using dev_{get|set}_drvdata with &spi->dev, so we
can directly pass a struct spi_device.

Signed-off-by: Jingoo Han <jg1.han@samsung.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agobacklight: ams369fg06: use spi_get_drvdata and spi_set_drvdata
Jingoo Han [Wed, 20 Feb 2013 02:15:23 +0000 (13:15 +1100)]
backlight: ams369fg06: use spi_get_drvdata and spi_set_drvdata

Use the wrapper functions for getting and setting the driver data using
spi_device instead of using dev_{get|set}_drvdata with &spi->dev, so we
can directly pass a struct spi_device.

Signed-off-by: Jingoo Han <jg1.han@samsung.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agobacklight: lms283gf05: use spi_get_drvdata and spi_set_drvdata
Jingoo Han [Wed, 20 Feb 2013 02:15:22 +0000 (13:15 +1100)]
backlight: lms283gf05: use spi_get_drvdata and spi_set_drvdata

Use the wrapper functions for getting and setting the driver data using
spi_device instead of using dev_{get|set}_drvdata with &spi->dev, so we
can directly pass a struct spi_device.

Signed-off-by: Jingoo Han <jg1.han@samsung.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agobacklight: tdo24m: use spi_get_drvdata and spi_set_drvdata
Jingoo Han [Wed, 20 Feb 2013 02:15:22 +0000 (13:15 +1100)]
backlight: tdo24m: use spi_get_drvdata and spi_set_drvdata

Use the wrapper functions for getting and setting the driver data using
spi_device instead of using dev_{get|set}_drvdata with &spi->dev, so we
can directly pass a struct spi_device.

Signed-off-by: Jingoo Han <jg1.han@samsung.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agobacklight: ltv350qv: use spi_get_drvdata and spi_set_drvdata
Jingoo Han [Wed, 20 Feb 2013 02:15:22 +0000 (13:15 +1100)]
backlight: ltv350qv: use spi_get_drvdata and spi_set_drvdata

Use the wrapper functions for getting and setting the driver data using
spi_device instead of using dev_{get|set}_drvdata with &spi->dev, so we
can directly pass a struct spi_device.

Signed-off-by: Jingoo Han <jg1.han@samsung.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agobacklight: s6e63m0: use spi_get_drvdata and spi_set_drvdata
Jingoo Han [Wed, 20 Feb 2013 02:15:21 +0000 (13:15 +1100)]
backlight: s6e63m0: use spi_get_drvdata and spi_set_drvdata

Use the wrapper functions for getting and setting the driver data using
spi_device instead of using dev_{get|set}_drvdata with &spi->dev, so we
can directly pass a struct spi_device

Signed-off-by: Jingoo Han <jg1.han@samsung.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agobacklight: ld9040: use spi_get_drvdata and spi_set_drvdata
Jingoo Han [Wed, 20 Feb 2013 02:15:21 +0000 (13:15 +1100)]
backlight: ld9040: use spi_get_drvdata and spi_set_drvdata

Use the wrapper functions for getting and setting the driver data using
spi_device instead of using dev_{get|set}_drvdata with &spi->dev, so we
can directly pass a struct spi_device.

Signed-off-by: Jingoo Han <jg1.han@samsung.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agobacklight: l4f00242t03: use spi_get_drvdata and spi_set_drvdata
Jingoo Han [Wed, 20 Feb 2013 02:15:21 +0000 (13:15 +1100)]
backlight: l4f00242t03: use spi_get_drvdata and spi_set_drvdata

Use the wrapper functions for getting and setting the driver data using
spi_device instead of using dev_{get|set}_drvdata with &spi->dev, so we
can directly pass a struct spi_device.

Signed-off-by: Jingoo Han <jg1.han@samsung.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agobacklight-add-new-lp8788-backlight-driver-checkpatch-fixes
Andrew Morton [Wed, 20 Feb 2013 02:15:21 +0000 (13:15 +1100)]
backlight-add-new-lp8788-backlight-driver-checkpatch-fixes

WARNING: please write a paragraph that describes the config symbol fully
#45: FILE: drivers/video/backlight/Kconfig:380:
+config BACKLIGHT_LP8788

ERROR: code indent should use tabs where possible
#446: FILE: include/linux/mfd/lp8788.h:242:
+^I^I            Only valid when bl_mode is LP8788_BL_COMB_PWM_BASED$

total: 1 errors, 1 warnings, 403 lines checked

NOTE: whitespace errors detected, you may wish to use scripts/cleanpatch or
      scripts/cleanfile

./patches/backlight-add-new-lp8788-backlight-driver.patch has style problems, please review.

If any of these errors are false positives, please report
them to the maintainer, see CHECKPATCH in MAINTAINERS.

Please run checkpatch prior to sending patches

Cc: "Kim, Milo" <Milo.Kim@ti.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agobacklight: add new lp8788 backlight driver
Kim, Milo [Wed, 20 Feb 2013 02:15:20 +0000 (13:15 +1100)]
backlight: add new lp8788 backlight driver

TI LP8788 PMU supports regulators, battery charger, RTC, ADC, backlight
dri= ver and current sinks.  This patch enables LP8788 backlight module.

(Brightness mode)
The brightness is controlled by PWM input or I2C register.
All modes are supported in the driver.

(Platform data)
Configurable data can be defined in the platform side.
 name                  : backlight driver name. (default: "lcd-backlight")
 initial_brightness    : initial value of backlight brightness
 bl_mode               : brightness control by PWM or lp8788 register
 dim_mode              : dimming mode selection
 full_scale            : full scale current setting
 rise_time             : brightness ramp up step time
 fall_time             : brightness ramp down step time
 pwm_pol               : PWM polarity setting when bl_mode is PWM based
 period_ns             : platform specific PWM period value. unit is nano.

The default values are set in case no platform data is defined.

Signed-off-by: Milo(Woogyom) Kim <milo.kim@ti.com>
Cc: Richard Purdie <rpurdie@rpsys.net>
Cc: Samuel Ortiz <sameo@linux.intel.com>
Cc: Thierry Reding <thierry.reding@avionic-design.de>
Cc: "devendra.aaru" <devendra.aaru@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agobacklight: ams369fg06: reorder inclusions of <linux/xxx.h>
Jingoo Han [Wed, 20 Feb 2013 02:15:20 +0000 (13:15 +1100)]
backlight: ams369fg06: reorder inclusions of <linux/xxx.h>

Reorder inclusions of <linux/xxx.h> for redability, according to
alphabetical ordering.  Also, unnecessary header comments are removed.

Signed-off-by: Jingoo Han <jg1.han@samsung.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agobacklight: ams369fg06: remove redundant variable 'before_power'
Jingoo Han [Wed, 20 Feb 2013 02:15:20 +0000 (13:15 +1100)]
backlight: ams369fg06: remove redundant variable 'before_power'

'before_power' was used to check the previous status when resume() is
called.  However, FB_BLANK_POWERDOWN was used in suspend() all the time,
so there is no need to check the previous status.  Also, redundant return
variables are removed to reduce the code.

Signed-off-by: Jingoo Han <jg1.han@samsung.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agobacklight: ams369fg06: replace EFAULT with EINVAL
Jingoo Han [Wed, 20 Feb 2013 02:15:19 +0000 (13:15 +1100)]
backlight: ams369fg06: replace EFAULT with EINVAL

Replace EFAULT with EINVAL, because EFAULT tends to be for the invalid
memory addresses.

Signed-off-by: Jingoo Han <jg1.han@samsung.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agobacklight: ams369fg06: remove unnecessary NULL deference check
Jingoo Han [Wed, 20 Feb 2013 02:15:19 +0000 (13:15 +1100)]
backlight: ams369fg06: remove unnecessary NULL deference check

Remove unnecessary NULL deference check, because it was already checked in
ams369fg06_probe().  Also, unnecessary parentheses are removed in
ams369fg06_power_is_on().

Signed-off-by: Jingoo Han <jg1.han@samsung.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agobacklight: ams369fg06: use sleep instead of delay
Jingoo Han [Wed, 20 Feb 2013 02:15:19 +0000 (13:15 +1100)]
backlight: ams369fg06: use sleep instead of delay

Replace mdelay with msleep to remove the busy loop waiting.

Signed-off-by: Jingoo Han <jg1.han@samsung.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agobacklight: s6e63m0: reorder inclusions of <linux/xxx.h>
Jingoo Han [Wed, 20 Feb 2013 02:15:18 +0000 (13:15 +1100)]
backlight: s6e63m0: reorder inclusions of <linux/xxx.h>

Reorder inclusions of <linux/xxx.h> for redability, according to
alphabetical ordering.  Also, unnecessary header comments are removed.

Signed-off-by: Jingoo Han <jg1.han@samsung.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agobacklight: s6e63m0: remove redundant variable 'before_power'
Jingoo Han [Wed, 20 Feb 2013 02:15:18 +0000 (13:15 +1100)]
backlight: s6e63m0: remove redundant variable 'before_power'

'before_power' was used to check the previous status when resume() is
called.  However, FB_BLANK_POWERDOWN was used in suspend() all the time,
so there is no need to check the previous status.  Also, redundant return
variables are removed to reduce the code.

Signed-off-by: Jingoo Han <jg1.han@samsung.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agobacklight: s6e63m0: replace EFAULT with EINVAL
Jingoo Han [Wed, 20 Feb 2013 02:15:18 +0000 (13:15 +1100)]
backlight: s6e63m0: replace EFAULT with EINVAL

Replace EFAULT with EINVAL, because EFAULT tends to be for the invalid
memory addresses.

Signed-off-by: Jingoo Han <jg1.han@samsung.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agobacklight: s6e63m0: remove unnecessary NULL deference check
Jingoo Han [Wed, 20 Feb 2013 02:15:17 +0000 (13:15 +1100)]
backlight: s6e63m0: remove unnecessary NULL deference check

Remove unnecessary NULL deference check, because it was already checked in
s6e63m0_probe().  Also, POWER_IS_ON is replaced with
s6e63m0_power_is_on().

Signed-off-by: Jingoo Han <jg1.han@samsung.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agobacklight: s6e63m0: use sleep instead of delay
Jingoo Han [Wed, 20 Feb 2013 02:15:17 +0000 (13:15 +1100)]
backlight: s6e63m0: use sleep instead of delay

Replace mdelay with msleep to remove the busy loop waiting.

Signed-off-by: Jingoo Han <jg1.han@samsung.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agobacklight: s6e63m0: use lowercase names of structs
Jingoo Han [Wed, 20 Feb 2013 02:15:17 +0000 (13:15 +1100)]
backlight: s6e63m0: use lowercase names of structs

Lowercase names of structs should be used, because they are
not preprocessor macros.

Signed-off-by: Jingoo Han <jg1.han@samsung.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agobacklight: ld9040: reorder inclusions of <linux/xxx.h>
Jingoo Han [Wed, 20 Feb 2013 02:15:17 +0000 (13:15 +1100)]
backlight: ld9040: reorder inclusions of <linux/xxx.h>

Reorder inclusions of <linux/xxx.h> for redability, according to
alphabetical ordering.  Also, unnecessary header comments are removed.

Signed-off-by: Jingoo Han <jg1.han@samsung.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agobacklight: ld9040: remove redundant return variables
Jingoo Han [Wed, 20 Feb 2013 02:15:16 +0000 (13:15 +1100)]
backlight: ld9040: remove redundant return variables

Redundant return variables are removed to reduce the code.

Signed-off-by: Jingoo Han <jg1.han@samsung.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agobacklight: ld9040: replace EFAULT with EINVAL
Jingoo Han [Wed, 20 Feb 2013 02:15:16 +0000 (13:15 +1100)]
backlight: ld9040: replace EFAULT with EINVAL

Replace EFAULT with EINVAL, because EFAULT tends to be for the invalid
memory addresses.

Signed-off-by: Jingoo Han <jg1.han@samsung.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agobacklight: ld9040: remove unnecessary NULL deference check
Jingoo Han [Wed, 20 Feb 2013 02:15:16 +0000 (13:15 +1100)]
backlight: ld9040: remove unnecessary NULL deference check

Removee unnecessary NULL deference check, because it was already checked
in ld9040_probe().  Also, power_is_on is replaced with
ld9040_power_is_on().

Signed-off-by: Jingoo Han <jg1.han@samsung.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agobacklight: ld9040: use sleep instead of delay
Jingoo Han [Wed, 20 Feb 2013 02:15:15 +0000 (13:15 +1100)]
backlight: ld9040: use sleep instead of delay

Replace mdelay with msleep to remove the busy loop waiting.

Signed-off-by: Jingoo Han <jg1.han@samsung.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agobacklight-add-lms501kf03-lcd-driver-fix-fix
Andrew Morton [Wed, 20 Feb 2013 02:15:15 +0000 (13:15 +1100)]
backlight-add-lms501kf03-lcd-driver-fix-fix

make lms501kf03_shutdown() static, per Fengguang

Cc: "devendra.aaru" <devendra.aaru@gmail.com>
Cc: Ilho Lee <Ilho215.lee@samsung.com>
Cc: Jingoo Han <jg1.han@samsung.com>
Cc: Wu Fengguang <fengguang.wu@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agobacklight-add-lms501kf03-lcd-driver-fix
Andrew Morton [Wed, 20 Feb 2013 02:15:15 +0000 (13:15 +1100)]
backlight-add-lms501kf03-lcd-driver-fix

remove unused variable `before_power'

Cc: "devendra.aaru" <devendra.aaru@gmail.com>
Cc: Ilho Lee <Ilho215.lee@samsung.com>
Cc: Jingoo Han <jg1.han@samsung.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agobacklight: add lms501kf03 LCD driver
Jingoo Han [Wed, 20 Feb 2013 02:15:14 +0000 (13:15 +1100)]
backlight: add lms501kf03 LCD driver

Add the lms501kf03 LCD panel driver.  The lms501kf03 LCD panel (800 x 480)
driver uses 3-wired SPI inteface.

Signed-off-by: Ilho Lee <Ilho215.lee@samsung.com>
Signed-off-by: Jingoo Han <jg1.han@samsung.com>
Cc: "devendra.aaru" <devendra.aaru@gmail.com>
Cc: Sachin Kamat <sachin.kamat@linaro.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agomaintainers-remove-mark-m-hoffman-fix
Andrew Morton [Wed, 20 Feb 2013 02:15:14 +0000 (13:15 +1100)]
maintainers-remove-mark-m-hoffman-fix

Cc: "Mark M. Hoffman" <mhoffman@lightlink.com>
Cc: Guenter Roeck <linux@roeck-us.net>
Cc: Jean Delvare <khali@linux-fr.org>
Cc: Wolfram Sang <w.sang@pengutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agoMAINTAINERS: Remove Mark M. Hoffman
Jean Delvare [Wed, 20 Feb 2013 02:15:14 +0000 (13:15 +1100)]
MAINTAINERS: Remove Mark M. Hoffman

Mark M.  Hoffman stopped working on the Linux kernel several years ago, so
he should no longer be listed as a driver maintainer.  I'm not even sure
if his e-mail address still works.

I can take over 3 drivers he was responsible for, the 4th one will
fall down to the subsystem maintainer.

Signed-off-by: Jean Delvare <khali@linux-fr.org>
Cc: "Mark M. Hoffman" <mhoffman@lightlink.com>
Cc: Guenter Roeck <linux@roeck-us.net>
Cc: Wolfram Sang <w.sang@pengutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agoMAINTAINERS: mm: add additional include files to listing
Cody P Schafer [Wed, 20 Feb 2013 02:15:13 +0000 (13:15 +1100)]
MAINTAINERS: mm: add additional include files to listing

Add gfp.h, mmzone.h, memory_hotplug.h & vmalloc.h to the "MEMORY
MANAGMENT" section so scripts/get_maintainer.pl can do a better job of
making recommendations.

Signed-off-by: Cody P Schafer <cody@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agoget_maintainer: allow keywords to match filenames
Stephen Warren [Wed, 20 Feb 2013 02:15:13 +0000 (13:15 +1100)]
get_maintainer: allow keywords to match filenames

Allow K: entries in MAINTAINERS to match directly against filenames;
either those extracted from patch +++ or --- lines, or those specified on
the command-line using the -f option.

This potentially allows fewer lines in a MAINTAINERS entry, if all the
relevant files are scattered throughout the whole kernel tree, yet contain
some common keyword.  An example would be using an ARM SoC name as the
keyword to catch all related drivers.

I don't think setting exact_pattern_match_hash would be appropriate here;
at least for intended Tegra use case, this feature is to ensure that all
Tegra-related driver changes get Cc'd to the Tegra mailing list.  Setting
exact_pattern_match_hash would prevent git history parsing for e.g.  S-o-b
tags, which still seems like it would be useful.  Hence, this flag isn't
set.

Signed-off-by: Stephen Warren <swarren@nvidia.com>
Cc: Joe Perches <joe@perches.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agoget_maintainer.pl: find maintainers for removed files
Geert Uytterhoeven [Wed, 20 Feb 2013 02:15:13 +0000 (13:15 +1100)]
get_maintainer.pl: find maintainers for removed files

For removed files, get_maintainer.pl doesn't find any maintainers (besides
the default linux-kernel@vger.kernel.org), as it only looks at the "+++"
lines, which are "/dev/null" for removals.  Fix this by extending the
parsing to the "---" lines.

E.g. for the two line test patch below the real score maintainers will now
be found:

    --- a/arch/score/include/asm/dma-mapping.h
    +++ /dev/null

Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Joe Perches <joe@perches.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agoprintk: add pr_devel_once and pr_devel_ratelimited
Mikhail Gruzdev [Wed, 20 Feb 2013 02:15:12 +0000 (13:15 +1100)]
printk: add pr_devel_once and pr_devel_ratelimited

Standardize pr_devel logging macros family by adding pr_devel_once and
pr_devel_ratelimited.

Signed-off-by: Mikhail Gruzdev <michail.gruzdev@gmail.com>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agolib/vsprintf.c: add %pa format specifier for phys_addr_t types
Stepan Moskovchenko [Wed, 20 Feb 2013 02:15:12 +0000 (13:15 +1100)]
lib/vsprintf.c: add %pa format specifier for phys_addr_t types

Add the %pa format specifier for printing a phys_addr_t type and its
derivative types (such as resource_size_t), since the physical address
size on some platforms can vary based on build options, regardless of the
native integer type.

Signed-off-by: Stepan Moskovchenko <stepanm@codeaurora.org>
Cc: Rob Landley <rob@landley.net>
Cc: George Spelvin <linux@horizon.com>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Stephen Boyd <sboyd@codeaurora.org>
Cc: Andrei Emeltchenko <andrei.emeltchenko@intel.com>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agokernel/smp.c: cleanups
Andrew Morton [Wed, 20 Feb 2013 02:15:12 +0000 (13:15 +1100)]
kernel/smp.c: cleanups

We sometimes use "struct call_single_data *data" and sometimes "struct
call_single_data *csd".  Use "csd" consistently.

We sometimes use "struct call_function_data *data" and sometimes "struct
call_function_data *cfd".  Use "cfd" consistently.

Also, avoid some 80-col layout tricks.

Cc: Ingo Molnar <mingo@elte.hu>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Shaohua Li <shli@fusionio.com>
Cc: Shaohua Li <shli@kernel.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agoinclude/linux/fs.h: disable preempt when acquire i_size_seqcount write lock
Fan Du [Wed, 20 Feb 2013 02:15:11 +0000 (13:15 +1100)]
include/linux/fs.h: disable preempt when acquire i_size_seqcount write lock

Two rt tasks bind to one CPU core.

The higher priority rt task A preempts a lower priority rt task B which
has already taken the write seq lock, and then the higher priority rt task
A try to acquire read seq lock, it's doomed to lockup.

rt task A with lower priority: call write
i_size_write                                        rt task B with higher priority: call sync, and preempt task A
  write_seqcount_begin(&inode->i_size_seqcount);    i_size_read
  inode->i_size = i_size;                             read_seqcount_begin <-- lockup here...

So disable preempt when acquiring every i_size_seqcount *write* lock will
cure the problem.

Signed-off-by: Fan Du <fan.du@windriver.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agosmp: Give WARN()ing when calling smp_call_function_many()/single() in serving irq
Chuansheng Liu [Wed, 20 Feb 2013 02:15:11 +0000 (13:15 +1100)]
smp: Give WARN()ing when calling smp_call_function_many()/single() in serving irq

Currently the functions smp_call_function_many()/single() will give a
WARN()ing only in the case of irqs_disabled(), but that check is not
enough to guarantee execution of the SMP cross-calls.

In many other cases such as softirq handling/interrupt handling, the two
APIs still can not be called, just as the smp_call_function_many()
comments say:

  * You must not call this function with disabled interrupts or from a
  * hardware interrupt handler or from a bottom half handler. Preemption
  * must be disabled when calling this function.

There is a real case for softirq DEADLOCK case:

CPUA                            CPUB
                                spin_lock(&spinlock)
                                Any irq coming, call the irq handler
                                irq_exit()
spin_lock_irq(&spinlock)
<== Blocking here due to
CPUB hold it
                                  __do_softirq()
                                    run_timer_softirq()
                                      timer_cb()
                                        call smp_call_function_many()
                                          send IPI interrupt to CPUA
                                            wait_csd()

Then both CPUA and CPUB will be deadlocked here.

So we should give a warning in the nmi, hardirq or softirq context as well.

Moreover, adding one new macro in_serving_irq() which indicates we are
processing nmi, hardirq or sofirq.

Signed-off-by: liu chuansheng <chuansheng.liu@intel.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Tested-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agosys_prctl(): coding-style cleanup
Andrew Morton [Wed, 20 Feb 2013 02:15:11 +0000 (13:15 +1100)]
sys_prctl(): coding-style cleanup

Remove a tabstop from the switch statement, in the usual fashion.  A few
instances of weirdwrapping were removed as a result.

Cc: Chen Gang <gang.chen@asianux.com>
Cc: Cyrill Gorcunov <gorcunov@openvz.org>
Acked-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agosys_prctl(): arg2 is unsigned long which is never < 0
Chen Gang [Wed, 20 Feb 2013 02:15:10 +0000 (13:15 +1100)]
sys_prctl(): arg2 is unsigned long which is never < 0

arg2 will never < 0, for its type is 'unsigned long'

Also, use the provided macros.

Signed-off-by: Chen Gang <gang.chen@asianux.com>
Reported-by: Cyrill Gorcunov <gorcunov@openvz.org>
Acked-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agosun.com documentation fixes
Christian Kujau [Wed, 20 Feb 2013 02:15:10 +0000 (13:15 +1100)]
sun.com documentation fixes

After I came across a help text for SUNGEM mentioning a broken sun.com
URL, I felt like fixing those up, as they are now pointing to oracle.com
URLs.

Signed-off-by: Christian Kujau <lists@nerdbynature.de>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agolib/Kconfig.debug: unhide CONFIG_PANIC_ON_OOPS
Kyle McMartin [Wed, 20 Feb 2013 02:15:10 +0000 (13:15 +1100)]
lib/Kconfig.debug: unhide CONFIG_PANIC_ON_OOPS

CONFIG_EXPERT doesn't really make sense, and hides it unintentionally.
Remove superfluous "default n" pointed out by Ingo as well.

Signed-off-by: Kyle McMartin <kyle@redhat.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agosmp: make smp_call_function_many() use logic similar to smp_call_function_single()
Shaohua Li [Wed, 20 Feb 2013 02:15:10 +0000 (13:15 +1100)]
smp: make smp_call_function_many() use logic similar to smp_call_function_single()

I'm testing swapout workload in a two-socket Xeon machine.  The workload
has 10 threads, each thread sequentially accesses separate memory region.
TLB flush overhead is very big in the workload.  For each page, page
reclaim need move it from active lru list and then unmap it.  Both need a
TLB flush.  And this is a multthread workload, TLB flush happens in 10
CPUs.  In X86, TLB flush uses generic smp_call)function.  So this workload
stress smp_call_function_many heavily.

Without patch, perf shows:
+  24.49%  [k] generic_smp_call_function_interrupt
-  21.72%  [k] _raw_spin_lock
   - _raw_spin_lock
      + 79.80% __page_check_address
      + 6.42% generic_smp_call_function_interrupt
      + 3.31% get_swap_page
      + 2.37% free_pcppages_bulk
      + 1.75% handle_pte_fault
      + 1.54% put_super
      + 1.41% grab_super_passive
      + 1.36% __swap_duplicate
      + 0.68% blk_flush_plug_list
      + 0.62% swap_info_get
+   6.55%  [k] flush_tlb_func
+   6.46%  [k] smp_call_function_many
+   5.09%  [k] call_function_interrupt
+   4.75%  [k] default_send_IPI_mask_sequence_phys
+   2.18%  [k] find_next_bit

swapout throughput is around 1300M/s.

With the patch, perf shows:
-  27.23%  [k] _raw_spin_lock
   - _raw_spin_lock
      + 80.53% __page_check_address
      + 8.39% generic_smp_call_function_single_interrupt
      + 2.44% get_swap_page
      + 1.76% free_pcppages_bulk
      + 1.40% handle_pte_fault
      + 1.15% __swap_duplicate
      + 1.05% put_super
      + 0.98% grab_super_passive
      + 0.86% blk_flush_plug_list
      + 0.57% swap_info_get
+   8.25%  [k] default_send_IPI_mask_sequence_phys
+   7.55%  [k] call_function_interrupt
+   7.47%  [k] smp_call_function_many
+   7.25%  [k] flush_tlb_func
+   3.81%  [k] _raw_spin_lock_irqsave
+   3.78%  [k] generic_smp_call_function_single_interrupt

swapout throughput is around 1400M/s.  So there is around a 7%
improvement, and total cpu utilization doesn't change.

Without the patch, cfd_data is shared by all CPUs.
generic_smp_call_function_interrupt does read/write cfd_data several times
which will create a lot of cache ping-pong.  With the patch, the data
becomes per-cpu.  The ping-pong is avoided.  And from the perf data, this
doesn't make call_single_queue lock contend.

Next step is to remove generic_smp_call_function_interrupt() from arch
code.

Signed-off-by: Shaohua Li <shli@fusionio.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agoscripts-pnmtologo-fix-for-plain-pbm-checkpatch-fixes
Andrew Morton [Wed, 20 Feb 2013 02:15:09 +0000 (13:15 +1100)]
scripts-pnmtologo-fix-for-plain-pbm-checkpatch-fixes

ERROR: do not initialise statics to 0 or NULL
#24: FILE: scripts/pnmtologo.c:77:
+static int is_plain_pbm = 0;

WARNING: line over 80 characters
#33: FILE: scripts/pnmtologo.c:108:
+  * between the digits. This is Ok cause we know a PBM can only have a '1'

total: 1 errors, 1 warnings, 25 lines checked

./patches/scripts-pnmtologo-fix-for-plain-pbm.patch has style problems, please review.

If any of these errors are false positives, please report
them to the maintainer, see CHECKPATCH in MAINTAINERS.

Please run checkpatch prior to sending patches

Cc: Andreas Bießmann <andreas@biessmann.de>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agomm: use vm_unmapped_area() on alpha architecture
Michel Lespinasse [Wed, 20 Feb 2013 02:15:09 +0000 (13:15 +1100)]
mm: use vm_unmapped_area() on alpha architecture

Update the alpha arch_get_unmapped_area function to make use of
vm_unmapped_area() instead of implementing a brute force search.

Signed-off-by: Michel Lespinasse <walken@google.com>
Acked-by: Rik van Riel <riel@redhat.com>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Matt Turner <mattst88@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agoubifs: wait for page writeback to provide stable pages
Jan Kara [Wed, 20 Feb 2013 02:15:09 +0000 (13:15 +1100)]
ubifs: wait for page writeback to provide stable pages

When stable pages are required, we have to wait if the page is just going
to disk and we want to modify it.  Add proper callback to
ubifs_vm_page_mkwrite().

Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Cc: Artem Bityutskiy <dedekind1@gmail.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Steven Whitehouse <swhiteho@redhat.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Eric Van Hensbergen <ericvh@gmail.com>
Cc: Ron Minnich <rminnich@sandia.gov>
Cc: Latchesar Ionkov <lucho@ionkov.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agoocfs2: wait for page writeback to provide stable pages
Jan Kara [Wed, 20 Feb 2013 02:15:08 +0000 (13:15 +1100)]
ocfs2: wait for page writeback to provide stable pages

When stable pages are required, we have to wait if the page is just going
to disk and we want to modify it.  Add proper callback to
ocfs2_grab_pages_for_write().

Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Acked-by: Joel Becker <jlbec@evilplan.org>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Artem Bityutskiy <dedekind1@gmail.com>
Cc: Steven Whitehouse <swhiteho@redhat.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Eric Van Hensbergen <ericvh@gmail.com>
Cc: Ron Minnich <rminnich@sandia.gov>
Cc: Latchesar Ionkov <lucho@ionkov.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agoblock: optionally snapshot page contents to provide stable pages during write
Darrick J. Wong [Wed, 20 Feb 2013 02:15:08 +0000 (13:15 +1100)]
block: optionally snapshot page contents to provide stable pages during write

This provides a band-aid to provide stable page writes on jbd without
needing to backport the fixed locking and page writeback bit handling
schemes of jbd2.  The band-aid works by using bounce buffers to snapshot
page contents instead of waiting.

For those wondering about the ext3 bandage -- fixing the jbd locking
(which was done as part of ext4dev years ago) is a lot of surgery, and
setting PG_writeback on data pages when we actually hold the page lock
dropped ext3 performance by nearly an order of magnitude.  If we're going
to migrate iscsi and raid to use stable page writes, the complaints about
high latency will likely return.  We might as well centralize their page
snapshotting thing to one place.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Tested-by: Andy Lutomirski <luto@amacapital.net>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Artem Bityutskiy <dedekind1@gmail.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Steven Whitehouse <swhiteho@redhat.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Eric Van Hensbergen <ericvh@gmail.com>
Cc: Ron Minnich <rminnich@sandia.gov>
Cc: Latchesar Ionkov <lucho@ionkov.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years ago9pfs: fix filesystem to wait for stable page writeback
Darrick J. Wong [Wed, 20 Feb 2013 02:15:08 +0000 (13:15 +1100)]
9pfs: fix filesystem to wait for stable page writeback

Fix up the ->page_mkwrite handler to provide stable page writes if necessary.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Artem Bityutskiy <dedekind1@gmail.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Steven Whitehouse <swhiteho@redhat.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Eric Van Hensbergen <ericvh@gmail.com>
Cc: Ron Minnich <rminnich@sandia.gov>
Cc: Latchesar Ionkov <lucho@ionkov.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agomm: only enforce stable page writes if the backing device requires it
Darrick J. Wong [Wed, 20 Feb 2013 02:15:07 +0000 (13:15 +1100)]
mm: only enforce stable page writes if the backing device requires it

Create a helper function to check if a backing device requires stable page
writes and, if so, performs the necessary wait.  Then, make it so that all
points in the memory manager that handle making pages writable use the
helper function.  This should provide stable page write support to most
filesystems, while eliminating unnecessary waiting for devices that don't
require the feature.

Before this patchset, all filesystems would block, regardless of whether
or not it was necessary.  ext3 would wait, but still generate occasional
checksum errors.  The network filesystems were left to do their own thing,
so they'd wait too.

After this patchset, all the disk filesystems except ext3 and btrfs will
wait only if the hardware requires it.  ext3 (if necessary) snapshots
pages instead of blocking, and btrfs provides its own bdi so the mm will
never wait.  Network filesystems haven't been touched, so either they
provide their own stable page guarantees or they don't block at all.  The
blocking behavior is back to what it was before 3.0 if you don't have a
disk requiring stable page writes.

Here's the result of using dbench to test latency on ext2:

3.8.0-rc3:
 Operation      Count    AvgLat    MaxLat
 ----------------------------------------
 WriteX        109347     0.028    59.817
 ReadX         347180     0.004     3.391
 Flush          15514    29.828   287.283

Throughput 57.429 MB/sec  4 clients  4 procs  max_latency=287.290 ms

3.8.0-rc3 + patches:
 WriteX        105556     0.029     4.273
 ReadX         335004     0.005     4.112
 Flush          14982    30.540   298.634

Throughput 55.4496 MB/sec  4 clients  4 procs  max_latency=298.650 ms

As you can see, the maximum write latency drops considerably with this
patch enabled.  The other filesystems (ext3/ext4/xfs/btrfs) behave
similarly, but see the cover letter for those results.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Acked-by: Steven Whitehouse <swhiteho@redhat.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Artem Bityutskiy <dedekind1@gmail.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Eric Van Hensbergen <ericvh@gmail.com>
Cc: Ron Minnich <rminnich@sandia.gov>
Cc: Latchesar Ionkov <lucho@ionkov.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agobdi: allow block devices to say that they require stable page writes
Darrick J. Wong [Wed, 20 Feb 2013 02:15:07 +0000 (13:15 +1100)]
bdi: allow block devices to say that they require stable page writes

This patchset ("stable page writes, part 2") makes some key modifications
to the original 'stable page writes' patchset.  First, it provides
creators (devices and filesystems) of a backing_dev_info a flag that
declares whether or not it is necessary to ensure that page contents
cannot change during writeout.  It is no longer assumed that this is true
of all devices (which was never true anyway).  Second, the flag is used to
relaxed the wait_on_page_writeback calls so that wait only occurs if the
device needs it.  Third, it fixes up the remaining disk-backed filesystems
to use this improved conditional-wait logic to provide stable page writes
on those filesystems.

It is hoped that (for people not using checksumming devices, anyway) this
patchset will give back unnecessary performance decreases since the
original stable page write patchset went into 3.0.  Sorry about not fixing
it sooner.

Complaints were registered by several people about the long write
latencies introduced by the original stable page write patchset.
Generally speaking, the kernel ought to allocate as little extra memory as
possible to facilitate writeout, but for people who simply cannot wait, a
second page stability strategy is (re)introduced: snapshotting page
contents.  The waiting behavior is still the default strategy; to enable
page snapshotting, a superblock flag (MS_SNAP_STABLE) must be set.  This
flag is used to bandaid^Henable stable page writeback on ext3[1], and is
not used anywhere else.

Given that there are already a few storage devices and network FSes that
have rolled their own page stability wait/page snapshot code, it would be
nice to move towards consolidating all of these.  It seems possible that
iscsi and raid5 may wish to use the new stable page write support to
enable zero-copy writeout.

Thank you to Jan Kara for helping fix a couple more filesystems.

Per Andrew Morton's request, here are the result of using dbench to measure
latencies on ext2:

3.8.0-rc3:
 Operation      Count    AvgLat    MaxLat
 ----------------------------------------
 WriteX        109347     0.028    59.817
 ReadX         347180     0.004     3.391
 Flush          15514    29.828   287.283

Throughput 57.429 MB/sec  4 clients  4 procs  max_latency=287.290 ms

3.8.0-rc3 + patches:
 WriteX        105556     0.029     4.273
 ReadX         335004     0.005     4.112
 Flush          14982    30.540   298.634

Throughput 55.4496 MB/sec  4 clients  4 procs  max_latency=298.650 ms

As you can see, for ext2 the maximum write latency decreases from ~60ms on a
laptop hard disk to ~4ms.  I'm not sure why the flush latencies increase,
though I suspect that being able to dirty pages faster gives the flusher more
work to do.

On ext4, the average write latency decreases as well as all the maximum
latencies:

3.8.0-rc3:
 WriteX         85624     0.152    33.078
 ReadX         272090     0.010    61.210
 Flush          12129    36.219   168.260

Throughput 44.8618 MB/sec  4 clients  4 procs  max_latency=168.276 ms

3.8.0-rc3 + patches:
 WriteX         86082     0.141    30.928
 ReadX         273358     0.010    36.124
 Flush          12214    34.800   165.689

Throughput 44.9941 MB/sec  4 clients  4 procs  max_latency=165.722 ms

XFS seems to exhibit similar latency improvements as ext2:

3.8.0-rc3:
 WriteX        125739     0.028   104.343
 ReadX         399070     0.005     4.115
 Flush          17851    25.004   131.390

Throughput 66.0024 MB/sec  4 clients  4 procs  max_latency=131.406 ms

3.8.0-rc3 + patches:
 WriteX        123529     0.028     6.299
 ReadX         392434     0.005     4.287
 Flush          17549    25.120   188.687

Throughput 64.9113 MB/sec  4 clients  4 procs  max_latency=188.704 ms

...and btrfs, just to round things out, also shows some latency decreases:

3.8.0-rc3:
 WriteX         67122     0.083    82.355
 ReadX         212719     0.005     2.828
 Flush           9547    47.561   147.418

Throughput 35.3391 MB/sec  4 clients  4 procs  max_latency=147.433 ms

3.8.0-rc3 + patches:
 WriteX         64898     0.101    71.631
 ReadX         206673     0.005     7.123
 Flush           9190    47.963   219.034

Throughput 34.0795 MB/sec  4 clients  4 procs  max_latency=219.044 ms

Before this patchset, all filesystems would block, regardless of whether
or not it was necessary.  ext3 would wait, but still generate occasional
checksum errors.  The network filesystems were left to do their own thing,
so they'd wait too.

After this patchset, all the disk filesystems except ext3 and btrfs will
wait only if the hardware requires it.  ext3 (if necessary) snapshots
pages instead of blocking, and btrfs provides its own bdi so the mm will
never wait.  Network filesystems haven't been touched, so either they
provide their own wait code, or they don't block at all.  The blocking
behavior is back to what it was before 3.0 if you don't have a disk
requiring stable page writes.

This patchset has been tested on 3.8.0-rc3 on x64 with ext3, ext4, and xfs.
I've spot-checked 3.8.0-rc4 and seem to be getting the same results as -rc3.

[1] The alternative fixes to ext3 include fixing the locking order and page bit
handling like we did for ext4 (but then why not just use ext4?), or setting
PG_writeback so early that ext3 becomes extremely slow.  I tried that, but the
number of write()s I could initiate dropped by nearly an order of magnitude.
That was a bit much even for the author of the stable page series! :)

This patch:

Creates a per-backing-device flag that tracks whether or not pages must be
held immutable during writeout.  Eventually it will be used to waive
wait_for_page_writeback() if nothing requires stable pages.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Artem Bityutskiy <dedekind1@gmail.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Steven Whitehouse <swhiteho@redhat.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Eric Van Hensbergen <ericvh@gmail.com>
Cc: Ron Minnich <rminnich@sandia.gov>
Cc: Latchesar Ionkov <lucho@ionkov.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agomm: use vm_unmapped_area() on frv architecture
Michel Lespinasse [Wed, 20 Feb 2013 02:15:07 +0000 (13:15 +1100)]
mm: use vm_unmapped_area() on frv architecture

Update the frv arch_get_unmapped_area function to make use of
vm_unmapped_area() instead of implementing a brute force search.

Signed-off-by: Michel Lespinasse <walken@google.com>
Acked-by: Rik van Riel <riel@redhat.com>
Acked-by: David Howells <dhowells@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agomm: add vm event counters for balloon pages compaction
Rafael Aquini [Wed, 20 Feb 2013 02:15:06 +0000 (13:15 +1100)]
mm: add vm event counters for balloon pages compaction

Introduce a new set of vm event counters to keep track of ballooned pages
compaction activity.

Signed-off-by: Rafael Aquini <aquini@redhat.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Minchan Kim <minchan@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agomemcg-debugging-facility-to-access-dangling-memcgs-fix
Andrew Morton [Wed, 20 Feb 2013 02:15:06 +0000 (13:15 +1100)]
memcg-debugging-facility-to-access-dangling-memcgs-fix

fix up Kconfig text

Cc: Glauber Costa <glommer@parallels.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agomemcg: debugging facility to access dangling memcgs
Glauber Costa [Wed, 20 Feb 2013 02:15:06 +0000 (13:15 +1100)]
memcg: debugging facility to access dangling memcgs

If memcg is tracking anything other than plain user memory (swap, tcp buf
mem, or slab memory), it is possible - and normal - that a reference will
be held by the group after it is dead.  Still, for developers, it would be
extremely useful to be able to query about those states during debugging.

This patch provides a debugging facility in the root memcg, so we can
inspect which memcgs still have pending objects, and what is the cause of
this state.

Signed-off-by: Glauber Costa <glommer@parallels.com>
Acked-by: Michal Hocko <mhocko@suse.cz>
Cc: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agomm/dmapool.c: fix null dev in dma_pool_create()
Xi Wang [Wed, 20 Feb 2013 02:15:05 +0000 (13:15 +1100)]
mm/dmapool.c: fix null dev in dma_pool_create()

A few drivers invoke dma_pool_create() with a null dev.  Note that dev is
dereferenced in dev_to_node(dev), causing a null pointer dereference.

A long term solution is to disallow null dev.  Once the drivers are fixed,
we can simplify the core code here.  For now we add WARN_ON(!dev) to
notify the driver maintainers and avoid the null pointer dereference.

Signed-off-by: Xi Wang <xi.wang@gmail.com>
Cc: David Rientjes <rientjes@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agodrivers/usb/gadget/amd5536udc.c: avoid calling dma_pool_create() with NULL dev
Xi Wang [Wed, 20 Feb 2013 02:15:05 +0000 (13:15 +1100)]
drivers/usb/gadget/amd5536udc.c: avoid calling dma_pool_create() with NULL dev

Calling dma_pool_create() with dev==NULL will oops on a NUMA machine.
Rather than changing dma_pool_create() we wish to disallow passing
dev==NULL.  This requires fixing up the small number of drivers which are
passing in dev==NULL.

Use &dev->pdev->dev instead of NULL.

Signed-off-by: Xi Wang <xi.wang@gmail.com>
Cc: Felipe Balbi <balbi@ti.com>
Cc: Greg KH <greg@kroah.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agodrop_caches: add some documentation and info message
Michal Hocko [Wed, 20 Feb 2013 02:15:05 +0000 (13:15 +1100)]
drop_caches: add some documentation and info message

I would like to resurrect Dave's patch.  The last time it was posted was
here https://lkml.org/lkml/2010/9/16/250 and there didn't seem to be any
strong opposition.

Kosaki was worried about possible excessive logging when somebody drops
caches too often (but then he claimed he didn't have a strong opinion on
that) but I would say opposite.  If somebody does that then I would really
like to know that from the log when supporting a system because it almost
for sure means that there is something fishy going on.  It is also worth
mentioning that only root can write drop caches so this is not an flooding
attack vector.

I am bringing that up again because this can be really helpful when
chasing strange performance issues which (surprise surprise) turn out to
be related to artificially dropped caches done because the admin thinks
this would help...

I have just refreshed the original patch on top of the current mm tree
but I could live with KERN_INFO as well if people think that KERN_NOTICE
is too hysterical.

: From: Dave Hansen <dave@linux.vnet.ibm.com>
: Date: Fri, 12 Oct 2012 14:30:54 +0200
:
: There is plenty of anecdotal evidence and a load of blog posts
: suggesting that using "drop_caches" periodically keeps your system
: running in "tip top shape".  Perhaps adding some kernel
: documentation will increase the amount of accurate data on its use.
:
: If we are not shrinking caches effectively, then we have real bugs.
: Using drop_caches will simply mask the bugs and make them harder
: to find, but certainly does not fix them, nor is it an appropriate
: "workaround" to limit the size of the caches.
:
: It's a great debugging tool, and is really handy for doing things
: like repeatable benchmark runs.  So, add a bit more documentation
: about it, and add a little KERN_NOTICE.  It should help developers
: who are chasing down reclaim-related bugs.

[mhocko@suse.cz: refreshed to current -mm tree]
[akpm@linux-foundation.org: checkpatch fixes]
Signed-off-by: Dave Hansen <dave@linux.vnet.ibm.com>
Reviewed-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: Michal Hocko <mhocko@suse.cz>
Acked-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agomm/fadvise.c: drain all pagevecs if POSIX_FADV_DONTNEED fails to discard all pages
Mel Gorman [Wed, 20 Feb 2013 02:15:05 +0000 (13:15 +1100)]
mm/fadvise.c: drain all pagevecs if POSIX_FADV_DONTNEED fails to discard all pages

Rob van der Heij reported the following (paraphrased) on private mail.

The scenario is that I want to avoid backups to fill up the page
cache and purge stuff that is more likely to be used again (this is
with s390x Linux on z/VM, so I don't give it as much memory that
we don't care anymore). So I have something with LD_PRELOAD that
intercepts the close() call (from tar, in this case) and issues
a posix_fadvise() just before closing the file.

This mostly works, except for small files (less than 14 pages)
that remains in page cache after the face.

Unfortunately Rob has not had a chance to test this exact patch but the
test program below should be reproducing the problem he described.

The issue is the per-cpu pagevecs for LRU additions. If the pages are added
by one CPU but fadvise() is called on another then the pages remain resident
as the invalidate_mapping_pages() only drains the local pagevecs via its
call to pagevec_release(). The user-visible effect is that a program that
uses fadvise() properly is not obeyed.

A possible fix for this is to put the necessary smarts into
invalidate_mapping_pages() to globally drain the LRU pagevecs if a pagevec
page could not be discarded. The downside with this is that an inode cache
shrink would send a global IPI and memory pressure potentially causing
global IPI storms is very undesirable.

Instead, this patch adds a check during fadvise(POSIX_FADV_DONTNEED) to
check if invalidate_mapping_pages() discarded all the requested pages. If a
subset of pages are discarded it drains the LRU pagevecs and tries again. If
the second attempt fails, it assumes it is due to the pages being mapped,
locked or dirty and does not care. With this patch, an application using
fadvise() correctly will be obeyed but there is a downside that a malicious
application can force the kernel to send global IPIs and increase overhead.

If accepted, I would like this to be considered as a -stable candidate.
It's not an urgent issue but it's a system call that is not working as
advertised which is weak.

The following test program demonstrates the problem. It should never
report that pages are still resident but will without this patch. It
assumes that CPU 0 and 1 exist.

int main() {
int fd;
int pagesize = getpagesize();
ssize_t written = 0, expected;
char *buf;
unsigned char *vec;
int resident, i;
cpu_set_t set;

/* Prepare a buffer for writing */
expected = FILESIZE_PAGES * pagesize;
buf = malloc(expected + 1);
if (buf == NULL) {
printf("ENOMEM\n");
exit(EXIT_FAILURE);
}
buf[expected] = 0;
memset(buf, 'a', expected);

/* Prepare the mincore vec */
vec = malloc(FILESIZE_PAGES);
if (vec == NULL) {
printf("ENOMEM\n");
exit(EXIT_FAILURE);
}

/* Bind ourselves to CPU 0 */
CPU_ZERO(&set);
CPU_SET(0, &set);
if (sched_setaffinity(getpid(), sizeof(set), &set) == -1) {
perror("sched_setaffinity");
exit(EXIT_FAILURE);
}

/* open file, unlink and write buffer */
fd = open("fadvise-test-file", O_CREAT|O_EXCL|O_RDWR);
if (fd == -1) {
perror("open");
exit(EXIT_FAILURE);
}
unlink("fadvise-test-file");
while (written < expected) {
ssize_t this_write;
this_write = write(fd, buf + written, expected - written);

if (this_write == -1) {
perror("write");
exit(EXIT_FAILURE);
}

written += this_write;
}
free(buf);

/*
 * Force ourselves to another CPU. If fadvise only flushes the local
 * CPUs pagevecs then the fadvise will fail to discard all file pages
 */
CPU_ZERO(&set);
CPU_SET(1, &set);
if (sched_setaffinity(getpid(), sizeof(set), &set) == -1) {
perror("sched_setaffinity");
exit(EXIT_FAILURE);
}

/* sync and fadvise to discard the page cache */
fsync(fd);
if (posix_fadvise(fd, 0, expected, POSIX_FADV_DONTNEED) == -1) {
perror("posix_fadvise");
exit(EXIT_FAILURE);
}

/* map the file and use mincore to see which parts of it are resident */
buf = mmap(NULL, expected, PROT_READ, MAP_SHARED, fd, 0);
if (buf == NULL) {
perror("mmap");
exit(EXIT_FAILURE);
}
if (mincore(buf, expected, vec) == -1) {
perror("mincore");
exit(EXIT_FAILURE);
}

/* Check residency */
for (i = 0, resident = 0; i < FILESIZE_PAGES; i++) {
if (vec[i])
resident++;
}
if (resident != 0) {
printf("Nr unexpected pages resident: %d\n", resident);
exit(EXIT_FAILURE);
}

munmap(buf, expected);
close(fd);
free(vec);
exit(EXIT_SUCCESS);
}

Signed-off-by: Mel Gorman <mgorman@suse.de>
Reported-by: Rob van der Heij <rvdheij@gmail.com>
Tested-by: Rob van der Heij <rvdheij@gmail.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agomm: export mmu notifier invalidates
Cliff Wickman [Wed, 20 Feb 2013 02:15:04 +0000 (13:15 +1100)]
mm: export mmu notifier invalidates

We at SGI have a need to address some very high physical address ranges
with our GRU (global reference unit), sometimes across partitioned machine
boundaries and sometimes with larger addresses than the cpu supports.  We
do this with the aid of our own 'extended vma' module which mimics the
vma.  When something (either unmap or exit) frees an 'extended vma' we use
the mmu notifiers to clean them up.

We had been able to mimic the functions
__mmu_notifier_invalidate_range_start() and
__mmu_notifier_invalidate_range_end() by locking the per-mm lock and
walking the per-mm notifier list.  But with the change to a global srcu
lock (static in mmu_notifier.c) we can no longer do that.  Our module has
no access to that lock.

So we request that these two functions be exported.

Signed-off-by: Cliff Wickman <cpw@sgi.com>
Acked-by: Robin Holt <holt@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agomm: accelerate munlock() treatment of THP pages
Michel Lespinasse [Wed, 20 Feb 2013 02:15:04 +0000 (13:15 +1100)]
mm: accelerate munlock() treatment of THP pages

munlock_vma_pages_range() was always incrementing addresses by PAGE_SIZE
at a time.  When munlocking THP pages (or the huge zero page), this
resulted in taking the mm->page_table_lock 512 times in a row.

We can do better by making use of the page_mask returned by
follow_page_mask (for the huge zero page case), or the size of the page
munlock_vma_page() operated on (for the true THP page case).

Note - I am sending this as RFC only for now as I can't currently put my
finger on what if anything prevents split_huge_page() from operating
concurrently on the same page as munlock_vma_page(), which would mess up
our NR_MLOCK statistics.  Is this a latent bug or is there a subtle point
I missed here ?

Signed-off-by: Michel Lespinasse <walken@google.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Hugh Dickins <hughd@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agomm: accelerate mm_populate() treatment of THP pages
Michel Lespinasse [Wed, 20 Feb 2013 02:15:04 +0000 (13:15 +1100)]
mm: accelerate mm_populate() treatment of THP pages

This change adds a follow_page_mask function which is equivalent to
follow_page, but with an extra page_mask argument.

follow_page_mask sets *page_mask to HPAGE_PMD_NR - 1 when it encounters a
THP page, and to 0 in other cases.

__get_user_pages() makes use of this in order to accelerate populating THP
ranges - that is, when both the pages and vmas arrays are NULL, we don't
need to iterate HPAGE_PMD_NR times to cover a single THP page (and we also
avoid taking mm->page_table_lock that many times).

Signed-off-by: Michel Lespinasse <walken@google.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Hugh Dickins <hughd@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agomm: use long type for page counts in mm_populate() and get_user_pages()
Michel Lespinasse [Wed, 20 Feb 2013 02:15:03 +0000 (13:15 +1100)]
mm: use long type for page counts in mm_populate() and get_user_pages()

Use long type for page counts in mm_populate() so as to avoid integer
overflow when running the following test code:

int main(void) {
  void *p = mmap(NULL, 0x100000000000, PROT_READ,
                 MAP_PRIVATE | MAP_ANON, -1, 0);
  printf("p: %p\n", p);
  mlockall(MCL_CURRENT);
  printf("done\n");
  return 0;
}

Signed-off-by: Michel Lespinasse <walken@google.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Hugh Dickins <hughd@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agomm-accurately-document-nr_free__pages-functions-with-code-comments-fix
Andrew Morton [Wed, 20 Feb 2013 02:15:03 +0000 (13:15 +1100)]
mm-accurately-document-nr_free__pages-functions-with-code-comments-fix

tweak comments

Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Zhang Yanfei <zhangyanfei@cn.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agomm: accurately document nr_free_*_pages functions with code comments
Zhang Yanfei [Wed, 20 Feb 2013 02:15:03 +0000 (13:15 +1100)]
mm: accurately document nr_free_*_pages functions with code comments

nr_free_zone_pages(), nr_free_buffer_pages() and nr_free_pagecache_pages()
are horribly badly named, so accurately document them with code comments
in case of the misuse of them.

Reviewed-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Zhang Yanfei <zhangyanfei@cn.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agoHWPOISON: change order of error_states[]'s elements
Naoya Horiguchi [Wed, 20 Feb 2013 02:15:02 +0000 (13:15 +1100)]
HWPOISON: change order of error_states[]'s elements

error_states[] has two separate states "unevictable LRU page" and "mlocked
LRU page", and the former one has the higher priority now.  But because of
that the latter one is rarely chosen because pages with PageMlocked highly
likely have PG_unevictable set.  On the other hand, PG_unevictable without
PageMlocked is common for ramfs or SHM_LOCKed shared memory, so reversing
the priority of these two states helps us clearly distinguish them.

Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Chen Gong <gong.chen@linux.intel.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Wu Fengguang <fengguang.wu@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agohwpoison-fix-misjudgement-of-page_action-for-errors-on-mlocked-pages-fix
Andrew Morton [Wed, 20 Feb 2013 02:15:02 +0000 (13:15 +1100)]
hwpoison-fix-misjudgement-of-page_action-for-errors-on-mlocked-pages-fix

tweak comments

Cc: Andi Kleen <andi@firstfloor.org>
Cc: Chen Gong <gong.chen@linux.intel.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Wu Fengguang <fengguang.wu@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agoHWPOISON: fix misjudgement of page_action() for errors on mlocked pages
Naoya Horiguchi [Wed, 20 Feb 2013 02:15:02 +0000 (13:15 +1100)]
HWPOISON: fix misjudgement of page_action() for errors on mlocked pages

memory_failure() can't handle memory errors on mlocked pages correctly,
because page_action() judges such errors as ones on "unknown pages"
instead of ones on "unevictable LRU page" or "mlocked LRU page".  In order
to determine page_state page_action() checks page flags at the timing of
the judgement, but such page flags are not the same with those just after
memory_failure() is called, because memory_failure() does unmapping of the
error pages before doing page_action().  This unmapping changes the page
state, especially page_remove_rmap() (called from try_to_unmap_one())
clears PG_mlocked, so page_action() can't catch mlocked pages after that.

With this patch, we store the page flag of the error page before doing
unmap, and (only) if the first check with page flags at the time decided
the error page is unknown, we do the second check with the stored page
flag.  This implementation doesn't change error handling for the page
types for which the first check can determine the page state correctly.

Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Chen Gong <gong.chen@linux.intel.com>
Cc: Wu Fengguang <fengguang.wu@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agomemcg: stop warning on memcg_propagate_kmem
Hugh Dickins [Wed, 20 Feb 2013 02:15:01 +0000 (13:15 +1100)]
memcg: stop warning on memcg_propagate_kmem

Whilst I run the risk of a flogging for disloyalty to the Lord of Sealand,
I do have CONFIG_MEMCG=y CONFIG_MEMCG_KMEM not set, and grow tired of the
"mm/memcontrol.c:4972:12: warning: `memcg_propagate_kmem' defined but not
used [-Wunused-function]" seen in 3.8-rc: move the #ifdef outwards.

Signed-off-by: Hugh Dickins <hughd@google.com>
Acked-by: Michal Hocko <mhocko@suse.cz>
Cc: Glauber Costa <glommer@parallels.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agonet: change type of virtio_chan->p9_max_pages
Zhang Yanfei [Wed, 20 Feb 2013 02:15:01 +0000 (13:15 +1100)]
net: change type of virtio_chan->p9_max_pages

This member of struct virtio_chan is calculated from nr_free_buffer_pages
so change its type to unsigned long in case of overflow.

Signed-off-by: Zhang Yanfei <zhangyanfei@cn.fujitsu.com>
Cc: David Miller <davem@davemloft.net>
Cc: Eric Van Hensbergen <ericvh@gmail.com>
Cc: Ron Minnich <rminnich@sandia.gov>
Cc: Latchesar Ionkov <lucho@ionkov.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agovmscan: change type of vm_total_pages to unsigned long
Zhang Yanfei [Wed, 20 Feb 2013 02:15:01 +0000 (13:15 +1100)]
vmscan: change type of vm_total_pages to unsigned long

This variable is calculated from nr_free_pagecache_pages so
change its type to unsigned long.

Signed-off-by: Zhang Yanfei <zhangyanfei@cn.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agofs/nfsd: change type of max_delegations, nfsd_drc_max_mem and nfsd_drc_mem_used
Zhang Yanfei [Wed, 20 Feb 2013 02:15:00 +0000 (13:15 +1100)]
fs/nfsd: change type of max_delegations, nfsd_drc_max_mem and nfsd_drc_mem_used

The three variables are calculated from nr_free_buffer_pages so
change their types to unsigned long in case of overflow.

Signed-off-by: Zhang Yanfei <zhangyanfei@cn.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agofs/buffer.c: change type of max_buffer_heads to unsigned long
Zhang Yanfei [Wed, 20 Feb 2013 02:15:00 +0000 (13:15 +1100)]
fs/buffer.c: change type of max_buffer_heads to unsigned long

max_buffer_heads is calculated from nr_free_buffer_pages(), so change its
type to unsigned long in case of overflow.

Signed-off-by: Zhang Yanfei <zhangyanfei@cn.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agoia64: use %ld to print pages calculated in nr_free_buffer_pages
Zhang Yanfei [Wed, 20 Feb 2013 02:15:00 +0000 (13:15 +1100)]
ia64: use %ld to print pages calculated in nr_free_buffer_pages

Now the function nr_free_buffer_pages returns unsigned long, so use %ld to
print its return value.

Signed-off-by: Zhang Yanfei <zhangyanfei@cn.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agomm: fix return type for functions nr_free_*_pages
Zhang Yanfei [Wed, 20 Feb 2013 02:14:59 +0000 (13:14 +1100)]
mm: fix return type for functions nr_free_*_pages

Currently, the amount of RAM that functions nr_free_*_pages return
is held in unsigned int. But in machines with big memory (exceeding
16TB), the amount may be incorrect because of overflow, so fix it.

Signed-off-by: Zhang Yanfei <zhangyanfei@cn.fujitsu.com>
Cc: Simon Horman <horms@verge.net.au>
Cc: Julian Anastasov <ja@ssi.bg>
Cc: David Miller <davem@davemloft.net>
Cc: Eric Van Hensbergen <ericvh@gmail.com>
Cc: Ron Minnich <rminnich@sandia.gov>
Cc: Latchesar Ionkov <lucho@ionkov.net>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Minchan Kim <minchan.kim@gmail.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agomemcg: cleanup mem_cgroup_init comment
Michal Hocko [Wed, 20 Feb 2013 02:14:59 +0000 (13:14 +1100)]
memcg: cleanup mem_cgroup_init comment

We should encourage all memcg controller initialization independent on a
specific mem_cgroup to be done here rather than exploit css_alloc callback
and assume that nothing happens before root cgroup is created.

Signed-off-by: Michal Hocko <mhocko@suse.cz>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Tejun Heo <htejun@gmail.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agomemcg: move memcg_stock initialization to mem_cgroup_init
Michal Hocko [Wed, 20 Feb 2013 02:14:59 +0000 (13:14 +1100)]
memcg: move memcg_stock initialization to mem_cgroup_init

memcg_stock are currently initialized during the root cgroup allocation
which is OK but it pointlessly pollutes memcg allocation code with
something that can be called when the memcg subsystem is initialized by
mem_cgroup_init along with other controller specific parts.

This patch wraps the current memcg_stock initialization code into a helper
calls it from the controller subsystem initialization code.

Signed-off-by: Michal Hocko <mhocko@suse.cz>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Tejun Heo <htejun@gmail.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agomemcg: move mem_cgroup_soft_limit_tree_init to mem_cgroup_init
Michal Hocko [Wed, 20 Feb 2013 02:14:58 +0000 (13:14 +1100)]
memcg: move mem_cgroup_soft_limit_tree_init to mem_cgroup_init

Per-node-zone soft limit tree is currently initialized when the root
cgroup is created which is OK but it pointlessly pollutes memcg allocation
code with something that can be called when the memcg subsystem is
initialized by mem_cgroup_init along with other controller specific parts.

While we are at it let's make mem_cgroup_soft_limit_tree_init void because
it doesn't make much sense to report memory failure because if we fail to
allocate memory that early during the boot then we are screwed anyway
(this saves some code).

Signed-off-by: Michal Hocko <mhocko@suse.cz>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Tejun Heo <htejun@gmail.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agomm: use up free swap space before reaching OOM kill
Minchan Kim [Wed, 20 Feb 2013 02:14:58 +0000 (13:14 +1100)]
mm: use up free swap space before reaching OOM kill

Recently, Luigi reported there are lots of free swap space when OOM
happens.  It's easily reproduced on zram-over-swap, where many instance of
memory hogs are running and laptop_mode is enabled.  He said there was no
problem when he disabled laptop_mode.  The problem when I investigate
problem is following as.

Assumption for easy explanation: There are no page cache page in system
because they all are already reclaimed.

1. try_to_free_pages disable may_writepage when laptop_mode is enabled.
2. shrink_inactive_list isolates victim pages from inactive anon lru list.
3. shrink_page_list adds them to swapcache via add_to_swap but it doesn't
   pageout because sc->may_writepage is 0 so the page is rotated back into
   inactive anon lru list. The add_to_swap made the page Dirty by SetPageDirty.
4. 3 couldn't reclaim any pages so do_try_to_free_pages increase priority and
   retry reclaim with higher priority.
5. shrink_inactlive_list try to isolate victim pages from inactive anon lru list
   but got failed because it try to isolate pages with ISOLATE_CLEAN mode but
   inactive anon lru list is full of dirty pages by 3 so it just returns
   without  any reclaim progress.
6. do_try_to_free_pages doesn't set may_writepage due to zero total_scanned.
   Because sc->nr_scanned is increased by shrink_page_list but we don't call
   shrink_page_list in 5 due to short of isolated pages.

Above loop is continued until OOM happens.

The problem didn't happen before [1] was merged because old logic's
isolatation in shrink_inactive_list was successful and tried to call
shrink_page_list to pageout them but it still ends up failed to page out
by may_writepage.  But important point is that sc->nr_scanned was
increased although we couldn't swap out them so do_try_to_free_pages could
set may_writepages.

Since f80c067 ("mm: zone_reclaim: make isolate_lru_page() filter-aware")
was introduced, it's not a good idea any more to depends on only the
number of scanned pages for setting may_writepage.  So this patch adds new
trigger point of setting may_writepage as below DEF_PRIOIRTY - 2 which is
used to show the significant memory pressure in VM so it's good fit for
our purpose which would be better to lose power saving or clickety rather
than OOM killing.

Signed-off-by: Minchan Kim <minchan@kernel.org>
Reported-by: Luigi Semenzato <semenzato@google.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 years agoinclude-linux-mmzoneh-cleanups-fix
Andrew Morton [Wed, 20 Feb 2013 02:14:58 +0000 (13:14 +1100)]
include-linux-mmzoneh-cleanups-fix

use zone_idx() some more, further simplify is_highmem()

Cc: Lin Feng <linfeng@cn.fujitsu.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>