Ed Cashin [Thu, 25 Oct 2012 01:15:22 +0000 (12:15 +1100)]
aoe: support larger I/O requests via aoe_maxsectors module param
The GPFS filesystem is an example of an aoe user that requires the aoe
driver to support I/O request sizes larger than the default. Most users
will not need large I/O request sizes, because they would need to be split
up into multiple AoE commands anyway.
Signed-off-by: Ed Cashin <ecashin@coraid.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Ed Cashin [Thu, 25 Oct 2012 01:15:22 +0000 (12:15 +1100)]
aoe: support the forgetting (flushing) of a user-specified AoE target
Users sometimes want to cause the aoe driver to forget a particular
previously discovered device when it is no longer online. The aoetools
provide an "aoe-flush" command that users run to perform this
administrative task. The changes below provide the support needed in the
driver.
Signed-off-by: Ed Cashin <ecashin@coraid.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Ed Cashin [Thu, 25 Oct 2012 01:15:21 +0000 (12:15 +1100)]
aoe: update cap on outstanding commands based on config query response
The ATA over Ethernet config query response contains a "buffer count"
field reflecting the AoE target's capacity to buffer incoming AoE
commands.
By taking the current value of this field into accound, we increase
performance throughput or avoid network congestion, when the value
has increased or decreased, respectively.
Signed-off-by: Ed Cashin <ecashin@coraid.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Ed Cashin [Thu, 25 Oct 2012 01:15:21 +0000 (12:15 +1100)]
aoe: avoid using skb member after dev_queue_xmit
After calling dev_queue_xmit it is no longer safe to access the
members of the skb.
Signed-off-by: Ed Cashin <ecashin@coraid.com> Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Ed Cashin [Thu, 25 Oct 2012 01:15:20 +0000 (12:15 +1100)]
Documentation/sparse.txt: document context annotations for lock checking
The context feature of sparse is used with the Linux kernel sources to
check for imbalanced uses of locks. Document the annotations defined in
include/linux/compiler.h that tell sparse what to expect when a lock is
held on function entry, exit, or both.
Signed-off-by: Ed Cashin <ecashin@coraid.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org> Acked-by: Christopher Li <sparse@chrisli.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Josh Triplett [Thu, 25 Oct 2012 01:15:19 +0000 (12:15 +1100)]
linux/compiler.h: add __must_hold macro for functions called with a lock held
linux/compiler.h has macros to denote functions that acquire or release
locks, but not to denote functions called with a lock held that return
with the lock still held. Add a __must_hold macro to cover that case.
Signed-off-by: Josh Triplett <josh@joshtriplett.org> Reported-by: Ed Cashin <ecashin@coraid.com> Tested-by: Ed Cashin <ecashin@coraid.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Manfred Spraul [Thu, 25 Oct 2012 01:15:19 +0000 (12:15 +1100)]
ipc/sem.c: alternatives to preempt_disable()
ipc/sem.c uses a custom wakeup scheme that relies on preempt_disable().
On -RT, this causes increased latencies and debug warnings.
The patch adds two additional schemes:
- one built around a completion - could be better for -RT kernels
- one built around a spinlock - unfortunately it's broken
- and the current one
My preferred solution would be the spinlock implementation: RT would use
premptible spinlocks, mainline normal spinlocks. Thus both get the
optimal implementation without any special code in ipc/sem.c.
Unfortunately, I don't see how it could be fixed.
Signed-off-by: Manfred Spraul <manfred@colorfullife.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Mike Galbraith <efault@gmx.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
WARNING: line over 80 characters
#114: FILE: tools/testing/selftests/ipc/msgque.c:61:
+ if (msgsnd(msgque->msq_id, &msgque->messages[i].mtype, msgque->messages[i].msize, IPC_NOWAIT) != 0) {
WARNING: line over 80 characters
#128: FILE: tools/testing/selftests/ipc/msgque.c:75:
+ ret = msgrcv(msgque->msq_id, &message.mtype, MAX_MSG_SIZE, 0, IPC_NOWAIT);
WARNING: line over 80 characters
#137: FILE: tools/testing/selftests/ipc/msgque.c:84:
+ printf("Wrong message size: %d (expected %d)\n", ret, msgque->messages[cnt].msize);
WARNING: line over 80 characters
#179: FILE: tools/testing/selftests/ipc/msgque.c:126:
+ printf("Failed to get stats for IPC queue with id %d\n", msgque->kern_id);
WARNING: line over 80 characters
#198: FILE: tools/testing/selftests/ipc/msgque.c:145:
+ ret = msgrcv(msgque->msq_id, &msgque->messages[i].mtype, MAX_MSG_SIZE, i, IPC_NOWAIT | MSG_COPY);
WARNING: line over 80 characters
#214: FILE: tools/testing/selftests/ipc/msgque.c:161:
+ if (msgsnd(msgque->msq_id, &msgbuf.mtype, sizeof(TEST_STRING), IPC_NOWAIT) != 0) {
WARNING: line over 80 characters
#221: FILE: tools/testing/selftests/ipc/msgque.c:168:
+ if (msgsnd(msgque->msq_id, &msgbuf.mtype, sizeof(ANOTHER_TEST_STRING), IPC_NOWAIT) != 0) {
WARNING: space prohibited between function name and open parenthesis '('
#228: FILE: tools/testing/selftests/ipc/msgque.c:175:
+int main (int argc, char **argv)
total: 0 errors, 8 warnings, 256 lines checked
./patches/selftests-ipc-message-queue-copy-feature-test.patch has style problems, please review.
If any of these errors are false positives, please report
them to the maintainer, see CHECKPATCH in MAINTAINERS.
Please run checkpatch prior to sending patches
Cc: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
This patch is required for checkpoint/restore in userspace.
c/r requires some way to get all pending IPC messages without deleting
them from the queue (checkpoint can fail and in this case tasks will be
resumed, so queue have to be valid).
To achive this, new operation flag MSG_COPY for sys_msgrcv() system call
was introduced. If this flag was specified, then mtype is interpreted as
number of the message to copy.
If MSG_COPY is set, then kernel will allocate dummy message with passed
size, and then use new copy_msg() helper function to copy desired message
(instead of unlinking it from the queue).
Notes:
1) Return -ENOSYS if MSG_COPY is specified, but
CONFIG_CHECKPOINT_RESTORE is not set.
Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Cc: Serge Hallyn <serge.hallyn@canonical.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Pavel Emelyanov <xemul@parallels.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Michael Kerrisk <mtk.manpages@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
WARNING: line over 80 characters
#33: FILE: include/linux/msg.h:39:
+ long (*msg_fill)(void __user *, struct msg_msg *, size_t ));
ERROR: space prohibited before that close parenthesis ')'
#33: FILE: include/linux/msg.h:39:
+ long (*msg_fill)(void __user *, struct msg_msg *, size_t ));
WARNING: line over 80 characters
#94: FILE: ipc/compat.c:368:
+ return do_msgrcv(first, uptr, second, msgtyp, third, compat_do_msg_fill);
ERROR: space prohibited before that close parenthesis ')'
#142: FILE: ipc/msg.c:774:
+ long (*msg_handler)(void __user *, struct msg_msg *, size_t ))
total: 2 errors, 2 warnings, 165 lines checked
./patches/ipc-message-queue-receive-cleanup.patch has style problems, please review.
If any of these errors are false positives, please report
them to the maintainer, see CHECKPATCH in MAINTAINERS.
Please run checkpatch prior to sending patches
Cc: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Add 3 new variables and sysctls to tune them (by one "next_id" variable
for messages, semaphores and shared memory respectively). This variable
can be used to set desired id for next allocated IPC object. By default
it's equal to -1 and old behaviour is preserved. If this variable is
non-negative, then desired idr will be extracted from it and used as a
start value to search for free IDR slot.
Notes:
1) this patch doesn't guarantee that the new object will have desired
id. So it's up to user space how to handle new object with wrong id.
2) After a sucessful id allocation attempt, "next_id" will be set back
to -1 (if it was non-negative).
Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Cc: Serge Hallyn <serge.hallyn@canonical.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Pavel Emelyanov <xemul@parallels.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Michael Kerrisk <mtk.manpages@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Andrew Morton [Thu, 25 Oct 2012 01:15:16 +0000 (12:15 +1100)]
procfs-add-vmflags-field-in-smaps-output-v4-fix
remove unneeded brakes per sfr, avoid using bloaty for_each_set_bit()
Cc: Cyrill Gorcunov <gorcunov@openvz.org> Cc: Pavel Emelyanov <xemul@parallels.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cyrill Gorcunov [Thu, 25 Oct 2012 01:15:16 +0000 (12:15 +1100)]
procfs: add VmFlags field in smaps output
During c/r sessions we've found that there is no way at the moment to
fetch some VMA associated flags, such as mlock() and madvise().
This leads us to a problem -- we don't know if we should call for mlock()
and/or madvise() after restore on the vma area we're bringing back to
life.
This patch intorduces a new field into "smaps" output called VmFlags,
where all set flags associated with the particular VMA is shown as two
letter mnemonics.
[ Strictly speaking for c/r we only need mlock/madvise bits but it has been
said that providing just a few flags looks somehow inconsistent. So all
flags are here now. ]
This feature is made available on CONFIG_CHECKPOINT_RESTORE=n kernels, as
other applications may start to use these fields.
The data is encoded in a somewhat awkward two letters mnemonic form, to
encourage userspace to be prepared for fields being added or removed in
the future.
[a.p.zijlstra@chello.nl: props to use for_each_set_bit]
[sfr@canb.auug.org.au: props to use array instead of struct]
[akpm@linux-foundation.org: overall redesign and simplification] Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org> Cc: Pavel Emelyanov <xemul@parallels.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
I suggest to hide non-existent capabilities. Here is two reasons.
* It's logically and easier for using.
* It helps to checkpoint-restore capabilities of tasks, because tasks
can be restored on another kernel, where CAP_LAST_CAP is bigger.
Signed-off-by: Andrew Vagin <avagin@openvz.org> Cc: Andrew G. Morgan <morgan@kernel.org> Reviewed-by: Serge E. Hallyn <serge.hallyn@canonical.com> Cc: Pavel Emelyanov <xemul@parallels.com> Reviewed-by: Kees Cook <keescook@chromium.org> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: James Morris <jmorris@namei.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Eldad Zack [Thu, 25 Oct 2012 01:15:15 +0000 (12:15 +1100)]
simple_strto*: annotate function as obsolete
Update the documentation for simple_strto* to reflect that it has been
obsoleted and advise the usage of kstrto*.
Signed-off-by: Eldad Zack <eldad@fogrefinery.com> Cc: J. Bruce Fields <bfields@fieldses.org> Cc: Joe Perches <joe@perches.com> Cc: Randy Dunlap <rdunlap@xenotime.net> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Rob Landley <rob@landley.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Eldad Zack [Thu, 25 Oct 2012 01:15:15 +0000 (12:15 +1100)]
kstrto*: add documentation
As Bruce Fields pointed out, kstrto* is currently lacking kerneldoc
comments. This patch adds kerneldoc comments to common variants of
kstrto*: kstrto(u)l, kstrto(u)ll and kstrto(u)int.
Signed-off-by: Eldad Zack <eldad@fogrefinery.com> Cc: J. Bruce Fields <bfields@fieldses.org> Cc: Joe Perches <joe@perches.com> Cc: Randy Dunlap <rdunlap@xenotime.net> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Rob Landley <rob@landley.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Namjae Jeon [Thu, 25 Oct 2012 01:15:14 +0000 (12:15 +1100)]
fat (exportfs): rebuild directory-inode if fat_dget() fails
This patch enables rebuilding of directory inodes which are not present in
the cache.This is done by traversing the disk clusters to find the
directory entry of the parent directory and using its i_pos to build the
inode.
Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com> Signed-off-by: Ravishankar N <ravi.n1@samsung.com> Signed-off-by: Amit Sahrawat <a.sahrawat@samsung.com> Cc: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Cc: "J. Bruce Fields" <bfields@fieldses.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Andrew Morton [Thu, 25 Oct 2012 01:15:14 +0000 (12:15 +1100)]
fat-exportfs-rebuild-inode-if-ilookup-fails-fix
fix warnings/types
fs/fat/nfs.c: In function 'fat_nfs_get_inode':
fs/fat/nfs.c:68: warning: passing argument 3 of 'fat_get_blknr_offset' from incompatible pointer type
fs/fat/fat.h:218: note: expected 'sector_t *' but argument is of type 'loff_t *'
fs/fat/inode.c: In function '__fat_write_inode':
fs/fat/inode.c:630: warning: passing argument 3 of 'fat_get_blknr_offset' from incompatible pointer type
fs/fat/fat.h:218: note: expected 'sector_t *' but argument is of type 'loff_t *'
Cc: "J. Bruce Fields" <bfields@fieldses.org> Cc: Amit Sahrawat <a.sahrawat@samsung.com> Cc: Namjae Jeon <namjae.jeon@samsung.com> Cc: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Cc: Ravishankar N <ravi.n1@samsung.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Namjae Jeon [Thu, 25 Oct 2012 01:15:14 +0000 (12:15 +1100)]
fat (exportfs): rebuild inode if ilookup() fails
Assign i_pos to kstat->ino and re-introduce fat_encode_fh() and include
i_pos value in the file handle.Use the i_pos value to find the directory
entry of the inode and subsequently rebuild the inode if the cache lookups
fail.
Since this involves accessing the FAT media, it is better to do this only
if the 'nfs' mount option is enabled with nostale_ro. Also introduce a
helper fat_get_blknr_offset() for use in __fat_write_inode() and
fat_nfs_get_inode()
Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com> Signed-off-by: Ravishankar N <ravi.n1@samsung.com> Signed-off-by: Amit Sahrawat <a.sahrawat@samsung.com> Cc: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Cc: "J. Bruce Fields" <bfields@fieldses.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Namjae Jeon [Thu, 25 Oct 2012 01:15:14 +0000 (12:15 +1100)]
fat: modify nfs mount option
This patchset eliminates the client side ESTALE errors when a FAT
partition exported over NFS has its dentries evicted from the cache.
One of the reasons for this error is lack of permanent inode numbers on
FAT which makes it difficult to construct persistent file handles. This
can be overcome by using fat_encode_fh() that include i_pos in file
handle.
Once the i_pos is available, it is only a matter of reading the directory
entries from the disk clusters to locate the matching entry and rebuild
the corresponding inode.
We reached the conclusion support stable inode's read-only export first
after discussing with OGAWA and Bruce. And will make it writable with
some operation(unlink and rename) limitation next time.
This patch:
Provide two possible values 'stale_rw' and 'nostale_ro' for the -o nfs
mount option. The first one allows all file operations but does not
reduce ESTALE errors on memory constrained systems. The second one
eliminates ESTALE errors but mounts the filesystem as read-only. Not
specifying a value defaults to 'stale_rw'.
Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com> Signed-off-by: Ravishankar N <ravi.n1@samsung.com> Signed-off-by: Amit Sahrawat <a.sahrawat@samsung.com> Cc: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Cc: "J. Bruce Fields" <bfields@fieldses.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
hfsplus: code style fixes - reworked support of extended attributes
This patch fixes code style issues:
1. Rephrase comment.
2. Fix multiline comment style.
3. The hfsplus_alloc_attr_entry() was corrected.
4. The hfsplus_unistr and hfsplus_attr_unistr structures were declared
independently.
Signed-off-by: Vyacheslav Dubeyko <slava@dubeyko.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
hfsplus: add on-disk layout declarations related to attributes tree
Current mainline implementation of hfsplus file system driver treats as
extended attributes only two fields (fdType and fdCreator) of user_info
field in file description record (struct hfsplus_cat_file). It is
possible to get or set only these two fields as extended attributes. But
HFS+ treats as com.apple.FinderInfo extended attribute an union of
user_info and finder_info fields as for file (struct hfsplus_cat_file) as
for folder (struct hfsplus_cat_folder). Moreover, current mainline
implementation of hfsplus file system driver doesn't support special
metadata file - attributes tree.
Mac OS X 10.4 and later support extended attributes by making use of the
HFS+ filesystem Attributes file B*-tree feature which allows for named
forks. Mac OS X supports only inline extended attributes, limiting their
size to 3802 bytes. Any regular file may have a list of extended
attributes. HFS+ supports an arbitrary number of named forks. Each
attribute is denoted by a name and the associated data. The name is a
null-terminated Unicode string. It is possible to list, to get, to set,
and to remove extended attributes from files or directories.
It exists some peculiarity during getting of extended attributes list by
means of getfattr utility. The getfattr utility expects prefix "user."
before any extended attribute's name. So, it ignores any names that don't
contained such prefix. Such behavior of getfattr utility results in
unexpected empty output of extended attributes list even in the case when
file (or folder) contains extended attributes. It needs to use empty
string as regular expression pattern for names matching (getfattr
--match="").
For support of extended attributes in HFS+:
1. It was added necessary on-disk layout declarations related to
Attributes tree into hfsplus_raw.h file.
2. It was added attributes.c file with implementation of functionality
of manipulation by records in Attributes tree.
3. It was reworked hfsplus_listxattr, hfsplus_getxattr,
hfsplus_setxattr functions in ioctl.c. Moreover, it was added
hfsplus_removexattr method.
This patch:
Add all neccessary on-disk layout declarations related to attributes
file.
Signed-off-by: Vyacheslav Dubeyko <slava@dubeyko.com> Reported-by: Hin-Tak Leung <htl10@users.sourceforge.net> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Christoph Hellwig <hch@lst.de> Cc: Jan Kara <jack@suse.cz> Tested-by: Hin-Tak Leung <htl10@users.sourceforge.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Roland Stigge [Thu, 25 Oct 2012 01:15:11 +0000 (12:15 +1100)]
ARM: mach-imx: support for DryIce RTC in i.MX53
Enable support for i.MX53 in addition to i.MX25 by providing a dummy clock
on i.MX53 since this one doesn't have a separate clock for internal RTC
but the driver requests one.
Signed-off-by: Roland Stigge <stigge@antcom.de> Cc: Alessandro Zummo <a.zummo@towertech.it> Cc: Grant Likely <grant.likely@secretlab.ca> Cc: Sascha Hauer <kernel@pengutronix.de> Cc: Russell King <linux@arm.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Roland Stigge [Thu, 25 Oct 2012 01:15:11 +0000 (12:15 +1100)]
drivers/rtc/rtc-imxdi.c: add devicetree support
Add device tree support to the rtc-imxdi driver.
Signed-off-by: Roland Stigge <stigge@antcom.de> Cc: Alessandro Zummo <a.zummo@towertech.it> Cc: Grant Likely <grant.likely@secretlab.ca> Cc: Sascha Hauer <kernel@pengutronix.de> Cc: Russell King <linux@arm.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Roland Stigge [Thu, 25 Oct 2012 01:15:11 +0000 (12:15 +1100)]
drivers/rtc/rtc-imxdi: support for i.MX53
Enable support for i.MX53 in addition to i.MX25 by enabling the driver on
ARCH_MXC generally.
Signed-off-by: Roland Stigge <stigge@antcom.de> Cc: Alessandro Zummo <a.zummo@towertech.it> Cc: Grant Likely <grant.likely@secretlab.ca> Cc: Sascha Hauer <kernel@pengutronix.de> Cc: Russell King <linux@arm.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Vaibhav Hiremath [Thu, 25 Oct 2012 01:15:10 +0000 (12:15 +1100)]
rtc: omap: add runtime pm support
OMAP1 RTC driver is used in multiple devices like, OMAPL138 and AM33XX.
Driver currently doesn't handle any clocks, which may be right for OMAP1
architecture but in case of AM33XX, the clock/module needs to be enabled
in order to access the registers.
So convert this driver to runtime pm, which internally handles rest.
[afzal@ti.com: handle error path] Signed-off-by: Vaibhav Hiremath <hvaibhav@ti.com> Signed-off-by: Afzal Mohammed <afzal@ti.com> Acked-by: Sekhar Nori <nsekhar@ti.com> Cc: Grant Likely <grant.likely@secretlab.ca> Cc: Sekhar Nori <nsekhar@ti.com> Cc: Kevin Hilman <khilman@ti.com> Cc: Russell King <linux@arm.linux.org.uk> Cc: Alessandro Zummo <a.zummo@towertech.it> Cc: Daniel Mack <zonque@gmail.com> Cc: Vaibhav Hiremath <hvaibhav@ti.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Afzal Mohammed [Thu, 25 Oct 2012 01:15:10 +0000 (12:15 +1100)]
rtc: omap: depend on am33xx
rtc-omap driver can be reused for AM33xx RTC. Provide dependency in
Kconfig.
Signed-off-by: Afzal Mohammed <afzal@ti.com> Acked-by: Sekhar Nori <nsekhar@ti.com> Cc: Grant Likely <grant.likely@secretlab.ca> Cc: Sekhar Nori <nsekhar@ti.com> Cc: Kevin Hilman <khilman@ti.com> Cc: Russell King <linux@arm.linux.org.uk> Cc: Alessandro Zummo <a.zummo@towertech.it> Cc: Daniel Mack <zonque@gmail.com> Cc: Vaibhav Hiremath <hvaibhav@ti.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Afzal Mohammed [Thu, 25 Oct 2012 01:15:10 +0000 (12:15 +1100)]
rtc: omap: dt support
Enhance rtc-omap driver with DT capability
Signed-off-by: Afzal Mohammed <afzal@ti.com> Acked-by: Sekhar Nori <nsekhar@ti.com> Cc: Grant Likely <grant.likely@secretlab.ca> Cc: Sekhar Nori <nsekhar@ti.com> Cc: Kevin Hilman <khilman@ti.com> Cc: Russell King <linux@arm.linux.org.uk> Cc: Alessandro Zummo <a.zummo@towertech.it> Cc: Daniel Mack <zonque@gmail.com> Cc: Vaibhav Hiremath <hvaibhav@ti.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Afzal Mohammed [Thu, 25 Oct 2012 01:15:09 +0000 (12:15 +1100)]
ARM: davinci: remove rtc kicker release
rtc-omap driver is now capable of handling kicker mechanism, hence remove
kicker handling at platform level, instead provide proper device name so
that driver can handle kicker mechanism by itself
Signed-off-by: Afzal Mohammed <afzal@ti.com> Acked-by: Sekhar Nori <nsekhar@ti.com> Cc: Grant Likely <grant.likely@secretlab.ca> Cc: Sekhar Nori <nsekhar@ti.com> Cc: Kevin Hilman <khilman@ti.com> Cc: Russell King <linux@arm.linux.org.uk> Cc: Alessandro Zummo <a.zummo@towertech.it> Cc: Daniel Mack <zonque@gmail.com> Cc: Vaibhav Hiremath <hvaibhav@ti.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Afzal Mohammed [Thu, 25 Oct 2012 01:15:09 +0000 (12:15 +1100)]
rtc: omap: kicker mechanism support
OMAP RTC IP can have kicker feature. This prevents spurious writes to
register. To write to registers kicker lock has to be released.
Procedure to do it as follows,
1. write to kick0 register, 0x83e70b13
2. write to kick1 register, 0x95a4f1e0
Writing value other than 0x83e70b13 to kick0 enables write locking, more
details about kicker mechanism can be found in section 20.3.3.5.3 of
AM335X TRM @www.ti.com/am335x
Here id table information is added and is used to distinguish those that
require kicker handling and the ones that doesn't need it. There are more
features in the newer IP's compared to legacy ones other than kicker,
which driver currently doesn't handle, supporting additional features
would be easier with the addition of id table.
Older IP (of OMAP1) doesn't have revision register as per TRM, so revision
register can't be relied always to find features, hence id table is being
used.
While at it, replace __raw_writeb/__raw_readb with writeb/readb; this
driver is used on ARMv7 (AM335X SoC)
Signed-off-by: Afzal Mohammed <afzal@ti.com> Acked-by: Sekhar Nori <nsekhar@ti.com> Cc: Grant Likely <grant.likely@secretlab.ca> Cc: Sekhar Nori <nsekhar@ti.com> Cc: Kevin Hilman <khilman@ti.com> Cc: Russell King <linux@arm.linux.org.uk> Cc: Alessandro Zummo <a.zummo@towertech.it> Cc: Daniel Mack <zonque@gmail.com> Cc: Vaibhav Hiremath <hvaibhav@ti.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Alan Cox [Thu, 25 Oct 2012 01:15:08 +0000 (12:15 +1100)]
binfmt_elf: fix corner case kfree of uninitialized data
If elf_core_dump() is called and fill_note_info() fails in the kmalloc()
then it returns 0 but has not yet initialised all the needed fields. As a
result we do a kfree(randomness) after correctly skipping the thread data.
Signed-off-by: Alan Cox <alan@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Some comment styles in net and drivers/net are flagged inappropriately.
Avoid proclaiming inline comments like:
int a = b; /* some comment */
and block comments like:
/*********************
* some comment
********************/
are defective.
Tested with
$ cat drivers/net/t.c
/* foo */
/*
* foo
*/
/* foo
*/
/* foo
* bar */
/****************************
* some long block comment
***************************/
struct foo {
int bar; /* another test */
};
$
Signed-off-by: Joe Perches <joe@perches.com> Reported-by: Larry Finger <Larry.Finger@lwfinger.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
This function is used by sparc, powerpc tile and arm64 for compat support.
The patch adds a generic implementation with a wrapper for PowerPC to do
the u32->int sign extension.
The reason for a single patch covering powerpc, tile, sparc and arm64 is
to keep it bisectable, otherwise kernel building may fail with mismatched
function declarations.
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Chris Metcalf <cmetcalf@tilera.com> [for tile] Acked-by: David S. Miller <davem@davemloft.net> Acked-by: Arnd Bergmann <arnd@arndb.de> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Andy Shevchenko [Thu, 25 Oct 2012 01:15:05 +0000 (12:15 +1100)]
lib: dynamic_debug: use kbasename()
Remove the custom implementation of the functionality similar to kbasename().
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: Jason Baron <jbaron@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Andy Shevchenko [Thu, 25 Oct 2012 01:15:04 +0000 (12:15 +1100)]
string: introduce helper to get base file name from given path
There are several places in the kernel that use functionality like
basename(3) with the exception: in case of '/foo/bar/' we expect to get an
empty string. Let's do it common helper for them.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: Jason Baron <jbaron@redhat.com> Cc: YAMANE Toshiaki <yamanetoshi@gmail.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Add 'const' to static array that was missing it in its definition. Also,
'const' is added to ili9320_write_regs(), because it is called by
vgg2432a4 driver.
Signed-off-by: Jingoo Han <jg1.han@samsung.com> Cc: Ben Dooks <ben-linux@fluff.org> Cc: Richard Purdie <rpurdie@rpsys.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Add 'const' to static array that was missing it in its definition.
Signed-off-by: Jingoo Han <jg1.han@samsung.com> Cc: Eric Miao <eric.y.miao@gmail.com> Cc: Richard Purdie <rpurdie@rpsys.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Add 'const' to static array that was missing it in its definition.
Signed-off-by: Jingoo Han <jg1.han@samsung.com> Acked-by: Marek Vasut <marex@denx.de> Cc: Richard Purdie <rpurdie@rpsys.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
The mutex for accessing lp855x registers is used in case of the user-space
interaction. When the brightness is changed via sysfs, the mutex is
required. But the backlight class device already provides it. Thus, the
lp855x mutex is unnecessary.
Signed-off-by: Milo(Woogyom) Kim <milo.kim@ti.com> Cc: Thierry Reding <thierry.reding@avionic-design.de> Cc: Richard Purdie <rpurdie@rpsys.net> Cc: Bryan Wu <bryan.wu@canonical.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Kim, Milo [Thu, 25 Oct 2012 01:15:02 +0000 (12:15 +1100)]
drivers/video/backlight/lp855x_bl.c: use generic PWM functions
The LP855x family devices support the PWM input for the backlight control.
Period of the PWM is configurable in the platform side. Platform
specific functions are unnecessary anymore because generic PWM functions
are used inside the driver.
(PWM input mode)
To set the brightness, new lp855x_pwm_ctrl() is used.
If a PWM device is not allocated, devm_pwm_get() is called.
The PWM consumer name is from the chip name such as 'lp8550' and 'lp8556'.
To get the brightness value, no additional handling is required.
Just the value of 'props.brightness' is returned.
If the PWM driver is not ready while initializing the LP855x driver, it's
OK. The PWM device can be retrieved later, when the brightness value is
changed.
Documentation is updated with an example.
Signed-off-by: Milo(Woogyom) Kim <milo.kim@ti.com> Cc: Thierry Reding <thierry.reding@avionic-design.de> Cc: Richard Purdie <rpurdie@rpsys.net> Cc: Bryan Wu <bryan.wu@canonical.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Jingoo Han [Thu, 25 Oct 2012 01:15:02 +0000 (12:15 +1100)]
backlight: tosa: use devm_gpio_request_one
By using devm_gpio_request_one it is possible to set the direction and
initial value in one shot. Thus, using devm_gpio_request_one can make the
code simpler.
Signed-off-by: Jingoo Han <jg1.han@samsung.com> Cc: Richard Purdie <rpurdie@rpsys.net> Cc: Dmitry Baryshkov <dbaryshkov@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Jingoo Han [Thu, 25 Oct 2012 01:15:01 +0000 (12:15 +1100)]
backlight: lms283gf05: use devm_gpio_request_one
By using devm_gpio_request_one it is possible to set the direction
and initial value in one shot. Thus, using devm_gpio_request_one
can make the code simpler.
Signed-off-by: Jingoo Han <jg1.han@samsung.com> Cc: Richard Purdie <rpurdie@rpsys.net> Acked-by: Marek Vasut <marex@denx.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Jingoo Han [Thu, 25 Oct 2012 01:14:59 +0000 (12:14 +1100)]
backlight: locomolcd: fix checkpatch error and warning
This patch fixes the checkpatch error and warning as below:
WARNING: line over 80 characters
WARNING: space prohibited between function name and open parenthesis '('
ERROR: trailing statements should be on next line
Also, long comments are fixed for the preferred style and
unnecessary lines are removed.
Signed-off-by: Jingoo Han <jg1.han@samsung.com> Cc: Richard Purdie <rpurdie@rpsys.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Jingoo Han [Thu, 25 Oct 2012 01:14:58 +0000 (12:14 +1100)]
backlight: ili9320: fix checkpatch error and warning
This patch fixes the checkpatch error and warning as below:
WARNING: please, no space before tabs
WARNING: please, no spaces at the start of a line
WARNING: EXPORT_SYMBOL(foo); should immediately follow its function/variable
WARNING: braces {} are not necessary for single statement blocks
ERROR: code indent should use tabs where possible
Signed-off-by: Jingoo Han <jg1.han@samsung.com> Cc: Richard Purdie <rpurdie@rpsys.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Jingoo Han [Thu, 25 Oct 2012 01:14:57 +0000 (12:14 +1100)]
backlight: hp680_bl: fix checkpatch error and warning
This patch fixes the checkpatch error and warning as below:
WARNING: please, no space before tabs
WARNING: please, no spaces at the start of a line
ERROR: do not initialise statics to 0 or NULL
ERROR: code indent should use tabs where possible
Signed-off-by: Jingoo Han <jg1.han@samsung.com> Cc: Richard Purdie <rpurdie@rpsys.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Jingoo Han [Thu, 25 Oct 2012 01:14:56 +0000 (12:14 +1100)]
backlight: da903x_bl: use dev_get_drvdata() instead of platform_get_drvdata()
dev_get_drvdata() can be used instead of platform_get_drvdata()
to make the code smaller.
Signed-off-by: Jingoo Han <jg1.han@samsung.com> Cc: Richard Purdie <rpurdie@rpsys.net> Cc: Mike Rapoport <mike@compulab.co.il> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Matthew Leach [Thu, 25 Oct 2012 01:14:55 +0000 (12:14 +1100)]
include/linux/init.h: use the stringify operator for the __define_initcall macro
Currently the __define_initcall() macro takes three arguments, fn, id and
level. The level argument is exactly the same as the id argument but
wrapped in quotes. To overcome this need to specify three arguments to
the __define_initcall macro, where one argument is the stringification of
another, we can just use the stringification macro instead.
Signed-off-by: Matthew Leach <matthew@mattleach.net> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Andreas Bießmann [Thu, 25 Oct 2012 01:14:54 +0000 (12:14 +1100)]
scripts/pnmtologo: fix for plain PBM
PBM generated with current tools do not have a whitespace between the
digits. Therefore the pnmtologo tool fails to gernerate the required
C-Array for these images. This patch fixes that behaviour and can handle
both 'old style' and 'new style' PBM files.
Signed-off-by: Andreas Bießmann <andreas@biessmann.de> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Wanpeng Li [Thu, 25 Oct 2012 01:14:54 +0000 (12:14 +1100)]
mm/memblock: reduce overhead in binary search
When checking that the indicated address belongs to the memory region, the
memory regions are checked one by one through a binary search, which will
be time consuming.
If the indicated address isn't in the memory region, then we needn't do
the time-consuming search. Add a check on the indicated address for that
purpose.
Signed-off-by: Wanpeng Li <liwanp@linux.vnet.ibm.com> Cc: Michal Hocko <mhocko@suse.cz> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Gavin Shan <shangw@linux.vnet.ibm.com> Cc: Yinghai Lu <yinghai@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Shaohua Li [Thu, 25 Oct 2012 01:14:53 +0000 (12:14 +1100)]
swap: add a simple detector for inappropriate swapin readahead
The swapin readahead does a blind readahead whether or not the swapin is
sequential. This is ok for harddisk because large reads have relatively
small costs and if the readahead pages are unneeded they can be reclaimed
easily. But for SSD devices large reads are more expensive than small
one. If readahead pages are unneeded, reading them in caused significant
overhead
This patch addes a simple random read detection similar to file mmap
readahead. If a random read is detected, swapin readahead will be
skipped. This improves a lot for a swap workload with random IO in a fast
SSD.
I run anonymous mmap write micro benchmark, which will triger swapin/swapout.
For both harddisk and SSD, the randwrite swap workload run time is reduced
significantly. Sequential write swap workload hasn't chanage.
Interestingly, the randwrite harddisk test is improved too. This might be
because swapin readahead needs to allocate extra memory, which further
tights memory pressure, so more swapout/swapin.
Signed-off-by: Shaohua Li <shli@fusionio.com> Acked-by: Rik van Riel <riel@redhat.com> Cc: Konstantin Khlebnikov <khlebnikov@openvz.org> Cc: Wu Fengguang <fengguang.wu@intel.com> Cc: Hugh Dickins <hughd@google.com> Cc: Minchan Kim <minchan@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Michal Hocko [Thu, 25 Oct 2012 01:14:53 +0000 (12:14 +1100)]
drop_caches: add some documentation and info messsge
I would like to resurrect Dave's patch. The last time it was posted was
here https://lkml.org/lkml/2010/9/16/250 and there didn't seem to be any
strong opposition.
Kosaki was worried about possible excessive logging when somebody drops
caches too often (but then he claimed he didn't have a strong opinion on
that) but I would say opposite. If somebody does that then I would really
like to know that from the log when supporting a system because it almost
for sure means that there is something fishy going on. It is also worth
mentioning that only root can write drop caches so this is not an flooding
attack vector.
I am bringing that up again because this can be really helpful when
chasing strange performance issues which (surprise surprise) turn out to
be related to artificially dropped caches done because the admin thinks
this would help...
I have just refreshed the original patch on top of the current mm tree
but I could live with KERN_INFO as well if people think that KERN_NOTICE
is too hysterical.
: From: Dave Hansen <dave@linux.vnet.ibm.com>
: Date: Fri, 12 Oct 2012 14:30:54 +0200
:
: There is plenty of anecdotal evidence and a load of blog posts
: suggesting that using "drop_caches" periodically keeps your system
: running in "tip top shape". Perhaps adding some kernel
: documentation will increase the amount of accurate data on its use.
:
: If we are not shrinking caches effectively, then we have real bugs.
: Using drop_caches will simply mask the bugs and make them harder
: to find, but certainly does not fix them, nor is it an appropriate
: "workaround" to limit the size of the caches.
:
: It's a great debugging tool, and is really handy for doing things
: like repeatable benchmark runs. So, add a bit more documentation
: about it, and add a little KERN_NOTICE. It should help developers
: who are chasing down reclaim-related bugs.
[mhocko@suse.cz: refreshed to current -mm tree] Signed-off-by: Dave Hansen <dave@linux.vnet.ibm.com> Reviewed-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: Michal Hocko <mhocko@suse.cz> Acked-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Lai Jiangshan [Thu, 25 Oct 2012 01:14:52 +0000 (12:14 +1100)]
slub, hotplug: ignore unrelated node's hot-adding and hot-removing
SLUB only focuses on the nodes which have normal memory and it ignores the
other node's hot-adding and hot-removing.
Aka: if some memory of a node which has no onlined memory is online, but
this new memory onlined is not normal memory (for example, highmem), we
should not allocate kmem_cache_node for SLUB.
And if the last normal memory is offlined, but the node still has memory,
we should remove kmem_cache_node for that node. (The current code delays
it when all of the memory is offlined)
So we only do something when marg->status_change_nid_normal > 0.
marg->status_change_nid is not suitable here.
The same problem doesn't exist in SLAB, because SLAB allocates kmem_list3
for every node even the node don't have normal memory, SLAB tolerates
kmem_list3 on alien nodes. SLUB only focuses on the nodes which have
normal memory, it don't tolerate alien kmem_cache_node. The patch makes
SLUB become self-compatible and avoids WARNs and BUGs in rare conditions.
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com> Cc: David Rientjes <rientjes@google.com> Cc: Minchan Kim <minchan.kim@gmail.com> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com> Cc: Rob Landley <rob@landley.net> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Jiang Liu <jiang.liu@huawei.com> Cc: Kay Sievers <kay.sievers@vrfy.org> Cc: Greg Kroah-Hartman <gregkh@suse.de> Cc: Mel Gorman <mgorman@suse.de> Cc: Wen Congyang <wency@cn.fujitsu.com> Acked-by: Christoph Lameter <cl@linux.com> Cc: Pekka Enberg <penberg@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Lai Jiangshan [Thu, 25 Oct 2012 01:14:52 +0000 (12:14 +1100)]
memory_hotplug: fix possible incorrect node_states[N_NORMAL_MEMORY]
Currently memory_hotplug only manages the node_states[N_HIGH_MEMORY], it
forgets to manage node_states[N_NORMAL_MEMORY]. This may cause
node_states[N_NORMAL_MEMORY] to become incorrect.
Example, if a node is empty before online, and we online a memory which is
in ZONE_NORMAL. And after online, node_states[N_HIGH_MEMORY] is correct,
but node_states[N_NORMAL_MEMORY] is incorrect, the online code doesn't set
the new online node to node_states[N_NORMAL_MEMORY].
The same thing will happen when offlining (the offline code doesn't clear
the node from node_states[N_NORMAL_MEMORY] when needed). Some memory
managment code depends node_states[N_NORMAL_MEMORY], so we have to fix up
the node_states[N_NORMAL_MEMORY].
We add node_states_check_changes_online() and
node_states_check_changes_offline() to detect whether
node_states[N_HIGH_MEMORY] and node_states[N_NORMAL_MEMORY] are changed
while hotpluging.
Also add @status_change_nid_normal to struct memory_notify, thus the
memory hotplug callbacks know whether the node_states[N_NORMAL_MEMORY] are
changed. (We can add a @flags and reuse @status_change_nid instead of
introducing @status_change_nid_normal, but it will add much more
complexity in memory hotplug callback in every subsystem. So introducing
@status_change_nid_normal is better and it doesn't change the sematics of
@status_change_nid)
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com> Cc: David Rientjes <rientjes@google.com> Cc: Minchan Kim <minchan.kim@gmail.com> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com> Cc: Rob Landley <rob@landley.net> Cc: Jiang Liu <jiang.liu@huawei.com> Cc: Kay Sievers <kay.sievers@vrfy.org> Cc: Greg Kroah-Hartman <gregkh@suse.de> Cc: Mel Gorman <mgorman@suse.de> Cc: Wen Congyang <wency@cn.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Wen Congyang [Thu, 25 Oct 2012 01:14:51 +0000 (12:14 +1100)]
memory-hotplug: build zonelist if a zone is populated after onlining pages
After "memory-hotplug: allocate zone's pcp before onlining pages", we
build zone list before onlining pages to allocate zone's pcp. But the
zone doesn't have pages before onlining pages, and the zone is not in
zonelist, so we still need to build zonelist after onlining pages.
Signed-off-by: Wen Congyang <wency@cn.fujitsu.com> Cc: David Rientjes <rientjes@google.com> Cc: Jiang Liu <liuj97@gmail.com> Cc: Len Brown <len.brown@intel.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Christoph Lameter <cl@linux.com> Cc: Minchan Kim <minchan.kim@gmail.com> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Michal Hocko [Thu, 25 Oct 2012 01:14:51 +0000 (12:14 +1100)]
mm: make zone_pcp_reset independent of MEMORY_HOTREMOVE
340175b7 (mm/hotplug: free zone->pageset when a zone becomes empty)
introduced zone_pcp_reset and hided it inside CONFIG_MEMORY_HOTREMOVE.
Since "memory-hotplug: allocate zone's pcp before onlining pages" the
function is also called from online_pages which is defined outside
CONFIG_MEMORY_HOTREMOVE which causes a linkage error.
The function, although not used outside of MEMORY_{HOTPLUT,HOTREMOVE},
seems like universal enough so let's keep it at its current location
and only remove the HOTREMOVE guard.
Signed-off-by: Michal Hocko <mhocko@suse.cz> Reviewed-by: Wen Congyang <wency@cn.fujitsu.com> Cc: David Rientjes <rientjes@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Wen Congyang [Thu, 25 Oct 2012 01:14:51 +0000 (12:14 +1100)]
memory-hotplug: allocate zone's pcp before onlining pages
We use __free_page() to put a page to buddy system when onlining pages.
__free_page() will store NR_FREE_PAGES in zone's pcp.vm_stat_diff, so we
should allocate zone's pcp before onlining pages, otherwise we will lose
some free pages.
Signed-off-by: Wen Congyang <wency@cn.fujitsu.com> Cc: David Rientjes <rientjes@google.com> Cc: Jiang Liu <liuj97@gmail.com> Cc: Len Brown <len.brown@intel.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Christoph Lameter <cl@linux.com> Cc: Minchan Kim <minchan.kim@gmail.com> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com> Cc: Dave Hansen <dave@linux.vnet.ibm.com> Cc: Mel Gorman <mel@csn.ul.ie> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Wen Congyang [Thu, 25 Oct 2012 01:14:51 +0000 (12:14 +1100)]
memory-hotplug: fix NR_FREE_PAGES mismatch
NR_FREE_PAGES will be wrong after offlining pages. We add/dec
NR_FREE_PAGES like this now:
1. move all pages in buddy system to MIGRATE_ISOLATE, and dec NR_FREE_PAGES
2. don't add NR_FREE_PAGES when it is freed and the migratetype is
MIGRATE_ISOLATE
3. dec NR_FREE_PAGES when offlining isolated pages.
4. add NR_FREE_PAGES when undoing isolate pages.
When we come to step 3, all pages are in MIGRATE_ISOLATE list, and
NR_FREE_PAGES are right. When we come to step4, all pages are not in
buddy system, so we don't change NR_FREE_PAGES in this step, but we change
NR_FREE_PAGES in step3. So NR_FREE_PAGES is wrong after offlining pages.
So there is no need to change NR_FREE_PAGES in step3.
This patch also fixs a problem in step2: if the migratetype is
MIGRATE_ISOLATE, we should not add NR_FRR_PAGES when we remove pages from
pcppages.
Signed-off-by: Wen Congyang <wency@cn.fujitsu.com> Cc: David Rientjes <rientjes@google.com> Cc: Jiang Liu <liuj97@gmail.com> Cc: Len Brown <len.brown@intel.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Christoph Lameter <cl@linux.com> Cc: Minchan Kim <minchan.kim@gmail.com> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com> Cc: Dave Hansen <dave@linux.vnet.ibm.com> Cc: Mel Gorman <mel@csn.ul.ie> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Wen Congyang [Thu, 25 Oct 2012 01:14:50 +0000 (12:14 +1100)]
memory-hotplug: auto offline page_cgroup when onlining memory block failed
When a memory block is onlined, we will try allocate memory on that node
to store page_cgroup. If onlining the memory block failed, we don't
offline the page cgroup, and we have no chance to offline this page cgroup
unless the memory block is onlined successfully again. It will cause that
we can't hot-remove the memory device on that node, because some memory is
used to store page cgroup. If onlining the memory block is failed, there
is no need to stort page cgroup for this memory. So auto offline
page_cgroup when onlining memory block failed.
Signed-off-by: Wen Congyang <wency@cn.fujitsu.com> Cc: David Rientjes <rientjes@google.com> Cc: Jiang Liu <liuj97@gmail.com> Cc: Len Brown <len.brown@intel.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Christoph Lameter <cl@linux.com> Cc: Minchan Kim <minchan.kim@gmail.com> Acked-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com> Cc: Dave Hansen <dave@linux.vnet.ibm.com> Cc: Mel Gorman <mel@csn.ul.ie> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>