git.karo-electronics.de Git - karo-tx-linux.git/log

Merge branch 'akpm/master'

linux/compiler.h: add __must_hold macro for functions called with a lock held

linux/compiler.h has macros to denote functions that acquire or release
locks, but not to denote functions called with a lock held that return
with the lock still held. Add a __must_hold macro to cover that case.

Signed-off-by: Josh Triplett <josh@joshtriplett.org>
Reported-by: Ed Cashin <ecashin@coraid.com>
Tested-by: Ed Cashin <ecashin@coraid.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

dma-debug: new interfaces to debug dma mapping errors

Add dma-debug interface debug_dma_mapping_error() to debug drivers that
fail to check dma mapping errors on addresses returned by dma_map_single()
and dma_map_page() interfaces.  This interface clears a flag set by
debug_dma_map_page() to indicate that dma_mapping_error() has been called
by the driver.  When driver does unmap, debug_dma_unmap() checks the flag
and if this flag is still set, prints warning message that includes call
trace that leads up to the unmap.  This interface can be called from
dma_mapping_error() routines to enable dma mapping error check debugging.

Tested: Intel iommu and swiotlb (iommu=soft) on x86-64 with
        CONFIG_DMA_API_DEBUG enabled and disabled.

Signed-off-by: Shuah Khan <shuah.khan@hp.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

ipc/sem.c: alternatives to preempt_disable()

ipc/sem.c uses a custom wakeup scheme that relies on preempt_disable().
On -RT, this causes increased latencies and debug warnings.

The patch adds two additional schemes:
- one built around a completion - could be better for -RT kernels
- one built around a spinlock - unfortunately it's broken
- and the current one

My preferred solution would be the spinlock implementation: RT would use
premptible spinlocks, mainline normal spinlocks. Thus both get the
optimal implementation without any special code in ipc/sem.c.
Unfortunately, I don't see how it could be fixed.

Signed-off-by: Manfred Spraul <manfred@colorfullife.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Mike Galbraith <efault@gmx.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

proc: don't show nonexistent capabilities

Without this patch it is really hard to interpret a bounding set, if
CAP_LAST_CAP is unknown for a current kernel.

Non-existant capabilities can not be deleted from a bounding set with help
of prctl.

E.g.: Here are two examples without/with this patch.
CapBnd: ffffffe0fdecffff
CapBnd: 00000000fdecffff

I suggest to hide non-existent capabilities. Here is two reasons.
* It's logically and easier for using.
* It helps to checkpoint-restore capabilities of tasks, because tasks
can be restored on another kernel, where CAP_LAST_CAP is bigger.

Signed-off-by: Andrew Vagin <avagin@openvz.org>
Cc: Andrew G. Morgan <morgan@kernel.org>
Reviewed-by: Serge E. Hallyn <serge.hallyn@canonical.com>
Cc: Pavel Emelyanov <xemul@parallels.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: James Morris <jmorris@namei.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

simple_strto*: annotate function as obsolete

Update the documentation for simple_strto* to reflect that it has been
obsoleted and advise the usage of kstrto*.

Signed-off-by: Eldad Zack <eldad@fogrefinery.com>
Cc: J. Bruce Fields <bfields@fieldses.org>
Cc: Joe Perches <joe@perches.com>
Cc: Randy Dunlap <rdunlap@xenotime.net>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Rob Landley <rob@landley.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

kstrto*: add documentation

As Bruce Fields pointed out, kstrto* is currently lacking kerneldoc
comments. This patch adds kerneldoc comments to common variants of
kstrto*: kstrto(u)l, kstrto(u)ll and kstrto(u)int.

Signed-off-by: Eldad Zack <eldad@fogrefinery.com>
Cc: J. Bruce Fields <bfields@fieldses.org>
Cc: Joe Perches <joe@perches.com>
Cc: Randy Dunlap <rdunlap@xenotime.net>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Rob Landley <rob@landley.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Documentation: update nfs option in filesystem/vfat.txt

Update nfs option in filesystem/vfat.txt

Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com>
Signed-off-by: Ravishankar N <ravi.n1@samsung.com>
Signed-off-by: Amit Sahrawat <a.sahrawat@samsung.com>
Cc: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Cc: "J. Bruce Fields" <bfields@fieldses.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

fat (exportfs): rebuild directory-inode if fat_dget() fails

This patch enables rebuilding of directory inodes which are not present in
the cache.This is done by traversing the disk clusters to find the
directory entry of the parent directory and using its i_pos to build the
inode.

Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com>
Signed-off-by: Ravishankar N <ravi.n1@samsung.com>
Signed-off-by: Amit Sahrawat <a.sahrawat@samsung.com>
Cc: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Cc: "J. Bruce Fields" <bfields@fieldses.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

fat-exportfs-rebuild-inode-if-ilookup-fails-fix

fix warnings/types

fs/fat/nfs.c: In function 'fat_nfs_get_inode':
fs/fat/nfs.c:68: warning: passing argument 3 of 'fat_get_blknr_offset' from incompatible pointer type
fs/fat/fat.h:218: note: expected 'sector_t *' but argument is of type 'loff_t *'
fs/fat/inode.c: In function '__fat_write_inode':
fs/fat/inode.c:630: warning: passing argument 3 of 'fat_get_blknr_offset' from incompatible pointer type
fs/fat/fat.h:218: note: expected 'sector_t *' but argument is of type 'loff_t *'

Cc: "J. Bruce Fields" <bfields@fieldses.org>
Cc: Amit Sahrawat <a.sahrawat@samsung.com>
Cc: Namjae Jeon <namjae.jeon@samsung.com>
Cc: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Cc: Ravishankar N <ravi.n1@samsung.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

fat (exportfs): rebuild inode if ilookup() fails

Assign i_pos to kstat->ino and re-introduce fat_encode_fh() and include
i_pos value in the file handle.Use the i_pos value to find the directory
entry of the inode and subsequently rebuild the inode if the cache lookups
fail.

Since this involves accessing the FAT media, it is better to do this only
if the 'nfs' mount option is enabled with nostale_ro. Also introduce a
helper fat_get_blknr_offset() for use in __fat_write_inode() and
fat_nfs_get_inode()

Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com>
Signed-off-by: Ravishankar N <ravi.n1@samsung.com>
Signed-off-by: Amit Sahrawat <a.sahrawat@samsung.com>
Cc: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Cc: "J. Bruce Fields" <bfields@fieldses.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

fat: modify nfs mount option

This patchset eliminates the client side ESTALE errors when a FAT
partition exported over NFS has its dentries evicted from the cache.

One of the reasons for this error is lack of permanent inode numbers on
FAT which makes it difficult to construct persistent file handles.  This
can be overcome by using fat_encode_fh() that include i_pos in file
handle.

Once the i_pos is available, it is only a matter of reading the directory
entries from the disk clusters to locate the matching entry and rebuild
the corresponding inode.

We reached the conclusion support stable inode's read-only export first
after discussing with OGAWA and Bruce.  And will make it writable with
some operation(unlink and rename) limitation next time.

This patch:

Provide two possible values 'stale_rw' and 'nostale_ro' for the -o nfs
mount option.  The first one allows all file operations but does not
reduce ESTALE errors on memory constrained systems.  The second one
eliminates ESTALE errors but mounts the filesystem as read-only.  Not
specifying a value defaults to 'stale_rw'.

Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com>
Signed-off-by: Ravishankar N <ravi.n1@samsung.com>
Signed-off-by: Amit Sahrawat <a.sahrawat@samsung.com>
Cc: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Cc: "J. Bruce Fields" <bfields@fieldses.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

hfsplus: code style fixes - reworked support of extended attributes

This patch fixes code style issues:
1. Rephrase comment.
2. Fix multiline comment style.
3. The hfsplus_alloc_attr_entry() was corrected.
4. The hfsplus_unistr and hfsplus_attr_unistr structures were declared
independently.

Signed-off-by: Vyacheslav Dubeyko <slava@dubeyko.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

hfsplus-add-support-of-manipulation-by-attributes-file-checkpatch-fixes

WARNING: Prefer netdev_err(netdev, ... then dev_err(dev, ... then pr_err(...  to printk(KERN_ERR ...
#280: FILE: fs/hfsplus/btree.c:103:
+ printk(KERN_ERR "hfs: invalid attributes max_key_len %d\n",

WARNING: Prefer netdev_err(netdev, ... then dev_err(dev, ... then pr_err(...  to printk(KERN_ERR ...
#589: FILE: fs/hfsplus/inode.c:353:
+ printk(KERN_ERR "hfs: sync non-existent attributes tree\n");

WARNING: Prefer netdev_err(netdev, ... then dev_err(dev, ... then pr_err(...  to printk(KERN_ERR ...
#643: FILE: fs/hfsplus/super.c:482:
+ printk(KERN_ERR "hfs: failed to load attributes file\n");

ERROR: code indent should use tabs where possible
#686: FILE: fs/hfsplus/super.c:660:
+ ^Ireturn err;$

WARNING: please, no space before tabs
#686: FILE: fs/hfsplus/super.c:660:
+ ^Ireturn err;$

WARNING: please, no spaces at the start of a line
#686: FILE: fs/hfsplus/super.c:660:
+ ^Ireturn err;$

total: 1 errors, 5 warnings, 616 lines checked

NOTE: whitespace errors detected, you may wish to use scripts/cleanpatch or
      scripts/cleanfile

./patches/hfsplus-add-support-of-manipulation-by-attributes-file.patch has style problems, please review.

If any of these errors are false positives, please report
them to the maintainer, see CHECKPATCH in MAINTAINERS.

Please run checkpatch prior to sending patches

Cc: Vyacheslav Dubeyko <slava@dubeyko.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

hfsplus: add support of manipulation by attributes file

Add support of manipulation by attributes file.

Signed-off-by: Vyacheslav Dubeyko <slava@dubeyko.com>
Reported-by: Hin-Tak Leung <htl10@users.sourceforge.net>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Jan Kara <jack@suse.cz>
Tested-by: Hin-Tak Leung <htl10@users.sourceforge.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

hfsplus: rework functionality of getting, setting and deleting of extended attributes

Rework functionality of getting, setting and deleting of extended
attributes.

Signed-off-by: Vyacheslav Dubeyko <slava@dubeyko.com>
Reported-by: Hin-Tak Leung <htl10@users.sourceforge.net>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Jan Kara <jack@suse.cz>
Tested-by: Hin-Tak Leung <htl10@users.sourceforge.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

hfsplus: add functionality of manipulating by records in attributes tree

Add functionality of manipulating by records in attributes tree.

Signed-off-by: Vyacheslav Dubeyko <slava@dubeyko.com>
Reported-by: Hin-Tak Leung <htl10@users.sourceforge.net>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Jan Kara <jack@suse.cz>
Tested-by: Hin-Tak Leung <htl10@users.sourceforge.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

hfsplus: add on-disk layout declarations related to attributes tree

Current mainline implementation of hfsplus file system driver treats as
extended attributes only two fields (fdType and fdCreator) of user_info
field in file description record (struct hfsplus_cat_file).  It is
possible to get or set only these two fields as extended attributes.  But
HFS+ treats as com.apple.FinderInfo extended attribute an union of
user_info and finder_info fields as for file (struct hfsplus_cat_file) as
for folder (struct hfsplus_cat_folder).  Moreover, current mainline
implementation of hfsplus file system driver doesn't support special
metadata file - attributes tree.

Mac OS X 10.4 and later support extended attributes by making use of the
HFS+ filesystem Attributes file B*-tree feature which allows for named
forks.  Mac OS X supports only inline extended attributes, limiting their
size to 3802 bytes.  Any regular file may have a list of extended
attributes.  HFS+ supports an arbitrary number of named forks.  Each
attribute is denoted by a name and the associated data.  The name is a
null-terminated Unicode string.  It is possible to list, to get, to set,
and to remove extended attributes from files or directories.

It exists some peculiarity during getting of extended attributes list by
means of getfattr utility.  The getfattr utility expects prefix "user."
before any extended attribute's name.  So, it ignores any names that don't
contained such prefix.  Such behavior of getfattr utility results in
unexpected empty output of extended attributes list even in the case when
file (or folder) contains extended attributes.  It needs to use empty
string as regular expression pattern for names matching (getfattr
--match="").

For support of extended attributes in HFS+:

1. It was added necessary on-disk layout declarations related to
   Attributes tree into hfsplus_raw.h file.

2. It was added attributes.c file with implementation of functionality
   of manipulation by records in Attributes tree.

3. It was reworked hfsplus_listxattr, hfsplus_getxattr,
   hfsplus_setxattr functions in ioctl.c.  Moreover, it was added
   hfsplus_removexattr method.

This patch:

Add all neccessary on-disk layout declarations related to attributes
file.

Signed-off-by: Vyacheslav Dubeyko <slava@dubeyko.com>
Reported-by: Hin-Tak Leung <htl10@users.sourceforge.net>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Jan Kara <jack@suse.cz>
Tested-by: Hin-Tak Leung <htl10@users.sourceforge.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

drivers-firmware-dmi_scanc-fetch-dmi-version-from-smbios-if-it-exists-checkpatch-fixes

WARNING: Prefer pr_info(... to printk(KERN_INFO, ...
#56: FILE: drivers/firmware/dmi_scan.c:426:
+ printk(KERN_INFO "SMBIOS %d.%d present.\n",

WARNING: Prefer pr_info(... to printk(KERN_INFO, ...
#61: FILE: drivers/firmware/dmi_scan.c:431:
+ printk(KERN_INFO "Legacy DMI %d.%d present.\n",

WARNING: Prefer pr_debug(... to printk(KERN_DEBUG, ...
#85: FILE: drivers/firmware/dmi_scan.c:455:
+ printk(KERN_DEBUG "SMBIOS version fixup(2.%d->2.%d)\n",

WARNING: Prefer pr_debug(... to printk(KERN_DEBUG, ...
#90: FILE: drivers/firmware/dmi_scan.c:460:
+ printk(KERN_DEBUG "SMBIOS version fixup(2.%d->2.%d)\n",

total: 0 errors, 4 warnings, 104 lines checked

./patches/drivers-firmware-dmi_scanc-fetch-dmi-version-from-smbios-if-it-exists.patch has style problems, please review.

If any of these errors are false positives, please report
them to the maintainer, see CHECKPATCH in MAINTAINERS.

Please run checkpatch prior to sending patches

Cc: Feng Jin <joe.jin@oracle.com>
Cc: Jean Delvare <khali@linux-fr.org>
Cc: Zhenzhong Duan <zhenzhong.duan@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

drivers/firmware/dmi_scan.c: fetch dmi version from SMBIOS if it exists

The right dmi version is in SMBIOS if it's zero in DMI region

Signed-off-by: Zhenzhong Duan <zhenzhong.duan@oracle.com>
Cc: Feng Jin <joe.jin@oracle.com>
Cc: Jean Delvare <khali@linux-fr.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

drivers-firmware-dmi_scanc-check-dmi-version-when-get-system-uuid-fix

tweak code comment

Cc: Feng Jin <joe.jin@oracle.com>
Cc: Jean Delvare <khali@linux-fr.org>
Cc: Zhenzhong Duan <zhenzhong.duan@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

drivers/firmware/dmi_scan.c: check dmi version when get system uuid

As of version 2.6 of the SMBIOS specification, the first 3
fields of the UUID are supposed to be little-endian encoded.

Also a minor fix to match variable meaning and mute checkpatch.pl

Signed-off-by: Zhenzhong Duan <zhenzhong.duan@oracle.com>
Cc: Feng Jin <joe.jin@oracle.com>
Cc: Jean Delvare <khali@linux-fr.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

compat: generic compat_sys_sched_rr_get_interval implementation

This function is used by sparc, powerpc tile and arm64 for compat support.
The patch adds a generic implementation with a wrapper for PowerPC to do
the u32->int sign extension.

The reason for a single patch covering powerpc, tile, sparc and arm64 is
to keep it bisectable, otherwise kernel building may fail with mismatched
function declarations.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Chris Metcalf <cmetcalf@tilera.com> [for tile]
Acked-by: David S. Miller <davem@davemloft.net>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

mm/memblock: reduce overhead in binary search

When checking that the indicated address belongs to the memory region, the
memory regions are checked one by one through a binary search, which will
be time consuming.

If the indicated address isn't in the memory region, then we needn't do
the time-consuming search. Add a check on the indicated address for that
purpose.

Signed-off-by: Wanpeng Li <liwanp@linux.vnet.ibm.com>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Gavin Shan <shangw@linux.vnet.ibm.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

swap-add-a-simple-detector-for-inappropriate-swapin-readahead-fix

tweak code comment

Cc: Hugh Dickins <hughd@google.com>
Cc: Konstantin Khlebnikov <khlebnikov@openvz.org>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Shaohua Li <shli@fusionio.com>
Cc: Shaohua Li <shli@kernel.org>
Cc: Wu Fengguang <fengguang.wu@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

swap: add a simple detector for inappropriate swapin readahead

The swapin readahead does a blind readahead whether or not the swapin is
sequential.  This is ok for harddisk because large reads have relatively
small costs and if the readahead pages are unneeded they can be reclaimed
easily.  But for SSD devices large reads are more expensive than small
one.  If readahead pages are unneeded, reading them in caused significant
overhead

This patch addes a simple random read detection similar to file mmap
readahead.  If a random read is detected, swapin readahead will be
skipped.  This improves a lot for a swap workload with random IO in a fast
SSD.

I run anonymous mmap write micro benchmark, which will triger swapin/swapout.

runtime changes with patch
randwrite harddisk -38.7%
seqwrite harddisk -1.1%
randwrite SSD -46.9%
seqwrite SSD +0.3%

For both harddisk and SSD, the randwrite swap workload run time is reduced
significantly.  Sequential write swap workload hasn't chanage.

Interestingly, the randwrite harddisk test is improved too.  This might be
because swapin readahead needs to allocate extra memory, which further
tights memory pressure, so more swapout/swapin.

Signed-off-by: Shaohua Li <shli@fusionio.com>
Acked-by: Rik van Riel <riel@redhat.com>
Cc: Konstantin Khlebnikov <khlebnikov@openvz.org>
Cc: Wu Fengguang <fengguang.wu@intel.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Minchan Kim <minchan@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

mm, memcg: make mem_cgroup_out_of_memory() static

mem_cgroup_out_of_memory() is only referenced from within file scope, so
it can be marked static.

Signed-off-by: David Rientjes <rientjes@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

memory hotplug: suppress "Device nodeX does not have a release() function" warning

When calling unregister_node(), the function shows following message at
device_release().

"Device 'node2' does not have a release() function, it is broken and must
be fixed."

The reason is node's device struct does not have a release() function.

So the patch registers node_device_release() to the device's release()
function for suppressing the warning message.  Additionally, the patch
adds memset() to initialize a node struct into register_node().  Because
the node struct is part of node_devices[] array and it cannot be freed by
node_device_release().  So if system reuses the node struct, it has a
garbage.

Signed-off-by: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
Acked-by: David Rientjes <rientjes@google.com>
Cc: Jiang Liu <liuj97@gmail.com>
Cc: Minchan Kim <minchan.kim@gmail.com>
Acked-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Greg KH <greg@kroah.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

memory hotplug: suppress "Device memoryX does not have a release() function" warning

When calling remove_memory_block(), the function shows following message
at device_release().

"Device 'memory528' does not have a release() function, it is broken and
must be fixed."

The reason is memory_block's device struct does not have a release()
function.

So the patch registers memory_block_release() to the device's release()
function for suppressing the warning message. Additionally, the patch
moves kfree(mem) into the release function since the release function is
prepared as a means to free a memory_block struct.

Signed-off-by: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Acked-by: David Rientjes <rientjes@google.com>
Cc: Jiang Liu <liuj97@gmail.com>
Cc: Minchan Kim <minchan.kim@gmail.com>
Acked-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Wen Congyang <wency@cn.fujitsu.com>
Cc: Greg KH <greg@kroah.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

mm: show migration types in show_mem

This is useful to diagnose the reason for page allocation failure for
cases where there appear to be several free pages.

Example, with this alloc_pages(GFP_ATOMIC) failure:

swapper/0: page allocation failure: order:0, mode:0x0
...
Mem-info:
Normal per-cpu:
CPU    0: hi:   90, btch:  15 usd:  48
CPU    1: hi:   90, btch:  15 usd:  21
active_anon:0 inactive_anon:0 isolated_anon:0
  active_file:0 inactive_file:84 isolated_file:0
  unevictable:0 dirty:0 writeback:0 unstable:0
  free:4026 slab_reclaimable:75 slab_unreclaimable:484
  mapped:0 shmem:0 pagetables:0 bounce:0
Normal free:16104kB min:2296kB low:2868kB high:3444kB active_anon:0kB
inactive_anon:0kB active_file:0kB inactive_file:336kB unevictable:0kB
isolated(anon):0kB isolated(file):0kB present:331776kB mlocked:0kB
dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:300kB
slab_unreclaimable:1936kB kernel_stack:328kB pagetables:0kB unstable:0kB
bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
lowmem_reserve[]: 0 0

Before the patch, it's hard (for me, at least) to say why all these free
chunks weren't considered for allocation:

Normal: 0*4kB 0*8kB 0*16kB 0*32kB 0*64kB 0*128kB 1*256kB 1*512kB
1*1024kB 1*2048kB 3*4096kB = 16128kB

After the patch, it's obvious that the reason is that all of these are
in the MIGRATE_CMA (C) freelist:

Normal: 0*4kB 0*8kB 0*16kB 0*32kB 0*64kB 0*128kB 1*256kB (C) 1*512kB
(C) 1*1024kB (C) 1*2048kB (C) 3*4096kB (C) = 16128kB

Signed-off-by: Rabin Vincent <rabin.vincent@stericsson.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

writeback: remove nr_pages_dirtied arg from balance_dirty_pages_ratelimited_nr()

There is no reason to pass the nr_pages_dirtied argument, because
nr_pages_dirtied value from the caller is unused in
balance_dirty_pages_ratelimited_nr().

Signed-off-by: Namjae Jeon <linkinjeon@gmail.com>
Signed-off-by: Vivek Trivedi <vtrivedi018@gmail.com>
Cc: Wu Fengguang <fengguang.wu@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

mm/page_alloc.c: remove duplicate check

While allocating pages using buddy allocator, the compound page is
probably split up to free pages.  Under these circumstances, the compound
page should be destroyed by destroy_compound_page().  However, there is a
duplicate check to judge if the page is compound.

Remove the duplicate check since the compound_order() returns 0 when the
page doesn't have PG_head set in destroy_compound_page().  That is to say,
destroy_compound_page() needn't check PageHead().

Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Pekka Enberg <penberg@cs.helsinki.fi>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

fs/block_dev.c: no need to check inode->i_bdev in bd_forget()

Its only caller evict() has promised a non-NULL inode->i_bdev.

Signed-off-by: Yan Hong <clouds.yan@gmail.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

fs: change return values from -EACCES to -EPERM

According to SUSv3:

[EACCES] Permission denied. An attempt was made to access a file in a way
forbidden by its file access permissions.

[EPERM] Operation not permitted. An attempt was made to perform an operation
limited to processes with appropriate privileges or to the owner of a file
or other resource.

So -EPERM should be returned if capability checks fails.

Strictly speaking this is an API change since the error code user sees is

Signed-off-by: Zhao Hongjiang <zhaohongjiang@huawei.com>
Acked-by: Jan Kara <jack@suse.cz>
Acked-by: Steven Whitehouse <swhiteho@redhat.com>
Acked-by: Ian Kent <raven@themaw.net>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

vfs: increment iversion when a file is truncated

When a file is truncated with truncate()/ftruncate() and then closed,
iversion is not updated.  This patch uses ATTR_SIZE flag as an indication
to increment iversion.

Mimi said:

On fput(), i_version is used to detect and flag files that have changed
and need to be re-measured in the IMA measurement policy.  When a file
is truncated with truncate()/ftruncate() and then closed, i_version is
not updated.  As a result, although the file has changed, it will not be
re-measured and added to the IMA measurement list on subsequent access.

Signed-off-by: Dmitry Kasatkin <dmitry.kasatkin@intel.com>
Acked-by: Mimi Zohar <zohar@us.ibm.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

drbd: use copy_highpage

Use copy_highpage() to copy from one page to another.

Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Cc: Lars Ellenberg <drbd-dev@lists.linbit.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

block: partition: msdos: provide UUIDs for partitions

The MSDOS/MBR partition table includes a 32-bit unique ID, often referred
to as the NT disk signature. When combined with a partition number within
the table, this can form a unique ID similar in concept to EFI/GPT's
partition UUID. Constructing and recording this value in struct
partition_meta_info allows MSDOS partitions to be referred to on the
kernel command-line using the following syntax:

root=PARTUUID=0002dd75-01

Signed-off-by: Stephen Warren <swarren@nvidia.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Tejun Heo <tj@kernel.org>
Cc: Will Drewry <wad@chromium.org>
Cc: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

init: reduce PARTUUID min length to 1 from 36

Reduce the minimum length for a root=PARTUUID= parameter to be considered
valid from 36 to 1.  EFI/GPT partition UUIDs are always exactly 36
characters long, hence the previous limit.  However, the next patch will
support DOS/MBR UUIDs too, which have a different, shorter, format.
Instead of validating any particular length, just ensure that at least
some non-empty value was given by the user.

Also, consider a missing UUID value to be a parsing error, in the same
vein as if /PARTNROFF exists and can't be parsed.  As such, make both
error cases print a message and disable rootwait.  Convert to pr_err while
we're at it.

Signed-off-by: Stephen Warren <swarren@nvidia.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Tejun Heo <tj@kernel.org>
Cc: Will Drewry <wad@chromium.org>
Cc: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

block: store partition_meta_info.uuid as a string

This will allow other types of UUID to be stored here, aside from true
UUIDs.  This also simplifies code that uses this field, since it's usually
constructed from a, used as a, or compared to other, strings.

Note: A simplistic approach here would be to set uuid_str[36]=0 whenever a
/PARTNROFF option was found to be present.  However, this modifies the
input string, and causes subsequent calls to devt_from_partuuid() not to
see the /PARTNROFF option, which causes different results.  In order to
avoid misleading future maintainers, this parameter is marked const.

Signed-off-by: Stephen Warren <swarren@nvidia.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Tejun Heo <tj@kernel.org>
Cc: Will Drewry <wad@chromium.org>
Cc: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

cciss: use check_signature()

Use check_signature() to find a signature in the mmio address.

Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Cc: Mike Miller <mike.miller@hp.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

cciss: cleanup bitops usage

- Remove unnecessary correction of bit and address
- Use BITS_TO_LONGS macro to calculate bitmap size
- Use bitmap_zero()

Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Cc: Mike Miller <mike.miller@hp.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

h8300: select generic atomic64_t support

Rationales from Eric:

So I just looked a little deeper and it appears architectures that do
not support atomic64_t are broken.

The generic atomic64 support came in 2009 to support the perf subsystem
with the expectation that all architectures would implement atomic64
support.

Furthermore upon inspection of the kernel atomic64_t is used in a fair
number of places beyond the performance counters:

block/blk-cgroup.c
drivers/acpi/apei/
drivers/block/rbd.c
drivers/crypto/nx/nx.h
drivers/gpu/drm/radeon/radeon.h
drivers/infiniband/hw/ipath/
drivers/infiniband/hw/qib/
drivers/staging/octeon/
fs/xfs/
include/linux/perf_event.h
include/net/netfilter/nf_conntrack_acct.h
kernel/events/
kernel/trace/
net/mac80211/key.h
net/rds/

The block control group, infiniband, xfs, crypto, 802.11, netfilter.
Nothing quite so fundamental as fs/namespace.c but definitely in
multiplatform-code that should work, and is already broken on those
architecutres.

Looking at the implementation of atomic64_add_return in lib/atomic64.c the
code looks as efficient as these kinds of things get.

Which leads me to the conclusion that we need atomic64 support on all
architectures.

Signed-off-by: Fengguang Wu <fengguang.wu@intel.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

time: don't inline EXPORT_SYMBOL functions

How is the compiler even handling exported functions that are marked
inline? Anyway, these shouldn't be inline because of that, so remove that
marking.

Based on a larger patch by Mark Charlebois to get LLVM to build the
kernel.

Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Mark Charlebois <mcharleb@qualcomm.com>
Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
Cc: hank <pyu@redhat.com>
Cc: John Stultz <john.stultz@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

timeconst.pl: remove deprecated defined(@array)

The use of defined() on arrays and hashes has been deprecated since perl
5.6, but until 5.17.6 it only warned on lexicals, not package globals.

Signed-off-by: Dagfinn Ilmari Mannsåker <ilmari@ilmari.org>
Acked-by: "H. Peter Anvin" <hpa@zytor.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

drm: fix radeon printk format warnings

Fix printk format warnings in gpu/drm/radeon/:

drivers/gpu/drm/radeon/radeon_atpx_handler.c:151:3: warning: format '%lu' expects type 'long unsigned int', but argument 2 has type 'size_t'
drivers/gpu/drm/radeon/radeon_acpi.c:204:3: warning: format '%lu' expects type 'long unsigned int', but argument 2 has type 'size_t'
drivers/gpu/drm/radeon/radeon_acpi.c:488:3: warning: format '%lu' expects type 'long unsigned int', but argument 2 has type 'size_t'

Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Cc: David Airlie <airlied@linux.ie>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

drm/i915: optimize DIV_ROUND_CLOSEST() call

DIV_ROUND_CLOSEST is faster if the compiler knows it will only be dealing
with unsigned dividends.

Signed-off-by: Jean Delvare <khali@linux-fr.org>
Cc: Guenter Roeck <linux@roeck-us.net>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: David Airlie <airlied@linux.ie>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

pcmcia: move unbind/rebind into dev_pm_ops.complete

Move the device rebind procedures for cardbus devices from the pm.resume
into the pm.complete callback.

The reason for moving the code is: "[...] The PM code needs to send
suspend and resume messages to every device in the right order, and it
can't do that if new devices are being added at the same time.  [...]"

However the situation really isn't quite that rigid.  In particular,
adding new children during a resume callback shouldn't cause much of
problem because the children don't need to be resumed anyway (since they
were never suspended).  On the other hand, if you do it you will get a
dev_warn() from the PM core, something like 'parent should not be
sleeping'.

Still, it is considered bad form and should be avoided if possible."

(Alan Stern's full comment about the topic can
be found here: <https://lkml.org/lkml/2012/7/10/254>)

Signed-off-by: Christian Lamparter <chunkeey@googlemail.com>
Cc: Dominik Brodowski <linux@dominikbrodowski.net>
Cc: Alan Stern <stern@rowland.harvard.edu>
Cc: Greg KH <greg@kroah.com>
Acked-by: "Rafael J. Wysocki" <rjw@sisk.pl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

fs/debugsfs: remove unnecessary inode->i_private initialization

inode->i_private is promised to be NULL on allocation, no need to set it
explicitly.

Signed-off-by: Yan Hong <clouds.yan@gmail.com>
Cc: Greg KH <greg@kroah.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

olpc: fix olpc-xo1-sci.c build errors

Fix build errors when CONFIG_INPUT=m. This is not pretty, but all of the
OLPC kconfig options are bool instead of tristate.

arch/x86/built-in.o: In function `send_lid_state':
olpc-xo1-sci.c:(.text+0x1d323): undefined reference to `input_event'
olpc-xo1-sci.c:(.text+0x1d338): undefined reference to `input_event'
arch/x86/built-in.o: In function `free_ebook_switch':
olpc-xo1-sci.c:(.text+0x1d529): undefined reference to `input_unregister_device'
olpc-xo1-sci.c:(.text+0x1d533): undefined reference to `input_free_device'
arch/x86/built-in.o: In function `free_power_button':
olpc-xo1-sci.c:(.text+0x1d549): undefined reference to `input_unregister_device'
olpc-xo1-sci.c:(.text+0x1d553): undefined reference to `input_free_device'
arch/x86/built-in.o: In function `send_ebook_state':
olpc-xo1-sci.c:(.text+0x1d632): undefined reference to `input_event'
olpc-xo1-sci.c:(.text+0x1d647): undefined reference to `input_event'
arch/x86/built-in.o: In function `xo1_sci_intr':
olpc-xo1-sci.c:(.text+0x1d78e): undefined reference to `input_event'
olpc-xo1-sci.c:(.text+0x1d7a3): undefined reference to `input_event'
olpc-xo1-sci.c:(.text+0x1d7be): undefined reference to `input_event'
arch/x86/built-in.o:olpc-xo1-sci.c:(.text+0x1d7d3): more undefined references to `input_event' follow
arch/x86/built-in.o: In function `free_lid_switch':
olpc-xo1-sci.c:(.text+0x1d7fd): undefined reference to `input_unregister_device'
olpc-xo1-sci.c:(.text+0x1d807): undefined reference to `input_free_device'
arch/x86/built-in.o: In function `setup_lid_switch':
olpc-xo1-sci.c:(.devinit.text+0x155): undefined reference to `input_allocate_device'
olpc-xo1-sci.c:(.devinit.text+0x1a4): undefined reference to `input_register_device'
olpc-xo1-sci.c:(.devinit.text+0x1ce): undefined reference to `input_unregister_device'
olpc-xo1-sci.c:(.devinit.text+0x1d8): undefined reference to `input_free_device'
arch/x86/built-in.o: In function `xo1_sci_probe':
olpc-xo1-sci.c:(.devinit.text+0x235): undefined reference to `input_allocate_device'
olpc-xo1-sci.c:(.devinit.text+0x285): undefined reference to `input_register_device'
olpc-xo1-sci.c:(.devinit.text+0x299): undefined reference to `input_free_device'
olpc-xo1-sci.c:(.devinit.text+0x2e1): undefined reference to `input_register_device'
olpc-xo1-sci.c:(.devinit.text+0x2f5): undefined reference to `input_free_device'
olpc-xo1-sci.c:(.devinit.text+0x54c): undefined reference to `input_allocate_device'

In the long run, fixing this driver kconfig to be tristate instead of bool
would be a very good change.

Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Cc: Andres Salomon <dilinger@queued.net>
Cc: Chris Ball <cjb@laptop.org>
Cc: Jon Nettleton <jon.nettleton@gmail.com>
Cc: Daniel Drake <dsd@laptop.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

arch/x86/platform/uv: fix incorrect tlb flush all issue

The flush tlb optimization code has logical issue on UV platform.  It
doesn't flush the full range at all, since it simply ignores its 'end'
parameter (and hence also the "all" indicator) in uv_flush_tlb_others()
function.

Cliff's notes:

: I tested the patch on a UV.  It has the effect of either clearing 1 or all
: TLBs in a cpu.  I added some debugging to test for the cases when clearing
: all TLBs is overkill, and in practice it happens very seldom.

Reported-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Alex Shi <alex.shi@intel.com>
Signed-off-by: Cliff Wickman <cpw@sgi.com>
Tested-by: Cliff Wickman <cpw@sgi.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

arch/x86/tools/insn_sanity.c: identify source of messages

The kernel build prints:

  Building modules, stage 2.
  TEST    posttest
  MODPOST 3821 modules
  TEST    posttest
Success: decoded and checked 1000000 random instructions with 0 errors (seed:0xaac4bc47)
  CC      arch/x86/boot/a20.o
  CC      arch/x86/boot/cmdline.o
  AS      arch/x86/boot/copy.o
  HOSTCC  arch/x86/boot/mkcpustr
  CC      arch/x86/boot/cpucheck.o
  CC      arch/x86/boot/early_serial_console.o

which is irritating because you don't know what program is proudly
pronouncing its success.

So, as described in "console mode programming user interface guidelines
version 101" which doesn't exist, change this program to identify the
source of its messages.

Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

x86 numa: don't check if node is NUMA_NO_NODE

If we aren't debugging per_cpu maps, the cpu's node is stored in per_cpu
variable numa_node. If `node' is NUMA_NO_NODE, it means the caller wants
to clear the cpu's node. So we should also call set_cpu_numa_node() in
this case.

Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
Cc: Len Brown <len.brown@intel.com>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

arch/x86/platform/iris/iris.c: register a platform device and a platform driver

This makes the iris driver use the platform API, so it is properly exposed
in /sys.

[akpm@linux-foundation.org: remove commented-out code, add missing space to printk, clean up code layout]
Signed-off-by: Shérab <Sebastien.Hinderer@ens-lyon.org>
Cc: Len Brown <lenb@kernel.org>
Cc: Matthew Garrett <mjg@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

acpi_memhotplug.c: auto bind the memory device which is hotplugged before the driver is loaded

If the memory device is hotplugged before the driver is loaded, the user
cannot see this device under the directory /sys/bus/acpi/devices/, and the
user cannot bind it by hand after the driver is loaded. This patch
introduces a new feature to bind such device when the driver is being
loaded.

Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
Cc: Len Brown <lenb@kernel.org>
Cc: "Brown, Len" <len.brown@intel.com>
Cc: Yasuaki ISIMATU <isimatu.yasuaki@jp.fujitsu.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

acpi_memhotplug.c: bind the memory device when the driver is being loaded

We had introduced acpi_hotmem_initialized to avoid strange add_memory fail
message. But the memory device may not be used by the kernel, and the
device should be bound when the driver is being loaded. Remove
acpi_hotmem_initialized to allow that the device can be bound when the
driver is being loaded.

Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
Cc: Len Brown <lenb@kernel.org>
Cc: "Brown, Len" <len.brown@intel.com>
Cc: Yasuaki ISIMATU <isimatu.yasuaki@jp.fujitsu.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

acpi_memhotplug.c: don't allow to eject the memory device if it is being used

We eject the memory device even if it is in use. It is very dangerous,
and it will cause the kernel to panic.

Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
Cc: Len Brown <lenb@kernel.org>
Cc: "Brown, Len" <len.brown@intel.com>
Cc: Yasuaki ISIMATU <isimatu.yasuaki@jp.fujitsu.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

acpi_memhotplug.c: remove memory info from list before freeing it

We free info, but we forget to remove it from the list. It will cause
unexpected problems when we access the list next time.

Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
Cc: Len Brown <lenb@kernel.org>
Cc: "Brown, Len" <len.brown@intel.com>
Cc: Yasuaki ISIMATU <isimatu.yasuaki@jp.fujitsu.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

acpi_memhotplug.c: free memory device if acpi_memory_enable_device() failed

If acpi_memory_enable_device() fails, acpi_memory_enable_device() will
return a non-zero value, which means we fail to bind the memory device to
this driver. So we should free memory device before
acpi_memory_device_add() returns.

Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
Cc: Len Brown <lenb@kernel.org>
Cc: "Brown, Len" <len.brown@intel.com>
Cc: Yasuaki ISIMATU <isimatu.yasuaki@jp.fujitsu.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

acpi_memhotplug.c: fix memory leak when memory device is unbound from the module acpi_memhotplug

We allocate memory to store acpi_memory_info, so we should free it before
freeing mem_device.

Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
Cc: Len Brown <lenb@kernel.org>
Cc: "Brown, Len" <len.brown@intel.com>
Cc: Yasuaki ISIMATU <isimatu.yasuaki@jp.fujitsu.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

cpu_hotplug-unmap-cpu2node-when-the-cpu-is-hotremoved-fix

make acpi_unmap_lsapic __ref

Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Len Brown <len.brown@intel.com>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Wen Congyang <wency@cn.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

x86 cpu_hotplug: unmap cpu2node when the cpu is hotremoved

When a cpu is hotplugged, we call acpi_map_cpu2node() in
_acpi_map_lsapic() to store the cpu's node.  But we don't clear the cpu's
node in acpi_unmap_lsapic() when this cpu is hotremoved.  If the node is
also hotremoved, We will get the following messages:

[ 1646.771485] kernel BUG at include/linux/gfp.h:329!
[ 1646.828729] invalid opcode: 0000 [#1] SMP
[ 1646.877872] Modules linked in: ebtable_nat ebtables ipt_MASQUERADE iptable_nat nf_nat xt_CHECKSUM iptable_mangle bridge stp llc sunrpc ipt_REJECT nf_conntrack_ipv4 nf_defrag_ipv4 iptable_filter ip_tables ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables binfmt_misc dm_mirror dm_region_hash dm_log dm_mod vhost_net macvtap macvlan tun uinput iTCO_wdt iTCO_vendor_support coretemp kvm_intel kvm crc32c_intel microcode pcspkr i2c_i801 i2c_core lpc_ich mfd_core ioatdma e1000e i7core_edac edac_core sg acpi_memhotplug igb dca sd_mod crc_t10dif megaraid_sas mptsas mptscsih mptbase scsi_transport_sas scsi_mod
[ 1647.588773] Pid: 3126, comm: init Not tainted 3.6.0-rc3-tangchen-hostbridge+ #13 FUJITSU-SV PRIMEQUEST 1800E/SB
[ 1647.711545] RIP: 0010:[<ffffffff811bc3fd>]  [<ffffffff811bc3fd>] allocate_slab+0x28d/0x300
[ 1647.810492] RSP: 0018:ffff88078a049cf8  EFLAGS: 00010246
[ 1647.874028] RAX: 0000000000000000 RBX: 0000000000000001 RCX: 0000000000000000
[ 1647.959339] RDX: 0000000000000001 RSI: 0000000000000001 RDI: 0000000000000246
[ 1648.044659] RBP: ffff88078a049d38 R08: 00000000000040d0 R09: 0000000000000001
[ 1648.129953] R10: 0000000000000000 R11: 0000000000000b5f R12: 00000000000052d0
[ 1648.215259] R13: ffff8807c1417300 R14: 0000000000030038 R15: 0000000000000003
[ 1648.300572] FS:  00007fa9b1b44700(0000) GS:ffff8807c3800000(0000) knlGS:0000000000000000
[ 1648.397272] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 1648.465985] CR2: 00007fa9b09acca0 CR3: 000000078b855000 CR4: 00000000000007e0
[ 1648.551265] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 1648.636565] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[ 1648.721838] Process init (pid: 3126, threadinfo ffff88078a048000, task ffff8807bb6f2650)
[ 1648.818534] Stack:
[ 1648.842548]  ffff8807c39d7fa0 ffffffff000040d0 00000000000000bb 00000000000080d0
[ 1648.931469]  ffff8807c1417300 ffff8807c39d7fa0 ffff8807c1417300 0000000000000001
[ 1649.020410]  ffff88078a049d88 ffffffff811bc4a0 ffff8807c1410c80 0000000000000000
[ 1649.109464] Call Trace:
[ 1649.138713]  [<ffffffff811bc4a0>] new_slab+0x30/0x1b0
[ 1649.199075]  [<ffffffff811bc978>] __slab_alloc+0x358/0x4c0
[ 1649.264683]  [<ffffffff810b71c0>] ? alloc_fair_sched_group+0xd0/0x1b0
[ 1649.341695]  [<ffffffff811be7d4>] kmem_cache_alloc_node_trace+0xb4/0x1e0
[ 1649.421824]  [<ffffffff8109d188>] ? hrtimer_init+0x48/0x100
[ 1649.488414]  [<ffffffff810b71c0>] ? alloc_fair_sched_group+0xd0/0x1b0
[ 1649.565402]  [<ffffffff810b71c0>] alloc_fair_sched_group+0xd0/0x1b0
[ 1649.640297]  [<ffffffff810a8bce>] sched_create_group+0x3e/0x110
[ 1649.711040]  [<ffffffff810bdbcd>] sched_autogroup_create_attach+0x4d/0x180
[ 1649.793260]  [<ffffffff81089614>] sys_setsid+0xd4/0xf0
[ 1649.854694]  [<ffffffff8167a029>] system_call_fastpath+0x16/0x1b
[ 1649.926483] Code: 89 c4 e9 73 fe ff ff 31 c0 89 de 48 c7 c7 45 de 9e 81 44 89 45 c8 e8 22 05 4b 00 85 db 44 8b 45 c8 0f 89 4f ff ff ff 0f 0b eb fe <0f> 0b 90 eb fd 0f 0b eb fe 89 de 48 c7 c7 45 de 9e 81 31 c0 44
[ 1650.161454] RIP  [<ffffffff811bc3fd>] allocate_slab+0x28d/0x300
[ 1650.232348]  RSP <ffff88078a049cf8>
[ 1650.274029] ---[ end trace adf84c90f3fea3e5 ]---

The reason is that: the cpu's node is not NUMA_NO_NODE, we will call
alloc_pages_exact_node() to alloc memory on the node, but the node is
offlined.

Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
Cc: Len Brown <len.brown@intel.com>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

vfs: d_obtain_alias() needs to use "/" as default name.

NFS appears to use d_obtain_alias() to create the root dentry rather than
d_make_root.  This can cause 'prepend_path()' to complain that the root
has a weird name if an NFS filesystem is lazily unmounted.  e.g.  if
"/mnt" is an NFS mount then

{ cd /mnt; umount -l /mnt ; ls -l /proc/self/cwd; }

will cause a WARN message like
   WARNING: at /home/git/linux/fs/dcache.c:2624 prepend_path+0x1d7/0x1e0()
   ...
   Root dentry has weird name <>

to appear in kernel logs.

So change d_obtain_alias() to use "/" rather than "" as the anonymous
name.

Signed-off-by: NeilBrown <neilb@suse.de>
Cc: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

selinux: fix sel_netnode_insert() suspicious rcu dereference

===============================
[ INFO: suspicious RCU usage. ]
3.5.0-rc1+ #63 Not tainted
-------------------------------
security/selinux/netnode.c:178 suspicious rcu_dereference_check() usage!

other info that might help us debug this:

rcu_scheduler_active = 1, debug_locks = 0
1 lock held by trinity-child1/8750:
#0: (sel_netnode_lock){+.....}, at: [<ffffffff812d8f8a>] sel_netnode_sid+0x16a/0x3e0

stack backtrace:
Pid: 8750, comm: trinity-child1 Not tainted 3.5.0-rc1+ #63
Call Trace:
[<ffffffff810cec2d>] lockdep_rcu_suspicious+0xfd/0x130
[<ffffffff812d91d1>] sel_netnode_sid+0x3b1/0x3e0
[<ffffffff812d8e20>] ? sel_netnode_find+0x1a0/0x1a0
[<ffffffff812d24a6>] selinux_socket_bind+0xf6/0x2c0
[<ffffffff810cd1dd>] ? trace_hardirqs_off+0xd/0x10
[<ffffffff810cdb55>] ? lock_release_holdtime.part.9+0x15/0x1a0
[<ffffffff81093841>] ? lock_hrtimer_base+0x31/0x60
[<ffffffff812c9536>] security_socket_bind+0x16/0x20
[<ffffffff815550ca>] sys_bind+0x7a/0x100
[<ffffffff816c03d5>] ? sysret_check+0x22/0x5d
[<ffffffff810d392d>] ? trace_hardirqs_on_caller+0x10d/0x1a0
[<ffffffff8133b09e>] ? trace_hardirqs_on_thunk+0x3a/0x3f
[<ffffffff816c03a9>] system_call_fastpath+0x16/0x1b

This patch below does what Paul McKenney suggested in the previous thread.

Signed-off-by: Dave Jones <davej@redhat.com>
Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Acked-by: Paul Moore <paul@paul-moore.com>
Cc: Eric Paris <eparis@parisplace.org>
Cc: James Morris <jmorris@namei.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

CRIS: Fix I/O macros

The inb/outb macros for CRIS are broken from a number of points of view,
missing () around parameters and they have an unprotected if statement in
them. This was breaking the compile of IPMI on CRIS and thus I was being
annoyed by build regressions, so I fixed them.

Plus I don't think they would have worked at all, since the data values
were missing "&" and the outsl had a "3" instead of a "4" for the size.
From what I can tell, this stuff is not used at all, so this can't be any
more broken than it was before, anyway.

Signed-off-by: Corey Minyard <cminyard@mvista.com>
Cc: Jesper Nilsson <jesper.nilsson@axis.com>
Cc: Mikael Starvik <starvik@axis.com>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

pidns-remove-recursion-from-free_pid_ns-v5-fix

export put_pid_ns() to modules

i386 allmodconfig:
ERROR: "put_pid_ns" [net/ipv6/ipv6.ko] undefined!

Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Andrew Vagin <avagin@openvz.org>
Cc: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: Greg KH <greg@kroah.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

pidns: remove recursion from free_pid_ns()

free_pid_ns() operates in a recursive fashion:

free_pid_ns(parent)
  put_pid_ns(parent)
    kref_put(&ns->kref, free_pid_ns);
      free_pid_ns

thus if there was a huge nesting of namespaces the userspace may trigger
avalanche calling of free_pid_ns leading to kernel stack exhausting and a
panic eventually.

This patch turns the recursion into an iterative loop.

Based on a patch by Andrew Vagin.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: Andrew Vagin <avagin@openvz.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Pavel Emelyanov <xemul@parallels.com>
Cc: Greg KH <greg@kroah.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

drivers/video/backlight/lm3639_bl.c: return proper error in lm3639_bled_mode_store() error paths

Signed-off-by: Axel Lin <axel.lin@ingics.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

kernel/sys.c: fix stack memory content leak via UNAME26

Calling uname() with the UNAME26 personality set allows a leak of kernel
stack contents. This fixes it by defensively calculating the length of
copy_to_user() call, making the len argument unsigned, and initializing
the stack buffer to zero (now technically unneeded, but hey, overkill).

CVE-2012-0957

Reported-by: PaX Team <pageexec@freemail.hu>
Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: PaX Team <pageexec@freemail.hu>
Cc: Brad Spengler <spender@grsecurity.net>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

memstick: memory leak on error in msb_ftl_scan()

We need to free "overwrite_flags" before returning.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Maxim Levitsky <maximlevitsly@gmail.com>
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

memstick: use after free in msb_disk_release()

The original code dereferenced "msb" after freeing it.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Maxim Levitsky <maximlevitsly@gmail.com>
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

memstick: ms_block: fix compile issue

As suggested by Geert Uytterhoeven:

: http://kisskb.ellerman.id.au/kisskb/buildresult/7280352/
: arch/m68k/include/asm/hardirq.h:23:20: error: expected ')' before 'DRIVER_NAME'
: make[4]: *** [drivers/memstick/core/ms_block.o] Error 1
:
: The reason for this is that pr_fmt() references DRIVER_NAME and is defined
: before the first include, while DRIVER_NAME is only defined in ms_block.h,
: which is the last included file.  If any subsequent include file uses
: pr_fmt() (e.g.  the call to pr_crit() in arch/m68k/include/asm/hardirq.h),
: this causes a build failure.
:
: I suggest moving the DRIVER_NAME define to ms_block.c.  Cfr.  memstick.c
: and mspro_block.c, who already have their own definition.

Signed-off-by: Maxim Levitsky <maximlevitsky@gmail.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Alex Dubov <oakad@yahoo.com>
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

memstick: remove unused field from state struct

Oops, I forgot that I have thet field there already. Just save memory by
not allocating it.

Signed-off-by: Maxim Levitsky <maximlevitsky@gmail.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Alex Dubov <oakad@yahoo.com>
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

linux/coredump.h needs asm/siginfo.h

commit 5ab1c30 ("coredump: pass siginfo_t* to do_coredump() and below, not
merely signr") added siginfo_t to linux/coredump.h but forgot to include
asm/siginfo.h. This breaks the build for UML/i386. (And any other arch
where asm/siginfo.h is not magically preincluded...)

In file included from arch/x86/um/elfcore.c:2:0:
include/linux/coredump.h:15:25: error: unknown type name 'siginfo_t'
make[1]: *** [arch/x86/um/elfcore.o] Error 1

Signed-off-by: Richard Weinberger <richard@nod.at>
Cc: Denys Vlasenko <vda.linux@googlemail.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Amerigo Wang <amwang@redhat.com>
Cc: "Jonathan M. Foote" <jmfoote@cert.org>
Cc: Roland McGrath <roland@hack.frob.com>
Cc: Pedro Alves <palves@redhat.com>
Cc: Fengguang Wu <fengguang.wu@intel.com>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Merge remote-tracking branch 'lzo-update/lzo-update'

Merge remote-tracking branch 'signal/for-next'

Merge remote-tracking branch 'dma-buf/for-next'

Merge remote-tracking branch 'kvmtool/master'

Merge remote-tracking branch 'tegra/for-next'

Merge remote-tracking branch 'samsung/for-next'

Conflicts:
arch/arm/mach-exynos/clock-exynos5.c

Merge remote-tracking branch 'renesas/next'

Merge remote-tracking branch 'ixp4xx/next'

Merge remote-tracking branch 'ep93xx/ep93xx-for-next'

Merge remote-tracking branch 'arm-soc/for-next'

Merge remote-tracking branch 'gpio-lw/for-next'

Merge remote-tracking branch 'remoteproc/for-next'

Merge remote-tracking branch 'vhost/linux-next'

Conflicts:
drivers/net/tun.c

Merge remote-tracking branch 'pinctrl/for-next'

Merge remote-tracking branch 'tmem/linux-next'

Merge remote-tracking branch 'regmap/for-next'

Merge remote-tracking branch 'drivers-x86/linux-next'

Merge remote-tracking branch 'workqueues/for-next'

Merge remote-tracking branch 'xen-two/linux-next'

Merge remote-tracking branch 'oprofile/for-next'

Merge remote-tracking branch 'kvm-ppc/kvm-ppc-next'

Merge remote-tracking branch 'kvm/linux-next'

Conflicts:
arch/powerpc/include/asm/Kbuild
arch/powerpc/include/asm/kvm_para.h

Merge remote-tracking branch 'tip/auto-latest'

Conflicts:
include/linux/mempolicy.h
mm/huge_memory.c
mm/mempolicy.c

Merge remote-tracking branch 'spi-mb/spi-next'

Merge remote-tracking branch 'edac-amd/for-next'

Conflicts:
Documentation/edac.txt
drivers/edac/amd64_edac.c

Merge remote-tracking branch 'edac/linux_next'

Merge remote-tracking branch 'fsnotify/for-next'

Conflicts:
kernel/audit_tree.c