Shaohua Li [Fri, 28 Sep 2012 00:19:42 +0000 (10:19 +1000)]
swap: add a simple detector for inappropriate swapin readahead
The swapin readahead does a blind readahead whether or not the swapin is
sequential. This is ok for harddisk because large reads have relatively
small costs and if the readahead pages are unneeded they can be reclaimed
easily. But for SSD devices large reads are more expensive than small
one. If readahead pages are unneeded, reading them in caused significant
overhead
This patch addes a simple random read detection similar to file mmap
readahead. If a random read is detected, swapin readahead will be
skipped. This improves a lot for a swap workload with random IO in a fast
SSD.
I run anonymous mmap write micro benchmark, which will triger swapin/swapout.
For both harddisk and SSD, the randwrite swap workload run time is reduced
significantly. Sequential write swap workload hasn't chanage.
Interestingly, the randwrite harddisk test is improved too. This might be
because swapin readahead needs to allocate extra memory, which further
tights memory pressure, so more swapout/swapin.
Signed-off-by: Shaohua Li <shli@fusionio.com> Acked-by: Rik van Riel <riel@redhat.com> Cc: Konstantin Khlebnikov <khlebnikov@openvz.org> Cc: Wu Fengguang <fengguang.wu@intel.com> Cc: Hugh Dickins <hughd@google.com> Cc: Minchan Kim <minchan@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
While allocating pages using buddy allocator, the compound page is
probably split up to free pages. Under these circumstances, the compound
page should be destroyed by destroy_compound_page(). However, there is a
duplicate check to judge if the page is compound.
Remove the duplicate check since the compound_order() returns 0 when the
page doesn't have PG_head set in destroy_compound_page(). That is to say,
destroy_compound_page() needn't check PageHead().
Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by: Pekka Enberg <penberg@cs.helsinki.fi> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
[EACCES] Permission denied. An attempt was made to access a file in a way
forbidden by its file access permissions.
[EPERM] Operation not permitted. An attempt was made to perform an operation
limited to processes with appropriate privileges or to the owner of a file
or other resource.
So -EPERM should be returned if capability checks fails.
Strictly speaking this is an API change since the error code user sees is
Signed-off-by: Zhao Hongjiang <zhaohongjiang@huawei.com> Acked-by: Jan Kara <jack@suse.cz> Acked-by: Steven Whitehouse <swhiteho@redhat.com> Acked-by: Ian Kent <raven@themaw.net> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
When a file is truncated with truncate()/ftruncate() and then closed,
iversion is not updated. This patch uses ATTR_SIZE flag as an indication
to increment iversion.
Mimi said:
On fput(), i_version is used to detect and flag files that have changed
and need to be re-measured in the IMA measurement policy. When a file
is truncated with truncate()/ftruncate() and then closed, i_version is
not updated. As a result, although the file has changed, it will not be
re-measured and added to the IMA measurement list on subsequent access.
Signed-off-by: Dmitry Kasatkin <dmitry.kasatkin@intel.com> Acked-by: Mimi Zohar <zohar@us.ibm.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Akinobu Mita [Fri, 28 Sep 2012 00:19:00 +0000 (10:19 +1000)]
drbd: use copy_highpage
Use copy_highpage() to copy from one page to another.
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Cc: Lars Ellenberg <drbd-dev@lists.linbit.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Stephen Warren [Fri, 28 Sep 2012 00:19:00 +0000 (10:19 +1000)]
block: partition: msdos: provide UUIDs for partitions
The MSDOS/MBR partition table includes a 32-bit unique ID, often referred
to as the NT disk signature. When combined with a partition number within
the table, this can form a unique ID similar in concept to EFI/GPT's
partition UUID. Constructing and recording this value in struct
partition_meta_info allows MSDOS partitions to be referred to on the
kernel command-line using the following syntax:
Stephen Warren [Fri, 28 Sep 2012 00:19:00 +0000 (10:19 +1000)]
init: reduce PARTUUID min length to 1 from 36
Reduce the minimum length for a root=PARTUUID= parameter to be considered
valid from 36 to 1. EFI/GPT partition UUIDs are always exactly 36
characters long, hence the previous limit. However, the next patch will
support DOS/MBR UUIDs too, which have a different, shorter, format.
Instead of validating any particular length, just ensure that at least
some non-empty value was given by the user.
Also, consider a missing UUID value to be a parsing error, in the same
vein as if /PARTNROFF exists and can't be parsed. As such, make both
error cases print a message and disable rootwait. Convert to pr_err while
we're at it.
Signed-off-by: Stephen Warren <swarren@nvidia.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Tejun Heo <tj@kernel.org> Cc: Will Drewry <wad@chromium.org> Cc: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Stephen Warren [Fri, 28 Sep 2012 00:18:59 +0000 (10:18 +1000)]
block: store partition_meta_info.uuid as a string
This will allow other types of UUID to be stored here, aside from true
UUIDs. This also simplifies code that uses this field, since it's usually
constructed from a, used as a, or compared to other, strings.
Note: A simplistic approach here would be to set uuid_str[36]=0 whenever a
/PARTNROFF option was found to be present. However, this modifies the
input string, and causes subsequent calls to devt_from_partuuid() not to
see the /PARTNROFF option, which causes different results. In order to
avoid misleading future maintainers, this parameter is marked const.
Signed-off-by: Stephen Warren <swarren@nvidia.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Tejun Heo <tj@kernel.org> Cc: Will Drewry <wad@chromium.org> Cc: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Akinobu Mita [Fri, 28 Sep 2012 00:18:59 +0000 (10:18 +1000)]
cciss: use check_signature()
Use check_signature() to find a signature in the mmio address.
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Cc: Mike Miller <mike.miller@hp.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Stephen M. Cameron <scameron@beardog.cce.hp.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Akinobu Mita [Fri, 28 Sep 2012 00:18:59 +0000 (10:18 +1000)]
cciss: cleanup bitops usage
- Remove unnecessary correction of bit and address
- Use BITS_TO_LONGS macro to calculate bitmap size
- Use bitmap_zero()
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Cc: Mike Miller <mike.miller@hp.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Stephen M. Cameron <scameron@beardog.cce.hp.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
The block control group, infiniband, xfs, crypto, 802.11, netfilter.
Nothing quite so fundamental as fs/namespace.c but definitely in
multiplatform-code that should work, and is already broken on those
architecutres.
Looking at the implementation of atomic64_add_return in lib/atomic64.c the
code looks as efficient as these kinds of things get.
Which leads me to the conclusion that we need atomic64 support on all
architectures.
Signed-off-by: Fengguang Wu <fengguang.wu@intel.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Daniel Santos [Fri, 28 Sep 2012 00:18:55 +0000 (10:18 +1000)]
compiler-gcc4.h: correct verion check for __compiletime_error
__attribute__((error(msg))) was introduced in gcc 4.3 (not 4.4) and as I
was unable to find any gcc bugs pertaining to it, I'm presuming that it
has functioned as advertised since 4.3.0.
Signed-off-by: Daniel Santos <daniel.santos@pobox.com> Cc: Michal Marek <mmarek@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
How is the compiler even handling exported functions that are marked
inline? Anyway, these shouldn't be inline because of that, so remove that
marking.
Based on a larger patch by Mark Charlebois to get LLVM to build the
kernel.
Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Mark Charlebois <mcharleb@qualcomm.com> Cc: Paul Gortmaker <paul.gortmaker@windriver.com> Cc: hank <pyu@redhat.com> Cc: John Stultz <john.stultz@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
The use of defined() on arrays and hashes has been deprecated since perl
5.6, but until 5.17.6 it only warned on lexicals, not package globals.
Signed-off-by: Dagfinn Ilmari Mannsåker <ilmari@ilmari.org> Acked-by: "H. Peter Anvin" <hpa@zytor.com> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Jean Delvare [Fri, 28 Sep 2012 00:18:54 +0000 (10:18 +1000)]
drm/i915: optimize DIV_ROUND_CLOSEST() call
DIV_ROUND_CLOSEST is faster if the compiler knows it will only be dealing
with unsigned dividends.
Signed-off-by: Jean Delvare <khali@linux-fr.org> Cc: Guenter Roeck <linux@roeck-us.net> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: David Airlie <airlied@linux.ie> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
pcmcia: move unbind/rebind into dev_pm_ops.complete
Move the device rebind procedures for cardbus devices from the pm.resume
into the pm.complete callback.
The reason for moving the code is: "[...] The PM code needs to send
suspend and resume messages to every device in the right order, and it
can't do that if new devices are being added at the same time. [...]"
However the situation really isn't quite that rigid. In particular,
adding new children during a resume callback shouldn't cause much of
problem because the children don't need to be resumed anyway (since they
were never suspended). On the other hand, if you do it you will get a
dev_warn() from the PM core, something like 'parent should not be
sleeping'.
Still, it is considered bad form and should be avoided if possible."
(Alan Stern's full comment about the topic can
be found here: <https://lkml.org/lkml/2012/7/10/254>)
Signed-off-by: Christian Lamparter <chunkeey@googlemail.com> Cc: Dominik Brodowski <linux@dominikbrodowski.net> Cc: Alan Stern <stern@rowland.harvard.edu> Cc: Greg KH <greg@kroah.com> Acked-by: "Rafael J. Wysocki" <rjw@sisk.pl> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Alex Shi [Fri, 28 Sep 2012 00:18:51 +0000 (10:18 +1000)]
arch/x86/platform/uv: fix incorrect tlb flush all issue
The flush tlb optimization code has logical issue on UV platform. It
doesn't flush the full range at all, since it simply ignores its 'end'
parameter (and hence also the "all" indicator) in uv_flush_tlb_others()
function.
Cliff's notes:
: I tested the patch on a UV. It has the effect of either clearing 1 or all
: TLBs in a cpu. I added some debugging to test for the cases when clearing
: all TLBs is overkill, and in practice it happens very seldom.
Reported-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: Alex Shi <alex.shi@intel.com> Signed-off-by: Cliff Wickman <cpw@sgi.com> Tested-by: Cliff Wickman <cpw@sgi.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Andrew Morton [Fri, 28 Sep 2012 00:18:51 +0000 (10:18 +1000)]
arch/x86/tools/insn_sanity.c: identify source of messages
The kernel build prints:
Building modules, stage 2.
TEST posttest
MODPOST 3821 modules
TEST posttest
Success: decoded and checked 1000000 random instructions with 0 errors (seed:0xaac4bc47)
CC arch/x86/boot/a20.o
CC arch/x86/boot/cmdline.o
AS arch/x86/boot/copy.o
HOSTCC arch/x86/boot/mkcpustr
CC arch/x86/boot/cpucheck.o
CC arch/x86/boot/early_serial_console.o
which is irritating because you don't know what program is proudly
pronouncing its success.
So, as described in "console mode programming user interface guidelines
version 101" which doesn't exist, change this program to identify the
source of its messages.
Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
If we aren't debugging per_cpu maps, the cpu's node is stored in per_cpu
variable numa_node. If `node' is NUMA_NO_NODE, it means the caller wants
to clear the cpu's node. So we should also call set_cpu_numa_node() in
this case.
Signed-off-by: Wen Congyang <wency@cn.fujitsu.com> Cc: Len Brown <len.brown@intel.com> Cc: Pavel Machek <pavel@ucw.cz> Cc: "Rafael J. Wysocki" <rjw@sisk.pl> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
arch/x86/platform/iris/iris.c: register a platform device and a platform driver
This makes the iris driver use the platform API, so it is properly exposed
in /sys.
[akpm@linux-foundation.org: remove commented-out code, add missing space to printk, clean up code layout] Signed-off-by: Shérab <Sebastien.Hinderer@ens-lyon.org> Cc: Len Brown <lenb@kernel.org> Cc: Matthew Garrett <mjg@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
acpi_memhotplug.c: auto bind the memory device which is hotplugged before the driver is loaded
If the memory device is hotplugged before the driver is loaded, the user
cannot see this device under the directory /sys/bus/acpi/devices/, and the
user cannot bind it by hand after the driver is loaded. This patch
introduces a new feature to bind such device when the driver is being
loaded.
Signed-off-by: Wen Congyang <wency@cn.fujitsu.com> Cc: Len Brown <lenb@kernel.org> Cc: "Brown, Len" <len.brown@intel.com> Cc: Yasuaki ISIMATU <isimatu.yasuaki@jp.fujitsu.com> Cc: David Rientjes <rientjes@google.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
acpi_memhotplug.c: bind the memory device when the driver is being loaded
We had introduced acpi_hotmem_initialized to avoid strange add_memory fail
message. But the memory device may not be used by the kernel, and the
device should be bound when the driver is being loaded. Remove
acpi_hotmem_initialized to allow that the device can be bound when the
driver is being loaded.
Signed-off-by: Wen Congyang <wency@cn.fujitsu.com> Cc: Len Brown <lenb@kernel.org> Cc: "Brown, Len" <len.brown@intel.com> Cc: Yasuaki ISIMATU <isimatu.yasuaki@jp.fujitsu.com> Cc: David Rientjes <rientjes@google.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
acpi_memhotplug.c: free memory device if acpi_memory_enable_device() failed
If acpi_memory_enable_device() fails, acpi_memory_enable_device() will
return a non-zero value, which means we fail to bind the memory device to
this driver. So we should free memory device before
acpi_memory_device_add() returns.
Signed-off-by: Wen Congyang <wency@cn.fujitsu.com> Cc: Len Brown <lenb@kernel.org> Cc: "Brown, Len" <len.brown@intel.com> Cc: Yasuaki ISIMATU <isimatu.yasuaki@jp.fujitsu.com> Cc: David Rientjes <rientjes@google.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: "Rafael J. Wysocki" <rjw@sisk.pl> Cc: Ingo Molnar <mingo@redhat.com> Cc: Len Brown <len.brown@intel.com> Cc: Pavel Machek <pavel@ucw.cz> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Wen Congyang <wency@cn.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
x86 cpu_hotplug: unmap cpu2node when the cpu is hotremoved
When a cpu is hotplugged, we call acpi_map_cpu2node() in
_acpi_map_lsapic() to store the cpu's node. But we don't clear the cpu's
node in acpi_unmap_lsapic() when this cpu is hotremoved. If the node is
also hotremoved, We will get the following messages:
The reason is that: the cpu's node is not NUMA_NO_NODE, we will call
alloc_pages_exact_node() to alloc memory on the node, but the node is
offlined.
Signed-off-by: Wen Congyang <wency@cn.fujitsu.com> Cc: Len Brown <len.brown@intel.com> Cc: Pavel Machek <pavel@ucw.cz> Cc: "Rafael J. Wysocki" <rjw@sisk.pl> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
vfs: d_obtain_alias() needs to use "/" as default name.
NFS appears to use d_obtain_alias() to create the root dentry rather than
d_make_root. This can cause 'prepend_path()' to complain that the root
has a weird name if an NFS filesystem is lazily unmounted. e.g. if
"/mnt" is an NFS mount then
{ cd /mnt; umount -l /mnt ; ls -l /proc/self/cwd; }
will cause a WARN message like
WARNING: at /home/git/linux/fs/dcache.c:2624 prepend_path+0x1d7/0x1e0()
...
Root dentry has weird name <>
to appear in kernel logs.
So change d_obtain_alias() to use "/" rather than "" as the anonymous
name.
Signed-off-by: NeilBrown <neilb@suse.de> Cc: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
This patch below does what Paul McKenney suggested in the previous thread.
Signed-off-by: Dave Jones <davej@redhat.com> Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Acked-by: Paul Moore <paul@paul-moore.com> Cc: Eric Paris <eparis@parisplace.org> Cc: James Morris <jmorris@namei.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
The inb/outb macros for CRIS are broken from a number of points of view,
missing () around parameters and they have an unprotected if statement in
them. This was breaking the compile of IPMI on CRIS and thus I was being
annoyed by build regressions, so I fixed them.
Plus I don't think they would have worked at all, since the data values
were missing "&" and the outsl had a "3" instead of a "4" for the size.
From what I can tell, this stuff is not used at all, so this can't be any
more broken than it was before, anyway.
Signed-off-by: Corey Minyard <cminyard@mvista.com> Cc: Jesper Nilsson <jesper.nilsson@axis.com> Cc: Mikael Starvik <starvik@axis.com> Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>