Because of x86-implement-strict-user-copy-checks-for-x86_64.patch
When compiling mm/mempolicy.c the following warning is shown.
In file included from arch/x86/include/asm/uaccess.h:572,
from include/linux/uaccess.h:5,
from include/linux/highmem.h:7,
from include/linux/pagemap.h:10,
from include/linux/mempolicy.h:70,
from mm/mempolicy.c:68:
In function `copy_from_user',
inlined from `compat_sys_get_mempolicy' at mm/mempolicy.c:1415:
arch/x86/include/asm/uaccess_64.h:64: warning: call to `copy_from_user_overflow' declared with attribute warning: copy_from_user() buffer size is not provably correct
LD mm/built-in.o
Fix this by passing correct buffer size value.
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Vasiliy Kulikov [Wed, 3 Aug 2011 00:52:32 +0000 (10:52 +1000)]
On thread exit shm_exit_ns() is called, it uses shm_ids(ns).rw_mutex. It
is initialized in shm_init(), but it is not called yet at the moment of
kernel threads exit. Some kernel threads are created in
do_pre_smp_initcalls(), and shm_init() is called in do_initcalls().
Static initialization of shm_ids(init_ipc_ns).rw_mutex fixes the race.
It fixes a kernel oops:
Unable to handle kernel NULL pointer dereference at virtual address 00000000
...
[<c0320090>] (__down_write_nested+0x88/0xe0) from [<c015da08>] (exit_shm+0x28/0x48)
[<c015da08>] (exit_shm+0x28/0x48) from [<c002e550>] (do_exit+0x59c/0x750)
[<c002e550>] (do_exit+0x59c/0x750) from [<c003eaac>] (____call_usermodehelper+0x13c/0x154)
[<c003eaac>] (____call_usermodehelper+0x13c/0x154) from [<c000f630>] (kernel_thread_exit+0x0/0x8)
Code: 1afffffae597c00ce58d0000e587d00c (e58cd000)
Reported-by: Manuel Lauss <manuel.lauss@googlemail.com> Reported-by: Richard Weinberger <richard@nod.at> Reported-by: Marc Zyngier <maz@misterjones.org> Tested-by: Manuel Lauss <manuel.lauss@googlemail.com> Tested-by: Richard Weinberger <richard@nod.at> Tested-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Vasiliy Kulikov <segoon@openwall.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Johannes Weiner [Wed, 3 Aug 2011 00:52:31 +0000 (10:52 +1000)]
__GFP_OTHER_NODE is used for NUMA allocations on behalf of other nodes.
It's supposed to be passed through from the page allocator to
zone_statistics(), but it never gets there as gfp_allowed_mask is not wide
enough and masks out the flag early in the allocation path.
The result is an accounting glitch where successful NUMA allocations
by-agent are not properly attributed as local.
Increase __GFP_BITS_SHIFT so that it includes __GFP_OTHER_NODE.
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by: Andi Kleen <ak@linux.intel.com> Reviewed-by: Minchan Kim <minchan.kim@gmail.com> Acked-by: Mel Gorman <mgorman@suse.de> Reviewed-by: Michal Hocko <mhocko@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Sergiu Iordache [Wed, 3 Aug 2011 00:52:31 +0000 (10:52 +1000)]
Update the module parameters when platform data is used. This means that
they can be read from /sys/module/ramoops/parameters in order to parse the
memory area.
Signed-off-by: Sergiu Iordache <sergiu@chromium.org> Cc: Marco Stornelli <marco.stornelli@gmail.com> Cc: Seiji Aguchi <seiji.aguchi@hds.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Will Drewry [Wed, 3 Aug 2011 00:52:31 +0000 (10:52 +1000)]
Update kernel-parameters.txt to point users to the authoritative comment
for name_to_dev_t. In addition, updates other places where some
name_to_dev_t behavior was discussed. All other references to root=
appear to be for explicit sample usage or just side comments when
discussing other kernel parameters.
Signed-off-by: Will Drewry <wad@chromium.org> Cc: Kay Sievers <kay.sievers@vrfy.org> Cc: Randy Dunlap <rdunlap@xenotime.net> Cc: Namhyung Kim <namhyung@gmail.com> Cc: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: Jens Axboe <axboe@kernel.dk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Will Drewry [Wed, 3 Aug 2011 00:52:30 +0000 (10:52 +1000)]
This patch makes two changes:
- check for trailing characters after parsing PARTNROFF=%d
- disable root_wait if a syntax error is seen
The former assures that bad input like
root=PARTUUID=<validuuid>/PARTNROFF=5abc
properly fails by attempting to parse an extra character after the
integer. If the integer is missing, sscanf will fail, but if it is
present, and there is a trailing non-nul character, then the extra
field will be parsed and the error case will be hit.
The latter assures that if rootwait has been specified, the error
message isn't flooded to the screen during rootwait's loop. Instead of
adding printk ratelimiting, root_wait was disabled. This stays true to
the rootwait goal of support asynchronous device arrival while still
providing users with helpful messages. With ratelimiting or disabling
logging on rootwait, a range of edge cases turn up where the user would
not be informed of an error properly.
Signed-off-by: Will Drewry <wad@chromium.org> Cc: Kay Sievers <kay.sievers@vrfy.org> Cc: Randy Dunlap <rdunlap@xenotime.net> Cc: Namhyung Kim <namhyung@gmail.com> Cc: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: Jens Axboe <axboe@kernel.dk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Will Drewry [Wed, 3 Aug 2011 00:52:30 +0000 (10:52 +1000)]
Expand root=PARTUUID=UUID syntax to support selecting a root partition by
integer offset from a known, unique partition. This approach provides
similar properties to specifying a device and partition number, but using
the UUID as the unique path prior to evaluating the offset.
For example,
root=PARTUUID=99DE9194-FC15-4223-9192-FC243948F88B/PARTNROFF=1
selects the partition with UUID 99DE.. then select the next
partition.
This change is motivated by a particular usecase in Chromium OS where the
bootloader can easily determine what partition it is on (by UUID) but
doesn't perform general partition table walking.
That said, support for this model provides a direct mechanism for the user
to modify the root partition to boot without specifically needing to
extract each UUID or update the bootloader explicitly when the root
partition UUID is changed (if it is recreated to be larger, for instance).
Pinning to a /boot-style partition UUID allows the arbitrary root
partition reconfiguration/modifications with slightly less ambiguity than
just [dev][partition] and less stringency than the specific root partition
UUID.
Signed-off-by: Will Drewry <wad@chromium.org> Cc: Kay Sievers <kay.sievers@vrfy.org> Cc: Randy Dunlap <rdunlap@xenotime.net> Cc: Namhyung Kim <namhyung@gmail.com> Cc: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: Jens Axboe <axboe@kernel.dk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Oleg Nesterov [Wed, 3 Aug 2011 00:52:29 +0000 (10:52 +1000)]
When send_cpu_listeners() finds the orphaned listener it marks it as
!valid and drops listeners->sem. Before it takes this sem for writing,
s->pid can be reused and add_del_listener() can wrongly try to re-use this
entry.
Oleg Nesterov [Wed, 3 Aug 2011 00:52:28 +0000 (10:52 +1000)]
1. 26c4caea "don't allow duplicate entries in listener mode"
changed add_del_listener(REGISTER) so that "next_cpu:" can
reuse the listener allocated for the previous cpu, this
doesn't look exactly right even if minor.
Change the code to kfree() in the already-registered case,
this case is unlikely anyway so the extra kmalloc_node()
shouldn't hurt but looke more correct and clean.
2. use the plain list_for_each_entry() instead of _safe() to
scan listeners->list.
3. Remove the unneeded INIT_LIST_HEAD(&s->list), we are going
to list_add(&s->list).
Daniel Glöckner [Wed, 3 Aug 2011 00:52:28 +0000 (10:52 +1000)]
As the comment explains, the intention of the code is to clear the
OMAP_RTC_CTRL_MODE_12_24 bit, but instead it only clears the
OMAP_RTC_CTRL_SPLIT and OMAP_RTC_CTRL_AUTO_COMP bits, which should be
kept. OMAP_RTC_CTRL_DISABLE, OMAP_RTC_CTRL_SET_32_COUNTER,
OMAP_RTC_CTRL_TEST, and OMAP_RTC_CTRL_ROUND_30S are also better off
being cleared.
Signed-off-by: Daniel Glöckner <dg@emlix.com> Cc: Alessandro Zummo <a.zummo@towertech.it> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Akinobu Mita [Wed, 3 Aug 2011 00:52:27 +0000 (10:52 +1000)]
init_fault_attr_dentries() is used to export fault_attr via debugfs. But
it can only export it in debugfs root directory.
Per Forlin is working on mmc_fail_request which adds support to inject
data errors after a completed host transfer in MMC subsystem.
The fault_attr for mmc_fail_request should be defined per mmc host and
export it in debugfs directory per mmc host like
/sys/kernel/debug/mmc0/mmc_fail_request.
init_fault_attr_dentries() doesn't help for mmc_fail_request. So this
introduces fault_create_debugfs_attr() which is able to create a directory
in the arbitrary directory and replace init_fault_attr_dentries().
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Tested-by: Per Forlin <per.forlin@linaro.org> Cc: Jens Axboe <axboe@kernel.dk> Cc: Christoph Lameter <cl@linux-foundation.org> Cc: Pekka Enberg <penberg@kernel.org> Cc: Matt Mackall <mpm@selenic.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Joe Thornber [Wed, 3 Aug 2011 00:43:45 +0000 (10:43 +1000)]
Initial EXPERIMENTAL implementation of device-mapper thin provisioning
with snapshot support. The 'thin' target is used to create instances of
the virtual devices that are hosted in the 'thin-pool' target. The
thin-pool target provides data sharing among devices. This sharing is
made possible using the persistent-data library in the previous patch.
The main highlight of this implementation, compared to the previous
implementation of snapshots, is that it allows many virtual devices to
be stored on the same data volume, simplifying administration and
allowing sharing of data between volumes (thus reducing disk usage).
Another big feature is support for arbitrary depth of recursive
snapshots (snapshots of snapshots of snapshots ...). The previous
implementation of snapshots did this by chaining together lookup tables,
and so performance was O(depth). This new implementation uses a single
data structure so we don't get this degradation with depth.
For further information and examples of how to use this, please read
Documentation/device-mapper/thin-provisioning.txt
Signed-off-by: Joe Thornber <thornber@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Mikulas Patocka [Wed, 3 Aug 2011 00:43:43 +0000 (10:43 +1000)]
This patch introduces dm_kcopyd_zero() to make it easy to use
kcopyd to write zeros into the requested areas instead
instead of copying. It is implemented by passing a NULL
copying source to dm_kcopyd_copy().
The forthcoming thin provisioning target uses this.
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Mike Snitzer [Wed, 3 Aug 2011 00:43:43 +0000 (10:43 +1000)]
DM has always advertised both REQ_FLUSH and REQ_FUA flush capabilities
regardless of whether or not a given DM device's underlying devices
also advertised a need for them.
Block's flush-merge changes from 2.6.39 have proven to be more costly
for DM devices. Performance regressions have been reported even when
DM's underlying devices do not advertise that they have a write cache.
Fix the performance regressions by configuring a DM device's flushing
capabilities based on those of the underlying devices' capabilities.
Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Milan Broz [Wed, 3 Aug 2011 00:43:43 +0000 (10:43 +1000)]
Add optional parameter field to dmcrypt table and support
"allow_discards" option.
Discard requests bypass crypt queue processing. Bio is simple remapped
to underlying device.
Note that discard will be never enabled by default because of security
consequences. It is up to the administrator to enable it for encrypted
devices.
(Note that userspace cryptsetup does not understand new optional
parameters yet. Support for this will come later. Until then, you
should use 'dmsetup' to enable and disable this.)
Signed-off-by: Milan Broz <mbroz@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Add the ability to parse and use metadata devices to dm-raid. Although
not strictly required, without the metadata devices, many features of
RAID are unavailable. They are used to store a superblock and bitmap.
The role, or position in the array, of each device must be recorded in
its superblock. This is to help with fault handling, array reshaping,
and sanity checks. RAID 4/5/6 devices must be loaded in a specific order:
in this way, the 'array_position' field helps validate the correctness
of the mapping when it is loaded. It can be used during reshaping to
identify which devices are added/removed. Fault handling is impossible
without this field. For example, when a device fails it is recorded in
the superblock. If this is a RAID1 device and the offending device is
removed from the array, there must be a way during subsequent array
assembly to determine that the failed device was the one removed. This
is done by correlating the 'array_position' field and the bit-field
variable 'failed_devices'.
Signed-off-by: Jonathan Brassow <jbrassow@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>