Vasiliy Kulikov [Wed, 3 Aug 2011 00:52:32 +0000 (10:52 +1000)]
On thread exit shm_exit_ns() is called, it uses shm_ids(ns).rw_mutex. It
is initialized in shm_init(), but it is not called yet at the moment of
kernel threads exit. Some kernel threads are created in
do_pre_smp_initcalls(), and shm_init() is called in do_initcalls().
Static initialization of shm_ids(init_ipc_ns).rw_mutex fixes the race.
It fixes a kernel oops:
Unable to handle kernel NULL pointer dereference at virtual address 00000000
...
[<c0320090>] (__down_write_nested+0x88/0xe0) from [<c015da08>] (exit_shm+0x28/0x48)
[<c015da08>] (exit_shm+0x28/0x48) from [<c002e550>] (do_exit+0x59c/0x750)
[<c002e550>] (do_exit+0x59c/0x750) from [<c003eaac>] (____call_usermodehelper+0x13c/0x154)
[<c003eaac>] (____call_usermodehelper+0x13c/0x154) from [<c000f630>] (kernel_thread_exit+0x0/0x8)
Code: 1afffffae597c00ce58d0000e587d00c (e58cd000)
Reported-by: Manuel Lauss <manuel.lauss@googlemail.com> Reported-by: Richard Weinberger <richard@nod.at> Reported-by: Marc Zyngier <maz@misterjones.org> Tested-by: Manuel Lauss <manuel.lauss@googlemail.com> Tested-by: Richard Weinberger <richard@nod.at> Tested-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Vasiliy Kulikov <segoon@openwall.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Will Drewry [Wed, 3 Aug 2011 00:52:30 +0000 (10:52 +1000)]
This patch makes two changes:
- check for trailing characters after parsing PARTNROFF=%d
- disable root_wait if a syntax error is seen
The former assures that bad input like
root=PARTUUID=<validuuid>/PARTNROFF=5abc
properly fails by attempting to parse an extra character after the
integer. If the integer is missing, sscanf will fail, but if it is
present, and there is a trailing non-nul character, then the extra
field will be parsed and the error case will be hit.
The latter assures that if rootwait has been specified, the error
message isn't flooded to the screen during rootwait's loop. Instead of
adding printk ratelimiting, root_wait was disabled. This stays true to
the rootwait goal of support asynchronous device arrival while still
providing users with helpful messages. With ratelimiting or disabling
logging on rootwait, a range of edge cases turn up where the user would
not be informed of an error properly.
Signed-off-by: Will Drewry <wad@chromium.org> Cc: Kay Sievers <kay.sievers@vrfy.org> Cc: Randy Dunlap <rdunlap@xenotime.net> Cc: Namhyung Kim <namhyung@gmail.com> Cc: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: Jens Axboe <axboe@kernel.dk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Will Drewry [Wed, 3 Aug 2011 00:52:30 +0000 (10:52 +1000)]
Expand root=PARTUUID=UUID syntax to support selecting a root partition by
integer offset from a known, unique partition. This approach provides
similar properties to specifying a device and partition number, but using
the UUID as the unique path prior to evaluating the offset.
For example,
root=PARTUUID=99DE9194-FC15-4223-9192-FC243948F88B/PARTNROFF=1
selects the partition with UUID 99DE.. then select the next
partition.
This change is motivated by a particular usecase in Chromium OS where the
bootloader can easily determine what partition it is on (by UUID) but
doesn't perform general partition table walking.
That said, support for this model provides a direct mechanism for the user
to modify the root partition to boot without specifically needing to
extract each UUID or update the bootloader explicitly when the root
partition UUID is changed (if it is recreated to be larger, for instance).
Pinning to a /boot-style partition UUID allows the arbitrary root
partition reconfiguration/modifications with slightly less ambiguity than
just [dev][partition] and less stringency than the specific root partition
UUID.
Signed-off-by: Will Drewry <wad@chromium.org> Cc: Kay Sievers <kay.sievers@vrfy.org> Cc: Randy Dunlap <rdunlap@xenotime.net> Cc: Namhyung Kim <namhyung@gmail.com> Cc: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: Jens Axboe <axboe@kernel.dk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Joe Thornber [Fri, 5 Aug 2011 00:38:11 +0000 (10:38 +1000)]
Initial EXPERIMENTAL implementation of device-mapper thin provisioning
with snapshot support. The 'thin' target is used to create instances of
the virtual devices that are hosted in the 'thin-pool' target. The
thin-pool target provides data sharing among devices. This sharing is
made possible using the persistent-data library in the previous patch.
The main highlight of this implementation, compared to the previous
implementation of snapshots, is that it allows many virtual devices to
be stored on the same data volume, simplifying administration and
allowing sharing of data between volumes (thus reducing disk usage).
Another big feature is support for arbitrary depth of recursive
snapshots (snapshots of snapshots of snapshots ...). The previous
implementation of snapshots did this by chaining together lookup tables,
and so performance was O(depth). This new implementation uses a single
data structure so we don't get this degradation with depth.
For further information and examples of how to use this, please read
Documentation/device-mapper/thin-provisioning.txt
Signed-off-by: Joe Thornber <thornber@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
xen/trace: Fix compile error when CONFIG_XEN_PRIVILEGED_GUEST is not set
with CONFIG_XEN and CONFIG_FTRACE set we get this:
arch/x86/xen/trace.c:22: error: ‘__HYPERVISOR_console_io’ undeclared here (not in a function)
arch/x86/xen/trace.c:22: error: array index in initializer not of integer type
arch/x86/xen/trace.c:22: error: (near initialization for ‘xen_hypercall_names’)
arch/x86/xen/trace.c:23: error: ‘__HYPERVISOR_physdev_op_compat’ undeclared here (not in a function)
Issue was that the definitions of __HYPERVISOR were not pulled
if CONFIG_XEN_PRIVILEGED_GUEST was not set.
Reported-by: Randy Dunlap <rdunlap@xenotime.net> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
hwmon: (pmbus/lm25066) Ignore byte writes to non-zero pages
pmbus_clear_faults() attempts to clear faults on non-existing real pages.
As a result, the command error bit in the status register is set, and faults
are not really cleared.
All byte writes to non-zero pages are requests to clear the status register
on that page. Since non-zero pages are virtual and do not exist on the chip,
there is nothing to do, and such requests have to be ignored. This fixes
above problem.
Boaz Harrosh [Thu, 4 Aug 2011 03:44:16 +0000 (20:44 -0700)]
exofs: Fix truncate for the raid-groups case
In the general raid-group case the truncate was wrong in that
it did not also fix the object length of the neighboring groups.
There are two bad cases in the old code:
1. Space that should be freed was not.
2. If a file That was big is truncated small, then made bigger
again, the holes would not contain zeros but could expose old data.
(If the growing of the file expands to more than a full
groups cycle + group size (> S + T))
Since the beginning we realloced the sbi structure when a bigger
then one device table was specified. (I know that was really stupid).
Then much later when "register bdi" was added (By Jens) it was
registering the pointer to sbi->bdi before the realloc.
We never saw this problem because up till now the realloc did not
do anything since the device table was small enough to fit in the
original allocation. But once we starting testing with large device
tables (Bigger then 28) we noticed the crash of writeback operating
on a deallocated pointer.
* Avoid the all mess by allocating the device-table as a second array
and get rid of the variable-sized structure and the rest of this
mess.
* Take the chance to clean near by structures and comments.
* Add a needed dprint on startup to indicate the loaded layout.
* Also move the bdi registration to the very end because it will
only fail in a low memory, which will probably fail before hand.
There are many more likely causes to not load before that. This
way the error handling is made simpler. (Just doing this would be
enough to fix the BUG)
Boaz Harrosh [Sun, 29 May 2011 07:57:47 +0000 (10:57 +0300)]
nfs_xdr: Move nfs4_string definition out of #ifdef CONFIG_NFS_V4
exofs file system wants to use pnfs_osd_xdr.h file instead of
redefining pnfs-objects types in it's private "pnfs.h" headr.
Before we do the switch we must make sure pnfs_osd_xdr.h is
compilable also under NFS versions smaller than 4.1. Since now
it is needed regardless of version, by the exofs code.
nfs4_string is not the only nfs4 type out in the global scope.
* stable/for-jens:
xen-blkback: refactor vbd remove/disconnect.
xen-blkback: repleace check kthread_should_stop() to remove_requested in xen_blkif_schedule() loop.
xen-blkback: add remove_requested to xen_blkif and some declares
xen/blkback: Make description more obvious.
xen/blk[front|back]: Implement the full FLUSH | FUA support.
xen-blkfront: Fix one off warning about name clash
xen-blkfront: Drop name and minor adjustments for emulated scsi devices
Igor Mammedov [Tue, 2 Aug 2011 09:45:25 +0000 (11:45 +0200)]
xen: Fix misleading WARN message at xen_release_chunk
WARN message should not complain
"Failed to release memory %lx-%lx err=%d\n"
^^^^^^^
about range when it fails to release just one page,
instead it should say what pfn is not freed.
In addition line:
printk(KERN_INFO "xen_release_chunk: looking at area pfn %lx-%lx: "
...
printk(KERN_CONT "%lu pages freed\n", len);
will be broken if WARN in between this line is fired. So fix it
by using a single printk for this.
Signed-off-by: Igor Mammedov <imammedo@redhat.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Joe Jin [Thu, 4 Aug 2011 07:25:44 +0000 (15:25 +0800)]
xen-blkback: refactor vbd remove/disconnect.
This patch refactor vbd remove/disconnect.
1. Add blkback shutdown watch for the remove/disconnect.
2. Don't disconnect backend when frontend state is XenbusStateClosing
until frontend state changed to XenbusStateClosed.
Signed-off-by: Joe Jin <joe.jin@oracle.com> Cc: Daniel Stodden <daniel.stodden@citrix.com> Cc: Jens Axboe <jaxboe@fusionio.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Annie Li <annie.li@oracle.com> Cc: Ian Campbell <Ian.Campbell@eu.citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Joe Jin [Thu, 4 Aug 2011 07:24:44 +0000 (15:24 +0800)]
xen-blkback: repleace check kthread_should_stop() to remove_requested in xen_blkif_schedule() loop.
When backend state change to XenbusStateClosed, remove_requested will be set,
so repleace check kthread_should_stop() to remove_requested in
xen_blkif_schedule() loop.
Signed-off-by: Joe Jin <joe.jin@oracle.com> Cc: Daniel Stodden <daniel.stodden@citrix.com> Cc: Jens Axboe <jaxboe@fusionio.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Annie Li <annie.li@oracle.com> Cc: Ian Campbell <Ian.Campbell@eu.citrix.com>
--
drivers/block/xen-blkback/blkback.c | 10 +++++-----
1 file changed, 5 insertions(+), 5 deletions(-) Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Joe Jin [Thu, 4 Aug 2011 07:23:41 +0000 (15:23 +0800)]
xen-blkback: add remove_requested to xen_blkif and some declares
Add remove_requested to xen_blkif and some declares.
Signed-off-by: Joe Jin <joe.jin@oracle.com> Cc: Daniel Stodden <daniel.stodden@citrix.com> Cc: Jens Axboe <jaxboe@fusionio.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Annie Li <annie.li@oracle.com> Cc: Ian Campbell <Ian.Campbell@eu.citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Linus Torvalds [Thu, 4 Aug 2011 16:36:20 +0000 (06:36 -1000)]
Merge branch 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/parisc-2.6
* 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/parisc-2.6:
[PARISC] wire up sendmmsg syscall
[PARISC] fix return type of __atomic64_add_return
[PARISC] Fix futex support
Linus Torvalds [Thu, 4 Aug 2011 16:35:51 +0000 (06:35 -1000)]
Merge branch 'for-linus' of git://git390.marist.edu/pub/scm/linux-2.6
* 'for-linus' of git://git390.marist.edu/pub/scm/linux-2.6:
[S390] signal: use set_restore_sigmask() helper
[S390] smp: remove pointless comments in startup_secondary()
[S390] qdio: Use kstrtoul_from_user
[S390] sclp_async: Use kstrtoul_from_user
[S390] exec: remove redundant set_fs(USER_DS)
[S390] cpu hotplug: on cpu start wait until being marked active
[S390] signal: convert to use set_current_blocked()
[S390] asm offsets: fix coding style
[S390] Add support for IBM zEnterprise 114
[S390] dasd: check if raw track access is supported
[S390] Use diagnose 308 for system reset
[S390] Export store_status() function
[S390] dasd: use vmalloc for statistics input buffer
[S390] Add PSW restart shutdown trigger
[S390] missing return in page_table_alloc_pgste
[S390] qdio: 2nd stage retry on SIGA-W busy conditions
Arnaud Lacombe [Thu, 4 Aug 2011 14:39:44 +0000 (10:39 -0400)]
eisa/pci_eisa.c: fix BUG introduced by 005bdad7b80
While `pci_eisa_driver' still refer `pci_eisa_init', the .probe() function
should not be called after init memory release, as pointed out by commit 74b9a297. The structure is still referenced in the drivers subsystem, and can
be accesseed through sysfs, so the modpost warning is a false positive. Mark
it as such.
In the same time, the warning referenced in 005bdad7b80 did only mention
`pci_eisa_driver', not `pci_eisa_pci_tbl', so remove its marking.
Trond Myklebust [Tue, 2 Aug 2011 18:46:52 +0000 (14:46 -0400)]
NFSv4.1: Return NFS4ERR_BADSESSION to callbacks during session resets
If the client is in the process of resetting the session when it receives
a callback, then returning NFS4ERR_DELAY may cause a deadlock with the
DESTROY_SESSION call.
Basically, if the client returns NFS4ERR_DELAY in response to the
CB_SEQUENCE call, then the server is entitled to believe that the
client is busy because it is already processing that call. In that
case, the server is perfectly entitled to respond with a
NFS4ERR_BACK_CHAN_BUSY to any DESTROY_SESSION call.
Fix this by having the client reply with a NFS4ERR_BADSESSION in
response to the callback if it is resetting the session.
Trond Myklebust [Tue, 2 Aug 2011 18:46:29 +0000 (14:46 -0400)]
NFSv4.1: Fix the callback 'highest_used_slotid' behaviour
Currently, there is no guarantee that we will call nfs4_cb_take_slot() even
though nfs4_callback_compound() will consistently call
nfs4_cb_free_slot() provided the cb_process_state has set the 'clp' field.
The result is that we can trigger the BUG_ON() upon the next call to
nfs4_cb_take_slot().
This patch fixes the above problem by using the slot id that was taken in
the CB_SEQUENCE operation as a flag for whether or not we need to call
nfs4_cb_free_slot().
It also fixes an atomicity problem: we need to set tbl->highest_used_slotid
atomically with the check for NFS4_SESSION_DRAINING, otherwise we end up
racing with the various tests in nfs4_begin_drain_session().