Joe Thornber [Tue, 9 Aug 2011 00:50:00 +0000 (10:50 +1000)]
Initial EXPERIMENTAL implementation of device-mapper thin provisioning
with snapshot support. The 'thin' target is used to create instances of
the virtual devices that are hosted in the 'thin-pool' target. The
thin-pool target provides data sharing among devices. This sharing is
made possible using the persistent-data library in the previous patch.
The main highlight of this implementation, compared to the previous
implementation of snapshots, is that it allows many virtual devices to
be stored on the same data volume, simplifying administration and
allowing sharing of data between volumes (thus reducing disk usage).
Another big feature is support for arbitrary depth of recursive
snapshots (snapshots of snapshots of snapshots ...). The previous
implementation of snapshots did this by chaining together lookup tables,
and so performance was O(depth). This new implementation uses a single
data structure so we don't get this degradation with depth.
For further information and examples of how to use this, please read
Documentation/device-mapper/thin-provisioning.txt
Signed-off-by: Joe Thornber <thornber@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Mikulas Patocka [Tue, 9 Aug 2011 00:49:58 +0000 (10:49 +1000)]
This patch introduces dm_kcopyd_zero() to make it easy to use
kcopyd to write zeros into the requested areas instead
instead of copying. It is implemented by passing a NULL
copying source to dm_kcopyd_copy().
The forthcoming thin provisioning target uses this.
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Allow QUEUE_FLAG_NONROT to propagate up the device stack if all
underlying devices are non-rotational. Tools like ureadahead will
schedule IOs differently based on the rotational flag.
With this patch, I see boot time go from 7.75 s to 7.46 s on my device.
Suggested-by: J. Richard Barnette <jrbarnette@chromium.org> Signed-off-by: Mandeep Singh Baines <msb@chromium.org> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Cc: Neil Brown <neilb@suse.de> Cc: Jens Axboe <jaxboe@fusionio.com> Cc: Martin K. Petersen <martin.petersen@oracle.com> Cc: dm-devel@redhat.com Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Mike Snitzer [Tue, 9 Aug 2011 00:49:57 +0000 (10:49 +1000)]
Commit a63a5cf (dm: improve block integrity support) introduced a
two-phase initialization of a DM device's integrity profile. This
patch avoids dereferencing a NULL 'template_disk' pointer in
blk_integrity_register() if there is an integrity profile mismatch in
dm_table_set_integrity().
This can occur if the integrity profiles for stacked devices in a DM
table are changed between the call to dm_table_prealloc_integrity() and
dm_table_set_integrity().
Reported-by: Zdenek Kabelac <zkabelac@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com> Cc: stable@kernel.org # 2.6.39
* pm-domains:
ARM / shmobile: Make A3RV be a subdomain of A4LC on SH7372
PM / Domains: Rename argument of pm_genpd_add_subdomain()
PM / Domains: Rename GPD_STATE_WAIT_PARENT to GPD_STATE_WAIT_MASTER
PM / Domains: Allow generic PM domains to have multiple masters
PM / Domains: Add "wait for parent" status for generic PM domains
PM / Domains: Make pm_genpd_poweron() always survive parent removal
PM / Domains: Do not take parent locks to modify subdomain counters
PM / Domains: Implement subdomain counters as atomic fields
ARM / shmobile: Make A3RV be a subdomain of A4LC on SH7372
Instead of coding the undocumented dependencies between power domains
A3RV and A4LC on SH7372 directly into the low-level power up/down
routines, make A3RV be a subdomain of A4LC, which will cause the
same dependecies to hold.
PM / Domains: Rename argument of pm_genpd_add_subdomain()
Change the name of the second argument of pm_genpd_add_subdomain()
so that it is (a) shorter and (b) in agreement with the name of
the second argument of pm_genpd_add_subdomain().
PM / Domains: Rename GPD_STATE_WAIT_PARENT to GPD_STATE_WAIT_MASTER
Since it is now possible for a PM domain to have multiple masters
instead of one parent, rename the "wait for parent" status to reflect
the new situation.
PM / Domains: Allow generic PM domains to have multiple masters
Currently, for a given generic PM domain there may be only one parent
domain (i.e. a PM domain it depends on). However, there is at least
one real-life case in which there should be two parents (masters) for
one PM domain (the A3RV domain on SH7372 turns out to depend on the
A4LC domain and it depends on the A4R domain and the same time). For
this reason, allow a PM domain to have multiple parents (masters) by
introducing objects representing links between PM domains.
The (logical) links between PM domains represent relationships in
which one domain is a master (i.e. it is depended on) and another
domain is a slave (i.e. it depends on the master) with the rule that
the slave cannot be powered on if the master is not powered on and
the master cannot be powered off if the slave is not powered off.
Each struct generic_pm_domain object representing a PM domain has
two lists of links, a list of links in which it is a master and
a list of links in which it is a slave. The first of these lists
replaces the list of subdomains and the second one is used in place
of the parent pointer.
Each link is represented by struct gpd_link object containing
pointers to the master and the slave and two struct list_head
members allowing it to hook into two lists (the master's list
of "master" links and the slave's list of "slave" links). This
allows the code to get to the link from each side (either from
the master or from the slave) and follow it in each direction.
PM / Domains: Add "wait for parent" status for generic PM domains
The next patch will make it possible for a generic PM domain to have
multiple parents (i.e. multiple PM domains it depends on). To
prepare for that change it is necessary to change pm_genpd_poweron()
so that it doesn't jump to the start label after running itself
recursively for the parent domain. For this purpose, introduce a new
PM domain status value GPD_STATE_WAIT_PARENT that will be set by
pm_genpd_poweron() before calling itself recursively for the parent
domain and modify the code in drivers/base/power/domain.c so that
the GPD_STATE_WAIT_PARENT status is guaranteed to be preserved during
the execution of pm_genpd_poweron() for the parent.
This change also causes pm_genpd_add_subdomain() and
pm_genpd_remove_subdomain() to wait for started pm_genpd_poweron() to
complete and allows pm_genpd_runtime_resume() to avoid dropping the
lock after powering on the PM domain.
PM / Domains: Make pm_genpd_poweron() always survive parent removal
If pm_genpd_remove_subdomain() is called to remove a PM domain's
subdomain and pm_genpd_poweron() is called for that subdomain at
the same time, and the pm_genpd_poweron() called by it recursively
for the parent returns an error, the first pm_genpd_poweron()'s
error code path will attempt to decrement the subdomain counter of
a PM domain that it's not a subdomain of any more.
Rearrange the code in pm_genpd_poweron() to prevent this from
happening.
PM / Domains: Do not take parent locks to modify subdomain counters
After the subdomain counter in struct generic_pm_domain has been
changed into an atomic_t field, it is possible to modify
pm_genpd_poweron() and pm_genpd_poweroff() so that they don't take
the parents locks. This requires pm_genpd_poweron() to increment
the parent's subdomain counter before calling itself recursively
for the parent and to decrement it if an error is to be returned.
PM / Domains: Implement subdomain counters as atomic fields
Currently, pm_genpd_poweron() and pm_genpd_poweroff() need to take
the parent PM domain's lock in order to modify the parent's counter
of active subdomains in a nonracy way. This causes the locking to be
considerably complex and in fact is not necessary, because the
subdomain counters may be implemented as atomic fields and they
won't have to be modified under a lock.
Replace the unsigned in sd_count field in struct generic_pm_domain
by an atomic_t one and modify the code in drivers/base/power/domain.c
to take this change into account.
This patch doesn't change the locking yet, that is going to be done
in a separate subsequent patch.
Colin Cross [Mon, 8 Aug 2011 21:39:36 +0000 (23:39 +0200)]
PM / Runtime: Add might_sleep() to runtime PM functions
Some of the entry points to pm runtime are not safe to
call in atomic context unless pm_runtime_irq_safe() has
been called. Inspecting the code, it is not immediately
obvious that the functions sleep at all, as they run
inside a spin_lock_irqsave, but under some conditions
they can drop the lock and turn on irqs.
If a driver incorrectly calls the pm_runtime apis, it can
cause sleeping and irq processing when it expects to stay
in atomic context.
Add might_sleep_if to the majority of the __pm_runtime_* entry points
to enforce correct usage.
Add pm_runtime_put_sync_autosuspend to the list of
functions that can be called in atomic context.
Signed-off-by: Colin Cross <ccross@android.com> Reviewed-by: Kevin Hilman <khilman@ti.com> Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>