git.karo-electronics.de Git - karo-tx-linux.git/log

Add read-only and fail-io modes to thin provisioning.

If a transaction commit fails the pool's metadata device will transition
to "read-only" mode. If a commit fails once already in read-only mode
the transition to "fail-io" mode occurs.

Once in fail-io mode the pool and all associated thin devices will
report a status of "Fail".

Signed-off-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>

commit | commitdiff | tree

Joe Thornber [Tue, 24 Jul 2012 23:25:33 +0000 (09:25 +1000)]

Introduce dm_pool_abort_metadata to abort the current metadata
transaction. Generally this will only be called when bad things are
happening and dm-thin is trying to roll back to a good state for
read-only mode.

It's complicated by the fact that the metadata device may have failed
completely causing the abort to be unable to read the old transaction.
In this case the metadata object is placed in a 'fail' mode and
everything fails apart from destroying it.

Signed-off-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>

commit | commitdiff | tree

Joe Thornber [Tue, 24 Jul 2012 23:25:32 +0000 (09:25 +1000)]

Introduce dm_pool_metadata_set_read_only to put the underlying block
manager into read-only mode.

Signed-off-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>

commit | commitdiff | tree

Joe Thornber [Tue, 24 Jul 2012 23:25:32 +0000 (09:25 +1000)]

Introduce dm_bm_set_read_only to switch the block manager into a
read-only mode. To be used when dm-thin degrades due to io errors on
the metadata device.

Signed-off-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>

commit | commitdiff | tree

Joe Thornber [Tue, 24 Jul 2012 23:25:32 +0000 (09:25 +1000)]

Reduce the number of metadata commits by using
dm_thin_changed_this_transaction to check if metadata was changed on a
per thin device granularity.

Signed-off-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>

commit | commitdiff | tree

Joe Thornber [Tue, 24 Jul 2012 23:25:31 +0000 (09:25 +1000)]

Introduce dm_thin_changed_this_transaction to dm-thin-metadata to publish a
useful bit of information we're already tracking. This will help dm thin
decide when to commit.

Signed-off-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>

commit | commitdiff | tree

Joe Thornber [Tue, 24 Jul 2012 23:25:31 +0000 (09:25 +1000)]

Add a parameter to dm_pool_metadata_open to indicate whether or not an unformatted
metadata area should be formatted.

Signed-off-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>

commit | commitdiff | tree

Joe Thornber [Tue, 24 Jul 2012 23:25:30 +0000 (09:25 +1000)]

Tidy up error path in __open_metadata and __format_metadata in dm-thin-metadata.

Signed-off-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>

commit | commitdiff | tree

Mike Snitzer [Tue, 24 Jul 2012 23:25:30 +0000 (09:25 +1000)]

Factor out __check_incompat_features and only call it once when we open
the metadata device rather than at the beginning of every transaction.

Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>

commit | commitdiff | tree

Joe Thornber [Tue, 24 Jul 2012 23:25:29 +0000 (09:25 +1000)]

Remove some duplicate initialisation of struct dm_pool_metadata.

These pmd fields are initialised by both:
__format_metadata's calls to dm_btree_empty
__write_initial_superblock + __begin_transaction

Signed-off-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>

commit | commitdiff | tree

Joe Thornber [Tue, 24 Jul 2012 23:25:29 +0000 (09:25 +1000)]

Remove 'create' parameter from __create_persistent_data_objects() in dm-thin-metadata.

Signed-off-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>

commit | commitdiff | tree

Joe Thornber [Tue, 24 Jul 2012 23:25:29 +0000 (09:25 +1000)]

Move the check for __superblock_all_zeroes from
__create_persistent_data_objects() down to __open_or_format_metadata in
dm-thin-metadata.

Signed-off-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>

commit | commitdiff | tree

Joe Thornber [Tue, 24 Jul 2012 23:25:28 +0000 (09:25 +1000)]

Remove nr_blocks arg from __create_persistent_data_objects in dm-thin-metadata.
It was always passed as zero.

Signed-off-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>

commit | commitdiff | tree

Joe Thornber [Tue, 24 Jul 2012 23:25:28 +0000 (09:25 +1000)]

Split __open_or_format_metadata into __format_metadata and __open_metadata in
dm-thin-metadata.

Signed-off-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>

commit | commitdiff | tree

Joe Thornber [Tue, 24 Jul 2012 23:25:27 +0000 (09:25 +1000)]

Clean up __open_or_format_metadata in dm-thin-metadata by using struct
dm_pool_metadata members to replace local variables.

Signed-off-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>

commit | commitdiff | tree

Joe Thornber [Tue, 24 Jul 2012 23:25:27 +0000 (09:25 +1000)]

Zero the unused uuid when initialising the metadata superblock.

Signed-off-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>

commit | commitdiff | tree

Joe Thornber [Tue, 24 Jul 2012 23:25:26 +0000 (09:25 +1000)]

Lift the call to __begin_transaction out of __write_initial_superblock in dm-thin-metadata.
Called higher up the call chain now.

Signed-off-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>

commit | commitdiff | tree

Joe Thornber [Tue, 24 Jul 2012 23:25:26 +0000 (09:25 +1000)]

Move dm_commit_pool_metadata inline into __write_initial_superblock in dm-thin-metadata.

Signed-off-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>

commit | commitdiff | tree

Joe Thornber [Tue, 24 Jul 2012 23:25:26 +0000 (09:25 +1000)]

Factor out __write_initial_superblock and also pull some other initial
creation code out of dm_pool_metadata_open.

Signed-off-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>

commit | commitdiff | tree

Joe Thornber [Tue, 24 Jul 2012 23:25:25 +0000 (09:25 +1000)]

Lift some initialisation out of __open_or_format_metadata in dm-thin-metadata.

Signed-off-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>

commit | commitdiff | tree

Joe Thornber [Tue, 24 Jul 2012 23:25:25 +0000 (09:25 +1000)]

Factor __destroy_persistent_data_objects out of dm_pool_metadata_close.

Signed-off-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>

commit | commitdiff | tree

Joe Thornber [Tue, 24 Jul 2012 23:25:24 +0000 (09:25 +1000)]

Move block manager creation and the check for unformatted metadata into
__create_persistent_data_objects().

Signed-off-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>

commit | commitdiff | tree

Joe Thornber [Tue, 24 Jul 2012 23:25:24 +0000 (09:25 +1000)]

Rename init_pmd to __create_persistent_data_objects in dm-thin-metadata.

Signed-off-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>

commit | commitdiff | tree

Joe Thornber [Tue, 24 Jul 2012 23:25:24 +0000 (09:25 +1000)]

Introduce wrappers to handle write locking the superblock
appropriately.

Signed-off-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>

commit | commitdiff | tree

Joe Thornber [Tue, 24 Jul 2012 23:25:23 +0000 (09:25 +1000)]

Stop using dm_bm_unlock_move when shadowing blocks in the transaction
manager as an optimisation and remove the function as it is then no
longer used.

Some code, such as the space maps, keeps using on-disk data structures
from the previous transaction. It can do this because blocks won't
be reallocated until the subsequent transaction. Using
dm_bm_unlock_move to copy blocks sounds like a win, but it forces a
synchronous read should the old block be accessed.

Signed-off-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>

commit | commitdiff | tree

Joe Thornber [Tue, 24 Jul 2012 23:25:23 +0000 (09:25 +1000)]

Tidy the transaction manager creation functions.

They no longer lock the superblock. Superblock locking is pulled out to
the caller.

Also export dm_bm_write_lock_zero.

Signed-off-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>

commit | commitdiff | tree

Joe Thornber [Tue, 24 Jul 2012 23:25:22 +0000 (09:25 +1000)]

Remove an optimisation that tracks whether or not a thin metadata commit
is needed.

If dm_pool_commit_metadata() is called and no changes have been made
to the metadata then this optimisation avoided writing to disk.

Removing because we're going to do something better later.

Signed-off-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>

commit | commitdiff | tree

Joe Thornber [Tue, 24 Jul 2012 23:25:22 +0000 (09:25 +1000)]

This patch introduces a separate struct for the block_manager.
It also uses IS_ERR to check the return value of dm_bufio_client_create
instead of testing incorrectly for NULL.

Prior to this patch a struct dm_block_manager was really an alias for
a struct dm_bufio_client. We want to add some functionality to the
block manager that will require extra fields, so this one to one
mapping is no longer valid.

Signed-off-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>

commit | commitdiff | tree

Joe Thornber [Tue, 24 Jul 2012 23:25:21 +0000 (09:25 +1000)]

Factor __setup_btree_details out of init_pmd in dm-thin-metadata.

Signed-off-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>

commit | commitdiff | tree

Alasdair G Kergon [Tue, 24 Jul 2012 23:25:21 +0000 (09:25 +1000)]

Use boolean bit fields for flags in struct dm_target.

Signed-off-by: Alasdair G Kergon <agk@redhat.com>

commit | commitdiff | tree

Joe Thornber [Tue, 24 Jul 2012 23:25:21 +0000 (09:25 +1000)]

The thin provisioning target commits internal metadata on flush. So it
should receive flushes regardless of whether the underlying devices
support them.

Signed-off-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>

commit | commitdiff | tree

Joe Thornber [Tue, 24 Jul 2012 23:25:20 +0000 (09:25 +1000)]

Allow targets to override the 'supports flush' calculation.

Set 'flush_supported' if a target needs to receive flushes regardless of
whether or not its underlying devices have support.

Signed-off-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>

commit | commitdiff | tree

Joe Thornber [Tue, 24 Jul 2012 23:25:20 +0000 (09:25 +1000)]

Introduce bitmap_index_changed to track whether or not the index changed
then only commit a space map if it did.

Signed-off-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>

commit | commitdiff | tree

Joe Thornber [Tue, 24 Jul 2012 23:25:19 +0000 (09:25 +1000)]

Unlock the superblock even if initial dm_bufio_write_dirty_buffers fails.

Also, remove redundant flush calls. dm_bm_flush_and_unlock's calls to
dm_bufio_write_dirty_buffers already result in dm_bufio_issue_flush
being called.

This avoids warnings about unflushed dirty buffers from bufio.

Signed-off-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>

commit | commitdiff | tree

Joe Thornber [Tue, 24 Jul 2012 23:25:19 +0000 (09:25 +1000)]

There's no need to break sharing, triggering a copy, for a write that has no
data (i.e. a flush).

Signed-off-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>

commit | commitdiff | tree

Joe Thornber [Tue, 24 Jul 2012 23:25:18 +0000 (09:25 +1000)]

Fix memory leak in process_prepared_mapping by always freeing
the dm_thin_new_mapping structs from the mapping_pool mempool on
the error paths.

Signed-off-by: Joe Thornber <ejt@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>

commit | commitdiff | tree

Jonathan E Brassow [Tue, 24 Jul 2012 23:25:18 +0000 (09:25 +1000)]

In preparation for RAID10 inclusion in dm-raid, we move the sectors_per_dev
calculation later in the device creation process. This is because we won't
know up-front how many stripes vs how many mirrors there are which will
change the calculation.

Signed-off-by: Jonathan Brassow <jbrassow@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>

commit | commitdiff | tree

Jonathan E Brassow [Tue, 24 Jul 2012 23:25:18 +0000 (09:25 +1000)]

In preparation for RAID10 addition to dm-raid, we change an 'if' conditional
to a 'switch' conditional to make it easier to see what is being checked for
each RAID type.

Signed-off-by: Jonathan Brassow <jbrassow@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>

commit | commitdiff | tree

Mike Snitzer [Tue, 24 Jul 2012 23:25:17 +0000 (09:25 +1000)]

A SCSI device handler might get attached to a device during the
initial device scan.  We do not necessarily want to override
this when loading a multipath table, so this patch adds a new
multipath feature argument "retain_attached_hw_handler".

During SCSI device scan all loaded SCSI device handlers will be
consulted for a match (via scsi_dh's provided .match).  If a match is
found that device handler will be attached.  We need a way to have
userspace multipathd's provided 'hw_handler' not override the already
attached hardware handler.

When specifying the new feature 'retain_attached_hw_handler' multipath
will use the currently attached hardware handler instead of trying to
attach the one specified during table load.  If no hardware handler is
attached the specified hardware handler will still be used.

Leverages scsi_dh_attach's ability to increment the scsi_dh's reference
count if the same scsi_dh name is provided when attaching - currently
attached scsi_dh name is determined with scsi_dh_attached_handler_name.

Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Tested-by: Babu Moger <babu.moger@netapp.com>
Reviewed-by: Chandra Seetharaman <sekharan@us.ibm.com>
Acked-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>

commit | commitdiff | tree

Mikulas Patocka [Tue, 24 Jul 2012 23:25:17 +0000 (09:25 +1000)]

dm-thin will be most likely used with a block size that is a power of
two. So it should be optimized for this case.

This patch changes division and modulo operations to shifts and bit
masks if block size is a power of two.

A test that bi_sector is divisible by a block size is removed from
io_overlaps_block. Device mapper never sends bios that span a block
boundary. Consequently, if we tested that bi_size is equivalent to block
size, bi_sector must already be on a block boundary.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>

commit | commitdiff | tree

Mikulas Patocka [Tue, 24 Jul 2012 23:25:16 +0000 (09:25 +1000)]

This patch sets the variable "ti->split_discard_requests", so that
device mapper core splits discard requests on a block boundary.

Consequently, a discard request that spans multiple blocks is never sent
to dm-thin. The patch also removes some code in process_discard that
deals with discards that span multiple blocks.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>

commit | commitdiff | tree

Mikulas Patocka [Tue, 24 Jul 2012 23:25:16 +0000 (09:25 +1000)]

This patch introduces a new variable split_discard_requests. It can be
set by targets so that discard requests are split on max_io_len
boundaries.

When split_discard_requests is not set, discard requests are only split on
boundaries between targets, as was the case before this patch.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>

commit | commitdiff | tree

Mike Snitzer [Tue, 24 Jul 2012 23:25:16 +0000 (09:25 +1000)]

Non power of 2 blocksize support is needed to properly align thinp IO
on storage that has non power of 2 optimal IO sizes (e.g. RAID6 10+2).

Use sector_div to support non power of 2 blocksize for the pool's
data device. This provides comparable performance to the power of 2
math that was performed until now (as tested on modern x86_64 hardware).

The kernel currently assumes that limits->discard_granularity is a power
of two so the thin target only enables discard support if the block
size is a power of two.

Eliminate pool structure's 'block_shift', 'offset_mask' and
remaining 4 byte holes.

Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>

commit | commitdiff | tree

Mikulas Patocka [Tue, 24 Jul 2012 23:25:15 +0000 (09:25 +1000)]

dm-stripe is usually used with a chunk size that is a power of two.
Previous code used slow divide and modulo operations with a power-of-two
chunk size. This patch changes it to use faster shifts and bit masks.

stripe_width is already optimized in a similar way.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>

commit | commitdiff | tree

Mikulas Patocka [Tue, 24 Jul 2012 23:25:15 +0000 (09:25 +1000)]

There is no technical limitation in device mapper that would prevent the
dm-stripe target from using a stripe size smaller than page size.

This patch removes the limit and makes stripe volumes portable across
architectures with different page size.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>

Linux kernel for KaRo TX COM modules

RSS Atom