When devices in a RAID array are not in-sync, they are supposed to be
reported as such in the status output as an 'a' character, which means
"alive, but not in-sync". But when the entire array is rebuilt 'A' is
being used, which is incorrect. This patch corrects this to 'a'.
Signed-off-by: Jonathan Brassow <jbrassow@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Allow userspace dm log implementations to register their log device so it
is no longer missing from the list of device dependencies.
When device mapper targets use a device they normally call dm_get_device
which includes it in the device list returned to userspace applications
such as LVM through the DM_TABLE_DEPS ioctl. Userspace log devices
don't use dm_get_device as userspace opens them so they are missing from
the list of dependencies.
This patch extends the DM_ULOG_CTR operation to allow userspace to
respond with the name of the log device (if appropriate) to be
registered via 'dm_get_device'. DM_ULOG_REQUEST_VERSION is incremented.
This is backwards compatible. If the kernel and userspace log server
have both been updated, the new information will be passed down to the
kernel and the device will be registered. If the kernel is new, but
the log server is old, the log server will not pass down any device
information and the kernel will simply bypass the device registration
as before. If the kernel is old but the log server is new, the log
server will see the old version number and not pass the device info.
Signed-off-by: Jonathan Brassow <jbrassow@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Joe Thornber [Tue, 1 Nov 2011 00:15:35 +0000 (11:15 +1100)]
Initial EXPERIMENTAL implementation of device-mapper thin provisioning
with snapshot support. The 'thin' target is used to create instances of
the virtual devices that are hosted in the 'thin-pool' target. The
thin-pool target provides data sharing among devices. This sharing is
made possible using the persistent-data library in the previous patch.
The main highlight of this implementation, compared to the previous
implementation of snapshots, is that it allows many virtual devices to
be stored on the same data volume, simplifying administration and
allowing sharing of data between volumes (thus reducing disk usage).
Another big feature is support for arbitrary depth of recursive
snapshots (snapshots of snapshots of snapshots ...). The previous
implementation of snapshots did this by chaining together lookup tables,
and so performance was O(depth). This new implementation uses a single
data structure so we don't get this degradation with depth.
For further information and examples of how to use this, please read
Documentation/device-mapper/thin-provisioning.txt
Signed-off-by: Joe Thornber <thornber@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Mikulas Patocka [Tue, 1 Nov 2011 00:15:34 +0000 (11:15 +1100)]
The dm-bufio interface allows you to do cached I/O on devices,
holding recently read blocks in memory and performing delayed writes.
We don't use buffer cache or page cache already present in the kernel, because:
* we need to handle block sizes larger than a page
* we can't allocate memory to perform reads or we'd have deadlocks
Currently, when a cache is required, we limit its size to a fraction of
available memory. Usage can be viewed and changed in
/sys/module/dm_bufio/parameters/ .
The first user is thin provisioning, but more dm users are planned.
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Introduce DM_TARGET_IMMUTABLE to indicate that the target type cannot be mixed
with any other target type, and once loaded into a device, it cannot be
replaced with a table containing a different type.
Mikulas Patocka [Tue, 1 Nov 2011 00:15:32 +0000 (11:15 +1100)]
This patch introduces dm_kcopyd_zero() to make it easy to use
kcopyd to write zeros into the requested areas instead
instead of copying. It is implemented by passing a NULL
copying source to dm_kcopyd_copy().
The forthcoming thin provisioning target uses this.
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Allow QUEUE_FLAG_NONROT to propagate up the device stack if all
underlying devices are non-rotational. Tools like ureadahead will
schedule IOs differently based on the rotational flag.
With this patch, I see boot time go from 7.75 s to 7.46 s on my device.
Suggested-by: J. Richard Barnette <jrbarnette@chromium.org> Signed-off-by: Mandeep Singh Baines <msb@chromium.org> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Cc: Neil Brown <neilb@suse.de> Cc: Jens Axboe <jaxboe@fusionio.com> Cc: Martin K. Petersen <martin.petersen@oracle.com> Cc: dm-devel@redhat.com Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Pawel Moll [Mon, 24 Oct 2011 13:07:03 +0000 (14:07 +0100)]
virtio: Add platform bus driver for memory mapped virtio device
This patch, based on virtio PCI driver, adds support for memory
mapped (platform) virtio device. This should allow environments
like qemu to use virtio-based block & network devices even on
platforms without PCI support.
One can define and register a platform device which resources
will describe memory mapped control registers and "mailbox"
interrupt. Such device can be also instantiated using the Device
Tree node with compatible property equal "virtio,mmio".
Cc: Anthony Liguori <aliguori@us.ibm.com> Cc: Michael S.Tsirkin <mst@redhat.com> Signed-off-by: Pawel Moll <pawel.moll@arm.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
virtio-pci: Use PCI MMIO instead of PIO when available
Currently virtio-pci is specced so that configuration of the device is
done through a PCI IO space (via BAR 0 of the virtual PCI device).
However, use of PCI IO space (aka PIO) is long deprecated, and can be
awkward to use on some systems (for example IBM pSeries machines
typically have many PCI domains, and not all firmware/hypervisor
versions necessarily support PCI PIO access on all domains).
Therefore, it would be preferable for the virtio virtual PCI device to
advertise a PCI memory space (aka MMIO) BAR and have configuration
done through this interface instead. This can be done backwards
compatibly by advertising the MMIO BAR in addition to the existing PIO
BAR so that the guest driver can choose whichever interface.
In anticipation of adding such an MMIO BAR to virtio host-side
implementations (e.g. qemu), this patch updates the Linux virtio-pci
driver to attempt to use BAR 2 (which will be MMIO) in preference to
the existing PIO BAR 0.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Acked-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Krishna Kumar [Wed, 5 Oct 2011 05:38:59 +0000 (11:08 +0530)]
virtio: Dont add "config" to list for !per_vq_vector
For the MSI but non-per_vq_vector case, the config/change vq
also gets added to the list of vqs that need to process the
MSI interrupt. This is not needed as config has it's own
handler (vp_config_changed). In any case, vring_interrupt()
finds nothing needs to be done on this vq.
I tested this patch by testing the "Fallback:" and "Finally
fall back" cases in vp_find_vqs(). Please review.
Signed-off-by: Krishna Kumar <krkumar2@in.ibm.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
virtio: console: wait for first console port for early console output
On s390 I have seen some random
"Warning: unable to open an initial console"
boot failure. Turns out that tty_open fails, because the
hvc_alloc was not yet done. In former times this could not happen,
since the probe function automatically called hvc_alloc. With newer
versions (multiport) some host<->guest interaction is required
before hvc_alloc is called. This might be too late, especially if
an initramfs is involved. Lets use a completion if we have
multiport and an early console.
[Amit:
* Use NULL instead of 0 for pointer comparison
* Rename 'port_added' to 'early_console_added'
* Re-format, re-word commit message
* Rebase patch on top of current queue]
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Amit Shah <amit.shah@redhat.com> Acked-by: Chrstian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Amit Shah [Wed, 14 Sep 2011 07:36:46 +0000 (13:06 +0530)]
virtio: console: add port stats for bytes received, sent and discarded
This commit adds port-specific stats for the number of bytes received,
sent and discarded. They're exposed via the debugfs interface. This
data can be used to check for data loss bugs (or disprove such claims).
It can also be used for accounting, if there's such a need.
The stats remain valid throughout the lifetime of the port. Unplugging
a port will reset the stats. The numbers are not reset across port
opens/closes.
Signed-off-by: Amit Shah <amit.shah@redhat.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Amit Shah [Wed, 14 Sep 2011 07:36:45 +0000 (13:06 +0530)]
virtio: console: make discard_port_data() use get_inbuf()
discard_port_data() used virtqueue_get_buf() directly instead of using
get_inbuf(). Fix this, so that we get accounting for all received
bytes. This also simplifies the code a lot.
Signed-off-by: Amit Shah <amit.shah@redhat.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Amit Shah [Wed, 14 Sep 2011 07:36:43 +0000 (13:06 +0530)]
virtio: console: make get_inbuf() return port->inbuf if present
Instead of pulling in a buffer from the vq each time it's called,
get_inbuf() now checks if the current active buffer, in port->inbuf is
valid. If it is, just returns a pointer to it. This ends up
simplifying a lot of code calling get_inbuf() since the check for
port->inbuf being valid was done by all the callers.
Signed-off-by: Amit Shah <amit.shah@redhat.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Amit Shah [Wed, 14 Sep 2011 07:36:40 +0000 (13:06 +0530)]
virtio: console: Ignore port name update request if name already set
We don't allow port name changes dynamically for a port. So any
requests by the host to change the name are ignored.
Before this patch, if the hypervisor sent a port name while we had one
set already, we would leak memory equivalent to the size of the old
name.
This scenario wasn't expected so far, but with the suspend-resume
support, we'll send the VIRTIO_CONSOLE_PORT_READY message after restore,
which can get us into this situation.
Signed-off-by: Amit Shah <amit.shah@redhat.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>