Cyrill Gorcunov [Thu, 5 May 2011 19:06:40 +0000 (23:06 +0400)]
kvm tools: Gather Virtio-PCI constants into one place
It's better than have them sprinkled in.c files. Note
that pin for ring device is changed so it no longer shared
with block device (it is done in a sake of simplicity).
Also comment style if a bit tuned up in virtio-pci.h
just to be consistent.
Ingo Molnar [Thu, 5 May 2011 08:00:45 +0000 (10:00 +0200)]
kvm tools: Fix 32-bit build of the asm/system.h include
Provide wrappers and other environmental dependencies that the
asm/system.h header file from hell needs to build fine in user-space.
Sidenote: right now alternative() defaults to the compatible, slightly
slower barrier instructions that work on all x86 systems.
If this ever shows up in profiles then kvm could provide an alternatives
patching machinery as well. Right now those instructions are emitted
into special sections and then discarded by the linker harmlessly.
Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Pekka Enberg <penberg@kernel.org>
Enable virtio-rng, a virtio random number generator.
Guest kernel should be compiled with CONFIG_HW_RANDOM_VIRTIO.
Once enabled, A RNG device will be located at /dev/hwrng.
Signed-off-by: Sasha Levin <levinsasha928@gmail.com> Signed-off-by: Pekka Enberg <penberg@kernel.org>
virtio-blk has been converted to use the threadpool. All the threading code has
been removed, which left only simple callback handling code.
New threadpool job types are created within VIRTIO_PCI_QUEUE_PFN for every
queue (just one in the case of virtio-blk). The module signals for work after
receiving VIRTIO_PCI_QUEUE_NOTIFY and expects the threadpool to call
virtio_blk_do_io to handle the I/O. It is possible that the module will signal
work several times while virtio_blk_do_io is already working, but there is no
need to handle multithreading there since the threadpool will call each job in
linear and not in parallel.
Signed-off-by: Sasha Levin <levinsasha928@gmail.com> Signed-off-by: Pekka Enberg <penberg@kernel.org>
This patch adds a generic pool to create a common interface for working with
threads within the kvm tool. Main idea here is using this threadpool for all
I/O threads instead of having every I/O module write it's own thread code. The
process of working with the thread pool is supposed to be very simple.
During initialization, each module which is interested in working with the
threadpool will call threadpool__add_jobtype with the callback function and a
void* parameter. For example, virtio modules will register every virt_queue as
a new job type. During operation, When theres work to do for a specific job,
the module will signal it to the queue and would expect the callback to be
called with proper parameters. It is assured that the callback will be called
once for every signal action and each callback will be called only once at a
time (i.e. callback functions themselves don't need to handle threading).
[ penberg@kernel.org: Use Lindent ] Signed-off-by: Sasha Levin <levinsasha928@gmail.com> Signed-off-by: Pekka Enberg <penberg@kernel.org>
kvm tools: display appropriate error message when default kernel image could not be found
This change was recommended by Ingo Molnar in his reply to mail 'Use the root
partition of the host to boot the guest machine'. The patch informs user to
explicitly run the 'kvm run --help' command, in case the kvm tool could not find
a default kernel image to boot.
prasad@prasad-kvm:~/KVM/linux-kvm/tools/kvm$ ./kvm run
Fatal: could not find default kernel image in:
./bzImage
../../arch/x86/boot/bzImage
/boot/vmlinuz-2.6.35-25-generic
/boot/bzImage-2.6.35-25-generic
Please see 'kvm run --help' for more options.
Signed-off-by: Prasad Joshi <prasadjoshi124@gmail.com> Signed-off-by: Pekka Enberg <penberg@kernel.org>
kvm tools: check read permission before using the root partition of the host to boot VM
The commit fbe8d0f (kvm tools: Use the root partition of the host to boot the
guest machine) changed the default image for virtual machine to root partition
of the host machine. The patch adds a check to ensure appropriate permission
(a read permission) is available for kvm tool to use this partition.
Signed-off-by: Prasad Joshi <prasadjoshi124@gmail.com> Signed-off-by: Pekka Enberg <penberg@kernel.org>
kvm tools: Add NR_CPUS definition in case of non-configured kernel sources
Pekka reported
|
| I see this if I ignore the reject:
|
| penberg@tiger:~/linux/tools/kvm$ make
| In file included from mptable.c:10:
| ../../arch/x86/include/asm/mpspec_def.h:20:6: error: "NR_CPUS" is not defined
This is because the source linux kernel might not be configured (bare sources)
so we add own definition in case if there is no NR_CPUS defined.
[ penberg@kernel.org: fix up compilation error ] Reported-by: Pekka Enberg <penberg@kernel.org> Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com> Signed-off-by: Pekka Enberg <penberg@kernel.org>
This is a raw prototipe for MP table support, most resources
such as IRQ pins and sources are hardcoded among other limitations.
Note we still limit the number of cpus to run up to a single cpu
until the full SMP support appear. In particular don't forget to
remove "nolapic" from command line then.
Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com> Signed-off-by: Pekka Enberg <penberg@kernel.org>
Using mmap to allocate the RAM enables us to allocate large blocks of memory
for the guest, allowing to boot guests with a large RAM. Since if we try
KVM_SET_USER_MEMORY_REGION with a large memory block KVM used to oops (now it
just fails), we had to move the actual ioctl to after we KVM_CREATE_VCPU.
Signed-off-by: Sasha Levin <levinsasha928@gmail.com> Signed-off-by: Pekka Enberg <penberg@kernel.org>
Pekka Enberg [Tue, 19 Apr 2011 16:57:36 +0000 (19:57 +0300)]
kvm tools: Fix qcow1_read_cluster() return value on error
The qcow1_read_cluster() returns negative number on error but the return value
type is unsigned. Fix that up and also fix the call-site in qcow1_read_sector()
to deal with negative return value properly.
Add I/O thread to handle I/O operations in virtio-blk.
There is currently support for multiple virtio queues but the kernel side
supports only one virtio queue. It's not too much of a performance impact and
the ABI does support multiple queues there - So I've prefered to do it like
that to keep it flexible.
I/O performance itself doesn't increase much due to the patch, what changes is
system responsiveness during I/O operations. On an unthreaded system, The VCPU
is frozen up until the I/O request is complete. On the other hand, On a
threaded system the VCPU is free to do other work or queue more I/O while
waiting for the original I/O request to complete.
kvm tools: Use the root partition of the host to boot the guest machine
The kvm run command should automatically pickup the image file to boot if one is
not explicitly specified.
Quoting Ingo Molnar:
Looks good here, with your patch applied 'kvm run' will now boot host userspace
with zero configuration needed (!), when there's a bzImage built in that kernel
tree:
Pekka Enberg [Sun, 17 Apr 2011 10:42:28 +0000 (13:42 +0300)]
kvm tools: Untangle qcow1_read_sector() function
This patch rearranges qcow1_read_sector() code so that all normal, zero
cluster, and I/O error paths share the same exit point which also deals with L2
table freeing.
kvm tools: check the cluster boundary in the qcow read code
The QCOW1 code should always read the data in cluster size. A new function
qcow1_read_cluster() is added to read a cluster wort of data. The current
function to read the data i.e. qcow1_read_sector is modified to use this newly
added function and preserve the cluster boundary.
Signed-off-by: Prasad Joshi <prasadjoshi124@gmail.com> Signed-off-by: Pekka Enberg <penberg@kernel.org>
Amos Kong [Thu, 14 Apr 2011 04:37:45 +0000 (12:37 +0800)]
kvm tools: Add a script to setup private bridge
We can use this script to create/delete a private bridge,
and launch a dhcp server on the bridge by dnsmasq,
setup forware rule of iptable, then guest can access public network.
# ./set_private_br.sh vbr0 192.168.33
add new private bridge: vbr0
# brctl show
bridge name bridge id STP enabled interfaces
vbr0 8000.000000000000 yes
# ifconfig vbr0
vbr0 Link encap:Ethernet HWaddr 82:0f:f5:8f:92:47
inet addr:192.168.33.1 Bcast:192.168.33.255 Mask:255.255.255.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:0 errors:0 dropped:0 overruns:0 frame:0
TX packets:11 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:0 (0.0 B) TX bytes:1979 (1.9 KB)
# ps aux |grep dnsmasq
nobody .. dnsmasq --strict-order --bind-interfaces --listen-address 192.168.33.1 \
--dhcp-range 192.168.33.1,192.168.33.254
Signed-off-by: Amos Kong <kongjianjun@gmail.com> Signed-off-by: Pekka Enberg <penberg@kernel.org>
kvm tools: Avoid using disk_image->priv member in disk_image__new()
The disk_image->priv is supposed to be a private member for users of
disk_image__new(). The other block device drivers, for example qcow, might need
this pointer to hold their header.
Added a new function disk_image__new_readonly() which calls disk_image__new()
to allocate a new disk and then sets the priv member to mmamped address.
Signed-off-by: Prasad Joshi <prasadjoshi124@gmail.com> Signed-off-by: Pekka Enberg <penberg@kernel.org>
Asias He [Tue, 12 Apr 2011 16:01:28 +0000 (00:01 +0800)]
kvm tools: Implement virtio network device
This patch implement virtio network device.
Use '-n virtio or --network=virtio' to enable it.
The current implementation uses tap which needs root privileges to create a
virtual network device (tap0) on host side. Actually, what we need is
CAP_NET_ADMIN.
The host side tap0 is set to 192.168.33.2/24.
You need to configure the guest side eth0 to any ip address in
192.168.33.0/24.
Here are some scp performance test for differenct implementations:
None of rx and tx as thread:
guest to host 3.2MB/s
host to guest 3.1MB/s
Only rx as thread:
guest to host 14.7MB/s
host to guest 33.4MB/s
Both rx and tx as thread(This patch works this way):
guest to host 19.8MB/s
host to guest 32.5MB/s
Signed-off-by: Asias He <asias.hejun@gmail.com> Signed-off-by: Pekka Enberg <penberg@kernel.org>
kvm tools: Add option to load disk image read only
As suggested by Christoph Hellwig and Pekka Enberg, Add a '--readonly' flag to
prevent runtime changes to the disk to be saved in the image file. Please note
that since the changes are saved in the VM instead of being written out, a
large amount of modified blocks may kill the tool.
Signed-off-by: Sasha Levin <levinsasha928@gmail.com> Signed-off-by: Pekka Enberg <penberg@kernel.org>
Amos Kong [Sun, 10 Apr 2011 08:33:39 +0000 (16:33 +0800)]
kvm tools: Make virt_queue__available return false if queue is not initialized
virtio_console__inject_interrupt tries to use virt queues before guest
tell us to initialize them.
(gdb) r run -i linux-0.2.img -k ./vmlinuz-2.6.38-rc6+ -r ./initrd.img-2.6.38-rc6+ -p=init=1 -m 500 -c
Starting program: /project/rh/kvm-tools/tools/kvm/kvm run -i linux-0.2.img -k ./vmlinuz-2.6.38-rc6+ -r ./initrd.img-2.6.38-rc6+ -p=init=1 -m 500 -c
[Thread debugging using libthread_db enabled]
[New Thread 0x7fffd6e2d700 (LWP 19280)]
Warning: request type 8
Program received signal SIGSEGV, Segmentation fault.
0x00000000004026ca in virt_queue__available (vq=0x60d3c8) at include/kvm/virtio.h:31
31 return vq->vring.avail->idx != vq->last_avail_idx;
(gdb)
(gdb) bt
(gdb) p *vq
$2 = {vring = {num = 0, desc = 0x0, avail = 0x0, used = 0x0}, pfn = 0, last_avail_idx = 0}
include/kvm/virtio-console.h:
59 void virtio_console__inject_interrupt(struct kvm *self)
....
71 if (term_readable(CONSOLE_VIRTIO) && virt_queue__available(vq)) {
72 head = virt_queue__get_iov(vq, iov, &out, &in, self);
^^^^ then this block will not be executed if
virtio_queue is unavaiable.
Changes from v1:
- move the check of virt_queue out of virt_queue__get_iov()
Reported-by: Amos Kong <akong@redhat.com> Acked-by: Cyrill Gorcunov <gorcunov@gmail.com> Signed-off-by: Asias He <asias.hejun@gmail.com> Signed-off-by: Amos Kong <akong@redhat.com> Signed-off-by: Pekka Enberg <penberg@kernel.org>
Asias He [Sun, 10 Apr 2011 06:50:31 +0000 (14:50 +0800)]
kvm tools: Exit KVM session on Ctrl+a and 'x'
This patch makes Ctrl+a escape key and makes Ctrl+a and 'x' terminate the KVM
session. It works for both serial and virtio console. If you want to input
Ctrl+a to guest, type Ctrl+a twice.
[ penberg@kernel.org: Make terminated message more readable. ] Tested-by: Amos Kong <akong@redhat.com> Signed-off-by: Asias He <asias.hejun@gmail.com> Signed-off-by: Pekka Enberg <penberg@kernel.org>
Pekka Enberg [Sun, 10 Apr 2011 08:17:35 +0000 (11:17 +0300)]
kvm tools: Remove the BUGS file
Commit b95f0c7 ("kvm tools: Emit a more informative error message when /dev/kvm
does not open") fixed the only issue listed in BUGS so remove the file.
This patch fixes the following error in virtio-blk.c:
virtio-blk.c: In function ‘virtio_blk_do_io_request’:
virtio-blk.c:125:6: error: variable ‘err’ set but not used [-Werror=unused-but-set-variable]
cc1: all warnings being treated as errors
Signed-off-by: Giuseppe Calderaro <giuseppecalderaro@gmail.com> Signed-off-by: Pekka Enberg <penberg@kernel.org>
Pekka Enberg [Sat, 9 Apr 2011 13:23:35 +0000 (16:23 +0300)]
kvm tools: Use mutex_lock() and mutex_unlock() wrappers
This patch implements less hostile mutex_lock() and mutex_lock() wrappers on
top of the pthread API equivalents as suggested by Ingo Molnar:
glibc/pthreads mutex API semantics are pretty silly IMO.
I *think* it would be better to try to match the kernel API here, and provide
trivial wrappers around mutex_lock()/mutex_unlock(). We wont ever bring down
threads in a hostile way, so we wont actually need the error returns. CPU
threads should probably only exit once the kvm process exits, after all
cleanup has been done.
That way usage would be more obvious and more familar to kernel developers :-)
[ It would also open up the possibility, in the far future, to bring lockdep to
user-space ;-) ]
Cc: Asias He <asias.hejun@gmail.com> Cc: Cyrill Gorcunov <gorcunov@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Pekka Enberg <penberg@kernel.org>
Pekka Enberg [Sat, 9 Apr 2011 10:53:11 +0000 (13:53 +0300)]
kvm tools: Use per-VCPU threads for execution
This patch makes the core hypervisor to use per-VCPU threads for execution.
NOTE: We only start one thread right now because we're unable to let the guest
kernel know about more cores at the moment.
Cc: Asias He <asias.hejun@gmail.com> Cc: Cyrill Gorcunov <gorcunov@gmail.com> Tested-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Pekka Enberg <penberg@kernel.org>