]> git.karo-electronics.de Git - karo-tx-linux.git/log
karo-tx-linux.git
13 years agokvm tools: Add QCOW write support
Prasad Joshi [Tue, 10 May 2011 14:43:30 +0000 (15:43 +0100)]
kvm tools: Add QCOW write support

The patch adds QCOW write support for both the versions of QCOW.

The code is based on the QCOW image format specifications which are available on:

  http://people.gnome.org/~markmc/qcow-image-format-version-1.html

  http://people.gnome.org/~markmc/qcow-image-format.html

Signed-off-by: Prasad Joshi <prasadjoshi124@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
13 years agokvm tools: Use constants for commonly used mmap flags
Sasha Levin [Wed, 11 May 2011 16:52:57 +0000 (19:52 +0300)]
kvm tools: Use constants for commonly used mmap flags

Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
13 years agokvm tools: Rename 'self' variables
Sasha Levin [Wed, 11 May 2011 16:52:56 +0000 (19:52 +0300)]
kvm tools: Rename 'self' variables

Give proper names to vars named 'self'.

Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
13 years agokvm tools: Add debug() helper
Cyrill Gorcunov [Wed, 11 May 2011 16:10:51 +0000 (20:10 +0400)]
kvm tools: Add debug() helper

Useful for debugging. It adds "--debug" option as well so
debug prints are seen only if user asked for them.

Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
13 years agokvm tools: Use definitions from kernel headers
Sasha Levin [Wed, 11 May 2011 15:17:25 +0000 (18:17 +0300)]
kvm tools: Use definitions from kernel headers

Instead of redefining virtio pci constants (or not using them at all), use
constants from kernel header.

Acked-and-tested-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
13 years agokvm tools: Prevent PFN wraparound
Sasha Levin [Wed, 11 May 2011 15:17:24 +0000 (18:17 +0300)]
kvm tools: Prevent PFN wraparound

queue->pfn may be used to point at addresses larger
than 32 bit.
Prevent a wraparound when shifting it left.

Acked-and-tested-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
13 years agokvm tools: Add memory gap for larger RAM sizes
Sasha Levin [Wed, 11 May 2011 15:17:23 +0000 (18:17 +0300)]
kvm tools: Add memory gap for larger RAM sizes

e820 is expected to leave a memory gap within the low 32
bits of RAM space. From the documentation of e820_setup_gap():

  /*
   * Search for the biggest gap in the low 32 bits of the e820
   * memory space.  We pass this space to PCI to assign MMIO resources
   * for hotplug or unconfigured devices in.
   * Hopefully the BIOS let enough space left.
   */

Not leaving such gap causes errors and hangs during the boot process.

This patch adds a memory gap between 0xe0000000 and 0x100000000 when using more
than 0xe0000000 bytes for guest RAM.

This patch updates the e820 table, slot allocations used for
KVM_SET_USER_MEMORY_REGION.

Acked-and-tested-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
13 years agokvm tools: Dump vCPUs in order
Ingo Molnar [Mon, 9 May 2011 07:45:32 +0000 (09:45 +0200)]
kvm tools: Dump vCPUs in order

* Ingo Molnar <mingo@elte.hu> wrote:

> The patch below addresses these concerns, serializes the output, tidies up the
> printout, resulting in this new output:

There's one bug remaining that my patch does not address: the vCPUs are not
printed in order:

# vCPU #0's dump:
# vCPU #2's dump:
# vCPU #24's dump:
# vCPU #5's dump:
# vCPU #39's dump:
# vCPU #38's dump:
# vCPU #51's dump:
# vCPU #11's dump:
# vCPU #10's dump:
# vCPU #12's dump:

This is undesirable as the order of printout is highly random, so successive
dumps are difficult to compare.

The patch below serializes the signalling itself. (this is on top of the
previous patch)

The patch also tweaks the vCPU printout line a bit so that it does not start
with '#', which is discarded if such messages are pasted into Git commit
messages.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
13 years agokvm tools: Fix and improve the CPU register dump debug output code
Ingo Molnar [Mon, 9 May 2011 07:27:11 +0000 (09:27 +0200)]
kvm tools: Fix and improve the CPU register dump debug output code

* Pekka Enberg <penberg@kernel.org> wrote:

> Ingo Molnar reported that 'kill -3' didn't work on his machine:
>
>   * Ingo Molnar <mingo@elte.hu> wrote:
>
>   > This is really cumbersome to debug - is there some good way to get to the RIP
>   > that the guest is hanging in? If kvm would print that out to the host console
>   > (even if it's just the raw RIP initially) on a kill -3 that would help
>   > enormously.
>
>   Looks like the code should be doing that already - but the ioctl(KVM_GET_SREGS)
>   hangs:
>
>     [pid   748] ioctl(6, KVM_GET_SREGS
>
> Avi Kivity pointed out that it's not safe to call KVM_GET_SREGS (or other vcpu
> related ioctls) from other threads:
>
>   > is it not OK to call KVM_GET_SREGS from other threads than the one
>   > that's doing KVM_RUN?
>
>   From Documentation/kvm/api.txt:
>
>    - vcpu ioctls: These query and set attributes that control the operation
>      of a single virtual cpu.
>
>      Only run vcpu ioctls from the same thread that was used to create the
>      vcpu.
>
> Fix that up by using pthread_kill() to force the threads that are doing KVM_RUN
> to do the register dumps.
>
> Reported: Ingo Molnar <mingo@elte.hu>
> Cc: Asias He <asias.hejun@gmail.com>
> Cc: Avi Kivity <avi@redhat.com>
> Cc: Cyrill Gorcunov <gorcunov@gmail.com>
> Cc: Ingo Molnar <mingo@elte.hu>
> Cc: Prasad Joshi <prasadjoshi124@gmail.com>
> Cc: Sasha Levin <levinsasha928@gmail.com>
> Signed-off-by: Pekka Enberg <penberg@kernel.org>
> ---
>  tools/kvm/kvm-run.c |   20 +++++++++++++++++---
>  1 files changed, 17 insertions(+), 3 deletions(-)
>
> diff --git a/tools/kvm/kvm-run.c b/tools/kvm/kvm-run.c
> index eb50b6a..58e2977 100644
> --- a/tools/kvm/kvm-run.c
> +++ b/tools/kvm/kvm-run.c
> @@ -127,6 +127,18 @@ static const struct option options[] = {
>   OPT_END()
>  };
>
> +static void handle_sigusr1(int sig)
> +{
> + struct kvm_cpu *cpu = current_kvm_cpu;
> +
> + if (!cpu)
> + return;
> +
> + kvm_cpu__show_registers(cpu);
> + kvm_cpu__show_code(cpu);
> + kvm_cpu__show_page_tables(cpu);
> +}
> +
>  static void handle_sigquit(int sig)
>  {
>   int i;
> @@ -134,9 +146,10 @@ static void handle_sigquit(int sig)
>   for (i = 0; i < nrcpus; i++) {
>   struct kvm_cpu *cpu = kvm_cpus[i];
>
> - kvm_cpu__show_registers(cpu);
> - kvm_cpu__show_code(cpu);
> - kvm_cpu__show_page_tables(cpu);
> + if (!cpu)
> + continue;
> +
> + pthread_kill(cpu->thread, SIGUSR1);
>   }
>
>   serial8250__inject_sysrq(kvm);

i can see a couple of problems with the debug printout code, which currently
produces a stream of such dumps for each vcpu:

Registers:
 rip: 0000000000000000   rsp: 00000000000016ca flags: 0000000000010002
 rax: 0000000000000000   rbx: 0000000000000000   rcx: 0000000000000000
 rdx: 0000000000000000   rsi: 0000000000000000   rdi: 0000000000000000
 rbp: 0000000000008000   r8:  0000000000000000   r9:  0000000000000000
 r10: 0000000000000000   r11: 0000000000000000   r12: 0000000000000000
 r13: 0000000000000000   r14: 0000000000000000   r15: 0000000000000000
 cr0: 0000000060000010   cr2: 0000000000000070   cr3: 0000000000000000
 cr4: 0000000000000000   cr8: 0000000000000000
Segment registers:
 register  selector  base              limit     type  p dpl db s l g avl
 cs        f000      00000000000f0000  0000ffff  03    1 3   0  1 0 0 0
 ss        1000      0000000000010000  0000ffff  03    1 3   0  1 0 0 0
 ds        1000      0000000000010000  0000ffff  03    1 3   0  1 0 0 0
 es        1000      0000000000010000  0000ffff  03    1 3   0  1 0 0 0
 fs        1000      0000000000010000  0000ffff  03    1 3   0  1 0 0 0
 gs        1000      0000000000010000  0000ffff  03    1 3   0  1 0 0 0
 tr        0000      0000000000000000  0000ffff  0b    1 0   0  0 0 0 0
 ldt       0000      0000000000000000  0000ffff  02    1 0   0  0 0 0 0
 gdt                 0000000000000000 0000ffff
 idt                 0000000000000000 0000ffff
 [ efer: 0000000000000000  apic base: 00000000fee00900  nmi: enabled ]
Interrupt bitmap:
 0000000000000000 0000000000000000 0000000000000000 0000000000000000
Code: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 <cf> eb 0d 90 90 90 90 90 90 90 90 90 90 90 90 90 f6 c4 0e 75 4b
Stack:
  0x000016ca: 00 00 00 00  00 00 00 00
  0x000016d2: 00 00 00 00  00 00 00 00
  0x000016da: 00 00 00 00  00 00 00 00
  0x000016e2: 00 00 00 00  00 00 00 00

The problems are:

 - This does not work very well on SMP with lots of vcpus, because the printing
   is unserialized, resulting in a jumbled mess of an output, all vcpus trying
   to print to the console at once, often mixing lines and characters randomly.

 - stdout from a signal handler must be flushed, otherwise lines can remain
   buffered if someone saves the output via 'tee' for example.

 - the dumps from the various CPUs are not distinguishable - they are just
   dumped after each other with no identification

 - the various printouts are rather hard to parse visually - it's not easy to see
   various properties "at a glance" because the dump is visually confusing.

The patch below addresses these concerns, serializes the output, tidies up the
printout, resulting in this new output:

#
# vCPU #0's dump:
#

 Registers:
 ----------
 rip: 0000000000000000   rsp: 00000000000008bc flags: 0000000000010002
 rax: 0000000000000000   rbx: 0000000000000000   rcx: 0000000000000000
 rdx: 0000000000000000   rsi: 0000000000000000   rdi: 0000000000000000
 rbp: 0000000000008000    r8: 0000000000000000    r9: 0000000000000000
 r10: 0000000000000000   r11: 0000000000000000   r12: 0000000000000000
 r13: 0000000000000000   r14: 0000000000000000   r15: 0000000000000000
 cr0: 0000000060000010   cr2: 0000000000000070   cr3: 0000000000000000
 cr4: 0000000000000000   cr8: 0000000000000000

 Segment registers:
 ------------------
 register  selector  base              limit     type  p dpl db s l g avl
 cs        f000      00000000000f0000  0000ffff  03    1 3   0  1 0 0 0
 ss        1000      0000000000010000  0000ffff  03    1 3   0  1 0 0 0
 ds        1000      0000000000010000  0000ffff  03    1 3   0  1 0 0 0
 es        1000      0000000000010000  0000ffff  03    1 3   0  1 0 0 0
 fs        1000      0000000000010000  0000ffff  03    1 3   0  1 0 0 0
 gs        1000      0000000000010000  0000ffff  03    1 3   0  1 0 0 0
 tr        0000      0000000000000000  0000ffff  0b    1 0   0  0 0 0 0
 ldt       0000      0000000000000000  0000ffff  02    1 0   0  0 0 0 0
 gdt                 0000000000000000  0000ffff
 idt                 0000000000000000  0000ffff

 APIC:
 -----
 efer: 0000000000000000  apic base: 00000000fee00900  nmi: enabled

 Interrupt bitmap:
 -----------------
 0000000000000000 0000000000000000 0000000000000000 0000000000000000

 Code:
 -----
 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 <cf> eb 0d 90 90 90 90 90 90 90 90 90 90 90 90 90 f6 c4 0e 75 4b

 Stack:
 ------
  0x000008bc: 00 00 00 00  00 00 00 00
  0x000008c4: 00 00 00 00  00 00 00 00
  0x000008cc: 00 00 00 00  00 00 00 00
  0x000008d4: 00 00 00 00  00 00 00 00

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
13 years agokvm tools: Add missing space after kernel params
Sasha Levin [Sun, 8 May 2011 18:58:04 +0000 (21:58 +0300)]
kvm tools: Add missing space after kernel params

Add missing space so that user-provided kernel params
will be properly concatenated to default params.

Instead of just adding a space at the end, add it with
a separate strcat(), since it's not the first (and wouldn't
have been the last) time a space wasn't added.

Reported-by: Rodrigo Campos <rodrigo@sdfg.com.ar>
Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
13 years agokvm tools: Fix virtio console hangs by removing IRQ injection for tx path
Asias He [Sun, 8 May 2011 13:09:25 +0000 (21:09 +0800)]
kvm tools: Fix virtio console hangs by removing IRQ injection for tx path

As virtio spec says:

"""
 Because this is high importance and low bandwidth, the current Linux
 implementation polls for the buffer to be used, rather than waiting
 for an interrupt, simplifying the implementation signicantly.
"""

drivers/char/virtio_console.c
 send_buf() {
 ...
/* Tell Host to go! */
virtqueue_kick(out_vq);
 ...
        while (!virtqueue_get_buf(out_vq, &len))
                cpu_relax();
 ...
 }

The console hangs can simply be reproduced by yes command which
gives tremendous console IOs and IRQs.

[   16.786440] irq 4: nobody cared (try booting with the "irqpoll" option)
[   16.786440] Pid: 1437, comm: yes Tainted: G        W 2.6.39-rc6+ #56
[   16.786440] Call Trace:
[   16.786440]  [<c16578eb>] __report_bad_irq+0x30/0x89
[   16.786440]  [<c10980e6>] note_interrupt+0x118/0x17a
[   16.786440]  [<c1096e7d>] handle_irq_event_percpu+0x168/0x179
[   16.786440]  [<c1096eba>] handle_irq_event+0x2c/0x46
[   16.786440]  [<c1098516>] ? unmask_irq+0x1e/0x1e
[   16.786440]  [<c1098566>] handle_level_irq+0x50/0x6e
[   16.786440]  <IRQ>  [<c102fa69>] ? do_IRQ+0x35/0x7f
[   16.786440]  [<c1665ea9>] ? common_interrupt+0x29/0x30
[   16.786440]  [<c16610d6>] ? _raw_spin_unlock_irqrestore+0x7/0x28
[   16.786440]  [<c1364f65>] ? hvc_write+0x88/0x9e
[   16.786440]  [<c1355500>] ? do_output_char+0x88/0x18a
[   16.786440]  [<c1355631>] ? process_output+0x2f/0x42
[   16.786440]  [<c1355af6>] ? n_tty_write+0x211/0x2dc
[   16.786440]  [<c1059d77>] ? try_to_wake_up+0x226/0x226
[   16.786440]  [<c13534a4>] ? tty_write+0x15e/0x1d1
[   16.786440]  [<c12c1644>] ? security_file_permission+0x22/0x26
[   16.786440]  [<c13558e5>] ? process_echoes+0x241/0x241
[   16.786440]  [<c10dd9d2>] ? vfs_write+0x84/0xd7
[   16.786440]  [<c1353346>] ? tty_write_lock+0x3d/0x3d
[   16.786440]  [<c10ddb92>] ? sys_write+0x3b/0x5d
[   16.786440]  [<c166594c>] ? sysenter_do_call+0x12/0x22
[   16.786440] handlers:
[   16.786440] [<c1351397>] (vp_interrupt+0x0/0x3a)
[   16.786440] Disabling IRQ #4

Tested-by: Sasha Levin <levinsasha928@gmail.com>
Signed-off-by: Asias He <asias.hejun@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
13 years agokvm tools: Use virt_queue__trigger_irq() to trigger IRQ for virtio rng
Asias He [Sun, 8 May 2011 13:09:24 +0000 (21:09 +0800)]
kvm tools: Use virt_queue__trigger_irq() to trigger IRQ for virtio rng

This patch uses IRQ injection mechanism introduced by
virt_queue__trigger_irq() which respect virtio IRQ status
and VRING_AVAIL_F_NO_INTERRUPT.

Signed-off-by: Asias He <asias.hejun@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
13 years agokvm tools: Use virt_queue__trigger_irq() to trigger IRQ for virtio blk
Asias He [Sun, 8 May 2011 13:09:23 +0000 (21:09 +0800)]
kvm tools: Use virt_queue__trigger_irq() to trigger IRQ for virtio blk

This patch uses IRQ injection mechanism introduced by
virt_queue__trigger_irq() which respect virtio IRQ status
and VRING_AVAIL_F_NO_INTERRUPT.

Signed-off-by: Asias He <asias.hejun@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
13 years agokvm tools: Use virt_queue__trigger_irq() to trigger IRQ for virtio console
Asias He [Sun, 8 May 2011 13:09:22 +0000 (21:09 +0800)]
kvm tools: Use virt_queue__trigger_irq() to trigger IRQ for virtio console

This patch uses IRQ injection mechanism introduced by
virt_queue__trigger_irq() which respect virtio IRQ status
and VRING_AVAIL_F_NO_INTERRUPT.

Signed-off-by: Asias He <asias.hejun@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
13 years agokvm tools: Fix 'kill -3' hangs
Pekka Enberg [Sun, 8 May 2011 09:56:04 +0000 (12:56 +0300)]
kvm tools: Fix 'kill -3' hangs

Ingo Molnar reported that 'kill -3' didn't work on his machine:

  * Ingo Molnar <mingo@elte.hu> wrote:

  > This is really cumbersome to debug - is there some good way to get to the RIP
  > that the guest is hanging in? If kvm would print that out to the host console
  > (even if it's just the raw RIP initially) on a kill -3 that would help
  > enormously.

  Looks like the code should be doing that already - but the ioctl(KVM_GET_SREGS)
  hangs:

    [pid   748] ioctl(6, KVM_GET_SREGS

Avi Kivity pointed out that it's not safe to call KVM_GET_SREGS (or other vcpu
related ioctls) from other threads:

  > is it not OK to call KVM_GET_SREGS from other threads than the one
  > that's doing KVM_RUN?

  From Documentation/kvm/api.txt:

   - vcpu ioctls: These query and set attributes that control the operation
     of a single virtual cpu.

     Only run vcpu ioctls from the same thread that was used to create the
     vcpu.

Fix that up by using pthread_kill() to force the threads that are doing KVM_RUN
to do the register dumps.

Reported: Ingo Molnar <mingo@elte.hu>
Cc: Asias He <asias.hejun@gmail.com>
Cc: Avi Kivity <avi@redhat.com>
Cc: Cyrill Gorcunov <gorcunov@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Prasad Joshi <prasadjoshi124@gmail.com>
Cc: Sasha Levin <levinsasha928@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
13 years agokvm tools: Enable earlyprintk=serial by default
Ingo Molnar [Sun, 8 May 2011 07:39:34 +0000 (09:39 +0200)]
kvm tools: Enable earlyprintk=serial by default

Enable the earlyprintk console to the serial port, to allow the debugging of
very early hangs/crashes.

Since we already enable the serial console by default, this is a natural
extension of it.

I have tested that it indeed works, by provoking an early hang that triggers
after the early console is enabled by before the real console is registered. In
that case before the patch we get:

  $ ./kvm run --cpus 2
  [ silent hang ]

With this patch applied i got the early output:

 $ ./kvm run --cpus 60
 [    0.000000] console [earlyser0] enabled
 [    0.000000] Initializing cgroup subsys cpu
 [    0.000000] Linux version 2.6.39-rc6-tip-02944-g87b0bcf-dirty (mingo@aldebaran) (gcc version 4.6.0 20110419 (Red Hat 4.6.0-5) (GCC) ) #84 SMP Mon May 9 02:34:26 CEST 2011
 [    0.000000] Command line: notsc noapic noacpi pci=conf1 console=ttyS0 earlyprintk=serialroot=/dev/vda1 rw
 [    0.000000] locking up the box!

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
13 years agokvm tools: Warn if guest RAM size exceeds host RAM size
Pekka Enberg [Sat, 7 May 2011 19:07:15 +0000 (22:07 +0300)]
kvm tools: Warn if guest RAM size exceeds host RAM size

Guest memory size that's larger than host physical RAM can cause swap deaths on
the host so warn the user about it.

Cc: Asias He <asias.hejun@gmail.com>
Cc: Cyrill Gorcunov <gorcunov@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Prasad Joshi <prasadjoshi124@gmail.com>
Cc: Sasha Levin <levinsasha928@gmail.com>
Suggested-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
13 years agokvm tools: Don't use all of host RAM for guests by default
Pekka Enberg [Sat, 7 May 2011 19:01:54 +0000 (22:01 +0300)]
kvm tools: Don't use all of host RAM for guests by default

This patch fixes the default guest RAM size maximum to 80% of the host RAM to
avoid swapping the host to death.

Cc: Asias He <asias.hejun@gmail.com>
Cc: Cyrill Gorcunov <gorcunov@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Prasad Joshi <prasadjoshi124@gmail.com>
Cc: Sasha Levin <levinsasha928@gmail.com>
Suggested-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
13 years agokvm tools: Fix up mtable srcbusirq assignment for PCI devices
Cyrill Gorcunov [Sat, 7 May 2011 15:02:58 +0000 (19:02 +0400)]
kvm tools: Fix up mtable srcbusirq assignment for PCI devices

The kernel expects srcbusirq follows MP specification and consists
a tuple of PCI device number with pin encoded. Make it so, otherwise
the kernel reports kind of "buggy MP table" found.

Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
13 years agokvm tools: Fix up PCI pin assignment to conform specification
Cyrill Gorcunov [Sat, 7 May 2011 15:02:57 +0000 (19:02 +0400)]
kvm tools: Fix up PCI pin assignment to conform specification

Only 4 pins are allowed for every PCI compilant device as per PCI 2.2 spec
Section 2.2.6 ("Interrupt Pins"). Multifunctional devices can use up to all
INTA#,B#,C#,D# pins, for our single function devices pin INTA# is enough.

Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
13 years agokvm tools: Limit CPU count by KVM_CAP_NR_VCPUS
Pekka Enberg [Sat, 7 May 2011 14:37:28 +0000 (17:37 +0300)]
kvm tools: Limit CPU count by KVM_CAP_NR_VCPUS

This patch limits the number of CPUs to KVM_CAP_NR_VCPUS when user specifies
more CPUs with the "--cpus=N" command line option than what the in-kernel KVM
is able to handle.

Cc: Asias He <asias.hejun@gmail.com>
Cc: Cyrill Gorcunov <gorcunov@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Prasad Joshi <prasadjoshi124@gmail.com>
Cc: Sasha Levin <levinsasha928@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
13 years agokvm tools: Rename pci_device to pci_hdr for clarity
Sasha Levin [Sat, 7 May 2011 10:50:45 +0000 (13:50 +0300)]
kvm tools: Rename pci_device to pci_hdr for clarity

Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
13 years agokvm tools: Scale guest RAM size by CPU count
Pekka Enberg [Sat, 7 May 2011 14:18:14 +0000 (17:18 +0300)]
kvm tools: Scale guest RAM size by CPU count

This patch increases default RAM size to 256 for one CPU and introduces RAM
size linear scaling based on CPUs as suggested by Ingo Molnar:

     64MB*(nr_cpus + 3)

     ------------------
       1 CPUs:   256 MB
       2 CPUs:   320 MB
       3 CPUs:   384 MB
       4 CPUs:   448 MB
       5 CPUs:   512 MB
       6 CPUs:   576 MB
       7 CPUs:   640 MB
       8 CPUs:   704 MB
       9 CPUs:   768 MB
      10 CPUs:   832 MB
      11 CPUs:   896 MB
      12 CPUs:   960 MB
      13 CPUs:  1024 MB
      14 CPUs:  1088 MB
      15 CPUs:  1152 MB
      16 CPUs:  1216 MB

Cc: Asias He <asias.hejun@gmail.com>
Cc: Cyrill Gorcunov <gorcunov@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Prasad Joshi <prasadjoshi124@gmail.com>
Cc: Sasha Levin <levinsasha928@gmail.com>
Suggested-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
13 years agokvm tools: Convert virtio devices to use IRQ registry
Sasha Levin [Fri, 6 May 2011 11:24:12 +0000 (14:24 +0300)]
kvm tools: Convert virtio devices to use IRQ registry

Instead of using static IRQ/device data, register the device
upon initialization and use the assign parameters when issuing
IRQs.

Clean up static definitions of IRQs.

Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
13 years agokvm tools: Dynamically add devices when creating mptable
Sasha Levin [Fri, 6 May 2011 11:24:11 +0000 (14:24 +0300)]
kvm tools: Dynamically add devices when creating mptable

Enumerate registered devices to build a complete
and updated mptable containing all registered pci
devices.

Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
13 years agokvm tools: Introduce IRQ registry
Sasha Levin [Fri, 6 May 2011 11:24:10 +0000 (14:24 +0300)]
kvm tools: Introduce IRQ registry

Instead of having static definitions of devices, Use a
dynamic registry of pci devices.

The structure is a rbtree which holds device types (net,
blk, etc). Each device entry holds a list of IRQ lines
associated with that device (pin).

Devices dynamically register upon initialization, and receive
a set of: device id, irq pin and irq line.

Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
13 years agokvm tools: Simplify search for root device
Sasha Levin [Fri, 6 May 2011 07:26:35 +0000 (10:26 +0300)]
kvm tools: Simplify search for root device

Use /dev/block to find the block device used for root
instead of searching through mounts.

Tested-by: Prasad Joshi <prasadjoshi124@gmail.com>
Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
13 years agokvm tools: Enable SMP support
Sasha Levin [Fri, 6 May 2011 23:51:11 +0000 (02:51 +0300)]
kvm tools: Enable SMP support

This patch enables SMP support:

[    0.155072] Brought up 3 CPUs
[    0.155074] Total of 3 processors activated (15158.58 BogoMIPS).

virtio-console was being loaded no matter the cmdline options
and it was causing some hangs (have to look into that).

I'll send this patch to a larger audience once someone
else can confirm it actually works/doesn't work.

Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
13 years agokvm tools: fix a memory leak in qcow2_read_header
Prasad Joshi [Fri, 6 May 2011 16:39:46 +0000 (17:39 +0100)]
kvm tools: fix a memory leak in qcow2_read_header

Free the allocated memory for qcow_header if header read operation fails.

Signed-off-by: Prasad Joshi <prasadjoshi124@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
13 years agokvm tools: Respect VRING_AVAIL_F_NO_INTERRUPT
Asias He [Sat, 7 May 2011 02:34:20 +0000 (10:34 +0800)]
kvm tools: Respect VRING_AVAIL_F_NO_INTERRUPT

Do not inject IRQ when guest suppress it.

This can reduce IRQ injection further and bumps
host to guest bandwitdh to 6178.78 Mbps(cpu 63.96%).

Signed-off-by: Asias He <asias.hejun@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
13 years agokvm tools: Respect ISR status in virtio header
Asias He [Sat, 7 May 2011 02:34:19 +0000 (10:34 +0800)]
kvm tools: Respect ISR status in virtio header

Inject IRQ to guest only when ISR status is low which means
guest has read ISR status and device has cleared this bit as
the side effect of this reading.

This reduces a lot of unnecessary IRQ inject from device to
guest.

Netpef test shows this patch changes:

the host to guest bandwidth
from 2866.27 Mbps (cpu 33.96%) to 5548.87 Mbps (cpu 53.87%),

the guest to host bandwitdth
form 1408.86 Mbps (cpu 99.9%) to 1301.29 Mbps (cpu 99.9%).

The bottleneck of the guest to host bandwidth is guest cpu power.

Signed-off-by: Asias He <asias.hejun@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
13 years agokvm tools: Gather Virtio-PCI constants into one place
Cyrill Gorcunov [Thu, 5 May 2011 19:06:40 +0000 (23:06 +0400)]
kvm tools: Gather Virtio-PCI constants into one place

It's better than have them sprinkled in.c files. Note
that pin for ring device is changed so it no longer shared
with block device (it is done in a sake of simplicity).

Also comment style if a bit tuned up in virtio-pci.h
just to be consistent.

Reported-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
13 years agokvm tools: Fix loading root device as image
Sasha Levin [Thu, 5 May 2011 19:53:32 +0000 (22:53 +0300)]
kvm tools: Fix loading root device as image

Fix the loading of root device when no image name was
specified.

Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
13 years agokvm tools: Cleanup virtio code some more
Pekka Enberg [Thu, 5 May 2011 19:45:49 +0000 (22:45 +0300)]
kvm tools: Cleanup virtio code some more

This patch cleans up some more style problems in virtio code.

Cc: Asias He <asias.hejun@gmail.com>
Cc: Cyrill Gorcunov <gorcunov@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Prasad Joshi <prasadjoshi124@gmail.com>
Cc: Sasha Levin <levinsasha928@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
13 years agokvm tools: virtio-rng code cleanup
Sasha Levin [Thu, 5 May 2011 19:16:54 +0000 (22:16 +0300)]
kvm tools: virtio-rng code cleanup

Clean coding style and naming within virtio-rng.

Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
13 years agokvm tools: virtio-net code cleanup
Sasha Levin [Thu, 5 May 2011 18:34:34 +0000 (21:34 +0300)]
kvm tools: virtio-net code cleanup

Clean coding style and naming within virtio-net.

Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
13 years agokvm tools: virtio-console code cleanup
Sasha Levin [Thu, 5 May 2011 18:34:33 +0000 (21:34 +0300)]
kvm tools: virtio-console code cleanup

Clean coding style and naming within virtio-console.

Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
13 years agokvm tools: virtio-blk code cleanup
Sasha Levin [Thu, 5 May 2011 18:34:32 +0000 (21:34 +0300)]
kvm tools: virtio-blk code cleanup

Clean coding style and naming within virtio-blk.

Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
13 years agokvm tools: Abolishment of uint*_t types
Sasha Levin [Thu, 5 May 2011 18:34:31 +0000 (21:34 +0300)]
kvm tools: Abolishment of uint*_t types

Clean uint*_t type from the code.

Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
13 years agokvm tools: Move virtio drivers under virtio directory
Pekka Enberg [Thu, 5 May 2011 14:39:19 +0000 (17:39 +0300)]
kvm tools: Move virtio drivers under virtio directory

This patch moves the virtio drivers under virtio directory.

Suggested-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
13 years agokvm tools: Add cmdline options for loading multiple images
Sasha Levin [Thu, 5 May 2011 10:24:31 +0000 (13:24 +0300)]
kvm tools: Add cmdline options for loading multiple images

Introduced new syntax for loading disk images.
Example:

./kvm run --image image1.img,ro --image image2.img

Will load image1.img with read only, and image2.img as
read/write.

Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
13 years agokvm tools: Add support for multiple virtio-blk
Sasha Levin [Thu, 5 May 2011 10:24:30 +0000 (13:24 +0300)]
kvm tools: Add support for multiple virtio-blk

Add support for multiple blk_devices by un-globalizing
the current blk_device and allow multiple blk_devices.

Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
13 years agokvm tools: Move disk_image into virtio-blk
Sasha Levin [Thu, 5 May 2011 10:24:29 +0000 (13:24 +0300)]
kvm tools: Move disk_image into virtio-blk

There may be multiple disk images on a running guest,
each associated with a virtio-blk.

Move disk_image into virtio-blk in preperation for
multiple disk images.

Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
13 years agokvm tools: Fix 32-bit build of the asm/system.h include
Ingo Molnar [Thu, 5 May 2011 08:00:45 +0000 (10:00 +0200)]
kvm tools: Fix 32-bit build of the asm/system.h include

Provide wrappers and other environmental dependencies that the
asm/system.h header file from hell needs to build fine in user-space.

Sidenote: right now alternative() defaults to the compatible, slightly
slower barrier instructions that work on all x86 systems.

If this ever shows up in profiles then kvm could provide an alternatives
patching machinery as well. Right now those instructions are emitted
into special sections and then discarded by the linker harmlessly.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
13 years agokvm tools: Fix virt_queue__set_used_elem
Sasha Levin [Tue, 3 May 2011 20:28:07 +0000 (23:28 +0300)]
kvm tools: Fix virt_queue__set_used_elem

Increase idx only after updating the used element.
Not doing so may mark a buffer as used without having
it's head and length updated.

Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
13 years agokvm tools: Drop ALIGN from bios.h
Sasha Levin [Tue, 3 May 2011 20:28:06 +0000 (23:28 +0300)]
kvm tools: Drop ALIGN from bios.h

Drops align from bios.h, fixes related code to use
<linux/kernel.h> instead.

Acked-by: Cyrill Gorcunov <gorcunov@gmail.com>
Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
13 years agokvm tools: Fix virtio rng build breakage
Pekka Enberg [Tue, 3 May 2011 14:04:35 +0000 (17:04 +0300)]
kvm tools: Fix virtio rng build breakage

Signed-off-by: Pekka Enberg <penberg@kernel.org>
13 years agokvm tools: Add cmdline switch to enable virtio-rng
Sasha Levin [Sat, 30 Apr 2011 13:30:25 +0000 (16:30 +0300)]
kvm tools: Add cmdline switch to enable virtio-rng

Add --virtio-rnd switch to enable virtio RNG in the guest.
Once enabled, The RNG device will be located at /dev/hwrng.

Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
13 years agokvm tools: Introduce virtio-rng
Sasha Levin [Sat, 30 Apr 2011 13:30:24 +0000 (16:30 +0300)]
kvm tools: Introduce virtio-rng

Enable virtio-rng, a virtio random number generator.
Guest kernel should be compiled with CONFIG_HW_RANDOM_VIRTIO.
Once enabled, A RNG device will be located at /dev/hwrng.

Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
13 years agokvm tools: Lock job_mutex before signalling
Sasha Levin [Sat, 30 Apr 2011 13:30:23 +0000 (16:30 +0300)]
kvm tools: Lock job_mutex before signalling

Locking mutex before signalling to prevent unexpected
scheduling.

Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
13 years agokvm tools: Modify thread pool API
Sasha Levin [Fri, 29 Apr 2011 15:15:02 +0000 (18:15 +0300)]
kvm tools: Modify thread pool API

Modify API function names and type names.

[ penberg@kernel.org: drop virtio net parts ]
Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
13 years agoRevert "kvm tools: Use threadpool for virtio-net"
Pekka Enberg [Sat, 30 Apr 2011 09:15:57 +0000 (12:15 +0300)]
Revert "kvm tools: Use threadpool for virtio-net"

This reverts commit a37089da817ce7aad9789aeb9fc09b68e088ad9a.

13 years agokvm tools: Emulate RTC to fix system time in guests
Pekka Enberg [Thu, 28 Apr 2011 18:50:17 +0000 (21:50 +0300)]
kvm tools: Emulate RTC to fix system time in guests

This patch fixes system time in guests by implementing proper CMOS RTC clock
support.

  # Before:

  sh-2.05b# date
  Fri Aug  7 04:02:01 UTC 2009

  # After:

  sh-2.05b# date
  Thu Apr 28 19:12:21 UTC 2011

Cc: Asias He <asias.hejun@gmail.com>
Cc: Cyrill Gorcunov <gorcunov@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Prasad Joshi <prasadjoshi124@gmail.com>
Cc: Sasha Levin <levinsasha928@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
13 years agokvm tools: Fix disk image double-free on KVM panic
Pekka Enberg [Thu, 28 Apr 2011 18:36:35 +0000 (21:36 +0300)]
kvm tools: Fix disk image double-free on KVM panic

The kvm_cmd_run() calls disk_image__close() before exiting so we must not it in
kvm_cpu_thread() in the "panic_kvm" case.

Signed-off-by: Pekka Enberg <penberg@kernel.org>
13 years agokvm tools: Use threadpool for virtio-net
Sasha Levin [Thu, 28 Apr 2011 13:40:45 +0000 (16:40 +0300)]
kvm tools: Use threadpool for virtio-net

virtio-net has been converted to use the threadpool.  This is very similar to
the change done in virtio-blk, only here we had 2 queues to handle.

Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
13 years agokvm tools: Use threadpool for virtio-console
Sasha Levin [Thu, 28 Apr 2011 13:40:44 +0000 (16:40 +0300)]
kvm tools: Use threadpool for virtio-console

This is very similar to the change done in virtio-net.

Notice that one signal here comes from outside the module (actual terminal)
while the other one is generated by the virtio module.

Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
13 years agokvm tools: Use threadpool for virtio-blk
Sasha Levin [Thu, 28 Apr 2011 13:40:43 +0000 (16:40 +0300)]
kvm tools: Use threadpool for virtio-blk

virtio-blk has been converted to use the threadpool. All the threading code has
been removed, which left only simple callback handling code.

New threadpool job types are created within VIRTIO_PCI_QUEUE_PFN for every
queue (just one in the case of virtio-blk).  The module signals for work after
receiving VIRTIO_PCI_QUEUE_NOTIFY and expects the threadpool to call
virtio_blk_do_io to handle the I/O.  It is possible that the module will signal
work several times while virtio_blk_do_io is already working, but there is no
need to handle multithreading there since the threadpool will call each job in
linear and not in parallel.

Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
13 years agokvm tools: Introduce generic I/O thread pool
Sasha Levin [Thu, 28 Apr 2011 13:40:42 +0000 (16:40 +0300)]
kvm tools: Introduce generic I/O thread pool

This patch adds a generic pool to create a common interface for working with
threads within the kvm tool. Main idea here is using this threadpool for all
I/O threads instead of having every I/O module write it's own thread code. The
process of working with the thread pool is supposed to be very simple.

During initialization, each module which is interested in working with the
threadpool will call threadpool__add_jobtype with the callback function and a
void* parameter. For example, virtio modules will register every virt_queue as
a new job type.  During operation, When theres work to do for a specific job,
the module will signal it to the queue and would expect the callback to be
called with proper parameters. It is assured that the callback will be called
once for every signal action and each callback will be called only once at a
time (i.e. callback functions themselves don't need to handle threading).

[ penberg@kernel.org: Use Lindent ]
Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
13 years agokvm tools: Add kernel headers required for using list
Sasha Levin [Thu, 28 Apr 2011 13:40:41 +0000 (16:40 +0300)]
kvm tools: Add kernel headers required for using list

Adds kernel headers so that <linux/list.h> (and others) could be included
directly.

Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
13 years agokvm tools: Prevent duplicate definitions of ALIGN
Sasha Levin [Thu, 28 Apr 2011 13:40:40 +0000 (16:40 +0300)]
kvm tools: Prevent duplicate definitions of ALIGN

Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
13 years agokvm tools: Show KVM state on SIGQUIT
Pekka Enberg [Tue, 26 Apr 2011 20:07:11 +0000 (23:07 +0300)]
kvm tools: Show KVM state on SIGQUIT

SysRq-t isn't useful during early boot problems because serial console is not
set up. Therefore, also dump KVM state on SIGQUIT.

Signed-off-by: Pekka Enberg <penberg@kernel.org>
13 years agokvm tools: display appropriate error message when default kernel image could not...
Prasad Joshi [Tue, 26 Apr 2011 11:28:27 +0000 (12:28 +0100)]
kvm tools: display appropriate error message when default kernel image could not be found

This change was recommended by Ingo Molnar in his reply to mail 'Use the root
partition of the host to boot the guest machine'. The patch informs user to
explicitly run the 'kvm run --help' command, in case the kvm tool could not find
a default kernel image to boot.

prasad@prasad-kvm:~/KVM/linux-kvm/tools/kvm$ ./kvm run
Fatal: could not find default kernel image in:
./bzImage
../../arch/x86/boot/bzImage
/boot/vmlinuz-2.6.35-25-generic
/boot/bzImage-2.6.35-25-generic

Please see 'kvm run --help' for more options.

Signed-off-by: Prasad Joshi <prasadjoshi124@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
13 years agokvm tools: check read permission before using the root partition of the host to boot VM
Prasad Joshi [Tue, 26 Apr 2011 10:59:18 +0000 (11:59 +0100)]
kvm tools: check read permission before using the root partition of the host to boot VM

The commit fbe8d0f (kvm tools: Use the root partition of the host to boot the
guest machine) changed the default image for virtual machine to root partition
of the host machine. The patch adds a check to ensure appropriate permission
(a read permission) is available for kvm tool to use this partition.

Signed-off-by: Prasad Joshi <prasadjoshi124@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
13 years agokvm tools: Add NR_CPUS definition in case of non-configured kernel sources
Cyrill Gorcunov [Mon, 25 Apr 2011 18:07:19 +0000 (22:07 +0400)]
kvm tools: Add NR_CPUS definition in case of non-configured kernel sources

Pekka reported
|
| I see this if I ignore the reject:
|
| penberg@tiger:~/linux/tools/kvm$ make
| In file included from mptable.c:10:
| ../../arch/x86/include/asm/mpspec_def.h:20:6: error: "NR_CPUS" is not defined

This is because the source linux kernel might not be configured (bare sources)
so we add own definition in case if there is no NR_CPUS defined.

[ penberg@kernel.org: fix up compilation error ]
Reported-by: Pekka Enberg <penberg@kernel.org>
Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
13 years agokvm tools: Use system wide msr-index.h instead of own definitions
Cyrill Gorcunov [Mon, 25 Apr 2011 17:26:06 +0000 (21:26 +0400)]
kvm tools: Use system wide msr-index.h instead of own definitions

To eliminate code duplication.

Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
13 years agokvm tools: Add MP tables support
Cyrill Gorcunov [Mon, 25 Apr 2011 17:26:05 +0000 (21:26 +0400)]
kvm tools: Add MP tables support

This is a raw prototipe for MP table support, most resources
such as IRQ pins and sources are hardcoded among other limitations.

Note we still limit the number of cpus to run up to a single cpu
until the full SMP support appear. In particular don't forget to
remove "nolapic" from command line then.

Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
13 years agokvm tools: Use non shared pin/irqs for virtio devices
Cyrill Gorcunov [Mon, 25 Apr 2011 17:26:04 +0000 (21:26 +0400)]
kvm tools: Use non shared pin/irqs for virtio devices

There is no need for shared IRQs for virtio devices.

Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
13 years agokvm tools: Implement virtio net TSO/UFO support
Asias He [Sun, 24 Apr 2011 12:46:37 +0000 (20:46 +0800)]
kvm tools: Implement virtio net TSO/UFO support

This patch bumps host to guest tcp bandwidth from

1060 Mib/s to 1760 Mib/s,

and guest to host tcp bandwidth from

 342 Mib/s to  619 Mib/s.

*************************
Without TSO and UFO
*************************
(guest <- host)
root@sid1:~# iperf -s
------------------------------------------------------------
Server listening on TCP port 5001
TCP window size: 85.3 KByte (default)
------------------------------------------------------------
[  4] local 192.168.33.15 port 5001 connected with 192.168.33.2 port 38733
[ ID] Interval       Transfer     Bandwidth
[  4]  0.0-10.0 sec  1.23 GBytes  1.06 Gbits/sec
^Croot@sid1:~# iperf -s -u
------------------------------------------------------------
Server listening on UDP port 5001
Receiving 1470 byte datagrams
UDP buffer size:   110 KByte (default)
------------------------------------------------------------
[  3] local 192.168.33.15 port 5001 connected with 192.168.33.2 port 54933
[ ID] Interval       Transfer     Bandwidth       Jitter   Lost/Total Datagrams
[  3]  0.0-10.0 sec  1.25 MBytes  1.05 Mbits/sec  0.030 ms    0/  893 (0%)

(guest to host)
root@sid1:~# iperf -c host
------------------------------------------------------------
Client connecting to host, TCP port 5001
TCP window size: 16.0 KByte (default)
------------------------------------------------------------
[  3] local 192.168.33.15 port 42197 connected with 192.168.33.2 port 5001
[ ID] Interval       Transfer     Bandwidth
[  3]  0.0-10.0 sec    408 MBytes    342 Mbits/sec
root@sid1:~# iperf -c host -u
------------------------------------------------------------
Client connecting to host, UDP port 5001
Sending 1470 byte datagrams
UDP buffer size:   110 KByte (default)
------------------------------------------------------------
[  3] local 192.168.33.15 port 56176 connected with 192.168.33.2 port 5001
[ ID] Interval       Transfer     Bandwidth
[  3]  0.0-10.0 sec  1.25 MBytes  1.05 Mbits/sec
[  3] Sent 893 datagrams
[  3] Server Report:
[  3]  0.0-10.0 sec  1.25 MBytes  1.05 Mbits/sec  0.012 ms    0/  893 (0%)

*************************
With TSO and UFO
*************************

(guest <- host)
root@sid1:~# iperf  -s
------------------------------------------------------------
Server listening on TCP port 5001
TCP window size: 85.3 KByte (default)
------------------------------------------------------------
[  4] local 192.168.33.15 port 5001 connected with 192.168.33.2 port 42767
[ ID] Interval       Transfer     Bandwidth
[  4]  0.0-10.0 sec  2.05 GBytes  1.76 Gbits/sec
root@sid1:~# iperf  -s -u
------------------------------------------------------------
Server listening on UDP port 5001
Receiving 1470 byte datagrams
UDP buffer size:   110 KByte (default)
------------------------------------------------------------
[  3] local 192.168.33.15 port 5001 connected with 192.168.33.2 port 35049
[ ID] Interval       Transfer     Bandwidth       Jitter   Lost/Total Datagrams
[  3]  0.0-10.0 sec  1.25 MBytes  1.05 Mbits/sec  0.031 ms    0/  893 (0%)

(guest -> host)
asias@hj:~$ iperf -s
------------------------------------------------------------
Server listening on TCP port 5001
TCP window size: 85.3 KByte (default)
------------------------------------------------------------
[  4] local 192.168.33.2 port 5001 connected with 192.168.33.15 port 60868
[ ID] Interval       Transfer     Bandwidth
[  4]  0.0-10.0 sec   738 MBytes   619 Mbits/sec
asias@hj:~$ iperf -s -u
------------------------------------------------------------
Server listening on UDP port 5001
Receiving 1470 byte datagrams
UDP buffer size:  112 KByte (default)
------------------------------------------------------------
[  3] local 192.168.33.2 port 5001 connected with 192.168.33.15 port 40602
[ ID] Interval       Transfer     Bandwidth        Jitter   Lost/Total Datagrams
[  3]  0.0-10.0 sec  1.25 MBytes  1.05 Mbits/sec   0.030 ms    0/  893 (0%)

Signed-off-by: Asias He <asias.hejun@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
13 years agokvm tools: Fix possible leak in qcow
Sasha Levin [Sat, 23 Apr 2011 14:05:08 +0000 (17:05 +0300)]
kvm tools: Fix possible leak in qcow

Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
13 years agokvm tools: Add missing space before root= option
Cyrill Gorcunov [Wed, 20 Apr 2011 15:49:12 +0000 (19:49 +0400)]
kvm tools: Add missing space before root= option

If user passes own options we need an extra space; otherwise options get
joined.

Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
13 years agokvm tools: Add read-only support for QCOW2 images
Pekka Enberg [Tue, 19 Apr 2011 19:56:00 +0000 (22:56 +0300)]
kvm tools: Add read-only support for QCOW2 images

This patch extends the QCOW1 format to also support QCOW2 images as specified
by the following document:

  http://people.gnome.org/~markmc/qcow-image-format.html

Cc: Asias He <asias.hejun@gmail.com>
Cc: Cyrill Gorcunov <gorcunov@gmail.com>
Cc: Prasad Joshi <prasadjoshi124@gmail.com>
Cc: Sasha Levin <levinsasha928@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
13 years agokvm tools: Use mmap to allocate guest RAM
Sasha Levin [Tue, 19 Apr 2011 17:28:39 +0000 (20:28 +0300)]
kvm tools: Use mmap to allocate guest RAM

Using mmap to allocate the RAM enables us to allocate large blocks of memory
for the guest, allowing to boot guests with a large RAM.  Since if we try
KVM_SET_USER_MEMORY_REGION with a large memory block KVM used to oops (now it
just fails), we had to move the actual ioctl to after we KVM_CREATE_VCPU.

Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
13 years agokvm tools: Fix qcow1_read_cluster() return value on error
Pekka Enberg [Tue, 19 Apr 2011 16:57:36 +0000 (19:57 +0300)]
kvm tools: Fix qcow1_read_cluster() return value on error

The qcow1_read_cluster() returns negative number on error but the return value
type is unsigned. Fix that up and also fix the call-site in qcow1_read_sector()
to deal with negative return value properly.

Signed-off-by: Pekka Enberg <penberg@kernel.org>
13 years agokvm tools: Kill redundant assignment in qcow1_read_cluster()
Pekka Enberg [Tue, 19 Apr 2011 16:55:49 +0000 (19:55 +0300)]
kvm tools: Kill redundant assignment in qcow1_read_cluster()

Signed-off-by: Pekka Enberg <penberg@kernel.org>
13 years agokvm tools: Use threading for virtio block devices
Sasha Levin [Mon, 18 Apr 2011 13:02:31 +0000 (16:02 +0300)]
kvm tools: Use threading for virtio block devices

Add I/O thread to handle I/O operations in virtio-blk.

There is currently support for multiple virtio queues but the kernel side
supports only one virtio queue. It's not too much of a performance impact and
the ABI does support multiple queues there - So I've prefered to do it like
that to keep it flexible.

I/O performance itself doesn't increase much due to the patch, what changes is
system responsiveness during I/O operations.  On an unthreaded system, The VCPU
is frozen up until the I/O request is complete. On the other hand, On a
threaded system the VCPU is free to do other work or queue more I/O while
waiting for the original I/O request to complete.

[ penberg@kernel.org: cleanups ]
Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
13 years agokvm tools: Fix possible leak in disk_image
Sasha Levin [Tue, 19 Apr 2011 07:32:16 +0000 (10:32 +0300)]
kvm tools: Fix possible leak in disk_image

Close leaking fd if ioctl fails.

Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
13 years agokvm tools: Use the root partition of the host to boot the guest machine
Prasad Joshi [Tue, 19 Apr 2011 14:29:04 +0000 (15:29 +0100)]
kvm tools: Use the root partition of the host to boot the guest machine

The kvm run command should automatically pickup the image file to boot if one is
not explicitly specified.

Quoting Ingo Molnar:

  Looks good here, with your patch applied 'kvm run' will now boot host userspace
  with zero configuration needed (!), when there's a bzImage built in that kernel
  tree:

    aldebaran:~/linux/linux/tools/kvm> ./kvm run

    [...]

    Welcome to Fedora release 16 (Rawhide)!

Tested-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Prasad Joshi <prasadjoshi124@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
13 years agokvm tools: Untangle qcow1_read_sector() function
Pekka Enberg [Sun, 17 Apr 2011 10:42:28 +0000 (13:42 +0300)]
kvm tools: Untangle qcow1_read_sector() function

This patch rearranges qcow1_read_sector() code so that all normal, zero
cluster, and I/O error paths share the same exit point which also deals with L2
table freeing.

Signed-off-by: Pekka Enberg <penberg@kernel.org>
13 years agokvm tools: Cleanup qcow1_read_sector() function
Pekka Enberg [Sun, 17 Apr 2011 10:40:37 +0000 (13:40 +0300)]
kvm tools: Cleanup qcow1_read_sector() function

Fix formatting issues and rename 'length' variable to 'nr_read'.

Signed-off-by: Pekka Enberg <penberg@kernel.org>
13 years agokvm tools: Correct a variable naming spelling mistake
Prasad Joshi [Sat, 16 Apr 2011 20:02:33 +0000 (21:02 +0100)]
kvm tools: Correct a variable naming spelling mistake

Signed-off-by: Prasad Joshi <prasadjoshi124@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
13 years agokvm tools: check the cluster boundary in the qcow read code
Prasad Joshi [Sat, 16 Apr 2011 19:40:25 +0000 (20:40 +0100)]
kvm tools: check the cluster boundary in the qcow read code

The QCOW1 code should always read the data in cluster size. A new function
qcow1_read_cluster() is added to read a cluster wort of data. The current
function to read the data i.e. qcow1_read_sector is modified to use this newly
added function and preserve the cluster boundary.

Signed-off-by: Prasad Joshi <prasadjoshi124@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
13 years agokvm tools: Fix iov shifting
Sasha Levin [Sat, 16 Apr 2011 21:03:55 +0000 (00:03 +0300)]
kvm tools: Fix iov shifting

Use helper function to remove full iov when 'reading in full'.

Thanks Konstantin Khlebnikov.

[ penberg@kernel.org: fix formatting ]
Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
13 years agokvm tools: Rename _sg to _iov and remove dead code
Sasha Levin [Sat, 16 Apr 2011 16:53:21 +0000 (19:53 +0300)]
kvm tools: Rename _sg to _iov and remove dead code

Use _iov to indicate scatter-gather.
Remove simple IO ops from raw image - Dead code.

Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
13 years agokvm tools: Add scatter-gather support for disk images
Sasha Levin [Sat, 16 Apr 2011 15:05:56 +0000 (18:05 +0300)]
kvm tools: Add scatter-gather support for disk images

Add optional support for scatter-gather to disk_image.
Formats that can't take advantage of scatter-gather fallback to simple IO.

Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
13 years agokvm tools: Add scatter-gather variants of IO functions
Sasha Levin [Sat, 16 Apr 2011 15:05:55 +0000 (18:05 +0300)]
kvm tools: Add scatter-gather variants of IO functions

Added scatter-gather variants of [x,xp][read,write]() and [p]read_in_full().

Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
13 years agokvm tool: Change the type of QCOW table index and offset variables to u64
Prasad Joshi [Sat, 16 Apr 2011 13:50:57 +0000 (14:50 +0100)]
kvm tool: Change the type of QCOW table index and offset variables to u64

Signed-off-by: Prasad Joshi <prasadjoshi124@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
13 years agokvm tools: Fix leak in QCOW
Sasha Levin [Sat, 16 Apr 2011 11:45:43 +0000 (14:45 +0300)]
kvm tools: Fix leak in QCOW

Fixed leak when reading a zero sector, also simplified flow a bit.

Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
13 years agokvm tool: Remove the __stringify*() definition from util.h
Prasad Joshi [Sat, 16 Apr 2011 11:15:06 +0000 (12:15 +0100)]
kvm tool: Remove the __stringify*() definition from util.h

Include the Linux kernel header file linux/stringify.h file instead of
redefining the __stringify* macros

Signed-off-by: Prasad Joshi <prasadjoshi124@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
13 years agokvm tools: Setup bridged network by a script
Amos Kong [Fri, 15 Apr 2011 10:19:34 +0000 (18:19 +0800)]
kvm tools: Setup bridged network by a script

Use original hardcode network by default.

#./kvm run ... -n virtio --tapscript=./util/kvm-ifup-vbr0
# brctl show
bridge name     bridge id               STP enabled     interfaces
vbr0            8000.e272c7c391f4       no              tap0
guest)# ifconfig eth6
eth6      Link encap:Ethernet  HWaddr 00:11:22:33:44:55
          inet addr:192.168.33.192  Bcast:192.168.33.255  Mask:255.255.255.0
          inet6 addr: fe80::211:22ff:fe33:4455/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:22 errors:0 dropped:0 overruns:0 frame:0
          TX packets:8 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:3725 (3.6 KiB)  TX bytes:852 (852.0 b)
guest)# ping amosk.info
PING amosk.info (69.175.108.82) 56(84) bytes of data.
64 bytes from nurpulat.uz (69.175.108.82): icmp_seq=1 ttl=43 time=306 ms

Changes from v1:
- rebased to latest tree
- replace system() by execv()

Signed-off-by: Amos Kong <kongjianjun@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
13 years agokvm tools: Add a script to setup tap device
Amos Kong [Thu, 14 Apr 2011 04:37:55 +0000 (12:37 +0800)]
kvm tools: Add a script to setup tap device

# ./kvm-ifup-vbr0 $tap_name

Signed-off-by: Amos Kong <kongjianjun@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
13 years agokvm tools: Add a script to setup private bridge
Amos Kong [Thu, 14 Apr 2011 04:37:45 +0000 (12:37 +0800)]
kvm tools: Add a script to setup private bridge

We can use this script to create/delete a private bridge,
and launch a dhcp server on the bridge by dnsmasq,
setup forware rule of iptable, then guest can access public network.

# ./set_private_br.sh vbr0 192.168.33
add new private bridge: vbr0
# brctl show
bridge name     bridge id               STP enabled     interfaces
vbr0            8000.000000000000       yes
# ifconfig vbr0
vbr0      Link encap:Ethernet  HWaddr 82:0f:f5:8f:92:47
          inet addr:192.168.33.1  Bcast:192.168.33.255  Mask:255.255.255.0
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:11 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:0 (0.0 B)  TX bytes:1979 (1.9 KB)
# ps aux |grep dnsmasq
nobody .. dnsmasq --strict-order --bind-interfaces --listen-address 192.168.33.1 \
--dhcp-range 192.168.33.1,192.168.33.254

Signed-off-by: Amos Kong <kongjianjun@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
13 years agokvm tools: Rename raw_image__close_sector_ro_mmap
Sasha Levin [Fri, 15 Apr 2011 08:12:31 +0000 (11:12 +0300)]
kvm tools: Rename raw_image__close_sector_ro_mmap

Renamed to raw_image__close_ro_mmap since it has nothing to do with sectors.

Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
13 years agokvm tools: Fix virtio console input problem
Asias He [Fri, 15 Apr 2011 14:55:04 +0000 (22:55 +0800)]
kvm tools: Fix virtio console input problem

term_getc only get one char at a time, so term_getc_iov should
send one char back to guest.

Otherwise, you will get four input chars when you only type one like bewlow:

sid login: r^@^@^@o^@^@^@o^@^@^@t^@^@^@

Signed-off-by: Asias He <asias.hejun@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
13 years agokvm tools: Free l1_table in qcow1_disk_close() and in error path of qcow1_probe()
Prasad Joshi [Fri, 15 Apr 2011 14:18:57 +0000 (15:18 +0100)]
kvm tools: Free l1_table in qcow1_disk_close() and in error path of qcow1_probe()

Signed-off-by: Prasad Joshi <prasadjoshi124@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
13 years agokvm tools: Avoid byte-order conversion during each read operation
Prasad Joshi [Fri, 15 Apr 2011 14:18:56 +0000 (15:18 +0100)]
kvm tools: Avoid byte-order conversion during each read operation

Signed-off-by: Prasad Joshi <prasadjoshi124@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
13 years agokvm tools: Fix function names in qcow.c
Prasad Joshi [Fri, 15 Apr 2011 14:18:55 +0000 (15:18 +0100)]
kvm tools: Fix function names in qcow.c

The function name sect_to_l1_offset() is changed to get_l1_index() as it
returns the l1 table index rather than offset.

Also change

    - sect_to_l2_offset to get_l2_index
    - sect_to_cluster_offset to get_cluster_offset

Signed-off-by: Prasad Joshi <prasadjoshi124@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
13 years agokvm tools: Add read-only support for block devices
Pekka Enberg [Wed, 13 Apr 2011 19:56:02 +0000 (22:56 +0300)]
kvm tools: Add read-only support for block devices

Add support for booting guests to host block devices in read-only mode.

Quoting Ingo Molnar:

  Booting into the host's userspace works out of box now. The
  following disk-image-less command:

     ./kvm run ../../arch/x86/boot/bzImage --readonly --image=/dev/sda --params="root=/dev/vda1"

  has booted all the way into the host's Fedora Rawhide userspace:

     ===================================
     Welcome to Fedora release 16 (Rawhide)!

     [...]

     Fedora release 16 (Rawhide)
     Kernel 2.6.39-rc3-tip+ on an x86_64 (ttyS0)

     aldebaran login:
     ===================================

  I ran this as unprivileged user (who had read access to the partitions in
  question).

  This is a very useful feature for testing kernels - i can test any random Linux
  box's userspace via KVM, without having to copy an image there! :-)

Cc: Asias He <asias.hejun@gmail.com>
Cc: Cyrill Gorcunov <gorcunov@gmail.com>
Cc: Prasad Joshi <prasadjoshi124@gmail.com>
Cc: Sasha Levin <levinsasha928@gmail.com>
Tested-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
13 years agokvm tools: Add option to specify guest MAC
Sasha Levin [Thu, 14 Apr 2011 19:17:43 +0000 (22:17 +0300)]
kvm tools: Add option to specify guest MAC

Add --guest-mac to specify the MAC of the guest.

Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
13 years agokvm tools: Enable network by default
Sasha Levin [Thu, 14 Apr 2011 19:17:42 +0000 (22:17 +0300)]
kvm tools: Enable network by default

Enable virtio networking by default, warn if user doesn't have tun/tap interface or not enough permissions to start it.

Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
13 years agokvm tools: Organize net parameters into struct
Sasha Levin [Thu, 14 Apr 2011 19:17:41 +0000 (22:17 +0300)]
kvm tools: Organize net parameters into struct

Move network configuration parameters into a struct.
The amount of network parameters will be rather large, so better do it early.

Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>