]> git.karo-electronics.de Git - karo-tx-linux.git/log
karo-tx-linux.git
13 years agokvm tools: Add basic ioport dynamic allocation
Sasha Levin [Thu, 26 May 2011 10:30:05 +0000 (13:30 +0300)]
kvm tools: Add basic ioport dynamic allocation

Add a very simple allocation of ioports.

This prevents the need to coordinate ioports between different
modules.

Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
13 years agokvm tools: Add optional parameter used in ioport callbacks
Sasha Levin [Thu, 26 May 2011 10:30:04 +0000 (13:30 +0300)]
kvm tools: Add optional parameter used in ioport callbacks

Allow specifying an optional parameter when registering an
ioport range. The callback functions provided by the registering
module will be called with the same parameter.

This may be used to keep context during callbacks on IO operations.

Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
13 years agokvm tools: Code cleanups to hw/vesa.c
Pekka Enberg [Mon, 23 May 2011 14:52:51 +0000 (17:52 +0300)]
kvm tools: Code cleanups to hw/vesa.c

Tidy up the code in hw/vesa.c:

  - Make videomem local to hw/vesa.c

  - Remove debugging printf() calls

  - Fix up coding style issues

Cc: John Floren <john@jfloren.net>
Cc: Sasha Levin <levinsasha928@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
13 years agokvm tools: Drop unused vars from int10.c code
Cyrill Gorcunov [Mon, 23 May 2011 14:39:17 +0000 (18:39 +0400)]
kvm tools: Drop unused vars from int10.c code

There is a couple of functions which defines 'ah' variable but
never use it in real so that gcc 4.6.x series does complain on
me as

  CC       bios/bios-rom.bin
  bios/int10.c: In function ‘int10_putchar’:
  bios/int10.c:86:9: error: variable ‘ah’ set but not used [-Werror=unused-but-set-variable]
  bios/int10.c: In function ‘int10_vesa’:
  bios/int10.c:96:9: error: variable ‘ah’ set but not used [-Werror=unused-but-set-variable]
  cc1: all warnings being treated as errors

so get rid of them.

Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
CC: Sasha Levin <levinsasha928@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
13 years agokvm tools: Initialize and use VESA and VNC
John Floren [Mon, 23 May 2011 12:15:18 +0000 (15:15 +0300)]
kvm tools: Initialize and use VESA and VNC

Requirements - Kernel compiled with:
CONFIG_FB_BOOT_VESA_SUPPORT=y
CONFIG_FB_VESA=y
CONFIG_FRAMEBUFFER_CONSOLE=y

Start VNC server by starting kvm tools with "--vnc".
Connect to the VNC server by running: "vncviewer :0".

Since there is no support for input devices at this time,
it may be useful starting kvm tools with an additional
' -p "console=ttyS0" ' parameter so that it would be possible
to use a serial console alongside with a graphic one.

Signed-off-by: John Floren <john@jfloren.net>
[ turning code into patches and cleanup ]
Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
13 years agokvm tools: Update makefile and feature tests
John Floren [Mon, 23 May 2011 12:15:17 +0000 (15:15 +0300)]
kvm tools: Update makefile and feature tests

Update feature tests to test for libvncserver.

VESA support doesn't get compiled in unless libvncserver
is installed.

Signed-off-by: John Floren <john@jfloren.net>
[ turning code into patches and cleanup ]
Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
13 years agokvm tools: Add VESA device
John Floren [Mon, 23 May 2011 12:15:16 +0000 (15:15 +0300)]
kvm tools: Add VESA device

Add a simple VESA device which simply moves a framebuffer
from guest kernel to a VNC server.

VESA device PCI code is very similar to virtio-* PCI code.

Signed-off-by: John Floren <john@jfloren.net>
[ turning code into patches and cleanup ]
Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
13 years agokvm tools: Add video mode to kernel initialization
John Floren [Mon, 23 May 2011 12:15:15 +0000 (15:15 +0300)]
kvm tools: Add video mode to kernel initialization

Allow setting video mode in guest kernel.

For possible values see Documentation/fb/vesafb.txt

Signed-off-by: John Floren <john@jfloren.net>
[ turning code into patches and cleanup ]
Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
13 years agokvm tools: Add BIOS INT10 handler
John Floren [Mon, 23 May 2011 12:15:14 +0000 (15:15 +0300)]
kvm tools: Add BIOS INT10 handler

INT10 handler is a basic implementation of BIOS video services.

The handler implements a VESA interface which is initialized at
the very beginning of loading the kernel.

Signed-off-by: John Floren <john@jfloren.net>
[ turning code into patches and cleanup ]
Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
13 years agokvm tools: Release memory allocated during virtio block initialization
Prasad Joshi [Sun, 22 May 2011 16:24:07 +0000 (17:24 +0100)]
kvm tools: Release memory allocated during virtio block initialization

Add a new function virtio_blk__delete() goes through array of block
devices and releases memory allocated for block device.

Signed-off-by: Prasad Joshi <prasadjoshi124@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
13 years agokvm tools: Add a wrapper function to initialize all virtio block devices
Prasad Joshi [Sun, 22 May 2011 16:24:06 +0000 (17:24 +0100)]
kvm tools: Add a wrapper function to initialize all virtio block devices

The patch moves the code for initialization of all of the virtio block
devices to virtio subsystem.

Signed-off-by: Prasad Joshi <prasadjoshi124@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
13 years agokvm tools: Close the disk images after the guest shuts down
Prasad Joshi [Sun, 22 May 2011 16:24:05 +0000 (17:24 +0100)]
kvm tools: Close the disk images after the guest shuts down

Signed-off-by: Prasad Joshi <prasadjoshi124@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
13 years agokvm tools: Add a wrapper function to open disk images
Prasad Joshi [Sun, 22 May 2011 16:24:04 +0000 (17:24 +0100)]
kvm tools: Add a wrapper function to open disk images

The patch was suggested by Ingo to move the disk image subsystem code
from the kvm-run.c file. The code to open all of the specified disk
images is now moved to a wrapper function in disk/core.c.

Signed-off-by: Prasad Joshi <prasadjoshi124@gmail.com>
Suggested-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
13 years agokvm tools: Drop dummy PCI ioport registrations
Pekka Enberg [Sun, 22 May 2011 07:55:20 +0000 (10:55 +0300)]
kvm tools: Drop dummy PCI ioport registrations

There's no point in registering dummy ops for PCI ioports because pci__init()
registers the real ones before we enter the guest kernel.

Cc: Cyrill Gorcunov <gorcunov@gmail.com>
Cc: Sasha Levin <levinsasha928@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
13 years agokvm tools: Print out a warning on io-port re-registration
Cyrill Gorcunov [Sat, 21 May 2011 12:16:51 +0000 (16:16 +0400)]
kvm tools: Print out a warning on io-port re-registration

Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
13 years agokvm tools, 9p: Test for tuncation result
Cyrill Gorcunov [Sat, 21 May 2011 12:10:34 +0000 (16:10 +0400)]
kvm tools, 9p: Test for tuncation result

Without 'ret' usage I get

 | cyrill@sun kvm $ make
 |  CC       virtio/9p.o
 | virtio/9p.c: In function ‘virtio_p9_wstat’:
 | virtio/9p.c:448:6: error: variable ‘res’ set but not used [-Werror=unused-but-set-variable]
 | cc1: all warnings being treated as errors
 | make: *** [virtio/9p.o] Error 1

so add a basic check for ftruncate result, this eliminate warning and
we might need to use 'res' status later in caller code.

Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
Acked-by: Sasha Levin <levinsasha928@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
13 years agokvm tools: Don't register dummy ops for serial ports
Pekka Enberg [Sat, 21 May 2011 12:01:05 +0000 (15:01 +0300)]
kvm tools: Don't register dummy ops for serial ports

The serial8250__init() function registers ops for all serial ports before we
start running the guest kernel so drop the dummy op registrations.

Cc: Cyrill Gorcunov <gorcunov@gmail.com>
Cc: Sasha Levin <levinsasha928@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
13 years agokvm tools, serial: Register 0x2e8 ioport
Pekka Enberg [Sat, 21 May 2011 12:04:10 +0000 (15:04 +0300)]
kvm tools, serial: Register 0x2e8 ioport

We already register ioports for 0x2f8 and 0x3e8 and mark them as inactive so
mark 0x2e8 ioport as such as well.  This is a preparational step to dropping
serial port dummy registrations from ioport__setup_legacy().

Cc: Cyrill Gorcunov <gorcunov@gmail.com>
Cc: Sasha Levin <levinsasha928@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
13 years agokvm tools: Modify ioport to use interval rbtree
Sasha Levin [Sat, 21 May 2011 08:51:51 +0000 (11:51 +0300)]
kvm tools: Modify ioport to use interval rbtree

Currently the ioport implementation is based on a USHRT_MAX length
array of ptrs to ioport_operations.

Instead, use an interval rbtree to map the ioports to
ioport_operations.

Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
13 years agokvm tools: Fix rbtree-interval balancing
Sasha Levin [Sat, 21 May 2011 08:51:50 +0000 (11:51 +0300)]
kvm tools: Fix rbtree-interval balancing

Augmentation is started on the pre-rotation node found in the search,
augment the rotated node instead.

Max high is the max of max highs below it, not the max of highs below it.

Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
13 years agokvm tools: Cleanup e820 code
Sasha Levin [Fri, 20 May 2011 14:23:05 +0000 (17:23 +0300)]
kvm tools: Cleanup e820 code

Several cleanups in the patch:
 - Use kernel headers for e820 types and definitions.
 - A byte sized entry count for e820 enteries was used,
this should be dword sized. Update in-memory layout and
bios code to fix it.
 - Use struct e820map to calculate offsets used by bios code.

Reviewed-by: Cyrill Gorcunov <gorcunov@gmail.com>
Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
13 years agokvm tools: Update README
Asias He [Thu, 19 May 2011 07:45:27 +0000 (15:45 +0800)]
kvm tools: Update README

This patch updates:

- kernel configuration options
- kvm command line options
- authors

in README.

Signed-off-by: Asias He <asias.hejun@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
13 years agokvm tools: Add virtio-9p
Sasha Levin [Fri, 20 May 2011 08:37:09 +0000 (11:37 +0300)]
kvm tools: Add virtio-9p

Overview:
9p allows for simple RPC based resource sharing over
different transports (in our case, virtio).

This is the implementation of (most of) the original
9p2000 protocol, without the .u or the .l extensions.

How to use:
1. Make sure kernel is compiled with:
    CONFIG_NET_9P=y
    CONFIG_NET_9P_VIRTIO=y
    CONFIG_NET_9P_DEBUG=y (At least until code is stable)
    CONFIG_9P_FS=y

2. Start KVM with '--virtio-9p <dirname>'. What happens now is that
a virtio transport with the name 'kvm_9p' is created. The server side
of the transport maps dirname to the root of the file system.

3. Within the guest, mount the fs:
mount -t 9p -otrans=virtio kvm_9p <local_dir> -oversion=9p2000
This will mount the 9p server to local_dir.

Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
13 years agokvm tools: Copy net/9p/9p.h
Sasha Levin [Fri, 20 May 2011 08:37:08 +0000 (11:37 +0300)]
kvm tools: Copy net/9p/9p.h

Header could not be included directly because among some minor
issues, the original header declared the same function twice:
int p9_errstr2errno(char *errstr, int len);
int p9_errstr2errno(char *, int);

A patch has been sent to 9P maintainers, this header should
be removed once the patch is in.

Until then, use a modified copy of the header.

Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
13 years agokvm tools: implement "help xxx" command
Amerigo Wang [Fri, 20 May 2011 07:01:24 +0000 (15:01 +0800)]
kvm tools: implement "help xxx" command

'kvm run --help' works fine but 'kvm help run' shows nothing,
this patch implements it.

Acked-by: Prasad Joshi <prasadjoshi124@gmail.com>
Signed-off-by: WANG Cong <amwang@redhat.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
13 years agokvm tools: Default guest cpu count to host cpu count
Sasha Levin [Wed, 18 May 2011 19:56:24 +0000 (22:56 +0300)]
kvm tools: Default guest cpu count to host cpu count

If user haven't specified cpu count for the guest, use
the amount of online cpus on the host.

Tested-by: Asias He <asias.hejun@gmail.com>
Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
13 years agokvm tools: Exit properly on SMP guests
Sasha Levin [Thu, 19 May 2011 12:28:30 +0000 (15:28 +0300)]
kvm tools: Exit properly on SMP guests

When shutting down SMP guests only VCPU #0 will receive
a KVM_EXIT_SHUTDOWN. Waiting for all VCPU threads to exit
causes them to hang.

Instead, notify all VCPU threads once VCPU #0 thread is terminated
so they could also stop properly.

Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
13 years agokvm tools: Fix alignment for mpf_intel table
Cyrill Gorcunov [Wed, 18 May 2011 19:40:51 +0000 (23:40 +0400)]
kvm tools: Fix alignment for mpf_intel table

Thomas and Asias reported that kernel doesn't find MP
tables on 32 bit host. This is because previously the
alignment was done on address obtained from calloc
missing the fact that MP tables are put into guest
memory *with* offset and MP signature should be
calculated keeping this offset in midn as well and
then aligned.

Reported-by: Thomas Heil <heil@terminal-consulting.de>
Reported-by: Asias He <asias.hejun@gmail.com>
Tested-by: Thomas Heil <heil@terminal-consulting.de>
Tested-by: Asias He <asias.hejun@gmail.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
13 years agokvm tools: build rbtree.o from source
Amerigo Wang [Thu, 19 May 2011 06:36:28 +0000 (14:36 +0800)]
kvm tools: build rbtree.o from source

Don't link the rbtree.o from kernel object tree, build
rbtree.o from source by ourselves.

Acked-by: Ingo Molnar <mingo@elte.hu>
Reported-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: WANG Cong <amwang@redhat.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
13 years agokvm tools: Move hardware drivers to hw directory
Pekka Enberg [Wed, 18 May 2011 19:52:37 +0000 (22:52 +0300)]
kvm tools: Move hardware drivers to hw directory

This patch moves hypervisor native hardware emulation drivers to "hw" directory
like we've done for virtio and disk image code.

Acked-by: Cyrill Gorcunov <gorcunov@gmail.com>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
13 years agokvm tools: Fail if passed initrd is not really an initrd
Pekka Enberg [Wed, 18 May 2011 19:19:40 +0000 (22:19 +0300)]
kvm tools: Fail if passed initrd is not really an initrd

We recently changed the meaning of "-i" from disk image to initrd. This has
confused many users because kvm just reports:

  Fatal: mmap() failed.

if a disk image is passed as initrd. This patch fixes that by checking for the
first two ID bytes in initrd:

  $ ./kvm run -i ~/images/linux-0.2.qcow
    # kvm run -k ../../arch/x86/boot/bzImage -m 256 -c 1
    Fatal: /home/penberg/images/linux-0.2.qcow is not an initrd

Reported-by: Thomas Heil <heil@terminal-consulting.de>
Suggested-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
13 years agokvm tools: Add conditional compilation of symbol resolving
Cyrill Gorcunov [Wed, 18 May 2011 19:08:57 +0000 (22:08 +0300)]
kvm tools: Add conditional compilation of symbol resolving

Thomas reported that on some systems there might be no bdf
library installed. So we take perf approach and check for
library presence at compilation time.

Reported-by: Thomas Heil <heil@terminal-consulting.de>
Tested-by: Thomas Heil <heil@terminal-consulting.de>
Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
13 years agokvm tools: Prefix error() and friends helpers with pr_
Cyrill Gorcunov [Wed, 18 May 2011 18:59:46 +0000 (21:59 +0300)]
kvm tools: Prefix error() and friends helpers with pr_

To look more familiar with kernel functions.

Suggested-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
13 years agokvm tools: Make host_ram_size() more robust
Pekka Enberg [Wed, 18 May 2011 18:41:36 +0000 (21:41 +0300)]
kvm tools: Make host_ram_size() more robust

This patch fixes cryptic "out of memory" errors on hosts where sysconf() fails
by defaulting to MIN_RAM_SIZE_MB.

Reported-by: <born2befrag@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
13 years agokvm tools: Print debug info for qcow1_nowrite_sector
Asias He [Wed, 18 May 2011 08:19:15 +0000 (16:19 +0800)]
kvm tools: Print debug info for qcow1_nowrite_sector

Print debug info when we are in qcow1_nowrite_sector

Signed-off-by: Asias He <asias.hejun@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
13 years agokvm tools: Add debug info for disk_image__{read, write}
Asias He [Wed, 18 May 2011 08:19:14 +0000 (16:19 +0800)]
kvm tools: Add debug info for disk_image__{read, write}

Print debug info when read/write error occurs

Signed-off-by: Asias He <asias.hejun@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
13 years agokvm tools: Do not use 'inline' for disk_image__flush
Asias He [Wed, 18 May 2011 08:19:13 +0000 (16:19 +0800)]
kvm tools: Do not use 'inline' for disk_image__flush

Signed-off-by: Asias He <asias.hejun@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
13 years agokvm tools: Remove unnecessary S_ISBLK check
Asias He [Wed, 18 May 2011 08:19:12 +0000 (16:19 +0800)]
kvm tools: Remove unnecessary S_ISBLK check

Let's do it in blkdev__probe.

Signed-off-by: Asias He <asias.hejun@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
13 years agokvm tools: Rename raw_image_ops to blk_dev_ops
Asias He [Wed, 18 May 2011 08:19:11 +0000 (16:19 +0800)]
kvm tools: Rename raw_image_ops to blk_dev_ops

This patch also adds some comments to disk/blk.c

Signed-off-by: Asias He <asias.hejun@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
13 years agokvm tools: Rename struct disk_image_operations ops name for raw image
Asias He [Wed, 18 May 2011 08:19:10 +0000 (16:19 +0800)]
kvm tools: Rename struct disk_image_operations ops name for raw image

This patch renames:

raw_image__read_sector_ro_mmap to raw_image__read_sector
raw_image__write_sector_ro_mmap to raw_image__write_sector
raw_image__close_ro_mmap to raw_image__close

Signed-off-by: Asias He <asias.hejun@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
13 years agokvm tools: Tune up ops in 'struct disk_image_operations'
Asias He [Wed, 18 May 2011 08:19:09 +0000 (16:19 +0800)]
kvm tools: Tune up ops in 'struct disk_image_operations'

Make read/write ops in 'struct disk_image_operations'
always return the number of bytes read/written and close/flush
ops return int.

Signed-off-by: Asias He <asias.hejun@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
13 years agokvm tools: Split blk device code from raw.c to blk.c
Asias He [Wed, 18 May 2011 08:19:08 +0000 (16:19 +0800)]
kvm tools: Split blk device code from raw.c to blk.c

Signed-off-by: Asias He <asias.hejun@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
13 years agokvm tools: Consolidate disk_image__{new, new_readonly}
Asias He [Wed, 18 May 2011 08:19:07 +0000 (16:19 +0800)]
kvm tools: Consolidate disk_image__{new, new_readonly}

This patch simplifies the disk image API.

Signed-off-by: Asias He <asias.hejun@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
13 years agokvm tools: Remove dead coe disk_image__{read, write}_sector
Asias He [Wed, 18 May 2011 08:19:06 +0000 (16:19 +0800)]
kvm tools: Remove dead coe disk_image__{read, write}_sector

These code are not used anymore. Let's remove it.

Signed-off-by: Asias He <asias.hejun@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
13 years agokvm tools: Rename disk_image__{read, write}_sector_iov
Asias He [Wed, 18 May 2011 08:19:05 +0000 (16:19 +0800)]
kvm tools: Rename disk_image__{read, write}_sector_iov

This patch renames disk_image__{read, write}_sector_io to
disk_image__{read, write}.

Signed-off-by: Asias He <asias.hejun@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
13 years agokvm tools: Split raw image and blk device code from disk/core.c
Asias He [Wed, 18 May 2011 08:19:04 +0000 (16:19 +0800)]
kvm tools: Split raw image and blk device code from disk/core.c

This patch moves raw image and blk device code into disk/raw.c

Signed-off-by: Asias He <asias.hejun@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
13 years agokvm tools: Rename disk-image.c to core.c
Asias He [Wed, 18 May 2011 08:19:03 +0000 (16:19 +0800)]
kvm tools: Rename disk-image.c to core.c

This patch prepares the splitting of disk-image.c to core.c,
blk.c and raw.c.

Signed-off-by: Asias He <asias.hejun@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
13 years agokvm tools: Move disk image related code under disk directory
Asias He [Wed, 18 May 2011 08:19:02 +0000 (16:19 +0800)]
kvm tools: Move disk image related code under disk directory

This patch removes disk-image.c and qcow.c under disk directory.

Signed-off-by: Asias He <asias.hejun@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
13 years agokvm tools: Fix includes for preadv/pwritev
Pekka Enberg [Tue, 17 May 2011 15:17:12 +0000 (18:17 +0300)]
kvm tools: Fix includes for preadv/pwritev

"bornto befrag <born2befrag@gmail.com>" writes:

  > When i compile i kvm native tool tools/kvm && make i get this
  >
  >  CC       read-write.o
  > cc1: warnings being treated as errors
  > read-write.c: In function ‘xpreadv’:
  > read-write.c:255: error: implicit declaration of function ‘preadv’
  > read-write.c:255: error: nested extern declaration of ‘preadv’
  > read-write.c: In function ‘xpwritev’:
  > read-write.c:268: error: implicit declaration of function ‘pwritev’
  > read-write.c:268: error: nested extern declaration of ‘pwritev’
  > make: *** [read-write.o] Error 1

Fix that up by including <sys/uio.h> for preadv()/pwritev().
Reported-by: <born2befrag@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
13 years agokvm tools: Use virtio IDs from <linux/virtio_ids.h>
Sasha Levin [Tue, 17 May 2011 12:11:04 +0000 (15:11 +0300)]
kvm tools: Use virtio IDs from <linux/virtio_ids.h>

Instead of redefining virtio IDs in our headers, use IDs defined
in <linux/virtio_ids.h>.

Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
13 years agokvm tools: Add MMIO address mapper
Sasha Levin [Tue, 17 May 2011 12:08:00 +0000 (15:08 +0300)]
kvm tools: Add MMIO address mapper

When we have a MMIO exit, we need to find which device
has registered to use the accessed MMIO space.

The mapper maps ranges of guest physical addresses to
callback functions.

Implementation is based on an interval red-black tree.

Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
13 years agokvm tools: Add interval red-black tree helper
Sasha Levin [Tue, 17 May 2011 12:07:59 +0000 (15:07 +0300)]
kvm tools: Add interval red-black tree helper

Interval rb-tree allows to directly store interval ranges
and quickly lookup an overlap with a single point or a range.

The helper is based on the kernel rb-tree implementation
(located in <linux/rbtree.h>) which alows for the augmention
of the classical rb-tree to be used as an interval tree.

Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
13 years agokvm tools: Return correct values from disk IOV functions
Sasha Levin [Sun, 15 May 2011 10:22:39 +0000 (13:22 +0300)]
kvm tools: Return correct values from disk IOV functions

Currently read/write IOV functions return an incorrect
value instead of the amount of bytes read/written.

This incorrect value may cause errors within the virtio layer.

Return correct amount of bytes read/written from _iov functions.

[ penberg@kernel.org: don't use 'inline' for out-of-line functions ]
Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
13 years agokvm tools: Add boot test to checks
Sasha Levin [Fri, 13 May 2011 15:39:12 +0000 (18:39 +0300)]
kvm tools: Add boot test to checks

'make check' will now try booting a kernel and will exit
gracefully once the kernel has finished loading.

Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
13 years agokvm tools: Add VIRTIO_BLK_T_FLUSH feature to handle flush operation from VM
Prasad Joshi [Fri, 13 May 2011 14:02:46 +0000 (15:02 +0100)]
kvm tools: Add VIRTIO_BLK_T_FLUSH feature to handle flush operation from VM

The virtual machine calls 'sync' when the machine
is halted. Adding the virtio flush feature will
ensure that the data is synced on to disk before
the virtual machine is halted. This is needed to
ensure the intigrity of the data.

Signed-off-by: Prasad Joshi <prasadjoshi124@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
13 years agokvm tools: Tune the command-line option
Asias He [Fri, 13 May 2011 02:40:09 +0000 (10:40 +0800)]
kvm tools: Tune the command-line option

With this patch we can have
-c --cpus
-m --mem
-d --disk
-k --kernel
-i --initrd
which is more consistent and easy to remember.

The patch also frees up -s, -g option.

Ingo suggestied
'''
 The debug options should probably be concentrated under a --debug option
 anyway, to allow things like:

  --debug single-step,ioport

 Even if the debug options are kept they should be streamlined along
 the same
 pattern:

 >>         --debug-single-step     Enable single stepping
 >>         --debug-ioport          Enable ioport debugging

 But having a --debug option that recognizes all the debug flags would
 be nicer.

 It would also allow future enhancements to group debug features, like:

   --debug all                # turn on everything and the kitchen sink
   for early hangs
   --debug all,-single-step   # turn on everything except single-step
   debugging
   --debug nonverbose         # turn on all non-noisy debug options we
   have

 Maybe even:

   --debug memcheck

 ... could run kvm under valgrind automatically - that way we can hide
 any secondary tool complexities from the user and turn those tools into
 simple debug options :-)
'''

Let's do this --debug option consolidation later.

Signed-off-by: Asias He <asias.hejun@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
13 years agokvm tools: Bring VIRTIO_BLK_F_SEG_MAX feature back to virtio blk
Asias He [Fri, 13 May 2011 02:40:08 +0000 (10:40 +0800)]
kvm tools: Bring VIRTIO_BLK_F_SEG_MAX feature back to virtio blk

commit b764422bb0b46b00b896f6d4538ac3d3dde9e56b
(kvm tools: Add support for multiple virtio-blk)
removed the VIRTIO_BLK_F_SEG_MAX publishment to guest.

There is no reason we should not support it. Just bring it back.

Signed-off-by: Asias He <asias.hejun@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
13 years agokvm tools: Fix type mismatches on GCC 4.4 on 32-bit systems
Ingo Molnar [Fri, 13 May 2011 08:19:09 +0000 (10:19 +0200)]
kvm tools: Fix type mismatches on GCC 4.4 on 32-bit systems

The tools/kvm build still fails on 32-bit:

 cc1: warnings being treated as errors
 qcow.c: In function ‘qcow1_write_sector’:
 qcow.c:307: error: comparison between signed and unsigned integer expressions
 make: *** [qcow.o] Error 1
 make: *** Waiting for unfinished jobs....

using:

 gcc version 4.4.4 20100630 (Red Hat 4.4.4-10) (GCC)

The patch below addresses them.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
13 years agokvm tools: Use standardized style for the virtio/net.c driver
Ingo Molnar [Thu, 12 May 2011 08:09:29 +0000 (10:09 +0200)]
kvm tools: Use standardized style for the virtio/net.c driver

I had a quick look at virtio/net.c and it still had quite many style
inefficiencies - all of which are patterns which i pointed out before:

 - use short names for devices within the driver, so not 'net_device' but
   'ndev' - everyone hacking net.c knows that this is the network driver so
   'ndev' is a self-explanatory (and very short) term of art ...

 - use 'pci_header' instead of the ambiguous and misleading
   'virtio_net_pci_device' naming.

 - do not repeat 'net' in struct net_device fields! So rename ndev->net_config
   to ndev->config.

 - In the kernel we generally use _lock names for mutexes. This is conceptually
   more generic. So rename the net device mutexes accordingly.

 - group #include lines in a topical way instead of a random mess

 - fix vertical alignment mismatches

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
13 years agokvm tools: Lookup symbol based on RIP for 'kill -3'
Pekka Enberg [Wed, 11 May 2011 16:14:39 +0000 (19:14 +0300)]
kvm tools: Lookup symbol based on RIP for 'kill -3'

To make debugging easier, look up symbol from guest kernel image based on RIP
when user does 'kill -3' to the hypervisor.

Example output looks as follows:

  Code:
  -----
  rip: [<ffffffff812cb3a0>] delay_loop+30 (/home/penberg/linux/arch/x86/lib/delay.c:32)

Cc: Asias He <asias.hejun@gmail.com>
Cc: Avi Kivity <avi@redhat.com>
Cc: Cyrill Gorcunov <gorcunov@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Prasad Joshi <prasadjoshi124@gmail.com>
Cc: Sasha Levin <levinsasha928@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
13 years agokvm tools: Fix typo in converting bytes to MBs
Pekka Enberg [Wed, 11 May 2011 18:43:14 +0000 (21:43 +0300)]
kvm tools: Fix typo in converting bytes to MBs

Signed-off-by: Pekka Enberg <penberg@kernel.org>
13 years agokvm tools: Fix read-only support in QCOW
Pekka Enberg [Wed, 11 May 2011 18:23:03 +0000 (21:23 +0300)]
kvm tools: Fix read-only support in QCOW

If the user specifies a read-only image, make sure we never write to it.
Booting to a read-only image looks like this now:

  $ ./kvm run -i ~/images/linux-0.2.qcow2,ro

  [ snip ]
  [    1.250236] end_request: I/O error, dev vda, sector 32856
  [    1.252867] Buffer I/O error on device vda, logical block 16428
  [    1.255706] lost page write due to I/O error on vda
  [    1.258120] EXT4-fs (vda): previous I/O error to superblock detected
  [    1.261157] end_request: I/O error, dev vda, sector 2
  [    1.263333] Buffer I/O error on device vda, logical block 1
  [    1.264944] lost page write due to I/O error on vda
  [    1.266139] EXT4-fs (vda): re-mounted. Opts:
  [    1.284390] end_request: I/O error, dev vda, sector 35842
  [    1.285679] Buffer I/O error on device vda, logical block 17921
  [    1.287175] EXT4-fs warning (device vda): ext4_end_bio:259: I/O error writing to inode 3756 (offset 0 size 1024 starting block 17922)

Reported-by: Sasha Levin <levinsasha928@gmail.com>
Cc: Asias He <asias.hejun@gmail.com>
Cc: Avi Kivity <avi@redhat.com>
Cc: Cyrill Gorcunov <gorcunov@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Prasad Joshi <prasadjoshi124@gmail.com>
Cc: Sasha Levin <levinsasha928@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
13 years agokvm tools: Print out important command line options at startup
Pekka Enberg [Wed, 11 May 2011 18:00:01 +0000 (21:00 +0300)]
kvm tools: Print out important command line options at startup

It's important to know what the guest configuration looks like when debugging
issues so print out important command line options at startup:

  $ ./kvm run -k ../../arch/x86/boot/bzImage -m 512
    # kvm run -k ../../arch/x86/boot/bzImage -m 512 -c 1

Suggested-by: Ingo Molnar <mingo@elte.hu>
Cc: Asias He <asias.hejun@gmail.com>
Cc: Avi Kivity <avi@redhat.com>
Cc: Cyrill Gorcunov <gorcunov@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Prasad Joshi <prasadjoshi124@gmail.com>
Cc: Sasha Levin <levinsasha928@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
13 years agokvm tools: Use '-c' for '--cpus', not '--console'
Pekka Enberg [Wed, 11 May 2011 17:57:48 +0000 (20:57 +0300)]
kvm tools: Use '-c' for '--cpus', not '--console'

This patch changes the '-c' command line option to specify the number of CPUs
because it's used more often than console switching.

Cc: Asias He <asias.hejun@gmail.com>
Cc: Avi Kivity <avi@redhat.com>
Cc: Cyrill Gorcunov <gorcunov@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Prasad Joshi <prasadjoshi124@gmail.com>
Cc: Sasha Levin <levinsasha928@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
13 years agokvm tools: QCOW code cleanups
Pekka Enberg [Wed, 11 May 2011 17:38:37 +0000 (20:38 +0300)]
kvm tools: QCOW code cleanups

Fix up coding style issues in QCOW code.

Cc: Asias He <asias.hejun@gmail.com>
Cc: Avi Kivity <avi@redhat.com>
Cc: Cyrill Gorcunov <gorcunov@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Prasad Joshi <prasadjoshi124@gmail.com>
Cc: Sasha Levin <levinsasha928@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
13 years agokvm tools: Add QCOW write support
Prasad Joshi [Tue, 10 May 2011 14:43:30 +0000 (15:43 +0100)]
kvm tools: Add QCOW write support

The patch adds QCOW write support for both the versions of QCOW.

The code is based on the QCOW image format specifications which are available on:

  http://people.gnome.org/~markmc/qcow-image-format-version-1.html

  http://people.gnome.org/~markmc/qcow-image-format.html

Signed-off-by: Prasad Joshi <prasadjoshi124@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
13 years agokvm tools: Use constants for commonly used mmap flags
Sasha Levin [Wed, 11 May 2011 16:52:57 +0000 (19:52 +0300)]
kvm tools: Use constants for commonly used mmap flags

Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
13 years agokvm tools: Rename 'self' variables
Sasha Levin [Wed, 11 May 2011 16:52:56 +0000 (19:52 +0300)]
kvm tools: Rename 'self' variables

Give proper names to vars named 'self'.

Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
13 years agokvm tools: Add debug() helper
Cyrill Gorcunov [Wed, 11 May 2011 16:10:51 +0000 (20:10 +0400)]
kvm tools: Add debug() helper

Useful for debugging. It adds "--debug" option as well so
debug prints are seen only if user asked for them.

Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
13 years agokvm tools: Use definitions from kernel headers
Sasha Levin [Wed, 11 May 2011 15:17:25 +0000 (18:17 +0300)]
kvm tools: Use definitions from kernel headers

Instead of redefining virtio pci constants (or not using them at all), use
constants from kernel header.

Acked-and-tested-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
13 years agokvm tools: Prevent PFN wraparound
Sasha Levin [Wed, 11 May 2011 15:17:24 +0000 (18:17 +0300)]
kvm tools: Prevent PFN wraparound

queue->pfn may be used to point at addresses larger
than 32 bit.
Prevent a wraparound when shifting it left.

Acked-and-tested-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
13 years agokvm tools: Add memory gap for larger RAM sizes
Sasha Levin [Wed, 11 May 2011 15:17:23 +0000 (18:17 +0300)]
kvm tools: Add memory gap for larger RAM sizes

e820 is expected to leave a memory gap within the low 32
bits of RAM space. From the documentation of e820_setup_gap():

  /*
   * Search for the biggest gap in the low 32 bits of the e820
   * memory space.  We pass this space to PCI to assign MMIO resources
   * for hotplug or unconfigured devices in.
   * Hopefully the BIOS let enough space left.
   */

Not leaving such gap causes errors and hangs during the boot process.

This patch adds a memory gap between 0xe0000000 and 0x100000000 when using more
than 0xe0000000 bytes for guest RAM.

This patch updates the e820 table, slot allocations used for
KVM_SET_USER_MEMORY_REGION.

Acked-and-tested-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
13 years agokvm tools: Dump vCPUs in order
Ingo Molnar [Mon, 9 May 2011 07:45:32 +0000 (09:45 +0200)]
kvm tools: Dump vCPUs in order

* Ingo Molnar <mingo@elte.hu> wrote:

> The patch below addresses these concerns, serializes the output, tidies up the
> printout, resulting in this new output:

There's one bug remaining that my patch does not address: the vCPUs are not
printed in order:

# vCPU #0's dump:
# vCPU #2's dump:
# vCPU #24's dump:
# vCPU #5's dump:
# vCPU #39's dump:
# vCPU #38's dump:
# vCPU #51's dump:
# vCPU #11's dump:
# vCPU #10's dump:
# vCPU #12's dump:

This is undesirable as the order of printout is highly random, so successive
dumps are difficult to compare.

The patch below serializes the signalling itself. (this is on top of the
previous patch)

The patch also tweaks the vCPU printout line a bit so that it does not start
with '#', which is discarded if such messages are pasted into Git commit
messages.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
13 years agokvm tools: Fix and improve the CPU register dump debug output code
Ingo Molnar [Mon, 9 May 2011 07:27:11 +0000 (09:27 +0200)]
kvm tools: Fix and improve the CPU register dump debug output code

* Pekka Enberg <penberg@kernel.org> wrote:

> Ingo Molnar reported that 'kill -3' didn't work on his machine:
>
>   * Ingo Molnar <mingo@elte.hu> wrote:
>
>   > This is really cumbersome to debug - is there some good way to get to the RIP
>   > that the guest is hanging in? If kvm would print that out to the host console
>   > (even if it's just the raw RIP initially) on a kill -3 that would help
>   > enormously.
>
>   Looks like the code should be doing that already - but the ioctl(KVM_GET_SREGS)
>   hangs:
>
>     [pid   748] ioctl(6, KVM_GET_SREGS
>
> Avi Kivity pointed out that it's not safe to call KVM_GET_SREGS (or other vcpu
> related ioctls) from other threads:
>
>   > is it not OK to call KVM_GET_SREGS from other threads than the one
>   > that's doing KVM_RUN?
>
>   From Documentation/kvm/api.txt:
>
>    - vcpu ioctls: These query and set attributes that control the operation
>      of a single virtual cpu.
>
>      Only run vcpu ioctls from the same thread that was used to create the
>      vcpu.
>
> Fix that up by using pthread_kill() to force the threads that are doing KVM_RUN
> to do the register dumps.
>
> Reported: Ingo Molnar <mingo@elte.hu>
> Cc: Asias He <asias.hejun@gmail.com>
> Cc: Avi Kivity <avi@redhat.com>
> Cc: Cyrill Gorcunov <gorcunov@gmail.com>
> Cc: Ingo Molnar <mingo@elte.hu>
> Cc: Prasad Joshi <prasadjoshi124@gmail.com>
> Cc: Sasha Levin <levinsasha928@gmail.com>
> Signed-off-by: Pekka Enberg <penberg@kernel.org>
> ---
>  tools/kvm/kvm-run.c |   20 +++++++++++++++++---
>  1 files changed, 17 insertions(+), 3 deletions(-)
>
> diff --git a/tools/kvm/kvm-run.c b/tools/kvm/kvm-run.c
> index eb50b6a..58e2977 100644
> --- a/tools/kvm/kvm-run.c
> +++ b/tools/kvm/kvm-run.c
> @@ -127,6 +127,18 @@ static const struct option options[] = {
>   OPT_END()
>  };
>
> +static void handle_sigusr1(int sig)
> +{
> + struct kvm_cpu *cpu = current_kvm_cpu;
> +
> + if (!cpu)
> + return;
> +
> + kvm_cpu__show_registers(cpu);
> + kvm_cpu__show_code(cpu);
> + kvm_cpu__show_page_tables(cpu);
> +}
> +
>  static void handle_sigquit(int sig)
>  {
>   int i;
> @@ -134,9 +146,10 @@ static void handle_sigquit(int sig)
>   for (i = 0; i < nrcpus; i++) {
>   struct kvm_cpu *cpu = kvm_cpus[i];
>
> - kvm_cpu__show_registers(cpu);
> - kvm_cpu__show_code(cpu);
> - kvm_cpu__show_page_tables(cpu);
> + if (!cpu)
> + continue;
> +
> + pthread_kill(cpu->thread, SIGUSR1);
>   }
>
>   serial8250__inject_sysrq(kvm);

i can see a couple of problems with the debug printout code, which currently
produces a stream of such dumps for each vcpu:

Registers:
 rip: 0000000000000000   rsp: 00000000000016ca flags: 0000000000010002
 rax: 0000000000000000   rbx: 0000000000000000   rcx: 0000000000000000
 rdx: 0000000000000000   rsi: 0000000000000000   rdi: 0000000000000000
 rbp: 0000000000008000   r8:  0000000000000000   r9:  0000000000000000
 r10: 0000000000000000   r11: 0000000000000000   r12: 0000000000000000
 r13: 0000000000000000   r14: 0000000000000000   r15: 0000000000000000
 cr0: 0000000060000010   cr2: 0000000000000070   cr3: 0000000000000000
 cr4: 0000000000000000   cr8: 0000000000000000
Segment registers:
 register  selector  base              limit     type  p dpl db s l g avl
 cs        f000      00000000000f0000  0000ffff  03    1 3   0  1 0 0 0
 ss        1000      0000000000010000  0000ffff  03    1 3   0  1 0 0 0
 ds        1000      0000000000010000  0000ffff  03    1 3   0  1 0 0 0
 es        1000      0000000000010000  0000ffff  03    1 3   0  1 0 0 0
 fs        1000      0000000000010000  0000ffff  03    1 3   0  1 0 0 0
 gs        1000      0000000000010000  0000ffff  03    1 3   0  1 0 0 0
 tr        0000      0000000000000000  0000ffff  0b    1 0   0  0 0 0 0
 ldt       0000      0000000000000000  0000ffff  02    1 0   0  0 0 0 0
 gdt                 0000000000000000 0000ffff
 idt                 0000000000000000 0000ffff
 [ efer: 0000000000000000  apic base: 00000000fee00900  nmi: enabled ]
Interrupt bitmap:
 0000000000000000 0000000000000000 0000000000000000 0000000000000000
Code: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 <cf> eb 0d 90 90 90 90 90 90 90 90 90 90 90 90 90 f6 c4 0e 75 4b
Stack:
  0x000016ca: 00 00 00 00  00 00 00 00
  0x000016d2: 00 00 00 00  00 00 00 00
  0x000016da: 00 00 00 00  00 00 00 00
  0x000016e2: 00 00 00 00  00 00 00 00

The problems are:

 - This does not work very well on SMP with lots of vcpus, because the printing
   is unserialized, resulting in a jumbled mess of an output, all vcpus trying
   to print to the console at once, often mixing lines and characters randomly.

 - stdout from a signal handler must be flushed, otherwise lines can remain
   buffered if someone saves the output via 'tee' for example.

 - the dumps from the various CPUs are not distinguishable - they are just
   dumped after each other with no identification

 - the various printouts are rather hard to parse visually - it's not easy to see
   various properties "at a glance" because the dump is visually confusing.

The patch below addresses these concerns, serializes the output, tidies up the
printout, resulting in this new output:

#
# vCPU #0's dump:
#

 Registers:
 ----------
 rip: 0000000000000000   rsp: 00000000000008bc flags: 0000000000010002
 rax: 0000000000000000   rbx: 0000000000000000   rcx: 0000000000000000
 rdx: 0000000000000000   rsi: 0000000000000000   rdi: 0000000000000000
 rbp: 0000000000008000    r8: 0000000000000000    r9: 0000000000000000
 r10: 0000000000000000   r11: 0000000000000000   r12: 0000000000000000
 r13: 0000000000000000   r14: 0000000000000000   r15: 0000000000000000
 cr0: 0000000060000010   cr2: 0000000000000070   cr3: 0000000000000000
 cr4: 0000000000000000   cr8: 0000000000000000

 Segment registers:
 ------------------
 register  selector  base              limit     type  p dpl db s l g avl
 cs        f000      00000000000f0000  0000ffff  03    1 3   0  1 0 0 0
 ss        1000      0000000000010000  0000ffff  03    1 3   0  1 0 0 0
 ds        1000      0000000000010000  0000ffff  03    1 3   0  1 0 0 0
 es        1000      0000000000010000  0000ffff  03    1 3   0  1 0 0 0
 fs        1000      0000000000010000  0000ffff  03    1 3   0  1 0 0 0
 gs        1000      0000000000010000  0000ffff  03    1 3   0  1 0 0 0
 tr        0000      0000000000000000  0000ffff  0b    1 0   0  0 0 0 0
 ldt       0000      0000000000000000  0000ffff  02    1 0   0  0 0 0 0
 gdt                 0000000000000000  0000ffff
 idt                 0000000000000000  0000ffff

 APIC:
 -----
 efer: 0000000000000000  apic base: 00000000fee00900  nmi: enabled

 Interrupt bitmap:
 -----------------
 0000000000000000 0000000000000000 0000000000000000 0000000000000000

 Code:
 -----
 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 <cf> eb 0d 90 90 90 90 90 90 90 90 90 90 90 90 90 f6 c4 0e 75 4b

 Stack:
 ------
  0x000008bc: 00 00 00 00  00 00 00 00
  0x000008c4: 00 00 00 00  00 00 00 00
  0x000008cc: 00 00 00 00  00 00 00 00
  0x000008d4: 00 00 00 00  00 00 00 00

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
13 years agokvm tools: Add missing space after kernel params
Sasha Levin [Sun, 8 May 2011 18:58:04 +0000 (21:58 +0300)]
kvm tools: Add missing space after kernel params

Add missing space so that user-provided kernel params
will be properly concatenated to default params.

Instead of just adding a space at the end, add it with
a separate strcat(), since it's not the first (and wouldn't
have been the last) time a space wasn't added.

Reported-by: Rodrigo Campos <rodrigo@sdfg.com.ar>
Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
13 years agokvm tools: Fix virtio console hangs by removing IRQ injection for tx path
Asias He [Sun, 8 May 2011 13:09:25 +0000 (21:09 +0800)]
kvm tools: Fix virtio console hangs by removing IRQ injection for tx path

As virtio spec says:

"""
 Because this is high importance and low bandwidth, the current Linux
 implementation polls for the buffer to be used, rather than waiting
 for an interrupt, simplifying the implementation signicantly.
"""

drivers/char/virtio_console.c
 send_buf() {
 ...
/* Tell Host to go! */
virtqueue_kick(out_vq);
 ...
        while (!virtqueue_get_buf(out_vq, &len))
                cpu_relax();
 ...
 }

The console hangs can simply be reproduced by yes command which
gives tremendous console IOs and IRQs.

[   16.786440] irq 4: nobody cared (try booting with the "irqpoll" option)
[   16.786440] Pid: 1437, comm: yes Tainted: G        W 2.6.39-rc6+ #56
[   16.786440] Call Trace:
[   16.786440]  [<c16578eb>] __report_bad_irq+0x30/0x89
[   16.786440]  [<c10980e6>] note_interrupt+0x118/0x17a
[   16.786440]  [<c1096e7d>] handle_irq_event_percpu+0x168/0x179
[   16.786440]  [<c1096eba>] handle_irq_event+0x2c/0x46
[   16.786440]  [<c1098516>] ? unmask_irq+0x1e/0x1e
[   16.786440]  [<c1098566>] handle_level_irq+0x50/0x6e
[   16.786440]  <IRQ>  [<c102fa69>] ? do_IRQ+0x35/0x7f
[   16.786440]  [<c1665ea9>] ? common_interrupt+0x29/0x30
[   16.786440]  [<c16610d6>] ? _raw_spin_unlock_irqrestore+0x7/0x28
[   16.786440]  [<c1364f65>] ? hvc_write+0x88/0x9e
[   16.786440]  [<c1355500>] ? do_output_char+0x88/0x18a
[   16.786440]  [<c1355631>] ? process_output+0x2f/0x42
[   16.786440]  [<c1355af6>] ? n_tty_write+0x211/0x2dc
[   16.786440]  [<c1059d77>] ? try_to_wake_up+0x226/0x226
[   16.786440]  [<c13534a4>] ? tty_write+0x15e/0x1d1
[   16.786440]  [<c12c1644>] ? security_file_permission+0x22/0x26
[   16.786440]  [<c13558e5>] ? process_echoes+0x241/0x241
[   16.786440]  [<c10dd9d2>] ? vfs_write+0x84/0xd7
[   16.786440]  [<c1353346>] ? tty_write_lock+0x3d/0x3d
[   16.786440]  [<c10ddb92>] ? sys_write+0x3b/0x5d
[   16.786440]  [<c166594c>] ? sysenter_do_call+0x12/0x22
[   16.786440] handlers:
[   16.786440] [<c1351397>] (vp_interrupt+0x0/0x3a)
[   16.786440] Disabling IRQ #4

Tested-by: Sasha Levin <levinsasha928@gmail.com>
Signed-off-by: Asias He <asias.hejun@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
13 years agokvm tools: Use virt_queue__trigger_irq() to trigger IRQ for virtio rng
Asias He [Sun, 8 May 2011 13:09:24 +0000 (21:09 +0800)]
kvm tools: Use virt_queue__trigger_irq() to trigger IRQ for virtio rng

This patch uses IRQ injection mechanism introduced by
virt_queue__trigger_irq() which respect virtio IRQ status
and VRING_AVAIL_F_NO_INTERRUPT.

Signed-off-by: Asias He <asias.hejun@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
13 years agokvm tools: Use virt_queue__trigger_irq() to trigger IRQ for virtio blk
Asias He [Sun, 8 May 2011 13:09:23 +0000 (21:09 +0800)]
kvm tools: Use virt_queue__trigger_irq() to trigger IRQ for virtio blk

This patch uses IRQ injection mechanism introduced by
virt_queue__trigger_irq() which respect virtio IRQ status
and VRING_AVAIL_F_NO_INTERRUPT.

Signed-off-by: Asias He <asias.hejun@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
13 years agokvm tools: Use virt_queue__trigger_irq() to trigger IRQ for virtio console
Asias He [Sun, 8 May 2011 13:09:22 +0000 (21:09 +0800)]
kvm tools: Use virt_queue__trigger_irq() to trigger IRQ for virtio console

This patch uses IRQ injection mechanism introduced by
virt_queue__trigger_irq() which respect virtio IRQ status
and VRING_AVAIL_F_NO_INTERRUPT.

Signed-off-by: Asias He <asias.hejun@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
13 years agokvm tools: Fix 'kill -3' hangs
Pekka Enberg [Sun, 8 May 2011 09:56:04 +0000 (12:56 +0300)]
kvm tools: Fix 'kill -3' hangs

Ingo Molnar reported that 'kill -3' didn't work on his machine:

  * Ingo Molnar <mingo@elte.hu> wrote:

  > This is really cumbersome to debug - is there some good way to get to the RIP
  > that the guest is hanging in? If kvm would print that out to the host console
  > (even if it's just the raw RIP initially) on a kill -3 that would help
  > enormously.

  Looks like the code should be doing that already - but the ioctl(KVM_GET_SREGS)
  hangs:

    [pid   748] ioctl(6, KVM_GET_SREGS

Avi Kivity pointed out that it's not safe to call KVM_GET_SREGS (or other vcpu
related ioctls) from other threads:

  > is it not OK to call KVM_GET_SREGS from other threads than the one
  > that's doing KVM_RUN?

  From Documentation/kvm/api.txt:

   - vcpu ioctls: These query and set attributes that control the operation
     of a single virtual cpu.

     Only run vcpu ioctls from the same thread that was used to create the
     vcpu.

Fix that up by using pthread_kill() to force the threads that are doing KVM_RUN
to do the register dumps.

Reported: Ingo Molnar <mingo@elte.hu>
Cc: Asias He <asias.hejun@gmail.com>
Cc: Avi Kivity <avi@redhat.com>
Cc: Cyrill Gorcunov <gorcunov@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Prasad Joshi <prasadjoshi124@gmail.com>
Cc: Sasha Levin <levinsasha928@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
13 years agokvm tools: Enable earlyprintk=serial by default
Ingo Molnar [Sun, 8 May 2011 07:39:34 +0000 (09:39 +0200)]
kvm tools: Enable earlyprintk=serial by default

Enable the earlyprintk console to the serial port, to allow the debugging of
very early hangs/crashes.

Since we already enable the serial console by default, this is a natural
extension of it.

I have tested that it indeed works, by provoking an early hang that triggers
after the early console is enabled by before the real console is registered. In
that case before the patch we get:

  $ ./kvm run --cpus 2
  [ silent hang ]

With this patch applied i got the early output:

 $ ./kvm run --cpus 60
 [    0.000000] console [earlyser0] enabled
 [    0.000000] Initializing cgroup subsys cpu
 [    0.000000] Linux version 2.6.39-rc6-tip-02944-g87b0bcf-dirty (mingo@aldebaran) (gcc version 4.6.0 20110419 (Red Hat 4.6.0-5) (GCC) ) #84 SMP Mon May 9 02:34:26 CEST 2011
 [    0.000000] Command line: notsc noapic noacpi pci=conf1 console=ttyS0 earlyprintk=serialroot=/dev/vda1 rw
 [    0.000000] locking up the box!

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
13 years agokvm tools: Warn if guest RAM size exceeds host RAM size
Pekka Enberg [Sat, 7 May 2011 19:07:15 +0000 (22:07 +0300)]
kvm tools: Warn if guest RAM size exceeds host RAM size

Guest memory size that's larger than host physical RAM can cause swap deaths on
the host so warn the user about it.

Cc: Asias He <asias.hejun@gmail.com>
Cc: Cyrill Gorcunov <gorcunov@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Prasad Joshi <prasadjoshi124@gmail.com>
Cc: Sasha Levin <levinsasha928@gmail.com>
Suggested-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
13 years agokvm tools: Don't use all of host RAM for guests by default
Pekka Enberg [Sat, 7 May 2011 19:01:54 +0000 (22:01 +0300)]
kvm tools: Don't use all of host RAM for guests by default

This patch fixes the default guest RAM size maximum to 80% of the host RAM to
avoid swapping the host to death.

Cc: Asias He <asias.hejun@gmail.com>
Cc: Cyrill Gorcunov <gorcunov@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Prasad Joshi <prasadjoshi124@gmail.com>
Cc: Sasha Levin <levinsasha928@gmail.com>
Suggested-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
13 years agokvm tools: Fix up mtable srcbusirq assignment for PCI devices
Cyrill Gorcunov [Sat, 7 May 2011 15:02:58 +0000 (19:02 +0400)]
kvm tools: Fix up mtable srcbusirq assignment for PCI devices

The kernel expects srcbusirq follows MP specification and consists
a tuple of PCI device number with pin encoded. Make it so, otherwise
the kernel reports kind of "buggy MP table" found.

Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
13 years agokvm tools: Fix up PCI pin assignment to conform specification
Cyrill Gorcunov [Sat, 7 May 2011 15:02:57 +0000 (19:02 +0400)]
kvm tools: Fix up PCI pin assignment to conform specification

Only 4 pins are allowed for every PCI compilant device as per PCI 2.2 spec
Section 2.2.6 ("Interrupt Pins"). Multifunctional devices can use up to all
INTA#,B#,C#,D# pins, for our single function devices pin INTA# is enough.

Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
13 years agokvm tools: Limit CPU count by KVM_CAP_NR_VCPUS
Pekka Enberg [Sat, 7 May 2011 14:37:28 +0000 (17:37 +0300)]
kvm tools: Limit CPU count by KVM_CAP_NR_VCPUS

This patch limits the number of CPUs to KVM_CAP_NR_VCPUS when user specifies
more CPUs with the "--cpus=N" command line option than what the in-kernel KVM
is able to handle.

Cc: Asias He <asias.hejun@gmail.com>
Cc: Cyrill Gorcunov <gorcunov@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Prasad Joshi <prasadjoshi124@gmail.com>
Cc: Sasha Levin <levinsasha928@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
13 years agokvm tools: Rename pci_device to pci_hdr for clarity
Sasha Levin [Sat, 7 May 2011 10:50:45 +0000 (13:50 +0300)]
kvm tools: Rename pci_device to pci_hdr for clarity

Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
13 years agokvm tools: Scale guest RAM size by CPU count
Pekka Enberg [Sat, 7 May 2011 14:18:14 +0000 (17:18 +0300)]
kvm tools: Scale guest RAM size by CPU count

This patch increases default RAM size to 256 for one CPU and introduces RAM
size linear scaling based on CPUs as suggested by Ingo Molnar:

     64MB*(nr_cpus + 3)

     ------------------
       1 CPUs:   256 MB
       2 CPUs:   320 MB
       3 CPUs:   384 MB
       4 CPUs:   448 MB
       5 CPUs:   512 MB
       6 CPUs:   576 MB
       7 CPUs:   640 MB
       8 CPUs:   704 MB
       9 CPUs:   768 MB
      10 CPUs:   832 MB
      11 CPUs:   896 MB
      12 CPUs:   960 MB
      13 CPUs:  1024 MB
      14 CPUs:  1088 MB
      15 CPUs:  1152 MB
      16 CPUs:  1216 MB

Cc: Asias He <asias.hejun@gmail.com>
Cc: Cyrill Gorcunov <gorcunov@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Prasad Joshi <prasadjoshi124@gmail.com>
Cc: Sasha Levin <levinsasha928@gmail.com>
Suggested-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
13 years agokvm tools: Convert virtio devices to use IRQ registry
Sasha Levin [Fri, 6 May 2011 11:24:12 +0000 (14:24 +0300)]
kvm tools: Convert virtio devices to use IRQ registry

Instead of using static IRQ/device data, register the device
upon initialization and use the assign parameters when issuing
IRQs.

Clean up static definitions of IRQs.

Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
13 years agokvm tools: Dynamically add devices when creating mptable
Sasha Levin [Fri, 6 May 2011 11:24:11 +0000 (14:24 +0300)]
kvm tools: Dynamically add devices when creating mptable

Enumerate registered devices to build a complete
and updated mptable containing all registered pci
devices.

Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
13 years agokvm tools: Introduce IRQ registry
Sasha Levin [Fri, 6 May 2011 11:24:10 +0000 (14:24 +0300)]
kvm tools: Introduce IRQ registry

Instead of having static definitions of devices, Use a
dynamic registry of pci devices.

The structure is a rbtree which holds device types (net,
blk, etc). Each device entry holds a list of IRQ lines
associated with that device (pin).

Devices dynamically register upon initialization, and receive
a set of: device id, irq pin and irq line.

Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
13 years agokvm tools: Simplify search for root device
Sasha Levin [Fri, 6 May 2011 07:26:35 +0000 (10:26 +0300)]
kvm tools: Simplify search for root device

Use /dev/block to find the block device used for root
instead of searching through mounts.

Tested-by: Prasad Joshi <prasadjoshi124@gmail.com>
Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
13 years agokvm tools: Enable SMP support
Sasha Levin [Fri, 6 May 2011 23:51:11 +0000 (02:51 +0300)]
kvm tools: Enable SMP support

This patch enables SMP support:

[    0.155072] Brought up 3 CPUs
[    0.155074] Total of 3 processors activated (15158.58 BogoMIPS).

virtio-console was being loaded no matter the cmdline options
and it was causing some hangs (have to look into that).

I'll send this patch to a larger audience once someone
else can confirm it actually works/doesn't work.

Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
13 years agokvm tools: fix a memory leak in qcow2_read_header
Prasad Joshi [Fri, 6 May 2011 16:39:46 +0000 (17:39 +0100)]
kvm tools: fix a memory leak in qcow2_read_header

Free the allocated memory for qcow_header if header read operation fails.

Signed-off-by: Prasad Joshi <prasadjoshi124@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
13 years agokvm tools: Respect VRING_AVAIL_F_NO_INTERRUPT
Asias He [Sat, 7 May 2011 02:34:20 +0000 (10:34 +0800)]
kvm tools: Respect VRING_AVAIL_F_NO_INTERRUPT

Do not inject IRQ when guest suppress it.

This can reduce IRQ injection further and bumps
host to guest bandwitdh to 6178.78 Mbps(cpu 63.96%).

Signed-off-by: Asias He <asias.hejun@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
13 years agokvm tools: Respect ISR status in virtio header
Asias He [Sat, 7 May 2011 02:34:19 +0000 (10:34 +0800)]
kvm tools: Respect ISR status in virtio header

Inject IRQ to guest only when ISR status is low which means
guest has read ISR status and device has cleared this bit as
the side effect of this reading.

This reduces a lot of unnecessary IRQ inject from device to
guest.

Netpef test shows this patch changes:

the host to guest bandwidth
from 2866.27 Mbps (cpu 33.96%) to 5548.87 Mbps (cpu 53.87%),

the guest to host bandwitdth
form 1408.86 Mbps (cpu 99.9%) to 1301.29 Mbps (cpu 99.9%).

The bottleneck of the guest to host bandwidth is guest cpu power.

Signed-off-by: Asias He <asias.hejun@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
13 years agokvm tools: Gather Virtio-PCI constants into one place
Cyrill Gorcunov [Thu, 5 May 2011 19:06:40 +0000 (23:06 +0400)]
kvm tools: Gather Virtio-PCI constants into one place

It's better than have them sprinkled in.c files. Note
that pin for ring device is changed so it no longer shared
with block device (it is done in a sake of simplicity).

Also comment style if a bit tuned up in virtio-pci.h
just to be consistent.

Reported-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
13 years agokvm tools: Fix loading root device as image
Sasha Levin [Thu, 5 May 2011 19:53:32 +0000 (22:53 +0300)]
kvm tools: Fix loading root device as image

Fix the loading of root device when no image name was
specified.

Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
13 years agokvm tools: Cleanup virtio code some more
Pekka Enberg [Thu, 5 May 2011 19:45:49 +0000 (22:45 +0300)]
kvm tools: Cleanup virtio code some more

This patch cleans up some more style problems in virtio code.

Cc: Asias He <asias.hejun@gmail.com>
Cc: Cyrill Gorcunov <gorcunov@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Prasad Joshi <prasadjoshi124@gmail.com>
Cc: Sasha Levin <levinsasha928@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
13 years agokvm tools: virtio-rng code cleanup
Sasha Levin [Thu, 5 May 2011 19:16:54 +0000 (22:16 +0300)]
kvm tools: virtio-rng code cleanup

Clean coding style and naming within virtio-rng.

Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>