]> git.karo-electronics.de Git - linux-beck.git/log
linux-beck.git
16 years agoKVM: Apply the kernel sigmask to vcpus blocked due to being uninitialized
Avi Kivity [Sun, 6 Jul 2008 12:48:31 +0000 (15:48 +0300)]
KVM: Apply the kernel sigmask to vcpus blocked due to being uninitialized

Signed-off-by: Avi Kivity <avi@qumranet.com>
16 years agoKVM: VMX: Add ept_sync_context in flush_tlb
Sheng Yang [Sun, 6 Jul 2008 11:16:51 +0000 (19:16 +0800)]
KVM: VMX: Add ept_sync_context in flush_tlb

Fix a potention issue caused by kvm_mmu_slot_remove_write_access(). The
old behavior don't sync EPT TLB with modified EPT entry, which result
in inconsistent content of EPT TLB and EPT table.

Signed-off-by: Sheng Yang <sheng.yang@intel.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
16 years agoKVM: mmu_shrink: kvm_mmu_zap_page requires slots_lock to be held
Marcelo Tosatti [Thu, 3 Jul 2008 21:33:02 +0000 (18:33 -0300)]
KVM: mmu_shrink: kvm_mmu_zap_page requires slots_lock to be held

kvm_mmu_zap_page() needs slots lock held (rmap_remove->gfn_to_memslot,
for example).

Since kvm_lock spinlock is held in mmu_shrink(), do a non-blocking
down_read_trylock().

Untested.

Signed-off-by: Avi Kivity <avi@qumranet.com>
16 years agox86: KVM guest: make kvm_smp_prepare_boot_cpu() static
Adrian Bunk [Mon, 30 Jun 2008 22:19:19 +0000 (01:19 +0300)]
x86: KVM guest: make kvm_smp_prepare_boot_cpu() static

This patch makes the needlessly global kvm_smp_prepare_boot_cpu() static.

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Avi Kivity <avi@qumranet.com>
16 years agoKVM: SVM: fix suspend/resume support
Joerg Roedel [Wed, 2 Jul 2008 14:02:11 +0000 (16:02 +0200)]
KVM: SVM: fix suspend/resume support

On suspend the svm_hardware_disable function is called which frees all svm_data
variables. On resume they are not re-allocated. This patch removes the
deallocation of svm_data from the hardware_disable function to the
hardware_unsetup function which is not called on suspend.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
16 years agoKVM: s390: rename private structures
Christian Borntraeger [Fri, 27 Jun 2008 13:05:40 +0000 (15:05 +0200)]
KVM: s390: rename private structures

While doing some tests with our lcrash implementation I have seen a
naming conflict with prefix_info in kvm_host.h vs. addrconf.h

To avoid future conflicts lets rename private definitions in
asm/kvm_host.h by adding the kvm_s390 prefix.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Carsten Otte <cotte@de.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
16 years agoKVM: s390: Set guest storage limit and offset to sane values
Christian Borntraeger [Fri, 27 Jun 2008 13:05:38 +0000 (15:05 +0200)]
KVM: s390: Set guest storage limit and offset to sane values

Some machines do not accept 16EB as guest storage limit. Lets change the
default for the guest storage limit to a sane value. We also should set
the guest_origin to what userspace thinks it is. This allows guests
starting at an address != 0.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Carsten Otte <cotte@de.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
16 years agoKVM: Fix memory leak on guest exit
Carsten Otte [Fri, 27 Jun 2008 13:05:34 +0000 (15:05 +0200)]
KVM: Fix memory leak on guest exit

This patch fixes a memory leak, we want to free the physmem when destroying
the vm.

Signed-off-by: Carsten Otte <cotte@de.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
16 years agoKVM: s390: dont allocate dirty bitmap
Carsten Otte [Fri, 27 Jun 2008 13:05:31 +0000 (15:05 +0200)]
KVM: s390: dont allocate dirty bitmap

This patch #ifdefs the bitmap array for dirty tracking. We don't have dirty
tracking on s390 today, and we'd love to use our storage keys to store the
dirty information for migration. Therefore, we won't need this array at all,
and due to our limited amount of vmalloc space this limits the amount of guests
we can run.

Signed-off-by: Carsten Otte <cotte@de.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
16 years agoKVM: move slots_lock acquision down to vapic_exit
Marcelo Tosatti [Mon, 23 Jun 2008 15:04:25 +0000 (12:04 -0300)]
KVM: move slots_lock acquision down to vapic_exit

There is no need to grab slots_lock if the vapic_page will not
be touched.

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
16 years agoKVM: VMX: Fake emulate Intel perfctr MSRs
Chris Lalancette [Fri, 20 Jun 2008 07:51:30 +0000 (09:51 +0200)]
KVM: VMX: Fake emulate Intel perfctr MSRs

Older linux guests (in this case, 2.6.9) can attempt to
access the performance counter MSRs without a fixup section, and injecting
a GPF kills the guest.  Work around by allowing the guest to write those MSRs.

Tested by me on RHEL-4 i386 and x86_64 guests, as well as F-9 guests.

Signed-off-by: Chris Lalancette <clalance@redhat.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
16 years agoKVM: VMX: Fix a wrong usage of vmcs_config
Sheng Yang [Wed, 18 Jun 2008 06:43:38 +0000 (14:43 +0800)]
KVM: VMX: Fix a wrong usage of vmcs_config

The function ept_update_paging_mode_cr0() write to
CPU_BASED_VM_EXEC_CONTROL based on vmcs_config.cpu_based_exec_ctrl. That's
wrong because the variable may not consistent with the content in the
CPU_BASE_VM_EXEC_CONTROL MSR.

Signed-off-by: Sheng Yang <sheng.yang@intel.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
16 years agoKVM: MMU: Fix printk format
Avi Kivity [Sun, 22 Jun 2008 13:46:22 +0000 (16:46 +0300)]
KVM: MMU: Fix printk format

Signed-off-by: Avi Kivity <avi@qumranet.com>
16 years agoKVM: MMU: When debug is enabled, make it a run-time parameter
Avi Kivity [Sun, 22 Jun 2008 13:45:24 +0000 (16:45 +0300)]
KVM: MMU: When debug is enabled, make it a run-time parameter

Signed-off-by: Avi Kivity <avi@qumranet.com>
16 years agoKVM: x86 emulator: lazily evaluate segment registers
Avi Kivity [Sun, 22 Jun 2008 13:22:51 +0000 (16:22 +0300)]
KVM: x86 emulator: lazily evaluate segment registers

Instead of prefetching all segment bases before emulation, read them at the
last moment.  Since most of them are unneeded, we save some cycles on
Intel machines where this is a bit expensive.

Signed-off-by: Avi Kivity <avi@qumranet.com>
16 years agoKVM: x86 emulator: avoid segment base adjust for lea
Avi Kivity [Mon, 16 Jun 2008 05:45:54 +0000 (22:45 -0700)]
KVM: x86 emulator: avoid segment base adjust for lea

Signed-off-by: Avi Kivity <avi@qumranet.com>
16 years agoKVM: x86 emulator: simplify rip relative decoding
Avi Kivity [Mon, 16 Jun 2008 05:09:11 +0000 (22:09 -0700)]
KVM: x86 emulator: simplify rip relative decoding

rip relative decoding is relative to the instruction pointer of the next
instruction; by moving address adjustment until after decoding is complete,
we remove the need to determine the instruction size.

Signed-off-by: Avi Kivity <avi@qumranet.com>
16 years agoKVM: x86 emulator: simplify r/m decoding
Avi Kivity [Mon, 16 Jun 2008 04:53:26 +0000 (21:53 -0700)]
KVM: x86 emulator: simplify r/m decoding

Consolidate the duplicated code when not in any special case.

Signed-off-by: Avi Kivity <avi@qumranet.com>
16 years agoKVM: x86 emulator: simplify sib decoding
Avi Kivity [Mon, 16 Jun 2008 04:23:17 +0000 (21:23 -0700)]
KVM: x86 emulator: simplify sib decoding

Instead of using sparse switches, use simpler if/else sequences.

Signed-off-by: Avi Kivity <avi@qumranet.com>
16 years agoKVM: x86 emulator: handle undecoded rex.b with r/m = 5 in certain cases
Avi Kivity [Mon, 16 Jun 2008 04:13:41 +0000 (21:13 -0700)]
KVM: x86 emulator: handle undecoded rex.b with r/m = 5 in certain cases

x86_64 does not decode rex.b in certain cases, where the r/m field = 5.

Signed-off-by: Avi Kivity <avi@qumranet.com>
16 years agoKVM: x86 emulator: emulate nop and xchg reg, acc (opcodes 0x90 - 0x97)
Mohammed Gamal [Sun, 15 Jun 2008 16:37:38 +0000 (19:37 +0300)]
KVM: x86 emulator: emulate nop and xchg reg, acc (opcodes 0x90 - 0x97)

Signed-off-by: Mohammed Gamal <m.gamal005@gmail.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
16 years agoKVM: Use printk_rlimit() instead of reporting emulation failures just once
Avi Kivity [Fri, 13 Jun 2008 19:45:42 +0000 (22:45 +0300)]
KVM: Use printk_rlimit() instead of reporting emulation failures just once

Emulation failure reports are useful, so allow more than one per the lifetime
of the module.

Signed-off-by: Avi Kivity <avi@qumranet.com>
16 years agoKVM: Support mixed endian machines
Tan, Li [Fri, 23 May 2008 06:54:09 +0000 (14:54 +0800)]
KVM: Support mixed endian machines

Currently kvmtrace is not portable. This will prevent from copying a
trace file from big-endian target to little-endian workstation for analysis.
In the patch, kernel outputs metadata containing a magic number to trace
log, and changes 64-bit words to be u64 instead of a pair of u32s.

Signed-off-by: Tan Li <li.tan@intel.com>
Acked-by: Jerone Young <jyoung5@us.ibm.com>
Acked-by: Hollis Blanchard <hollisb@us.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
16 years agoKVM: Do not calculate linear rip in emulation failure report
Glauber Costa [Tue, 10 Jun 2008 13:46:53 +0000 (10:46 -0300)]
KVM: Do not calculate linear rip in emulation failure report

If we're not gonna do anything (case in which failure is already
reported), we do not need to even bother with calculating the linear rip.

Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
16 years agoKVM: only abort guest entry if timer count goes from 0->1
Marcelo Tosatti [Wed, 11 Jun 2008 22:52:53 +0000 (19:52 -0300)]
KVM: only abort guest entry if timer count goes from 0->1

Only abort guest entry if the timer count went from 0->1, since for 1->2
or larger the bit will either be set already or a timer irq will have
been injected.

Using atomic_inc_and_test() for it also introduces an SMP barrier
to the LAPIC version (thought it was unecessary because of timer
migration, but guest can be scheduled to a different pCPU between exit
and kvm_vcpu_block(), so there is the possibility for a race).

Noticed by Avi.

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
16 years agoKVM: Add coalesced MMIO support (ia64 part)
Laurent Vivier [Fri, 30 May 2008 14:05:57 +0000 (16:05 +0200)]
KVM: Add coalesced MMIO support (ia64 part)

This patch enables coalesced MMIO for ia64 architecture.
It defines KVM_MMIO_PAGE_OFFSET and KVM_CAP_COALESCED_MMIO.
It enables the compilation of coalesced_mmio.c.

[akpm: fix compile error on ia64]

Signed-off-by: Laurent Vivier <Laurent.Vivier@bull.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Avi Kivity <avi@qumranet.com>
16 years agoKVM: Add coalesced MMIO support (powerpc part)
Laurent Vivier [Fri, 30 May 2008 14:05:56 +0000 (16:05 +0200)]
KVM: Add coalesced MMIO support (powerpc part)

This patch enables coalesced MMIO for powerpc architecture.
It defines KVM_MMIO_PAGE_OFFSET and KVM_CAP_COALESCED_MMIO.
It enables the compilation of coalesced_mmio.c.

Signed-off-by: Laurent Vivier <Laurent.Vivier@bull.net>
Signed-off-by: Avi Kivity <avi@qumranet.com>
16 years agoKVM: Add coalesced MMIO support (x86 part)
Laurent Vivier [Fri, 30 May 2008 14:05:55 +0000 (16:05 +0200)]
KVM: Add coalesced MMIO support (x86 part)

This patch enables coalesced MMIO for x86 architecture.
It defines KVM_MMIO_PAGE_OFFSET and KVM_CAP_COALESCED_MMIO.
It enables the compilation of coalesced_mmio.c.

Signed-off-by: Laurent Vivier <Laurent.Vivier@bull.net>
Signed-off-by: Avi Kivity <avi@qumranet.com>
16 years agoKVM: Add coalesced MMIO support (common part)
Laurent Vivier [Fri, 30 May 2008 14:05:54 +0000 (16:05 +0200)]
KVM: Add coalesced MMIO support (common part)

This patch adds all needed structures to coalesce MMIOs.
Until an architecture uses it, it is not compiled.

Coalesced MMIO introduces two ioctl() to define where are the MMIO zones that
can be coalesced:

- KVM_REGISTER_COALESCED_MMIO registers a coalesced MMIO zone.
  It requests one parameter (struct kvm_coalesced_mmio_zone) which defines
  a memory area where MMIOs can be coalesced until the next switch to
  user space. The maximum number of MMIO zones is KVM_COALESCED_MMIO_ZONE_MAX.

- KVM_UNREGISTER_COALESCED_MMIO cancels all registered zones inside
  the given bounds (bounds are also given by struct kvm_coalesced_mmio_zone).

The userspace client can check kernel coalesced MMIO availability by asking
ioctl(KVM_CHECK_EXTENSION) for the KVM_CAP_COALESCED_MMIO capability.
The ioctl() call to KVM_CAP_COALESCED_MMIO will return 0 if not supported,
or the page offset where will be stored the ring buffer.
The page offset depends on the architecture.

After an ioctl(KVM_RUN), the first page of the KVM memory mapped points to
a kvm_run structure. The offset given by KVM_CAP_COALESCED_MMIO is
an offset to the coalesced MMIO ring expressed in PAGE_SIZE relatively
to the address of the start of th kvm_run structure. The MMIO ring buffer
is defined by the structure kvm_coalesced_mmio_ring.

[akio: fix oops during guest shutdown]

Signed-off-by: Laurent Vivier <Laurent.Vivier@bull.net>
Signed-off-by: Akio Takebe <takebe_akio@jp.fujitsu.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
16 years agoKVM: kvm_io_device: extend in_range() to manage len and write attribute
Laurent Vivier [Fri, 30 May 2008 14:05:53 +0000 (16:05 +0200)]
KVM: kvm_io_device: extend in_range() to manage len and write attribute

Modify member in_range() of structure kvm_io_device to pass length and the type
of the I/O (write or read).

This modification allows to use kvm_io_device with coalesced MMIO.

Signed-off-by: Laurent Vivier <Laurent.Vivier@bull.net>
Signed-off-by: Avi Kivity <avi@qumranet.com>
16 years agoKVM: MMU: Avoid page prefetch on SVM
Avi Kivity [Thu, 29 May 2008 11:56:28 +0000 (14:56 +0300)]
KVM: MMU: Avoid page prefetch on SVM

SVM cannot benefit from page prefetching since guest page fault bypass
cannot by made to work there.  Avoid accessing the guest page table in
this case.

Signed-off-by: Avi Kivity <avi@qumranet.com>
16 years agoKVM: MMU: Move nonpaging_prefetch_page()
Avi Kivity [Thu, 29 May 2008 11:55:03 +0000 (14:55 +0300)]
KVM: MMU: Move nonpaging_prefetch_page()

In preparation for next patch. No code change.

Signed-off-by: Avi Kivity <avi@qumranet.com>
16 years agoKVM: x86 emulator: implement 'push imm' (opcode 0x68)
Avi Kivity [Thu, 29 May 2008 11:38:38 +0000 (14:38 +0300)]
KVM: x86 emulator: implement 'push imm' (opcode 0x68)

Encountered in FC6 boot sequence, now that we don't force ss.rpl = 0 during
the protected mode transition.  Not really necessary, but nice to have.

Signed-off-by: Avi Kivity <avi@qumranet.com>
16 years agoKVM: x86 emulator: simplify push imm8 emulation
Avi Kivity [Thu, 29 May 2008 11:26:29 +0000 (14:26 +0300)]
KVM: x86 emulator: simplify push imm8 emulation

Instead of fetching the data explicitly, use SrcImmByte.

Signed-off-by: Avi Kivity <avi@qumranet.com>
16 years agoKVM: MMU: Optimize prefetch_page()
Avi Kivity [Thu, 29 May 2008 11:20:16 +0000 (14:20 +0300)]
KVM: MMU: Optimize prefetch_page()

Instead of reading each pte individually, read 256 bytes worth of ptes and
batch process them.

Signed-off-by: Avi Kivity <avi@qumranet.com>
16 years agoKVM: x86 emulator: Add support for mov r, sreg (0x8c) instruction
Guillaume Thouvenin [Tue, 27 May 2008 13:13:28 +0000 (15:13 +0200)]
KVM: x86 emulator: Add support for mov r, sreg (0x8c) instruction

Add support for mov r, sreg (0x8c) instruction

Signed-off-by: Guillaume Thouvenin <guillaume.thouvenin@ext.bull.net>
Signed-off-by: Laurent Vivier <laurent.vivier@bull.net>
Signed-off-by: Avi Kivity <avi@qumranet.com>
16 years agoKVM: x86 emulator: Add support for mov seg, r (0x8e) instruction
Guillaume Thouvenin [Tue, 27 May 2008 12:49:15 +0000 (14:49 +0200)]
KVM: x86 emulator: Add support for mov seg, r (0x8e) instruction

Add support for mov r, sreg (0x8c) instruction.

[avi: drop the sreg decoding table in favor of 1:1 encoding]

Signed-off-by: Guillaume Thouvenin <guillaume.thouvenin@ext.bull.net>
Signed-off-by: Laurent Vivier <laurent.vivier@bull.net>
Signed-off-by: Avi Kivity <avi@qumranet.com>
16 years agoKVM: x86 emulator: adds support to mov r,imm (opcode 0xb8) instruction
Guillaume Thouvenin [Tue, 27 May 2008 08:19:16 +0000 (10:19 +0200)]
KVM: x86 emulator: adds support to mov r,imm (opcode 0xb8) instruction

Add support to mov r, imm (0xb8) instruction.

Signed-off-by: Guillaume Thouvenin <guillaume.thouvenin@ext.bull.net>
Signed-off-by: Laurent Vivier <laurent.vivier@bull.net>
Signed-off-by: Avi Kivity <avi@qumranet.com>
16 years agoKVM: x86 emulator: add support for jmp far 0xea
Guillaume Thouvenin [Tue, 27 May 2008 08:19:08 +0000 (10:19 +0200)]
KVM: x86 emulator: add support for jmp far 0xea

Add support for jmp far (opcode 0xea) instruction.

Signed-off-by: Guillaume Thouvenin <guillaume.thouvenin@ext.bull.net>
Signed-off-by: Laurent Vivier <laurent.vivier@bull.net>
Signed-off-by: Avi Kivity <avi@qumranet.com>
16 years agoKVM: x86 emulator: Update c->dst.bytes in decode instruction
Guillaume Thouvenin [Tue, 27 May 2008 08:22:20 +0000 (10:22 +0200)]
KVM: x86 emulator: Update c->dst.bytes in decode instruction

Update c->dst.bytes in decode instruction instead of instruction
itself.  It's needed because if c->dst.bytes is equal to 0, the
instruction is not emulated.

Signed-off-by: Guillaume Thouvenin <guillaume.thouvenin@ext.bull.net>
Signed-off-by: Laurent Vivier <laurent.vivier@bull.net>
Signed-off-by: Avi Kivity <avi@qumranet.com>
16 years agoKVM: Prefixes segment functions that will be exported with "kvm_"
Guillaume Thouvenin [Tue, 27 May 2008 08:18:46 +0000 (10:18 +0200)]
KVM: Prefixes segment functions that will be exported with "kvm_"

Prefixes functions that will be exported with kvm_.
We also prefixed set_segment() even if it still static
to be coherent.

signed-off-by: Guillaume Thouvenin <guillaume.thouvenin@ext.bull.net>
Signed-off-by: Laurent Vivier <laurent.vivier@bull.net>
Signed-off-by: Avi Kivity <avi@qumranet.com>
16 years agoKVM: MTRR support
Avi Kivity [Mon, 26 May 2008 17:06:35 +0000 (20:06 +0300)]
KVM: MTRR support

Add emulation for the memory type range registers, needed by VMware esx 3.5,
and by pci device assignment.

Signed-off-by: Avi Kivity <avi@qumranet.com>
16 years agoKVM: Order segment register constants in the same way as cpu operand encoding
Avi Kivity [Tue, 27 May 2008 13:26:01 +0000 (16:26 +0300)]
KVM: Order segment register constants in the same way as cpu operand encoding

This can be used to simplify the x86 instruction decoder.

Signed-off-by: Avi Kivity <avi@qumranet.com>
16 years agoKVM: VMX: Enable NMI with in-kernel irqchip
Sheng Yang [Thu, 15 May 2008 10:23:25 +0000 (18:23 +0800)]
KVM: VMX: Enable NMI with in-kernel irqchip

Signed-off-by: Sheng Yang <sheng.yang@intel.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
16 years agoKVM: IOAPIC/LAPIC: Enable NMI support
Sheng Yang [Thu, 15 May 2008 01:52:48 +0000 (09:52 +0800)]
KVM: IOAPIC/LAPIC: Enable NMI support

[avi: fix ia64 build breakage]

Signed-off-by: Sheng Yang <sheng.yang@intel.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
16 years agoKVM: Remove unnecessary ->decache_regs() call
Avi Kivity [Sun, 25 May 2008 11:38:15 +0000 (14:38 +0300)]
KVM: Remove unnecessary ->decache_regs() call

Since we aren't modifying any register, there's no need to decache
the register state.

Signed-off-by: Avi Kivity <avi@qumranet.com>
16 years agoKVM: Remove decache_vcpus_on_cpu() and related callbacks
Avi Kivity [Tue, 13 May 2008 13:29:20 +0000 (16:29 +0300)]
KVM: Remove decache_vcpus_on_cpu() and related callbacks

Obsoleted by the vmx-specific per-cpu list.

Signed-off-by: Avi Kivity <avi@qumranet.com>
16 years agoKVM: VMX: Add list of potentially locally cached vcpus
Avi Kivity [Tue, 13 May 2008 13:22:47 +0000 (16:22 +0300)]
KVM: VMX: Add list of potentially locally cached vcpus

VMX hardware can cache the contents of a vcpu's vmcs.  This cache needs
to be flushed when migrating a vcpu to another cpu, or (which is the case
that interests us here) when disabling hardware virtualization on a cpu.

The current implementation of decaching iterates over the list of all vcpus,
picks the ones that are potentially cached on the cpu that is being offlined,
and flushes the cache.  The problem is that it uses mutex_trylock() to gain
exclusive access to the vcpu, which fires off a (benign) warning about using
the mutex in an interrupt context.

To avoid this, and to make things generally nicer, add a new per-cpu list
of potentially cached vcus.  This makes the decaching code much simpler.  The
list is vmx-specific since other hardware doesn't have this issue.

[andrea: fix crash on suspend/resume]

Signed-off-by: Andrea Arcangeli <andrea@qumranet.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
16 years agoKVM: Handle virtualization instruction #UD faults during reboot
Avi Kivity [Tue, 13 May 2008 10:23:38 +0000 (13:23 +0300)]
KVM: Handle virtualization instruction #UD faults during reboot

KVM turns off hardware virtualization extensions during reboot, in order
to disassociate the memory used by the virtualization extensions from the
processor, and in order to have the system in a consistent state.
Unfortunately virtual machines may still be running while this goes on,
and once virtualization extensions are turned off, any virtulization
instruction will #UD on execution.

Fix by adding an exception handler to virtualization instructions; if we get
an exception during reboot, we simply spin waiting for the reset to complete.
If it's a true exception, BUG() so we can have our stack trace.

Signed-off-by: Avi Kivity <avi@qumranet.com>
16 years agoKVM: MMU: Fix false flooding when a pte points to page table
Avi Kivity [Thu, 15 May 2008 10:51:35 +0000 (13:51 +0300)]
KVM: MMU: Fix false flooding when a pte points to page table

The KVM MMU tries to detect when a speculative pte update is not actually
used by demand fault, by checking the accessed bit of the shadow pte.  If
the shadow pte has not been accessed, we deem that page table flooded and
remove the shadow page table, allowing further pte updates to proceed
without emulation.

However, if the pte itself points at a page table and only used for write
operations, the accessed bit will never be set since all access will happen
through the emulator.

This is exactly what happens with kscand on old (2.4.x) HIGHMEM kernels.
The kernel points a kmap_atomic() pte at a page table, and then
proceeds with read-modify-write operations to look at the dirty and accessed
bits.  We get a false flood trigger on the kmap ptes, which results in the
mmu spending all its time setting up and tearing down shadows.

Fix by setting the shadow accessed bit on emulated accesses.

Signed-off-by: Avi Kivity <avi@qumranet.com>
16 years agoKVM: VMX: Trivial vmcs_write64() code simplification
Avi Kivity [Mon, 12 May 2008 16:25:43 +0000 (19:25 +0300)]
KVM: VMX: Trivial vmcs_write64() code simplification

Signed-off-by: Avi Kivity <avi@qumranet.com>
16 years agoKVM: SVM: Fake MSR_K7 performance counters
Chris Lalancette [Mon, 5 May 2008 17:05:16 +0000 (13:05 -0400)]
KVM: SVM: Fake MSR_K7 performance counters

Attached is a patch that fixes a guest crash when booting older Linux kernels.
The problem stems from the fact that we are currently emulating
MSR_K7_EVNTSEL[0-3], but not emulating MSR_K7_PERFCTR[0-3].  Because of this,
setup_k7_watchdog() in the Linux kernel receives a GPF when it attempts to
write into MSR_K7_PERFCTR, which causes an OOPs.

The patch fixes it by just "fake" emulating the appropriate MSRs, throwing
away the data in the process.  This causes the NMI watchdog to not actually
work, but it's not such a big deal in a virtualized environment.

When we get a write to one of these counters, we printk_ratelimit() a warning.
I decided to print it out for all writes, even if the data is 0; it doesn't
seem to make sense to me to special case when data == 0.

Tested by myself on a RHEL-4 guest, and Joerg Roedel on a Windows XP 64-bit
guest.

Signed-off-by: Chris Lalancette <clalance@redhat.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
16 years agoKVM: PIT: support mode 3
Aurelien Jarno [Fri, 2 May 2008 15:02:23 +0000 (17:02 +0200)]
KVM: PIT: support mode 3

The in-kernel PIT emulation ignores pending timers if operating
under mode 3, which for example Hurd uses.

This mode should output a square wave, high for (N+1)/2 counts and low
for (N-1)/2 counts. As we only care about the resulting interrupts, the
period is N, and mode 3 is the same as mode 2 with regard to
interrupts.

Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Avi Kivity <avi@qumranet.com>
16 years agoKVM: Handle vma regions with no backing page
Anthony Liguori [Wed, 30 Apr 2008 20:37:07 +0000 (15:37 -0500)]
KVM: Handle vma regions with no backing page

This patch allows VMAs that contain no backing page to be used for guest
memory.  This is useful for assigning mmio regions to a guest.

Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
16 years agoKVM: SVM: add tracing support for TDP page faults
Joerg Roedel [Wed, 30 Apr 2008 15:56:04 +0000 (17:56 +0200)]
KVM: SVM: add tracing support for TDP page faults

To distinguish between real page faults and nested page faults they should be
traced as different events. This is implemented by this patch.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
16 years agoKVM: SVM: add missing kvmtrace markers
Joerg Roedel [Wed, 30 Apr 2008 15:56:03 +0000 (17:56 +0200)]
KVM: SVM: add missing kvmtrace markers

This patch adds the missing kvmtrace markers to the svm
module of kvm.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
16 years agoKVM: add missing kvmtrace bits
Joerg Roedel [Wed, 30 Apr 2008 15:56:02 +0000 (17:56 +0200)]
KVM: add missing kvmtrace bits

This patch adds some kvmtrace bits to the generic x86 code
where it is instrumented from SVM.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
16 years agoKVM: SVM: implement dedicated INTR exit handler
Joerg Roedel [Wed, 30 Apr 2008 15:56:01 +0000 (17:56 +0200)]
KVM: SVM: implement dedicated INTR exit handler

With an exit handler for INTR intercepts its possible to account them using
kvmtrace.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
16 years agoKVM: SVM: implement dedicated NMI exit handler
Joerg Roedel [Wed, 30 Apr 2008 15:56:00 +0000 (17:56 +0200)]
KVM: SVM: implement dedicated NMI exit handler

With an exit handler for NMI intercepts its possible to account them using
kvmtrace.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
16 years agoKVM: VMX: move APIC_ACCESS trace entry to generic code
Joerg Roedel [Wed, 30 Apr 2008 15:55:59 +0000 (17:55 +0200)]
KVM: VMX: move APIC_ACCESS trace entry to generic code

This patch moves the trace entry for APIC accesses from the VMX code to the
generic lapic code. This way APIC accesses from SVM will also be traced.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
16 years agoKVM: add statics were possible, function definition in lapic.h
Harvey Harrison [Sun, 27 Apr 2008 19:14:13 +0000 (12:14 -0700)]
KVM: add statics were possible, function definition in lapic.h

Noticed by sparse:
arch/x86/kvm/vmx.c:1583:6: warning: symbol 'vmx_disable_intercept_for_msr' was not declared. Should it be static?
arch/x86/kvm/x86.c:3406:5: warning: symbol 'kvm_task_switch_16' was not declared. Should it be static?
arch/x86/kvm/x86.c:3429:5: warning: symbol 'kvm_task_switch_32' was not declared. Should it be static?
arch/x86/kvm/mmu.c:1968:6: warning: symbol 'kvm_mmu_remove_one_alloc_mmu_page' was not declared. Should it be static?
arch/x86/kvm/mmu.c:2014:6: warning: symbol 'mmu_destroy_caches' was not declared. Should it be static?
arch/x86/kvm/lapic.c:862:5: warning: symbol 'kvm_lapic_get_base' was not declared. Should it be static?
arch/x86/kvm/i8254.c:94:5: warning: symbol 'pit_get_gate' was not declared. Should it be static?
arch/x86/kvm/i8254.c:196:5: warning: symbol '__pit_timer_fn' was not declared. Should it be static?
arch/x86/kvm/i8254.c:561:6: warning: symbol '__inject_pit_timer_intr' was not declared. Should it be static?

Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
16 years agoKVM: remove long -> void *user -> long cast
Christian Borntraeger [Mon, 21 Apr 2008 11:48:24 +0000 (13:48 +0200)]
KVM: remove long -> void *user -> long cast

kvm_dev_ioctl casts the arg value to void __user *, just to recast it
again to long. This seems unnecessary.

According to objdump the binary code on x86 is unchanged by this patch.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
16 years agoMerge branch 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mfashe...
Linus Torvalds [Thu, 17 Jul 2008 17:55:51 +0000 (10:55 -0700)]
Merge branch 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mfasheh/ocfs2

* 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mfasheh/ocfs2:
  [PATCH] ocfs2: fix oops in mmap_truncate testing
  configfs: call drop_link() to cleanup after create_link() failure
  configfs: Allow ->make_item() and ->make_group() to return detailed errors.
  configfs: Fix failing mkdir() making racing rmdir() fail
  configfs: Fix deadlock with racing rmdir() and rename()
  configfs: Make configfs_new_dirent() return error code instead of NULL
  configfs: Protect configfs_dirent s_links list mutations
  configfs: Introduce configfs_dirent_lock
  ocfs2: Don't snprintf() without a format.
  ocfs2: Fix CONFIG_OCFS2_DEBUG_FS #ifdefs
  ocfs2/net: Silence build warnings on sparc64
  ocfs2: Handle error during journal load
  ocfs2: Silence an error message in ocfs2_file_aio_read()
  ocfs2: use simple_read_from_buffer()
  ocfs2: fix printk format warnings with OCFS2_FS_STATS=n
  [PATCH 2/2] ocfs2: Instrument fs cluster locks
  [PATCH 1/2] ocfs2: Add CONFIG_OCFS2_FS_STATS config option

16 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/brodo/pcmcia-fixes-2.6
Linus Torvalds [Thu, 17 Jul 2008 17:55:07 +0000 (10:55 -0700)]
Merge git://git.kernel.org/pub/scm/linux/kernel/git/brodo/pcmcia-fixes-2.6

* git://git.kernel.org/pub/scm/linux/kernel/git/brodo/pcmcia-fixes-2.6:
  pcmcia: ide-cs: Remove outdated comment
  pcmcia: fix cisinfo_t removal
  pcmcia: fix return value in cm4000_cs.c

16 years agoMerge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Thu, 17 Jul 2008 17:38:59 +0000 (10:38 -0700)]
Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip

* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86: fix asm/e820.h for userspace inclusion
  x86: fix numaq_tsc_disable
  x86: fix kernel_physical_mapping_init() for large x86 systems

16 years agoMerge branch 'tracing-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Thu, 17 Jul 2008 17:37:10 +0000 (10:37 -0700)]
Merge branch 'tracing-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip

* 'tracing-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  ftrace: do not trace library functions
  ftrace: do not trace scheduler functions
  ftrace: fix lockup with MAXSMP
  ftrace: fix merge buglet

16 years agox86: fix asm/e820.h for userspace inclusion
Rusty Russell [Tue, 15 Jul 2008 05:02:27 +0000 (15:02 +1000)]
x86: fix asm/e820.h for userspace inclusion

asm-x86/e820.h is included from userspace.  'x86: make e820.c to have
common functions' (b79cd8f1268bab57ff85b19d131f7f23deab2dee) broke it:

make -C Documentation/lguest
cc -Wall -Wmissing-declarations -Wmissing-prototypes -O3 -I../../include
lguest.c  -lz -o lguest
In file included from ../../include/asm-x86/bootparam.h:8,
                 from lguest.c:45:
../../include/asm/e820.h:66: error: expected ‘)’ before ‘start’
../../include/asm/e820.h:67: error: expected ‘)’ before ‘start’
../../include/asm/e820.h:68: error: expected ‘)’ before ‘start’
../../include/asm/e820.h:72: error: expected ‘=’, ‘,’, ‘;’, ‘asm’
or ‘__attribute__’ before ‘e820_update_range’
...

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agox86: fix numaq_tsc_disable
Yinghai Lu [Tue, 15 Jul 2008 06:29:01 +0000 (23:29 -0700)]
x86: fix numaq_tsc_disable

fix:

 arch/x86/kernel/numaq_32.c: In function ‘numaq_tsc_disable’:
 arch/x86/kernel/numaq_32.c:99: warning: ‘return’ with a value, in function returning void

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agoMerge branch 'linus' into x86/urgent
Ingo Molnar [Thu, 17 Jul 2008 17:24:56 +0000 (19:24 +0200)]
Merge branch 'linus' into x86/urgent

16 years agofix build error of arch/ia64/kvm/*
Takashi Iwai [Thu, 17 Jul 2008 16:09:12 +0000 (18:09 +0200)]
fix build error of arch/ia64/kvm/*

Fix calls of smp_call_function*() in arch/ia64/kvm for recent API
changes.

    CC [M]  arch/ia64/kvm/kvm-ia64.o
  arch/ia64/kvm/kvm-ia64.c: In function 'handle_global_purge':
  arch/ia64/kvm/kvm-ia64.c:398: error: too many arguments to function 'smp_call_function_single'
  arch/ia64/kvm/kvm-ia64.c: In function 'kvm_vcpu_kick':
  arch/ia64/kvm/kvm-ia64.c:1696: error: too many arguments to function 'smp_call_function_single'

Signed-off-by: Takashi Iwai <tiwai@suse.de>
Acked-by Xiantao Zhang <xiantao.zhang@intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years agoMerge branch 'ptrace-cleanup' of git://git.kernel.org/pub/scm/linux/kernel/git/frob...
Linus Torvalds [Thu, 17 Jul 2008 16:15:23 +0000 (09:15 -0700)]
Merge branch 'ptrace-cleanup' of git://git.kernel.org/pub/scm/linux/kernel/git/frob/linux-2.6-utrace

* 'ptrace-cleanup' of git://git.kernel.org/pub/scm/linux/kernel/git/frob/linux-2.6-utrace:
  fix dangling zombie when new parent ignores children
  do_wait: return security_task_wait() error code in place of -ECHILD
  ptrace children revamp
  do_wait reorganization

16 years agoUpdate scripts/Makefile.fwinst to cope with older make
David Woodhouse [Thu, 17 Jul 2008 06:44:32 +0000 (23:44 -0700)]
Update scripts/Makefile.fwinst to cope with older make

Also fix unwanted rebuilds of the firmware/ihex2fw tool by including
the .ihex2fw.cmd file when present.

Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Reported-and-tested-by: Wang Chen <wangchen@cn.fujitsu.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years agoMerge branch 'for-linus' of git://git390.osdl.marist.edu/pub/scm/linux-2.6
Linus Torvalds [Thu, 17 Jul 2008 16:05:38 +0000 (09:05 -0700)]
Merge branch 'for-linus' of git://git390.osdl.marist.edu/pub/scm/linux-2.6

* 'for-linus' of git://git390.osdl.marist.edu/pub/scm/linux-2.6:
  [S390] dasd: use -EOPNOTSUPP instead of -ENOTSUPP
  [S390] qdio: new qdio driver.
  [S390] cio: Export chsc_error_from_response().
  [S390] vmur: Fix return code handling.
  [S390] Fix stacktrace compile bug.
  [S390] Increase default warning stacksize.
  [S390] dasd: Fix cleanup in dasd_{fba,diag}_check_characteristics().
  [S390] chsc headers userspace cleanup
  [S390] dasd: fix unsolicited SIM handling.
  [S390] zfcpdump: Make SCSI disk dump tool recognize storage holes

16 years agoFix collateral damage to top level Makefile
Grant Likely [Thu, 17 Jul 2008 07:06:55 +0000 (01:06 -0600)]
Fix collateral damage to top level Makefile

The patch named "powerpc/mpc5121: Add clock driver", also contained
an unrelated and bogus change to the top-level makefile.  This patch
backs out the bad bit.

SHA1 of offending patch: 137e95906e294913fab02162e8a1948ade49acb5)

Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Repented-by: John Rigby <jrigby@freescale.com>
[ Heh. Normally I pick these out from the diffstats, but I guess
  I've grown to trust the ppc tree too much ;)   - Linus ]
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years agoftrace: do not trace library functions
Ingo Molnar [Thu, 17 Jul 2008 15:40:48 +0000 (17:40 +0200)]
ftrace: do not trace library functions

make function tracing more robust: do not trace library functions.

We've already got a sizable list of exceptions:

 ifdef CONFIG_FTRACE
 # Do not profile string.o, since it may be used in early boot or vdso
 CFLAGS_REMOVE_string.o = -pg
 # Also do not profile any debug utilities
 CFLAGS_REMOVE_spinlock_debug.o = -pg
 CFLAGS_REMOVE_list_debug.o = -pg
 CFLAGS_REMOVE_debugobjects.o = -pg
 CFLAGS_REMOVE_find_next_bit.o = -pg
 CFLAGS_REMOVE_cpumask.o = -pg
 CFLAGS_REMOVE_bitmap.o = -pg
 endif

... and the pattern has been that random library functionality showed
up in ftrace's critical path (outside of its recursion check), causing
hard to debug lockups.

So be a bit defensive about it and exclude all lib/*.o functions by
default. It's not that they are overly interesting for tracing purposes
anyway. Specific ones can still be traced, in an opt-in manner.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agoftrace: do not trace scheduler functions
Ingo Molnar [Tue, 15 Apr 2008 20:39:31 +0000 (22:39 +0200)]
ftrace: do not trace scheduler functions

do not trace scheduler functions - it's still a bit fragile
and can lock up with:

  http://redhat.com/~mingo/misc/config-Thu_Jul_17_13_34_52_CEST_2008

Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agoftrace: fix lockup with MAXSMP
Ingo Molnar [Thu, 17 Jul 2008 15:38:17 +0000 (17:38 +0200)]
ftrace: fix lockup with MAXSMP

MAXSMP brings in lots of use of various bitops in smp_processor_id()
and friends - causing ftrace to lock up during bootup:

  calling  anon_inode_init+0x0/0x130
  initcall anon_inode_init+0x0/0x130 returned 0 after 0 msecs
  calling  acpi_event_init+0x0/0x57
  [ hard hang ]

So exclude the bitops facilities from tracing.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years ago[S390] dasd: use -EOPNOTSUPP instead of -ENOTSUPP
Stefan Haberland [Thu, 17 Jul 2008 15:16:49 +0000 (17:16 +0200)]
[S390] dasd: use -EOPNOTSUPP instead of -ENOTSUPP

return value -ENOTSUPP is not valid in userspace context, use
-EOPNOTSUPP instead

Signed-off-by: Stefan Haberland <stefan.haberland@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
16 years ago[S390] qdio: new qdio driver.
Jan Glauber [Thu, 17 Jul 2008 15:16:48 +0000 (17:16 +0200)]
[S390] qdio: new qdio driver.

List of major changes:
- split qdio driver into several files
- seperation of thin interrupt code
- improved handling for multiple thin interrupt devices
- inbound and outbound processing now always runs in tasklet context
- significant less tasklet schedules per interrupt needed
- merged qebsm with non-qebsm handling
- cleanup qdio interface and added kerneldoc
- coding style

Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Reviewed-by: Utz Bacher <utz.bacher@de.ibm.com>
Reviewed-by: Ursula Braun <braunu@de.ibm.com>
Signed-off-by: Jan Glauber <jang@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
16 years ago[S390] cio: Export chsc_error_from_response().
Cornelia Huck [Thu, 17 Jul 2008 15:16:47 +0000 (17:16 +0200)]
[S390] cio: Export chsc_error_from_response().

Make chsc_error_from_response() available to chsc callers outside
of chsc.c (namely qdio) to avoid duplicating error checking code.

Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
16 years ago[S390] vmur: Fix return code handling.
Frank Munzert [Thu, 17 Jul 2008 15:16:46 +0000 (17:16 +0200)]
[S390] vmur: Fix return code handling.

Use -EOPNOTSUPP instead of -ENOTSUPP.

Signed-off-by: Frank Munzert <munzert@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
16 years ago[S390] Fix stacktrace compile bug.
Heiko Carstens [Thu, 17 Jul 2008 15:16:45 +0000 (17:16 +0200)]
[S390] Fix stacktrace compile bug.

Add missing module.h include to fix this:

  CC      arch/s390/kernel/stacktrace.o
arch/s390/kernel/stacktrace.c:84: warning: data definition has no type or storage class
arch/s390/kernel/stacktrace.c:84: warning: type defaults to 'int' in declaration of 'EXPORT_SYMBOL_GPL'
arch/s390/kernel/stacktrace.c:84: warning: parameter names (without types) in function declaration
arch/s390/kernel/stacktrace.c:97: warning: data definition has no type or storage class
arch/s390/kernel/stacktrace.c:97: warning: type defaults to 'int' in declaration of 'EXPORT_SYMBOL_GPL'
arch/s390/kernel/stacktrace.c:97: warning: parameter names (without types) in function declaration

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
16 years ago[S390] Increase default warning stacksize.
Heiko Carstens [Thu, 17 Jul 2008 15:16:44 +0000 (17:16 +0200)]
[S390] Increase default warning stacksize.

Compiling a kernel with allmodconfig or allyesconfig results in tons
of gcc warnings, because the default maximum stacksize from which on
gcc will emit a warning is just 256 bytes.
Increase this to 2048, so these warnings don't distract from the real
warnings that we need to watch at.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
16 years ago[S390] dasd: Fix cleanup in dasd_{fba,diag}_check_characteristics().
Cornelia Huck [Thu, 17 Jul 2008 15:16:43 +0000 (17:16 +0200)]
[S390] dasd: Fix cleanup in dasd_{fba,diag}_check_characteristics().

Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
16 years ago[S390] chsc headers userspace cleanup
Adrian Bunk [Thu, 17 Jul 2008 15:16:42 +0000 (17:16 +0200)]
[S390] chsc headers userspace cleanup

Kernel headers shouldn't expose functions to userspace.

Cc: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
16 years ago[S390] dasd: fix unsolicited SIM handling.
Stefan Haberland [Thu, 17 Jul 2008 15:16:41 +0000 (17:16 +0200)]
[S390] dasd: fix unsolicited SIM handling.

Add missing schedule_bh and check that there is 32 bit sense data.

Signed-off-by: Stefan Haberland <stefan.haberland@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
16 years ago[S390] zfcpdump: Make SCSI disk dump tool recognize storage holes
Frank Munzert [Thu, 17 Jul 2008 15:16:40 +0000 (17:16 +0200)]
[S390] zfcpdump: Make SCSI disk dump tool recognize storage holes

The kernel part of zfcpdump establishes a new debugfs file zcore/memmap
which exports information on memory layout (start address and length of each
memory chunk) to its userspace counterpart.

Signed-off-by: Frank Munzert <munzert@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
16 years agoftrace: fix merge buglet
Ingo Molnar [Thu, 17 Jul 2008 11:26:50 +0000 (13:26 +0200)]
ftrace: fix merge buglet

-tip testing found a bootup hang here:

  initcall anon_inode_init+0x0/0x130 returned 0 after 0 msecs
  calling  acpi_event_init+0x0/0x57

the bootup should have continued with:

  initcall acpi_event_init+0x0/0x57 returned 0 after 45 msecs

but it hung hard there instead.

bisection led to this commit:

| commit 5806b81ac1c0c52665b91723fd4146a4f86e386b
| Merge: d14c8a6... 6712e29...
| Author: Ingo Molnar <mingo@elte.hu>
| Date:   Mon Jul 14 16:11:52 2008 +0200
|     Merge branch 'auto-ftrace-next' into tracing/for-linus

turns out that i made this mistake in the merge:

  ifdef CONFIG_FTRACE
  # Do not profile debug utilities
  CFLAGS_REMOVE_tsc_64.o = -pg
  CFLAGS_REMOVE_tsc_32.o = -pg

those two files got unified meanwhile - so the dont-profile annotation
got lost. The proper rule is:

  CFLAGS_REMOVE_tsc.o = -pg

i guess this could have been caught sooner if the CFLAGS_REMOVE* kbuild
rule aborted the build if it met a target that does not exist anymore?

Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agofix dangling zombie when new parent ignores children
Roland McGrath [Wed, 9 Apr 2008 06:12:30 +0000 (23:12 -0700)]
fix dangling zombie when new parent ignores children

This fixes an arcane bug that we think was a regression introduced
by commit b2b2cbc4b2a2f389442549399a993a8306420baf.  When a parent
ignores SIGCHLD (or uses SA_NOCLDWAIT), its children would self-reap
but they don't if it's using ptrace on them.  When the parent thread
later exits and ceases to ptrace a child but leaves other live
threads in the parent's thread group, any zombie children are left
dangling.  The fix makes them self-reap then, as they would have
done earlier if ptrace had not been in use.

Signed-off-by: Roland McGrath <roland@redhat.com>
16 years agodo_wait: return security_task_wait() error code in place of -ECHILD
Roland McGrath [Mon, 31 Mar 2008 01:41:25 +0000 (18:41 -0700)]
do_wait: return security_task_wait() error code in place of -ECHILD

This reverts the effect of commit f2cc3eb133baa2e9dc8efd40f417106b2ee520f3
"do_wait: fix security checks".  That change reverted the effect of commit
73243284463a761e04d69d22c7516b2be7de096c.  The rationale for the original
commit still stands.  The inconsistent treatment of children hidden by
ptrace was an unintended omission in the original change and in no way
invalidates its purpose.

This makes do_wait return the error returned by security_task_wait()
(usually -EACCES) in place of -ECHILD when there are some children the
caller would be able to wait for if not for the permission failure.  A
permission error will give the user a clue to look for security policy
problems, rather than for mysterious wait bugs.

Signed-off-by: Roland McGrath <roland@redhat.com>
16 years agoptrace children revamp
Roland McGrath [Tue, 25 Mar 2008 01:36:23 +0000 (18:36 -0700)]
ptrace children revamp

ptrace no longer fiddles with the children/sibling links, and the
old ptrace_children list is gone.  Now ptrace, whether of one's own
children or another's via PTRACE_ATTACH, just uses the new ptraced
list instead.

There should be no user-visible difference that matters.  The only
change is the order in which do_wait() sees multiple stopped
children and stopped ptrace attachees.  Since wait_task_stopped()
was changed earlier so it no longer reorders the children list, we
already know this won't cause any new problems.

Signed-off-by: Roland McGrath <roland@redhat.com>
16 years agodo_wait reorganization
Roland McGrath [Thu, 20 Mar 2008 02:24:59 +0000 (19:24 -0700)]
do_wait reorganization

This breaks out the guts of do_wait into three subfunctions.
The control flow is less nonobvious without so much goto.
do_wait_thread and ptrace_do_wait contain the main work of the outer loop.
wait_consider_task contains the main work of the inner loop.

Signed-off-by: Roland McGrath <roland@redhat.com>
16 years agoscsi_dh: Verify "dev" is a sdev before accessing it.
Chandra Seetharaman [Thu, 17 Jul 2008 00:35:08 +0000 (17:35 -0700)]
scsi_dh: Verify "dev" is a sdev before accessing it.

Before accessing the device data structure in hardware handlers,
make sure it is a indeed a sdev device.

Yinghai Lu <yhlu.kernel@gmail.com> found the bug on Jul 16, 2008,
and later tested/verified the following fix.

Signed-off-by: Chandra Seetharaman <sekharan@us.ibm.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years agoMerge branch 'linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes...
Linus Torvalds [Thu, 17 Jul 2008 00:25:46 +0000 (17:25 -0700)]
Merge branch 'linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6

* 'linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6: (72 commits)
  Revert "x86/PCI: ACPI based PCI gap calculation"
  PCI: remove unnecessary volatile in PCIe hotplug struct controller
  x86/PCI: ACPI based PCI gap calculation
  PCI: include linux/pm_wakeup.h for device_set_wakeup_capable
  PCI PM: Fix pci_prepare_to_sleep
  x86/PCI: Fix PCI config space for domains > 0
  Fix acpi_pm_device_sleep_wake() by providing a stub for CONFIG_PM_SLEEP=n
  PCI: Simplify PCI device PM code
  PCI PM: Introduce pci_prepare_to_sleep and pci_back_from_sleep
  PCI ACPI: Rework PCI handling of wake-up
  ACPI: Introduce new device wakeup flag 'prepared'
  ACPI: Introduce acpi_device_sleep_wake function
  PCI: rework pci_set_power_state function to call platform first
  PCI: Introduce platform_pci_power_manageable function
  ACPI: Introduce acpi_bus_power_manageable function
  PCI: make pci_name use dev_name
  PCI: handle pci_name() being const
  PCI: add stub for pci_set_consistent_dma_mask()
  PCI: remove unused arch pcibios_update_resource() functions
  PCI: fix pci_setup_device()'s sprinting into a const buffer
  ...

Fixed up conflicts in various files (arch/x86/kernel/setup_64.c,
arch/x86/pci/irq.c, arch/x86/pci/pci.h, drivers/acpi/sleep/main.c,
drivers/pci/pci.c, drivers/pci/pci.h, include/acpi/acpi_bus.h) from x86
and ACPI updates manually.

16 years agoRevert "x86/PCI: ACPI based PCI gap calculation"
Jesse Barnes [Wed, 16 Jul 2008 23:21:47 +0000 (16:21 -0700)]
Revert "x86/PCI: ACPI based PCI gap calculation"

This reverts commit 809d9a8f93bd8504dcc34b16bbfdfd1a8c9bb1ed.

This one isn't quite ready for prime time.  It needs more testing and
additional feedback from the ACPI guys.

16 years ago[PATCH] ocfs2: fix oops in mmap_truncate testing
Coly Li [Mon, 30 Jun 2008 10:45:45 +0000 (18:45 +0800)]
[PATCH] ocfs2: fix oops in mmap_truncate testing

This patch fixes a mmap_truncate bug which was found by ocfs2 test suite.

In an ocfs2 cluster more than 1 node, run program mmap_truncate, which races
mmap writes and truncates from multiple processes. While the test is
running, a stat from another node forces writeout, causing an oops in
ocfs2_get_block() because it sees a buffer to write which isn't allocated.

This patch fixed the bug by clear dirty and uptodate bits in buffer, leave
the buffer unmapped and return.

Fix is suggested by Mark Fasheh, and I code up the patch.

Signed-off-by: Coly Li <coyli@suse.de>
Signed-off-by: Mark Fasheh <mfasheh@suse.com>
16 years agoMerge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/drzeus/mmc
Linus Torvalds [Wed, 16 Jul 2008 22:11:07 +0000 (15:11 -0700)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/drzeus/mmc

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/drzeus/mmc: (68 commits)
  sdio_uart: Fix SDIO break control to now return success or an error
  mmc: host driver for Ricoh Bay1Controllers
  sdio: sdio_io.c Fix sparse warnings
  sdio: fix the use of hard coded timeout value.
  mmc: OLPC: update vdd/powerup quirk comment
  mmc: fix spares errors of sdhci.c
  mmc: remove multiwrite capability
  wbsd: fix bad dma_addr_t conversion
  atmel-mci: Driver for Atmel on-chip MMC controllers
  mmc: fix sdio_io sparse errors
  mmc: wbsd.c fix shadowing of 'dma' variable
  MMC: S3C24XX: Refuse incorrectly aligned transfers
  MMC: S3C24XX: Add maintainer entry
  MMC: S3C24XX: Update error debugging.
  MMC: S3C24XX: Add media presence test to request handling.
  MMC: S3C24XX: Fix use of msecs where jiffies are needed
  MMC: S3C24XX: Add MODULE_ALIAS() entries for the platform devices
  MMC: S3C24XX: Fix s3c2410_dma_request() return code check.
  MMC: S3C24XX: Allow card-detect on non-IRQ capable pin
  MMC: S3C24XX: Ensure host->mrq->data is valid
  ...

Manually fixed up bogus executable bits on drivers/mmc/core/sdio_io.c
and include/linux/mmc/sdio_func.h when merging.

16 years agoMerge branch 'for_linus' of git://git.infradead.org/~dedekind/ubifs-2.6
Linus Torvalds [Wed, 16 Jul 2008 22:02:57 +0000 (15:02 -0700)]
Merge branch 'for_linus' of git://git.infradead.org/~dedekind/ubifs-2.6

* 'for_linus' of git://git.infradead.org/~dedekind/ubifs-2.6:
  UBIFS: include to compilation
  UBIFS: add new flash file system
  UBIFS: add brief documentation
  MAINTAINERS: add UBIFS section
  do_mounts: allow UBI root device name
  VFS: export sync_sb_inodes
  VFS: move inode_lock into sync_sb_inodes

16 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/bart/ide-2.6
Linus Torvalds [Wed, 16 Jul 2008 21:53:54 +0000 (14:53 -0700)]
Merge git://git.kernel.org/pub/scm/linux/kernel/git/bart/ide-2.6

* git://git.kernel.org/pub/scm/linux/kernel/git/bart/ide-2.6: (76 commits)
  IDE: Report errors during drive reset back to user space
  Update documentation of HDIO_DRIVE_RESET ioctl
  IDE: Remove unused code
  IDE: Fix HDIO_DRIVE_RESET handling
  hd.c: remove the #include <linux/mc146818rtc.h>
  update the BLK_DEV_HD help text
  move ide/legacy/hd.c to drivers/block/
  ide/legacy/hd.c: use late_initcall()
  remove BLK_DEV_HD_ONLY
  ide: endian annotations in ide-floppy.c
  ide-floppy: zero out the whole struct ide_atapi_pc on init
  ide-floppy: fold idefloppy_create_test_unit_ready_cmd into idefloppy_open
  ide-cd: move request prep chunk from cdrom_do_newpc_cont to rq issue path
  ide-cd: move request prep from cdrom_start_rw_cont to rq issue path
  ide-cd: move request prep from cdrom_start_seek_continuation to rq issue path
  ide-cd: fold cdrom_start_seek into ide_cd_do_request
  ide-cd: simplify request issuing path
  ide-cd: mv ide_do_rw_cdrom ide_cd_do_request
  ide-cd: cdrom_start_seek: remove unused argument block
  ide-cd: ide_do_rw_cdrom: add the catch-all bad request case to the if-else block
  ...

16 years agoMerge branch 'release-2.6.27' of git://git.kernel.org/pub/scm/linux/kernel/git/ak...
Linus Torvalds [Wed, 16 Jul 2008 21:52:12 +0000 (14:52 -0700)]
Merge branch 'release-2.6.27' of git://git.kernel.org/pub/scm/linux/kernel/git/ak/linux-acpi-merge-2.6

* 'release-2.6.27' of git://git.kernel.org/pub/scm/linux/kernel/git/ak/linux-acpi-merge-2.6: (87 commits)
  Fix FADT parsing
  Add the ability to reset the machine using the RESET_REG in ACPI's FADT table.
  ACPI: use dev_printk when possible
  PNPACPI: add support for HP vendor-specific CCSR descriptors
  PNP: avoid legacy IDE IRQs
  PNP: convert resource options to single linked list
  ISAPNP: handle independent options following dependent ones
  PNP: remove extra 0x100 bit from option priority
  PNP: support optional IRQ resources
  PNP: rename pnp_register_*_resource() local variables
  PNPACPI: ignore _PRS interrupt numbers larger than PNP_IRQ_NR
  PNP: centralize resource option allocations
  PNP: remove redundant pnp_can_configure() check
  PNP: make resource assignment functions return 0 (success) or -EBUSY (failure)
  PNP: in debug resource dump, make empty list obvious
  PNP: improve resource assignment debug
  PNP: increase I/O port & memory option address sizes
  PNP: introduce pnp_irq_mask_t typedef
  PNP: make resource option structures private to PNP subsystem
  PNP: define PNP-specific IORESOURCE_IO_* flags alongside IRQ, DMA, MEM
  ...