Avi Kivity [Tue, 1 Feb 2011 14:32:03 +0000 (16:32 +0200)]
KVM: x86 emulator: vendor specific instructions
Mark some instructions as vendor specific, and allow the caller to request
emulation only of vendor specific instructions. This is useful in some
circumstances (responding to a #UD fault).
Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Avi Kivity [Tue, 1 Feb 2011 14:32:02 +0000 (16:32 +0200)]
KVM: Drop bogus x86_decode_insn() error check
x86_decode_insn() doesn't return X86EMUL_* values, so the check
for X86EMUL_PROPOGATE_FAULT will always fail. There is a proper
check later on, so there is no need for a replacement for this
code.
Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Jan Kiszka [Tue, 1 Feb 2011 12:27:15 +0000 (13:27 +0100)]
KVM: x86: Drop obsolete warning about INIT on runnable VCPU
This warning was once used for debugging QEMU user space. Though
uncommon, it is actually possible to send an INIT request to a running
VCPU. So better drop this warning before someone misuses it to flood
kernel logs this way.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Huang Ying [Sun, 30 Jan 2011 03:15:49 +0000 (11:15 +0800)]
KVM: Replace is_hwpoison_address with __get_user_pages
is_hwpoison_address only checks whether the page table entry is
hwpoisoned, regardless the memory page mapped. While __get_user_pages
will check both.
QEMU will clear the poisoned page table entry (via unmap/map) to make
it possible to allocate a new memory page for the virtual address
across guest rebooting. But it is also possible that the underlying
memory page is kept poisoned even after the corresponding page table
entry is cleared, that is, a new memory page can not be allocated.
__get_user_pages can catch these situations.
Huang Ying [Sun, 30 Jan 2011 03:15:48 +0000 (11:15 +0800)]
mm: make __get_user_pages return -EHWPOISON for HWPOISON page optionally
Make __get_user_pages return -EHWPOISON for HWPOISON page only if
FOLL_HWPOISON is specified. With this patch, the interested callers
can distinguish HWPOISON pages from general FAULT pages, while other
callers will still get -EFAULT for all these pages, so the user space
interface need not to be changed.
This feature is needed by KVM, where UCR MCE should be relayed to
guest for HWPOISON page, while instruction emulation and MMIO will be
tried for general FAULT page.
The idea comes from Andrew Morton.
Signed-off-by: Huang Ying <ying.huang@intel.com> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
Huang Ying [Sun, 30 Jan 2011 03:15:47 +0000 (11:15 +0800)]
mm: export __get_user_pages
In most cases, get_user_pages and get_user_pages_fast should be used
to pin user pages in memory. But sometimes, some special flags except
FOLL_GET, FOLL_WRITE and FOLL_FORCE are needed, for example in
following patch, KVM needs FOLL_HWPOISON. To support these users,
__get_user_pages is exported directly.
There are some symbol name conflicts in infiniband driver, fixed them too.
Signed-off-by: Huang Ying <ying.huang@intel.com> CC: Andrew Morton <akpm@linux-foundation.org> CC: Michel Lespinasse <walken@google.com> CC: Roland Dreier <roland@kernel.org> CC: Ralph Campbell <infinipath@qlogic.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
john cooper [Fri, 21 Jan 2011 05:21:00 +0000 (00:21 -0500)]
KVM: x86: handle guest access to BBL_CR_CTL3 MSR
A correction to Intel cpu model CPUID data (patch queued)
caused winxp to BSOD when booted with a Penryn model.
This was traced to the CPUID "model" field correction from
6 -> 23 (as is proper for a Penryn class of cpu). Only in
this case does the problem surface.
The cause for this failure is winxp accessing the BBL_CR_CTL3
MSR which is unsupported by current kvm, appears to be a
legacy MSR not fully characterized yet existing in current
silicon, and is apparently carried forward in MSR space to
accommodate vintage code as here. It is not yet conclusive
whether this MSR implements any of its legacy functionality
or is just an ornamental dud for compatibility. While I
found no silicon version specific documentation link to
this MSR, a general description exists in Intel's developer's
reference which agrees with the functional behavior of
other bootloader/kernel code I've examined accessing
BBL_CR_CTL3. Regrettably winxp appears to be setting bit #19
called out as "reserved" in the above document.
So to minimally accommodate this MSR, kvm msr get will provide
the equivalent mock data and kvm msr write will simply toss the
guest passed data without interpretation. While this treatment
of BBL_CR_CTL3 addresses the immediate problem, the approach may
be modified pending clarification from Intel.
Signed-off-by: john cooper <john.cooper@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Xiao Guangrong [Wed, 12 Jan 2011 07:41:22 +0000 (15:41 +0800)]
KVM: make make_all_cpus_request() lockless
Now, we have 'vcpu->mode' to judge whether need to send ipi to other
cpus, this way is very exact, so checking request bit is needless,
then we can drop the spinlock let it's collateral
Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Avi Kivity <avi@redhat.com>
Xiao Guangrong [Wed, 12 Jan 2011 07:40:31 +0000 (15:40 +0800)]
KVM: Add "exiting guest mode" state
Currently we keep track of only two states: guest mode and host
mode. This patch adds an "exiting guest mode" state that tells
us that an IPI will happen soon, so unless we need to wait for the
IPI, we can avoid it completely.
Also
1: No need atomically to read/write ->mode in vcpu's thread
2: reorganize struct kvm_vcpu to make ->mode and ->requests
in the same cache line explicitly
Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Avi Kivity <avi@redhat.com>
Heiko Carstens [Mon, 17 Jan 2011 20:21:08 +0000 (21:21 +0100)]
KVM: fix build warning within __kvm_set_memory_region() on s390
Get rid of this warning:
CC arch/s390/kvm/../../../virt/kvm/kvm_main.o
arch/s390/kvm/../../../virt/kvm/kvm_main.c:596:12: warning: 'kvm_create_dirty_bitmap' defined but not used
The only caller of the function is within a !CONFIG_S390 section, so add the
same ifdef around kvm_create_dirty_bitmap() as well.
Avi Kivity [Thu, 6 Jan 2011 16:09:12 +0000 (18:09 +0200)]
KVM: VMX: Avoid atomic operation in vmx_vcpu_run
Instead of exchanging the guest and host rcx, have separate storage
for each. This allows us to avoid using the xchg instruction, which
is is a little slower than normal operations.
Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Avi Kivity [Mon, 3 Jan 2011 12:28:51 +0000 (14:28 +0200)]
KVM: VMX: Save and restore tr selector across mode switches
When emulating real mode we play with tr hidden state, but leave
tr.selector alone. That works well, except for save/restore, since
loading TR writes it to the hidden state in vmx->rmode.
Fix by also saving and restoring the tr selector; this makes things
more consistent and allows migration to work during the early
boot stages of Windows XP.
Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Sedat Dilek [Sun, 2 Jan 2011 23:01:29 +0000 (00:01 +0100)]
KVM guest: Fix section mismatch derived from kvm_guest_cpu_online()
WARNING: arch/x86/built-in.o(.text+0x1bb74): Section mismatch in reference from the function kvm_guest_cpu_online() to the function .cpuinit.text:kvm_guest_cpu_init()
The function kvm_guest_cpu_online() references
the function __cpuinit kvm_guest_cpu_init().
This is often because kvm_guest_cpu_online lacks a __cpuinit
annotation or the annotation of kvm_guest_cpu_init is wrong.
Linus Torvalds [Thu, 17 Mar 2011 02:09:57 +0000 (19:09 -0700)]
Merge branch 'mnt_devname' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6
* 'mnt_devname' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6:
vfs: bury ->get_sb()
nfs: switch NFS from ->get_sb() to ->mount()
nfs: stop mangling ->mnt_devname on NFS
vfs: new superblock methods to override /proc/*/mount{s,info}
nfs: nfs_do_{ref,sub}mount() superblock argument is redundant
nfs: make nfs_path() work without vfsmount
nfs: store devname at disconnected NFS roots
nfs: propagate devname to nfs{,4}_get_root()
Linus Torvalds [Thu, 17 Mar 2011 02:08:03 +0000 (19:08 -0700)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k:
m68k/block: amiflop - Remove superfluous amiga_chip_alloc() cast
m68k/atari: ARAnyM - Add support for network access
m68k/atari: ARAnyM - Add support for console access
m68k/atari: ARAnyM - Add support for block access
m68k/atari: Initial ARAnyM support
m68k: Kconfig - Remove unneeded "default n"
m68k: Makefiles - Change to new flags variables
m68k/amiga: Reclaim Chip RAM for PPC exception handlers
m68k: Allow all kernel traps to be handled via exception fixups
m68k: Use base_trap_init() to initialize vectors
m68k: Add helper function handle_kernel_fault()
Linus Torvalds [Thu, 17 Mar 2011 02:05:40 +0000 (19:05 -0700)]
Merge branch 'remove' of master.kernel.org:/home/rmk/linux-2.6-arm
* 'remove' of master.kernel.org:/home/rmk/linux-2.6-arm:
ARM: 6629/2: aaec2000: remove support for mach-aaec2000
ARM: lh7a40x: remove unmaintained platform support
Fix up trivial conflicts in
- arch/arm/mach-{aaec2000,lh7a40x}/include/mach/memory.h (removed)
- drivers/usb/gadget/Kconfig (USB_[GADGET_]LH7A40X removed, others added)
Linus Torvalds [Thu, 17 Mar 2011 02:03:06 +0000 (19:03 -0700)]
Merge branch 'for-linus' of master.kernel.org:/home/rmk/linux-2.6-arm
* 'for-linus' of master.kernel.org:/home/rmk/linux-2.6-arm: (91 commits)
ARM: 6806/1: irq: introduce entry and exit functions for chained handlers
ARM: 6781/1: Thumb-2: Work around buggy Thumb-2 short branch relocations in gas
ARM: 6747/1: P2V: Thumb2 support
ARM: 6798/1: aout-core: zero thread debug registers in a.out core dump
ARM: 6796/1: Footbridge: Fix I/O mappings for NOMMU mode
ARM: 6784/1: errata: no automatic Store Buffer drain on Cortex-A9
ARM: 6772/1: errata: possible fault MMU translations following an ASID switch
ARM: 6776/1: mach-ux500: activate fix for errata 753970
ARM: 6794/1: SPEAr: Append UL to device address macros.
ARM: 6793/1: SPEAr: Remove unused *_SIZE macros from spear*.h files
ARM: 6792/1: SPEAr: Replace SIZE macro's with SZ_4K macros
ARM: 6791/1: SPEAr3xx: Declare device structures after shirq code
ARM: 6790/1: SPEAr: Clock Framework: Rename usbd clock and align apb_clk entry
ARM: 6789/1: SPEAr3xx: Rename sdio to sdhci
ARM: 6788/1: SPEAr: Include mach/hardware.h instead of mach/spear.h
ARM: 6787/1: SPEAr: Reorder #includes in .h & .c files.
ARM: 6681/1: SPEAr: add debugfs support to clk API
ARM: 6703/1: SPEAr: update clk API support
ARM: 6679/1: SPEAr: make clk API functions more generic
ARM: 6737/1: SPEAr: formalized timer support
...
Linus Torvalds [Thu, 17 Mar 2011 02:01:29 +0000 (19:01 -0700)]
Merge branch 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6
* 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6:
[IA64] tioca: Fix assignment from incompatible pointer warnings
[IA64] mca.c: Fix cast from integer to pointer warning
[IA64] setup.c Typo fix "Architechtuallly"
[IA64] Add CONFIG_MISC_DEVICES=y to configs that need it.
[IA64] disable interrupts at end of ia64_mca_cpe_int_handler()
[IA64] Add DMA_ERROR_CODE define.
pstore: fix build warning for unused return value from sysfs_create_file
pstore: X86 platform interface using ACPI/APEI/ERST
pstore: new filesystem interface to platform persistent storage
Linus Torvalds [Thu, 17 Mar 2011 00:21:00 +0000 (17:21 -0700)]
Merge branch 'config' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/bkl
* 'config' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/bkl:
BKL: That's all, folks
fs/locks.c: Remove stale FIXME left over from BKL conversion
ipx: remove the BKL
appletalk: remove the BKL
x25: remove the BKL
ufs: remove the BKL
hpfs: remove the BKL
drivers: remove extraneous includes of smp_lock.h
tracing: don't trace the BKL
adfs: remove the big kernel lock
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6: (1480 commits)
bonding: enable netpoll without checking link status
xfrm: Refcount destination entry on xfrm_lookup
net: introduce rx_handler results and logic around that
bonding: get rid of IFF_SLAVE_INACTIVE netdev->priv_flag
bonding: wrap slave state work
net: get rid of multiple bond-related netdevice->priv_flags
bonding: register slave pointer for rx_handler
be2net: Bump up the version number
be2net: Copyright notice change. Update to Emulex instead of ServerEngines
e1000e: fix kconfig for crc32 dependency
netfilter ebtables: fix xt_AUDIT to work with ebtables
xen network backend driver
bonding: Improve syslog message at device creation time
bonding: Call netif_carrier_off after register_netdevice
bonding: Incorrect TX queue offset
net_sched: fix ip_tos2prio
xfrm: fix __xfrm_route_forward()
be2net: Fix UDP packet detected status in RX compl
Phonet: fix aligned-mode pipe socket buffer header reserve
netxen: support for GbE port settings
...
Fix up conflicts in drivers/staging/brcm80211/brcmsmac/wl_mac80211.c
with the staging updates.
Linus Torvalds [Wed, 16 Mar 2011 22:19:35 +0000 (15:19 -0700)]
Merge branch 'staging-next' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging-2.6
* 'staging-next' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging-2.6: (961 commits)
staging: hv: fix memory leaks
staging: hv: Remove NULL check before kfree
Staging: hv: Get rid of vmbus_child_dev_add()
Staging: hv: Change the signature for vmbus_child_device_register()
Staging: hv: Get rid of vmbus_cleanup() function
Staging: hv: Get rid of vmbus_dev_rm() function
Staging: hv: Change the signature for vmbus_on_isr()
Staging: hv: Eliminate vmbus_event_dpc()
Staging: hv: Get rid of the function vmbus_msg_dpc()
Staging: hv: Change the signature for vmbus_cleanup()
Staging: hv: Simplify root device management
staging: rtl8192e: Don't copy dev pointer to skb
staging: rtl8192e: Pass priv to cmdpkt functions
staging: rtl8192e: Pass priv to firmware download functions
staging: rtl8192e: Pass priv to rtl8192_interrupt
staging: rtl8192e: Pass rtl8192_priv to dm functions
staging: rtl8192e: Pass ieee80211_device to callbacks
staging: rtl8192e: Pass ieee80211_device to callbacks
staging: rtl8192e: Pass ieee80211_device to callbacks
staging: rtl8192e: Pass ieee80211_device to callbacks
...
Linus Torvalds [Wed, 16 Mar 2011 22:11:04 +0000 (15:11 -0700)]
Merge branch 'tty-next' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty-2.6
* 'tty-next' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty-2.6: (76 commits)
pch_uart: reference clock on CM-iTC
pch_phub: add new device ML7213
n_gsm: fix UIH control byte : P bit should be 0
n_gsm: add a documentation
serial: msm_serial_hs: Add MSM high speed UART driver
tty_audit: fix tty_audit_add_data live lock on audit disabled
tty: move cd1865.h to drivers/staging/tty/
Staging: tty: fix build with epca.c driver
pcmcia: synclink_cs: fix prototype for mgslpc_ioctl()
Staging: generic_serial: fix double locking bug
nozomi: don't use flush_scheduled_work()
tty/serial: Relax the device_type restriction from of_serial
MAINTAINERS: Update HVC file patterns
tty: phase out of ioctl file pointer for tty3270 as well
tty: forgot to remove ipwireless from drivers/char/pcmcia/Makefile
pch_uart: Fix DMA channel miss-setting issue.
pch_uart: fix exclusive access issue
pch_uart: fix auto flow control miss-setting issue
pch_uart: fix uart clock setting issue
pch_uart : Use dev_xxx not pr_xxx
...
Fix up trivial conflicts in drivers/misc/pch_phub.c (same patch applied
twice, then changes to the same area in one branch)
Linus Torvalds [Wed, 16 Mar 2011 22:04:26 +0000 (15:04 -0700)]
Merge branch 'usb-next' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb-2.6
* 'usb-next' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb-2.6: (172 commits)
USB: Add support for SuperSpeed isoc endpoints
xhci: Clean up cycle bit math used during stalls.
xhci: Fix cycle bit calculation during stall handling.
xhci: Update internal dequeue pointers after stalls.
USB: Disable auto-suspend for USB 3.0 hubs.
USB: Remove bogus USB_PORT_STAT_SUPER_SPEED symbol.
xhci: Return canceled URBs immediately when host is halted.
xhci: Fixes for suspend/resume of shared HCDs.
xhci: Fix re-init on power loss after resume.
xhci: Make roothub functions deal with device removal.
xhci: Limit roothub ports to 15 USB3 & 31 USB2 ports.
xhci: Return a USB 3.0 hub descriptor for USB3 roothub.
xhci: Register second xHCI roothub.
xhci: Change xhci_find_slot_id_by_port() API.
xhci: Refactor bus suspend state into a struct.
xhci: Index with a port array instead of PORTSC addresses.
USB: Set usb_hcd->state and flags for shared roothubs.
usb: Make core allocate resources per PCI-device.
usb: Store bus type in usb_hcd, not in driver flags.
usb: Change usb_hcd->bandwidth_mutex to a pointer.
...
Al Viro [Wed, 16 Mar 2011 10:59:40 +0000 (06:59 -0400)]
vfs: new superblock methods to override /proc/*/mount{s,info}
a) ->show_devname(m, mnt) - what to put into devname columns in mounts,
mountinfo and mountstats
b) ->show_path(m, mnt) - what to put into relative path column in mountinfo
Leaving those NULL gives old behaviour. NFS switched to using those.
Al Viro [Wed, 16 Mar 2011 09:44:14 +0000 (05:44 -0400)]
nfs: store devname at disconnected NFS roots
part 2: make sure that disconnected roots have corresponding mnt_devname
values stashed into them.
Have nfs*_get_root() stuff a copy of devname into ->d_fsdata of the
found root, provided that it is disconnected.
Have ->d_release() free it when dentry goes away.
Have the places where NFS uses ->d_fsdata for sillyrename (and that
can *never* happen to a disconnected root - dentry will be attached
to its parent) free old devname copies if they find those.
Andy Gospodarek [Mon, 14 Mar 2011 12:05:21 +0000 (12:05 +0000)]
bonding: enable netpoll without checking link status
Only slaves that are up should transmit netpoll frames, so there is no
need to check to see if a slave is up before enabling netpoll on it.
This resolves a reported failure on active-backup bonds where a slave
interface is down when netpoll was enabled.
Signed-off-by: Andy Gospodarek <andy@greyhouse.net> Tested-by: WANG Cong <amwang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Steffen Klassert [Tue, 15 Mar 2011 21:12:49 +0000 (21:12 +0000)]
xfrm: Refcount destination entry on xfrm_lookup
We return a destination entry without refcount if a socket
policy is found in xfrm_lookup. This triggers a warning on
a negative refcount when freeeing this dst entry. So take
a refcount in this case to fix it.
This refcount was forgotten when xfrm changed to cache bundles
instead of policies for outgoing flows.
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com> Acked-by: Timo Teräs <timo.teras@iki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>
Jiri Pirko [Sat, 12 Mar 2011 03:14:39 +0000 (03:14 +0000)]
net: introduce rx_handler results and logic around that
This patch allows rx_handlers to better signalize what to do next to
it's caller. That makes skb->deliver_no_wcard no longer needed.
kernel-doc for rx_handler_result is taken from Nicolas' patch.
Signed-off-by: Jiri Pirko <jpirko@redhat.com> Reviewed-by: Nicolas de Pesloüan <nicolas.2p.debian@free.fr> Signed-off-by: David S. Miller <davem@davemloft.net>
Jiri Pirko [Wed, 16 Mar 2011 08:46:43 +0000 (08:46 +0000)]
bonding: get rid of IFF_SLAVE_INACTIVE netdev->priv_flag
Since bond-related code was moved from net/core/dev.c into bonding,
IFF_SLAVE_INACTIVE is no longer needed. Replace is with flag "inactive"
stored in slave structure
Signed-off-by: Jiri Pirko <jpirko@redhat.com> Reviewed-by: Nicolas de Pesloüan <nicolas.2p.debian@free.fr> Signed-off-by: David S. Miller <davem@davemloft.net>
Jiri Pirko [Sat, 12 Mar 2011 03:14:37 +0000 (03:14 +0000)]
bonding: wrap slave state work
transfers slave->state into slave->backup (that it's going to transfer
into bitfield. Introduce wrapper inlines to do the work with it.
Signed-off-by: Jiri Pirko <jpirko@redhat.com> Reviewed-by: Nicolas de Pesloüan <nicolas.2p.debian@free.fr> Signed-off-by: David S. Miller <davem@davemloft.net>
Jiri Pirko [Wed, 16 Mar 2011 08:45:23 +0000 (08:45 +0000)]
net: get rid of multiple bond-related netdevice->priv_flags
Now when bond-related code is moved from net/core/dev.c into bonding
code, multiple priv_flags are not needed anymore. So let them rot.
Signed-off-by: Jiri Pirko <jpirko@redhat.com> Reviewed-by: Nicolas de Pesloüan <nicolas.2p.debian@free.fr> Signed-off-by: David S. Miller <davem@davemloft.net>
Jiri Pirko [Sat, 12 Mar 2011 03:14:35 +0000 (03:14 +0000)]
bonding: register slave pointer for rx_handler
Register slave pointer as rx_handler data. That would eventually prevent
need to loop over slave devices to find the right slave.
Use synchronize_net to ensure that bond_handle_frame does not get slave
structure freed when working with that.
Signed-off-by: Jiri Pirko <jpirko@redhat.com> Reviewed-by: Nicolas de Pesloüan <nicolas.2p.debian@free.fr> Signed-off-by: David S. Miller <davem@davemloft.net>
Reported-by: Frank Peters <frank.peters@comcast.net> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Cc: Bruce Allan <bruce.w.allan@intel.com> Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Eric B Munson [Tue, 15 Mar 2011 23:12:18 +0000 (16:12 -0700)]
Documentation: update cgroup pid and cpuset information
The cgroup documentation does not specify how a process can be removed
from a particular group. This patch adds a note at the end of the
simple example about how this is done. Also, some cgroups (like
cpusets) require user input before a new group can be used. This is
noted in the patch as well.
Signed-off-by: Eric B Munson <emunson@mgebm.net> Acked-by: Paul Menage <menage@google.com> Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Rob Landley [Tue, 15 Mar 2011 23:11:29 +0000 (16:11 -0700)]
Documentation: remove obsolete files from 00-INDEX
Time interpolators were removed in git 1f564ad6d41828 ("[IA64] remove
time interpolator"), and Voyager support went away in git b6b6e2b112caf
("Documentation: remove obsolete voyager.txt file")
Signed-off-by: Rob Landley <rlandley@parallels.com> Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Thomas Graf [Wed, 16 Mar 2011 17:32:13 +0000 (18:32 +0100)]
netfilter ebtables: fix xt_AUDIT to work with ebtables
Even though ebtables uses xtables it still requires targets to
return EBT_CONTINUE instead of XT_CONTINUE. This prevented
xt_AUDIT to work as ebt module.
Upon Jan's suggestion, use a separate struct xt_target for
NFPROTO_BRIDGE having its own target callback returning
EBT_CONTINUE instead of cloning the module.
Signed-off-by: Thomas Graf <tgraf@redhat.com> Signed-off-by: Patrick McHardy <kaber@trash.net>
Linus Torvalds [Wed, 16 Mar 2011 17:10:02 +0000 (10:10 -0700)]
Merge branch 'x86-trampoline-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-trampoline-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
x86: Fix binutils-2.21 symbol related build failures
x86-64, trampoline: Remove unused variable
x86, reboot: Fix the use of passed arguments in 32-bit BIOS reboot
x86, reboot: Move the real-mode reboot code to an assembly file
x86: Make the GDT_ENTRY() macro in <asm/segment.h> safe for assembly
x86, trampoline: Use the unified trampoline setup for ACPI wakeup
x86, trampoline: Common infrastructure for low memory trampolines
Fix up trivial conflicts in arch/x86/kernel/Makefile
Linus Torvalds [Wed, 16 Mar 2011 16:24:44 +0000 (09:24 -0700)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/suspend-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/suspend-2.6: (21 commits)
PM / Hibernate: Reduce autotuned default image size
PM / Core: Introduce struct syscore_ops for core subsystems PM
PM QoS: Make pm_qos settings readable
PM / OPP: opp_find_freq_exact() documentation fix
PM: Documentation/power/states.txt: fix repetition
PM: Make system-wide PM and runtime PM treat subsystems consistently
PM: Simplify kernel/power/Kconfig
PM: Add support for device power domains
PM: Drop pm_flags that is not necessary
PM: Allow pm_runtime_suspend() to succeed during system suspend
PM: Clean up PM_TRACE dependencies and drop unnecessary Kconfig option
PM: Remove CONFIG_PM_OPS
PM: Reorder power management Kconfig options
PM: Make CONFIG_PM depend on (CONFIG_PM_SLEEP || CONFIG_PM_RUNTIME)
PM / ACPI: Remove references to pm_flags from bus.c
PM: Do not create wakeup sysfs files for devices that cannot wake up
USB / Hub: Do not call device_set_wakeup_capable() under spinlock
PM: Use appropriate printk() priority level in trace.c
PM / Wakeup: Don't update events_check_enabled in pm_get_wakeup_count()
PM / Wakeup: Make pm_save_wakeup_count() work as documented
...
Linus Torvalds [Wed, 16 Mar 2011 16:24:25 +0000 (09:24 -0700)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband:
RDMA/cma: Replace global lock in rdma_destroy_id() with id-specific one
IB/cm: Cancel pending LAP message when exiting IB_CM_ESTABLISH state
IB/cm: Bump reference count on cm_id before invoking callback
RDMA/cma: Fix crash in request handlers
IB/ipath: Don't reset disabled devices
IB/qib: Fix M_Key field in SubnGet and SubnGetResp MADs
IB/qib: Set default LE2 value for active cables to 0
RDMA/cxgb4: Debugfs dump_qp() updates
RDMA/cxgb4: Dispatch FATAL event on EEH errors
RDMA/cxgb4: Use ULP_MODE_TCPDDP
RDMA/cxgb4: Enable on-chip SQ support by default
RDMA/cxgb4: Do CIDX_INC updates every 1/16 CQ depth CQE reaps
RDMA/cxgb4: Remove db_drop_task
RDMA/cxgb4: Turn on delayed ACK
IB/qib: Return correct MAD when setting link width to 255
Linus Torvalds [Wed, 16 Mar 2011 16:15:43 +0000 (09:15 -0700)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6: (33 commits)
AppArmor: kill unused macros in lsm.c
AppArmor: cleanup generated files correctly
KEYS: Add an iovec version of KEYCTL_INSTANTIATE
KEYS: Add a new keyctl op to reject a key with a specified error code
KEYS: Add a key type op to permit the key description to be vetted
KEYS: Add an RCU payload dereference macro
AppArmor: Cleanup make file to remove cruft and make it easier to read
SELinux: implement the new sb_remount LSM hook
LSM: Pass -o remount options to the LSM
SELinux: Compute SID for the newly created socket
SELinux: Socket retains creator role and MLS attribute
SELinux: Auto-generate security_is_socket_class
TOMOYO: Fix memory leak upon file open.
Revert "selinux: simplify ioctl checking"
selinux: drop unused packet flow permissions
selinux: Fix packet forwarding checks on postrouting
selinux: Fix wrong checks for selinux_policycap_netpeer
selinux: Fix check for xfrm selinux context algorithm
ima: remove unnecessary call to ima_must_measure
IMA: remove IMA imbalance checking
...
* git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
crypto: tcrypt - do not attempt to write to readonly variable
random: update interface comments to reflect reality
crypto: picoxcell - add support for the picoxcell crypto engines
crypto: sha1 - Add test vector to test partial block processing
hwrng: omap - Convert release_resource to release_region/release_mem_region
crypto: aesni-intel - Fix remaining leak in rfc4106_set_hash_key
crypto: omap-sham - don't treat NULL clk as an error
crypto: omap-aes - don't treat NULL clk as an error
crypto: testmgr - mark ghash as fips_allowed
crypto: testmgr - mark xts(aes) as fips_allowed
crypto: skcipher - remove redundant NULL check
hwrng: pixocell - add support for picoxcell TRNG
crypto: aesni-intel - Don't leak memory in rfc4106_set_hash_subkey
Linus Torvalds [Wed, 16 Mar 2011 15:58:09 +0000 (08:58 -0700)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs: (46 commits)
fs/9p: Make the writeback_fid owned by root
fs/9p: Writeback dirty data before setattr
fs/9p: call vmtruncate before setattr 9p opeation
fs/9p: Properly update inode attributes on link
fs/9p: Prevent multiple inclusion of same header
fs/9p: Workaround vfs rename rehash bug
fs/9p: Mark directory inode invalid for many directory inode operations
fs/9p: Add . and .. dentry revalidation flag
fs/9p: mark inode attribute invalid on rename, unlink and setattr
fs/9p: Add support for marking inode attribute invalid
fs/9p: Initialize root inode number for dotl
fs/9p: Update link count correctly on different file system operations
fs/9p: Add drop_inode 9p callback
fs/9p: Add direct IO support in cached mode
fs/9p: Fix inode i_size update in file_write
fs/9p: set default readahead pages in cached mode
fs/9p: Move writeback fid to v9fs_inode
fs/9p: Add v9fs_inode
fs/9p: Don't set stat.st_blocks based on nrpages
fs/9p: Add inode hashing
...
Linus Torvalds [Wed, 16 Mar 2011 15:57:32 +0000 (08:57 -0700)]
Merge branch 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev
* 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev: (29 commits)
ahci: add another PCI ID for marvell
libata: Use 'bool' return value for ata_id_XXX
sata_fsl: Update RX_WATER_MARK for TRANSCFG
sata_fsl: Fix wrong Device Error Register usage
libata: Include WWN ID in inquiry VPD emulation
ata/pata_arasan_cf: fill dma chan->private from pdata->dma_priv
ata: pata: Convert pr_*(DRV_NAME ...) to pr_fmt/pr_<level>
pata_arasan_cf: fix printk format string warning
pata_arasan_cf: Adding support for arasan compact flash host controller
libata-sff: add ata_sff_queue_work() & ata_sff_queue_delayed_work()
ahci: AHCI mode SATA patch for Intel Patsburg SATA RAID controller
ahci: recognize Marvell 88se9125 PCIe SATA 6.0 Gb/s controller
libata: remove ATA_FLAG_LPM
libata: remove ATA_FLAG_NO_LEGACY
libata: remove ATA_FLAG_MMIO
libata: remove ATA_FLAG_{SRST|SATA_RESET}
ipr/sas_ata: use mode mask macros from <linux/ata.h>
sata_dwc_460ex: add debugging options
sata_dwc_460ex: fix misuse of ata_get_cmd_descript()
sata_dwc_460ex: fix return value of dma_dwc_xfer_setup()
...
Linus Torvalds [Wed, 16 Mar 2011 15:22:41 +0000 (08:22 -0700)]
Merge branch 'for-2.6.39' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu
* 'for-2.6.39' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu:
percpu, x86: Add arch-specific this_cpu_cmpxchg_double() support
percpu: Generic support for this_cpu_cmpxchg_double()
alpha: use L1_CACHE_BYTES for cacheline size in the linker script
percpu: align percpu readmostly subsection to cacheline
Fix up trivial conflict in arch/x86/kernel/vmlinux.lds.S due to the
percpu alignment having changed ("x86: Reduce back the alignment of the
per-CPU data section")
Linus Torvalds [Wed, 16 Mar 2011 15:20:19 +0000 (08:20 -0700)]
Merge branch 'for-2.6.39' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq
* 'for-2.6.39' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq:
workqueue: fix build failure introduced by s/freezeable/freezable/
workqueue: add system_freezeable_wq
rds/ib: use system_wq instead of rds_ib_fmr_wq
net/9p: replace p9_poll_task with a work
net/9p: use system_wq instead of p9_mux_wq
xfs: convert to alloc_workqueue()
reiserfs: make commit_wq use the default concurrency level
ocfs2: use system_wq instead of ocfs2_quota_wq
ext4: convert to alloc_workqueue()
scsi/scsi_tgt_lib: scsi_tgtd isn't used in memory reclaim path
scsi/be2iscsi,qla2xxx: convert to alloc_workqueue()
misc/iwmc3200top: use system_wq instead of dedicated workqueues
i2o: use alloc_workqueue() instead of create_workqueue()
acpi: kacpi*_wq don't need WQ_MEM_RECLAIM
fs/aio: aio_wq isn't used in memory reclaim path
input/tps6507x-ts: use system_wq instead of dedicated workqueue
cpufreq: use system_wq instead of dedicated workqueues
wireless/ipw2x00: use system_wq instead of dedicated workqueues
arm/omap: use system_wq in mailbox
workqueue: use WQ_MEM_RECLAIM instead of WQ_RESCUER
Linus Torvalds [Wed, 16 Mar 2011 15:10:07 +0000 (08:10 -0700)]
Merge branch 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
smp: Document transitivity for memory barriers.
rcu: add comment saying why DEBUG_OBJECTS_RCU_HEAD depends on PREEMPT.
rcupdate: remove dead code
rcu: add documentation saying which RCU flavor to choose
rcutorture: Get rid of duplicate sched.h include
rcu: call __rcu_read_unlock() in exit_rcu for tiny RCU
Linus Torvalds [Wed, 16 Mar 2011 15:04:07 +0000 (08:04 -0700)]
Increase OSF partition limit from 8 to 18
It turns out that while a maximum of 8 partitions may be what people
"should" have had, you can actually fit up to 18 entries(*) in a sector.
And some people clearly were taking advantage of that, like Michael
Cree, who had ten partitions on one of his OSF disks.
(*) The OSF partition data starts at byte offset 64 in the first sector,
and the array of 16-byte partition entries start at offset 148 in
the on-disk partition structure.
iprune_sem is continously giving us lockdep warnings because we do take it in
read mode in the reclaim path, but we're also doing non-NOFS allocations under
it taken in write mode.
Taking a bit deeper look at it I think it's fixable quite trivially:
- for invalidate_inodes we do not need iprune_sem at all. We have an active
reference on the superblock, so the filesystem is not going away until it
has finished.
- for evict_inodes we do need it, to make sure prune_icache has done it's
work before we tear down the superblock. But there is no reason to
hold it over the actual reclaim operation - it's enough to cycle through
it after the actual reclaim to make sure we wait for any pending
prune_icache to complete. We just have to remove the WARN_ON for
otherwise busy inodes as they can actually happen now.
Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Boris Ostrovsky [Tue, 15 Mar 2011 16:13:44 +0000 (12:13 -0400)]
x86, AMD: Set ARAT feature on AMD processors
Support for Always Running APIC timer (ARAT) was introduced in
commit db954b5898dd3ef3ef93f4144158ea8f97deb058. This feature
allows us to avoid switching timers from LAPIC to something else
(e.g. HPET) and go into timer broadcasts when entering deep
C-states.
AMD processors don't provide a CPUID bit for that feature but
they also keep APIC timers running in deep C-states (except for
cases when the processor is affected by erratum 400). Therefore
we should set ARAT feature bit on AMD CPUs.
Tested-by: Borislav Petkov <borislav.petkov@amd.com> Acked-by: Andreas Herrmann <andreas.herrmann3@amd.com> Acked-by: Mark Langsdorf <mark.langsdorf@amd.com> Acked-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@amd.com>
LKML-Reference: <1300205624-4813-1-git-send-email-ostr@amd64.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>