Avi Kivity [Mon, 25 Oct 2010 13:23:55 +0000 (15:23 +0200)]
KVM: Avoid double interrupt injection with vapic
After an interrupt injection, the PPR changes, and we have to reflect that
into the vapic. This causes a KVM_REQ_EVENT to be set, which causes the
whole interrupt injection routine to be run again (harmlessly).
Optimize by only setting KVM_REQ_EVENT if the ppr was lowered; otherwise
there is no chance that a new injection is needed.
Avi Kivity [Thu, 21 Oct 2010 10:20:33 +0000 (12:20 +0200)]
KVM: SVM: Move fs/gs/ldt save/restore to heavyweight exit path
ldt is never used in the kernel context; same goes for fs (x86_64) and gs
(i386). So save/restore them in the heavyweight exit path instead
of the lightweight path.
By itself, this doesn't buy us much, but it paves the way for moving vmload
and vmsave to the heavyweight exit path, since they modify the same registers.
[jan: fix copy/pase mistake on i386]
Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Avi Kivity [Thu, 21 Oct 2010 10:20:31 +0000 (12:20 +0200)]
KVM: SVM: Move guest register save out of interrupts disabled section
Saving guest registers is just a memory copy, and does not need to be in the
critical section. Move outside the critical section to improve latency a
bit.
Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Andi Kleen [Wed, 20 Oct 2010 15:56:17 +0000 (17:56 +0200)]
KVM: Move KVM context switch into own function
gcc 4.5 with some special options is able to duplicate the VMX
context switch asm in vmx_vcpu_run(). This results in a compile error
because the inline asm sequence uses an on local label. The non local
label is needed because other code wants to set up the return address.
This patch moves the asm code into an own function and marks
that explicitely noinline to avoid this problem.
Better would be probably to just move it into an .S file.
The diff looks worse than the change really is, it's all just
code movement and no logic change.
Gleb Natapov [Thu, 14 Oct 2010 09:22:55 +0000 (11:22 +0200)]
KVM: Let host know whether the guest can handle async PF in non-userspace context.
If guest can detect that it runs in non-preemptable context it can
handle async PFs at any time, so let host know that it can send async
PF even if guest cpu is not in userspace.
Gleb Natapov [Thu, 14 Oct 2010 09:22:54 +0000 (11:22 +0200)]
KVM paravirt: Handle async PF in non preemptable context
If async page fault is received by idle task or when preemp_count is
not zero guest cannot reschedule, so do sti; hlt and wait for page to be
ready. vcpu can still process interrupts while it waits for the page to
be ready.
Gleb Natapov [Thu, 14 Oct 2010 09:22:53 +0000 (11:22 +0200)]
KVM: Inject asynchronous page fault into a PV guest if page is swapped out.
Send async page fault to a PV guest if it accesses swapped out memory.
Guest will choose another task to run upon receiving the fault.
Allow async page fault injection only when guest is in user mode since
otherwise guest may be in non-sleepable context and will not be able
to reschedule.
Vcpu will be halted if guest will fault on the same page again or if
vcpu executes kernel code.
Gleb Natapov [Thu, 14 Oct 2010 09:22:52 +0000 (11:22 +0200)]
KVM: Handle async PF in a guest.
When async PF capability is detected hook up special page fault handler
that will handle async page fault events and bypass other page faults to
regular page fault handler. Also add async PF handling to nested SVM
emulation. Async PF always generates exit to L1 where vcpu thread will
be scheduled out until page is available.
Gleb Natapov [Mon, 18 Oct 2010 13:22:23 +0000 (15:22 +0200)]
KVM: Add memory slot versioning and use it to provide fast guest write interface
Keep track of memslots changes by keeping generation number in memslots
structure. Provide kvm_write_guest_cached() function that skips
gfn_to_hva() translation if memslots was not changed since previous
invocation.
Gleb Natapov [Sun, 17 Oct 2010 16:13:42 +0000 (18:13 +0200)]
KVM: Retry fault before vmentry
When page is swapped in it is mapped into guest memory only after guest
tries to access it again and generate another fault. To save this fault
we can map it immediately since we know that guest is going to access
the page. Do it only when tdp is enabled for now. Shadow paging case is
more complicated. CR[034] and EFER registers should be switched before
doing mapping and then switched back.
Gleb Natapov [Thu, 14 Oct 2010 09:22:46 +0000 (11:22 +0200)]
KVM: Halt vcpu if page it tries to access is swapped out
If a guest accesses swapped out memory do not swap it in from vcpu thread
context. Schedule work to do swapping and put vcpu into halted state
instead.
Interrupts will still be delivered to the guest and if interrupt will
cause reschedule guest will continue to run another task.
[avi: remove call to get_user_pages_noio(), nacked by Linus; this
makes everything synchrnous again]
Avi Kivity [Fri, 31 Dec 2010 08:52:15 +0000 (10:52 +0200)]
KVM: i8259: initialize isr_ack
isr_ack is never initialized. So, until the first PIC reset, interrupts
may fail to be injected. This can cause Windows XP to fail to boot, as
reported in the fallout from the fix to
https://bugzilla.kernel.org/show_bug.cgi?id=21962.
Reported-and-tested-by: Nicolas Prochazka <prochazka.nicolas@gmail.com> Signed-off-by: Avi Kivity <avi@redhat.com>
Avi Kivity [Tue, 28 Dec 2010 10:09:07 +0000 (12:09 +0200)]
KVM: MMU: Fix incorrect direct gfn for unpaged mode shadow
We use the physical address instead of the base gfn for the four
PAE page directories we use in unpaged mode. When the guest accesses
an address above 1GB that is backed by a large host page, a BUG_ON()
in kvm_mmu_set_gfn() triggers.
Resolves: https://bugzilla.kernel.org/show_bug.cgi?id=21962 Reported-and-tested-by: Nicolas Prochazka <prochazka.nicolas@gmail.com>
KVM-Stable-Tag. Signed-off-by: Avi Kivity <avi@redhat.com>
* git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tile:
arch/tile: handle rt_sigreturn() more cleanly
arch/tile: handle CLONE_SETTLS in copy_thread(), not user space
Linus Torvalds [Sat, 18 Dec 2010 18:13:24 +0000 (10:13 -0800)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6:
x86: avoid high BIOS area when allocating address space
x86: avoid E820 regions when allocating address space
x86: avoid low BIOS area when allocating address space
resources: add arch hook for preventing allocation in reserved areas
Revert "resources: support allocating space within a region from the top down"
Revert "PCI: allocate bus resources from the top down"
Revert "x86/PCI: allocate space from the end of a region, not the beginning"
Revert "x86: allocate space within a region top-down"
Revert "PCI: fix pci_bus_alloc_resource() hang, prefer positive decode"
PCI: Update MCP55 quirk to not affect non HyperTransport variants
Chris Metcalf [Tue, 14 Dec 2010 21:07:25 +0000 (16:07 -0500)]
arch/tile: handle rt_sigreturn() more cleanly
The current tile rt_sigreturn() syscall pattern uses the common idiom
of loading up pt_regs with all the saved registers from the time of
the signal, then anticipating the fact that we will clobber the ABI
"return value" register (r0) as we return from the syscall by setting
the rt_sigreturn return value to whatever random value was in the pt_regs
for r0.
However, this breaks in our 64-bit kernel when running "compat" tasks,
since we always sign-extend the "return value" register to properly
handle returned pointers that are in the upper 2GB of the 32-bit compat
address space. Doing this to the sigreturn path then causes occasional
random corruption of the 64-bit r0 register.
Instead, we stop doing the crazy "load the return-value register"
hack in sigreturn. We already have some sigreturn-specific assembly
code that we use to pass the pt_regs pointer to C code. We extend that
code to also set the link register to point to a spot a few instructions
after the usual syscall return address so we don't clobber the saved r0.
Now it no longer matters what the rt_sigreturn syscall returns, and the
pt_regs structure can be cleanly and completely reloaded.
Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
Chris Metcalf [Tue, 14 Dec 2010 20:57:49 +0000 (15:57 -0500)]
arch/tile: handle CLONE_SETTLS in copy_thread(), not user space
Previously we were just setting up the "tp" register in the
new task as started by clone() in libc. However, this is not
quite right, since in principle a signal might be delivered to
the new task before it had its TLS set up. (Of course, this race
window still exists for resetting the libc getpid() cached value
in the new task, in principle. But in any case, we are now doing
this exactly the way all other architectures do it.)
This change is important for 2.6.37 since the tile glibc we will
be submitting upstream will not set TLS in user space any more,
so it will only work on a kernel that has this fix. It should
also be taken for 2.6.36.x in the stable tree if possible.
Signed-off-by: Chris Metcalf <cmetcalf@tilera.com> Cc: stable <stable@kernel.org>
Kevin Cernekee [Wed, 3 Nov 2010 05:28:01 +0000 (22:28 -0700)]
MIPS: Fix build errors in sc-mips.c
Seen with malta_defconfig on Linus' tree:
CC arch/mips/mm/sc-mips.o
arch/mips/mm/sc-mips.c: In function 'mips_sc_is_activated':
arch/mips/mm/sc-mips.c:77: error: 'config2' undeclared (first use in this function)
arch/mips/mm/sc-mips.c:77: error: (Each undeclared identifier is reported only once
arch/mips/mm/sc-mips.c:77: error: for each function it appears in.)
arch/mips/mm/sc-mips.c:81: error: 'tmp' undeclared (first use in this function)
make[2]: *** [arch/mips/mm/sc-mips.o] Error 1
make[1]: *** [arch/mips/mm] Error 2
make: *** [arch/mips] Error 2
[Ralf: Cosmetic changes to minimize the number of arguments passed to
mips_sc_is_activated]
Signed-off-by: Kevin Cernekee <cernekee@gmail.com>
Patchwork: https://patchwork.linux-mips.org/patch/1752/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Bjorn Helgaas [Thu, 16 Dec 2010 17:39:02 +0000 (10:39 -0700)]
x86: avoid high BIOS area when allocating address space
This prevents allocation of the last 2MB before 4GB.
The experiment described here shows Windows 7 ignoring the last 1MB:
https://bugzilla.kernel.org/show_bug.cgi?id=23542#c27
This patch ignores the top 2MB instead of just 1MB because H. Peter Anvin
says "There will be ROM at the top of the 32-bit address space; it's a fact
of the architecture, and on at least older systems it was common to have a
shadow 1 MiB below."
Acked-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Bjorn Helgaas [Thu, 16 Dec 2010 17:38:56 +0000 (10:38 -0700)]
x86: avoid E820 regions when allocating address space
When we allocate address space, e.g., to assign it to a PCI device, don't
allocate anything mentioned in the BIOS E820 memory map.
On recent machines (2008 and newer), we assign PCI resources from the
windows described by the ACPI PCI host bridge _CRS. On many Dell
machines, these windows overlap some E820 reserved areas, e.g.,
If we put devices at 0xbff00000, they don't work, probably because
that's really RAM, not I/O memory. This patch prevents that by removing
the 0xbfe4dc00-0xbfffffff area from the "available" resource.
I'm not very happy with this solution because Windows solves the problem
differently (it seems to ignore E820 reserved areas and it allocates
top-down instead of bottom-up; details at comment 45 of the bugzilla
below). That means we're vulnerable to BIOS defects that Windows would not
trip over. For example, if BIOS described a device in ACPI but didn't
mention it in E820, Windows would work fine but Linux would fail.
Reference: https://bugzilla.kernel.org/show_bug.cgi?id=16228 Acked-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Bjorn Helgaas [Thu, 16 Dec 2010 17:38:51 +0000 (10:38 -0700)]
x86: avoid low BIOS area when allocating address space
This implements arch_remove_reservations() so allocate_resource() can
avoid any arch-specific reserved areas. This currently just avoids the
BIOS area (the first 1MB), but could be used for E820 reserved areas if
that turns out to be necessary.
We previously avoided this area in pcibios_align_resource(). This patch
moves the test from that PCI-specific path to a generic path, so *all*
resource allocations will avoid this area.
Acked-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Bjorn Helgaas [Thu, 16 Dec 2010 17:38:46 +0000 (10:38 -0700)]
resources: add arch hook for preventing allocation in reserved areas
This adds arch_remove_reservations(), which an arch can implement if it
needs to protect part of the address space from allocation.
Sometimes that can be done by just putting a region in the resource tree,
but there are cases where that doesn't work well. For example, x86 BIOS
E820 reservations are not related to devices, so they may overlap part of,
all of, or more than a device resource, so they may not end up at the
correct spot in the resource tree.
Acked-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
We're going back to considering bus resources in the order we found
them (in _CRS order, when we're using _CRS), so we don't need to
define any ordering.
Acked-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Linus Torvalds [Fri, 17 Dec 2010 17:45:25 +0000 (09:45 -0800)]
Merge branch 'for_linus' of git://github.com/at91linux/linux-2.6-at91
* 'for_linus' of git://github.com/at91linux/linux-2.6-at91:
at91: Refactor Stamp9G20 and PControl G20 board file
at91: Fix uhpck clock rate in upll case
Linus Torvalds [Fri, 17 Dec 2010 17:32:39 +0000 (09:32 -0800)]
Merge branch 'kvm-updates/2.6.37' of git://git.kernel.org/pub/scm/virt/kvm/kvm
* 'kvm-updates/2.6.37' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
KVM: Fix preemption counter leak in kvm_timer_init()
KVM: enlarge number of possible CPUID leaves
KVM: SVM: Do not report xsave in supported cpuid
KVM: Fix OSXSAVE after migration
Linus Torvalds [Fri, 17 Dec 2010 17:31:59 +0000 (09:31 -0800)]
Merge branch 'pm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/suspend-2.6
* 'pm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/suspend-2.6:
PM / Runtime: Fix pm_runtime_suspended()
PM / Hibernate: Restore old swap signature to avoid user space breakage
PM / Hibernate: Fix PM_POST_* notification with user-space suspend
Linus Torvalds [Fri, 17 Dec 2010 17:27:30 +0000 (09:27 -0800)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound-2.6:
ALSA: hda - Fix conflict of Mic Boot controls
ALSA: HDA: Enable subwoofer on Asus G73Jw
ALSA: HDA: Fix auto-mute on Lenovo Edge 14
ASoC: Fix bias power down of non-DAPM codec
ASoC: WM8580: Fix R8 initial value
ASoC: fix deemphasis control in wm8904/55/60 codecs
Takashi Iwai [Fri, 17 Dec 2010 14:23:41 +0000 (15:23 +0100)]
ALSA: hda - Fix conflict of Mic Boot controls
Due to the recent change for multiple mics assignment, we need to handle
the index of each Mic Boost control respectively. Otherwise the driver
gets the control element conflicts, and gives the unsable state.
at91: Refactor Stamp9G20 and PControl G20 board file
As PControl G20 is a carrier board for the Stamp9G20 SoM, some code can
be shared. Therefore board-stamp9g20.c is refactored to allow reusing the
SoM initialization and board-pcontrol-g20.c is modified to use it.
Signed-off-by: Christian Glindkamp <christian.glindkamp@taskit.de> Signed-off-by: Nicolas Ferre <nicolas.ferre@atmel.com>
Ryan Mallon [Wed, 2 Jun 2010 00:55:36 +0000 (12:55 +1200)]
at91: Fix uhpck clock rate in upll case
The uhpck clock should be divided from the utmi clock, not its parent
(main). This change is mostly cosmetic as the uhpck rate value is not
used anywhere except for the debugfs clock output.
Signed-off-by: Ryan Mallon <ryan@bluewatersys.com> Signed-off-by: Nicolas Ferre <nicolas.ferre@atmel.com>
Linus Torvalds [Thu, 16 Dec 2010 23:45:49 +0000 (15:45 -0800)]
Merge branch 'for-linus' of git://git.infradead.org/users/eparis/notify
* 'for-linus' of git://git.infradead.org/users/eparis/notify:
fanotify: fill in the metadata_len field on struct fanotify_event_metadata
fanotify: split version into version and metadata_len
fanotify: Dont try to open a file descriptor for the overflow event
fanotify: Introduce FAN_NOFD
fanotify: do not leak user reference on allocation failure
inotify: stop kernel memory leak on file creation failure
fanotify: on group destroy allow all waiters to bypass permission check
fanotify: Dont allow a mask of 0 if setting or removing a mark
fanotify: correct broken ref counting in case adding a mark failed
fanotify: if set by user unset FMODE_NONOTIFY before fsnotify_perm() is called
fanotify: remove packed from access response message
fanotify: deny permissions when no event was sent
Linus Torvalds [Thu, 16 Dec 2010 23:45:25 +0000 (15:45 -0800)]
Merge branch 'upstream' of git://git.linux-mips.org/pub/scm/upstream-linus
* 'upstream' of git://git.linux-mips.org/pub/scm/upstream-linus: (28 commits)
MIPS: Add a CONFIG_FORCE_MAX_ZONEORDER Kconfig option.
MIPS: LD/SD o32 macro GAS fix update
MIPS: Alchemy: fix build with SERIAL_8250=n
MIPS: Rename mips_dma_cache_sync back to dma_cache_sync
MIPS: MT: Fix typo in comment.
SSB: Fix nvram_get on BCM47xx platform
MIPS: BCM47xx: Swap serial console if ttyS1 was specified.
MIPS: BCM47xx: Use sscanf for parsing mac address
MIPS: BCM47xx: Fill values for b43 into SSB sprom
MIPS: BCM47xx: Do not read config from CFE
MIPS: FDT size is a be32
MIPS: Fix CP0 COUNTER clockevent race
MIPS: Fix regression on BCM4710 processor detection
MIPS: JZ4740: Fix pcm device name
MIPS: Separate two consecutive loads in memset.S
MIPS: Send proper signal and siginfo on FP emulator faults.
MIPS: AR7: Fix loops per jiffies on TNETD7200 devices
MIPS: AR7: Fix double ar7_gpio_init declaration
MIPS: Rework GENERIC_HARDIRQS Kconfig.
MIPS: Alchemy: Add return value check for strict_strtoul()
...
Neil Horman [Wed, 8 Dec 2010 14:47:48 +0000 (09:47 -0500)]
PCI: Update MCP55 quirk to not affect non HyperTransport variants
I wrote this quirk awhile ago to properly setup MCP55 chips on hypertransport
busses so that interrupts reached whatever cpu happend to boot the kdump kernel.
while that works well, it was recently shown to me that a a non-hypertransport
variant of the MCP55 exists, and on those system the register that this quirk
manipulates causes hangs if you write to it. Since the quirk was only meant to
handle errors found on MCP55 chips that have a HT interface, this patch adds a
filter to make sure the chip is an HT capable before making the needed register
adjustment. This lets the broken MCP55s work with kdump while not breaking the
non-HT variants.
that fixes a problem with the LD/SD macro currently implemented by GAS for
the o32 ABI in an inconsistent way. This is best illustrated with a
simple program, which I'm copying here from the message above for easier
reference:
The GAS fix makes the macro behave in a consistent way and pairs of LW/SW
instructions to be output as appropriate regardless of the size of the
offset associated with the address used. The machine instruction is still
available, but to reach it macros have to be disabled first. This has a
side effect of requiring the use of a machine-addressable memory operand.
As some platforms require 64-bit operations for accesses to some I/O
registers LD/SD instructions are used in a couple of places in Linux
regardless of the ABI selected. Here's a fix for some pieces of code
affected I've been able to track down. The fix should be backwards
compatible with all supported binutils releases in existence and can be
used as a reference for any other places or off-tree code. The use of the
"R" constraint guarantees a machine-addressable operand.
Signed-off-by: Maciej W. Rozycki <macro@linux-mips.org> Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/1680/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Manuel Lauss [Mon, 25 Oct 2010 16:44:11 +0000 (18:44 +0200)]
MIPS: Alchemy: fix build with SERIAL_8250=n
In commit 7d172bfe ("Alchemy: Add UART PM methods") I introduced
platform PM methods which call a function of the 8250 driver;
this patch works around link failures when the kernel is built
without 8250 support.
Signed-off-by: Manuel Lauss <manuel.lauss@googlemail.com>
To: Linux-MIPS <linux-mips@linux-mips.org>
Patchwork: https://patchwork.linux-mips.org/patch/1737/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Hauke Mehrtens [Sat, 27 Nov 2010 18:26:32 +0000 (19:26 +0100)]
SSB: Fix nvram_get on BCM47xx platform
The nvram_get function was never in the mainline kernel, it only existed in
an external OpenWrt patch. Use nvram_getenv function, which is in mainline
and use an include instead of an extra function declaration. et0macaddr
contains the mac address in text from like 00:11:22:33:44:55. We have to
parse it before adding it into macaddr.
nvram_parse_macaddr will be merged into asm/mach-bcm47xx/nvram.h through
the MIPS git tree and will be available soon. It will not build now without
nvram_parse_macaddr, but it hasn't before either.
Hauke Mehrtens [Sat, 27 Nov 2010 16:46:01 +0000 (17:46 +0100)]
MIPS: BCM47xx: Swap serial console if ttyS1 was specified.
Some devices like the Netgear WGT634U are using ttyS1 for default console
output. We should switch to that console if it was given in the kernel_args
parameters.
Hauke Mehrtens [Sat, 27 Nov 2010 16:45:58 +0000 (17:45 +0100)]
MIPS: BCM47xx: Do not read config from CFE
The config options read out here are not stored in CFE but only in NVRAM on
the devices. Remove reading from CFE and only access the NVRAM. Reading out
CFE does not harm but is useless here.
Kevin Cernekee [Tue, 23 Nov 2010 18:26:44 +0000 (10:26 -0800)]
MIPS: Fix CP0 COUNTER clockevent race
Consider the following test case:
write_c0_compare(read_c0_count());
Even if the counter doesn't increment during execution, this might not
generate an interrupt until the counter wraps around. The CPU may
perform the comparison each time CP0 COUNT increments, not when CP0
COMPARE is written.
If mips_next_event() is called with a very small delta, and CP0 COUNT
increments during the calculation of "cnt += delta", it is possible
that CP0 COMPARE will be written with the current value of CP0 COUNT.
If this is detected, the function should return -ETIME, to indicate
that the interrupt might not have actually gotten scheduled.
Tony Wu [Wed, 10 Nov 2010 13:48:15 +0000 (21:48 +0800)]
MIPS: Separate two consecutive loads in memset.S
partial_fixup is used in noreorder block.
Separating two consecutive loads can save one cycle on processors with
GPR intrelock and can fix load-use on processors that need a load delay slot.
Also do so for fwd_fixup.
[Ralf: Only R2000/R3000 class processors are lacking the the load-user
interlock and even some of those got it retrofitted. With R2000/R3000
being fairly uncommon these days the impact of this bug should be minor.]
Signed-off-by: Tony Wu <tung7970@gmail.com>
To: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/1768/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
David Daney [Thu, 21 Oct 2010 23:32:26 +0000 (16:32 -0700)]
MIPS: Send proper signal and siginfo on FP emulator faults.
We were unconditionally sending SIGBUS with an empty siginfo on FP
emulator faults. This differs from what happens when real floating
point hardware would get a fault.
For most faults we need to send SIGSEGV with the faulting address
filled in in the struct siginfo.
Reported-by: Camm Maguire <camm@maguirefamily.org> Signed-off-by: David Daney <ddaney@caviumnetworks.com>
To: linux-mips@linux-mips.org Cc: Camm Maguire <camm@maguirefamily.org>
Patchwork: https://patchwork.linux-mips.org/patch/1727/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Florian Fainelli [Sun, 31 Oct 2010 22:49:58 +0000 (23:49 +0100)]
MIPS: AR7: Fix loops per jiffies on TNETD7200 devices
TNETD7200 run their CPU clock faster than the default CPU clock we assume.
In order to have the correct loops per jiffies settings, initialize clocks right
before setting mips_hpt_frequency. As a side effect, we can no longer use
msleep in clocks.c which requires other parts of the kernel to be initialized,
so replace these with mdelay.
Yoichi Yuasa [Mon, 8 Nov 2010 08:23:52 +0000 (17:23 +0900)]
MIPS: Alchemy: Add return value check for strict_strtoul()
arch/mips/alchemy/devboards/prom.c: In function 'prom_init':
arch/mips/alchemy/devboards/prom.c:60: error: ignoring return value of
'strict_strtoul', declared with attribute warn_unused_result
Wu Zhangjin [Mon, 8 Nov 2010 13:25:24 +0000 (21:25 +0800)]
MIPS: Loongson: Add return value check for strict_strtoul()
cc1: warnings being treated as errors
arch/mips/loongson/common/env.c: In function 'prom_init_env':
arch/mips/loongson/common/env.c:49: error: ignoring return value of 'strict_strtol', declared with attribute warn_unused_result
arch/mips/loongson/common/env.c:50: error: ignoring return value of 'strict_strtol', declared with attribute warn_unused_result
arch/mips/loongson/common/env.c:51: error: ignoring return value of 'strict_strtol', declared with attribute warn_unused_result
arch/mips/loongson/common/env.c:52: error: ignoring return value of 'strict_strtol', declared with attribute warn_unused_result
Jesper Juhl [Sat, 30 Oct 2010 16:37:16 +0000 (18:37 +0200)]
MIPS: VPE loader: Check vmalloc return value in vpe_open
The return value of the vmalloc() call in arch/mips/kernel/vpe.c::vpe_open()
is not checked, so we potentially store a null pointer in v->pbuffer. Add
a check for a null return and then return -ENOMEM in that case.
[Ralf: The check added by Jesper's original patch is where it logically
should be. Adding it eleminated the need for the checks in a few other
places, so I removed them. There still is a zillion of other things that
need to be fixed in this file / API.]
David Daney [Tue, 2 Nov 2010 00:43:07 +0000 (17:43 -0700)]
MIPS: Don't clobber personality high bits.
The high bits of current->personality carry settings that we don't want to
clobber on each exec. Only clobber them if the lower bits that indicate
either PER_LINUX or PER_LINUX32 are invalid.
The clobbering prevents us from using useful bits like ADDR_NO_RANDOMIZE.
WARNING: arch/mips/built-in.o(.text+0xc): Section mismatch in reference from the
function jz4740_init_cmdline() to the variable .init.data:arcs_cmdline
While were at it, make jz4740_init_cmdline static as well.
Al Viro [Thu, 4 Nov 2010 11:13:59 +0000 (11:13 +0000)]
MIPS: Don't stomp on caller's ->regs[2] in copy_thread()
We never needed that (->regs[2] is overwritten on return from syscall paths
with return value of syscall, so storing it there early made no sense) and
with new restart logics since d27240bf7e61d2656de18e158ec910a902030847 it
has become really bad - we lose the original syscall number before the
place where we decide that we might need a syscall restart.
Note that for child we do need the assignment to regs[2] - it won't go
through the normal return from syscall path.
[Ralf: Issue found and reported by Lluís; initial investigations by me;
bug finally found and patch by Al; testing by me and Lluís.]
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Tested-by: Lluís Batlle i Rossell <viriketo@gmail.com> Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Linus Torvalds [Thu, 16 Dec 2010 16:33:44 +0000 (08:33 -0800)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
Input: define separate EVIOCGKEYCODE_V2/EVIOCSKEYCODE_V2
Input: wacom - add another Bamboo Pen ID (0xd4)
There are some situations (e.g. in __pm_generic_call()), where
pm_runtime_suspended() is used to decide whether or not to execute
a device's (system) ->suspend() callback. The callback is not
executed if pm_runtime_suspended() returns true, but it does so
for devices that don't even support runtime PM, because the
power.disable_depth device field is ignored by it. This leads to
problems (i.e. devices are not suspened when they should), so rework
pm_runtime_suspended() so that it returns false if the device's
power.disable_depth field is different from zero.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Cc: stable@kernel.org
PM / Hibernate: Restore old swap signature to avoid user space breakage
Commit 3624eb0 (PM / Hibernate: Modify signature used to mark swap)
attempted to modify hibernate signature used to mark swap partitions
containing hibernation images, so that old kernels don't try to
handle compressed images. However, this change broke resume from
hibernation on Fedora 14 that apparently doesn't pass the resume=
argument to the kernel and tries to trigger resume from early user
space. This doesn't work, because the signature is now different,
so the old signature has to be restored to avoid the problem.
Reported-by: Dr. David Alan Gilbert <linux@treblig.org> Reported-by: Zhang Rui <rui.zhang@intel.com> Reported-by: Pascal Chapperon <pascal.chapperon@wanadoo.fr> Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Takashi Iwai [Thu, 9 Dec 2010 23:16:39 +0000 (00:16 +0100)]
PM / Hibernate: Fix PM_POST_* notification with user-space suspend
The user-space hibernation sends a wrong notification after the image
restoration because of thinko for the file flag check. RDONLY
corresponds to hibernation and WRONLY to restoration, confusingly.
Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Cc: stable@kernel.org
Rusty Russell [Thu, 16 Dec 2010 23:03:15 +0000 (17:03 -0600)]
lguest: populate initial_page_table
Two x86 patches broke lguest:
1) v2.6.35-492-g72d7c3b, which changed x86 to use the memblock allocator.
In lguest, the host places linear page tables at the top of mem, which
used to be enough to get us up to the swapper_pg_dir page tables. With
the first patch, the direct mapping tables used that memory:
Before: kernel direct mapping tables up to 4000000 @ 7000-1a000
After: kernel direct mapping tables up to 4000000 @ 3fed000-4000000
I initially fixed this by lying about the amount of memory we had, so
the kernel wouldn't blatt the lguest boot pagetables (yuk!), but then...
This was initialized in a part of head_32.S which isn't executed by
lguest; it is then copied into swapper_pg_dir. So we have to initialize
it; and anyway we switch to it before we blatt the old tables, so that
fixes the previous damage as well.
For the moment, I cut & pasted the code into lguest's boot code, but
next merge window I will merge them.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Cc: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
To: x86@kernel.org
Rusty Russell [Thu, 16 Dec 2010 23:03:15 +0000 (17:03 -0600)]
lguest: restore boot speed
lguest is dumb and drops *all* the pagetables for set_pte (which is
only used for kernel mapping manipulation, so it's OK without highmem).
But it's used a lot in boot, too. As a guest optimization, we
suppressed this flushing until the first page switch. Now we have
initial_page_table, that happens much earlier, so extend the heuristic
to wait until we switch to something other than the swapper_pg_dir or
initial_page_table.
As measured on my laptop under kvm, this dropped the time-to-mount-root
from 48 seconds to 4.3 seconds.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Rusty Russell [Thu, 16 Dec 2010 23:03:13 +0000 (17:03 -0600)]
lguest: fix crash lguest_time_init
fe25c7fc2e "x86: lguest: Convert to new irq chip functions" converted
enable_lguest_irq() to take a struct irq_data *, but didn't fix the one
internal caller.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
To: x86@kernel.org
Ryusuke Konishi [Thu, 16 Dec 2010 00:57:57 +0000 (09:57 +0900)]
nilfs2: fix regression of garbage collection ioctl
On 2.6.37-rc1, garbage collection ioctl of nilfs was broken due to the
commit 263d90cefc7d82a0 ("nilfs2: remove own inode hash used for GC"),
and leading to filesystem corruption.
The patch doesn't queue gc-inodes for log writer if they are reused
through the vfs inode cache. Here, gc-inode is the inode which
buffers blocks to be relocated on GC. That patch queues gc-inodes in
nilfs_init_gcinode() function, but this function is not called when
they don't have I_NEW flag. Thus, some of live blocks are wrongly
overrode without being moved to new logs.
This resolves the problem by moving the gc-inode queueing to an outer
function to ensure it's done right.
Linus Torvalds [Wed, 15 Dec 2010 20:41:17 +0000 (12:41 -0800)]
Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4
* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
ext4: fix typo which broke '..' detection in ext4_find_entry()
ext4: Turn off multiple page-io submission by default