Jiri Kosina [Wed, 30 Jan 2008 12:31:07 +0000 (13:31 +0100)]
x86: PIE executable randomization
main executable of (specially compiled/linked -pie/-fpie) ET_DYN binaries
onto a random address (in cases in which mmap() is allowed to perform a
randomization).
The code has been extraced from Ingo's exec-shield patch
http://people.redhat.com/mingo/exec-shield/
This patches proceeds with the integration of msr.h, making
the code unified, instead of having a version for each architecture.
We stick with the native_* functions, and then paravirt comes for free.
Signed-off-by: Glauber de Oliveira Costa <gcosta@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
This patch uses the _ASM_ALIGN and _ASM_PTR macros
to make the fixups in native_read/write_msr_safe look the same
for x86_64 and i386. Besides using this macros, we also have to
take the explicit instruction suffixes out. It's okay
because all this instructions uses registers, and can be sized by
them.
Signed-off-by: Glauber de Oliveira Costa <gcosta@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
This patche changes the native_write_msr() and friends interface
to explicitly take 2 32-bit registers instead of a 64-bit value.
The change will ease the merge with 64-bit code. As the 64-bit
value will be passed as two registers anyway in i386,
the PVOP_CALL interface has to account for that and use low/high parameters
It would force the x86_64 version to be different.
The change does not make i386 generated code less efficient. As said above,
it would get the values from two registers anyway.
Signed-off-by: Glauber de Oliveira Costa <gcosta@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
the rdpmc instruction gets a counter argument in rcx. However,
the i386 version was ignoring it. To make both x86_64 and i386 versions
the same, as well as to comply with the instruction semantics, this
parameter is added in the i386 version
Signed-off-by: Glauber de Oliveira Costa <gcosta@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Targetting paravirt, this patch introduces native_read_tscp, in
place of rdtscp() macro. When in a paravirt guest, this will
involve a function call, and thus, cannot be done in the vdso area.
These users then have to call the native version directly
Signed-off-by: Glauber de Oliveira Costa <gcosta@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
cpuid is not very different between i386 and x86_64.
We move away the x86_64 version from msr.h, and
unify them at processor.h, where they belong.
cpuid() paravirt then comes for free.
Signed-off-by: Glauber de Oliveira Costa <gcosta@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
This patch splits get_cycles_sync() into __get_cycles_sync(),
and the rdtscll part. Paravirt guests cannot issue rdtscl directly,
as it involves a function call in vdso area.
So, using the __get_cycles_sync() base, we introduce vget_cycles_sync,
which then calls the native version of rdtscll. Ideally, however, a guest
should define its own clocksource, together with a vread function
Signed-off-by: Glauber de Oliveira Costa <gcosta@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
x86: allow sched clock to be overridden by paravirt
This patch turns the sched_clock into native_sched_clock.
sched clock becomes a weak symbol, which can then give its
place to a paravirt definition.
Signed-off-by: Glauber de Oliveira Costa <gcosta@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
The functions under #ifdef CONFIG_SMP in msr.h are the same
for both x86_64 and i386, and this patches removes one of them,
putting them in a single location
Signed-off-by: Glauber de Oliveira Costa <gcosta@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
x86: wipe out traditional opt from x86_64 Makefile
Among other things, using -traditional as a gcc option stops us from
using macro token pasting, which is a feature we heavily rely on.
There was still a use of -traditional in arch/x86/kernel/Makefile_64,
which this patch removes.
I don't see any problems building kernels in my x86_64 box without
-traditional.
Signed-off-by: Glauber de Oliveira Costa <gcosta@redhat.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Acked-by: Jeremy Fitzhardinge <jeremy@xensource.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Joerg Roedel [Wed, 30 Jan 2008 12:31:02 +0000 (13:31 +0100)]
x86: use __PAGE_KERNEL* instead of _KERNPG_TABLE
This minor cleanup replaces _KERNPG_TABLE with the __PAGE_KERNEL* for 2MB PTEs
in the 64-bit memory initialization code. The __PAGE_KERNEL* defines are more
appropriate for PTEs.
Len Brown [Wed, 30 Jan 2008 12:31:02 +0000 (13:31 +0100)]
x86: 32-bit IOAPIC: de-fang IRQ compression
commit c434b7a6aedfe428ad17cd61b21b125a7b7a29ce
(x86: avoid wasting IRQs for PCI devices)
created a concept of "IRQ compression" on i386
to conserve IRQ numbers on systems with many
sparsely populated IO APICs.
The same scheme was also added to x86_64,
but later removed when x86_64 recieved an IRQ over-haul
that made it unnecessary -- including per-CPU
IRQ vectors that greatly increased the IRQ capacity
on the machine.
i386 has not received the analogous over-haul,
and thus a previous attempt to delete IRQ compression
from i386 was rejected on the theory that there may
exist machines that actually need it. The fact is
that the author of IRQ compression patch was unable
to confirm the actual existence of such a system.
As a result, all i386 kernels with IOAPIC support
pay the following:
1. confusion
IRQ compression re-names the traditional IOAPIC
pin numbers (aka ACPI GSI's) into sequential IRQ #s:
This makes /proc/interrupts look different
depending on system configuration and device probe order.
It is also different than the x86_64 kernel running
on the exact same system. As a result, programmers
get confused when comparing systems.
2. complexity
The IRQ code in Linux is already overly complex,
and IRQ compression makes it worse. There have
already been two bug workarounds related to IRQ
compression -- the IRQ0 timer workaround and
the VIA PCI IRQ workaround.
3. size
All i386 kernels with IOAPIC support contain an int[4096] --
a 4 page array to contain the renamed IRQs.
So while the irq compression code on i386 should really
be deleted -- even before merging the x86_64 irq-overhaul,
this patch simply disables it on all high volume systems
to avoid problems #1 and #2 on most all i386 systems.
A large system with pin numbers >=64 will still have compression
to conserve limited IRQ numbers for sparse IOAPICS. However,
the vast majority of the planet, those with only pin numbers < 64
will use an identity GSI -> IRQ mapping.
Signed-off-by: Len Brown <len.brown@intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: "Eric W. Biederman" <ebiederm@xmission.com>
Roland McGrath [Wed, 30 Jan 2008 12:31:01 +0000 (13:31 +0100)]
x86: x86 ia32 ptrace arch merge
This moves the sys32_ptrace code into arch/x86/kernel/ptrace.c,
verbatim except for a few hard-coded sizes replaced with sizeof.
Here this code can use the shared local functions in this file.
Signed-off-by: Roland McGrath <roland@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Roland McGrath [Wed, 30 Jan 2008 12:31:01 +0000 (13:31 +0100)]
x86: x86 ia32 ptrace getreg/putreg merge
This reimplements the 64-bit IA32-emulation register access
functions in arch/x86/kernel/ptrace.c, where they can share
some guts with the native access functions directly.
These functions are not used yet, but this paves the way to move
IA32 ptrace support into this file to share its local functions.
[akpm@linuxfoundation.org: Build fix]
Signed-off-by: Roland McGrath <roland@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Roland McGrath [Wed, 30 Jan 2008 12:31:01 +0000 (13:31 +0100)]
x86: x86 ptrace getreg/putreg merge
This merges 64-bit support into the low-level register access
functions in arch/x86/kernel/ptrace.c, paving the way to share
this file between 32-bit and 64-bit builds.
Signed-off-by: Roland McGrath <roland@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Roland McGrath [Wed, 30 Jan 2008 12:30:59 +0000 (13:30 +0100)]
x86: x86-32 thread_struct.debugreg
This replaces the debugreg[7] member of thread_struct with individual
members debugreg0, etc. This saves two words for the dummies 4 and 5,
and harmonizes the code between 32 and 64.
Signed-off-by: Roland McGrath <roland@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Roland McGrath [Wed, 30 Jan 2008 12:30:58 +0000 (13:30 +0100)]
x86: x86-64 ia32 ptrace get/putreg32 current task
This generalizes the getreg32 and putreg32 functions so they can be used on
the current task, as well as on a task stopped in TASK_TRACED and switched
off. This lays the groundwork to share this code for all kinds of
user-mode machine state access, not just ptrace.
Signed-off-by: Roland McGrath <roland@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Roland McGrath [Wed, 30 Jan 2008 12:30:58 +0000 (13:30 +0100)]
x86: x86-32 ptrace get/putreg current task
This generalizes the getreg and putreg functions so they can be used on the
current task, as well as on a task stopped in TASK_TRACED and switched off.
This lays the groundwork to share this code for all kinds of user-mode
machine state access, not just ptrace.
Signed-off-by: Roland McGrath <roland@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Roland McGrath [Wed, 30 Jan 2008 12:30:58 +0000 (13:30 +0100)]
x86: x86-64 ptrace get/putreg current task
This generalizes the getreg and putreg functions so they can be used on the
current task, as well as on a task stopped in TASK_TRACED and switched off.
This lays the groundwork to share this code for all kinds of user-mode
machine state access, not just ptrace.
Signed-off-by: Roland McGrath <roland@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
H. Peter Anvin [Wed, 30 Jan 2008 12:30:56 +0000 (13:30 +0100)]
x86: use generic register names in struct sigcontext
Switch struct sigcontext (defined in <asm/sigcontext*.h>) to using
register names withut e- or r-prefixes for both 32- and 64-bit x86.
This is intended as a preliminary step in unifying this code between
architectures.
Signed-off-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
H. Peter Anvin [Wed, 30 Jan 2008 12:30:56 +0000 (13:30 +0100)]
x86: use generic register names in struct user_regs_struct
Switch struct user_regs_struct (defined in <asm/user.h>, which is no
longer exported to userspace) to using register names without e- or
r-prefixes for both 32 and 64 bit x86. This is intended as a
preliminary step in unifying this code between architectures.
Also, be a bit more strict in truncating 32-bit "extended" segment
register values to 16 bits.
Signed-off-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
H. Peter Anvin [Wed, 30 Jan 2008 12:30:56 +0000 (13:30 +0100)]
x86: rename the struct pt_regs members for 32/64-bit consistency
We have a lot of code which differs only by the naming of specific
members of structures that contain registers. In order to enable
additional unifications, this patch drops the e- or r- size prefix
from the register names in struct pt_regs, and drops the x- prefixes
for segment registers on the 32-bit side.
This patch also performs the equivalent renames in some additional
places that might be candidates for unification in the future.
Signed-off-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
The patch to suppress bitops-related warnings added a pile of ugly
casts. Many of these were related to the management of x86 CPU
capabilities. Clean these up by adding specific set/clear_cpu_cap
macros, and use them consistently.
Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com> Cc: Andi Kleen <ak@suse.de> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
This unifies the set/clear/test bit functions of asm/bitops.h.
I have not attempted to merge the bit-finding functions, since they
rely on the machine word size and can't be easily restructured to work
generically without a lot of #ifdefs. In particular, the 64-bit code
can assume the presence of conditional move instructions, whereas
32-bit needs to be more careful.
The inline assembly for the bit operations has been changed to remove
explicit sizing hints on the instructions, so the assembler will pick
the appropriate instruction forms depending on the architecture and
the context.
Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com> Cc: Andi Kleen <ak@suse.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Roland McGrath [Wed, 30 Jan 2008 12:30:55 +0000 (13:30 +0100)]
x86: PTRACE_SINGLEBLOCK
This adds the PTRACE_SINGLEBLOCK request on x86, matching the ia64 feature.
The implementation comes from the generic ptrace code and relies on the
low-level machine support provided by arch_has_block_step() et al.
Signed-off-by: Roland McGrath <roland@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Roland McGrath [Wed, 30 Jan 2008 12:30:54 +0000 (13:30 +0100)]
x86: debugctlmsr kprobes
This adjusts the x86 kprobes implementation to cope with per-thread
MSR_IA32_DEBUGCTLMSR being set for user mode. I haven't delved deep
enough into the kprobes code to be really sure this covers all the
cases where the user-mode BTF setting needs to be cleared or restored.
It looks about right to me.
Signed-off-by: Roland McGrath <roland@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Roland McGrath [Wed, 30 Jan 2008 12:30:54 +0000 (13:30 +0100)]
x86: debugctlmsr kconfig
This adds the (internal) Kconfig macro CONFIG_X86_DEBUGCTLMSR,
to be defined when configuring to support only hardware that
definitely supports MSR_IA32_DEBUGCTLMSR with the BTF flag.
The Intel documentation says "P6 family" and later processors all have it.
I think the Kconfig dependencies are right to have it set for those and
unset for others (i.e., when 586 and earlier are supported).
Signed-off-by: Roland McGrath <roland@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Roland McGrath [Wed, 30 Jan 2008 12:30:53 +0000 (13:30 +0100)]
ptrace: generic PTRACE_SINGLEBLOCK
This makes ptrace_request handle PTRACE_SINGLEBLOCK along with
PTRACE_CONT et al. The new generic code makes use of the
arch_has_block_step macro and generic entry points on machines
that define them.
[ mingo@elte.hu: bugfix ]
Signed-off-by: Roland McGrath <roland@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Roland McGrath [Wed, 30 Jan 2008 12:30:53 +0000 (13:30 +0100)]
ptrace: arch_has_block_step
This defines the new macro arch_has_block_step() in linux/ptrace.h, a
default for when asm/ptrace.h does not define it. This is the analog
of arch_has_single_step() for step-until-branch on machines that have
it. It declares the new user_enable_block_step function, which goes
with the existing user_enable_single_step and user_disable_single_step.
This is not used yet, but paves the way to harmonize on this interface
for the arch-specific calls on all machines.
Signed-off-by: Roland McGrath <roland@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Roland McGrath [Wed, 30 Jan 2008 12:30:52 +0000 (13:30 +0100)]
x86: x86-32 ptrace debugreg cleanup
This cleans up the 32-bit ptrace code to separate the guts of the
debug register access from the implementation of PTRACE_PEEKUSR and
PTRACE_POKEUSR. The new functions ptrace_[gs]et_debugreg match the
new 64-bit entry points for parity, but they don't need to be global.
Signed-off-by: Roland McGrath <roland@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Roland McGrath [Wed, 30 Jan 2008 12:30:52 +0000 (13:30 +0100)]
x86: x86-64 ptrace debugreg cleanup
This cleans up the 64-bit ptrace code to separate the guts of the
debug register access from the implementation of PTRACE_PEEKUSR and
PTRACE_POKEUSR. The new functions ptrace_[gs]et_debugreg are made
global so that the ia32 code can later be changed to call them too.
Signed-off-by: Roland McGrath <roland@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Roland McGrath [Wed, 30 Jan 2008 12:30:51 +0000 (13:30 +0100)]
powerpc: arch_has_single_step
This defines the new standard arch_has_single_step macro. It makes the
existing set_single_step and clear_single_step entry points global, and
renames them to the new standard names user_enable_single_step and
user_disable_single_step, respectively.
Signed-off-by: Roland McGrath <roland@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Roland McGrath [Wed, 30 Jan 2008 12:30:51 +0000 (13:30 +0100)]
ptrace: generic resume
This makes ptrace_request handle all the ptrace requests that wake
up the traced task. These do low-level ptrace implementation magic
that is not arch-specific and should be kept out of arch code. The
implementations on each arch usually do the same thing. The new
generic code makes use of the arch_has_single_step macro and generic
entry points to handle PTRACE_SINGLESTEP.
Signed-off-by: Roland McGrath <roland@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Roland McGrath [Wed, 30 Jan 2008 12:30:50 +0000 (13:30 +0100)]
x86 single_step: TIF_FORCED_TF
This changes the single-step support to use a new thread_info flag
TIF_FORCED_TF instead of the PT_DTRACE flag in task_struct.ptrace.
This keeps arch implementation uses out of this non-arch field.
This changes the ptrace access to eflags to mask TF and maintain
the TIF_FORCED_TF flag directly if userland sets TF, instead of
relying on ptrace_signal_deliver. The 64-bit and 32-bit kernels
are harmonized on this same behavior. The ptrace_signal_deliver
approach works now, but this change makes the low-level register
access code reliable when called from different contexts than a
ptrace stop, which will be possible in the future.
The 64-bit do_debug exception handler is also changed not to clear TF
from user-mode registers. This matches the 32-bit kernel's behavior.
Signed-off-by: Roland McGrath <roland@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Roland McGrath [Wed, 30 Jan 2008 12:30:50 +0000 (13:30 +0100)]
x86: single_step: share code
This removes the single-step code from ptrace_32.c and uses the step.c code
shared with the 64-bit kernel. The two versions of the code were nearly
identical already, so the shared code has only a couple of simple #ifdef's.
Signed-off-by: Roland McGrath <roland@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Roland McGrath [Wed, 30 Jan 2008 12:30:50 +0000 (13:30 +0100)]
x86: single_step moved
This moves the single-step support code from ptrace_64.c into a new file
step.c, verbatim. This paves the way for consolidating this code between
64-bit and 32-bit versions.
Signed-off-by: Roland McGrath <roland@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Roland McGrath [Wed, 30 Jan 2008 12:30:48 +0000 (13:30 +0100)]
x86: arch_has_single_step
This defines the new standard arch_has_single_step macro. It makes the
existing set_singlestep and clear_singlestep entry points global, and
renames them to the new standard names user_enable_single_step and
user_disable_single_step, respectively.
Signed-off-by: Roland McGrath <roland@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Roland McGrath [Wed, 30 Jan 2008 12:30:48 +0000 (13:30 +0100)]
x86: segment selector macros
This copies into asm-x86/segment_64.h some macros from asm-x86/segment_32.h
for dissecting segment selectors. This lets other code use these macros
uniformly on 32/64-bit rather than duplicating the constants elsewhere.
Signed-off-by: Roland McGrath <roland@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Roland McGrath [Wed, 30 Jan 2008 12:30:47 +0000 (13:30 +0100)]
ptrace: arch_has_single_step
This defines the new macro arch_has_single_step() in linux/ptrace.h, a
default for when asm/ptrace.h does not define it. It declares the new
user_enable_single_step and user_disable_single_step functions.
This is not used yet, but paves the way to harmonize on this interface
for the arch-specific calls on all machines.
Signed-off-by: Roland McGrath <roland@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
x86: fall back on interrupt disable in cmpxchg8b on 80386 and 80486
Actually, on 386, cmpxchg and cmpxchg_local fall back on
cmpxchg_386_u8/16/32: it disables interruptions around non atomic
updates to mimic the cmpxchg behavior.
The comment:
/* Poor man's cmpxchg for 386. Unsuitable for SMP */
already present in cmpxchg_386_u32 tells much about how this cmpxchg
implementation should not be used in a SMP context. However, the cmpxchg_local
can perfectly use this fallback, since it only needs to be atomic wrt the local
cpu.
This patch adds a cmpxchg_486_u64 and uses it as a fallback for cmpxchg64
and cmpxchg64_local on 80386 and 80486.
Q:
but why is it called cmpxchg_486 when the other functions are called
A:
Because the standard cmpxchg is missing only on 386, but cmpxchg8b is
missing both on 386 and 486.
Citing Intel's Instruction set reference:
cmpxchg:
This instruction is not supported on Intel processors earlier than the
Intel486 processors.
cmpxchg8b:
This instruction encoding is not supported on Intel processors earlier
than the Pentium processors.
Q:
What's the reason to have cmpxchg64_local on 32 bit architectures?
Without that need all this would just be a few simple defines.
A:
cmpxchg64_local on 32 bits architectures takes unsigned long long
parameters, but cmpxchg_local only takes longs. Since we have cmpxchg8b
to execute a 8 byte cmpxchg atomically on pentium and +, it makes sense
to provide a flavor of cmpxchg and cmpxchg_local using this instruction.
Also, for 32 bits architectures lacking the 64 bits atomic cmpxchg, it
makes sense _not_ to define cmpxchg64 while cmpxchg could still be
available.
Moreover, the fallback for cmpxchg8b on i386 for 386 and 486 is a
However, cmpxchg64_local will be emulated by disabling interrupts on all
architectures where it is not supported atomically.
Therefore, we *could* turn cmpxchg64_local into a cmpxchg_local, but it
would make the 386/486 fallbacks ugly, make its design different from
cmpxchg/cmpxchg64 (which really depends on atomic operations and cannot
be emulated) and require the __cmpxchg_local to be expressed as a macro
rather than an inline function so the parameters would not be fixed to
unsigned long long in every case.
So I think cmpxchg64_local makes sense there, but I am open to
suggestions.
Q:
Are there any callers?
A:
I am actually using it in LTTng in my timestamping code. I use it to
work around CPUs with asynchronous TSCs. I need to update 64 bits
values atomically on this 32 bits architecture.
Changelog:
- Ran though checkpatch.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> Cc: Andi Kleen <ak@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Ralf Baechle [Wed, 30 Jan 2008 12:30:47 +0000 (13:30 +0100)]
mips, x86: optimize the i8259 code a bit
The timer code always calls the clock_event_device set_net_event and
set_mode methods with interrupts disabled, so no need to use
spin_lock_irqsave / spin_unlock_irqrestore for those.
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Acked-by:Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
x86: 64-bit, make sparsemem vmemmap the only memory model
Use sparsemem as the only memory model for UP, SMP and NUMA. Measurements
indicate that DISCONTIGMEM has a higher overhead than sparsemem. And
FLATMEMs benefits are minimal. So I think its best to simply standardize
on sparsemem.
Results of page allocator tests (test can be had via git from slab git
tree branch tests)
Measurements in cycle counts. 1000 allocations were performed and then the
average cycle count was calculated.
Sanitize user specified e820 memory ranges, using the same logic that is
applied to the values returned by the BIOS. This ensures consistent
handling regardless of the source of the memory mappings.
Allows overriding portions of the memory map without specifying one in
it's entirety (memmap=exactmap).
E.g. marking a range of bad RAM as reserved with memmap=48M$528M
Roland McGrath [Wed, 30 Jan 2008 12:30:46 +0000 (13:30 +0100)]
x86: TLS cleanup
This consolidates the four different places that implemented the same
encoding magic for the GDT-slot 32-bit TLS support. The old tls32.c was
renamed and is now only slightly modified to be the shared implementation.
Signed-off-by: Roland McGrath <roland@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Zachary Amsden <zach@vmware.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Roland McGrath [Wed, 30 Jan 2008 12:30:45 +0000 (13:30 +0100)]
x86: desc_empty
This replaces the desc_empty macro with an inline. It now handles
easily any of the four different types used between 32/64 code to
refer to these 8 bytes. It's identical in both asm-x86/processor_64.h
and asm-x86/processor_32.h, so if these files ever get merged this
function can be in the common code.
This also removes the desc_equal macro because nothing uses it.
Signed-off-by: Roland McGrath <roland@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Roland McGrath [Wed, 30 Jan 2008 12:30:45 +0000 (13:30 +0100)]
x86: ptrace fs/gs_base
The fs_base and gs_base fields are available in user_regs_struct.
But reading these via ptrace (PTRACE_GETREGS or PTRACE_PEEKUSR) does
not give a reliably useful value. The thread_struct fields are 0
when do_arch_prctl decided to use a GDT slot instead of MSR_FS_BASE,
which it does for a value under 1<<32.
This changes ptrace access to fs_base and gs_base to work like
PTRACE_ARCH_PRCTL does. That is, it reads the base address that
user-mode memory access using the fs/gs instruction prefixes will
use, regardless of how it's being implemented in the kernel. The
MSR vs GDT is an implementation detail that is pretty much hidden
from userland in the actual using, and there is no reason that
ptrace should give the internal implementation picture rather than
the user-mode semantic picture. In the case of setting the value,
this can implicitly change the fsindex/gsindex value (also
separately in user_regs_struct), which is what happens when the
thread calls arch_prctl itself. In a PTRACE_SETREGS, the fs_base
change will come after the fsindex change due to the order of the
struct, and so a change the debugger made to fs_base will have the
effect intended, another part of the user_regs_struct will now
differ when read back from what the debugger wrote.
This makes PTRACE_ARCH_PRCTL obsolete. We could consider declaring
it deprecated and removing it one day, though there is no hurry.
For the foreseeable future, debuggers have to assume an old kernel
that does not report reliable fs_base/gs_base values in user_regs_struct
and stick to PTRACE_ARCH_PRCTL anyway.
Signed-off-by: Roland McGrath <roland@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Roland McGrath [Wed, 30 Jan 2008 12:30:45 +0000 (13:30 +0100)]
x86: use get_desc_base
This changes a couple of places to use the get_desc_base function.
They were duplicating the same calculation with different equivalent code.
Signed-off-by: Roland McGrath <roland@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Roland McGrath [Wed, 30 Jan 2008 12:30:44 +0000 (13:30 +0100)]
x86: get_desc_base
This defines the get_desc_base function in asm-x86/desc_64.h to match the
one in desc_32.h. If these two files ever get merged together, this
function could be the same in both.
Signed-off-by: Roland McGrath <roland@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Roland McGrath [Wed, 30 Jan 2008 12:30:44 +0000 (13:30 +0100)]
x86 vDSO: canonicalize sysenter .eh_frame
Some assembler versions automagically optimize .eh_frame contents,
changing their size. The CFI in sysenter.S was not using optimal
formatting, so it would be changed by newer/smarter assemblers.
This ran afoul of the wired constant for padding out the other vDSO
images to match its size. This changes the original hand-coded
source to use the optimal format encoding for its operations. That
leaves nothing more for a fancy assembler to do, so the sizes will
match the wired-in expected size regardless of the assembler version.
Signed-off-by: Roland McGrath <roland@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Roland McGrath [Wed, 30 Jan 2008 12:30:44 +0000 (13:30 +0100)]
x86 vDSO: reorder vdso32 code
This reorders the code in the 32-bit vDSO images to put the signal
trampolines first and __kernel_vsyscall after them. The order does
not matter to userland, it just uses what AT_SYSINFO or e_entry
says. Since the signal trampolines are the same size in both
versions of the vDSO, putting them first is the simplest way to get
the addresses to line up. This makes it work to use a more compact
layout for the vDSO.
Signed-off-by: Roland McGrath <roland@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Roland McGrath [Wed, 30 Jan 2008 12:30:43 +0000 (13:30 +0100)]
x86 vDSO: consolidate vdso32
This makes x86_64's ia32 emulation support share the sources used in the
32-bit kernel for the 32-bit vDSO and much of its setup code.
The 32-bit vDSO mapping now behaves the same on x86_64 as on native 32-bit.
The abi.syscall32 sysctl on x86_64 now takes the same values that
vm.vdso_enabled takes on the 32-bit kernel. That is, 1 means a randomized
vDSO location, 2 means the fixed old address. The CONFIG_COMPAT_VDSO
option is now available to make this the default setting, the same meaning
it has for the 32-bit kernel. (This does not affect the 64-bit vDSO.)
The argument vdso32=[012] can be used on both 32-bit and 64-bit kernels to
set this paramter at boot time. The vdso=[012] argument still does this
same thing on the 32-bit kernel.
Signed-off-by: Roland McGrath <roland@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Roland McGrath [Wed, 30 Jan 2008 12:30:43 +0000 (13:30 +0100)]
x86 vDSO: ia32 vdso32-syscall build
This puts the syscall version of the 32-bit vDSO in arch/x86/vdso/vdso32/
for 64-bit IA32 support. This is not used yet, but it paves the way for
consolidating the 32-bit vDSO source and build logic all in one place.
Signed-off-by: Roland McGrath <roland@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Roland McGrath [Wed, 30 Jan 2008 12:30:43 +0000 (13:30 +0100)]
x86 vDSO: ia32 sysenter_return
This changes the 64-bit kernel's support for the 32-bit sysenter
instruction to use stored fields rather than constants for the
user-mode return address, as the 32-bit kernel does. This adds a
sysenter_return field to struct thread_info, as 32-bit has. There
is no observable effect from this yet. It makes the assembly code
independent of the 32-bit vDSO mapping address, paving the way for
making the vDSO address vary as it does on the 32-bit kernel.
[ akpm@linux-foundation.org: build fix on !CONFIG_IA32_EMULATION ]
Signed-off-by: Roland McGrath <roland@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Roland McGrath [Wed, 30 Jan 2008 12:30:42 +0000 (13:30 +0100)]
x86 vDSO: vdso32 setup
This moves arch/x86/kernel/sysenter_32.c to arch/x86/vdso/vdso32-setup.c,
keeping all the code relating only to vDSO magic in the vdso/ subdirectory.
This is a pure renaming, but it paves the way to consolidating the code for
dealing with 32-bit vDSOs across CONFIG_X86_32 and CONFIG_IA32_EMULATION.
Signed-off-by: Roland McGrath <roland@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Roland McGrath [Wed, 30 Jan 2008 12:30:42 +0000 (13:30 +0100)]
x86 vDSO: vdso32 build
This builds the 32-bit vDSO images in the arch/x86/vdso subdirectory.
Nothing uses the images yet, but this paves the way for consolidating
the vDSO build logic all in one place. The new images use a linker
script sharing the layout parts from vdso-layout.lds.S with the 64-bit
vDSO. A new vdso32-syms.lds is generated in the style of vdso-syms.lds.
Signed-off-by: Roland McGrath <roland@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Roland McGrath [Wed, 30 Jan 2008 12:30:42 +0000 (13:30 +0100)]
x86 vDSO: arch/x86/vdso/vdso32
This moves the i386 vDSO sources into arch/x86/vdso/vdso32/, a
new directory. This patch is a pure renaming, but paves the way
for consolidating the vDSO build logic.
Signed-off-by: Roland McGrath <roland@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>