As it stands, irq_exit() may or may not be called with
irqs disabled, depending on __ARCH_IRQ_EXIT_IRQS_DISABLED
that the arch can define.
It makes tick_nohz_irq_exit() unsafe. For example two
interrupts can race in tick_nohz_stop_sched_tick(): the inner
most one computes the expiring time on top of the timer list,
then it's interrupted right before reprogramming the
clock. The new interrupt enqueues a new timer list timer,
it reprogram the clock to take it into account and it exits.
The CPUs resumes the inner most interrupt and performs the clock
reprogramming without considering the new timer list timer.
Linus Torvalds [Wed, 20 Feb 2013 04:12:57 +0000 (20:12 -0800)]
Merge branch 'x86-uv-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 UV3 support update from Ingo Molnar:
"Support for the SGI Ultraviolet System 3 (UV3) platform - the upcoming
third major iteration and upscaling of the SGI UV supercomputing
platform."
* 'x86-uv-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86, uv, uv3: Trim MMR register definitions after code changes for SGI UV3
x86, uv, uv3: Check current gru hub support for SGI UV3
x86, uv, uv3: Update Time Support for SGI UV3
x86, uv, uv3: Update x2apic Support for SGI UV3
x86, uv, uv3: Update Hub Info for SGI UV3
x86, uv, uv3: Update ACPI Check to include SGI UV3
x86, uv, uv3: Update MMR register definitions for SGI Ultraviolet System 3 (UV3)
Linus Torvalds [Wed, 20 Feb 2013 04:11:07 +0000 (20:11 -0800)]
Merge branch 'x86-platform-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 platform changes from Ingo Molnar:
- Support for the Technologic Systems TS-5500 platform, by Vivien
Didelot
- Improved NUMA support on AMD systems:
Add support for federated systems where multiple memory controllers
can exist and see each other over multiple PCI domains. This
basically means that AMD node ids can be more than 8 now and the code
handling this is taught to incorporate PCI domain into those IDs.
- Support for the Goldfish virtual Android emulator, by Jun Nakajima,
Intel, Google, et al.
- Misc fixlets.
* 'x86-platform-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86: Add TS-5500 platform support
x86/srat: Simplify memory affinity init error handling
x86/apb/timer: Remove unnecessary "if"
goldfish: platform device for x86
amd64_edac: Fix type usage in NB IDs and memory ranges
amd64_edac: Fix PCI function lookup
x86, AMD, NB: Use u16 for northbridge IDs in amd_get_nb_id
x86, AMD, NB: Add multi-domain support
Linus Torvalds [Wed, 20 Feb 2013 04:10:21 +0000 (20:10 -0800)]
Merge branch 'x86-hyperv-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86/hyperv changes from Ingo Molnar:
"The biggest change is support for Windows 8's improved hypervisor
interrupt model on the Linux Hyper-V guest subsystem code side.
Smallish fixes otherwise."
* 'x86-hyperv-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86, hyperv: HYPERV depends on X86_LOCAL_APIC
X86: Handle Hyper-V vmbus interrupts as special hypervisor interrupts
X86: Add a check to catch Xen emulation of Hyper-V
x86: Hyper-V: register clocksource only if its advertised
Linus Torvalds [Wed, 20 Feb 2013 04:09:48 +0000 (20:09 -0800)]
Merge branch 'x86-debug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86/debug changes from Ingo Molnar:
"Two init annotations and a built-in memtest speedup"
* 'x86-debug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/memtest: Shorten time for tests
x86: Convert a few mistaken __cpuinit annotations to __init
x86/EFI: Properly init-annotate BGRT code
Linus Torvalds [Wed, 20 Feb 2013 03:13:49 +0000 (19:13 -0800)]
Merge branch 'x86-cleanups-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 cleanup patches from Ingo Molnar:
"Misc smaller cleanups"
* 'x86-cleanups-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86: ptrace.c only needs export.h and not the full module.h
x86, apb_timer: remove unused variable percpu_timer
um: don't compare a pointer to 0
arch/x86/platform/uv: use ARRAY_SIZE where possible
Linus Torvalds [Wed, 20 Feb 2013 03:12:03 +0000 (19:12 -0800)]
Merge branch 'x86-build-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull two x86 kernel build changes from Ingo Molnar:
"The first change modifies how 'make oldconfig' works on cross-bitness
situations on x86. It was felt the new behavior of preserving the
bitness of the .config is more logical. This is a leftover of the
merge.
The second change eliminates a Perl warning. (There's another, more
complete fix resulting of this warning fix, which second fix in flight
to you via the kbuild tree, which will remove the timeconst.pl script
altogether.)"
* 'x86-build-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
timeconst.pl: Eliminate Perl warning
x86: Default to ARCH=x86 to avoid overriding CONFIG_64BIT
Linus Torvalds [Wed, 20 Feb 2013 03:11:10 +0000 (19:11 -0800)]
Merge branch 'x86-boot-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 bootup changes from Ingo Molnar:
"Deal with bootloaders which fail to initialize unknown fields in
boot_params to zero, by sanitizing boot params passed in.
This unbreaks versions of kexec-utils. Other bootloaders do not
appear to show sensitivity to this change, but it's a possibility for
breakage nevertheless."
* 'x86-boot-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86, boot: Sanitize boot_params if not zeroed on creation
Linus Torvalds [Wed, 20 Feb 2013 03:09:42 +0000 (19:09 -0800)]
Merge branch 'x86-asm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86/asm changes from Ingo Molnar:
"The biggest change (by line count) is the unification of the XOR code
and then the introduction of an additional SSE based XOR assembly
method.
The other bigger change is the head_32.S rework/cleanup by Borislav
Petkov.
Last but not least there's the usual laundry list of small but
dangerous (and hopefully perfectly tested) changes to subtle low level
x86 code, plus cleanups."
* 'x86-asm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86, head_32: Give the 6 label a real name
x86, head_32: Remove second CPUID detection from default_entry
x86: Detect CPUID support early at boot
x86, head_32: Remove i386 pieces
x86: Require MOVBE feature in cpuid when we use it
x86: Enable ARCH_USE_BUILTIN_BSWAP
x86/xor: Add alternative SSE implementation only prefetching once per 64-byte line
x86/xor: Unify SSE-base xor-block routines
x86: Fix a typo
x86/mm: Fix the argument passed to sync_global_pgds()
x86/mm: Convert update_mmu_cache() and update_mmu_cache_pmd() to functions
ix86: Tighten asmlinkage_protect() constraints
Linus Torvalds [Wed, 20 Feb 2013 03:07:27 +0000 (19:07 -0800)]
Merge branch 'x86-apic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86/apic changes from Ingo Molnar:
"Main changes:
- Multiple MSI support added to the APIC, PCI and AHCI code - acked
by all relevant maintainers, by Alexander Gordeev.
The advantage is that multiple AHCI ports can have multiple MSI
irqs assigned, and can thus spread to multiple CPUs.
[ Drivers can make use of this new facility via the
pci_enable_msi_block_auto() method ]
- x86 IOAPIC code from interrupt remapping cleanups from Joerg
Roedel:
These patches move all interrupt remapping specific checks out of
the x86 core code and replaces the respective call-sites with
function pointers. As a result the interrupt remapping code is
better abstraced from x86 core interrupt handling code.
- Various smaller improvements, fixes and cleanups."
* 'x86-apic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (26 commits)
x86/intel/irq_remapping: Clean up x2apic opt-out security warning mess
x86, kvm: Fix intialization warnings in kvm.c
x86, irq: Move irq_remapped out of x86 core code
x86, io_apic: Introduce eoi_ioapic_pin call-back
x86, msi: Introduce x86_msi.compose_msi_msg call-back
x86, irq: Introduce setup_remapped_irq()
x86, irq: Move irq_remapped() check into free_remapped_irq
x86, io-apic: Remove !irq_remapped() check from __target_IO_APIC_irq()
x86, io-apic: Move CONFIG_IRQ_REMAP code out of x86 core
x86, irq: Add data structure to keep AMD specific irq remapping information
x86, irq: Move irq_remapping_enabled declaration to iommu code
x86, io_apic: Remove irq_remapping_enabled check in setup_timer_IRQ0_pin
x86, io_apic: Move irq_remapping_enabled checks out of check_timer()
x86, io_apic: Convert setup_ioapic_entry to function pointer
x86, io_apic: Introduce set_affinity function pointer
x86, msi: Use IRQ remapping specific setup_msi_irqs routine
x86, hpet: Introduce x86_msi_ops.setup_hpet_msi
x86, io_apic: Introduce x86_io_apic_ops.print_entries for debugging
x86, io_apic: Introduce x86_io_apic_ops.disable()
x86, apic: Mask IO-APIC and PIC unconditionally on LAPIC resume
...
Linus Torvalds [Wed, 20 Feb 2013 03:05:45 +0000 (19:05 -0800)]
Merge branch 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull timer changes from Ingo Molnar:
"Main changes:
- ntp: Add CONFIG_RTC_SYSTOHC: a generic RTC driver facility
complementing the existing CONFIG_RTC_HCTOSYS, which uses NTP to
keep the hardware clock updated.
- posix-timers: Fix clock_adjtime to always return timex data on
success. This is changing the ABI, but no breakage was expected
and found - caution is warranted nevertheless.
- clockevents: refactor timer broadcast handling to be more generic
and less duplicated with matching architecture code (mostly ARM
motivated.)
- various fixes and cleanups"
* 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
timers/x86/hpet: Use HPET_COUNTER to specify the hpet counter in vread_hpet()
posix-cpu-timers: Fix nanosleep task_struct leak
clockevents: Fix generic broadcast for FEAT_C3STOP
time, Fix setting of hardware clock in NTP code
hrtimer: Prevent hrtimer_enqueue_reprogram race
clockevents: Add generic timer broadcast function
clockevents: Add generic timer broadcast receiver
timekeeping: Switch HAS_PERSISTENT_CLOCK to ALWAYS_USE_PERSISTENT_CLOCK
x86/time/rtc: Don't print extended CMOS year when reading RTC
x86: Select HAS_PERSISTENT_CLOCK on x86
timekeeping: Add CONFIG_HAS_PERSISTENT_CLOCK option
rtc: Skip the suspend/resume handling if persistent clock exist
timekeeping: Add persistent_clock_exist flag
posix-timers: Fix clock_adjtime to always return timex data on success
Round the calculated scale factor in set_cyc2ns_scale()
NTP: Add a CONFIG_RTC_SYSTOHC configuration
MAINTAINERS: Update John Stultz's email
time: create __getnstimeofday for WARNless calls
Linus Torvalds [Wed, 20 Feb 2013 03:04:55 +0000 (19:04 -0800)]
Merge branch 'smp-hotplug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull preparatory smp/hotplug patches from Ingo Molnar:
"Some early preparatory changes for the WIP hotplug rework by Thomas
Gleixner."
* 'smp-hotplug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
stop_machine: Use smpboot threads
stop_machine: Store task reference in a separate per cpu variable
smpboot: Allow selfparking per cpu threads
Linus Torvalds [Wed, 20 Feb 2013 02:19:48 +0000 (18:19 -0800)]
Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler changes from Ingo Molnar:
"Main changes:
- scheduler side full-dynticks (user-space execution is undisturbed
and receives no timer IRQs) preparation changes that convert the
cputime accounting code to be full-dynticks ready, from Frederic
Weisbecker.
- Initial sched.h split-up changes, by Clark Williams
- select_idle_sibling() performance improvement by Mike Galbraith:
" 1 tbench pair (worst case) in a 10 core + SMT package:
pre 15.22 MB/sec 1 procs
post 252.01 MB/sec 1 procs "
- sched_rr_get_interval() ABI fix/change. We think this detail is not
used by apps (so it's not an ABI in practice), but lets keep it
under observation.
- misc RT scheduling cleanups, optimizations"
* 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (24 commits)
sched/rt: Add <linux/sched/rt.h> header to <linux/init_task.h>
cputime: Remove irqsave from seqlock readers
sched, powerpc: Fix sched.h split-up build failure
cputime: Restore CPU_ACCOUNTING config defaults for PPC64
sched/rt: Move rt specific bits into new header file
sched/rt: Add a tuning knob to allow changing SCHED_RR timeslice
sched: Move sched.h sysctl bits into separate header
sched: Fix signedness bug in yield_to()
sched: Fix select_idle_sibling() bouncing cow syndrome
sched/rt: Further simplify pick_rt_task()
sched/rt: Do not account zero delta_exec in update_curr_rt()
cputime: Safely read cputime of full dynticks CPUs
kvm: Prepare to add generic guest entry/exit callbacks
cputime: Use accessors to read task cputime stats
cputime: Allow dynamic switch between tick/virtual based cputime accounting
cputime: Generic on-demand virtual cputime accounting
cputime: Move default nsecs_to_cputime() to jiffies based cputime file
cputime: Librarize per nsecs resolution cputime definitions
cputime: Avoid multiplication overflow on utime scaling
context_tracking: Export context state for generic vtime
...
Fix up conflict in kernel/context_tracking.c due to comment additions.
Linus Torvalds [Wed, 20 Feb 2013 01:49:41 +0000 (17:49 -0800)]
Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf changes from Ingo Molnar:
"There are lots of improvements, the biggest changes are:
Main kernel side changes:
- Improve uprobes performance by adding 'pre-filtering' support, by
Oleg Nesterov.
- Make some POWER7 events available in sysfs, equivalent to what was
done on x86, from Sukadev Bhattiprolu.
- tracing updates by Steve Rostedt - mostly misc fixes and smaller
improvements.
- Use perf/event tracing to report PCI Express advanced errors, by
Tony Luck.
- Enable northbridge performance counters on AMD family 15h, by Jacob
Shin.
- This tracing commit:
tracing: Remove the extra 4 bytes of padding in events
changes the ABI. All involved parties (PowerTop in particular)
seem to agree that it's safe to do now with the introduction of
libtraceevent, but the devil is in the details ...
Main tooling side changes:
- Add 'event group view', from Namyung Kim:
To use it, 'perf record' should group events when recording. And
then perf report parses the saved group relation from file header
and prints them together if --group option is provided. You can
use the 'perf evlist' command to see event group information:
$ perf record -e '{ref-cycles,cycles}' noploop 1
[ perf record: Woken up 2 times to write data ]
[ perf record: Captured and wrote 0.385 MB perf.data (~16807 samples) ]
$ perf evlist --group
{ref-cycles,cycles}
With this example, default perf report will show you each event
separately.
You can use --group option to enable event group view:
As you can see the Overhead column now contains both of ref-cycles
and cycles and header line shows group information also - 'anon
group { ref-cycles, cycles }'. The output is sorted by period of
group leader first.
- Initial GTK+ annotate browser, from Namhyung Kim.
- Add option for runtime switching perf data file in perf report,
just press 's' and a menu with the valid files found in the current
directory will be presented, from Feng Tang.
- Add support to display whole group data for raw columns, from Jiri
Olsa.
- Add per processor socket count aggregation in perf stat, from
Stephane Eranian.
- Add interval printing in 'perf stat', from Stephane Eranian.
- 'perf test' improvements
- Add support for wildcards in tracepoint system name, from Jiri
Olsa.
- Add anonymous huge page recognition, from Joshua Zhu.
- perf build-id cache now can show DSOs present in a perf.data file
that are not in the cache, to integrate with build-id servers being
put in place by organizations such as Fedora.
- perf top now shares more of the evsel config/creation routines with
'record', paving the way for further integration like 'top'
snapshots, etc.
- perf top now supports DWARF callchains.
- Fix mmap limitations on 32-bit, fix from David Miller.
- 'perf bench numa mem' NUMA performance measurement suite
- ... and lots of fixes, performance improvements, cleanups and other
improvements I failed to list - see the shortlog and git log for
details."
* 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (270 commits)
perf/x86/amd: Enable northbridge performance counters on AMD family 15h
perf/hwbp: Fix cleanup in case of kzalloc failure
perf tools: Fix build with bison 2.3 and older.
perf tools: Limit unwind support to x86 archs
perf annotate: Make it to be able to skip unannotatable symbols
perf gtk/annotate: Fail early if it can't annotate
perf gtk/annotate: Show source lines with gray color
perf gtk/annotate: Support multiple event annotation
perf ui/gtk: Implement basic GTK2 annotation browser
perf annotate: Fix warning message on a missing vmlinux
perf buildid-cache: Add --update option
uprobes/perf: Avoid uprobe_apply() whenever possible
uprobes/perf: Teach trace_uprobe/perf code to use UPROBE_HANDLER_REMOVE
uprobes/perf: Teach trace_uprobe/perf code to pre-filter
uprobes/perf: Teach trace_uprobe/perf code to track the active perf_event's
uprobes: Introduce uprobe_apply()
perf: Introduce hw_perf_event->tp_target and ->tp_list
uprobes/perf: Always increment trace_uprobe->nhit
uprobes/tracing: Kill uprobe_trace_consumer, embed uprobe_consumer into trace_uprobe
uprobes/tracing: Introduce is_trace_uprobe_enabled()
...
Linus Torvalds [Wed, 20 Feb 2013 01:47:58 +0000 (17:47 -0800)]
Merge branch 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull irq core changes from Ingo Molnar:
"The biggest changes are the IRQ-work and printk changes from Frederic
Weisbecker, which prepare the code for 'full dynticks' (the ability to
stop or slow down the periodic tick arbitrarily, not just in idle time
as today):
- Don't stop tick with irq works pending. This fix is generally
useful and concerns archs that can't raise self IPIs.
- Flush irq works before CPU offlining.
- Introduce "lazy" irq works that can wait for the next tick to be
executed, unless it's stopped.
- Implement klogd wake up using irq work. This removes the ad-hoc
printk_tick()/printk_needs_cpu() hooks and make it working even in
dynticks mode.
- Cleanups and fixes."
* 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
genirq: Export enable/disable_percpu_irq()
arch Kconfig: Remove references to IRQ_PER_CPU
irq_work: Remove return value from the irq_work_queue() function
genirq: Avoid deadlock in spurious handling
printk: Wake up klogd using irq_work
irq_work: Make self-IPIs optable
irq_work: Warn if there's still work on cpu_down
irq_work: Flush work on CPU_DYING
irq_work: Don't stop the tick with pending works
nohz: Add API to check tick state
irq_work: Remove CONFIG_HAVE_IRQ_WORK
irq_work: Fix racy check on work pending flag
irq_work: Fix racy IRQ_WORK_BUSY flag setting
Linus Torvalds [Wed, 20 Feb 2013 01:45:20 +0000 (17:45 -0800)]
Merge branch 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull RCU changes from Ingo Molnar:
"SRCU changes:
- These include debugging aids, updates that move towards the goal of
permitting srcu_read_lock() and srcu_read_unlock() to be used from
idle and offline CPUs, and a few small fixes.
Changes to rcutorture and to RCU documentation:
- Posted to LKML at https://lkml.org/lkml/2013/1/26/188
Enhancements to uniprocessor handling in tiny RCU:
- Posted to LKML at https://lkml.org/lkml/2013/1/27/2
Tag RCU callbacks with grace-period number to simplify callback
advancement:
- Posted to LKML at https://lkml.org/lkml/2013/1/26/203
Miscellaneous fixes:
- Posted to LKML at https://lkml.org/lkml/2013/1/26/204"
* 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (27 commits)
srcu: use ACCESS_ONCE() to access sp->completed in srcu_read_lock()
srcu: Update synchronize_srcu_expedited()'s comments
srcu: Update synchronize_srcu()'s comments
srcu: Remove checks preventing idle CPUs from calling srcu_read_lock()
srcu: Remove checks preventing offline CPUs from calling srcu_read_lock()
srcu: Simple cleanup for cleanup_srcu_struct()
srcu: Add might_sleep() annotation to synchronize_srcu()
srcu: Simplify __srcu_read_unlock() via this_cpu_dec()
rcu: Allow rcutorture to be built at low optimization levels
rcu: Make rcutorture's shuffler task shuffle recently added tasks
rcu: Allow TREE_PREEMPT_RCU on UP systems
rcu: Provide RCU CPU stall warnings for tiny RCU
context_tracking: Add comments on interface and internals
rcu: Remove obsolete Kconfig option from comment
rcu: Remove unused code originally used for context tracking
rcu: Consolidate debugging Kconfig options
rcu: Correct 'optimized' to 'optimize' in header comment
rcu: Trace callback acceleration
rcu: Tag callback lists with corresponding grace-period number
rcutorture: Don't compare ptr with 0
...
Ingo Molnar [Sat, 16 Feb 2013 08:46:48 +0000 (09:46 +0100)]
sched/rt: Add <linux/sched/rt.h> header to <linux/init_task.h>
IA64 relied on it through sched.h inclusion:
arch/ia64/kernel/init_task.c:38:11: error: 'MAX_PRIO' undeclared here (not in a function)
arch/ia64/kernel/init_task.c:38:11: error: 'RR_TIMESLICE' undeclared here (not in a function)
Reported-by: kbuild test robot <fengguang.wu@intel.com> Cc: Clark Williams <williams@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Link: http://lkml.kernel.org/n/tip-xaan1twswggedMR0airtpjui@git.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
Linus Torvalds [Mon, 18 Feb 2013 17:58:02 +0000 (09:58 -0800)]
mm: fix pageblock bitmap allocation
Commit c060f943d092 ("mm: use aligned zone start for pfn_to_bitidx
calculation") fixed out calculation of the index into the pageblock
bitmap when a !SPARSEMEM zome was not aligned to pageblock_nr_pages.
However, the _allocation_ of that bitmap had never taken this alignment
requirement into accout, so depending on the exact size and alignment of
the zone, the use of that index could then access past the allocation,
resulting in some very subtle memory corruption.
This was reported (and bisected) by Ingo Molnar: one of his random
config builds would hang with certain very specific kernel command line
options.
In the meantime, commit c060f943d092 has been marked for stable, so this
fix needs to be back-ported to the stable kernels that backported the
commit to use the right alignment.
Jacob Shin [Wed, 6 Feb 2013 17:26:29 +0000 (11:26 -0600)]
perf/x86/amd: Enable northbridge performance counters on AMD family 15h
On AMD family 15h processors, there are 4 new performance
counters (in addition to 6 core performance counters) that can
be used for counting northbridge events (i.e. DRAM accesses).
Their bit fields are almost identical to the core performance
counters. However, unlike the core performance counters, these
MSRs are shared between multiple cores (that share the same
northbridge).
We will reuse the same code path as existing family 10h
northbridge event constraints handler logic to enforce
this sharing.
Signed-off-by: Jacob Shin <jacob.shin@amd.com> Acked-by: Stephane Eranian <eranian@google.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Jacob Shin <jacob.shin@amd.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1360171589-6381-7-git-send-email-jacob.shin@amd.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
Linus Torvalds [Fri, 15 Feb 2013 20:12:55 +0000 (12:12 -0800)]
Merge tag 'stable/for-linus-3.8-rc7-tag-two' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen
Pull xen fixes from Konrad Rzeszutek Wilk:
"Two fixes:
- A simple bug-fix for redundant NULL check.
- CVE-2013-0228/XSA-42: x86/xen: don't assume %ds is usable in
xen_iret for 32-bit PVOPS
and two reverts:
- Revert the PVonHVM kexec. The patch introduces a regression with
older hypervisor stacks, such as Xen 4.1."
* tag 'stable/for-linus-3.8-rc7-tag-two' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen:
Revert "xen PVonHVM: use E820_Reserved area for shared_info"
Revert "xen/PVonHVM: fix compile warning in init_hvm_pv_info"
xen: remove redundant NULL check before unregister_and_remove_pcpu().
x86/xen: don't assume %ds is usable in xen_iret for 32-bit PVOPS.
Revert "[media] dvb_frontend: return -ENOTTY for unimplement IOCTL"
As reported by Klaus Schmidinger:
"In VDR I use an ioctl() call with FE_READ_UNCORRECTED_BLOCKS on a
device (using stb0899). After this call I check 'errno' for
EOPNOTSUPP to determine whether this device supports this call. This
used to work just fine, until a few months ago I noticed that my
devices using stb0899 didn't display their signal quality in VDR's OSD
any more. After further investigation I found that
ioctl(FE_READ_UNCORRECTED_BLOCKS) no longer returns EOPNOTSUPP, but
rather ENOTTY. And since I stop getting the signal quality in case
any unknown errno value appears, this broke my signal quality query
function."
Linus Torvalds [Fri, 15 Feb 2013 20:04:57 +0000 (12:04 -0800)]
Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull one more x86 fix from Peter Anvin:
"Sigh. One more patch in the "please don't brick my Samsung" series"
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
efi: Clear EFI_RUNTIME_SERVICES rather than EFI_BOOT by "noefi" boot parameter
Linus Torvalds [Fri, 15 Feb 2013 20:04:08 +0000 (12:04 -0800)]
Merge tag '3.8-pci-fixes-3' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci
Pull PCI fix from Bjorn Helgaas:
"This is another fix for v3.8. It fixes an oops that happens when a
Thunderbolt adapter is unplugged (remove device, poll for PME events
on no-longer-existing device, oops)."
* tag '3.8-pci-fixes-3' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
PCI/PM: Clean up PME state when removing a device
Linus Torvalds [Fri, 15 Feb 2013 20:03:09 +0000 (12:03 -0800)]
Merge tag 'omapdss-for-3.8-rc8' of git://gitorious.org/linux-omap-dss2/linux
Pull omapdss fixes from Tomi Valkeinen:
"It'd be great if these two late fixes would still make it into 3.8.
The other one fixes ARM kernel compilation when using 'allyesconfig',
and the other makes DPI displays function again on OMAP3630 boards:
- Fix ARM compilation with "allyesconfig" (omapdrm: fix the
dependency to omapdss)
- fix DPI displays on OMAP3630 (OMAPDSS: add FEAT_DPI_USES_VDDS_DSI
to omap3630_dss_feat_list)"
* tag 'omapdss-for-3.8-rc8' of git://gitorious.org/linux-omap-dss2/linux:
omapdrm: fix the dependency to omapdss
OMAPDSS: add FEAT_DPI_USES_VDDS_DSI to omap3630_dss_feat_list
Linus Torvalds [Fri, 15 Feb 2013 19:59:27 +0000 (11:59 -0800)]
Merge branch 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux
Pull i2c maintainer info update from Wolfram Sang:
"Since my old email and repos are not working anymore, and this already
caused some confusion, I think a MAINTAINERS update for 3.8 is
helpful. So, people trying I2C with the new kernel can properly reach
me and find my repos."
* 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
MAINTAINERS: change my email and repos
The trinity fuzzer triggered a task_struct reference leak via
clock_nanosleep with CPU_TIMERs. do_cpu_nanosleep() calls
posic_cpu_timer_create(), but misses a corresponding
posix_cpu_timer_del() which leads to the task_struct reference leak.
Reported-and-tested-by: Tommi Rantala <tt.rantala@gmail.com> Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com> Cc: Dave Jones <davej@redhat.com> Cc: John Stultz <john.stultz@linaro.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/20130215100810.GF4392@redhat.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
We are doing this b/c on 32-bit PVonHVM with older hypervisors
(Xen 4.1) it ends up bothing up the start_info. This is bad b/c
we use it for the time keeping, and the timekeeping code loops
forever - as the version field never changes. Olaf says to
revert it, so lets do that.
Acked-by: Olaf Hering <olaf@aepfle.de> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Paul Gortmaker [Thu, 14 Feb 2013 20:14:02 +0000 (15:14 -0500)]
x86: ptrace.c only needs export.h and not the full module.h
Commit cb57a2b4cff7edf2a4e32c0163200e9434807e0a ("x86-32: Export
kernel_stack_pointer() for modules") added an include of the
module.h header in conjunction with adding an EXPORT_SYMBOL_GPL
of kernel_stack_pointer.
But module.h should be avoided for simple exports, since it in turn
includes the world. Swap the module.h for export.h instead.
Vinson Lee [Wed, 13 Feb 2013 21:48:58 +0000 (13:48 -0800)]
perf tools: Fix build with bison 2.3 and older.
The %name-prefix "prefix" syntax is not available on bison 2.3 and
older. Substitute with the -p "prefix" command-line option for
compatibility with older versions of bison.
This patch fixes this build error with older versions of bison.
Namhyung Kim [Thu, 7 Feb 2013 09:02:14 +0000 (18:02 +0900)]
perf annotate: Make it to be able to skip unannotatable symbols
Add --skip-missing option for skipping symbols that cannot be used for
annotation. It's the case of kernel symbols that user doesn't have a
vmlinux image file.
Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Andi Kleen <andi@firstfloor.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Pekka Enberg <penberg@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1360227734-375-8-git-send-email-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Namhyung Kim [Thu, 7 Feb 2013 09:02:10 +0000 (18:02 +0900)]
perf gtk/annotate: Show source lines with gray color
In order to differentiate source lines from asm line, print them with
gray color. To do this, it needs to be escaped since sometimes it
contains "<" and/or ">" characters so that it should not be considered
as a markup tags. Use glib's g_markup_escape_text() for this.
Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Andi Kleen <andi@firstfloor.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Pekka Enberg <penberg@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1360227734-375-4-git-send-email-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Namhyung Kim [Thu, 7 Feb 2013 09:02:09 +0000 (18:02 +0900)]
perf gtk/annotate: Support multiple event annotation
Show multiple annotation result for each evsel. Each result represents
the most frquently sampled symbol/function for the evsel and it will be
shown in a tab window.
For this add a reference to main container (notebook) to the pgctx. At
the first call to annotate browser, hist_entry__find_annotations() will
setup a new browser, and next calls will add new tabs to the browser.
But it requires final perf_gtk__show_annotations() to start processing
GUI events.
Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Andi Kleen <andi@firstfloor.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Pekka Enberg <penberg@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1360227734-375-3-git-send-email-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Namhyung Kim [Thu, 7 Feb 2013 09:02:12 +0000 (18:02 +0900)]
perf annotate: Fix warning message on a missing vmlinux
When perf annotate runs with no vmlinux file it cannot annotate kernel
symbols because the kallsyms only provides symbol addresses. So it
recommends to run perf buildid-cache to install proper vmlinux image.
But running perf buildid-cache -av vmlinux as the message gives me a
following error:
$ perf buildid-cache -av /home/namhyung/build/kernel/vmlinux
Couldn't add v: No such file or directory
Since the -a option receives a parameter, 'v' should not be after the
option.
In addition -a option is not work for this case since the build-id cache
already has a kallsyms with same build-id so it'll fail with EEXIST.
Use recently added -u (--update) option for it.
Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Andi Kleen <andi@firstfloor.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Pekka Enberg <penberg@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1360227734-375-6-git-send-email-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Thomas Gleixner [Thu, 31 Jan 2013 12:11:14 +0000 (12:11 +0000)]
stop_machine: Use smpboot threads
Use the smpboot thread infrastructure. Mark the stopper thread
selfparking and park it after it has finished the take_cpu_down()
work.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Paul McKenney <paulmck@linux.vnet.ibm.com> Cc: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> Cc: Arjan van de Veen <arjan@infradead.org> Cc: Paul Turner <pjt@google.com> Cc: Richard Weinberger <rw@linutronix.de> Cc: Magnus Damm <magnus.damm@gmail.com> Link: http://lkml.kernel.org/r/20130131120741.686315164@linutronix.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Thomas Gleixner [Thu, 31 Jan 2013 12:11:13 +0000 (12:11 +0000)]
stop_machine: Store task reference in a separate per cpu variable
To allow the stopper thread being managed by the smpboot thread
infrastructure separate out the task storage from the stopper data
structure.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Paul McKenney <paulmck@linux.vnet.ibm.com> Cc: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> Cc: Arjan van de Veen <arjan@infradead.org> Cc: Paul Turner <pjt@google.com> Cc: Richard Weinberger <rw@linutronix.de> Cc: Magnus Damm <magnus.damm@gmail.com> Link: http://lkml.kernel.org/r/20130131120741.626690384@linutronix.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Thomas Gleixner [Thu, 31 Jan 2013 12:11:12 +0000 (12:11 +0000)]
smpboot: Allow selfparking per cpu threads
The stop machine threads are still killed when a cpu goes offline. The
reason is that the thread is used to bring the cpu down, so it can't
be parked along with the other per cpu threads.
Allow a per cpu thread to be excluded from automatic parking, so it
can park itself once it's done
Add a create callback function as well.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Paul McKenney <paulmck@linux.vnet.ibm.com> Cc: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> Cc: Arjan van de Veen <arjan@infradead.org> Cc: Paul Turner <pjt@google.com> Cc: Richard Weinberger <rw@linutronix.de> Cc: Magnus Damm <magnus.damm@gmail.com> Link: http://lkml.kernel.org/r/20130131120741.553993267@linutronix.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tomi Valkeinen [Thu, 7 Feb 2013 14:35:52 +0000 (16:35 +0200)]
omapdrm: fix the dependency to omapdss
omapdrm uses "select" in Kconfig to enable omapdss. This doesn't work
correctly, as "select" forces omapdss to be enabled in the config even
if it normally could not be enabled because of missing Kconfig
dependencies.
This causes a build break on ARM, when using allyesconfig:
drivers/video/omap2/dss/dss.c: In function 'dss_calc_clock_div':
drivers/video/omap2/dss/dss.c:572:20: error: 'CONFIG_OMAP2_DSS_MIN_FCK_PER_PCK' undeclared (first use in this function)
drivers/video/omap2/dss/dss.c:572:20: note: each undeclared identifier is reported only once for each function it appears in
Instead of using select, this patch changes omapdrm to use "depend
on".
Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ti.com>
Satoru Takeuchi [Thu, 14 Feb 2013 00:12:52 +0000 (09:12 +0900)]
efi: Clear EFI_RUNTIME_SERVICES rather than EFI_BOOT by "noefi" boot parameter
There was a serious problem in samsung-laptop that its platform driver is
designed to run under BIOS and running under EFI can cause the machine to
become bricked or can cause Machine Check Exceptions.
Discussion about this problem:
https://bugs.launchpad.net/ubuntu-cdimage/+bug/1040557
https://bugzilla.kernel.org/show_bug.cgi?id=47121
Unfortunately this problem comes back again if users specify "noefi" option.
This parameter clears EFI_BOOT and that driver continues to run even if running
under EFI. Refer to the document, this parameter should clear
EFI_RUNTIME_SERVICES instead.
Documentation/x86/x86_64/uefi.txt:
===============================================================================
...
- If some or all EFI runtime services don't work, you can try following
kernel command line parameters to turn off some or all EFI runtime
services.
noefi turn off all EFI runtime services
...
===============================================================================
Signed-off-by: Satoru Takeuchi <takeuchi_satoru@jp.fujitsu.com> Link: http://lkml.kernel.org/r/511C2C04.2070108@jp.fujitsu.com Cc: Matt Fleming <matt.fleming@intel.com> Cc: <stable@vger.kernel.org> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Jan Beulich [Thu, 24 Jan 2013 13:11:10 +0000 (13:11 +0000)]
x86/xen: don't assume %ds is usable in xen_iret for 32-bit PVOPS.
This fixes CVE-2013-0228 / XSA-42
Drew Jones while working on CVE-2013-0190 found that that unprivileged guest user
in 32bit PV guest can use to crash the > guest with the panic like this:
Petr says: "
I've analysed the bug and I think that xen_iret() cannot cope with
mangled DS, in this case zeroed out (null selector/descriptor) by either
xen_failsafe_callback() or RESTORE_REGS because the corresponding LDT
entry was invalidated by the reproducer. "
Jan took a look at the preliminary patch and came up a fix that solves
this problem:
"This code gets called after all registers other than those handled by
IRET got already restored, hence a null selector in %ds or a non-null
one that got loaded from a code or read-only data descriptor would
cause a kernel mode fault (with the potential of crashing the kernel
as a whole, if panic_on_oops is set)."
The way to fix this is to realize that the we can only relay on the
registers that IRET restores. The two that are guaranteed are the
%cs and %ss as they are always fixed GDT selectors. Also they are
inaccessible from user mode - so they cannot be altered. This is
the approach taken in this patch.
Another alternative option suggested by Jan would be to relay on
the subtle realization that using the %ebp or %esp relative references uses
the %ss segment. In which case we could switch from using %eax to %ebp and
would not need the %ss over-rides. That would also require one extra
instruction to compensate for the one place where the register is used
as scaled index. However Andrew pointed out that is too subtle and if
further work was to be done in this code-path it could escape folks attention
and lead to accidents.
Reviewed-by: Petr Matousek <pmatouse@redhat.com> Reported-by: Petr Matousek <pmatouse@redhat.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> Signed-off-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
David S. Miller [Wed, 13 Feb 2013 20:21:06 +0000 (12:21 -0800)]
sparc64: Fix get_user_pages_fast() wrt. THP.
Mostly mirrors the s390 logic, as unlike x86 we don't need the
SetPageReferenced() bits.
On sparc64 we also lack a user/privileged bit in the huge PMDs.
In order to make this work for THP and non-THP builds, some header
file adjustments were necessary. Namely, provide the PMD_HUGE_* bit
defines and the pmd_large() inline unconditionally rather than
protected by TRANSPARENT_HUGEPAGE.
Reported-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Pull networking fixes from David Miller:
"This is primarily to get those r8169 reverts sorted, but other fixes
have accumulated meanwhile.
1) Revert two r8169 changes to fix suspend/resume for some users,
from Francois Romieu.
2) PCI dma mapping errors in atl1c are not checked for and this cause
hard crashes for some users, from Xiong Huang.
3) In 3.8.x we merged the removal of the EXPERIMENTAL dependency for
'dlm' but the same patch for 'sctp' got lost somewhere, resulting
in the potential for build errors since there are cross
dependencies. From Kees Cook.
4) SCTP's ipv6 socket route validation makes boolean tests
incorrectly, fix from Daniel Borkmann.
5) mac80211 does sizeof(ptr) instead of (sizeof(ptr) * nelem), from
Cong Ding.
6) arp_rcv() can crash on shared non-linear packets, from Eric
Dumazet.
7) Avoid crashes in macvtap by setting ->gso_type consistently in
ixgbe, qlcnic, and bnx2x drivers. From Michael S Tsirkin and
Alexander Duyck.
8) Trinity fuzzer spots infinite loop in __skb_recv_datagram(), fix
from Eric Dumazet.
9) STP protocol frames should use high packet priority, otherwise an
overloaded bridge can get stuck. From Stephen Hemminger.
10) The HTB packet scheduler was converted some time ago to store
internal timestamps in nanoseconds, but we don't convert back into
psched ticks for the user during dumps. Fix from Jiri Pirko.
11) mwl8k channel table doesn't set the .band field properly,
resulting in NULL pointer derefs. Fix from Jonas Gorski.
12) mac80211 doesn't accumulate channels properly during a scan so we
can downgrade heavily to a much less desirable connection speed.
Fix from Johannes Berg.
13) PHY probe failure in stmmac can result in resource leaks and
double MDIO registery later, from Giuseppe CAVALLARO.
14) Correct ipv6 checksumming in ip6t_NPT netfilter module, also fix
address prefix mangling, from YOSHIFUJI Hideaki."
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (27 commits)
net, sctp: remove CONFIG_EXPERIMENTAL
net: sctp: sctp_v6_get_dst: fix boolean test in dst cache
batman-adv: Fix NULL pointer dereference in DAT hash collision avoidance
net/macb: fix race with RX interrupt while doing NAPI
atl1c: add error checking for pci_map_single functions
htb: fix values in opt dump
ixgbe: Only set gso_type to SKB_GSO_TCPV4 as RSC does not support IPv6
net: fix infinite loop in __skb_recv_datagram()
net: qmi_wwan: add Yota / Megafon M100-1 4g modem
mwl8k: fix band for supported channels
bridge: set priority of STP packets
mac80211: fix channel selection bug
arp: fix possible crash in arp_rcv()
bnx2x: set gso_type
qlcnic: set gso_type
ixgbe: fix gso type
stmmac: mdio register has to fail if the phy is not found
stmmac: fix macro used for debugging the xmit
Revert "r8169: enable internal ASPM and clock request settings".
Revert "r8169: enable ALDPS for power saving".
...
Linus Torvalds [Wed, 13 Feb 2013 20:19:49 +0000 (12:19 -0800)]
Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Peter Anvin:
"One (hopefully) last batch of x86 fixes. You asked for the patch by
patch justifications, so here they are:
x86, MCE: Retract most UAPI exports
This one unexports from userspace a bunch of definitions which should
never have been exported. We really don't want to create an
accidental legacy here.
x86, doc: Add a bootloader ID for OVMF
This is a documentation-only patch, just recording the official
assignment of a boot loader ID.
x86: Do not leak kernel page mapping locations
Security: avoid making it needlessly easy for user space to probe the
kernel memory layout.
x86/mm: Check if PUD is large when validating a kernel address
Prevent failures using /proc/kcore when using 1G pages.
x86/apic: Work around boot failure on HP ProLiant DL980 G7 Server systems
Works around a BIOS problem causing boot failures on affected hardware."
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/mm: Check if PUD is large when validating a kernel address
x86/apic: Work around boot failure on HP ProLiant DL980 G7 Server systems
x86, doc: Add a bootloader ID for OVMF
x86: Do not leak kernel page mapping locations
x86, MCE: Retract most UAPI exports
Wolfram Sang [Fri, 8 Feb 2013 19:54:40 +0000 (20:54 +0100)]
MAINTAINERS: change my email and repos
Change to my private email, change to my shiny new kernel.org repos,
and drop outdated entry from the former maintainer. Drop my PCA entry,
too, since it belongs to the I2C realm anyhow.
Signed-off-by: Wolfram Sang <wolfram@the-dreams.de>
Devices are added to pci_pme_list when drivers use pci_enable_wake()
or pci_wake_from_d3(), but they aren't removed from the list unless
the driver explicitly disables wakeup. Many drivers never disable
wakeup, so their devices remain on the list even after they are
removed, e.g., via hotplug. A subsequent PME poll will oops when
it tries to touch the device.
This patch disables PME# on a device before removing it, which removes
the device from pci_pme_list. This is safe even if the device never
had PME# enabled.
This oops can be triggered by unplugging a Thunderbolt ethernet adapter
on a Macbook Pro, as reported by Daniel below.
[bhelgaas: changelog]
Reference: http://lkml.kernel.org/r/CAMVG2svG21yiM1wkH4_2pen2n+cr2-Zv7TbH3Gj+8MwevZjDbw@mail.gmail.com Reported-and-tested-by: Daniel J Blueman <daniel@quora.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> CC: stable@vger.kernel.org
Kees Cook [Wed, 13 Feb 2013 00:24:56 +0000 (16:24 -0800)]
net, sctp: remove CONFIG_EXPERIMENTAL
This config item has not carried much meaning for a while now and is
almost always enabled by default. As agreed during the Linux kernel
summit, remove it.
Acked-by: Vlad Yasevich <vyasevich@gmail.com> Acked-by: Steven Whitehouse <swhiteho@redhat.com> Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: David Rientjes <rientjes@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Daniel Borkmann [Tue, 12 Feb 2013 13:30:16 +0000 (13:30 +0000)]
net: sctp: sctp_v6_get_dst: fix boolean test in dst cache
We walk through the bind address list and try to get the best source
address for a given destination. However, currently, we take the
'continue' path of the loop when an entry is invalid (!laddr->valid)
*and* the entry state does not equal SCTP_ADDR_SRC (laddr->state !=
SCTP_ADDR_SRC).
Thus, still, invalid entries with SCTP_ADDR_SRC might not 'continue'
as well as valid entries with SCTP_ADDR_{NEW, SRC, DEL}, with a possible
false baddr and matchlen as a result, causing in worst case dst route
to be false or possibly NULL.
This test should actually be a '||' instead of '&&'. But lets fix it
and make this a bit easier to read by having the condition the same way
as similarly done in sctp_v4_get_dst.
Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Acked-by: Vlad Yasevich <vyasevich@gmail.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Pau Koning [Tue, 12 Feb 2013 00:18:45 +0000 (00:18 +0000)]
batman-adv: Fix NULL pointer dereference in DAT hash collision avoidance
An entry in DAT with the hashed position of 0 can cause a NULL pointer
dereference when the first entry is checked by batadv_choose_next_candidate.
This first candidate automatically has the max value of 0 and the max_orig_node
of NULL. Not checking max_orig_node for NULL in batadv_is_orig_node_eligible
will lead to a NULL pointer dereference when checking for the lowest address.
Nicolas Ferre [Tue, 12 Feb 2013 10:08:48 +0000 (11:08 +0100)]
net/macb: fix race with RX interrupt while doing NAPI
When interrupts are disabled, an RX condition can occur but
it is not reported when enabling interrupts again. We need to check
RSR and use napi_reschedule() if condition is met.
Signed-off-by: Nicolas Ferre <nicolas.ferre@atmel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Huang, Xiong [Mon, 11 Feb 2013 14:44:40 +0000 (14:44 +0000)]
atl1c: add error checking for pci_map_single functions
it is reported that code hit DMA-API errors on 3.8-rc6+,
(see https://bugzilla.redhat.com/show_bug.cgi?id=908436, and
https://bugzilla.redhat.com/show_bug.cgi?id=908550)
this patch just adds error handler for
pci_map_single and skb_frag_dma_map.
Signed-off-by: xiong <xiong@qca.qualcomm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Investigation determined that the bug triggered when reading
system RAM at the 4G mark. On this system, that was the first
address using 1G pages for the virt->phys direct mapping so the
PUD is pointing to a physical address, not a PMD page.
The problem is that the page table walker in kern_addr_valid() is
not checking pud_large() and treats the physical address as if
it was a PMD. If it happens to look like pmd_none then it'll
silently fail, probably returning zeros instead of real data. If
the data happens to look like a present PMD though, it will be
walked resulting in the oops above.
This patch adds the necessary pud_large() check.
Unfortunately the problem was not readily reproducible and now
they are running the backup program without accessing
/proc/kcore so the patch has not been validated but I think it
makes sense.
Signed-off-by: Mel Gorman <mgorman@suse.de> Reviewed-by: Rik van Riel <riel@redhat.coM> Reviewed-by: Michal Hocko <mhocko@suse.cz> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Cc: stable@vger.kernel.org Cc: linux-mm@kvack.org Link: http://lkml.kernel.org/r/20130211145236.GX21389@suse.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
Ingo Molnar [Wed, 13 Feb 2013 08:45:59 +0000 (09:45 +0100)]
Merge branch 'rcu/srcu' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu into core/rcu
Pull SRCU changes from Paul E. McKenney.
" These include debugging aids, updates that move towards the goal
of permitting srcu_read_lock() and srcu_read_unlock() to be used
from idle and offline CPUs, and a few small fixes. "
H. Peter Anvin [Wed, 13 Feb 2013 01:46:23 +0000 (17:46 -0800)]
x86, hyperv: HYPERV depends on X86_LOCAL_APIC
In order to compile in the special Hyper-V interrupt vector, we need
infrastructure in arch/x86/apic/apic.c. At least for now, simply
require CONFIG_X86_LOCAL_APIC in order to enable CONFIG_HYPERV.
X86: Handle Hyper-V vmbus interrupts as special hypervisor interrupts
Starting with win8, vmbus interrupts can be delivered on any VCPU in the guest
and furthermore can be concurrently active on multiple VCPUs. Support this
interrupt delivery model by setting up a separate IDT entry for Hyper-V vmbus.
interrupts. I would like to thank Jan Beulich <JBeulich@suse.com> and
Thomas Gleixner <tglx@linutronix.de>, for their help.
In this version of the patch, based on the feedback, I have merged the IDT
vector for Xen and Hyper-V and made the necessary adjustments. Furhermore,
based on Jan's feedback I have added the necessary compilation switches.
X86: Add a check to catch Xen emulation of Hyper-V
Xen emulates Hyper-V to host enlightened Windows. Looks like this
emulation may be turned on by default even for Linux guests. Check and
fail Hyper-V detection if we are on Xen.
[ hpa: the problem here is that Xen doesn't emulate Hyper-V well
enough, and if the Xen support isn't compiled in, we end up stubling
over the Hyper-V emulation and try to activate it -- and it fails. ]
Olaf Hering [Mon, 4 Feb 2013 01:22:37 +0000 (17:22 -0800)]
x86: Hyper-V: register clocksource only if its advertised
Enable hyperv_clocksource only if its advertised as a feature.
XenServer 6 returns the signature which is checked in
ms_hyperv_platform(), but it does not offer all features. Currently the
clocksource is enabled unconditionally in ms_hyperv_init_platform(), and
the result is a hanging guest.
Hyper-V spec Bit 1 indicates the availability of Partition Reference
Counter. Register the clocksource only if this bit is set.
The guest in question prints this in dmesg:
[ 0.000000] Hypervisor detected: Microsoft HyperV
[ 0.000000] HyperV: features 0x70, hints 0x0
This bug can be reproduced easily be setting 'viridian=1' in a HVM domU
.cfg file. A workaround without this patch is to boot the HVM guest with
'clocksource=jiffies'.
Linus Torvalds [Wed, 13 Feb 2013 00:20:07 +0000 (16:20 -0800)]
Merge branch 'autofs-fix' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux into akpm
Pull hp parisc automounter fix from Helge Deller:
"This unbreaks automounter support for the parisc architecture (and
probably aarch64 as well).""
* 'autofs-fix' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
unbreak automounter support on 64-bit kernel with 32-bit userspace (v2)
Linus Torvalds [Wed, 13 Feb 2013 00:16:29 +0000 (16:16 -0800)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux into akpm
Pull s390 regression fix from Martin Schwidefsky:
"The recent fix for the s390 sched_clock() function uncovered yet
another bug in s390_next_ktime which causes an endless loop in KVM.
This regression should be fixed before v3.8.
I keep the fingers crossed that this is the last one for v3.8."
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
s390/timer: avoid overflow when programming clock comparator
Borislav Petkov [Mon, 11 Feb 2013 14:22:16 +0000 (15:22 +0100)]
x86: Detect CPUID support early at boot
We detect CPUID function support on each CPU and save it for later use,
obviating the need to play the toggle EFLAGS.ID game every time. C code
is looking at ->cpuid_level anyway.
Borislav Petkov [Mon, 11 Feb 2013 14:22:15 +0000 (15:22 +0100)]
x86, head_32: Remove i386 pieces
Remove code fragments detecting a 386 CPU since we don't support those
anymore. Also, do not do alignment checks because they're done only at
CPL3. Also, no need to preserve EFLAGS.
Linus Torvalds [Tue, 12 Feb 2013 23:13:42 +0000 (15:13 -0800)]
Merge branch 'stable' of git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tile into akpm
Pull tile bugfixes from Chris Metcalf:
"This includes a variety of minor bug fixes, mostly to do with testing
"make allyesconfig", "make allmodconfig", "make allnoconfig", inspired
to Tejun Heo's observation about Kconfig.freezer not being included.
The largest changes are just syntax changes removing the tile-specific
use of a macro named INT_MASK, which is way too commonly redefined
throughout driver code"
* 'stable' of git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tile:
tile: tag some code with #ifdef CONFIG_COMPAT
tile: fix memcpy_*io functions for allnoconfig
tile: export a handful of symbols appropriately
drm: fix compile failure by including <linux/swiotlb.h>
tile: avoid defining INT_MASK macro in <arch/interrupts.h>
tile: provide "screen_info" when enabling VT
drivers/input/joystick/analog.c: enable precise timer
tile: include kernel/Kconfig.freezer in tile Kconfig
tile: remove an unused variable in copy_thread()
Linus Torvalds [Tue, 12 Feb 2013 23:12:24 +0000 (15:12 -0800)]
Merge tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc into akpm
Pull ARM SoC fixes from Olof Johansson:
"We had a number of fixes queued up, but taking a strict pass-through
and weeding out any that either have been broken for a while, or are
for platforms that need out-of-tree code to be useful anyway, or other
fixes for problems that few users are likely to see in real life, only
this short branch of patches remains.
The three patches here are to make SMP boot work on the Calxeda
platforms again. Some of the rework for cpuids on 3.8 broke it (and
it was discovered late, unfortunately)."
* tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
ARM: highbank: mask cluster id from cpu_logical_map
ARM: scu: mask cluster id from cpu_logical_map
ARM: scu: add empty scu_enable for !CONFIG_SMP
Marek Szyprowski [Tue, 12 Feb 2013 21:46:24 +0000 (13:46 -0800)]
mm: cma: fix accounting of CMA pages placed in high memory
The total number of low memory pages is determined as totalram_pages -
totalhigh_pages, so without this patch all CMA pageblocks placed in
highmem were accounted to low memory.
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> Acked-by: Kyungmin Park <kyungmin.park@samsung.com> Cc: Minchan Kim <minchan.kim@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Glauber Costa [Tue, 12 Feb 2013 21:46:22 +0000 (13:46 -0800)]
memcg: fix kmemcg registration for late caches
The designed workflow for the caches in kmemcg is: register it with
memcg_register_cache() if kmemcg is already available or later on when a
new kmemcg appears at memcg_update_cache_sizes() which will handle all
caches in the system. The caches created at boot time will be handled
by the later, and the memcg-caches as well as any system caches that are
registered later on by the former.
There is a bug, however, in memcg_register_cache: we correctly set up
the array size, but do not mark the cache as a root cache.
This means that allocations for any cache appearing late in the game
will see memcg->memcg_params->is_root_cache == false, and in particular,
trigger VM_BUG_ON(!cachep->memcg_params->is_root_cache) in
__memcg_kmem_cache_get.
The obvious fix is to include the missing assignment.
Signed-off-by: Glauber Costa <glommer@parallels.com> Cc: Michal Hocko <mhocko@suse.cz> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Gerald Schaefer [Tue, 12 Feb 2013 21:46:20 +0000 (13:46 -0800)]
mm: don't overwrite mm->def_flags in do_mlockall()
With commit 8e72033f2a48 ("thp: make MADV_HUGEPAGE check for
mm->def_flags") the VM_NOHUGEPAGE flag may be set on s390 in
mm->def_flags for certain processes, to prevent future thp mappings.
This would be overwritten by do_mlockall(), which sets it back to 0 with
an optional VM_LOCKED flag set.
To fix this, instead of overwriting mm->def_flags in do_mlockall(), only
the VM_LOCKED flag should be set or cleared.
Linus Walleij [Tue, 12 Feb 2013 21:46:19 +0000 (13:46 -0800)]
drivers/rtc/rtc-pl031.c: restore ST variant functionality
Commit e7e034e18a0a ("drivers/rtc/rtc-pl031.c: fix the missing operation
on enable") accidentally broke the ST variants of PL031.
The bit that is being poked as "clockwatch" enable bit for the ST
variants does the work of bit 0 on this variant. Bit 0 is used for a
clock divider on the ST variants, and setting it to 1 will affect
timekeeping in a very bad way.
David S. Miller [Tue, 12 Feb 2013 21:11:09 +0000 (16:11 -0500)]
Merge branch 'for-davem' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless
John W. Linville says:
====================
Here is another handful of late-breaking fixes intended for the 3.8
stream... Hopefully the will still make it! :-)
There are three mac80211 fixes pulled from Johannes:
"Here are three fixes still for the 3.8 stream, the fix from Cong Ding
for the bad sizeof (Stephen Hemminger had pointed it out before but I'd
promptly forgotten), a mac80211 managed-mode channel context usage fix
where a downgrade would never stop until reaching non-HT and a bug in
the channel determination that could cause invalid channels like HT40+
on channel 11 to be used."
Also included is a mwl8k fix that avoids an oops when using mwl8k
devices that only support the 5 GHz band.
Please let me know if there are problems!
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Alexander Duyck [Tue, 12 Feb 2013 09:45:44 +0000 (09:45 +0000)]
ixgbe: Only set gso_type to SKB_GSO_TCPV4 as RSC does not support IPv6
The original fix that was applied for setting gso_type required more change
than necessary because it was assumed ixgbe does RSC on IPv6 frames and this
is not correct. RSC is only supported with IPv4/TCP frames only. As such we
can simplify the fix and avoid the unnecessary move of eth_type_trans.
The previous patch "ixgbe: fix gso type" and this patch reduce the entire fix
to one line that sets gso_type to TCPV4 if the frame is RSC.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Reported-by: Tommi Rantala <tt.rantala@gmail.com> Tested-by: Tommi Rantala <tt.rantala@gmail.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Pavel Emelyanov <xemul@parallels.com> Acked-by: Pavel Emelyanov <xemul@parallels.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Mark Rutland [Fri, 8 Feb 2013 15:24:07 +0000 (15:24 +0000)]
clockevents: Fix generic broadcast for FEAT_C3STOP
Commit 12ad100046: "clockevents: Add generic timer broadcast function"
made tick_device_uses_broadcast set up the generic broadcast function
for dummy devices (where !tick_device_is_functional(dev)), but neglected
to set up the broadcast function for devices that stop in low power
states (with the CLOCK_EVT_FEAT_C3STOP flag).
When these devices enter low power states they will not have the generic
broadcast function assigned, and will bring down the system when an
attempt is made to broadcast to them.
This patch ensures that the broadcast function is also assigned for
devices which require broadcast in low power states.
Reported-by: Stephen Warren <swarren@nvidia.com> Signed-off-by: Mark Rutland <mark.rutland@arm.com> Tested-by: Stephen Warren <swarren@nvidia.com> Cc: linux-arm-kernel@lists.infradead.org Cc: nico@linaro.org Cc: Marc.Zyngier@arm.com Cc: Will.Deacon@arm.com Cc: santosh.shilimkar@ti.com Cc: john.stultz@linaro.org Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Linus Torvalds [Tue, 12 Feb 2013 16:17:35 +0000 (08:17 -0800)]
Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux
Pull drm fixes from Dave Airlie:
"Three nouveau fixes, all user visible issues, and one radeon
regression fix"
* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux:
drm/radeon: enforce use of radeon_get_ib_value when reading user cmd
drm/nouveau: add lockdep annotations
drm/nv50/fb: Fix nullptr-deref on IGPs
drm/nouveau: use different register to wait for secret scrubber
Jerome Glisse [Mon, 11 Feb 2013 13:57:18 +0000 (08:57 -0500)]
drm/radeon: enforce use of radeon_get_ib_value when reading user cmd
When ever parsing cmd buffer supplied by userspace we need to use
radeon_get_ib_value rather than directly accessing the ib as the user
cmd might not yet be copied into the ib thus the parser might read
value that does not correspond to what user is sending and possibly
allowing user to send malicious command undected.
Signed-off-by: Jerome Glisse <jglisse@redhat.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com>
Mike Travis [Mon, 11 Feb 2013 19:45:15 +0000 (13:45 -0600)]
x86, uv, uv3: Trim MMR register definitions after code changes for SGI UV3
This patch trims the MMR register definitions after the updates for the
SGI UV3 system have been applied. Note that because these definitions
are automatically generated from the RTL we cannot control the length
of the names. Therefore there are lines that exceed 80 characters.