Cong Wang [Fri, 25 Nov 2011 14:08:45 +0000 (22:08 +0800)]
highmem: mark k[un]map_atomic() with two arguments as deprecated
For backward compatibility, we still keep the deprecated form,
and will warn the users if they still use the deprecated one, like this:
drivers/block/drbd/drbd_bitmap.c: In function ‘bm_page_io_async’:
drivers/block/drbd/drbd_bitmap.c:973:3: warning: ‘kmap_atomic_deprecated’ is deprecated (declared at /home/wangcong/linux-2.6/include/linux/highmem.h:124)
drivers/block/drbd/drbd_bitmap.c:977:3: warning: ‘kunmap_atomic_deprecated’ is deprecated (declared at /home/wangcong/linux-2.6/include/linux/highmem.h:144)
Thanks to Nick Bowler for the cpp trick!
Cc: Cesar Eduardo Barros <cesarb@cesarb.net> Cc: Nick Bowler <nbowler@elliptictech.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Cong Wang <amwang@redhat.com>
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net:
net: Silence seq_scale() unused warning
ipv4:correct description for tcp_max_syn_backlog
pasemi_mac: Fix building as module
netback: Fix alert message.
r8169: fix Rx index race between FIFO overflow recovery and NAPI handler.
r8169: Rx FIFO overflow fixes.
ipv4: Fix peer validation on cached lookup.
ipv4: make sure RTO_ONLINK is saved in routing cache
iwlwifi: change the default behavior of watchdog timer
iwlwifi: do not re-configure HT40 after associated
iwlagn: fix HW crypto for TX-only keys
Revert "mac80211: clear sta.drv_priv on reconfiguration"
mac80211: fill rate filter for internal scan requests
cfg80211: amend regulatory NULL dereference fix
cfg80211: fix race on init and driver registration
Linus Torvalds [Tue, 6 Dec 2011 19:54:33 +0000 (11:54 -0800)]
Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
ftrace: Fix hash record accounting bug
perf: Fix parsing of __print_flags() in TP_printk()
jump_label: jump_label_inc may return before the code is patched
ftrace: Remove force undef config value left for testing
tracing: Restore system filter behavior
tracing: fix event_subsystem ref counting
Peter Pan(潘卫平) [Mon, 5 Dec 2011 21:39:41 +0000 (21:39 +0000)]
ipv4:correct description for tcp_max_syn_backlog
Since commit c5ed63d66f24(tcp: fix three tcp sysctls tuning),
sysctl_max_syn_backlog is determined by tcp_hashinfo->ehash_mask,
and the minimal value is 128, and it will increase in proportion to the
memory of machine.
The original description for tcp_max_syn_backlog and sysctl_max_syn_backlog
are out of date.
Changelog:
V2: update description for sysctl_max_syn_backlog
Signed-off-by: Weiping Pan <panweiping3@gmail.com> Reviewed-by: Shan Wei <shanwei88@gmail.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Ben Hutchings [Mon, 5 Dec 2011 19:44:22 +0000 (19:44 +0000)]
pasemi_mac: Fix building as module
Commit ded19addf9c937d83b9bfb4d73a836732569041b ('pasemic_mac*: Move
the PA Semi driver') inadvertently split pasemi_mac into two separate
modules with unresolved symbols. Change it back into a single module.
Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Tue, 6 Dec 2011 00:54:15 +0000 (16:54 -0800)]
Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
intr_remapping: Fix section mismatch in ir_dev_scope_init()
intel-iommu: Fix section mismatch in dmar_parse_rmrr_atsr_dev()
x86, amd: Fix up numa_node information for AMD CPU family 15h model 0-0fh northbridge functions
x86, AMD: Correct align_va_addr documentation
x86/rtc, mrst: Don't register a platform RTC device for for Intel MID platforms
x86/mrst: Battery fixes
x86/paravirt: PTE updates in k(un)map_atomic need to be synchronous, regardless of lazy_mmu mode
x86: Fix "Acer Aspire 1" reboot hang
x86/mtrr: Resolve inconsistency with Intel processor manual
x86: Document rdmsr_safe restrictions
x86, microcode: Fix the failure path of microcode update driver init code
Add TAINT_FIRMWARE_WORKAROUND on MTRR fixup
x86/mpparse: Account for bus types other than ISA and PCI
x86, mrst: Change the pmic_gpio device type to IPC
mrst: Added some platform data for the SFI translations
x86,mrst: Power control commands update
x86/reboot: Blacklist Dell OptiPlex 990 known to require PCI reboot
x86, UV: Fix UV2 hub part number
x86: Add user_mode_vm check in stack_overflow_check
Linus Torvalds [Tue, 6 Dec 2011 00:54:00 +0000 (16:54 -0800)]
Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf: Fix loss of notification with multi-event
perf, x86: Force IBS LVT offset assignment for family 10h
perf, x86: Disable PEBS on SandyBridge chips
trace_events_filter: Use rcu_assign_pointer() when setting ftrace_event_call->filter
perf session: Fix crash with invalid CPU list
perf python: Fix undefined symbol problem
perf/x86: Enable raw event access to Intel offcore events
perf: Don't use -ENOSPC for out of PMU resources
perf: Do not set task_ctx pointer in cpuctx if there are no events in the context
perf/x86: Fix PEBS instruction unwind
oprofile, x86: Fix crash when unloading module (nmi timer mode)
oprofile: Fix crash when unloading module (hr timer mode)
Linus Torvalds [Tue, 6 Dec 2011 00:53:43 +0000 (16:53 -0800)]
Merge branch 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
clockevents: Set noop handler in clockevents_exchange_device()
tick-broadcast: Stop active broadcast device when replacing it
clocksource: Fix bug with max_deferment margin calculation
rtc: Fix some bugs that allowed accumulating time drift in suspend/resume
rtc: Disable the alarm in the hardware
Linus Torvalds [Tue, 6 Dec 2011 00:50:24 +0000 (16:50 -0800)]
Merge branch 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
* 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
sched, x86: Avoid unnecessary overflow in sched_clock
sched: Fix buglet in return_cfs_rq_runtime()
sched: Avoid SMT siblings in select_idle_sibling() if possible
sched: Set the command name of the idle tasks in SMP kernels
sched, rt: Provide means of disabling cross-cpu bandwidth sharing
sched: Document wait_for_completion_*() return values
sched_fair: Fix a typo in the comment describing update_sd_lb_stats
sched: Add a comment to effective_load() since it's a pain
Linus Torvalds [Mon, 5 Dec 2011 23:35:16 +0000 (15:35 -0800)]
Merge branch 'for-linus' of git://git390.marist.edu/pub/scm/linux-2.6
* 'for-linus' of git://git390.marist.edu/pub/scm/linux-2.6:
[S390] ap: Setup timer for sending messages after reset.
[S390] cio: fix chsc_chp_vary
[S390] cio: provide fake irb for transport mode IO
[S390] cio: disallow driver io for known to be broken paths
[S390] hibernate: directly trigger subchannel evaluation
[S390] remove reset of system call restart on psw changes
[S390] add missing .set function for NT_S390_LAST_BREAK regset
[S390] fix page change underindication in pgste_update_all
[S390] ptrace inferior call interactions with TIF_SYSCALL
[S390] kdump: Replace is_kdump_kernel() with OLDMEM_BASE check
françois romieu [Sun, 4 Dec 2011 20:30:52 +0000 (20:30 +0000)]
r8169: fix Rx index race between FIFO overflow recovery and NAPI handler.
Since 92fc43b4159b518f5baae57301f26d770b0834c9, rtl8169_tx_timeout ends up
resetting Rx and Tx indexes and thus racing with the NAPI handler via
-> rtl8169_hw_reset
-> rtl_hw_reset
-> rtl8169_init_ring_indexes
What about returning to the original state ?
rtl_hw_reset is only used by rtl8169_hw_reset and rtl8169_init_one.
The latter does not need rtl8169_init_ring_indexes because the indexes
still contain their original values from the newly allocated network
device private data area (i.e. 0).
rtl8169_hw_reset is used by:
1. rtl8169_down
Helper for rtl8169_close. rtl8169_open explicitely inits the indexes
anyway.
2. rtl8169_pcierr_interrupt
Indexes are set by rtl8169_reinit_task.
3. rtl8169_interrupt
rtl8169_hw_reset is needed when the device goes down. See 1.
4. rtl_shutdown
System shutdown handler. Indexes are irrelevant.
5. rtl8169_reset_task
Indexes must be set before rtl_hw_start is called.
6. rtl8169_tx_timeout
Indexes should not be set. This is the job of rtl8169_reset_task anyway.
The removal of rtl8169_hw_reset in rtl8169_tx_timeout and its move in
rtl8169_reset_task do not change the analysis.
Signed-off-by: Francois Romieu <romieu@fr.zoreil.com> Cc: hayeswang <hayeswang@realtek.com> Signed-off-by: David S. Miller <davem@davemloft.net>
françois romieu [Sun, 4 Dec 2011 20:30:45 +0000 (20:30 +0000)]
r8169: Rx FIFO overflow fixes.
Realtek has specified that the post 8168c gigabit chips and the post
8105e fast ethernet chips recover automatically from a Rx FIFO overflow.
The driver does not need to clear the RxFIFOOver bit of IntrStatus and
it should rather avoid messing it.
The implementation deserves some explanation:
1. events outside of the intr_event bit mask are now ignored. It enforces
a no-processing policy for the events that either should not be there
or should be ignored.
2. RxFIFOOver was already ignored in rtl_cfg_infos[RTL_CFG_1] for the
whole 8168 line of chips with two exceptions:
- RTL_GIGA_MAC_VER_22 since b5ba6d12bdac21bc0620a5089e0f24e362645efd
("use RxFIFO overflow workaround for 8168c chipset.").
This one should now be correctly handled.
- RTL_GIGA_MAC_VER_11 (8168b) which requires a different Rx FIFO
overflow processing.
Though it does not conform to Realtek suggestion above, the updated
driver includes no change for RTL_GIGA_MAC_VER_12 and RTL_GIGA_MAC_VER_17.
Both are 8168b. RTL_GIGA_MAC_VER_12 is common and a bit old so I'd rather
wait for experimental evidence that the change suggested by Realtek really
helps or does not hurt in unexpected ways.
Removed case statements in rtl8169_interrupt are only 8168 relevant.
3. RxFIFOOver is masked for post 8105e 810x chips, namely the sole 8105e
(RTL_GIGA_MAC_VER_30) itself.
Signed-off-by: Francois Romieu <romieu@fr.zoreil.com> Cc: hayeswang <hayeswang@realtek.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Steven Rostedt [Sat, 5 Nov 2011 00:32:39 +0000 (20:32 -0400)]
ftrace: Fix hash record accounting bug
If the set_ftrace_filter is cleared by writing just whitespace to
it, then the filter hash refcounts will be decremented but not
updated. This causes two bugs:
1) No functions will be enabled for tracing when they all should be
2) If the users clears the set_ftrace_filter twice, it will crash ftrace:
Steven Rostedt [Fri, 4 Nov 2011 20:32:25 +0000 (16:32 -0400)]
perf: Fix parsing of __print_flags() in TP_printk()
A update is made to the sched:sched_switch event that adds some
logic to the first parameter of the __print_flags() that shows the
state of tasks. This change cause perf to fail parsing the flags.
A simple fix is needed to have the parser be able to process ops
within the argument.
Cc: stable@vger.kernel.org Reported-by: Andrew Vagin <avagin@openvz.org> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Gleb Natapov [Tue, 18 Oct 2011 17:55:51 +0000 (19:55 +0200)]
jump_label: jump_label_inc may return before the code is patched
If cpu A calls jump_label_inc() just after atomic_add_return() is
called by cpu B, atomic_inc_not_zero() will return value greater then
zero and jump_label_inc() will return to a caller before jump_label_update()
finishes its job on cpu B.
Link: http://lkml.kernel.org/r/20111018175551.GH17571@redhat.com Cc: stable@vger.kernel.org Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Acked-by: Jason Baron <jbaron@redhat.com> Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Steven Rostedt [Fri, 4 Nov 2011 14:45:23 +0000 (10:45 -0400)]
ftrace: Remove force undef config value left for testing
A forced undef of a config value was used for testing and was
accidently left in during the final commit. This causes x86 to
run slower than needed while running function tracing as well
as causes the function graph selftest to fail when DYNMAIC_FTRACE
is not set. This is because the code in MCOUNT expects the ftrace
code to be processed with the config value set that happened to
be forced not set.
Ilya Dryomov [Mon, 31 Oct 2011 09:07:42 +0000 (11:07 +0200)]
tracing: fix event_subsystem ref counting
Fix a bug introduced by e9dbfae5, which prevents event_subsystem from
ever being released.
Ref_count was added to keep track of subsystem users, not for counting
events. Subsystem is created with ref_count = 1, so there is no need to
increment it for every event, we have nr_events for that. Fix this by
touching ref_count only when we actually have a new user -
subsystem_open().
David S. Miller [Mon, 5 Dec 2011 18:21:42 +0000 (13:21 -0500)]
ipv4: Fix peer validation on cached lookup.
If ipv4_valdiate_peer() fails during a cached entry lookup,
we'll NULL derer since the loop iterator assumes rth is not
NULL.
Letting this be handled as a failure is just bogus, so just make it
not fail. If we have trouble getting a non-NULL neighbour for the
redirected gateway, just restore the original gateway and continue.
The very next use of this cached route will try again.
Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Seth Forshee [Wed, 16 Nov 2011 23:37:45 +0000 (17:37 -0600)]
toshiba_acpi: Fix machines that don't support HCI_SYSTEM_EVENT
The Satellite C670-10V generates notifications for hotkeys but does
not support HCI_SYSTEM_EVENT. As a result when a hotkey is pressed
it gets stuck in an infinite loop in toshiba_acpi_notify. To fix
this, detect whether or not HCI_SYSTEM_EVENT is supported up-front
and don't try to read system events if it isn't supported. In
addition, limit the number of retries when reading HCI_SYSTEM_EVENT
fails so that this loop cannot run unbounded.
Signed-off-by: Seth Forshee <seth.forshee@canonical.com> Signed-off-by: Matthew Garrett <mjg@redhat.com>
intr_remapping: Fix section mismatch in ir_dev_scope_init()
Fix:
Section mismatch in reference from the function
ir_dev_scope_init() to the function
.init.text:dmar_dev_scope_init() The function
ir_dev_scope_init() references the function __init dmar_dev_scope_init().
intel-iommu: Fix section mismatch in dmar_parse_rmrr_atsr_dev()
dmar_parse_rmrr_atsr_dev() calls rmrr_parse_dev() and
atsr_parse_dev() which are both marked as __init.
Section mismatch in reference from the function
dmar_parse_rmrr_atsr_dev() to the function
.init.text:dmar_parse_dev_scope() The function
dmar_parse_rmrr_atsr_dev() references the function __init
dmar_parse_dev_scope().
Section mismatch in reference from the function
dmar_parse_rmrr_atsr_dev() to the function
.init.text:dmar_parse_dev_scope() The function
dmar_parse_rmrr_atsr_dev() references the function __init
dmar_parse_dev_scope().
x86, amd: Fix up numa_node information for AMD CPU family 15h model 0-0fh northbridge functions
I've received complaints that the numa_node attribute for family
15h model 00-0fh (e.g. Interlagos) northbridge functions shows
-1 instead of the proper node ID.
Correct this with attached quirks (similar to quirks for other
AMD CPU families used in multi-socket systems).
Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com> Cc: Frank Arnold <frank.arnold@amd.com> Cc: Borislav Petkov <borislav.petkov@amd.com> Link: http://lkml.kernel.org/r/20111202072143.GA31916@alberich.amd.com Signed-off-by: Ingo Molnar <mingo@elte.hu>
Borislav Petkov [Mon, 21 Nov 2011 11:10:19 +0000 (12:10 +0100)]
x86, AMD: Correct align_va_addr documentation
Commit dfb09f9b7ab0 ("x86, amd: Avoid cache aliasing penalties
on AMD family 15h") introduced a kernel command line parameter
called 'align_va_addr' which still refers to arguments used in
an earlier version of the patch and which got changed without
updating the documentation. Correct that omission.
The problem is that in copy_page_range() we turn lazy mode on,
and then in swap_entry_free() we call swap_count_continued()
which ends up in:
map = kmap_atomic(page, KM_USER0) + offset;
and then later we touch *map.
Since we are running in batched mode (lazy) we don't actually
set up the PTE mappings and the kmap_atomic is not done
synchronously and ends up trying to dereference a page that has
not been set.
Looking at kmap_atomic_prot_pfn(), it uses
'arch_flush_lazy_mmu_mode' and doing the same in
kmap_atomic_prot() and __kunmap_atomic() makes the problem go
away.
Interestingly, commit b8bcfe997e4615 ("x86/paravirt: remove lazy
mode in interrupts") removed part of this to fix an interrupt
issue - but it went to far and did not consider this scenario.
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
Peter Chubb [Mon, 5 Dec 2011 13:53:53 +0000 (16:53 +0300)]
x86: Fix "Acer Aspire 1" reboot hang
Looks like on some Acer Aspire 1s with older bioses, reboot via bios
fails. It works on my machine, (with BIOS version 0.3310) but
not on some others (BIOS version 0.3309).
There's a log of problems at:
https://bbs.archlinux.org/viewtopic.php?id=124136
This patch adds a different callback to the reboot quirk table,
to allow rebooting via keybaord controller.
Reported-by: Uroš Vampl <mobile.leecher@gmail.com> Tested-by: Vasily Khoruzhick <anarsoul@gmail.com> Signed-off-by: Peter Chubb <peter.chubb@nicta.com.au> Cc: Don Zickus <dzickus@redhat.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: stable@kernel.org Link: http://lkml.kernel.org/r/1323093233-9481-1-git-send-email-anarsoul@gmail.com Signed-off-by: Ingo Molnar <mingo@elte.hu>
x86/mtrr: Resolve inconsistency with Intel processor manual
Following is from Notes of section 11.5.3 of Intel processor
manual available at:
http://www.intel.com/Assets/PDF/manual/325384.pdf
For the Pentium 4 and Intel Xeon processors, after the sequence of
steps given above has been executed, the cache lines containing the
code between the end of the WBINVD instruction and before the
MTRRS have actually been disabled may be retained in the cache
hierarchy. Here, to remove code from the cache completely, a
second WBINVD instruction must be executed after the MTRRs have
been disabled.
This patch provides resolution for that.
Ideally, I will like to make changes only for Pentium 4 and Xeon
processors. But, I am not finding easier way to do it.
And, extra wbinvd() instruction does not hurt much for other
processors.
Signed-off-by: Ajaykumar Hotchandani <ajaykumar.hotchandani@oracle.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Arjan van de Ven <arjan@linux.intel.com> Cc: Lucas De Marchi <lucas.demarchi@profusion.mobi> Link: http://lkml.kernel.org/r/4EBD1CC5.3030008@oracle.com Signed-off-by: Ingo Molnar <mingo@elte.hu>
x86/mpparse: Account for bus types other than ISA and PCI
In commit f8924e770e04 ("x86: unify mp_bus_info"), the 32-bit
and 64-bit versions of MP_bus_info were rearranged to match each
other better. Unfortunately it introduced a regression: prior
to that change we used to always set the mp_bus_not_pci bit,
then clear it if we found a PCI bus. After it, we set
mp_bus_not_pci for ISA buses, clear it for PCI buses, and leave
it alone otherwise.
In the cases of ISA and PCI, there's not much difference. But
ISA is not the only non-PCI bus, so it's better to always set
mp_bus_not_pci and clear it only for PCI.
Without this change, Dan's Dell PowerEdge 4200 panics on boot
with a log indicating interrupt routing trouble unless the
"noapic" option is supplied. With this change, the machine
boots reliably without "noapic".
Jacob Pan [Wed, 16 Nov 2011 16:07:22 +0000 (16:07 +0000)]
x86,mrst: Power control commands update
On the Intel MID devices SCU commands are issued to manage power
off and the like. We need to issue different ones for
non-Lincroft based devices.
Signed-off-by: Alek Du <alek.du@intel.com> Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com> Signed-off-by: Alan Cox <alan@linux.intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
Jack Steiner [Tue, 29 Nov 2011 21:00:58 +0000 (15:00 -0600)]
x86, UV: Fix UV2 hub part number
There was a mixup when the SGI UV2 hub chip was sent to be
fabricated, and it ended up with the wrong part number in the
HRP_NODE_ID mmr. Future versions of the chip will (may) have the
correct part number. Change the UV infrastructure to recognize
both part numbers as valid IDs of a UV2 hub chip.
Mitsuo Hayasaka [Tue, 29 Nov 2011 06:08:21 +0000 (15:08 +0900)]
x86: Add user_mode_vm check in stack_overflow_check
The kernel stack overflow is checked in stack_overflow_check(),
which may wrongly detect the overflow if the stack pointer in
user space points to the kernel stack intentionally or
accidentally. So, the actual overflow is never detected after
this misdetection because WARN_ONCE() is used on the detection
of it.
This patch adds user-mode-vm checking before it to avoid this
problem and bails out early if the user stack is used.
Peter Zijlstra [Mon, 28 Nov 2011 20:12:40 +0000 (21:12 +0100)]
slab, lockdep: Fix silly bug
Commit 30765b92 ("slab, lockdep: Annotate the locks before using
them") moves the init_lock_keys() call from after g_cpucache_up =
FULL, to before it. And overlooks the fact that init_node_lock_keys()
tests for it and ignores everything !FULL.
Introduce a LATE stage and change the lockdep test to be <LATE.
Acked-by: Christoph Lameter <cl@linux.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: stable@kernel.org Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>
Peter Zijlstra [Sat, 26 Nov 2011 01:47:31 +0000 (02:47 +0100)]
perf: Fix loss of notification with multi-event
When you do:
$ perf record -e cycles,cycles,cycles noploop 10
You expect about 10,000 samples for each event, i.e., 10s at
1000samples/sec. However, this is not what's happening. You
get much fewer samples, maybe 3700 samples/event:
$ perf report -D | tail -15
Aggregated stats:
TOTAL events: 10998
MMAP events: 66
COMM events: 2
SAMPLE events: 10930
cycles stats:
TOTAL events: 3644
SAMPLE events: 3644
cycles stats:
TOTAL events: 3642
SAMPLE events: 3642
cycles stats:
TOTAL events: 3644
SAMPLE events: 3644
On a Intel Nehalem or even AMD64, there are 4 counters capable
of measuring cycles, so there is plenty of space to measure those
events without multiplexing (even with the NMI watchdog active).
And even with multiplexing, we'd expect roughly the same number
of samples per event.
The root of the problem was that when the event that caused the buffer
to become full was not the first event passed on the cmdline, the user
notification would get lost. The notification was sent to the file
descriptor of the overflowed event but the perf tool was not polling
on it. The perf tool aggregates all samples into a single buffer,
i.e., the buffer of the first event. Consequently, it assumes
notifications for any event will come via that descriptor.
The seemingly straight forward solution of moving the waitq into the
ringbuffer object doesn't work because of life-time issues. One could
perf_event_set_output() on a fd that you're also blocking on and cause
the old rb object to be freed while its waitq would still be
referenced by the blocked thread -> FAIL.
Therefore link all events to the ringbuffer and broadcast the wakeup
from the ringbuffer object to all possible events that could be waited
upon. This is rather ugly, and we're open to better solutions but it
works for now.
Robert Richter [Tue, 8 Nov 2011 18:20:44 +0000 (19:20 +0100)]
perf, x86: Force IBS LVT offset assignment for family 10h
On AMD family 10h we see firmware bug messages like the following:
[Firmware Bug]: cpu 6, try to use APIC500 (LVT offset 0) for vector 0x10400, but the register is already in use for vector 0xf9 on another cpu
[Firmware Bug]: cpu 6, IBS interrupt offset 0 not available (MSRC001103A=0x0000000000000100)
[Firmware Bug]: using offset 1 for IBS interrupts
[Firmware Bug]: workaround enabled for IBS LVT offset
perf: AMD IBS detected (0x00000007)
We always see this, since the offsets are not assigned by the BIOS for
this family. Force LVT offset assignment in this case. If the OS
assignment fails, fallback to BIOS settings and try to setup this.
The fallback to BIOS settings weakens the family check since
force_ibs_eilvt_setup() may fail e.g. in case of virtual machines.
But setup may still succeed if BIOS offsets are correct.
Other families don't have a workaround implemented that assigns LVT
offsets. It's ok, to drop calling force_ibs_eilvt_setup() for that
families.
With the patch the [Firmware Bug] messages vanish. We see now:
Linus Torvalds [Sun, 4 Dec 2011 19:57:09 +0000 (11:57 -0800)]
x86: Fix boot failures on older AMD CPU's
People with old AMD chips are getting hung boots, because commit bcb80e53877c ("x86, microcode, AMD: Add microcode revision to
/proc/cpuinfo") moved the microcode detection too early into
"early_init_amd()".
At that point we are *so* early in the booth that the exception tables
haven't even been set up yet, so the whole
doesn't actually work: if the rdmsr does a GP fault (due to non-existant
MSR register on older CPU's), we can't fix it up yet, and the boot fails.
Fix it by simply moving the code to a slightly later point in the boot
(init_amd() instead of early_init_amd()), since the kernel itself
doesn't even really care about the microcode patchlevel at this point
(or really ever: it's made available to user space in /proc/cpuinfo, and
updated if you do a microcode load).
Reported-tested-and-bisected-by: Larry Finger <Larry.Finger@lwfinger.net> Tested-by: Bob Tracy <rct@gherkin.frus.com> Acked-by: Borislav Petkov <borislav.petkov@amd.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
xen/pm_idle: Make pm_idle be default_idle under Xen.
The idea behind commit d91ee5863b71 ("cpuidle: replace xen access to x86
pm_idle and default_idle") was to have one call - disable_cpuidle()
which would make pm_idle not be molested by other code. It disallows
cpuidle_idle_call to be set to pm_idle (which is excellent).
But in the select_idle_routine() and idle_setup(), the pm_idle can still
be set to either: amd_e400_idle, mwait_idle or default_idle. This
depends on some CPU flags (MWAIT) and in AMD case on the type of CPU.
In case of mwait_idle we can hit some instances where the hypervisor
(Amazon EC2 specifically) sets the MWAIT and we get:
In the case of amd_e400_idle we don't get so spectacular crashes, but we
do end up making an MSR which is trapped in the hypervisor, and then
follow it up with a yield hypercall. Meaning we end up going to
hypervisor twice instead of just once.
The previous behavior before v3.0 was that pm_idle was set to
default_idle regardless of select_idle_routine/idle_setup.
We want to do that, but only for one specific case: Xen. This patch
does that.
Fixes RH BZ #739499 and Ubuntu #881076 Reported-by: Stefan Bader <stefan.bader@canonical.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
ipv4: make sure RTO_ONLINK is saved in routing cache
__mkroute_output fails to work with the original tos
and uses value with stripped RTO_ONLINK bit. Make sure we put
the original TOS bits into rt_key_tos because it used to match
cached route.
Signed-off-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Fri, 2 Dec 2011 21:30:58 +0000 (13:30 -0800)]
Merge branch 'usb-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb
* 'usb-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (21 commits)
usb: ftdi_sio: add PID for Propox ISPcable III
Revert "xHCI: reset-on-resume quirk for NEC uPD720200"
xHCI: fix bug in xhci_clear_command_ring()
usb: gadget: fsl_udc: fix dequeuing a request in progress
usb: fsl_mxc_udc.c: Remove compile-time dependency of MX35 SoC type
usb: fsl_mxc_udc.c: Fix build issue by including missing header file
USB: fsl_udc_core: use usb_endpoint_xfer_isoc to judge ISO XFER
usb: udc: Fix gadget driver's speed check in various UDC drivers
usb: gadget: fix g_serial regression
usb: renesas_usbhs: fixup driver speed
usb: renesas_usbhs: fixup gadget.dev.driver when udc_stop.
usb: renesas_usbhs: fixup signal the driver that cable was disconnected
usb: renesas_usbhs: fixup device_register timing
usb: musb: PM: fix context save/restore in suspend/resume path
USB: linux-cdc-acm.inf: add support for the acm_ms gadget
EHCI : Fix a regression in the ISO scheduler
xHCI: reset-on-resume quirk for NEC uPD720200
USB: whci-hcd: fix endian conversion in qset_clear()
USB: usb-storage: unusual_devs entry for Kingston DT 101 G2
usb: option: add SIMCom SIM5218
...
Linus Torvalds [Fri, 2 Dec 2011 21:30:25 +0000 (13:30 -0800)]
Merge branch 'staging-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging
* 'staging-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging:
Staging: comedi: fix integer overflow in do_insnlist_ioctl()
Revert "Staging: comedi: integer overflow in do_insnlist_ioctl()"
Staging: comedi: integer overflow in do_insnlist_ioctl()
Staging: comedi: fix signal handling in read and write
Staging: comedi: fix mmap_count
staging: comedi: fix oops for USB DAQ devices.
staging: comedi: usbduxsigma: Fixed wrong range for the analogue channel.
staging:rts_pstor:Complete scanning_done variable
staging: usbip: bugfix for deadlock
Wey-Yi Guy [Fri, 2 Dec 2011 16:19:19 +0000 (08:19 -0800)]
iwlwifi: change the default behavior of watchdog timer
The current default watchdog timer is enabled, but we are seeing issues on
legacy devices. So change the default setting of watchdog timer to per
device based. But user still can use the "wd_disable" module parameter
to overwrite the system setting
Cc: stable@vger.kernel.org #3.0+ Signed-off-by: Wey-Yi Guy <wey-yi.w.guy@intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
Johannes Berg [Fri, 2 Dec 2011 16:19:17 +0000 (08:19 -0800)]
iwlagn: fix HW crypto for TX-only keys
Group keys in IBSS or AP mode are not programmed
into the device since we give the key to it with
every TX packet. However, we do need mac80211 to
create the MMIC & PN in all cases. Move the code
around to set the key flags all the time. We set
them even when the key is removed again but that
is obviously harmless.
Cc: stable@vger.kernel.org Reported-by: Reinette Chatre <reinette.chatre@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Wey-Yi Guy <wey-yi.w.guy@intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
Linus Torvalds [Fri, 2 Dec 2011 18:38:20 +0000 (10:38 -0800)]
Merge branch 'for-linus' of git://oss.sgi.com/xfs/xfs
* 'for-linus' of git://oss.sgi.com/xfs/xfs:
xfs: fix attr2 vs large data fork assert
xfs: force buffer writeback before blocking on the ilock in inode reclaim
xfs: validate acl count
Linus Torvalds [Fri, 2 Dec 2011 16:25:04 +0000 (08:25 -0800)]
Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux
* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux:
vmwgfx: integer overflow in vmw_kms_update_layout_ioctl()
drm/radeon/kms: fix 2D tiling CS support on EG/CM
drm/radeon/kms: fix scanout of 2D tiled buffers on EG/CM
drm: Fix lack of CRTC disable for drm_crtc_helper_set_config(.fb=NULL)
drm/radeon/kms: add some new pci ids
drm/radeon/kms: Skip ACPI call to ATIF when possible
drm/radeon/kms: Hide debugging message
drm/radeon/kms: add some loop timeouts in pageflip code
drm/nv50/disp: silence compiler warning
drm/nouveau: fix oopses caused by clear being called on unpopulated ttms
drm/nouveau: Keep RAMIN heap within the channel.
drm/nvd0/disp: fix sor dpms typo, preventing dpms on in some situations
drm/nvc0/gr: fix TP init for transform feedback offset queries
drm/nouveau: add dumb ioctl support
Linus Torvalds [Fri, 2 Dec 2011 16:10:51 +0000 (08:10 -0800)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
ALSA: hda - Fix S3/S4 problem on machines with VREF-pin mute-LED
ALSA: hda_intel - revert a quirk that affect VIA chipsets
ALSA: hda - Avoid touching mute-VREF pin for IDT codecs
firmware: Sigma: Fix endianess issues
firmware: Sigma: Skip header during CRC generation
firmware: Sigma: Prevent out of bounds memory access
ALSA: usb-audio - Support for Roland GAIA SH-01 Synthesizer
ASoC: Supply dcs_codes for newer WM1811 revisions
ASoC: Error out if we can't generate a LRCLK at all for WM8994
ASoC: Correct name of Speyside Main Speaker widget
ASoC: skip resume of soc-audio devices without codecs
ASoC: cs42l51: Fix off-by-one for reg_cache_size
ASoC: drop support for PlayPaq with WM8510
ASoC: mpc8610: tell the CS4270 codec that it's the master
ASoC: cs4720: use snd_soc_cache_sync()
ASoC: SAMSUNG: Fix build error
ASoC: max9877: Update register if either val or val2 is changed
ASoC: Fix wrong define for AD1836_ADC_WORD_OFFSET
Thomas Gleixner [Fri, 2 Dec 2011 15:02:45 +0000 (16:02 +0100)]
clockevents: Set noop handler in clockevents_exchange_device()
If a device is shutdown, then there might be a pending interrupt,
which will be processed after we reenable interrupts, which causes the
original handler to be run. If the old handler is the (broadcast)
periodic handler the shutdown state might hang the kernel completely.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: stable@vger.kernel.org
Ido Yariv [Thu, 1 Dec 2011 11:55:08 +0000 (13:55 +0200)]
genirq: Fix race condition when stopping the irq thread
In irq_wait_for_interrupt(), the should_stop member is verified before
setting the task's state to TASK_INTERRUPTIBLE and calling schedule().
In case kthread_stop sets should_stop and wakes up the process after
should_stop is checked by the irq thread but before the task's state
is changed, the irq thread might never exit:
Xi Wang [Mon, 28 Nov 2011 11:25:43 +0000 (12:25 +0100)]
vmwgfx: integer overflow in vmw_kms_update_layout_ioctl()
There are two issues in vmw_kms_update_layout_ioctl(). First, the
for loop forgets to index rects and only checks the first element.
Second, there is a potential integer overflow if userspace passes
in a large arg->num_outputs. The call to kzalloc() would allocate
a small buffer, leading to out-of-bounds read.
Reported-by: Haogang Chen <haogangchen@gmail.com> Signed-off-by: Xi Wang <xi.wang@gmail.com> Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>