Linus Torvalds [Fri, 14 Oct 2011 05:06:39 +0000 (17:06 +1200)]
Merge branch 'for-linus' of git://oss.sgi.com/xfs/xfs
* 'for-linus' of git://oss.sgi.com/xfs/xfs:
xfs: revert to using a kthread for AIL pushing
xfs: force the log if we encounter pinned buffers in .iop_pushbuf
xfs: do not update xa_last_pushed_lsn for locked items
Mika Westerberg [Thu, 13 Oct 2011 09:04:20 +0000 (12:04 +0300)]
x86, mrst: use a temporary variable for SFI irq
SFI tables reside in RAM and should not be modified once they are
written. Current code went to set pentry->irq to zero which causes
subsequent reads to fail with invalid SFI table checksum. This will
break kexec as the second kernel fails to validate SFI tables.
To fix this we use temporary variable for irq number.
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com> Reviewed-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: stable@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Hartmut Knaack [Mon, 10 Oct 2011 22:22:45 +0000 (00:22 +0200)]
gpio-pca953x: fix gpio_base
gpio_base was set to 0 if no system platform data or open firmware
platform data was provided. This led to conflicts, if any other gpiochip
with a gpiobase of 0 was instantiated already. Setting it to -1 will
automatically use the first one available.
Signed-off-by: Hartmut Knaack <knaack.h@gmx.de> Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
gpio/omap: fix build error with certain OMAP1 configs
With commit f64ad1a0e21a, "gpio/omap: cleanup _set_gpio_wakeup(), remove
ifdefs", access to build time conditionally omitted 'suspend_wakeup'
member of the 'gpio_bank' structure has been placed unconditionally in
function _set_gpio_wakeup(), which is always built. This resulted in the
driver compilation broken for certain OMAP1, i.e., non-OMAP16xx,
configurations.
Really required or not in previously excluded cases, define this
structure member unconditionally as a fix.
Tested with a custom OMAP1510 only configuration.
Signed-off-by: Janusz Krzysztofik <jkrzyszt@tis.icnet.pl> Acked-by: Kevin Hilman <khilman@ti.com> Tested-by: Aaro Koskinen <aaro.koskinen@iki.fi> Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Chris Metcalf [Wed, 5 Oct 2011 21:09:29 +0000 (17:09 -0400)]
tile: revert change from <asm/atomic.h> to <linux/atomic.h> in asm files
The 32-bit TILEPro support uses some #defines in <asm/atomic_32.h>
for atomic support routines in assembly. To make this more explicit,
I've turned those includes into includes of <asm/atomic_32.h>, which
should hopefully make it clear that they shouldn't be bombed into
<linux/atomic.h> in any cleanups.
Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net:
mscan: too much data copied to CAN frame due to 16 bit accesses
gro: refetch inet6_protos[] after pulling ext headers
bnx2x: fix cl_id allocation for non-eth clients for NPAR mode
mlx4_en: fix endianness with blue frame support
There are a lot of file references to now moved or deleted files in the
whole tree, especially in documentation and Kconfig files. This patch
fixes the references in drivers/ide/.
Signed-off-by: Johann Felix Soden <johfel@users.sourceforge.net> Signed-off-by: Randy Dunlap <rdunlap@xenotime.net> Acked-by: David S. Miller <davem@davemloft.net> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Currently we have a few issues with the way the workqueue code is used to
implement AIL pushing:
- it accidentally uses the same workqueue as the syncer action, and thus
can be prevented from running if there are enough sync actions active
in the system.
- it doesn't use the HIGHPRI flag to queue at the head of the queue of
work items
At this point I'm not confident enough in getting all the workqueue flags and
tweaks right to provide a perfectly reliable execution context for AIL
pushing, which is the most important piece in XFS to make forward progress
when the log fills.
Revert back to use a kthread per filesystem which fixes all the above issues
at the cost of having a task struct and stack around for each mounted
filesystem. In addition this also gives us much better ways to diagnose
any issues involving hung AIL pushing and removes a small amount of code.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reported-by: Stefan Priebe <s.priebe@profihost.ag> Tested-by: Stefan Priebe <s.priebe@profihost.ag> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Alex Elder <aelder@sgi.com>
xfs: force the log if we encounter pinned buffers in .iop_pushbuf
We need to check for pinned buffers even in .iop_pushbuf given that inode
items flush into the same buffers that may be pinned directly due operations
on the unlinked inode list operating directly on buffers. To do this add a
return value to .iop_pushbuf that tells the AIL push about this and use
the existing log force mechanisms to unpin it.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reported-by: Stefan Priebe <s.priebe@profihost.ag> Tested-by: Stefan Priebe <s.priebe@profihost.ag> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Alex Elder <aelder@sgi.com>
xfs: do not update xa_last_pushed_lsn for locked items
If an item was locked we should not update xa_last_pushed_lsn and thus skip
it when restarting the AIL scan as we need to be able to lock and write it
out as soon as possible. Otherwise heavy lock contention might starve AIL
pushing too easily, especially given the larger backoff once we moved
xa_last_pushed_lsn all the way to the target lsn.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reported-by: Stefan Priebe <s.priebe@profihost.ag> Tested-by: Stefan Priebe <s.priebe@profihost.ag> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Alex Elder <aelder@sgi.com>
Chris Mason [Tue, 11 Oct 2011 15:41:40 +0000 (11:41 -0400)]
Btrfs: make sure not to defrag extents past i_size
The btrfs file defrag code will loop through the extents and
force COW on them. But there is a concurrent truncate in the middle of
the defrag, it might end up defragging the same range over and over
again.
The problem is that writepage won't go through and do anything on pages
past i_size, so the cow won't happen, so the file will appear to still
be fragmented. defrag will end up hitting the same extents again and
again.
In the worst case, the truncate can actually live lock with the defrag
because the defrag keeps creating new ordered extents which the truncate
code keeps waiting on.
The fix here is to make defrag check for i_size inside the main loop,
instead of just once before the looping starts.
Signed-off-by: Chris Mason <chris.mason@oracle.com>
llist: Add back llist_add_batch() and llist_del_first() prototypes
Commit 1230db8e1543 ("llist: Make some llist functions inline")
has deleted the definitions, causing problems for (not upstream yet)
code that tries to make use of them.
Is caused by commit 3ae36655 ("x86-64: Rework vsyscall emulation and add
vsyscall= parameter") - the vsyscall emulation code is not fully cooked
yet as UML relies on some rather fragile SIGSEGV semantics.
Linus suggested in https://lkml.org/lkml/2011/8/9/376 to default
to vsyscall=native for now, this patch implements that.
and then it'll go into a loop: writeback -> defrag -> writeback ...
It's because writeback writes [8K, 200K] and then writes [0, 8K].
I tried to make writeback know if the pages are dirtied by defrag,
but the patch was a bit intrusive. Here I simply set writeback_index
when we defrag a file.
Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
mscan: too much data copied to CAN frame due to 16 bit accesses
Due to the 16 bit access to mscan registers there's too much data copied to
the zero initialized CAN frame when having an odd number of bytes to copy.
This patch ensures that only the requested bytes are copied by using an
8 bit access for the remaining byte.
Reported-by: Andre Naujoks <nautsch@gmail.com> Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net> Signed-off-by: Wolfgang Grandegger <wg@grandegger.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Dmitry Kravkov [Sun, 9 Oct 2011 23:57:36 +0000 (23:57 +0000)]
bnx2x: fix cl_id allocation for non-eth clients for NPAR mode
There are some consolidations of NPAR configuration
when FCoE and iSCSI L2 clients will get the same id,
in this case FCoE ring will be non-functional.
Signed-off-by: Dmitry Kravkov <dmitry@broadcom.com> Signed-off-by: Eilon Greenstein <eilong@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
The doorbell register was being unconditionally swapped. In x86, that
meant it was being swapped to BE and written to the descriptor and to
memory, depending on the case of blue frame support or writing to
doorbell register. On PPC, this meant it was being swapped to LE and
then swapped back to BE while writing to the register. But in the blue
frame case, it was being written as LE to the descriptor.
The fix is not to swap doorbell unconditionally, write it to the
register as BE and convert it to BE when writing it to the descriptor.
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@linux.vnet.ibm.com> Reported-by: Richard Hendrickson <richhend@us.ibm.com> Cc: Eli Cohen <eli@dev.mellanox.co.il> Cc: Yevgeny Petrilin <yevgenyp@mellanox.co.il> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Fix x86 insn decoder for hardening against invalid length
instructions. This adds length checkings for each byte-read
site and if it exceeds MAX_INSN_SIZE, returns immediately.
This can happen when decoding user-space binary.
Caller can check whether it happened by checking insn.*.got
member is set or not.
Adrian Bunk [Sun, 9 Oct 2011 13:47:26 +0000 (16:47 +0300)]
x86, vsyscall: Use better error message on bad pointers
At least some UML versions run into this and need
vsyscall=native for now.
Signed-off-by: Adrian Bunk <bunk@stusta.de> Cc: Andrew Lutomirski <luto@mit.edu> Cc: H. Peter Anvin <hpa@linux.intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Arjan van de Ven <arjan@infradead.org> Link: http://lkml.kernel.org/r/20111009134725.GD4586@localhost.pp.htv.fi Signed-off-by: Ingo Molnar <mingo@elte.hu>
Ingo Molnar [Thu, 6 Oct 2011 12:20:27 +0000 (14:20 +0200)]
x86, nmi, drivers: Fix nmi splitup build bug
nmi.c needs an #include <linux/mca.h>:
arch/x86/kernel/nmi.c: In function ‘unknown_nmi_error’:
arch/x86/kernel/nmi.c:286:6: error: ‘MCA_bus’ undeclared (first use in this function)
arch/x86/kernel/nmi.c:286:6: note: each undeclared identifier is reported only once for each function it appears in
Another one is the hpwdt driver:
drivers/watchdog/hpwdt.c:507:9: error: ‘NMI_DONE’ undeclared (first use in this function)
Robert Richter [Wed, 21 Sep 2011 09:30:18 +0000 (11:30 +0200)]
perf, x86: Implement IBS initialization
This patch implements IBS feature detection and initialzation. The
code is shared between perf and oprofile. If IBS is available on the
system for perf, a pmu is setup.
Don Zickus [Fri, 30 Sep 2011 19:06:23 +0000 (15:06 -0400)]
x86, nmi: Track NMI usage stats
Now that the NMI handler are broken into lists, increment the appropriate
stats for each list. This allows us to see what is going on when they
get printed out in the next patch.
Don Zickus [Fri, 30 Sep 2011 19:06:22 +0000 (15:06 -0400)]
x86, nmi: Add in logic to handle multiple events and unknown NMIs
Previous patches allow the NMI subsystem to process multipe NMI events
in one NMI. As previously discussed this can cause issues when an event
triggered another NMI but is processed in the current NMI. This causes the
next NMI to go unprocessed and become an 'unknown' NMI.
To handle this, we first have to flag whether or not the NMI handler handled
more than one event or not. If it did, then there exists a chance that
the next NMI might be already processed. Once the NMI is flagged as a
candidate to be swallowed, we next look for a back-to-back NMI condition.
This is determined by looking at the %rip from pt_regs. If it is the same
as the previous NMI, it is assumed the cpu did not have a chance to jump
back into a non-NMI context and execute code and instead handled another NMI.
If both of those conditions are true then we will swallow any unknown NMI.
There still exists a chance that we accidentally swallow a real unknown NMI,
but for now things seem better.
An optimization has also been added to the nmi notifier rountine. Because x86
can latch up to one NMI while currently processing an NMI, we don't have to
worry about executing _all_ the handlers in a standalone NMI. The idea is
if multiple NMIs come in, the second NMI will represent them. For those
back-to-back NMI cases, we have the potentail to drop NMIs. Therefore only
execute all the handlers in the second half of a detected back-to-back NMI.
Don Zickus [Fri, 30 Sep 2011 19:06:21 +0000 (15:06 -0400)]
x86, nmi: Wire up NMI handlers to new routines
Just convert all the files that have an nmi handler to the new routines.
Most of it is straight forward conversion. A couple of places needed some
tweaking like kgdb which separates the debug notifier from the nmi handler
and mce removes a call to notify_die.
[Thanks to Ying for finding out the history behind that mce call
https://lkml.org/lkml/2010/5/27/114
And Boris responding that he would like to remove that call because of it
https://lkml.org/lkml/2011/9/21/163]
The things that get converted are the registeration/unregistration routines
and the nmi handler itself has its args changed along with code removal
to check which list it is on (most are on one NMI list except for kgdb
which has both an NMI routine and an NMI Unknown routine).
Signed-off-by: Don Zickus <dzickus@redhat.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Acked-by: Corey Minyard <minyard@acm.org> Cc: Jason Wessel <jason.wessel@windriver.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Robert Richter <robert.richter@amd.com> Cc: Huang Ying <ying.huang@intel.com> Cc: Corey Minyard <minyard@acm.org> Cc: Jack Steiner <steiner@sgi.com> Link: http://lkml.kernel.org/r/1317409584-23662-4-git-send-email-dzickus@redhat.com Signed-off-by: Ingo Molnar <mingo@elte.hu>
Don Zickus [Fri, 30 Sep 2011 19:06:20 +0000 (15:06 -0400)]
x86, nmi: Create new NMI handler routines
The NMI handlers used to rely on the notifier infrastructure. This worked
great until we wanted to support handling multiple events better.
One of the key ideas to the nmi handling is to process _all_ the handlers for
each NMI. The reason behind this switch is because NMIs are edge triggered.
If enough NMIs are triggered, then they could be lost because the cpu can
only latch at most one NMI (besides the one currently being processed).
In order to deal with this we have decided to process all the NMI handlers
for each NMI. This allows the handlers to determine if they recieved an
event or not (the ones that can not determine this will be left to fend
for themselves on the unknown NMI list).
As a result of this change it is now possible to have an extra NMI that
was destined to be received for an already processed event. Because the
event was processed in the previous NMI, this NMI gets dropped and becomes
an 'unknown' NMI. This of course will cause printks that scare people.
However, we prefer to have extra NMIs as opposed to losing NMIs and as such
are have developed a basic mechanism to catch most of them. That will be
a later patch.
To accomplish this idea, I unhooked the nmi handlers from the notifier
routines and created a new mechanism loosely based on doIRQ. The reason
for this is the notifier routines have a couple of shortcomings. One we
could't guarantee all future NMI handlers used NOTIFY_OK instead of
NOTIFY_STOP. Second, we couldn't keep track of the number of events being
handled in each routine (most only handle one, perf can handle more than one).
Third, I wanted to eventually display which nmi handlers are registered in
the system in /proc/interrupts to help see who is generating NMIs.
The patch below just implements the new infrastructure but doesn't wire it up
yet (that is the next patch). Its design is based on doIRQ structs and the
atomic notifier routines. So the rcu stuff in the patch isn't entirely untested
(as the notifier routines have soaked it) but it should be double checked in
case I copied the code wrong.
Gleb Natapov [Wed, 5 Oct 2011 12:01:21 +0000 (14:01 +0200)]
perf, intel: Use GO/HO bits in perf-ctr
Intel does not have guest/host-only bit in perf counters like AMD
does. To support GO/HO bits KVM needs to switch EVENTSELn values
(or PERF_GLOBAL_CTRL if available) at a guest entry. If a counter is
configured to count only in a guest mode it stays disabled in a host,
but VMX is configured to switch it to enabled value during guest entry.
This patch adds GO/HO tracking to Intel perf code and provides interface
for KVM to get a list of MSRs that need to be switched on a guest entry.
Only cpus with architectural PMU (v1 or later) are supported with this
patch. To my knowledge there is not p6 models with VMX but without
architectural PMU and p4 with VMX are rare and the interface is general
enough to support them if need arise.
Linus Torvalds [Mon, 10 Oct 2011 02:48:27 +0000 (14:48 +1200)]
Merge branch 'fixes' of git://git.linaro.org/people/arnd/arm-soc
* 'fixes' of git://git.linaro.org/people/arnd/arm-soc:
ARM: mach-ux500: enable fix for ARM errata 754322
ARM: OMAP: musb: Remove a redundant omap4430_phy_init call in usb_musb_init
ARM: OMAP: Fix i2c init for twl4030
ARM: OMAP4: MMC: fix power and audio issue, decouple USBC1 from MMC1
Marc Dietrich [Fri, 7 Oct 2011 15:31:41 +0000 (08:31 -0700)]
ARM: tegra: fix compilation error due to mach/hardware.h removal
This fixes a compilation error in cpu-tegra.c which was introduced in dc8d966bccde ("ARM: convert PCI defines to variables") which removed the
now obsolete mach/hardware.h from the mach-tegra subtree.
Signed-off-by: Marc Dietrich <marvin24@gmx.de> Signed-off-by: Olof Johansson <olof@lixom.net> Cc: Sergei Shtylyov <sshtylyov@ru.mvista.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Mon, 10 Oct 2011 02:43:06 +0000 (14:43 +1200)]
Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux
* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux:
drm/radeon/kms: use hardcoded dig encoder to transmitter mapping for DCE4.1
drm/radeon/kms: fix dp_detect handling for DP bridge chips
drm/radeon/kms: retry aux transactions if there are status flags
Olof Johansson [Fri, 7 Oct 2011 20:27:48 +0000 (13:27 -0700)]
MAINTAINERS: Update tegra maintainer information
A couple of changes to the Tegra maintainership setup:
I'm very glad to bring on Stephen Warren on board as a maintainer. The
work he has done so far is excellent, and the fact that he works for
Nvidia means he has long-term interest in the platform.
Erik Gilling did an astounding amount of work on getting things up and
running but has been a silent partner on the maintainership side for a
while, and is stepping down. Thanks for your contributions so far, Erik.
Finally, update the git URL since I'll take over running the main repo
for a while.
Overall maintainership model isn't changing much at this time: We'll all
three review patches as appropriate, and one of us will collect the main
repo (me at this time).
Signed-off-by: Olof Johansson <olof@lixom.net> Cc: Erik Gilling <konkers@android.com> Acked-by: Colin Cross <ccross@android.com> Acked-by: Stephen Warren <swarren@nvidia.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Mon, 10 Oct 2011 02:39:03 +0000 (14:39 +1200)]
Merge branch 'upstream' of git://git.linux-mips.org/pub/scm/upstream-linus
* 'upstream' of git://git.linux-mips.org/pub/scm/upstream-linus: (29 commits)
MIPS: Call oops_enter, oops_exit in die
staging/octeon: Software should check the checksum of no tcp/udp packets
MIPS: Octeon: Enable C0_UserLocal probing.
MIPS: No branches in delay slots for huge pages in handle_tlbl
MIPS: Don't clobber CP0_STATUS value for CONFIG_MIPS_MT_SMTC
MIPS: Octeon: Select CONFIG_HOLES_IN_ZONE
MIPS: PM: Use struct syscore_ops instead of sysdevs for PM (v2)
MIPS: Compat: Use 32-bit wrapper for compat_sys_futex.
MIPS: Do not use EXTRA_CFLAGS
MIPS: Alchemy: DB1200: Disable cascade IRQ in handler
SERIAL: Lantiq: Set timeout in uart_port
MIPS: Lantiq: Fix setting the PCI bus speed on AR9
MIPS: Lantiq: Fix external interrupt sources
MIPS: tlbex: Fix build error in R3000 code.
MIPS: Alchemy: Include Au1100 in PM code.
MIPS: Alchemy: Fix typo in MAC0 registration
MIPS: MSP71xx: Fix build error.
MIPS: Handle __put_user() sleeping.
MIPS: Allow forced irq threading
MIPS: i8259: Mark cascade interrupt non-threaded
...
Steve French [Fri, 7 Oct 2011 04:14:07 +0000 (23:14 -0500)]
[CIFS] Fix first time message on mount, ntlmv2 upgrade delayed to 3.2
Microsoft has a bug with ntlmv2 that requires use of ntlmssp, but
we didn't get the required information on when/how to use ntlmssp to
old (but once very popular) legacy servers (various NT4 fixpacks
for example) until too late to merge for 3.1. Will upgrade
to NTLMv2 in NTLMSSP in 3.2
Signed-off-by: Steve French <smfrench@gmail.com> Reviewed-by: Jeff Layton <jlayton@redhat.com>
perf tools: Make perf.data more self-descriptive (v8)
The goal of this patch is to include more information about the host
environment into the perf.data so it is more self-descriptive. Overtime,
profiles are captured on various machines and it becomes hard to track
what was recorded, on what machine and when.
This patch provides a way to solve this by extending the perf.data file
with basic information about the host machine. To add those extensions,
we leverage the feature bits capabilities of the perf.data format. The
change is backward compatible with existing perf.data files.
We define the following useful new extensions:
- HEADER_HOSTNAME: the hostname
- HEADER_OSRELEASE: the kernel release number
- HEADER_ARCH: the hw architecture
- HEADER_CPUDESC: generic CPU description
- HEADER_NRCPUS: number of online/avail cpus
- HEADER_CMDLINE: perf command line
- HEADER_VERSION: perf version
- HEADER_TOPOLOGY: cpu topology
- HEADER_EVENT_DESC: full event description (attrs)
- HEADER_CPUID: easy-to-parse low level CPU identication
The small granularity for the entries is to make it easier to extend
without breaking backward compatiblity. Many entries are provided as
ASCII strings.
Perf report/script have been modified to print the basic information as
easy-to-parse ASCII strings. Extended information about CPU and NUMA
topology may be requested with the -I option.
Thanks to David Ahern for reviewing and testing the many versions of
this patch.
$ perf report --stdio
# ========
# captured on : Mon Sep 26 15:22:14 2011
# hostname : quad
# os release : 3.1.0-rc4-tip
# perf version : 3.1.0-rc4
# arch : x86_64
# nrcpus online : 4
# nrcpus avail : 4
# cpudesc : Intel(R) Core(TM)2 Quad CPU Q6600 @ 2.40GHz
# cpuid : GenuineIntel,6,15,11
# total memory : 8105360 kB
# cmdline : /home/eranian/perfmon/official/tip/build/tools/perf/perf record date
# event : name = cycles, type = 0, config = 0x0, config1 = 0x0, config2 = 0x0, excl_usr = 0, excl_kern = 0, id = { 29, 30, 31,
# HEADER_CPU_TOPOLOGY info available, use -I to display
# HEADER_NUMA_TOPOLOGY info available, use -I to display
# ========
#
...
Reviewed-by: David Ahern <dsahern@gmail.com> Tested-by: David Ahern <dsahern@gmail.com> Cc: David Ahern <dsahern@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Robert Richter <robert.richter@amd.com> Cc: Andi Kleen <ak@linux.intel.com> Link: http://lkml.kernel.org/r/20110930134040.GA5575@quad Signed-off-by: Stephane Eranian <eranian@google.com>
[ committer notes: Use --show-info in the tools as was in the docs, rename
perf_header_fprintf_info to perf_file_section__fprintf_info, fixup
conflict with f69b64f7 "perf: Support setting the disassembler style" ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
perf hists browser: Fix TAB/UNTAB use with multiple events
When requesting multiple events, say:
# perf top -e instructions -e cycles -e cache-misses
The first screen lets the user chose what to see first, then to switch
one can either use the left key to get back to the event menu or simply
use TAB to go the next and shift+TAB to go the prev.
When using TAB/UNTAB the call to perf_evlist__set_selected(event) was
missing, fix it.
Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-3xqqh3fwmt914gg43frey14y@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
perf tools: Fix broken number of samples for perf report -n
The perf report -n option was broken because it was not reporting the
correct number of samples depending on the sorting mode. By default,
samples are sorted by comm,dso,sym. That means that samples for the same
command (binary) get collapsed.
The hists__collapse_insert_entry() had a bug whereby it was aggregating
the number of events observed (periods) but not the number of samples.
Consequently, the number of samples reported could be below reality. The
percentage remained correct because based on the periods.
This patch fixes the problem by also aggregating the number of samples.
Here is an example:
1. Make sure newt-devel is not installed when building it
2. Use 'perf top --stdio' just like with report
3. Edit your ~/.perfconfig or system wide config and have this there:
[tui]
top = off
But you shouldn't, since the TUI is so much more powerful, has
integration with annotation and where lots more interesting features
will be developed, so if something annoys you (the colors?) just let me
know and I'll do my best to make it pleasant as a default.
Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-cy2tn4uj1t7c3aqss5l25of5@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
perf annotate browser: Allow navigation to called functions
I.e. when in the annotate TUI window, if Enter is pressed over an
assembly line with a 'callq' it will try to open another TUI window with
that symbol.
This is just a proof of concept and works only on x86_64, more work is
needed to support kernel modules, userland, other arches, etc, but
should already be useful as-is.
Suggested-by: Ingo Molnar <mingo@elte.hu> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-opyvskw5na3qdmkv8vxi3zbr@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
perf top: Reuse the 'report' hist_entry/hists classes
This actually fixes several problems we had in the old 'perf top':
1. Unresolved symbols not show, limitation that came from the old
"KernelTop" codebase, to solve it we would need to do changes
that would make sym_entry have most of the hist_entry fields.
2. It was using the number of samples, not the sum of sample->period.
And brings the --sort code that allows us to have all the views in
'perf report', for instance:
perf browsers: Add live mode to the hists, annotate browsers
This allows passing a timer to be run periodically, which will update
the hists tree that then gers refreshed on the screen, just like the
Live mode (symbol entries, annotation) we already have in 'perf top
--tui'.
Will be used by the new hist_entry/hists based 'top' tool.
Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-2r44qd8oe4sagzcgoikl8qzc@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
perf hists: Threaded addition and sorting of entries
By using a mutex just for inserting and rotating two hist_entry rb
trees, so that when sorting we can get the last batch of entries created
from the ring buffer, merge it with whatever we have processed so far
and show the output while new entries are being added.
The 'report' tool continues, for now, to do it without threading, but
will use this in the future to allow visualization of results in long
perf.data sessions while the entries are being processed.
The new 'top' tool will be the first user.
Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-9b05atsn0q6m7fqgrug8fk2i@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Linus Torvalds [Thu, 6 Oct 2011 23:15:10 +0000 (16:15 -0700)]
Merge git://github.com/davem330/net
* git://github.com/davem330/net:
net: fix typos in Documentation/networking/scaling.txt
bridge: leave carrier on for empty bridge
netfilter: Use proper rwlock init function
tcp: properly update lost_cnt_hint during shifting
tcp: properly handle md5sig_pool references
macvlan/macvtap: Fix unicast between macvtap interfaces in bridge mode
Paul Menzel [Wed, 31 Aug 2011 15:07:10 +0000 (17:07 +0200)]
x86/PCI: use host bridge _CRS info on ASUS M2V-MX SE
In summary, this DMI quirk uses the _CRS info by default for the ASUS
M2V-MX SE by turning on `pci=use_crs` and is similar to the quirk
added by commit 2491762cfb47 ("x86/PCI: use host bridge _CRS info on
ASRock ALiveSATA2-GLAN") whose commit message should be read for further
information.
Since commit 3e3da00c01d0 ("x86/pci: AMD one chain system to use pci
read out res") Linux gives the following oops:
Trusting the ACPI _CRS information (`pci=use_crs`) fixes this problem.
$ dmesg | grep -i crs # with the quirk
PCI: Using host bridge windows from ACPI; if necessary, use "pci=nocrs" and report a bug
The match has to be against the DMI board entries though since the vendor entries are not populated.
DMI: System manufacturer System Product Name/M2V-MX SE, BIOS 0304 10/30/2007
This quirk should be removed when `pci=use_crs` is enabled for machines
from 2006 or earlier or some other solution is implemented.
Using coreboot [1] with this board the problem does not exist but this
quirk also does not affect it either. To be safe though the check is
tightened to only take effect when the BIOS from American Megatrends is
used.
15:13 < ruik> but coreboot does not need that
15:13 < ruik> because i have there only one root bus
15:13 < ruik> the audio is behind a bridge
$ sudo dmidecode
BIOS Information
Vendor: American Megatrends Inc.
Version: 0304
Release Date: 10/30/2007
This resolves a regression seen by some users of bridging.
Some users use the bridge like a dummy device.
They expect to be able to put an IPv6 address on the device
with no ports attached. Although there are better ways of doing
this, there is no reason to not allow it.
Note: the bridge still will reflect the state of ports in the
bridge if there are any added.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Joerg Roedel [Wed, 5 Oct 2011 12:01:16 +0000 (14:01 +0200)]
perf, core: Introduce attrs to count in either host or guest mode
The two new attributes exclude_guest and exclude_host can
bes used by user-space to tell the kernel to setup
performance counter to either only count while the CPU is in
guest or in host mode.
An additional check is also introduced to make sure
user-space does not try to exclude guest and host mode from
counting.
Peter Zijlstra [Sat, 25 Jun 2011 13:45:46 +0000 (15:45 +0200)]
sched: Unify the ->cpus_allowed mask copy
Currently every sched_class::set_cpus_allowed() implementation has to
copy the cpumask into task_struct::cpus_allowed, this is pointless,
put this copy in the generic code.
Peter Zijlstra [Thu, 16 Jun 2011 10:23:22 +0000 (12:23 +0200)]
sched: Wrap scheduler p->cpus_allowed access
This task is preparatory for the migrate_disable() implementation, but
stands on its own and provides a cleanup.
It currently only converts those sites required for task-placement.
Kosaki-san once mentioned replacing cpus_allowed with a proper
cpumask_t instead of the NR_CPUS sized array it currently is, that
would also require something like this.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Acked-by: Thomas Gleixner <tglx@linutronix.de> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Link: http://lkml.kernel.org/n/tip-e42skvaddos99psip0vce41o@git.kernel.org Signed-off-by: Ingo Molnar <mingo@elte.hu>
Suresh Siddha [Mon, 3 Oct 2011 22:09:01 +0000 (15:09 -0700)]
sched: Request for idle balance during nohz idle load balance
rq's idle_at_tick is set to idle/busy during the timer tick
depending on the cpu was idle or not. This will be used later in the load
balance that will be done in the softirq context (which is a process
context in -RT kernels).
For nohz kernels, for the cpu doing nohz idle load balance on behalf of
all the idle cpu's, its rq->idle_at_tick might have a stale value (which is
recorded when it got the timer tick presumably when it is busy).
As the nohz idle load balancing is also being done at the same place
as the regular load balancing, nohz idle load balancing was bailing out
when it sees rq's idle_at_tick not set.
Thus leading to poor system utilization.
Rename rq's idle_at_tick to idle_balance and set it when someone requests
for nohz idle balance on an idle cpu.
Suresh Siddha [Mon, 3 Oct 2011 22:09:00 +0000 (15:09 -0700)]
sched: Use resched IPI to kick off the nohz idle balance
Current use of smp call function to kick the nohz idle balance can deadlock
in this scenario.
1. cpu-A did a generic_exec_single() to cpu-B and after queuing its call single
data (csd) to the call single queue, cpu-A took a timer interrupt. Actual IPI
to cpu-B to process the call single queue is not yet sent.
2. As part of the timer interrupt handler, cpu-A decided to kick cpu-B
for the idle load balancing (sets cpu-B's rq->nohz_balance_kick to 1)
and __smp_call_function_single() with nowait will queue the csd to the
cpu-B's queue. But the generic_exec_single() won't send an IPI to cpu-B
as the call single queue was not empty.
3. cpu-A is busy with lot of interrupts
4. Meanwhile cpu-B is entering and exiting idle and noticed that it has
it's rq->nohz_balance_kick set to '1'. So it will go ahead and do the
idle load balancer and clear its rq->nohz_balance_kick.
5. At this point, csd queued as part of the step-2 above is still locked
and waiting to be serviced on cpu-B.
6. cpu-A is still busy with interrupt load and now it got another timer
interrupt and as part of it decided to kick cpu-B for another idle load
balancing (as it finds cpu-B's rq->nohz_balance_kick cleared in step-4
above) and does __smp_call_function_single() with the same csd that is
still locked.
7. And we get a deadlock waiting for the csd_lock() in the
__smp_call_function_single().
Main issue here is that cpu-B can service the idle load balancer kick
request from cpu-A even with out receiving the IPI and this lead to
doing multiple __smp_call_function_single() on the same csd leading to
deadlock.
To kick a cpu, scheduler already has the reschedule vector reserved. Use
that mechanism (kick_process()) instead of using the generic smp call function
mechanism to kick off the nohz idle load balancing and avoid the deadlock.
[ This issue is present from 2.6.35+ kernels, but marking it -stable
only from v3.0+ as the proposed fix depends on the scheduler_ipi()
that is introduced recently. ]
Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Tested-by: Simon Farnsworth <simon.farnsworth@onelan.co.uk> Cc: stable@kernel.org Signed-off-by: Dave Airlie <airlied@redhat.com>
Thomas Gleixner [Wed, 5 Oct 2011 03:24:43 +0000 (03:24 +0000)]
netfilter: Use proper rwlock init function
Replace the open coded initialization with the init function.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Hans Schillstrom <hans.schillstrom@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* git://bedivere.hansenpartnership.com/git/scsi-rc-fixes-2.6:
[SCSI] libsas: fix panic when single phy is disabled on a wide port
[SCSI] qla2xxx: Fix crash in qla2x00_abort_all_cmds() on unload
Alex Deucher [Tue, 4 Oct 2011 16:23:24 +0000 (12:23 -0400)]
drm/radeon/kms: fix dp_detect handling for DP bridge chips
The HPD pin is not reliable for detecting whether a monitor
is connected or not. Skip HPD and just use DDC or load
detection.
Fixes phantom VGA connected bugs.
[Michel: fixes phantom VGA bugs on his llano system.]
Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Tested-by: Michel Dänzer <michel.daenzer@amd.com> Cc: stable@kernel.org Signed-off-by: Dave Airlie <airlied@redhat.com>
Alex Deucher [Tue, 4 Oct 2011 21:23:15 +0000 (17:23 -0400)]
drm/radeon/kms: retry aux transactions if there are status flags
If there are error flags in the aux status, retry the transaction.
This makes aux much more reliable, especially on llano systems.
Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Cc: stable@kernel.org Signed-off-by: Dave Airlie <airlied@redhat.com>
Yan, Zheng [Sun, 2 Oct 2011 04:21:50 +0000 (04:21 +0000)]
tcp: properly update lost_cnt_hint during shifting
lost_skb_hint is used by tcp_mark_head_lost() to mark the first unhandled skb.
lost_cnt_hint is the number of packets or sacked packets before the lost_skb_hint;
When shifting a skb that is before the lost_skb_hint, if tcp_is_fack() is ture,
the skb has already been counted in the lost_cnt_hint; if tcp_is_fack() is false,
tcp_sacktag_one() will increase the lost_cnt_hint. So tcp_shifted_skb() does not
need to adjust the lost_cnt_hint by itself. When shifting a skb that is equal to
lost_skb_hint, the shifted packets will not be counted by tcp_mark_head_lost().
So tcp_shifted_skb() should adjust the lost_cnt_hint even tcp_is_fack(tp) is true.
Signed-off-by: Zheng Yan <zheng.z.yan@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
tcp_v4_clear_md5_list() assumes that multiple tcp md5sig peers
only hold one reference to md5sig_pool. but tcp_v4_md5_do_add()
increases use count of md5sig_pool for each peer. This patch
makes tcp_v4_md5_do_add() only increases use count for the first
tcp md5sig peer.
Signed-off-by: Zheng Yan <zheng.z.yan@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
David Ward [Sun, 18 Sep 2011 12:53:20 +0000 (12:53 +0000)]
macvlan/macvtap: Fix unicast between macvtap interfaces in bridge mode
Packets should always be forwarded to the lowerdev using dev_forward_skb.
vlan->forward is for packets being forwarded directly to another macvlan/
macvtap device (used for multicast in bridge mode).
Reported-and-tested-by: Shlomo Pongratz <shlomop@mellanox.com> Signed-off-by: David Ward <david.ward@ll.mit.edu> Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Tue, 4 Oct 2011 17:37:06 +0000 (10:37 -0700)]
Merge git://github.com/davem330/net
* git://github.com/davem330/net:
pch_gbe: Fixed the issue on which a network freezes
pch_gbe: Fixed the issue on which PC was frozen when link was downed.
make PACKET_STATISTICS getsockopt report consistently between ring and non-ring
net: xen-netback: correctly restart Tx after a VM restore/migrate
bonding: properly stop queuing work when requested
can bcm: fix incomplete tx_setup fix
RDSRDMA: Fix cleanup of rds_iw_mr_pool
net: Documentation: Fix type of variables
ibmveth: Fix oops on request_irq failure
ipv6: nullify ipv6_ac_list and ipv6_fl_list when creating new socket
cxgb4: Fix EEH on IBM P7IOC
can bcm: fix tx_setup off-by-one errors
MAINTAINERS: tehuti: Alexander Indenbaum's address bounces
dp83640: reduce driver noise
ptp: fix L2 event message recognition
Linus Torvalds [Tue, 4 Oct 2011 16:59:22 +0000 (09:59 -0700)]
Merge branch 'fix/asoc' of git://github.com/tiwai/sound
* 'fix/asoc' of git://github.com/tiwai/sound:
ASoC: omap_mcpdm_remove cannot be __devexit
ASoC: Fix setting update bits for WM8753_LADC and WM8753_RADC
ASoC: use a valid device for dev_err() in Zylonite