Andi Kleen [Wed, 16 Mar 2011 19:44:33 +0000 (15:44 -0400)]
oprofile, x86: Allow setting EDGE/INV/CMASK for counter events
For some performance events it's useful to set the EDGE and INV
bits and the CMASK mask in the counter control register. The list
of predefined events Intel releases for each CPU has some events which
require these settings to get more "natural" to use higher level events.
oprofile currently doesn't allow this.
This patch adds new extra configuration fields for them, so that
they can be specified in oprofilefs.
An updated oprofile daemon can then make use of this to set them.
v2: Write back masked extra value to variable.
Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Robert Richter <robert.richter@amd.com>
Robert Richter [Fri, 11 Feb 2011 16:31:44 +0000 (17:31 +0100)]
oprofile, s390: Rework hwsampler implementation
This patch is a rework of the hwsampler oprofile implementation that
has been applied recently. Now there are less non-architectural
changes. The only changes are:
* introduction of oprofile_add_ext_hw_sample(), and
* removal of section attributes of oprofile_timer_init/_exit().
To setup hwsampler for oprofile we need to modify start()/stop()
callbacks and additional hwsampler control files in oprofilefs. We do
not reinitialize the timer or hwsampler mode by restarting calling
init/exit() anymore, instead hwsampler_running is used to switch the
mode directly in oprofile_hwsampler_start/_stop(). For locking reasons
there is also hwsampler_file that reflects the value in oprofilefs.
The overall diffstat of the oprofile s390 hwsampler implemenation
shows the low impact to non-architectural code:
Heinz Graalfs [Fri, 21 Jan 2011 10:06:53 +0000 (10:06 +0000)]
oprofile, s390: Enhance OProfile to support System zs hardware sampling feature
OProfile is enhanced to export all files for controlling System z's
hardware sampling, and to invoke hwsampler exported functions to
initialize and use System z's hardware sampling.
The patch invokes hwsampler_setup() during oprofile init and exports
following hwsampler files under oprofilefs if hwsampler's setup
succeeded:
A new directory for hardware sampling based files
/dev/oprofile/hwsampling/
The userland daemon must explicitly write to the following files
to disable (or enable) hardware based sampling
/dev/oprofile/hwsampling/hwsampler
to modify the actual sampling rate
/dev/oprofile/hwsampling/hw_interval
to modify the amount of sampling memory (measured in 4K pages)
/dev/oprofile/hwsampling/hw_sdbt_blocks
The following files are read only and show
the possible minimum sampling rate
/dev/oprofile/hwsampling/hw_min_interval
the possible maximum sampling rate
/dev/oprofile/hwsampling/hw_max_interval
The patch splits the oprofile_timer_[init/exit] function so that it
can be also called through user context (oprofilefs) to avoid kernel
oops.
Applied with following changes:
* whitespace changes in Makefile and timer_int.c
Heinz Graalfs [Fri, 21 Jan 2011 10:06:52 +0000 (10:06 +0000)]
oprofile, s390: Add support for hardware based sampling on System z processors
This adds support for hardware based sampling on System z processors
(models z10 and up).
System z's hardware sampling is described in detail in:
SA23-2260-01 "The Load-Program-Parameter and CPU-Measurement Facilities"
The patch introduces
- support for System z's hardware sampler in OProfile's kernel module
- it adds functions that control all hardware sampling related
operations as:
- checking if hardware sampling feature is available, i.e.: on
System z models z10 and up, in LPAR mode only, and authorised
during LPAR activation
- allocating memory for the hardware sampling feature
- starting/stopping hardware sampling
All functions required to start and stop hardware sampling have to be
invoked by the oprofile kernel module as provided by the other patches
of this patch set.
In case hardware based sampling cannot be setup standard timer based
sampling is used by OProfile.
Applied with following changes:
* enable compilation in Makefile
Heinz Graalfs [Fri, 21 Jan 2011 10:06:54 +0000 (10:06 +0000)]
oprofile: Introduce new oprofile sample add function (oprofile_add_ext_hw_sample)
This patch introduces a new oprofile sample add function
(oprofile_add_ext_hw_sample) that can also take task_struct as an
argument, which is used by the hwsampler kernel module when copying
hardware samples to OProfile buffers.
Applied with following changes:
* removed #include <linux/module.h>
* whitespace changes
* removed conditional compilation (CONFIG_HAVE_HWSAMPLER)
* modified order of functions
* fix missing function definition in header file
Ari Kauppi [Thu, 20 Jan 2011 18:57:19 +0000 (13:57 -0500)]
ARM: oprofile: Fix backtraces in timer mode
Always allow backtraces when using oprofile on ARM, even if a PMU
isn't present. Restores functionality originally introduced in commit 1b7b56982fdcd9d85effd76f3928cf5d6eb26155 ("oprofile: Always allow
backtraces on ARM") by Richard Purdie.
It is not that obvious, but there is now only one oprofile_arch_init()
function. So the .backtrace callback is available also in timer mode.
Implemented by removing code and using stubs for oprofile_perf_{init,
exit} provided by <linux/oprofile.h>. This allows cleaning of other
architecture specific implementations too.
Cc: stable@kernel.org # 37.x Signed-off-by: Ari Kauppi <kauppi@papupata.org> Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Robert Richter <robert.richter@amd.com>
Linus Torvalds [Sat, 22 Jan 2011 00:50:31 +0000 (16:50 -0800)]
Merge branch 'media_fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-2.6
* 'media_fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-2.6: (101 commits)
[media] staging/lirc: fix mem leaks and ptr err usage
[media] hdpvr: reduce latency of i2c read/write w/recycled buffer
[media] hdpvr: enable IR part
[media] rc/mceusb: timeout should be in ns, not us
[media] v4l2-device: fix 'use-after-freed' oops
[media] v4l2-dev: don't memset video_device.dev
[media] zoran: use video_device_alloc instead of kmalloc
[media] w9966: zero device state after a detach
[media] v4l: Fix a use-before-set in the control framework
[media] v4l: Include linux/videodev2.h in media/v4l2-ctrls.h
[media] DocBook/v4l: update V4L2 revision and update copyright years
[media] DocBook/v4l: fix validation error in dev-rds.xml
[media] v4l2-ctrls: queryctrl shouldn't attempt to replace V4L2_CID_PRIVATE_BASE IDs
[media] v4l2-ctrls: fix missing 'read-only' check
[media] pvrusb2: Provide more information about IR units to lirc_zilog and ir-kbd-i2c
[media] ir-kbd-i2c: Add back defaults setting for Zilog Z8's at addr 0x71
[media] lirc_zilog: Update TODO.lirc_zilog
[media] lirc_zilog: Add Andy Walls to copyright notice and authors list
[media] lirc_zilog: Remove useless struct i2c_driver.command function
[media] lirc_zilog: Remove unneeded tests for existence of the IR Tx function
...
David Howells [Thu, 20 Jan 2011 16:38:27 +0000 (16:38 +0000)]
KEYS: Do some style cleanup in the key management code.
Do a bit of a style clean up in the key management code. No functional
changes.
Done using:
perl -p -i -e 's!^/[*]*/\n!!' security/keys/*.c
perl -p -i -e 's!} /[*] end [a-z0-9_]*[(][)] [*]/\n!}\n!' security/keys/*.c
sed -i -s -e ": next" -e N -e 's/^\n[}]$/}/' -e t -e P -e 's/^.*\n//' -e "b next" security/keys/*.c
To remove /*****/ lines, remove comments on the closing brace of a
function to name the function and remove blank lines before the closing
brace of a function.
Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6:
cifs: fix up CIFSSMBEcho for unaligned access
cifs: fix unaligned accesses in cifsConvertToUCS
cifs: clean up unaligned accesses in cifs_unicode.c
cifs: fix unaligned access in check2ndT2 and coalesce_t2
cifs: clean up unaligned accesses in validate_t2
cifs: use get/put_unaligned functions to access ByteCount
cifs: move time field in cifsInodeInfo
cifs: TCP_Server_Info diet
CIFS: Implement cifs_strict_readv (try #4)
CIFS: Implement cifs_file_strict_mmap (try #2)
CIFS: Implement cifs_strict_fsync
CIFS: Make cifsFileInfo_put work with strict cache mode
Linus Torvalds [Fri, 21 Jan 2011 21:38:57 +0000 (13:38 -0800)]
Merge branch 'fixes-2.6.38' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq
* 'fixes-2.6.38' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq:
workqueue: note the nested NOT_RUNNING test in worker_clr_flags() isn't a noop
workqueue: relax lockdep annotation on flush_work()
Linus Torvalds [Fri, 21 Jan 2011 21:34:39 +0000 (13:34 -0800)]
Merge branches 'fixes' and 'fwnet' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394-2.6
* 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394-2.6:
firewire: core: fix unstable I/O with Canon camcorder
* 'fwnet' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394-2.6:
firewire: net: is not experimental anymore
firewire: net: invalidate ARP entries of removed nodes
Michal Simek [Fri, 21 Jan 2011 07:49:56 +0000 (08:49 +0100)]
mm: System without MMU do not need pte_mkwrite
The patch "thp: export maybe_mkwrite" (commit 14fd403f2146) breaks
systems without MMU.
Error log:
CC arch/microblaze/mm/init.o
In file included from include/linux/mman.h:14,
from arch/microblaze/mm/consistent.c:24:
include/linux/mm.h: In function 'maybe_mkwrite':
include/linux/mm.h:482: error: implicit declaration of function 'pte_mkwrite'
include/linux/mm.h:482: error: incompatible types in assignment
Signed-off-by: Michal Simek <monstr@monstr.eu> CC: Andrea Arcangeli <aarcange@redhat.com> Reviewed-by: Rik van Riel <riel@redhat.com> CC: Andrew Morton <akpm@linux-foundation.org> CC: Ingo Molnar <mingo@elte.hu> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Although the last_pfn obtained from the startup info is 0x26700, which
should in turn not be hit, the additional 8MB which are added as extra
memory normally seem to be ok. This lead to looking into the initial
p2m tree construction, which uses the smaller value and assuming that
there is other code handling the extra memory.
When the p2m tree is set up, the leaves are directly pointed to the
array which the domain builder set up. But if the mapping is not on a
boundary that fits into one p2m page, this will result in the last leaf
being only partially valid. And as the invalid entries are not
initialized in that case, things go badly wrong.
I am trying to fix that by checking whether the current leaf is a
complete map and if not, allocate a completely new page and copy only
the valid pointers there. This may not be the most efficient or elegant
solution, but at least it seems to allow me booting DomUs with memory
assignments all over the range.
Thomas Gleixner [Wed, 19 Jan 2011 18:41:35 +0000 (19:41 +0100)]
genirq: Remove __do_IRQ
All architectures are finally converted. Remove the cruft.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Richard Henderson <rth@twiddle.net> Cc: Mike Frysinger <vapier@gentoo.org> Cc: David Howells <dhowells@redhat.com> Cc: Tony Luck <tony.luck@intel.com> Cc: Greg Ungerer <gerg@uclinux.org> Cc: Michal Simek <monstr@monstr.eu> Acked-by: David Howells <dhowells@redhat.com> Cc: Kyle McMartin <kyle@mcmartin.ca> Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Chen Liqin <liqin.chen@sunplusct.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Chris Metcalf <cmetcalf@tilera.com> Cc: Jeff Dike <jdike@addtoit.com>
Thomas Gleixner [Wed, 22 Sep 2010 17:13:16 +0000 (19:13 +0200)]
m32r: Cleanup direct irq_desc access
The irq descriptors are already initialized by the generic
code. Remove the redundant init code and set the irq chip with the
proper accessor function.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Hirokazu Takata <takata@linux-m32r.org> Cc: Paul Mundt <lethal@linux-sh.org>
Thomas Gleixner [Wed, 19 Jan 2011 11:26:32 +0000 (12:26 +0100)]
h8300: Use generic irq Kconfig
Switch to the generic irq Kconfig. h8300 has all irq chips converted
to the new functions, so select the GENERIC_HARDIRQS_NO_DEPRECATED
switch as well. Fixup the resulting fallout in show_interrupts().
Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: Paul Mundt <lethal@linux-sh.org>
Thomas Gleixner [Wed, 19 Jan 2011 11:18:57 +0000 (12:18 +0100)]
h8300: Convert interrupt handling to flow handler
__do_IRQ is deprecated so h8300 needs to be converted to proper flow
handling. The irq chip is simple and does not required any
mask/ack/eoi functions, so we can use handle_simple_irq.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: Paul Mundt <lethal@linux-sh.org>
Ben Hutchings [Sat, 8 Jan 2011 14:24:01 +0000 (14:24 +0000)]
powerpc/boot/dts: Install dts from the right directory
The dts-installed variable is initialised using a wildcard path that
will be expanded relative to the build directory. Use the existing
variable dtstree to generate an absolute wildcard path that will work
when building in a separate directory.
Reported-by: Gerhard Pircher <gerhard_pircher@gmx.net> Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Tested-by: Gerhard Pircher <gerhard_pircher@gmx.net> [against 2.6.32] Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Anton Blanchard [Tue, 11 Jan 2011 19:52:31 +0000 (19:52 +0000)]
powerpc: machine_check_generic is wrong on 64bit
Decoding machine checks is CPU specific and so machine_check_generic doesn't
do the right thing on 64bit chips. Luckily we never call into this code
because we call ppc_md.machine_check_exception instead if available.
Since we check cur_cpu_spec->machine_check before calling it, we may as
well remove machine_check_generic from 64bit archs.
Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Anton Blanchard [Tue, 11 Jan 2011 19:50:51 +0000 (19:50 +0000)]
powerpc: Fix corruption when grabbing FWNMI data
The FWNMI code uses a global buffer without any locks to read the RTAS error
information. If two CPUs take a machine check at once then we will corrupt
this buffer.
Since most FWNMI rtas messages are not of the extended type, we can create a
64bit percpu buffer and use it where possible. If we do receive an extended
RTAS log then we fall back to the old behaviour of using the global buffer.
Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Anton Blanchard [Tue, 11 Jan 2011 19:49:19 +0000 (19:49 +0000)]
powerpc: Rework pseries machine check handler
Rework pseries machine check handler:
- If MSR_RI isn't set, we cannot recover even if the machine check was fully
recovered
- Rename nonfatal to recovered
- Handle RTAS_DISP_LIMITED_RECOVERY
- Use BUS_MCEERR_AR instead of BUS_ADRERR
- Don't check all the RTAS error log fields when receiving a synchronous
machine check. Recent versions of the pseries firmware do not fill them
in during a machine check and instead send a follow up error log with
the detailed information. If we see a synchronous machine check, and we
came from userspace then kill the task.
Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Anton Blanchard [Tue, 11 Jan 2011 19:48:14 +0000 (19:48 +0000)]
powerpc: Don't silently handle machine checks from userspace
If a machine check comes from userspace we send a SIGBUS to the task and
fail to printk anything.
If we are taking machine checks due to bad hardware we want to know about
it right away. Furthermore if we don't complain loudly then it will look
a lot like a bug in the userspace application, potentially causing a lot
of confusion.
Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Anton Blanchard [Tue, 11 Jan 2011 19:46:29 +0000 (19:46 +0000)]
powerpc: Never halt RTAS error logging after receiving an unrecoverable machine check
Newer versions of the System p firwmare send a partial RTAS error log in the
machine check handler with a more detailed response appearing sometime later
via check event.
This means at machine check time we do not have enough information to
ascertain exactly what went on. Furthermore, I have found the RTAS error
logs in the machine check handler contain no useful information, so halting on
them makes little sense. If we want to halt it would make more sense to do
it following the error log received sometime later via check event.
In light of this, never halt the error log in the pseries machine
check handler.
Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Anton Blanchard [Thu, 6 Jan 2011 18:00:36 +0000 (18:00 +0000)]
powerpc/kdump: Disable ftrace during kexec
We should disable ftrace during kexec, some of the tracers are very invasive
and we do not want them going off while doing the low level work of swapping
one kernel out for another. This mirrors what we do on x86.
Even though we cannot return from a kexec on powerpc (since we do not implement
CONFIG_KEXEC_JUMP), add the restore code in case we do one day.
Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Tejun Heo [Mon, 3 Jan 2011 03:49:25 +0000 (03:49 +0000)]
powerpc/cell: Use system_wq in cpufreq_spudemand
With cmwq, there's no reason to use a separate workqueue in
cpufreq_spudemand. Use system_wq instead. The work items are already
sync canceled on stop, so it's already guaranteed that no work is
running when spu_gov_exit() is entered.
Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: linuxppc-dev@lists.ozlabs.org Cc: Dave Jones <davej@redhat.com> Cc: cpufreq@vger.kernel.org Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Steven Rostedt [Wed, 22 Dec 2010 16:42:56 +0000 (16:42 +0000)]
powerpc/ppc32/tracing: Add stack frame to calls of trace_hardirqs_on/off
32-bit variant of the previous patch for 64-bit:
<<
When an interrupt occurs in userspace, we can call trace_hardirqs_on/off()
With one level stack. But if we have irqsoff tracing enabled,
it checks both CALLER_ADDR0 and CALLER_ADDR1. The second call
goes two stack frames up. If this is from user space, then there may
not exist a second stack....
>>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Steven Rostedt [Thu, 23 Dec 2010 19:46:06 +0000 (19:46 +0000)]
powerpc/ppc64/tracing: Add stack frame to calls of trace_hardirqs_on/off
When an interrupt occurs in userspace, we can call trace_hardirqs_on/off()
With one level stack. But if we have irqsoff tracing enabled,
it checks both CALLER_ADDR0 and CALLER_ADDR1. The second call
goes two stack frames up. If this is from user space, then there may
not exist a second stack.
Add a second stack when calling trace_hardirqs_on/off() otherwise
the following oops might occur:
Reported-by: Joerg Sommer <joerg@alea.gnuu.de> Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
powerpc: Ensure the else case of feature sections will fit
When we create an alternative feature section, the else case must be the
same size or smaller than the body. This is because when we patch the
else case in we just overwrite the body, so there must be room.
Up to now we just did this by inspection, but it's quite easy to enforce
it in the assembler, so we should.
The only change is to add the ifgt block, but that effects the alignment
of the tabs and so the whole macro is modified.
Also add a test, but #if 0 it because we don't want to break the build.
Anyone who's modifying the feature macros should enable the test.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Linus Torvalds [Fri, 21 Jan 2011 02:30:37 +0000 (18:30 -0800)]
Merge branch 'core-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'core-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
smp: Allow on_each_cpu() to be called while early_boot_irqs_disabled status to init/main.c
lockdep: Move early boot local IRQ enable/disable status to init/main.c
ACPI / PM: Call suspend_nvs_free() earlier during resume
It turns out that some device drivers map pages from the ACPI NVS region
during resume using ioremap(), which conflicts with ioremap_cache() used
for mapping those pages by the NVS save/restore code in nvs.c.
Make the NVS pages mapped by the code in nvs.c be unmapped before device
drivers' resume routines run.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Commit ca9b600be38c ("ACPI / PM: Make suspend_nvs_save() use
acpi_os_map_memory()") attempted to prevent the code in osl.c and nvs.c
from using different ioremap() variants by making the latter use
acpi_os_map_memory() for mapping the NVS pages. However, that also
requires acpi_os_unmap_memory() to be used for unmapping them, which
causes synchronize_rcu() to be executed many times in a row
unnecessarily and introduces substantial delays during resume on some
systems.
Instead of using acpi_os_map_memory() for mapping the NVS pages in nvs.c
introduce acpi_os_ioremap() calling ioremap_cache() and make the code in
both osl.c and nvs.c use it.
Reported-by: Jeff Chua <jeff.chua.linux@gmail.com> Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>