Alan Stern [Sat, 25 Sep 2010 21:35:00 +0000 (23:35 +0200)]
PM / Runtime: Merge synchronous and async runtime routines
This patch (as1423) merges the asynchronous routines
__pm_request_idle(), __pm_request_suspend(), and __pm_request_resume()
with their synchronous counterparts. The RPM_ASYNC bitflag argument
serves to indicate what sort of operation to perform.
In the course of performing this merger, it became apparent that the
various functions don't all behave consistenly with regard to error
reporting and cancellation of outstanding requests. A new routine,
rpm_check_suspend_allowed(), was written to centralize much of the
testing, and the other functions were revised to follow a simple
algorithm:
If the operation is disallowed because of the device's
settings or current state, return an error.
Cancel pending or scheduled requests of lower priority.
Schedule, queue, or perform the desired operation.
A few special cases and exceptions are noted in comments.
Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Alan Stern [Sat, 25 Sep 2010 21:34:54 +0000 (23:34 +0200)]
PM / Runtime: Replace boolean arguments with bitflags
The "from_wq" argument in __pm_runtime_suspend() and
__pm_runtime_resume() supposedly indicates whether or not the function
was called by the PM workqueue thread, but in fact it isn't always
used this way. It really indicates whether or not the function should
return early if the requested operation is already in progress.
Along with this badly-named boolean argument, later patches in this
series will add several other boolean arguments to these functions and
others. Therefore this patch (as1422) begins the conversion process
by replacing from_wq with a bitflag argument. The same bitflags are
also used in __pm_runtime_get() and __pm_runtime_put(), where they
indicate whether or not the operation should be asynchronous.
Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Alan Stern [Sat, 25 Sep 2010 21:34:46 +0000 (23:34 +0200)]
PM / Runtime: Move code in drivers/base/power/runtime.c
This patch (as1421) moves the PM runtime accounting subroutines up to
the beginning of runtime.c, taking them out of the middle of the
functions that do the actual work. No operational changes.
Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Alan Stern [Sat, 25 Sep 2010 21:34:22 +0000 (23:34 +0200)]
sysfs: Add sysfs_merge_group() and sysfs_unmerge_group()
This patch (as1420) adds sysfs_merge_group() and sysfs_unmerge_group()
functions, allowing drivers easily to add and remove sets of
attributes to a pre-existing attribute group directory.
Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Acked-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
PM: Fix potential issue with failing asynchronous suspend
There is a potential issue with the asynchronous suspend code that
a device driver suspending asynchronously may not notice that it
should back off. There are two failing scenarions, (1) when the
driver is waiting for a driver suspending synchronously to complete
and that second driver returns error code, in which case async_error
won't be set and the waiting driver will continue suspending and (2)
after the driver has called device_pm_wait_for_dev() and the waited
for driver returns error code, in which case the caller of
device_pm_wait_for_dev() will not know that there was an error and
will continue suspending.
To fix this issue make __device_suspend() set async_error, so
async_suspend() doesn't need to set it any more, and make
device_pm_wait_for_dev() return async_error, so that its callers
can check whether or not they should continue suspending.
No more changes are necessary, since device_pm_wait_for_dev() is
not used by any drivers' suspend routines.
Reported-by: Colin Cross <ccross@android.com> Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Acked-by: Greg Kroah-Hartman <gregkh@suse.de>
Introduce struct wakeup_source for representing system wakeup sources
within the kernel and for collecting statistics related to them.
Make the recently introduced helper functions pm_wakeup_event(),
pm_stay_awake() and pm_relax() use struct wakeup_source objects
internally, so that wakeup statistics associated with wakeup devices
can be collected and reported in a consistent way (the definition of
pm_relax() is changed, which is harmless, because this function is
not called directly by anyone yet). Introduce new wakeup-related
sysfs device attributes in /sys/devices/.../power for reporting the
device wakeup statistics.
Change the global wakeup events counters event_count and
events_in_progress into atomic variables, so that it is not necessary
to acquire a global spinlock in pm_wakeup_event(), pm_stay_awake()
and pm_relax(), which should allow us to avoid lock contention in
these functions on SMP systems with many wakeup devices.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Acked-by: Greg Kroah-Hartman <gregkh@suse.de>
PM / Hibernate: Make default image size depend on total RAM size
The default hibernation image size is currently hard coded and euqal
to 500 MB, which is not a reasonable default on many contemporary
systems. Make it equal 2/5 of the total RAM size (this is slightly
below the maximum, i.e. 1/2 of the total RAM size, and seems to be
generally suitable).
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Tested-by: M. Vefa Bicakci <bicave@superonline.com>
PM / Runtime: Use alloc_workqueue() for creating the PM workqueue
Although we need the PM workqueue to be freezable, we don't need it
to be singlethread. Also, the number of concurrent work items
running on a single CPU need not be constrained. For these reasons
use alloc_workqueue() directly, with suitable arguments, instead of
create_freezeable_workqueue(), to create the runtime PM workqueue.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Acked-by: Tejun Heo <tj@kernel.org>
Bojan Smojver [Thu, 9 Sep 2010 21:06:23 +0000 (23:06 +0200)]
PM / Hibernate: Compress hibernation image with LZO
Compress hibernation image with LZO in order to save on I/O and
therefore time to hibernate/thaw.
[rjw: Added hibernate=nocompress command line option instead of just
nocompress which would be confusing, fixed a couple of compiler
warnings, fixed kerneldoc comments, minor cleanups.]
Signed-off-by: Bojan Smojver <bojan@rexursive.com> Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Allow drivers, that belong to subsystems which use the generic
runtime pm callbacks, not to define runtime pm suspend/resume handlers,
by implicitly assuming success in such cases.
This is needed to eliminate nop handlers that would otherwise be
necessary by drivers which enable runtime pm, but don't need
to do anything when their devices are runtime-suspended/resumed.
Signed-off-by: Ohad Ben-Cohen <ohad@wizery.com> Acked-by: Kevin Hilman <khilman@deeprootsystems.com> Acked-by: Mark Brown <broonie@opensource.wolfsonmicro.com> Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
and it's used all over the place (including quite a few places where
we currently have sigprocmask(SIG_SETMASK, set, NULL), which is what
it's equivalent to). With that done, m32r doesn't use _BLOCKABLE
anywhere, so it got removed. And that chunk got picked when I'd been
reordering the queue to pull the arch-specific fixes in front.
Sorry."
Eric Paris [Fri, 15 Oct 2010 21:34:14 +0000 (14:34 -0700)]
types.h: define __aligned_u64 and expose to userspace
We currently have a kernel internal type called aligned_u64 which aligns
__u64's on 8 bytes boundaries even on systems which would normally align
them on 4 byte boundaries. This patch creates a new type __aligned_u64
which does the same thing but which is exposed to userspace rather than
being kernel internal.
[akpm: merge early as both the net and audit trees want this]
[akpm@linux-foundation.org: enhance the comment describing the reasons for using aligned_u64. Via Andreas and Andi.] Based-on-patch-by: Andreas Gruenbacher <agruen@suse.de> Signed-off-by: Eric Paris <eparis@redhat.com> Cc: Jan Engelhardt <jengelh@medozas.de> Cc: David Miller <davem@davemloft.net> Cc: Andi Kleen <andi@firstfloor.org> Cc: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
FUJITA Tomonori [Fri, 15 Oct 2010 21:34:13 +0000 (14:34 -0700)]
uml: fix build
Fix a build error introduced by d6d1b650ae6acce73d55dd024 ("param: simple
locking for sysfs-writable charp parameters").
CC arch/um/kernel/trap.o
arch/um/drivers/hostaudio_kern.c: In function 'hostaudio_open':
arch/um/drivers/hostaudio_kern.c:204: error: '__param_dsp' undeclared (first use in this function)
arch/um/drivers/hostaudio_kern.c:204: error: (Each undeclared identifier is reported only once
arch/um/drivers/hostaudio_kern.c:204: error: for each function it appears in.)
arch/um/drivers/hostaudio_kern.c: In function 'hostmixer_open_mixdev':
arch/um/drivers/hostaudio_kern.c:265: error: '__param_mixer' undeclared (first use in this function)
arch/um/drivers/hostaudio_kern.c:272: error: '__param_dsp' undeclared (first use in this function)
Reported-by: Toralf Förster <toralf.foerster@gmx.de> Tested-by: Toralf Förster <toralf.foerster@gmx.de> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Takashi Iwai <tiwai@suse.de> Cc: Jeff Dike <jdike@addtoit.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Eric Dumazet [Fri, 15 Oct 2010 21:34:12 +0000 (14:34 -0700)]
sysctl: min/max bounds are optional
sysctl check complains with a WARN() when proc_doulongvec_minmax() or
proc_doulongvec_ms_jiffies_minmax() are used by a vector of longs (with
more than one element), with no min or max value specified.
This is unexpected, given we had a bug on this min/max handling :)
Reported-by: Jiri Slaby <jirislaby@gmail.com> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: David Miller <davem@davemloft.net> Acked-by: WANG Cong <xiyou.wangcong@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Ohad Ben-Cohen [Wed, 13 Oct 2010 07:31:56 +0000 (09:31 +0200)]
mmc: sdio: fix SDIO suspend/resume regression
Fix SDIO suspend/resume regression introduced by 4c2ef25fe0b "mmc: fix
all hangs related to mmc/sd card insert/removal during suspend/resume":
PM: Syncing filesystems ... done.
Freezing user space processes ... (elapsed 0.01 seconds) done.
Freezing remaining freezable tasks ... (elapsed 0.01 seconds) done.
Suspending console(s) (use no_console_suspend to debug)
pm_op(): platform_pm_suspend+0x0/0x5c returns -38
PM: Device pxa2xx-mci.0 failed to suspend: error -38
PM: Some devices failed to suspend
4c2ef25fe0b moved the card removal/insertion mechanism out of MMC's
suspend/resume path and into pm notifiers (mmc_pm_notify), and that
broke SDIO's expectation that mmc_suspend_host() will remove the card,
and squash the error, in case -ENOSYS is returned from the bus suspend
handler (mmc_sdio_suspend() in this case).
mmc_sdio_suspend() is using this whenever at least one of the card's SDIO
function drivers does not have suspend/resume handlers - in that case
it is agreed to force removal of the entire card.
This patch fixes this regression by trivially bringing back that part of
mmc_suspend_host(), which was removed by 4c2ef25fe0b.
Reported-and-tested-by: Sven Neumann <s.neumann@raumfeld.com> Signed-off-by: Ohad Ben-Cohen <ohad@wizery.com> Cc: Maxim Levitsky <maximlevitsky@gmail.com> Cc: <stable@kernel.org> Acked-by: Nicolas Pitre <nico@fluxnic.net> Signed-off-by: Chris Ball <cjb@laptop.org>
Linus Torvalds [Fri, 15 Oct 2010 16:49:43 +0000 (09:49 -0700)]
Merge branch 'upstream-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid
* 'upstream-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid:
HID: Add Cando touch screen 15.6-inch product id
HID: Add MULTI_INPUT quirk for turbox/mosart touchscreen
HID: hidraw, fix a NULL pointer dereference in hidraw_write
HID: hidraw, fix a NULL pointer dereference in hidraw_ioctl
Tejun Heo [Fri, 15 Oct 2010 10:56:21 +0000 (12:56 +0200)]
ubd: fix incorrect sector handling during request restart
Commit f81f2f7c (ubd: drop unnecessary rq->sector manipulation)
dropped request->sector manipulation in preparation for global request
handling cleanup; unfortunately, it incorrectly assumed that the
updated sector wasn't being used.
ubd tries to issue as many requests as possible to io_thread. When
issuing fails due to memory pressure or other reasons, the device is
put on the restart list and issuing stops. On IO completion, devices
on the restart list are scanned and IO issuing is restarted.
ubd issues IOs sg-by-sg and issuing can be stopped in the middle of a
request, so each device on the restart queue needs to remember where
to restart in its current request. ubd needs to keep track of the
issue position itself because,
* blk_rq_pos(req) is now updated by the block layer to keep track of
_completion_ position.
* Multiple io_req's for the current request may be in flight, so it's
difficult to tell where blk_rq_pos(req) currently is.
Add ubd->rq_pos to keep track of the issue position and use it to
correctly restart io_req issue.
Signed-off-by: Tejun Heo <tj@kernel.org> Reported-by: Richard Weinberger <richard@nod.at> Tested-by: Richard Weinberger <richard@nod.at> Tested-by: Chris Frey <cdfrey@foursquare.net> Cc: stable@kernel.org Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
Linus Torvalds [Thu, 14 Oct 2010 21:32:06 +0000 (14:32 -0700)]
Un-inline the core-dump helper functions
Tony Luck reports that the addition of the access_ok() check in commit 0eead9ab41da ("Don't dump task struct in a.out core-dumps") broke the
ia64 compile due to missing the necessary header file includes.
Rather than add yet another include (<asm/unistd.h>) to make everything
happy, just uninline the silly core dump helper functions and move the
bodies to fs/exec.c where they make a lot more sense.
dump_seek() in particular was too big to be an inline function anyway,
and none of them are in any way performance-critical. And we really
don't need to mess up our include file headers more than they already
are.
Reported-and-tested-by: Tony Luck <tony.luck@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6:
ehea: Fix a checksum issue on the receive path
net: allow FEC driver to use fixed PHY support
tg3: restore rx_dropped accounting
b44: fix carrier detection on bind
net: clear heap allocations for privileged ethtool actions
NET: wimax, fix use after free
ATM: iphase, remove sleep-inside-atomic
ATM: mpc, fix use after free
ATM: solos-pci, remove use after free
net/fec: carrier off initially to avoid root mount failure
r8169: use device model DMA API
r8169: allocate with GFP_KERNEL flag when able to sleep
Linus Torvalds [Thu, 14 Oct 2010 17:57:40 +0000 (10:57 -0700)]
Don't dump task struct in a.out core-dumps
akiphie points out that a.out core-dumps have that odd task struct
dumping that was never used and was never really a good idea (it goes
back into the mists of history, probably the original core-dumping
code). Just remove it.
Also do the access_ok() check on dump_write(). It probably doesn't
matter (since normal filesystems all seem to do it anyway), but he
points out that it's normally done by the VFS layer, so ...
[ I suspect that we should possibly do "vfs_write()" instead of
calling ->write directly. That also does the whole fsnotify and write
statistics thing, which may or may not be a good idea. ]
And just to be anal, do this all for the x86-64 32-bit a.out emulation
code too, even though it's not enabled (and won't currently even
compile)
Salman Qazi [Tue, 12 Oct 2010 14:25:19 +0000 (07:25 -0700)]
hrtimer: Preserve timer state in remove_hrtimer()
The race is described as follows:
CPU X CPU Y
remove_hrtimer
// state & QUEUED == 0
timer->state = CALLBACK
unlock timer base
timer->f(n) //very long
hrtimer_start
lock timer base
remove_hrtimer // no effect
hrtimer_enqueue
timer->state = CALLBACK |
QUEUED
unlock timer base
hrtimer_start
lock timer base
remove_hrtimer
mode = INACTIVE
// CALLBACK bit lost!
switch_hrtimer_base
CALLBACK bit not set:
timer->base
changes to a
different CPU.
lock this CPU's timer base
The bug was introduced with commit ca109491f (hrtimer: removing all ur
callback modes) in 2.6.29
[ tglx: Feed new state via local variable and add a comment. ]
Signed-off-by: Salman Qazi <sqazi@google.com> Cc: akpm@linux-foundation.org Cc: Peter Zijlstra <peterz@infradead.org>
LKML-Reference: <20101012142351.8485.21823.stgit@dungbeetle.mtv.corp.google.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: stable@kernel.org
Linus Torvalds [Wed, 13 Oct 2010 23:50:23 +0000 (16:50 -0700)]
Merge branch 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
ring-buffer: Fix typo of time extends per page
perf, MIPS: Support cross compiling of tools/perf for MIPS
perf: Fix incorrect copy_from_user() usage
Linus Torvalds [Wed, 13 Oct 2010 23:35:33 +0000 (16:35 -0700)]
Merge master.kernel.org:/home/rmk/linux-2.6-arm
* master.kernel.org:/home/rmk/linux-2.6-arm:
ARM: relax ioremap prohibition (309caa9) for -final and -stable
ARM: 6440/1: ep93xx: DMA: fix channel_disable
cpuimx27: fix i2c bus selection
cpuimx27: fix compile when ULPI is selected
ARM: 6435/1: Fix HWCAP_TLS flag for ARM11MPCore/Cortex-A9
ARM: 6436/1: AT91: Fix power-saving in idle-mode on 926T processors
ARM: fix section mismatch warnings in Versatile Express
ARM: 6412/1: kprobes-decode: add support for MOVW instruction
ARM: 6419/1: mmu: Fix MT_MEMORY and MT_MEMORY_NONCACHED pte flags
ARM: 6416/1: errata: faulty hazard checking in the Store Buffer may lead to data corruption
Linus Torvalds [Wed, 13 Oct 2010 23:35:05 +0000 (16:35 -0700)]
Merge branch 'omap-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap-2.6
* 'omap-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap-2.6:
omap: iommu-load cam register before flushing the entry
Linus Torvalds [Wed, 13 Oct 2010 23:34:46 +0000 (16:34 -0700)]
Merge branch 'drm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6
* 'drm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6:
drm/radeon/kms: Silent spurious error message
drm/radeon/kms: fix bad cast/shift in evergreen.c
drm/radeon/kms: make TV/DFP table info less verbose
drm/radeon/kms: leave certain CP int bits enabled
drm/radeon/kms: avoid corner case issue with unmappable vram V2
Linus Torvalds [Wed, 13 Oct 2010 23:34:23 +0000 (16:34 -0700)]
Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
x86, numa: For each node, register the memory blocks actually used
x86, AMD, MCE thresholding: Fix the MCi_MISCj iteration order
x86, mce, therm_throt.c: Fix missing curly braces in error handling logic
Dan Williams [Wed, 13 Oct 2010 22:43:10 +0000 (15:43 -0700)]
ioat2: fix performance regression
Commit 0793448 "DMAENGINE: generic channel status v2" changed the interface for
how dma channel progress is retrieved. It inadvertently exported an internal
helper function ioat_tx_status() instead of ioat_dma_tx_status(). The latter
polls the hardware to get the latest completion state, while the helper just
evaluates the current state without touching hardware. The effect is that we
end up waiting for completion timeouts or descriptor allocation errors before
the completion state is updated.
Breno Leitao [Thu, 7 Oct 2010 13:17:33 +0000 (13:17 +0000)]
ehea: Fix a checksum issue on the receive path
Currently we set all skbs with CHECKSUM_UNNECESSARY, even
those whose protocol we don't know. This patch just
add the CHECKSUM_COMPLETE tag for non TCP/UDP packets.
Reported-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Breno Leitao <leitao@linux.vnet.ibm.com> Signed-off-by: Jay Vosburgh <fubar@us.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
J. Bruce Fields [Wed, 13 Oct 2010 18:46:17 +0000 (14:46 -0400)]
nfsd: fix BUG at fs/nfsd/nfsfh.h:199 on unlink
As of commit 43a9aa64a2f4330a9cb59aaf5c5636566bce067c "NFSD:
Fill in WCC data for REMOVE, RMDIR, MKNOD, and MKDIR", we sometimes call
fh_unlock on a filehandle that isn't fully initialized.
We should fix up the callers, but as a quick fix it is also sufficient
just to remove this assertion.
Reported-by: Marius Tolzmann <tolzmann@molgen.mpg.de> Cc: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Greg Ungerer [Mon, 11 Oct 2010 21:03:05 +0000 (21:03 +0000)]
net: allow FEC driver to use fixed PHY support
At least one board using the FEC driver does not have a conventional
PHY attached to it, it is directly connected to a somewhat simple
ethernet switch (the board is the SnapGear/LITE, and the attached
4-port ethernet switch is a RealTek RTL8305). This switch does not
present the usual register interface of a PHY, it presents nothing.
So a PHY scan will find nothing - it finds ID's of 0 for each PHY
on the attached MII bus.
After the FEC driver was changed to use phylib for supporting PHYs
it no longer works on this particular board/switch setup.
Add code support to use a fixed phy if no PHY is found on the MII bus.
This is based on the way the cpmac.c driver solved this same problem.
Signed-off-by: Greg Ungerer <gerg@uclinux.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Mika Westerberg [Tue, 12 Oct 2010 09:37:59 +0000 (10:37 +0100)]
ARM: 6440/1: ep93xx: DMA: fix channel_disable
When channel_disable() is called, it disables per channel interrupts and
waits until channels state becomes STATE_STALL, and then disables the
channel. Now, if the DMA transfer is disabled while the channel is in
STATE_NEXT we will not wait anything and disable the channel immediately.
This seems to cause weird data corruption for example in audio transfers.
Fix is to wait while we are in STATE_NEXT or STATE_ON and only then
disable the channel.
Signed-off-by: Mika Westerberg <mika.westerberg@iki.fi> Acked-by: Ryan Mallon <ryan@bluewatersys.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Steven Rostedt [Tue, 12 Oct 2010 16:06:43 +0000 (12:06 -0400)]
ring-buffer: Fix typo of time extends per page
Time stamps for the ring buffer are created by the difference between
two events. Each page of the ring buffer holds a full 64 bit timestamp.
Each event has a 27 bit delta stamp from the last event. The unit of time
is nanoseconds, so 27 bits can hold ~134 milliseconds. If two events
happen more than 134 milliseconds apart, a time extend is inserted
to add more bits for the delta. The time extend has 59 bits, which
is good for ~18 years.
Currently the time extend is committed separately from the event.
If an event is discarded before it is committed, due to filtering,
the time extend still exists. If all events are being filtered, then
after ~134 milliseconds a new time extend will be added to the buffer.
This can only happen till the end of the page. Since each page holds
a full timestamp, there is no reason to add a time extend to the
beginning of a page. Time extends can only fill a page that has actual
data at the beginning, so there is no fear that time extends will fill
more than a page without any data.
When reading an event, a loop is made to skip over time extends
since they are only used to maintain the time stamp and are never
given to the caller. As a paranoid check to prevent the loop running
forever, with the knowledge that time extends may only fill a page,
a check is made that tests the iteration of the loop, and if the
iteration is more than the number of time extends that can fit in a page
a warning is printed and the ring buffer is disabled (all of ftrace
is also disabled with it).
There is another event type that is called a TIMESTAMP which can
hold 64 bits of data in the theoretical case that two events happen
18 years apart. This code has not been implemented, but the name
of this event exists, as well as the structure for it. The
size of a TIMESTAMP is 16 bytes, where as a time extend is only
8 bytes. The macro used to calculate how many time extends can fit on
a page used the TIMESTAMP size instead of the time extend size
cutting the amount in half.
The following test case can easily trigger the warning since we only
need to have half the page filled with time extends to trigger the
warning:
Enabling the function tracer and then setting the filter to only trace
functions where the process id is negative (no events), then clearing
the trace buffer to ensure that we have nothing in the buffer,
then write to trace_marker to add an event to the beginning of a page,
sleep for 2 minutes (only 35 seconds is probably needed, but this
guarantees the bug), and then finally reading the trace which will
trigger the bug.
This patch fixes the typo and prevents the false positive of that warning.
Reported-by: Hans J. Koch <hjk@linutronix.de> Tested-by: Hans J. Koch <hjk@linutronix.de> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Stable Kernel <stable@kernel.org> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Deng-Cheng Zhu [Tue, 12 Oct 2010 11:33:33 +0000 (19:33 +0800)]
perf, MIPS: Support cross compiling of tools/perf for MIPS
Changes:
v4: Fix the cosmetic issue of redundant dot-ops
v3: Change rmb() to use SYNC
v2: Include mips unistd.h and define rmb()/cpu_relax() in tools/perf/perf.h
Signed-off-by: Deng-Cheng Zhu <dengcheng.zhu@gmail.com> Acked-by: Ralf Baechle <ralf@linux-mips.org> Cc: David Daney <ddaney@caviumnetworks.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
Jean Delvare [Fri, 8 Oct 2010 12:34:49 +0000 (14:34 +0200)]
drm/radeon/kms: Silent spurious error message
I see the following error message in my kernel log from time to time:
radeon 0000:07:00.0: ffff88007c334000 reserve failed for wait
radeon 0000:07:00.0: ffff88007c334000 reserve failed for wait
After investigation, it turns out that there's nothing to be afraid of
and everything works as intended. So remove the spurious log message.
Signed-off-by: Jean Delvare <khali@linux-fr.org> Reviewed-by: Jerome Glisse <jglisse@redhat.com> Signed-off-by: Dave Airlie <airlied@redhat.com>
Reported-by: Dave Gilbert <freedesktop@treblig.org> Signed-off-by: Alex Deucher <alexdeucher@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Cc: stable@kernel.org Signed-off-by: Dave Airlie <airlied@redhat.com>
Jerome Glisse [Tue, 10 Aug 2010 21:41:31 +0000 (17:41 -0400)]
drm/radeon/kms: avoid corner case issue with unmappable vram V2
We should not allocate any object into unmappable vram if we
have no means to access them which on all GPU means having the
CP running and on newer GPU having the blit utility working.
This patch limit the vram allocation to visible vram until
we have acceleration up and running.
Note that it's more than unlikely that we run into any issue
related to that as when acceleration is not woring userspace
should allocate any object in vram beside front buffer which
should fit in visible vram.
V2 use real_vram_size as mc_vram_size could be bigger than
the actual amount of vram
[airlied: fixup r700_cp_stop case]
Signed-off-by: Jerome Glisse <jglisse@redhat.com> Signed-off-by: Dave Airlie <airlied@redhat.com>
Eric Paris [Mon, 11 Oct 2010 22:13:31 +0000 (18:13 -0400)]
fanotify: disable fanotify syscalls
This patch disables the fanotify syscalls by just not building them and
letting the cond_syscall() statements in kernel/sys_ni.c redirect them
to sys_ni_syscall().
It was pointed out by Tvrtko Ursulin that the fanotify interface did not
include an explicit prioritization between groups. This is necessary
for fanotify to be usable for hierarchical storage management software,
as they must get first access to the file, before inotify-like notifiers
see the file.
This feature can be added in an ABI compatible way in the next release
(by using a number of bits in the flags field to carry the info) but it
was suggested by Alan that maybe we should just hold off and do it in
the next cycle, likely with an (new) explicit argument to the syscall.
I don't like this approach best as I know people are already starting to
use the current interface, but Alan is all wise and noone on list backed
me up with just using what we have. I feel this is needlessly ripping
the rug out from under people at the last minute, but if others think it
needs to be a new argument it might be the best way forward.
Three choices:
Go with what we got (and implement the new feature next cycle). Add a
new field right now (and implement the new feature next cycle). Wait
till next cycle to release the ABI (and implement the new feature next
cycle). This is number 3.
Signed-off-by: Eric Paris <eparis@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Eric Dumazet [Sun, 10 Oct 2010 19:55:52 +0000 (19:55 +0000)]
tg3: restore rx_dropped accounting
commit 511d22247be7 (tg3: 64 bit stats on all arches), overlooked the
rx_dropped accounting.
We use a full "struct rtnl_link_stats64" to hold rx_dropped value, but
forgot to report it in tg3_get_stats64().
Use an "unsigned long" instead to shrink "struct tg3" by 176 bytes, and
report this value to stats readers.
Increment rx_dropped counter for oversized frames.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> CC: Michael Chan <mchan@broadcom.com> CC: Matt Carlson <mcarlson@broadcom.com> Acked-by: Matt Carlson <mcarlson@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Paul Fertser [Mon, 11 Oct 2010 22:45:35 +0000 (15:45 -0700)]
b44: fix carrier detection on bind
For carrier detection to work properly when binding the driver with a cable
unplugged, netif_carrier_off() should be called after register_netdev(),
not before.
Signed-off-by: Paul Fertser <fercerpav@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Before that commit, register_active_regions() is called for every SRAT memory
entry right away.
Use nodememblk_range[] instead of nodes[] in order to make sure we
capture the actual memory blocks registered with each node. nodes[]
contains an extended range which spans all memory regions associated
with a node, but that does not mean that all the memory in between are
included.
Reported-by: Russ Anderson <rja@sgi.com> Tested-by: Russ Anderson <rja@sgi.com> Signed-off-by: Yinghai Lu <yinghai@kernel.org>
LKML-Reference: <4CB27BDF.5000800@kernel.org> Acked-by: David Rientjes <rientjes@google.com> Cc: <stable@kernel.org> 2.6.33 .34 .35 .36 Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Kees Cook [Mon, 11 Oct 2010 19:23:25 +0000 (12:23 -0700)]
net: clear heap allocations for privileged ethtool actions
Several other ethtool functions leave heap uncleared (potentially) by
drivers. Some interfaces appear safe (eeprom, etc), in that the sizes
are well controlled. In some situations (e.g. unchecked error conditions),
the heap will remain unchanged in areas before copying back to userspace.
Note that these are less of an issue since these all require CAP_NET_ADMIN.
Cc: stable@kernel.org Signed-off-by: Kees Cook <kees.cook@canonical.com> Acked-by: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Jiri Slaby [Sun, 10 Oct 2010 23:26:57 +0000 (23:26 +0000)]
ATM: iphase, remove sleep-inside-atomic
Stanse found that ia_init_one locks a spinlock and inside of that it
calls ia_start which calls:
* request_irq
* tx_init which does kmalloc(GFP_KERNEL)
Both of them can thus sleep and result in a deadlock. I don't see a
reason to have a per-device spinlock there which is used only there
and inited right before the lock location. So remove it completely.
Signed-off-by: Jiri Slaby <jslaby@suse.cz> Cc: Chas Williams <chas@cmf.nrl.navy.mil> Signed-off-by: David S. Miller <davem@davemloft.net>
Jiri Slaby [Sun, 10 Oct 2010 22:46:34 +0000 (22:46 +0000)]
ATM: mpc, fix use after free
Stanse found that mpc_push frees skb and then it dereferences it. It
is a typo, new_skb should be dereferenced there.
Signed-off-by: Jiri Slaby <jslaby@suse.cz> Cc: Eric Dumazet <eric.dumazet@gmail.com> Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Jiri Slaby [Sun, 10 Oct 2010 21:50:44 +0000 (21:50 +0000)]
ATM: solos-pci, remove use after free
Stanse found we do in console_show:
kfree_skb(skb);
return skb->len;
which is not good. Fix that by remembering the len and use it in the
function instead.
Signed-off-by: Jiri Slaby <jslaby@suse.cz> Cc: Chas Williams <chas@cmf.nrl.navy.mil> Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Mon, 11 Oct 2010 17:19:24 +0000 (10:19 -0700)]
Merge branch 'rc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild-2.6
* 'rc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild-2.6:
kbuild: fix oldnoconfig to do the right thing
kconfig: Temporarily disable dependency warnings
kconfig: delay symbol direct dependency initialization
Linus Torvalds [Mon, 11 Oct 2010 17:05:05 +0000 (10:05 -0700)]
Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mjg59/platform-drivers-x86
* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mjg59/platform-drivers-x86:
IPS driver: Fix limit clamping when reducing CPU power
[PATCH 2/2] IPS driver: disable CPU turbo
IPS driver: apply BIOS provided CPU limit if different from default
intel_ips -- ensure we do not enable gpu turbo mode without driver linkage
intel_ips: Print MCP limit exceeded values.
IPS driver: verify BIOS provided limits
IPS driver: don't toggle CPU turbo on unsupported CPUs
NULL pointer might be used in ips_monitor()
Release symbol on error-handling path of ips_get_i915_syms()
old_cpu_power is wrongly divided by 65535 in ips_monitor()
seqno mask of THM_ITV register is 16bit
Zachary Amsden [Fri, 20 Aug 2010 08:07:19 +0000 (22:07 -1000)]
KVM: x86: Move TSC reset out of vmcb_init
The VMCB is reset whenever we receive a startup IPI, so Linux is setting
TSC back to zero happens very late in the boot process and destabilizing
the TSC. Instead, just set TSC to zero once at VCPU creation time.
Why the separate patch? So git-bisect is your friend.
Borislav Petkov [Fri, 8 Oct 2010 10:08:34 +0000 (12:08 +0200)]
x86, AMD, MCE thresholding: Fix the MCi_MISCj iteration order
This fixes possible cases of not collecting valid error info in
the MCE error thresholding groups on F10h hardware.
The current code contains a subtle problem of checking only the
Valid bit of MSR0000_0413 (which is MC4_MISC0 - DRAM
thresholding group) in its first iteration and breaking out if
the bit is cleared.
But (!), this MSR contains an offset value, BlkPtr[31:24], which
points to the remaining MSRs in this thresholding group which
might contain valid information too. But if we bail out only
after we checked the valid bit in the first MSR and not the
block pointer too, we miss that other information.
The thing is, MC4_MISC0[BlkPtr] is not predicated on
MCi_STATUS[MiscV] or MC4_MISC0[Valid] and should be checked
prior to iterating over the MCI_MISCj thresholding group,
irrespective of the MC4_MISC0[Valid] setting.
Oskar Schirmer [Thu, 7 Oct 2010 02:30:30 +0000 (02:30 +0000)]
net/fec: carrier off initially to avoid root mount failure
with hardware slow in negotiation, the system did freeze
while trying to mount root on nfs at boot time.
the link state has not been initialised so network stack
tried to start transmission right away. this caused instant
retries, as the driver solely stated business upon link down,
rendering the system unusable.
notify carrier off initially to prevent transmission until
phylib will report link up.
Signed-off-by: Oskar Schirmer <oskar@linutronix.de> Signed-off-by: David S. Miller <davem@davemloft.net>
Commit 861b4ea4 broke oldnoconfig when removed the oldnoconfig checks on
if (input_mode == nonint_oldconfig ||
input_mode == oldnoconfig) {
if (input_mode == nonint_oldconfig &&
sym->name &&
!sym_is_choice_value(sym)) {
to avoid oldnoconfig chugging through the else stanza.
Fix that to restore expected behaviour (which I've confirmed in the
Fedora kernel build that the configs end up looking the same.)
Signed-off-by: Kyle McMartin <kyle@redhat.com> Acked-by: Sam Ravnborg <sam@ravnborg.org> Signed-off-by: Michal Marek <mmarek@suse.cz>
Michal Marek [Fri, 8 Oct 2010 14:40:27 +0000 (16:40 +0200)]
kconfig: Temporarily disable dependency warnings
After fixing a use-after-free bug in kconfig, a 'make defconfig' or
'make allmodconfig' fills the screen with warnings that were not
detected before. Given that we are close to the release now, disable the
warnings temporarily and deal with them after 2.6.36.
Linus Torvalds [Sat, 9 Oct 2010 19:04:38 +0000 (12:04 -0700)]
Merge branch 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6
* 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6:
ACPI: add DMI to disable AML Vista compatibility on MSI GX723 Notebook
ACPI: Handle ACPI0007 Device in acpi_early_set_pdc
Linus Torvalds [Sat, 9 Oct 2010 19:03:46 +0000 (12:03 -0700)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client:
ceph: update issue_seq on cap grant
ceph: send cap release message early on failed revoke.
ceph: Update max_len with minimum required size
ceph: Fix return value of encode_fh function
ceph: avoid null deref in osd request error path
ceph: fix list_add usage on unsafe_writes list
Linus Torvalds [Sat, 9 Oct 2010 18:43:40 +0000 (11:43 -0700)]
Merge branch 'drm-intel-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/ickle/drm-intel
* 'drm-intel-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/ickle/drm-intel:
drm/i915: Prevent module unload to avoid random memory corruption
Linus Torvalds [Sat, 9 Oct 2010 18:43:18 +0000 (11:43 -0700)]
Merge branch 's5p-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/kgene/linux-samsung
* 's5p-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/kgene/linux-samsung:
ARM: SAMSUNG: Add a workaround for get_clock() for serial driver
ARM: S5P: Bug fix on errors of build with CONFIG_PREEMPT_NONE
ARM: SAMSUNG: Fix build warnings because of unused codes
Use DMA API as PCI equivalents will be deprecated. This change also
allow to allocate with GFP_KERNEL where possible.
Tested-by: Neal Becker <ndbecker2@gmail.com> Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com> Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
r8169: allocate with GFP_KERNEL flag when able to sleep
We have fedora bug report where driver fail to initialize after
suspend/resume because of memory allocation errors:
https://bugzilla.redhat.com/show_bug.cgi?id=629158
To fix use GFP_KERNEL allocation where possible.
Tested-by: Neal Becker <ndbecker2@gmail.com> Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com> Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Kees Cook [Thu, 7 Oct 2010 10:03:48 +0000 (10:03 +0000)]
net: clear heap allocation for ETHTOOL_GRXCLSRLALL
Calling ETHTOOL_GRXCLSRLALL with a large rule_cnt will allocate kernel
heap without clearing it. For the one driver (niu) that implements it,
it will leave the unused portion of heap unchanged and copy the full
contents back to userspace.
Signed-off-by: Kees Cook <kees.cook@canonical.com> Acked-by: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Dan Carpenter [Fri, 8 Oct 2010 17:21:22 +0000 (10:21 -0700)]
isdn: strcpy() => strlcpy()
setup.phone and setup.eazmsn are 32 character buffers.
rcvmsg.msg_data.byte_array is a 48 character buffer.
sc_adapter[card]->channel[rcvmsg.phy_link_no - 1].dn is 50 chars.
The rcvmsg struct comes from the memcpy_fromio() in receivemessage().
I guess that means it's data off the wire. I'm not very familiar with
this code but I don't see any reason to assume these strings are NULL
terminated.
Also it's weird that "dn" in a 50 character buffer but we only seem to
use 32 characters. In drivers/isdn/sc/scioc.h, "dn" is only a 49
character buffer. So potentially there is still an issue there.
The important thing for now is to prevent the memory corruption.
Signed-off-by: Dan Carpenter <error27@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Boaz Harrosh [Thu, 7 Oct 2010 17:37:51 +0000 (13:37 -0400)]
exofs: Fix double page_unlock BUG in write_begin/end
This BUG is there since the first submit of the code, but only triggered
in last Kernel. It's timing related do to the asynchronous object-creation
behaviour of exofs. (Which should be investigated farther)
Chris Wilson [Fri, 8 Oct 2010 12:40:27 +0000 (13:40 +0100)]
drm/i915: Prevent module unload to avoid random memory corruption
The i915 driver has quite a few module unload bugs, the known ones at
least have fixes that are targeting 2.6.37. However, in order to
maintain a stable kernel, we should prevent this known random memory
corruption following driver unload. This should have very low impact on
normal users who are unlikely to need to unload the i915 driver.
Suggested-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: stable@kernel.org Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>