git.karo-electronics.de Git - karo-tx-linux.git/log

]> git.karo-electronics.de Git - karo-tx-linux.git/log

Jens Axboe [Wed, 17 Jul 2013 03:10:18 +0000 (21:10 -0600)]

Merge branch 'bcache-for-3.11' of git://evilpiepirate.org/~kent/linux-bcache into for-3.11/drivers

Kent writes:

Hey Jens - I've been busy torture testing and chasing bugs, here's the
fruits of my labors. These are all fairly small fixes, some of them
quite important.

commit | commitdiff | tree

Kent Overstreet [Thu, 11 Jul 2013 01:31:58 +0000 (18:31 -0700)]

bcache: Allocation kthread fixes

The alloc kthread should've been using try_to_freeze() - and also there
was the potential for the alloc kthread to get woken up after it had
shut down, which would have been bad.

Signed-off-by: Kent Overstreet <kmo@daterainc.com>

commit | commitdiff | tree

Kent Overstreet [Fri, 12 Jul 2013 02:43:21 +0000 (19:43 -0700)]

bcache: Fix GC_SECTORS_USED() calculation

Part of the job of garbage collection is to add up however many sectors
of live data it finds in each bucket, but that doesn't work very well if
it doesn't reset GC_SECTORS_USED() when it starts. Whoops.

This wouldn't have broken anything horribly, but allocation tries to
preferentially reclaim buckets that are mostly empty and that's not
gonna work with an incorrect GC_SECTORS_USED() value.

Signed-off-by: Kent Overstreet <kmo@daterainc.com>
Cc: linux-stable <stable@vger.kernel.org> # >= v3.10

commit | commitdiff | tree

Kent Overstreet [Fri, 12 Jul 2013 05:42:14 +0000 (22:42 -0700)]

bcache: Journal replay fix

The journal replay code starts by finding something that looks like a
valid journal entry, then it does a binary search over the unchecked
region of the journal for the journal entries with the highest sequence
numbers.

Trouble is, the logic was wrong - journal_read_bucket() returns true if
it found journal entries we need, but if the range of journal entries
we're looking for loops around the end of the journal - in that case
journal_read_bucket() could return true when it hadn't found the highest
sequence number we'd seen yet, and in that case the binary search did
the wrong thing. Whoops.

Signed-off-by: Kent Overstreet <kmo@daterainc.com>
Cc: linux-stable <stable@vger.kernel.org> # >= v3.10

commit | commitdiff | tree

Kent Overstreet [Thu, 11 Jul 2013 04:03:25 +0000 (21:03 -0700)]

bcache: Shutdown fix

Stopping a cache set is supposed to make it stop attached backing
devices, but somewhere along the way that code got lost. Fixing this
mainly has the effect of fixing our reboot notifier.

Signed-off-by: Kent Overstreet <kmo@daterainc.com>
Cc: linux-stable <stable@vger.kernel.org> # >= v3.10

commit | commitdiff | tree

Kent Overstreet [Thu, 11 Jul 2013 04:25:02 +0000 (21:25 -0700)]

bcache: Fix a sysfs splat on shutdown

If we stopped a bcache device when we were already detaching (or
something like that), bcache_device_unlink() would try to remove a
symlink from sysfs that was already gone because the bcache dev kobject
had already been removed from sysfs.

So keep track of whether we've removed stuff from sysfs.

Signed-off-by: Kent Overstreet <kmo@daterainc.com>
Cc: linux-stable <stable@vger.kernel.org> # >= v3.10

commit | commitdiff | tree

Kent Overstreet [Thu, 11 Jul 2013 01:44:40 +0000 (18:44 -0700)]

bcache: Advertise that flushes are supported

Whoops - bcache's flush/FUA was mostly correct, but flushes get filtered
out unless we say we support them...

Signed-off-by: Kent Overstreet <kmo@daterainc.com>
Cc: linux-stable <stable@vger.kernel.org> # >= v3.10

commit | commitdiff | tree

Dan Carpenter [Fri, 5 Jul 2013 06:05:46 +0000 (09:05 +0300)]

bcache: check for allocation failures

There is a missing NULL check after the kzalloc().

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>

commit | commitdiff | tree

Kent Overstreet [Thu, 11 Jul 2013 01:04:21 +0000 (18:04 -0700)]

bcache: Fix a dumb race

In the far-too-complicated closure code - closures can have destructors,
for probably dubious reasons; they get run after the closure is no
longer waiting on anything but before dropping the parent ref, intended
just for freeing whatever memory the closure is embedded in.

Trouble is, when remaining goes to 0 and we've got nothing more to run -
we also have to unlock the closure, setting remaining to -1. If there's
a destructor, that unlock isn't doing anything - nobody could be trying
to lock it if we're about to free it - but if the unlock _is needed...
that check for a destructor was racy. Argh.

Signed-off-by: Kent Overstreet <kmo@daterainc.com>
Cc: linux-stable <stable@vger.kernel.org> # >= v3.10

commit | commitdiff | tree

Jens Axboe [Tue, 2 Jul 2013 06:32:57 +0000 (08:32 +0200)]

Merge branch 'bcache-for-3.11' of git://evilpiepirate.org/~kent/linux-bcache into for-3.11/drivers

commit | commitdiff | tree

Jens Axboe [Tue, 2 Jul 2013 06:31:48 +0000 (08:31 +0200)]

Merge tag 'v3.10-rc7' into for-3.11/drivers

Linux 3.10-rc7

Pull this in early to avoid doing it with the bcache merge,
since there are a number of changes to bcache between my old
base (3.10-rc1) and the new pull request.

commit | commitdiff | tree

Kent Overstreet [Fri, 7 Jun 2013 01:15:57 +0000 (18:15 -0700)]

bcache: Use standard utility code

Some of bcache's utility code has made it into the rest of the kernel,
so drop the bcache versions.

Bcache used to have a workaround for allocating from a bio set under
generic_make_request() (if you allocated more than once, the bios you
already allocated would get stuck on current->bio_list when you
submitted, and you'd risk deadlock) - bcache would mask out __GFP_WAIT
when allocating bios under generic_make_request() so that allocation
could fail and it could retry from workqueue. But bio_alloc_bioset() has
a workaround now, so we can drop this hack and the associated error
handling.

Signed-off-by: Kent Overstreet <koverstreet@google.com>

commit | commitdiff | tree

Kent Overstreet [Mon, 1 Jul 2013 19:19:55 +0000 (12:19 -0700)]

bcache: Update email address

Signed-off-by: Kent Overstreet <kmo@daterainc.com>

commit | commitdiff | tree

Kent Overstreet [Thu, 16 May 2013 00:13:45 +0000 (17:13 -0700)]

bcache: Delete fuzz tester

This code has rotted and it hasn't been used in ages anyways.

Signed-off-by: Kent Overstreet <kmo@daterainc.com>

commit | commitdiff | tree

Kent Overstreet [Mon, 3 Jun 2013 20:04:56 +0000 (13:04 -0700)]

bcache: Document shrinker reserve better

Signed-off-by: Kent Overstreet <kmo@daterainc.com>

commit | commitdiff | tree

Kent Overstreet [Thu, 27 Jun 2013 00:25:38 +0000 (17:25 -0700)]

bcache: FUA fixes

Journal writes need to be marked FUA, not just REQ_FLUSH. And btree node
writes have... weird ordering requirements.

Signed-off-by: Kent Overstreet <koverstreet@google.com>

commit | commitdiff | tree

Philipp Reisner [Tue, 25 Jun 2013 14:50:08 +0000 (16:50 +0200)]

drbd: Allow online change of al-stripes and al-stripe-size

Allow to change the AL layout with an resize operation. For that
the reisze command gets two new fields: al_stripes and al_stripe_size.

In order to make the operation crash save:
1) Lock out all IO and MD-IO
2) Write the super block with MDF_PRIMARY_IND clear
3) write the bitmap to the new location (all zeros, since
we allow only while connected)
4) Initialize the new AL-area
5) Write the super block with the restored MDF_PRIMARY_IND.
6) Unfreeze all IO

Since the AL-layout has no influence on the protocol, this operation
needs to be beforemed on both sides of a resource (if intended).

Signed-off-by: Andreas Gruenbacher <agruen@linbit.com>
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>

commit | commitdiff | tree

Philipp Reisner [Tue, 25 Jun 2013 14:50:07 +0000 (16:50 +0200)]

drbd: Constants should be UPPERCASE

Signed-off-by: Andreas Gruenbacher <agruen@linbit.com>
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>

commit | commitdiff | tree

Philipp Reisner [Tue, 25 Jun 2013 14:50:06 +0000 (16:50 +0200)]

drbd: Ignore the exit code of a fence-peer handler if it returns too late

In case the connection was established and lost again before
the a fence-peer handler returns, ignore the exit code of this
instance. (And use the exit code of the later started instance)

Signed-off-by: Andreas Gruenbacher <agruen@linbit.com>
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>

commit | commitdiff | tree

Andreas Gruenbacher [Tue, 25 Jun 2013 14:50:05 +0000 (16:50 +0200)]

drbd: Fix rcu_read_lock balance on error path

Signed-off-by: Andreas Gruenbacher <agruen@linbit.com>
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>

commit | commitdiff | tree

Wei Yongjun [Tue, 25 Jun 2013 14:50:04 +0000 (16:50 +0200)]

drbd: fix error return code in drbd_init()

Fix to return a negative error code from the error handling
case instead of 0, as returned elsewhere in this function.

Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Andreas Gruenbacher <agruen@linbit.com>
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>

commit | commitdiff | tree

Andreas Gruenbacher [Tue, 25 Jun 2013 14:50:03 +0000 (16:50 +0200)]

drbd: Do not sleep inside rcu

Signed-off-by: Andreas Gruenbacher <agruen@linbit.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>

commit | commitdiff | tree

Jens Axboe [Fri, 28 Jun 2013 14:01:14 +0000 (16:01 +0200)]

Merge branch 'stable/for-jens-3.10' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen into for-3.11/drivers

Konrad writes:

It has the 'feature-max-indirect-segments' implemented in both backend
and frontend. The current problem with the backend and frontend is that the
segment size is limited to 11 pages. It means we can at most squeeze in 44kB per
request. The ring can hold 32 (next power of two below 36) requests, meaning we
can do 1.4M of outstanding requests. Nowadays that is not enough.

The problem in the past was addressed in two ways - but neither one went upstream.
The first solution to this proposed by Justin from Spectralogic was to negotiate
the segment size.  This means that the ‘struct blkif_sring_entry’ is now a variable size.
It can expand from 112 bytes (cover 11 pages of data - 44kB) to 1580 bytes
(256 pages of data - so 1MB). It is a simple extension by just making the array in the
request expand from 11 to a variable size negotiated. But it had limits: this extension
still limits the number of segments per request to 255 (as the total number must be
specified in the request, which only has an 8-bit field for that purpose).

The other solution (from Intel - Ronghui) was to create one extra ring that only has the
‘struct blkif_request_segment’ in them. The ‘struct blkif_request’ would be changed to have
an index in said ‘segment ring’. There is only one segment ring. This means that the size of
the initial ring is still the same. The requests would point to the segment and enumerate out
how many of the indexes it wants to use. The limit is of course the size of the segment.
If one assumes a one-page segment this means we can in one request cover ~4MB.

Those patches were posted as RFC and the author never followed up on the ideas on changing
it to be a bit more flexible.

There is yet another mechanism that could be employed  (which these patches implement) - and it
borrows from VirtIO protocol. And that is the ‘indirect descriptors’. This very similar to
what Intel suggests, but with a twist. The twist is to negotiate how many of these
'segment' pages (aka indirect descriptor pages) we want to support (in reality we negotiate
how many entries in the segment we want to cover, and we module the number if it is
bigger than the segment size).

This means that with the existing 36 slots in the ring (single page) we can cover:
32 slots * each blkif_request_indirect covers: 512 * 4096 ~= 64M. Since we ample space
in the blkif_request_indirect to span more than one indirect page, that number (64M)
can be also multiplied by eight = 512MB.

Roger Pau Monne took the idea and implemented them in these patches. They work
great and the corner cases (migration between backends with and without this extension)
work nicely. The backend has a limit right now off how many indirect entries
it can handle: one indirect page, and at maximum 256 entries (out of 512 - so  50% of the page
is used). That comes out to 32 slots * 256 entries in a indirect page * 1 indirect page
per request * 4096 = 32MB.

This is a conservative number that can change in the future. Right now it strikes
a good balance between giving excellent performance, memory usage in the backend, and
balancing the needs of many guests.

In the patchset there is also the split of the blkback structure to be per-VBD.
This means that the spinlock contention we had with many guests trying to do I/O and
all the blkback threads hitting the same lock has been eliminated.

Also there are bug-fixes to deal with oddly sized sectors, insane amounts on
th ring, and also a security fix (posted earlier).

commit | commitdiff | tree

Gabriel de Perthuis [Thu, 27 Jun 2013 00:12:07 +0000 (02:12 +0200)]

bcache: Refresh usage docs

Mention udev autoregistration, symlinks. Write down some sysfs paths.

Signed-off-by: Gabriel de Perthuis <g2p.code@gmail.com>
Signed-off-by: Kent Overstreet <koverstreet@google.com>

commit | commitdiff | tree

Gabriel de Perthuis [Sat, 8 Jun 2013 22:54:48 +0000 (00:54 +0200)]

bcache: Send label uevents

Signed-off-by: Gabriel de Perthuis <g2p.code@gmail.com>
Signed-off-by: Kent Overstreet <koverstreet@google.com>

commit | commitdiff | tree

Gabriel de Perthuis [Fri, 7 Jun 2013 21:27:01 +0000 (23:27 +0200)]

bcache: Send a uevent with a cached device's UUID

Signed-off-by: Gabriel de Perthuis <g2p.code@gmail.com>

commit | commitdiff | tree

Masanari Iida [Sun, 19 May 2013 15:04:35 +0000 (00:04 +0900)]

doc: Fix typo in documentation/bcache.txt

Correct spelling typo in documentation/bcache.txt

Signed-off-by: Masanari Iida <standby24x7@gmail.com>
Signed-off-by: Kent Overstreet <koverstreet@google.com>

commit | commitdiff | tree

Kent Overstreet [Wed, 5 Jun 2013 13:24:39 +0000 (06:24 -0700)]

bcache: Write out full stripes

Now that we're tracking dirty data per stripe, we can add two
optimizations for raid5/6:

* If a stripe is already dirty, force writes to that stripe to
writeback mode - to help build up full stripes of dirty data

* When flushing dirty data, preferentially write out full stripes first
if there are any.

Signed-off-by: Kent Overstreet <koverstreet@google.com>

commit | commitdiff | tree

Kent Overstreet [Wed, 5 Jun 2013 13:21:07 +0000 (06:21 -0700)]

bcache: Track dirty data by stripe

To make background writeback aware of raid5/6 stripes, we first need to
track the amount of dirty data within each stripe - we do this by
breaking up the existing sectors_dirty into per stripe atomic_ts

Signed-off-by: Kent Overstreet <koverstreet@google.com>

commit | commitdiff | tree

Kent Overstreet [Sun, 12 May 2013 00:07:26 +0000 (17:07 -0700)]

bcache: Initialize sectors_dirty when attaching

Previously, dirty_data wouldn't get initialized until the first garbage
collection... which was a bit of a problem for background writeback (as
the PD controller keys off of it) and also confusing for users.

This is also prep work for making background writeback aware of raid5/6
stripes.

Signed-off-by: Kent Overstreet <koverstreet@google.com>

commit | commitdiff | tree

Kent Overstreet [Sat, 11 May 2013 22:59:37 +0000 (15:59 -0700)]

bcache: Improve lazy sorting

The old lazy sorting code was kind of hacky - rewrite in a way that
mathematically makes more sense; the idea is that the size of the sets
of keys in a btree node should increase by a more or less fixed ratio
from smallest to biggest.

Signed-off-by: Kent Overstreet <koverstreet@google.com>

commit | commitdiff | tree

Kent Overstreet [Wed, 15 May 2013 03:33:16 +0000 (20:33 -0700)]

bcache: Rip out pkey()/pbtree()

Old gcc doesnt like the struct hack, and it is kind of ugly. So finish
off the work to convert pr_debug() statements to tracepoints, and delete
pkey()/pbtree().

Signed-off-by: Kent Overstreet <koverstreet@google.com>

commit | commitdiff | tree

Kent Overstreet [Fri, 26 Apr 2013 22:39:55 +0000 (15:39 -0700)]

bcache: Fix/revamp tracepoints

The tracepoints were reworked to be more sensible, and fixed a null
pointer deref in one of the tracepoints.

Converted some of the pr_debug()s to tracepoints - this is partly a
performance optimization; it used to be that with DEBUG or
CONFIG_DYNAMIC_DEBUG pr_debug() was an empty macro; but at some point it
was changed to an empty inline function.

Some of the pr_debug() statements had rather expensive function calls as
part of the arguments, so this code was getting run unnecessarily even
on non debug kernels - in some fast paths, too.

Signed-off-by: Kent Overstreet <koverstreet@google.com>

commit | commitdiff | tree

Kent Overstreet [Thu, 25 Apr 2013 20:58:35 +0000 (13:58 -0700)]

bcache: Refactor btree io

The most significant change is that btree reads are now done
synchronously, instead of asynchronously and doing the post read stuff
from a workqueue.

This was originally done because we can't block on IO under
generic_make_request(). But - we already have a mechanism to punt cache
lookups to workqueue if needed, so if we just use that we don't have to
deal with the complexity of doing things asynchronously.

The main benefit is this makes the locking situation saner; we can hold
our write lock on the btree node until we're finished reading it, and we
don't need that btree_node_read_done() flag anymore.

Also, for writes, btree_write() was broken out into btree_node_write()
and btree_leaf_dirty() - the old code with the boolean argument was dumb
and confusing.

The prio_blocked mechanism was improved a bit too, now the only counter
is in struct btree_write, we don't mess with transfering a count from
struct btree anymore.

This required changing garbage collection to block prios at the start
and unblock when it finishes, which is cleaner than what it was doing
anyways (the old code had mostly the same effect, but was doing it in a
convoluted way)

And the btree iter btree_node_read_done() uses was converted to a real
mempool.

Signed-off-by: Kent Overstreet <koverstreet@google.com>

commit | commitdiff | tree

Kent Overstreet [Thu, 25 Apr 2013 02:01:12 +0000 (19:01 -0700)]

bcache: Convert allocator thread to kthread

Using a workqueue when we just want a single thread is a bit silly.

Signed-off-by: Kent Overstreet <koverstreet@google.com>

commit | commitdiff | tree

Gabriel de Perthuis [Sat, 4 May 2013 10:19:41 +0000 (12:19 +0200)]

bcache: Warn when a device is already registered.

Signed-off-by: Gabriel de Perthuis <g2p.code+bcache@gmail.com>
Signed-off-by: Kent Overstreet <koverstreet@google.com>

commit | commitdiff | tree

Kent Overstreet [Wed, 29 May 2013 04:53:19 +0000 (21:53 -0700)]

bcache: fix a spurious gcc complaint, use scnprintf

An old version of gcc was complaining about using a const int as the
size of a stack allocated array. Which should be fine - but using
ARRAY_SIZE() is better, anyways.

Also, refactor the code to use scnprintf().

Signed-off-by: Kent Overstreet <koverstreet@google.com>

commit | commitdiff | tree

Kumar Amit Mehta [Tue, 28 May 2013 07:31:15 +0000 (00:31 -0700)]

md: bcache: io.c: fix a potential NULL pointer dereference

bio_alloc_bioset returns NULL on failure. This fix adds a missing check
for potential NULL pointer dereferencing.

Signed-off-by: Kumar Amit Mehta <gmate.amit@gmail.com>
Signed-off-by: Kent Overstreet <koverstreet@google.com>

commit | commitdiff | tree

Roger Pau Monne [Sat, 22 Jun 2013 07:59:17 +0000 (09:59 +0200)]

xen-blkback: check the number of iovecs before allocating a bios

With the introduction of indirect segments we can receive requests
with a number of segments bigger than the maximum number of allowed
iovecs in a bios, so make sure that blkback doesn't try to allocate a
bios with more iovecs than BIO_MAX_PAGES

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>

commit | commitdiff | tree

Linus Torvalds [Sat, 22 Jun 2013 19:47:31 +0000 (09:47 -1000)]

Linux 3.10-rc7

commit | commitdiff | tree

Linus Torvalds [Sat, 22 Jun 2013 19:44:45 +0000 (09:44 -1000)]

Merge tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc

Pull ARM SoC fixes from Arnd Bergmann:
"These are two fixes that came in this week, one for a regression we
  introduced in 3.10 in the GIC interrupt code, and the other one fixes
  a typo in newly introduced code"

* tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
  irqchip: gic: call gic_cpu_init() as well in CPU_STARTING_FROZEN case
  ARM: dts: Correct the base address of pinctrl_3 on Exynos5250

commit | commitdiff | tree

Linus Torvalds [Sat, 22 Jun 2013 19:02:44 +0000 (09:02 -1000)]

Merge tag 'driver-core-3.10-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core

Pull driver core fix from Greg Kroah-Hartman:
"Here's a single patch for the firmware core that resolves a reported
oops in the firmware core that people have been hitting."

* tag 'driver-core-3.10-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core:
firmware loader: fix use-after-free by double abort

commit | commitdiff | tree

Linus Torvalds [Sat, 22 Jun 2013 19:01:47 +0000 (09:01 -1000)]

Merge tag 'usb-3.10-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb

Pull USB fixes from Greg Kroah-Hartman:
"Here are two USB patches for 3.10.

  One updates the Kconfig wording for CONFIG_USB_PHY to make it,
  hopefully, more obvious what this option is (I know you complained
  about this when it hit the tree.) The other is a new device id for a
  driver"

* tag 'usb-3.10-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
  USB: serial: ti_usb_3410_5052: new device id for Abbot strip port cable
  usb: phy: Improve Kconfig help for CONFIG_USB_PHY

commit | commitdiff | tree

Linus Torvalds [Sat, 22 Jun 2013 19:00:28 +0000 (09:00 -1000)]

Merge tag 'tty-3.10-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty

Pul tty fixes from Greg Kroah-Hartman:
"Here are two tty core fixes that resolve some regressions that have
  been reported recently.  Both tiny fixes, but needed"

* tag 'tty-3.10-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
  tty: Fix transient pty write() EIO
  tty/vt: Return EBUSY if deallocating VT1 and it is busy

commit | commitdiff | tree

Linus Torvalds [Sat, 22 Jun 2013 18:54:06 +0000 (08:54 -1000)]

Merge git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending

Pull SCSI target fixes from Nicholas Bellinger:
"Included is the recent tcm_qla2xxx residual underrun length fix from
  Roland, along with Joern's iscsi-target patch for session_lock
  breakage within iscsit_stop_time2retain_timer() code.  Both are CC'ed
  to stable.

  The remaining two are specific to recent iscsi-target + iser
  conversion changes.  One drops some left-over debug noise, and Andy's
  patch fixes configfs attribute handling during an explicit network
  portal feature bit disable when iser-target is unsupported."

* git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending:
  iscsi-target: Remove left over v3.10-rc debug printks
  target/iscsi: Fix op=disable + error handling cases in np_store_iser
  tcm_qla2xxx: Fix residual for underrun commands that fail
  target/iscsi: don't corrupt bh_count in iscsit_stop_time2retain_timer()

commit | commitdiff | tree

Linus Torvalds [Sat, 22 Jun 2013 18:43:17 +0000 (08:43 -1000)]

Merge branch 'v4l_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media

Pull media fixes from Mauro Carvalho Chehab:
"Another set of fixes for Kernel 3.10.

  This series contain:
   - two Kbuild fixes for randconfig
   - a buffer overflow when using rtl28xuu with r820t tuner
   - one clk fixup on exynos4-is driver"

* 'v4l_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media:
  [media] Fix build when drivers are builtin and frontend modules
  [media] s5p makefiles: don't override other selections on obj-[ym]
  [media] exynos4-is: Fix FIMC-IS clocks initialization
  [media] rtl28xxu: fix buffer overflow when probing Rafael Micro r820t tuner

commit | commitdiff | tree

Linus Torvalds [Sat, 22 Jun 2013 18:42:20 +0000 (08:42 -1000)]

Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs

Pull vfs fixes from Al Viro:
"Several fixes for bugs caught while looking through f_pos (ab)users"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  aout32 coredump compat fix
  splice: don't pass the address of ->f_pos to methods
  mconsole: we'd better initialize pos before passing it to vfs_read()...

commit | commitdiff | tree

Al Viro [Sat, 22 Jun 2013 07:01:38 +0000 (11:01 +0400)]

aout32 coredump compat fix

dump_seek() does SEEK_CUR, not SEEK_SET; native binfmt_aout
handles it correctly (seeks by PAGE_SIZE - sizeof(struct user),
getting the current position to PAGE_SIZE), compat one seeks
by PAGE_SIZE and ends up at PAGE_SIZE + already written...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

commit | commitdiff | tree

Roger Pau Monne [Fri, 21 Jun 2013 10:56:54 +0000 (12:56 +0200)]

xen-blkfront: set blk_queue_max_hw_sectors correctly

Now that indirect segments are enabled blk_queue_max_hw_sectors must
be set to match the maximum number of sectors we can handle in a
request.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reported-by: Felipe Franciosi <felipe.franciosi@citrix.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>

commit | commitdiff | tree

Roger Pau Monne [Fri, 21 Jun 2013 10:56:53 +0000 (12:56 +0200)]

xen-blkback: workaround compiler bug in gcc 4.1

The code generat with gcc (GCC) 4.1.2 20080704 (Red Hat 4.1.2-54)
creates an unbound loop for the second foreach_grant_safe loop in
purge_persistent_gnt.

The workaround is to avoid having this second loop and instead
perform all the work inside the first loop by adding a new variable,
clean_used, that will be set when all the desired persistent grants
have been removed and we need to iterate over the remaining ones to
remove the WAS_ACTIVE flag.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reported-by: Tom O'Neill <toneill@vmem.com>
Reported-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>

commit | commitdiff | tree

Linus Torvalds [Fri, 21 Jun 2013 16:33:48 +0000 (06:33 -1000)]

Merge branch 'x86/urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 fixes from Peter Anvin:
"This series fixes a couple of build failures, and fixes MTRR cleanup
  and memory setup on very specific memory maps.

  Finally, it fixes triggering backtraces on all CPUs, which was
  inadvertently disabled on x86."

* 'x86/urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/efi: Fix dummy variable buffer allocation
  x86: Fix trigger_all_cpu_backtrace() implementation
  x86: Fix section mismatch on load_ucode_ap
  x86: fix build error and kconfig for ia32_emulation and binfmt
  range: Do not add new blank slot with add_range_with_merge
  x86, mtrr: Fix original mtrr range get for mtrr_cleanup

commit | commitdiff | tree

Linus Torvalds [Fri, 21 Jun 2013 16:33:06 +0000 (06:33 -1000)]

Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux

Pull drm radeon fixes from Dave Airlie:
"One core fix, but mostly radeon fixes for s/r and big endian UVD
  support, and a fix to stop the GPU being reset for no good reason, and
  crashing people's machines."

* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux:
  drm/radeon: update lockup tracking when scheduling in empty ring
  drm/prime: Honor requested file flags when exporting a buffer
  drm/radeon: fix UVD on big endian
  drm/radeon: fix write back suspend regression with uvd v2
  drm/radeon: do not try to uselessly update virtual memory pagetable

commit | commitdiff | tree

Linus Torvalds [Fri, 21 Jun 2013 16:31:10 +0000 (06:31 -1000)]

Merge tag 'acpi-3.10-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull ACPI fixes from Rafael Wysocki:

- Fix for a regression causing a failure to turn on some devices on
   some systems during initialization introduced by a recent revert of
   an ACPI PM change that broke something else.  Fortunately, we know
   exactly what devices are affected, so we can add a fix just for them
   leaving everyone else alone.

- ACPI power resources initialization fix preventing a NULL pointer
   from being dereferenced in the acpi_add_power_resource() error code
   path.

- ACPI dock station driver fix that adds missing locking to
   write_undock().

- ACPI resources allocation fix changing the scope of an old workaround
   so that it doesn't affect systems that aren't actually buggy.  This
   was reported a couple of days ago to fix DMA problems on some new
   platforms so we need it in -stable.  From Mika Westerberg.

* tag 'acpi-3.10-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  ACPI / LPSS: Power up LPSS devices during enumeration
  ACPI / PM: Fix error code path for power resources initialization
  ACPI / dock: Take ACPI scan lock in write_undock()
  ACPI / resources: call acpi_get_override_irq() only for legacy IRQ resources

commit | commitdiff | tree

Linus Torvalds [Fri, 21 Jun 2013 16:29:22 +0000 (06:29 -1000)]

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull KVM fixes from Paolo Bonzini:
"Three one-line fixes for my first pull request; one for x86 host, one
  for x86 guest, one for PPC"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  x86: kvmclock: zero initialize pvclock shared memory area
  kvm/ppc/booke: Delay kvmppc_lazy_ee_enable
  KVM: x86: remove vcpu's CPL check in host-invoked XCR set

commit | commitdiff | tree

Linus Torvalds [Fri, 21 Jun 2013 16:28:39 +0000 (06:28 -1000)]

Merge git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6

Pull crypto fix from Herbert Xu:
"This fixes an unaligned crash in XTS mode when using aseni_intel"

* git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
crypto: aesni_intel - fix accessing of unaligned memory

commit | commitdiff | tree

Linus Torvalds [Fri, 21 Jun 2013 16:27:40 +0000 (06:27 -1000)]

Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client

Pull Ceph fix from Sage Weil:
"This fixes a problem preventing the kernel and userland librbd
libraries from sharing data with the new format 2 images"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client:
rbd: use the correct length for format 2 object names

commit | commitdiff | tree

H. Peter Anvin [Fri, 21 Jun 2013 10:01:21 +0000 (03:01 -0700)]

Merge tag 'efi-urgent' into x86/urgent

* Don't leak random kernel memory to EFI variable NVRAM when attempting
to initiate garbage collection. Also, free the kernel memory when
we're done with it instead of leaking - Ben Hutchings

Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>

commit | commitdiff | tree

Ben Hutchings [Sun, 16 Jun 2013 20:27:12 +0000 (21:27 +0100)]

x86/efi: Fix dummy variable buffer allocation

1. Check for allocation failure
2. Clear the buffer contents, as they may actually be written to flash
3. Don't leak the buffer

Compile-tested only.

[ Tested successfully on my buggy ASUS machine - Matt ]

Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Cc: stable@vger.kernel.org
Signed-off-by: Matt Fleming <matt.fleming@intel.com>

commit | commitdiff | tree

Nicholas Bellinger [Thu, 20 Jun 2013 23:36:17 +0000 (16:36 -0700)]

iscsi-target: Remove left over v3.10-rc debug printks

Reported-by: Andy Grover <agrover@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>

commit | commitdiff | tree

Andy Grover [Wed, 29 May 2013 19:05:59 +0000 (12:05 -0700)]

target/iscsi: Fix op=disable + error handling cases in np_store_iser

Writing 0 when iser was not previously enabled, so succeed but do
nothing so that user-space code doesn't need a try: catch block
when ib_isert logic is not available.

Also, return actual error from add_network_portal using PTR_ERR
during op=enable failure.

Signed-off-by: Andy Grover <agrover@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>

commit | commitdiff | tree

Dave Airlie [Thu, 20 Jun 2013 22:52:19 +0000 (08:52 +1000)]

Merge branch 'drm-fixes-3.10' of git://people.freedesktop.org/~agd5f/linux into drm-fixes

One user visible fix to stop misreport GPU hangs and subsequent resets.
* 'drm-fixes-3.10' of git://people.freedesktop.org/~agd5f/linux:
drm/radeon: update lockup tracking when scheduling in empty ring

commit | commitdiff | tree

Jerome Glisse [Wed, 19 Jun 2013 14:02:28 +0000 (10:02 -0400)]

drm/radeon: update lockup tracking when scheduling in empty ring

There might be issue with lockup detection when scheduling on an
empty ring that have been sitting idle for a while. Thus update
the lockup tracking data when scheduling new work in an empty ring.

Signed-off-by: Jerome Glisse <jglisse@redhat.com>
Tested-by: Andy Lutomirski <luto@amacapital.net>
Cc: stable@vger.kernel.org
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

commit | commitdiff | tree

Linus Torvalds [Thu, 20 Jun 2013 18:18:35 +0000 (08:18 -1000)]

Merge branch 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull scheduler fixes from Ingo Molnar:
"Two smaller fixes - plus a context tracking tracing fix that is a bit
  bigger"

* 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  tracing/context-tracking: Add preempt_schedule_context() for tracing
  sched: Fix clear NOHZ_BALANCE_KICK
  sched/x86: Construct all sibling maps if smt

commit | commitdiff | tree

Linus Torvalds [Thu, 20 Jun 2013 18:17:36 +0000 (08:17 -1000)]

Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull perf fixes from Ingo Molnar:
"Four fixes.  The mmap ones are unfortunately larger than desired -
  fuzzing uncovered bugs that needed perf context life time management
  changes to fix properly"

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf/x86: Fix broken PEBS-LL support on SNB-EP/IVB-EP
  perf: Fix mmap() accounting hole
  perf: Fix perf mmap bugs
  kprobes: Fix to free gone and unused optprobes

commit | commitdiff | tree

Linus Torvalds [Thu, 20 Jun 2013 18:16:07 +0000 (08:16 -1000)]

Merge branch 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull cpu idle fixes from Thomas Gleixner:
- Add a missing irq enable. Fallout of the idle conversion
- Fix stackprotector wreckage caused by the idle conversion

* 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
idle: Enable interrupts in the weak arch_cpu_idle() implementation
idle: Add the stack canary init to cpu_startup_entry()

commit | commitdiff | tree

Linus Torvalds [Thu, 20 Jun 2013 18:15:13 +0000 (08:15 -1000)]

Merge branch 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull timer fixes from Thomas Gleixner:
- Fix inconstinant clock usage in virtual time accounting
- Fix a build error in KVM caused by the NOHZ work
- Remove a pointless timekeeping duty assignment which breaks NOHZ
- Use a proper notifier return value to avoid random behaviour

* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  tick: Remove useless timekeeping duty attribution to broadcast source
  nohz: Fix notifier return val that enforce timekeeping
  kvm: Move guest entry/exit APIs to context_tracking
  vtime: Use consistent clocks among nohz accounting

commit | commitdiff | tree

Linus Torvalds [Thu, 20 Jun 2013 18:08:46 +0000 (08:08 -1000)]

Merge branch 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc

Pull powerpc fix fro, Benjamin Herrenschmidt:
"We accidentally broke hugetlbfs on Freescale embedded processors which
use a slightly different page table layout than our server processors"

* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc:
powerpc: Fix bad pmd error with book3E config

commit | commitdiff | tree

Linus Torvalds [Thu, 20 Jun 2013 18:07:42 +0000 (08:07 -1000)]

Merge branch 'stable' of git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tile

Pull tilepro fix from Chris Metcalf:
"This change allows the older tilepro architecture to be correctly
  built by newer gccs, despite a change that caused gcc to start trying
  to use an out-of-line implementation for __builtin_ffsll().

  This should be inline again starting with gcc 4.7.4 and 4.8.2 or so,
  but meanwhile this change keeps things from breaking, with the only
  cost being a few bytes of code in the kernel to provide __ffsdi2 even
  for compilers that do inline it"

* 'stable' of git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tile:
  tilepro: work around module link error with gcc 4.7

commit | commitdiff | tree

Linus Torvalds [Thu, 20 Jun 2013 18:06:48 +0000 (08:06 -1000)]

Merge tag 'arm64-stable' of git://git.kernel.org/pub/scm/linux/kernel/git/cmarinas/linux-aarch64

Pull arm64 perf fix from Catalin Marinas:
"Perf fix (user-mode PC recording)"

* tag 'arm64-stable' of git://git.kernel.org/pub/scm/linux/kernel/git/cmarinas/linux-aarch64:
perf: arm64: Record the user-mode PC in the call chain.

commit | commitdiff | tree

Al Viro [Thu, 20 Jun 2013 14:58:36 +0000 (18:58 +0400)]

splice: don't pass the address of ->f_pos to methods

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

commit | commitdiff | tree

Mauro Carvalho Chehab [Thu, 20 Jun 2013 13:35:53 +0000 (10:35 -0300)]

[media] Fix build when drivers are builtin and frontend modules

There are a large number of reports that the media build is
not compiling when some drivers are compiled as builtin, while
the needed frontends are compiled as module.

On the last one of such reports:
From: kbuild test robot <fengguang.wu@intel.com>
Subject: saa7134-dvb.c:undefined reference to `zl10039_attach'

The .config file has:

CONFIG_VIDEO_SAA7134=y
CONFIG_VIDEO_SAA7134_DVB=y
# CONFIG_MEDIA_ATTACH is not set
CONFIG_DVB_ZL10039=m

And it produces all those errors:

   drivers/built-in.o: In function `set_type':
   tuner-core.c:(.text+0x2f263e): undefined reference to `tea5767_attach'
   tuner-core.c:(.text+0x2f273e): undefined reference to `tda9887_attach'
   drivers/built-in.o: In function `tuner_probe':
   tuner-core.c:(.text+0x2f2d20): undefined reference to `tea5767_autodetection'
   drivers/built-in.o: In function `av7110_attach':
   av7110.c:(.text+0x330bda): undefined reference to `ves1x93_attach'
   av7110.c:(.text+0x330bf7): undefined reference to `stv0299_attach'
   av7110.c:(.text+0x330c63): undefined reference to `tda8083_attach'
   av7110.c:(.text+0x330d09): undefined reference to `ves1x93_attach'
   av7110.c:(.text+0x330d33): undefined reference to `tda8083_attach'
   av7110.c:(.text+0x330d5d): undefined reference to `stv0297_attach'
   av7110.c:(.text+0x330dbe): undefined reference to `stv0299_attach'
   drivers/built-in.o: In function `tuner_attach_dtt7520x':
   ngene-cards.c:(.text+0x3381cb): undefined reference to `dvb_pll_attach'
   drivers/built-in.o: In function `demod_attach_lg330x':
   ngene-cards.c:(.text+0x33828a): undefined reference to `lgdt330x_attach'
   drivers/built-in.o: In function `demod_attach_stv0900':
   ngene-cards.c:(.text+0x3383d5): undefined reference to `stv090x_attach'
   drivers/built-in.o: In function `cineS2_probe':
   ngene-cards.c:(.text+0x338b7f): undefined reference to `drxk_attach'
   drivers/built-in.o: In function `configure_tda827x_fe':
   saa7134-dvb.c:(.text+0x346ae7): undefined reference to `tda10046_attach'
   drivers/built-in.o: In function `dvb_init':
   saa7134-dvb.c:(.text+0x347283): undefined reference to `mt352_attach'
   saa7134-dvb.c:(.text+0x3472cd): undefined reference to `mt352_attach'
   saa7134-dvb.c:(.text+0x34731c): undefined reference to `tda10046_attach'
   saa7134-dvb.c:(.text+0x34733c): undefined reference to `tda10046_attach'
   saa7134-dvb.c:(.text+0x34735c): undefined reference to `tda10046_attach'
   saa7134-dvb.c:(.text+0x347378): undefined reference to `tda10046_attach'
   saa7134-dvb.c:(.text+0x3473db): undefined reference to `tda10046_attach'
   drivers/built-in.o:saa7134-dvb.c:(.text+0x347502): more undefined references to `tda10046_attach' follow
   drivers/built-in.o: In function `dvb_init':
   saa7134-dvb.c:(.text+0x347812): undefined reference to `mt352_attach'
   saa7134-dvb.c:(.text+0x347951): undefined reference to `mt312_attach'
   saa7134-dvb.c:(.text+0x3479a9): undefined reference to `mt312_attach'
>> saa7134-dvb.c:(.text+0x3479c1): undefined reference to `zl10039_attach'

This is happening because a builtin module can't use directly a symbol
found on a module. By enabling CONFIG_MEDIA_ATTACH, the configuration
becomes valid, as dvb_attach() macro loads the module if needed, making
the symbol available to the builtin module.

While this bug started to appear after the patches that use IS_DEFINED
macro (like changeset 7b34be71db533f3e0cf93d53cf62d036cdb5418a), this
bug is a way ancient than that.

The thing is that, before the IS_DEFINED() patches, the logic used to be:

       && defined(MODULE))
struct dvb_frontend *zl10039_attach(struct dvb_frontend *fe,
u8 i2c_addr,
struct i2c_adapter *i2c);
static inline struct dvb_frontend *zl10039_attach(struct dvb_frontend *fe,
u8 i2c_addr,
struct i2c_adapter *i2c)
{
printk(KERN_WARNING "%s: driver disabled by Kconfig\n", __func__);
return NULL;
}

The above code, with the .config file used, was evoluting to FALSE
(instead of TRUE as it should be, as CONFIG_DVB_ZL10039 is 'm'),
and were adding the static inline code at saa7134-dvb, instead
of the external call. So, while it weren't producing any compilation
error, the code weren't working either.

So, as the overhead for using CONFIG_MEDIA_ATTACH is minimal, just
enable it, if MODULES is defined.

Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>

commit | commitdiff | tree

Shawn Guo [Wed, 12 Jun 2013 11:30:27 +0000 (19:30 +0800)]

irqchip: gic: call gic_cpu_init() as well in CPU_STARTING_FROZEN case

Commit c011470 (irqchip: gic: Perform the gic_secondary_init() call via
CPU notifier) moves gic_secondary_init() that used to be called in
.smp_secondary_init hook into a notifier call.  But it changes the
system behavior a little bit.  Before the commit, gic_cpu_init()
is called not only when kernel brings up the secondary cores but also
when system resuming procedure hot-plugs the cores back to kernel.
While after the commit, the function will not be called in the latter
case, where the 'action' will not be CPU_STARTING but
CPU_STARTING_FROZEN.  This behavior difference at least causes the
following suspend/resume regression on imx6q.

$ echo mem > /sys/power/state
PM: Syncing filesystems ... done.
PM: Preparing system for mem sleep
mmc1: card e624 removed
Freezing user space processes ... (elapsed 0.01 seconds) done.
Freezing remaining freezable tasks ... (elapsed 0.01 seconds) done.
PM: Entering mem sleep
PM: suspend of devices complete after 5.930 msecs
PM: suspend devices took 0.010 seconds
PM: late suspend of devices complete after 0.343 msecs
PM: noirq suspend of devices complete after 0.828 msecs
Disabling non-boot CPUs ...
CPU1: shutdown
CPU2: shutdown
CPU3: shutdown
Enabling non-boot CPUs ...
CPU1: Booted secondary processor
INFO: rcu_sched detected stalls on CPUs/tasks: { 1 2 3} (detected by 0, t=2102 jiffies, g=4294967169, c=4294967168, q=17)
Task dump for CPU 1:
swapper/1       R running      0     0      1 0x00000000
Backtrace:
[<bf895ff4>] (0xbf895ff4) from [<00000000>] (  (null))
Backtrace aborted due to bad frame pointer <8007ccdc>
Task dump for CPU 2:
swapper/2       R running      0     0      1 0x00000000
Backtrace:
[<8075dbdc>] (0x8075dbdc) from [<00000000>] (  (null))
Backtrace aborted due to bad frame pointer <00000002>
Task dump for CPU 3:
swapper/3       R running      0     0      1 0x00000000
Backtrace:
[<8075dbdc>] (0x8075dbdc) from [<00000000>] (  (null))

Fix the regression by checking 'action' being CPU_STARTING_FROZEN to
have gic_cpu_init() called for secondary cores when system resumes.

Signed-off-by: Shawn Guo <shawn.guo@linaro.org>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Tested-by: Joseph Lo <josephl@nvidia.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>

commit | commitdiff | tree

Michel Lespinasse [Thu, 6 Jun 2013 11:41:15 +0000 (04:41 -0700)]

x86: Fix trigger_all_cpu_backtrace() implementation

The following change fixes the x86 implementation of
trigger_all_cpu_backtrace(), which was previously (accidentally,
as far as I can tell) disabled to always return false as on
architectures that do not implement this function.

trigger_all_cpu_backtrace(), as defined in include/linux/nmi.h,
should call arch_trigger_all_cpu_backtrace() if available, or
return false if the underlying arch doesn't implement this
function.

x86 did provide a suitable arch_trigger_all_cpu_backtrace()
implementation, but it wasn't actually being used because it was
declared in asm/nmi.h, which linux/nmi.h doesn't include. Also,
linux/nmi.h couldn't easily be fixed by including asm/nmi.h,
because that file is not available on all architectures.

I am proposing to fix this by moving the x86 definition of
arch_trigger_all_cpu_backtrace() to asm/irq.h.

Tested via: echo l > /proc/sysrq-trigger

Before the change, this uses a fallback implementation which
shows backtraces on active CPUs (using
smp_call_function_interrupt() )

After the change, this shows NMI backtraces on all CPUs

Signed-off-by: Michel Lespinasse <walken@google.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/1370518875-1346-1-git-send-email-walken@google.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>

commit | commitdiff | tree

Jed Davis [Thu, 20 Jun 2013 03:07:14 +0000 (04:07 +0100)]

perf: arm64: Record the user-mode PC in the call chain.

With this change, we no longer lose the innermost entry in the user-mode
part of the call chain. See also the x86 port, which includes the ip,
and the corresponding change in arch/arm.

Signed-off-by: Jed Davis <jld@mozilla.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Will Deacon <will.deacon@arm.com>
Cc: stable@vger.kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>

commit | commitdiff | tree

Mauro Carvalho Chehab [Thu, 20 Jun 2013 08:46:00 +0000 (05:46 -0300)]

[media] s5p makefiles: don't override other selections on obj-[ym]

The $obj-m/$obj-y vars should be adding new modules to build, not
overriding it. So, it should never use
$obj-y := foo.o
instead, it should use:
$obj-y += foo.o

Failing to do that is very bad, as it will suppress needed modules.

Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>

commit | commitdiff | tree

Aneesh Kumar K.V [Wed, 19 Jun 2013 06:34:26 +0000 (12:04 +0530)]

powerpc: Fix bad pmd error with book3E config

Book3E uses the hugepd at PMD level and don't encode pte directly
at the pmd level. So it will find the lower bits of pmd set
and the pmd_bad check throws error. Infact the current code
will never take the free_hugepd_range call at all because it will
clear the pmd if it find a hugepd pointer. Fix this by clearing
bad pmd only if it is not a hugepd pointer.

This is regression introduced by e2b3d202d1dba8f3546ed28224ce485bc50010be
"powerpc: Switch 16GB and 16MB explicit hugepages to a different page table format"

Reported-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>

commit | commitdiff | tree

Anders Hammarquist [Tue, 18 Jun 2013 23:45:48 +0000 (01:45 +0200)]

USB: serial: ti_usb_3410_5052: new device id for Abbot strip port cable

Add product id for Abbott strip port cable for Precision meter which
uses the TI 3410 chip.

Signed-off-by: Anders Hammarquist <iko@iko.pp.se>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit | commitdiff | tree

Rafael J. Wysocki [Tue, 18 Jun 2013 22:45:34 +0000 (00:45 +0200)]

ACPI / LPSS: Power up LPSS devices during enumeration

Commit 7cd8407 (ACPI / PM: Do not execute _PS0 for devices without
_PSC during initialization) introduced a regression on some systems
with Intel Lynxpoint Low-Power Subsystem (LPSS) where some devices
need to be powered up during initialization, but their device objects
in the ACPI namespace have _PS0 and _PS3 only (without _PSC or power
resources).

To work around this problem, make the ACPI LPSS driver power up
devices it knows about by using a new helper function
acpi_device_fix_up_power() that does all of the necessary
sanity checks and calls acpi_dev_pm_explicit_set() to put the
device into D0.

Reported-and-tested-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

commit | commitdiff | tree

Rafael J. Wysocki [Tue, 18 Jun 2013 22:44:45 +0000 (00:44 +0200)]

ACPI / PM: Fix error code path for power resources initialization

Commit 781d737 (ACPI: Drop power resources driver) introduced a
bug in the power resources initialization error code path causing
a NULL pointer to be referenced in acpi_release_power_resource()
if there's an error triggering a jump to the 'err' label in
acpi_add_power_resource(). This happens because the list_node
field of struct acpi_power_resource has not been initialized yet
at this point and doing a list_del() on it is a bad idea.

To prevent this problem from occuring, initialize the list_node
field of struct acpi_power_resource upfront.

Reported-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Tested-by: Lan Tianyu <tianyu.lan@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: 3.9+ <stable@vger.kernel.org>
Acked-by: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>

commit | commitdiff | tree

Rafael J. Wysocki [Sat, 15 Jun 2013 22:38:30 +0000 (00:38 +0200)]

ACPI / dock: Take ACPI scan lock in write_undock()

Since commit 3757b94 (ACPI / hotplug: Fix concurrency issues and
memory leaks) acpi_bus_scan() and acpi_bus_trim() must always be
called under acpi_scan_lock, but currently the following scenario
violating that requirement is possible:

write_undock()
  handle_eject_request()
   hotplug_dock_devices()
    dock_remove_acpi_device()
     acpi_bus_trim()

Fix that by making write_undock() acquire acpi_scan_lock before
calling handle_eject_request() as appropriate (begin_undock() is
under the lock too in analogy with acpi_dock_deferred_cb()).

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: 3.9+ <stable@vger.kernel.org>
Acked-by: Toshi Kani <toshi.kani@hp.com>

commit | commitdiff | tree

Mika Westerberg [Mon, 20 May 2013 15:41:45 +0000 (15:41 +0000)]

ACPI / resources: call acpi_get_override_irq() only for legacy IRQ resources

acpi_get_override_irq() was added because there was a problem with
buggy BIOSes passing wrong IRQ() resource for the RTC IRQ. The
commit that added the workaround was 61fd47e0c8476 (ACPI: fix two
IRQ8 issues in IOAPIC mode).

With ACPI 5 enumerated devices there are typically one or more
extended IRQ resources per device (and these IRQs can be shared).
However, the acpi_get_override_irq() workaround forces all IRQs in
range 0 - 15 (the legacy ISA IRQs) to be edge triggered, active high
as can be seen from the dmesg below:

ACPI: IRQ 6 override to edge, high
ACPI: IRQ 7 override to edge, high
ACPI: IRQ 7 override to edge, high
ACPI: IRQ 13 override to edge, high

Also /proc/interrupts for the I2C controllers (INT33C2 and INT33C3) shows
the same thing:

7: 4 0 0 0 IO-APIC-edge INT33C2:00, INT33C3:00

The _CSR method for INT33C2 (and INT33C3) device returns following
resource:

Interrupt (ResourceConsumer, Level, ActiveLow, Shared,,, )
{
0x00000007,
}

which states that this is supposed to be level triggered, active low,
shared IRQ instead.

Fix this by making sure that acpi_get_override_irq() gets only called
when we are dealing with legacy IRQ() or IRQNoFlags() descriptors.

While we are there, correct pr_warning() to print the right triggering
value.

This change turns out to be necessary to make DMA work correctly on
systems based on the Intel Lynxpoint PCH (Platform Controller Hub).

[rjw: Changelog]
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Cc: 3.9+ <stable@vger.kernel.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

commit | commitdiff | tree

Paul Gortmaker [Wed, 19 Jun 2013 15:15:26 +0000 (11:15 -0400)]

x86: Fix section mismatch on load_ucode_ap

We are in the process of removing all the __cpuinit annotations.
While working on making that change, an existing problem was
made evident:

  WARNING: arch/x86/kernel/built-in.o(.text+0x198f2): Section mismatch
  in reference from the function cpu_init() to the function
  .init.text:load_ucode_ap()   The function cpu_init() references
  the function __init load_ucode_ap().  This is often because cpu_init
  lacks a __init annotation or the annotation of load_ucode_ap is wrong.

This now appears because in my working tree, cpu_init() is no longer
tagged as __cpuinit, and so the audit picks up the mismatch.  The 2nd
hypothesis from the audit is the correct one, as there was an incorrect
__init tag on the prototype in the header (but __cpuinit was used on
the function itself.)

The audit is telling us that the prototype's __init annotation took
effect and the function did land in the .init.text section.  Checking
with objdump on a mainline tree that still has __cpuinit shows that
the __cpuinit on the function takes precedence over the __init on the
prototype, but that won't be true once we make __cpuinit a no-op.

Even though we are removing __cpuinit, we temporarily align both
the function and the prototype on __cpuinit so that the changeset
can be applied to stable trees  if desired.

[ hpa: build fix only, no object code change ]

Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: stable <stable@vger.kernel.org> # 3.9+
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Link: http://lkml.kernel.org/r/1371654926-11729-1-git-send-email-paul.gortmaker@windriver.com
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>

commit | commitdiff | tree

David Daney [Mon, 17 Jun 2013 15:46:07 +0000 (08:46 -0700)]

mn10300: Fix include dependency in irqflags.h et al.

We need to pick up the definition of raw_smp_processor_id() from
asm/smp.h. For the !SMP case, we need to supply a definition of
raw_smp_processor_id().

Because of the include dependencies we cannot use smp_call_func_t in
asm/smp.h, but we do need linux/thread_info.h

Signed-off-by: David Daney <david.daney@cavium.com>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

commit | commitdiff | tree

Linus Torvalds [Wed, 19 Jun 2013 16:24:50 +0000 (06:24 -1000)]

Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc

Pull sparc fixes from David Miller:
"Various sparc bug fixes, in particular:

  1) TSB hashes have to be flushed before TLB on sparc64, from Dave
     Kleikamp.

  2) LEON timer interrupts can get stuck, from Andreas Larsson.

  3) Sparc64 needs to handle lack of address-congruence devicetree
     property, from Bob Picco"

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc:
  sparc: tsb must be flushed before tlb
  sparc,leon: Convert to use devm_ioremap_resource
  sparc64 address-congruence property
  sparc32, leon: Enable interrupts before going idle to avoid getting stuck
  sparc32, leon: Remove separate "ticker" timer for SMP
  sparc: kernel: using strlcpy() instead of strcpy()
  arch: sparc: prom: looping issue, need additional length check in the outside looping
  sparc: remove inline marking of EXPORT_SYMBOL functions
  sparc: Switch to asm-generic/linkage.h

commit | commitdiff | tree

Linus Torvalds [Wed, 19 Jun 2013 16:23:56 +0000 (06:23 -1000)]

Merge branch 'parisc-3.10' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux

Pull parisc fixes from Helge Deller:
"This contains a kernel segfault fix when reading /proc/kpageflags or
  /proc/kpagecount, two fixes for the serial port and PCI graphic card
  support on C8000 workstations and a fix to use unshadowed registers
  for flushing D- and I-caches."

* 'parisc-3.10' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
  parisc: Use unshadowed index register for flush instructions in flush_dcache_page_asm and flush_icache_page_asm
  parisc: provide pci_mmap_page_range() for parisc
  parisc: fix serial ports on C8000 workstation
  parisc: fix kernel BUG at arch/parisc/include/asm/mmzone.h:50 (part 2)

commit | commitdiff | tree

James Hogan [Fri, 14 Jun 2013 09:31:01 +0000 (10:31 +0100)]

metag: fix mm/hugetlb.c build breakage

Commit 106c992a5ebe ("mm/hugetlb: add more arch-defined huge_pte
functions") added an include of <asm-generic/hugetlb.h> to each
architecture's <asm/hugetlb.h> (except s390).  Unfortunately metag was
missed which resulted in build errors when hugetlbfs is enabled (see
below).

Add the include for metag too to fix the build errors:

  mm/hugetlb.c In function 'make_huge_pte':
  mm/hugetlb.c +2250 : error: implicit declaration of function 'huge_pte_mkwrite'
  mm/hugetlb.c +2250 : error: implicit declaration of function 'huge_pte_mkdirty'
  ...

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Michal Hocko <mhocko@suse.cz>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

commit | commitdiff | tree

Linus Torvalds [Wed, 19 Jun 2013 16:19:46 +0000 (06:19 -1000)]

Merge branch 'fixes' of git://git.linaro.org/people/rmk/linux-arm

Pull ARM fixes from Russell King:
"The larger changes this time are

   - "ARM: 7755/1: handle user space mapped pages in flush_kernel_dcache_page"
     which fixes more data corruption problems with O_DIRECT

   - "ARM: 7759/1: decouple CPU offlining from reboot/shutdown" which
     gets us back to working shutdown/reboot on SMP platforms

   - "ARM: 7752/1: errata: LoUIS bit field in CLIDR register is incorrect"
     which fixes a shutdown regression found in v3.10 on Versatile
     Express platforms.

  The remainder are the quite small, maybe one or two line changes"

* 'fixes' of git://git.linaro.org/people/rmk/linux-arm:
  ARM: 7759/1: decouple CPU offlining from reboot/shutdown
  ARM: 7756/1: zImage/virt: remove hyp-stub.S during distclean
  ARM: 7755/1: handle user space mapped pages in flush_kernel_dcache_page
  ARM: 7754/1: Fix the CPU ID and the mask associated to the PJ4B
  ARM: 7753/1: map_init_section flushes incorrect pmd
  ARM: 7752/1: errata: LoUIS bit field in CLIDR register is incorrect

commit | commitdiff | tree

Sylwester Nawrocki [Fri, 14 Jun 2013 13:44:30 +0000 (10:44 -0300)]

[media] exynos4-is: Fix FIMC-IS clocks initialization

The ISP clock register content is not preserved over the ISP power domain
off/on cycle. Instead of setting the clock frequencies once at probe time
the clock rates set up is moved to the runtime_resume handler, which is
invoked after the related power domain is already enabled, ensuring the
clocks are properly configured when the device is actively used.
This fixes the FIMC-IS malfunctions and STREAM ON timeout errors accuring
on some boards:
[ 59.860000] fimc_is_general_irq_handler:583 ISR_NDONE: 5: 0x800003e8, IS_ERROR_UNKNOWN
[ 59.860000] fimc_is_general_irq_handler:586 IS_ERROR_TIME_OUT

Signed-off-by: Sylwester Nawrocki <s.nawrocki@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>

commit | commitdiff | tree

Philip J Kelleher [Tue, 18 Jun 2013 19:52:21 +0000 (14:52 -0500)]

rsxx: Adding in debugfs entries.

Adding debugfs entries to help with debugging and testing and
testing code.

pci_regs:
        This entry will spit out all of the data stored on the BAR.

stats:
        This entry will display all of the driver stats for each
        DMA channel.

cram:
This will allow read/write ability to the CRAM address space
on our adapter's CPU.

Signed-off-by: Philip J Kelleher <pjk1939@linux.vnet.ibm.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>

commit | commitdiff | tree

Philip J Kelleher [Tue, 18 Jun 2013 19:50:48 +0000 (14:50 -0500)]

rsxx: Fixes incorrect stats calculation.

Fixing incorrect stats calculation during read retries.

Signed-off-by: Philip J Kelleher <pjk1939@linux.vnet.ibm.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>

commit | commitdiff | tree

Philip J Kelleher [Tue, 18 Jun 2013 19:49:48 +0000 (14:49 -0500)]

rsxx: Adding EEH check inside cregs timeout.

Unfortunaly, our CPU register path does not do any kind of
EEH error checking. So to fix this issue, an ioread32 was
added to the CPU register timeout code. This way, the
driver can check to see if the timeout was caused by an EEH
error or not. This is a dummy read.

Signed-off-by: Philip J Kelleher <pjk1939@linux.vnet.ibm.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>

commit | commitdiff | tree

Philip J Kelleher [Tue, 18 Jun 2013 19:48:38 +0000 (14:48 -0500)]

rsxx: Adapter address space sanity check.

Adding a sanity check to guarentee that DMAs outside of the device's
address space will be errored out right away.

Signed-off-by: Philip J Kelleher <pjk1939@linux.vnet.ibm.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>

commit | commitdiff | tree

Philip J Kelleher [Tue, 18 Jun 2013 19:46:04 +0000 (14:46 -0500)]

rsxx: Fixes DLPAR add kernel panic if partition still mounted.

A kernel panic would occur on a DLPAR add if there was a partition
still mounted during the DLPAR remove. This bug fix will allow the
user to unmount the partition and bring the driver back into a
good state after the DLPAR add.

Signed-off-by: Philip J Kelleher <pjk1939@linux.vnet.ibm.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>

commit | commitdiff | tree

Philip J Kelleher [Tue, 18 Jun 2013 19:43:58 +0000 (14:43 -0500)]

rsxx: Changing the adapter name to the official name.

Changing the adapter name from FlashSystem-80 to the official
name: Flash Adapter 900GB Full Height.

Signed-off-by: Philip J Kelleher <pjk1939@linux.vnet.ibm.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>

commit | commitdiff | tree

Philip J Kelleher [Tue, 18 Jun 2013 19:42:36 +0000 (14:42 -0500)]

rsxx: Adding in sync_start module paramenter.

Before, the partition table would have to be reread because our
card was attached before it transistioned out of it's 'starting'
state.

This change will cause the driver to wait to attach the device
until the adapter is ready.

Signed-off-by: Philip J Kelleher <pjk1939@linux.vnet.ibm.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>

commit | commitdiff | tree

Philip J Kelleher [Tue, 18 Jun 2013 19:39:44 +0000 (14:39 -0500)]

rsxx: Allow block size to be determined by configuration.

Previously, the block size was determined by whether or not
our Hardware could handle 512 byte accesses. Now, all of our
Hardware can handle 512 and 4096 block sizes.

This fix allows it to be user configurable.

Signed-off-by: Philip J Kelleher <pjk1939@linux.vnet.ibm.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>

commit | commitdiff | tree

Philip J Kelleher [Tue, 18 Jun 2013 19:38:26 +0000 (14:38 -0500)]

rsxx: Fixes soft-lockup issues during DMAs.

The workqueue mechanism has been reworked to prevent soft
lockup issues from occuring by adding in mutex sychronization.

Signed-off-by: Philip J Kelleher <pjk1939@linux.vnet.ibm.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>

commit | commitdiff | tree

Philip J Kelleher [Tue, 18 Jun 2013 19:36:26 +0000 (14:36 -0500)]

rsxx: Restructured DMA cancel scheme.

Before, DMAs would never be cancelled if there was a data stall
or an EEH Permenant failure which would cause an unrecoverable
I/O hang.

The DMA cancellation mechanism has been modified to fix
these issues and allows DMAs to be cancelled during the
above mentioned events.

Signed-off-by: Philip J Kelleher <pjk1939@linux.vnet.ibm.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>

commit | commitdiff | tree

Philip J Kelleher [Tue, 18 Jun 2013 19:34:54 +0000 (14:34 -0500)]

rsxx: Individual workqueues for interruptible events.

Giving all interrupt based events their own workqueue to complete
tasks on. This fixes a bug that would cause creg commands to timeout
if too many are issued at once.

Signed-off-by: Philip J Kelleher <pjk1939@linux.vnet.ibm.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>

commit | commitdiff | tree

Steven Rostedt [Fri, 24 May 2013 19:23:40 +0000 (15:23 -0400)]

tracing/context-tracking: Add preempt_schedule_context() for tracing

Dave Jones hit the following bug report:

===============================
[ INFO: suspicious RCU usage. ]
3.10.0-rc2+ #1 Not tainted
-------------------------------
include/linux/rcupdate.h:771 rcu_read_lock() used illegally while idle!
other info that might help us debug this:
RCU used illegally from idle CPU! rcu_scheduler_active = 1, debug_locks = 0
RCU used illegally from extended quiescent state!
2 locks held by cc1/63645:
  #0:  (&rq->lock){-.-.-.}, at: [<ffffffff816b39fd>] __schedule+0xed/0x9b0
  #1:  (rcu_read_lock){.+.+..}, at: [<ffffffff8109d645>] cpuacct_charge+0x5/0x1f0

CPU: 1 PID: 63645 Comm: cc1 Not tainted 3.10.0-rc2+ #1 [loadavg: 40.57 27.55 13.39 25/277 64369]
Hardware name: Gigabyte Technology Co., Ltd. GA-MA78GM-S2H/GA-MA78GM-S2H, BIOS F12a 04/23/2010
  0000000000000000 ffff88010f78fcf8 ffffffff816ae383 ffff88010f78fd28
  ffffffff810b698d ffff88011c092548 000000000023d073 ffff88011c092500
  0000000000000001 ffff88010f78fd60 ffffffff8109d7c5 ffffffff8109d645
Call Trace:
  [<ffffffff816ae383>] dump_stack+0x19/0x1b
  [<ffffffff810b698d>] lockdep_rcu_suspicious+0xfd/0x130
  [<ffffffff8109d7c5>] cpuacct_charge+0x185/0x1f0
  [<ffffffff8109d645>] ? cpuacct_charge+0x5/0x1f0
  [<ffffffff8108dffc>] update_curr+0xec/0x240
  [<ffffffff8108f528>] put_prev_task_fair+0x228/0x480
  [<ffffffff816b3a71>] __schedule+0x161/0x9b0
  [<ffffffff816b4721>] preempt_schedule+0x51/0x80
  [<ffffffff816b4800>] ? __cond_resched_softirq+0x60/0x60
  [<ffffffff816b6824>] ? retint_careful+0x12/0x2e
  [<ffffffff810ff3cc>] ftrace_ops_control_func+0x1dc/0x210
  [<ffffffff816be280>] ftrace_call+0x5/0x2f
  [<ffffffff816b681d>] ? retint_careful+0xb/0x2e
  [<ffffffff816b4805>] ? schedule_user+0x5/0x70
  [<ffffffff816b4805>] ? schedule_user+0x5/0x70
  [<ffffffff816b6824>] ? retint_careful+0x12/0x2e
------------[ cut here ]------------

What happened was that the function tracer traced the schedule_user() code
that tells RCU that the system is coming back from userspace, and to
add the CPU back to the RCU monitoring.

Because the function tracer does a preempt_disable/enable_notrace() calls
the preempt_enable_notrace() checks the NEED_RESCHED flag. If it is set,
then preempt_schedule() is called. But this is called before the user_exit()
function can inform the kernel that the CPU is no longer in user mode and
needs to be accounted for by RCU.

The fix is to create a new preempt_schedule_context() that checks if
the kernel is still in user mode and if so to switch it to kernel mode
before calling schedule. It also switches back to user mode coming back
from schedule in need be.

The only user of this currently is the preempt_enable_notrace(), which is
only used by the tracing subsystem.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1369423420.6828.226.camel@gandalf.local.home
Signed-off-by: Ingo Molnar <mingo@kernel.org>

Linux kernel for KaRo TX COM modules

RSS Atom