git.karo-electronics.de Git - karo-tx-linux.git/log

bcache: Move keylist out of btree_op

Slowly working on pruning struct btree_op - the aim is for it to only
contain things that are actually necessary for traversing the btree.

Signed-off-by: Kent Overstreet <kmo@daterainc.com>

bcache: Refactor journalling flow control

Making things less asynchronous that don't need to be - bch_journal()
only has to block when the journal or journal entry is full, which is
emphatically not a fast path. So make it a normal function that just
returns when it finishes, to make the code and control flow easier to
follow.

Signed-off-by: Kent Overstreet <kmo@daterainc.com>

bcache: Refactor read request code a bit

More refactoring, and renaming.

Signed-off-by: Kent Overstreet <kmo@daterainc.com>

bcache: Refactor request_write()

Try to improve some of the naming a bit to be more consistent, and also
improve the flow of control in request_write() a bit.

Signed-off-by: Kent Overstreet <kmo@daterainc.com>

bcache: Clean up keylist code

More random refactoring.

Signed-off-by: Kent Overstreet <kmo@daterainc.com>

bcache: Add explicit keylist arg to btree_insert()

Some refactoring - better to explicitly pass stuff around instead of
having it all in the "big bag of state", struct btree_op. Going to prune
struct btree_op quite a bit over time.

Signed-off-by: Kent Overstreet <kmo@daterainc.com>

bcache: Convert btree_insert_check_key() to btree_insert_node()

This was the main point of all this refactoring - now,
btree_insert_check_key() won't fail just because the leaf node happened
to be full.

Signed-off-by: Kent Overstreet <kmo@daterainc.com>

bcache: Insert multiple keys at a time

We'll often end up with a list of adjacent keys to insert -
because bch_data_insert() may have to fragment the data it writes.

Originally, to simplify things and avoid having to deal with corner
cases bch_btree_insert() would pass keys from this list one at a time to
btree_insert_recurse() - mainly because the list of keys might span leaf
nodes, so it was easier this way.

With the btree_insert_node() refactoring, it's now a lot easier to just
pass down the whole list and have btree_insert_recurse() iterate over
leaf nodes until it's done.

Signed-off-by: Kent Overstreet <kmo@daterainc.com>

bcache: Add btree_insert_node()

The flow of control in the old btree insertion code was rather -
backwards; we'd recurse down the btree (in btree_insert_recurse()), and
then if we needed to split the keys to be inserted into the parent node
would be effectively returned up to btree_insert_recurse(), which would
notice there was more work to do and finish the insertion.

The main problem with this was that the full logic for btree insertion
could only be used by calling btree_insert_recurse; if you'd gotten to a
btree leaf some other way and had a key to insert, if it turned out that
node needed to be split you were SOL.

This inverts the flow of control so btree_insert_node() does _full_
btree insertion, including splitting - and takes a (leaf) btree node to
insert into as a parameter.

This means we can now _correctly_ handle cache misses - for cache
misses, we need to insert a fake "check" key into the btree when we
discover we have a cache miss - while we still have the btree locked.
Previously, if the btree node was full inserting a cache miss would just
fail.

Signed-off-by: Kent Overstreet <kmo@daterainc.com>

bcache: Explicitly track btree node's parent

This is prep work for the reworked btree insertion code.

The way we set b->parent is ugly and hacky... the problem is, when
btree_split() or garbage collection splits or rewrites a btree node, the
parent changes for all its (potentially already cached) children.

I may change this later and add some code to look through the btree node
cache and find all our cached child nodes and change the parent pointer
then...

Signed-off-by: Kent Overstreet <kmo@daterainc.com>

bcache: Remove unnecessary check in should_split()

Checking i->seq was redundant, because since ages ago we always
initialize the new bset when advancing b->written

Signed-off-by: Kent Overstreet <kmo@daterainc.com>

bcache: Stripe size isn't necessarily a power of two

Originally I got this right... except that the divides didn't use
do_div(), which broke 32 bit kernels. When I went to fix that, I forgot
that the raid stripe size usually isn't a power of two... doh

Signed-off-by: Kent Overstreet <kmo@daterainc.com>

bcache: Add on error panic/unregister setting

Works kind of like the ext4 setting, to panic or remount read only on
errors.

Signed-off-by: Kent Overstreet <kmo@daterainc.com>

bcache: Use blkdev_issue_discard()

The old asynchronous discard code was really a relic from when all the
allocation code was asynchronous - now that allocation runs out of a
dedicated thread there's no point in keeping around all that complicated
machinery.

Signed-off-by: Kent Overstreet <kmo@daterainc.com>

bcache: Fix for handling overlapping extents when reading in a btree node

btree_sort_fixup() was overly clever, because it was trying to avoid
pulling a key off the btree iterator in more than one place.

This led to a really obscure bug where we'd break early from the loop in
btree_sort_fixup() if the current key overlapped with keys in more than
one older set, and the next key it overlapped with was zero size.

Signed-off-by: Kent Overstreet <kmo@daterainc.com>
Cc: linux-stable <stable@vger.kernel.org> # >= v3.10

bcache: Fix a shrinker deadlock

GFP_NOIO means we could be getting called recursively - mca_alloc() ->
mca_data_alloc() - definitely can't use mutex_lock(bucket_lock) then.
Whoops.

Signed-off-by: Kent Overstreet <kmo@daterainc.com>
Cc: linux-stable <stable@vger.kernel.org> # >= v3.10

bcache: Fix a dumb CPU spinning bug in writeback

schedule_timeout() != schedule_timeout_uninterruptible()

Signed-off-by: Kent Overstreet <kmo@daterainc.com>
Cc: linux-stable <stable@vger.kernel.org> # >= v3.10

bcache: Fix a flush/fua performance bug

bch_journal_meta() was missing the flush to make the journal write
actually go down (instead of waiting up to journal_delay_ms)...

Whoops

Signed-off-by: Kent Overstreet <kmo@daterainc.com>
Cc: linux-stable <stable@vger.kernel.org> # >= v3.10

bcache: Fix a writeback performance regression

Background writeback works by scanning the btree for dirty data and
adding those keys into a fixed size buffer, then for each dirty key in
the keybuf writing it to the backing device.

When read_dirty() finishes and it's time to scan for more dirty data, we
need to wait for the outstanding writeback IO to finish - they still
take up slots in the keybuf (so that foreground writes can check for
them to avoid races) - without that wait, we'll continually rescan when
we'll be able to add at most a key or two to the keybuf, and that takes
locks that starves foreground IO. Doh.

Signed-off-by: Kent Overstreet <kmo@daterainc.com>
Cc: linux-stable <stable@vger.kernel.org> # >= v3.10

bcache: Correct printf()-style format length modifier

drivers/md/bcache/btree.c: In function ‘bch_btree_node_read’:
drivers/md/bcache/btree.c:259: warning: format ‘%lu’ expects type ‘long unsigned int’, but argument 3 has type ‘size_t’

Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Kent Overstreet <kmo@daterainc.com>

bcache: Fix for when no journal entries are found

The journal replay code didn't handle this case, causing it to go into
an infinite loop...

Signed-off-by: Kent Overstreet <kmo@daterainc.com>
Cc: linux-stable <stable@vger.kernel.org> # >= v3.10

bcache: Strip endline when writing the label through sysfs

sysfs attributes with unusual characters have crappy failure modes
in Squeeze (udev 164); later versions of udev are unaffected.

This should make these characters more unusual.

Signed-off-by: Gabriel de Perthuis <g2p.code@gmail.com>
Signed-off-by: Kent Overstreet <kmo@daterainc.com>
Cc: linux-stable <stable@vger.kernel.org> # >= v3.10

bcache: Fix a dumb journal discard bug

That switch statement was obviously wrong, leading to some sort of weird
spinning on rare occasion with discards enabled...

Signed-off-by: Kent Overstreet <kmo@daterainc.com>
Cc: linux-stable <stable@vger.kernel.org> # >= v3.10

bcache: Allocation kthread fixes

The alloc kthread should've been using try_to_freeze() - and also there
was the potential for the alloc kthread to get woken up after it had
shut down, which would have been bad.

Signed-off-by: Kent Overstreet <kmo@daterainc.com>

bcache: Fix GC_SECTORS_USED() calculation

Part of the job of garbage collection is to add up however many sectors
of live data it finds in each bucket, but that doesn't work very well if
it doesn't reset GC_SECTORS_USED() when it starts. Whoops.

This wouldn't have broken anything horribly, but allocation tries to
preferentially reclaim buckets that are mostly empty and that's not
gonna work with an incorrect GC_SECTORS_USED() value.

Signed-off-by: Kent Overstreet <kmo@daterainc.com>
Cc: linux-stable <stable@vger.kernel.org> # >= v3.10

bcache: Journal replay fix

The journal replay code starts by finding something that looks like a
valid journal entry, then it does a binary search over the unchecked
region of the journal for the journal entries with the highest sequence
numbers.

Trouble is, the logic was wrong - journal_read_bucket() returns true if
it found journal entries we need, but if the range of journal entries
we're looking for loops around the end of the journal - in that case
journal_read_bucket() could return true when it hadn't found the highest
sequence number we'd seen yet, and in that case the binary search did
the wrong thing. Whoops.

Signed-off-by: Kent Overstreet <kmo@daterainc.com>
Cc: linux-stable <stable@vger.kernel.org> # >= v3.10

bcache: Shutdown fix

Stopping a cache set is supposed to make it stop attached backing
devices, but somewhere along the way that code got lost. Fixing this
mainly has the effect of fixing our reboot notifier.

Signed-off-by: Kent Overstreet <kmo@daterainc.com>
Cc: linux-stable <stable@vger.kernel.org> # >= v3.10

bcache: Fix a sysfs splat on shutdown

If we stopped a bcache device when we were already detaching (or
something like that), bcache_device_unlink() would try to remove a
symlink from sysfs that was already gone because the bcache dev kobject
had already been removed from sysfs.

So keep track of whether we've removed stuff from sysfs.

Signed-off-by: Kent Overstreet <kmo@daterainc.com>
Cc: linux-stable <stable@vger.kernel.org> # >= v3.10

bcache: Advertise that flushes are supported

Whoops - bcache's flush/FUA was mostly correct, but flushes get filtered
out unless we say we support them...

Signed-off-by: Kent Overstreet <kmo@daterainc.com>
Cc: linux-stable <stable@vger.kernel.org> # >= v3.10

bcache: check for allocation failures

There is a missing NULL check after the kzalloc().

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>

bcache: Fix a dumb race

In the far-too-complicated closure code - closures can have destructors,
for probably dubious reasons; they get run after the closure is no
longer waiting on anything but before dropping the parent ref, intended
just for freeing whatever memory the closure is embedded in.

Trouble is, when remaining goes to 0 and we've got nothing more to run -
we also have to unlock the closure, setting remaining to -1. If there's
a destructor, that unlock isn't doing anything - nobody could be trying
to lock it if we're about to free it - but if the unlock _is needed...
that check for a destructor was racy. Argh.

Signed-off-by: Kent Overstreet <kmo@daterainc.com>
Cc: linux-stable <stable@vger.kernel.org> # >= v3.10

bcache: Use standard utility code

Some of bcache's utility code has made it into the rest of the kernel,
so drop the bcache versions.

Bcache used to have a workaround for allocating from a bio set under
generic_make_request() (if you allocated more than once, the bios you
already allocated would get stuck on current->bio_list when you
submitted, and you'd risk deadlock) - bcache would mask out __GFP_WAIT
when allocating bios under generic_make_request() so that allocation
could fail and it could retry from workqueue. But bio_alloc_bioset() has
a workaround now, so we can drop this hack and the associated error
handling.

Signed-off-by: Kent Overstreet <koverstreet@google.com>

bcache: Update email address

Signed-off-by: Kent Overstreet <kmo@daterainc.com>

bcache: Delete fuzz tester

This code has rotted and it hasn't been used in ages anyways.

Signed-off-by: Kent Overstreet <kmo@daterainc.com>

bcache: Document shrinker reserve better

Signed-off-by: Kent Overstreet <kmo@daterainc.com>

bcache: FUA fixes

Journal writes need to be marked FUA, not just REQ_FLUSH. And btree node
writes have... weird ordering requirements.

Signed-off-by: Kent Overstreet <koverstreet@google.com>

bcache: Refresh usage docs

Mention udev autoregistration, symlinks. Write down some sysfs paths.

Signed-off-by: Gabriel de Perthuis <g2p.code@gmail.com>
Signed-off-by: Kent Overstreet <koverstreet@google.com>

bcache: Send label uevents

Signed-off-by: Gabriel de Perthuis <g2p.code@gmail.com>
Signed-off-by: Kent Overstreet <koverstreet@google.com>

bcache: Send a uevent with a cached device's UUID

Signed-off-by: Gabriel de Perthuis <g2p.code@gmail.com>

doc: Fix typo in documentation/bcache.txt

Correct spelling typo in documentation/bcache.txt

Signed-off-by: Masanari Iida <standby24x7@gmail.com>
Signed-off-by: Kent Overstreet <koverstreet@google.com>

bcache: Write out full stripes

Now that we're tracking dirty data per stripe, we can add two
optimizations for raid5/6:

* If a stripe is already dirty, force writes to that stripe to
writeback mode - to help build up full stripes of dirty data

* When flushing dirty data, preferentially write out full stripes first
if there are any.

Signed-off-by: Kent Overstreet <koverstreet@google.com>

bcache: Track dirty data by stripe

To make background writeback aware of raid5/6 stripes, we first need to
track the amount of dirty data within each stripe - we do this by
breaking up the existing sectors_dirty into per stripe atomic_ts

Signed-off-by: Kent Overstreet <koverstreet@google.com>

bcache: Initialize sectors_dirty when attaching

Previously, dirty_data wouldn't get initialized until the first garbage
collection... which was a bit of a problem for background writeback (as
the PD controller keys off of it) and also confusing for users.

This is also prep work for making background writeback aware of raid5/6
stripes.

Signed-off-by: Kent Overstreet <koverstreet@google.com>

bcache: Improve lazy sorting

The old lazy sorting code was kind of hacky - rewrite in a way that
mathematically makes more sense; the idea is that the size of the sets
of keys in a btree node should increase by a more or less fixed ratio
from smallest to biggest.

Signed-off-by: Kent Overstreet <koverstreet@google.com>

bcache: Rip out pkey()/pbtree()

Old gcc doesnt like the struct hack, and it is kind of ugly. So finish
off the work to convert pr_debug() statements to tracepoints, and delete
pkey()/pbtree().

Signed-off-by: Kent Overstreet <koverstreet@google.com>

bcache: Fix/revamp tracepoints

The tracepoints were reworked to be more sensible, and fixed a null
pointer deref in one of the tracepoints.

Converted some of the pr_debug()s to tracepoints - this is partly a
performance optimization; it used to be that with DEBUG or
CONFIG_DYNAMIC_DEBUG pr_debug() was an empty macro; but at some point it
was changed to an empty inline function.

Some of the pr_debug() statements had rather expensive function calls as
part of the arguments, so this code was getting run unnecessarily even
on non debug kernels - in some fast paths, too.

Signed-off-by: Kent Overstreet <koverstreet@google.com>

bcache: Refactor btree io

The most significant change is that btree reads are now done
synchronously, instead of asynchronously and doing the post read stuff
from a workqueue.

This was originally done because we can't block on IO under
generic_make_request(). But - we already have a mechanism to punt cache
lookups to workqueue if needed, so if we just use that we don't have to
deal with the complexity of doing things asynchronously.

The main benefit is this makes the locking situation saner; we can hold
our write lock on the btree node until we're finished reading it, and we
don't need that btree_node_read_done() flag anymore.

Also, for writes, btree_write() was broken out into btree_node_write()
and btree_leaf_dirty() - the old code with the boolean argument was dumb
and confusing.

The prio_blocked mechanism was improved a bit too, now the only counter
is in struct btree_write, we don't mess with transfering a count from
struct btree anymore.

This required changing garbage collection to block prios at the start
and unblock when it finishes, which is cleaner than what it was doing
anyways (the old code had mostly the same effect, but was doing it in a
convoluted way)

And the btree iter btree_node_read_done() uses was converted to a real
mempool.

Signed-off-by: Kent Overstreet <koverstreet@google.com>

bcache: Convert allocator thread to kthread

Using a workqueue when we just want a single thread is a bit silly.

Signed-off-by: Kent Overstreet <koverstreet@google.com>

bcache: Warn when a device is already registered.

Signed-off-by: Gabriel de Perthuis <g2p.code+bcache@gmail.com>
Signed-off-by: Kent Overstreet <koverstreet@google.com>

bcache: fix a spurious gcc complaint, use scnprintf

An old version of gcc was complaining about using a const int as the
size of a stack allocated array. Which should be fine - but using
ARRAY_SIZE() is better, anyways.

Also, refactor the code to use scnprintf().

Signed-off-by: Kent Overstreet <koverstreet@google.com>

md: bcache: io.c: fix a potential NULL pointer dereference

bio_alloc_bioset returns NULL on failure. This fix adds a missing check
for potential NULL pointer dereferencing.

Signed-off-by: Kumar Amit Mehta <gmate.amit@gmail.com>
Signed-off-by: Kent Overstreet <koverstreet@google.com>

Linux 3.10-rc7

Merge tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc

Pull ARM SoC fixes from Arnd Bergmann:
"These are two fixes that came in this week, one for a regression we
  introduced in 3.10 in the GIC interrupt code, and the other one fixes
  a typo in newly introduced code"

* tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
  irqchip: gic: call gic_cpu_init() as well in CPU_STARTING_FROZEN case
  ARM: dts: Correct the base address of pinctrl_3 on Exynos5250

Merge tag 'driver-core-3.10-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core

Pull driver core fix from Greg Kroah-Hartman:
"Here's a single patch for the firmware core that resolves a reported
oops in the firmware core that people have been hitting."

* tag 'driver-core-3.10-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core:
firmware loader: fix use-after-free by double abort

Merge tag 'usb-3.10-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb

Pull USB fixes from Greg Kroah-Hartman:
"Here are two USB patches for 3.10.

  One updates the Kconfig wording for CONFIG_USB_PHY to make it,
  hopefully, more obvious what this option is (I know you complained
  about this when it hit the tree.) The other is a new device id for a
  driver"

* tag 'usb-3.10-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
  USB: serial: ti_usb_3410_5052: new device id for Abbot strip port cable
  usb: phy: Improve Kconfig help for CONFIG_USB_PHY

Merge tag 'tty-3.10-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty

Pul tty fixes from Greg Kroah-Hartman:
"Here are two tty core fixes that resolve some regressions that have
  been reported recently.  Both tiny fixes, but needed"

* tag 'tty-3.10-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
  tty: Fix transient pty write() EIO
  tty/vt: Return EBUSY if deallocating VT1 and it is busy

Merge git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending

Pull SCSI target fixes from Nicholas Bellinger:
"Included is the recent tcm_qla2xxx residual underrun length fix from
  Roland, along with Joern's iscsi-target patch for session_lock
  breakage within iscsit_stop_time2retain_timer() code.  Both are CC'ed
  to stable.

  The remaining two are specific to recent iscsi-target + iser
  conversion changes.  One drops some left-over debug noise, and Andy's
  patch fixes configfs attribute handling during an explicit network
  portal feature bit disable when iser-target is unsupported."

* git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending:
  iscsi-target: Remove left over v3.10-rc debug printks
  target/iscsi: Fix op=disable + error handling cases in np_store_iser
  tcm_qla2xxx: Fix residual for underrun commands that fail
  target/iscsi: don't corrupt bh_count in iscsit_stop_time2retain_timer()

Merge branch 'v4l_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media

Pull media fixes from Mauro Carvalho Chehab:
"Another set of fixes for Kernel 3.10.

  This series contain:
   - two Kbuild fixes for randconfig
   - a buffer overflow when using rtl28xuu with r820t tuner
   - one clk fixup on exynos4-is driver"

* 'v4l_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media:
  [media] Fix build when drivers are builtin and frontend modules
  [media] s5p makefiles: don't override other selections on obj-[ym]
  [media] exynos4-is: Fix FIMC-IS clocks initialization
  [media] rtl28xxu: fix buffer overflow when probing Rafael Micro r820t tuner

Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs

Pull vfs fixes from Al Viro:
"Several fixes for bugs caught while looking through f_pos (ab)users"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  aout32 coredump compat fix
  splice: don't pass the address of ->f_pos to methods
  mconsole: we'd better initialize pos before passing it to vfs_read()...

aout32 coredump compat fix

dump_seek() does SEEK_CUR, not SEEK_SET; native binfmt_aout
handles it correctly (seeks by PAGE_SIZE - sizeof(struct user),
getting the current position to PAGE_SIZE), compat one seeks
by PAGE_SIZE and ends up at PAGE_SIZE + already written...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

Merge branch 'x86/urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 fixes from Peter Anvin:
"This series fixes a couple of build failures, and fixes MTRR cleanup
  and memory setup on very specific memory maps.

  Finally, it fixes triggering backtraces on all CPUs, which was
  inadvertently disabled on x86."

* 'x86/urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/efi: Fix dummy variable buffer allocation
  x86: Fix trigger_all_cpu_backtrace() implementation
  x86: Fix section mismatch on load_ucode_ap
  x86: fix build error and kconfig for ia32_emulation and binfmt
  range: Do not add new blank slot with add_range_with_merge
  x86, mtrr: Fix original mtrr range get for mtrr_cleanup

Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux

Pull drm radeon fixes from Dave Airlie:
"One core fix, but mostly radeon fixes for s/r and big endian UVD
  support, and a fix to stop the GPU being reset for no good reason, and
  crashing people's machines."

* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux:
  drm/radeon: update lockup tracking when scheduling in empty ring
  drm/prime: Honor requested file flags when exporting a buffer
  drm/radeon: fix UVD on big endian
  drm/radeon: fix write back suspend regression with uvd v2
  drm/radeon: do not try to uselessly update virtual memory pagetable

Merge tag 'acpi-3.10-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull ACPI fixes from Rafael Wysocki:

- Fix for a regression causing a failure to turn on some devices on
   some systems during initialization introduced by a recent revert of
   an ACPI PM change that broke something else.  Fortunately, we know
   exactly what devices are affected, so we can add a fix just for them
   leaving everyone else alone.

- ACPI power resources initialization fix preventing a NULL pointer
   from being dereferenced in the acpi_add_power_resource() error code
   path.

- ACPI dock station driver fix that adds missing locking to
   write_undock().

- ACPI resources allocation fix changing the scope of an old workaround
   so that it doesn't affect systems that aren't actually buggy.  This
   was reported a couple of days ago to fix DMA problems on some new
   platforms so we need it in -stable.  From Mika Westerberg.

* tag 'acpi-3.10-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  ACPI / LPSS: Power up LPSS devices during enumeration
  ACPI / PM: Fix error code path for power resources initialization
  ACPI / dock: Take ACPI scan lock in write_undock()
  ACPI / resources: call acpi_get_override_irq() only for legacy IRQ resources

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull KVM fixes from Paolo Bonzini:
"Three one-line fixes for my first pull request; one for x86 host, one
  for x86 guest, one for PPC"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  x86: kvmclock: zero initialize pvclock shared memory area
  kvm/ppc/booke: Delay kvmppc_lazy_ee_enable
  KVM: x86: remove vcpu's CPL check in host-invoked XCR set

Merge git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6

Pull crypto fix from Herbert Xu:
"This fixes an unaligned crash in XTS mode when using aseni_intel"

* git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
crypto: aesni_intel - fix accessing of unaligned memory

Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client

Pull Ceph fix from Sage Weil:
"This fixes a problem preventing the kernel and userland librbd
libraries from sharing data with the new format 2 images"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client:
rbd: use the correct length for format 2 object names

Merge tag 'efi-urgent' into x86/urgent

* Don't leak random kernel memory to EFI variable NVRAM when attempting
to initiate garbage collection. Also, free the kernel memory when
we're done with it instead of leaking - Ben Hutchings

Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>

x86/efi: Fix dummy variable buffer allocation

1. Check for allocation failure
2. Clear the buffer contents, as they may actually be written to flash
3. Don't leak the buffer

Compile-tested only.

[ Tested successfully on my buggy ASUS machine - Matt ]

Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Cc: stable@vger.kernel.org
Signed-off-by: Matt Fleming <matt.fleming@intel.com>

iscsi-target: Remove left over v3.10-rc debug printks

Reported-by: Andy Grover <agrover@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>

target/iscsi: Fix op=disable + error handling cases in np_store_iser

Writing 0 when iser was not previously enabled, so succeed but do
nothing so that user-space code doesn't need a try: catch block
when ib_isert logic is not available.

Also, return actual error from add_network_portal using PTR_ERR
during op=enable failure.

Signed-off-by: Andy Grover <agrover@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>

Merge branch 'drm-fixes-3.10' of git://people.freedesktop.org/~agd5f/linux into drm-fixes

One user visible fix to stop misreport GPU hangs and subsequent resets.
* 'drm-fixes-3.10' of git://people.freedesktop.org/~agd5f/linux:
drm/radeon: update lockup tracking when scheduling in empty ring

drm/radeon: update lockup tracking when scheduling in empty ring

There might be issue with lockup detection when scheduling on an
empty ring that have been sitting idle for a while. Thus update
the lockup tracking data when scheduling new work in an empty ring.

Signed-off-by: Jerome Glisse <jglisse@redhat.com>
Tested-by: Andy Lutomirski <luto@amacapital.net>
Cc: stable@vger.kernel.org
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

Merge branch 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull scheduler fixes from Ingo Molnar:
"Two smaller fixes - plus a context tracking tracing fix that is a bit
  bigger"

* 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  tracing/context-tracking: Add preempt_schedule_context() for tracing
  sched: Fix clear NOHZ_BALANCE_KICK
  sched/x86: Construct all sibling maps if smt

Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull perf fixes from Ingo Molnar:
"Four fixes.  The mmap ones are unfortunately larger than desired -
  fuzzing uncovered bugs that needed perf context life time management
  changes to fix properly"

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf/x86: Fix broken PEBS-LL support on SNB-EP/IVB-EP
  perf: Fix mmap() accounting hole
  perf: Fix perf mmap bugs
  kprobes: Fix to free gone and unused optprobes

Merge branch 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull cpu idle fixes from Thomas Gleixner:
- Add a missing irq enable. Fallout of the idle conversion
- Fix stackprotector wreckage caused by the idle conversion

* 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
idle: Enable interrupts in the weak arch_cpu_idle() implementation
idle: Add the stack canary init to cpu_startup_entry()

Merge branch 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull timer fixes from Thomas Gleixner:
- Fix inconstinant clock usage in virtual time accounting
- Fix a build error in KVM caused by the NOHZ work
- Remove a pointless timekeeping duty assignment which breaks NOHZ
- Use a proper notifier return value to avoid random behaviour

* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  tick: Remove useless timekeeping duty attribution to broadcast source
  nohz: Fix notifier return val that enforce timekeeping
  kvm: Move guest entry/exit APIs to context_tracking
  vtime: Use consistent clocks among nohz accounting

Merge branch 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc

Pull powerpc fix fro, Benjamin Herrenschmidt:
"We accidentally broke hugetlbfs on Freescale embedded processors which
use a slightly different page table layout than our server processors"

* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc:
powerpc: Fix bad pmd error with book3E config

Merge branch 'stable' of git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tile

Pull tilepro fix from Chris Metcalf:
"This change allows the older tilepro architecture to be correctly
  built by newer gccs, despite a change that caused gcc to start trying
  to use an out-of-line implementation for __builtin_ffsll().

  This should be inline again starting with gcc 4.7.4 and 4.8.2 or so,
  but meanwhile this change keeps things from breaking, with the only
  cost being a few bytes of code in the kernel to provide __ffsdi2 even
  for compilers that do inline it"

* 'stable' of git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tile:
  tilepro: work around module link error with gcc 4.7

Merge tag 'arm64-stable' of git://git.kernel.org/pub/scm/linux/kernel/git/cmarinas/linux-aarch64

Pull arm64 perf fix from Catalin Marinas:
"Perf fix (user-mode PC recording)"

* tag 'arm64-stable' of git://git.kernel.org/pub/scm/linux/kernel/git/cmarinas/linux-aarch64:
perf: arm64: Record the user-mode PC in the call chain.

splice: don't pass the address of ->f_pos to methods

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

[media] Fix build when drivers are builtin and frontend modules

There are a large number of reports that the media build is
not compiling when some drivers are compiled as builtin, while
the needed frontends are compiled as module.

On the last one of such reports:
From: kbuild test robot <fengguang.wu@intel.com>
Subject: saa7134-dvb.c:undefined reference to `zl10039_attach'

The .config file has:

CONFIG_VIDEO_SAA7134=y
CONFIG_VIDEO_SAA7134_DVB=y
# CONFIG_MEDIA_ATTACH is not set
CONFIG_DVB_ZL10039=m

And it produces all those errors:

   drivers/built-in.o: In function `set_type':
   tuner-core.c:(.text+0x2f263e): undefined reference to `tea5767_attach'
   tuner-core.c:(.text+0x2f273e): undefined reference to `tda9887_attach'
   drivers/built-in.o: In function `tuner_probe':
   tuner-core.c:(.text+0x2f2d20): undefined reference to `tea5767_autodetection'
   drivers/built-in.o: In function `av7110_attach':
   av7110.c:(.text+0x330bda): undefined reference to `ves1x93_attach'
   av7110.c:(.text+0x330bf7): undefined reference to `stv0299_attach'
   av7110.c:(.text+0x330c63): undefined reference to `tda8083_attach'
   av7110.c:(.text+0x330d09): undefined reference to `ves1x93_attach'
   av7110.c:(.text+0x330d33): undefined reference to `tda8083_attach'
   av7110.c:(.text+0x330d5d): undefined reference to `stv0297_attach'
   av7110.c:(.text+0x330dbe): undefined reference to `stv0299_attach'
   drivers/built-in.o: In function `tuner_attach_dtt7520x':
   ngene-cards.c:(.text+0x3381cb): undefined reference to `dvb_pll_attach'
   drivers/built-in.o: In function `demod_attach_lg330x':
   ngene-cards.c:(.text+0x33828a): undefined reference to `lgdt330x_attach'
   drivers/built-in.o: In function `demod_attach_stv0900':
   ngene-cards.c:(.text+0x3383d5): undefined reference to `stv090x_attach'
   drivers/built-in.o: In function `cineS2_probe':
   ngene-cards.c:(.text+0x338b7f): undefined reference to `drxk_attach'
   drivers/built-in.o: In function `configure_tda827x_fe':
   saa7134-dvb.c:(.text+0x346ae7): undefined reference to `tda10046_attach'
   drivers/built-in.o: In function `dvb_init':
   saa7134-dvb.c:(.text+0x347283): undefined reference to `mt352_attach'
   saa7134-dvb.c:(.text+0x3472cd): undefined reference to `mt352_attach'
   saa7134-dvb.c:(.text+0x34731c): undefined reference to `tda10046_attach'
   saa7134-dvb.c:(.text+0x34733c): undefined reference to `tda10046_attach'
   saa7134-dvb.c:(.text+0x34735c): undefined reference to `tda10046_attach'
   saa7134-dvb.c:(.text+0x347378): undefined reference to `tda10046_attach'
   saa7134-dvb.c:(.text+0x3473db): undefined reference to `tda10046_attach'
   drivers/built-in.o:saa7134-dvb.c:(.text+0x347502): more undefined references to `tda10046_attach' follow
   drivers/built-in.o: In function `dvb_init':
   saa7134-dvb.c:(.text+0x347812): undefined reference to `mt352_attach'
   saa7134-dvb.c:(.text+0x347951): undefined reference to `mt312_attach'
   saa7134-dvb.c:(.text+0x3479a9): undefined reference to `mt312_attach'
>> saa7134-dvb.c:(.text+0x3479c1): undefined reference to `zl10039_attach'

This is happening because a builtin module can't use directly a symbol
found on a module. By enabling CONFIG_MEDIA_ATTACH, the configuration
becomes valid, as dvb_attach() macro loads the module if needed, making
the symbol available to the builtin module.

While this bug started to appear after the patches that use IS_DEFINED
macro (like changeset 7b34be71db533f3e0cf93d53cf62d036cdb5418a), this
bug is a way ancient than that.

The thing is that, before the IS_DEFINED() patches, the logic used to be:

       && defined(MODULE))
struct dvb_frontend *zl10039_attach(struct dvb_frontend *fe,
u8 i2c_addr,
struct i2c_adapter *i2c);
static inline struct dvb_frontend *zl10039_attach(struct dvb_frontend *fe,
u8 i2c_addr,
struct i2c_adapter *i2c)
{
printk(KERN_WARNING "%s: driver disabled by Kconfig\n", __func__);
return NULL;
}

The above code, with the .config file used, was evoluting to FALSE
(instead of TRUE as it should be, as CONFIG_DVB_ZL10039 is 'm'),
and were adding the static inline code at saa7134-dvb, instead
of the external call. So, while it weren't producing any compilation
error, the code weren't working either.

So, as the overhead for using CONFIG_MEDIA_ATTACH is minimal, just
enable it, if MODULES is defined.

Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>

irqchip: gic: call gic_cpu_init() as well in CPU_STARTING_FROZEN case

Commit c011470 (irqchip: gic: Perform the gic_secondary_init() call via
CPU notifier) moves gic_secondary_init() that used to be called in
.smp_secondary_init hook into a notifier call.  But it changes the
system behavior a little bit.  Before the commit, gic_cpu_init()
is called not only when kernel brings up the secondary cores but also
when system resuming procedure hot-plugs the cores back to kernel.
While after the commit, the function will not be called in the latter
case, where the 'action' will not be CPU_STARTING but
CPU_STARTING_FROZEN.  This behavior difference at least causes the
following suspend/resume regression on imx6q.

$ echo mem > /sys/power/state
PM: Syncing filesystems ... done.
PM: Preparing system for mem sleep
mmc1: card e624 removed
Freezing user space processes ... (elapsed 0.01 seconds) done.
Freezing remaining freezable tasks ... (elapsed 0.01 seconds) done.
PM: Entering mem sleep
PM: suspend of devices complete after 5.930 msecs
PM: suspend devices took 0.010 seconds
PM: late suspend of devices complete after 0.343 msecs
PM: noirq suspend of devices complete after 0.828 msecs
Disabling non-boot CPUs ...
CPU1: shutdown
CPU2: shutdown
CPU3: shutdown
Enabling non-boot CPUs ...
CPU1: Booted secondary processor
INFO: rcu_sched detected stalls on CPUs/tasks: { 1 2 3} (detected by 0, t=2102 jiffies, g=4294967169, c=4294967168, q=17)
Task dump for CPU 1:
swapper/1       R running      0     0      1 0x00000000
Backtrace:
[<bf895ff4>] (0xbf895ff4) from [<00000000>] (  (null))
Backtrace aborted due to bad frame pointer <8007ccdc>
Task dump for CPU 2:
swapper/2       R running      0     0      1 0x00000000
Backtrace:
[<8075dbdc>] (0x8075dbdc) from [<00000000>] (  (null))
Backtrace aborted due to bad frame pointer <00000002>
Task dump for CPU 3:
swapper/3       R running      0     0      1 0x00000000
Backtrace:
[<8075dbdc>] (0x8075dbdc) from [<00000000>] (  (null))

Fix the regression by checking 'action' being CPU_STARTING_FROZEN to
have gic_cpu_init() called for secondary cores when system resumes.

Signed-off-by: Shawn Guo <shawn.guo@linaro.org>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Tested-by: Joseph Lo <josephl@nvidia.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>

x86: Fix trigger_all_cpu_backtrace() implementation

The following change fixes the x86 implementation of
trigger_all_cpu_backtrace(), which was previously (accidentally,
as far as I can tell) disabled to always return false as on
architectures that do not implement this function.

trigger_all_cpu_backtrace(), as defined in include/linux/nmi.h,
should call arch_trigger_all_cpu_backtrace() if available, or
return false if the underlying arch doesn't implement this
function.

x86 did provide a suitable arch_trigger_all_cpu_backtrace()
implementation, but it wasn't actually being used because it was
declared in asm/nmi.h, which linux/nmi.h doesn't include. Also,
linux/nmi.h couldn't easily be fixed by including asm/nmi.h,
because that file is not available on all architectures.

I am proposing to fix this by moving the x86 definition of
arch_trigger_all_cpu_backtrace() to asm/irq.h.

Tested via: echo l > /proc/sysrq-trigger

Before the change, this uses a fallback implementation which
shows backtraces on active CPUs (using
smp_call_function_interrupt() )

After the change, this shows NMI backtraces on all CPUs

Signed-off-by: Michel Lespinasse <walken@google.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/1370518875-1346-1-git-send-email-walken@google.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>

perf: arm64: Record the user-mode PC in the call chain.

With this change, we no longer lose the innermost entry in the user-mode
part of the call chain. See also the x86 port, which includes the ip,
and the corresponding change in arch/arm.

Signed-off-by: Jed Davis <jld@mozilla.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Will Deacon <will.deacon@arm.com>
Cc: stable@vger.kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>

[media] s5p makefiles: don't override other selections on obj-[ym]

The $obj-m/$obj-y vars should be adding new modules to build, not
overriding it. So, it should never use
$obj-y := foo.o
instead, it should use:
$obj-y += foo.o

Failing to do that is very bad, as it will suppress needed modules.

Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>

powerpc: Fix bad pmd error with book3E config

Book3E uses the hugepd at PMD level and don't encode pte directly
at the pmd level. So it will find the lower bits of pmd set
and the pmd_bad check throws error. Infact the current code
will never take the free_hugepd_range call at all because it will
clear the pmd if it find a hugepd pointer. Fix this by clearing
bad pmd only if it is not a hugepd pointer.

This is regression introduced by e2b3d202d1dba8f3546ed28224ce485bc50010be
"powerpc: Switch 16GB and 16MB explicit hugepages to a different page table format"

Reported-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>

USB: serial: ti_usb_3410_5052: new device id for Abbot strip port cable

Add product id for Abbott strip port cable for Precision meter which
uses the TI 3410 chip.

Signed-off-by: Anders Hammarquist <iko@iko.pp.se>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ACPI / LPSS: Power up LPSS devices during enumeration

Commit 7cd8407 (ACPI / PM: Do not execute _PS0 for devices without
_PSC during initialization) introduced a regression on some systems
with Intel Lynxpoint Low-Power Subsystem (LPSS) where some devices
need to be powered up during initialization, but their device objects
in the ACPI namespace have _PS0 and _PS3 only (without _PSC or power
resources).

To work around this problem, make the ACPI LPSS driver power up
devices it knows about by using a new helper function
acpi_device_fix_up_power() that does all of the necessary
sanity checks and calls acpi_dev_pm_explicit_set() to put the
device into D0.

Reported-and-tested-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

ACPI / PM: Fix error code path for power resources initialization

Commit 781d737 (ACPI: Drop power resources driver) introduced a
bug in the power resources initialization error code path causing
a NULL pointer to be referenced in acpi_release_power_resource()
if there's an error triggering a jump to the 'err' label in
acpi_add_power_resource(). This happens because the list_node
field of struct acpi_power_resource has not been initialized yet
at this point and doing a list_del() on it is a bad idea.

To prevent this problem from occuring, initialize the list_node
field of struct acpi_power_resource upfront.

Reported-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Tested-by: Lan Tianyu <tianyu.lan@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: 3.9+ <stable@vger.kernel.org>
Acked-by: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>

ACPI / dock: Take ACPI scan lock in write_undock()

Since commit 3757b94 (ACPI / hotplug: Fix concurrency issues and
memory leaks) acpi_bus_scan() and acpi_bus_trim() must always be
called under acpi_scan_lock, but currently the following scenario
violating that requirement is possible:

write_undock()
  handle_eject_request()
   hotplug_dock_devices()
    dock_remove_acpi_device()
     acpi_bus_trim()

Fix that by making write_undock() acquire acpi_scan_lock before
calling handle_eject_request() as appropriate (begin_undock() is
under the lock too in analogy with acpi_dock_deferred_cb()).

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: 3.9+ <stable@vger.kernel.org>
Acked-by: Toshi Kani <toshi.kani@hp.com>

ACPI / resources: call acpi_get_override_irq() only for legacy IRQ resources

acpi_get_override_irq() was added because there was a problem with
buggy BIOSes passing wrong IRQ() resource for the RTC IRQ. The
commit that added the workaround was 61fd47e0c8476 (ACPI: fix two
IRQ8 issues in IOAPIC mode).

With ACPI 5 enumerated devices there are typically one or more
extended IRQ resources per device (and these IRQs can be shared).
However, the acpi_get_override_irq() workaround forces all IRQs in
range 0 - 15 (the legacy ISA IRQs) to be edge triggered, active high
as can be seen from the dmesg below:

ACPI: IRQ 6 override to edge, high
ACPI: IRQ 7 override to edge, high
ACPI: IRQ 7 override to edge, high
ACPI: IRQ 13 override to edge, high

Also /proc/interrupts for the I2C controllers (INT33C2 and INT33C3) shows
the same thing:

7: 4 0 0 0 IO-APIC-edge INT33C2:00, INT33C3:00

The _CSR method for INT33C2 (and INT33C3) device returns following
resource:

Interrupt (ResourceConsumer, Level, ActiveLow, Shared,,, )
{
0x00000007,
}

which states that this is supposed to be level triggered, active low,
shared IRQ instead.

Fix this by making sure that acpi_get_override_irq() gets only called
when we are dealing with legacy IRQ() or IRQNoFlags() descriptors.

While we are there, correct pr_warning() to print the right triggering
value.

This change turns out to be necessary to make DMA work correctly on
systems based on the Intel Lynxpoint PCH (Platform Controller Hub).

[rjw: Changelog]
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Cc: 3.9+ <stable@vger.kernel.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

x86: Fix section mismatch on load_ucode_ap

We are in the process of removing all the __cpuinit annotations.
While working on making that change, an existing problem was
made evident:

  WARNING: arch/x86/kernel/built-in.o(.text+0x198f2): Section mismatch
  in reference from the function cpu_init() to the function
  .init.text:load_ucode_ap()   The function cpu_init() references
  the function __init load_ucode_ap().  This is often because cpu_init
  lacks a __init annotation or the annotation of load_ucode_ap is wrong.

This now appears because in my working tree, cpu_init() is no longer
tagged as __cpuinit, and so the audit picks up the mismatch.  The 2nd
hypothesis from the audit is the correct one, as there was an incorrect
__init tag on the prototype in the header (but __cpuinit was used on
the function itself.)

The audit is telling us that the prototype's __init annotation took
effect and the function did land in the .init.text section.  Checking
with objdump on a mainline tree that still has __cpuinit shows that
the __cpuinit on the function takes precedence over the __init on the
prototype, but that won't be true once we make __cpuinit a no-op.

Even though we are removing __cpuinit, we temporarily align both
the function and the prototype on __cpuinit so that the changeset
can be applied to stable trees  if desired.

[ hpa: build fix only, no object code change ]

Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: stable <stable@vger.kernel.org> # 3.9+
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Link: http://lkml.kernel.org/r/1371654926-11729-1-git-send-email-paul.gortmaker@windriver.com
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>

mn10300: Fix include dependency in irqflags.h et al.

We need to pick up the definition of raw_smp_processor_id() from
asm/smp.h. For the !SMP case, we need to supply a definition of
raw_smp_processor_id().

Because of the include dependencies we cannot use smp_call_func_t in
asm/smp.h, but we do need linux/thread_info.h

Signed-off-by: David Daney <david.daney@cavium.com>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc

Pull sparc fixes from David Miller:
"Various sparc bug fixes, in particular:

  1) TSB hashes have to be flushed before TLB on sparc64, from Dave
     Kleikamp.

  2) LEON timer interrupts can get stuck, from Andreas Larsson.

  3) Sparc64 needs to handle lack of address-congruence devicetree
     property, from Bob Picco"

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc:
  sparc: tsb must be flushed before tlb
  sparc,leon: Convert to use devm_ioremap_resource
  sparc64 address-congruence property
  sparc32, leon: Enable interrupts before going idle to avoid getting stuck
  sparc32, leon: Remove separate "ticker" timer for SMP
  sparc: kernel: using strlcpy() instead of strcpy()
  arch: sparc: prom: looping issue, need additional length check in the outside looping
  sparc: remove inline marking of EXPORT_SYMBOL functions
  sparc: Switch to asm-generic/linkage.h

Merge branch 'parisc-3.10' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux

Pull parisc fixes from Helge Deller:
"This contains a kernel segfault fix when reading /proc/kpageflags or
  /proc/kpagecount, two fixes for the serial port and PCI graphic card
  support on C8000 workstations and a fix to use unshadowed registers
  for flushing D- and I-caches."

* 'parisc-3.10' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
  parisc: Use unshadowed index register for flush instructions in flush_dcache_page_asm and flush_icache_page_asm
  parisc: provide pci_mmap_page_range() for parisc
  parisc: fix serial ports on C8000 workstation
  parisc: fix kernel BUG at arch/parisc/include/asm/mmzone.h:50 (part 2)

metag: fix mm/hugetlb.c build breakage

Commit 106c992a5ebe ("mm/hugetlb: add more arch-defined huge_pte
functions") added an include of <asm-generic/hugetlb.h> to each
architecture's <asm/hugetlb.h> (except s390).  Unfortunately metag was
missed which resulted in build errors when hugetlbfs is enabled (see
below).

Add the include for metag too to fix the build errors:

  mm/hugetlb.c In function 'make_huge_pte':
  mm/hugetlb.c +2250 : error: implicit declaration of function 'huge_pte_mkwrite'
  mm/hugetlb.c +2250 : error: implicit declaration of function 'huge_pte_mkdirty'
  ...

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Michal Hocko <mhocko@suse.cz>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Merge branch 'fixes' of git://git.linaro.org/people/rmk/linux-arm

Pull ARM fixes from Russell King:
"The larger changes this time are

   - "ARM: 7755/1: handle user space mapped pages in flush_kernel_dcache_page"
     which fixes more data corruption problems with O_DIRECT

   - "ARM: 7759/1: decouple CPU offlining from reboot/shutdown" which
     gets us back to working shutdown/reboot on SMP platforms

   - "ARM: 7752/1: errata: LoUIS bit field in CLIDR register is incorrect"
     which fixes a shutdown regression found in v3.10 on Versatile
     Express platforms.

  The remainder are the quite small, maybe one or two line changes"

* 'fixes' of git://git.linaro.org/people/rmk/linux-arm:
  ARM: 7759/1: decouple CPU offlining from reboot/shutdown
  ARM: 7756/1: zImage/virt: remove hyp-stub.S during distclean
  ARM: 7755/1: handle user space mapped pages in flush_kernel_dcache_page
  ARM: 7754/1: Fix the CPU ID and the mask associated to the PJ4B
  ARM: 7753/1: map_init_section flushes incorrect pmd
  ARM: 7752/1: errata: LoUIS bit field in CLIDR register is incorrect

[media] exynos4-is: Fix FIMC-IS clocks initialization

The ISP clock register content is not preserved over the ISP power domain
off/on cycle. Instead of setting the clock frequencies once at probe time
the clock rates set up is moved to the runtime_resume handler, which is
invoked after the related power domain is already enabled, ensuring the
clocks are properly configured when the device is actively used.
This fixes the FIMC-IS malfunctions and STREAM ON timeout errors accuring
on some boards:
[ 59.860000] fimc_is_general_irq_handler:583 ISR_NDONE: 5: 0x800003e8, IS_ERROR_UNKNOWN
[ 59.860000] fimc_is_general_irq_handler:586 IS_ERROR_TIME_OUT

Signed-off-by: Sylwester Nawrocki <s.nawrocki@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>

tracing/context-tracking: Add preempt_schedule_context() for tracing

Dave Jones hit the following bug report:

===============================
[ INFO: suspicious RCU usage. ]
3.10.0-rc2+ #1 Not tainted
-------------------------------
include/linux/rcupdate.h:771 rcu_read_lock() used illegally while idle!
other info that might help us debug this:
RCU used illegally from idle CPU! rcu_scheduler_active = 1, debug_locks = 0
RCU used illegally from extended quiescent state!
2 locks held by cc1/63645:
  #0:  (&rq->lock){-.-.-.}, at: [<ffffffff816b39fd>] __schedule+0xed/0x9b0
  #1:  (rcu_read_lock){.+.+..}, at: [<ffffffff8109d645>] cpuacct_charge+0x5/0x1f0

CPU: 1 PID: 63645 Comm: cc1 Not tainted 3.10.0-rc2+ #1 [loadavg: 40.57 27.55 13.39 25/277 64369]
Hardware name: Gigabyte Technology Co., Ltd. GA-MA78GM-S2H/GA-MA78GM-S2H, BIOS F12a 04/23/2010
  0000000000000000 ffff88010f78fcf8 ffffffff816ae383 ffff88010f78fd28
  ffffffff810b698d ffff88011c092548 000000000023d073 ffff88011c092500
  0000000000000001 ffff88010f78fd60 ffffffff8109d7c5 ffffffff8109d645
Call Trace:
  [<ffffffff816ae383>] dump_stack+0x19/0x1b
  [<ffffffff810b698d>] lockdep_rcu_suspicious+0xfd/0x130
  [<ffffffff8109d7c5>] cpuacct_charge+0x185/0x1f0
  [<ffffffff8109d645>] ? cpuacct_charge+0x5/0x1f0
  [<ffffffff8108dffc>] update_curr+0xec/0x240
  [<ffffffff8108f528>] put_prev_task_fair+0x228/0x480
  [<ffffffff816b3a71>] __schedule+0x161/0x9b0
  [<ffffffff816b4721>] preempt_schedule+0x51/0x80
  [<ffffffff816b4800>] ? __cond_resched_softirq+0x60/0x60
  [<ffffffff816b6824>] ? retint_careful+0x12/0x2e
  [<ffffffff810ff3cc>] ftrace_ops_control_func+0x1dc/0x210
  [<ffffffff816be280>] ftrace_call+0x5/0x2f
  [<ffffffff816b681d>] ? retint_careful+0xb/0x2e
  [<ffffffff816b4805>] ? schedule_user+0x5/0x70
  [<ffffffff816b4805>] ? schedule_user+0x5/0x70
  [<ffffffff816b6824>] ? retint_careful+0x12/0x2e
------------[ cut here ]------------

What happened was that the function tracer traced the schedule_user() code
that tells RCU that the system is coming back from userspace, and to
add the CPU back to the RCU monitoring.

Because the function tracer does a preempt_disable/enable_notrace() calls
the preempt_enable_notrace() checks the NEED_RESCHED flag. If it is set,
then preempt_schedule() is called. But this is called before the user_exit()
function can inform the kernel that the CPU is no longer in user mode and
needs to be accounted for by RCU.

The fix is to create a new preempt_schedule_context() that checks if
the kernel is still in user mode and if so to switch it to kernel mode
before calling schedule. It also switches back to user mode coming back
from schedule in need be.

The only user of this currently is the preempt_enable_notrace(), which is
only used by the tracing subsystem.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1369423420.6828.226.camel@gandalf.local.home
Signed-off-by: Ingo Molnar <mingo@kernel.org>

sched: Fix clear NOHZ_BALANCE_KICK

I have faced a sequence where the Idle Load Balance was sometime not
triggered for a while on my platform, in the following scenario:

CPU 0 and CPU 1 are running tasks and CPU 2 is idle

CPU 1 kicks the Idle Load Balance
CPU 1 selects CPU 2 as the new Idle Load Balancer
CPU 2 sets NOHZ_BALANCE_KICK for CPU 2
CPU 2 sends a reschedule IPI to CPU 2

While CPU 3 wakes up, CPU 0 or CPU 1 migrates a waking up task A on CPU 2

CPU 2 finally wakes up, runs task A and discards the Idle Load Balance
task A quickly goes back to sleep (before a tick occurs on CPU 2)
CPU 2 goes back to idle with NOHZ_BALANCE_KICK set

Whenever CPU 2 will be selected as the ILB, no reschedule IPI will be sent
because NOHZ_BALANCE_KICK is already set and no Idle Load Balance will be
performed.

We must wait for the sched softirq to be raised on CPU 2 thanks to another
part the kernel to come back to clear NOHZ_BALANCE_KICK.

The proposed solution clears NOHZ_BALANCE_KICK in schedule_ipi if
we can't raise the sched_softirq for the Idle Load Balance.

Change since V1:

- move the clear of NOHZ_BALANCE_KICK in got_nohz_idle_kick if the ILB
can't run on this CPU (as suggested by Peter)

Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1370419991-13870-1-git-send-email-vincent.guittot@linaro.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>