git.karo-electronics.de Git - linux-beck.git/log

NFS: Prevent garbage cinfo->ds from leaking out

This is a bugfix that applies on top of the previous directio patches,
that fixes a bug introduced in "NFS: create struct nfs_commit_info".

Signed-off-by: Fred Isaman <iisaman@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>

NFS: rewrite directio write to use async coalesce code

This also has the advantage that it allows directio to use pnfs.

Signed-off-by: Fred Isaman <iisaman@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>

NFS: avoid some stat gathering for direct io

Signed-off-by: Fred Isaman <iisaman@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>

NFS: add dreq to nfs_commit_info

Need this to pass into nfs_commitdata_init, in order to keep data->dreq
accurate.

Signed-off-by: Fred Isaman <iisaman@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>

NFS: create nfs_commit_completion_ops

Factors out the code that needs to change when directio
starts using these code paths.

Signed-off-by: Fred Isaman <iisaman@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>

NFS: create struct nfs_commit_info

It is COMMIT that is handled the most differently between
the paged and direct paths. Create a structure that encapsulates
everything either path needs to know about the commit state.

We could use void to hide some of the layout driver stuff, but
Trond suggests pulling it out to ensure type checking, given the
huge changes being made, and the fact that it doesn't interfere
with other drivers.

Signed-off-by: Fred Isaman <iisaman@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>

NFS: create nfs_generic_commit_list

Simple refactoring.

Signed-off-by: Fred Isaman <iisaman@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>

NFS: rewrite directio read to use async coalesce code

This also has the advantage that it allows directio to use pnfs.

Signed-off-by: Fred Isaman <iisaman@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>

NFS: prepare coalesce testing for directio

The coalesce code made assumptions that will no longer be true once
non-page aligned io occurs. This introduces no change in
current behavior, but allows for more general situations to come.

Signed-off-by: Fred Isaman <iisaman@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>

NFS: remove unused wb_complete field from struct nfs_page

Signed-off-by: Fred Isaman <iisaman@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>

NFS: create completion structure to pass into page_init functions

Factors out the code that will need to change when directio
starts using these code paths. This will allow directio to use
the generic pagein and flush routines

Signed-off-by: Fred Isaman <iisaman@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>

NFS: merge _full and _partial write rpc_ops

Decouple nfs_pgio_header and nfs_write_data, and have (possibly
multiple) nfs_write_datas each take a refcount on nfs_pgio_header.

For the moment keeps nfs_write_header as a way to preallocate a single
nfs_write_data with the nfs_pgio_header. The code doesn't need this,
and would be prettier without, but given the amount of churn I am
already introducing I didn't want to play with tuning new mempools.

This also fixes bug in pnfs_ld_handle_write_error. In the case of
desc->pg_bsize < PAGE_CACHE_SIZE, the pages list was empty, causing
replay attempt to do nothing.

Signed-off-by: Fred Isaman <iisaman@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>

NFS: merge _full and _partial read rpc_ops

Decouple nfs_pgio_header and nfs_read_data, and have (possibly
multiple) nfs_read_datas each take a refcount on nfs_pgio_header.

For the moment keeps nfs_read_header as a way to preallocate a single
nfs_read_data with the nfs_pgio_header. The code doesn't need this,
and would be prettier without, but given the amount of churn I am
already introducing I didn't want to play with tuning new mempools.

This also fixes bug in pnfs_ld_handle_read_error. In the case of
desc->pg_bsize < PAGE_CACHE_SIZE, the pages list was empty, causing
replay attempt to do nothing.

Signed-off-by: Fred Isaman <iisaman@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>

NFS: create struct nfs_page_array

Both nfs_read_data and nfs_write_data devote several fields which
can be combined into a single shared struct.

Signed-off-by: Fred Isaman <iisaman@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>

NFS: create common nfs_pgio_header for both read and write

In order to avoid duplicating all the data in nfs_read_data whenever we
split it up into multiple RPC calls (either due to a short read result
or due to rsize < PAGE_SIZE), we split out the bits that are the same
per RPC call into a separate "header" structure.

The goal this patch moves towards is to have a single header
refcounted by several rpc_data structures.  Thus, want to always refer
from rpc_data to the header, and not the other way.  This patch comes
close to that ideal, but the directio code currently needs some
special casing, isolated in the nfs_direct_[read_write]hdr_release()
functions.  This will be dealt with in a future patch.

Signed-off-by: Fred Isaman <iisaman@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>

NFS: use req_offset where appropriate

Signed-off-by: Fred Isaman <iisaman@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>

NFS: remove unnecessary casts of void pointers in nfs4filelayout.c

Signed-off-by: Fred Isaman <iisaman@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>

NFS: reverse arg order in nfs_initiate_[read|write]

Make it consistent with nfs_initiate_commit.

Signed-off-by: Fred Isaman <iisaman@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>

NFS: dprintks in directio code were referencing task after put

Signed-off-by: Fred Isaman <iisaman@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>

NFS: add a struct nfs_commit_data to replace nfs_write_data in commits

Commits don't need the vectors of pages, etc. that writes do. Split out
a separate structure for the commit operation.

Signed-off-by: Fred Isaman <iisaman@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>

NFS4.1: Add lseg to struct nfs4_fl_commit_bucket

Also create a commit_info structure to hold the bucket array and push
it up from the lseg to the layout where it really belongs.

While we are at it, fix a refcounting bug due to an (incorrect)
implicit assumption that filelayout_scan_ds_commit_list always
completely emptied the src list.

This clarifies refcounting, removes the ugly find_only_write_lseg
functions, and pushes the file layout commit code along on the path to
supporting multiple lsegs.

Signed-off-by: Fred Isaman <iisaman@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>

NFS4.1: make pnfs_ld_[read|write]_done consistent

The two functions had diverged quite a bit, with the write function
being a bit more robust than the read.

However, these still break badly in the desc->pg_bsize < PAGE_CACHE_SIZE case,
as then there is nothing hanging on the data->pages list, and the resend
ends up doing nothing. This will be fixed in a patch later in the series.

Signed-off-by: Fred Isaman <iisaman@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>

NFS: grab open context in direct read

Signed-off-by: Fred Isaman <iisaman@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>

NFS: Remove unused function nfs_lookup_with_sec()

This fixes a compiler warning.

Signed-off-by: Bryan Schumaker <bjschuma@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>

NFS: Honor the authflavor set in the clone mount data

The authflavor is set in an nfs_clone_mount structure and passed to the
xdev_mount() functions where it was promptly ignored. Instead, use it
to initialize an rpc_clnt for the cloned server.

Signed-off-by: Bryan Schumaker <bjschuma@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>

NFS: Fix following referral mount points with different security

I create a new proc_lookup_mountpoint() to use when submounting an NFS
v4 share. This function returns an rpc_clnt to use for performing an
fs_locations() call on a referral's mountpoint.

Signed-off-by: Bryan Schumaker <bjschuma@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>

NFS: Do secinfo as part of lookup

Whenever lookup sees wrongsec do a secinfo and retry the lookup to find
attributes of the file or directory, such as "is this a referral
mountpoint?". This also allows me to remove handling -NFS4ERR_WRONSEC
as part of getattr xdr decoding.

Signed-off-by: Bryan Schumaker <bjschuma@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>

NFS: Handle exceptions coming out of nfs4_proc_fs_locations()

We don't want to return -NFS4ERR_WRONGSEC to the VFS because it could
cause the kernel to oops.

Signed-off-by: Bryan Schumaker <bjschuma@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>

NFS: Fix SECINFO_NO_NAME

I was using the same decoder function for SECINFO and SECINFO_NO_NAME,
so it was returning an error when it tried to decode an OP_SECINFO_NO_NAME
header as OP_SECINFO.

Signed-off-by: Bryan Schumaker <bjschuma@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>

SUNRPC: traverse clients tree on PipeFS event

v2: recursion was replaced by loop

If client is a clone, then it's parent can not be in the list.
But parent's Pipefs dentries have to be created and destroyed.

Note: event skip helper for clients introduced

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>

SUNRPC: set per-net PipeFS superblock before notification

There can be a case, when on MOUNT event RPC client (after it's dentries were
created) is not longer hold by anyone except notification callback.
I.e. on release this client will be destoroyed. And it's dentries have to be
destroyed as well. Which in turn requires per-net PipeFS superblock to be set.

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>

SUNRPC: skip clients with program without PipeFS entries

1) This is sane.
2) Otherwise there will be soft lockup:

do {
rpc_get_client_for_event (clnt->cl_dentry == NULL ==> choose)
__rpc_pipefs_event (clnt->cl_program->pipe_dir_name == NULL ==> return)
} while (1)

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>

SUNRPC: skip dead but not buried clients on PipeFS events

These clients can't be safely dereferenced if their counter in 0.

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>

Avoid beyond bounds copy while caching ACL

When attempting to cache ACLs returned from the server, if the bitmap
size + the ACL size is greater than a PAGE_SIZE but the ACL size itself
is smaller than a PAGE_SIZE, we can read past the buffer page boundary.

Signed-off-by: Sachin Prabhu <sprabhu@redhat.com>
Reported-by: Jian Li <jiali@redhat.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>

Avoid reading past buffer when calling GETACL

Bug noticed in commit
bf118a342f10dafe44b14451a1392c3254629a1f

When calling GETACL, if the size of the bitmap array, the length
attribute and the acl returned by the server is greater than the
allocated buffer(args.acl_len), we can Oops with a General Protection
fault at _copy_from_pages() when we attempt to read past the pages
allocated.

This patch allocates an extra PAGE for the bitmap and checks to see that
the bitmap + attribute_length + ACLs don't exceed the buffer space
allocated to it.

Signed-off-by: Sachin Prabhu <sprabhu@redhat.com>
Reported-by: Jian Li <jiali@redhat.com>
[Trond: Fixed a size_t vs unsigned int printk() warning]
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>

fix page number calculation bug for block layout decode buffer

Signed-off-by: Jim Rees <rees@umich.edu>
Suggested-by: Andy Adamson <andros@netapp.com>
Suggested-by: Fred Isaman <iisaman@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>

NFSv4.1 fix page number calculation bug for filelayout decode buffers

Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>

pnfs-obj: Remove unused variable from objlayout_get_deviceinfo()

Local variable 'sb' was not being used in objlayout_get_deviceinfo().

Signed-off-by: Sachin Bhamare <sbhamare@panasas.com>
Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>

nfs4: fix referrals on mounts that use IPv6 addrs

All referrals (IPv4 addr, IPv6 addr, and DNS) are broken on mounts of
IPv6 addresses, because validation code uses a path that is parsed
from the dev_name ("<server>:<path>") by splitting on the first colon and
colons are used in IPv6 addrs.
This patch ignores colons within IPv6 addresses that are escaped by '[' and ']'.

Signed-off-by: Weston Andros Adamson <dros@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>

Merge tag 'nfs-for-3.4-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs

Pull NFS client bugfixes from Trond Myklebust:
- Fix NFSv4 infinite loops on open(O_TRUNC)
- Fix an Oops and an infinite loop in the NFSv4 flock code
- Don't register the PipeFS filesystem until it has been set up
- Fix an Oops in nfs_try_to_update_request
- Don't reuse NFSv4 open owners: fixes a bad sequence id storm.

* tag 'nfs-for-3.4-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
  NFSv4: Keep dropped state owners on the LRU list for a while
  NFSv4: Ensure that we don't drop a state owner more than once
  NFSv4: Ensure we do not reuse open owner names
  nfs: Enclose hostname in brackets when needed in nfs_do_root_mount
  NFS: put open context on error in nfs_flush_multi
  NFS: put open context on error in nfs_pagein_multi
  NFSv4: Fix open(O_TRUNC) and ftruncate() error handling
  NFSv4: Ensure that we check lock exclusive/shared type against open modes
  NFSv4: Ensure that the LOCK code sets exception->inode
  NFS: check for req==NULL in nfs_try_to_update_request cleanup
  SUNRPC: register PipeFS file system after pernet sybsystem

Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 fixes from H. Peter Anvin.

* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x32, siginfo: Provide proper overrides for x32 siginfo_t
  asm-generic: Allow overriding clock_t and add attributes to siginfo_t
  x32: Check __ILP32__ instead of __LP64__ for x32
  x86, acpi: Call acpi_enter_sleep_state via an asmlinkage C function from assembler
  ACPI: Convert wake_sleep_flags to a value instead of function
  x86, apic: APIC code touches invalid MSR on P5 class machines
  i387: ptrace breaks the lazy-fpu-restore logic
  x86/platform: Remove incorrect error message in x86_default_fixup_cpu_id()
  x86, efi: Add dedicated EFI stub entry point
  x86/amd: Remove broken links from comment and kernel message
  x86, microcode: Ensure that module is only loaded on supported AMD CPUs
  x86, microcode: Fix sysfs warning during module unload on unsupported CPUs

Merge branch 'for_linus' of git://cavan.codon.org.uk/platform-drivers-x86

Pull x86 platform driver fixes from Matthew Garrett:
"One annoyance fix (make intel_ips stop complaining unnecessarily) and
  one oops fix (unterminated list in dell-laptop).  Both have been in
  -next for a while with no complaints."

* 'for_linus' of git://cavan.codon.org.uk/platform-drivers-x86:
  dell-laptop: Terminate quirks list properly
  intel_ips: Hush the i915 symbols message

mm: memcg: move pc lookup point to commit_charge()

None of the callsites actually need the page_cgroup descriptor
themselves, so just pass the page and do the look up in there.

We already had two bugs (6568d4a 'mm: memcg: update the correct soft
limit tree during migration' and 'memcg: fix Bad page state after
replace_page_cache') where the passed page and pc were not referring
to the same page frame.

Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Hugh Dickins <hughd@google.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

mm: nobootmem: Correct alloc_bootmem semantics.

The comments above __alloc_bootmem_node() claim that the code will
first try the allocation using 'goal' and if that fails it will
try again but with the 'goal' requirement dropped.

Unfortunately, this is not what the code does, so fix it to do so.

This is important for nobootmem conversions to architectures such
as sparc where MAX_DMA_ADDRESS is infinity.

On such architectures all of the allocations done by generic spots,
such as the sparse-vmemmap implementation, will pass in:

__pa(MAX_DMA_ADDRESS)

as the goal, and with the limit given as "-1" this will always fail
unless we add the appropriate fallback logic here.

Signed-off-by: David S. Miller <davem@davemloft.net>
Acked-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Merge git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-3.0-fixes

Pull gfs2 fixes from Steven Whitehouse.

* git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-3.0-fixes:
GFS2: Instruct DLM to avoid queue convert slowdown

Merge tag 'hsi_fixes_for_3.4' of git://gitorious.org/kernel-hsi/kernel-hsi

Pull HSI fixes and ABI documentation from Carlos Chinea

* tag 'hsi_fixes_for_3.4' of git://gitorious.org/kernel-hsi/kernel-hsi:
  HSI: Add HSI ABI documentation
  HSI: hsi_char: Remove max_data_size from sysfs
  HSI: hsi: Rework hsi_event interface
  HSI: hsi: Remove controllers and ports from the bus
  HSI: hsi: Fix error path cleanup on client registration
  HSI: hsi: Rework hsi_controller release

GFS2: Instruct DLM to avoid queue convert slowdown

This patch instructs DLM to prevent an "in place" conversion, where the
lock just stays on the granted queue, and instead forces the conversion to
the back of the convert queue. This is done on upward conversions only.

This is useful in cases where, for example, a lock is frequently needed in
PR on one node, but another node needs it temporarily in EX to update it.
This may happen, for example, when the rindex is being updated by gfs2_grow.
The gfs2_grow needs to have the lock in EX, but the other nodes need to
re-read it to retrieve the updates. The glock is already granted in PR on
the non-growing nodes, so this prevents them from continually re-granting
the lock in PR, and forces the EX from gfs2_grow to go through.

Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>

Merge tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4

Pull ext4 bug fixes from Ted Ts'o:
"These are two low-risk bug fixes for ext4, fixing a compile warning
  and a potential deadlock."

* tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
  super.c: unused variable warning without CONFIG_QUOTA
  jbd2: use GFP_NOFS for blkdev_issue_flush

Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rkuo/linux-hexagon-kernel

Pull Hexagon fixes from Richard Kuo:
"It's mostly compile fixes and the Hexagon portion of a CPU hotplug
  patch set."

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rkuo/linux-hexagon-kernel:
  hexagon: add missing cpu.h include
  hexagon/CPU hotplug: Add missing call to notify_cpu_starting()
  hexagon:  use renamed tick_nohz_idle_* functions
  Hexagon: misc compile warning/error cleanup due to missing headers

Merge branch 'rc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild

Pull build system failure fix from Michal Marek:
"This fixes build failure with newer gcc that adds some internal
  symbols that end in "__mod_*_device_table", but are not actually the
  tables themselves."

* 'rc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild:
  Fix modpost failures in fedora 17

super.c: unused variable warning without CONFIG_QUOTA

sb info is only checked with quota support.

fs/ext4/super.c: In function ‘parse_options’:
fs/ext4/super.c:1600:23: warning: unused variable ‘sbi’ [-Wunused-variable]

Signed-off-by: Eldad Zack <eldad@fogrefinery.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>

jbd2: use GFP_NOFS for blkdev_issue_flush

flush request is issued in transaction commit code path, so looks using
GFP_KERNEL to allocate memory for flush request bio falls into the classic
deadlock issue. I saw btrfs and dm get it right, but ext4, xfs and md are
using GFP.

Signed-off-by: Shaohua Li <shli@fusionio.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Reviewed-by: Jan Kara <jack@suse.cz>
Cc: stable@vger.kernel.org

Merge tag 'md-3.4-fixes' of git://neil.brown.name/md

Pull a few more md bug fixes from NeilBrown:
"2 are tagged for -stable, one being for a fairly serious bug that can
  corrupt metadata and make it hard to recovery an array.  The other is
  for a more recent regression since 3.3"

* tag 'md-3.4-fixes' of git://neil.brown.name/md:
  md: fix possible corruption of array metadata on shutdown.
  md: don't call ->add_disk unless there is good reason.
  DM RAID: Use safe version of rdev_for_each

Merge tag 'dlm-fixes-3.4' of git://git.kernel.org/pub/scm/linux/kernel/git/teigland/linux-dlm

Pull dlm fixes from David Teigland:
"This includes one short patch fixing the behavior of the QUECVT flag,
which the gfs2 folks are waiting on."

* tag 'dlm-fixes-3.4' of git://git.kernel.org/pub/scm/linux/kernel/git/teigland/linux-dlm:
dlm: fix QUECVT when convert queue is empty

mm: fix s390 BUG by __set_page_dirty_no_writeback on swap

Mel reports a BUG_ON(slot == NULL) in radix_tree_tag_set() on s390
3.0.13: called from __set_page_dirty_nobuffers() when page_remove_rmap()
tries to transfer dirty flag from s390 storage key to struct page and
radix_tree.

That would be because of reclaim's shrink_page_list() calling
add_to_swap() on this page at the same time: first PageSwapCache is set
(causing page_mapping(page) to appear as &swapper_space), then
page->private set, then tree_lock taken, then page inserted into
radix_tree - so there's an interval before taking the lock when the
radix_tree slot is empty.

We could fix this by moving __add_to_swap_cache()'s spin_lock_irq up
before the SetPageSwapCache. But a better fix is simply to do what's
five years overdue: Ken Chen introduced __set_page_dirty_no_writeback()
(if !PageDirty TestSetPageDirty) for tmpfs to skip all the radix_tree
overhead, and swap is just the same - it ignores the radix_tree tag, and
does not participate in dirty page accounting, so should be using
__set_page_dirty_no_writeback() too.

s390 testing now confirms that this does indeed fix the problem.

Reported-by: Mel Gorman <mgorman@suse.de>
Signed-off-by: Hugh Dickins <hughd@google.com>
Acked-by: Mel Gorman <mgorman@suse.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Ken Chen <kenchen@google.com>
Cc: stable@vger.kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

x32, siginfo: Provide proper overrides for x32 siginfo_t

Provide the proper override macros for x32 siginfo_t. The combination
of a special type here and an overall alignment constraint actually
ends up with all the types being properly aligned, but the hack is
needed to keep the substructures inside siginfo_t from adding padding.

Note: use __attribute__((aligned())) since __aligned() is not exported
to user space.

[ v2: fix stray semicolon ]

Reported-by: H.J. Lu <hjl.rools@gmail.com>
Cc: Bruce J. Beare <bruce.j.beare@intel.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Link: http://lkml.kernel.org/r/CAMe9rOqF6Kh6-NK7oP0Fpzkd4SBAWU%2BG53hwBbSD4iA2UzyxuA@mail.gmail.com
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>

md: fix possible corruption of array metadata on shutdown.

commit c744a65c1e2d59acc54333ce8
md: don't set md arrays to readonly on shutdown.

removed the possibility of a 'BUG' when data is written to an array
that has just been switched to read-only, but also introduced the
possibility that the array metadata could be corrupted.

If, when md_notify_reboot gets the mddev lock, the array is
in a state where it is assembled but hasn't been started (as can
happen if the personality module is not available, or in other unusual
situations), then incorrect metadata will be written out making it
impossible to re-assemble the array.

So only call __md_stop_writes() if the array has actually been
activated.

This patch is needed for any stable kernel which has had the above
commit applied.

Cc: stable@vger.kernel.org
Reported-by: Christoph Nelles <evilazrael@evilazrael.de>
Signed-off-by: NeilBrown <neilb@suse.de>

md: don't call ->add_disk unless there is good reason.

Commit 7bfec5f35c68121e7b18

md/raid5: If there is a spare and a want_replacement device, start replacement.

cause md_check_recovery to call ->add_disk much more often.
Instead of only when the array is degraded, it is now called whenever
md_check_recovery finds anything useful to do, which includes
updating the metadata for clean<->dirty transition.
This causes unnecessary work, and causes info messages from ->add_disk
to be reported much too often.

So refine md_check_recovery to only do any actual recovery checking
(including ->add_disk) if MD_RECOVERY_NEEDED is set.

This fix is suitable for 3.3.y:

Cc: stable@vger.kernel.org
Reported-by: Jan Ceuleers <jan.ceuleers@computer.org>
Signed-off-by: NeilBrown <neilb@suse.de>

DM RAID: Use safe version of rdev_for_each

Fix segfault caused by using rdev_for_each instead of rdev_for_each_safe

Commit dafb20fa34320a472deb7442f25a0c086e0feb33 mistakenly replaced a safe
iterator with an unsafe one when making some macro changes.

Signed-off-by: Jonathan Brassow <jbrassow@redhat.com>
Signed-off-by: NeilBrown <neilb@suse.de>

asm-generic: Allow overriding clock_t and add attributes to siginfo_t

For the particular issue of x32, which shares code with i386 in the
handling of compat_siginfo_t, the use of a 64-bit clock_t bumps the
sigchld structure out of alignment, which triggers a messy cascade of
padding.

This was already handled on the kernel compat side, but it needs
handling on the user space side, which uses the generic header. To
make that possible:

1. Allow __kernel_clock_t to be overridden in struct siginfo;
2. Allow there to be attributes added to struct siginfo.

Reported-by: H.J. Lu <hjl.rools@gmail.com>
Cc: Bruce J. Beare <bruce.j.beare@intel.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Link: http://lkml.kernel.org/r/CAMe9rOqF6Kh6-NK7oP0Fpzkd4SBAWU%2BG53hwBbSD4iA2UzyxuA@mail.gmail.com
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>

x32: Check __ILP32__ instead of __LP64__ for x32

Check __LP64__ isn't a reliable way to tell if we are compiling for x32
since __LP64__ isnn't specified by x86-64 psABI.  Not all x86-64
compilers define __LP64__, which was added to GCC 3.3. The updated x32
psABI:

https://sites.google.com/site/x32abi/documents

definse _ILP32 and __ILP32__ for x32.  GCC trunk and 4.7 branch have
been updated to define _ILP32 and __ILP32__ for x32.  This patch
replaces __LP64__ check with __ILP32__.

Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>

x86, acpi: Call acpi_enter_sleep_state via an asmlinkage C function from assembler

With commit a2ef5c4fd44ce3922435139393b89f2cce47f576
"ACPI: Move module parameter gts and bfs to sleep.c" the
wake_sleep_flags is required when calling acpi_enter_sleep_state.

The assembler code in wakeup_*.S did not do that. One solution
is to call it from assembler and stick the wake_sleep_flags on
the stack (for 32-bit) or in %esi (for 64-bit). hpa and rafael
both suggested however to create a wrapper function to call
acpi_enter_sleep_state and call said wrapper function
("acpi_enter_s3") from assembler.

For 32-bit, the acpi_enter_s3 ends up looking as so:

  push   %ebp
  mov    %esp,%ebp
  sub    $0x8,%esp
  movzbl 0xc1809314,%eax [wake_sleep_flags]
  movl   $0x3,(%esp)
  mov    %eax,0x4(%esp)
  call   0xc12d1fa0 <acpi_enter_sleep_state>
  leave
  ret

And 64-bit:

  movzbl 0x9afde1(%rip),%esi        [wake_sleep_flags]
  push   %rbp
  mov    $0x3,%edi
  mov    %rsp,%rbp
  callq  0xffffffff812e9800 <acpi_enter_sleep_state>
  leaveq
  retq

Reviewed-by: H. Peter Anvin <hpa@zytor.com>
Suggested-by: H. Peter Anvin <hpa@zytor.com>
[v2: Remove extra assembler operations, per hpa review]
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Link: http://lkml.kernel.org/r/1335150198-21899-3-git-send-email-konrad.wilk@oracle.com
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>

ACPI: Convert wake_sleep_flags to a value instead of function

With commit a2ef5c4fd44ce3922435139393b89f2cce47f576
"ACPI: Move module parameter gts and bfs to sleep.c" the wake_sleep_flags
is required when calling acpi_enter_sleep_state, which means
that if there are functions outside the sleep.c code they
can't get the wake_sleep_flags values.

This converts the function in to a exported value and converts
the module config operands to a function.

Acked-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Lin Ming <ming.m.lin@intel.com>
[v2: Parameters can be turned on/off dynamically]
[v3: unsigned char -> u8]
[v4: val -> kp->arg]
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Link: http://lkml.kernel.org/r/1335150198-21899-2-git-send-email-konrad.wilk@oracle.com
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>

hexagon: add missing cpu.h include

Signed-off-by: Richard Kuo <rkuo@codeaurora.org>

hexagon/CPU hotplug: Add missing call to notify_cpu_starting()

The scheduler depends on receiving the CPU_STARTING notification, without
which we end up into a lot of trouble. So add the missing call to
notify_cpu_starting() in the bringup code.

Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Signed-off-by: Richard Kuo <rkuo@codeaurora.org>

hexagon: use renamed tick_nohz_idle_* functions

Signed-off-by: Richard Kuo <rkuo@codeaurora.org>

Hexagon: misc compile warning/error cleanup due to missing headers

Fixed warnings/errors for EXPORT_SYMBOL, linux_binprm, elf related
defines

Signed-off-by: Richard Kuo <rkuo@codeaurora.org>

dlm: fix QUECVT when convert queue is empty

The QUECVT flag should not prevent conversions from
being granted immediately when the convert queue is
empty.

Signed-off-by: David Teigland <teigland@redhat.com>

HSI: Add HSI ABI documentation

Adds sysfs HSI framework documentation

Signed-off-by: Carlos Chinea <carlos.chinea@nokia.com>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Acked-by: Linus Walleij <linus.walleij@linaro.org>

HSI: hsi_char: Remove max_data_size from sysfs

Remove max_data_size sysfs entry. Otherwise is possible
to have a buffer overrun if its value is increased after
the device is open.

Signed-off-by: Carlos Chinea <carlos.chinea@nokia.com>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Acked-by: Linus Walleij <linus.walleij@linaro.org>

HSI: hsi: Rework hsi_event interface

Remove custom hack and make use of the notifier chain interfaces for
delivering events from the ports to their associated clients.
Clients that want to receive port events need to register their callbacks
using hsi_register_port_event(). The callbacks can be called in interrupt
context. Use hsi_unregestier_port_event() to undo the registration.

Signed-off-by: Carlos Chinea <carlos.chinea@nokia.com>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Acked-by: Linus Walleij <linus.walleij@linaro.org>

HSI: hsi: Remove controllers and ports from the bus

HSI controllers and ports do not belong to the HSI bus.
Those devices are not supposed to have a driver attached to them.

Signed-off-by: Carlos Chinea <carlos.chinea@nokia.com>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Acked-by: Linus Walleij <linus.walleij@linaro.org>

HSI: hsi: Fix error path cleanup on client registration

HSI client structure should be freed on error path after
calling device_registration by dropping a reference to it.

Signed-off-by: Carlos Chinea <carlos.chinea@nokia.com>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Acked-by: Linus Walleij <linus.walleij@linaro.org>

HSI: hsi: Rework hsi_controller release

Use the proper release mechanism for hsi_controller and
hsi_ports structures. Free the structures through their
associated device release callbacks.

Signed-off-by: Carlos Chinea <carlos.chinea@nokia.com>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Acked-by: Linus Walleij <linus.walleij@linaro.org>

Merge branch 'fixes' of git://git.linaro.org/people/rmk/linux-arm

Pull ARM fixes from Russell King:
"Here's my usual Sunday push, just for one revert which PeterZ hollered
  about after last weeks push.  Other than that, all seems strangely
  quiet as far as fixes go in non-platform ARM land at the moment."

* 'fixes' of git://git.linaro.org/people/rmk/linux-arm:
  Revert "ARM: 7359/2: smp_twd: Only wait for reprogramming on active cpus"

Merge branch 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc

Pull powerpc fixes from Benjamin Herrenschmidt:
"Here are a few fixes for powerpc.  Note the addition to the generic
  irq.h.  This is part of a 3-patches regression fix for mpic due to
  changes in how IRQ_TYPE_NONE is being handled.  Thomas agreed to the
  addition of the new IRQ_TYPE_DEFAULT contant, however he hasn't
  replied with an Ack to the actual patch yet.  I don't to wait much
  longer with these patches tho."

* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc:
  powerpc/mpic: Properly set default triggers
  irq: Add IRQ_TYPE_DEFAULT for use by PIC drivers
  powerpc/mpic: Fix confusion between hw_irq and virq
  powerpc/pmac: Don't add_timer() twice
  powerpc/eeh: Fix crash caused by null eeh_dev
  powerpc/mpc85xx: add MPIC message dts node
  powerpc/mpic_msgr: fix offset error when setting mer register
  powerpc/mpic_msgr: add lock for MPIC message global variable
  powerpc/mpic_msgr: fix compile error when SMP disabled
  powerpc: fix build when CONFIG_BOOKE_WDT is enabled
  powerpc/85xx: don't call of_platform_bus_probe() twice

Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net

Pull networking fixes from David Miller:

1) Fix namespace init and cleanup in phonet to fix some oopses, from
    Eric W. Biederman.

2) Missing kfree_skb() in AF_KEY, from Julia Lawall.

3) Refcount leak and source address handling fix in l2tp from James
    Chapman.

4) Memory leak fix in CAIF from Tomasz Gregorek.

5) When routes are cloned from ipv6 addrconf routes, we don't process
    expirations properly.  Fix from Gao Feng.

6) Fix panic on DMA errors in atl1 driver, from Tony Zelenoff.

7) Only enable interrupts in 8139cp driver after we've registered the
    IRQ handler.  From Jason Wang.

8) Fix too many reads of KS_CIDER register in ks8851 during probe,
    fixing crashes on spurious interrupts.  From Matt Renzelmann.

9) Missing include in ath5k driver and missing iounmap on probe
    failure, from Jonathan Bither.

10) Fix RX packet handling in smsc911x driver, from Will Deacon.

11) Fix ixgbe WoL on fiber by leaving the laser on during shutdown.

12) ks8851 needs MAX_RECV_FRAMES increased otherwise the internal MAC
    buffers are easily overflown.  Fix from Davide Cimingahi.

13) Fix memory leaks in peak_usb CAN driver, from Jesper Juhl.

14) gred packet scheduler can dump in WRED more when doing a netlink
    dump.  Fix from David Ward.

15) Fix MTU in USB smsc75xx driver, from Stephane Fillod.

16) Dummy device needs ->ndo_uninit handler to properly handle
    ->ndo_init failures.  From Hiroaki SHIMODA.

17) Fix TX fragmentation in ath9k driver, from Sujith Manoharan.

18) Missing RTNL lock in ixgbe PM resume, from Benjamin Poirier.

19) Missing iounmap in farsync WAN driver, from Julia Lawall.

20) With LRO/GRO, tcp_grow_window() is easily tricked into not growing
    the receive window properly, and this hurts performance.  Fix from
    Eric Dumazet.

21) Network namespace init failure can leak net_generic data, fix from
    Julian Anastasov.

22) Fix skb_over_panic due to mis-accounting in TCP for partially ACK'd
    SKBs.  From Eric Dumazet.

23) New IDs for qmi_wwan driver, from Bjørn Mork.

24) Fix races in ax25_exit(), from Eric W. Biederman.

25) IPV6 TCP doesn't handle TCP_MAXSEG socket option properly, copy over
    logic from the IPV4 side.  From Neal Cardwell.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (59 commits)
  tcp: fix TCP_MAXSEG for established IPv6 passive sockets
  drivers/net: Do not free an IRQ if its request failed
  drop_monitor: allow more events per second
  ks8851: Fix request_irq/free_irq mismatch
  net/hyperv: Adding cancellation to ensure rndis filter is closed
  ks8851: Fix mutex deadlock in ks8851_net_stop()
  net ax25: Reorder ax25_exit to remove races.
  icplus: fix interrupt for IC+ 101A/G and 1001LF
  net: qmi_wwan: support Sierra Wireless MC77xx devices in QMI mode
  bnx2x: off by one in bnx2x_ets_e3b0_sp_pri_to_cos_set()
  ksz884x: don't copy too much in netdev_set_mac_address()
  tcp: fix retransmit of partially acked frames
  netns: do not leak net_generic data on failed init
  net/sock.h: fix sk_peek_off kernel-doc warning
  tcp: fix tcp_grow_window() for large incoming frames
  drivers/net/wan/farsync.c: add missing iounmap
  davinci_mdio: Fix MDIO timeout check
  ipv6: clean up rt6_clean_expires
  ipv6: fix rt6_update_expires
  arcnet: rimi: Fix device name in debug output
  ...

powerpc/mpic: Properly set default triggers

This gets rid of the unused default senses array, and replaces the
incorrect use of IRQ_TYPE_NONE with the new IRQ_TYPE_DEFAULT for
the initial set_trigger() call when mapping an interrupt.

This in turn makes us read the HW state and update the irq desc
accordingly.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>

irq: Add IRQ_TYPE_DEFAULT for use by PIC drivers

This is meant typically to allow a PIC driver's irq domain map() callback
to establish sane defaults for the interrupt (and make sure that the HW
and the irq_desc are in sync as far as the trigger is concerned).

The irq core may not call the set_trigger callback if it thinks the
trigger is already set to the right setting, so we need to ensure new
descriptors are properly synchronized with the hardware.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>

powerpc/mpic: Fix confusion between hw_irq and virq

mpic_is_ipi() takes a virq and immediately converts it to a hw_irq.

However, one of the two call sites calls it with a ... hw_irq. The
other call site also happens to have the hw_irq at hand, so let's
change it to just take that as an argument. Also change mpic_is_tm()
for consistency.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>

powerpc/pmac: Don't add_timer() twice

If the interrupt and the timeout happen roughly at the same
time, we can get into a situation where the timer function
is run while the interrupt has already been processed. In
this case, the timer function might end up doing an add_timer
on an already pending timer, causing a BUG_ON() to trigger.

Instead, just skip the whole timeout operation if we see that
the timer is pending. The spinlock ensures that the only way
that happens is if we already started a new operation and thus
the timeout can be ignored.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>

powerpc/eeh: Fix crash caused by null eeh_dev

The problem was reported by Anton Blanchard. While EEH error
happened to the PCI device without the corresponding device
driver, kernel crash was seen. Eventually, I successfully
reproduced the problem on Firebird-L machine with utility
"errinjct". Initially, the device driver for Emulex ethernet
MAC has been disabled from .config and force data parity on
the Emulex ethernet MAC with help of "errinjct". Eventually,
I saw the kernel crash after issueing couple of "lspci -v"
command.

The root cause behind is that the PCI device, including the
reference to the corresponding eeh device, will be removed
from the system while EEH does recovery. Afterwards, the
PCI device will be probed again and added into the system
accordingly. So it's not safe to retrieve the eeh device from
the corresponding PCI device after the PCI device has been removed
and not added again.

The patch fixes the issue and retrieve the eeh device from OF node
instead of PCI device after the PCI device has been removed.

Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Tested-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>

Merge remote-tracking branch 'kumar/merge' into merge

tcp: fix TCP_MAXSEG for established IPv6 passive sockets

Commit f5fff5d forgot to fix TCP_MAXSEG behavior IPv6 sockets, so IPv6
TCP server sockets that used TCP_MAXSEG would find that the advmss of
child sockets would be incorrect. This commit mirrors the advmss logic
from tcp_v4_syn_recv_sock in tcp_v6_syn_recv_sock. Eventually this
logic should probably be shared between IPv4 and IPv6, but this at
least fixes this issue.

Signed-off-by: Neal Cardwell <ncardwell@google.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Linux 3.4-rc4

drivers/net: Do not free an IRQ if its request failed

Refrain from attempting to free an interrupt line if the request
fails and hence, there is no IRQ to free.

CC: netdev@vger.kernel.org
Signed-off-by: Lee Jones <lee.jones@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

sparc32,leon: add notify_cpu_starting()

Otherwise cpu_active_mask will not set, which lead to other issue.

Signed-off-by: Yong Zhang <yong.zhang0@gmail.com>
Signed-off-by: Konrad Eisele <konrad@gaisler.com>
Reviewed-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

drop_monitor: allow more events per second

It seems there is a logic error in trace_drop_common(), since we store
only 64 drops, even if they are from same location.

This fix is a one liner, but we probably need more work to avoid useless
atomic dec/inc

Now I can watch 1 Mpps drops through dropwatch...

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Neil Horman <nhorman@tuxdriver.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ks8851: Fix request_irq/free_irq mismatch

The dev_id parameter passed to free_irq needs to match the one passed
to the corresponding request_irq.

Signed-off-by: Matt Renzelmann <mjr@cs.wisc.edu>
Acked-by: Stephen Boyd <sboyd@codeaurora.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc

Pull "ARM: SoC fixes" from Olof Johansson:
* at91, ux500, imx, omap and bcmring:
  - at91 fixes for =m driver build issues, irqdomain fixes and config
    dependency fixes
  - ux500 kconfig dependency fixes and a  smp wakeup bugfix
  - imx idle bugfix and build fix due to irq domain changes
  - omap uart pinmux fixes, softreset regression revert and misc fixes
  - bcmring build error regression fix

* ux500 and imx had some small defconfig updates in this branch

* tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (27 commits)
  ARM: bcmring: fix UART declarations
  ARM: imx: Fix imx5 idle logic bug
  ARM: imx27-dt: Fix build due to removal of irq_domain_add_simple()
  ARM: imx_v4_v5_defconfig: Add support for CONFIG_REGULATOR_FIXED_VOLTAGE
  ARM: OMAP1: DMTIMER: fix broken timer clock source selection
  ARM: OMAP: serial: Fix the ocp smart idlemode handling bug
  ARM: OMAP2+: UART: Fix incorrect population of default uart pads
  ARM: OMAP: sram: fix BUG in dpll code for !PM case
  dmaengine: Kconfig: fix Atmel at_hdmac entry
  USB: gadget/at91_udc: add gpio_to_irq() function to vbus interrupt
  USB: ohci-at91: change annotations for probe/remove functions
  leds-atmel-pwm.c: Make pwmled_probe() __devinit
  ARM: at91: fix at91sam9261ek Ethernet dm9000 irq
  ARM: at91: fix rm9200ek flash size
  ARM: at91: remove empty at91_init_serial function
  ARM: at91: fix typo in at91_pmc_base assembly declaration
  ARM: at91: Export at91_matrix_base
  ARM: at91: Export at91_pmc_base
  ARM: at91: Export at91_ramc_base
  ARM: at91: Export at91_st_base
  ...

Merge tag 'mmc-fixes-for-3.4-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/cjb/mmc

Pull MMC fixes from Chris Ball:
- Build fix for omap_hsmmc with OF against 3.4-rc1.
- Fix CONFIG_MMC_UNSAFE_RESUME semantics regression against 3.3, which
   broke hotplug card detection when UNSAFE_RESUME is set.
- Fix a race condition in omap_hsmmc with runtime PM.
- Fix two libertas SDIO-powered-resume regressions.
- Small fixes for discard/sanitize, dw_mmc, cd-gpio and esdhc-imx.

* tag 'mmc-fixes-for-3.4-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/cjb/mmc:
  mmc: core: Do not pre-claim host in suspend
  mmc: dw_mmc: prevent NULL dereference for dma_ops
  mmc: unbreak sdhci-esdhc-imx on i.MX25
  mmc: cd-gpio: Include header to pickup exported symbol prototypes
  mmc: sdhci: refine non-removable card checking for card detection
  mmc: dw_mmc: Fix switch from DMA to PIO
  mmc: remove MMC bus legacy suspend/resume method
  mmc: omap_hsmmc: Get rid of of_have_populated_dt() usage
  mmc: omap_hsmmc: build fix for CONFIG_OF=y and CONFIG_MMC_OMAP_HS=m
  mmc: fixes for eMMC v4.5 sanitize operation
  mmc: fixes for eMMC v4.5 discard operation

Merge branch 'v4l_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media

Pull media fixes from Mauro Carvalho Chehab:
- Fixes a regression at DVB core when switching from DVB-S2 to DVB-S on
   Kaffeine (Fedora 16 Bugzilla #812895);
- Fixes a mutex unlock at an error condition at drx-k;
- Fix winbond-cir set mode;
- mt9m032: Fix a compilation breakage with some random Kconfig;
- mt9m032: fix two dead locks;
- xc5000: don't require an special firmware (that won't be provided by
   the vendor) just because the xtal frequency is different;
- V4L DocBook: fix some typos at multi-plane formats description.

* 'v4l_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media:
  [media] xc5000: support 32MHz & 31.875MHz xtal using the 41.024.5 firmware
  [media] V4L: mt9m032: fix compilation breakage
  [media] V4L: DocBook: Fix typos in the multi-plane formats description
  [media] V4L: mt9m032: fix two dead-locks
  [media] rc-core: set mode for winbond-cir
  [media] drxk: Does not unlock mutex if sanity check failed in scu_command()
  [media] dvb_frontend: Fix a regression when switching back to DVB-S

Merge tag 'mfd-for-linus-3.4' of git://git.kernel.org/pub/scm/linux/kernel/git/sameo/mfd-2.6

Pull MFD fixes from Samuel Ortiz:
"We have 3 build fixes, a OMAP USB host PHY reset fix and the twl6040
  conversion to an i2c driver.  The latter may not sound like a fix but
  the twl6040 MFD driver won't probe without it, triggering an OMAP4
  audio regression."

* tag 'mfd-for-linus-3.4' of git://git.kernel.org/pub/scm/linux/kernel/git/sameo/mfd-2.6:
  mfd: Fix modular builds of rc5t583 regulator support
  mfd: Fix asic3_gpio_to_irq
  ARM: OMAP3: USB: Fix the EHCI ULPI PHY reset issue
  mfd: Convert twl6040 to i2c driver, and separate it from twl core
  mfd : Fix dbx500 compilation error

net/hyperv: Adding cancellation to ensure rndis filter is closed

Although the network interface is down, the RX packets number which
could be observed by ifconfig may keep on increasing.

This is because the WORK scheduled in netvsc_set_multicast_list()
may be executed after netvsc_close(). That means the rndis filter
may be re-enabled by do_set_multicast() even if it was closed by
netvsc_close().

By canceling possible WORK before close the rndis filter, the issue
could be never happened.

Signed-off-by: Wenqi Ma <wenqi_ma@trendmicro.com.cn>
Reviewed-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>

ks8851: Fix mutex deadlock in ks8851_net_stop()

There is a potential deadlock scenario when the ks8851 driver
is removed. The interrupt handler schedules a workqueue which
acquires a mutex that ks8851_net_stop() also acquires before
flushing the workqueue. Previously lockdep wouldn't be able
to find this problem but now that it has the support we can
trigger this lockdep warning by rmmoding the driver after
an ifconfig up.

Fix the possible deadlock by disabling the interrupts in
the chip and then release the lock across the workqueue
flushing. The mutex is only there to proect the registers
anyway so this should be ok.

=======================================================
[ INFO: possible circular locking dependency detected ]
3.0.21-00021-g8b33780-dirty #2911
-------------------------------------------------------
rmmod/125 is trying to acquire lock:
((&ks->irq_work)){+.+...}, at: [<c019e0b8>] flush_work+0x0/0xac

but task is already holding lock:
(&ks->lock){+.+...}, at: [<bf00b850>] ks8851_net_stop+0x64/0x138 [ks8851]

which lock already depends on the new lock.

the existing dependency chain (in reverse order) is:

-> #1 (&ks->lock){+.+...}:
       [<c01b89c8>] __lock_acquire+0x940/0x9f8
       [<c01b9058>] lock_acquire+0x10c/0x130
       [<c083dbec>] mutex_lock_nested+0x68/0x3dc
       [<bf00bd48>] ks8851_irq_work+0x24/0x46c [ks8851]
       [<c019c580>] process_one_work+0x2d8/0x518
       [<c019cb98>] worker_thread+0x220/0x3a0
       [<c01a2ad4>] kthread+0x88/0x94
       [<c0107008>] kernel_thread_exit+0x0/0x8

-> #0 ((&ks->irq_work)){+.+...}:
       [<c01b7984>] validate_chain+0x914/0x1018
       [<c01b89c8>] __lock_acquire+0x940/0x9f8
       [<c01b9058>] lock_acquire+0x10c/0x130
       [<c019e104>] flush_work+0x4c/0xac
       [<bf00b858>] ks8851_net_stop+0x6c/0x138 [ks8851]
       [<c06b209c>] __dev_close_many+0x98/0xcc
       [<c06b2174>] dev_close_many+0x68/0xd0
       [<c06b22ec>] rollback_registered_many+0xcc/0x2b8
       [<c06b2554>] rollback_registered+0x28/0x34
       [<c06b25b8>] unregister_netdevice_queue+0x58/0x7c
       [<c06b25f4>] unregister_netdev+0x18/0x20
       [<bf00c1f4>] ks8851_remove+0x64/0xb4 [ks8851]
       [<c049ddf0>] spi_drv_remove+0x18/0x1c
       [<c0468e98>] __device_release_driver+0x7c/0xbc
       [<c0468f64>] driver_detach+0x8c/0xb4
       [<c0467f00>] bus_remove_driver+0xb8/0xe8
       [<c01c1d20>] sys_delete_module+0x1e8/0x27c
       [<c0105ec0>] ret_fast_syscall+0x0/0x3c

other info that might help us debug this:

Possible unsafe locking scenario:

       CPU0                    CPU1
       ----                    ----
  lock(&ks->lock);
                               lock((&ks->irq_work));
                               lock(&ks->lock);
  lock((&ks->irq_work));

*** DEADLOCK ***

4 locks held by rmmod/125:
#0:  (&__lockdep_no_validate__){+.+.+.}, at: [<c0468f44>] driver_detach+0x6c/0xb4
#1:  (&__lockdep_no_validate__){+.+.+.}, at: [<c0468f50>] driver_detach+0x78/0xb4
#2:  (rtnl_mutex){+.+.+.}, at: [<c06b25e8>] unregister_netdev+0xc/0x20
#3:  (&ks->lock){+.+...}, at: [<bf00b850>] ks8851_net_stop+0x64/0x138 [ks8851]

Cc: Ben Dooks <ben-linux@fluff.org>
Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

NFSv4: Keep dropped state owners on the LRU list for a while

To ensure that we don't reuse their identifiers.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>

NFSv4: Ensure that we don't drop a state owner more than once

Retest the RB_EMPTY_NODE() condition under the spin lock
to ensure that we don't call rb_erase() more than once on the
same state owner.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>

kill mm argument of vm_munmap()

it's always current->mm

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

perfmon: kill some helpers and arguments

pfm_vm_munmap() is simply vm_munmap() and pfm_remove_smpl_mapping()
always get current as the first argument.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

aio: don't bother with unmapping when aio_free_ring() is coming from exit_aio()

... since exit_mmap() is coming and it will munmap() everything anyway.
In all other cases aio_free_ring() has ctx->mm == current->mm; moreover,
all other callers of vm_munmap() have mm == current->mm, so this will
allow us to get rid of mm argument of vm_munmap().

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>