Jeff Layton [Wed, 16 Jul 2014 10:52:19 +0000 (06:52 -0400)]
sunrpc: fix RCU handling of gc_ctx field
The handling of the gc_ctx pointer only seems to be partially RCU-safe.
The assignment and freeing are done using RCU, but many places in the
code seem to dereference that pointer without proper RCU safeguards.
Fix them to use rcu_dereference and to rcu_read_lock/unlock, and to
properly handle the case where the pointer is NULL.
Cc: Arnd Bergmann <arnd@arndb.de> Cc: Paul McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: Jeff Layton <jlayton@primarydata.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Jeff Layton [Wed, 16 Jul 2014 10:52:18 +0000 (06:52 -0400)]
sunrpc: remove __rcu annotation from struct gss_cl_ctx->gc_gss_ctx
Commit 5b22216e11f7 (nfs: __rcu annotations) added a __rcu annotation to
the gc_gss_ctx field. I see no rationale for adding that though, as that
field does not seem to be managed via RCU at all.
Cc: Arnd Bergmann <arnd@arndb.de> Cc: Paul McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: Jeff Layton <jlayton@primarydata.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
NFS: nfs4_do_open should add negative results to the dcache.
If you have an NFSv4 mounted directory which does not container 'foo'
and:
ls -l foo
ssh $server touch foo
cat foo
then the 'cat' will fail (usually, depending a bit on the various
cache ages). This is correct as negative looks are cached by default.
However with the same initial conditions:
cat foo
ssh $server touch foo
cat foo
will usually succeed. This is because an "open" does not add a
negative dentry to the dcache, while a "lookup" does.
This can have negative performance effects. When "gcc" searches for
an include file, it will try to "open" the file in every director in
the search path. Without caching of negative "open" results, this
generates much more traffic to the server than it should (or than
NFSv3 does).
The root of the problem is that _nfs4_open_and_get_state() will call
d_add_unique() on a positive result, but not on a negative result.
Compare with nfs_lookup() which calls d_materialise_unique on both
a positive result and on ENOENT.
This patch adds a call d_add() in the ENOENT case for
_nfs4_open_and_get_state() and also calls nfs_set_verifier().
With it, many fewer "open" requests for known-non-existent files are
sent to the server.
nfs3_list_one_acl(): check get_acl() result with IS_ERR_OR_NULL
There was a check for result being not NULL. But get_acl() may return
NULL, or ERR_PTR, or actual pointer.
The purpose of the function where current change is done is to "list
ACLs only when they are available", so any error condition of get_acl()
mustn't be elevated, and returning 0 there is still valid.
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=81111 Signed-off-by: Andrey Utkin <andrey.krieger.utkin@gmail.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Fixes: 74adf83f5d77 (nfs: only show Posix ACLs in listxattr if actually...) Cc: stable@vger.kernel.org # 3.14+ Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Trond Myklebust [Sun, 3 Aug 2014 21:04:51 +0000 (17:04 -0400)]
Merge branch 'nfs-rdma' of git://git.linux-nfs.org/projects/anna/nfs-rdma into linux-next
* 'nfs-rdma' of git://git.linux-nfs.org/projects/anna/nfs-rdma: (916 commits)
xprtrdma: Handle additional connection events
xprtrdma: Remove RPCRDMA_PERSISTENT_REGISTRATION macro
xprtrdma: Make rpcrdma_ep_disconnect() return void
xprtrdma: Schedule reply tasklet once per upcall
xprtrdma: Allocate each struct rpcrdma_mw separately
xprtrdma: Rename frmr_wr
xprtrdma: Disable completions for LOCAL_INV Work Requests
xprtrdma: Disable completions for FAST_REG_MR Work Requests
xprtrdma: Don't post a LOCAL_INV in rpcrdma_register_frmr_external()
xprtrdma: Reset FRMRs after a flushed LOCAL_INV Work Request
xprtrdma: Reset FRMRs when FAST_REG_MR is flushed by a disconnect
xprtrdma: Properly handle exhaustion of the rb_mws list
xprtrdma: Chain together all MWs in same buffer pool
xprtrdma: Back off rkey when FAST_REG_MR fails
xprtrdma: Unclutter struct rpcrdma_mr_seg
xprtrdma: Don't invalidate FRMRs if registration fails
xprtrdma: On disconnect, don't ignore pending CQEs
xprtrdma: Update rkeys after transport reconnect
xprtrdma: Limit data payload size for ALLPHYSICAL
xprtrdma: Protect ia->ri_id when unmapping/invalidating MRs
...
SUNRPC: Enforce an upper limit on the number of cached credentials
In some cases where the credentials are not often reused, we may want
to limit their total number just in order to make the negative lookups
in the hash table more manageable.
Chuck Lever [Tue, 29 Jul 2014 21:26:12 +0000 (17:26 -0400)]
xprtrdma: Handle additional connection events
Commit 38ca83a5 added RDMA_CM_EVENT_TIMEWAIT_EXIT. But that status
is relevant only for consumers that re-use their QPs on new
connections. xprtrdma creates a fresh QP on reconnection, so that
event should be explicitly ignored.
Squelch the alarming "unexpected CM event" message.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Tested-by: Steve Wise <swise@opengridcomputing.com> Tested-by: Shirley Ma <shirley.ma@oracle.com> Tested-by: Devesh Sharma <devesh.sharma@emulex.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
RPCRDMA_PERSISTENT_REGISTRATION was a compile-time switch between
RPCRDMA_REGISTER mode and RPCRDMA_ALLPHYSICAL mode. Since
RPCRDMA_REGISTER has been removed, there's no need for the extra
conditional compilation.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Tested-by: Steve Wise <swise@opengridcomputing.com> Tested-by: Shirley Ma <shirley.ma@oracle.com> Tested-by: Devesh Sharma <devesh.sharma@emulex.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Chuck Lever [Tue, 29 Jul 2014 21:25:55 +0000 (17:25 -0400)]
xprtrdma: Make rpcrdma_ep_disconnect() return void
Clean up: The return code is used only for dprintk's that are
already redundant.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Tested-by: Steve Wise <swise@opengridcomputing.com> Tested-by: Shirley Ma <shirley.ma@oracle.com> Tested-by: Devesh Sharma <devesh.sharma@emulex.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Chuck Lever [Tue, 29 Jul 2014 21:25:46 +0000 (17:25 -0400)]
xprtrdma: Schedule reply tasklet once per upcall
Minor optimization: grab rpcrdma_tk_lock_g and disable hard IRQs
just once after clearing the receive completion queue.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Tested-by: Steve Wise <swise@opengridcomputing.com> Tested-by: Shirley Ma <shirley.ma@oracle.com> Tested-by: Devesh Sharma <devesh.sharma@emulex.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Chuck Lever [Tue, 29 Jul 2014 21:25:38 +0000 (17:25 -0400)]
xprtrdma: Allocate each struct rpcrdma_mw separately
Currently rpcrdma_buffer_create() allocates struct rpcrdma_mw's as
a single contiguous area of memory. It amounts to quite a bit of
memory, and there's no requirement for these to be carved from a
single piece of contiguous memory.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Tested-by: Steve Wise <swise@opengridcomputing.com> Tested-by: Shirley Ma <shirley.ma@oracle.com> Tested-by: Devesh Sharma <devesh.sharma@emulex.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Chuck Lever [Tue, 29 Jul 2014 21:25:29 +0000 (17:25 -0400)]
xprtrdma: Rename frmr_wr
Clean up: Name frmr_wr after the opcode of the Work Request,
consistent with the send and local invalidation paths.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Tested-by: Steve Wise <swise@opengridcomputing.com> Tested-by: Shirley Ma <shirley.ma@oracle.com> Tested-by: Devesh Sharma <devesh.sharma@emulex.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Chuck Lever [Tue, 29 Jul 2014 21:25:20 +0000 (17:25 -0400)]
xprtrdma: Disable completions for LOCAL_INV Work Requests
Instead of relying on a completion to change the state of an FRMR
to FRMR_IS_INVALID, set it in advance. If an error occurs, a completion
will fire anyway and mark the FRMR FRMR_IS_STALE.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Tested-by: Steve Wise <swise@opengridcomputing.com> Tested-by: Shirley Ma <shirley.ma@oracle.com> Tested-by: Devesh Sharma <devesh.sharma@emulex.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Chuck Lever [Tue, 29 Jul 2014 21:25:12 +0000 (17:25 -0400)]
xprtrdma: Disable completions for FAST_REG_MR Work Requests
Instead of relying on a completion to change the state of an FRMR
to FRMR_IS_VALID, set it in advance. If an error occurs, a completion
will fire anyway and mark the FRMR FRMR_IS_STALE.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Tested-by: Steve Wise <swise@opengridcomputing.com> Tested-by: Shirley Ma <shirley.ma@oracle.com> Tested-by: Devesh Sharma <devesh.sharma@emulex.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Chuck Lever [Tue, 29 Jul 2014 21:25:03 +0000 (17:25 -0400)]
xprtrdma: Don't post a LOCAL_INV in rpcrdma_register_frmr_external()
Any FRMR arriving in rpcrdma_register_frmr_external() is now
guaranteed to be either invalid, or to be targeted by a queued
LOCAL_INV that will invalidate it before the adapter processes
the FAST_REG_MR being built here.
The problem with current arrangement of chaining a LOCAL_INV to the
FAST_REG_MR is that if the transport is not connected, the LOCAL_INV
is flushed and the FAST_REG_MR is flushed. This leaves the FRMR
valid with the old rkey. But rpcrdma_register_frmr_external() has
already bumped the in-memory rkey.
Next time through rpcrdma_register_frmr_external(), a LOCAL_INV and
FAST_REG_MR is attempted again because the FRMR is still valid. But
the rkey no longer matches the hardware's rkey, and a memory
management operation error occurs.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Tested-by: Steve Wise <swise@opengridcomputing.com> Tested-by: Shirley Ma <shirley.ma@oracle.com> Tested-by: Devesh Sharma <devesh.sharma@emulex.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Chuck Lever [Tue, 29 Jul 2014 21:24:54 +0000 (17:24 -0400)]
xprtrdma: Reset FRMRs after a flushed LOCAL_INV Work Request
When a LOCAL_INV Work Request is flushed, it leaves an FRMR in the
VALID state. This FRMR can be returned by rpcrdma_buffer_get(), and
must be knocked down in rpcrdma_register_frmr_external() before it
can be re-used.
Instead, capture these in rpcrdma_buffer_get(), and reset them.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Tested-by: Steve Wise <swise@opengridcomputing.com> Tested-by: Shirley Ma <shirley.ma@oracle.com> Tested-by: Devesh Sharma <devesh.sharma@emulex.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Chuck Lever [Tue, 29 Jul 2014 21:24:45 +0000 (17:24 -0400)]
xprtrdma: Reset FRMRs when FAST_REG_MR is flushed by a disconnect
FAST_REG_MR Work Requests update a Memory Region's rkey. Rkey's are
used to block unwanted access to the memory controlled by an MR. The
rkey is passed to the receiver (the NFS server, in our case), and is
also used by xprtrdma to invalidate the MR when the RPC is complete.
When a FAST_REG_MR Work Request is flushed after a transport
disconnect, xprtrdma cannot tell whether the WR actually hit the
adapter or not. So it is indeterminant at that point whether the
existing rkey is still valid.
After the transport connection is re-established, the next
FAST_REG_MR or LOCAL_INV Work Request against that MR can sometimes
fail because the rkey value does not match what xprtrdma expects.
The only reliable way to recover in this case is to deregister and
register the MR before it is used again. These operations can be
done only in a process context, so handle it in the transport
connect worker.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Tested-by: Steve Wise <swise@opengridcomputing.com> Tested-by: Shirley Ma <shirley.ma@oracle.com> Tested-by: Devesh Sharma <devesh.sharma@emulex.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Chuck Lever [Tue, 29 Jul 2014 21:24:36 +0000 (17:24 -0400)]
xprtrdma: Properly handle exhaustion of the rb_mws list
If the rb_mws list is exhausted, clean up and return NULL so that
call_allocate() will delay and try again.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Tested-by: Steve Wise <swise@opengridcomputing.com> Tested-by: Shirley Ma <shirley.ma@oracle.com> Tested-by: Devesh Sharma <devesh.sharma@emulex.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Chuck Lever [Tue, 29 Jul 2014 21:24:28 +0000 (17:24 -0400)]
xprtrdma: Chain together all MWs in same buffer pool
During connection loss recovery, need to visit every MW in a
buffer pool. Any MW that is in use by an RPC will not be on the
rb_mws list.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Tested-by: Steve Wise <swise@opengridcomputing.com> Tested-by: Shirley Ma <shirley.ma@oracle.com> Tested-by: Devesh Sharma <devesh.sharma@emulex.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Chuck Lever [Tue, 29 Jul 2014 21:24:19 +0000 (17:24 -0400)]
xprtrdma: Back off rkey when FAST_REG_MR fails
If posting a FAST_REG_MR Work Reqeust fails, revert the rkey update
to avoid subsequent IB_WC_MW_BIND_ERR completions.
Suggested-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Chuck Lever [Tue, 29 Jul 2014 21:24:09 +0000 (17:24 -0400)]
xprtrdma: Unclutter struct rpcrdma_mr_seg
Clean ups:
- make it obvious that the rl_mw field is a pointer -- allocated
separately, not as part of struct rpcrdma_mr_seg
- promote "struct {} frmr;" to a named type
- promote the state enum to a named type
- name the MW state field the same way other fields in
rpcrdma_mw are named
Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Tested-by: Steve Wise <swise@opengridcomputing.com> Tested-by: Shirley Ma <shirley.ma@oracle.com> Tested-by: Devesh Sharma <devesh.sharma@emulex.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Chuck Lever [Tue, 29 Jul 2014 21:24:01 +0000 (17:24 -0400)]
xprtrdma: Don't invalidate FRMRs if registration fails
If FRMR registration fails, it's likely to transition the QP to the
error state. Or, registration may have failed because the QP is
_already_ in ERROR.
Thus calling rpcrdma_deregister_external() in
rpcrdma_create_chunks() is useless in FRMR mode: the LOCAL_INVs just
get flushed.
It is safe to leave existing registrations: when FRMR registration
is tried again, rpcrdma_register_frmr_external() checks if each FRMR
is already/still VALID, and knocks it down first if it is.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Tested-by: Steve Wise <swise@opengridcomputing.com> Tested-by: Shirley Ma <shirley.ma@oracle.com> Tested-by: Devesh Sharma <devesh.sharma@emulex.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Chuck Lever [Tue, 29 Jul 2014 21:23:52 +0000 (17:23 -0400)]
xprtrdma: On disconnect, don't ignore pending CQEs
xprtrdma is currently throwing away queued completions during
a reconnect. RPC replies posted just before connection loss, or
successful completions that change the state of an FRMR, can be
missed.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Tested-by: Steve Wise <swise@opengridcomputing.com> Tested-by: Shirley Ma <shirley.ma@oracle.com> Tested-by: Devesh Sharma <devesh.sharma@emulex.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Chuck Lever [Tue, 29 Jul 2014 21:23:34 +0000 (17:23 -0400)]
xprtrdma: Limit data payload size for ALLPHYSICAL
When the client uses physical memory registration, each page in the
payload gets its own array entry in the RPC/RDMA header's chunk list.
Therefore, don't advertise a maximum payload size that would require
more array entries than can fit in the RPC buffer where RPC/RDMA
headers are built.
BugLink: https://bugzilla.linux-nfs.org/show_bug.cgi?id=248 Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Tested-by: Steve Wise <swise@opengridcomputing.com> Tested-by: Shirley Ma <shirley.ma@oracle.com> Tested-by: Devesh Sharma <devesh.sharma@emulex.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Chuck Lever [Tue, 29 Jul 2014 21:23:25 +0000 (17:23 -0400)]
xprtrdma: Protect ia->ri_id when unmapping/invalidating MRs
Ensure ia->ri_id remains valid while invoking dma_unmap_page() or
posting LOCAL_INV during a transport reconnect. Otherwise,
ia->ri_id->device or ia->ri_id->qp is NULL, which triggers a panic.
BugLink: https://bugzilla.linux-nfs.org/show_bug.cgi?id=259 Fixes: ec62f40 'xprtrdma: Ensure ia->ri_id->qp is not NULL when reconnecting' Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Tested-by: Steve Wise <swise@opengridcomputing.com> Tested-by: Shirley Ma <shirley.ma@oracle.com> Tested-by: Devesh Sharma <devesh.sharma@emulex.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: Yan Burman <yanb@mellanox.com> Reviewed-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Merge tag 'staging-3.16-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging
Pull more IIO driver fixes from Greg KH:
"Here are two IIO driver fixes for 3.16-rc6 that resolve some reported
issues"
* tag 'staging-3.16-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging:
iio: mma8452: Use correct acceleration units.
iio:core: Handle error when mask type is not separate
Merge tag 'usb-3.16-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb
Pull USB fixes from Greg KH:
"Here are two USB patches that resolve some reported issues, one with
an odd HUB, and one in the chipidea driver"
* tag 'usb-3.16-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
usb: Check if port status is equal to RxDetect
usb: chipidea: udc: Disable auto ZLP generation on ep0
Merge tag 'driver-core-3.16-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core
Pull driver core fix from Greg KH:
"Here is a single driver core fix that reverts an older patch that has
been causing a number of reported problems with the platform devices.
This revert has been in linux-next for a while with no reported issues"
* tag 'driver-core-3.16-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core:
platform_get_irq: Revert to platform_get_resource if of_irq_get fails
Merge tag 'char-misc-3.16-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc
Pull char/misc fix from Greg KH:
"Here's a single hyper-v driver fix for a reported issue"
* tag 'char-misc-3.16-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
Drivers: hv: hv_fcopy: fix a race condition for SMP guest
Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux
Pull intel drm fixes from Dave Airlie:
"Intel fixes came in late, but since I debugged one of them I'll send
them on,
Two reverts, a quirk and one warn regression"
* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux:
Revert "drm/i915: reverse dp link param selection, prefer fast over wide again"
drm/i915: Track the primary plane correctly when reassigning planes
drm/i915: Ignore VBT backlight presence check on HP Chromebook 14
Revert "drm/i915: Don't set the 8to6 dither flag when not scaling"
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/uml
Pull UML fixes from Richard Weinberger:
"Four fixes, all discovered by Trinity"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/uml:
um: segv: Save regs only in case of a kernel mode fault
um: Fix hung task in fix_range_common()
um: Ensure that a stub page cannot get unmapped
Revert "um: Fix wait_stub_done() error handling"
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs
Pull btrfs fixes from Chris Mason:
"We have two more fixes in my for-linus branch.
I was hoping to also include a fix for a btrfs deadlock with
compression enabled, but we're still nailing that one down"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs:
btrfs: test for valid bdev before kobj removal in btrfs_rm_device
Btrfs: fix abnormal long waiting in fsync
Merge tag 'nfs-for-3.16-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs
Pull NFS client fixes from Trond Myklebust:
"Apologies for the relative lateness of this pull request, however the
commits fix some issues with the NFS read/write code updates in
3.16-rc1 that can cause serious Oopsing when using small r/wsize. The
delay was mainly due to extra testing to make sure that the fixes
behave correctly.
Highlights include;
- Stable fix for an NFSv3 posix ACL regression
- Multiple fixes for regressions to the NFS generic read/write code:
- Fix page splitting bugs that come into play when a small
rsize/wsize read/write needs to be sent again (due to error
conditions or page redirty)
- Fix nfs_wb_page_cancel, which is called by the "invalidatepage"
method
- Fix 2 compile warnings about unused variables
- Fix a performance issue affecting unstable writes"
* tag 'nfs-for-3.16-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
NFS: Don't reset pg_moreio in __nfs_pageio_add_request
NFS: Remove 2 unused variables
nfs: handle multiple reqs in nfs_wb_page_cancel
nfs: handle multiple reqs in nfs_page_async_flush
nfs: change find_request to find_head_request
nfs: nfs_page should take a ref on the head req
nfs: mark nfs_page reqs with flag for extra ref
nfs: only show Posix ACLs in listxattr if actually present
Eric Sandeen [Mon, 7 Jul 2014 17:34:49 +0000 (12:34 -0500)]
btrfs: test for valid bdev before kobj removal in btrfs_rm_device
commit 99994cd btrfs: dev delete should remove sysfs entry
added a btrfs_kobj_rm_device, which dereferences device->bdev...
right after we check whether device->bdev might be NULL.
I don't honestly know if it's possible to have a NULL device->bdev
here, but assuming that it is (given the test), we need to move
the kobject removal to be under that test.
(Coverity spotted this)
Signed-off-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: Chris Mason <clm@fb.com>
Liu Bo [Thu, 17 Jul 2014 08:08:36 +0000 (16:08 +0800)]
Btrfs: fix abnormal long waiting in fsync
xfstests generic/127 detected this problem.
With commit 7fc34a62ca4434a79c68e23e70ed26111b7a4cf8, now fsync will only flush
data within the passed range. This is the cause of the above problem,
-- btrfs's fsync has a stage called 'sync log' which will wait for all the
ordered extents it've recorded to finish.
In xfstests/generic/127, with mixed operations such as truncate, fallocate,
punch hole, and mapwrite, we get some pre-allocated extents, and mapwrite will
mmap, and then msync. And I find that msync will wait for quite a long time
(about 20s in my case), thanks to ftrace, it turns out that the previous
fallocate calls 'btrfs_wait_ordered_range()' to flush dirty pages, but as the
range of dirty pages may be larger than 'btrfs_wait_ordered_range()' wants,
there can be some ordered extents created but not getting corresponding pages
flushed, then they're left in memory until we fsync which runs into the
stage 'sync log', and fsync will just wait for the system writeback thread
to flush those pages and get ordered extents finished, so the latency is
inevitable.
This adds a flush similar to btrfs_start_ordered_extent() in
btrfs_wait_logged_extents() to fix that.
Reviewed-by: Miao Xie <miaox@cn.fujitsu.com> Signed-off-by: Liu Bo <bo.li.liu@oracle.com> Signed-off-by: Chris Mason <clm@fb.com>
Merge branch 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull locking fixes from Thomas Gleixner:
"The locking department delivers:
- A rather large and intrusive bundle of fixes to address serious
performance regressions introduced by the new rwsem / mcs
technology. Simpler solutions have been discussed, but they would
have been ugly bandaids with more risk than doing the right thing.
- Make the rwsem spin on owner technology opt-in for architectures
and enable it only on the known to work ones.
- A few fixes to the lockdep userspace library"
* 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
locking/rwsem: Add CONFIG_RWSEM_SPIN_ON_OWNER
locking/mutex: Disable optimistic spinning on some architectures
locking/rwsem: Reduce the size of struct rw_semaphore
locking/rwsem: Rename 'activity' to 'count'
locking/spinlocks/mcs: Micro-optimize osq_unlock()
locking/spinlocks/mcs: Introduce and use init macro and function for osq locks
locking/spinlocks/mcs: Convert osq lock to atomic_t to reduce overhead
locking/spinlocks/mcs: Rename optimistic_spin_queue() to optimistic_spin_node()
locking/rwsem: Allow conservative optimistic spinning when readers have lock
tools/liblockdep: Account for bitfield changes in lockdeps lock_acquire
tools/liblockdep: Remove debug print left over from development
tools/liblockdep: Fix comparison of a boolean value with a value of 2
Merge branch 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull irq fixes from Thomas Gleixner:
"Three patches addressing shortcomings in the ARM gic interrupt chip
driver"
* 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
irqchip: gic: Fix core ID calculation when topology is read from DT
irqchip: gic: Add binding probe for ARM GIC400
irqchip: gic: Add support for cortex a7 compatible string
Merge branch 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull timer fix from Thomas Gleixner:
"A single fix for a long standing issue in the alarm timer subsystem,
which was noticed recently when people finally started to use alarm
timers for serious work"
* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
alarmtimer: Fix bug where relative alarm timers were treated as absolute
Merge branch 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull RCU fixes from Thomas Gleixner:
"Two RCU patches:
- Address a serious performance regression on open/close caused by
commit ac1bea85781e ("Make cond_resched() report RCU quiescent
states")
- Export RCU debug functions. Not a regression, but enablement to
address a serious recursion bug in the sl*b allocators in 3.17"
* 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
rcu: Reduce overhead of cond_resched() checks for RCU
rcu: Export debug_init_rcu_head() and and debug_init_rcu_head()
Merge tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc
Pull ARM SoC fixes from Olof Johansson:
"A smaller set of fixes this week, and all regression fixes:
- a handful of issues fixed on at91 with common clock conversion
- a set of fixes for Marvell mvebu (SMP, coherency, PM)
- a clock fix for i.MX6Q.
- ... and a SMP/hotplug fix for Exynos"
* tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
ARM: EXYNOS: Fix core ID used by platsmp and hotplug code
ARM: at91/dt: add missing clocks property to pwm node in sam9x5.dtsi
ARM: at91/dt: fix usb0 clocks definition in sam9n12 dtsi
ARM: at91: at91sam9x5: correct typo error for ohci clock
ARM: clk-imx6q: parent lvds_sel input from upstream clock gates
ARM: mvebu: Fix coherency bus notifiers by using separate notifiers
ARM: mvebu: Fix the operand list in the inline asm of armada_370_xp_pmsu_idle_enter
ARM: mvebu: fix SMP boot for Armada 38x and Armada 375 Z1 in big endian
Dave Airlie [Sat, 19 Jul 2014 06:48:38 +0000 (16:48 +1000)]
Merge tag 'drm-intel-fixes-2014-07-18' of git://anongit.freedesktop.org/drm-intel
But in any case nothing really shocking in
here, 2 reverts, 1 quirk and a regression fix a WARN.
* tag 'drm-intel-fixes-2014-07-18' of git://anongit.freedesktop.org/drm-intel:
Revert "drm/i915: reverse dp link param selection, prefer fast over wide again"
drm/i915: Track the primary plane correctly when reassigning planes
drm/i915: Ignore VBT backlight presence check on HP Chromebook 14
Revert "drm/i915: Don't set the 8to6 dither flag when not scaling"
Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Peter Anvin:
"A couple of key fixes and a few less critical ones. The main ones
are:
- add a .bss section to the PE/COFF headers when building with EFI
stub
- invoke the correct paravirt magic when building the espfix page
tables
Unfortunately both of these areas also have at least one additional
fix each still in thie pipeline, but which are not yet ready to push"
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86: Remove unused variable "polling"
x86/espfix/xen: Fix allocation of pages for paravirt page tables
x86/efi: Include a .bss section within the PE/COFF headers
efi: fdt: Do not report an error during boot if UEFI is not available
efi/arm64: efistub: remove local copy of linux_banner
Merge tag 'rdma-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband
Pull infiniband/rdma fixes from Roland Dreier:
- cxgb4 hardware driver regression fixes
- mlx5 hardware driver regression fixes
* tag 'rdma-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband:
IB/mlx5: Enable "block multicast loopback" for kernel consumers
RDMA/cxgb4: Call iwpm_init() only once
mlx5_core: Fix possible race between mr tree insert/delete
RDMA/cxgb4: Initialize the device status page
RDMA/cxgb4: Clean up connection on ARP error
RDMA/cxgb4: Fix skb_leak in reject_cr()
Merge tag 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging
Pull hwmon fixes from Guenter Roeck:
"More fallout from module tests and code inspection.
Fixes to temperature limit write operations in adt7470 driver. Also,
dashes are not allowed in hwmon 'name' attributes. Fix drivers where
necessary"
* tag 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
hwmon: (adt7470) Fix writes to temperature limit registers
hwmon: (da9055) Don't use dash in the name attribute
hwmon: (da9052) Don't use dash in the name attribute
Merge tag 'iommu-fixes-v3.16-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu
Pull iommu fixes from Joerg Roedel:
"A couple of fixes for the Freescale PAMU driver queued up:
- fix PAMU window size check.
- fix the device domain attach condition.
- fix the error condition during iommu group"
* tag 'iommu-fixes-v3.16-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu:
iommu/fsl: Fix the error condition during iommu group
iommu/fsl: Fix the device domain attach condition.
iommu/fsl: Fix PAMU window size check.
Merge tag 'pm+acpi-3.16-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull ACPI and power management fixes from Rafael Wysocki:
"These are a few recent regression fixes, a revert of the ACPI video
commit I promised, a system resume fix related to request_firmware(),
an ACPI video quirk for one more Win8-oriented BIOS, an ACPI device
enumeration documentation update and a few fixes for ARM cpufreq
drivers.
Specifics:
- Fix for a recently introduced NULL pointer dereference in the core
system suspend code occuring when platforms without ACPI attempt to
use the "freeze" sleep state from Zhang Rui.
- Fix for a recently introduced build warning in cpufreq headers from
Brian W Hart.
- Fix for a 3.13 cpufreq regression related to sysem resume that
triggers on some systems with multiple CPU clusters from Viresh
Kumar.
- Fix for a 3.4 regression in request_firmware() resulting in
WARN_ON()s on some systems during system resume from Takashi Iwai.
- Revert of the ACPI video commit that changed the default value of
the video.brightness_switch_enabled command line argument to 0 as
it has been reported to break existing setups.
- ACPI device enumeration documentation update to take recent code
changes into account and make the documentation match the code
again from Darren Hart.
- Fixes for the sa1110, imx6q, kirkwood, and cpu0 cpufreq drivers
from Linus Walleij, Nicolas Del Piano, Quentin Armitage, Viresh
Kumar.
- New ACPI video blacklist entry for HP ProBook 4540s from Hans de
Goede"
* tag 'pm+acpi-3.16-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
cpufreq: make table sentinel macros unsigned to match use
cpufreq: move policy kobj to policy->cpu at resume
cpufreq: cpu0: OPPs can be populated at runtime
cpufreq: kirkwood: Reinstate cpufreq driver for ARCH_KIRKWOOD
cpufreq: imx6q: Select PM_OPP
cpufreq: sa1110: set memory type for h3600
ACPI / video: Add use_native_backlight quirk for HP ProBook 4540s
PM / sleep: fix freeze_ops NULL pointer dereferences
PM / sleep: Fix request_firmware() error at resume
Revert "ACPI / video: change acpi-video brightness_switch_enabled default to 0"
ACPI / documentation: Remove reference to acpi_platform_device_ids from enumeration.txt
Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux
Pull drm fixes from Dave Airlie:
"One nouveau deadlock fix, one qxl irq handling fix, and a set of
radeon pageflipping changes that fix regressions in pageflipping since
-rc1 along with a leak and backlight fix.
The pageflipping fixes are a bit bigger than I'd like, but there has
been a few people focused on testing them"
* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux:
drm/radeon: Make classic pageflip completion path less racy.
drm/radeon: Add missing vblank_put in pageflip ioctl error path.
drm/radeon: Remove redundant fence unref in pageflip path.
drm/radeon: Complete page flip even if waiting on the BO fence fails
drm/radeon: Move pinning the BO back to radeon_crtc_page_flip()
drm/radeon: Prevent too early kms-pageflips triggered by vblank.
drm/radeon: set default bl level to something reasonable
drm/radeon: avoid leaking edid data
drm/qxl: return IRQ_NONE if it was not our irq
drm/nouveau/therm: fix a potential deadlock in the therm monitoring code
Merge tag 'random_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/random
Pull /dev/random fix from Ted Ts'o:
"Fix a BUG splat found by trinity"
* tag 'random_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/random:
random: check for increase of entropy_count because of signed conversion
random: check for increase of entropy_count because of signed conversion
The expression entropy_count -= ibytes << (ENTROPY_SHIFT + 3) could
actually increase entropy_count if during assignment of the unsigned
expression on the RHS (mind the -=) we reduce the value modulo
2^width(int) and assign it to entropy_count. Trinity found this.
[ Commit modified by tytso to add an additional safety check for a
negative entropy_count -- which should never happen, and to also add
an additional paranoia check to prevent overly large count values to
be passed into urandom_read(). ]
Reported-by: Dave Jones <davej@redhat.com> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@vger.kernel.org
Tomasz Figa [Tue, 15 Jul 2014 17:59:18 +0000 (02:59 +0900)]
ARM: EXYNOS: Fix core ID used by platsmp and hotplug code
When CPU topology is specified in device tree, cpu_logical_map() does
not return core ID anymore, but rather full MPIDR value. This breaks
existing calculation of PMU register offsets on Exynos SoCs.
This patch fixes the problem by adjusting the code to use only core ID
bits of the value returned by cpu_logical_map() to allow CPU topology to
be specified in device tree on Exynos SoCs.
Signed-off-by: Tomasz Figa <t.figa@samsung.com> Signed-off-by: Kukjin Kim <kgene.kim@samsung.com> Signed-off-by: Olof Johansson <olof@lixom.net>
Olof Johansson [Fri, 18 Jul 2014 21:40:17 +0000 (14:40 -0700)]
Merge tag 'imx-fixes-3.16-2' of git://git.kernel.org/pub/scm/linux/kernel/git/shawnguo/linux into fixes
Merge "ARM: imx: fixes for 3.16, 2nd take" from Shawn Guo:
The i.MX fixes for 3.16, 2nd take:
It fixes a hard machine hang regression for boards where only pcie is
active but no sata, as the latest imx6-pcie driver is no longer enabling
the upstream clock directly but only lvds clk out.
* tag 'imx-fixes-3.16-2' of git://git.kernel.org/pub/scm/linux/kernel/git/shawnguo/linux:
ARM: clk-imx6q: parent lvds_sel input from upstream clock gates
Olof Johansson [Fri, 18 Jul 2014 21:38:28 +0000 (14:38 -0700)]
Merge tag 'mvebu-fixes-3.16-3' of git://git.infradead.org/linux-mvebu into fixes
Merge "mvebu fixes for v3.16 (round 3)" from Jason Cooper:
- Fix SMP boot on 38x/375 in big endian
- Fix operand list for pmsu on 370/XP
- Fix coherency bus notifiers
* tag 'mvebu-fixes-3.16-3' of git://git.infradead.org/linux-mvebu:
ARM: mvebu: Fix coherency bus notifiers by using separate notifiers
ARM: mvebu: Fix the operand list in the inline asm of armada_370_xp_pmsu_idle_enter
ARM: mvebu: fix SMP boot for Armada 38x and Armada 375 Z1 in big endian
Merge tag 'gfs2-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-3.0-fixes
Pull gfs2 fixes from Steven Whitehouse:
"This patch set contains two minor docs/spelling fixes, some fixes for
flock, a change to use GFP_NOFS to avoid recursion on a rarely used
code path and a fix for a race relating to the glock lru"
* tag 'gfs2-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-3.0-fixes:
GFS2: fs/gfs2/rgrp.c: kernel-doc warning fixes
GFS2: memcontrol: Spelling s/invlidate/invalidate/
GFS2: Allow caching of glocks for flock
GFS2: Allow flocks to use normal glock dq rather than dq_wait
GFS2: replace count*size kzalloc by kcalloc
GFS2: Use GFP_NOFS when allocating glocks
GFS2: Fix race in glock lru glock disposal
GFS2: Only wait for demote when last holder is dequeued
Merge tag 'dm-3.16-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm
Pull device mapper fixes from Mike Snitzer:
"Fix the dm-thinp and dm-cache targets to disallow changing the data
device's block size"
* tag 'dm-3.16-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
dm cache metadata: do not allow the data block size to change
dm thin metadata: do not allow the data block size to change
Merge tag 'upstream-3.16-rc6' of git://git.infradead.org/linux-ubifs
Pull UBI fixes from Artem Bityutskiy:
"Two UBI fastmap-related fixes for v3.16:
- fix UBI fastmap support which we broke in 3.16-rc1 by reversing the
volumes RB-tree sorting criteria.
- make sure that we scrub all PEBs where we see bit-flips - we were
missing some of them when the fastmap feature was enabled"
* tag 'upstream-3.16-rc6' of git://git.infradead.org/linux-ubifs:
UBI: fastmap: do not miss bit-flips
UBI: fix the volumes tree sorting criteria
Merge tag 'xfs-for-linus-3.16-rc5' of git://oss.sgi.com/xfs/xfs
Pull xfs fixes from Dave Chinner:
"Fixes for low memory perforamnce regressions and a quota inode
handling regression.
These are regression fixes for issues recently introduced - the change
in the stack switch location is fairly important, so I've held off
sending this update until I was sure that it still addresses the stack
usage problem the original solved. So while the commits in the xfs
tree are recent, it has been under tested for several weeks now"
* tag 'xfs-for-linus-3.16-rc5' of git://oss.sgi.com/xfs/xfs:
xfs: null unused quota inodes when quota is on
xfs: refine the allocation stack switch
Revert "xfs: block allocation work needs to be kswapd aware"
Bo Shen [Mon, 14 Jul 2014 03:08:14 +0000 (11:08 +0800)]
ARM: at91: at91sam9x5: correct typo error for ohci clock
Correct the typo error for the second "uhphs_clk".
Signed-off-by: Bo Shen <voice.shen@atmel.com> Acked-by: Boris Brezillon <boris.brezillon@free-electrons.com> Signed-off-by: Nicolas Ferre <nicolas.ferre@atmel.com>
Tomasz Figa [Thu, 17 Jul 2014 15:23:44 +0000 (17:23 +0200)]
irqchip: gic: Fix core ID calculation when topology is read from DT
Certain GIC implementation, namely those found on earlier, single
cluster, Exynos SoCs, have registers mapped without per-CPU banking,
which means that the driver needs to use different offset for each CPU.
Currently the driver calculates the offset by multiplying value returned
by cpu_logical_map() by CPU offset parsed from DT. This is correct when
CPU topology is not specified in DT and aforementioned function returns
core ID alone. However when DT contains CPU topology, the function
changes to return cluster ID as well, which is non-zero on mentioned
SoCs and so breaks the calculation in GIC driver.
This patch fixes this by masking out cluster ID in CPU offset
calculation so that only core ID is considered. Multi-cluster Exynos
SoCs already have banked GIC implementations, so this simple fix should
be enough.
Reported-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Reported-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com> Signed-off-by: Tomasz Figa <t.figa@samsung.com> Fixes: db0d4db22a78d ("ARM: gic: allow GIC to support non-banked setups") Cc: <stable@vger.kernel.org> # v3.3+ Link: https://lkml.kernel.org/r/1405610624-18722-1-git-send-email-t.figa@samsung.com Signed-off-by: Jason Cooper <jason@lakedaemon.net>
Bob Peterson [Thu, 26 Jun 2014 14:47:48 +0000 (10:47 -0400)]
GFS2: Allow caching of glocks for flock
This patch removes the GLF_NOCACHE flag from the glocks associated with
flocks. There should be no good reason not to cache glocks for flocks:
they only force the glock to be demoted before they can be reacquired,
which can slow down performance and even cause glock hangs, especially
in cases where the flocks are held in Shared (SH) mode.
Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
Bob Peterson [Thu, 26 Jun 2014 14:46:25 +0000 (10:46 -0400)]
GFS2: Allow flocks to use normal glock dq rather than dq_wait
This patch allows flock glocks to use a non-blocking dequeue rather
than dq_wait. It also reverts the previous patch I had posted regarding
dq_wait. The reverted patch isn't necessarily a bad idea, but I decided
this might avoid unforeseen side effects, and was therefore safer.
Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
Normally GFP_KERNEL is ok here, but there is now a rarely used code path
relating to deallocation of unlinked inodes (in certain corner cases)
which if hit at times of memory shortage can cause recursion while
trying to free memory.
One solution would be to try and move the gfs2_glock_get() call so
that it is no longer called while another glock is held, but that
doesn't look at all easy, so GFP_NOFS is the best solution for the
time being.
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
We must not leave items on the LRU list with GLF_LOCK set, since
they can be removed if the glock is brought back into use, which
may then potentially result in a hang, waiting for GLF_LOCK to
clear.
It doesn't happen very often, since it requires a glock that has
not been used for a long time to be brought back into use at the
same moment that the shrinker is part way through disposing of
glocks.
The fix is to set GLF_LOCK at a later time, when we already know
that the other locks can be obtained. Also, we now only release
the lru_lock in case a resched is needed, rather than on every
iteration.
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
Bob Peterson [Fri, 20 Jun 2014 13:36:41 +0000 (09:36 -0400)]
GFS2: Only wait for demote when last holder is dequeued
Function gfs2_glock_dq_wait is supposed to dequeue a glock and then
wait for the lock to be demoted. The problem is, if this is a shared
lock, its demote will depend on the other holders, which means you
might end up waiting forever because the other process is blocked.
This problem is especially apparent when dealing with nested flocks.
Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
Lucas Stach [Thu, 17 Jul 2014 10:20:14 +0000 (12:20 +0200)]
ARM: clk-imx6q: parent lvds_sel input from upstream clock gates
The i.MX6 reference manual doesn't make a clear distinction
between the fixed clock divider and the enable gate for the
pcie and sata reference clocks. This lead to the lvds mux
inputs in the imx6q clk driver to be parented from the
ref clock (which is the divider) instead of the actual gate,
which in turn prevents the upstream clock to actually be
enabled when lvds clk out is active.
This fixes a hard machine hang regression in kernel 3.16 for
boards where only pcie is active but no sata, as with this
kernel version the imx6-pcie driver is no longer enabling
the upstream clock directly but only lvds clk out.
Reported-by: Arne Ruhnau <arne.ruhnau@target-sg.com> Signed-off-by: Lucas Stach <l.stach@pengutronix.de> Tested-by: Arne Ruhnau <arne.ruhnau@target-sg.com> Signed-off-by: Shawn Guo <shawn.guo@freescale.com>
Dexuan Cui [Wed, 16 Jul 2014 07:00:45 +0000 (00:00 -0700)]
Drivers: hv: hv_fcopy: fix a race condition for SMP guest
We should schedule the 5s "timer work" before starting the data transfer,
otherwise, the data transfer code may finish so fast on another
virtual cpu that when the code(fcopy_write()) trying to cancel the 5s
"timer work" can occasionally fail because the "timer work" may haven't
been scheduled yet and as a result the fcopy process will be aborted
wrongly by fcopy_work_func() in 5s.
Thank Liz Zhang <lizzha@microsoft.com> for the initial investigation
on the bug.
This addresses https://bugzilla.redhat.com/show_bug.cgi?id=1118123
Tested-by: Liz Zhang <lizzha@microsoft.com> Cc: Haiyang Zhang <haiyangz@microsoft.com> Cc: stable@vger.kernel.org Signed-off-by: Dexuan Cui <decui@microsoft.com> Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* pm-cpufreq:
cpufreq: make table sentinel macros unsigned to match use
cpufreq: move policy kobj to policy->cpu at resume
cpufreq: cpu0: OPPs can be populated at runtime
cpufreq: kirkwood: Reinstate cpufreq driver for ARCH_KIRKWOOD
cpufreq: imx6q: Select PM_OPP
cpufreq: sa1110: set memory type for h3600
Brian W Hart [Fri, 27 Jun 2014 21:09:39 +0000 (16:09 -0500)]
cpufreq: make table sentinel macros unsigned to match use
Commit 5eeaf1f18973 (cpufreq: Fix build error on some platforms that
use cpufreq_for_each_*) moved function cpufreq_next_valid() to a public
header. Warnings are now generated when objects including that header
are built with -Wsign-compare (as an out-of-tree module might be):
.../include/linux/cpufreq.h: In function ‘cpufreq_next_valid’:
.../include/linux/cpufreq.h:519:27: warning: comparison between signed
and unsigned integer expressions [-Wsign-compare]
while ((*pos)->frequency != CPUFREQ_TABLE_END)
^
.../include/linux/cpufreq.h:520:25: warning: comparison between signed
and unsigned integer expressions [-Wsign-compare]
if ((*pos)->frequency != CPUFREQ_ENTRY_INVALID)
^
Constants CPUFREQ_ENTRY_INVALID and CPUFREQ_TABLE_END are signed, but
are used with unsigned member 'frequency' of cpufreq_frequency_table.
Update the macro definitions to be explicitly unsigned to match their
use.
This also corrects potentially wrong behavior of clk_rate_table_iter()
if unsigned long is wider than usigned int.
Fixes: 5eeaf1f18973 (cpufreq: Fix build error on some platforms that use cpufreq_for_each_*) Signed-off-by: Brian W Hart <hartb@linux.vnet.ibm.com> Reviewed-by: Simon Horman <horms+renesas@verge.net.au> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
When using USB 3.0 pen drive with the [AMD] FCH USB XHCI Controller
[1022:7814], the second hotplugging will experience the USB 3.0 pen
drive is recognized as high-speed device. After bisecting the kernel,
I found the commit number 41e7e056cdc662f704fa9262e5c6e213b4ab45dd
(USB: Allow USB 3.0 ports to be disabled.) causes the bug. After doing
some experiments, the bug can be fixed by avoiding executing the function
hub_usb3_port_disable(). Because the port status with [AMD] FCH USB
XHCI Controlleris [1022:7814] is already in RxDetect
(I tried printing out the port status before setting to Disabled state),
it's reasonable to check the port status before really executing
hub_usb3_port_disable().
Fixes: 41e7e056cdc6 (USB: Allow USB 3.0 ports to be disabled.) Signed-off-by: Gavin Guo <gavin.guo@canonical.com> Acked-by: Alan Stern <stern@rowland.harvard.edu> Cc: <stable@vger.kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Dave Airlie [Thu, 17 Jul 2014 23:59:21 +0000 (09:59 +1000)]
Merge branch 'drm-fixes-3.16' of git://people.freedesktop.org/~agd5f/linux into drm-fixes
A few more fixes for 3.16. The pageflipping fixes I dropped last week
have finally shaped up so this is mostly fixes for fallout from the
pageflipping code changes. Also fix a memory leak and a black screen
when restoring the backlight on console unblanking.
* 'drm-fixes-3.16' of git://people.freedesktop.org/~agd5f/linux:
drm/radeon: Make classic pageflip completion path less racy.
drm/radeon: Add missing vblank_put in pageflip ioctl error path.
drm/radeon: Remove redundant fence unref in pageflip path.
drm/radeon: Complete page flip even if waiting on the BO fence fails
drm/radeon: Move pinning the BO back to radeon_crtc_page_flip()
drm/radeon: Prevent too early kms-pageflips triggered by vblank.
drm/radeon: set default bl level to something reasonable
drm/radeon: avoid leaking edid data
Abbas Raza [Thu, 17 Jul 2014 11:34:31 +0000 (19:34 +0800)]
usb: chipidea: udc: Disable auto ZLP generation on ep0
There are 2 methods for ZLP (zero-length packet) generation:
1) In software
2) Automatic generation by device controller
1) is implemented in UDC driver and it attaches ZLP to IN packet if
descriptor->size < wLength
2) can be enabled/disabled by setting ZLT bit in the QH
When gadget ffs is connected to ubuntu host, the host sends
get descriptor request and wLength in setup packet is 255 while the
size of descriptor which will be sent by gadget in IN packet is
64 byte. So the composite driver sets req->zero = 1.
In UDC driver following code will be executed then
Case-A:
So in case of ubuntu host, UDC driver will attach a ZLP to the IN packet.
ubuntu host will request 255 byte in IN request, gadget will send 64 byte
with ZLP and host will come to know that there is no more data.
But hold on, by default ZLT=0 for endpoint 0 so hardware also tries to
automatically generate the ZLP which blocks enumeration for ~6 seconds due
to endpoint 0 STALL, NAKs are sent to host for any requests (OUT/PING)
Case-B:
In case when gadget ffs is connected to Apple device, Apple device sends
setup packet with wLength=64. So descriptor->size = 64 and wLength=64
therefore req->zero = 0 and UDC driver will not attach any ZLP to the
IN packet. Apple device requests 64 bytes, gets 64 bytes and doesn't
further request for IN data. But ZLT=0 by default for endpoint 0 so
hardware tries to automatically generate the ZLP which blocks enumeration
for ~6 seconds due to endpoint 0 STALL, NAKs are sent to host for any
requests (OUT/PING)
According to USB2.0 specs:
8.5.3.2 Variable-length Data Stage
A control pipe may have a variable-length data phase in which the
host requests more data than is contained in the specified data
structure. When all of the data structure is returned to the host,
the function should indicate that the Data stage is ended by
returning a packet that is shorter than the MaxPacketSize for the
pipe. If the data structure is an exact multiple of wMaxPacketSize
for the pipe, the function will return a zero-length packet to indicate
the end of the Data stage.
In Case-A mentioned above:
If we disable software ZLP generation & ZLT=0 for endpoint 0 OR if software
ZLP generation is not disabled but we set ZLT=1 for endpoint 0 then
enumeration doesn't block for 6 seconds.
In Case-B mentioned above:
If we disable software ZLP generation & ZLT=0 for endpoint then enumeration
still blocks due to ZLP automatically generated by hardware and host not needing
it. But if we keep software ZLP generation enabled but we set ZLT=1 for
endpoint 0 then enumeration doesn't block for 6 seconds.
So the proper solution for this issue seems to disable automatic ZLP generation
by hardware (i.e by setting ZLT=1 for endpoint 0) and let software (UDC driver)
handle the ZLP generation based on req->zero field.
Cc: stable@vger.kernel.org Signed-off-by: Abbas Raza <Abbas_Raza@mentor.com> Signed-off-by: Peter Chen <peter.chen@freescale.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Merge tag 'stable/for-linus-3.16-rc5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip
Pull Xen fixes from Konrad Rzeszutek Wilk:
"Two fixes found during migration of PV guests. David would be the one
doing this pull but he is on vacation.
Fixes:
- fix console deadlock when resuming PV guests
- fix regression hit when ballooning and resuming PV guests"
* tag 'stable/for-linus-3.16-rc5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
xen/balloon: set ballooned out pages as invalid in p2m
xen/manage: fix potential deadlock when resuming the console
Merge tag 'trace-fixes-v3.16-rc5-v2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
Pull tracing fixes from Steven Rostedt:
"A few more fixes for ftrace infrastructure.
I was cleaning out my INBOX and found two fixes from zhangwei from a
year ago that were lost in my mail. These fix an inconsistency
between trace_puts() and the way trace_printk() works. The reason
this is important to fix is because when trace_printk() doesn't have
any arguments, it turns into a trace_puts(). Not being able to enable
a stack trace against trace_printk() because it does not have any
arguments is quite confusing. Also, the fix is rather trivial and low
risk.
While porting some changes to PowerPC I discovered that it still has
the function graph tracer filter bug that if you also enable stack
tracing the function graph tracer filter is ignored. I fixed that up.
Finally, Martin Lau, fixed a bug that would cause readers of the
ftrace ring buffer to block forever even though it was suppose to be
NONBLOCK"
This also includes the fix from an earlier pull request:
"Oleg Nesterov fixed a memory leak that happens if a user creates a
tracing instance, sets up a filter in an event, and then removes that
instance. The filter allocates memory that is never freed when the
instance is destroyed"
* tag 'trace-fixes-v3.16-rc5-v2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
ring-buffer: Fix polling on trace_pipe
tracing: Add TRACE_ITER_PRINTK flag check in __trace_puts/__trace_bputs
tracing: Fix graph tracer with stack tracer on other archs
tracing: Add ftrace_trace_stack into __trace_puts/__trace_bputs
tracing: instance_rmdir() leaks ftrace_event_file->filter
Mario Kleiner [Thu, 17 Jul 2014 00:24:45 +0000 (02:24 +0200)]
drm/radeon: Make classic pageflip completion path less racy.
Need to protect mmio flip programming by event lock as well.
Need to also first enable pflip irq, then mmio program,
otherwise a flip completion may get unnoticed in the vblank
of actual completion if the flip is programmed, but
radeon_flip_work_func gets preempted immediately after
mmio programming and before vblank. In that case the
vblank irq handler wouldn't run radeon_crtc_handle_vblank()
with the completion check routine, miss the completed flip,
and only notice one vblank after actual completion, causing
a false/delayed report of flip completion.
Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Mario Kleiner [Wed, 16 Jul 2014 23:37:53 +0000 (01:37 +0200)]
drm/radeon: Add missing vblank_put in pageflip ioctl error path.
Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Mario Kleiner [Wed, 16 Jul 2014 23:27:25 +0000 (01:27 +0200)]
drm/radeon: Remove redundant fence unref in pageflip path.
Not needed anymore, as it is already unreffed within
radeon_flip_work_func() after its only use.
Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Michel Dänzer [Mon, 14 Jul 2014 06:58:03 +0000 (15:58 +0900)]
drm/radeon: Complete page flip even if waiting on the BO fence fails
Otherwise the DRM core and userspace will be confused about which BO the
CRTC is scanning out.
Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Michel Dänzer <michel.daenzer@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Michel Dänzer [Mon, 14 Jul 2014 06:48:42 +0000 (15:48 +0900)]
drm/radeon: Move pinning the BO back to radeon_crtc_page_flip()
As well as enabling the vblank interrupt. These shouldn't take any
significant amount of time, but at least pinning the BO has actually been
seen to fail in practice before, in which case we need to let userspace
know about it.
Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Michel Dänzer <michel.daenzer@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Mario Kleiner [Thu, 3 Jul 2014 01:45:02 +0000 (03:45 +0200)]
drm/radeon: Prevent too early kms-pageflips triggered by vblank.
Since 3.16-rc1 we have this new failure:
When the userspace XOrg ddx schedules vblank events to
trigger deferred kms-pageflips, e.g., via the OML_sync_control
extension call glXSwapBuffersMscOML(), or if a glXSwapBuffers()
is called immediately after completion of a previous swapbuffers
call, e.g., in a tight rendering loop with minimal rendering,
it happens frequently that the pageflip ioctl() is executed
within the same vblank in which a previous kms-pageflip completed,
or - for deferred swaps - always one vblank earlier than requested
by the client app.
This causes premature pageflips and detection of failure by
the ddx, e.g., XOrg log warnings like...
... and error/invalid return values of glXWaitForSbcOML() and
Intel_swap_events extension.
Reason is the new way in which kms-pageflips are programmed
since 3.16.
This commit changes the time window in which the hw can
execute pending programmed pageflips. Before, a pending flip
would get executed anywhere within the vblank interval. Now
a pending flip only gets executed at the leading edge of
vblank (start of front porch), making sure that a invocation
of the pageflip ioctl() within a given vblank interval will
only lead to pageflip completion in the following vblank.
Tested to death on a DCE-4 card.
Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Deucher [Tue, 15 Jul 2014 13:48:53 +0000 (09:48 -0400)]
drm/radeon: set default bl level to something reasonable
If the value in the scratch register is 0, set it to the
max level. This fixes an issue where the console fb blanking
code calls back into the backlight driver on unblank and then
sets the backlight level to 0 after the driver has already
set the mode and enabled the backlight.
Alex Deucher [Mon, 14 Jul 2014 21:57:19 +0000 (17:57 -0400)]
drm/radeon: avoid leaking edid data
In some cases we fetch the edid in the detect() callback
in order to determine what sort of monitor is connected.
If that happens, don't fetch the edid again in the get_modes()
callback or we will leak the edid.
Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
Commit 3ab72f9156bb "dt-bindings: add GIC-400 binding" added the
"arm,gic-400" compatible string, but the corresponding IRQCHIP_DECLARE
was never added to the gic driver.
Therefore add the missing irqchip declaration for it.
Signed-off-by: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com>
Removed additional empty line and adapted commit message to mark it
as fixing an issue. Signed-off-by: Heiko Stuebner <heiko@sntech.de> Acked-by: Will Deacon <will.deacon@arm.com> Fixes: 3ab72f9156bb ("dt-bindings: add GIC-400 binding") Cc: <stable@vger.kernel.org> # v3.14+ Link: https://lkml.kernel.org/r/2621565.f5eISveXXJ@diego Signed-off-by: Jason Cooper <jason@lakedaemon.net>
cpufreq: move policy kobj to policy->cpu at resume
This is only relevant to implementations with multiple clusters, where clusters
have separate clock lines but all CPUs within a cluster share it.
Consider a dual cluster platform with 2 cores per cluster. During suspend we
start hot unplugging CPUs in order 1 to 3. When CPU2 is removed, policy->kobj
would be moved to CPU3 and when CPU3 goes down we wouldn't free policy or its
kobj as we want to retain permissions/values/etc.
Now on resume, we will get CPU2 before CPU3 and will call __cpufreq_add_dev().
We will recover the old policy and update policy->cpu from 3 to 2 from
update_policy_cpu().
But the kobj is still tied to CPU3 and isn't moved to CPU2. We wouldn't create a
link for CPU2, but would try that for CPU3 while bringing it online. Which will
report errors as CPU3 already has kobj assigned to it.
This bug got introduced with commit 42f921a, which overlooked this scenario.
To fix this, lets move kobj to the new policy->cpu while bringing first CPU of a
cluster back. Also do a WARN_ON() if kobject_move failed, as we would reach here
only for the first CPU of a non-boot cluster. And we can't recover from this
situation, if kobject_move() fails.
Fixes: 42f921a6f10c (cpufreq: remove sysfs files for CPUs which failed to come back after resume) Cc: 3.13+ <stable@vger.kernel.org> # 3.13+ Reported-and-tested-by: Bu Yitian <ybu@qti.qualcomm.com> Reported-by: Saravana Kannan <skannan@codeaurora.org> Reviewed-by: Srivatsa S. Bhat <srivatsa@mit.edu> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>