Eric Sandeen [Tue, 21 Jan 2014 22:44:57 +0000 (16:44 -0600)]
xfs: clean up xfs_buftarg
Clean up the xfs_buftarg structure a bit:
- remove bt_bsize which is never used
- replace bt_sshift with bt_ssize; we only ever shift it back
Signed-off-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Ben Myers <bpm@sgi.com>
Ben Myers [Thu, 9 Jan 2014 22:03:18 +0000 (16:03 -0600)]
Merge branch 'xfs-extent-list-locking-fixes' into for-next
A set of fixes which makes sure we are taking the ilock whenever accessing the
extent list. This was associated with "Access to block zero" messages which
may result in extent list corruption.
Chuansheng Liu [Tue, 7 Jan 2014 08:53:34 +0000 (16:53 +0800)]
xfs: Calling destroy_work_on_stack() to pair with INIT_WORK_ONSTACK()
In case CONFIG_DEBUG_OBJECTS_WORK is defined, it is needed to
call destroy_work_on_stack() which frees the debug object to pair
with INIT_WORK_ONSTACK().
Signed-off-by: Liu, Chuansheng <chuansheng.liu@intel.com> Reviewed-by: Ben Myers <bpm@sgi.com> Signed-off-by: Ben Myers <bpm@sgi.com>
Jie Liu [Wed, 1 Jan 2014 11:28:03 +0000 (19:28 +0800)]
xfs: fix off-by-one error in xfs_attr3_rmt_verify
With CRC check is enabled, if trying to set an attributes value just
equal to the maximum size of XATTR_SIZE_MAX would cause the v3 remote
attr write verification procedure failure, which would yield the back
trace like below:
This patch fix it to check the remote EA size is greater than the
XATTR_SIZE_MAX rather than more than or equal to it, because it's
valid if the specified EA value size is equal to the limitation as
per VFS setxattr interface.
Signed-off-by: Jie Liu <jeff.liu@oracle.com> Reviewed-by: Mark Tinguely <tinguely@sgi.com> Signed-off-by: Ben Myers <bpm@sgi.com>
xfs: assert that we hold the ilock for extent map access
Make sure that xfs_bmapi_read has the ilock held in some way, and that
xfs_bmapi_write, xfs_bmapi_delay, xfs_bunmapi and xfs_iread_extents are
called with the ilock held exclusively.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Ben Myers <bpm@sgi.com>
Ben Myers [Fri, 6 Dec 2013 20:30:11 +0000 (12:30 -0800)]
xfs: reinstate the ilock in xfs_readdir
Although it was removed in commit 051e7cd44ab8, ilock needs to be taken in
xfs_readdir because we might have to read the extent list in from disk. This
keeps other threads from reading from or writing to the extent list while it is
being read in and is still in a transitional state.
This has been associated with "Access to block zero" messages on directories
with large numbers of extents resulting from excessive filesytem fragmentation,
as well as extent list corruption. Unfortunately no test case at this point.
Signed-off-by: Ben Myers <bpm@sgi.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
Dave Chinner [Thu, 12 Dec 2013 05:34:38 +0000 (16:34 +1100)]
xfs: abort metadata writeback on permanent errors
If we are doing aysnc writeback of metadata, we can get write errors
but have nobody to report them to. At the moment, we simply attempt
to reissue the write from io completion in the hope that it's a
transient error.
When it's not a transient error, the buffer is stuck forever in
this loop, and we cannot break out of it. Eventually, unmount will
hang because the AIL cannot be emptied and everything goes downhill
from them.
To solve this problem, only retry the write IO once before aborting
it. We don't throw the buffer away because some transient errors can
last minutes (e.g. FC path failover) or even hours (thin
provisioned devices that have run out of backing space) before they
go away. Hence we really want to keep trying until we can't try any
more.
Because the buffer was not cleaned, however, it does not get removed
from the AIL and hence the next pass across the AIL will start IO on
it again. As such, we still get the "retry forever" semantics that
we currently have, but we allow other access to the buffer in the
mean time. Meanwhile the filesystem can continue to modify the
buffer and relog it, so the IO errors won't hang the log or the
filesystem.
Now when we are pushing the AIL, we can see all these "permanent IO
error" buffers and we can issue a warning about failures before we
retry the IO. We can also catch these buffers when unmounting an
issue a corruption warning, too.
Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Ben Myers <bpm@sgi.com>
Dave Chinner [Thu, 12 Dec 2013 05:34:36 +0000 (16:34 +1100)]
xfs: swalloc doesn't align allocations properly
When swalloc is specified as a mount option, allocations are
supposed to be aligned to the stripe width rather than the stripe
unit of the underlying filesystem. However, it does not do this.
What the implementation does is round up the allocation size to a
stripe width, hence ensuring that all allocations span a full stripe
width. It does not, however, ensure that that allocation is aligned
to a stripe width, and hence the allocations can span multiple
underlying stripes and so still see RMW cycles for things like
direct IO on MD RAID.
So, if the swalloc mount option is set, change the allocation
alignment in xfs_bmap_btalloc() to use the stripe width rather than
the stripe unit.
Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Ben Myers <bpm@sgi.com> Signed-off-by: Ben Myers <bpm@sgi.com>
The xfsbdstrat helper is a small but useless wrapper for xfs_buf_iorequest that
handles the case of a shut down filesystem. Most of the users have private,
uncached buffers that can just be freed in this case, but the complex error
handling in xfs_bioerror_relse messes up the case when it's called without
a locked buffer.
Remove xfsbdstrat and opencode the error handling in the callers. All but
one can simply return an error and don't need to deal with buffer state,
and the one caller that cares about the buffer state could do with a major
cleanup as well, but we'll defer that to later.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Ben Myers <bpm@sgi.com> Signed-off-by: Ben Myers <bpm@sgi.com>
Dave Chinner [Thu, 21 Nov 2013 23:41:16 +0000 (10:41 +1100)]
xfs: align initial file allocations correctly
The function xfs_bmap_isaeof() is used to indicate that an
allocation is occurring at or past the end of file, and as such
should be aligned to the underlying storage geometry if possible.
Commit 27a3f8f ("xfs: introduce xfs_bmap_last_extent") changed the
behaviour of this function for empty files - it turned off
allocation alignment for this case accidentally. Hence large initial
allocations from direct IO are not getting correctly aligned to the
underlying geometry, and that is cause write performance to drop in
alignment sensitive configurations.
Fix it by considering allocation into empty files as requiring
aligned allocation again.
Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Ben Myers <bpm@sgi.com>
(cherry picked from commit f9b395a8ef8f34d19cae2cde361e19c96e097fad)
Namjae Jeon [Sun, 8 Dec 2013 14:33:50 +0000 (23:33 +0900)]
MAINTAINERS: fix incorrect mail address of XFS maintainer
When I tried to send the patches to XFS Maintainers,
I got returned mail included delivery fail message for Dave's mail.
Maybe, Dave Chinner mail address is incorrect.
I try to fix it correctly.
Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com> Reviewed-by: Ben Myers <bpm@sgi.com> Signed-off-by: Ben Myers <bpm@sgi.com>
(cherry picked from commit db10bddc7d4f412bcd8630fc479fa1eb009e325b)
Jie Liu [Tue, 26 Nov 2013 13:38:49 +0000 (21:38 +0800)]
xfs: fix infinite loop by detaching the group/project hints from user dquot
xfs_quota(8) will hang up if trying to turn group/project quota off
before the user quota is off, this could be 100% reproduced by:
# mount -ouquota,gquota /dev/sda7 /xfs
# mkdir /xfs/test
# xfs_quota -xc 'off -g' /xfs <-- hangs up
# echo w > /proc/sysrq-trigger
# dmesg
It's fine if we turn user quota off at first, then turn off other
kind of quotas if they are enabled since the group/project dquot
refcount is decreased to zero once the user quota if off. Otherwise,
those dquots refcount is non-zero due to the user dquot might refer
to them as hint(s). Hence, above operation cause an infinite loop
at xfs_qm_dquot_walk() while trying to purge dquot cache.
This problem has been around since Linux 3.4, it was introduced by:
[ b84a3a9675 xfs: remove the per-filesystem list of dquots ]
Originally we will release the group dquot pointers because the user
dquots maybe carrying around as a hint via xfs_qm_detach_gdquots().
However, with above change, there is no such work to be done before
purging group/project dquot cache.
In order to solve this problem, this patch introduces a special routine
xfs_qm_dqpurge_hints(), and it would release the group/project dquot
pointers the user dquots maybe carrying around as a hint, and then it
will proceed to purge the user dquot cache if requested.
Cc: stable@vger.kernel.org Signed-off-by: Jie Liu <jeff.liu@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Ben Myers <bpm@sgi.com>
(cherry picked from commit df8052e7dae00bde6f21b40b6e3e1099770f3afc)
Jie Liu [Tue, 26 Nov 2013 13:38:34 +0000 (21:38 +0800)]
xfs: fix assertion failure at xfs_setattr_nonsize
For CRC enabled v5 super block, change a file's ownership can simply
trigger an ASSERT failure at xfs_setattr_nonsize() if both group and
project quota are enabled, i.e,
This fix adjust the assertion to check if the super block support both
quota inodes or not.
Signed-off-by: Jie Liu <jeff.liu@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Ben Myers <bpm@sgi.com>
(cherry picked from commit 5a01dd54f4a7fb513062070c5acef20d13cad980)
We're hitting the ASSERT(XFS_IS_*QUOTA_ON(mp)) in xfs_qm_vop_create_dqattach(),
however the assertion itself is not right IMHO. While performing quota off, we
firstly clear the XFS_*QUOTA_ACTIVE bit(s) from struct xfs_mount without taking
any special locks, see xfs_qm_scall_quotaoff(). Hence there is no guarantee
that the desired quota is still active.
Signed-off-by: Jie Liu <jeff.liu@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Ben Myers <bpm@sgi.com>
(cherry picked from commit 37eb9706ebf5b99d14c6086cdeef2c2f73f9c9fb)
Mark Tinguely [Sun, 6 Oct 2013 02:48:25 +0000 (21:48 -0500)]
xfs: fix memory leak in xfs_dir2_node_removename
Fix the leak of kernel memory in xfs_dir2_node_removename()
when xfs_dir2_leafn_remove() returns an error code.
Signed-off-by: Mark Tinguely <tinguely@sgi.com> Reviewed-by: Ben Myers <bpm@sgi.com> Signed-off-by: Ben Myers <bpm@sgi.com>
(cherry picked from commit ef701600fd26cace9d513ee174688a2b83832126)
With the new iop_format scheme there is no need to have a temporary buffer
to format logged extents into, we can do so directly into the CIL. This
also allows to remove the shortcut for big endian systems that probably
hasn't gotten a lot of test coverage for a long time.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
xfs: format log items write directly into the linear CIL buffer
Instead of setting up pointers to memory locations in iop_format which then
get copied into the CIL linear buffer after return move the copy into
the individual inode items. This avoids the need to always have a memory
block in the exact same layout that gets written into the log around, and
allow the log items to be much more flexible in their in-memory layouts.
The only caveat is that we need to properly align the data for each
iovec so that don't have structures misaligned in subsequent iovecs.
Note that all log item format routines now need to be careful to modify
the copy of the item that was placed into the CIL after calls to
xlog_copy_iovec instead of the in-memory copy.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
Add a helper to abstract out filling the log iovecs in the log item
format handlers. This will allow us to change the way we do the log
item formatting more easily.
The copy in the name is a bit confusing for now as it just assigns a
pointer and lets the CIL code perform the copy, but that will change
soon.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
Dave Chinner [Thu, 21 Nov 2013 23:41:16 +0000 (10:41 +1100)]
xfs: align initial file allocations correctly
The function xfs_bmap_isaeof() is used to indicate that an
allocation is occurring at or past the end of file, and as such
should be aligned to the underlying storage geometry if possible.
Commit 27a3f8f ("xfs: introduce xfs_bmap_last_extent") changed the
behaviour of this function for empty files - it turned off
allocation alignment for this case accidentally. Hence large initial
allocations from direct IO are not getting correctly aligned to the
underlying geometry, and that is cause write performance to drop in
alignment sensitive configurations.
Fix it by considering allocation into empty files as requiring
aligned allocation again.
Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Ben Myers <bpm@sgi.com>
Namjae Jeon [Sun, 8 Dec 2013 14:33:50 +0000 (23:33 +0900)]
MAINTAINERS: fix incorrect mail address of XFS maintainer
When I tried to send the patches to XFS Maintainers,
I got returned mail included delivery fail message for Dave's mail.
Maybe, Dave Chinner mail address is incorrect.
I try to fix it correctly.
Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com> Reviewed-by: Ben Myers <bpm@sgi.com> Signed-off-by: Ben Myers <bpm@sgi.com>
If we are using a large directory block size, and memory becomes
fragmented, we can get memory allocation failures trying to
kmem_alloc(64k) for a temporary buffer. However, there is not need
for a directory buffer sized allocation, as the end result ends up
in the inode literal area. This is, at most, slightly less than 2k
of space, and hence we don't need an allocation larger than that
fora temporary buffer.
Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Ben Myers <bpm@sgi.com> Signed-off-by: Ben Myers <bpm@sgi.com>
For V4 filesystems, we don't have a agfl header structure, and so
XFS_AGFL_SIZE() returns an entire sector's worth of entries, which
we then index from an offset into the sector. Hence: buffer overrun.
This problem was introduced in 3.10 by commit 77c95bba ("xfs: add
CRC checks to the AGFL") which changed the AGFL structure but failed
to update the growfs code to handle the different structures.
Fix it by using the correct offset into the buffer for both V4 and
V5 filesystems.
Cc: <stable@vger.kernel.org> Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Jie Liu <jeff.liu@oracle.com> Signed-off-by: Ben Myers <bpm@sgi.com>
(cherry picked from commit b7d961b35b3ab69609aeea93f870269cb6e7ba4d)
Jie Liu [Wed, 20 Nov 2013 08:08:53 +0000 (16:08 +0800)]
xfs: don't perform discard if the given range length is less than block size
For discard operation, we should return EINVAL if the given range length
is less than a block size, otherwise it will go through the file system
to discard data blocks as the end range might be evaluated to -1, e.g,
# fstrim -v -o 0 -l 100 /xfs7
/xfs7: 9811378176 bytes were trimmed
This issue can be triggered via xfstests/generic/288.
Also, it seems to get the request queue pointer via bdev_get_queue()
instead of the hard code pointer dereference is not a bad thing.
Signed-off-by: Jie Liu <jeff.liu@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Ben Myers <bpm@sgi.com>
(cherry picked from commit f9fd0135610084abef6867d984e9951c3099950d)
Jie Liu [Tue, 26 Nov 2013 13:38:49 +0000 (21:38 +0800)]
xfs: fix infinite loop by detaching the group/project hints from user dquot
xfs_quota(8) will hang up if trying to turn group/project quota off
before the user quota is off, this could be 100% reproduced by:
# mount -ouquota,gquota /dev/sda7 /xfs
# mkdir /xfs/test
# xfs_quota -xc 'off -g' /xfs <-- hangs up
# echo w > /proc/sysrq-trigger
# dmesg
It's fine if we turn user quota off at first, then turn off other
kind of quotas if they are enabled since the group/project dquot
refcount is decreased to zero once the user quota if off. Otherwise,
those dquots refcount is non-zero due to the user dquot might refer
to them as hint(s). Hence, above operation cause an infinite loop
at xfs_qm_dquot_walk() while trying to purge dquot cache.
This problem has been around since Linux 3.4, it was introduced by:
[ b84a3a9675 xfs: remove the per-filesystem list of dquots ]
Originally we will release the group dquot pointers because the user
dquots maybe carrying around as a hint via xfs_qm_detach_gdquots().
However, with above change, there is no such work to be done before
purging group/project dquot cache.
In order to solve this problem, this patch introduces a special routine
xfs_qm_dqpurge_hints(), and it would release the group/project dquot
pointers the user dquots maybe carrying around as a hint, and then it
will proceed to purge the user dquot cache if requested.
Cc: stable@vger.kernel.org Signed-off-by: Jie Liu <jeff.liu@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Ben Myers <bpm@sgi.com>
Jie Liu [Tue, 26 Nov 2013 13:38:34 +0000 (21:38 +0800)]
xfs: fix assertion failure at xfs_setattr_nonsize
For CRC enabled v5 super block, change a file's ownership can simply
trigger an ASSERT failure at xfs_setattr_nonsize() if both group and
project quota are enabled, i.e,
Split out a xfs_setattr_time helper to share code between truncate and
regular setattr similar to xfs_setattr_mode. I might also have another
caller growing for this in the near future.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Ben Myers <bpm@sgi.com>
We're hitting the ASSERT(XFS_IS_*QUOTA_ON(mp)) in xfs_qm_vop_create_dqattach(),
however the assertion itself is not right IMHO. While performing quota off, we
firstly clear the XFS_*QUOTA_ACTIVE bit(s) from struct xfs_mount without taking
any special locks, see xfs_qm_scall_quotaoff(). Hence there is no guarantee
that the desired quota is still active.
Signed-off-by: Jie Liu <jeff.liu@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Ben Myers <bpm@sgi.com>
Jie Liu [Fri, 22 Nov 2013 16:15:43 +0000 (00:15 +0800)]
xfs: integrate xfs_quota_priv header file to xfs_qm
The xfs_quota_priv header file is only included by xfs_qm header and
there is no much users for its contents, hence we can move those stuff
to xfs_qm header file and kill it.
This patch also remove an unused macro DQFLAGTO_TYPESTR.
Signed-off-by: Jie Liu <jeff.liu@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Ben Myers <bpm@sgi.com>
Jie Liu [Fri, 22 Nov 2013 06:04:00 +0000 (14:04 +0800)]
xfs: make quota metadata truncation behavior consistent to user space
In xfs_qm_scall_trunc_qfiles(), we ignore the error if failed to remove
the users quota metadata and proceed to remove groups and projects if
they are being there. However, in user space, the remove operation will
break and return if failed to remove any kind of quota.
Also for v5 super block, we can enabled both group and project quota at
the same time, in this case the current error handling will cover the
group error with projects but they might failed due to different reasons.
It seems we'd better the error handling consistent to the user space and
don't trying to remove another kind of quota metadata if the previous
operation is failed.
Signed-off-by: Jie Liu <jeff.liu@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Ben Myers <bpm@sgi.com>
Mark Tinguely [Wed, 2 Oct 2013 12:51:12 +0000 (07:51 -0500)]
xfs: free the list of recovery items on error
Recovery builds a list of items on the transaction's
r_itemq head. Normally these items are committed and freed.
But in the event of a recovery error, these allocations
are leaked.
If the error occurs during item reordering, then reconstruct
the r_itemq list before deleting the list to avoid leaking
the entries that were on one of the temporary lists.
Signed-off-by: Mark Tinguely <tinguely@sgi.com> Reviewed-by: Ben Myers <bpm@sgi.com> Signed-off-by: Ben Myers <bpm@sgi.com>
Currently notify_change directly updates i_version for size updates,
which not only is counter to how all other fields are updated through
struct iattr, but also breaks XFS, which need inode updates to happen
under its own lock, and synchronized to the structure that gets written
to the log.
Remove the update in the common code, and it to btrfs and ext4,
XFS already does a proper updaste internally and currently gets a
double update with the existing code.
IMHO this is 3.13 and -stable material and should go in through the XFS
tree.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Andreas Dilger <adilger@dilger.ca> Acked-by: Jan Kara <jack@suse.cz> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Chris Mason <clm@fb.com> Signed-off-by: Ben Myers <bpm@sgi.com>
For V4 filesystems, we don't have a agfl header structure, and so
XFS_AGFL_SIZE() returns an entire sector's worth of entries, which
we then index from an offset into the sector. Hence: buffer overrun.
This problem was introduced in 3.10 by commit 77c95bba ("xfs: add
CRC checks to the AGFL") which changed the AGFL structure but failed
to update the growfs code to handle the different structures.
Fix it by using the correct offset into the buffer for both V4 and
V5 filesystems.
Cc: <stable@vger.kernel.org> Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Jie Liu <jeff.liu@oracle.com> Signed-off-by: Ben Myers <bpm@sgi.com>
Jie Liu [Wed, 20 Nov 2013 08:08:53 +0000 (16:08 +0800)]
xfs: don't perform discard if the given range length is less than block size
For discard operation, we should return EINVAL if the given range length
is less than a block size, otherwise it will go through the file system
to discard data blocks as the end range might be evaluated to -1, e.g,
# fstrim -v -o 0 -l 100 /xfs7
/xfs7: 9811378176 bytes were trimmed
This issue can be triggered via xfstests/generic/288.
Also, it seems to get the request queue pointer via bdev_get_queue()
instead of the hard code pointer dereference is not a bad thing.
Signed-off-by: Jie Liu <jeff.liu@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Ben Myers <bpm@sgi.com>
Linus Torvalds [Fri, 29 Nov 2013 17:57:13 +0000 (09:57 -0800)]
Merge tag 'arm64-stable' of git://git.kernel.org/pub/scm/linux/kernel/git/cmarinas/linux-aarch64
Pull ARM64 fixes from Catalin Marinas:
- Remove preempt_count modifications in the arm64 IRQ handling code
since that's already dealt with in generic irq_enter/irq_exit
- PTE_PROT_NONE bit moved higher up to avoid overlapping with the
hardware bits (for PROT_NONE mappings which are pte_present)
- Big-endian fixes for ptrace support
- Asynchronous aborts unmasking while in the kernel
- pgprot_writecombine() change to create Normal NonCacheable memory
rather than Device GRE
* tag 'arm64-stable' of git://git.kernel.org/pub/scm/linux/kernel/git/cmarinas/linux-aarch64:
arm64: Move PTE_PROT_NONE higher up
arm64: Use Normal NonCacheable memory for writecombine
arm64: debug: make aarch32 bkpt checking endian clean
arm64: ptrace: fix compat registes get/set to be endian clean
arm64: Unmask asynchronous aborts when in kernel mode
arm64: dts: Reserve the memory used for secondary CPU release address
arm64: let the core code deal with preempt_count
Linus Torvalds [Fri, 29 Nov 2013 17:56:15 +0000 (09:56 -0800)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Pull s390 updates from Martin Schwidefsky:
"One performance improvement and a few bug fixes. Two of the fixes
deal with the clock related problems we have seen on recent kernels"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
s390/mm: handle asce-type exceptions as normal page fault
s390,time: revert direct ktime path for s390 clockevent device
s390/time,vdso: convert to the new update_vsyscall interface
s390/uaccess: add missing page table walk range check
s390/mm: optimize copy_page
s390/dasd: validate request size before building CCW/TCW request
s390/signal: always restore saved runtime instrumentation psw bit
Linus Torvalds [Fri, 29 Nov 2013 17:55:13 +0000 (09:55 -0800)]
Merge branch 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux
Pull i2c fixes from Wolfram Sang:
"Some easy but needed fixes for i2c drivers since rc1"
* 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
i2c: bcm2835: Linking platform nodes to adapter nodes
i2c: omap: raw read and write endian fix
i2c: i2c-bcm-kona: Fix module build
i2c: i2c-diolan-u2c: different usb endpoints for DLN-2-U2C
i2c: bcm-kona: remove duplicated include
i2c: davinci: raw read and write endian fix
Linus Torvalds [Fri, 29 Nov 2013 17:49:08 +0000 (09:49 -0800)]
Merge branch 'for-3.13-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq
Pull workqueue fixes from Tejun Heo:
"This contains one important fix. The NUMA support added a while back
broke ordering guarantees on ordered workqueues. It was enforced by
having single frontend interface with @max_active == 1 but the NUMA
support puts multiple interfaces on unbound workqueues on NUMA
machines thus breaking the ordered guarantee. This is fixed by
disabling NUMA support on ordered workqueues.
The above and a couple other patches were sitting in for-3.12-fixes
but I forgot to push that out, so they ended up waiting a bit too
long. My aplogies.
Other fixes are minor"
* 'for-3.13-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq:
workqueue: fix pool ID allocation leakage and remove BUILD_BUG_ON() in init_workqueues
workqueue: fix comment typo for __queue_work()
workqueue: fix ordered workqueues in NUMA setups
workqueue: swap set_cpus_allowed_ptr() and PF_NO_SETAFFINITY
Linus Torvalds [Fri, 29 Nov 2013 17:48:25 +0000 (09:48 -0800)]
Merge branch 'for-3.13-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata
Pull libata fixes from Tejun Heo:
"libata device removal path was removing parent device node before its
child, which is mostly harmless but triggers warning after recent
sysfs changes. Rafael's patch fixes the order.
Other than that, minor controller-specific fixes and device ID
additions"
* 'for-3.13-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata:
ATA: Fix port removal ordering
ahci: add Marvell 9230 to the AHCI PCI device list
ata: fix acpi_bus_get_device() return value check
pata_arasan_cf: add missing clk_disable_unprepare() on error path
ahci: add support for IBM Akebono platform device
Linus Torvalds [Fri, 29 Nov 2013 17:47:06 +0000 (09:47 -0800)]
Merge branch 'for-3.13-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup
Pull cgroup fixes from Tejun Heo:
"Fixes for three issues.
- cgroup destruction path could swamp system_wq possibly leading to
deadlock. This actually seems to happen in the wild with memcg
because memcg destruction path adds nested dependency on system_wq.
Resolved by isolating cgroup destruction work items on its
dedicated workqueue.
- Possible locking context deadlock through seqcount reported by
lockdep
- Memory leak under certain conditions"
* 'for-3.13-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
cgroup: fix cgroup_subsys_state leak for seq_files
cpuset: Fix memory allocator deadlock
cgroup: use a dedicated workqueue for cgroup destruction
Linus Torvalds [Fri, 29 Nov 2013 17:36:42 +0000 (09:36 -0800)]
Merge tag 'sound-3.13-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
"Quite a few HD-Audio fixes, a WUSB audio fix and a fix for FireWire
audio. The HD-audio part contains a couple of fixes for the generic
parser, and these are the only intrusive fixes. The rest are mostly
device-specific fixes"
* tag 'sound-3.13-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
ALSA: hda - Add LFE chmap to ASUS ET2700
ALSA: hda - Initialize missing bass speaker pin for ASUS AIO ET2700
ALSA: hda - limit mic boost on Asus UX31[A,E]
ALSA: hda - Check leaf nodes to find aamix amps
ALSA: hda - Fix hp-mic mode without VREF bits
ALSA: hda - Create Headhpone Mic Jack Mode when really needed
ALSA: usb: use multiple packets per urb for Wireless USB inbound audio
ALSA: hda - Enable mute/mic-mute LEDs for more Thinkpads with Conexant codec
ALSA: hda - Drop bus->avoid_link_reset flag
ALSA: hda/realtek - Set pcbeep amp for ALC668
ALSA: hda/realtek - Add support of ALC231 codec
ALSA: firewire-lib: fix wrong value for FDF field as an empty packet
Linus Torvalds [Fri, 29 Nov 2013 17:27:19 +0000 (09:27 -0800)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull vfs dentry reference count fix from Al Viro.
This fixes a possible inode_permission NULL pointer dereference (and
other problems) that were due to the root dentry count being decremented
too much. In commit 48a066e72d97 ("RCU'd vfsmounts") the placement of
clearing the LOOKUP_RCU bit changed, and we then returned failure of
incrementing the lockref on the parent dentry with LOOKUP_RCU cleared.
But that meant we needed to go through the same cleanup routines that
the later failures did wrt LOOKUP_ROOT and nd->root.
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
fix bogus path_put() of nd->root after some unlazy_walk() failures
Catalin Marinas [Wed, 27 Nov 2013 16:59:27 +0000 (16:59 +0000)]
arm64: Move PTE_PROT_NONE higher up
PTE_PROT_NONE means that a pte is present but does not have any
read/write attributes. However, setting the memory type like
pgprot_writecombine() is allowed and such bits overlap with
PTE_PROT_NONE. This causes mmap/munmap issues in drivers that change the
vma->vm_pg_prot on PROT_NONE mappings.
This patch reverts the PTE_FILE/PTE_PROT_NONE shift in commit 59911ca4325d (ARM64: mm: Move PTE_PROT_NONE bit) and moves PTE_PROT_NONE
together with the other software bits.
Signed-off-by: Steve Capper <steve.capper@linaro.org> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Tested-by: Steve Capper <steve.capper@linaro.org> Cc: <stable@vger.kernel.org> # 3.11+
Catalin Marinas [Fri, 29 Nov 2013 10:56:14 +0000 (10:56 +0000)]
arm64: Use Normal NonCacheable memory for writecombine
This provides better performance compared to Device GRE and also allows
unaligned accesses. Such memory is intended to be used with standard RAM
(e.g. framebuffers) and not I/O.
Al Viro [Fri, 29 Nov 2013 06:48:32 +0000 (01:48 -0500)]
fix bogus path_put() of nd->root after some unlazy_walk() failures
Failure to grab reference to parent dentry should go through the
same cleanup as nd->seq mismatch. As it is, we might end up with
caller thinking it needs to path_put() nd->root, with obvious
nasty results once we'd hit that bug enough times to drive the
refcount of root dentry all the way to zero...
Linus Torvalds [Thu, 28 Nov 2013 17:57:46 +0000 (09:57 -0800)]
Merge tag 'gpio-v3.13-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio
Pull GPIO fixes from Linus Walleij:
"Here us a bunch of patches for the v3.13 series. Most important stuff
is related to fixes and documentation for the new GPIO descriptor API.
If the diffstat is scary you'll notice most of it is to
Documentation/*:
- A big slew of documentation for the gpiod transition that happened
in the merge window, no semantic effect, but we should provide
proper documentation with the new API.
- Fix flags related to the new API.
- Fix to the find_chip_by_name() lookup function related to the new
API.
- Fix of_find_gpio() when not using device tree.
- Bug fix for the TB10x direction setting.
- Error path fixes from Dan Carpenter.
- Nasty IRQdomain bug relating to taking an unitialized spinlock.
- Minor fixes here and there"
* tag 'gpio-v3.13-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio:
gpio: bcm281xx: Fix return value of bcm_kona_gpio_get()
gpio: pl061: move irqdomain initialization
gpio: ucb1400: Add MODULE_ALIAS
gpiolib: fix of_find_gpio() when OF not defined
gpio: fix memory leak in error path
gpio: rcar: NULL dereference on error in probe()
gpio: msm: make msm_gpio.summary_irq signed for error handling
gpio: mvebu: make mvchip->irqbase signed for error handling
gpiolib: use dedicated flags for GPIO properties
gpiolib: fix find_chip_by_name()
Documentation: gpiolib: document new interface
gpio: tb10x: Set output value before setting direction to output
Linus Torvalds [Thu, 28 Nov 2013 17:51:39 +0000 (09:51 -0800)]
Merge tag 'md/3.13-fixes' of git://neil.brown.name/md
Pull md fixes from Neil Brown:
"Three bug fixes for md in 3.13-rc
All recent regressions, one in 3.12 so marked for -stable"
* tag 'md/3.13-fixes' of git://neil.brown.name/md:
md/raid5: fix newly-broken locking in get_active_stripe.
md: test mddev->flags more safely in md_check_recovery.
md/raid5: fix new memory-reference bug in alloc_thread_groups.
Linus Torvalds [Thu, 28 Nov 2013 17:50:25 +0000 (09:50 -0800)]
Merge branch 'for-next' of git://git.samba.org/sfrench/cifs-2.6
Pull cifs fixes from Steve French:
"SMB3 "validate negotiate" is needed to prevent certain types of
downgrade attacks.
Also changes SMB2/SMB3 copy offload from using the BTRFS copy ioctl
(BTRFS_IOC_CLONE) to a cifs specific ioctl (CIFS_IOC_COPYCHUNK_FILE)
to address Christoph's comment that there are semantic differences
between requesting copy offload in which copy-on-write is mandatory
(as in the BTRFS ioctl) and optional in the SMB2/SMB3 case. Also
fixes SMB2/SMB3 copychunk for large files"
* 'for-next' of git://git.samba.org/sfrench/cifs-2.6:
[CIFS] Do not use btrfs refcopy ioctl for SMB2 copy offload
Check SMB3 dialects against downgrade attacks
Removed duplicated (and unneeded) goto
CIFS: Fix SMB2/SMB3 Copy offload support (refcopy) for large files
Helge Deller [Thu, 28 Nov 2013 08:16:33 +0000 (09:16 +0100)]
kernel/extable: fix address-checks for core_kernel and init areas
The init_kernel_text() and core_kernel_text() functions should not
include the labels _einittext and _etext when checking if an address is
inside the .text or .init sections.
Takashi Iwai [Thu, 28 Nov 2013 14:21:21 +0000 (15:21 +0100)]
ALSA: hda - Initialize missing bass speaker pin for ASUS AIO ET2700
Add a fixup entry for the missing bass speaker pin 0x16 on ASUS ET2700
AiO desktop. The channel map will be added in the next patch, so that
this can be backported easily to stable kernels.
Takashi Iwai [Thu, 28 Nov 2013 10:05:28 +0000 (11:05 +0100)]
ALSA: hda - Check leaf nodes to find aamix amps
The current generic parser assumes blindly that the volume and mute
amps are found in the aamix node itself. But on some codecs,
typically Analog Devices ones, the aamix amps are separately
implemented in each leaf node of the aamix node, and the current
driver can't establish the correct amp controls. This is a regression
compared with the previous static quirks.
This patch extends the search for the amps to the leaf nodes for
allowing the aamix controls again on such codecs.
In this implementation, I didn't code to loop through the whole paths,
since usually one depth should suffice, and we can't search too
deeply, as it may result in the conflicting control assignments.
Florian Meier [Mon, 25 Nov 2013 08:01:50 +0000 (09:01 +0100)]
i2c: bcm2835: Linking platform nodes to adapter nodes
In order to find I2C devices in the device tree, the platform nodes
have to be known by the I2C core. This requires setting the
dev.of_node parameter of the adapter.
Signed-off-by: Florian Meier <florian.meier@koalo.de> Tested-by: Stephen Warren <swarren@wwwdotorg.org> Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
Linus Torvalds [Thu, 28 Nov 2013 05:06:01 +0000 (21:06 -0800)]
Merge tag 'tty-3.13-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty
Pull tty/serial fixes from Greg KH:
"Here are some tty/serial driver fixes for reported issues in 3.13-rc2.
The n_gsm "fix" was reverted as it was found to not be correct.
Hopefully this will be resolved in a future pull request, but as
there's really only one user of this line setting, it's not a big
deal..."
* tag 'tty-3.13-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
Revert "n_gsm: race between ld close and gsmtty open"
n_tty: Protect minimum_to_wake reset for concurrent readers
tty: Reset hupped state on open
TTY: amiserial, add missing platform check
TTY: pmac_zilog, check existence of ports in pmz_console_init()
n_gsm: race between ld close and gsmtty open
tty/serial/8250: fix typo in help text
n_tty: Fix 4096-byte canonical reads
n_tty: Fix echo overrun tail computation
n_tty: Ensure reader restarts worker for next reader
Linus Torvalds [Thu, 28 Nov 2013 05:05:31 +0000 (21:05 -0800)]
Merge tag 'staging-3.13-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging
Pull staging fixes from Greg KH:
"Here are a number of staging, and IIO driver, fixes for 3.13-rc2 that
resolve issues that have been reported for 3.13-rc1. All of these
have been in linux-next for a bit this week"
* tag 'staging-3.13-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging: (25 commits)
Staging: tidspbridge: disable driver
staging: zsmalloc: Ensure handle is never 0 on success
staging/lustre/ptlrpc: fix ptlrpc_stop_pinger logic
staging: r8188eu: Fix AP mode
Staging: btmtk_usb: Add hdev parameter to hdev->send driver callback
Staging: go7007: fix up some remaining go->dev issues
staging: imx-drm: Fix modular build of DRM_IMX_IPUV3
staging: ft1000: fix use of potentially uninitialized variable
Revert "staging:media: Use dev_dbg() instead of pr_debug()"
Staging: zram: Fix memory leak by refcount mismatch
staging: vt6656: [BUG] Fix for TX USB resets from vendors driver.
staging: nvec: potential NULL dereference on error path
Staging: vt6655-6: potential NULL dereference in hostap_disable_hostapd()
staging: comedi: s626: fix value written by s626_set_dac()
Staging: comedi: pcl730: fix some bitwise vs logical AND bugs
staging: comedi: fix potentially uninitialised variable
iio:accel:kxsd9 fix missing mutex unlock
iio: adc: ti_am335x_adc: avoid double free of buffer.
staging:iio: Fix hmc5843 Kconfig dependencies
iio: Fix tcs3472 Kconfig dependencies
...
Linus Torvalds [Thu, 28 Nov 2013 05:04:37 +0000 (21:04 -0800)]
Merge tag 'driver-core-3.13-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core
Pull driver core fixes from Greg KH:
"Here are 3 patches for sysfs issues that have been reported. Well, 1
patch really, the first one is reverted as it's not really needed (the
correct fix is coming in through the different driver subsystems
instead)
But that 1 sysfs fix is needed, so this is still a good thing to pull
in now"
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* tag 'driver-core-3.13-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core:
Revert "sysfs: handle duplicate removal attempts in sysfs_remove_group()"
sysfs: use a separate locking class for open files depending on mmap
sysfs: handle duplicate removal attempts in sysfs_remove_group()
Linus Torvalds [Thu, 28 Nov 2013 04:41:54 +0000 (20:41 -0800)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid
Pull HID fixes from Jiri Kosina:
- fix compat ioctl leak in uhid, by David Herrmann
- fix scheduling in atomic context (causing actual lockups in real
world) in hid-sony driver, by Sven Eckelmann
- revert patch introducing VID/PID conflict, by Jiri Kosina
- support from various new device IDs by Benjamin Tissoires and
KaiChung Cheng
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid:
HID: uhid: fix leak for 64/32 UHID_CREATE
HID: kye: fix unresponsive keyboard
HID: kye: Add report fixup for Genius Manticore Keyboard
HID: multicouh: add PID VID to support 1 new Wistron optical touch device
HID: appleir: force input to be set
Revert "HID: wiimote: add LEGO-wiimote VID"
HID: sony: Send FF commands in non-atomic context
Linus Torvalds [Thu, 28 Nov 2013 04:40:35 +0000 (20:40 -0800)]
Merge tag 'pm+acpi-3.13-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull ACPI and power management fixes from Rafael Wysocki:
- Fix for a recent regression in the Tegra cpufreq driver causing
excess error messages to be printed from Stephen Warren
- ACPI-based device hotplug fix to prevent conflicting notify handlers
from being installed for PCI host bridge objects. From Toshi Kani
- ACPICA update to upstream version 20131115. This contains bug fixes
mostly (loop termination fix for the get AML length function, fixes
related to namespace node removal and debug output). From Bob Moore,
Tomasz Nowicki and Lv Zheng
- Removal of incorrect inclusions of internal ACPICA header files by
non-ACPICA code from Lv Zheng
- Fixes for the ACPI sysfs interface exposing tables to user space from
Daisuke Hatayama and Jeremy Compostella
- Assorted ACPI and cpufreq cleanups from Sachin Kamat and Al Stone
- cpupower tool fix and man page from Thomas Renninger
* tag 'pm+acpi-3.13-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
ACPI: Clean up incorrect inclusions of ACPICA headers
tools: cpupower: fix wrong err msg not supported vs not available
tools: cpupower: Add cpupower-idle-set(1) manpage
ACPI / sysfs: Fix incorrect ACPI tables walk in acpi_tables_sysfs_init()
ACPI / sysfs: Set file size for each exposed ACPI table
ACPICA: Update version to 20131115.
ACPICA: Add support to delete all objects attached to the root namespace node.
ACPICA: Delete all attached data objects during namespace node deletion.
ACPICA: Resources: Fix loop termination for the get AML length function.
ACPICA: Tests: Add CHECKSUM_ABORT protection for test utilities.
ACPICA: Debug output: Do not emit function nesting level for kernel build.
ACPI / sleep: clean up compiler warning about uninitialized field
cpufreq: exynos: Remove unwanted EXPORT_SYMBOL
cpufreq: tegra: don't error target() when suspended
ACPI / hotplug: Fix conflicted PCI bridge notify handlers