Move the buffer locking into the callers as they need to do it
wether they call xfs_map_at_offset or not. Remove the b_bdev
assignment, which is already done by get_blocks. Remove the
duplicate extent type asserts in xfs_convert_page just before
calling xfs_map_at_offset.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Alex Elder <aelder@sgi.com>
After the last patches the code for overwrites is the same as for
delayed and unwritten extents except that it doesn't need to call
xfs_map_at_offset. Take care of that fact to simplify
xfs_vm_writepage.
The buffer loop now first checks the type of buffer and checks/sets
the ioend type, or continues to the next buffer if it's not
interesting to us. Only after that we validate the iomap and
perform the block mapping if needed, all in common code for the
cases where we have to do work.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Alex Elder <aelder@sgi.com>
xfs_map_blocks always calls xfs_bmapi with the XFS_BMAPI_ENTIRE
entire flag, which tells it to not cap the extent at the passed in
size, but just treat the size as an minimum to map. This means
xfs_probe_cluster is entirely useless as we'll always get the whole
extent back anyway.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Alex Elder <aelder@sgi.com>
No need to lock the extent map exclusive when performing an
overwrite, we know the extent map must already have been loaded by
get_blocks. Apply the non-blocking inode semantics to all mapping
types instead of just delayed allocations. Remove the handling of
not yet allocated blocks for the IO_UNWRITTEN case - if an extent is
marked as unwritten allocated in the buffer it must already have an
extent on disk.
Add asserts to verify all the assumptions above in debug builds.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Alex Elder <aelder@sgi.com>
Opencode the xfs_iomap code in it's two callers. The overlap of
passed flags already was minimal and will be further reduced in the
next patch.
As a side effect the BMAPI_* flags for xfs_bmapi and the IO_* flags
for I/O end processing are merged into a single set of flags, which
should be a bit more descriptive of the operation we perform.
Also improve the tracing by giving each caller it's own type set of
tracepoints.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Alex Elder <aelder@sgi.com>
Remove passing the BMAPI_* flags to these helpers, in
xfs_iomap_write_direct the check BMAPI_DIRECT was always true, and
in the xfs_iomap_write_delay path is was never checked at all.
Remove the nmap return value as we never make use of it.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Alex Elder <aelder@sgi.com>
xfs: a few small tweaks for overwrites in xfs_vm_writepage
Don't trylock the buffer. We are the only one ever locking it for a
regular file address space, and trylock was only copied from the
generic code which did it due to the old buffer based writeout in
jbd. Also make sure to only write out the buffer if the iomap
actually is valid, because we wouldn't have a proper mapping
otherwise. In practice we will never get an invalid mapping here as
the page lock guarantees truncate doesn't race with us, but better
be safe than sorry. Also make sure we allocate a new ioend when
crossing boundaries between mappings, just like we do for delalloc
and unwritten extents. Again this currently doesn't matter as the
I/O end handler only cares for the boundaries for unwritten extents,
but this makes the code fully correct and the same as for
delalloc/unwritten extents.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Alex Elder <aelder@sgi.com>
We'll never have BIO_EOPNOTSUPP set after calling submit_bio as this
can only happen for discards, and used to happen for barriers, none
of which is every submitted by xfs_submit_ioend_bio. Also remove
the loop around bio_alloc as it will never fail due to it's mempool
backing.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Alex Elder <aelder@sgi.com>
xfs: improve mapping type check in xfs_vm_writepage
Currently we only refuse a "read-only" mapping for writing out
unwritten and delayed buffers, and refuse any other for overwrites.
Improve the checks to require delalloc mappings for delayed buffers,
and unwritten extent mappings for unwritten extents.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Alex Elder <aelder@sgi.com>
Merge the call to xlog_recover_reorder_trans and the loop over the
recovery items from xlog_recover_do_trans into xlog_recover_commit_trans,
and keep the switch statement over the log item types as a separate helper.
Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Alex Elder <aelder@sgi.com>
xfs: remove leftovers of old buffer log items in recovery code
XFS used to support different types of buffer log items long time
ago. Remove the switch statements checking the log item type in
various buffer recovery helpers that were left over from those days
and the rather useless xlog_recover_do_buffer_pass2 wrapper.
Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Alex Elder <aelder@sgi.com>
Samuel Kvasnica [Fri, 19 Nov 2010 13:38:49 +0000 (13:38 +0000)]
xfs: fix exporting with left over 64-bit inodes
We now support mounting and using filesystems with 64-bit inodes
even when not mounted with the inode64 option (which now only
controls if we allocate new inodes in that space or not). Make sure
we always use large NFS file handles when exporting a filesystem
that may contain 64-bit inodes. Note that this only affects newly
generated file handles, any outstanding 32-bit file handle is still
accepted.
[hch: the comment and commit log are mine, the rest is from a patch
snipplet from Samuel]
Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Alex Elder <aelder@sgi.com>
xfs: log timestamp changes to the source inode in rename
Now that we don't mark VFS inodes dirty anymore for internal
timestamp changes, but rely on the transaction subsystem to push
them out, we need to explicitly log the source inode in rename after
updating it's timestamps to make sure the changes actually get
forced out by sync/fsync or an AIL push.
We already account for the fourth inode in the log reservation, as a
rename of directories needs to update the nlink field, so just
adding the xfs_trans_log_inode call is enough.
This fixes the xfsqa 065 regression introduced by:
"xfs: don't use vfs writeback for pure metadata modifications"
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Alex Elder <aelder@sgi.com>
Dave Chinner [Tue, 30 Nov 2010 04:15:31 +0000 (15:15 +1100)]
xfs: only run xfs_error_test if error injection is active
Recent tests writing lots of small files showed the flusher thread
being CPU bound and taking a long time to do allocations on a debug
kernel. perf showed this as the prime reason:
samples pcnt function DSO
_______ _____ ___________________________ _________________
Walking btree blocks during allocation checking them requires each
block (a cache hit, so no I/O) call xfs_error_test(), which then
does a random32() call as the first operation. IOWs, ~50% of the
CPU is being consumed just testing whether we need to inject an
error, even though error injection is not active.
Kill this overhead when error injection is not active by adding a
global counter of active error traps and only calling into
xfs_error_test when fault injection is active.
Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
Dave Chinner [Tue, 30 Nov 2010 04:15:46 +0000 (15:15 +1100)]
xfs: avoid moving stale inodes in the AIL
When an inode has been marked stale because the cluster is being
freed, we don't want to (re-)insert this inode into the AIL. There
is a race condition where the cluster buffer may be unpinned before
the inode is inserted into the AIL during transaction committed
processing. If the buffer is unpinned before the inode item has been
committed and inserted, then it is possible for the buffer to be
released and hence processthe stale inode callbacks before the inode
is inserted into the AIL.
In this case, we then insert a clean, stale inode into the AIL which
will never get removed by an IO completion. It will, however, get
reclaimed and that triggers an assert in xfs_inode_free()
complaining about freeing an inode still in the AIL.
This race can be avoided by not moving stale inodes forward in the AIL
during transaction commit completion processing. This closes the
race condition by ensuring we never insert clean stale inodes into
the AIL. It is safe to do this because a dirty stale inode, by
definition, must already be in the AIL.
Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
Dave Chinner [Tue, 30 Nov 2010 04:16:02 +0000 (15:16 +1100)]
xfs: delayed alloc blocks beyond EOF are valid after writeback
There is an assumption in the parts of XFS that flushing a dirty
file will make all the delayed allocation blocks disappear from an
inode. That is, that after calling xfs_flush_pages() then
ip->i_delayed_blks will be zero.
This is an invalid assumption as we may have specualtive
preallocation beyond EOF and they are recorded in
ip->i_delayed_blks. A flush of the dirty pages of an inode will not
change the state of these blocks beyond EOF, so a non-zero
deeelalloc block count after a flush is valid.
The bmap code has an invalid ASSERT() that needs to be removed, and
the swapext code has a bug in that while it swaps the data forks
around, it fails to swap the i_delayed_blks counter associated with
the fork and hence can get the block accounting wrong.
Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
Dave Chinner [Tue, 30 Nov 2010 04:16:16 +0000 (15:16 +1100)]
xfs: push stale, pinned buffers on trylock failures
As reported by Nick Piggin, XFS is suffering from long pauses under
highly concurrent workloads when hosted on ramdisks. The problem is
that an inode buffer is stuck in the pinned state in memory and as a
result either the inode buffer or one of the inodes within the
buffer is stopping the tail of the log from being moved forward.
The system remains in this state until a periodic log force issued
by xfssyncd causes the buffer to be unpinned. The main problem is
that these are stale buffers, and are hence held locked until the
transaction/checkpoint that marked them state has been committed to
disk. When the filesystem gets into this state, only the xfssyncd
can cause the async transactions to be committed to disk and hence
unpin the inode buffer.
This problem was encountered when scaling the busy extent list, but
only the blocking lock interface was fixed to solve the problem.
Extend the same fix to the buffer trylock operations - if we fail to
lock a pinned, stale buffer, then force the log immediately so that
when the next attempt to lock it comes around, it will have been
unpinned.
Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
Dave Chinner [Tue, 30 Nov 2010 04:14:39 +0000 (15:14 +1100)]
xfs: fix failed write truncation handling.
Since the move to the new truncate sequence we call xfs_setattr to
truncate down excessively instanciated blocks. As shown by the testcase
in kernel.org BZ #22452 that doesn't work too well. Due to the confusion
of the internal inode size, and the VFS inode i_size it zeroes data that
it shouldn't.
But full blown truncate seems like overkill here. We only instanciate
delayed allocations in the write path, and given that we never released
the iolock we can't have converted them to real allocations yet either.
The only nasty case is pre-existing preallocation which we need to skip.
We already do this for page discard during writeback, so make the delayed
allocation block punching a generic function and call it from the failed
write path as well as xfs_aops_discard_page. The callers are
responsible for ensuring that partial blocks are not truncated away,
and that they hold the ilock.
Based on a fix originally from Christoph Hellwig. This version used
filesystem blocks as the range unit.
Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
On-list discussion seems to suggest that the robustness fixes for printk
make this unnecessary and DaveM has also agreed in person at Kernel Summit
and on list.
The main problem with this code is once we hit a lockdep splat we always
keep oops_in_progress set, the console layer uses oops_in_progress with KMS
to decide when it should be showing the oops and not showing X, so it causes
problems around suspend/resume time when a userspace resume can cause a console
switch away from X, only if oops_in_progress is set (which is what we want
if an oops actually is in progress, but not because we had a lockdep splat
2 days prior).
Cc: David S Miller <davem@davemloft.net> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Dave Airlie <airlied@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Mon, 29 Nov 2010 22:36:07 +0000 (14:36 -0800)]
Merge branch 'omap-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap-2.6
* 'omap-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap-2.6:
OMAP2+: PM/serial: hold console semaphore while OMAP UARTs are disabled
OMAP: UART: don't resume UARTs that are not enabled.
Matthew Garrett [Thu, 21 Oct 2010 21:42:40 +0000 (17:42 -0400)]
tpm: Autodetect itpm devices
Some Lenovos have TPMs that require a quirk to function correctly. This can
be autodetected by checking whether the device has a _HID of INTC0102. This
is an invalid PNPid, and as such is discarded by the pnp layer - however
it's still present in the ACPI code, so we can pull it out that way. This
means that the quirk won't be automatically applied on non-ACPI systems,
but without ACPI we don't have any way to identify the chip anyway so I
don't think that's a great concern.
Signed-off-by: Matthew Garrett <mjg@redhat.com> Acked-by: Rajiv Andrade <srajiv@linux.vnet.ibm.com> Tested-by: Jiri Kosina <jkosina@suse.cz> Tested-by: Andy Isaacson <adi@hexapodia.org> Signed-off-by: James Morris <jmorris@namei.org>
* git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstable: (24 commits)
Btrfs: don't use migrate page without CONFIG_MIGRATION
Btrfs: deal with DIO bios that span more than one ordered extent
Btrfs: setup blank root and fs_info for mount time
Btrfs: fix fiemap
Btrfs - fix race between btrfs_get_sb() and umount
Btrfs: update inode ctime when using links
Btrfs: make sure new inode size is ok in fallocate
Btrfs: fix typo in fallocate to make it honor actual size
Btrfs: avoid NULL pointer deref in try_release_extent_buffer
Btrfs: make btrfs_add_nondir take parent inode as an argument
Btrfs: hold i_mutex when calling btrfs_log_dentry_safe
Btrfs: use dget_parent where we can UPDATED
Btrfs: fix more ESTALE problems with NFS
Btrfs: handle NFS lookups properly
btrfs: make 1-bit signed fileds unsigned
btrfs: Show device attr correctly for symlinks
btrfs: Set file size correctly in file clone
btrfs: Check if dest_offset is block-size aligned before cloning file
Btrfs: handle the space_cache option properly
btrfs: Fix early enospc because 'unused' calculated with wrong sign.
...
Eric Dumazet [Thu, 25 Nov 2010 04:11:39 +0000 (04:11 +0000)]
af_unix: limit recursion level
Its easy to eat all kernel memory and trigger NMI watchdog, using an
exploit program that queues unix sockets on top of others.
lkml ref : http://lkml.org/lkml/2010/11/25/8
This mechanism is used in applications, one choice we have is to have a
recursion limit.
Other limits might be needed as well (if we queue other types of files),
since the passfd mechanism is currently limited by socket receive queue
sizes only.
Add a recursion_level to unix socket, allowing up to 4 levels.
Each time we send an unix socket through sendfd mechanism, we copy its
recursion level (plus one) to receiver. This recursion level is cleared
when socket receive queue is emptied.
Reported-by: Марк Коренберг <socketpair@gmail.com> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Toshiharu Okada [Mon, 29 Nov 2010 06:18:07 +0000 (06:18 +0000)]
pch_gbe driver: The wrong of initializer entry
The wrong of initializer entry was modified.
Signed-off-by: Toshiharu Okada <toshiharu-linux@dsn.okisemi.com> Reported-by: Dr. David Alan Gilbert <linux@treblig.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Yang Li [Thu, 25 Nov 2010 23:29:58 +0000 (23:29 +0000)]
ucc_geth: fix ucc halt problem in half duplex mode
In commit 58933c64(ucc_geth: Fix the wrong the Rx/Tx FIFO size),
the UCC_GETH_UTFTT_INIT is set to 512 based on the recommendation
of the QE Reference Manual. But that will sometimes cause tx halt
while working in half duplex mode.
According to errata draft QE_GENERAL-A003(High Tx Virtual FIFO
threshold size can cause UCC to halt), setting UTFTT less than
[(UTFS x (M - 8)/M) - 128] will prevent this from happening
(M is the minimum buffer size).
The patch changes UTFTT back to 256.
Signed-off-by: Li Yang <leoli@freescale.com> Cc: Jean-Denis Boyer <jdboyer@media5corp.com> Cc: Andreas Schmitz <Andreas.Schmitz@riedel.net> Cc: Anton Vorontsov <avorontsov@ru.mvista.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Nagendra Tomar [Fri, 26 Nov 2010 14:26:27 +0000 (14:26 +0000)]
inet: Fix __inet_inherit_port() to correctly increment bsockets and num_owners
inet sockets corresponding to passive connections are added to the bind hash
using ___inet_inherit_port(). These sockets are later removed from the bind
hash using __inet_put_port(). These two functions are not exactly symmetrical.
__inet_put_port() decrements hashinfo->bsockets and tb->num_owners, whereas
___inet_inherit_port() does not increment them. This results in both of these
going to -ve values.
This patch fixes this by calling inet_bind_hash() from ___inet_inherit_port(),
which does the right thing.
'bsockets' and 'num_owners' were introduced by commit a9d8f9110d7e953c
(inet: Allowing more than 64k connections and heavily optimize bind(0))
Signed-off-by: Nagendra Singh Tomar <tomer_iisc@yahoo.com> Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Acked-by: Evgeniy Polyakov <zbr@ioremap.net> Signed-off-by: David S. Miller <davem@davemloft.net>
Chris Mason [Mon, 29 Nov 2010 00:56:33 +0000 (19:56 -0500)]
Btrfs: deal with DIO bios that span more than one ordered extent
The new DIO bio splitting code has problems when the bio
spans more than one ordered extent. This will happen as the
generic DIO code merges our get_blocks calls together into
a bigger single bio.
This fixes things by walking forward in the ordered extent
code finding all the overlapping ordered extents and completing them
all at once.
Signed-off-by: Chris Mason <chris.mason@oracle.com>
Linus Torvalds [Sun, 28 Nov 2010 22:09:57 +0000 (14:09 -0800)]
Export 'get_pipe_info()' to other users
And in particular, use it in 'pipe_fcntl()'.
The other pipe functions do not need to use the 'careful' version, since
they are only ever called for things that are already known to be pipes.
The normal read/write/ioctl functions are called through the file
operations structures, so if a file isn't a pipe, they'd never get
called. But pipe_fcntl() is special, and called directly from the
generic fcntl code, and needs to use the same careful function that the
splice code is using.
Cc: Jens Axboe <jaxboe@fusionio.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Dave Jones <davej@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Sun, 28 Nov 2010 21:56:09 +0000 (13:56 -0800)]
Rename 'pipe_info()' to 'get_pipe_info()'
.. and change it to take the 'file' pointer instead of an inode, since
that's what all users want anyway.
The renaming is preparatory to exporting it to other users. The old
'pipe_info()' name was too generic and is already used elsewhere, so
before making the function public we need to use a more specific name.
Cc: Jens Axboe <jaxboe@fusionio.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Dave Jones <davej@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Filip Aben [Thu, 25 Nov 2010 03:40:50 +0000 (03:40 +0000)]
hso: fix disable_net
The HSO driver incorrectly creates a serial device instead of a net
device when disable_net is set. It shouldn't create anything for the
network interface.
Signed-off-by: Filip Aben <f.aben@option.com> Reported-by: Piotr Isajew <pki@ex.com.pl> Reported-by: Johan Hovold <jhovold@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Jiri Slaby [Wed, 24 Nov 2010 13:54:54 +0000 (13:54 +0000)]
NET: wan/x25_asy, move lapb_unregister to x25_asy_close_tty
We register lapb when tty is created, but unregister it only when the
device is UP. So move the lapb_unregister to x25_asy_close_tty after
the device is down.
The old behaviour causes ldisc switching to fail each second attempt,
because we noted for us that the device is unused, so we use it the
second time, but labp layer still have it registered, so it fails
obviously.
Signed-off-by: Jiri Slaby <jslaby@suse.cz> Reported-by: Sergey Lapin <slapin@ossfans.org> Cc: Andrew Hendry <andrew.hendry@gmail.com> Tested-by: Sergey Lapin <slapin@ossfans.org> Tested-by: Mikhail Ulyanov <ulyanov.mikhail@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
We were truncating the number of unicast and multicast MAC addresses
supported. Additionally, we were incorrectly computing the MAC Address
hash (a "1 << N" where we needed a "1ULL << N").
Signed-off-by: Casey Leedom <leedom@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Cyrill Gorcunov [Tue, 23 Nov 2010 11:43:44 +0000 (11:43 +0000)]
net, ppp: Report correct error code if unit allocation failed
Allocating unit from ird might return several error codes
not only -EAGAIN, so it should not be changed and returned
precisely. Same time unit release procedure should be invoked
only if device is unregistering.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org> CC: Paul Mackerras <paulus@samba.org> Signed-off-by: David S. Miller <davem@davemloft.net>
au1000_eth: fix invalid address accessing the MAC enable register
"aup->enable" holds already the address pointing to the MAC enable
register. The bug was introduced by commit d0e7cb:
"au1000-eth: remove volatiles, switch to I/O accessors".
CC: Florian Fainelli <florian@openwrt.org> Signed-off-by: Wolfgang Grandegger <wg@denx.de> Acked-by: Florian Fainelli <florian@openwrt.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Gerrit Renker [Tue, 23 Nov 2010 02:36:56 +0000 (02:36 +0000)]
dccp: fix error in updating the GAR
This fixes a bug in updating the Greatest Acknowledgment number Received (GAR):
the current implementation does not track the greatest received value -
lower values in the range AWL..AWH (RFC 4340, 7.5.1) erase higher ones.
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
Josef Bacik [Fri, 19 Nov 2010 19:59:15 +0000 (14:59 -0500)]
Btrfs: setup blank root and fs_info for mount time
There is a problem with how we use sget, it searches through the list of supers
attached to the fs_type looking for a super with the same fs_devices as what
we're trying to mount. This depends on sb->s_fs_info being filled, but we don't
fill that in until we get to btrfs_fill_super, so we could hit supers on the
fs_type super list that have a null s_fs_info. In order to fix that we need to
go ahead and setup a blank root with a blank fs_info to hold fs_devices, that
way our test will work out right and then we can set s_fs_info in
btrfs_set_super, and then open_ctree will simply use our pre-allocated root and
fs_info when setting everything up. Thanks,
Signed-off-by: Josef Bacik <josef@redhat.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
Josef Bacik [Tue, 23 Nov 2010 19:36:57 +0000 (19:36 +0000)]
Btrfs: fix fiemap
There are two big problems currently with FIEMAP
1) We return extents for holes. This isn't supposed to happen, we just don't
return extents for holes and then userspace interprets the lack of an extent as
a hole.
2) We sometimes don't set FIEMAP_EXTENT_LAST properly. This is because we wait
to see a EXTENT_FLAG_VACANCY flag on the em, but this won't happen if say we ask
fiemap to map up to the last extent in a file, and there is nothing but holes up
to the i_size. To fix this we need to lookup the last extent in this file and
save the logical offset, so if we happen to try and map that extent we can be
sure to set FIEMAP_EXTENT_LAST.
With this patch we now pass xfstest 225, which we never have before.
Signed-off-by: Josef Bacik <josef@redhat.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
Ian Kent [Mon, 22 Nov 2010 02:21:38 +0000 (02:21 +0000)]
Btrfs - fix race between btrfs_get_sb() and umount
When mounting a btrfs file system btrfs_test_super() may attempt to
use sb->s_fs_info, the btrfs root, of a super block that is going away
and that has had the btrfs root set to NULL in its ->put_super(). But
if the super block is going away it cannot be an existing super block
so we can return false in this case.
Signed-off-by: Ian Kent <raven@themaw.net> Signed-off-by: Chris Mason <chris.mason@oracle.com>
Josef Bacik [Mon, 22 Nov 2010 18:55:39 +0000 (18:55 +0000)]
Btrfs: make sure new inode size is ok in fallocate
We have been failing xfstest 228 forever, because we don't check to make sure
the new inode size is acceptable as far as RLIMIT is concerned. Just check to
make sure it's ok to create a inode with this new size and error out if not.
With this patch we now pass 228.
Signed-off-by: Josef Bacik <josef@redhat.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
Josef Bacik [Mon, 22 Nov 2010 18:50:32 +0000 (18:50 +0000)]
Btrfs: fix typo in fallocate to make it honor actual size
There is a typo in __btrfs_prealloc_file_range() where we set the i_size to
actual_len/cur_offset, and then just set it to cur_offset again, and do the same
with btrfs_ordered_update_i_size(). This fixes it back to keeping i_size in a
local variable and then updating i_size properly. Tested this with
xfs_io -F -f -c "falloc 0 1" -c "pwrite 0 1" foo
stat'ing foo gives us a size of 1 instead of 4096 like it was. Thanks,
Signed-off-by: Josef Bacik <josef@redhat.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
Linus Torvalds [Fri, 26 Nov 2010 22:30:30 +0000 (07:30 +0900)]
Merge branch 'bugfixes' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6
* 'bugfixes' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6:
NFS: Ensure we return the dirent->d_type when it is known
NFS: Correct the array bound calculation in nfs_readdir_add_to_array
NFS: Don't ignore errors from nfs_do_filldir()
NFS: Fix the error handling in "uncached_readdir()"
NFS: Fix a page leak in uncached_readdir()
NFS: Fix a page leak in nfs_do_filldir()
NFS: Assume eof if the server returns no readdir records
NFS: Buffer overflow in ->decode_dirent() should not be fatal
Pure nfs client performance using odirect.
SUNRPC: Fix an infinite loop in call_refresh/call_refreshresult
Linus Torvalds [Fri, 26 Nov 2010 22:28:47 +0000 (07:28 +0900)]
Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
dmar, x86: Use function stubs when CONFIG_INTR_REMAP is disabled
x86-64: Fix and clean up AMD Fam10 MMCONF enabling
x86: UV: Address interrupt/IO port operation conflict
x86: Use online node real index in calulate_tbl_offset()
x86, asm: Fix binutils 2.15 build failure
Linus Torvalds [Fri, 26 Nov 2010 22:28:17 +0000 (07:28 +0900)]
Merge branch 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
perf symbols: Remove incorrect open-coded container_of()
perf record: Handle restrictive permissions in /proc/{kallsyms,modules}
x86/kprobes: Prevent kprobes to probe on save_args()
irq_work: Drop cmpxchg() result
perf: Fix owner-list vs exit
x86, hw_nmi: Move backtrace_mask declaration under ARCH_HAS_NMI_WATCHDOG
tracing: Fix recursive user stack trace
perf,hw_breakpoint: Initialize hardware api earlier
x86: Ignore trap bits on single step exceptions
tracing: Force arch_local_irq_* notrace for paravirt
tracing: Fix module use of trace_bprintk()
Linus Torvalds [Fri, 26 Nov 2010 22:17:50 +0000 (07:17 +0900)]
Merge branch 'for-linus' of git://git.kernel.dk/linux-2.6-block
* 'for-linus' of git://git.kernel.dk/linux-2.6-block:
cciss: fix build for PROC_FS disabled
block: fix amiga and atari floppy driver compile warning
blk-throttle: Fix calculation of max number of WRITES to be dispatched
ioprio: grab rcu_read_lock in sys_ioprio_{set,get}()
xen/blkfront: cope with backend that fail empty BLKIF_OP_WRITE_BARRIER requests
xen/blkfront: Implement FUA with BLKIF_OP_WRITE_BARRIER
xen/blkfront: change blk_shadow.request to proper pointer
xen/blkfront: map REQ_FLUSH into a full barrier
Linus Torvalds [Fri, 26 Nov 2010 22:14:00 +0000 (07:14 +0900)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ryusuke/nilfs2
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ryusuke/nilfs2:
nilfs2: fix typo in comment of nilfs_dat_move function
nilfs2: nilfs_iget_for_gc() returns ERR_PTR
Takashi Iwai [Fri, 26 Nov 2010 16:11:18 +0000 (17:11 +0100)]
ALSA: hda - Use ALC_INIT_DEFAULT for really default initialization
When SKU assid gives no valid bits for 0x38, the driver didn't take
any action, so far. This resulted in the missing initialization for
external amps, etc, thus the silent output in the end.
Especially users hit this problem on ALC888 newly since 2.6.35,
where the driver doesn't force to use ALC_INIT_DEFAULT any more.
This patch sets the default initialization scheme to use
ALC_INIT_DEFAULT when no valid bits are set for SKU assid.
Peter Zijlstra [Fri, 26 Nov 2010 12:49:04 +0000 (13:49 +0100)]
perf: Fix the software context switch counter
Stephane noticed that because the perf_sw_event() call is inside the
perf_event_task_sched_out() call it won't get called unless we
have a per-task counter.
Don Zickus [Mon, 22 Nov 2010 21:55:23 +0000 (16:55 -0500)]
x86, perf, nmi: Disable perf if counters are not accessible
In a kvm virt guests, the perf counters are not emulated. Instead they
return zero on a rdmsrl. The perf nmi handler uses the fact that crossing
a zero means the counter overflowed (for those counters that do not have
specific interrupt bits). Therefore on kvm guests, perf will swallow all
NMIs thinking the counters overflowed.
This causes problems for subsystems like kgdb which needs NMIs to do its
magic. This problem was discovered by running kgdb tests.
The solution is to write garbage into a perf counter during the
initialization and hopefully reading back the same number. On kvm
guests, the value will be read back as zero and we disable perf as
a result.
Reported-by: Jason Wessel <jason.wessel@windriver.com> Patch-inspired-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Don Zickus <dzickus@redhat.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@google.com>
LKML-Reference: <1290462923-30734-1-git-send-email-dzickus@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
Thomas Gleixner [Wed, 24 Nov 2010 09:05:55 +0000 (10:05 +0100)]
perf: Fix inherit vs. context rotation bug
It was found that sometimes children of tasks with inherited events had
one extra event. Eventually it turned out to be due to the list rotation
no being exclusive with the list iteration in the inheritance code.
Cure this by temporarily disabling the rotation while we inherit the events.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <new-submission> Cc: <stable@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
Randy Dunlap [Mon, 22 Nov 2010 20:48:34 +0000 (12:48 -0800)]
dmar, x86: Use function stubs when CONFIG_INTR_REMAP is disabled
The stubs for CONFIG_INTR_REMAP disabled need to be functions
instead of values to eliminate build warnings.
arch/x86/kernel/apic/apic.c: In function 'lapic_suspend':
arch/x86/kernel/apic/apic.c:2060:3: warning: statement with no effect
arch/x86/kernel/apic/apic.c: In function 'lapic_resume':
arch/x86/kernel/apic/apic.c:2137:3: warning: statement with no effect
Reported-and-Tested-by: Fabio Comolli <fabio.comolli@gmail.com> Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Cc: Suresh Siddha <suresh.b.siddha@intel.com> Cc: Yinghai Lu <yinghai@kernel.org> Cc: David Woodhouse <dwmw2@infradead.org> Cc: Jesse Barnes <jbarnes@virtuousgeek.org>
LKML-Reference: <20101122124834.74429004.randy.dunlap@oracle.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
Axel Lin [Wed, 24 Nov 2010 14:24:01 +0000 (22:24 +0800)]
ASoC: Fix resource reclaim for osk5912
In current implementation, there are resources leak in the error path.
This patch properly reclaims the allocated resources in the error path.
Also adds a missing clk_put in osk_soc_exit.
Signed-off-by: Axel Lin <axel.lin@gmail.com> Acked-by: Jarkko Nikula <jhnikula@gmail.com> Acked-by: Liam Girdwood <lrg@slimlogic.co.uk> Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Axel Lin [Wed, 24 Nov 2010 14:40:59 +0000 (22:40 +0800)]
ASoC: tlv320aic3x - fix variable may be used uninitialized warning
If aic3x_read failed , val is used uninitialized.
Fix it by initializing val to 0.
This patch fixes below compile warning:
sound/soc/codecs/tlv320aic3x.c: In function 'aic3x_get_gpio':
sound/soc/codecs/tlv320aic3x.c:1183: warning: 'val' may be used uninitialized in this function
sound/soc/codecs/tlv320aic3x.c: In function 'aic3x_headset_detected':
sound/soc/codecs/tlv320aic3x.c:1211: warning: 'val' may be used uninitialized in this function
sound/soc/codecs/tlv320aic3x.c: In function 'aic3x_button_pressed':
sound/soc/codecs/tlv320aic3x.c:1219: warning: 'val' may be used uninitialized in this function
Signed-off-by: Axel Lin <axel.lin@gmail.com> Acked-by: Liam Girdwood <lrg@slimlogic.co.uk> Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Axel Lin [Thu, 25 Nov 2010 03:33:14 +0000 (11:33 +0800)]
ASoC: davinci-vcif - fix a memory leak
Signed-off-by: Axel Lin <axel.lin@gmail.com> Acked-by: Liam Girdwood <lrg@slimlogic.co.uk> Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Axel Lin [Thu, 25 Nov 2010 07:14:03 +0000 (15:14 +0800)]
ASoC: phycore-ac97: fix resource leak
Fix imx_phycore_init() error path and imx_phycore_exit() to properly free
allocated resources.
Signed-off-by: Axel Lin <axel.lin@gmail.com> Acked-by: Sascha Hauer <s.hauer@pengutronix.de> Acked-by: Liam Girdwood <lrg@slimlogic.co.uk> Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Axel Lin [Thu, 25 Nov 2010 07:13:09 +0000 (15:13 +0800)]
ASoC: imx-ssi: fix resource leak
Fix imx_ssi_probe() error path and imx_ssi_remove() to properly free
allocated resources.
Signed-off-by: Axel Lin <axel.lin@gmail.com> Acked-by: Sascha Hauer <s.hauer@pengutronix.de> Acked-by: Liam Girdwood <lrg@slimlogic.co.uk> Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Axel Lin [Thu, 25 Nov 2010 07:12:30 +0000 (15:12 +0800)]
ASoC: simone: fix resource leak in simone_init error path
Fix the error path to properly free allocated resources.
Signed-off-by: Axel Lin <axel.lin@gmail.com> Acked-by: Mika Westerberg <mika.westerberg@iki.fi> Acked-by: Liam Girdwood <lrg@slimlogic.co.uk> Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Axel Lin [Thu, 25 Nov 2010 07:11:03 +0000 (15:11 +0800)]
ASoC: sam9g20_wm8731: fix resource leak in at91sam9g20ek_init error path
Fix the error path to properly free allocated resources.
Signed-off-by: Axel Lin <axel.lin@gmail.com> Acked-by: Liam Girdwood <lrg@slimlogic.co.uk> Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Axel Lin [Thu, 25 Nov 2010 02:44:59 +0000 (10:44 +0800)]
ASoC: snd-soc-afeb9260: remove unneeded platform_device_del in error path
Signed-off-by: Axel Lin <axel.lin@gmail.com> Acked-by: Liam Girdwood <lrg@slimlogic.co.uk> Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Axel Lin [Thu, 25 Nov 2010 07:08:31 +0000 (15:08 +0800)]
ASoC: pcm030-audio-fabric: fix resource leak in pcm030_fabric_init error path
Add missing platform_device_put() if platform_device_add() failed.
Signed-off-by: Axel Lin <axel.lin@gmail.com> Acked-by: Liam Girdwood <lrg@slimlogic.co.uk> Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Axel Lin [Thu, 25 Nov 2010 07:07:25 +0000 (15:07 +0800)]
ASoC: efika-audio-fabric: fix resource leak in efika_fabric_init error path
Add missing platform_device_put() if platform_device_add() failed.
Signed-off-by: Axel Lin <axel.lin@gmail.com> Acked-by: Liam Girdwood <lrg@slimlogic.co.uk> Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Axel Lin [Thu, 25 Nov 2010 09:23:55 +0000 (17:23 +0800)]
ASoC: Call snd_soc_unregister_dais instead of snd_soc_unregister_dai in sh4_soc_dai_remove
We call snd_soc_register_dais() in sh4_soc_dai_probe(),
thus we should call snd_soc_unregister_dais() in sh4_soc_dai_remove().
Otherwise, we got "too many arguments to function 'snd_soc_unregister_dai'"
error message.
Signed-off-by: Axel Lin <axel.lin@gmail.com> Acked-by: Liam Girdwood <lrg@slimlogic.co.uk> Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Jan Glauber [Thu, 25 Nov 2010 08:52:46 +0000 (09:52 +0100)]
[S390] qdio: free indicator after reset is finished
The qdio device indicator is freed before the device is notified that
the indicator is reset. This sequence contains a race when the freed
indicator is used by a new device while the reset of the indicator is
still pending. Do the reset operation before freeing the indicator to
avoid that potential race.
Signed-off-by: Jan Glauber <jang@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Heiko Carstens [Thu, 25 Nov 2010 08:52:45 +0000 (09:52 +0100)]
[S390] nmi: fix clock comparator revalidation
On each machine check all registers are revalidated. The save area for
the clock comparator however only contains the upper most seven bytes
of the former contents, if valid.
Therefore the machine check handler uses a store clock instruction to
get the current time and writes that to the clock comparator register
which in turn will generate an immediate timer interrupt.
However within the lowcore the expected time of the next timer
interrupt is stored. If the interrupt happens before that time the
handler won't be called. In turn the clock comparator won't be
reprogrammed and therefore the interrupt condition stays pending which
causes an interrupt loop until the expected time is reached.
On NOHZ machines this can result in unresponsive machines since the
time of the next expected interrupted can be a couple of days in the
future.
To fix this just revalidate the clock comparator register with the
expected value.
In addition the special handling for udelay must be changed as well.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
The mixer nids passed to alc_auto_create_input_ctls are wrong: 0x15 is
a pin, and 0x09 is the ADC on both ALC660-VD/ALC861-VD. Thus with
current code, input playback volume/switches and input source mixer
controls are not created, and recording doesn't work. Select correct
mixers, 0x0b (input playback mixer) and 0x22 (capture source mixer).