Peter Hurley [Sat, 15 Jun 2013 14:21:17 +0000 (10:21 -0400)]
n_tty: Fix EOF push handling
In canonical mode, an EOF which is not the first character of the line
causes read() to complete and return the number of characters read so
far (commonly referred to as EOF push). However, if the previous read()
returned because the user buffer was full _and_ the next character
is an EOF not at the beginning of the line, read() must not return 0,
thus mistakenly indicating the end-of-file condition.
The TTY_PUSH flag is used to indicate an EOF was received which is not
at the beginning of the line. Because the EOF push condition is
evaluated by a thread other than the read(), multiple EOF pushes can
cause a premature end-of-file to be indicated.
Instead, discover the 'EOF push as first read character' condition
from the read() thread itself, and restart the i/o loop if detected.
Signed-off-by: Peter Hurley <peter@hurleysoftware.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Peter Hurley [Sat, 15 Jun 2013 14:04:26 +0000 (10:04 -0400)]
n_tty: Process echoes in blocks
Byte-by-byte echo output is painfully slow, requiring a lock/unlock
cycle for every input byte.
Instead, perform the echo output in blocks of 256 characters, and
at least once per flip buffer receive. Enough space is reserved in
the echo buffer to guarantee a full block can be saved without
overrunning the echo output. Overrun is prevented by discarding
the oldest echoes until enough space exists in the echo buffer
to receive at least a full block of new echoes.
Signed-off-by: Peter Hurley <peter@hurleysoftware.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Peter Hurley [Sat, 15 Jun 2013 14:04:24 +0000 (10:04 -0400)]
n_tty: Remove echo_lock
Adding data to echo_buf (via add_echo_byte()) is guaranteed to be
single-threaded, since all callers are from the n_tty_receive_buf()
path. Processing the echo_buf can be called from either the
n_tty_receive_buf() path or the n_tty_write() path; however, these
callers are already serialized by output_lock.
Publish cumulative echo_head changes to echo_commit; process echo_buf
from echo_tail to echo_commit; remove echo_lock.
On echo_buf overrun, claim output_lock to serialize changes to
echo_tail.
Signed-off-by: Peter Hurley <peter@hurleysoftware.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Peter Hurley [Sat, 15 Jun 2013 14:04:22 +0000 (10:04 -0400)]
n_tty: Use separate head and tail indices for echo_buf
Instead of using a single index to track the current echo_buf position,
use a head index when adding to the buffer and a tail index when
consuming from the buffer. Allow these head and tail indices to wrap
at max representable value; perform modulo reduction via helper
functions when accessing the buffer.
Signed-off-by: Peter Hurley <peter@hurleysoftware.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Peter Hurley [Sat, 15 Jun 2013 13:36:15 +0000 (09:36 -0400)]
tty: Fix unsafe vt paste_selection()
Convert the tty_buffer_flush() exclusion mechanism to a
public interface - tty_buffer_lock/unlock_exclusive() - and use
the interface to safely write the paste selection to the line
discipline.
Signed-off-by: Peter Hurley <peter@hurleysoftware.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Peter Hurley [Sat, 15 Jun 2013 13:36:11 +0000 (09:36 -0400)]
tty: Only perform flip buffer flush from tty_buffer_flush()
Now that dropping the buffer lock is not necessary (as result of
converting the spin lock to a mutex), the flip buffer flush no
longer needs to be handled by the buffer work.
Simply signal a flush is required; the buffer work will exit the
i/o loop, which allows tty_buffer_flush() to proceed.
Signed-off-by: Peter Hurley <peter@hurleysoftware.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Peter Hurley [Sat, 15 Jun 2013 13:36:09 +0000 (09:36 -0400)]
tty: Make driver-side flip buffers lockless
Driver-side flip buffer input is already single-threaded; 'publish'
the .next link as the last operation on the tail buffer so the
'consumer' sees the already-completed flip buffer.
The commit buffer index is already 'published' by driver-side functions.
Signed-off-by: Peter Hurley <peter@hurleysoftware.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Peter Hurley [Sat, 15 Jun 2013 13:36:08 +0000 (09:36 -0400)]
tty: Track flip buffer memory limit atomically
Lockless flip buffers require atomically updating the bytes-in-use
watermark.
The pty driver also peeks at the watermark value to limit
memory consumption to a much lower value than the default; query
the watermark with new fn, tty_buffer_space_avail().
Signed-off-by: Peter Hurley <peter@hurleysoftware.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Peter Hurley [Sat, 15 Jun 2013 13:36:07 +0000 (09:36 -0400)]
tty: Simplify flip buffer list with 0-sized sentinel
Use a 0-sized sentinel to avoid assigning the head ptr from
the driver side thread. This also eliminates testing head/tail
for NULL.
When the sentinel is first 'consumed' by the buffer work
(or by tty_buffer_flush()), it is detached from the list but not
freed nor added to the free list. Both buffer work and
tty_buffer_flush() continue to preserve at least 1 flip buffer
to which head & tail is pointed.
Signed-off-by: Peter Hurley <peter@hurleysoftware.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Peter Hurley [Sat, 15 Jun 2013 13:36:06 +0000 (09:36 -0400)]
tty: Use lockless flip buffer free list
In preparation for lockless flip buffers, make the flip buffer
free list lockless.
NB: using llist is not the optimal solution, as the driver and
buffer work may contend over the llist head unnecessarily. However,
test measurements indicate this contention is low.
Signed-off-by: Peter Hurley <peter@hurleysoftware.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Peter Hurley [Sat, 15 Jun 2013 13:36:04 +0000 (09:36 -0400)]
tty: Merge tty_buffer_find() into tty_buffer_alloc()
tty_buffer_find() implements a simple free list lookaside cache.
Merge this functionality into tty_buffer_alloc() to reflect the
more traditional alloc/free symmetry.
Signed-off-by: Peter Hurley <peter@hurleysoftware.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Peter Hurley [Sat, 15 Jun 2013 13:36:02 +0000 (09:36 -0400)]
tty: Fix flip buffer free list
Since flip buffers are size-aligned to 256 bytes and all flip
buffers 512-bytes or larger are not added to the free list, the
free list only contains 256-byte flip buffers.
Remove the list search when allocating a new flip buffer.
Signed-off-by: Peter Hurley <peter@hurleysoftware.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Peter Hurley [Sat, 15 Jun 2013 13:14:36 +0000 (09:14 -0400)]
n_tty: Queue buffer work on any available cpu
Scheduling buffer work on the same cpu as the read() thread
limits the parallelism now possible between the receive_buf path
and the n_tty_read() path.
Signed-off-by: Peter Hurley <peter@hurleysoftware.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Peter Hurley [Tue, 23 Jul 2013 12:47:30 +0000 (08:47 -0400)]
n_tty: Special case pty flow control
The pty driver forces ldisc flow control on, regardless of available
receive buffer space, so the writer can be woken whenever unthrottle
is called. However, this 'forced throttle' has performance
consequences, as multiple atomic operations are necessary to
unthrottle and perform the write wakeup for every input line (in
canonical mode).
Instead, short-circuit the unthrottle if the tty is a pty and perform
the write wakeup directly.
Signed-off-by: Peter Hurley <peter@hurleysoftware.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Peter Hurley [Sat, 15 Jun 2013 13:14:27 +0000 (09:14 -0400)]
n_tty: Reset lnext if canonical mode changes
lnext escapes the next input character as a literal, and must
be reset when canonical mode changes (to avoid misinterpreting
a special character as a literal if canonical mode is changed
back again).
lnext is specifically not reset on a buffer flush so as to avoid
misinterpreting the next input character as a special character.
Signed-off-by: Peter Hurley <peter@hurleysoftware.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Peter Hurley [Sat, 15 Jun 2013 13:14:26 +0000 (09:14 -0400)]
n_tty: Make N_TTY ldisc receive path lockless
n_tty has a single-producer/single-consumer input model;
use lockless publish instead.
Use termios_rwsem to exclude both consumer and producer while
changing or resetting buffer indices, eg., when flushing. Also,
claim exclusive termios_rwsem to safely retrieve the buffer
indices from a thread other than consumer or producer
(eg., TIOCINQ ioctl).
Note the read_tail is published _after_ clearing the newline
indicator in read_flags to avoid racing the producer.
Drop read_lock spinlock.
Signed-off-by: Peter Hurley <peter@hurleysoftware.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Peter Hurley [Sat, 15 Jun 2013 13:14:25 +0000 (09:14 -0400)]
n_tty: Replace canon_data with index comparison
canon_data represented the # of lines which had been copied
to the receive buffer but not yet copied to the user buffer.
The value was tested to determine if input was available in
canonical mode (and also to force input overrun if the
receive buffer was full but a newline had not been received).
However, the actual count was irrelevent; only whether it was
non-zero (meaning 'is there any input to transfer?'). This
shared count is unnecessary and unsafe with a lockless algorithm.
The same check is made by comparing canon_head with read_tail instead.
Signed-off-by: Peter Hurley <peter@hurleysoftware.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Peter Hurley [Sat, 15 Jun 2013 13:14:24 +0000 (09:14 -0400)]
n_tty: Access termios values safely
Use termios_rwsem to guarantee safe access to the termios values.
This is particularly important for N_TTY as changing certain termios
settings alters the mode of operation.
termios_rwsem must be dropped across throttle/unthrottle since
those functions claim the termios_rwsem exclusively (to guarantee
safe access to the termios and for mutual exclusion).
Signed-off-by: Peter Hurley <peter@hurleysoftware.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Peter Hurley [Sat, 15 Jun 2013 13:14:21 +0000 (09:14 -0400)]
n_tty: Don't wrap input buffer indices at buffer size
Wrap read_buf indices (read_head, read_tail, canon_head) at
max representable value, instead of at the N_TTY_BUF_SIZE. This step
is necessary to allow lockless reads of these shared variables
(by updating the variables atomically).
Signed-off-by: Peter Hurley <peter@hurleysoftware.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Peter Hurley [Sat, 15 Jun 2013 13:14:15 +0000 (09:14 -0400)]
tty: Make ldisc input flow control concurrency-friendly
Although line discipline receiving is single-producer/single-consumer,
using tty->receive_room to manage flow control creates unnecessary
critical regions requiring additional lock use.
Instead, introduce the optional .receive_buf2() ldisc method which
returns the # of bytes actually received. Serialization is guaranteed
by the caller.
In turn, the line discipline should schedule the buffer work item
whenever space becomes available; ie., when there is room to receive
data and receive_room() previously returned 0 (the buffer work
item stops processing if receive_buf2() returns 0). Note the
'no room' state need not be atomic despite concurrent use by two
threads because only the buffer work thread can set the state and
only the read() thread can clear the state.
Add n_tty_receive_buf2() as the receive_buf2() method for N_TTY.
Provide a public helper function, tty_ldisc_receive_buf(), to use
when directly accessing the receive_buf() methods.
Line disciplines not using input flow control can continue to set
tty->receive_room to a fixed value and only provide the receive_buf()
method.
Signed-off-by: Peter Hurley <peter@hurleysoftware.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Peter Hurley [Sat, 15 Jun 2013 13:14:14 +0000 (09:14 -0400)]
tty: Simplify tty buffer/ldisc interface with helper function
Ldisc interface functions must be called with interrupts enabled.
Separating the ldisc calls into a helper function simplies the
eventual removal of the spinlock.
Note that access to the buf->head ptr outside the spinlock is
safe here because;
* __tty_buffer_flush() is prevented from running while buffer work
performs i/o,
* tty_buffer_find() only assigns buf->head if the flip buffer list
is empty (which is never the case in flush_to_ldisc() since at
least one buffer is always left in the list after use)
Access to the read index outside the spinlock is safe here for the
same reasons.
Update the buffer's read index _after_ the data has been received
by the ldisc.
Signed-off-by: Peter Hurley <peter@hurleysoftware.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Peter Hurley [Sat, 15 Jun 2013 13:14:13 +0000 (09:14 -0400)]
tty: Don't change receive_room for ioctl(TIOCSETD)
tty_set_ldisc() is guaranteed exclusive use of the line discipline
by tty_ldisc_lock_pair_timeout(); shutting off input by resetting
receive_room is unnecessary.
Signed-off-by: Peter Hurley <peter@hurleysoftware.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Peter Hurley [Sat, 15 Jun 2013 11:04:48 +0000 (07:04 -0400)]
tty: Replace ldisc locking with ldisc_sem
Line discipline locking was performed with a combination of
a mutex, a status bit, a count, and a waitqueue -- basically,
a rw semaphore.
Replace the existing combination with an ld_semaphore.
Fixes:
1) the 'reference acquire after ldisc locked' bug
2) the over-complicated halt mechanism
3) lock order wrt. tty_lock()
4) dropping locks while changing ldisc
5) previously unidentified deadlock while locking ldisc from
both linked ttys concurrently
6) previously unidentified recursive deadlocks
Adds much-needed lockdep diagnostics.
Signed-off-by: Peter Hurley <peter@hurleysoftware.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Peter Hurley [Sat, 15 Jun 2013 11:04:47 +0000 (07:04 -0400)]
tty: Add lock/unlock ldisc pair functions
Just as the tty pair must be locked in a stable sequence
(ie, independent of which is consider the 'other' tty), so must
the ldisc pair be locked in a stable sequence as well.
Signed-off-by: Peter Hurley <peter@hurleysoftware.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Merge tag 'acpi-video-3.11' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull ACPI video support fixes from Rafael Wysocki:
"I'm sending a separate pull request for this as it may be somewhat
controversial. The breakage addressed here is not really new and the
fixes may not satisfy all users of the affected systems, but we've had
so much back and forth dance in this area over the last several weeks
that I think it's time to actually make some progress.
The source of the problem is that about a year ago we started to tell
BIOSes that we're compatible with Windows 8, which we really need to
do, because some systems shipping with Windows 8 are tested with it
and nothing else, so if we tell their BIOSes that we aren't compatible
with Windows 8, we expose our users to untested BIOS/AML code paths.
However, as it turns out, some Windows 8-specific AML code paths are
not tested either, because Windows 8 actually doesn't use the ACPI
methods containing them, so if we declare Windows 8 compatibility and
attempt to use those ACPI methods, things break. That occurs mostly
in the backlight support area where in particular the _BCM and _BQC
methods are plain unusable on some systems if the OS declares Windows
8 compatibility.
[ The additional twist is that they actually become usable if the OS
says it is not compatible with Windows 8, but that may cause
problems to show up elsewhere ]
Investigation carried out by Matthew Garrett indicates that what
Windows 8 does about backlight is to leave backlight control up to
individual graphics drivers. At least there's evidence that it does
that if the Intel graphics driver is used, so we've decided to follow
Windows 8 in that respect and allow i915 to control backlight (Daniel
likes that part).
The first commit from Aaron Lu makes ACPICA export the variable from
which we can infer whether or not the BIOS believes that we are
compatible with Windows 8.
The second commit from Matthew Garrett prepares the ACPI video driver
by making it initialize the ACPI backlight even if it is not going to
be used afterward (that is needed for backlight control to work on
Thinkpads).
The third commit implements the actual workaround making i915 take
over backlight control if the firmware thinks it's dealing with
Windows 8 and is based on the work of multiple developers, including
Matthew Garrett, Chun-Yi Lee, Seth Forshee, and Aaron Lu.
The final commit from Aaron Lu makes us follow Windows 8 by informing
the firmware through the _DOS method that it should not carry out
automatic brightness changes, so that brightness can be controlled by
GUI.
Hopefully, this approach will allow us to avoid using blacklists of
systems that should not declare Windows 8 compatibility just to avoid
backlight control problems in the future.
- Change from Aaron Lu makes ACPICA export a variable which can be
used by driver code to determine whether or not the BIOS believes
that we are compatible with Windows 8.
- Change from Matthew Garrett makes the ACPI video driver initialize
the ACPI backlight even if it is not going to be used afterward
(that is needed for backlight control to work on Thinkpads).
- Fix from Rafael J Wysocki implements Windows 8 backlight support
workaround making i915 take over bakclight control if the firmware
thinks it's dealing with Windows 8. Based on the work of multiple
developers including Matthew Garrett, Chun-Yi Lee, Seth Forshee,
and Aaron Lu.
- Fix from Aaron Lu makes the kernel follow Windows 8 by informing
the firmware through the _DOS method that it should not carry out
automatic brightness changes, so that brightness can be controlled
by GUI"
* tag 'acpi-video-3.11' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
ACPI / video: no automatic brightness changes by win8-compatible firmware
ACPI / video / i915: No ACPI backlight if firmware expects Windows 8
ACPI / video: Always call acpi_video_init_brightness() on init
ACPICA: expose OSI version
Merge tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4
Pull ext[34] tmpfile bugfix from Ted Ts'o:
"Fix regression caused by commit af51a2ac36d1f which added ->tmpfile()
support (along with a similar fix for ext3)"
* tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
ext3: fix a BUG when opening a file with O_TMPFILE flag
ext4: fix a BUG when opening a file with O_TMPFILE flag
Zheng Liu [Sun, 21 Jul 2013 02:03:20 +0000 (22:03 -0400)]
ext3: fix a BUG when opening a file with O_TMPFILE flag
When we try to open a file with O_TMPFILE flag, we will trigger a bug.
The root cause is that in ext4_orphan_add() we check ->i_nlink == 0 and
this check always fails because we set ->i_nlink = 1 in
inode_init_always(). We can use the following program to trigger it:
Here we couldn't call clear_nlink() directly because in d_tmpfile() we
will call inode_dec_link_count() to decrease ->i_nlink. So this commit
tries to call d_tmpfile() before ext4_orphan_add() to fix this problem.
Signed-off-by: Zheng Liu <wenqing.lz@taobao.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Cc: Jan Kara <jack@suse.cz> Cc: Al Viro <viro@zeniv.linux.org.uk>
Zheng Liu [Sun, 21 Jul 2013 01:58:38 +0000 (21:58 -0400)]
ext4: fix a BUG when opening a file with O_TMPFILE flag
When we try to open a file with O_TMPFILE flag, we will trigger a bug.
The root cause is that in ext4_orphan_add() we check ->i_nlink == 0 and
this check always fails because we set ->i_nlink = 1 in
inode_init_always(). We can use the following program to trigger it:
Here we couldn't call clear_nlink() directly because in d_tmpfile() we
will call inode_dec_link_count() to decrease ->i_nlink. So this commit
tries to call d_tmpfile() before ext4_orphan_add() to fix this problem.
Reported-by: Dave Jones <davej@redhat.com> Signed-off-by: Zheng Liu <wenqing.lz@taobao.com> Tested-by: Darrick J. Wong <darrick.wong@oracle.com> Tested-by: Dave Jones <davej@redhat.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Acked-by: Al Viro <viro@zeniv.linux.org.uk>
Merge tag 'staging-3.11-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging
Pull staging tree fixes from Greg KH:
"Here are a few iio driver fixes for 3.11-rc2. They are still spread
across drivers/iio and drivers/staging/iio so they are coming in
through this tree.
I've also removed the drivers/staging/csr/ driver as the developers
who originally sent it to me have moved on to other companies, and CSR
still will not send us the specs for the device, making the driver
pretty much obsolete and impossible to fix up. Deleting it now
prevents people from sending in lots of tiny codingsyle fixes that
will never go anywhere.
It also helps to offset the large lustre filesystem merge that
happened in 3.11-rc1 in the overall 3.11.0 diffstat. :)"
* tag 'staging-3.11-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging:
staging: csr: remove driver
iio: lps331ap: Fix wrong in_pressure_scale output value
iio staging: fix lis3l02dq, read error handling
staging:iio:ad7291: add missing .driver_module to struct iio_info
iio: ti_am335x_adc: add missing .driver_module to struct iio_info
iio: mxs-lradc: Remove useless check in read_raw
iio: mxs-lradc: Fix misuse of iio->trig
iio: inkern: fix iio_convert_raw_to_processed_unlocked
iio: Fix iio_channel_has_info
iio:trigger: device_unregister->device_del to avoid double free
iio: dac: ad7303: fix error return code in ad7303_probe()
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull vfs fixes from Al Viro:
"The sget() one is a long-standing bug and will need to go into -stable
(in fact, it had been originally caught in RHEL6), the other two are
3.11-only"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
vfs: constify dentry parameter in d_count()
livelock avoidance in sget()
allow O_TMPFILE to work with O_WRONLY
Merge tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4
Pull ext4 bugfixes from Ted Ts'o:
"Fixes for 3.11-rc2, sent at 5pm, in the professoinal style. :-)"
I'm not sure I like this new level of "professionalism".
9-5, people, 9-5.
* tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
ext4: call ext4_es_lru_add() after handling cache miss
ext4: yield during large unlinks
ext4: make the extent_status code more robust against ENOMEM failures
ext4: simplify calculation of blocks to free on error
ext4: fix error handling in ext4_ext_truncate()
Merge tag 'nfs-for-3.11-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs
Pull NFS client bugfixes from Trond Myklebust:
- Fix a regression against NFSv4 FreeBSD servers when creating a new
file
- Fix another regression in rpc_client_register()
* tag 'nfs-for-3.11-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
NFSv4: Fix a regression against the FreeBSD server
SUNRPC: Fix another issue with rpc_client_register()
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/josef/btrfs-next
Pull btrfs fixes from Josef Bacik:
"I'm playing the role of Chris Mason this week while he's on vacation.
There are a few critical fixes for btrfs here, all regressions and
have been tested well"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/josef/btrfs-next:
Btrfs: fix wrong write offset when replacing a device
Btrfs: re-add root to dead root list if we stop dropping it
Btrfs: fix lock leak when resuming snapshot deletion
Btrfs: update drop progress before stopping snapshot dropping
Al Viro [Fri, 19 Jul 2013 23:13:55 +0000 (03:13 +0400)]
livelock avoidance in sget()
Eric Sandeen has found a nasty livelock in sget() - take a mount(2) about
to fail. The superblock is on ->fs_supers, ->s_umount is held exclusive,
->s_active is 1. Along comes two more processes, trying to mount the same
thing; sget() in each is picking that superblock, bumping ->s_count and
trying to grab ->s_umount. ->s_active is 3 now. Original mount(2)
finally gets to deactivate_locked_super() on failure; ->s_active is 2,
superblock is still ->fs_supers because shutdown will *not* happen until
->s_active hits 0. ->s_umount is dropped and now we have two processes
chasing each other:
s_active = 2, A acquired ->s_umount, B blocked
A sees that the damn thing is stillborn, does deactivate_locked_super()
s_active = 1, A drops ->s_umount, B gets it
A restarts the search and finds the same superblock. And bumps it ->s_active.
s_active = 2, B holds ->s_umount, A blocked on trying to get it
... and we are in the earlier situation with A and B switched places.
The root cause, of course, is that ->s_active should not grow until we'd
got MS_BORN. Then failing ->mount() will have deactivate_locked_super()
shut the damn thing down. Fortunately, it's easy to do - the key point
is that grab_super() is called only for superblocks currently on ->fs_supers,
so it can bump ->s_count and grab ->s_umount first, then check MS_BORN and
bump ->s_active; we must never increment ->s_count for superblocks past
->kill_sb(), but grab_super() is never called for those.
The bug is pretty old; we would've caught it by now, if not for accidental
exclusion between sget() for block filesystems; the things like cgroup or
e.g. mtd-based filesystems don't have anything of that sort, so they get
bitten. The right way to deal with that is obviously to fix sget()...
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/uml
Pull UML fixes from Richard Weinberger:
"Special thanks goes to Toralf Föster for continuously testing UML and
reporting issues!"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/uml:
um: remove dead code
um: siginfo cleanup
uml: Fix which_tmpdir failure when /dev/shm is a symlink, and in other edge cases
um: Fix wait_stub_done() error handling
um: Mark stub pages mapping with VM_PFNMAP
um: Fix return value of strnlen_user()
Merge branch 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus
Pull MIPS fixes from Ralf Baechle:
"MIPS fixes for 3.11. Half of then is for Netlogic the remainder
touches things across arch/mips.
Nothing really dramatic and by rc1 standards MIPS will be in fairly
good shape with this applied. Tested by building all MIPS defconfigs
of which with this pull request four platforms won't build. And yes,
it boots also on my favorite test systems"
* 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus:
MIPS: kvm: Kconfig: Drop HAVE_KVM dependency from VIRTUALIZATION
MIPS: Octeon: Fix DT pruning bug with pip ports
MIPS: KVM: Mark KVM_GUEST (T&E KVM) as BROKEN_ON_SMP
MIPS: tlbex: fix broken build in v3.11-rc1
MIPS: Netlogic: Add XLP PIC irqdomain
MIPS: Netlogic: Fix USB block's coherent DMA mask
MIPS: tlbex: Fix typo in r3000 tlb store handler
MIPS: BMIPS: Fix thinko to release slave TP from reset
MIPS: Delete dead invocation of exception_exit().
Merge tag 'arm64-stable' of git://git.kernel.org/pub/scm/linux/kernel/git/cmarinas/linux-aarch64
Pull arm64 fixes from Catalin Marinas:
- Post -rc1 update to the common reboot infrastructure.
- Fixes (user cache maintenance fault handling, !COMPAT compilation,
CPU online and interrupt hanlding).
* tag 'arm64-stable' of git://git.kernel.org/pub/scm/linux/kernel/git/cmarinas/linux-aarch64:
arm64: use common reboot infrastructure
arm64: mm: don't treat user cache maintenance faults as writes
arm64: add '#ifdef CONFIG_COMPAT' for aarch32_break_handler()
arm64: Only enable local interrupts after the CPU is marked online
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Pull s390 fixes from Martin Schwidefsky:
"An update for the BFP jit to the latest and greatest, two patches to
get kdump working again, the random-abort ptrace extention for
transactional execution, the z90crypt module alias for ap and a tiny
cleanup"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
s390/zcrypt: Alias for new zcrypt device driver base module
s390/kdump: Allow copy_oldmem_page() copy to virtual memory
s390/kdump: Disable mmap for s390
s390/bpf,jit: add pkt_type support
s390/bpf,jit: address randomize and write protect jit code
s390/bpf,jit: use generic jit dumper
s390/bpf,jit: call module_free() from any context
s390/qdio: remove unused variable
s390/ptrace: PTRACE_TE_ABORT_RAND
Stefan Behrens [Thu, 4 Jul 2013 14:14:23 +0000 (16:14 +0200)]
Btrfs: fix wrong write offset when replacing a device
Miao Xie reported the following issue:
The filesystem was corrupted after we did a device replace.
Steps to reproduce:
# mkfs.btrfs -f -m single -d raid10 <device0>..<device3>
# mount <device0> <mnt>
# btrfs replace start -rfB 1 <device4> <mnt>
# umount <mnt>
# btrfsck <device4>
The reason for the issue is that we changed the write offset by mistake,
introduced by commit 625f1c8dc.
We read the data from the source device at first, and then write the
data into the corresponding place of the new device. In order to
implement the "-r" option, the source location is remapped using
btrfs_map_block(). The read takes place on the mapped location, and
the write needs to take place on the unmapped location. Currently
the write is using the mapped location, and this commit changes it
back by undoing the change to the write address that the aforementioned
commit added by mistake.
Reported-by: Miao Xie <miaox@cn.fujitsu.com> Cc: <stable@vger.kernel.org> # 3.10+ Signed-off-by: Stefan Behrens <sbehrens@giantdisaster.de> Signed-off-by: Josef Bacik <jbacik@fusionio.com>
Josef Bacik [Wed, 17 Jul 2013 23:30:20 +0000 (19:30 -0400)]
Btrfs: re-add root to dead root list if we stop dropping it
If we stop dropping a root for whatever reason we need to add it back to the
dead root list so that we will re-start the dropping next transaction commit.
The other case this happens is if we recover a drop because we will add a root
without adding it to the fs radix tree, so we can leak it's root and commit root
extent buffer, adding this to the dead root list makes this cleanup happen.
Thanks,
Cc: stable@vger.kernel.org Reported-by: Alex Lyakas <alex.btrfs@zadarastorage.com> Signed-off-by: Josef Bacik <jbacik@fusionio.com>
Josef Bacik [Mon, 15 Jul 2013 16:41:42 +0000 (12:41 -0400)]
Btrfs: fix lock leak when resuming snapshot deletion
We aren't setting path->locks[level] when we resume a snapshot deletion which
means we won't unlock the buffer when we free the path. This causes deadlocks
if we happen to re-allocate the block before we've evicted the extent buffer
from cache. Thanks,
Cc: stable@vger.kernel.org Reported-by: Alex Lyakas <alex.btrfs@zadarastorage.com> Signed-off-by: Josef Bacik <jbacik@fusionio.com>
Josef Bacik [Mon, 15 Jul 2013 15:57:06 +0000 (11:57 -0400)]
Btrfs: update drop progress before stopping snapshot dropping
Alex pointed out a problem and fix that exists in the drop one snapshot at a
time patch. If we decide we need to exit for whatever reason (umount for
example) we will just exit the snapshot dropping without updating the drop
progress. So the next time we go to resume we will BUG_ON() because we can't
find the extent we left off at because we never updated it. This patch fixes
the problem.
Cc: stable@vger.kernel.org Reported-by: Alex Lyakas <alex.btrfs@zadarastorage.com> Signed-off-by: Josef Bacik <jbacik@fusionio.com>
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull KVM fix from Paolo Bonzini:
"This single patch fixes a regression caused by one of the
optimizations introduced in 3.11, which is generally visible only on
AMD processors"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
KVM: MMU: avoid fast page fault fixing mmio page fault
Merge tag 'pm+acpi-3.11-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management and ACPI fixes from Rafael Wysocki:
"These are fixes collected over the last week, most importnatly two
cpufreq reverts fixing regressions introduced in 3.10, an autoseelp
fix preventing systems using it from crashing during shutdown and two
ACPI scan fixes related to hotplug.
Specifics:
- Two cpufreq commits from the 3.10 cycle introduced regressions.
The first of them was buggy (it did way much more than it needed to
do) and the second one attempted to fix an issue introduced by the
first one. Fixes from Srivatsa S Bhat revert both.
- If autosleep triggers during system shutdown and the shutdown
callbacks of some device drivers have been called already, it may
crash the system. Fix from Liu Shuo prevents that from happening
by making try_to_suspend() check system_state.
- The ACPI memory hotplug driver doesn't clear its driver_data on
errors which may cause a NULL poiter dereference to happen later.
Fix from Toshi Kani.
- The ACPI namespace scanning code should not try to attach scan
handlers to device objects that have them already, which may
confuse things quite a bit, and it should rescan the whole
namespace branch starting at the given node after receiving a bus
check notify event even if the device at that particular node has
been discovered already. Fixes from Rafael J Wysocki.
- New ACPI video blacklist entry for a system whose initial backlight
setting from the BIOS doesn't make sense. From Lan Tianyu.
- Garbage string output avoindance for ACPI PNP from Liu Shuo.
- Two Kconfig fixes for issues introduced recently in the s3c24xx
cpufreq driver (when moving the driver to drivers/cpufreq) from
Paul Bolle.
- Trivial comment fix in pm_wakeup.h from Chanwoo Choi"
* tag 'pm+acpi-3.11-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
ACPI / video: ignore BIOS initial backlight value for Fujitsu E753
PNP / ACPI: avoid garbage in resource name
cpufreq: Revert commit 2f7021a8 to fix CPU hotplug regression
cpufreq: s3c24xx: fix "depends on ARM_S3C24XX" in Kconfig
cpufreq: s3c24xx: rename CONFIG_CPU_FREQ_S3C24XX_DEBUGFS
PM / Sleep: Fix comment typo in pm_wakeup.h
PM / Sleep: avoid 'autosleep' in shutdown progress
cpufreq: Revert commit a66b2e to fix suspend/resume regression
ACPI / memhotplug: Fix a stale pointer in error path
ACPI / scan: Always call acpi_bus_scan() for bus check notifications
ACPI / scan: Do not try to attach scan handlers to devices having them
Marc Zyngier [Thu, 11 Jul 2013 11:13:00 +0000 (12:13 +0100)]
arm64: use common reboot infrastructure
Commit 7b6d864b48d9 (reboot: arm: change reboot_mode to use enum
reboot_mode) changed the way reboot is handled on arm, which has a
direct impact on arm64 as we share the reset driver on the VE platform.
The obvious fix is to move arm64 to use the same infrastructure.
Will Deacon [Fri, 19 Jul 2013 14:37:12 +0000 (15:37 +0100)]
arm64: mm: don't treat user cache maintenance faults as writes
On arm64, cache maintenance faults appear as data aborts with the CM
bit set in the ESR. The WnR bit, usually used to distinguish between
faulting loads and stores, always reads as 1 and (slightly confusingly)
the instructions are treated as reads by the architecture.
This patch fixes our fault handling code to treat cache maintenance
faults in the same way as loads.
Signed-off-by: Will Deacon <will.deacon@arm.com> Cc: <stable@vger.kernel.org> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Chen Gang [Mon, 24 Jun 2013 09:27:49 +0000 (10:27 +0100)]
arm64: add '#ifdef CONFIG_COMPAT' for aarch32_break_handler()
If 'COMPAT' not defined, aarch32_break_handler() cannot pass compiling,
and it can work independent with 'COMPAT', so remove dummy definition.
The related error:
arch/arm64/kernel/debug-monitors.c:249:5: error: redefinition of ‘aarch32_break_handler’
In file included from arch/arm64/kernel/debug-monitors.c:29:0:
/root/linux-next/arch/arm64/include/asm/debug-monitors.h:89:12: note: previous definition of ‘aarch32_break_handler’ was here
Signed-off-by: Chen Gang <gang.chen@asianux.com> Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
arm64: Only enable local interrupts after the CPU is marked online
There is a slight chance that (timer) interrupts are triggered before a
secondary CPU has been marked online with implications on softirq thread
affinity.
Markos Chandras [Tue, 11 Jun 2013 09:02:33 +0000 (09:02 +0000)]
MIPS: kvm: Kconfig: Drop HAVE_KVM dependency from VIRTUALIZATION
Virtualization does not always need KVM capabilities so drop the
dependency. The KVM symbol already depends on HAVE_KVM.
Fixes the following problem on a randconfig:
warning: (REMOTEPROC && RPMSG) selects VIRTUALIZATION which has unmet direct
dependencies (HAVE_KVM)
warning: (REMOTEPROC && RPMSG) selects VIRTUALIZATION which has unmet
direct dependencies (HAVE_KVM)
Signed-off-by: Markos Chandras <markos.chandras@imgtec.com> Acked-by: Steven J. Hill <Steven.Hill@imgtec.com> Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/5443/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Currently we use both struct siginfo and siginfo_t.
Let's use struct siginfo internally to avoid ongoing
compiler warning. We are allowed to do so because
struct siginfo and siginfo_t are equivalent.
Signed-off-by: Richard Weinberger <richard@nod.at>
During the pruning of the device tree octeon_fdt_pip_iface() is called
for each PIP interface and every port up to the port count is removed
from the device tree. However, the count was set to the return value of
cvmx_helper_interface_enumerate() which doesn't actually return the
count but just returns zero on success. This effectively removed *all*
ports from the tree.
Use cvmx_helper_ports_on_interface() instead to fix this. This
successfully restores the 3 ports of my ERLite-3 and fixes the "kernel
assigns random MAC addresses" issue.
Signed-off-by: Faidon Liambotis <paravoid@debian.org> Tested-by: Aaro Koskinen <aaro.koskinen@iki.fi> Acked-by: David Daney <david.daney@cavium.com> Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/5587/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
uml: Fix which_tmpdir failure when /dev/shm is a symlink, and in other edge cases
which_tmpdir did the wrong thing if /dev/shm was a symlink (e.g., to /run/shm),
if there were multiple mounts on top of each other, if the mount(s) were
obscured by a later mount, or if /dev/shm was a prefix of another mount point.
This fixes these cases. Applies to 3.9.6.
Signed-off-by: Tristan Schmelcher <tschmelcher@google.com> Signed-off-by: Richard Weinberger <richard@nod.at>
James Hogan [Fri, 12 Jul 2013 10:26:11 +0000 (10:26 +0000)]
MIPS: KVM: Mark KVM_GUEST (T&E KVM) as BROKEN_ON_SMP
Make KVM_GUEST depend on BROKEN_ON_SMP so that it cannot be enabled with
SMP.
SMP kernels use ll/sc instructions for an atomic section in the tlb fill
handler, with a tlbp instruction contained in the middle. This cannot be
emulated with trap & emulate KVM because the tlbp instruction traps and
the eret to return to the guest code clears the LLbit which makes the sc
instruction always fail.
Aaro Koskinen [Mon, 15 Jul 2013 07:21:57 +0000 (07:21 +0000)]
MIPS: tlbex: fix broken build in v3.11-rc1
Commit 6ba045f9fbdafb48da42aa8576ea7a3980443136 (MIPS: Move generated code
to .text for microMIPS) deleted tlbmiss_handler_setup_pgd_array, but some
references were not converted. Fix that to enable building a MIPS kernel.
Signed-off-by: Aaro Koskinen <aaro.koskinen@iki.fi> Acked-by: Jayachandran C. <jchandra@broadcom.com> Acked-by: David Daney <david.daney@cavium.com> Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/5589/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Jayachandran C [Wed, 17 Jul 2013 10:27:26 +0000 (10:27 +0000)]
MIPS: Netlogic: Add XLP PIC irqdomain
Add a legacy irq domain for the XLP PIC interrupts. This will be used
when interrupts are assigned from the device tree. This change is required
after commit c5cdc67 "irqdomain: Remove temporary MIPS workaround code".
Signed-off-by: Jayachandran C <jchandra@broadcom.com> Cc: linux-mips@linux-mips.org Cc: Jayachandran C <jchandra@broadcom.com>
Patchwork: https://patchwork.linux-mips.org/patch/5597/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
The on-chip USB controller on Netlogic XLP does not suppport
DMA beyond 32-bit physical address. Set the coherent_dma_mask
of the USB in its PCI fixup to support this.
Tony Wu [Thu, 18 Jul 2013 09:45:47 +0000 (09:45 +0000)]
MIPS: tlbex: Fix typo in r3000 tlb store handler
commit 6ba045f (MIPS: Move generated code to .text for microMIPS)
causes a panic at boot. The handler builder should test against
handle_tlbs_end, not handle_tlbs.
Signed-off-by: Tony Wu <tung7970@gmail.com> Acked-by: Jayachandran C. <jchandra@broadcom.com> Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/5600/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>