Jeff Layton [Fri, 21 Jun 2013 12:58:15 +0000 (08:58 -0400)]
locks: protect most of the file_lock handling with i_lock
Having a global lock that protects all of this code is a clear
scalability problem. Instead of doing that, move most of the code to be
protected by the i_lock instead. The exceptions are the global lists
that the ->fl_link sits on, and the ->fl_block list.
->fl_link is what connects these structures to the
global lists, so we must ensure that we hold those locks when iterating
over or updating these lists.
Furthermore, sound deadlock detection requires that we hold the
blocked_list state steady while checking for loops. We also must ensure
that the search and update to the list are atomic.
For the checking and insertion side of the blocked_list, push the
acquisition of the global lock into __posix_lock_file and ensure that
checking and update of the blocked_list is done without dropping the
lock in between.
On the removal side, when waking up blocked lock waiters, take the
global lock before walking the blocked list and dequeue the waiters from
the global list prior to removal from the fl_block list.
With this, deadlock detection should be race free while we minimize
excessive file_lock_lock thrashing.
Finally, in order to avoid a lock inversion problem when handling
/proc/locks output we must ensure that manipulations of the fl_block
list are also protected by the file_lock_lock.
Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Jeff Layton [Fri, 21 Jun 2013 12:58:14 +0000 (08:58 -0400)]
locks: encapsulate the fl_link list handling
Move the fl_link list handling routines into a separate set of helpers.
Also ensure that locks and requests are always put on global lists
last (after fully initializing them) and are taken off before unintializing
them.
Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Jeff Layton [Fri, 21 Jun 2013 12:58:10 +0000 (08:58 -0400)]
cifs: use posix_unblock_lock instead of locks_delete_block
commit 66189be74 (CIFS: Fix VFS lock usage for oplocked files) exported
the locks_delete_block symbol. There's already an exported helper
function that provides this capability however, so make cifs use that
instead and turn locks_delete_block back into a static function.
Note that if fl->fl_next == NULL then this lock has already been through
locks_delete_block(), so we should be OK to ignore an ENOENT error here
and simply not retry the lock.
Cc: Pavel Shilovsky <piastryyy@gmail.com> Signed-off-by: Jeff Layton <jlayton@redhat.com> Acked-by: J. Bruce Fields <bfields@fieldses.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Linus Torvalds [Tue, 21 May 2013 22:22:44 +0000 (15:22 -0700)]
Don't pass inode to ->d_hash() and ->d_compare()
Instances either don't look at it at all (the majority of cases) or
only want it to find the superblock (which can be had as dentry->d_sb).
A few cases that want more are actually safe with dentry->d_inode -
the only precaution needed is the check that it hadn't been replaced with
NULL by rmdir() or by overwriting rename(), which case should be simply
treated as cache miss.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Sun, 16 Jun 2013 15:08:36 +0000 (19:08 +0400)]
fanotify: quit wanking with FASYNC in ->release()
... especially since there's no way to get that sucker
on the list fsnotify_fasync() works with - the only thing
adding to it is fsnotify_fasync() itself and it's never
called for fanotify files while they are opened.
David Howells [Thu, 13 Jun 2013 22:37:55 +0000 (23:37 +0100)]
SELinux: Institute file_path_has_perm()
Create a file_path_has_perm() function that is like path_has_perm() but
instead takes a file struct that is the source of both the path and the
inode (rather than getting the inode from the dentry in the path). This
is then used where appropriate.
This will be useful for situations like unionmount where it will be
possible to have an apparently-negative dentry (eg. a fallthrough) that is
open with the file struct pointing to an inode on the lower fs.
Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Tue, 11 Jun 2013 04:34:36 +0000 (08:34 +0400)]
allow the temp files created by open() to be linked to
O_TMPFILE | O_CREAT => linkat() with AT_SYMLINK_FOLLOW and /proc/self/fd/<n>
as oldpath (i.e. flink()) will create a link
O_TMPFILE | O_CREAT | O_EXCL => ENOENT on attempt to link those guys
Al Viro [Thu, 16 May 2013 01:02:48 +0000 (21:02 -0400)]
[readdir] convert ext3
new helper: dir_relax(inode). Call when you are in location that will
_not_ be invalidated by directory modifications (block boundary, in case
of ext*). Returns whether the directory has survived (dropping i_mutex
allows rmdir to kill the sucker; if it returns false to us, ->iterate()
is obviously done)
New method - ->iterate(file, ctx). That's the replacement for ->readdir();
it takes callback from ctx->actor, uses ctx->pos instead of file->f_pos and
calls dir_emit(ctx, ...) instead of filldir(data, ...). It does *not*
update file->f_pos (or look at it, for that matter); iterate_dir() does the
update.
Note that dir_emit() takes the offset from ctx->pos (and eventually
filldir_t will lose that argument).
Al Viro [Wed, 15 May 2013 17:52:59 +0000 (13:52 -0400)]
[readdir] introduce iterate_dir() and dir_context
iterate_dir(): new helper, replacing vfs_readdir().
struct dir_context: contains the readdir callback (and will get more stuff
in it), embedded into whatever data that callback wants to deal with;
eventually, we'll be passing it to ->readdir() replacement instead of
(data,filldir) pair.