SUNRPC client: add interface for binding to a local address
In addition to binding to a local privileged port the NFS client should
allow binding to a specific local address. This is used by the server
for callbacks. The patch adds the necessary interface.
Signed-off-by: Frank van Maarseveen <frankvm@frankvm.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
NFSv4: Make the NFS state model work with the nosharedcache mount option
Consider the case where the user has mounted the remote filesystem
server:/foo on the two local directories /bar and /baz using the
nosharedcache mount option. The files /bar/file and /baz/file are
represented by different inodes in the local namespace, but refer to the
same file /foo/file on the server.
Consider the case where a process opens both /bar/file and /baz/file, then
closes /bar/file: because the nfs4_state is not shared between /bar/file
and /baz/file, the kernel will see that the nfs4_state for /bar/file is no
longer referenced, so it will send off a CLOSE rpc call. Unless the
open_owners differ, then that CLOSE call will invalidate the open state on
/baz/file too.
Conclusion: we cannot share open state owners between two different
non-shared mount instances of the same filesystem.
Trond Myklebust [Wed, 16 May 2007 20:53:28 +0000 (16:53 -0400)]
NFS: Error when mounting the same filesystem with different options
Unless the user sets the NFS_MOUNT_NOSHAREDCACHE mount flag, we should
return EBUSY if the filesystem is already mounted on a superblock that
has set conflicting mount options.
Trond Myklebust [Wed, 16 May 2007 20:53:28 +0000 (16:53 -0400)]
NFS: Add the mount option "nosharecache"
Prior to David Howell's mount changes in 2.6.18, users who mounted
different directories which happened to be from the same filesystem on the
server would get different super blocks, and hence could choose different
mount options. As long as there were no hard linked files that crossed from
one subtree to another, this was quite safe.
Post the changes, if the two directories are on the same filesystem (have
the same 'fsid'), they will share the same super block, and hence the same
mount options.
Add a flag to allow users to elect not to share the NFS super block with
another mount point, even if the fsids are the same. This will allow
users to set different mount options for the two different super blocks, as
was previously possible. It is still up to the user to ensure that there
are no cache coherency issues when doing this, however the default
behaviour will be to share super blocks whenever two paths result in
the same fsid.
Chuck Lever [Sun, 1 Jul 2007 16:13:49 +0000 (12:13 -0400)]
NFS: Introduce generic mount client API
For NFSv2 and v3 mounts, the first step is to contact the server's MOUNTD
and request the file handle for the root of the mounted share. Add a
function to the NFS client that handles this operation.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Chuck Lever [Sun, 1 Jul 2007 16:13:27 +0000 (12:13 -0400)]
NFS: Remake nfsroot_mount as a permanent part of NFS client
In preparation for supporting NFSv2 and NFSv3 mount option handling in the
kernel NFS client, convert mount_clnt.c to be a permanent part of the NFS
client, instead of built only when CONFIG_ROOT_NFS is enabled.
In addition, we also replace the "struct sockaddr_in *" argument with
something more generic, to help support IPv6 at some later point.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Chuck Lever [Sun, 1 Jul 2007 16:13:22 +0000 (12:13 -0400)]
SUNRPC: Add a convenient default for the hostname when calling rpc_create()
A couple of callers just use a stringified IP address for the rpc client's
hostname. Move the logic for constructing this into rpc_create(), so it can
be shared.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Chuck Lever [Sun, 1 Jul 2007 16:13:12 +0000 (12:13 -0400)]
SUNRPC: Rename rpcb_getport_external routine
In preparation for handling NFS mount option parsing in the kernel,
rename rpcb_getport_external as rpcb_get_port_sync, and make it available
always (instead of only when CONFIG_ROOT_NFS is enabled).
Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Chuck Lever [Sun, 1 Jul 2007 16:12:56 +0000 (12:12 -0400)]
NFS: Clean up nfs_validate_mount_data
Move error handling code out of the main code path. The switch statement
was also improperly indented, according to Documentation/CodingStyle. This
prepares nfs_validate_mount_data for the addition of option string parsing.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Chuck Lever [Sun, 1 Jul 2007 16:12:46 +0000 (12:12 -0400)]
NFS: Clean-up: Refactor IP address sanity checks in NFS client
NFS and NFSv4 mounts can now share server address sanity checking. And, it
provides an easy mechanism for adding IPv6 address checking at some later
point.
Chuck Lever [Sun, 1 Jul 2007 16:12:40 +0000 (12:12 -0400)]
NFS: Clean-up: fix a compiler warning in fs/nfs/super.c
/home/cel/linux/fs/nfs/super.c: In function 'nfs_pseudoflavour_to_name':
/home/cel/linux/fs/nfs/super.c:270: warning: comparison between signed and unsigned
Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
NFSv4: Don't call OPEN if we already have an open stateid for a file
If we already have a stateid with the correct open mode for a given file,
then we can reuse that stateid instead of re-issuing an OPEN call without
violating the close-to-open caching semantics.
NFSv4: Defer inode revalidation when setting up a delegation
Currently we force a synchronous call to __nfs_revalidate_inode() in
nfs_inode_set_delegation(). This not only ensures that we cannot call
nfs_inode_set_delegation from an asynchronous context, but it also slows
down any call to open().
There appear to be some rogue servers out there that issue multiple
delegations with different stateids for the same file. Ensure that when we
return delegations, we do so on a per-stateid basis rather than a per-file
basis.
NFSv4: set the delegation in nfs4_opendata_to_nfs4_state
This ensures that nfs4_open_release() and nfs4_open_confirm_release()
can now handle an eventual delegation that was returned with out open.
As such, it fixes a delegation "leak" when the user breaks out of an open
call.
The test for state->state == 0 does not tell you that the stateid is in the
process of being freed. It really tells you that the stateid is not yet
initialised...
Currently we do not check for the FMODE_EXEC flag as we should. For that
particular case, we need to perform an ACCESS call to the server in order
to check that the file is executable.
Trond Myklebust [Wed, 27 Jun 2007 18:29:04 +0000 (14:29 -0400)]
SUNRPC: Remove the tk_auth macro...
We should almost always be deferencing the rpc_auth struct by means of the
credential's cr_auth field instead of the rpc_clnt->cl_auth anyway. Fix up
that historical mistake, and remove the macro that propagated it.
Trond Myklebust [Sat, 9 Jun 2007 19:41:42 +0000 (15:41 -0400)]
SUNRPC: Fix a memory leak in the auth credcache code
The leak only affects the RPCSEC_GSS caches, since they are the only ones
that are dynamically allocated...
Rename the existing rpcauth_free_credcache() to rpcauth_clear_credcache()
in order to better describe its role, then add a new function
rpcauth_destroy_credcache() that actually frees the cache in addition to
clearing it out.
Also move the call to destroy the credcache in gss_destroy() to come before
the rpc upcall pipe is unlinked.
Trond Myklebust [Thu, 7 Jun 2007 19:31:36 +0000 (15:31 -0400)]
SUNRPC: Add a downcall queue to struct rpc_inode
Currently, the downcall queue is tied to the struct gss_auth, which means
that different RPCSEC_GSS pseudoflavours must use different upcall pipes.
Add a list to struct rpc_inode that can be used instead.
Trond Myklebust [Thu, 7 Jun 2007 14:14:15 +0000 (10:14 -0400)]
SUNRPC: Always match an upcall message in gss_pipe_downcall()
It used to be possible for an rpc.gssd daemon to stuff the RPC credential
cache for any rpc client simply by creating RPCSEC_GSS contexts and then
doing downcalls. In practice, no daemons ever made use of this feature.
Remove this feature now, since it will be impossible to figure out which
mechanism a given context actually matches if we enable more
than one gss mechanism to use the same upcall pipe.
Trond Myklebust [Thu, 7 Jun 2007 14:14:14 +0000 (10:14 -0400)]
SUNRPC: Add a backpointer from the struct rpc_cred to the rpc_auth
Cleans up an issue whereby rpcsec_gss uses the rpc_clnt->cl_auth. If we want
to be able to add several rpc_auths to a single rpc_clnt, then this abuse
must go.
Trond Myklebust [Thu, 14 Jun 2007 22:00:42 +0000 (18:00 -0400)]
SUNRPC: fix hang due to eventd deadlock...
Brian Behlendorf writes:
The root cause of the NFS hang we were observing appears to be a rare
deadlock between the kernel provided usermodehelper API and the linux NFS
client. The deadlock can arise because both of these services use the
generic linux work queues. The usermodehelper API run the specified user
application in the context of the work queue. And NFS submits both cleanup
and reconnect work to the generic work queue for handling. Normally this
is fine but a deadlock can result in the following situation.
- NFS client is in a disconnected state
- [events/0] runs a usermodehelper app with an NFS dependent operation,
this triggers an NFS reconnect.
- NFS reconnect happens to be submitted to [events/0] work queue.
- Deadlock, the [events/0] work queue will never process the
reconnect because it is blocked on the previous NFS dependent
operation which will not complete.`
The solution is simply to run reconnect requests on rpciod.
Trond Myklebust [Thu, 14 Jun 2007 21:08:36 +0000 (17:08 -0400)]
SUNRPC: Optimise rpciod_up()
Instead of taking the mutex every time we just need to increment/decrement
rpciod_users, we can optmise by using atomic_inc_not_zero and
atomic_dec_and_test.
Trond Myklebust [Thu, 14 Jun 2007 20:40:32 +0000 (16:40 -0400)]
SUNRPC: Remove rpc_clnt->cl_count
The kref now does most of what cl_count + cl_user used to do. The only
remaining role for cl_count is to tell us if we are in a 'shutdown'
phase. We can provide that information using a single bit field instead
of a full atomic counter.
Also rename rpc_destroy_client() to rpc_close_client(), which reflects
better what its role is these days.