J. Bruce Fields [Thu, 22 Mar 2012 20:07:18 +0000 (16:07 -0400)]
nfsd4: allow numeric idmapping
Mimic the client side by providing a module parameter that turns off
idmapping in the auth_sys case, for backwards compatibility with NFSv2
and NFSv3.
Unlike in the client case, we don't have any way to negotiate, since the
client can return an error to us if it doesn't like the id that we
return to it in (for example) a getattr call.
However, it has always been possible for servers to return numeric id's,
and as far as we're aware clients have always been able to handle them.
Also, in the auth_sys case clients already need to have numeric id's the
same between client and server.
Therefore we believe it's safe to default this to on; but the module
parameter is available to return to previous behavior if this proves to
be a problem in some unexpected setup.
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Jeff Layton [Wed, 21 Mar 2012 13:52:08 +0000 (09:52 -0400)]
nfsd: add notifier to handle mount/unmount of rpc_pipefs sb
In the event that rpc_pipefs isn't mounted when nfsd starts, we
must register a notifier to handle creating the dentry once it
is mounted, and to remove the dentry on unmount.
Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Jeff Layton [Wed, 21 Mar 2012 13:52:07 +0000 (09:52 -0400)]
nfsd: add the infrastructure to handle the cld upcall
...and add a mechanism for switching between the "legacy" tracker and
the new one. The decision is made by looking to see whether the
v4recoverydir exists. If it does, then the legacy client tracker is
used.
If it's not, then the kernel will create a "cld" pipe in rpc_pipefs.
That pipe is used to talk to a daemon for handling the upcall.
Most of the data structures for the new client tracker are handled on a
per-namespace basis, so this upcall should be essentially ready for
containerization. For now however, nfsd just starts it by calling the
initialization and exit functions for init_net.
I'm making the assumption that at some point in the future we'll be able
to determine the net namespace from the nfs4_client. Until then, this
patch hardcodes init_net in those places. I've sprinkled some "FIXME"
comments around that code to attempt to make it clear where we'll need
to fix that up later.
Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Jeff Layton [Wed, 21 Mar 2012 13:52:05 +0000 (09:52 -0400)]
nfsd: add a per-net-namespace struct for nfsd
Eventually, we'll need this when nfsd gets containerized fully. For
now, create a struct on a per-net-namespace basis that will just hold
a pointer to the cld_net structure. That struct will hold all of the
per-net data that we need for the cld tracker.
Eventually we can add other pernet objects to struct nfsd_net.
Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Jeff Layton [Wed, 21 Mar 2012 20:42:43 +0000 (16:42 -0400)]
nfsd: add nfsd4_client_tracking_ops struct and a way to set it
Abstract out the mechanism that we use to track clients into a set of
client name tracking functions.
This gives us a mechanism to plug in a new set of client tracking
functions without disturbing the callers. It also gives us a way to
decide on what tracking scheme to use at runtime.
For now, this just looks like pointless abstraction, but later we'll
add a new alternate scheme for tracking clients on stable storage.
Note too that this patch anticipates the eventual containerization
of this code by passing in struct net pointers in places. No attempt
is made to containerize the legacy client tracker however.
Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Jeff Layton [Wed, 21 Mar 2012 13:52:02 +0000 (09:52 -0400)]
nfsd: convert nfs4_client->cl_cb_flags to a generic flags field
We'll need a way to flag the nfs4_client as already being recorded on
stable storage so that we don't continually upcall. Currently, that's
recorded in the cl_firststate field of the client struct. Using an
entire u32 to store a flag is rather wasteful though.
The cl_cb_flags field is only using 2 bits right now, so repurpose that
to a generic flags field. Rename NFSD4_CLIENT_KILL to
NFSD4_CLIENT_CB_KILL to make it evident that it's part of the callback
flags. Add a mask that we can use for existing checks that look to see
whether any flags are set, so that the new flags don't interfere.
Convert all references to cl_firstate to the NFSD4_CLIENT_STABLE flag,
and add a new NFSD4_CLIENT_RECLAIM_COMPLETE flag.
Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Vivek Trivedi [Thu, 15 Mar 2012 18:28:52 +0000 (23:58 +0530)]
NFS: fix sb->s_id in nfs debug prints
NFS bdi flush thread in ps output is printed like "flush-<major number
in decimal>:<minor number in decimal>"
For example:
$ ps aux | grep flush
2079 root 0 SW [flush-0:18]
^^^^
Sachin Bhamare [Tue, 20 Mar 2012 03:47:58 +0000 (20:47 -0700)]
pnfs-obj: autologin: Add support for protocol autologin
The pnfs-objects protocol mandates that we autologin into devices not
present in the system, according to information specified in the
get_device_info returned from the server.
The Protocol specifies two login hints.
1. An IP address:port combination
2. A string URI which is constructed as a URL with a protocol prefix
followed by :// and a string as address. For each protocol prefix
the string-address format might be different.
We only support the second option. The first option is just redundant
to the second one.
NOTE: The Kernel part of autologin does not parse the URI string. It
just channels it to a user-mode script. So any new login protocols should
only update the user-mode script which is a part of the nfs-utils package,
but the Kernel need not change.
We implement the autologin by using the call_usermodehelper() API.
(Thanks to Steve Dickson <steved@redhat.com> for pointing it out)
So there is no running daemon needed, and/or special setup.
We Add the osd_login_prog Kernel module parameters which defaults to:
/sbin/osd_login
Kernel try's to upcall the program specified in osd_login_prog. If the file is
not found or the execution fails Kernel will disable any farther upcalls, by
zeroing out osd_login_prog, Until Admin re-enables it by setting the
osd_login_prog parameter to a proper program.
Also add text about the osd_login program command line API to:
Documentation/filesystems/nfs/pnfs.txt
and documentation of the new osd_login_prog module parameter to:
Documentation/kernel-parameters.txt
TODO: Add timeout option in the case osd_login program gets
stuck
Trond Myklebust [Tue, 20 Mar 2012 13:22:00 +0000 (09:22 -0400)]
SUNRPC/LOCKD: Fix build warnings when CONFIG_SUNRPC_DEBUG is undefined
Stephen Rothwell reports:
net/sunrpc/rpcb_clnt.c: In function 'rpcb_enc_mapping':
net/sunrpc/rpcb_clnt.c:820:19: warning: unused variable 'task' [-Wunused-variable]
net/sunrpc/rpcb_clnt.c: In function 'rpcb_dec_getport':
net/sunrpc/rpcb_clnt.c:837:19: warning: unused variable 'task' [-Wunused-variable]
net/sunrpc/rpcb_clnt.c: In function 'rpcb_dec_set':
net/sunrpc/rpcb_clnt.c:860:19: warning: unused variable 'task' [-Wunused-variable]
net/sunrpc/rpcb_clnt.c: In function 'rpcb_enc_getaddr':
net/sunrpc/rpcb_clnt.c:892:19: warning: unused variable 'task' [-Wunused-variable]
net/sunrpc/rpcb_clnt.c: In function 'rpcb_dec_getaddr':
net/sunrpc/rpcb_clnt.c:914:19: warning: unused variable 'task' [-Wunused-variable]
fs/lockd/svclock.c:49:20: warning: 'nlmdbg_cookie2a' declared 'static' but never defined [-Wunused-function]
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Chuck Lever [Fri, 2 Mar 2012 22:13:50 +0000 (17:13 -0500)]
NFSD: Fix nfs4_verifier memory alignment
Clean up due to code review.
The nfs4_verifier's data field is not guaranteed to be u32-aligned.
Casting an array of chars to a u32 * is considered generally
hazardous.
We can fix most of this by using a __be32 array to generate the
verifier's contents and then byte-copying it into the verifier field.
However, there is one spot where there is a backwards compatibility
constraint: the do_nfsd_create() call expects a verifier which is
32-bit aligned. Fix this spot by forcing the alignment of the create
verifier in the nfsd4_open args structure.
Also, sizeof(nfs4_verifer) is the size of the in-core verifier data
structure, but NFS4_VERIFIER_SIZE is the number of octets in an XDR'd
verifier. The two are not interchangeable, even if they happen to
have the same value.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Trond Myklebust [Sun, 18 Mar 2012 18:07:42 +0000 (14:07 -0400)]
SUNRPC/NFS: Add Kbuild dependencies for NFS_DEBUG/RPC_DEBUG
This allows us to turn on/off the dprintk() debugging interfaces for
those distributions that don't ship the 'rpcdebug' utility.
It also allows us to add Kbuild dependencies. Specifically, we already
know that dprintk() in general relies on CONFIG_SYSCTL. Now it turns out
that the NFS dprintks depend on CONFIG_CRC32 after we added support
for the filehandle hash.
Reported-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Trond Myklebust [Sat, 17 Mar 2012 15:59:30 +0000 (11:59 -0400)]
NFS: Use cond_resched_lock() to reduce latencies in the commit scans
Ensure that we conditionally drop the inode->i_lock when it is safe
to do so in the commit loops.
We do so after locking the nfs_page, but before removing it from the
commit list. We can then use list_safe_reset_next to recover the loop
after the lock is retaken.
Trond Myklebust [Mon, 19 Mar 2012 20:17:18 +0000 (16:17 -0400)]
NFSv4: It is not safe to dereference lsp->ls_state in release_lockowner
It is quite possible for the release_lockowner RPC call to race with the
close RPC call, in which case, we cannot dereference lsp->ls_state in
order to find the nfs_server.
Trond Myklebust [Mon, 19 Mar 2012 17:39:35 +0000 (13:39 -0400)]
SUNRPC: We must not use list_for_each_entry_safe() in rpc_wake_up()
The problem is that for the case of priority queues, we
have to assume that __rpc_remove_wait_queue_priority will move new
elements from the tk_wait.links lists into the queue->tasks[] list.
We therefore cannot use list_for_each_entry_safe() on queue->tasks[],
since that will skip these new tasks that __rpc_remove_wait_queue_priority
is adding.
Without this fix, rpc_wake_up and rpc_wake_up_status will both fail
to wake up all functions on priority wait queues, which can result
in some nasty hangs.
Reported-by: Andy Adamson <andros@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: stable@vger.kernel.org
J. Bruce Fields [Mon, 19 Mar 2012 16:34:39 +0000 (12:34 -0400)]
nfsd: merge cookie collision fixes from ext4 tree
These changes fix readdir loops on ext4 filesystems with dir_index
turned on. I'm pulling them from Ted's tree as I'd like to give them
some extra nfsd testing, and expect to be applying (potentially
conflicting) patches to the same code before the next merge window.
Bernd Schubert [Mon, 19 Mar 2012 02:44:50 +0000 (22:44 -0400)]
nfsd: vfs_llseek() with 32 or 64 bit offsets (hashes)
Use 32-bit or 64-bit llseek() hashes for directory offsets depending on
the NFS version. NFSv2 gets 32-bit hashes only.
NOTE: This patch got rather complex as Christoph asked to set the
filp->f_mode flag in the open call or immediatly after dentry_open()
in nfsd_open() to avoid races.
Personally I still do not see a reason for that and in my opinion
FMODE_32BITHASH/FMODE_64BITHASH flags could be set nfsd_readdir(), as it
follows directly after nfsd_open() without a chance of races.
Signed-off-by: Bernd Schubert <bernd.schubert@itwm.fraunhofer.de> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Acked-by: J. Bruce Fields<bfields@redhat.com>
Fan Yong [Mon, 19 Mar 2012 02:44:40 +0000 (22:44 -0400)]
ext4: return 32/64-bit dir name hash according to usage type
Traditionally ext2/3/4 has returned a 32-bit hash value from llseek()
to appease NFSv2, which can only handle a 32-bit cookie for seekdir()
and telldir(). However, this causes problems if there are 32-bit hash
collisions, since the NFSv2 server can get stuck resending the same
entries from the directory repeatedly.
Allow ext4 to return a full 64-bit hash (both major and minor) for
telldir to decrease the chance of hash collisions. This still needs
integration on the NFS side.
Patch-updated-by: Bernd Schubert <bernd.schubert@itwm.fraunhofer.de>
(blame me if something is not correct)
Signed-off-by: Fan Yong <yong.fan@whamcloud.com> Signed-off-by: Andreas Dilger <adilger@whamcloud.com> Signed-off-by: Bernd Schubert <bernd.schubert@itwm.fraunhofer.de> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Sachin Prabhu [Fri, 16 Mar 2012 19:25:52 +0000 (19:25 +0000)]
Try using machine credentials for RENEW calls
Using user credentials for RENEW calls will fail when the user
credentials have expired.
To avoid this, try using the machine credentials when making RENEW
calls. If no machine credentials have been set, fall back to using user
credentials as before.
Trond Myklebust [Fri, 16 Mar 2012 17:52:45 +0000 (13:52 -0400)]
NFSv4.1: Fix a few issues in filelayout_commit_pagelist
- Fix a race in which NFS_I(inode)->commits_outstanding could potentially
go to zero (triggering a call to nfs_commit_clear_lock()) before we're
done sending out all the commit RPC calls.
- If nfs_commitdata_alloc fails, there is no reason why we shouldn't
try to send off all the commits-to-ds.
- Simplify the error handling.
- Change pnfs_commit_list() to always return either
PNFS_ATTEMPTED or PNFS_NOT_ATTEMPTED.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: Fred Isaman <iisaman@netapp.com>
Trond Myklebust [Thu, 15 Mar 2012 21:16:40 +0000 (17:16 -0400)]
NFSv4.1: Clean ups and bugfixes for the pNFS read/writeback/commit code
Move more pnfs-isms out of the generic commit code.
Bugfixes:
- filelayout_scan_commit_lists doesn't need to get/put the lseg.
In fact since it is run under the inode->i_lock, the lseg_put()
can deadlock.
- Ensure that we distinguish between what needs to be done for
commit-to-data server and what needs to be done for commit-to-MDS
using the new flag PG_COMMIT_TO_DS. Otherwise we may end up calling
put_lseg() on a bucket for a struct nfs_page that got written
through the MDS.
- Fix a case where we were using list_del() on an nfs_page->wb_list
instead of list_del_init().
- filelayout_initiate_commit needs to call filelayout_commit_release
on error instead of the mds_ops->rpc_release(). Otherwise it won't
clear the commit lock.
Cleanups:
- Let the files layout manage the commit lists for the pNFS case.
Don't expose stuff like pnfs_choose_commit_list, and the fact
that the commit buckets hold references to the layout segment
in common code.
- Cast out the put_lseg() calls for the struct nfs_read/write_data->lseg
into the pNFS layer from whence they came.
- Let the pNFS layer manage the NFS_INO_PNFS_COMMIT bit.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: Fred Isaman <iisaman@netapp.com>
Trond Myklebust [Thu, 15 Mar 2012 01:55:01 +0000 (21:55 -0400)]
NFS: Fix a compile error when !defined NFS_DEBUG
We should use the 'ifdebug' wrapper rather than trying to inline
tests of nfs_debug, so that the code compiles correctly when we
don't define NFS_DEBUG.
Reported-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
William Dauchy [Wed, 14 Mar 2012 11:32:04 +0000 (12:32 +0100)]
NFSv4: Rate limit the state manager for lock reclaim warning messages
Adding rate limit on `Lock reclaim failed` messages since it could fill
up system logs Signed-off-by: William Dauchy <wdauchy@gmail.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Boaz Harrosh [Wed, 14 Mar 2012 03:44:26 +0000 (20:44 -0700)]
pnfs-obj: Uglify objio_segment allocation for the sake of the principle :-(
At some past instance Linus Trovalds wrote:
> From: Linus Torvalds <torvalds@linux-foundation.org>
> commit a84a79e4d369a73c0130b5858199e949432da4c6 upstream.
>
> The size is always valid, but variable-length arrays generate worse code
> for no good reason (unless the function happens to be inlined and the
> compiler sees the length for the simple constant it is).
>
> Also, there seems to be some code generation problem on POWER, where
> Henrik Bakken reports that register r28 can get corrupted under some
> subtle circumstances (interrupt happening at the wrong time?). That all
> indicates some seriously broken compiler issues, but since variable
> length arrays are bad regardless, there's little point in trying to
> chase it down.
>
> "Just don't do that, then".
Since then any use of "variable length arrays" has become blasphemous.
Even in perfectly good, beautiful, perfectly safe code like the one
below where the variable length arrays are only used as a sizeof()
parameter, for type-safe dynamic structure allocations. GCC is not
executing any stack allocation code.
I have produced a small file which defines two functions main1(unsigned numdevs)
and main2(unsigned numdevs). main1 uses code as before with call to malloc
and main2 uses code as of after this patch. I compiled it as:
gcc -O2 -S see_asm.c
and here is what I get:
Dan Carpenter [Tue, 13 Mar 2012 17:18:48 +0000 (20:18 +0300)]
NFS: null dereference in dev_remove()
In commit 5ffaf85541 "NFS: replace global bl_wq with per-net one" we
made "msg" a pointer instead of a struct stored in stack memory. But we
forgot to change the memset() here so we're still clearing stack memory
instead clearing the struct like we intended. It will lead to a kernel
crash.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Trond Myklebust [Mon, 12 Mar 2012 17:29:05 +0000 (13:29 -0400)]
SUNRPC: Don't use variable length automatic arrays in kernel code
Replace the variable length array in the RPCSEC_GSS crypto code with
a fixed length one. The size should be bounded by the variable
GSS_KRB5_MAX_BLOCKSIZE, so use that.
Bryan Schumaker [Mon, 12 Mar 2012 15:33:00 +0000 (11:33 -0400)]
NFS: Check return value from rpc_queue_upcall()
This function could fail to queue the upcall if rpc.idmapd is not running,
causing a warning message to be printed. Instead, I want to check the
return value and revoke the key if the upcall can't be run.
Trond Myklebust [Sun, 11 Mar 2012 19:22:54 +0000 (15:22 -0400)]
SUNRPC: Fix a few sparse warnings
net/sunrpc/svcsock.c:412:22: warning: incorrect type in assignment
(different address spaces)
- svc_partial_recvfrom now takes a struct kvec, so the variable
save_iovbase needs to be an ordinary (void *)
Make a bunch of variables in net/sunrpc/xprtsock.c static
Fix a couple of "warning: symbol 'foo' was not declared. Should it be
static?" reports.
Fix a couple of conflicting function declarations.
Trond Myklebust [Sun, 11 Mar 2012 17:11:00 +0000 (13:11 -0400)]
NFS: Fix a number of sparse warnings
Fix a number of "warning: symbol 'foo' was not declared. Should it be
static?" conditions.
Fix 2 cases of "warning: Using plain integer as NULL pointer"
fs/nfs/delegation.c:263:31: warning: restricted fmode_t degrades to integer
- We want to allow upgrades to a WRITE delegation, but should otherwise
consider servers that hand out duplicate delegations to be borken.
This queue is used for sleeping in kernel and it have to be per-net since we
don't want to wake any other waiters except in out network nemespace.
BTW, move wq to per-net data is easy. But some way to handle upcall timeouts
have to be provided. On message destroy in case of timeout, tasks, waiting for
message to be delivered, should be awakened. Thus, some data required to
located the right wait queue. Chosen solution replaces rpc_pipe_msg object with
new introduced bl_pipe_msg object, containing rpc_pipe_msg and proper wq.
Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Trond Myklebust [Sat, 10 Mar 2012 16:23:15 +0000 (11:23 -0500)]
NFSv4.0: Re-establish the callback channel on NFS4ERR_CB_PATHDOWN
When the NFSv4.0 server tells us that it can no-longer talk to us
on the callback channel, we should attempt a new SETCLIENTID in
order to re-transmit the callback channel information.
Note that as long as we do not change the boot verifier, this is
a safe procedure; the server is required to keep our state.
Also move the function nfs_handle_cb_pathdown to fs/nfs/nfs4state.c,
and change the name in order to mark it as being specific to NFSv4.0.
J. Bruce Fields [Fri, 9 Mar 2012 22:02:28 +0000 (17:02 -0500)]
nfsd4: make sure set CB_PATH_DOWN sequence flag set
Make sure this is set whenever there is no callback channel.
If a client does not set up a callback channel at all, then it will get
this flag set from the very start. That's OK, it can just ignore the
flag if it doesn't care. If a client does care, I think it's better to
inform it of the problem as early as possible.
Reported-by: Rick Macklem <rmacklem@uoguelph.ca> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Trond Myklebust [Wed, 7 Mar 2012 21:39:06 +0000 (16:39 -0500)]
NFSv4: Return the delegation if the server returns NFS4ERR_OPENMODE
If a setattr() fails because of an NFS4ERR_OPENMODE error, it is
probably due to us holding a read delegation. Ensure that the
recovery routines return that delegation in this case.
J. Bruce Fields [Fri, 27 Jan 2012 21:26:02 +0000 (16:26 -0500)]
nfsd4: delay setting current filehandle till success
Compound processing stops on error, so the current filehandle won't be
used on error. Thus the order here doesn't really matter. It'll be
more convenient to do it later, though.
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Benny Halevy [Fri, 24 Feb 2012 01:40:52 +0000 (17:40 -0800)]
nfsd41: free_session/free_client must be called under the client_lock
The session client is manipulated under the client_lock hence
both free_session and nfsd4_del_conns must be called under this lock.
This patch adds a BUG_ON that checks this condition in the
respective functions and implements the missing locks.
nfsd4_{get,put}_session helpers were moved to the C file that uses them
so to prevent use from external files and an unlocked version of
nfsd4_put_session is provided for external use from nfs4xdr.c
Signed-off-by: Benny Halevy <bhalevy@tonian.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Benny Halevy [Tue, 21 Feb 2012 22:16:44 +0000 (14:16 -0800)]
nfsd41: refactor nfs4_open_deleg_none_ext logic out of nfs4_open_delegation
When a 4.1 client asks for a delegation and the server returns none
op_delegate_type is set to NFS4_OPEN_DELEGATE_NONE_EXT
and op_why_no_deleg is set to either WND4_CONTENTION or WND4_RESOURCE.
Or, if the client sent a NFS4_SHARE_WANT_CANCEL (which it is not supposed
to ever do until our server supports delegations signaling),
op_why_no_deleg is set to WND4_CANCELLED.
Note that for WND4_CONTENTION and WND4_RESOURCE, the xdr layer is hard coded
at this time to encode boolean FALSE for ond_server_will_push_deleg /
ond_server_will_signal_avail.
Signed-off-by: Benny Halevy <bhalevy@tonian.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Dan Carpenter [Tue, 21 Feb 2012 07:28:04 +0000 (10:28 +0300)]
svcrdma: silence a Sparse warning
Sparse complains that the definition function definition and the
implementation aren't anotated the same way.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: Tom Tucker <tom@opengridcomputing.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Jeff Layton [Mon, 5 Mar 2012 16:42:36 +0000 (11:42 -0500)]
nfsd4: fix recovery-dir leak on nfsd startup failure
The current code never calls nfsd4_shutdown_recdir if nfs4_state_start
returns an error. Also, it's better to go ahead and consolidate these
functions since one is just a trivial wrapper around the other.
Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
J. Bruce Fields [Tue, 6 Mar 2012 19:43:36 +0000 (14:43 -0500)]
nfsd4: purge stable client records with insufficient state
To escape having your stable storage record purged at the end of the
grace period, it's not sufficient to simply have performed a
setclientid_confirm; you also need to meet the same requirements as
someone creating a new record: either you should have done an open or
open reclaim (in the 4.0 case) or a reclaim_complete (in the 4.1 case).
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
J. Bruce Fields [Tue, 6 Mar 2012 19:35:16 +0000 (14:35 -0500)]
nfsd4: don't set cl_firststate on first reclaim in 4.1 case
We set cl_firststate when we first decide that a client will be
permitted to reclaim state on next boot. This happens:
- for new 4.0 clients, when they confirm their first open
- for returning 4.0 clients, when they reclaim their first open
- for 4.1+ clients, when they perform reclaim_complete
We also use cl_firststate to decide whether a reclaim_complete has
already been performed, in the 4.1+ case.
We were setting it on 4.1 open reclaims, which caused spurious
COMPLETE_ALREADY errors on RECLAIM_COMPLETE from an nfs4.1 client with
anything to reclaim.
Reported-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Chuck Lever [Fri, 2 Mar 2012 22:14:31 +0000 (17:14 -0500)]
NFS: Fix nfs4_verifier memory alignment
Clean up due to code review.
The nfs4_verifier's data field is not guaranteed to be u32-aligned.
Casting an array of chars to a u32 * is considered generally
hazardous.
Fix this by using a __be32 array to generate a verifier's contents,
and then byte-copy the contents into the verifier field. The contents
of a verifier, for all intents and purposes, are opaque bytes. Only
local code that generates a verifier need know the actual content and
format. Everyone else compares the full byte array for exact
equality.
Also, sizeof(nfs4_verifer) is the size of the in-core verifier data
structure, but NFS4_VERIFIER_SIZE is the number of octets in an XDR'd
verifier. The two are not interchangeable, even if they happen to
have the same value.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Trond Myklebust [Sun, 4 Mar 2012 23:13:57 +0000 (18:13 -0500)]
NFSv4: Simplify the struct nfs4_stateid
Replace the union with the common struct stateid4 as defined in both
RFC3530 and RFC5661. This makes it easier to access the sequence id,
which will again make implementing support for parallel OPEN calls
easier.
Trond Myklebust [Sun, 4 Mar 2012 23:13:56 +0000 (18:13 -0500)]
NFSv4: Further clean-ups of delegation stateid validation
Change the name to reflect what we're really doing: testing two
stateids for whether or not they match according the the rules in
RFC3530 and RFC5661.
Move the code from callback_proc.c to nfs4proc.c
Trond Myklebust [Sun, 4 Mar 2012 23:13:56 +0000 (18:13 -0500)]
NFSv4.1: Fix matching of the stateids when returning a delegation
nfs41_validate_delegation_stateid is broken if we supply a stateid with
a non-zero sequence id. Instead of trying to match the sequence id,
the function assumes that we always want to error. While this is
true for a delegation callback, it is not true in general.
Also fix a typo in nfs4_callback_recall.
Reported-by: Andy Adamson <andros@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Trond Myklebust [Tue, 6 Mar 2012 00:56:44 +0000 (19:56 -0500)]
NFS: Properly handle the case where the delegation is revoked
If we know that the delegation stateid is bad or revoked, we need to
remove that delegation as soon as possible, and then mark all the
stateids that relied on that delegation for recovery. We cannot use
the delegation as part of the recovery process.
Also note that NFSv4.1 uses a different error code (NFS4ERR_DELEG_REVOKED)
to indicate that the delegation was revoked.
Finally, ensure that setlk() and setattr() can both recover safely from
a revoked delegation.
Chuck Lever [Thu, 1 Mar 2012 22:02:05 +0000 (17:02 -0500)]
NFS: Request fh_expire_type attribute in "server caps" operation
The fh_expire_type file attribute is a filesystem wide attribute that
consists of flags that indicate what characteristics file handles
on this FSID have.
Our client doesn't support volatile file handles. It should find
out early (say, at mount time) whether the server is going to play
shenanighans with file handles during a migration.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Chuck Lever [Thu, 1 Mar 2012 22:01:57 +0000 (17:01 -0500)]
NFS: Introduce NFS_ATTR_FATTR_V4_LOCATIONS
The Linux NFS client must distinguish between referral events (which
it currently supports) and migration events (which it does not yet
support).
In both types of events, an fs_locations array is returned. But upper
layers, not the XDR layer, should make the distinction between a
referral and a migration. There really isn't a way for an XDR decoder
function to distinguish the two, in general.
Slightly adjust the FATTR flags returned by decode_fs_locations()
to set NFS_ATTR_FATTR_V4_LOCATIONS only if a non-empty locations
array was returned from the server. Then have logic in nfs4proc.c
distinguish whether the locations array is for a referral or
something else.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Chuck Lever [Thu, 1 Mar 2012 22:01:31 +0000 (17:01 -0500)]
NFS: Add a client-side function to display NFS file handles
For debugging, introduce a simplistic function to print NFS file
handles on the system console. The main function is hooked into the
dprintk debugging facility, but you can directly call the helper,
_nfs_display_fhandle(), if you want to print a handle unconditionally.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Chuck Lever [Thu, 1 Mar 2012 22:01:23 +0000 (17:01 -0500)]
NFS: Make clientaddr= optional
For NFSv4 mounts, the clientaddr= mount option has always been
required. Now we have rpc_localaddr() in the kernel, which was
modeled after the same logic in the mount.nfs command that constructs
the clientaddr= mount option. If user space doesn't provide a
clientaddr= mount option, the kernel can now construct its own.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Chuck Lever [Thu, 1 Mar 2012 22:01:14 +0000 (17:01 -0500)]
SUNRPC: Add API to acquire source address
NFSv4.0 clients must send endpoint information for their callback
service to NFSv4.0 servers during their first contact with a server.
Traditionally on Linux, user space provides the callback endpoint IP
address via the "clientaddr=" mount option.
During an NFSv4 migration event, it is possible that an FSID may be
migrated to a destination server that is accessible via a different
source IP address than the source server was. The client must update
callback endpoint information on the destination server so that it can
maintain leases and allow delegation.
Without a new "clientaddr=" option from user space, however, the
kernel itself must construct an appropriate IP address for the
callback update. Provide an API in the RPC client for upper layer
RPC consumers to acquire a source address for a remote.
The mechanism used by the mount.nfs command is copied: set up a
connected UDP socket to the designated remote, then scrape the source
address off the socket. We are careful to select the correct network
namespace when setting up the temporary UDP socket.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>