NFS on NFS?

Tue Jul 17 20:19:57 UTC 2007

Rick Macklem wrote:
> 
> 
> On Tue, 17 Jul 2007, Eric Anderson wrote:
> 
>> Rick Macklem wrote:
>>
>> Is that really true?  It looked like the NFS handle was created by 
>> various file system goo, which could come up again some time in the 
>> future.  For instance, file a file systems inode table, rm all the 
>> files, do it again (with different data in the files).  Wouldn't the 
>> NFS handle look the same to the client then, but be a different file?  
>> Or when we say 'file' do we mean 'inode' on a file system?
>>
> The file handle also has di_gen (the generation #) in it, which is there
> specifically to prevent the file handle from accidentally referring to a
> new file with the same i-node #. The server is expected to return ESTALE
> when a client tries to use a file handle after the file is deleted and
> this error is returned when the generation# in the file handle is not the
> same as di_gen in the i-node. (di_gen is incremented each time the i-node
> is re-used.) File systems that do not have the equivalent of di_gen cannot
> be exported via NFS correctly (but some people/systems do so anyhow). Ok
> if the file system is read-only.

I see.  That clears it up a bit.

>> Also, by 'T stable', does 'T' mean 'time' here?
> Yep. Capital T for a looonnngggg time.
> 
>> I'm not certain I completely understand why the clients would get 
>> confused. Wouldn't it look something like this:
>>
>> [File system->NFS server->NFS handle]
>>               |
>>               V
>> [NFS client->virtual file system->NFS server->NFS handle2]
>>               |
>>               V
>> [NFS Client->virtual file system->application]
>>
> So long as the intermediate server obeys all the rules, it can work:
> - File Handle is T-stable (recognized as ESTALE after the file is deleted)
>   and still works the same after server reboots, etc.
> - fsid in getattr remains the same throughout the file system, even after
>   server reboots, etc.
> - handles RPCs in an atomic way, so that they are either done or not
>   (can't leave things half created after a crash)
>   - NFSv2 and v3 clients don't expect servers to maintain any state
>     and don't know the server rebooted. They simply retry the RPC until
>     they get success or failure back from the server.
> 
> Where these schemes usually break down is when the intermediate server
> reboots and no longer does the same file handle translations or assigns
> a new, different fsid to the file system or crosses a mount point
> boundary and changes the fsid or ???

I see the point.

> Like I said, seems like a simple proxy that passes along the RPCs is
> easier to do. For NFSv3 (not v2) the intermediary can grow the size of
> the file handle (to a maximum of 64 bytes) so, if the real server creates
> file handles less than 64 bytes in size, it can add/remove stuff, but...

Ok, I understand, and see the utility..

> - it then becomes useful for only certain servers

Why?  Because some servers implement large NFS handles?  I've only ever 
seen 32bytes, but..

> - it has to do lots of copying of args, since the size changes

You mean because you have to map the server's info to your new handle? 
or am I missing something?

Thanks for the info.. (is there a good doc on this, besides and RFC?)

Eric