svn commit: r192256 - head/sys/fs/nfsserver

Wed May 20 22:12:29 UTC 2009

On Wed, 20 May 2009, Doug Rabson wrote:

[good stuff snipped]
>> Why do they need to be distinguished? The nfsv4 state subsystem handles
>> all conflicts between them, so they are just "nfsv4 locks".
>
> Why? It seems a shame to re-implement all the record locking logic of the 
> local lock manager in NFS.
>
The record locking would have been easy, but the nfsv4 RFC has a wonderful
couple of sentences under the description of Close that basically says
that, upon close, related byte range locks must be released or an error
returned. Those two sentences imply that there is a relationship between
Opens and Locks that the server must maintain and it gets complicated.

For example, a client can:
(i) Lock foo with OpenStateID-0 and Lockowner-A
(ii) Lock foo with OpenStateID-1 and Lockowner-A
- now, the above Locks are done by the same Lockowner-A, so they don't
   conflict if they overlap etc.
BUT
- the server cannot simply lump them to-gether, because when OpenStateID-0
   gets Closed, the i) lock(s) must be released, but the (ii) lock(s)
   remain.

Then, there are locks issued locally against a delegation, but the above
open/lock relationship must be maintained for them too, because a 
delegation can be recalled at any time and they must be correctly acquired
against the server at that point...

Basically, the nfsv4 server has no choice but to keep track of all this
stuff and the locks/lock_owners end up all over it. Using the lock manager
to keep track of the one small piece that it can really wouldn't make much
difference. (Beyond that, the code was written to be portable to the
various BSDen over several years, so I avoided making assumptions about
what the system might provide that I could use.)

Maybe it would be beneficial to extract the state handling stuff and let
a Cifs/SMB server use it as well, but I know diddly about Cifs/SMB and
know it would be a bunch of work.

Does this clarify it? rick

>> 
>> An nfsv4 lockowner is a ClientID + up to 1024 bytes of opaque name and it
>> might not persist in the server beyond the point where no locks are
>> held and the associated OpenOwner no longer has any Opens. After this,
>> the same lockowner could be "re-incarnated" (ie. create a new state
>> data structure in the server with the same ClientID + up to 1024 bytes)
>> when the client chooses to do more locking on it. If a pid is generated
>> sequentially, this second re-incarnation would end up with a different
>> pid although it is the same lockowner. (To ensure this doesn't happen,
>> the server would have to hold onto the lockowner state structure "forever"
>> and that obviously isn't practical.) Or a pid could be a 32bit checksum
>> on the ClientID + up to 1024 bytes instead of sequential assignment. In 
>> that case the re-incarnation would get the same pid, but it wouldn't be
>> guaranteed to be unique across all different lockowners.
>> 
>> As such, the most an assigned pid could be is a "hint" that the lockowner
>> is different/same. Is there some benefit to this over "held by nfsv4",
>> which is what using one <l_sysid, l_pid> tuple gives you?