Re: support for pNFS with Linux as Data Servers
- Reply: Rick Macklem : "Re: support for pNFS with Linux as Data Servers"
- In reply to: Rick Macklem : "Re: support for pNFS with Linux as Data Servers"
- Go to: [ bottom of page ] [ top of archives ] [ this month ]
Date: Tue, 13 May 2025 11:22:48 UTC
Thanks for the reply! > I think the hard part may be implementation of the "fencing" that is > required for the loosely coupled model. (For what is there now, > permission handling is done by the DS(s), since the files on the DS(s) > have exactly the same owner/group/mode/ACL as the MDS.) That makes sense to me! Also I'm not sure what's required but handling restarts and recovery looks potentially hairy. > It has been a while since I read the RFC, so I cannot recall how the > loosely coupled model supports permissions (fencing off files that the > client does not have permission to access). If I understand RFC 8435 correctly, the MDS uses the synthetic uid and gid to allow access, and unilaterally prevents access by changing the owner uid and gid on the DSs. > Although it does not explicitly say so in the RFC, you want to use NFSv3 > RPCs to talk to the DS(s) from the MDS for the loosely coupled variant. > (That avoids any stateid hassles. For NFSv4 DSs, the MDS would have to > do Opens and keep open_stateids for the DS files.) It makes sense to me that the MDS should tell clients to use NFSv3 to talk to the DSs, e.g. to avoid stateid hassles. And ideally the MDS would talk to the DSs using NFSv3 too, at least for the simplicity of the DSs only talking NFSv3. But there seems to be plenty of existing code where the MDS is using NFSv4 to proxy operations to the DS, and making that work with only NFSv3 instead looks non-trivial? So I wonder how bad it would be to leave that existing code as NFSv4. > > The other issue is clients will use the synthetic > > uid/gid given by the MDS (currently 999/999), and this results in > > access errors when the clients talk to the DSs. > The NFSv3 Create RPC that creates the DS file would set it owned > by the uid/gid and mode 0600, I think? I didn't understand you here, but if you don't mind, let's gloss over it for now to talk about fencing first... > As I noted, I think "fencing" is where most of the work is. > If I recall it correctly, it goes something like this: > - Client does a Setattr of owner/group/mode/ACL on the MDS. > --> Server must recall all layouts for the file via CB_RECALLLAYOUT > callbacks and reply NFS4ERR_DELAY to the Setattr. > --> Sometime later, the client retries the Setattr. If all layouts have been > returned, it is done. If not, the server must either return NFS4ERR_DELAY > again or change the mode on the file on the DS(s) so that clients cannot > access it. I think the MDS must wait at least one lease duration (2min) > after issuing the CB_RECALLLAYOUTs before doing this. I see a few places where the RFCs mention that fencing should or must happen: when a file's permissions or ACLs are changed, a client lease expires, there's an admin revoke of a client, a client doesn't respond to CB_LAYOUTRECALL, and probably I missed some others. It makes sense to me that fencing should (and maybe must) happen in all of these situations... except for the permissions changing case, that doesn't make sense to me yet, and it'd be great if you could enlighten me. If the scenario is like: 1) Some user creates a file with mode 644, writes, closes it 2) Someone changes the permissions to 444 or changes the owner 3) The MDS doesn't fence the client even though it's supposed to 4) The user tries to open the file for writing. The MDS will return an access error for this open, even without fencing. So no problem here(?) Or if instead the scenario is like: 1) Some user creates a file with mode 644, writes, does NOT close 2) Someone changes the permissions to 444 or changes the owner 3) The MDS doesn't fence the client even though it's supposed to 4) The user tries to write more to the file. The additional writes are successful. This doesn't sound like a problem to me either, as it's normal for writes to be succesful in this case(?) There must be some scenario where there's a bad effect unless the MDS fences, but I don't know what it is. Or maybe I'm thinking about it all wrong. Thanks!