Re: NFS exports of ZFS snapshots broken

From: Garrett Wollman <wollman_at_bimajority.org>
Date: Fri, 17 Nov 2023 22:35:04 UTC
<<On Fri, 17 Nov 2023 15:57:42 -0600, Mike Karels <mike@karels.net> said:

> I have not run into this, so I tried it just now.  I had no problem.
> The server is 13.2, fully patched, the client is up-to-date -current,
> and the mount is v4.

On my 13.2 client and 13-stable server, I see:

 25034 ls       CALL  open(0x237d32f9a000,0x120004<O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC>)
 25034 ls       NAMI  "/mnt/tools/.zfs/snapshot/weekly-2023-45"
 25034 ls       RET   open 4
 25034 ls       CALL  fcntl(0x4,F_ISUNIONSTACK,0x0)
 25034 ls       RET   fcntl 0
 25034 ls       CALL  getdirentries(0x4,0x237d32faa000,0x1000,0x237d32fa7028)
 25034 ls       RET   getdirentries -1 errno 5 Input/output error
 25034 ls       CALL  close(0x4)
 25034 ls       RET   close 0
 25034 ls       CALL  exit(0)

Certainly a libc bug here that getdirentries(2) returning [EIO]
results in ls(1) returning EXIT_SUCCESS, but the [EIO] error is
consistent across both FreeBSD and Linux clients.

Looking at this from the RPC side:

	(PUTFH, GETATTR, LOOKUP(snapshotname), GETFH, GETATTR)
		[NFS4_OK for all ops]
	(PUTFH, GETATTR)
		[NFS4_OK, NFS4_OK]
	(PUTFH, ACCESS(0x3f), GETATTR)
		[NFS4_OK, NFS4_OK, rights = 0x03, NFS4_OK]
	(PUTFH, GETATTR, LOOKUPP, GETFH, GETATTR)
		[NFS4_OK, NFS4_OK, NFS4ERR_NOFILEHANDLE]

and at this point the [EIO] is returned.

It seems that clients always do a LOOKUPP before calling READDIR, and
this is failing when the subject file handle is the snapshot.  The
client is perfectly able to *traverse into* the snapshot: if I try to
list a subdirectory I know exists in the snapshot, the client is able to
LOOKUP(dirname) just fine, but LOOKUPP still fails with
NFS4ERR_NOFILEHANDLE *on the subndirectory*.

-GAWollman