Re: zfs related panic

From: Bakul Shah <bakul_at_iitbombay.org>
Date: Fri, 15 Aug 2025 23:51:00 UTC
On Aug 15, 2025, at 3:51 PM, Konstantin Belousov <kostikbel@gmail.com> wrote:
> 
> On Fri, Aug 15, 2025 at 11:19:55AM -0700, Bakul Shah wrote:
>> Is this a known bug or may be something specific on my machine?
>> If the latter, any way to "fsck" it? FYI, the zpool is a mirror
>> (two files on the host via nvme). built from c992ac621327 commit hash
>> (which has other issues but they seem to be separate from this).
>> I saw the same panic when I booted from a day old snapshot.
>> 
>> Note that "ls /.zfs" panics but "ls /.zfs/snapshot" doesn't!
>> 
>> This is on a -current VM:
>> 
>> root@:/ # ls .zfs
>> VNASSERT failed: oresid == 0 || nresid != oresid || *(a)->a_eofflag == 1 not true at vnode_if.c:1824 (VOP_READDIR_APV)
> 
> Try this, untested.

Thanks for the quick patch! But I am afraid it didn't help. Let me know if you
want me to check things via gdb. [I have filed
	https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=288889
so we can continue debugging there]

On the console (single user, RO root):
# ls /.zfs
VNASSERT failed: oresid == 0 || nresid != oresid || *(a)->a_eofflag == 1 not true at vnode_if.c:1824 (VOP_READDIR_APV)
0xfffff800059546e0: type VDIR state VSTATE_CONSTRUCTED op 0xffffffff8272cfd0
    usecount 1, writecount 0, refcount 1 seqc users 0 mountedhere 0
    hold count flags ()
    flags ()
    lock type zfs: SHARED (count 1)
        name = .zfs
        parent_id = 0
        id = 1
panic: VOP_READDIR: eofflag not set
cpuid = 0
time = 1755276357
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe0053f83af0
vpanic() at vpanic+0x136/frame 0xfffffe0053f83c20
panic() at panic+0x43/frame 0xfffffe0053f83c80
VOP_READDIR_APV() at VOP_READDIR_APV+0x205/frame 0xfffffe0053f83cd0
kern_getdirentries() at kern_getdirentries+0x228/frame 0xfffffe0053f83dd0
sys_getdirentries() at sys_getdirentries+0x29/frame 0xfffffe0053f83e00
amd64_syscall() at amd64_syscall+0x169/frame 0xfffffe0053f83f30
fast_syscall_common() at fast_syscall_common+0xf8/frame 0xfffffe0053f83f30
--- syscall (554, FreeBSD ELF64, getdirentries), rip = 0x331339f976aa, rsp = 0x33133631ade8, rbp = 0x33133631ae20 ---
KDB: enter: panic
[ thread pid 23 tid 100211 ]
Stopped at      kdb_enter+0x33: movq    $0,0x12313e2(%rip)
db>

Running gdb on the host (attached to tcp port):
#16 0xffffffff80b7992b in vpanic (
    fmt=0xffffffff812ddf30 "VOP_READDIR: eofflag not set",
    ap=ap@entry=0xfffffe0053f83c60)
    at /home/FreeBSD/current/sys/kern/kern_shutdown.c:962
#17 0xffffffff80b79793 in panic (
    fmt=0xffffffff81d9eab0 <cnputs_mtx> "\304\372\032\201\377\377\377\377")
    at /home/FreeBSD/current/sys/kern/kern_shutdown.c:887
#18 0xffffffff81195fd5 in VOP_READDIR_APV (vop=<optimized out>,
    a=a@entry=0xfffffe0053f83d30) at vnode_if.c:1824
#19 0xffffffff80c95e58 in VOP_READDIR (vp=0xfffff800059546e0,
    uio=0xfffffe0053f83d00, cred=<optimized out>, eofflag=0xfffffe0053f83d6c,
    ncookies=0x0, cookies=0x0) at ./vnode_if.h:972
#20 kern_getdirentries (td=0xfffff8007e7c3780, fd=<optimized out>,
    buf=0x4ea30d020000 "\001", count=4096,
    basep=basep@entry=0xfffffe0053f83df0, residp=residp@entry=0x0,
    bufseg=UIO_USERSPACE) at /home/FreeBSD/current/sys/kern/vfs_syscalls.c:4353
#21 0xffffffff80c96289 in sys_getdirentries (
    td=0xffffffff81d9eab0 <cnputs_mtx>, uap=0xfffff8007e7c3ba8)
    at /home/FreeBSD/current/sys/kern/vfs_syscalls.c:4287
#22 0xffffffff810ca8b9 in syscallenter (td=0xfffff8007e7c3780)
    at /home/FreeBSD/current/sys/amd64/amd64/../../kern/subr_syscall.c:193
#23 amd64_syscall (td=0xfffff8007e7c3780, traced=0)
    at /home/FreeBSD/current/sys/amd64/amd64/trap.c:1208


> 
> commit a97fc29bf2c03bbfc57b9c188ab3b24450d453bc
> Author: Konstantin Belousov <kib@FreeBSD.org>
> Date:   Sat Aug 16 01:50:42 2025 +0300
> 
>    zfs control dir: properly set eof
> 
> diff --git a/sys/contrib/openzfs/module/os/freebsd/zfs/zfs_ctldir.c b/sys/contrib/openzfs/module/os/freebsd/zfs/zfs_ctldir.c
> index 61d0bb26d1e5..725c02d47edf 100644
> --- a/sys/contrib/openzfs/module/os/freebsd/zfs/zfs_ctldir.c
> +++ b/sys/contrib/openzfs/module/os/freebsd/zfs/zfs_ctldir.c
> @@ -1056,17 +1056,22 @@ zfsctl_snapdir_readdir(struct vop_readdir_args *ap)
> zfs_uio_t uio;
> int *eofp = ap->a_eofflag;
> off_t dots_offset;
> + offset_t orig_resid;
> int error;
> 
> zfs_uio_init(&uio, ap->a_uio);
> + orig_resid = uio.uio->uio_resid;
> 
> ASSERT3S(vp->v_type, ==, VDIR);
> 
> error = sfs_readdir_common(ZFSCTL_INO_ROOT, ZFSCTL_INO_SNAPDIR, ap,
>    &uio, &dots_offset);
> if (error != 0) {
> - if (error == ENAMETOOLONG) /* ran out of destination space */
> + if (error == ENAMETOOLONG) { /* ran out of destination space */
> error = 0;
> + if (orig_resid == uio.uio->uio_resid && eofp != NULL)
> + *eofp = 1;
> + }
> return (error);
> }
> 
> @@ -1084,7 +1089,8 @@ zfsctl_snapdir_readdir(struct vop_readdir_args *ap)
> dsl_pool_config_exit(dmu_objset_pool(zfsvfs->z_os), FTAG);
> if (error != 0) {
> if (error == ENOENT) {
> - if (eofp != NULL)
> + if (orig_resid == uio.uio->uio_resid &&
> +    eofp != NULL)
> *eofp = 1;
> error = 0;
> }
> @@ -1099,8 +1105,12 @@ zfsctl_snapdir_readdir(struct vop_readdir_args *ap)
> entry.d_reclen = sizeof (entry);
> error = vfs_read_dirent(ap, &entry, zfs_uio_offset(&uio));
> if (error != 0) {
> - if (error == ENAMETOOLONG)
> + if (error == ENAMETOOLONG) {
> error = 0;
> + if (orig_resid == uio.uio->uio_resid &&
> +    eofp != NULL)
> + *eofp = 1;
> + }
> zfs_exit(zfsvfs, FTAG);
> return (SET_ERROR(error));
> }