[Bug 260884] [zfs] Panic in zfs_onexit_destroy
- Reply: bugzilla-noreply_a_freebsd.org: "[Bug 260884] [zfs] Panic in zfs_onexit_destroy"
- Reply: bugzilla-noreply_a_freebsd.org: "[Bug 260884] [zfs] Panic in zfs_onexit_destroy [fix available]"
- Reply: bugzilla-noreply_a_freebsd.org: "[Bug 260884] [zfs] Panic in zfs_onexit_destroy [fix available]"
- Reply: bugzilla-noreply_a_freebsd.org: "[Bug 260884] [zfs] Panic in zfs_onexit_destroy [fix available]"
- Reply: bugzilla-noreply_a_freebsd.org: "[Bug 260884] [zfs] Panic in zfs_onexit_destroy [fix available]"
- Reply: bugzilla-noreply_a_freebsd.org: "[Bug 260884] [zfs] Panic in zfs_onexit_destroy [fix available]"
- Reply: bugzilla-noreply_a_freebsd.org: "[Bug 260884] [zfs] Panic in zfs_onexit_destroy [fix available]"
- Reply: bugzilla-noreply_a_freebsd.org: "[Bug 260884] [zfs] Panic in zfs_onexit_destroy [fix available]"
- Reply: bugzilla-noreply_a_freebsd.org: "[Bug 260884] [zfs] Panic in zfs_onexit_destroy [fix available]"
- Go to: [ bottom of page ] [ top of archives ] [ this month ]
Date: Sun, 02 Jan 2022 17:25:37 UTC
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=260884
Bug ID: 260884
Summary: [zfs] Panic in zfs_onexit_destroy
Product: Base System
Version: 13.0-RELEASE
Hardware: Any
OS: Any
Status: New
Severity: Affects Only Me
Priority: ---
Component: kern
Assignee: bugs@FreeBSD.org
Reporter: grembo@FreeBSD.org
I see this problem on multiple hosts running a couple of ZFS clone based jails
(orchestrated by nomad/pot). As pot calls `zfs list` once per second per
running jail, this adds up to 10-30 calls to `zfs list` per second per node.
After a few days, all hosts consistently crash with a panic, which seems to
happen while calling `zfs`. This looks a lot like this bug reported in TrueNAS:
https://jira.ixsystems.com/browse/NAS-108891
It seems like the underlying locking problem was already fixed in OpenZFS
upstream, but FreeBSD 13.0-RELEASE is using an older version. As far as I can
see it, would be very easy to apply the fix from here to resolve a potential
errata and create 13.0-RELEASE-p6 from that:
https://github.com/openzfs/zfs/commit/f845b2dd1c60
You can find more context about my use case here:
https://github.com/pizzamig/pot/issues/195
Crashinfo output:
```
Fatal trap 12: page fault while in kernel mode
cpuid = 3; apic id = 03
fault virtual address = 0x18
fault code = supervisor read data, page not present
instruction pointer = 0x20:0xffffffff80bffeca
stack pointer = 0x28:0xfffffe01e0bd5820
frame pointer = 0x28:0xfffffe01e0bd5830
code segment = base 0x0, limit 0xfffff, type 0x1b
= DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags = interrupt enabled, resume, IOPL = 0
current process = 91596 (zfs)
trap number = 12
panic: page fault
cpuid = 3
time = 1641116990
KDB: stack backtrace:
#0 0xffffffff80c40295 at kdb_backtrace+0x65
#1 0xffffffff80bf5d91 at vpanic+0x181
#2 0xffffffff80bf5b63 at panic+0x43
#3 0xffffffff810878f7 at trap_fatal+0x387
#4 0xffffffff81087966 at trap_pfault+0x66
#5 0xffffffff81086f8b at trap+0x2ab
#6 0xffffffff8105b808 at calltrap+0x8
#7 0xffffffff822cabb0 at zfs_onexit_destroy+0x20
#8 0xffffffff82146768 at zfsdev_close+0x58
#9 0xffffffff80a98347 at devfs_destroy_cdevpriv+0x97
#10 0xffffffff80a9bf64 at devfs_close_f+0x64
#11 0xffffffff80b98d2b at _fdrop+0x1b
#12 0xffffffff80b9c5e9 at closef+0x1d9
#13 0xffffffff80ba0697 at closefp_impl+0x77
#15 0xffffffff8105c12e at fast_syscall_common+0xf8
Uptime: 3d16h29m24s
Dumping 7555 out of 65271 MB:..1%..11%..21%..31%..41%..51%..61%..71%..81%..91%
__curthread () at /usr/src/sys/amd64/include/pcpu_aux.h:55
55 __asm("movq %%gs:%P1,%0" : "=r" (td) : "n" (offsetof(struct
pcpu,
(kgdb) #0 __curthread () at /usr/src/sys/amd64/include/pcpu_aux.h:55
#1 doadump (textdump=<optimized out>) at /usr/src/sys/kern/kern_shutdown.c:399
#2 0xffffffff80bf59bb in kern_reboot (howto=260)
at /usr/src/sys/kern/kern_shutdown.c:486
#3 0xffffffff80bf5e00 in vpanic (fmt=<optimized out>, ap=<optimized out>)
at /usr/src/sys/kern/kern_shutdown.c:919
#4 0xffffffff80bf5b63 in panic (fmt=<unavailable>)
at /usr/src/sys/kern/kern_shutdown.c:843
#5 0xffffffff810878f7 in trap_fatal (frame=0xfffffe01e0bd5760, eva=24)
at /usr/src/sys/amd64/amd64/trap.c:915
#6 0xffffffff81087966 in trap_pfault (frame=frame@entry=0xfffffe01e0bd5760,
usermode=false, signo=<optimized out>, signo@entry=0x0,
ucode=<optimized out>, ucode@entry=0x0)
at /usr/src/sys/amd64/amd64/trap.c:732
#7 0xffffffff81086f8b in trap (frame=0xfffffe01e0bd5760)
at /usr/src/sys/amd64/amd64/trap.c:398
#8 <signal handler called>
#9 _sx_xlock (sx=0x0, opts=opts@entry=0,
file=0xffffffff8239be7a
"/usr/src/sys/contrib/openzfs/module/zfs/zfs_onexit.c", line=line@entry=89) at
/usr/src/sys/kern/kern_sx.c:325
#10 0xffffffff822cabb0 in zfs_onexit_destroy (zo=0x0)
at /usr/src/sys/contrib/openzfs/module/zfs/zfs_onexit.c:89
#11 0xffffffff82146768 in zfsdev_close (data=0xfffff8000822c700)
at /usr/src/sys/contrib/openzfs/module/os/freebsd/zfs/kmod_core.c:197
#12 0xffffffff80a98347 in devfs_destroy_cdevpriv (p=0xfffff8051eff9b40)
at /usr/src/sys/fs/devfs/devfs_vnops.c:197
#13 0xffffffff80a9bf64 in devfs_fpdrop (fp=0xfffff807882306e0)
at /usr/src/sys/fs/devfs/devfs_vnops.c:211
#14 devfs_close_f (fp=0xfffff807882306e0, td=<optimized out>)
at /usr/src/sys/fs/devfs/devfs_vnops.c:787
#15 0xffffffff80b98d2b in fo_close (fp=0xfffff807882306e0,
td=0xfffffe01e6a02300) at /usr/src/sys/sys/file.h:377
#16 _fdrop (fp=fp@entry=0xfffff807882306e0, td=td@entry=0xfffffe01e6a02300)
at /usr/src/sys/kern/kern_descrip.c:3510
#17 0xffffffff80b9c5e9 in closef (fp=fp@entry=0xfffff807882306e0,
td=td@entry=0xfffffe01e6a02300) at /usr/src/sys/kern/kern_descrip.c:2828
#18 0xffffffff80ba0697 in closefp_impl (fdp=0xfffffe01ef4134f0, fd=5,
fp=0xfffff807882306e0, td=0xfffffe01e6a02300, audit=true)
at /usr/src/sys/kern/kern_descrip.c:1271
#19 0xffffffff8108827e in syscallenter (td=<optimized out>)
at /usr/src/sys/amd64/amd64/../../kern/subr_syscall.c:189
#20 amd64_syscall (td=0xfffffe01e6a02300, traced=0)
at /usr/src/sys/amd64/amd64/trap.c:1156
#21 <signal handler called>
#22 0x00000008007bb40a in ?? ()
Backtrace stopped: Cannot access memory at address 0x7fffffffe9c8
(kgdb)
```
--
You are receiving this mail because:
You are the assignee for the bug.