persistent integer divide fault panic in zfs_rmnode

Wed Jan 27 21:00:10 UTC 2021

Have you tried with the OpenZFS port instead in case the problem is solved there?

(Might be easiest to just boot a FreeBSD 13 kernel with your existing 12.2 user land)

- Peter

> On 27 Jan 2021, at 19:15, Steven Schlansker <stevenschlansker at gmail.com> wrote:
> 
> Does anybody have any suggestions as to what I can try next regarding this
> panic?
> 
> At this point the only path forward I see is to declare the zpool corrupt
> and attempt to
> move all the data off, destroy, and migrate back, and hope the recreated
> pool does not tickle this bug.
> 
> That would be a pretty disappointing end to a long fatal-problem-free run
> with ZFS.
> 
> Thanks,
> Steven
> 
> On Fri, Jan 8, 2021 at 3:41 PM Steven Schlansker <stevenschlansker at gmail.com>
> wrote:
> 
>> Hi freebsd-fs,
>> 
>> I have a 8-way raidz2 system running FreeBSD 12.2-RELEASE-p1 GENERIC
>> Approximately since upgrading to FreeBSD 12.2-RELEASE, I receive a nasty
>> panic when trying to unlink any of a large number of files.
>> 
>> Fatal trap 18: integer divide fault while in kernel mode
>> 
>> 
>> The pool reports as healthy:
>> 
>> pool: universe
>> state: ONLINE
>> status: One or more devices are configured to use a non-native block size.
>>       Expect reduced performance.
>> action: Replace affected devices with devices that support the
>>       configured block size, or migrate data to a properly configured
>>       pool.
>> scan: resilvered 416M in 0 days 00:08:35 with 0 errors on Thu Jan  7
>> 02:16:03 2021
>> When some files are unlinked, the system panics with a partial backtrace
>> of:
>> 
>> #6 0xffffffff82a148ce at zfs_rmnode+0x5e
>> #7 0xffffffff82a35612 at zfs_freebsd_reclaim+0x42
>> #8 0xffffffff812482db at VOP_RECLAIM_APV+0x7b
>> #9 0xffffffff80c8e376 at vgonel+0x216
>> #10 0xffffffff80c8e9c5 at vrecycle+0x45
>> 
>> I captured a dump, and using kgdb extracted a full backtrace, and filed it
>> as https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=250784
>> 
>> #8  0xffffffff82963725 in get_next_chunk (dn=0xfffff804325045c0,
>> start=<optimized out>, minimum=0, l1blks=<optimized out>)
>>   at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu.c:721
>> warning: Source file is more recent than executable.
>> 721                 (roundup(*start, iblkrange) - (minimum / iblkrange *
>> iblkrange)) /
>> (kgdb) list
>> 716              * L1 blocks in this range have data. If we can, we use
>> this
>> 717              * worst case value as an estimate so we can avoid having
>> to look
>> 718              * at the object's actual data.
>> 719              */
>> 720             uint64_t total_l1blks =
>> 721                 (roundup(*start, iblkrange) - (minimum / iblkrange *
>> iblkrange)) /
>> 722                 iblkrange;
>> 723             if (total_l1blks <= maxblks) {
>> 724                     *l1blks = total_l1blks;
>> 725                     *start = minimum;
>> (kgdb) print iblkrange
>> $1 = 0
>> (kgdb) print minimum
>> $2 = 0
>> 
>> It looks like it is attempting to compute 0 / 0, causing the panic.
>> 
>> How can I restore my zpool to a working state?  Thank you for any
>> assistance.
>> 
> _______________________________________________
> freebsd-fs at freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-fs
> To unsubscribe, send any mail to "freebsd-fs-unsubscribe at freebsd.org"