Re: BLAKE3 unstability?
- In reply to: Evgeniy Khramtsov : "Re: BLAKE3 unstability?"
- Go to: [ bottom of page ] [ top of archives ] [ this month ]
Date: Tue, 12 Jul 2022 18:33:47 UTC
On 7/12/22 1:41 AM, Evgeniy Khramtsov wrote: >>>> I can reproduce via: >>>> >>>> $ truncate -s 10G /tmp/test >>>> $ mdconfig -f /tmp/test -S 4096 >>>> $ zpool create test /dev/md1 >>>> $ zfs create -o checksum=blake3 test/b >>>> $ dd if=/dev/random of=/test/b/noise bs=1M count=4096 >>>> $ sync >>>> $ zpool scrub test >>>> $ zpool status >>> >>> I cannot reproduce this on openzfs/zfs@cb01da68057 (the commit that was >>> most recently merged) built out of tree on either stable/13 70fd40edb86 >>> or main 9aa02d5120a. >>> >>> I'll update a system and see if I can reproduce it with the in-tree ZFS. >>> >>> - Ryan >>> >> It did not reproduce for me with in-tree ZFS on main@3c9ad9398fcd either. >> >> Could you share sysctl kstat.zfs.misc.chksum_bench, maybe we are using >> different implementations? >> I do see that blake3 went in with only a Linux module parameter for the >> implementation selection, so I'll have to fix that. For now we can at least >> see which was fastest, which should be the one selected. You just won't be >> able to manually change it to see if that helps. >> >> - Ryan > > I found the culprit (kernel and base from download.FreeBSD.org > kernel.txz and base.txz respectively) (I forgot about local sysctl.conf...): > > kern.sched.steal_thresh=1 > kern.sched.preempt_thresh=121 > > Then > > #!/bin/sh > > truncate -s 10G /tmp/test > mdconfig -f /tmp/test -S 4096 > zpool create test /dev/md0 > zfs create -o checksum=blake3 test/b > dd if=/dev/random of=/test/b/noise bs=1M count=4096 > sync > zpool scrub test > sleep 3 > zpool status > > zpool destroy test > mdconfig -d -u 0 > rm /tmp/test > > As for ULE "tuning", these values give me fine desktop interactivity > when building lang/rust when nice and idprio did not help, so I left > them in sysctl.conf. Not sure if scheduling parameters are worthy of > a ZFS PR, maybe something essential is preempted. It could be missing fpu_kern_enter/leave that lack of preemption would cover over. I thought that missing that would give a panic in the kernel though due to FPU instructions being disabled (including vector instructions). Maybe ZFS isn't using fpu_kern_enter(FPU_NOCTX) and is instead trying to juggle contexts and it has a bug in how it manages saved FPU contexts and reuses a context? If so, I would just suggest that ZFS switch to using FPU_KERN_NOCTX instead which runs all SSE type code in a critical section to disable preemption but avoids having to allocate and manage FPU contexts. -- John Baldwin