From nobody Tue Jul 12 18:33:47 2022 X-Original-To: freebsd-current@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id C488D1D048F0 for ; Tue, 12 Jul 2022 18:33:49 +0000 (UTC) (envelope-from jhb@FreeBSD.org) Received: from smtp.freebsd.org (smtp.freebsd.org [IPv6:2610:1c1:1:606c::24b:4]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "smtp.freebsd.org", Issuer "R3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4Lj8XY5CvPz3nGJ; Tue, 12 Jul 2022 18:33:49 +0000 (UTC) (envelope-from jhb@FreeBSD.org) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1657650829; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=JDpgxYoZpK5G41rZ43rfjBAnDpNfbM6xrWD03iNQtts=; b=ONsD2GMG+soIv3irJ/e6serSm0bq6RH02+wOB1mbUIpsIFYUBSQTiz4yPYJKnkyEQSFzT/ 1XM2kH33V8BOeBYGe9veFzt88b+wjHWPP9f/Ds+KXS+UN9DcBgaZhZHMZxPN9AAfo4Ii4O DPuAI6ezClJlgvIICMJ+21MV+zx1GeRW7cq1wm8PKrhxvCCg5FKqToA+IFxXtox2WbiCWr I5UcN07m+jsiAKAYUk43dU7DElz43gZ7NXhY+xpOjYGyIRPYdHq5CJe/joKcRoAdxUbYPi F5z+29O0fr3NRqrVpieOO0iSWhZSMx0OOAEP5R3p5S8kUJZBtNYBRVHZUfi1fg== Received: from [10.0.1.4] (ralph.baldwin.cx [66.234.199.215]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) (Authenticated sender: jhb) by smtp.freebsd.org (Postfix) with ESMTPSA id 4Lj8XY1MPwzrbk; Tue, 12 Jul 2022 18:33:49 +0000 (UTC) (envelope-from jhb@FreeBSD.org) Message-ID: <653d1a91-1468-ba15-5365-87a63ed0e2d1@FreeBSD.org> Date: Tue, 12 Jul 2022 11:33:47 -0700 List-Id: Discussions about the use of FreeBSD-current List-Archive: https://lists.freebsd.org/archives/freebsd-current List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-current@freebsd.org MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:91.0) Gecko/20100101 Thunderbird/91.10.0 Subject: Re: BLAKE3 unstability? Content-Language: en-US To: Evgeniy Khramtsov , Ryan Moeller Cc: FreeBSD-CURRENT@FreeBSD.org References: <20220709162640.7my2bq6rax5npdhf@vax.khramtsov.org> <20220709175605.ofkoft2mglrkaqpf@vax.khramtsov.org> <20220712084101.iqvwyfuhge6myteq@vax.khramtsov.org> From: John Baldwin In-Reply-To: <20220712084101.iqvwyfuhge6myteq@vax.khramtsov.org> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1657650829; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=JDpgxYoZpK5G41rZ43rfjBAnDpNfbM6xrWD03iNQtts=; b=w0h5KHC3DJ3iGCZ3lgTh/nw46fWoQ4x++sUnRwKexboLkibjprP9/VZ03K0pCfa0b6nJEr o9YM/M85oUgwjjkcVtkq1mw/MvBi02+daRgQ+75m9aGJ7scNjuaKBDOb4V+z25jtKZyAAb U02Wr09uHvHMpnUgeSYYgUi3etrm3wEmOuej2iUypZPSjsP8CnPLyYWrrU0qz81g62akob N0ygQZrKe7JkYDecT0CPUs8Cls462YjHoLk0S3mwxM4TGd5TofqdSxO68ZI/0nkqEc3BL3 mQED5uVeNigK+YzjFY2/R/5Hqs96CGJxaJ0x4j4AtCgxzJJPS8EAmcERQ/tnag== ARC-Seal: i=1; s=dkim; d=freebsd.org; t=1657650829; a=rsa-sha256; cv=none; b=a0FSDlKh5z68lXcUe6WVQi3ZHfSYtCFKclPAVouLHOhBQsa52tEuwPz7MTF0JUwUPKTmdm 1/RNdUgGbBb/hytuVXBauHp9GcQZndgEbQPEC2DNQ0kScfIP1hqKjwn8rqfpUHGy7uBiZx AdEzftqpWCnjQXkpEs6quu/K+IWrV5hjnIg8JtZezht4svVciQfdZ8xwqcHDP+Olt2sUpf yvJ6E+K9tkYcRoKoKfD4BOiZBlNNvAU951G5lGmjjSOdv07nKm9m5q6GKmSdieyftq7Ihk RcDl0N+LkoShXUCsWVlUoiA3OTcuBXBamSe9g04h8wxpEVVW1aZTnEVqjNSexg== ARC-Authentication-Results: i=1; mx1.freebsd.org; none X-ThisMailContainsUnwantedMimeParts: N On 7/12/22 1:41 AM, Evgeniy Khramtsov wrote: >>>> I can reproduce via: >>>> >>>> $ truncate -s 10G /tmp/test >>>> $ mdconfig -f /tmp/test -S 4096 >>>> $ zpool create test /dev/md1 >>>> $ zfs create -o checksum=blake3 test/b >>>> $ dd if=/dev/random of=/test/b/noise bs=1M count=4096 >>>> $ sync >>>> $ zpool scrub test >>>> $ zpool status >>> >>> I cannot reproduce this on openzfs/zfs@cb01da68057 (the commit that was >>> most recently merged) built out of tree on either stable/13 70fd40edb86 >>> or main 9aa02d5120a. >>> >>> I'll update a system and see if I can reproduce it with the in-tree ZFS. >>> >>> - Ryan >>> >> It did not reproduce for me with in-tree ZFS on main@3c9ad9398fcd either. >> >> Could you share sysctl kstat.zfs.misc.chksum_bench, maybe we are using >> different implementations? >> I do see that blake3 went in with only a Linux module parameter for the >> implementation selection, so I'll have to fix that. For now we can at least >> see which was fastest, which should be the one selected. You just won't be >> able to manually change it to see if that helps. >> >> - Ryan > > I found the culprit (kernel and base from download.FreeBSD.org > kernel.txz and base.txz respectively) (I forgot about local sysctl.conf...): > > kern.sched.steal_thresh=1 > kern.sched.preempt_thresh=121 > > Then > > #!/bin/sh > > truncate -s 10G /tmp/test > mdconfig -f /tmp/test -S 4096 > zpool create test /dev/md0 > zfs create -o checksum=blake3 test/b > dd if=/dev/random of=/test/b/noise bs=1M count=4096 > sync > zpool scrub test > sleep 3 > zpool status > > zpool destroy test > mdconfig -d -u 0 > rm /tmp/test > > As for ULE "tuning", these values give me fine desktop interactivity > when building lang/rust when nice and idprio did not help, so I left > them in sysctl.conf. Not sure if scheduling parameters are worthy of > a ZFS PR, maybe something essential is preempted. It could be missing fpu_kern_enter/leave that lack of preemption would cover over. I thought that missing that would give a panic in the kernel though due to FPU instructions being disabled (including vector instructions). Maybe ZFS isn't using fpu_kern_enter(FPU_NOCTX) and is instead trying to juggle contexts and it has a bug in how it manages saved FPU contexts and reuses a context? If so, I would just suggest that ZFS switch to using FPU_KERN_NOCTX instead which runs all SSE type code in a critical section to disable preemption but avoids having to allocate and manage FPU contexts. -- John Baldwin