Re: STABLE-15: ZFS incorrectly orders metaslabs (?)

From: Volodymyr Kostyrko <arcade_at_b1t.name>
Date: Fri, 19 Sep 2025 05:45:04 UTC
13.09.25 17:45, Alan Somers:
> On Sat, Sep 13, 2025 at 12:21 AM Volodymyr Kostyrko <arcade@b1t.name 
> <mailto:arcade@b1t.name>> wrote:
> 
>     Hello.
> 
>     So I like, thought about looking at 15 on my book. As I'm normally
>     using
>     STABLE on my workplace and non-prod servers this wasn't that much
>     scary,
>     so I just went ahead and compiled kernel. Everything seems to be
>     working
>     pretty much fine, so I also updated world and kernel modules (so I can
>     use desktop). Again, everything was working pretty much fine, so I went
>     ahead rebuilding packages. And after a few dozen my host just stuck. I
>     rebooted only to face instapanic on boot, someting like this:
> 
>     https://t.me/freebsd_ua/18931 <https://t.me/freebsd_ua/18931>
> 
>     I tried booting from old kernel (STABLE-14), and suddenly host just
>     booted. Then I tried repating steps under 15 to make sure it's real
>     bug.
>     And after some disk activity host stuck again. This time, however, 14
>     wasn't able to boot too:
> 
>     https://t.me/freebsd_ua/18948 <https://t.me/freebsd_ua/18948>
> 
>     My setup:
> 
>     * Custom kernel, mostly based off MINIMAL.
>     * ZFS was NOT upgraded.
>     * There was a number of features enabled on ZFS, like checksums, big
>     blocks, dedup, etc.
> 
>     I'll try to boot GENERIC 15 on the pool to check.
> 
>     Hope that helps someone to debug the issue. Thanks.
> 
>     -- 
>     Sphinx of black quartz judge my vow.
> 
> 
> It's a known issue.  See https://github.com/openzfs/zfs/issues/15030 
> <https://github.com/openzfs/zfs/issues/15030> .

Big thanks, that was a really helpful read. Indeed, all this was caused 
by previously present issues related to dedup, and vfs.zfs.recover=1 
helped me to mount pool and play longer with it to find the true cause.

The only good solution is to recreate a pool from scratch, as zfs 
rewrite actually triggers issue again when dedup entries are removed 
causing even more issues. Since then host is stable.

-- 
Sphinx of black quartz judge my vow.