From nobody Tue Sep 02 19:01:39 2025 X-Original-To: freebsd-current@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4cGZq15SHYz67Bhj for ; Tue, 02 Sep 2025 19:01:49 +0000 (UTC) (envelope-from woozle@woozle.net) Received: from wzl.woozle.net (wzl.woozle.net [195.54.192.66]) by mx1.freebsd.org (Postfix) with ESMTP id 4cGZq12L2Bz3bwW; Tue, 02 Sep 2025 19:01:49 +0000 (UTC) (envelope-from woozle@woozle.net) Authentication-Results: mx1.freebsd.org; none Received: from woozle.rinet.ru (woozle.rinet.ru [195.54.192.68]) by wzl.woozle.net (Postfix) with ESMTP id 1C69E73B; Tue, 02 Sep 2025 22:01:42 +0300 (MSK) Received: from localhost (woozle.rinet.ru [195.54.192.68]) by woozle.rinet.ru (8.14.5/8.14.5) with ESMTP id 582J1drH031458; Tue, 2 Sep 2025 22:01:41 +0300 (MSK) (envelope-from woozle@woozle.net) Date: Tue, 2 Sep 2025 22:01:39 +0300 (MSK) From: Dmitry Morozovsky X-X-Sender: marck@woozle.rinet.ru To: Alexander Motin cc: freebsd-current@FreeBSD.org Subject: Re: reviving ZFS in broken sm_start+sm_size state In-Reply-To: <9d8f3fa4-b550-4d49-9724-a9eaa08ae7c9@FreeBSD.org> Message-ID: References: <9d8f3fa4-b550-4d49-9724-a9eaa08ae7c9@FreeBSD.org> User-Agent: Alpine 2.00 (BSF 1167 2008-08-23) X-OpenPGP-Key-ID: 6B691B03 List-Id: Discussions about the use of FreeBSD-current List-Archive: https://lists.freebsd.org/archives/freebsd-current List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-current@FreeBSD.org MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.4.3 (woozle.rinet.ru [195.54.192.68]); Tue, 02 Sep 2025 22:01:41 +0300 (MSK) X-Spamd-Bar: ---- X-Spamd-Result: default: False [-4.00 / 15.00]; REPLY(-4.00)[]; ASN(0.00)[asn:31363, ipnet:195.54.192.0/19, country:RU] X-Rspamd-Pre-Result: action=no action; module=replies; Message is reply to one we originated X-Rspamd-Queue-Id: 4cGZq12L2Bz3bwW Alexander, nice to hear from ya! On Tue, 2 Sep 2025, Alexander Motin wrote: > Hi Dmitry, > > This is a space map corruption, that could happen even some time before the > reboot. You should be able to import the pool read-only to evacuate the data, ah, that mostly straightforward idea somehow missed my mind! and yes, I confirm zpool import -o readonly=on -R /mnt did not panic, and `find -s /mnt` produced reasonable result > since read-only import does not load space maps. Unfortunately without having > any reproduction of the actual corruption we might not be able to understand > how it happened. It might be either software of hardware, so unless you have > ECC RAM, you may wish to test it. You may also try to use `zdb -emmmm ...` to > dump the metaslabs on the pool and look for more corruptions and their > patterns, hoping it give any more ideas. well, as I said, there's not much data to evacuate, good enough backups are in place, so I'd rather try to do smth to locate and hopefully help to fix underlying bugs output from which commands would be useful? thanks again! > > On 01.09.2025 11:22, Dmitry Morozovsky wrote: > > Dear colleagues, > > > > after some (AFAIR clean) reboot of current with ZFS-on-root I had (OCRed > > from > > mobile photo but hopefully good enough) unbootable system with the following > > panic: > > > > --- 8< --- > > panic: VERIFY3U(entry_offset, <, sm->sm_start + sm->sm_size) failed > > (1847270282567680 < 92341796864) > > > > cpuid = 2 > > time = 1756738203 > > KDB: stack backtrace: > > db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame > > 0xfffffe0149c856d0 > > vpanic at vpanic+0x136/frame 0xfffffe8149c85800 > > spl_panic at spl_panic+0x3a/frame 0xfffffe0149c85860 > > space_map_iterate() at space_map_iterate+0x3b1/frame 0xfffffe0149c85920 > > space_map_load_length() at space_map_load_length+0x5f/frame > > 0xfffffe8149c85970 > > metaslab_load() at metaslab_load+0x529/frame 0xfffffe8149c85a40 > > metaslab_activate() at metaslab_activate+0x46/frame 0xfffffe8149c85a88 > > metaslab_alloc_dva_range() at metaslab_alloc_dva_range+0x7f9/frame > > 0xfffffe0149c85bb0 > > metaslab_alloc_range() at metaslab_alloc_range+8x2c2/frame > > 8xfffffe8149c85c70 > > metaslab_alloc() at metaslab_allo > > zio_dva_allocate() at 0xfffffe0149c85cc0 > > zio_execute() at zio iraframe 0xfffffe0149c85e10/frame 0xfffffe0149c85e40 > > taskqueue_run_locked) at taskqueue_run_locked+0x1c2/frame 0xfffffe0149c85ec0 > > taskqueue_thread_loop() at taskqueue_thread_loop+0xd3/frame > > 0xfffffe0149c85ef0 > > fork_exit() at fork_exit+0x82/frame 0xfffffe0149c85f30 > > fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe0149c85f30 > > --- trap 0, rip = 0, rsp = 0, rbp = 0 --- > > KDB: enter: panic > > [ thread pid 0 tid 101011] > > Stopped at > > --- 8< --- > > > > attempts to boot from last snapshot and/or trying to boot from PRERELEASE > > and > > zpool import lead to exactly the same results, even with different '-F' > > options: > > > > pool *seems* to be importable but actually isn't due to mad entry_offset as > > I > > can see from source > > > > any hints how could I resolve this? the pool content itself is not **very** > > important, but avoiding recreation would be nice > > -- Sincerely, D.Marck [MCK-RIPE] [ FreeBSD committer: marck@FreeBSD.org ] --------------------------------------------------------------------------- *** Dmitry Morozovsky --- D.Marck --- Wild Woozle --- woozle@woozle.net *** ---------------------------------------------------------------------------