Re: [List] Re: zfs corruption at zroot/usr/home:<0x0>
Date: Thu, 13 Nov 2025 15:42:49 UTC
On Thu, Nov 13, 2025 at 2:06 PM Frank Leonhardt <freebsd-doc@fjl.co.uk> wrote: > On 12/11/2025 19:20, Tomek CEDRO wrote: > Hmm this is brand new NVME drive not really likely to fail. I have the > same problem on zraid0 (stripe) array while initially I saw the bad > file name with 3 problems (vm image) it now turned into > ztuff/vm:<0x482>. Charlie Foxtrot :-( > > NVME drives are known to fail early in their life if they're going to fail at all, otherwise they're quite reliable for a long time. > > Almost every time I've blamed ZFS in the past (and there have been quite a few occasions) it's turned out to be a hardware problem, even when it seemed okay. Testing subsequently confirmed a flaky drive or controller. A few times I haven't found conclusive proof one way or the other. I believe ZFS is just particularly good at detecting corruption - I've seen corrupted data on UFS2 over the years, but the OS doesn't notice. > There's always the chance of a bug in the drivers, of course. Hmm, will try to boot some diagnostics iso from the vendor to make sure nvme drive status.. but it was and is still working fine for several months already, it uses onboard controller and has big heatsink installed. This is Samsung PRO 9100 2TB NVME with latest firmware installed (I know Samsung can release nvmes that will self destruct because of faulty firmware). I once noticed these early errors in the raidz2 with brand new hdd wd red disks, so I checked every single one of them with destructive badblocks and one turned out to be faulty and was replaced quickly. This was the only time when I saw ZFS error before. Since then I always pass every disk with rw badblocks in several iterations even before first use :-) I have three zfs pools two are simple stripe (1x2TB for root, 2x2TB for data) and one is raidz2 (4x4TB for data). I am sure this was caused by two kernel panics I have triggered by hand during tests. Not sure if this a "driver" bug because I was source of the problem but it there is a place for improvement in ZFS then I just found one.. I would rather prefer to have lost last write than inconsistent filesystem with unknown location after :-P Only raidz2 is unaffected because it had some additional data to restore content. Now I understand why the "lost" 8TB space is required :D The other two pools were in active use during panic thus the data loss. I will replace old zfs stripe and add two disks to the raidz2 at first occasion when some cash jumps in :-) With UFS2 I not only always had filesystem corruption on kernel panic but as you say there were hidden corruption problems that fsck could not catch. ZFS is like a dream here.. and look the problem only happened in some known dataset so I can restore only the dataset not the whole disk :-) > And this is why (as mentioned elsewhere) I do a last-ditch backup of files to tape using tar! > ZFS is sold as a magic never-lose-data filing system. It's good, but it can't work miracles on flaky hardware. IME, when it goes, it goes. Yes, I will re-enable my auto zfs snap now in the cron to have at least one month of snapshots created every week/day auto created and then zfs export. I had this running but got too confident disabled it and look it would help now :-P I am using BluRay disks for backups these are bigger and faster (also EMP reistant?) than tape but I really admire the tape approach :-) Just bought several Sony's BD-RE XL 100GB (rewritable) and also have some BD-RE DL (50GB) but these are slow to write (2x 36Mbps). BD-R and BD-R DL are a lot faster to write (i.e. 6..12x) but one time only.. and I have BD-R XL 128GB with 4x write. I also got DVD-RAM (2..5x write speed still slower than 6x DVD-RW) that can store small portions of data quickly in theory (good for logs) but FreeBSD's UDF support ends at 1.50 while 2.60 is required for true random access, and I did not manage to udfclient to provide random read/write so multisession is the only way for now. Also good disk burner with firmware that supports these disks is required not all can even read them and write speed is important too it if takes 12h or quarter of that. > Good luck with recovering the snapshot. > Regards, Frank. Thank you Frank!! For now I am backing up current data to the disks it takes some time will report back after simple snapshot rollback :-) Tomek -- CeDeROM, SQ7MHZ, http://www.tomek.cedro.info