Re: FreeBSD 13.2-STABLE can not boot from damaged mirror AND pool stuck in "resilver" state even without new devices.

From: Warner Losh <imp_at_bsdimp.com>
Date: Sun, 07 Jan 2024 18:34:06 UTC
On Sun, Jan 7, 2024 at 10:57 AM Lev Serebryakov <lev@freebsd.org> wrote:

> On 07.01.2024 16:38, Miroslav Lachman wrote:
>
> >>> ZFS: i/o error - all block copies unavailable
> >>> ZFS: can't read MOS of pool zroot
> >>>
> >>>     after that.
> >>   I've re-created pool from scratch
> >>
> >>   zpool create znewroot ada0p3 && zfs send zroot | zfs receive znewroot
> && zpool destroy zroot && zpool attach znewroot ada0p3 ada1p3
> >>
> >>   but gptzfsboot still can not boot from it with same diagnostics :-(
>

I must have missed it. What were the diagnostics?


> > How large are your disks in a question?
>    2TB
>
> ada0 at ahcich0 bus 0 scbus0 target 0 lun 0
> ada0: <HGST HUS726020ALE610 APGNTD05> ACS-2 ATA SATA 3.x device
> ada0: Serial Number K5HPZZLD
> ada0: 600.000MB/s transfers (SATA 3.x, UDMA6, PIO 8192bytes)
> ada0: Command Queueing enabled
> ada0: 1907729MB (3907029168 512 byte sectors)
> ada1 at ahcich1 bus 0 scbus1 target 0 lun 0
> ada1: <WDC WD2000FYYZ-01UL1B1 01.01K02> ATA8-ACS SATA 3.x device
> ada1: Serial Number WD-WMC1P0504169
> ada1: 600.000MB/s transfers (SATA 3.x, UDMA6, PIO 8192bytes)
> ada1: Command Queueing enabled
> ada1: 1907729MB (3907029168 512 byte sectors)
>

< 4294967296 sectors should be good. So these drives shouldn't see this
problem. the BIOS interfaces should have no trouble here.


> > As far as I search the internet it is caused by the boot code (later
> stage which is in a file in /boot directory) was moved too far from the
> beginning of the disk and some old BIOS cannot allow the system to continue
> booting.
>    Oh, it is good hypothesis. It is Haswell-time MSI board (old Hetzner
> EX40 instance)...
>

Yes. If the drives are > 2TB you lose. BIOS is not for you...  Unless you
make special partitions that are in the first 2TB of the drive and only
boot off of those. Also, if the drives are 4k, you likely lose, though it's
hit or miss. Those are the hard limits of the BIOS ABI.

> It can also be avoided if your machine supports EFI boot, but my HP
> Microserver Gen 8 does not support it.
>    I'll try to switch to EFI, but it needs some luck to get to BIOS with
> provided KVM, it is very unstable :-)
>

BIOS booting is dying. It will be unsupportable in not too many more years
and the code removed. The rapid proliferation of ZFS crypto and compression
types is hastening the race to see who can use up the most space in the
boot loader. We can do marginal things to make it better wrt the 640k
limit, sure, but then we hit other limits like the 2TB address space, like
not being able to reliably support 4k drives, etc. BIOS booting likely will
support an increasingly small subset of all possible booting methods as we
go forward. The current crazy mix of different alternative firmwares makes
it hard to know what will survive, but as we hit these limitations, it will
make it harder and harder to configure, deploy and manage these systems.

The Linux on ZFS root pages, btw, recommend having two pools on two
partitions on the disk. One that's a few GB that's the bool that has the
kernel in it, and the other, rest of the disk, that's rpool for the root
pool. If people want to continue to support BIOS booting (or rather,
booting using the CSM interfaces), then somebody is going to need to step
up to the plate and implement a similar option in bsdinstall, bectl,
freebsd-update, etc.

Warner