ZFS i/o error on boot unable to start system
David Christensen
dpchrist at holgerdanske.com
Mon Mar 2 18:57:56 UTC 2020
On 2020-03-02 07:28, mike tancsa wrote:
> On 2/28/2020 2:04 PM, David Christensen wrote:
>>
>> The most likely explanation is that you broke rc.conf.
>>
> I dont think he is getting that far. This looks like the kernel etc are
> not even loaded yet.
I agree. I spent many hours yesterday battling the the same error
message again. The message appears to be generated late in the boot
process when the (third stage?) bootloader or kernel (?) is trying to
mount root and/or other ZFS virtual devices. Therefore, both root and
/etc/rc.conf are unavailable when the message is printed to the console.
It appears that ZFS writes metadata to disk (disks?) that is read by
(one or more stages of) the bootloader and/or kernel upon the next boot.
If conditions during the next boot are "too different" from the
conditions that existed when the metadata was written, the error message
is printed and the boot process stops.
So far, I have found a few techniques for dealing with this problem:
1. Power off, remove all data disks, and boot. This can work; sometimes.
2. Power off, remove all disks, and boot the FreeBSD USB installer into
single user mode. I believe this has always worked. Once at the
installer root prompt, remount the installer root filesystem read-write,
import the problem disk bootpool, delete /boot/zfs/zpool.cache on the
problem disk, export the problem disk bootpool, power off, unplug the
installer media, and boot. Again, sometimes.
3. Reverse the ordering of the SATA port connections, so that the
system disk is at one end or the other -- e.g. if my motherboard has
ports SATA0 through SATA5, put the system disk at SATA0 if it was at
SATA5, or put the system disk at SATA5 if it was at SATA0. (Finding the
first and last ports gets more complex when you have multiple chips
and/or HBA's, and can change when you add more disks.) Again, sometimes.
4. If all else fails, move the system disk to another computer and
start over. (Or build a new one in the target computer.)
By executing permutations of the above techniques, I was eventually able
to get both computers to boot with one bulk data pool in each.
The dimensions of this problem appear to be:
1. Firmware-- BIOS (myself, OP?) vs. (U)EFI
2. Partitioning -- MBR (myself) vs. GPT (OP?)
3. SATA port ordering -- system disk on first SATA port (unknown) vs.
system disk on last SATA port (unknown)
4. Boot file system -- UFS vs. ZFS boot (myself, OP?)
5. Root file system -- UFS vs. ZFS root (myself, OP?)
6. Number of VDEV's -- multiple devices in one VDEV with everything
(OP?) vs. one device with boot partition/VDEV and root partition/VDEV,
and multiple devices in a yet another VDEV with bulk data (myself).
So, 2**6 = 64 combinations using just the dimensions and choices above
(and there are more). My case may have some overlap with the OP and/or
the various respondants, but the differences make troubleshooting
difficult because what may work for one combination does not apply or
may not work for another.
My next battle will be inserting and removing additional disks for
rotating backups...
David
More information about the freebsd-questions
mailing list