Crash on boot after moving root zpool

From: Mr Roooster <mrrooster_at_gmail.com>
Date: Wed, 23 Jun 2021 22:24:00 +0100
Hi All,

I appreciate the below is a bit 'odd', so not sure if it's worth
raising a bug for.

I have my home server running FreeBSD 13-stable (amd64), specifically
'FreeBSD 13.0-STABLE #5 stable/13-n246071-659f77766031'.

It's an AMD Ryzen 8 core on an Asus X-370 Pro.

I have root on a SATA attached SSD, a 1TB samsung evo, this was
installed from the FreeBSD install media. It boots via EFI.

I decided to move the root to a 2TB NVMe, so I installed the NVMe,
duplicated the partition layouty (apart from a larger freebsd-zfs
partition) and used dd to copy the efi and freebsd-boot partitions
over.

I then rebooted and selected the new drive as boot to check the EFI
partition and bootloader worked as expected, the system rebooted fine.

Next step was to 'zpool add' the zfs parttion on the new drive to the
zroot pool, then 'zpool remove' the existing partition located on the
SATA SSD. (ada7p4 IIRC)

This all appeared to work, and I got a message telling me the data had
been migrated and an amount of memory had been used. The data seemed
okay and the machine remained operational.

Then I rebooted. At this point I got stuck in an endless reboot loop,
a video of the incident suggets it's somewhere around
spa_import_rootpool, but it's so quick I didn't get a full stacktrace.

I was able to boot off a USB and import the pool, the data looked
okay, so I then used zfs send to dump the data to a file. I have since
sucesfully zfs receive'd this onto a new zroot pool and all is well
again.

But I am a little curious why it didn't work. The pool appearead to be
importable and the data was fully retrievable via zfs send.

I have also managed to re-create this in a VM. Using Hyper-V, I
created a new VM for FreeBSD, with EFI, 4 CPU, and 4G of memory. The
default size HD. Installed from FreeBSD-13.0-RELEASE-amd64-disc1.iso

I then added another virtual SCSI disk (da1) and did the following:

<----------------------------------------------------------------------------------------------CUT
[root_at_vm ~]# gpart show da0
=>       40  266338224  da0  GPT  (127G)
         40     532480    1  efi  (260M)
     532520       1024    2  freebsd-boot  (512K)
     533544        984       - free -  (492K)
     534528    4194304    3  freebsd-swap  (2.0G)
    4728832  261607424    4  freebsd-zfs  (125G)
  266336256       2008       - free -  (1.0M)

[root_at_vm ~]# gpart create -s gpt da1
da1 created
[root_at_vm ~]# gpart add -t efi -b 40 -s 532480 -i 1 da1
da1p1 added
[root_at_vm ~]# gpart add -t freebsd-boot -b 532520  -s 1024 -i 2 da1
da1p2 added
[root_at_vm ~]# gpart add -t freebsd-swap -b 534528  -s 4194304 -i 3 da1
da1p3 added
[root_at_vm ~]# gpart add -t freebsd-zfs -b 4728832  -s 261607424 -i 4 da1
da1p4 added
[root_at_vm ~]# gpart show da1
=>       40  266338224  da1  GPT  (127G)
         40     532480    1  efi  (260M)
     532520       1024    2  freebsd-boot  (512K)
     533544        984       - free -  (492K)
     534528    4194304    3  freebsd-swap  (2.0G)
    4728832  261607424    4  freebsd-zfs  (125G)
  266336256       2008       - free -  (1.0M)

[root_at_vm ~]# dd if=/dev/da0p1 of=/dev/da1p1 bs=1m
260+0 records in
260+0 records out
272629760 bytes transferred in 0.124740 secs (2185582343 bytes/sec)
[root_at_vm ~]# dd if=/dev/da0p2 of=/dev/da1p2
1024+0 records in
1024+0 records out
524288 bytes transferred in 0.343932 secs (1524395 bytes/sec)
[root_at_vm ~]# zpool status zroot
  pool: zroot
 state: ONLINE
config:

        NAME        STATE     READ WRITE CKSUM
        zroot       ONLINE       0     0     0
          da0p4     ONLINE       0     0     0

errors: No known data errors
[root_at_vm ~]# zpool add zroot da1p4
[root_at_vm ~]# zpool remove zroot da0p4
[root_at_vm ~]# zpool status zroot
  pool: zroot
 state: ONLINE
remove: Removal of vdev 0 copied 1013M in 0h0m, completed on Wed Jun
23 20:43:46 2021
    11.8K memory used for removed device mappings
config:

        NAME          STATE     READ WRITE CKSUM
        zroot         ONLINE       0     0     0
          da1p4       ONLINE       0     0     0

errors: No known data errors
<----------------------------------------------------------------------------------------------CUT

If at this point I shut down the VM, remove what was da0, and set da1
to be the only (and first) scsi device I get exactly the same reboot
loop.

Booting the VM of the install DVD allows me to zpool import the root
pool just fine. (Rebooting still fails after doing this though).

Worth a PR?

Cheers,

Ian
Received on Wed Jun 23 2021 - 21:24:00 UTC

Original text of this message