ZFS root mount regression

Fri Jul 19 18:21:04 UTC 2019

I recently upgraded several file servers from 11.2 to 11.3.  All of
them boot from a ZFS pool called "tank" (the data is in a different
pool).  In a couple of instances (which caused me to have to take a
late-evening 140-mile drive to the remote data center where they are
located), the servers crashed at the root mount phase.  In one case,
it bailed out with error 5 (I believe that's [EIO]) to the usual
mountroot prompt.  In the second case, the kernel panicked instead.

The root cause (no pun intended) on both servers was a disk which was
supplied by the vendor with a label on it that claimed to be part of
the "tank" pool, and for some reason the 11.3 kernel was trying to
mount that (faulted) pool rather than the real one.  The disks and
pool configuration were unchanged from 11.2 (and probably 11.1 as
well) so I am puzzled.

Other than laboriously running "zpool labelclear -f /dev/somedisk" for
every piece of media that comes into my hands, is there anything else
I could have done to avoid this?

-GAWollman