Getting ZFS pools back.

Willem Jan Withagen wjw at digiware.nl
Mon Apr 30 10:37:51 UTC 2018


On 29-4-2018 23:20, Willem Jan Withagen wrote:
> On 29/04/2018 20:21, Warner Losh wrote:
>>
>>
>> On Sun, Apr 29, 2018 at 11:57 AM, Jan Knepper <jan at digitaldaemon.com 
>> <mailto:jan at digitaldaemon.com>> wrote:
>>
>>     On 04/29/2018 13:27, Willem Jan Withagen wrote:
>>
>>         Trouble started when I installed (freebsd-update) 11.1 over a
>>         running 10.4. Which is sort of scarry?
>>
>>     This does sounds 'scary' as I am planning to do this in the (near)
>>     future...
>>
>>     Has anyone else experienced issues like this?
>>
>>     Generally I do build the new system software on a running system,
>>     but then go to single user mode to perform the actual install.
>>
>>     I have done many upgrades like that over 18 or so years and never
>>     seen or heard of an issue alike this.
>>
>>
>> 11.x binaries aren't guaranteed to work with a 10.x kernel. So that's 
>> a bit of a problem. freebsd-update shouldn't have let you do that either.
>>
>> However, most 11.x binaries work well enough to at least bootstrap / 
>> fix problems if booted on a 10.x kernel due to targeted forward 
>> compatibility. You shouldn't count on it for long, but it generally 
>> won't totally brick your box. In the past, and I believe this is still 
>> true, they work well enough to compile and install a new kernel after 
>> pulling sources. The 10.x -> 11.x syscall changes are such that you 
>> should be fine. At least if you are on UFS.
> 
> I have been doing those kind of this for years and years. Even upgrading 
> over NFS and stuff. Sometimes it is a bit too close to the sun and 
> things burn. But never crash this bad.
> 
>> However, the ZFS ioctls and such are in the bag of 'don't specifically 
>> guarantee and also they change a lot' so that may be why you can't 
>> mount ZFS by UUID. I've not checked to see if there's specifically an 
>> issue here or not. The ZFS ABI is somewhat more fragile than other 
>> parts of the system, so you may have issues here.
>>
>> If all else fails, you may be able to PXE boot an 11 kernel, or boot 
>> off a USB memstick image to install a kernel.
> 
> Tried just about replace everything in both the boot-partition (First 
> growing it to take > 64K gptzfsboot) and in /boot from the memstick.
> But the error never went away.
> 
> Never had ZFS die on me this bad, that I could not get it back.
> 
>> Generally, while we don't guarantee forward compatibility (running 
>> newer binaries on older kernels), we've generally built enough forward 
>> compat so that things work well enough to complete the upgrade. That's 
>> why you haven't hit an issue in 18 years of upgrading. However, the 
>> velocity of syscall additions has increased, and we've gone from 
>> fairly stable (stale?) ABIs for UFS to a more dynamic one for ZFS 
>> where backwards compat is a bit of a crap shoot and forward compat 
>> isn't really there at all. That's likely why you've hit a speed bump 
>> here.
> 
> Come to think of it, I did not do this step with freebsd-update, since I 
> was not at an official release yet. I was going to 11.1-RELEASE, to be 
> able to start using freebsd-update.
> 
> So I don't think I did just do that.... But I tried so much yesterday.
> Normally I would installkernel, reboot, installworld, mergemaster, 
> reboot for systems that are not up for freebsd-update.

Right,

The story gets even sadder .....
Took the "spare" disk home, and just connected it to an older SuperMicro 
server I had lying about for Ceph tests. And lo and behold, it just boots.

So that system got upgraded from: 10.2 -> 10.4 -> 11.1
No complaints about anything.

So now I'm inclined to point at older hardware with an old bios, which 
confused ZFS, or probably more precisely gptzfsboot.

 From dmidecode:
System Information
         Manufacturer: Supermicro
         Product Name: H8SGL
         Version: 1234567890
BIOS Information
         Vendor: American Megatrends Inc.
         Version: 3.5
         Release Date: 11/25/2013
         Address: 0xF0000

We only have 1 of those, so further investigation, and or tinkering, in 
combo with the hardware will be impossible.

--WjW





More information about the freebsd-fs mailing list