zfs upgrade hang upgrading from v3 to v5

Teske, Devin Devin.Teske at fisglobal.com
Mon Jan 6 17:27:06 UTC 2014

On Jan 4, 2014, at 2:54 PM, Darren Pilgrim wrote:

> I'm upgrading a system from 8.3 to 9.2 using a fresh install onto a parallel set of filesystems on the ZFS pool.  The system is a root on ZFS configuration with GPT-labeled AHCI disks.  The zpool upgrade step worked fine.  When I did `zfs upgrade -a` it didn't return right away, but this system is a little smaller so I left it to work.
> An hour later, it's still not done.  Ctrl-T shows zfs upgrade is in tx->tx_sync_done_cv and using no CPU.  Normally I expect to see "runnable" and using some CPU.  I can still work in open SSH sessions, but other zfs commands hang.  New SSH logins don't work.  Console logins hang between me entering the username and it printing the password prompt.  Even though I know there are active processes on the system, there is no disk activity.  Networking is still fine--the machine acts as a router, and the LAN behind it hasn't loss internet access.  The unbound instance running on it is also responsive, but it never touches the disk when running (it syslogs).
> Figuring it's livelocked on disk I/O, I try to reboot, but neither Control-Alt-Delete nor the power button do anything.  I ended up hard resetting the system.
> The system rebooted without issue.  Zfs upgrade showed a few of the v3 filesystems had been upgraded, but most hadn't.  Upgrading filesystems one by one got me most of the way there.  By dumb luck I got all the way to the base filesystem without anything hanging.  The base filesystem, however, did hang.
> I read Devin Teske's messages to freebsd-fs from Sept 20, 2013 about the same scenario.  Interestingly, the base filesystem on this box is the only one that has mountpoint=none.  Later today I'll try setting a mountpoint on it see if the upgrade will succeed then.
> In the meantime, is this a known issue by now?  The only things I could find were the aforementioned emails from Devin, and no one answered him.

I can chime in with the ugly work-around that allowed us to migrate from v3 to v5.
Quite unceremoniously, we rsync'd all the data to a new v5 dataset and then
destroyed the existing v3 dataset, only to rebuild the pool from scratch.

Certainly less than ideal; I'll be very interested in your testing to see if you can find
a way to around the issue (which we still think is centered around datasets lacking
a mountpoint).

The information contained in this message is proprietary and/or confidential. If you are not the intended recipient, please: (i) delete the message and all copies; (ii) do not disclose, distribute or use the message in any manner; and (iii) notify the sender immediately. In addition, please be aware that any message addressed to our domain is subject to archiving and review by persons other than the intended recipient. Thank you.

More information about the freebsd-questions mailing list