Re: ZFS operations hanging, but no visible errors?

From: Walter Cramer <wfc_at_mintsol.com>
Date: Fri, 05 Nov 2021 15:59:26 UTC
  This message is in MIME format.  The first part should be readable text,
  while the remaining parts are likely unreadable without MIME-aware tools.

--0-2138392286-1636127895=:95319
Content-Type: TEXT/PLAIN; CHARSET=X-UNKNOWN; format=flowed
Content-Transfer-Encoding: QUOTED-PRINTABLE
Content-ID: <20211105115818.E95319@mulder.mintsol.com>



On Fri, 5 Nov 2021, Chris Ross wrote:

>
>
> On Nov 5, 2021, at 11:40, Chris Ross <cross+freebsd@distal.com> wrote:
>
> Hey there.  I have a server running FreeBSD 13.0-RELEASE, with a large ZF=
S zpool.  I have a UFS mirror on hardware raid, then a bunch of JBOD disks =
in a pool.  I recently added a new vdev to this pool, which may or may not =
be related.
>
> Today, I started an rsync of a large (100GB) file from the pool to anothe=
r host.  After a while (7%), it seemed no progres was being made.  I tried =
to kill the rsync, which didn=E2=80=99t exit, or suspend.  Now anything tha=
t touches the pool seems to hang.  But, the system is otherwise functional,=
 console shows no issues, the controller (via out-of-band management interf=
ace) shows all disks as having no errors or issues.
>
> Any idea what I should be looking for, and if there=E2=80=99s any way to =
recover it without reboot?  Another operation I had running was in the midd=
le of writing data I=E2=80=99d rather not lose progress on, which is why I =
haven=E2=80=99t rebooted yet.
>
>              - Chris
>
> Ps, apologies for the rapid secondary email, but I forgot to mention some=
thing important.  A =E2=80=9Czpool status tank=E2=80=9D is also hanging.  S=
o, it=E2=80=99s not just FS operations that are stuck.
>

Any chance that you have output from an earlier `zpool status`, or even=20
`zpool list`, to give us a better sense of your situation?

Might `dmesg -a` show any relevant errors?

(FWIW, I usually have a daily cron job log a few things like `zpool=20
status` to /var/backups/zfs, to help with situations like this.)

--0-2138392286-1636127895=:95319--