kern/177536: [zfs] zfs livelock (deadlock) with high write-to-disk load
Martin Birgmeier
Martin.Birgmeier at aon.at
Sat Apr 27 09:50:04 UTC 2013
The following reply was made to PR kern/177536; it has been noted by GNATS.
From: Martin Birgmeier <Martin.Birgmeier at aon.at>
To: bug-followup at FreeBSD.org, Andriy Gapon <avg at FreeBSD.org>
Cc:
Subject: Re: kern/177536: [zfs] zfs livelock (deadlock) with high write-to-disk
load
Date: Sat, 27 Apr 2013 11:40:16 +0200
So it happened again... same system (9.1.0 release), except that the
kernel has been recompiled with options DDB, KDB, and STACK.
I ran procstat -kk -a (twice). Output can be found in
http://members.aon.at/xyzzy/procstat.-kk.-a.1.gz and
http://members.aon.at/xyzzy/procstat.-kk.-a.2.gz, respectively. I also
started kgdb in script(1), executing "thread apply all bt" in it. Output
can be found in http://members.aon.at/xyzzy/kgdb.thread.apply.all.bt.gz.
More info on the "test case":
- As described in the initial report, / is a UFS GPT partition on one of
6 SATA disks. There exists a zpool "hal.1" on one (other) GPT partition
on each of these disks.
- VirtualBox is run by a user whose home dir is on one of the zfs file
systems.
- First, a big write load to another zfs file system of the same zpool
was started (160 GB copy from a remote machine).
- Then, 3 VBoxHeadless instances were started.
==> livelock on zfs
- procstat run twice, then script + kgdb
- copied output to another machine
- shutdown the hung machine (via "shutdown -p")
==> "some processes would not die"
==> "syncing disks" executes until all zeros, then the system just sits
there with continuous disk activity (obviously from zfs), shutdown does
not proceed further
- hard reset
- on reboot: UFS file system check (no errors), ZFS starts fine and
seems mostly unaffected (except of course that the 160 GB copy is truncated)
An analysis would be appreciated, and also a hint whether I should
switch to stable/9 instead.
Regards,
Martin
More information about the freebsd-fs
mailing list