ZFS question

Jeremy Chadwick jdc at koitsu.org
Thu Mar 21 04:45:58 UTC 2013


(Please keep me CC'd as I'm not subscribed to -questions)


Lots to say about this.

1. freebsd-fs is the proper list for filesystem-oriented questions of
this sort, especially for ZFS.

2. The issue you've described is experienced by some, and **not**
experienced by even more/just as many, so please keep that in mind.
Each/every person's situation/environment/issue has to be treated
separately/as unique.

3. You haven't provided any useful details, even in your follow-up post
here:

http://lists.freebsd.org/pipermail/freebsd-questions/2013-March/249958.html

All you've provided is a "general overview" with no technical details,
no actual data.  You need to provide that data verbatim.  You need to
provide:

- Contents of /boot/loader.conf
- Contents of /etc/sysctl.conf
- Output from "zpool status"
- Output from "zpool get all"
- Output from "zfs get all"
- Output from "dmesg" (probably the most important)
- Output from "sysctl vfs.zfs kstat.zfs"

I particularly tend to assist with disk-level problems, so if this turns
out to be a disk-level issue (and NOT a controller or controller driver
issue), I can help quite a bit with that.

4. I would **not** suggest rolling back to 9.0.  This recommendation is
solves nothing -- if there is truly a bug/livelock issue, then that
needs to be tracked down.  By rolling back, if there is an issue, you're
effectively ensuring it'll never get investigated or fixed, which means
you can probably expect to see this in 9.2, 9.3, or even 10.x onward.

If you can't deal with the instability, or don't have the
time/cycles/interest to help track it down, that's perfectly okay too:
my recommendation is to go back to UFS (there's no shame in that).

Else, as always, I strongly recommend running stable/9 (keep reading).

5. stable/9 (a.k.a. FreeBSD 9.1-STABLE) just recently (~5 days ago)
MFC'd an Illumos ZFS feature solely to help debug/troubleshoot this
exact type of situation: introduction of the ZFS deadmean thread.
Reference materials for what that is:

http://svnweb.freebsd.org/base?view=revision&revision=248369
http://svnweb.freebsd.org/base?view=revision&revision=247265
https://www.illumos.org/issues/3246

The purpose of this feature (enabled by default) is to induce a kernel
panic when ZFS I/O stalls/hangs for unexpectedly long periods of time
(configurable via vfs.zfs.deadman_synctime).

Once the panic happens (assuming your system is configured with a slice
dedicated to swap (ZFS-backed swap = bad bad bad) and use of
dumpdev="auto" in rc.conf), upon reboot the system should extract the
crash dump from swap and save it into /var/crash.  At that point kernel
developers on the -fs list can help tell you *exactly* what to do with
kgdb(1) that can shed some light on what happened/where the issue may
lie.

All that's assuming that the issue truly is ZFS waiting for I/O and not
something else (like ZFS internally spinning hard in its own code).

Good luck, and let us know how you want to proceed.

-- 
| Jeremy Chadwick                                   jdc at koitsu.org |
| UNIX Systems Administrator                http://jdc.koitsu.org/ |
| Mountain View, CA, US                                            |
| Making life hard for others since 1977.             PGP 4BD6C0CB |



More information about the freebsd-questions mailing list