ZFS Hangs

Adam McDougall mcdouga9 at egr.msu.edu
Mon Nov 5 09:05:21 PST 2007

On Mon, Nov 05, 2007 at 10:24:14AM +0100, Kris Kennaway wrote:

  Thomas Sparrevohn wrote:
>> On Sunday 04 November 2007 15:00:50 Kris Kennaway wrote:
>>> http://www.freebsd.org/doc/en_US.ISO8859-1/books/developers-handbook/kerneldebug.html
>> Oh my god - Overlooked that ;-) - funny that -  Its a bit tricky as it not 
>> possibly to dump a kernel
>> when the swap is on ZFS - I did a test with all debugging enabled and the 
>> problem
>> did not show up - which makes it somewhat nasty - I check if I can 
>> reproduce it with only DDB enabled 
  You can still hook up a serial console, or at the very least take 
  photographs of the screen with the relevant DDB information.  Or add 
  another disk and dump on that.
I have some screenshots of ps in ddb from one of several zfs hangs I've had
on one amd64 system:


I didn't post every single screenful since I don't have a microsd reader handy,
and emailing the pictures off my phone is painful.  If I missed a screenshot of
one or more particular processes that might have a telling state, let me know.

I also have a gzipped kernel + dump from a forced panic when it was in this
state, if a developer is interested in it please let me know so I can post it
somewhere private since the system is in NIS and likely has tables cached
in memory.  

It is running a kernel from Oct 17.  I tried a kernel with WITNESS, INVARIANTS
etc but it did the same hang without any panic.  I completed a zpool scrub
this morning with no errors.  Lately zfs seems to wedge up every single night
when rsync from remote servers run.  This is the only amd64 system I have zfs on,
the other two are i386 and the problems on those systems have only been kmem panics
which so far have been avoidable.  

I can help by checking somewhat specific things and running prescribed tests,
but right now I don't have time to tackle this problem on this system and learn
how to debug it entirely on my own starting with nothing more than a DDB guide
from the handbook.  Its not that I refuse to; I recognize its difficult to
join remote skill with local hands for something this technical. 

Friday I replaced the motherboard/cpu just as a shot in the dark (since the
system had some strange instability in the past) but this didn't help zfs 
(not surprised).  When zfs was hung saturday morning, I tried to reboot it
but reboot would not even get far enough to stop new ssh connections.

More information about the freebsd-current mailing list