I have a DDB session open to a crashed ZFS server

Andriy Gapon avg at FreeBSD.org
Tue Oct 16 16:29:42 UTC 2012


on 16/10/2012 19:15 John Baldwin said the following:
> On Tuesday, October 16, 2012 11:16:37 am Dennis Glatting wrote:
>> On Tue, 2012-10-16 at 08:44 -0400, John Baldwin wrote:
>>> On Monday, October 15, 2012 12:03:39 pm Dennis Glatting wrote:
>>>> FreeBSD/amd64 (mc) (ttyu0)
>>>>
>>>> login: NMI ... going to debugger
>>>> [ thread pid 11 tid 100003 ]
>>>
>>> You got an NMI, not a crash.  What happens if you just continue ('c' command) 
>>> from DDB?
>>>
>>
>> I hit the NMI button because of the "crash," which is a misword, to get
>> into DDB. 
> 
> Ah, I would suggest "hung" or "deadlocked" next time.  It certainly seems like
> a deadlock since all CPUs are idle.  Some helpful commands here might be
> 'show sleepchain' and 'show lockchain'.
> 
> Pick a "stuck" process (like find) and run:
> 
> 'show sleepchain <pid>'
> 
> In your case though it seems both 'find' and the various 'pbzip2' threads
> are stuck on a condition variable, so there isn't an easy way to identify
> an "owner" that is supposed to awaken these threads.  It could be a case
> of a missed wakeup perhaps, but you'll need to get someone more familiar
> with ZFS to identify where these codes should be awakened normally.
> 

I would also re-iterate a suggestion that I made to Nikolay ealrier:
http://article.gmane.org/gmane.os.freebsd.devel.file-systems/15981

BTW, in that case it turned out to be a genuine deadlock in ZFS ARC handling of
lowmem.
procstat -kk -a is a great help for analyzing such situations.

-- 
Andriy Gapon


More information about the freebsd-fs mailing list