Process enters unkillable state and somewhat wedges zfs

Sat Sep 27 12:55:32 UTC 2014

On Friday, September 26, 2014 02:47:58 PM Daniel Andersen wrote:
> Okay, we were finally able to collect a trace on this.  I took two, just in
> case.  basically did a tr <pid>, continue, go back into ddb and did another.
> 
> Tracing pid 89483 tid 101168 td 0xfffff8048e301920
> cpustop_handler() at cpustop_handler+0x28/frame 0xfffffe2f395e1ce0
> ipi_nmi_handler() at ipi_nmi_handler+0x3f/frame 0xfffffe2f395e1d00
> trap() at trap+0x42/frame 0xfffffe2f395e1f20
> nmi_calltrap() at nmi_calltrap+0x8/frame 0xfffffe2f395e1f20
> --- trap 0x13, rip = 0xffffffff808aada0, rsp = 0xfffffe2f395e1fe0, rbp = 0xfffffe2ffca2bf60 ---
> __lockmgr_args() at __lockmgr_args+0x30/frame 0xfffffe2ffca2bf60
> vop_stdlock() at vop_stdlock+0x3c/frame 0xfffffe2ffca2bf80
> VOP_LOCK1_APV() at VOP_LOCK1_APV+0x9d/frame 0xfffffe2ffca2bfb0
> _vn_lock() at _vn_lock+0x43/frame 0xfffffe2ffca2c010
> zfs_lookup() at zfs_lookup+0x3a1/frame 0xfffffe2ffca2c0a0
> zfs_freebsd_lookup() at zfs_freebsd_lookup+0x6d/frame 0xfffffe2ffca2c1e0
> VOP_CACHEDLOOKUP_APV() at VOP_CACHEDLOOKUP_APV+0x92/frame 0xfffffe2ffca2c210
> vfs_cache_lookup() at vfs_cache_lookup+0xcf/frame 0xfffffe2ffca2c260
> VOP_LOOKUP_APV() at VOP_LOOKUP_APV+0x92/frame 0xfffffe2ffca2c290
> null_lookup() at null_lookup+0x8b/frame 0xfffffe2ffca2c300
> VOP_LOOKUP_APV() at VOP_LOOKUP_APV+0x92/frame 0xfffffe2ffca2c330
> lookup() at lookup+0x58b/frame 0xfffffe2ffca2c3b0
> namei() at namei+0x504/frame 0xfffffe2ffca2c480
> vn_open_cred() at vn_open_cred+0x232/frame 0xfffffe2ffca2c5d0
> vop_stdvptocnp() at vop_stdvptocnp+0x17d/frame 0xfffffe2ffca2c910
> null_vptocnp() at null_vptocnp+0x2b/frame 0xfffffe2ffca2c970
> VOP_VPTOCNP_APV() at VOP_VPTOCNP_APV+0x98/frame 0xfffffe2ffca2c9a0
> vn_vptocnp_locked() at vn_vptocnp_locked+0x113/frame 0xfffffe2ffca2ca20
> vn_fullpath1() at vn_fullpath1+0x1b2/frame 0xfffffe2ffca2ca80
> kern___getcwd() at kern___getcwd+0x115/frame 0xfffffe2ffca2cae0
> amd64_syscall() at amd64_syscall+0x357/frame 0xfffffe2ffca2cbf0
> Xfast_syscall() at Xfast_syscall+0xfb/frame 0xfffffe2ffca2cbf0
> --- syscall (326, FreeBSD ELF64, sys___getcwd), rip = 0x2000cd8cda, rsp = 0x7fffffffd358, rbp = 0x7fffffffd440 ---
> 
> 
> 
> Tracing pid 89483 tid 101168 td 0xfffff8048e301920
> cpustop_handler() at cpustop_handler+0x28/frame 0xfffffe2f395e1ce0
> ipi_nmi_handler() at ipi_nmi_handler+0x3f/frame 0xfffffe2f395e1d00
> trap() at trap+0x42/frame 0xfffffe2f395e1f20
> nmi_calltrap() at nmi_calltrap+0x8/frame 0xfffffe2f395e1f20
> --- trap 0x13, rip = 0xffffffff80e6110e, rsp = 0xfffffe2f395e1fe0, rbp = 0xfffffe2ffca2bfb0 ---
> VOP_LOCK1_APV() at VOP_LOCK1_APV+0xae/frame 0xfffffe2ffca2bfb0
> _vn_lock() at _vn_lock+0x43/frame 0xfffffe2ffca2c010
> zfs_lookup() at zfs_lookup+0x3a1/frame 0xfffffe2ffca2c0a0
> zfs_freebsd_lookup() at zfs_freebsd_lookup+0x6d/frame 0xfffffe2ffca2c1e0
> VOP_CACHEDLOOKUP_APV() at VOP_CACHEDLOOKUP_APV+0x92/frame 0xfffffe2ffca2c210
> vfs_cache_lookup() at vfs_cache_lookup+0xcf/frame 0xfffffe2ffca2c260
> VOP_LOOKUP_APV() at VOP_LOOKUP_APV+0x92/frame 0xfffffe2ffca2c290
> null_lookup() at null_lookup+0x8b/frame 0xfffffe2ffca2c300
> VOP_LOOKUP_APV() at VOP_LOOKUP_APV+0x92/frame 0xfffffe2ffca2c330
> lookup() at lookup+0x58b/frame 0xfffffe2ffca2c3b0
> namei() at namei+0x504/frame 0xfffffe2ffca2c480
> vn_open_cred() at vn_open_cred+0x232/frame 0xfffffe2ffca2c5d0
> vop_stdvptocnp() at vop_stdvptocnp+0x17d/frame 0xfffffe2ffca2c910
> null_vptocnp() at null_vptocnp+0x2b/frame 0xfffffe2ffca2c970
> VOP_VPTOCNP_APV() at VOP_VPTOCNP_APV+0x98/frame 0xfffffe2ffca2c9a0
> vn_vptocnp_locked() at vn_vptocnp_locked+0x113/frame 0xfffffe2ffca2ca20
> vn_fullpath1() at vn_fullpath1+0x1b2/frame 0xfffffe2ffca2ca80
> kern___getcwd() at kern___getcwd+0x115/frame 0xfffffe2ffca2cae0
> amd64_syscall() at amd64_syscall+0x357/frame 0xfffffe2ffca2cbf0
> Xfast_syscall() at Xfast_syscall+0xfb/frame 0xfffffe2ffca2cbf0
> --- syscall (326, FreeBSD ELF64, sys___getcwd), rip = 0x2000cd8cda, rsp = 0x7fffffffd358, rbp = 0x7fffffffd440 ---

Hmm, if you are still in this state, can you do a ktrace of this running
process and see if it is logging any syscall events (just want to make sure it
is stuck in the kernel rather than constantly calling system calls due to a
loop in userland)?

Also, do you have a nullfs mount of a zfs path?  (It looks that way from your
stack trace).  I did find this thread:

https://www.mail-archive.com/freebsd-current@freebsd.org/msg147678.html

Andriy (cc'd) probably has better insight into this than I.

> On 08/29/2014 11:24 AM, John Baldwin wrote:
> > On Wednesday, August 27, 2014 02:24:47 PM Daniel Andersen wrote:
> >> On 08/27/2014 01:39 PM, John Baldwin wrote:
> >>> These are all blocked in "zfs" then.  (For future reference, the 'mwchan'
> >>> field that you see as 'STATE' in top or via 'ps O mwchan' is more detailed
> >>> than the 'D' state.)
> >>> 
> >>> To diagnose this further, you would need to see which thread holds the
> >>> ZFS vnode lock these threads need.  I have some gdb scripts you can use to
> >>> do that at www.freebsd.org/~jhb/gdb/.  You would want to download 'gdb6*'
> >>> files from there and then do this as root:
> >>> 
> >>> # cd /path/to/gdb/files
> >>> # kgdb
> >>> (kgdb) source gdb6
> >>> (kgdb) sleepchain 42335
> >>> 
> >>> Where '42335' is the pid of some process stuck in "zfs".
> >> 
> >> I will keep this in mind the next time the machine wedges.  Another data
> >> point: the second procstat output I sent was the most recent.  All the
> >> processes listed there were after the fact.  The process that started the
> >> entire problem ( this time ) was sudo, and it only has this one entry in
> >> procstat:
> >> 
> >> 38003 102797 sudo             -                <running>
> >> 
> >> Of note, this does not appear to be blocked on zfs in anyway.  'ps' showed
> >> it in 'R' state instead of 'D' ( I will be sure to use mwchan in the
> >> future. ) It appeared to be pegging an entire CPU core at 100% usage, as
> >> well, and was only single threaded.
> > 
> > Well, if it is spinning in some sort of loop in the kernel while holding a
> > ZFS vnode lock that could be blocking all the other threads.  In that case,
> > you don't need to do what I asked for above.  Instead, we need to find out
> > what that thread is doing.  There are two ways of doing this.  One is to
> > force a panic via 'sysctl debug.kdb.panic=1' and then use kgdb on the
> > crashdump to determine what the running thread is doing.  Another option
> > is to break into the DDB debugger on the console (note that you will need
> > to build a custom kernel with DDB if you are on stable) and request a
> > stack trace of the running process via 'tr <pid>'.  Ideally you can do this
> > over a serial console so you can just cut and paste the output of the trace
> > into a mail.  Over a video console you can either transcribe it by hand or
> > take photos.

-- 
John Baldwin