UPDATE: Server hanged on VFS lock problem

Thu Mar 29 19:20:45 UTC 2007

Andrea Venturoli wrote:

> Is there a way I can get these dumps automatically, without entering 
> DDB, since this is an unattended server?

I still don't know if it's possible to get dump and get going... I don't 
think so, actually...
Anyway I found debug.vfs_badlock_ddb=0 should allow this unattended box 
to continue working.
Now I just wonder what would happen if it did...

Futhermore, I got another dump like this and in both case I got to the 
conclusion that the userland situation is that cyrus-imapd is receiving 
a message which it has to forward to another host. This is probably 
irrelevant, but isn't it quite strange that on a busy 
mailserver/fileserver/a-lot-of-other things, both dumps come from 
exactly the same cronjob (logcheck, btw) sending a mail to the same address.

This is bt (which I forgot in the original message):

> (kgdb) bt
> #0  doadump () at pcpu.h:172
> #1  0xffffffff80245a29 in boot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:409
> #2  0xffffffff802454bb in panic (fmt=0xffffffff803c5a09 "from debugger")
>     at /usr/src/sys/kern/kern_shutdown.c:565
> #3  0xffffffff8017bb12 in db_panic (addr=0, have_addr=0, count=0, modif=0x0)
>     at /usr/src/sys/ddb/db_command.c:438
> #4  0xffffffff8017c055 in db_command_loop () at /usr/src/sys/ddb/db_command.c:350
> #5  0xffffffff8017df4d in db_trap (type=-1471015248, code=0)
>     at /usr/src/sys/ddb/db_main.c:222
> #6  0xffffffff80262089 in kdb_trap (type=3, code=0, tf=0xffffffffa85217b0)
>     at /usr/src/sys/kern/subr_kdb.c:473
> #7  0xffffffff80384c84 in trap (frame=
>       {tf_rdi = 0, tf_rsi = -2139025408, tf_rdx = 1, tf_rcx = 1123776, tf_r8 = 1048064, tf_r9 = 10, tf_rax = 27, tf_rbx = -1099401716568, tf_rbp = -1471014800, tf_r10 = -1471015040, tf_r11 = 4294967255, tf_r12 = -2143248681, tf_r13 = 0, tf_r14 = 0, tf_r15 = -1471014064, tf_trapno = 3, tf_addr = 0, tf_flags = -1099401716568, tf_err = 0, tf_rip = -2144986273, tf_cs = 8, tf_rflags = 642, tf_rsp = -1471014800, tf_ss = 16})
>     at /usr/src/sys/amd64/amd64/trap.c:442
> #8  0xffffffff803709db in calltrap () at /usr/src/sys/amd64/amd64/exception.S:168
> #9  0xffffffff80261b5f in kdb_enter (msg=0x0) at cpufunc.h:63
> #10 0xffffffff802adb4d in assert_vop_elocked (vp=0xffffff00068d1ca8,
>     str=0xffffffff80409ed7 "VOP_WRITE") at /usr/src/sys/kern/vfs_subr.c:3436
> #11 0xffffffff803b3eae in VOP_WRITE_APV (vop=0x0, a=0xffffffffa8521a10)
>     at vnode_if.c:709
> #12 0xffffffff802b935c in vn_write (fp=0xffffff00130ecca8, uio=0xffffffffa8521b50,
>     active_cred=0x1, flags=0, td=0xffffff0023565000) at vnode_if.h:372
> #13 0xffffffff80271b37 in dofilewrite (td=0xffffff0023565000, fd=22,
>     fp=0xffffff00130ecca8, auio=0xffffffffa8521b50, offset=1048064, flags=0)
>     at file.h:252
> #14 0xffffffff80271e01 in kern_writev (td=0xffffff0023565000, fd=22,
>     auio=0xffffffffa8521b50) at /usr/src/sys/kern/sys_generic.c:402
> #15 0xffffffff80271efa in write (td=0x0, uap=0xffffffff80811000)
>     at /usr/src/sys/kern/sys_generic.c:326
> #16 0xffffffff803854a1 in syscall (frame=
>       {tf_rdi = 22, tf_rsi = 34429279984, tf_rdx = 1208, tf_rcx = 6557696, tf_r8 = -2143273848, tf_r9 = 140737488336808, tf_rax = 4, tf_rbx = 1208, tf_rbp = 34429279984, tf_r10 = 1, tf_r11 = 642, tf_r12 = 0, tf_r13 = 22, tf_r14 = 312, tf_r15 = 0, tf_trapno = 12, tf_addr = 6652216, tf_flags = 34384627961, tf_err = 2, tf_rip = 34384825260, tf_cs = 43, tf_rflags = 518, tf_rsp = 140737488336808, tf_ss = 35})
>     at /usr/src/sys/amd64/amd64/trap.c:792
> #17 0xffffffff80370b78 in Xfast_syscall ()
>     at /usr/src/sys/amd64/amd64/exception.S:270
> #18 0x00000008017ecbac in ?? ()

I'd still appreciate if someone with more insight than me could comment 
this.

  bye & Thanks
	av.