How to help debugging of lock-up
Doug White
dwhite at gumbysoft.com
Wed Jun 15 02:48:27 GMT 2005
On Tue, 14 Jun 2005, Jun Kuriyama wrote:
>
> I'm using the current (minus recent ssouhlal@'s commit).
>
> This kernel usually locked up when daily backup begins (by invoked by
> amanda server), but sometimes locked up in other random situations.
>
> When locked up, no ping response, but I can enter to debugger from
> serial console.
>
> I'm not sure which process I should suspect. Is there something I can
> provide to help debugging about this?
The trace looks normal for something network- and disk-bound. Perhaps your
NIC's overloaded or hung? Where is the amanda backup going -- back to the
same system?
>
> Currently, compiled with INVARIANTS, INVARIANT_SUPPORT, WITNESS,
> WITNESS_SKIPSPIN, DEBUG_VFS_LOCKS and debug.mpsafevfs="0".
>
>
> -----
> KDB: enter: Break sequence on console
> [thread pid 12 tid 100005 ]
> Stopped at kdb_enter+0x2b: nop
> db> ps
> pid proc uid ppid pgrp flag stat wmesg wchan cmd
> 2491 c186d400 91 2489 2483 0004000 [SLPQ piperd 0xc16cba80][SLP] sed
> 2490 c1802e00 91 2489 2483 0004000 [SLPQ piperd 0xc183e000][SLP] restore
> 2489 c1648a00 91 2487 2483 0004000 [SLPQ wait 0xc1648a00][SLP] sh
> 2488 c1648800 91 2484 2483 0004000 [SLPQ biord 0xc3a27da8][SLP] dump
> 2487 c15e4a00 91 2484 2483 0000000 [SLPQ piperd 0xc16f1d80][SLP] sendbackup
> 2486 c1871c00 91 2484 2483 0004000 [SLPQ piperd 0xc16ca000][SLP] gzip
> 2484 c1b12000 91 1 2483 0004000 [SLPQ piperd 0xc16f1c00][SLP] sendbackup
> 2454 c1802a00 91 2451 2440 0000000 [SLPQ pause 0xc1802a34][SLP] dump
> 2453 c1b11c00 91 2451 2440 0000000 [SLPQ pipdwt 0xc16cbc00][SLP] dump
> 2452 c1871400 91 2451 2440 0000000 [SLPQ pause 0xc1871434][SLP] dump
> 2451 c1b11000 91 2445 2440 0000000 [SLPQ sbwait 0xc186b9b0][SLP] dump
> 2445 c1b12200 91 2441 2440 0004000 [SLPQ wait 0xc1b12200][SLP] dump
> 2444 c15e4200 91 2441 2440 0000000 [SLPQ pipewr 0xc16f1900][SLP] sendbackup
> 2443 c1645e00 91 2441 2440 0004000 [SLPQ sbwait 0xc1a7e638][SLP] gzip
> 2441 c1b11400 91 1 2440 0004000 [SLPQ piperd 0xc16cb600][SLP] sendbackup
> 1033 c186de00 1021 1032 1033 0004002 [SLPQ ttyin 0xc15b9410][SLP] zsh
> 1032 c186da00 1021 1030 1030 0000100 [SLPQ select 0xc075cda4][SLP] sshd
> 1030 c1800200 0 583 1030 0004100 [SLPQ sbwait 0xc186c480][SLP] sshd
> 717 c1648e00 0 1 717 0004002 [SLPQ ttyin 0xc1541010][SLP] getty
> 716 c1800a00 0 1 716 0004002 [SLPQ ttyin 0xc153cc10][SLP] getty
> 715 c1871a00 0 1 715 0004002 [SLPQ ttyin 0xc155a810][SLP] getty
> 714 c1800600 0 1 714 0004002 [SLPQ ttyin 0xc153b010][SLP] getty
> 700 c1800000 0 1 700 0000000 [SLPQ select 0xc075cda4][SLP] inetd
> 679 c1871600 0 1 679 0000000 [SLPQ select 0xc075cda4][SLP] moused
> 656 c1871e00 0 1 655 0000000 [SLPQ select 0xc075cda4][SLP] snmpd
> 637 c1800400 0 1 637 0000000 [SLPQ select 0xc075cda4][SLP] pptpd
> 607 c1645800 0 1 607 0000000 [SLPQ nanslp 0xc070faac][SLP] cron
> 595 c1645600 25 1 595 0000100 [SLPQ pause 0xc1645634][SLP] sendmail
> 589 c186d200 0 1 589 0000100 [SLPQ select 0xc075cda4][SLP] sendmail
> 583 c15e4600 0 1 583 0000100 [SLPQ select 0xc075cda4][SLP] sshd
> 565 c186d000 0 1 565 0000000 [SLPQ select 0xc075cda4][SLP] ntpd
> 511 c1648000 0 1 511 0000000 [SLPQ select 0xc075cda4][SLP] usbd
> 492 c15e4400 0 490 490 0000100 [SLPQ nfslockd 0xc07654c8][SLP] rpc.lockd
> 490 c15e4800 0 1 490 0000000 [SLPQ select 0xc075cda4][SLP] rpc.lockd
> 485 c15e5c00 0 1 485 0000000 [SLPQ select 0xc075cda4][SLP] rpc.statd
> 479 c1648600 0 478 478 0000000 [SLPQ - 0xc15cb800][SLP] nfsd
> 478 c1645a00 0 1 478 0000000 [SLPQ select 0xc075cda4][SLP] nfsd
> 476 c1645400 0 1 476 0000000 [SLPQ select 0xc075cda4][SLP] mountd
> 398 c1648400 0 1 398 0000000 [SLPQ select 0xc075cda4][SLP] ypbind
> 395 c1645200 0 1 395 0000000 [SLPQ select 0xc075cda4][SLP] rpcbind
> 362 c1648200 0 1 362 0000000 [SLPQ biord 0xc3a14508][SLP] syslogd
> 325 c1645000 0 1 325 0000000 [SLPQ select 0xc075cda4][SLP] devd
> 218 c15e4000 0 1 218 0000000 [SLPQ pause 0xc15e4034][SLP] adjkerntz
> 62 c15e4c00 0 0 0 0000204 [SLPQ - 0xc83cfd04][SLP] schedcpu
> 61 c15e4e00 0 0 0 0000204 [SLPQ - 0xc076526c][SLP] nfsiod 3
> 60 c15e5000 0 0 0 0000204 [SLPQ - 0xc0765268][SLP] nfsiod 2
> 59 c15e5200 0 0 0 0000204 [SLPQ - 0xc0765264][SLP] nfsiod 1
> 58 c15e5400 0 0 0 0000204 [SLPQ - 0xc0765260][SLP] nfsiod 0
> 57 c15e5600 0 0 0 0000204 [SLPQ syncer 0xc070f81c][SLP] syncer
> 56 c15e5800 0 0 0 0000204 [SLPQ vlruwt 0xc15e5800][SLP] vnlru
> 55 c133e400 0 0 0 0000204 [SLPQ psleep 0xc075d2ec][SLP] bufdaemon
> 54 c133e600 0 0 0 000020c [SLPQ pgzero 0xc076b704][SLP] pagezero
> 53 c133e800 0 0 0 0000204 [SLPQ psleep 0xc076b254][SLP] vmdaemon
> 52 c133ea00 0 0 0 0000204 [SLPQ psleep 0xc076b210][SLP] pagedaemon
> 51 c133ec00 0 0 0 0000204 [SLPQ m:w2 0xc15b5d00][SLP] g_mirror data
> 50 c133ee00 0 0 0 0000204 [IWAIT] swi0: sio
> 49 c13ad000 0 0 0 0000204 [SLPQ - 0xc14c5a3c][SLP] fdc0
> 48 c13ad200 0 0 0 0000204 [SLPQ tzpoll 0xc0882694][SLP] acpi_thermal
> 47 c13ad400 0 0 0 0000204 [SLPQ usbevt 0xc1525210][SLP] usb2
> db> trace 362
> Tracing pid 362 tid 100081 td 0xc15e7c00
> sched_switch(c15e7c00,0,1) at sched_switch+0x177
> mi_switch(1,0) at mi_switch+0x270
> sleepq_switch(c3a14508,cc6b99dc,c050da15,c3a14508,0) at sleepq_switch+0xe0
> sleepq_wait(c3a14508,0,0,c06aec19,e52) at sleepq_wait+0x30
> msleep(c3a14508,c075d3c0,4c,c06af341,0) at msleep+0x311
> bwait(c3a14508,4c,c06af341) at bwait+0x47
> bufwait(c3a14508,1,0,0,c16b2000) at bufwait+0x1a
> breadn(c1adedd0,0,0,800,0) at breadn+0x266
> bread(c1adedd0,0,0,800,0) at bread+0x20
> ffs_balloc_ufs2(c1adedd0,4f,0,57,c12f5a00) at ffs_balloc_ufs2+0xcbf
> ffs_write(cc6b9c40,c1a4a510,c1adedd0,cc6b9c8c,c0565fd6) at ffs_write+0x2b4
> VOP_WRITE_APV(c06f7600,cc6b9c40) at VOP_WRITE_APV+0x9b
> vn_write(c1a4a510,c1bac100,c12f5a00,0,c15e7c00) at vn_write+0x1ea
> kern_writev(c15e7c00,8,c1bac100,c1bac100,0) at kern_writev+0x8e
> writev(c15e7c00,cc6b9d04,3,39,292) at writev+0x30
> syscall(3b,3b,3b,8054cde,bfbfde70) at syscall+0x22f
> Xint0x80_syscall() at Xint0x80_syscall+0x1f
> --- syscall (121, FreeBSD ELF32, writev), eip = 0x280c9563, esp = 0xbfbfd8ac, ebp = 0xbfbfde98 ---
> db> trace 2451
> Tracing pid 2451 tid 100120 td 0xc1803600
> sched_switch(c1803600,0,1) at sched_switch+0x177
> mi_switch(1,0) at mi_switch+0x270
> sleepq_switch(c186b9b0,0,ccb6dbb4,c050da06,c186b9b0) at sleepq_switch+0xe0
> sleepq_wait_sig(c186b9b0,0,100,c06adec5,3f1) at sleepq_wait_sig+0xc
> msleep(c186b9b0,c186b97c,158,c06ae15c,0) at msleep+0x302
> sbwait(c186b964,c070f180,2,4,0) at sbwait+0x4b
> soreceive(c186b914,0,ccb6dc78,0,0) at soreceive+0x2da
> soo_read(c1a4a318,ccb6dc78,c1b01380,0,c1803600) at soo_read+0x41
> dofileread(c1803600,c1a4a318,12,bfbee028,4) at dofileread+0xad
> read(c1803600,ccb6dd04,3,449,292) at read+0x3b
> syscall(3b,3b,bfbe003b,12,bfbee028) at syscall+0x22f
> Xint0x80_syscall() at Xint0x80_syscall+0x1f
> --- syscall (3, FreeBSD ELF32, read), eip = 0x280c1e83, esp = 0xbfbedfdc, ebp = 0xbfbee008 ---
>
>
>
--
Doug White | FreeBSD: The Power to Serve
dwhite at gumbysoft.com | www.FreeBSD.org
More information about the freebsd-current
mailing list