Unkillable KSE threaded proc

Julian Elischer julian at elischer.org
Wed Sep 8 15:54:52 PDT 2004

Andrew Gallatin wrote:

>Julian Elischer writes:
> > it is possible. Howevr you should try this on -current,  (please) 
> > because I rewrite some of the exit code
> > and may have already fixed it..
> > 
> > a -curent kernel can run a 5.3 userland in general so you may just need 
> > to recompile the kernel.
>OK, I built a -current kernel from CVS sources dated 8amPDT.
>And it is worse..
>The initial skill -9 -u gallatin seems to be ignored by the threaded
>process and it gets re-parented to init when skill takes out its
>parent (sh) and its parent's parent (csh and sshd):
># ps axwl | grep ping | grep -v grep
> 1387   607     1 591 132  0 18260 11480 -      R     p0-   5:18.18 tests/mx_pingpong -e 2 -M 2 -E 3000000 -d scream:0
>Logging in again and doing 'kill -9 607' results in other stuff
>starting to hang. (Can't ssh in again,  kill never seems to return.
>In the following ps, the shell that launched the second kill -9
>is pid 624 (^T also claims its running)
>db> ps
>  pid   proc     uarea   uid  ppid  pgrp  flag   stat  wmesg    wchan  cmd
>  624 c1a28c40 e6808000 1387   623   624 0004002 [CPU 0] csh
>  623 c1f24540 e8858000 1387   621   621 0000100 [SLPQ select 0xc06cb5c4][SLP] sshd
>  621 c1647a80 e52e3000    0   451   621 0000100 [SLPQ sbwait 0xc1990d40][SLP] sshd
>  607 c1a2d8c0 e680f000 1387     1   605 000c482 (threaded)  mx_pingpong
>   thread 0xc1f25960 ksegrp 0xc18808c0 [CPU 1]
>   thread 0xc1f2aaf0 ksegrp 0xc18808c0 [SUSP]
>   thread 0xc1f2a960 ksegrp 0xc18808c0 [RUNQ]
>   thread 0xc1f2a4b0 ksegrp 0xc1f282a0 [LOCK process lock c1b37bc0]
>db> tr 607
>sched_switch(c1f25960,c15b9000,c15b9000,ae1ed572,3db79502) at sched_switch+0xd8
>mi_switch(2,c15b9000,c15b9154,c15b9000,e884db50) at mi_switch+0x1c7
>maybe_preempt(c15b9000,82,0,c1568c40,c15b9000) at maybe_preempt+0x99
>sched_add(e884db70,46,c1f2a960,46,c18808c0) at sched_add+0x103
>resetpriority(e884db84,e680f000,46,46,c1a2d8c0) at resetpriority+0x62
>_end(c1f282a4,c1f25960,c1f2a970,c1f2a960,c1f2a988) at 0xc1f25960
>(null)(c1f282a0,c18808c4,c1f25960,c1f2a4b8,c1f2aaf0) at 0
>end(c1f28850,c1f28854,c1f25320,c1f25328,0) at 0xc1647a80
>end(c1880af0,c1880af4,c1a29af0,c1a29af8,0) at 0xc1a2d8c0
>_end(c1995000,c1995004,c187f7d0,c187f7d8,0) at 0xc1f24e00
><_end() is repeated quite a few times>
>Is there any way to get a trace of the other threads from ddb?


I think it is

show thread (address)
but if yuo can get a coredump it would be best..
in ddb do:
call doadump

in this case it looks like  thread 0xc1f2aaf0 has called exit() and is 
waiting for the others to exit..
I wonder if the lock is the answer.. it woul dbe good to follow the link 
in the mutex in the proc structure at 0xc1a2d8c0
to see which thread OWNS it..


More information about the freebsd-threads mailing list