Unkillable KSE threaded proc

Andrew Gallatin gallatin at cs.duke.edu
Wed Sep 8 13:37:27 PDT 2004

Julian Elischer writes:
 > it is possible. Howevr you should try this on -current,  (please) 
 > because I rewrite some of the exit code
 > and may have already fixed it..
 > a -curent kernel can run a 5.3 userland in general so you may just need 
 > to recompile the kernel.

OK, I built a -current kernel from CVS sources dated 8amPDT.
And it is worse..

The initial skill -9 -u gallatin seems to be ignored by the threaded
process and it gets re-parented to init when skill takes out its
parent (sh) and its parent's parent (csh and sshd):

# ps axwl | grep ping | grep -v grep
 1387   607     1 591 132  0 18260 11480 -      R     p0-   5:18.18 tests/mx_pingpong -e 2 -M 2 -E 3000000 -d scream:0

Logging in again and doing 'kill -9 607' results in other stuff
starting to hang. (Can't ssh in again,  kill never seems to return.
In the following ps, the shell that launched the second kill -9
is pid 624 (^T also claims its running)

db> ps
  pid   proc     uarea   uid  ppid  pgrp  flag   stat  wmesg    wchan  cmd
  624 c1a28c40 e6808000 1387   623   624 0004002 [CPU 0] csh
  623 c1f24540 e8858000 1387   621   621 0000100 [SLPQ select 0xc06cb5c4][SLP] sshd
  621 c1647a80 e52e3000    0   451   621 0000100 [SLPQ sbwait 0xc1990d40][SLP] sshd
  607 c1a2d8c0 e680f000 1387     1   605 000c482 (threaded)  mx_pingpong
   thread 0xc1f25960 ksegrp 0xc18808c0 [CPU 1]
   thread 0xc1f2aaf0 ksegrp 0xc18808c0 [SUSP]
   thread 0xc1f2a960 ksegrp 0xc18808c0 [RUNQ]
   thread 0xc1f2a4b0 ksegrp 0xc1f282a0 [LOCK process lock c1b37bc0]

db> tr 607
sched_switch(c1f25960,c15b9000,c15b9000,ae1ed572,3db79502) at sched_switch+0xd8
mi_switch(2,c15b9000,c15b9154,c15b9000,e884db50) at mi_switch+0x1c7
maybe_preempt(c15b9000,82,0,c1568c40,c15b9000) at maybe_preempt+0x99
sched_add(e884db70,46,c1f2a960,46,c18808c0) at sched_add+0x103
resetpriority(e884db84,e680f000,46,46,c1a2d8c0) at resetpriority+0x62
_end(c1f282a4,c1f25960,c1f2a970,c1f2a960,c1f2a988) at 0xc1f25960
(null)(c1f282a0,c18808c4,c1f25960,c1f2a4b8,c1f2aaf0) at 0
end(c1f28850,c1f28854,c1f25320,c1f25328,0) at 0xc1647a80
end(c1880af0,c1880af4,c1a29af0,c1a29af8,0) at 0xc1a2d8c0
_end(c1995000,c1995004,c187f7d0,c187f7d8,0) at 0xc1f24e00
<_end() is repeated quite a few times>

Is there any way to get a trace of the other threads from ddb?


