Process hanging on 6.0-STABLE

Daniel O'Connor doconnor at
Wed Mar 22 07:25:49 UTC 2006

I work for a small company that makes radar systems for research 
organisations and we use FreeBSD on the PCs for data acquisition and 
processing. We have recently shifted to FreeBSD6/amd64 and one machine in 
particular is exhibiting a strange problem.

The acquisition process is a Tcl interpreter with a largish chunk of C code
 which talks to the hardware (via RS485 and a custom PCI card). Once the 
system is set up it streams data back via the PCI card and runs it through
 various data processors (eg dump raw data to disk, FFT, winds, etc..). 

The actual forking of processes is handled in Tcl and the C code only gets
 involved to write the data out (to an FD the Tcl layer keeps).

The problem is that every now and then the process gets stuck and becomes
 unkillable just after forking, ie..
eureka:~>ps -axwwwwl | grep Reco
19999   881     1  12  -8 -5 21716 15984 piperd I<s   ??  128:50.79 /usr/home/radar/skiymet/libexec/Recorder /usr/home/radar/skiymet/libexec/acquisition/sks.tcl /usr/home/radar/skiymet/etc/ud3
19999 80154   881  12  92 -5 21716    16 user m D<L   ??    0:00.00 /usr/home/radar/skiymet/libexec/Recorder /usr/home/radar/skiymet/libexec/acquisition/sks.tcl /usr/home/radar/skiymet/etc/ud3
19999 96464 96343   0  96  0   388   280 -      R+    p2    0:00.00 grep Reco

Looking at the original process is OK..

eureka:~>gdb $GSHOME/libexec/Recorder
(gdb) attach 881
(gdb) bt
#0  0x00000008009c395c in read () from /lib/
#1  0x000000080072f77f in TclpCreateProcess () from /usr/local/lib/
#2  0x0000000800717d25 in TclCreatePipeline () from /usr/local/lib/
#3  0x00000008007186d0 in Tcl_OpenCommandChannel () from /usr/local/lib/
#4  0x0000000800704af8 in Tcl_ExecObjCmd () from /usr/local/lib/

However the newly made one..
(gdb) attach 80154
Attaching to program: /usr/home/radar/skiymet/libexec/Recorder, process 80154
ptrace: Resource temporarily unavailable.

The original is killable..
eureka:~>kill 881
eureka:~>kill 881
881: No such process

But the old one is not..
eureka:~>kill 80154
eureka:~>kill 80154
eureka:~>kill -9 80154
eureka:~>kill -9 80154

I can fstat the new process and it shows a slew of open FDs (presumably
inherited from the old process), but I can't ktrace it..
eureka:~>ktrace -f 80154.ktr -p 80154
ktrace: 80154.ktr: Operation not permitted
eureka:~>sudo ktrace -f 80154.ktr -p 80154
ktrace: 80154.ktr: Operation not permitted

Or get a memory map..
eureka:~>dd if=/proc/80154/map bs=64k
dd: /proc/80154/map: Resource temporarily unavailable
0+0 records in
0+0 records out
0 bytes transferred in 0.000096 secs (0 bytes/sec)

Unfortunately the machine is at a very remote location and I have not
been able to replicate it locally (and I can't run, say memtest remotely

The custom PCI card has a driver which may be the cause of the problems
but it does not appear to be involved from what I can see.

Does anyone have any suggestions? The version of FreeBSD is a little 
after 6.0-RELEASE but not much.

Daniel O'Connor software and network engineer
for Genesis Software -
"The nice thing about standards is that there
are so many of them to choose from."
  -- Andrew Tanenbaum
GPG Fingerprint - 5596 B766 97C0 0E94 4347 295E E593 DC20 7B3F CE8C
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
Url :

More information about the freebsd-stable mailing list