7.0-stable: a hung process - scheduler bug?
Mikhail Teterin
mi+mill at aldan.algebra.com
Tue Sep 23 17:48:01 UTC 2008
Hello!
I was trying to build OpenOffice using all of my 4 CPUs. To be able to
do other work on the machine comfortably, I ran the build under nice,
and assigned real-time priority to the two Xorg processes.
The build started at about 23:10 last night, and hung at 23:46. The
procstat output for the make's process group is:
PID PPID PGID SID TSID THR LOGIN WCHAN EMUL
COMM
8371 2425 8371 2425 2425 1 mi wait FreeBSD ELF64 make
12254 8371 8371 2425 2425 1 mi wait FreeBSD ELF64 sh
12255 12254 8371 2425 2425 1 mi pause FreeBSD ELF64
tcsh
12262 12255 8371 2425 2425 1 mi wait FreeBSD ELF64
perl5.8.8
33010 12262 8371 2425 2425 1 mi wait FreeBSD ELF64
perl5.8.8
33011 33010 8371 2425 2425 1 mi wait FreeBSD ELF64 sh
33012 33011 8371 2425 2425 1 mi wait FreeBSD ELF64 dmake
37126 33012 8371 2425 2425 1 mi - FreeBSD ELF64 dmake
The last line worries me greatly... According to "procstat -t", there is
only one thread there:
PID TID COMM TDNAME CPU PRI STATE
WCHAN
37126 100724 dmake - 1 193 sleep -
And trying to "ktrace -p 37126" returns (even to root, even in /tmp):
ktrace: ktrace.out: Operation not permitted
There are no problems ktrace-ing 33012, but nothing comes from there, as
that process simply waits for its child. I guess, the child -- 37126 was
(v)forked to launch a compiler or some such and remains stuck in between
(v)fork and exec somewhere...
The OS is: FreeBSD 7.0-STABLE/amd64 from Sat Jul 26, 2008 and the box is
otherwise perfectly functional. The scheduling-related options are set
as such:
options SCHED_4BSD # 4BSD scheduler
options _KPOSIX_PRIORITY_SCHEDULING # POSIX P1003_1B
real-time extensions
Let me know, what else I can do to help fix this bug -- I'm going to
reboot the machine tonight... Should I switch to SCHED_ULE as a
work-around? Thanks! Yours,
-mi
More information about the freebsd-stable
mailing list