7.0-stable: a hung process - scheduler bug?

Paul B. Mahol onemda at gmail.com
Tue Sep 23 18:55:23 UTC 2008


On 9/23/08, Mikhail Teterin <mi+mill at aldan.algebra.com> wrote:
> Hello!
>
> I was trying to build OpenOffice using all of my 4 CPUs. To be able to
> do other work on the machine comfortably, I ran the build under nice,
> and assigned real-time priority to the two Xorg processes.
> The build started at about 23:10 last night, and hung at 23:46. The
> procstat output for the make's process group is:
>
>       PID  PPID  PGID   SID  TSID THR LOGIN    WCHAN     EMUL
>     COMM
>      8371  2425  8371  2425  2425   1 mi       wait      FreeBSD ELF64 make
>     12254  8371  8371  2425  2425   1 mi       wait      FreeBSD ELF64 sh
>     12255 12254  8371  2425  2425   1 mi       pause     FreeBSD ELF64
>     tcsh
>     12262 12255  8371  2425  2425   1 mi       wait      FreeBSD ELF64
>     perl5.8.8
>     33010 12262  8371  2425  2425   1 mi       wait      FreeBSD ELF64
>     perl5.8.8
>     33011 33010  8371  2425  2425   1 mi       wait      FreeBSD ELF64 sh
>     33012 33011  8371  2425  2425   1 mi       wait      FreeBSD ELF64 dmake
>     37126 33012  8371  2425  2425   1 mi       -         FreeBSD ELF64 dmake
>
> The last line worries me greatly... According to "procstat -t", there is
> only one thread there:
>
>       PID    TID COMM             TDNAME           CPU  PRI STATE
>     WCHAN
>     37126 100724 dmake            -                  1  193 sleep   -
>
> And trying to "ktrace -p 37126" returns (even to root, even in /tmp):
>
>     ktrace: ktrace.out: Operation not permitted
>
> There are no problems ktrace-ing 33012, but nothing comes from there, as
> that process simply waits for its child. I guess, the child -- 37126 was
> (v)forked to launch a compiler or some such and remains stuck in between
> (v)fork and exec somewhere...
>
> The OS is: FreeBSD 7.0-STABLE/amd64 from Sat Jul 26, 2008 and the box is
> otherwise perfectly functional. The scheduling-related options are set
> as such:
>
>     options         SCHED_4BSD              # 4BSD scheduler
>     options         _KPOSIX_PRIORITY_SCHEDULING # POSIX P1003_1B
>     real-time extensions
>
> Let me know, what else I can do to help fix this bug -- I'm going to
> reboot the machine tonight... Should I switch to SCHED_ULE as a
> work-around?

SCHED_BSD4 is suboptimal for 4 CPUs, and it is replaced with SCHED_ULE
on 7 STABLE.


More information about the freebsd-stable mailing list