kern/67830: CPU affinity problem with forked child processes (SMP)
gemini at geminix.org
Fri Jun 11 13:10:25 GMT 2004
>Synopsis: CPU affinity problem with forked child processes (SMP)
>Arrival-Date: Fri Jun 11 13:10:03 GMT 2004
>Originator: Uwe Doering
>Release: FreeBSD 4.5-RELEASE i386
EscapeBox - Managed On-Demand UNIX Servers
System: FreeBSD geminix.org 4.5-RELEASE FreeBSD 4.5-RELEASE #3: Thu Jun 10 14:06:47 GMT 2004 root at localhost:/STABLE_Enhanced_Edition i386
In SMP kernels there is a problem with the 'struct proc' variables
'p_oncpu' and 'p_lastcpu' being uninitialized (zeroed) when a forked
child process is put onto the run queue near the end of fork1(). Other
kernel functions expect these variables to be set up properly even
before the new child process has had its first process switch.
What happens in the current implementation is that all forked child
processes get an initial affinity to CPU0, regardless of which CPU
fork1() ran on. I believe this is unintended and in violation of
the SMP design goals.
The right thing to do, IMHO, would be to give the child process the
CPU affinity of the parent process. Since parent processes tend to
block shortly after returning from fork1(), in order to wait for some
event to happen, chooseproc() will most likely pick up the new child
process right away and make it the next process to be executed. Without
switching CPUs, that is.
On lightly loaded systems the change I propose makes no difference
since idle CPUs pick up processes "belonging" to their peers,
anyway (CPU migration). On busy systems, however, CPUs regularly
favor processes with a matching affinity, as long as there is
sufficient supply. In this situation the initial CPU affinity of
forked child processes _does_ matter if the goal is to spread the work
load evenly over multiple CPUs.
Although I cannot provide hard evidence to prove it, I believe it is
likely that this change, or rather correction, will improve the overall
performance of SMP systems that deal with a lot of forks, like busy email
servers, web servers with plenty of CGI invocations etc.
Since this is about a performance issue rather that a malfunction there
is nothing to repeat. The algorithmic deficiency becomes apparent from
looking at the sources.
I suggest to initialize 'p_oncpu' and 'p_lastcpu' of the forked child
process in a way as if it had been through a process switch once already.
chooseproc() and other kernel functions will then work as expected.
Please consider the following patch:
--- kern_fork.c.diff begins here ---
--- src/sys/kern/kern_fork.c.orig Wed Apr 21 09:23:06 2004
+++ src/sys/kern/kern_fork.c Thu Jun 10 16:05:03 2004
@@ -559,6 +559,10 @@
p2->p_acflag = AFORK;
s = splhigh();
+ p2->p_oncpu = 0xff; /* idle */
+ p2->p_lastcpu = p1->p_oncpu;
p2->p_stat = SRUN;
--- kern_fork.c.diff ends here ---
More information about the freebsd-bugs