cvs commit: src/sys/kern sched_4bsd.c

David Xu davidxu at freebsd.org
Mon Nov 13 12:48:16 UTC 2006


On Monday 13 November 2006 17:58, Bruce Evans wrote:

> > It might not be a bug of the NO_KSE, the problem is in sched_fork() and
> > sched_exit(), for process which quickly fork() a child and then the child
> > exits quickly, the parent's estcpu will be doubled quickly too, this
> > fairness is really unfair,
>
> That can't be the problem, since there are no exits in the above.
>
I have tried the change I mentioned, and top runs quickly and the
system does not have the problem as you described.

> > I think your examples is the scenario, however, I don't know
> > why KSE works better. this might be fixed by remembering the inherited
> > estcpu in child, and decay it every second. when the child exits,
> > it add really used estcpu to parent. code looks like this:
> >
> > in sched_fork(), we remember inherited estcpu:
> > 	td->td_inherited_estcpu = parent->td_estcpu;
> > in schedcpu(), we decay it every second (should be fixed in sched_wakeup
> > too):
> >        td->td_inherited_estcpu = decaycpu(loadfac, td->td_inherited_cpu);
> > in sched_exit();
> >        parent->td_estcpu = ESTCPULIM(parent->td_estcpu,
> > 		childtd->td_estcpu - td->td_inherited_cpu);
> >
> > This should fix the quickly fork() and exit() problem for parent process.
>
> I've known about this bug since Peter Default told me about it in late
> 1999, and now use the code at the end of this mail to avoid it.  However,
> I remembered it incorrectly and may have misdescribed it to you.  I
> thought I remembered actual doubling, with estcpu soon reaching
> "infinity", but the ESTCPULIM() clamp prevents it getting preposterously
> high now, and I couldn't find any version that let it reach "infinity".
> Versions before late 1999 had a bogus limit of UCHAR_MAX and that may
> have been responsible for shells appearing to hang because it was a
> better approximation to "infinity".
>
> I now use the following:
>
> % Index: sched_4bsd.c
> % ===================================================================
> % RCS file: /home/ncvs/src/sys/kern/sched_4bsd.c,v
> % retrieving revision 1.41
> % diff -u -2 -r1.41 sched_4bsd.c
> % --- sched_4bsd.c	21 Jun 2004 23:47:47 -0000	1.41
> % +++ sched_4bsd.c	8 Dec 2005 11:11:52 -0000
> % @@ -550,9 +641,20 @@
> %
> %  void
> % -sched_exit_ksegrp(struct ksegrp *kg, struct ksegrp *child)
> % +sched_exit_ksegrp(struct ksegrp *parent, struct ksegrp *child)
> %  {
> %
> %  	mtx_assert(&sched_lock, MA_OWNED);
> % -	kg->kg_estcpu = ESTCPULIM(kg->kg_estcpu + child->kg_estcpu);
> % +	/*
> % +	 * XXX adding all of the child's cpu to the parent's like we used to
> % +	 * do would be wrong, since we duplicate the parent's cpu at fork
> % +	 * time so adding it all back would give exponential growth.  In
> % +	 * practice, the growth would have been limited by ESTCPULIM, but that
> % +	 * would be wrong too since it is very nonlinear.  Splitting the cpu
> % +	 * at fork time would be better, but adding it all back here would
> % +	 * still give nonlinearities since multiple processes tend to
> % +	 * accumulate more cpu than single ones.
> % +	 */
> % +	if (parent->kg_estcpu < child->kg_estcpu)
> % +		parent->kg_estcpu = child->kg_estcpu;
> %  }
> %
>
> This seems to work well enough in practice.  It grows the parent's estcpu
> quite slowly if there are a lot of fork/exits.
>
Yes, I knew there was the patch.

> Previous versions did something different on fork too.  Splitting or
> otherwise reducing estcpu on fork isn't such a good idea since it
> reduces the limit on the real resource hogs -- all the children, when
> there are lots of children that all want to run.  When the children
> don't exit, hacking on the parent's estcpu doesn't help, and doubling
> the child's estpcu on fork and halving it on exit is closer to being
> correct than the reverse.
>
> At least one of Peter Dufault's versions removed all explicit accesses
> to p_estcpu on fork and exit.  I think the change on fork is only
> cosmetic -- p_estcpu should have been automatically copied on fork.
>
> Anyway, this isn't the bug in non-KSE.  I didn't look hard for the
> reasons.  Top seemed to show the priorites of the hogs not decreasing
> (numerically increasing) fast enough.
>
I still can not find the bug although I have read all changes several times.

> Bruce



More information about the cvs-src mailing list