machdep.cpu_idle_hlt and SMP perf?

Andrew Gallatin gallatin at cs.duke.edu
Wed Feb 8 09:17:12 PST 2006


John Baldwin writes:
 > On Tuesday 07 February 2006 17:46, Andrew Gallatin wrote:
 > > John Baldwin writes:
 > >  > On Tuesday 07 February 2006 17:15, Andrew Gallatin wrote:
 > >  > > John Baldwin writes:
 > >  > >  > On Monday 06 February 2006 17:37, Andrew Gallatin wrote:
 > >  > >  > > John Baldwin writes:
 > >  > >  > >  > On Monday 06 February 2006 14:46, Andrew Gallatin wrote:
 > >  > >  > >  > > Andre Oppermann writes:
 > >  > >  > >  > >  > Andrew Gallatin wrote:
 > >  > >  > >  > >  > > Why dooes machdep.cpu_idle_hlt=1 drop my 10GbE network
 > >  > >  > >  > >  > > rx performance by a considerable amount (7.5Gbs ->
 > >  > >  > >  > >  > > 5.5Gbs)?
 > >  > >  > >  >
 > >  > >  > >  > You may be seeing problems because it might simply take a
 > >  > >  > >  > while for the CPU to wake up from HLT when an interrupt comes
 > >  > >  > >  > in.  The 4BSD scheduler tries to do IPIs to wakeup any
 > >  > >  > >  > sleeping CPUs when it schedules a new thread, but that would
 > >  > >  > >  > add higher latency for ithreads than just preempting directly
 > >  > >  > >  > to the ithread.  Oh, you have to turn that on, it's off by
 > >  > >  > >  > default
 > >  > >  > >  > (kern.sched.ipiwakeup.enabled=1).
 > >  > >  > >
 > >  > >  > > Hmm..  It seems to be on by default.  Unfortunately, it does not
 > >  > >  > > seem to help.
 > >  > >  >
 > >  > >  > I'm not sure.
 > >  > >
 > >  > > One thing which really helps is disabling preemption.  If I do that,
 > >  > > I get 7.7Gb/sec with machdep.cpu_idle_hlt=1.  This is slightly better
 > >  > > than machdep.cpu_idle_hlt=0 and no PREEMPTION.
 > >  > >
 > >  > > BTW, net.isr.direct=1 in all testing.
 > >  >
 > >  > Do you have very little userland activity in this test?
 > >
 > > Essentially none.  netserver just sits in a loop, reading from the
 > > socket and throwing the data away.
 > 
 > If you disable preemption then in effect you are letting the idle CPUs pick up 
 > the ithread and not disturbing what is running on the non-idle CPU.  
 > sched_4bsd is supposed to be triggering the same behavior, except that it has 
 > to send an IPI to awaken the idle CPUs.  When you have idle_hlt=0, there are 
 > no idle CPUs, so 4bsd thinks they are all busy and preempts.  When you 
 > disable preemption, it just leaves the ithread on the runqueue until one of 
 > the idle CPUs notices the new thread in its idle loop and runs it.  When you 
 > have idle_hlt=1, then 4bsd doesn't preempt but sends an IPI.  It doesn't even 
 > try to preempt unless it thinks all CPUs are busy.

I wish we had a lightweight way to watch all this stuff.  I can't 
wait for dtrace.

FWIW, if I use SCHED_ULE, performance sucks regardless of idle_hlt.

 > One thing disabling PREEMPTION does is that it enables some explicit 
 > FULL_PREEMPTION-like behavior in _mtx_unlock_sleep().  You might want to try 
 > #if 0'ing that code out to see if that is why having PREEMPTION off makes a 
 > difference.  (Ironically, having PREEMPTION on means _mtx_unlock_sleep() will 
 > preempt less often.)

Removing that code did not seem to matter.  I still get good
performance with SCHED_4BSD, PREEMPTION disabled, idle_hlt=1, and that
code removed.

Drew


More information about the freebsd-current mailing list