machdep.cpu_idle_hlt and SMP perf?
John Baldwin
jhb at freebsd.org
Thu Feb 9 08:14:00 PST 2006
On Wednesday 08 February 2006 12:17, Andrew Gallatin wrote:
> John Baldwin writes:
> > On Tuesday 07 February 2006 17:46, Andrew Gallatin wrote:
> > > John Baldwin writes:
> > > > On Tuesday 07 February 2006 17:15, Andrew Gallatin wrote:
> > > > > John Baldwin writes:
> > > > > > On Monday 06 February 2006 17:37, Andrew Gallatin wrote:
> > > > > > > John Baldwin writes:
> > > > > > > > On Monday 06 February 2006 14:46, Andrew Gallatin wrote:
> > > > > > > > > Andre Oppermann writes:
> > > > > > > > > > Andrew Gallatin wrote:
> > > > > > > > > > > Why dooes machdep.cpu_idle_hlt=1 drop my 10GbE
> > > > > > > > > > > network rx performance by a considerable amount
> > > > > > > > > > > (7.5Gbs -> 5.5Gbs)?
> > > > > > > >
> > > > > > > > You may be seeing problems because it might simply take a
> > > > > > > > while for the CPU to wake up from HLT when an interrupt
> > > > > > > > comes in. The 4BSD scheduler tries to do IPIs to wakeup
> > > > > > > > any sleeping CPUs when it schedules a new thread, but
> > > > > > > > that would add higher latency for ithreads than just
> > > > > > > > preempting directly to the ithread. Oh, you have to turn
> > > > > > > > that on, it's off by default
> > > > > > > > (kern.sched.ipiwakeup.enabled=1).
> > > > > > >
> > > > > > > Hmm.. It seems to be on by default. Unfortunately, it does
> > > > > > > not seem to help.
> > > > > >
> > > > > > I'm not sure.
> > > > >
> > > > > One thing which really helps is disabling preemption. If I do
> > > > > that, I get 7.7Gb/sec with machdep.cpu_idle_hlt=1. This is
> > > > > slightly better than machdep.cpu_idle_hlt=0 and no PREEMPTION.
> > > > >
> > > > > BTW, net.isr.direct=1 in all testing.
> > > >
> > > > Do you have very little userland activity in this test?
> > >
> > > Essentially none. netserver just sits in a loop, reading from the
> > > socket and throwing the data away.
> >
> > If you disable preemption then in effect you are letting the idle CPUs
> > pick up the ithread and not disturbing what is running on the non-idle
> > CPU. sched_4bsd is supposed to be triggering the same behavior, except
> > that it has to send an IPI to awaken the idle CPUs. When you have
> > idle_hlt=0, there are no idle CPUs, so 4bsd thinks they are all busy and
> > preempts. When you disable preemption, it just leaves the ithread on
> > the runqueue until one of the idle CPUs notices the new thread in its
> > idle loop and runs it. When you have idle_hlt=1, then 4bsd doesn't
> > preempt but sends an IPI. It doesn't even try to preempt unless it
> > thinks all CPUs are busy.
>
> I wish we had a lightweight way to watch all this stuff. I can't
> wait for dtrace.
You can try using KTR with KTR_SCHED and then using schedgraph.py to look at
what happens. I'm not sure how lightweight that might be if you just have
KTR on and no other debug stuff.
> FWIW, if I use SCHED_ULE, performance sucks regardless of idle_hlt.
Hmmmm.
> > One thing disabling PREEMPTION does is that it enables some explicit
> > FULL_PREEMPTION-like behavior in _mtx_unlock_sleep(). You might want to
> > try #if 0'ing that code out to see if that is why having PREEMPTION off
> > makes a difference. (Ironically, having PREEMPTION on means
> > _mtx_unlock_sleep() will preempt less often.)
>
> Removing that code did not seem to matter. I still get good
> performance with SCHED_4BSD, PREEMPTION disabled, idle_hlt=1, and that
> code removed.
Ok. Hmmmmm.
--
John Baldwin <jhb at FreeBSD.org> <>< http://www.FreeBSD.org/~jhb/
"Power Users Use the Power to Serve" = http://www.FreeBSD.org
More information about the freebsd-current
mailing list