Call for performance evaluation: net.isr.direct (fwd)

Andrew Gallatin gallatin at cs.duke.edu
Fri Oct 14 07:54:43 PDT 2005


Poul-Henning Kamp writes:
 > In message <17231.43525.446450.161986 at grasshopper.cs.duke.edu>, Andrew Gallatin
 >  writes:
 > >
 > >Poul-Henning Kamp writes:
 > > > The best compromise solution therefore is to change the scheduler
 > > > to make decisions based on the TSC ticks (or equivalent on other
 > > > archs) and at regular intervals figure out how fast the CPU ran in
 > > > the last period and convert the TSC ticks accumulated to a time
 > > > unit suitable for resource accounting.
 > > > 
 > > > 
 > > > The bad solution is to try to do timekeeping based on hardware
 > > > counters which are unsuitable for the purpose, the TSC being
 > > > the primary suspect here, and we will not do that.
 > >
 > >I'll bet that nobody will want to touch the scheduler, so we'll
 > >continue be stuck with inflated context switch times on SMP because we
 > >use such an expensive time source.
 > >
 > >What if somebody were to port the linux TSC syncing code, and use it
 > >to decide whether or not set kern.timecounter.smp_tsc=1?  Would you
 > >object to that?
 > 
 > Yes, I would object to that.
 > 
 > Even to this day new CPU chips come out where TSC has flaws that
 > prevent it from being used as timecounter, and we do not have (NDA)
 > access to the data that would allow us to build a list of safe
 > hardware.

Bear in mind that I have no clue about timekeeping.  I got into this
just because I noticed using a TSC timecounter reduces context switch
latency by 40% or more on all the SMP platforms I have access to:

1.0GHz dual PIII : 50% reduction vs i8254
3.06GHz 1 HTT P4 : 55% vs ACPI-safe, 70% vs i8254)
2.0GHz dual amd64: 43% vs ACPI-fast, 60% vs i8254)

High context switch latency has been problem since FreeBSD 5 in
networking due to the context switches for netisr use, and for the
context switches required by interrupt threads.  I'm sure it is a
problem in other parts of the system. I think it is pretty important,
and I'd really like to see it fixed.

Since I don't know much about timekeeping, all I can do is raise
awareness, and offer to port the linux solution (which I think I might
be able to understand).  However, if the linux solution is not
correct, then somebody with timekeeping knowledge needs to get 
involved.  Is this you?

BTW, is an algorithm like Solaris' the "best compromise" you mention
above?  Or is it just keeping the TSC in sync?  They seem to maintain
a high resolution timer based on tsc, and keep it in sync every
second, and fixup drift on different cpus, and the TSC
being reset after suspend/resume.
http://cvs.opensolaris.org/source/xref/usr/src/uts/i86pc/os/timestamp.c


Drew


More information about the freebsd-net mailing list