Network Stack Locking
Matthew Dillon
dillon at apollo.backplane.com
Mon May 24 20:39:41 PDT 2004
:On Mon, 24 May 2004, Eivind Eklund wrote:
:
:> On Fri, May 21, 2004 at 01:23:51PM -0400, Robert Watson wrote:
:> > The other concern I have is whether the message queues get deep or not:
:> > many of the benefits of message queues come when the queues allow
:> > coallescing of context switches to process multiple packets. If you're
:> > paying a context switch per packet passing through the stack each time you
:> > cross a boundary, there's a non-trivial operational cost to that.
:>
:> I don't know what Matt has done here, but at least with the design we
:> used for G2 (a private DFly-like project that John Dyson, I, and a few
:> other people I don't know if want to be anonymous or not ran), this
:> should not an issue. We used thread context passing with an API that
:> contained putmsg_and_terminate() and message ports that automatically
:> could spawn new handler threads. Effectively, a message-related context
:> switch turned into "assemble everything I care about in a small package,
:> reset the stack pointer, and go". The expectation was that this should
:> end up with less overhead than function calls, as we could drop the call
:> frames for "higher levels in the chain". We never got to the point
:> where we could measure if it worked out that way in practice, though.
:
:Sounds a lot like a lot of the Mach IPC optimizations, including their use
:of continuations during IPC to avoid a full context switch.
:
:Robert N M Watson FreeBSD Core Team, TrustedBSD Projects
:robert at fledge.watson.org Senior Research Scientist, McAfee Research
Well, I like the performance aspects of a continuation mechanism, but
I really dislike the memory overhead. Even a minimal stack is
expensive when you multiply it by potentially hundreds of thousands
of 'blocking' entities such as PCBs.. say, a TCP output stream.
Because of this the overhead and cache pollution generated by the
continuation mechanism increases as system load increases rather
then decreases.
Deep message queues aren't necessarily a problem and, in fact, having
one or two dozen messages backed up in a protocol thread's message
port is actually good because the thread can then process all the
messages in a tight loop (cpu and cache locality of reference). If
designed properly, this directly mitigates the cost of a thread switch
as system load increases. So message queueing has the opposite effect...
per-unit handling overhead *decreases* as system load increases.
(Also, DragonFly's thread scheduler is a much lighter weight mechanism
then what you have in FBsd-4 or FBsd-5).
e.g.: lets say you have a context switch overhead of 1uS and a message
processing overhead of 100ns.
light load: 100 messages/sec: 1.1uS/message
medium load: 1000 messages/sec, average 10 messages in queue at
context switch: 10*100ns+1uS = 2uS/10 =
200ns/msg
heavy load: 10000 msgs/sec, average 100 msgs in queue:
100*100ns+1uS = 11uS/100=
110ns/msg
The reason a deep message queue is not a problem vs other mechanisms
is simple... a message represents a unit of work. The work must be
done regardless, and on the cpu it was told to be done on, no matter
whether you use a message or a continuation or some other mechanism.
In otherwords, a deep message queue is actually an effect of the
problem, not a cause of that problem. Solving the problem (if it
actually is a problem) does not involve dealing with the deep message
queue, it involves dealing with the set of circumstances that are
causing that deep message queue to occur.
Now, certainly end-to-end latency is an issue. But when one is talking
about context switching one is talking about nanoseconds and microseconds.
Turn-around latency just isn't an issue most of the time, and in those
extremely rare cases where it might be one does the turn-around in the
driver interrupt anyway.
-Matt
Matthew Dillon
<dillon at backplane.com>
More information about the freebsd-arch
mailing list