40% slowdown with dynamic /bin/sh
Matthew Dillon
dillon at apollo.backplane.com
Tue Nov 25 11:50:31 PST 2003
:IMHO, it makes more sense to write NSS modules that do their own
:proxying than to make things even more complicated in libc. Those
:that are lightweight don't carry extra baggage; those that do can
:implement proxying in the most efficient manner for that particular
:backend, e.g. some calls can be proxyed while others done in-process.
:You don't have to rewrite existing NSS modules so that they take into
:account that they are really serving multiple processes--- which
:usually means adding credentials management, locking, per-process
:state, and so forth to each NSS module. Or you could just create a
:whole new NSS API and call it something else and forget about support
:for existing NSS modules.
:
:Caching results (which is different than out-of-process
:implementations, the Linux nscd authors are just confused) does
:require a daemon, but this doesn't really complicate things. (I
:should get around to that someday :-)
:
:That said, I would not stand in the way of a well-thought out,
:well-written NSS implementation that attempts to proxy every get*()
:call over e.g. RPC. (Hmm, sounds like NIS to me. I guess that's
:partially explains why PADL.com's NIS<->LDAP gateway is popular :-)
Well, here's the issue... where do you think the future is? I
believe the future is in large, extended clusters of machines which
either need to agree on their management resources or which need to
be capable of hierarchically accessing a global resource namespace.
Sure you can do this within the nsswitch framework, by writing
particular NSS modules that then going out and implementing some other
proxy protocol. But most NSS modules are not going to be written with IPC
in mind so it would be a fairly difficult and involved job to create
an integrated framework capable of the above within NSS. Without doing
that you would be restricted to only those modules which are directly
capable of proxying *AND* you would have to contend with various
proxy-capable modules using different protocols. In otherwords, it's
a mess. It seems silly to waste your time on a framework that you are
just going to have to rip out again a year or two from now.
By using an IPC mechanism from the start the framework and centralization
issues go away. Poof, gone. No issue. A module written as an IPC
service doesn't know or care (other then for authentication purpopses)
who is making requests of it. In DFly it is particularly important
because we are going for an SSI-capable result and you just can't do
that with NSS (at least not without devaluing the NSS mechanism so much
you might as well not have used it in the first place!).
The absolute worst case in an IPC framework is that the program trying
to access service X and not being able to find it would have to fork/exec
the service binary itself to create the IPC connection. This, of course,
would only occur under extrodinary circumstances (such as when you are
in single-user mode). But despite the overhead we are only talking about
two lines of code, really. fork() and exec(). Well, three... pipe(),
fork(), and exec().
In regards to caching... with an IPC mechanism the client can choose
to cache the results however it pleases. The IPC mechanism can simply
notify the client asynchronously if a cache invalidation is required.
That's what a real messaging/port protocol gives you the ability to do.
So generaly performance using the IPC mechanism is going to be as good
or better then what we currently have with uncached flat files or uncached
databases.
Which brings up yet another cool result ... when you use an IPC mechanism
you don't need to generate DBM's. The service process itself will simply
scan /etc/master.passwd, /etc/group, and so forth, and build its own
in-memory database. Being able to get rid of the DBMs is only part of
the equation because it also means that files which are not otherwise
DBM's, such as /etc/services and /etc/group, enjoy the same feature.
:Um, if you can't trust the authentication code, what can you trust?
:Furthermore, for many many many applications that use getpwnam(3) and
:so on, increased privileges are not needed or wanted.
Think out of the box. Consider a multi-layered approach. Take
access to master.passwd for example. Would you have
(A) the authentication code integrated into the potentially buggy program
be able to access the file directly or would you rather have
(B) the authentication code access an IPC service which *ONLY* allows
challenge/response, does *NOT* give you direct access to the
encrypted contents of the password file, and which limits the challenge
rate to prevent dictionary attacks?
That's about the best example that I can come up with. Think about
it... the *ONLY* code that has access to the actual password file is
sufficiently limited in scope as to greatly reduce the potential for
bugs creating situations which expose the password file. That is a
damn sight better then an NSS module which needs to physically open
the password file.
:And if you *are* really talking about authentication code (and not
:directory services), then you need to get PAM to work in a statically
:linked world, also. (You can compile PAM statically today, but that
:means no 3rd-party modules. The same holds for NSS, of course.)
... Or you can build an IPC mechanism that implements the PAM
functionality and then have programs which would otherwise use PAM
instead use the IPC mechanism. Which is the whole point of having
the IPC mechanism in the first place.
:> The other huge advantage that IPC has over DLL is that you can switchout
:> the backend mechanisms on a running system without killing and restarting
:> anything other then he one service you want to switch out, and if it
:> doesn't work right you can restart the old one or keep live as a fallback.
:When using the current NSS implementation, there is no need to
:kill/restart anything when you update /etc/nsswitch.conf. New
:modules are dynamically loaded, and any no-longer-referenced ones are
:unloaded.
Sounds good, I guess... does it check the timestamp on /etc/nsswitch.conf
every time you try to do a lookup? With the IPC mechanism an IPC
request will either fail or gets a 'reconnect' request, in which case
the IPC reconnects to the (updated) service.
:> the IPC model is so much better then the DLL model for this sort of thing
:> I don't understand why people are even arguing about it.
:
:Because the rest of us are stupid and lazy, remember? :-)
:
:Cheers,
:--
:Jacques Vidrine NTT/Verio SME FreeBSD UNIX Heimdal
Just not thinking out of the box, maybe.
-Matt
Matthew Dillon
<dillon at backplane.com>
More information about the freebsd-current
mailing list