>> Apple's experience has been somewhat to the contrary -- while the 
>> architecture varies some by OS release, one of the persisting performance 
>> problems they were seeing was the cost of fork()+execve() from applications 
>> with very large numbers of shared libraries, plugins, memory mappings, etc.
>>  Currently, they address this by having a process launch applications "by 
>> proxy" as a result of IPC requests instead of forking and execing, but you 
>> might reasonably argue that the problem is with the fork()+execve() model.
> Essentially the same regardless of libraries.  vfork is 5 times faster for 
> -static, 11 times faster for regular dynamic, and 20 times faster for extra 
> libraries.
> So.. if something auto-detects posix_spawn(), which uses vfork(), it would 
> be a win compared to the usual fork()/exec().  A small win, but still a win. 
> It would have to do a lot of iterations to add up.
> Incidently, this is why /usr/bin/make and /usr/bin/gcc are statically 
> linked.  /bin/sh used to be, but isn't so that ~user can use nsswitch.
> For amusement, think of kde and gnome with all their libraries.

Well, kdeinit already performs pre-linking to avoid repeated runtime linker 
costs -- kdeinit is basically a template process waiting to be filled in with 
the specifics of any particular application, but with all the shared libraries 
already mapped and linked.  I've never benchmarked it, but one might 
reasonably assume that the technique works or they wouldn't ship with it. 
Seems like some systemic benchmarking and profiling is required to decide how 
much blame to point at the cost of forking complex address spaces, run-time 
linking, exec overhead, etc in order to decide how much to pin the blame on 

