sem_post() performance

Wed Sep 24 14:31:11 UTC 2014

On Wed, Sep 24, 2014 at 10:11:51AM -0400, John Baldwin wrote:
> On Tuesday, September 23, 2014 11:20:00 PM Jilles Tjoelker wrote:
> > On Mon, Sep 22, 2014 at 03:53:13PM -0400, John Baldwin wrote:
> > > On Sunday, September 21, 2014 11:37:42 PM Jilles Tjoelker wrote:
> > > > It has been reported that POSIX semaphores are slow, in contexts such as
> > > > Python. Note that POSIX semaphores are the only synchronization objects
> > > > that support use by different processes in shared memory; this does not
> > > > work for mutexes and condition variables because they are pointers to
> > > > the actual data structure.
> > > > 
> > > > In fact, sem_post() unconditionally performs an umtx system call.
> > > 
> > > *sigh*  I was worried that that might be the case.
> > > 
> > > > To avoid both lost wakeups and possible writes to a destroyed semaphore,
> > > > an uncontested sem_post() must check the _has_waiters flag atomically
> > > > with incrementing _count.
> > > > 
> > > > The proper way to do this would be to take one bit from _count and
> > > > use it for the _has_waiters flag; the definition of SEM_VALUE_MAX
> > > > permits this. However, this would require a new set of umtx
> > > > semaphore operations and will break ABI of process-shared semaphores
> > > > (things may break if an old and a new libc access the same semaphore
> > > > over shared memory).
> > > > 
> > > > This diff only affects 32-bit aligned but 64-bit misaligned
> > > > semaphores on 64-bit systems, and changes _count and _has_waiters
> > > > atomically using a 64-bit atomic operation. It probably needs a
> > > > may_alias attribute for correctness, but <sys/cdefs.h> does not have
> > > > a wrapper for that.
> > > 
> > > It wasn't clear on first reading, but you are using aliasing to get
> > > around the need for new umtx calls by using a 64-bit atomic op to
> > > adjust two ints at the same time, yes?  Note that since a failing
> > > semaphore op calls into the kernel for the "hard" case, you might in
> > > fact be able to change the ABI without breaking process-shared
> > > semaphores.  That is, suppose you left 'has_waiters' as always true
> > > and reused the high bit of count for has_waiters.
> > > 
> > > Would old binaries always trap into the kernel?  (Not sure they will,
> > > especially the case where an old binary creates the semaphore, a new
> > > binary would have to force has_waiters to true in every sem op, but
> > > even that might not be enough.)
> > 
> > I think that everything will break when a binary linked to old and new
> > libcs use the same semaphore. If the new contested bit is set, the old
> > sem_getvalue() will return garbage, the old sem_trywait() will fail even
> > if the real count is greater than 0, the old sem_wait() and
> > sem_timedwait() may spin if the real count is greater than 0 and the old
> > sem_post() will fail with [EOVERFLOW].
> 
> Well, keep in mind you can't have a binary linked to both libc's generally 
> (way too many things break if you do that, like passing the result of strdup() 
> from one libc to free() of the other, etc.).  The real problem case is 
> multiple binaries.  However, barring truly bizarre cases with people using 
> funky LD_LIBRARY_PATH, at least one binary would have to be static to really 
> get into trouble as in all but the bizarre cases all binaries will be sharing 
> the same libc.so.7.
> 
> However, it was more wishful thinking on my part that we could arrange things
> to force old binaries to still work (just slowly) while fixing things cleanly.
> 
> > > I think this looks ok, but it's a shame we can't fix it more cleanly.
> > 
> > I don't like at all that you have to either force an exotic alignment or
> > use unaligned atomic ops.
> > 
> > I saw another interesting sem_ implementation in musl libc. It also uses
> > a separate waiters word (but an int instead of a boolean), and is based
> > on the futex calls (which we already have as wait_uint/wake umtx calls).
> > It looks rather complicated but the musl libc people are usually good at
> > getting this kind of stuff right. I suppose this will also break subtly
> > when interoperating with old libc, and adding new umtx ops will simplify
> > the algorithm.
> > 
> > Consideration: just declare mixing process-shared semaphores with
> > sufficiently different libc unsupported, and change SEM_MAGIC to enforce
> > that? (This does not prevent running old binaries, as long as they're
> > dynamically linked to libc and you use a new libc.so.)
> 
> This only breaks old static binaries (note that we only recently got pshared
> semaphores in 9.x or so).  You could even allow partial compat for those if 
> needed by recognizing the old SEM_MAGIC when opening an existing shared 
> semaphore (so that compat would work so long as an old binary created the 
> semaphore).  That compat can even be added retroactively to libc if it turns
> out we need it.  I probably think that is ok, but it would be an ABI breakage
> that would need to be documented.  One thing that makes this slightly less
> painful is that new-style semaphores in 9.0+ are per-chroot, so if you have
> a jail that runs only old binaries, they will continue to work fine.
> 
> I'm cc'ing Konstantin to see what he thinks.
You may also get in trouble if you upgrade libc without reboot, or use
static binary built against newer libc on older host.

But SEM_MAGIC provides at least protection against mixing incompatible
implementations, and it seems that the checks are enough to ensure
that both bad cases (new/old and old/new) are handled.

Do we have a real use case when the old and new implementations have to
be mixed ?  Was it the same situation when the sem.c -> sem_new.c transition
was made ?  I think that the answer to the questions  are no/yes, and we
should just change magic and not worry about the issue until real use
case appear.

> 
> > Consideration 2: use the new implementation only for process-private
> > semaphores and keep using the old slow one for process-shared
> > semaphores?
> 
> This is certainly safe, just requires extra work in the implementation.
> 
> -- 
> John Baldwin