sem_post() performance

John Baldwin jhb at freebsd.org
Wed Sep 24 15:04:36 UTC 2014


On Wednesday, September 24, 2014 05:45:19 PM Konstantin Belousov wrote:
> On Wed, Sep 24, 2014 at 05:31:04PM +0300, Konstantin Belousov wrote:
> > On Wed, Sep 24, 2014 at 10:11:51AM -0400, John Baldwin wrote:
> > > On Tuesday, September 23, 2014 11:20:00 PM Jilles Tjoelker wrote:
> > > > On Mon, Sep 22, 2014 at 03:53:13PM -0400, John Baldwin wrote:
> > > > > On Sunday, September 21, 2014 11:37:42 PM Jilles Tjoelker wrote:
> > > > > > It has been reported that POSIX semaphores are slow, in contexts
> > > > > > such as
> > > > > > Python. Note that POSIX semaphores are the only synchronization
> > > > > > objects
> > > > > > that support use by different processes in shared memory; this
> > > > > > does not
> > > > > > work for mutexes and condition variables because they are pointers
> > > > > > to
> > > > > > the actual data structure.
> > > > > > 
> > > > > > In fact, sem_post() unconditionally performs an umtx system call.
> > > > > 
> > > > > *sigh*  I was worried that that might be the case.
> > > > > 
> > > > > > To avoid both lost wakeups and possible writes to a destroyed
> > > > > > semaphore,
> > > > > > an uncontested sem_post() must check the _has_waiters flag
> > > > > > atomically
> > > > > > with incrementing _count.
> > > > > > 
> > > > > > The proper way to do this would be to take one bit from _count and
> > > > > > use it for the _has_waiters flag; the definition of SEM_VALUE_MAX
> > > > > > permits this. However, this would require a new set of umtx
> > > > > > semaphore operations and will break ABI of process-shared
> > > > > > semaphores
> > > > > > (things may break if an old and a new libc access the same
> > > > > > semaphore
> > > > > > over shared memory).
> > > > > > 
> > > > > > This diff only affects 32-bit aligned but 64-bit misaligned
> > > > > > semaphores on 64-bit systems, and changes _count and _has_waiters
> > > > > > atomically using a 64-bit atomic operation. It probably needs a
> > > > > > may_alias attribute for correctness, but <sys/cdefs.h> does not
> > > > > > have
> > > > > > a wrapper for that.
> > > > > 
> > > > > It wasn't clear on first reading, but you are using aliasing to get
> > > > > around the need for new umtx calls by using a 64-bit atomic op to
> > > > > adjust two ints at the same time, yes?  Note that since a failing
> > > > > semaphore op calls into the kernel for the "hard" case, you might in
> > > > > fact be able to change the ABI without breaking process-shared
> > > > > semaphores.  That is, suppose you left 'has_waiters' as always true
> > > > > and reused the high bit of count for has_waiters.
> > > > > 
> > > > > Would old binaries always trap into the kernel?  (Not sure they
> > > > > will,
> > > > > especially the case where an old binary creates the semaphore, a new
> > > > > binary would have to force has_waiters to true in every sem op, but
> > > > > even that might not be enough.)
> > > > 
> > > > I think that everything will break when a binary linked to old and new
> > > > libcs use the same semaphore. If the new contested bit is set, the old
> > > > sem_getvalue() will return garbage, the old sem_trywait() will fail
> > > > even
> > > > if the real count is greater than 0, the old sem_wait() and
> > > > sem_timedwait() may spin if the real count is greater than 0 and the
> > > > old
> > > > sem_post() will fail with [EOVERFLOW].
> > > 
> > > Well, keep in mind you can't have a binary linked to both libc's
> > > generally
> > > (way too many things break if you do that, like passing the result of
> > > strdup() from one libc to free() of the other, etc.).  The real problem
> > > case is multiple binaries.  However, barring truly bizarre cases with
> > > people using funky LD_LIBRARY_PATH, at least one binary would have to
> > > be static to really get into trouble as in all but the bizarre cases
> > > all binaries will be sharing the same libc.so.7.
> > > 
> > > However, it was more wishful thinking on my part that we could arrange
> > > things to force old binaries to still work (just slowly) while fixing
> > > things cleanly.> > 
> > > > > I think this looks ok, but it's a shame we can't fix it more
> > > > > cleanly.
> > > > 
> > > > I don't like at all that you have to either force an exotic alignment
> > > > or
> > > > use unaligned atomic ops.
> > > > 
> > > > I saw another interesting sem_ implementation in musl libc. It also
> > > > uses
> > > > a separate waiters word (but an int instead of a boolean), and is
> > > > based
> > > > on the futex calls (which we already have as wait_uint/wake umtx
> > > > calls).
> > > > It looks rather complicated but the musl libc people are usually good
> > > > at
> > > > getting this kind of stuff right. I suppose this will also break
> > > > subtly
> > > > when interoperating with old libc, and adding new umtx ops will
> > > > simplify
> > > > the algorithm.
> > > > 
> > > > Consideration: just declare mixing process-shared semaphores with
> > > > sufficiently different libc unsupported, and change SEM_MAGIC to
> > > > enforce
> > > > that? (This does not prevent running old binaries, as long as they're
> > > > dynamically linked to libc and you use a new libc.so.)
> > > 
> > > This only breaks old static binaries (note that we only recently got
> > > pshared semaphores in 9.x or so).  You could even allow partial compat
> > > for those if needed by recognizing the old SEM_MAGIC when opening an
> > > existing shared semaphore (so that compat would work so long as an old
> > > binary created the semaphore).  That compat can even be added
> > > retroactively to libc if it turns out we need it.  I probably think
> > > that is ok, but it would be an ABI breakage that would need to be
> > > documented.  One thing that makes this slightly less painful is that
> > > new-style semaphores in 9.0+ are per-chroot, so if you have a jail that
> > > runs only old binaries, they will continue to work fine.
> > > 
> > > I'm cc'ing Konstantin to see what he thinks.
> > 
> > You may also get in trouble if you upgrade libc without reboot, or use
> > static binary built against newer libc on older host.

Yes, but we generally don't support those situations (you can do an 
installworld, but generally speaking you should reboot afterwards).

> > But SEM_MAGIC provides at least protection against mixing incompatible
> > implementations, and it seems that the checks are enough to ensure
> > that both bad cases (new/old and old/new) are handled.
> > 
> > Do we have a real use case when the old and new implementations have to
> > be mixed ?  Was it the same situation when the sem.c -> sem_new.c
> > transition was made ?  I think that the answer to the questions  are
> > no/yes, and we should just change magic and not worry about the issue
> > until real use case appear.

I think that is correct, yes.

> I think it is even worse now. If the application linked against FBSD_1.0
> (ksem) semaphores implementation runs on the modern host, it cannot
> share semaphore with modern binary.

Correct.  We changed sem_t's ABI hence the compat shims IIRC.

> Since this is not considered significant problem, we can avoid compat
> code there as well.

I think if we leave sem_t alone there is no reason to create new compat
shims for this change, but the existing FBSD_1.0 versions have to remain for 
people using old binaries, yes?

-- 
John Baldwin


More information about the freebsd-threads mailing list