libthr cleanup

David Xu davidxu at freebsd.org
Wed Mar 29 08:33:42 UTC 2006


Dag-Erling Smørgrav wrote:
> The attached patch brings libthr up to WARNS level 2.  There is also a
> small amount of style and whitespace changes mixed in, mostly because
> I'm so conditioned to style(9) that my fingers sometimes do these
> things automatically.
> 
I have bunch of patches in hand which compiles at level 3, if you commit
it, there will lots of conflict to my working copy.

> Parts of the patch go a little further than level 2, but we're nowhere
> near level 3 (or antything higher).  The major obstacle is the umtx
> interface, which in my eyes is fundamentally broken.
> The _umtx_op() syscall is intended to replace _umtx_lock() and
> _umtx_unlock(), and support other operations like sleep, wakeup etc.
> 
> This is the wrong way to go.  If we applied this kind of thinking
> universally, we'd just define all our system calls as macro wrappers
> for __syscall().
> 
> Beyond these purely philosophical aspects, _umtx_op() seems designed
> to encourage poor coding practices.  It's impossible to use in a type-
> safe manner: its first argument is supposed to be a struct umtx *, but
> I don't think there's a single instance in libthr where it is called
> with an actual struct umtx *.  Instead, libthr uses umtx_t, which is
> defined as long.  Sometimes, an umtx_t is passed as the third argument
> to _umtx_op() as well.  Various fields in struct pthread, struct
> pthread_barrier and struct pthread_cond are declared as umtx_t.  Some
> are used purely as cookies for passing to _umtx_op(); some are used as
> counters or state variables.
> 
> It's a miracle libthr works at all.  The kernel expects a pointer to a
> struct umtx; instead, it gets (mostly) a pointer to long.  Luckily,
> struct umtx (which contains a single pointer) is the same size as long
> on all our platforms.
> 
> _umtx_op() needs to be split into four system calls:
> 
> int _umtx_lock_timeout(struct umtx *mtx,
>     const struct timespec * __restrict timeout);
> int _umtx_unlock(struct umtx *mtx);
> int _umtx_wait_timeout(const void *cookie, struct umtx *mtx,
>     const struct timespec * __restrict timeout);
> int _umtx_wake(const void *cookie);
> 
> _umtx_unlock() already exists with the correct semantics; the rest are
> new.  Note that we can't just add a struct timespec to _umtx_lock(),
> as that would break the upgrade path from RELENG_5; hence the _timeout
> suffix.
> 
When libthr was redesigned, things were not clear, I have figured out
all semantics needed to implement libthr, the above functions are the
interfaces current internally implemented by libthr, and functions in
umtx.h is not used because it can generate bloat code than the
versions in libthr, current libthr shared library only has 64K size.
I really dont need struct umtx at all, an integer is enough, basic idea
is using atomic operations to test in userland and wait in kernel.
the above functions should be changed to:
int _umtx_lock_timeout(umtx_t *,
     const struct timespec * __restrict timeout);
int _umtx_unlock(umtx_t *mtx);
int _umtx_wait_timeout(umtx_t *, umtx_t expect,
     const struct timespec * __restrict timeout);
int _umtx_wake(umtx_t *i, int number);

the umtx_t could be an integer type, but to maintain binary 
compatibility, it has to be a long integer type.

> I'm not sure the current implementation of UMTX_OP_WAIT / UMTX_OP_WAKE
> in the kernel is correct.  Normally, a wait primitive (like msleep(),
> pthread_cond_wait() etc.) takes a cookie and a mutex, and unlocks the
> mutex while it's sleeping on the cookie.  It looks like the umtx code
> originally worked like this, but I'm not sure it does anymore; it's
> hard to unravel, partly because I haven't quite figured out the queues
> yet and partly because I haven't had breakfast.
> 
> DES
>
Well, I am not sure you know the history of umtx, the orignal work only
did a lock and unlock semantic, there is no general sleep and wait
semantic, orignal umtx code has obscure race, I think in early days, 
valgrind suffered from this bug. I am not sure you understand how a 
userland synchronization works, these two operations are used to 
implement compare and wait for a integer to be changed, you should
check source code before talking a lot, saying that you doubt the 
UMTX_OP_WAIT and UMTX_OP_WAKE's correctness is not professional, there 
are users using libthr in daily work and stress testing had been done.
I have put lots of work and time on libthr, I know the problems,
though the _umtx interface is a bit ulgly because it was unclear when it 
was being designed, but I don't think it really hurt you or other
people, it can be fixed in few days, I just was hesitating to add the 
new interfaces.

David Xu



More information about the freebsd-threads mailing list