Greetings... a patch I would like your comments on...

Jilles Tjoelker jilles at stack.nl
Fri Jan 22 17:25:28 UTC 2010


On Fri, Jan 22, 2010 at 07:55:47AM -0800, Randall Stewart wrote:
> On Jan 22, 2010, at 7:33 AM, Ivan Voras wrote:

> > On 01/22/10 16:10, Ed Schouten wrote:
> >> * Ivan Voras<ivoras at freebsd.org>  wrote:
> >>> This is a good and useful addition! I think Windows has  
> >>> implemented a
> >>> generalization of this (called "wait objects" or something like  
> >>> that),
> >>> which effectively allows a select()- (or in this case kqueue())-like
> >>> syscall to wait on both file descriptors and condvars (as well as
> >>> probably other MS-style objects). It's useful for multiplexing  
> >>> events
> >>> for dissimilar sources.

> >> NtWaitForSingleObject(), NtWaitForMultipleObjects(), etc. :-)

> > Yes, I was thinking about WaitForMultipleObjects() - I sometimes  
> > wished I had it in FreeBSD :)

> > I think the hackers@ side of the thread is missing the original link  
> > to the patch file offered for review, so here it is:

> > http://people.freebsd.org/~rrs/kque_umtx.patch

Cool.

> > My kqueue-fu level is too low to be really useful here but from what  
> > I've read it looks like a logical and even reasonably clean way of  
> > doing it.

> thanks it made sense to me ;-)

> > If I read the comment at filt_umtxattach() correctly, in the best  
> > case you would need an extension to the kevent structure to add more  
> > fields like data & udata (for passing values back and forth between  
> > userland and kernel). I agree with this - it would be very  
> > convenient for some future purposes (like file modification  
> > notification) if the kernel filter could both accept and return a  
> > struct of data from/to the userland.

> Yeah, more arguments inside the kevent would allow me to add the  
> COND_CV_WAIT* where a lock and condition are passed
> in as well... But I was hesitant to add more than was already there  
> since doing
> so would cause ABI ripples that I did not want to face.

I don't think passing the lock is needed.

Traditional way (error handling omitted):

pthread_mutex_lock(&M);
while (!P)
	pthread_cond_wait(&C, &M);
do_work();
pthread_mutex_unlock(&M);

The thread must be registered as a waiter before unlocking the mutex,
otherwise a wakeup could be lost.

Possible kqueue way (error handling omitted):

kq = kqueue();
pthread_mutex_lock(&M);
while (!P)
{
	pthread_cond_kqueue_register_wait_np(&C, kq);
	pthread_mutex_unlock(&M);
	nevents = kevent(kq, NULL, 0, events, 1, NULL);
	... handle other events ...
	pthread_mutex_lock(&M);
}
do_work();
pthread_mutex_unlock(&M);
close(kq);

To avoid lost wakeup, the kqueue must be registered as a waiter before
unlocking the mutex. Once it has been registered, it is safe to unlock
the mutex as the kqueue will store the fact that the condition has been
signalled. Hence, pthread_cond_kqueue_register_wait_np() or however it
will be named does not need the mutex explicitly.

kevent() needs to return some sort of identifier of C, possibly via
kevent.udata. In that case the hack with fflags and data remains needed.

Registering multiple condvars in one function call could be nasty, as it
requires all the mutexes to be locked simultaneously.

On the other hand if you do want registration and waiting in the same
function, like kqueue can do with file descriptors, then it is needed to
pass all the locks so they can be unlocked after registering and before
waiting. I think this results in a rather complicated API though.

Adding a mutex to a kqueue probably means that mutexes are used wrongly:
they should not be locked for so long that it is desirable to wait for
one together with other objects (pthread_mutex_timedlock() is meant as a
fail-safe).

Semaphores could also be waited for via kqueue, probably via a version
of sem_trywait() that registers the kqueue as a waiter. Once this is
signalled, the thread can call the function again to obtain the
semaphore or register again if another thread obtained the semaphore
first.

-- 
Jilles Tjoelker


More information about the freebsd-hackers mailing list