Managing userland data pointers in kqueue/kevent

Paul LeoNerd Evans leonerd at leonerd.org.uk
Mon Nov 15 23:22:49 UTC 2010


On Mon, Nov 15, 2010 at 12:51:57PM -0800, Julian Elischer wrote:
> "keep more information associated with each kevent and use the user
> cookie to
> match them"  this is what it was for.
> it's a tool, not an answer. Given this tool you should be able to
> get what you want.
> how you do it is your job.

OK. Then I am not seeing it. I would love to seen an example, if you
or anyone else could provide me one, on how I am supposed to use this
feature. That would be great... but please read below first.

> It's not the kernel's job to keep application specific data for
> you.. but it gives you a way
> to do it yourself and keep track of it trivially.
> It's expected that for every event the user gives to the kernel, he
> has some matching
> information about that event in userland.

Sure. The information I keep in userland is in the structure at the end
of that  udata  pointer.


Since you claim it to be so trivial then, I would like to ask you to
explain it. It should be quite a simple task:

---
  Demonstrate me a program that, on receipt of -any- event out of the
  kqueue file descriptor, can print the word "FREE\n" when the kernel
  has now dropped its side of the watcher, for this event.

  Specifically, it has to print "FREE\n" in any of the following four
  conditions:

    1. After a final event, such as EVFILT_PROC,NOTE_EXIT

    2. After any event that had been registered with EV_ONESHOT

    3. After the user has called EV_SET(..., EV_DELETE,...) on it

    4. After calling close(fd) on a filehandle that has been registered
       under EVFILT_READ or EVFILT_WRITE
---

I am claiming that such a program cannot be written, using the current
kqueue interface, and simply allowing the user code to call EV_SET
however they like and put their own pointers in it. If I read your
assertion of triviallity correctly, then you are claiming that such a
program is indeed possible. I would therefore invite you to demonstrate
for me such a program.


If perhaps this does indeed prove to be impossible, I would like instead
you to demonstate a program having all the above properties, but
allowing you to arbitrarily wrap the kqueue API; store extra data in my
structures, or hook extra information around EV_SET calls.

I have already demonstrated -a- way to solve this, by storing data in
the event udata structure to answer 2, and storing a full mapping from
ident+filter to udata pointer, to answer 3. I declare 1 trivial by
inspection of the results in the returned kevent. I declare 4 to be
impossible short of such hackery as LD_PRELOAD around the actual close()
libc function.

In short, I claim that a solution to all parts 1-4 is impossible. It is
possible to solve 1-3 only, by storing a full mapping from ident+filter
to udata pointer, in userland. But then by doing that why bother giving
the pointer to the kernel in the first place?


There comes a further complication for a wrapping library that tries to
provide a generic interface around kqueue, for problem 1 however. Right
now, the following function could be said to implement problem 1:

 int is_final(struct kevent *ev)
 {
   switch(ev->filter) {
     case EVFILT_READ:
     case EVFILT_WRITE:
       return ev->flags & EV_EOF;
     case EVFILT_VNODE:
       return ev->fflags & (NOTE_DELETE|NOTE_REVOKE);
       /* I'm only guessing on this one from reading the docs, I'm not
        * 100% sure */
     case EVFILT_PROC:
       return ev->fflags & NOTE_EXIT;
     default:
       return 0;
   }
 }

And in fact even this code isn't perfect, because the kqueue(2) manpage
does also point out that EV_EOF on a pipe/fifo isn't final, because you
can EV_CLEAR to reset the EOF condition and wait again. So maybe this
code ought to read:

   case EVFILT_READ:
   case EVFILT_WRITE:
     {
       struct stat st;
       fstat(ev->ident, &st);
       return (ev->flags & EV_EOF) && !(S_ISFIFO(st.st_mode));
     }

And so now we suddenly have to make an fstat() call -every- time we
receive an event on a read/write filter?

OK well clearly not, we'd in fact do that once at EV_ADD time, and store
whether it's a FIFO in our extended  udata  structure, so as to know if
EV_EOF is final. But then we're having to use that udata structure to
store data internal to the purposes of this kqueue interface, and not
the overall user data.


Are you still now going to claim to me this is trivial?


Please compare this solution to:

   if(ev->flags & EV_FREEWATCH)
     free(ev->udata);

I would call that solution "trivial". And I claim it fairly easy to
implement. 

-- 
Paul "LeoNerd" Evans

leonerd at leonerd.org.uk
ICQ# 4135350       |  Registered Linux# 179460
http://www.leonerd.org.uk/
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 190 bytes
Desc: Digital signature
Url : http://lists.freebsd.org/pipermail/freebsd-hackers/attachments/20101115/e2acf66e/attachment.pgp


More information about the freebsd-hackers mailing list