kevent behavior
Konstantin Belousov
kostikbel at gmail.com
Fri Mar 27 13:27:01 UTC 2015
On Thu, Mar 26, 2015 at 11:58:27PM +0100, Jilles Tjoelker wrote:
> On Thu, Mar 26, 2015 at 11:45:51PM +0200, Konstantin Belousov wrote:
> > On Wed, Mar 25, 2015 at 11:35:30PM +0100, Jilles Tjoelker wrote:
>
> > > POSIX does not say anything about kevent(), including whether it should
> > > be a cancellation point or not. Given that kevent() may block for a long
> > > time, making it a cancellation point seems to make sense.
>
> > But POSIX language is very explicit for its own set of interfaces, and
> > that makes sense. This is why I think that cancellability, after the
> > 15+ years of kqueue existence, should be opt in.
>
> Hmm, I think most code written is cancel-unsafe. It is unlikely to have
> cancel-safe code using kevent() and relying on it not being a
> cancellation point, but still possible.
Ok, I re-considered my opinion after re-reading your responses.
>
> This also means we shouldn't wait too long with adding missing
> cancellation points like ppoll().
>
> > > The kevent() cancellation point implementation would be much like most
> > > other cancellation points such as pselect(), with the difference that
> > > the call may have committed even if it fails, in the case that
> > > nchanges > nevents and in the case that nchanges > 0 && errno == EINTR.
> > > If cancellation is requested while blocking in the latter case, libthr
> > > will have to return -1/EINTR to indicate to the application that the
> > > changes were applied successfully while allowing the thread to be
> > > cancelled at the next cancellation point, even though there may not be
> > > any signal handler.
I do not quite understand this.
If any error (except things like EFAULT) occured while processing the
changelist, kevent(2) returns error count, and not the -1, regarless of
the length of eventlist. In this case, we do not cancel.
Syscall cannot return n/EINTR, where n >= 0.
If -1/EINTR is returned, this means that the call blocked and sleep
was interrupted. The changes from the changelist were applied,
do you suggested that we must not cancel if nchanges > 0 ?
>
> > > If nevents == 0, kevent() does not block and need not be a cancellation
> > > point. This special case may simplify things slightly for application
> > > programmers.
>
> > > A kqueue() flag for cancellation does not seem very useful: since such a
> > > flag would be a kernel thing and pthread cancellation is a libthr thing,
> > > requesting the flag from the kernel would slow down all kevent() calls.
>
> > Yes, I thought about adding some (small) userspace table which would
> > track cancellable kqueues.
>
> I think the interaction with dup() and similar calls and with exec makes
> this too nasty.
Hm, yes.
>
> > But if we make the cancellation state per-call, and not a state for kqueue
> > descriptor, we might indicate the cancellability by some event type, which
> > is not passed to the kernel at all. The libthr wrapper for kevent(2) would
> > need to scan the changelist and zero-out the cancellation indicator.
>
> This seems a bit hackish. However, enabling cancellation per-call seems
> better than per-kqueue.
Patch below always considers kevent(2) as cancellable, unless nevents == 0.
I will correct the call to thr_cancel_leave() if I misunderstood you.
diff --git a/lib/libc/include/libc_private.h b/lib/libc/include/libc_private.h
index e4bf4a6..ceaa2a3 100644
--- a/lib/libc/include/libc_private.h
+++ b/lib/libc/include/libc_private.h
@@ -221,6 +221,7 @@ enum {
INTERPOS__pthread_mutex_init_calloc_cb,
INTERPOS_spinlock,
INTERPOS_spinunlock,
+ INTERPOS_kevent,
INTERPOS_MAX
};
@@ -293,6 +294,7 @@ void * __sys_freebsd6_mmap(void *, __size_t, int, int, int, int, __off_t);
struct aiocb;
struct fd_set;
struct iovec;
+struct kevent;
struct msghdr;
struct pollfd;
struct rusage;
@@ -315,6 +317,8 @@ int __sys_fsync(int);
__pid_t __sys_fork(void);
int __sys_ftruncate(int, __off_t);
int __sys_gettimeofday(struct timeval *, struct timezone *);
+int __sys_kevent(int, const struct kevent *, int, struct kevent *,
+ int, const struct timespec *);
__off_t __sys_lseek(int, __off_t, int);
void *__sys_mmap(void *, __size_t, int, int, int, __off_t);
int __sys_msync(void *, __size_t, int);
diff --git a/lib/libc/sys/Makefile.inc b/lib/libc/sys/Makefile.inc
index 0edf644b..7745b2a 100644
--- a/lib/libc/sys/Makefile.inc
+++ b/lib/libc/sys/Makefile.inc
@@ -51,6 +51,7 @@ INTERPOSED = \
fcntl \
fsync \
fork \
+ kevent \
msync \
nanosleep \
open \
diff --git a/lib/libc/sys/interposing_table.c b/lib/libc/sys/interposing_table.c
index 0fd6c75..4290bc6 100644
--- a/lib/libc/sys/interposing_table.c
+++ b/lib/libc/sys/interposing_table.c
@@ -75,6 +75,7 @@ interpos_func_t __libc_interposing[INTERPOS_MAX] = {
SLOT(_pthread_mutex_init_calloc_cb, _pthread_mutex_init_calloc_cb_stub),
SLOT(spinlock, __libc_spinlock_stub),
SLOT(spinunlock, __libc_spinunlock_stub),
+ SLOT(kevent, __sys_kevent),
};
#undef SLOT
diff --git a/lib/libc/sys/kevent.c b/lib/libc/sys/kevent.c
new file mode 100644
index 0000000..5f84ef8
--- /dev/null
+++ b/lib/libc/sys/kevent.c
@@ -0,0 +1,53 @@
+/*
+ * Copyright (c) 2015 The FreeBSD Foundation.
+ * All rights reserved.
+ *
+ * Portions of this software were developed by Konstantin Belousov
+ * under sponsorship from the FreeBSD Foundation.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ * 1. Redistributions of source code must retain the above copyright
+ * notice(s), this list of conditions and the following disclaimer as
+ * the first lines of this file unmodified other than the possible
+ * addition of one or more copyright notices.
+ * 2. Redistributions in binary form must reproduce the above copyright
+ * notice(s), this list of conditions and the following disclaimer in
+ * the documentation and/or other materials provided with the
+ * distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDER(S) ``AS IS'' AND ANY
+ * EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+ * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
+ * PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER(S) BE
+ * LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
+ * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+ * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR
+ * BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY,
+ * WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE
+ * OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE,
+ * EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <sys/cdefs.h>
+__FBSDID("$FreeBSD$");
+
+#include <sys/types.h>
+#include <sys/event.h>
+#include <sys/time.h>
+#include "libc_private.h"
+
+__weak_reference(__sys_kevent, __kevent);
+
+#pragma weak kevent
+int
+kevent(int kq, const struct kevent *changelist, int nchanges,
+ struct kevent *eventlist, int nevents, const struct timespec *timeout)
+{
+
+ return (((int (*)(int, const struct kevent *, int,
+ struct kevent *, int, const struct timespec *))
+ __libc_interposing[INTERPOS_kevent])(kq, changelist, nchanges,
+ eventlist, nevents, timeout));
+}
diff --git a/lib/libc/sys/kqueue.2 b/lib/libc/sys/kqueue.2
index 93223f1..fa76c54 100644
--- a/lib/libc/sys/kqueue.2
+++ b/lib/libc/sys/kqueue.2
@@ -24,7 +24,7 @@
.\"
.\" $FreeBSD$
.\"
-.Dd July 18, 2014
+.Dd March 27, 2015
.Dt KQUEUE 2
.Os
.Sh NAME
@@ -41,7 +41,7 @@
.Fn kqueue "void"
.Ft int
.Fn kevent "int kq" "const struct kevent *changelist" "int nchanges" "struct kevent *eventlist" "int nevents" "const struct timespec *timeout"
-.Fn EV_SET "&kev" ident filter flags fflags data udata
+.Fn EV_SET "kev" ident filter flags fflags data udata
.Sh DESCRIPTION
The
.Fn kqueue
@@ -550,6 +550,16 @@ On return,
.Va fflags
contains the users defined flags in the lower 24 bits.
.El
+.Sh CANCELLATION BEHAVIOUR
+If
+.Fa nevents
+is non-zero, i.e. the function is potentially blocking, the call
+is the cancellation point.
+Otherwise, i.e. if
+.Fa nevents
+is zero, the call is not cancellable.
+Cancellation can only occur when the call was blocked and no changes
+to the queue were requested.
.Sh RETURN VALUES
The
.Fn kqueue
diff --git a/lib/libthr/thread/thr_syscalls.c b/lib/libthr/thread/thr_syscalls.c
index 10fbad4..e71bf4a 100644
--- a/lib/libthr/thread/thr_syscalls.c
+++ b/lib/libthr/thread/thr_syscalls.c
@@ -341,6 +341,29 @@ __thr_pselect(int count, fd_set *rfds, fd_set *wfds, fd_set *efds,
return (ret);
}
+static int
+__thr_kevent(int kq, const struct kevent *changelist, int nchanges,
+ struct kevent *eventlist, int nevents, const struct timespec *timeout)
+{
+ struct pthread *curthread;
+ int ret;
+
+ if (nevents == 0) {
+ /*
+ * No blocking, do not make the call cancellable.
+ */
+ return (__sys_kevent(kq, changelist, nchanges, eventlist,
+ nevents, timeout));
+ }
+ curthread = _get_curthread();
+ _thr_cancel_enter(curthread);
+ ret = __sys_kevent(kq, changelist, nchanges, eventlist, nevents,
+ timeout);
+ _thr_cancel_leave(curthread, ret == -1 && nchanges == 0);
+
+ return (ret);
+}
+
/*
* Cancellation behavior:
* Thread may be canceled at start, but if the system call got some data,
@@ -599,6 +622,7 @@ __thr_interpose_libc(void)
SLOT(writev);
SLOT(spinlock);
SLOT(spinunlock);
+ SLOT(kevent);
#undef SLOT
*(__libc_interposing_slot(
INTERPOS__pthread_mutex_init_calloc_cb)) =
More information about the freebsd-hackers
mailing list