IPv6 -> IPv4 fallback broken in serf, kernel bug?
Don Lewis
truckman at FreeBSD.org
Wed Jul 27 07:42:57 UTC 2016
On 26 Jul, Karl Denninger wrote:
> On 7/26/2016 10:59, Don Lewis wrote:
>> Serf has some code to fall back from IPv4 if an IPv6 and more generally
>> try different addresses on multi-homed servers if connection attempts
>> fail, but it does not work properly on recent versions of FreeBSD. I've
>> tested both recent FreeBSD 10.3-STABLE and HEAD.
>>
>> The way that it is supposed to work is that serf creates a socket, sets
>> it non-blocking, calls connect(), and then passes the fd to poll(). When
>> the connection attempt fails, it expects to see a POLLERR event. The
>> POLLERR event handler will then call getsockopt(fd, SOL_SOCKET,
>> SO_ERROR, &error, ...). If the returned error is ECONNREFUSED or one of
>> a couple of other errors, then serf will move on to the next address.
>>
>> Instead what happens is that serf also(?) sees POLLIN set, which it
>> processes first by calling read(), which returns an ECONNREFUSED error.
>> That not a documented error return from read().
>>
>> An easy way to test this is to truss svn and attempt to do an http
>> checkout from a host that has both IPv6 and IPv4 addresses, but is not
>> listening on port 80. The only connection attempt will be to the IPv6
>> address.
>>
>> socket(PF_INET6,SOCK_STREAM|SOCK_CLOEXEC,6) = 4 (0x4)
>> fcntl(4,F_GETFL,) = 2 (0x2)
>> fcntl(4,F_SETFL,O_NONBLOCK|0x2) = 0 (0x0)
>> setsockopt(0x4,0x6,0x1,0x7fffffffdda4,0x4) = 0 (0x0)
>> gettimeofday({ 1469515046.979461 },0x0) = 0 (0x0)
>> connect(4,{ AF_INET6 [xxxx:xxxx:xxxx:xxxx::xxxx]:80 },28) ERR#36 'Operation now in progress'
>> gettimeofday({ 1469515046.979614 },0x0) = 0 (0x0)
>> kevent(3,{ 4,EVFILT_READ,EV_ADD,0x0,0x0,0x805491300 },1,0x0,0,0x0) = 0 (0x0)
>> kevent(3,{ 4,EVFILT_WRITE,EV_ADD,0x0,0x0,0x805491300 },1,0x0,0,0x0) = 0 (0x0)
>> kevent(3,0x0,0,{ 4,EVFILT_READ,EV_EOF,NOTE_LOWAT|0x3c,0x0,0x805491300 4,EVFILT_WRITE,EV_EOF,NOTE_LOWAT|0x3c,0x8000,0x805491300 },32,{ 0.500000000 }) = 2 (0x2)
>> read(4,0x80549c064,8000) ERR#61 'Connection refused'
>> kevent(3,{ 4,EVFILT_READ,EV_DELETE,0x0,0x0,0x0 },1,0x0,0,0x0) = 0 (0x0)
>> kevent(3,{ 4,EVFILT_WRITE,EV_DELETE,0x0,0x0,0x0 },1,0x0,0,0x0) = 0 (0x0)
>> kevent(3,{ 4,EVFILT_READ,EV_DELETE,0x0,0x0,0x0 },1,0x0,0,0x0) ERR#2 'No such file or directory'
>> kevent(3,{ 4,EVFILT_WRITE,EV_DELETE,0x0,0x0,0x0 },1,0x0,0,0x0) ERR#2 'No such file or directory'
>> close(4) = 0 (0x0)
>> close(3) = 0 (0x0)
>> svn: E170013: Unable to connect to a repository at URL ...
>>
>>
>> It looks like it should be possible to patch serf to handle this, but:
>> * Should POLLIN be set for this event?
>>
>> * What errno value should read() return in this case, if it is
>> ECONNREFUSED, then that should be documented.
>>
>>
> This is kinda serious in that the above manifestation in svn effectively
> disables it for those of us that are on IPv4 connections and have no
> provider capability for IPv6 at the present time. When I was running
> 10.2 this was not a problem but as soon as I rolled forward to 11.x it
> showed up.
Try the following apr patch. It works for me with svn, but I'm getting
a crash in another application that uses apr.
--- apr-1.5.2/poll/unix/kqueue.c.orig 2015-03-20 01:34:07 UTC
+++ apr-1.5.2/poll/unix/kqueue.c
@@ -25,21 +25,40 @@
#ifdef HAVE_KQUEUE
-static apr_int16_t get_kqueue_revent(apr_int16_t event, apr_int16_t flags)
+static apr_int16_t get_kqueue_revent(apr_int16_t event, apr_int16_t flags,
+ int fflags, intptr_t data)
{
apr_int16_t rv = 0;
- if (event == EVFILT_READ)
- rv |= APR_POLLIN;
- else if (event == EVFILT_WRITE)
- rv |= APR_POLLOUT;
- if (flags & EV_EOF)
- rv |= APR_POLLHUP;
- /* APR_POLLPRI, APR_POLLERR, and APR_POLLNVAL are not handled by this
- * implementation.
+ /* APR_POLLPRI and APR_POLLNVAL are not handled by this implementation.
* TODO: See if EV_ERROR + certain system errors in the returned data field
* should map to APR_POLLNVAL.
*/
+ if (event == EVFILT_READ) {
+ if (data > 0 || fflags == 0)
+ rv |= APR_POLLIN;
+ else
+ rv |= APR_POLLERR;
+ /*
+ * Don't return POLLHUP if connect fails. Apparently Linux
+ * does not, and this is expected by serf in order for IPv6 to
+ * IPv4 or multihomed host fallback to work.
+ *
+ * ETIMEDOUT is ambiguous here since we don't know if a
+ * connection was established. We don't want to return
+ * POLLHUP here if the connection attempt timed out, but
+ * we do if the connection was successful but later dropped.
+ * For now, favor the latter.
+ */
+ if ((flags & EV_EOF) != 0 && fflags != ECONNREFUSED &&
+ fflags != ENETUNREACH && fflags != EHOSTUNREACH)
+ rv |= APR_POLLHUP;
+ } else if (event == EVFILT_WRITE) {
+ if (data > 0 || fflags == 0)
+ rv |= APR_POLLOUT;
+ else
+ rv |= APR_POLLERR;
+ }
return rv;
}
@@ -290,7 +309,9 @@ static apr_status_t impl_pollset_poll(ap
pollset->p->result_set[j] = fd;
pollset->p->result_set[j].rtnevents =
get_kqueue_revent(pollset->p->ke_set[i].filter,
- pollset->p->ke_set[i].flags);
+ pollset->p->ke_set[i].flags,
+ pollset->p->ke_set[i].fflags,
+ pollset->p->ke_set[i].data);
j++;
}
}
@@ -471,7 +492,9 @@ static apr_status_t impl_pollcb_poll(apr
apr_pollfd_t *pollfd = (apr_pollfd_t *)(pollcb->pollset.ke[i].udata);
pollfd->rtnevents = get_kqueue_revent(pollcb->pollset.ke[i].filter,
- pollcb->pollset.ke[i].flags);
+ pollcb->pollset.ke[i].flags,
+ pollcb->pollset.ke[i].fflags,
+ pollcb->pollset.ke[i].data);
rv = func(baton, pollfd);
More information about the freebsd-net
mailing list