kern/181416: socket timeout rounding issue

Bruce Evans brde at optusnet.com.au
Tue Aug 20 10:11:42 UTC 2013


On Tue, 20 Aug 2013, Vitja Makarov wrote:

>> Description:
> Recently I was playing with small socket timeouts. setsockopt(2)
> SO_RCVTIMEO and found a problem with it: if timeout is small enough
> read(2) may return before timeout is actually expired.
>
> I was unable to reproduce this on linux box.
>
> I found that kernel uses a timer with 1/HZ precision so it converts
> time in microseconds to ticks that's ok linux does it as well. The
> problem is in details: freebsd uses floor() approach while linux uses
> ceil():
>
> from FreeBSD's sys/kern/uipc_socket.c:
> val = (u_long)(tv.tv_sec * hz) + tv.tv_usec / tick;

This is actually an off-by-2 error in most case.  ceil() isn't high enough
either, since for example with hz = 100 and tv = 25 msec, the ceil() of 3
ticks is 2 full ticks plus a fractional tick which may be 1 nsec long.  At
least with old timeout code.

> if (val == 0 && tv.tv_usec != 0)
>     val = 1; /* at least one tick if tv > 0 */

This does the ceil() in the special case where tv < 1 tick.  This is a
waste of timeout, at least with old timeout code, since callout_reset()
used to add 1.  This seems to have been lost, breaking old callers that
depended on it.  Current timeout code tries to be more accurute, but that
means that it less accurate if the caller is broken and rounds down.
Maybe your bug can only be seen with the increased accuracy.

tvtohz() should always be used to convert timevals to ticks.  It rounds
up and adds 1, and handles overflow.  The conversion in uipc_socket.c
isn't even short.  It takes 15 lines for its own overflow handling.  It
seems to check the SHRT_MAX limit twice.

If uipc_socket.c called tvtohz(), then it would still have to check
that the result fits in a short.  Its error handling when it doesn't
fit seems wrong.  EDOM is documented as a domain error for math
software.  setsockopt() isn't math software, and EDOM isn't a documented
errno for it.  EINVAL and EOVERFLOW are more usual kernel errors for
unrepresentable values.

Grepping for ' / tick' in /sys shows no other home made tvtohz()'s.

> from Linux's net/core/sock.c:
> *timeo_p = tv.tv_sec*HZ + (tv.tv_usec+(1000000/HZ-1))/(1000000/HZ);

The conversion is much simpler when HZ is hard-coded.  Linux has some
bounds checking before this, but the error handling in at least
Linux-2.6.10 is to ignore invalid tv's and return success without
changing the timeout.

> So, for instance, we have a freebsd system running with kern.hz set to
> 100 and set receive timeout to 25ms that is converted to 2 ticks which
> is 20ms. In my test program read(2) returns with EAGAIN set in
> 0.019ms.

Bruce

Bruce


More information about the freebsd-bugs mailing list