hacking - aio_sendfile()

Konstantin Belousov kostikbel at gmail.com
Thu Jul 11 09:36:42 UTC 2013


On Thu, Jul 11, 2013 at 01:37:19AM -0700, Adrian Chadd wrote:
> Hiya,
> 
> I'm more interested in the API than the implementation at the moment.
> 
> Yes, you're right - it should eventually be driven using disk io
> completion upcalls which triggers the push of data into the socket
> buffer. I totally agree.
> 
> I'm hacking up some libevent-ish looking thing that uses kqueue and
> wraps aio, read, write, and other event types into something I can
> easily shoehorn this stuff into. I'll then throughly test it (and
> other options) out. You're right, it's likely going to end up with a
> whole lot of aio threads sitting there waiting for disk IO to complete
> - and at that point, I'll start hacking at sendfile() to split it into
> two halves and have it driven by a completion call from g_up or
> wherever, triggering the socket write side of things.
> 
> There are some other questions too - like whether the IO completion
> should just queue socket IO (and have it potentially block in the TCP
> code) or whether it should funnel completions into a per-CPU aio
> completion thread which does the socket write bit. That way disk IO
> completion isn't going to be blocked by longer-held locks in the
> networking stack.

No, it is not disk I/O which is problematic there. It is socket I/O
e.g. wait for the socket buffers lomark in the kern_sendfile() which
causes unbounded sleep. Look for the sbwait() call, both in the
kern_sendfile() itself, and in the pru_send methods of the protocols,
e.g. in sosend_generic(). The wait scope controlled by the other side of
connection and allow it to completely block the aio subsystem.

Disk I/O is supposed to finish in the finite time.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 834 bytes
Desc: not available
URL: <http://lists.freebsd.org/pipermail/freebsd-current/attachments/20130711/55044bb5/attachment.sig>


More information about the freebsd-current mailing list