git: 1ecbc1d8e9d3 - main - cxgbe tom: Don't queue AIO requests on listen sockets.

John Baldwin jhb at FreeBSD.org
Wed Sep 15 17:32:25 UTC 2021


On 9/15/21 8:47 AM, Alan Somers wrote:
> On Wed, Sep 15, 2021 at 9:21 AM John Baldwin <jhb at freebsd.org> wrote:
> 
>> On 9/14/21 1:53 PM, Alan Somers wrote:
>>> On Tue, Sep 14, 2021 at 2:46 PM John Baldwin <jhb at freebsd.org> wrote:
>>>
>>>> The branch main has been updated by jhb:
>>>>
>>>> URL:
>>>>
>> https://cgit.FreeBSD.org/src/commit/?id=1ecbc1d8e9d3fbcd8e68fc68f0a32944a12ddb1e
>>>>
>>>> commit 1ecbc1d8e9d3fbcd8e68fc68f0a32944a12ddb1e
>>>> Author:     John Baldwin <jhb at FreeBSD.org>
>>>> AuthorDate: 2021-09-14 20:46:14 +0000
>>>> Commit:     John Baldwin <jhb at FreeBSD.org>
>>>> CommitDate: 2021-09-14 20:46:14 +0000
>>>>
>>>>       cxgbe tom: Don't queue AIO requests on listen sockets.
>>>>
>>>>       This is similar to the fixes in 141fe2dceeae.  One difference is
>> that
>>>>       TOE sockets do not change states (listen vs non-listen) once
>> created,
>>>>       so no lock is needed for SOLISTENING().
>>>>
>>>>       Sponsored by:   Chelsio Communications
>>>>
>>>
>>> I've always wondered: what's the point to using AIO with sockets?  Can't
>>> everything socket-related be done better with non-blocking read/write and
>>> kqueue?
>>
>> Zero-copy operation with TOE is why TOE uses AIO.  Zero-copy of user
>> buffers
>> can't really work with the non-AIO APIs because the user buffer is free to
>> be reused immediately after write(2) (and on the read side you don't know
>> the buffer in advance to allow the NIC to write directly into the use
>> buffer).
>>
>> In theory we could support zero-copy using mb_ext_pgs for aio_write() for
>> the non-TOE case similar to what sendfile() does.
>>
>> --
>> John Baldwin
>>
> 
> Interesting.  Do you know of any common applications that include this
> optimization?  I've been working on the AIO ecosystem for Rust.  It would
> be good to ensure that this use case works, especially if zero-copy ever
> works for non-TOE.

I do not, and I rely on patches I merged upstream to netperf (-a and -A flags)
to test it.  I believe there might be some proprietary bits in some FreeBSD
downstreams that might make use of this.

-- 
John Baldwin


More information about the dev-commits-src-main mailing list