Re: aio_read2() and aio_write2()

From: Alan Somers <asomers_at_freebsd.org>
Date: Wed, 31 Jan 2024 18:19:21 UTC
On Sun, Jan 14, 2024 at 12:07 PM Vinícius dos Santos Oliveira
<vini.ipsmaker@gmail.com> wrote:
>
> Em dom., 14 de jan. de 2024 às 15:23, Alan Somers
> <asomers@freebsd.org> escreveu:
> > I think you're using the term "green threading" too broadly.  Golang
> > uses green threads, but Rust does not.  The difference is that in Rust
> > with async/await, the task-switching boundaries are explicit (the
> > .await syntax).  So Rust uses explicit concurrency, not green
> > threading.  I can't speak to the other languages you mention.
>
> Still, we might have async IO if the implementation permits.
>
> > You propose an extension that would essentially create asynchronous
> > (and racy) versions of read, write, readv, and writev .  But what
> > about copy_file_range and fspacectl?  Or for that matter all the
> > dozens of control-path syscalls like open, stat, chmod, and truncate?
>
> They would block the thread, obviously. Look, I've been playing with
> async IO for most of my career. I'm not asking for exoteric APIs. I
> just want a non-blocking version for read(). What's so hard about
> that? From what I understand from FreeBSD source code, I can already
> "unofficially" do that (offset is ignored if the concrete type is not
> a real file).

Oh, are you not actually concerned about real files?  aio_read and
aio_write already have special handling for sockets.

>
> Very very few OSes actually implement async versions for anything
> beyond the typical read()/write(). Even open() could block. For
> anything beyond read()/write(), you just create a thread and live with
> that. From a userspace perspective, it's expected that filesystem
> operations such as file-move, directory-listing, etc will block the
> thread. It's already expected. However you don't expect that for the
> basic read()/write().
>
> Again: Linux and Windows already allow that and it works fine on them.
>
> And again: I ask why FreeBSD is special here. I've been answering your
> questions, but you've been avoiding my question every time. Why is
> FreeBSD special here? Linux and Windows work just fine with this
> design. Why suddenly does it become special for FreeBSD? It's the very
> same application.

The only sense in which FreeBSD is "special" is that we're better at
finding the best solutions, rather than the quickest and hackiest.
That's why we have kqueue instead of epoll, and ifconfig instead of
ifconfig/iwconfig/wpa_supplicant/ip .

>
> > This flag that you propose is not a panacea that will eliminate all
> > blocking file operations.  There will still be a need for things that
> > block.  Rust (via the Tokio library) still uses a thread pool for such
> > things.  It even uses the thread pool for the equivalent of read() and
> > write() (but not pread and pwrite).
>
> Nothing new here. I use thread pools to perform DNS queries. I allow
> my user to create threads to perform blocking filesystem operations
> (move, directory listing, etc). I know what I'm asking for: a read()
> that won't block. I'm not asking for a competitor to io_uring. I'm
> just asking for a read() that will never block my thread.
>
> > My point is that if you want fully asynchronous file I/O that never
> > blocks you can't achieve that by adding one additional flag to POSIX
> > AIO.
>
> It's just a read() that won't block the thread. Easy.
>
> Do you have concrete points for the design? What does it need to
> change in the design so it becomes acceptable to you? What are the
> criterias? If the implementation fulfills all these points, will it be
> acceptable for you?

I would like to see a design that:
* Is extensible to most file system and networking syscalls, even if
it doesn't include them right now.  At a minimum, it should be able to
include fspacectl, copy_file_range, truncate, and posix_fallocate.
Probably open too.
* Is reviewed by kib and Thomas Munro.
* Has completion notification delivered by kqueue.
* Is race-resistant.

>
> > Instead, all operations would
> > either specify the offset (as with pwrite, pread) or operate only at
> > EoF as if O_APPEND were used.
>
> I strongly disagree here. Async APIs should just achieve the same
> semantics one *already* has when it creates threads and performs
> blocking calls. Do *not* create new semantics. The initial patch
> follows this principle. Your proposal does not.

Shared state between asynchronous tasks is fundamentally racy if
unsynchronized.  And if synchronized, it fundamentally imposes a
performance cost.  I even recall reading a paper demonstrating that
the need to assign file descriptors sequentially necessarily created
shared state between threads.  The authors were able to improve the
performance of open() by assign file descriptors using some kind of
thread-local data structure instead, and that way open() millions of
files per second.  That's what a good asynchronous API looks like:
it's resistant to races without requiring extra synchronization.

>
>
> --
> Vinícius dos Santos Oliveira
> https://vinipsmaker.github.io/