Re: aio_read2() and aio_write2()

From: Konstantin Belousov <kib_at_freebsd.org>
Date: Thu, 01 Feb 2024 01:19:48 UTC
On Wed, Jan 31, 2024 at 11:19:21AM -0700, Alan Somers wrote:
> On Sun, Jan 14, 2024 at 12:07 PM Vinícius dos Santos Oliveira
> <vini.ipsmaker@gmail.com> wrote:
> >
> > Em dom., 14 de jan. de 2024 às 15:23, Alan Somers
> > <asomers@freebsd.org> escreveu:
> > > I think you're using the term "green threading" too broadly.  Golang
> > > uses green threads, but Rust does not.  The difference is that in Rust
> > > with async/await, the task-switching boundaries are explicit (the
> > > .await syntax).  So Rust uses explicit concurrency, not green
> > > threading.  I can't speak to the other languages you mention.
> >
> > Still, we might have async IO if the implementation permits.
> >
> > > You propose an extension that would essentially create asynchronous
> > > (and racy) versions of read, write, readv, and writev .  But what
> > > about copy_file_range and fspacectl?  Or for that matter all the
> > > dozens of control-path syscalls like open, stat, chmod, and truncate?
> >
> > They would block the thread, obviously. Look, I've been playing with
> > async IO for most of my career. I'm not asking for exoteric APIs. I
> > just want a non-blocking version for read(). What's so hard about
> > that? From what I understand from FreeBSD source code, I can already
> > "unofficially" do that (offset is ignored if the concrete type is not
> > a real file).
> 
> Oh, are you not actually concerned about real files?  aio_read and
> aio_write already have special handling for sockets.
> 
> >
> > Very very few OSes actually implement async versions for anything
> > beyond the typical read()/write(). Even open() could block. For
> > anything beyond read()/write(), you just create a thread and live with
> > that. From a userspace perspective, it's expected that filesystem
> > operations such as file-move, directory-listing, etc will block the
> > thread. It's already expected. However you don't expect that for the
> > basic read()/write().
> >
> > Again: Linux and Windows already allow that and it works fine on them.
> >
> > And again: I ask why FreeBSD is special here. I've been answering your
> > questions, but you've been avoiding my question every time. Why is
> > FreeBSD special here? Linux and Windows work just fine with this
> > design. Why suddenly does it become special for FreeBSD? It's the very
> > same application.
> 
> The only sense in which FreeBSD is "special" is that we're better at
> finding the best solutions, rather than the quickest and hackiest.
> That's why we have kqueue instead of epoll, and ifconfig instead of
> ifconfig/iwconfig/wpa_supplicant/ip .
> 
> >
> > > This flag that you propose is not a panacea that will eliminate all
> > > blocking file operations.  There will still be a need for things that
> > > block.  Rust (via the Tokio library) still uses a thread pool for such
> > > things.  It even uses the thread pool for the equivalent of read() and
> > > write() (but not pread and pwrite).
> >
> > Nothing new here. I use thread pools to perform DNS queries. I allow
> > my user to create threads to perform blocking filesystem operations
> > (move, directory listing, etc). I know what I'm asking for: a read()
> > that won't block. I'm not asking for a competitor to io_uring. I'm
> > just asking for a read() that will never block my thread.
> >
> > > My point is that if you want fully asynchronous file I/O that never
> > > blocks you can't achieve that by adding one additional flag to POSIX
> > > AIO.
> >
> > It's just a read() that won't block the thread. Easy.
> >
> > Do you have concrete points for the design? What does it need to
> > change in the design so it becomes acceptable to you? What are the
> > criterias? If the implementation fulfills all these points, will it be
> > acceptable for you?
> 
> I would like to see a design that:
> * Is extensible to most file system and networking syscalls, even if
> it doesn't include them right now.  At a minimum, it should be able to
> include fspacectl, copy_file_range, truncate, and posix_fallocate.
> Probably open too.
> * Is reviewed by kib and Thomas Munro.
> * Has completion notification delivered by kqueue.
> * Is race-resistant.
I think that the request was much more modest.  It is only about having
the https://reviews.freebsd.org/D43448 committed.  And indeed I do not 
see a reason to block the review from landing.

I added arch@ to get this discussion more visibility.
> 
> >
> > > Instead, all operations would
> > > either specify the offset (as with pwrite, pread) or operate only at
> > > EoF as if O_APPEND were used.
> >
> > I strongly disagree here. Async APIs should just achieve the same
> > semantics one *already* has when it creates threads and performs
> > blocking calls. Do *not* create new semantics. The initial patch
> > follows this principle. Your proposal does not.
> 
> Shared state between asynchronous tasks is fundamentally racy if
> unsynchronized.  And if synchronized, it fundamentally imposes a
> performance cost.  I even recall reading a paper demonstrating that
> the need to assign file descriptors sequentially necessarily created
> shared state between threads.  The authors were able to improve the
> performance of open() by assign file descriptors using some kind of
> thread-local data structure instead, and that way open() millions of
> files per second.  That's what a good asynchronous API looks like:
> it's resistant to races without requiring extra synchronization.
> 
> >
> >
> > --
> > Vinícius dos Santos Oliveira
> > https://vinipsmaker.github.io/