Re: bio re-ordering

From: Peter Jeremy <peterj_at_freebsd.org>
Date: Fri, 18 Feb 2022 08:36:18 UTC
On 2022-Feb-17 17:48:14 -0800, John-Mark Gurney <jmg@funkthat.com> wrote:
>Peter Jeremy wrote this message on Sat, Feb 05, 2022 at 20:50 +1100:
>> I've raised https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=261731 to
>> make geom_gate support BIO_ORDERED.  Exposing the BIO_ORDERED flag to
>> userland is quite easy (once a decision is made as to how to do that).
>> Enhancing the geom_gate clients to correctly implement BIO_ORDERED is
>> somewhat harder.
>
>The clients are single threaded wrt IOs, so I don't think updating them
>are required.

ggatec(8) and ggated(8) will not reorder I/Os.  I'm not sure about hast.

>I do have patches to improve things by making ggated multithreaded to
>improve IOPs, and so making this improvement would allow those patches
>to be useful.

Likewise, I found ggatec and ggated to be too slow for my purposes and
so I've implemented my own variant (not network API compatible) that
can/does reorder requests.  That was when I noticed that BIO_ORDERED
wasn't implemented.

>I do have a question though, what is the exact semantics of _ORDERED?

I can't authoritatively answer this, sorry.

>And right now, the ggate protocol (from what I remember) doesn't have
>a way to know when the remote kernel has received notification that an
>IO is complete.

A G_GATE_CMD_START write request will be sent to the remote system and
issued as a pwrite(2) then an acknowledgement packet will be returned
and passed back to the local kernel via G_GATE_CMD_DONE.  There's no
support for BIO_FLUSH or BIO_ORDERED so there's no way for the local
kernel to know when the write has been written to non-volatile store.

>> I've done some experiments and OpenZFS doesn't generate BIO_ORDERED
>> operations so I've also raised https://github.com/openzfs/zfs/issues/13065
>> I haven't looked into how difficult that would be to fix.

Unrelated to the above but for completeness:  OpenZFS avoids the need
for BIO_ORDERED by not issuing additional I/Os until previous I/Os have
been retired when ordering is important.  (It does rely on BIO_FLUSH).

-- 
Peter Jeremy