Re: git: 9ac6eda6c6a3 - main - pipe: try to skip locking the pipe if a non-blocking fd is used

From: Mateusz Guzik <mjguzik_at_gmail.com>
Date: Wed, 17 Aug 2022 17:07:26 UTC
On 8/17/22, Gleb Smirnoff <glebius@freebsd.org> wrote:
> On Wed, Aug 17, 2022 at 06:21:33PM +0200, Mateusz Guzik wrote:
> M> I don't understand this whatsoever. I patched the read side, not write.
> M>
> M> So in particular, should you find there is free space in the pipe to
> M> write to, there is literally no change.
> M>
> M> At worst, if a non-blocking read was racing a write, there is a bigger
> M> window where the reader will conclude there is nothing to get and
> M> leave.
> M>
> M> I see the following races:
> M> 1. non-blocking reader arrived before the writer managed to busy the
> M> pipe, in which case the outcome is the same in both old and new code
> M> 2. non-blocking reader arrived after the writer managed to unbusy the
> M> pipe, in which case the outcome is once again the same -- the write
> M> was completed and reader sees it
> M> 3. non-blocking reader arrived after the writer managed to busy the
> M> pipe, but before it got unbusied. here the old code would always wait,
> M> while the new will or will not wait depending on whether the writer
> M> managed to bump data counters or not.
> M>
> M> Ultimately there is no *new* outcome for any of it.
>
> The problem is symmetrical for both read and write.  I'm sorry that
> I used write scenario to describe a problem.
>
> So, let's look into third case. We got event dispatcher reporting
> to us that next read(2) would be successful:
>
>      POLLRDNORM     Normal data may be read without blocking.
>
> or
>
>      EVFILT_READ         Takes a descriptor as the identifier, and returns
>                          whenever there is data available to read.  The
>                          behavior of the filter is slightly different
>                          depending on the descriptor type.
>                          Fifos, Pipes
>                              Returns when the there is data to read; data
>                              contains the number of bytes available.
>
> But next read(2) would return EAGAIN. This is broken contract.
>

Once more I don't see it, can you show a simple graph how this is
supposed to happen all while not being possible on the previous
kernel? See below for an example.

Say there is only one thread which sees the reader side and that
thread performs the poll/select/kqueue call. Once it gets the
notification, it proceeds to do the read which of course works fine.

Say there are more threads or even multiple processes. If userspace
took no measures to synchronize read calls vs poll/select/kqueue calls
(that is some threads just read while others poll and only then read,
for example), it runs into races which are the same for both old and
new code (with one slight distinction of what happens when the write
is in progress)

thread1    thread2
poll
           write
  ret
read

thread1 finds data no problem.

thread1    thread2    thread3
poll
           write
  ret
                      read
read

thread1 got scooped and gets EAGAIN on both old and new kernels

-- 
Mateusz Guzik <mjguzik gmail.com>