Re: git: 9ac6eda6c6a3 - main - pipe: try to skip locking the pipe if a non-blocking fd is used

Reply: Gleb Smirnoff : "Re: git: 9ac6eda6c6a3 - main - pipe: try to skip locking the pipe if a non-blocking fd is used"
In reply to: Gleb Smirnoff : "Re: git: 9ac6eda6c6a3 - main - pipe: try to skip locking the pipe if a non-blocking fd is used"
Go to: [ bottom of page ] [ top of archives ] [ this month ]

From: Mateusz Guzik <mjguzik_at_gmail.com>
Date: Wed, 17 Aug 2022 16:21:33 UTC

On 8/17/22, Gleb Smirnoff <glebius@freebsd.org> wrote:
>   Mateusz,
>
> On Wed, Aug 17, 2022 at 02:23:51PM +0000, Mateusz Guzik wrote:
> M> +	 * Try to avoid locking the pipe if we have nothing to do.
> M> +	 *
> M> +	 * There are programs which share one pipe amongst multiple processes
> M> +	 * and perform non-blocking reads in parallel, even if the pipe is
> M> +	 * empty.  This in particular is the case with BSD make, which when
> M> +	 * spawned with a high -j number can find itself with over half of the
> M> +	 * calls failing to find anything.
> M> +	 */
> M> +	if ((fp->f_flag & FNONBLOCK) != 0 && !mac_pipe_check_read_enabled()) {
> M> +		if (__predict_false(uio->uio_resid == 0))
> M> +			return (0);
> M> +		if ((atomic_load_short(&rpipe->pipe_state) & PIPE_EOF) != 0)
> M> +			return (0);
> M> +		if (atomic_load_int(&rpipe->pipe_buffer.cnt) == 0 &&
> M> +		    atomic_load_int(&rpipe->pipe_pages.cnt) == 0)
> M> +			return (EAGAIN);
> M> +	}
>
> While this change definitely can optimize some programs it also can heavily
> pessimize others. An application can share pipes and actually expect that
> kernel will do proper serialization of writes by different threads. With
> this change, the threads that previously were sleeping on the pipe would
> enter a busy loop consisting write + kevent/poll syscalls.
>
> This also breaks API. Although not explicitly narrated by POSIX or our
> documentation, but there is a strong contract that, if application had
> received a non-blocking fd from event dispatcher (kevent, poll, select)
> as writable, then it is possible to write thre non-zero bytes with next
> syscall.
>
> This is a useful behavior, but by no means should be the default. It
> should be explicitly enabled by a new fcntl() and make(1) can do that.
>

I don't understand this whatsoever. I patched the read side, not write.

So in particular, should you find there is free space in the pipe to
write to, there is literally no change.

At worst, if a non-blocking read was racing a write, there is a bigger
window where the reader will conclude there is nothing to get and
leave.

I see the following races:
1. non-blocking reader arrived before the writer managed to busy the
pipe, in which case the outcome is the same in both old and new code
2. non-blocking reader arrived after the writer managed to unbusy the
pipe, in which case the outcome is once again the same -- the write
was completed and reader sees it
3. non-blocking reader arrived after the writer managed to busy the
pipe, but before it got unbusied. here the old code would always wait,
while the new will or will not wait depending on whether the writer
managed to bump data counters or not.

Ultimately there is no *new* outcome for any of it.

-- 
Mateusz Guzik <mjguzik gmail.com>