bin/164947: tee looses data when writing to non-blocking file descriptors

Fri Feb 10 21:10:14 UTC 2012

The following reply was made to PR bin/164947; it has been noted by GNATS.

From: Martin Cracauer <cracauer at cons.org>
To: Diomidis Spinellis <dds at aueb.gr>
Cc: Martin Cracauer <cracauer at cons.org>, freebsd-gnats-submit at freebsd.org
Subject: Re: bin/164947: tee looses data when writing to non-blocking file descriptors
Date: Fri, 10 Feb 2012 16:03:02 -0500

 Diomidis Spinellis wrote on Fri, Feb 10, 2012 at 10:32:02PM +0200: 
 > On 10/02/2012 21:17, Martin Cracauer wrote:
 > >Diomidis Spinellis wrote on Fri, Feb 10, 2012 at 07:04:41AM +0000:
 > >>
 > >>>Number:         164947
 > [...]
 > 
 > >>>How-To-Repeat:
 > >>Run the following:
 > >>#!/usr/local/bin/bash
 > >># bash needed for the>(...) functionality
 > >># ssh apparently sets O_NONBLOCK
 > >># Remove the 2>/dev/null to see tee complaining
 > >>dd count=100000 if=/dev/zero |
 > >>tee>(ssh localhost dd of=/dev/null) 2>/dev/null |
 > >>(ssh localhost dd of=/dev/null)
 > >
 > >I don't think it is ssh that is causing this. If you use a named pipe
 > >explicitly and hook ssh up to that the error doesn't appear.  Seems to
 > >be something that bash is doing there.
 > 
 > I think the named pipe isolates the write fd from the ssh end.  If you 
 > use cat or dd instead of ssh the problem goes away.

 Do you happen to know what bash does there, exactly? I was assuming it
 is creating a named pipe behind the user's back.

 I noticed that if you do ssh on the "tee part" and something else on
 the end of the regular pipe then things also fail.  On the other hand
 if you put the "tee part" on something else and the regular pipe on
 ssh things never seem to fail.

 tee treats both fds the same, and obviously ssh is always setting up
 it's input the same way, so the difference must be in what bash is
 doing there with that "pipe emulation".

 > >That doesn't mean I am opposed to handling EAGAIN.
 > >
 > >The way I normally do it is a simple retry loop, not using select.
 > >I'm aware of the tradeoffs, so far I was always better off not
 > >investing a second system call into every retry.
 > 
 > I agree this can be cheaper for many cases, but it can become very 
 > expensive for long waits.

 I'd like to understand what exactly is special about the way bash
 implements that feature so that we can make a more educated decision
 about the tradeoff of using select or not.

 Martin
 -- 
 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
 Martin Cracauer <cracauer at cons.org>   http://www.cons.org/cracauer/