stdio/sh behaviour guaranteed?

Sat Jul 8 02:49:23 UTC 2006

On Sat, 8 Jul 2006, Giorgos Keramidas wrote:

> On 2006-07-07 20:23, Jan Grant <jan.grant at bristol.ac.uk> wrote:
>> Consider the following snippet.
>>
>> [[[
>> #!/bin/sh
>>
>> echo one > t
>> while read line
>> do
>>         echo $line
>>         case $line in
>>         one) echo two >> t
>>                 ;;
>>         two) echo three >> t
>>                 ;;
>>         esac
>> done <t
>> ]]]
>>
>> This produces three lines of output on FreeBSD: which is what
>> I'd intuitively expect and it's certainly useful behaviour.
>>
>> I'm just trying to determine if that behaviour is one that I
>> can rely on - in other words, I guess, if stdio performs a
>> "short read" that fails to fill a buffer, and the underlying
>> file is then extended outside the process, will another attempt
>> to read from the FILE* (or a test of feof, say) honour the new,
>> longer file contents?
>
> I think that /bin/sh is not absolutely required to use stdio.h
> for input (which could pre-read some text and cause the above to
> fail).  Having said that, I think it makes sense to assume that
> input is line-buffered, otherwise stuff like this could fail:
>
>    cmd | while read line ; do echo "$line" ; done

Using stdio for input from pipelines in shells is almost absolutely not
possible, since stdio normally does buffering and any buffering breaks
pipelines since there is no way to unread characters from a pipe.  E.g.:

     printf "%s\n%s\n" foo bar |
       sh -c 'read x; echo $$: $x; sh -c "read y; echo \$$: \$y"'

This reads "foo\n" in the first shell and "bar\n" in the second shell,
and prints the results (after discarding the newlines) with process
numbers so that you can see that the reads were done by separate
processes.  For this to work, the first shell must not read beyond
"foo\n".  Line buffering might work, but stdio's line buffering only
applies to output, and to do input line buffering stdio would have to
do the same as shells -- it would have to read a character at a time
up to a newline.  I think stdio never does this, and shells use their
own one-char-at a time input routine for pipes.  I think some shells
optimize the case of input from seekable files by using buffered reads
for such files only, with lseek() to unread input.  Stdio might be
useable for this.  This depends on the file not changing too much
underneath.  Changes to the file in advance of the current EOF should
if if the implementation doesn't do anything stupid.

>> And in particular, is the idiom above blessed by appropriate
>> posix standards?

Stuff like this has to work to satisfy at least defacto standards.

Bruce