misc/91606: sha1 /dev is suspended

Thu Jan 19 05:50:14 PST 2006

The following reply was made to PR bin/91606; it has been noted by GNATS.

From: Bruce Evans <bde at zeta.org.au>
To: Kris Kennaway <kris at obsecurity.org>
Cc: freebsd-gnats-submit at FreeBSD.org
Subject: Re: misc/91606: sha1 /dev is suspended
Date: Fri, 20 Jan 2006 00:42:55 +1100 (EST)

 On Thu, 19 Jan 2006, Kris Kennaway wrote:

 > On Tue, Jan 10, 2006 at 05:52:31PM +0000, Dominik Karczmarski wrote:
 > > >Description:
 > > When I run (on FreeBSD 5.4)
 > >  > sha1 /dev=20
 > > or
 > >  > md5 /dev
 > >=20
 > > checksum is NOT calculated and is suspended.
 > > there is no problem on FreeBSD 4.X.
 >
 > What do you expect this to do?  There are no files in /dev suitable
 > for checksumming.

 This shows that:
 - md5 doesn't detect EOF properly
 - devfs doesn't implement EOF properly
 - neither of these bugs have been merged into FreeBSD-4.x.

 Variants of this show that md5 doesn't read pipes or special files
 properly.  Most disk files in /dev are suitable for checksumming.

 md5 was broken in rev.1.14 of libmd/mdXhl.c (4.x just missed the bug --
 it is at 1.13):

 % Index: mdXhl.c
 % ===================================================================
 % RCS file: /home/ncvs/src/lib/libmd/mdXhl.c,v
 % retrieving revision 1.13
 % retrieving revision 1.14
 % diff -u -2 -r1.13 -r1.14
 % --- mdXhl.c	28 Aug 1999 00:05:07 -0000	1.13
 % +++ mdXhl.c	17 Mar 2001 10:00:50 -0000	1.14
 % @@ -7,9 +7,10 @@
 %   * ----------------------------------------------------------------------------
 %   *
 % - * $FreeBSD: src/lib/libmd/mdXhl.c,v 1.13 1999/08/28 00:05:07 peter Exp $
 % + * $FreeBSD: src/lib/libmd/mdXhl.c,v 1.14 2001/03/17 10:00:50 phk Exp $
 %   *
 %   */
 % 
 %  #include <sys/types.h>
 % +#include <sys/stat.h>
 %  #include <fcntl.h>
 %  #include <unistd.h>
 % @@ -44,17 +45,38 @@
 %  MDXFile(const char *filename, char *buf)
 %  {
 % +    return MDXFileChunk(filename, buf, 0, 0);
 % +}
 % +
 % +char *
 % +MDXFileChunk(const char *filename, char *buf, off_t ofs, off_t len)
 % +{
 %      unsigned char buffer[BUFSIZ];
 %      MDX_CTX ctx;
 % -    int f,i,j;
 % +    struct stat stbuf;
 % +    int f, i, e;
 % +    off_t n;
 % 
 %      MDXInit(&ctx);
 % -    f = open(filename,O_RDONLY);
 % +    f = open(filename, O_RDONLY);
 %      if (f < 0) return 0;
 % -    while ((i = read(f,buffer,sizeof buffer)) > 0) {
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

 MDXFile() used to detect EOF properly -- it used an almost(*)-sufficiently
 large buffer and stopped reading after reading 0 in the above.

 % -	MDXUpdate(&ctx,buffer,i);
 % -    }
 % -    j = errno;
 % +    if (fstat(f, &stbuf) < 0) return 0;
 % +    if (ofs > stbuf.st_size)
 % +	ofs = stbuf.st_size;
 % +    if ((len == 0) || (len > stbuf.st_size - ofs))
 % +	len = stbuf.st_size - ofs;

 Rev.14 adds an fstat() to bogusly determine the size.  The size can be
 given by fstat can be wrong for various reasons, e.g.:
 - the size of a regular file or directory can grow or shrink.
    Shrinkage should be still detected by reading EOF (as above), but
    that is broken too (see below).
 - the size in stbuf.st_size is likely to be wrong for other file types,
    especially for pipes.
 - some files (mainly ones in procfs) claim to be regular although they
    are very irregular.  Determining their current actual size is
    problematic and not done on every stat().
 - all directories in devfs claim to have size 512 but few have that size,
    since directories in devfs are irregular like the irregular regular
    files in procfs with less excuse since the directory contents don't
    change often.  devfs just reports a nominal st_size and it takes
    reading to the end to determine the actual size.  E.g., "wc /dev" on
    sledge now shows that the size of /dev is 1116.  It is not even a
    multiple of 512 so other programs may be confused.  md5 is especially
    confused, as follows:
    - it first tries to read the nominal size (512) (it would try up to 1024
      if the nominal size were larger).
    - devfs read() returns 504 since it doesn't return partial directory
      entries.
    - md5 now tries to read only what it thinks is the residual size (8).
      This is too small for the next directory entry for the same reason
      that 512 was too small for another entry, so read() returns 0.  This
      result is very irregular (it's a bug in devfs) since EOF has not
      been reached.
    - md5 doesn't understand the irregular EOF, so it doesn't handle the
      problem by expanding the read, and it doesn't understand ordinary
      EOF so it isn't confused into mishandling the problem by quitting
      before it reaches real EOF, so it loops doing the previous step
      forever.

 % +    if (lseek(f, ofs, SEEK_SET) < 0) return 0;

 Rev.1.14 also tries to break pipes here.  However, "cat | md5" still
 works because the top level of md5 doesn't call here -- it uses
 MDFilter() which uses fread() which has non-broken detection of EOF.
 "cat | md5 /dev/stdin" fails here and then the top level warn()'s about
 the illegal seek.  For non-pipes, the above isn't even wrong, since
 lseek() succeeds on unseekable files that aren't pipes, e.g. for most
 device files.  st_size is also garbage for most device files.  This
 breaks things like "md5 /dev/ad0" (md5 reads 0 bytes so /dev/ad0
 appears to be the same as /dev/null).

 (*) I think the fstat() in the above is good only for determining the
 size of a sufficiently-large buffer (st_blksize (**)).  E.g, for
 directory entries on file systems that like devf only return non-truncated
 entries, it is essential for the read size to be at least as large as
 the largest directory entry.  Using st_blksize should (**) give this
 automatically, but using an arbitrary buffer size like BUFSIZ might not.
 stdio tries to do the right thing by using st_blksize in most cases, but
 some cases didn't work right and now all cases don't work right (**).

 (**):
 - st_blksize used to work right for regular files, at least above the
    level of individual filesystems.  Filesystems return their preferred
    block size in va_blocksize, and vn_stat() used to actually return
    va_blocksize to userland.  This is unbroken in FreeBSD-4.
 - st_blksize used to work right for directories, just like for files.
    For ffs, va_blocksize is the same for directories as for files (it
    is fs_bsize).  This is unbroken in FreeBSD-3.  In FreeBSD-4, it was
    broken misconverting the default of a case statement in rev.1.81
    of vfs_vnops.c.
 - st_blocksize used to work not-so right for some special files (mainly
    disks).  Drivers were given a chance to set a good size in si_bsize_best.
    A few disk drivers actually did so.
 - -current ignores va_blocksize and doesn't have si_bsize_best.  It sets
    st_blksize to PAGE_SIZE in all cases.
 - stdio attempts to use st_blksize for everything except ttys.  This used
    to work right for disks.  It last worked in FreeBSD-3.  Stdio doesn't
    know of any good way to determine which devices are ttys.  It just
    classifies all cdevs as "could be tty".  Since block disk devices were
    lost in FreeBSD-4, stdio hasn't used st-blksize for disks since FreeBSD-3,
    and the partially working cases for disks in vn_stat() in FreeBSD-4 were
    never actually used by stdio.

 % +    n = len;
 % +    while (n > 0) {
 % +	if (n > sizeof(buffer))
 % +	    i = read(f, buffer, sizeof(buffer));
 % +	else
 % +	    i = read(f, buffer, n);
 % +	if (i < 0) break;
     	^^^^^^^^^^^^^^^^^

 The new function MDXFileChunk() (and thus MDXFile() too) loops forever
 if it unexpectedly hits EOF.

 For reasons given above, the fix is not just to change the break
 condition to (i <= 0):
 - the buffer size should not be reduced to n, so that irregular EOFs
    can't happen.  Then the MD5File() case would just work like it used
    to provided the lseek() and the fstat() are also removed in that case.
    In the nondegenerate MD5FileChunk() case, the read might read past
    (offset + len), but that wouldn't cause any new problems since we
    close the file so our file pointer is irrelevant.
 - fixing unseekable files is harder, since lseek() doesn't tell you if
    a file is seekable (it only fails for pipes, and thos can be classified
    just as easily using the result of the fstat()).  The degenerate case
    shouldn't seek, and the nondengenerate case should try harder to abort
    for unseekable files.  dd(1) has a fair amount of code to guess
    seekability.
 - I think using st_size should work for regular files and directories
    only, but the case of procfs shows that it doesn't actually work for
    "regular" files, and the case of devfs shows that it does't actually
    work for directories.  I'm not sure if it is really needed.  All of
    its current uses are dubious:
    - it is just harmful for the degenerate case.
    - if it is larger than the offset, then we reduce the offset.  Why not
      an error?
    - if the size from the offset to st_size is < len, we reduce len?  Why
      not an error?  Without errors, there is no way for the caller to
      tell if the chunk read was anywhere near the chunk requested.  Without
      fixups to avoid the errors, the errors are easy to detect by just
      seeking to `offset' and trying to read `len' bytes, provided lseek()
      works.

 % +	MDXUpdate(&ctx, buffer, i);
 % +	n -= i;
 % +    } 
 % +    e = errno;
 %      close(f);
 % -    errno = j;
 % +    errno = e;
 %      if (i < 0) return 0;
 %      return MDXEnd(&ctx, buf);

 A workaround for pipes, devices and directories is to only use stdin
 (not by name).

 Bruce