Inconsistent behavior with dd(1)

Sat Aug 16 17:34:47 UTC 2014

Alan Somers wrote this message on Fri, Aug 15, 2014 at 10:42 -0600:
> On Thu, Aug 14, 2014 at 11:55 PM, William Orr <will at worrbase.com> wrote:
> > Hey,
> >
> > I found some inconsistent behavior with dd(1) when it comes to specifying arguments in -CURRENT.
> >
> >  [ worr on terra ] ( ~ ) % dd if=/dev/zero of=/dev/null count=18446744073709551616
> > dd: count: Result too large
> >  [ worr on terra ] ( ~ ) % dd if=/dev/zero of=/dev/null count=18446744073709551617
> > dd: count: Result too large
> >  [ worr on terra ] ( ~ ) % dd if=/dev/zero of=/dev/null count=18446744073709551615
> > dd: count cannot be negative
> >  [ worr on terra ] ( ~ ) % dd if=/dev/zero of=/dev/null count=-18446744073709551615
> > 1+0 records in
> > 1+0 records out
> > 512 bytes transferred in 0.000373 secs (1373071 bytes/sec)
> >  [ worr on terra ] ( ~ ) % dd if=/dev/zero of=/dev/null count=-1
> > dd: count cannot be negative
> >
> > ???
> >
> > Any chance someone has the time and could take a look? https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=191263
> >
> > Thanks,
> > William Orr
> >
> > ???
> 
> 
> IMHO, this is a bug in strtouq(3), not in dd(1).  Why should it parse
> negative numbers at all, when there is stroq(3) for that purpose?  The
> standard is clear that it must, though.  Oddly enough, stroq would
> probably not accept -18446744073709551615, even though strtouq does.
> Specific comments on your patch below:
> 
> 
> >
> > Here???s the patch:
> >
> > Index: bin/dd/args.c
> > ===================================================================
> > --- bin/dd/args.c       (revision 267712)
> > +++ bin/dd/args.c       (working copy)
> > @@ -186,46 +186,31 @@
> >  static void
> >  f_bs(char *arg)
> >  {
> > -       uintmax_t res;
> > -
> > -       res = get_num(arg);
> > -       if (res < 1 || res > SSIZE_MAX)
> > -               errx(1, "bs must be between 1 and %jd", (intmax_t)SSIZE_MAX);
> > -       in.dbsz = out.dbsz = (size_t)res;
> > +       in.dbsz = out.dbsz = get_num(arg);
> > +       if (in.dbsz < 1 || out.dbsz < 1)
> 
> Why do you need to check both in and out?  Aren't they the same?
> Also, you eliminated the check for overflowing SSIZE_MAX.  That's not
> ok, because these values get passed to places that expect signed
> numbers, for example in dd.c:303.

The type of dbsz is size_t, so really:

> > +               errx(1, "bs must be between 1 and %ju", (uintmax_t)-1);

This should be SIZE_MAX, except there isn't a define for this?  So maybe
the code really should be:
  (uintmax_t)(size_t)-1

to get the correct value for SIZE_MAX...

Otherwise on systems that uintmax_t is >32bits and size_t is 32bits,
the error message will be wrong...

> >  }
> >
> >  static void
> >  f_cbs(char *arg)
> >  {
> > -       uintmax_t res;
> > -
> > -       res = get_num(arg);
> > -       if (res < 1 || res > SSIZE_MAX)
> > -               errx(1, "cbs must be between 1 and %jd", (intmax_t)SSIZE_MAX);
> > -       cbsz = (size_t)res;
> > +       cbsz = get_num(arg);
> > +       if (cbsz < 1)
> > +               errx(1, "cbs must be between 1 and %ju", (uintmax_t)-1);
> >  }
> 
> Again, you eliminated the check for SSIZE_MAX, but cbsz must be signed.

What do you mean by this?  cbsz is size_t which is unsigned...

Again, the cast above is wrong...  Maybe we should add a SIZE_MAX
define so we don't have to see the double cast...

> >  static void
> >  f_count(char *arg)
> >  {
> > -       intmax_t res;
> > -
> > -       res = (intmax_t)get_num(arg);
> > -       if (res < 0)
> > -               errx(1, "count cannot be negative");
> > -       if (res == 0)
> > -               cpy_cnt = (uintmax_t)-1;
> 
> This is a special case.  See dd_in().  I think that eliminating this
> special case will have the unintended effect of breaking count=0.
> 
> > -       else
> > -               cpy_cnt = (uintmax_t)res;
> > +       cpy_cnt = get_num(arg);
> >  }
> >
> >  static void
> >  f_files(char *arg)
> >  {
> > -

Don't eliminate these blank lines.. they are intentional per style(9):
             /* Insert an empty line if the function has no local variables. */

> >         files_cnt = get_num(arg);
> >         if (files_cnt < 1)
> > -               errx(1, "files must be between 1 and %jd", (uintmax_t)-1);
> > +               errx(1, "files must be between 1 and %ju", (uintmax_t)-1);
> 
> Good catch.
> 
> >  }
> >
> >  static void
> > @@ -241,14 +226,10 @@
> >  static void
> >  f_ibs(char *arg)
> >  {
> > -       uintmax_t res;
> > -
> >         if (!(ddflags & C_BS)) {
> > -               res = get_num(arg);
> > -               if (res < 1 || res > SSIZE_MAX)
> > -                       errx(1, "ibs must be between 1 and %jd",
> > -                           (intmax_t)SSIZE_MAX);
> > -               in.dbsz = (size_t)res;
> > +               in.dbsz = get_num(arg);
> > +               if (in.dbsz < 1)
> > +                       errx(1, "ibs must be between 1 and %ju", (uintmax_t)-1);
> 
> Again, you eliminated the check for SSIZE_MAX, but dbsz must be signed.

If dbsz must be signed, we should change it's definition to ssize_t
instead of size_t...  Can you point to the line that says this?

In investigating this, it looks like we may have a bug in ftruncate in
that out.offset * out.dbsz may overflow and return incorrect results...
We should probably check that the output (cast to off_t) is greater than
both the inputs before calling ftruncate...  This is safe as both are
unsigned...

> >         }
> >  }
> >
> > @@ -262,14 +243,10 @@
> >  static void
> >  f_obs(char *arg)
> >  {
> > -       uintmax_t res;
> > -
> >         if (!(ddflags & C_BS)) {
> > -               res = get_num(arg);
> > -               if (res < 1 || res > SSIZE_MAX)
> > -                       errx(1, "obs must be between 1 and %jd",
> > -                           (intmax_t)SSIZE_MAX);
> > -               out.dbsz = (size_t)res;
> > +               out.dbsz = get_num(arg);
> > +               if (out.dbsz < 1)
> > +                       errx(1, "obs must be between 1 and %ju", (uintmax_t)-1);
> >         }
> >  }
> 
> Again, you eliminated the check for SSIZE_MAX, but dbsz must be signed.
> 
> >
> > @@ -378,11 +355,14 @@
> >         uintmax_t num, mult, prevnum;
> >         char *expr;
> >
> > +       if (val[0] == '-')
> > +               errx(1, "%s: cannot be negative", oper);
> > +
> 
> In general, I like this part of the diff.  Every user of get_num
> checks for negative values, so why not move the check into get_num
> itself?  But you changed it from a numeric check to a text check, and
> writing text parsers makes me nervous.  I can't see any problems,
> though.
> 
> >         errno = 0;
> >         num = strtouq(val, &expr, 0);
> >         if (errno != 0)                         /* Overflow or underflow. */
> >                 err(1, "%s", oper);
> > -
> > +
> >         if (expr == val)                        /* No valid digits. */
> >                 errx(1, "%s: illegal numeric value", oper);
> >
> > Index: bin/dd/dd.c
> > ===================================================================
> > --- bin/dd/dd.c (revision 267712)
> > +++ bin/dd/dd.c (working copy)
> > @@ -284,8 +284,6 @@
> >
> >         for (;;) {
> >                 switch (cpy_cnt) {
> > -               case -1:                        /* count=0 was specified */
> > -                       return;
> 
> Again, I don't think this will do what you want it to do.  Previously,
> leaving count unspecified resulted in cpy_cnt being 0, and specifying
> count=0 set cpy_cnt to -1.  With your patch, setting count=0 will have
> the same effect as leaving it unspecified.
> 
> >                 case 0:
> >                         break;
> >                 default:
> >
> >

-- 
  John-Mark Gurney				Voice: +1 415 225 5579

     "All that I will do, has been done, All that I have, has not."