svn commit: r273734 - head/bin/dd

Mon Oct 27 16:06:53 UTC 2014

On Mon, 27 Oct 2014, Kurt Jaeger wrote:

> Log:
>  bin/dd: Fix incorrect casting of arguments
>
>  dd(1) casts many of its numeric arguments from uintmax_t to intmax_t
>  and back again to detect whether or not the original arguments were
>  negative. This caused wrong behaviour in some boundary cases:
>
>  $ dd if=/dev/zero of=/dev/null count=18446744073709551615
>  dd: count cannot be negative
>
>  After the fix:
>
>  $ dd if=/dev/zero of=/dev/null count=18446744073709551615
>  dd: count: Result too large

Both of these work correctly in my version (with a relatively small patch
and no breakage of other cases).  (I actually typed large values as -1
and 11111111111111111111111111.  -1 means (uintmax_t)-1 although this
is undocumented and now broken).

>
>  PR:		191263
>  Submitted by:	will at worrbase.com
>  Approved by:	cognet@

I couldn't review the PR since I bugzilla doesn't accept mail responses.
I didn't fear it was so bad.

> Modified: head/bin/dd/args.c
> ==============================================================================
> --- head/bin/dd/args.c	Mon Oct 27 11:21:47 2014	(r273733)
> +++ head/bin/dd/args.c	Mon Oct 27 11:38:17 2014	(r273734)
> @@ -41,6 +41,7 @@ __FBSDID("$FreeBSD$");
>
> #include <sys/types.h>
>
> +#include <ctype.h>
> #include <err.h>
> #include <errno.h>
> #include <inttypes.h>
> @@ -171,8 +172,7 @@ jcl(char **argv)
> 	 */
> 	if (in.offset > OFF_MAX / (ssize_t)in.dbsz ||
> 	    out.offset > OFF_MAX / (ssize_t)out.dbsz)
> -		errx(1, "seek offsets cannot be larger than %jd",
> -		    (intmax_t)OFF_MAX);
> +		errx(1, "seek offsets cannot be larger than %jd", OFF_MAX);
> }

This used to be correct.  Now it assumes that off_t == intmax_t.  Both
just happen to be int64_t.  This will break when intmax_t is expanded
(off_t is unlikely to be expanded).  The bug would be detected at compile
time now if the type of intmax_t had maximal rank (long long) but int64_t
and off_t remain as long on 64-bit arches.  C99 doesn't seem to require
intmax_t to have maximal rank.  This seems to be a bug in C99, but FreeBSD
exploits it to avoid using the long long abomination on 6 4-bit arches.

>
> static int
> @@ -186,37 +186,30 @@ c_arg(const void *a, const void *b)
> static void
> f_bs(char *arg)
> {
> -	uintmax_t res;
>
> -	res = get_num(arg);

This used to be correct.  Though I don't like unsigned types, the API
of read() and write() encourages use of size_t instead of ssize_t for
buffer sizes.  That was done, but a limit of SSIZE_MAX was applied,
partly to avoid problems detecting errors from read() and write()
and partly to distinguish overflowing cases.

get_num() returns an unsigned type (uintmax_t), so it is suitable for
handling the size_t args here.  It was used.  However, size_t might be
smaller than uintmax_t, so it cannot always hold the result of get_num().
So a temporary variable was used to hold the value before checking it.

> -	if (res < 1 || res > SSIZE_MAX)
> -		errx(1, "bs must be between 1 and %jd", (intmax_t)SSIZE_MAX);

get_num() returns UINTMAX_MAX if the result is too large.  This may or may
not be an error.  But in all arches, UINTMAX_MAX exceeds SSIZE_MAX, so
the overflowing case gives the same error as a non-overflowing but too
large value.

> -	in.dbsz = out.dbsz = (size_t)res;
> +	in.dbsz = out.dbsz = get_num(arg);

This breaks the error checking.  Blind assignment may corrupt the value
before it can be checked.

> +	if (out.dbsz < 1 || out.dbsz > SSIZE_MAX)
> +		errx(1, "bs must be between 1 and %jd", SSIZE_MAX);
> }

Printf format error, as above.

>
> static void
> f_cbs(char *arg)
> {
> -	uintmax_t res;
>
> -	res = get_num(arg);
> -	if (res < 1 || res > SSIZE_MAX)
> -		errx(1, "cbs must be between 1 and %jd", (intmax_t)SSIZE_MAX);
> -	cbsz = (size_t)res;
> +	cbsz = get_num(arg);
> +	if (cbsz < 1 || cbsz > SSIZE_MAX)
> +		errx(1, "cbs must be between 1 and %jd", SSIZE_MAX);
> }

This just breaks the range checking and the printf format, as above.

>
> static void
> f_count(char *arg)
> {
> -	intmax_t res;
>
> -	res = (intmax_t)get_num(arg);
> -	if (res < 0)
> -		errx(1, "count cannot be negative");

This was correct too.

Though get_num() returns an unsigned type, it is based on strtoul()
which handles negative values as correctly as possible.  (The
multipliers in get_num() don't handle negative values as correctly as
possible.)  dd also has a get_off_t() function.  This is supposed to
handle negative values more carefully, but is so badly implemented
that it has many more bugs than blindly casting get_num() to off_t.
The above cast is safer since it converts to the signed type
corresponding to the return type.  In 2's complement, the result is
predictable and correct except in unrepresentable cases.  E.g., an arg
of -1 is returned as UINTMAX_MAX and casting it gives -1 again.

My version implements get_off_t() using get_num().  Of course it doesn't
blindly convert to an off_t.  If does range checking on the uintmax_t
before converting:

@ static off_t
@ get_off_t(const char *val)
@ {
@ 	uintmax_t num;
@ 
@ 	num = get_num(val);
@ 	if (num != (uintmax_t)(off_t)num)
@ 		errx(1, "%s: offset too large", oper);
@ 	return ((off_t)num);
@ }

> -	if (res == 0)
> -		cpy_cnt = (uintmax_t)-1;
> -	else
> -		cpy_cnt = (uintmax_t)res;

This function doesn't want an off_t variable, so get_off_t() is unuable
for it.  Though I don't like unsigned types, it is natural for this one
to use one, and this helps avoid proliferation of conversion functions.
So I changed the variable type to uintmax_t so that get_num() can be
used directly.  I also removed the magic -1 case and fixed the broken
count = 0 case.  A count of 0 should mean 0.  Actually, the variable
type was already uintmax_t, so using intmax_t and -1 here is more garbage
magic than might first appear.  I use the following patches for bogus
counting (relative to an old version, and possibly not all enclosed here):

@diff -u2 args.c~ args.c
@--- args.c~	Wed Apr  7 20:20:58 2004
@+++ args.c	Wed Apr  7 20:20:59 2004
@@@ -204,13 +206,10 @@
@ f_count(char *arg)
@ {
@-	intmax_t res;
@ 
@-	res = (intmax_t)get_num(arg);
@-	if (res < 0)
@-		errx(1, "count cannot be negative");
@-	if (res == 0)
@-		cpy_cnt = (uintmax_t)-1;
@-	else
@-		cpy_cnt = (uintmax_t)res;
@+	cpy_cnt = get_num(arg);
@+#if 0
@+	if (cpy_cnt == 0)
@+		terminate(0);
@+#endif
@ }
@ 
@@@ -220,6 +219,4 @@
@ 
@ 	files_cnt = get_num(arg);
@-	if (files_cnt < 1)
@-		errx(1, "files must be between 1 and %jd", (uintmax_t)-1);
@ }
@ 
@diff -u2 dd.c~ dd.c
@--- dd.c~	Wed Apr  7 20:20:48 2004
@+++ dd.c	Wed Apr  7 20:20:49 2004
@@@ -272,14 +267,6 @@
@ 
@ 	for (;;) {
@-		switch (cpy_cnt) {
@-		case -1:			/* count=0 was specified */
@+		if (cpy_cnt != 0 && st.in_full + st.in_part >= cpy_cnt)
@ 			return;
@-		case 0:
@-			break;
@-		default:
@-			if (st.in_full + st.in_part >= (uintmax_t)cpy_cnt)
@-				return;
@-			break;
@-		}
@ 
@ 		/*

> +	cpy_cnt = get_num(arg);

Seems to be correct.

> +	if (cpy_cnt == SIZE_MAX)
> +		errc(1, ERANGE, "%s", oper);

Bogus.  Counts aren't sizes, and no useful restriction on the count can
prevent total counts overflowing.  Only the non-useful restriction to
counts of 0 and 1 prevents overflow.  Counts of 2 to sizes of near the
maximum to add up to almost twice the maximum.

> +	if (cpy_cnt == 0)
> +		cpy_cnt = -1;
> }

Bogus.  -1 really means UINTMAX_MAX (at least, that is what it becomes in
cpy_cnt).  On 64-bit arches, SIZE_MAX has the same value and the previous
check keeps that value as magic.  On 32-bit arches, it doesn't even do that.

>
> static void
> @@ -225,7 +218,7 @@ f_files(char *arg)
>
> 	files_cnt = get_num(arg);
> 	if (files_cnt < 1)
> -		errx(1, "files must be between 1 and %jd", (uintmax_t)-1);
> +		errx(1, "files must be between 1 and %ju", SIZE_MAX);
> }

The usual printf format error.

>
> static void
> @@ -241,14 +234,11 @@ f_fillchar(char *arg)
> static void
> f_ibs(char *arg)
> {
> -	uintmax_t res;
>
> 	if (!(ddflags & C_BS)) {
> -		res = get_num(arg);
> -		if (res < 1 || res > SSIZE_MAX)
> -			errx(1, "ibs must be between 1 and %jd",
> -			    (intmax_t)SSIZE_MAX);
> -		in.dbsz = (size_t)res;
> +		in.dbsz = get_num(arg);
> +		if (in.dbsz < 1 || in.dbsz > SSIZE_MAX)
> +			errx(1, "ibs must be between 1 and %ju", SSIZE_MAX);
> 	}
> }

The usual range checking errors.

More than the usual printf format error.  SSIZE_MAX has a signed type, but
is now printed using an unsigned format.

>
> @@ -262,14 +252,11 @@ f_if(char *arg)
> static void
> f_obs(char *arg)
> {
> -	uintmax_t res;
>
> 	if (!(ddflags & C_BS)) {
> -		res = get_num(arg);
> -		if (res < 1 || res > SSIZE_MAX)
> -			errx(1, "obs must be between 1 and %jd",
> -			    (intmax_t)SSIZE_MAX);
> -		out.dbsz = (size_t)res;
> +		out.dbsz = get_num(arg);
> +		if (out.dbsz < 1 || out.dbsz > SSIZE_MAX)
> +			errx(1, "obs must be between 1 and %jd", SSIZE_MAX);
> 	}
> }

The usual range checking errors.

The usual printf format error.

>
> @@ -378,11 +365,17 @@ get_num(const char *val)
> 	uintmax_t num, mult, prevnum;
> 	char *expr;
>
> +	while (isspace(val[0]))

isspace() is undefined on negative characters.

> +		val++;
> +

Style bug (extra blank line).


> +	if (val[0] == '-')
> +		errx(1, "%s: cannot be negative", oper);

There is no particular reason to enforce this.  It breaks using get_num()
to implement get_off_t().  Any check like this should be external to
the function.  I don't know of any useful useful use of negative numbers
in dd, but I want get_num() to be more generally useful, and used.  It
can easily handle negative args but either it or the caller would
need a sign check like the above to determine if the unsigned result
actually is just a representation of a signed result.  There is only
an ambiguity for signed results above INTMAX_MAX.

Another interesting case is an arg of -1*-1.  get_num() allowed both
the -1's, but turned them into UINTMAX_MAX.  The overflow checking is
too careful, so -1*-1 is rejected as overflow instead of giving 1.

> +

More of this style bug.

> 	errno = 0;
> -	num = strtouq(val, &expr, 0);
> +	num = strtoull(val, &expr, 0);

get_num() returns uintmax_t, but still used strtouq() internally.  This
changes it to use a loing long abomination instead of the correct function
strtoumax().

My fixes for get_num() are relatively small: patch relative to an old
version:

@ @@ -334,5 +331,5 @@
@  }
@ 
@ -/*
@ +/*-
@   * Convert an expression of the following forms to a uintmax_t.
@   * 	1) A positive decimal number.
@ @@ -341,6 +338,6 @@
@   *	4) A positive decimal number followed by a m (mult by 1 << 20).
@   *	5) A positive decimal number followed by a g (mult by 1 << 30).
@ - *	5) A positive decimal number followed by a w (mult by sizeof int).
@ - *	6) Two or more positive decimal numbers (with/without [bkmgw])
@ + *	6) A positive decimal number followed by a w (mult by sizeof int).
@ + *	7) Two or more positive decimal numbers (with/without [bkmgw])
@   *	   separated by x (also * for backwards compatibility), specifying
@   *	   the product of the indicated values.

I didn't fix the spammed error in the comment.  The errors start with
"positive".  This function accepts negative numbers and returns them
as positive numbers.  Except this commit breaks that.  The errors
continue with "decimal".  This function actually takes decimal, octal
or hexadecimal numbers...


@ @@ -349,13 +346,12 @@
@  get_num(const char *val)
@  {
@ -	uintmax_t num, mult, prevnum;
@ +	uintmax_t mult, num, prevnum;
@  	char *expr;
@ 
@  	errno = 0;
@ -	num = strtouq(val, &expr, 0);
@ -	if (errno != 0)				/* Overflow or underflow. */
@ +	num = strtoumax(val, &expr, 0);
@ +	if (errno == ERANGE)			/* Overflow. */

"Underflow" can only occur for FP expressions.  It is not going negative.


@  		err(1, "%s", oper);
@ - 
@ -	if (expr == val)			/* No valid digits. */
@ +	if (expr == val)			/* No digits. */
@  		errx(1, "%s: illegal numeric value", oper);
@ 
@ @@ -369,8 +365,8 @@
@  		break;
@  	case 'm':
@ -		mult = 1 << 20;
@ +		mult = 1UL << 20;
@  		break;

Don't assume 32-bit ints.

@  	case 'g':
@ -		mult = 1 << 30;
@ +		mult = 1UL << 30;
@  		break;
@  	case 'w':
@ @@ -378,7 +374,6 @@
@  		break;
@  	default:
@ -		;
@ +		break;
@  	}
@ -
@  	if (mult != 0) {
@  		prevnum = num;
@

> Modified: head/bin/dd/conv.c
> ==============================================================================
> --- head/bin/dd/conv.c	Mon Oct 27 11:21:47 2014	(r273733)
> +++ head/bin/dd/conv.c	Mon Oct 27 11:38:17 2014	(r273734)
> @@ -133,7 +133,7 @@ block(void)
> 	 */
> 	ch = 0;
> 	for (inp = in.dbp - in.dbcnt, outp = out.dbp; in.dbcnt;) {
> -		maxlen = MIN(cbsz, in.dbcnt);
> +		maxlen = MIN(cbsz, (size_t)in.dbcnt);
> 		if ((t = ctab) != NULL)
> 			for (cnt = 0; cnt < maxlen && (ch = *inp++) != '\n';
> 			    ++cnt)
> @@ -146,7 +146,7 @@ block(void)
> 		 * Check for short record without a newline.  Reassemble the
> 		 * input block.
> 		 */
> -		if (ch != '\n' && in.dbcnt < cbsz) {
> +		if (ch != '\n' && (size_t)in.dbcnt < cbsz) {
> 			(void)memmove(in.db, in.dbp - in.dbcnt, in.dbcnt);
> 			break;
> 		}

Now you need casts to recover from changing the mixture of unsigned and
signed types.  If some block size variables are signed, then all should
be, but cbsz is still size_t.

> @@ -228,7 +228,7 @@ unblock(void)
> 	 * translation has to already be done or we might not recognize the
> 	 * spaces.
> 	 */
> -	for (inp = in.db; in.dbcnt >= cbsz; inp += cbsz, in.dbcnt -= cbsz) {
> +	for (inp = in.db; (size_t)in.dbcnt >= cbsz; inp += cbsz, in.dbcnt -= cbsz) {
> 		for (t = inp + cbsz - 1; t >= inp && *t == ' '; --t)
> 			;
> 		if (t >= inp) {
>

The excessive casts give 2 more style bugs here (expannsion of the line
beyond 80 columns and no re-wrapping to fix this).

> Modified: head/bin/dd/dd.c
[More type poisoning]
> ...
> Modified: head/bin/dd/dd.h
> ==============================================================================
> --- head/bin/dd/dd.h	Mon Oct 27 11:21:47 2014	(r273733)
> +++ head/bin/dd/dd.h	Mon Oct 27 11:38:17 2014	(r273734)
> @@ -38,10 +38,9 @@
> typedef struct {
> 	u_char		*db;		/* buffer address */
> 	u_char		*dbp;		/* current buffer I/O address */
> -	/* XXX ssize_t? */

ISTR asking the author of this comment to clean up some sign-related bugs.
I didn't like the results.  They including changing minor sign hacks to
broken overflow handling in get_off_t().

> -	size_t		dbcnt;		/* current buffer byte count */
> -	size_t		dbrcnt;		/* last read byte count */
> -	size_t		dbsz;		/* block size */
> +	ssize_t		dbcnt;		/* current buffer byte count */
> +	ssize_t		dbrcnt;		/* last read byte count */
> +	ssize_t		dbsz;		/* block size */
>
> #define	ISCHR		0x01		/* character device (warn on short) */
> #define	ISPIPE		0x02		/* pipe-like (see position.c) */
> @@ -57,13 +56,13 @@ typedef struct {
> } IO;
>
> typedef struct {
> -	uintmax_t	in_full;	/* # of full input blocks */
> -	uintmax_t	in_part;	/* # of partial input blocks */
> -	uintmax_t	out_full;	/* # of full output blocks */
> -	uintmax_t	out_part;	/* # of partial output blocks */
> -	uintmax_t	trunc;		/* # of truncated records */
> -	uintmax_t	swab;		/* # of odd-length swab blocks */
> -	uintmax_t	bytes;		/* # of bytes written */

All of these were correct.  Counters aren't related to sizes, and shouldn't
be limited to 32 bits on 32-bit arches.  Especally since they used to
be 64 bits on 32-bit arches.  They were only u_long in 4.4BSD, but went
through several rounds of churning in 1999 to reach the above.

> +	size_t	in_full;	/* # of full input blocks */
> +	size_t	in_part;	/* # of partial input blocks */
> +	size_t	out_full;	/* # of full output blocks */
> +	size_t	out_part;	/* # of partial output blocks */
> +	size_t	trunc;		/* # of truncated records */
> +	size_t	swab;		/* # of odd-length swab blocks */
> +	size_t	bytes;		/* # of bytes written */

The byte count is most broken.  Now it overflows on 32-bit arches after
copying just 0.1% or a 4TB disk.

> 	struct timespec	start;		/* start time of dd */
> } STAT;
>
>
> Modified: head/bin/dd/position.c
> ==============================================================================
> --- head/bin/dd/position.c	Mon Oct 27 11:21:47 2014	(r273733)
> +++ head/bin/dd/position.c	Mon Oct 27 11:38:17 2014	(r273734)
> @@ -178,7 +178,7 @@ pos_out(void)
> 			n = write(out.fd, out.db, out.dbsz);
> 			if (n == -1)
> 				err(1, "%s", out.name);
> -			if ((size_t)n != out.dbsz)
> +			if (n != out.dbsz)
> 				errx(1, "%s: write failure", out.name);
> 		}
> 		break;

This is about the only simplification that you get from using ssize_t for
dbsz, etc.  The values were restricted to SSIZE_MAX instead of to SIZE_MAX
mainly so that the error checking can be simpler here (it is too simple
to be correct -- dd is clueless about short i/o's).  But the compiler
doesn't know about the restriction, so might warn.  The cast prevents
that and expresses correctness of the conversion.

This isn't really a simplification.  Another compiler might warn about
the implicit conversion of out.dbsz when passed to write() (since the
compiler doesn't know that the value is >= 0).  Then you would need a cast
to prevent that and express the correctness of the conversion.

Bruce