svn commit: r298671 - head/sys/geom/part

Wed Apr 27 09:35:50 UTC 2016

On Tue, 26 Apr 2016, Conrad E. Meyer wrote:

> Log:
>  g_part_bsd64: Check for valid on-disk npartitions value
>
>  This value is u32 on disk, but assigned to an int in memory.  After we do the
>  implicit conversion via assignment, check that the result is at least one[1]
>  (non-negative[2]).
>
>  1. The subsequent for-loop iterates from gpt_entries minus one, down, until
>     reaching zero.  A negative or zero initial index results in undefined signed
>     integer overflow.
>  2. It is also used to index into arrays later.
>
>  In practice, we expected non-malicious disks to contain small positive values.

It is a common programming error to use unsigned types to contain small
positive values.  This gives [un]sign extension/overflow bugs like (1)
by making it difficult to use subtraction operators.  (The loop doesn't
actually have bug (1) -- it correctly uses only ints in the loop -- see
below near the end.  Signed integer overflow mostly doesn't occur either.
Only an initial value if INT_MIN would give it for subtracting 1.  The
first undefined behaviour for a negative initial index is actually for
the array indexing in crc32(... d_partitions[basetable->gpt_entries]).)

struct g_part_table doesn't have as many instances of this bug as struct
disklabel or struct disklabel64, so conversions tend to cause [un]sign
extension/overflow bugs like [0].  Using unsigned types just gives wrong
values with no overflow if the target type is too small.

>  Reported by:	Coverity
>  CID:		1223202
>  Sponsored by:	EMC / Isilon Storage Division
>
> Modified:
>  head/sys/geom/part/g_part_bsd64.c
>
> Modified: head/sys/geom/part/g_part_bsd64.c
> ==============================================================================
> --- head/sys/geom/part/g_part_bsd64.c	Tue Apr 26 22:01:07 2016	(r298670)
> +++ head/sys/geom/part/g_part_bsd64.c	Tue Apr 26 22:30:54 2016	(r298671)
> @@ -509,7 +509,8 @@ g_part_bsd64_read(struct g_part_table *b
>
> 	dlp = (struct disklabel64 *)buf;
> 	basetable->gpt_entries = le32toh(dlp->d_npartitions);

Overflow still occurs here if the source value actually uses all 32 of its
bits (assuming 32-bit ints -- otherwise the analysis is even more complicated.
POSIX now requires >= 32-bit ints, and FreeBSD never supported anything else,
so this assumption is safe for now, but stating it complicates the analysis).

> -	if (basetable->gpt_entries > MAXPARTITIONS64)
> +	if (basetable->gpt_entries > MAXPARTITIONS64 ||
> +	    basetable->gpt_entries < 1)
> 		goto invalid_label;

This check is still bogus, since it checks for overflow after overflow
has caused implementation-defined behaviour.  This requires a complicated
analysis to justify even for the usual behaviour of 2's complement
wrapping.  I've never seen any implementation that documuments this.

The correct check doesn't need any complicated analysis.  It just
checks the source value before clobbering it by assigning it to a
possibly-different type.  It is still not so easy to know the correct
type to use, since the source value is encoded and
__typeof(source_value) is unportable.  But we already had to use
knowledge of its type to hard-code the decoding method as le32toh().
So we may as well hard-code the type too:

 	uint32_t npart;		/* XXX: assume that dlp_npartitions is u32 */
 	...
 	npart = le32toh(dlp->d_npartitions);
 	if (npart < 1 || npart > MAXPARTITIONS64)
 		goto invalid_label;
 	basetable->gpt_entries = npart;

Old struct disklabel has the same design bug, but its d_npartitions has
type uint16_t.  With 32-bit ints, uint16_t promotes to plain int, so its
unsigned features are hard to use even if you want them.  Expressions
like le16toh(dlp->d_npartitions) - 1 worked as intended but rarely as
wanted after C90 standardised the bad "value-preserving" sign extension
rules: when the source value is 0, the value of this subtraction is -1
with a 16-bit source but 0xFFFFFFFFU with a 32-bit source (assuming a
32-bit int target -- the general case is more complicated).

This makes loops difficult to write.  If you use an index variable i
then it should have type int, but might need to be uint32_t depending
on the source type and compiler warnings.  If you use the entry in
the struct more directly, then counting down from -1 to 0 just won't
work if the type in the struct is unsigned, except in some contexts
where this type is smaller than int and the "value-preserving" rule
gives the normally-unwanted value of -1.  The decoding step prevents
direct use in most cases.  Thus it is very natural to fix unsigned
design errors in standard structs -- you have to check bounds when
converting, and the bounds usually don't go above INT16_MAX, so you
can use signed types in your own structs.

gpt->gpt_entries and the index variable ('index') for the loop that
counts it down actually have the correct type (int), so point (1)
is incorrect -- a negative or zero initial index just results in signed
0 being correctly counted down to -1 so that the loop iterates 0 times.

Also, even with unsigned types, there would be no undefined behaviour
-- 0U would be counted down to UINT_MAX, and assigning this to 'int
index' would only cause implementation-defined behaviour.  Nothing bad
seems to happen for zero entries when the loop does nothing.  Zero
entries might even be a valid format; however, all disklabels are
supposed to have a 'c' partition which takes 3 entries to reach, so
1 or 2 entries are just as invalid as ones with 0 entries.

Bruce