[RFC] Consistent numeric range for "expr" on all architectures
Bruce Evans
brde at optusnet.com.au
Wed Jun 29 01:10:01 UTC 2011
[I changed developers to standards instead of removing it]
On Tue, 28 Jun 2011, Stefan Esser wrote:
> Am 28.06.2011 13:02, schrieb Poul-Henning Kamp:
>> In message <4E09AF8E.5010509 at freebsd.org>, "Stefan Esser" writes:
>>
>>> Due to (false, according to BDE) considerations for POSIX compliance,
>>> the 64bit code was made conditional on a command line option in 2002.
>>
>> I think 64bit is the wrong thing to focus on, shouldn't it be
>> "intmax_t" so we will not have to revisit this again ?
>
> Well, actually it already *is* intmax_t, which happens to be 64bit
> on all architectures I checked ;-)
>
> My proposal is just to not produce overflows when easily avoidable.
> This takes little effort, simplifies the code and makes scripts more
> portable accross architectures.
>
> Are there any supported architectures with intmax_t smaller than 64bit?
There cannot be, since C99 requires long long to be at least 64 bits
(counting the sign bit) and it requires intmax_t to be capable of
representing any value of any signed integer type.
Which checking this, I noticed that:
- preprocessor arithmetic is done using intmax_t or uintmax_t. This causes
portability problems related to ones for expr -- expressions like
ULONG_MAX + ULONG_MAX suddenly started in 1999 giving twice ULONG_MAX
instead of ULONG_MAX-1, but only on arches where ULONG_MAX < UINTMAX_MAX.
(I use unsigned values in this example to give defined behaviour on
overflow, so that the expression ULONG_MAX + ULONG_MAX is not just a bug.
expr doesn't have this complication.)
- C99 doesn't require intmax_t to be the logically longest type. Thus it
permits FreeBSD's rather bizarre implementation of intmax_t being plain
long which is logically shorter than long long.
Other points:
- `expr -e 10000000000000000000 + 0' (19 zeros) gives "Result too large",
but it isn't the result that is too large, but the arg that is too large.
This message is strerror(ERANGE) after strtoimax() sets errno to ERANGE.
`expr -e 1000000000000000000 \* 10' gives "overflow". This message is
correct, but it is in a different style to strerror() (uncapitalized,
and more concise).
- `expr 10000000000000000000' (19 or even 119 zeros) gives no error. It
is documented that the arg is parsed as a string in this case, and the
documentation for -e doesn't clearly say that -e changes this. And -e
doesn't change this if the arg clearly isn't a number
(e.g., if it is 10000000000000000000mumble), or even if it is a non-decimal
number (e.g., if is 010, 0x10 or 10.0). If the arg isn't a decimal integer,
then (except for -e on decimal integers), there is an error irrespective
of -e when arithmetic is attempted (e.g., adding 0). The error message
for this bogusly says "non-numeric argument" when the arg is numeric but
not a decimal integer.
- POSIX requires brokenness for bases other than 10, but I wonder if an
arg like 0x10 invokes undefined behaviour and thus can be made to
work. (I wanted to use a hex number since I can never remember what
INTMAX_MAX is in decimal and wanted to type it in hex for checking
the range and overflow errors.) Allowing hex args causes fewer
problems than allowing decimal args larger than INT32_MAX, since
they are obviously unportable. Some FreeBSD utilities, e.g., dd,
support hex args and don't worry about POSIX restricting them.
- POSIX unfortunately requires args larger than INT32_MAX to be unportable
(to work if longs are longer than 32 bits, else to give undefined (?)
behaviour. For portability there could be a -p switch that limits args
to INT32_MAX even if longs are longer than 32 bits.
- I hope POSIX doesn't require benign overflow. Thus treating all overflows
as errors is good for portability and doesn't require any switch.
Bruce
More information about the freebsd-standards
mailing list