Implementation errors in strtol()
Joerg Wunsch
freebsd-current at uriah.heep.sax.de
Thu Jan 20 14:30:08 PST 2005
As Andrey Chernov wrote:
> Errno may be set in case of error with not documented errno. Thats
> how I read it, but I may miss something.
I read that a bit differently.
> > Still, my major point was that "0x" sequences are falsely rejected as
> It clearly should be rejected with EINVAL in case base == 16,
> because 0 alone is not valid HEX sequence
Nope. "0" alone is a completely valid hexadecimal number,
representing the value 0. Conversion has to start at the 0 (as it is
not invalid), and to stop at the x. The string "0x" simply means
there is *no* optional 0x prefix, but just a number 0 without a prefix
(followed by a letter that cannot be converted, so it has to be passed
as final string).
> > conversion errors, and that strings consisting solely of a plus or
> > minus sign should not throw an error either, as I read the C standard.
> +- may produce EINVAL, as POSIX says.
Where?
Again, I'd value the C standard higher than Posix. C says they form a
valid subject sequence. Thus, no error may be flagged.
I don't have Posix at hand, but SUSPv2 completely follows the C
standard, with the only addition that EINVAL might be flagged in the
case of a conversion error. As a subject sequence consisting of a
sign only (+ or -) does not constitute an empty subject sequence, thus
no conversion error is permissible. This implies EINVAL must not be
set.
> In general please don't forget that strtol(), atol() etc. supposed
> to parse user input and _detect_ syntax errors, it is their
> purpose. If they not do it or do it in half, each program forced to
> use its own parser instead.
This is no excuse for violating standards. If a user feels that a
single sign must not be interpreted as a valid 0, they indeed have to
apply their own checks. It would be no use to them if FreeBSD's
strtoul (erroneously) flagged it as an error, while any
standard-compliant implementation would not fulfill their expectations
anyway.
As a demonstration, consider this test program:
#include <stdlib.h>
#include <stdio.h>
#include <errno.h>
const char *s[] = {
"", "+", "-", "0x", "-0x"
};
int
main(void)
{
size_t i;
char *p;
long l;
for (i = 0; i < sizeof s / sizeof s[0]; i++) {
errno = 0;
l = strtol(s[i], &p, 16);
printf("\"%s\" -> %ld, len %u, errno %d\n",
s[i], l, p - s[i], errno);
}
return 0;
}
Below are the results for Solaris 8, FreeBSD 5, Linux 2.x, and HP-UX
10.20.
helios% ./foo
"" -> 0, len 0, errno 0
"+" -> 0, len 0, errno 0
"-" -> 0, len 0, errno 0
"0x" -> 0, len 1, errno 0
"-0x" -> 0, len 2, errno 0
j at uriah 1259% ./foo
"" -> 0, len 0, errno 22
"+" -> 0, len 0, errno 22
"-" -> 0, len 0, errno 22
"0x" -> 0, len 0, errno 22
"-0x" -> 0, len 0, errno 22
j at lux 344% ./foo
"" -> 0, len 0, errno 0
"+" -> 0, len 0, errno 0
"-" -> 0, len 0, errno 0
"0x" -> 0, len 1, errno 0
"-0x" -> 0, len 2, errno 0
j at king 105% ./foo
"" -> 0, len 0, errno 0
"+" -> 0, len 0, errno 0
"-" -> 0, len 0, errno 0
"0x" -> 0, len 0, errno 0
"-0x" -> 0, len 0, errno 0
It's quite obvious that any other system differs from FreeBSD here.
(OK, HP-UX doesn't throw EINVAL at all, even for clearly inconvertible
strings. But then, it's a pretty old system, more than ten years.)
As the Posix/SUSPv2 standard say ``may set to EINVAL'', I'd even go
with the majority and not set EINVAL for an empty string even though
it technically constitutes a conversion error according to the C
standard. It's quite pointless to handle a single plus or minus sign
differently than an empty string, and as both the C and Posix/SUSP
standards mandate that +/- must not cause a conversion error, just
don't flag the error for the empty string either.
--
cheers, J"org .-.-. --... ...-- -.. . DL8DTL
http://www.sax.de/~joerg/ NIC: JW11-RIPE
Never trust an operating system you don't have sources for. ;-)
More information about the freebsd-current
mailing list