Float problen running i386 inary on amd64
Bruce Evans
brde at optusnet.com.au
Sat Nov 17 02:01:42 PST 2007
On Fri, 16 Nov 2007, Peter Jeremy wrote:
> I've Cc'd bde@ because this relates to the FPU initialisation - which
> he is the expert on.
>
> On Thu, Nov 15, 2007 at 12:54:29PM +0000, Pete French wrote:
>>> On Fri, Nov 02, 2007 at 10:04:48PM +0000, Pete French wrote:
>>>> int
>>>> main(int argc, char *argv[])
>>>> {
>>>> if(atof("3.2") == atof("3.200"))
>>>> puts("They are equal");
>>>> else
>>>> puts("They are NOT equal!");
>>>> return 0;
>>>> }
>>>
>>> Since the program as defined above does not include any prototype for
>>> atof(), its return value is assumed to be int. The i386 code for the
>>> comparison is therefore:
>>
>> Sorry, I didn't bother sticking the include lines in when I sent it
>> to the mailing list as I assumed it would be ovious that you need
>> to include the prototypes!
>
> OK, sorry for the confusion.
>
>> Interestingly, if you recode like this:
>>
>> double x = atof("3.2");
>> double y = atof("3.200");
>> if(x == y)
>> puts("They are equal");
>> else
>> puts("They are NOT equal!");
>>
>> Then the problem goes away! Glancing at the assembly code they both appear to
>> be doing the same thing as regards the comparison.
Glance more closely.
Behaviour like this should be expected on i386 but not on amd64. It
gives the well-known property of the sin() function, that sin(x) != sin(x)
for almost all x (!). It happens because expressions _may_ be evaluated
in extra precision (this is perfectly standard), so identical expressions
may sometimes be evaluated in different precisions even, or especially,
if they are on the same line. atof(s) and sin(x) are expressions, so
they may or may not be evaluated in extra precision. Certainly they
may be evaluated in extra precision internally. Then when they return
a result, C99 doesn't require discarding any extra precision. (It only
requires a conversion if the type of the expression being returned is
different from the return type. Then it requires a conversion as if by
assignment, and such conversions _are_ required to discard any extra
precision. This gives the bizarre behaviour that, if a functon returning
double uses long double internally until the return statement so as to
get extra precision, then it can only return double precision, since the
return statement discards the extra precision, while if it uses double
precision internally then it may return extra precision and the extra
bits may even be correct.)
The actual behaviour depends on implementation details and bugs.
Programmers are supposed to be get almost deterministic behaviour (with
no _may_'s) by using casts or assignments to discard any extra precision.
E.g., in functions that are declared as double, to actually return
only double precision, use "return ((double)(x + y))" instead of "return
(x + y)", or assign the result to a double (maybe "x += y; return (x);").
However, this is completely broken for gcc on i386's. For gcc on i386's,
casts and assignments _may_ actually work as required by C99. The
-ffloat-store hack is often recommended for fixing problems in this
area, but it only works for assignments; casts remain broken, and the
results of expressions remain unpredictable and dependent on the
optimization level because intermediate values _may_ retain extra
precision depending on whether they are spilled to memory and perhaps
on other things (spilling certainly removes extra precision). This
has been intentionally broken for about 20 years now. It is hard to
fix without pessimizing almost everything in much the same way as
-ffloat-store. The pessimization is larger than it was 20 years ago
since memory is relatively slower (though the stores now normally go
to L1 caches which are very fast, they add a relatvely large amount
to pipeline latency) and register allocation is better. It is hard
to write code that avoids the pessimization, since only code that uses
very long expressions with no assignments to even register variables
can avoid the stores. (Store+load to discard the extra precision is
another implementation detail. It is the fastest way, even if a value
with extra precision is in a register.)
To work around the gcc bugs, something like *(volatile double *)&x
must be used to reduce "double x;" to actually be a double.
The actual behaviour is fairly easy to describe for (f(x) == f(x)):
amd64:
if f() returns float, then the value is returned in the low
quarter of an XMM register, so extra precision is automatically
discarded and the results are equal except in exceptional cases
(if f(x) is a NaN or varies due to internals in the function).
Assignment of the result(s) to variables of any type work
correctly and don't change the values since float is the lowest
precision.
if f() returns double, similarly except the value is returned in
the low half of an XMM register, and assignment of the result(s)
to variable(s) of type float would work correctly and reduce to
float precision.
if f() returns long double, then the value is returned on the
top of the FPU stack. Since the result has maximal precision,
neither return statment may discard precision, and since the
declared type is the actual type, the compiler cannot discard
any extra precision accidentally. Assignment of the result(s)
to variable(s) of type float or double would work correctly and
reduce to the precision of the variable(s).
i386:
The value is always returned on the top of the FPU stack. It
_may_ have extra precision if the return type is float or double.
If f() returns float, then the value is very likely to have
extra precision from something like "return (x + y)" at the
end of f(). Then since the ABI requires spilling the result
of the first f(x) before calling f() again, and since gcc
doesn't know that f(x) has extra precision, the first f(x) is
spilled to a termporary variable of type float and thus loses
all of its extra precision. Then the second f(x) is very
likely to return extra precision. With optimization, the
second f(x) is never spilled -- it is just compared with the
spilled first f(x) so it is very likely to compare unequal.
Without optimization, or maybe with -ffloat-store, the second
f(x) _may_ be stored to memory too, so the results may compare
equal. Explicit assignments to variables of type float have
no effect on any of this due to the compiler bugs, unless the
variables are volatile.
If f() returns double, then under FreeBSD extra precision is
normally discarded before returning, since FreeBSD still uses
double precision for evaluating expressions so as to avoid
bugs in this area. ("return (x + y)" returns double precision,
possibly with extra exponent range, since "x + y" is rounded to
double precision.) However, functions like sqrt() and sin()
return long double precision due to the implementation detail
that they are evaluated in hardware in extra precision and the
bugfeature that this extra precision is not discarded.) Then
the behaviour is similar to the float case -- due to the
compiler bugs, the extra precision cannot be discarded by
assigning the results to variables of type double unless the
variables are volatile, etc. I don't know how atof() could
return extra precision.
If f() returns long double, then the behaviour is the same as
on amd64, except long double is almost completely useless,
since most expressions are evaluated in double precision.
> The underlying problem is that the amd64 FPU is initialised to 64-bit
> precision mode, whilst the i386 FPU is initialised to 53-bit precision
> mode (__INITIAL_FPUCW__ in amd64/include/fpu.h vs __INITIAL_NPXCW__ in
> i386/include/npx.h).
This is more of a feature than a problem, especially on amd64 where
64-bit precision works and doesn't interfere with 53-bit precision.
(However, 64-bit precision is very slow, especially for vectors since
it cannot use SSE, and the amd64 ABI defeats the main point of having
extra precision, which is to automatically evaluate most expressions in
extra precision so as to protect naive programmers from most precision
bugs and make the precision bugs very hard to understand and/or work
around when they occur. This ABI is actually more of a problem for
floats than for doubles -- for doubles, the behaviour is almost the
same as with FreeBSD-i386's reduced-precision hack, and problems are rare
because double precision is enough for most things, but float precision
isn't enough for most things.)
> It looks like the FPU is initialised during the
> machine-dependent CPU initialisation and then inherited by subsequent
> processes as they are fork()d. The fix is probably to explicitly
> initialise the FPU for legacy mode processes on the amd64.
>
> A work-around would be to call fpsetprec(FP_PD) (see <machine/ieeefp.h>)
> at the start of main().
This would avoid most non-bugs in this area in the same way as
FreeBSD-i386's reduced precision hack does accidentally (the hack is
only intended to avoid the compiler bugs, at least now). I still
don't know how atof() could return extra precision on amd64.
Long-term changes to the FP environment are dangerous for various reasons:
- setjmp()/longjmp() still don't support SSE (mxcsr is missing in jmp_buf;
IIRC, this affects fpsetround() but not fpsetprec()).
- library functions have not been tested much in non-default environments.
Reducing the precision on amd64 should work unsurprisingly. Increasing
the precision on i386 probably breaks at least trig arg reduction (for
double and maybe even for float precision) due to the compiler bugs.
Breaking atof("3.2") might be expected too, since 0.2 is not exactly
representable and, although atof (gdtoa) is careful about precision, it
is careful code that is most broken by the compiler bugs. However, I
would expect gdtoa to just work with the default precision on amd64.
Bruce
More information about the freebsd-stable
mailing list