Float problen running i386 inary on amd64

Sat Nov 17 02:01:42 PST 2007

On Fri, 16 Nov 2007, Peter Jeremy wrote:

> I've Cc'd bde@ because this relates to the FPU initialisation - which
> he is the expert on.
>
> On Thu, Nov 15, 2007 at 12:54:29PM +0000, Pete French wrote:
>>> On Fri, Nov 02, 2007 at 10:04:48PM +0000, Pete French wrote:
>>>> 	int
>>>> 	main(int argc, char *argv[])
>>>> 	{
>>>>        	if(atof("3.2") == atof("3.200"))
>>>>                	puts("They are equal");
>>>>        	else
>>>>                	puts("They are NOT equal!");
>>>>        	return 0;
>>>> 	}
>>>
>>> Since the program as defined above does not include any prototype for
>>> atof(), its return value is assumed to be int.  The i386 code for the
>>> comparison is therefore:
>>
>> Sorry, I didn't bother sticking the include lines in when I sent it
>> to the mailing list as I assumed it would be ovious that you need
>> to include the prototypes!
>
> OK, sorry for the confusion.
>
>> Interestingly, if you recode like this:
>>
>>        double x = atof("3.2");
>>        double y = atof("3.200");
>>        if(x == y)
>>                puts("They are equal");
>>        else
>>                puts("They are NOT equal!");
>>
>> Then the problem goes away! Glancing at the assembly code they both appear to
>> be doing the same thing as regards the comparison.

Glance more closely.

Behaviour like this should be expected on i386 but not on amd64.  It
gives the well-known property of the sin() function, that sin(x) != sin(x)
for almost all x (!).  It happens because expressions _may_ be evaluated
in extra precision (this is perfectly standard), so identical expressions 
may sometimes be evaluated in different precisions even, or especially,
if they are on the same line.  atof(s) and sin(x) are expressions, so
they may or may not be evaluated in extra precision.  Certainly they
may be evaluated in extra precision internally.  Then when they return
a result, C99 doesn't require discarding any extra precision.  (It only
requires a conversion if the type of the expression being returned is
different from the return type.  Then it requires a conversion as if by
assignment, and such conversions _are_ required to discard any extra
precision.  This gives the bizarre behaviour that, if a functon returning
double uses long double internally until the return statement so as to
get extra precision, then it can only return double precision, since the
return statement discards the extra precision, while if it uses double
precision internally then it may return extra precision and the extra
bits may even be correct.)

The actual behaviour depends on implementation details and bugs.
Programmers are supposed to be get almost deterministic behaviour (with
no _may_'s) by using casts or assignments to discard any extra precision.
E.g., in functions that are declared as double, to actually return
only double precision, use "return ((double)(x + y))" instead of "return
(x + y)", or assign the result to a double (maybe "x += y; return (x);").
However, this is completely broken for gcc on i386's. For gcc on i386's,
casts and assignments _may_ actually work as required by C99.  The
-ffloat-store hack is often recommended for fixing problems in this
area, but it only works for assignments; casts remain broken, and the
results of expressions remain unpredictable and dependent on the
optimization level because intermediate values _may_ retain extra
precision depending on whether they are spilled to memory and perhaps
on other things (spilling certainly removes extra precision).  This
has been intentionally broken for about 20 years now.  It is hard to
fix without pessimizing almost everything in much the same way as
-ffloat-store.  The pessimization is larger than it was 20 years ago
since memory is relatively slower (though the stores now normally go
to L1 caches which are very fast, they add a relatvely large amount
to pipeline latency) and register allocation is better.  It is hard
to write code that avoids the pessimization, since only code that uses
very long expressions with no assignments to even register variables
can avoid the stores.  (Store+load to discard the extra precision is
another implementation detail.  It is the fastest way, even if a value
with extra precision is in a register.)

To work around the gcc bugs, something like *(volatile double *)&x
must be used to reduce "double x;" to actually be a double.

The actual behaviour is fairly easy to describe for (f(x) == f(x)):

amd64:
 	if f() returns float, then the value is returned in the low
 	quarter of an XMM register, so extra precision is automatically
 	discarded and the results are equal except in exceptional cases
 	(if f(x) is a NaN or varies due to internals in the function).
 	Assignment of the result(s) to variables of any type work
 	correctly and don't change the values since float is the lowest
 	precision.

 	if f() returns double, similarly except the value is returned in
 	the low half of an XMM register, and assignment of the result(s)
 	to variable(s) of type float would work correctly and reduce to
 	float precision.

 	if f() returns long double, then the value is returned on the
 	top of the FPU stack.  Since the result has maximal precision,
 	neither return statment may discard precision, and since the
 	declared type is the actual type, the compiler cannot discard
 	any extra precision accidentally.  Assignment of the result(s)
 	to variable(s) of type float or double would work correctly and
 	reduce to the precision of the variable(s).

i386:
 	The value is always returned on the top of the FPU stack.  It
 	_may_ have extra precision if the return type is float or double.

 	If f() returns float, then the value is very likely to have
 	extra precision from something like "return (x + y)" at the
 	end of f().  Then since the ABI requires spilling the result
 	of the first f(x) before calling f() again, and since gcc
 	doesn't know that f(x) has extra precision, the first f(x) is
 	spilled to a termporary variable of type float and thus loses
 	all of its extra precision.  Then the second f(x) is very
 	likely to return extra precision.  With optimization, the
 	second f(x) is never spilled -- it is just compared with the
 	spilled first f(x) so it is very likely to compare unequal.
 	Without optimization, or maybe with -ffloat-store, the second
 	f(x) _may_ be stored to memory too, so the results may compare
 	equal.  Explicit assignments to variables of type float have
 	no effect on any of this due to the compiler bugs, unless the
 	variables are volatile.

 	If f() returns double, then under FreeBSD extra precision is
 	normally discarded before returning, since FreeBSD still uses
 	double precision for evaluating expressions so as to avoid
 	bugs in this area.  ("return (x + y)" returns double precision,
 	possibly with extra exponent range, since "x + y" is rounded to
 	double precision.)  However, functions like sqrt() and sin()
 	return long double precision due to the implementation detail
 	that they are evaluated in hardware in extra precision and the
 	bugfeature that this extra precision is not discarded.)  Then
 	the behaviour is similar to the float case -- due to the
 	compiler bugs, the extra precision cannot be discarded by
 	assigning the results to variables of type double unless the
 	variables are volatile, etc.  I don't know how atof() could
 	return extra precision.

 	If f() returns long double, then the behaviour is the same as
 	on amd64, except long double is almost completely useless,
 	since most expressions are evaluated in double precision.

> The underlying problem is that the amd64 FPU is initialised to 64-bit
> precision mode, whilst the i386 FPU is initialised to 53-bit precision
> mode (__INITIAL_FPUCW__ in amd64/include/fpu.h vs __INITIAL_NPXCW__ in
> i386/include/npx.h).

This is more of a feature than a problem, especially on amd64 where
64-bit precision works and doesn't interfere with 53-bit precision.
(However, 64-bit precision is very slow, especially for vectors since
it cannot use SSE, and the amd64 ABI defeats the main point of having
extra precision, which is to automatically evaluate most expressions in
extra precision so as to protect naive programmers from most precision
bugs and make the precision bugs very hard to understand and/or work
around when they occur.  This ABI is actually more of a problem for
floats than for doubles -- for doubles, the behaviour is almost the
same as with FreeBSD-i386's reduced-precision hack, and problems are rare
because double precision is enough for most things, but float precision
isn't enough for most things.)

> It looks like the FPU is initialised during the
> machine-dependent CPU initialisation and then inherited by subsequent
> processes as they are fork()d.  The fix is probably to explicitly
> initialise the FPU for legacy mode processes on the amd64.
>
> A work-around would be to call fpsetprec(FP_PD) (see <machine/ieeefp.h>)
> at the start of main().

This would avoid most non-bugs in this area in the same way as
FreeBSD-i386's reduced precision hack does accidentally (the hack is
only intended to avoid the compiler bugs, at least now).  I still
don't know how atof() could return extra precision on amd64.

Long-term changes to the FP environment are dangerous for various reasons:
- setjmp()/longjmp() still don't support SSE (mxcsr is missing in jmp_buf;
   IIRC, this affects fpsetround() but not fpsetprec()).
- library functions have not been tested much in non-default environments.
   Reducing the precision on amd64 should work unsurprisingly.  Increasing
   the precision on i386 probably breaks at least trig arg reduction (for
   double and maybe even for float precision) due to the compiler bugs.
   Breaking atof("3.2") might be expected too, since 0.2 is not exactly
   representable and, although atof (gdtoa) is careful about precision, it
   is careful code that is most broken by the compiler bugs.  However, I
   would expect gdtoa to just work with the default precision on amd64.

Bruce