amd64/156464: fpsetprec does not work

Mon Apr 18 20:02:07 UTC 2011

On Mon, 18 Apr 2011, Michirou & wrote:

>> Description:
>
> In default, fpgetprec() returns FP_PE, but results show FP_PD.
> if fpsetprec(FP_PE) is called, results are never changed.

amd64 uses SSE except for long doubles, so fpsetprec() and no effect
on the results for long doubles.  Since the precision defaults to
FP_PE on amd64, fpsetprec() can only be used to break long doubles
on amd64, while on i386 the precision defaults to FP_PD and fpsetprec()
is needed to unbreak this.  fpsetprec() on i386 can also be used to:
- break doubles by setting the precision to FP_PS
- reduce the precision for floats by setting the precision to FP_PS.
   This is sometimes useful for getting the same precision for floats
   as on other arches like amd64, to test that nothing depends on the
   extra precision without being ifdefed for this.
- give increased precision for floats and doubles by setting the
   precision to FP_PE.  This may be useful, but is difficult to
   program.  It requires almost never actually using floats or
   doubles, except for converting them to and from long double on
   input and output.

> This is not happen on FreeBSD8.2-RELEASE i386 version.

amd64 behaviour in this area hasn't changed.

>> How-To-Repeat:
>
> #include <stdio.h>
> #include <stdlib.h>
> #include <machine/ieeefp.h>
> int main()
> {
>    double a, b, c, d;

This only uses doubles, so fpsetprec() has no effect on it.

>
>    printf("fpgetprec %d\n",   fpgetprec()); // 3 on amd64, 2 on i386
>
>    a = 10.0;
>    b = 2.718281810;
>    c = a / (b * b);
>    printf("%20.16e\n",   c);  // 1.3533528507465618e+00 on both
>
>    fpsetprec(FP_PE);

It is still 3 on amd64, but is not used for doubles.  It was changed
from 2 to 3 on i386.

>    a = 10.0;
>    b = 2.718281810;
>    c = a / (b * b);
>    printf("%20.16e\n",   c);
>              // 1.3533528507465618e+00 on amd64
> 	      // 1.3533528507465620e+00 on i386

So result is more accurate on i386, but this behaviour is fragile and
requires more care to program than the above in general.  With FP_PE
on i386, b*b is evaluated in extra precision, but there is nothing
to prevent it being stored to memory, which would lose its extra
precision, especially since gcc doesn't understand precision stuff.
In practice, gcc won't store to memory in the middle of a simple
expression like the above, even with -O0, so the above works like
you want.  The careful version is:

 	a = 10.0;
 	b = 2.718281810;

 	long double la, lb;

 	la = a;
 	lb = b;
 	c = la / (lb * lb);	/* compiler bugs -- extra precision not lost
 				 * yet unless there is an acidental or
 				 * forced store (-ffloat-store) */
 	printf("%20.16e\n",   c);  /* ABI gives a store which loses the bugs
 				    * so we see only double precision for
 				    * the result */

An even more careful version to avoid the compiler bugs by forcing a store
for this variable only is:

 	...
 	volatile double vc;

 	vc = la / (lb * lb);
 	c = vc;			/* c reduced to double prec -- now ready for
 				 * output, but probably not useful for
 				 * furthe calculations */

-ffloat-store should never be used since it pessimizes speed and precision
globally.

>
>    exit(0);
> }

fpsetprec() is very unportable due to its only affecting the i387 register
set.  Even on i386, you can break its effect on doubles by using '-msse2
-mfpmath=sse'.  This bug is the default for clang.

Bruce