long double broken on i386?

Bruce Evans brde at optusnet.com.au
Mon Oct 1 02:56:38 PDT 2007

On Fri, 28 Sep 2007, Steve Kargl wrote:

> So, in my never ending crusade to implement missing
> C99 math functions, I decided to tackle sinl, cosl,
> and tanl.  I have implementations that use argument
> reduction from k_rem_pio2.c and Chebyshev interpolation
> for the trig functions over the range [0,pi/4].
> On amd64, these routines give <= 1/2 ULP for the various
> values I've tested in the reduced range.  Now, if I test

I doubt this :-).  Getting <= 1 ULP is hard; getting <= 1/2 + 0.001
ULPs is harder, and getting <= 1/2 ULPs without using an enormous
amount of extra precision is a research problem.

> these on i386, I see between 400 and 1200 ULP.  I spent
> a week or so trying to understand the descrepancy and "fix"
> the problem using a double-double precision arithemetic.
> I finally found the problem! /usr/include/float.h on i386
> is lying about its numerical representation of long double.

This problem is well known.  I must have referred to it many times
in old mail to you.  FreeBSD uses my old workaround for compiler
bugs in double precision, of using a rounding precision of double for
everything.  Unfortunately, this is still necessary because the
compiler bugs are still there.  This mostly breaks long double
precision, and it doesn't fix float precision, but double precision
is more important.  libm, in particular k_rem_pio2f.c (which is no
longer used) needs tricky fixes to work around the bugs for float
precision.  The fixes have mostly not been merged to the double
precision functions -- we just depend on double precision mostly

> In particular, at least these 3 values appear to be wrong
> #define LDBL_MANT_DIG   64
> #define LDBL_EPSILON    1.0842021724855044340E-19L
> #define LDBL_DIG        18

These were broken in rev.1.9 in 2002.  The "slight rounding issues with
long doubles" referred to in the log message are that long doubles are
rounded to double precision when they are operated on, unless you change
the rounding precision to long double using fpsetprec() or fesetprec().
Even if you change the rounding precision, long doubles can only appear
to work, but not actually work, unless you fix the compiler bugs or
work around them in libraries and in your application.  It isn't clear
if the "appear to work" in the log messages is with or without the change
to the rounding precision.  As mentioned in the log message, printf() in
2002 didn't support printing long doubles at all, so it was hard to see
loss of precision in printf output.  Now printf supports long doubles and
%a, so it is easy to see lost precision using it.  E.g.:

#include <math.h>
#include <stdio.h>

 	printf("    sin(1) = %La\n", (long double)(double)sin(1));
 	printf("2 * sin(1) = %La\n", (long double)(double)(2 * sin(1)));
 	return (0);

Output on i386:

     sin(1) = 0xd.76aa47848677021p-4
2 * sin(1) = 0xd.76aa47848677p-3

This exhibits a library bug, several compiler bugs, and the lost precision
from my rounding precision hack:

o library bug: sin() returns extra precision, despite its prototype saying
   that it returns double.  Well, I used to think that this is a bug.  I
   only leared a few months ago that it is a feature, since it matches
   the corresponding feature for a function implemented in C.  On i386,
   sin() is implemented in asm.  It just uses the hardware to evaluate
   sin() in long double precision, and returns the long double result of
   that.  The corresponding hardware feature is that in a C function that
   does similar calculations is permitted to return extra precision:

     double x, y;
     long double foo() { return x+y; }

   This is permitted to do the following:
   - evaluate x+y in extra precision
   - return the extra precision, since unlike casts and assignments, the
     return operation is _not_ required to discard any extra precision.
   For gcc on i386, this is what it would do without my rounding precision
   hack, except possibly when compiled with -O0 and/or if the result of
   x+y is spilled to memory (-O0 makes spilling quite likely).  The return
   statement must be written as "return (double)(x+y);" to ensure removing
   any extra precision _before_ returning.

   Now I wonder if the implicit casts for function args with prototyped
   functions are permitted to be as magic as the non-casts for return
   statments.  Keeping extra precision in intermediate calculations is
   good, and you don't really want to lose it unconditionally at function
   call boundaries, and it's confusing to lose it on calls but not on

   I'm using sin(1) mainly to generate a long double with nonzero in its
   11 least significant mantissa bits.  Note that initializing a long
   double to apparently-the-same value at compile time would not work:

     long double x = 0xd.76aa47848677021p-4L;  /* loses extra precision */

   This is because gcc on FreeBSD supports my precision hack to a fault,
   and has bugs in the fault.  gcc now tries to do compile-time arithmetic
   in the same precision as the runtime arithmetic, i.e., in only double
   precision on FreeBSD.  This is wrong if there is no arithmetic involved,
   as in the above initializer.  The runtime rounding precision only applies
   to operations -- it doesn't prevent loading, storing or non-arithmetic
   (function) operations on long doubles with full precision.  The above
   initializer corresponds to a store of a value that has somehow been
   calculated in full precision.

   Note that this bug makes it more difficult to implement functions in
   long double precision -- you cannot use tables of long double constants.
   OTOH, loads of long double constants are so slow on i386 that such
   constants should almost never be used.  Instead represent long doubles
   as the sum of 2 doubles if possible.  Loading 2 doubles and adding them
   is faster than loading 1 long double, at least if the 2 extra operations
   can be pipelined effectively.

o compiler bug (?).  Casts are required to remove any extra precision.  I
   cast sin(1) to double to do this.  gcc elides this cast since it thinks
   that sin(1) has no extra precision since it has type double, and it
   knows that sin(1) is in a long double register, so it thinks that nothing
   needs to be done for the combined cast (long double)(double).  However,
   having type double doesn't imply no extra precision in general, and
   after a return statement is one class of cases where it never does.
   A compiler can only tell if there is extra precision by looking inside
   the function near its return statement, or perhaps using an attribute.

o compiler bug.  The casts in "(long double)(double)(2 * sin(1)))" are
   elided similarly.  This is not visible in the printf output, since
   my rounding precision hack actually makes a difference here: ...

o ... lost precision.  The hack results in the multiplication not
   producing any extra precision, so the printf output is correct despite
   the compiler bug.

Variations on the above:
- with 1*sin(1) instead of 2*sin(1), optimizing away the multiplication
   is valid and done with -O, so the extra precision is retained.
- with sin(1) == sin(1) instead of 2*sin(1), and %d for the format specifier,
   the precision bugs combined with spilling bugs result in the value
   0.  Similarly, sin(x) != sin(x) for almost all x.  This behaviour
   is probably permitted, but it is surprising, and it isn't intentional.
   C99 apparently permits sin(x) to be evaluated _and_ returned in extra
   precision, and I don't know of any requirement that the return value
   always has the same amount of extra precision, and there are no casts
   in sight to clip the precision.  Different amounts always result in
   practice, accidentally as follows: gcc doesn't really understand
   exztra precision, especially for spilling intermediate values.  It
   just spills one of the sin(1)'s and thus loses the extra precision
   for only one of them.  Loss of precision from spilling is normal
   near function calls because of the ABI, but uncommon (but much
   harder to predict) otherwise.

> #include <stdio.h>
> int
> main (void)
> {
>  int i;
>  long double a;
>  a = 1.L;
>  for (i = 1; i < 64; i++)
>    {
>      a /= 2;
>      if (1.L - a == 1.L) break;
>    }
>  printf("%d %Le\n", i, a);
>  return 0;
> }
> amd64:sgk[206] ./z
> 64 1.084202e-19
> i386:kargl[205] ./z
> 54 5.551115e-17
> The above results for i386 are consistent with DBL_MANT_DIG
> and DBL_EPSILON.  Is this intentional, and should float.h be
> fixed?

Note that the compiler bugs make use of the epsilons fragile.
The delicate casts or assignments needed to give the lower-precision
epsilons a chance of working don't actually work unless you use
hacks like -ffloat store and volatile variables.

LDBL_MIN_* and LDBL_MAX_* were fixed in 2002.  My precision hack only
affects precision (only of +-*/ operations).  Long doubles still have
extra exponent range, so LDBL_MAX != DBL_MAX, etc.  This was wrong
before 2002.


More information about the freebsd-standards mailing list