long double broken on i386?

Bruce Evans brde at optusnet.com.au
Mon Oct 1 19:25:06 PDT 2007


On Mon, 1 Oct 2007, Steve Kargl wrote:

> On Mon, Oct 01, 2007 at 07:56:28PM +1000, Bruce Evans wrote:
>> On Fri, 28 Sep 2007, Steve Kargl wrote:

>>> On amd64, these routines give <= 1/2 ULP for the various
>>> values I've tested in the reduced range.  Now, if I test
>>
>> I doubt this :-).  Getting <= 1 ULP is hard; getting <= 1/2 + 0.001
>> ULPs is harder, and getting <= 1/2 ULPs without using an enormous
>> amount of extra precision is a research problem.
>
> It appears that I was not clear in what I had intended to write.
> Let's try again.  I've implemented the necessary argument
> reduction using k_rem_pio2.c for a 64-bit long double.  I'm
> making no claims about its accuracy and the ULP.  The comments
> in k_rem_pio2.c do not discuss the accuracy.  It is noted
> that giving the reduced argument to one of my algorithms
> does not sudden recover any lost precision.

I think it is accurate to within one bit provided it is set up
correctly.  Setup includes passing it a large table of bits of
2/pi.  I don't know if the existing tables with 1584 bits are
large enough for long double precision.  If long doubles have
64 bits, then the recipe in k_rem_pio_2 returns 106 mantissa
bits.  I would expect a max relative error of 2^-105.

> However, given a reduced value in the range [0,pi/4], the
> algorithms that I have for sinl(), cosl(), and tanl() over
> this range give <= 1/2 ULP "for the various values I've
> tested."  Note, I did not claim to test *all* values in
> [0,pi/4].  I've only tested the algorithms with several 10
> of millions of randomly generated values.

If you start with a relative error of 2^-105, then it is not
very hard to get a relative error of 2^-104 in the result, but it
is not very easy to get this either without having very slow,
large and/or tricky code.

k_cos.c only wants a relative error of about 2^-60, so it only needs
a small amount of tricky code to be fast (it uses essentially cos(x+y)
~= cos(x) - x*y; getting a smaller relative error would require many
more terms; everything should really be evaluated in double-double
precision, but k_cos.c uses tricks so ordinary double precision error
can be used to get a relative error of strictly less than 2^-54 (half
an ulp) if the final addition were in infinite precision.  Rounding
then gives a final relative error of strictly less than 1 ulp.

If you can limit the relative error to < 2^-104 before rounding to
long double precision, then (with i386 long doubles but of course
not with sparc64 long doubles) the final result has a very large
chance of being perfectly rounded, but since the result space is huge
there is also a very large chance of wrong rounding in a huge number
of cases.  (Chance of wrong rounding ~= 2^-40 in an arg space of
size 2^64 gives about 2^20 cases incorrectly rounded.  The arg space
is so large that you will probably never find any incorrectly rounded
cases by blind searching.)

> It should also be noted that I'm using Goldberg's definition
> of ULP where the "exact" result is computed via MPFR with
> 512 bits of precision and then the mpfr value is converted
> to long double.

If you can limit the relative error to < 2^-512 before the final
rounding step, then you probably have perfect rounding in all cases
but no one can prove it because the arg space is too large.  (Chance
of wrong rounding ~= 2^-448 in an arg space of size 2^64 gives about
2^-384 cases incorrectly rounded).

I think doing any calculations in 512-bit precision would be too slow
to be useful for long double precision, except for verifying results.

> /*
> * y is the approximate value.  z is the long double result
> * of converting the 512 bit MPFR number in round-to-nearest
> * mode.
> */
> long double
> ulpl(long double y, long double z)
> {
> 	int n;
> 	long double f;
> 	f = frexpl(y, &n);
> 	f = fabsl(f - ldexpl(z, -n));
> 	f = ldexpl(f, LDBL_MANT_DIG - 1);
> 	return (f);
> }

You should try to use all the bits in the MFPR number, to verify that
when the rounded results differ, it is because the result is near a
half-way case (as indicated by lots of 1's bits in the MFPR number
after the rounding point).

>> [rounding precision hack]
> Sorry about the lack of trimming the above passage, but I was afraid
> I might lose context if I deleted text.

All gone now :-).

> Anyway, the above looks like a catch-22 for i386.  I can wrap

In 1988 I thought it would be fixed by now.  Now It looks like i386
will be obsolete first.

>>   Note that this bug makes it more difficult to implement functions in
>>   long double precision -- you cannot use tables of long double constants.
>
> Ah ha!  This explains why my Chebyshev polynomial approximations
> (ie nearly mimimax approximations) can have between 10 and 100
> ULP of errors.

And not all 11 bits (2048 ulps) are normally lost since the constants
are usually exact for leading terms.

>>   OTOH, loads of long double constants are so slow on i386 that such
>>   constants should almost never be used.  Instead represent long doubles
>>   as the sum of 2 doubles if possible.  Loading 2 doubles and adding them
>>   is faster than loading 1 long double, at least if the 2 extra operations
>>   can be pipelined effectively.
>
> Looks like I spend some time this week splitting the coefficients
> into high and low parts.

I use pari scripts to generate coeffiencts, and they handle the splitting
with minor adjustements.

Bruce


More information about the freebsd-standards mailing list