Use of C99 extra long double math functions after r236148

Thu Jul 26 02:19:38 UTC 2012

On Wed, 25 Jul 2012, Stephen Montgomery-Smith wrote:

> On 07/25/12 12:31, Steve Kargl wrote:
>> On Wed, Jul 25, 2012 at 12:27:43PM -0500, Stephen Montgomery-Smith wrote:
>>> Just as a point of comparison, here is the answer computed using
>>> Mathematica:
>>> 
>>> N[Exp[2], 50]
>>> 7.3890560989306502272304274605750078131803155705518
>>> 
>>> As you can see, the expl solution has only a few digits more accuracy
>>> that exp.
>> 
>> Unless you are using sparc64 hardware.
>> 
>> flame:kargl[204] ./testl -V 2
>> ULP = 0.2670 for x = 2.000000000000000000000000000000000e+00
>> mpfr exp: 7.389056098930650227230427460575008e+00
>> libm exp: 7.389056098930650227230427460575008e+00
>
Yes.  It would be nice if long on the Intel was as long as the sparc64.

You want it to be as slow as sparc64?  (About 300 times slower, after
scaling the CPU clock rates.  Doubles on sparc64 are less than 2 times
slower.)

I forgot to mention in a previous reply is that expl has only a few
more decimal digits of accuracy than exp because the extra precision
on x86 wasn't designed to give much more accuracy.  It was designed
to give more chance of full double precision accuracy in naive code.
It was designed in ~1980 when bits were expensive and the extra 11
provided by the 8087 were considered the best tradeoff between cost
and accuracy.  They only previde 2-3 extra decimal digits of accuracy.
They are best thought of as guard bits.  Floating point uses 1 or 2
guard bits internally.  11 extends that significantly and externalizes
it, but is far from doubling the number of bits.  Their use to provide
extra precision was mostly defeated in C by bad C bindings and
implementations.  This was consolidated by my not using the extra bits
for the default rounding precision in FreeBSD.  This has been further
consolidated by SSE not supporting extended precision.  Now the naive
code that uses doubles never gets the extra precision on amd64.  Mixing
of long doubles with doubles is much slower with SSE+i387 than with
i387, since the long doubles are handled in different registers and
must be translated with SSE+i387, while with i387, using long doubles
is almost free (it actually has a negative cost in non-naive code since
it allows avoiding extra precision in software).  Thus SSE also inhibits
using the extra precision intentionally.

Bruce