long double broken on i386?
Steve Kargl
sgk at troutmask.apl.washington.edu
Thu Nov 15 12:17:19 PST 2007
On Sat, Nov 03, 2007 at 03:53:19PM +1100, Bruce Evans wrote:
> This thread is old, so I quoted too much.
I quote very little, but it remains an old thread. I'm
sure I have attributions wrong.
> % Index: msun/src/s_cosl.c
> % ...
> % +/*
> % + * Table of constants for 2/pi, 396 Hex digits (476 decimal) of 2/pi.
> % + * The table was taken from e_rem_pio2.c.
> % + */
> % +static const int32_t two_over_pi[] = {
> % +0xA2F983, 0x6E4E44, 0x1529FC, 0x2757D1, 0xF534DD, 0xC0DB62,
> % +0x95993C, 0x439041, 0xFE5163, 0xABDEBB, 0xC561B7, 0x246E3A,
>
> This gives far too many copies of this table (instead of just too many).
> The float and double versions avoid having a copy for each of cos/sin/tan
> by doing arg reduction in e_rem_pio2[f].c. The extra layer for this file
> is a pessimization but seems to be worth it to avoid the duplication.
>
> Are 396 hex digits enough? It can't be right for all of the float, double
> and long double cases. I hope it is enough for 128-bit long doubles, and
> the float case is just sloppy in using the same table as the double case.
> If 396 hex digits is enough for all cases, then the table should be in
> k_rem_pio2.c.
I started to rework my sinl(), cosl(), and tanl() (and __kernel_*)
routines. In doing so, I've taken, for example, __kernel_cos() and
changed it to __kernel_cosl() keeping the "extra precision" algorithm.
For reference, see code below. Testing shows that I have ULPs exceeding
38 for cosl() with this algorithm. If the "extra precision" algorithm
is correct, then perhaps your question "Are 396 hex digits enough?" is
relevant.
So, I poked around and found
http://docs.sun.com/app/docs/doc/801-7639/6i1ucu1uh?a=view
which contains the statement:
libm and libsunmath trigonometric functions use an "infinitely" precise
pi for argument reduction. The value 2/pi is computed to 916 hexadecimal
digits and stored in a lookup table to use during argument reduction.
I'm looking into this now.
--
Steve
static const long double xxx[10] = {
-0x8.000000000000000p-4L, /* The - here, changes the sign of hz. */
0xa.aaaaaaaaaaaaaabp-8L,
-0x5.b05b05b05b05b05p-12L,
0x1.a01a01a01a019dbp-16L,
-0x4.9f93edde27cbed7p-24L,
0x8.f76c77fc4bbff37p-32L,
-0xc.9cba54236b4e97fp-40L,
0xd.73f9aa92060766ap-48L,
-0xb.4105ba69deeed1ap-56L,
0x7.7e10d7644cea922p-64L,
};
#define C xxx
long double
__kernel_cosl(long double x, long double y)
{
long double hz, r, z, w;
#if 1
/*
* XXX - Although y supposely carries extra precision to allow
* better accuracy, the following lines are needed to get <= 1 ULP.
* Note, very simply testing with arg in [-2000:2000] shows ULPs
* that exceed 38 without this.
*/
x += y;
y = 0;
#endif
z = x * x;
r = z * (C[1] + z * (C[2] + z * (C[3] + z * (C[4] + z * (C[5] +
z * (C[6] + z * (C[7] + z * (C[8] + z * C[9]))))))));
hz = (float)C[0] * z;
w = 1 + hz;
return (w + (((1 - w) + hz) + (z * r - x * y)));
}
More information about the freebsd-standards
mailing list