long double broken on i386?

Wed Dec 5 12:36:26 PST 2007

On Wed, Dec 05, 2007 at 01:51:32PM -0500, David Schultz wrote:
> On Mon, Dec 03, 2007, Steve Kargl wrote:
> > On Mon, Dec 03, 2007 at 02:44:08AM -0500, David Schultz wrote:
> > > Is the latest version of the patch the one you sent to the list?
> > > If cosl and friends produce accurate answers within a reasonable
> > > domain, it's worth committing; the whole business about how
> > > accurate cosl(1000000000) is can be dealt with later.
> > 
> > No, I've rewritten most of the code (numerous times in fact :-).
> > 
> > First, I followed Bruce's recommendation to compute the
> > ULP in the higher precision available in GMP/MPFR instead 
> > of the long double type.  The old method appears to 
> > have been using a chop rounding because I would get
> > only 0.5, 1.0, 1.5, ... ULPs (ie, I never saw anything like
> > 0.6312 ULP).  With the new method, I see ULPs in the 0.6
> > to 0.75 range.  I haven't had time to implement a full
> > blown Remes algorithm to see if I can reduce the ULP. 
> > 
> > Second, I've implemented the algorithms found in 
> > s_cos.c, s_sin.c, and s_tan.c  that use the extra
> > precision from the argument reduction.  If I use
> > these algorithms, then the ULPs go up!  For example,
> > the ULP for sinl() goes up to about 1.5.  Bruce
> > had suggested that the 396 hex digits in the table
> > for 2/pi may need to be checked.  I found a paper
> > by K.C. Ng at David Hough's website that (if I read it
> > correctly) suggests that we need a much larger
> > table.  So, I need to check how well argument reduction
> > is working.
> 
> Having a version that works for machines with the 128-bit floating
> point format is pretty important. (These would probably be two
> separate files; I've been thinking we should add a msun/ieee96
> directory and a msun/ieee128 directory or something.)

This is probably a good idea if freebsd wants to maintain the
same algorithm across, say, sinf, sin, and sinl.  I can produce
code for the ieee128 case, but I have no way to test it.

An alternative may be to use table-driven implementations where
the intervals are defined by exactly representable values in ieee96
(i.e., 64-bit significand), which are then also exactly representable
in ieee128.  I haven't investigated how many intervals one would
need nor the best interpolation scheme.

> But honestly, I've tried to wrestle with the argument reduction
> stuff before, and my advice is to not kill yourself over it.
(large arg discussion elided)

> That's not to say that we need no argument reduction at all. For
> instance, cosl(5*pi/3) should still give an accurate answer. But
> when the input is so huge that the exponent is bigger than the
> number of significant bits in the mantissa, you lose so much
> precision in the input that it's not as important anymore. That's
> probably why Intel decided to make its trig instructions only work
> up to 2^63 before requiring explicit argument reduction.

This is what I need to check.  For reasonable args, does the
arg reduction provide extra precision and is it useful?  It
seems that you may have already checked this.  

I hope to revisit some of this on Saturday.

-- 
Steve