i386/67469: src/lib/msun/i387/s_tan.S gives incorrect results for large inputs

David Schultz das at FreeBSD.ORG
Sun Feb 6 01:30:22 PST 2005


The following reply was made to PR i386/67469; it has been noted by GNATS.

From: David Schultz <das at FreeBSD.ORG>
To: Bruce Evans <bde at zeta.org.au>
Cc: FreeBSD-gnats-submit at FreeBSD.ORG, freebsd-i386 at FreeBSD.ORG
Subject: Re: i386/67469: src/lib/msun/i387/s_tan.S gives incorrect results for large inputs
Date: Sun, 6 Feb 2005 04:21:51 -0500

 On Sun, Feb 06, 2005, Bruce Evans wrote:
 > > [... too much detail to reply to :-)]
 > > Your suggestion of effectively special-casing large inputs and
 > > inputs close to multiples of pi is probably the right way to fix
 > > the inaccuracy.  However, I'm worried that it would wipe away
 > > any performance benefit of using fptan, if there is a benefit
 > > at all.  How about punting and removing s_tan.S instead?
 > 
 > The problem affects at least sin() and cos() too, so I think throwing
 > away the optimized versions is too extreme.  Perhaps a single range
 > check to let the MD version handle only values near 0 ([-pi/2+eps,
 > pi/2-eps]? would be efficient enough.
 > 
 > > Other unanswered questions (ENOTIME right now):
 > > - What about sin() and cos()?
 > > - What about the float versions of these?
 > 
 > I tested sin().  It misbehaves similarly (with identical results for
 > args (2 * n * M_PI).
 
 My question was more along the lines of, ``Is there an actual
 performance benefit for sin() and cos()?''  If there is little or
 no benefit, then there is no reason to keep the assembly routines
 and worry about how to fix them.  I'll try to investigate this
 aspect in more detail at some point, unless you beat me to it...
 
 > I don't run -current and hadn't seen your code for the MD float versions...
 > They are buggier:
 > (1) the exponent can be up to 127, so more than half of the representable
 >     values exceed 2^63 in magnitude and thus need range reduction.  Results
 >     for tan(0x1p64) using various methods:
 > 
 >       buggy MD tanf:   0x1p+64                 1.84467e+19
 >       buggy MD tan:   -0x1.8467b926c834bp-2   -0.379302
 >       fdlibm MI tanf: -0x1.82bee6p-6          -0.0236051
 
 Whups.  FWIW, they're not mine; I got them from NetBSD.  Another
 one of the float functions in the NetBSD repository is buggy,
 too, but I forget which one.  I only imported the ones that seemed
 to be correct and faster than fdlibm for input bit patterns chosen
 uniformly at random, but I guess I missed that problem.
 
 > (2) they return extra bits of precision.  The MD double precision versions
 >     have the same bug; the bug is just clearer for the float case, and
 >     cannot be fixed by FreeBSD's rounding precision hack.
 > 
 > BTW, I don't trust the range reduction for floating point pi/2 but
 > never finished debugging or fixing it.  According to my old comment,
 > it is off by pi/2 and very slow for many values between 32768 and
 > 65535.  However, I couldn't duplicate this misbehaviour last time I
 > looked at it.  I used to fix it using the double version.
 
 Yeah, there seem to be many values for which __ieee754_rem_pio2f()
 returns the wrong quotient:
 
 -0x1.8009c6p+8       (input)
         d = -244     (return value of __ieee754_rem_pio2())
         f = -4       (return value of __ieee754_rem_pio2f())
 
 0x1.389ee4p+87
         d = 4
         f = 3
 
 -0x1.70bca6p+16
         d = -60095
         f = -7
 
 It also gets the remainder completely wrong sometimes:
 
 0x1.389ee4p+87
         rd0 = -0x1.6ad286p-4       (y[0] from rem_pio2(), rounded to float)
         rf0 = 0x1.7b728cp+0        (y[0] from rem_pio2f())
 
 0x1.4d23ecp+95
         rd0 = -0x1.ee68f8p-2
         rf0 = 0x1.168578p+0
 
 -0x1.5a172p+63
         rd0 = 0x1.52f2aap-1
         rf0 = -0x1.d14ccp-1
 
 Also, rem_pio2f() is probably not much more efficient than
 rem_pio2().  The former would be better if it used a single double
 instead of two floats for increased accuracy.  (The double version
 has two use two doubles for accuracy because there's no ``double
 double'' type.)
 
 Results for MI tan() differ from MI tanf() by >2 ulp for many
 inputs, including:
 
 	0x1.00008ep+15
 	0x1.0000ap+15
 	0x1.0000a4p+15
 	0x1.0000aep+15
 	[...]
 
 Perhaps this is due to the problem with rem_pio2f().


More information about the freebsd-i386 mailing list