i386/67469: src/lib/msun/i387/s_tan.S gives incorrect results for large inputs

Bruce Evans bde at zeta.org.au
Sat Feb 5 23:32:43 PST 2005


On Sat, 5 Feb 2005, David Schultz wrote:

> On Sat, Feb 05, 2005, Bruce Evans wrote:
> > Better call the MI tan() to do all this.  It won't take significantly
> > longer, and shouldn't be reached in most cases anyway.
>
> Yeah, but s_tan.S overrides s_tan.c, so that would require extra
> machinery.  Also, if fptan had worked as advertised and correctly
> identified the cases where range reduction was necessary, calling
> rem_pio2() directly would have avoided superfluous range checking.

I hacked the file (but not the machinery) easily enough for testing.

fptan advertises to work? :-)  At least the amd64 manual (instruction
reference) only claims that it sets C2 if the arg is outside the range
[-2^63,2^63].

> > I think it is because fptan is inherently inaccurate.  [...]
>
> As you mention, it gets the wrong answer for M_PI.  In general, it
> seems to do pretty badly on numbers close to multiples of pi.
> Granted, one could argue that the answer returned is going to be
> even *further* from the answer the programmer expected (namely, 0),
> so it won't make much difference. ;-)

Too bad the correct answer is a little further from 0.

> ...
> - The Intel software developer's guide says that fptan has an error
>   of at most 1 ulp, but as we've seen, they're lying.

The amd64 manual (application refrence) says "x87 computations are carried
out in double extended precision format, so that the transcendental
functions [are accurate to 1 ulp]".  The "so that" part doesn't follow.
We are seeing something like naive extended precision computations and how
inaccurate they can be even when only a double precision result is wanted.

> [... too much detail to reply to :-)]
> Your suggestion of effectively special-casing large inputs and
> inputs close to multiples of pi is probably the right way to fix
> the inaccuracy.  However, I'm worried that it would wipe away
> any performance benefit of using fptan, if there is a benefit
> at all.  How about punting and removing s_tan.S instead?

The problem affects at least sin() and cos() too, so I think throwing
away the optimized versions is too extreme.  Perhaps a single range
check to let the MD version handle only values near 0 ([-pi/2+eps,
pi/2-eps]? would be efficient enough.

> Other unanswered questions (ENOTIME right now):
> - What about sin() and cos()?
> - What about the float versions of these?

I tested sin().  It misbehaves similarly (with identical results for
args (2 * n * M_PI).

It's interesting that there is fptan and not ftan, but fsin and not fpsin.
Both are partial.

I don't run -current and hadn't seen your code for the MD float versions...
They are buggier:
(1) the exponent can be up to 127, so more than half of the representable
    values exceed 2^63 in magnitude and thus need range reduction.  Results
    for tan(0x1p64) using various methods:

      buggy MD tanf:   0x1p+64                 1.84467e+19
      buggy MD tan:   -0x1.8467b926c834bp-2   -0.379302
      fdlibm MI tanf: -0x1.82bee6p-6          -0.0236051
      bc s(x)/c(x) (scale=40):                 -.02360508353334969937000...
      bc s(x)/c(x) (default scale=20):         -.02358521765210826916

    It looks like fdlibm is perfect and scale=20 in bc is not quite enough.
    sinf(0x1p64) would be 0x1p64 too.  This is preposterous for sin().
(2) they return extra bits of precision.  The MD double precision versions
    have the same bug; the bug is just clearer for the float case, and
    cannot be fixed by FreeBSD's rounding precision hack.

BTW, I don't trust the range reduction for floating point pi/2 but
never finished debugging or fixing it.  According to my old comment,
it is off by pi/2 and very slow for many values between 32768 and
65535.  However, I couldn't duplicate this misbehaviour last time I
looked at it.  I used to fix it using the double version.

Bruce


More information about the freebsd-i386 mailing list