Complex arg-trig functions
Stephen Montgomery-Smith
stephen at missouri.edu
Mon Sep 17 22:50:28 UTC 2012
OK, I am struggling a bit with the latest suggestions.
First, I have completely removed all the code related to when |z| is
small. I have just lost it all. So I didn't perform any changes
related to that code. If you want me to put it back with appropriate
"#if 0", can you email those code segments back to me?
On 09/17/2012 12:07 PM, Bruce Evans wrote:
> @ @@ -321,30 +334,28 @@
> @ }
> @ @ - if (isinf(x) || isinf(y))
> @ - return (cpackf(copysignf(0, x), copysignf(m_pi_2 + tiny, y)));
> @ + /* Raise inexact unless z == 0; return for z == 0 as a side
> effect. */
> @ + if ((x == 0 && y == 0) || (int)(1 + tiny) != 1)
> @ + return (z);
I'm not too sure where this code is meant to be. It looks like it
should be part of testing |z| small, but it seems to be placed where |z|
is large. When |z| is large, z=0 will never happen.
> cacos*() and casin*() should benefit even more from an up-front raising
> of inexact, since do_hard_work() has 7 magic statements to raise inexact
> where sum_squares has only 1.
Where is the code that raises inexact up-front?
> There are no magic expressions (int)(1+tiny) left except the new up-front
> one. There are still not-so- magic expressions (m_pi_2 + tiny). BTW,
> most or all of the recent fixes to use the latter expressions don't
> have a comment about raising inexact in catrig.c, while most or all
> older expressions for setting inexact do have such a comment.
I put the comments in.
> Previous optimization not turned off for debugging. It is simpler now
> that it can depend on inexact being raised up-front.
Ditto. Which code turns on inexact up front?
> @ } else
> @ rx = log1pf(4*ax / sum_squares(ax-1, ay)) / 4;
> @ @ - if (ax == 1) {
> @ - if (ay==0)
> @ - ry = 0;
> @ - else
> @ - ry = atan2f(2, -ay) / 2;
> @ - } else if (ay < FOUR_SQRT_MIN) {
> @ - if ((int)ay==0)
> @ - ry = atan2f(2*ay, (1-ax)*(1+ax)) / 2;
> @ - } else
> @ + if (ax == 1)
> @ + ry = atan2f(2, ay) / 2;
> @ + else if (ay < FOUR_SQRT_MIN)
> @ + ry = atan2f(2*ay, (1-ax)*(1+ax)) / 2;
> @ + else
> @ ry = atan2f(2*ay, (1-ax)*(1+ax) - ay*ay) / 2;
> @
>
> Remove the special case for (ax == 1, ay == 0). The general case gives
> the same result.
I don't think your code works. It should be ry = atan2f(2, -ay) / 2,
not ry = atan2f(2, ay) / 2.
In your tests, you should include cases where x or y is equal or close
to 1. These are important special cases that I think your test code is
very unlikely to hit. These are difficult edge cases for all the
arc-trig functions.
> Remove negation of ay for ax == 1. The sign will be copied into the result
> later for all cases, so it doesn't matter in the arg. I didn't check the
> branch cut details for this, but runtime tests passed.
See above.
> ... I now understand what the threshold should be. You have
> filtered out ax == 1. This makes 1 - ax*ax at least ~2*EPSILON, so
> ay*ay can be dropped if ay is less than sqrt(2*EPSILON*EPSILON) *
> 2**-GUARD_DIGITS = EPSILON * 2**-5 say. SQRT_MIN is way smaller
> than that, so FOUR_SQRT_MIN works too. We should use a larger
> threshold for efficiency, or avoid the special case for ax == 1.
> Testing shows that this analysis is off by a factor of about
> sqrt(EPSILON), since a threshold of EPSILON * 2**7 is optimal.
> The optimization made no difference to speed; it is just an
> optimization for understanding. Maybe the special case for ax == 1
> can be avoided, or folded together with the same special case for
> evaluation of the real part. This special case is similar to the
> one in clog(), but easier.
This was one of the clever ideas in the paper by Hull et al, which I
only understood recently. Their code was closer to your approach, I think.
Let me think about what you wrote some more.
>
> Further optimization: in sum_squares(), y is always ay >= 0, so there
> is no need to apply fabs*() to it. I think the compiler does this
> optimization. It can see that y == ay via the inline.
Well spotted.
More information about the freebsd-numerics
mailing list