cpow(3) implementations.

Thu Aug 9 03:19:14 UTC 2012

On 08/08/2012 02:49 PM, Stephen Montgomery-Smith wrote:
> On 08/08/2012 01:11 PM, Bruce Evans wrote:
>> On Tue, 7 Aug 2012, Stephen Montgomery-Smith wrote:
>>
>>> On 08/07/2012 07:03 PM, Peter Jeremy wrote:
>>>> The C99 standard (at least WG14/N1256) is quite liberal as regards
>>>> cpow() and specifically allows a conforming implementation to just do:
>>>>    cpow(z, w) = cexp(w * clog(z))
>>>>
>>>> The downside of this approach is that log() inherently loses precision
>>>> by pushing (most of) the exponent bits into the fraction, displacing
>>>> original fraction bits.  I've therefore been looking at how to
>>>> implement cpow(3) with a precision similar to pow(3).  The following
>>>> are some thoughts, together with questions.
>>>>
>>>> In the following:
>>>>    w = a + I*b
>>>>    z = c + I*d
>>>>    cis(r) = cos(r) + I*sin(r)
>>>>    t = u + I*v = clog(c + I*d)
>>>>                = log(hypot(c, d)) + I*atan2(d, c)
>>>>
>>>>    cpow(z, w) = cexp(w * clog(z))
>>>>               = cpow(c + I*d, a + I*b)
>>>>               = cexp((a + I*b) * clog(c + I*d))
>>>>               = cexp((a + I*b) * (u + I*v))
>>>>               = cexp((a*u - b*v) + I*(a*v + b*u))
>>>>               = exp(a*u - b*v) * cis(a*v + b*u)
>>>
>>> I wouldn't regard errors in a*u-b*v as catastrophic cancellation.
>>> This is because exp(0) = 1.  So if the error in computing a*u-b*v is
>>> approx DBL_EPSILON, and a*u-b*v approx zero, even though the relative
>>> error in computing a*u-b*v is going to large (perhaps even infinite),
>>> nevertheless the error in exp(a*u-b*v) may still be bounded by 1 or 2
>>> ulp.
>>
>> Indeed.  It's sort of the opposite of catastrophic cancelation.  We
>> might have a*u-b*v off by a factor of 2, with value DBL_EPSILON when
>> it should be 2*DBL_EPSILON.  That is about 2**52 ulps in double
>> precision.  But then adding 1 turns it into an error of just 1 ulp.
>>
>> Even more so for the cos() part of cis().  Now the small term with the
>> large error gets squared before adding 1.
>>
>> But for the sin() part of cis(), the final error is the one that the
>> cancelation gave.  It may be small absolute for cis(), but on
>> multiplication by large exp() it will become large absolute (though
>> no larger relative, and always small relative to the cos() part which
>> is about 1).
>>
>>> More generally, as a mathematician, I would be far more concerned that
>>> cpow(z,w) return accurate answers when w is real, and especially when
>>> w is a small integer.  Real life applications of cpow(z,y) when w is
>>> not real are very few and far between.
>>
>> This is easy when z is also real, by using pow().  Then there is the
>> issue of continuity -- how close is cpow(x + I*eps, y) to pow(x, y)?
>> Similarly for cpow(x, y + I*eps).
>>
>>> I would be pleased if cpow(x,y) made special provisions for when y is
>>> a small integer, and so, for example, cpow(z,2) was computed as z*z =
>>> (x+y)*(x-y) + 2x*y*I.
>>
>> x^2 - y^2 is easier than x^2 + y^2 - 1 for clog().
>>
>>> For cpow(z,3), you are going to have a hard time avoiding large
>>> relative errors when x^2 = 3y^2 (i.e. z is parallel to a cube root of
>>> 1). Frankly I see that as somewhat unavoidable.
>>
>> x^2 - 3y^2 is still easier than for clog().  But go to z^4, and the real
>> part is x^4 - 6 x^2 y^2 + y^4, so it's much harder than clog().  Seems
>> to need quadrupled precision for the sum of the 4th powers, then a trick
>> to avoid maybe octupled precision :-).
>
> How about doing the real part of z^4 as
> ((x-y)*(x+y)-2*x*y)*((x-y)*(x+y)+2*x*y) ?
> That is what my suggested process would do.

I tested it.  It performs very poorly.

I wrote a cpow function, just for integers.  cpow(x+I*y,n) does very 
poorly when x/y approx tan(Pi/(2n)), that is, x+I*y is parallel to an 
nth root of -1.  Only the cases n=0,1,2,-1,-2 do well.

As a mathematician, I wouldn't expect any better.  In fact I would be 
pleasantly surprised if it got the case n=2 correct.

Just to point out:

tan(Pi/4) = 1
tan(Pi/8) = sqrt(2) - 1
tan(Pi/16) = (1 + Sqrt(2)) (sqrt(4 - 2 sqrt(2)) - 1)

if a_n = tan(Pi/(2n)), then
a_{n+1} = (sqrt(a_n^2+1) - 1) / a_n