j0 (and y0) in the range 2 <= x < (p/2)*log(2)

Bruce Evans brde at optusnet.com.au
Sat Sep 8 12:09:47 UTC 2018

On Fri, 7 Sep 2018, Steve Kargl wrote:

> On Sat, Sep 08, 2018 at 12:07:26AM +1000, Bruce Evans wrote:
>> ...
>> In fact, this must more or less work, and the difficulty is to make it work
>> more than less.  Start with any reasonable formula like (u*cc-v*ss).  (This
>> is not specific to j0, except it gives an example of a nontrivial formula.)
>> cc and ss are held fixed.  The best choice of u is not known, but we if we
>> can get fairly close to it (perhaps even off by a factor of 2), then we can
>> throw away any previous approximation for v and solve for v (in high
>> precision), then approximate that.  There must be no singularities for this
>> to work in high precision.  We expect to avoid the singularities by some
>> combination of starting with u close enough, keeping the interval small,
>> and special handling of zeros.
>> ...
> I don't see how the above works.  We have
> j0(x) = sqrt(1/pi) * (cc*u-ss*v) / sqrt(x).
> If we take u to be accurate (with some amount of error from the
> asymptotic series) and assume v as an unknown, we can solve for
> v to high accuracy.  But, u is still inaccurate.  So, in the

u should have a small error, but suppose we replace it by 0.  This
also loses the ss factor.  To make things even worse, replace (-ss)
and the sqrt's by 1.  Then the approximation reduces to v alone, and
we know that we can do this fairly accurately.  Actually doing this,
I get an absolute accuracy of 53.652 bits for approximating plain j0
on [2, 2.857] using an even polynomial of degree 16.  A rational
function of order 16 (with degree 8 and 8 polynomials?) should work
slightly better.  Getting relative accuracy near the zero will lose
some of the absolute acuracy, but I expect a degree/order of 2 higher
would be enough to compensate.

But we keep u and all the other terms in the expecation that they will
allow better approximations with lower orders for u.  They actually
seem to require much lower orders of u get only slightly worse
approximations.  pzero(x) = 1 + r(z) / s(z) where z = 1/(x*x) has
degree 5 for p and degree 4 for q.  This has order 2 * (5 + 4) = 18
in x.  qzero(x) = (-0.125 + r(z) / s(z)) / x has degree 1 higher for
s and thus order 20 in x for the r/s part, or 19 for qzero.  Add another
12 or 13 for the order of cos or sin.  Add further inefficiencies and
inaccuracies for the divisions in this.

I get a large unimprovement in the simple polynomial approximation by
approximating j0(x) * sqrt(x) instead of j0(x).  This compensates for
keeping the 1/sqrt(x) factor from the asymptotic formula.  It breaks
the evenness of the function being approximated, so using an even
polynomial of degree 18 gives an accuracy of only 26.754 bits.  For
odd polys, it takes degree 13 to get 53+ bits of accuracy.  For even
polys, it takes degree 38 to get 53+ bits of accuracy.

It now seems clear that the asympotic approximation is just garbage near
x = 2.  Especially the 1/sqrt(x) factor in it.  This destroys the evenness
of the function so it needs a much higher order of rational function to
compensate.  Probably similarly for x <= 8.  It is only for x > 8 that we
are forced to use the asymptotic expansion with its inconvenient 1/sqrt(x)

> interval [2,2.857] we accept whatever the truncated asymptotic
> series gives for u and adjust v to compensate.  In the next
> interval, [2.857,4.547], we flip the roles of u and v based on
> the behavior of cc and ss.  Eventually, we get to the interval
> [8,inf).  The divergent asymptotic series in 8 <= x < p/2*log(2)
> give u and v, which yield poor values for j0(x).  There are
> 4 zeros of j0(x) in this range, and cc and ss have 4 and 3 zeros,
> respectfully.

I didn't notice at first that you were stating my idea in different
words here.  I think you did see.

> There are some subtleties in your approach that I don't
> understand, yet!

The only subtlety seems to be that we didn't understand the full
awfulness of using the asympotic formula where it doesn't apply.
The subtleties mostly go away if we use the simple method of a
different poly approx on each subinterval.

I don't like rational function approximations (with a nontrivial
denominator), and don't fully understand why they are often more
accurate.  Now I suspect that their main advantage is reducing the
number of zeros in the approximating function, and that this advantage
is especially small here since we use small intervals to avoid zeros
in the function being approximated.  Higher degree polys have a higher
number of zeros, and these zeros must all be outside of the interval
except ones matching the zeros of the function being approximated.
Using a rational function reduces the degree, by using poles instead
of zeros.  The poles must also be outside of the interval, but this is
apparently easier to arrange.

> I'll probably try a different approach tomorrow.  j0(x),
> cc, and ss are smoothly varying functions of x.  u and v
> appear to be slowly varying (in small intervals), so we can
> have
> j0(x)   = sqrt(1/pi) * (cc(x)  *u-ss(x)  *v) / sqrt(x).
> j0(x+e) = sqrt(1/pi) * (cc(x+e)*u-ss(x+e)*v) / sqrt(x+e).
> where e is some small perturbution.  Two equation and two
> unknowns is easily solved.  Simply need to do this across
> the interval to generate u(x) and v(x).

I will reply to your other mail about this.  Better throw away the
asymptotic expansion except for the interval at infinity, but that
would require reorganizing almost everything.  I would first try
using the simple poly approx in the interval before the infinite one.
The asymptotic formula should work better then (but pzero and qzero
sill have high order, perhaps because the destruction of parity by
the 1/sqrt(x), u and v factors is almost as bad as for the interval
starting at 2).

>> I noticed that this method cannot work for jn, since u and v depend on n
>> so there is no way to generate before runtime.  jn actually uses a different
>> method and I suspect that it is much more inaccurate.  My tests don't
>> cover jn since its n arg is an integer and I only have coverage for 2
>> real args or 1 complex arg.
> jn with n > 1 only uses the large argument asymptotic expansion for
> the case of very large x where p0(x) = 1 and q0(x) = 0.  This then
> gives (from the comment in e_jn.c)
>     *     Jn(x) = cos(x-(2n+1)*pi/4)*sqrt(2/x*pi)
> The work is in computing cos().  Elsewhere jn is computed via
> upward or downward recursion.

I fear that the recursion is very inaccurate.  It seems to do about n
multiplications, so must have an inaccuracy of >= n/2 ulps.

>>> I'm beginning to like my use of the product formula.  It
>>> allows one to remove the zero from an interval.  I did some
>> I haven't seen that.  Is it for the final evaluation or for determining
>> the approximation?  I don't know of any better way determining approximations
>> near zeros except to multiply by a linear factor to remove the zero.
>> It is becoming clearer why a separate interval is needed for each zero.
>> Polynomials can only give good approximations for one zero at a time
>> except in special cases.
> I haven't sent the code to you as its a work in progress.  I did
> send you some of my testing results for 2 <= x < 4 a couple weeks
> ago.  For j0(x), the infinite product is (https://dlmf.nist.gov/10.21)
> j0(x) = prod(1 - (x/zj)**2), j = 0, 1, 2, ...
> with z0 the first zero of j0(x), etc.  This can be rewritten and
> approximated by
> j0(x) / ((x-z0)*(x+z0)) = R(x) / S(x)
> While developing the rational approximation, on the LHS we use
> z0 = z0hi + z0lo.  Both z0hi and z0lo have p-bits of precision.
> The final result is
> j0(x) = (x - z0hi - z0lo) * (x + z0) * R(x) / S(x)

I see.  I think this can only work well near one zero, and reduces to
my idea for [2, 2.857].  For simplicity, I used j0(x) = P(x) and omitted
the details for handling the zero.  Your formula gives a good way to handle
the zero.  I doubt that the rational function works much better than the
polynomial.  It is difficult to evaluate the multiplicatons accurately
and the division gives further inaccuracies.  The expression should be
written as:

     lower terms + (x - z0hi - z0lo) * (x + z0) * P0.

This still has an inaccuracy of about 1.5 ulps.  To do better, everything
in this expression would have to be done in extra precision, starting with
P0 = P0hi + P0lo, but this is too hard.

It obviously can't work on an infinite interval, since there are an infinite
number of zeros there, but only a finite number pf zeros in R(x) or P(x), and
poles in 1/S(x).  It's surprising that the asymptotic expansion seems to
find all the zeros of j0f with errors of less than 4 million ulps (according
to consistency checks between j0f and j0).


More information about the freebsd-numerics mailing list