standards/142803: j0 Bessel function inaccurate near zeros of the function

Fri Jan 15 00:40:03 UTC 2010

The following reply was made to PR standards/142803; it has been noted by GNATS.

From: Bruce Evans <brde at optusnet.com.au>
To: "Steven G. Kargl" <kargl at troutmask.apl.washington.edu>
Cc: Bruce Evans <brde at optusnet.com.au>, FreeBSD-gnats-submit at FreeBSD.org,
        freebsd-standards at FreeBSD.org
Subject: Re: standards/142803: j0 Bessel function inaccurate near zeros of
 the function
Date: Fri, 15 Jan 2010 11:38:39 +1100 (EST)

 On Thu, 14 Jan 2010, Steven G. Kargl wrote:

 > Bruce Evans wrote:
 >> On Wed, 13 Jan 2010, Steven G. Kargl wrote:
 >>
 >>>> Description:
 >>>
 >>> The j0 bessel function supplied by libm is fairly inaccurate at
 >>> ...
 >>
 >> This is a very hard and relatively unimportant problem.
 >
 > Yes, it is very hard, but apparently you do not use bessel
 > functions in your everyday life. :)
 >
 > I only discover this issue because I need bessel functions
 > of complex arguments and I found my routines have issues
 > in the vicinity of zeros.  So, I decided to look at the
 > libm routines.

 It is interesting that the often-poor accuracy of almost every system's
 libm matters in real life.

 Complex args are another interesting problem since even complex
 multiplication is hard to do accurately (it may have large cancelation
 errors).

 >> Anyway, if you can get anywhere near < 10 ulp error near all zeros using
 >> only an asymptotic method, then that would be good.  Then the asymptotic
 >> method would also be capable of locating the zeros very accurately.  But
 >> I would be very surprised if this worked.  I know of nothing similar for
 >> reducing mod Pi for trigonometric functions, which seems a simpler problem.
 >> I would expect it to at best involve thousands of binary digits in the
 >> tables for the asymptotic method, and corresponding thousands of digits
 >> of precision in the calculation (4000 as for mfpr enough for the 2**100th
 >> zero?).
 >
 > The 4000-bit setting for mpfr was a hold over from testing mpfr_j0
 > against my ascending series implementation of j0 with mpfr
 > primitives.  As few as 128-bits is sufficient to achieve the
 > following:
 >
 >>>    x        my j0f(x)     libm j0f(x)    MPFR j0        my err  libm err
 >    2.404825  5.6434398E-08  5.9634296E-08  5.6434400E-08      0.05 152824.59
 >    5.520078  2.4476657E-08  2.4153294E-08  2.4476659E-08      0.10  18878.52
 >    8.653728  1.0355303E-07  1.0359805E-07  1.0355306E-07      0.86   1694.47
 >   11.791534 -3.5291243E-09 -3.5193941E-09 -3.5301714E-09     75.93    781.53

 Wonder why this jumps.

 >   14.930918 -6.4815082E-09 -6.3911618E-09 -6.4815052E-09      0.23   6722.88
 >   18.071064  5.1532352E-09  5.3149818E-09  5.1532318E-09      0.23  10910.50
 >   21.211637 -1.5023349E-07 -1.5002509E-07 -1.5023348E-07      2.70  56347.01
 > ...
 >
 > As I suspected by adding additional terms to the asymptotic
 > approximation and performing all computations with double
 > precision, reduces 'my err' (5th column).  The value at
 > x=11.7... is the best I can get.  The asymptotic approximations
 > contain divergent series and additional terms do not help.

 The extra precision is almost certainly necessary.  Whether double
 precision is nearly enough is unclear, but the error near 11.7 suggests
 that it is nearly enough except there.  The large error might be caused
 by that zero alone (among small zeros) being very close to a representable
 value.

 I forgot the mention that the error table in my previous mail is on amd64
 for comparing float precision functions with double precision ones, assuming
 that the latter are correct, which they aren't, but they are hopefully
 correct enough for this comparision.  The errors on i386 are much larger,
 due to i386 still using i387 hardware trigonometric which are extremely
 inaccurate near zeros, starting at the first zero.  Here are both tables:

 amd64:
 %%%
 j0:    max_er = 0x7fffffffffdf5e07 17179869183.9960, avg_er = 5.581, #>=1:0.5 = 1593961230:1839722934
 j1:    max_er = 0x7fffffffffbcd2a1 17179869183.9918, avg_er = 4.524, #>=1:0.5 = 1597678928:1856295142
 lgamma:max_er = 0x135a0b77e00000 10145883.7461, avg_er = 0.252, #>=1:0.5 = 44084256:331444835
 y0:    max_er = 0x7fffffffffbcd2a0 17179869183.9918, avg_er = 2.379, #>=1:0.5 = 837057577:1437331064
 y1:    max_er = 0x7fffffffffdf5e07 17179869183.9960, avg_er = 3.761, #>=1:0.5 = 865063612:1460264955
 %%%

 i386:
 %%%
 j0:    max_er = 0x6f8686c5c9f26 3654467.3863, avg_er = 0.418, #>=1:0.5 = 671562746:1332944948
 j1:    max_er = 0x449026e9c286f 2246675.4566, avg_er = 0.425, #>=1:0.5 = 674510414:1347770568
 lgamma:max_er = 0xe4cf242400000 7497618.0703, avg_er = 0.274, #>=1:0.5 = 70033452:508702222
 y0:    max_er = 0x93a2340c00000 4837658.0234, avg_er = 0.380, #>=1:0.5 = 594207097:1303826516
 y1:    max_er = 0x7ffa2068256f72ab 17176789825.1699, avg_er = 5.188, #>=1:0.5 = 459137173:1213136103
 %%%

 Unfortunately, most of these i386 errors, and the amd64 error for y1()
 are misreported.  The huge max_er of 0x7ff... (16 hex digits) is
 actually a sign error misreported.  Sign errors are bad enough.  They
 always result at a simple zero z0 (f'(z0) != 0) when the approximation
 is so inaccurate that it cannot tell on which side of infinite-precision
 z0 the parameter is.  Methods involving a table of zeros will not have
 any of these provided the table is accurate to within 1 ulp, but other
 methods can easily have them, depending on the other method's ability
 to locate the zeros to within 1 ulp by solving f~(z) ~= 0 where f~ is
 the approximate f.

 Having no sign errors across the whole range seems too good to believe
 for the amd64 functions.  All of the i387 hardware trig functions have
 sign errors.

 Bruce