dead code in lgamma_r[f]?
Steve Kargl
sgk at troutmask.apl.washington.edu
Sun Dec 8 18:33:46 UTC 2013
On Sun, Dec 08, 2013 at 03:21:04PM +1100, Bruce Evans wrote:
> On Sat, 7 Dec 2013, Steve Kargl wrote:
>
> > I really need to stop looking at fdlibm code. The threshold(s) in
> > e_lgammaf.c could be chosen better (and actually documented). In
> > the float code
> >
> > /* 8.0 <= x < 2**58 */
> > } else if (ix < 0x4e800000) { // (ix < 0x5c800000) {
> > t = __ieee754_logf(x);
> > z = one/x;
> > y = z*z;
> > w = w0+z*(w1+y*(w2+y*(w3+y*(w4+y*(w5+y*w6)))));
> > r = (x-half)*(t-one)+w;
> > } else
> > /* 2**58 <= x <= inf */
> > r = x*(__ieee754_logf(x)-one);
> >
> > the 2**58 is much too large. Asymptotically, we have
>
> That's Cygnus code, not fdlibm :-).
>
> > lgamma(x)~(x-0.5)log(x)-x = x*(log(x)-1) - 0.5*log(x)
> >
> > where the 0.5*log(x) is insignificant when compared to 2**58.
> > I suspect that 2**58 is a sloppy threshold for double, which
> > can be reduced to 2**30 (or smaller) for float. I suppose I
> > could look for a tighter bound, but 2**(p+slop) with p, the
> > precision, probably is sufficient.
>
> The Cygnus translation to float didn't look closely enough at
> the thresholds.
It also did not recompute minimax coefficients as there are too
many terms in the approximation. Looks like Cygnus simply rounded
the double coefficients to float.
> gamma() was the first part of libm that I looked closely at (even
> before trig functions in 2004/5). I didn't understand it at first and
> tried to write my own. My first idea was to use the asymptotic
> expansion as a kernel for all cases, especially for complex args.
> To use the asymptotic expansion, the real part of the arg must be
> pushed up to about 16 or greater using repeated multiplications.
> It is difficult to do these multiplications accurately enough,
> but the float case should be very easy using double precision
> multiplication. I tried extra precision in software, but this
> was too slow. This method also works poorly for args with negative
> real part, especially for lgamma() near its zeros. Even using
> doubled double precision didn't get near 1 ulp of accuracy in float
> precision for lgamma() near its zeros, and the only fix that I
> know of is to use a special approximation near each zero down
> to about real part -70 (70 cases). fdlibm is also bad for lgamma()
> near its zeros.
Interesting idea. For x in [2,8), fdlibm code (as you probably
know) uses recursion to reduce the evaluation to a log of a
product and lgamma(2+s) with s in [0,1). One can then show that
lgamma(2+s) = s * (1 - E) + s**2 * P(s) where E is Euler's constant
and P(s) is a polynomial in s. But, instead of using a minimax
polynomial approximation fdlibm code uses a rational approximation
lgamma(2+s) = 0.5*s + p(s) / q(s). This is the origin of the
s0, ..., s6, r1, ..., r6 coefficients.
> Here is a pari version of this (the C version is harder to understand
> due to its technical details and less suitable language; pari syntax
> is worse than C syntax for most things (including things that pari
> should be able to do well) but slightly better for this use:
I suppose one of these days, I'll learn how to use pari.
A quick scan of n1256.pdf shows that C99 does not require
lgamma of a complex argument, but it does reserve the
names clgamma[fl].
> % lg(z) =
> % {
> % local(bol, rnew);
> %
> % if (abs(z - 1) < 1.e-1, return (lngamma(z)););
> % if (abs(z - 2) < 1.e-1, return (lngamma(z)););
>
> Near 1 and 2, I planned to use a poly approximation, but that is not
> so easy to do in pari so I call the pari builtin lgamma() to fake it.
fdlibm's lgammaf_r does not appear to have an issue near 1 or 2.
Starting at the lower bound of each interval and using nextafterf
to scan up to the upper bound, I'm seeing
% make testf && ./testf
Interval: Time ULP Value
[4.7683716e-07, 2.0000000e+00): 0.04590 0.98491 1.231645e+00
[2.0000000e+00, 8.0000000e+00): 0.08754 0.82825 4.012439e+00
[8.0000000e+00, 2.8823038e+17): 0.05841 1.43789 8.942273e+06
[2.8823038e+17, 2.6584560e+36): 0.05095 0.50000 1.491249e+23
where the reference value is from lgamma_r. The different
intervals test specific branches in lgammaf_r. The upper
bound of 2.6584560e36 is 0x1p121. Time is the average value
for all calls in the interval in usec/call.
> % bol = 0;
> % r = 1;
> % while (abs(z) < M,
> % rnew = r * z;
> % if (imag(r) > 0 && imag(rnew) < 0, bol += 1);
> % if (imag(r) < 0 && imag(rnew) > 0, bol -= 1);
> % r = rnew;
> % z = z + 1;
> % );
>
> This pushes up z, almost as above.
>
> % bol = 0;
> % return (lgb(z) - log(r) - 2 * Pi * bol);
>
> Then use the asymptotic expansion.
>
> % }
> %
> % lgb(z) = (z - 0.5) * log(z) - z + 0.5 * log(2 * Pi) + \
> % sum(n = 1, N, b[n] / z^(2 * n - 1))
>
> This is the asymptotic expansion. Pari syntax is only perfect for
> numeric things like this (and probably for some number theory things
> which I don't know well). b[n] is a table of coefficients.
>
> I had forgotten the details of even the first term in this. You
> mentioned using an approximation of just the first term or two above
> a huge threshold. It might be better to go much lower using another
> term or two (but not as many as medium-sized args).
Well, I was only interested in the poorly chosen 2**58 threshold.
In the [8,2**58) interval the asymptotic expansion is used, but
its written in such a way that the
>% + 0.5 * log(2 * Pi) + sum(n = 1, N, b[n] / z^(2 * n - 1))
portion of your above code is approximated by a minimax polynomial.
/* 8.0 <= x < 2**58 */
} else if (ix < 0x5c800000) {
t = __ieee754_logf(x);
z = one/x;
y = z*z;
w = w0+z*(w1+y*(w2+y*(w3+y*(w4+y*(w5+y*w6)))));
r = (x-half)*(t-one)+w;
} else
fdlibm use only the first 2 terms for the x >= 2**58.
> Cephes has a very complicated lgamma() or gamma(), with many special
> poly approximations for 0 < x < 8 or similar.
fdlibm's approximations in the [0,2) range are mostly black magic
to me.
--
Steve
More information about the freebsd-numerics
mailing list