Patches for s_expl.c

Thu May 30 20:19:30 UTC 2013

On Thu, 30 May 2013, Steve Kargl wrote:

> OK, I've restored whitespace to hopefully match your expectations.
> Removed excess digits in exponents (e.g., 1.234e08 --> 1.234e8).
> Restored XXX comments.
> Removed (unnecessary?) blank lines.
> Restored the order of computing r = r1 + r2 in ld128.
> Moved the |x| < 0x1p-113 if-block back into the [T1:T3] interval.

I like the ld80 version now.  My diffs for the ld128 version are below.

> Final questions.  What is your preference for committing expm1l?
> Should it be included in s_expl.c or should I use 'svn cp' to
> copy s_expl.c to s_expm1l.c and add the implementation of
> expm1l to the copied version?

I prefer it in the same file.  The big table is hard to manage in a
separate file (if the functions are split, then the table should be
too, since it is the largest component), and some constants would have
to be made public or duplicated.  Accesses to public tables and scalars
cannot be optimized (by the compiler) as much as static ones.  But when
you implement exp() so that it works as well as expl(), the table should
be shared in the ld80 case, so at least the table should be split then.

@ --- z22/s_expl.c	Fri May 31 04:31:30 2013
@ +++ s_expl.c	Fri May 31 05:32:51 2013
@ @@ -70,7 +70,13 @@
@ 
@ +/*
@ + * XXX values in hex in comments have been lost (or were never present)
@ + * from here.
@ + */

This patch fixes just a few.  All the double precision coeffs are in a
standad format now.

@  static const long double
@  /*
@ - * Domain [-0.002708, 0.002708], range ~[-2.4011e-38, 2.4244e-38]:
@ + * Domain [-0.002708, 0.002708], range ~[-2.4021e-38, 2.4234e-38]:

Checking the range showed that it is not quite the claimed one.  I
think the old values are from a previous check, but I improved the
checking program so the new values are hopefully more accurate.

Oops, I'm not quite happy with the ld80 version, since the checker
says that its B range is much more different than claimed than this
range.

@   * |exp(x) - p(x)| < 2**-124.9
@   * (0.002708 is ln2/(2*INTERVALS) rounded up a little).
@ + *
@ + * XXX the coeffs aren't very carefully rounded, and I get 2.3 more bits.
@   */

Perhaps the coeffs are rounded carefully enough now.  They can be chosen
better.

@ @@ -83,8 +89,25 @@
@  static const double
@ -A7  =  1.9841269841269471e-4,
@ -A8  =  2.4801587301585284e-5,
@ -A9  =  2.7557324277411234e-6,
@ -A10 =  2.7557333722375072e-7;
@ +A7  =  1.9841269841269470e-4,		/*  0x1.a01a01a019f91p-13 */
@ +A8  =  2.4801587301585286e-5,		/*  0x1.71de3ec75a967p-19 */
@ +A9  =  2.7557324277411235e-6,		/*  0x1.71de3ec75a967p-19 */
@ +A10 =  2.7557333722375069e-7;		/*  0x1.27e505ab56259p-22 */

Act on an old reminder to fix things and round the values properly
(just re-print the values given by the C declarations).  Also add
comments.

@ 
@  static const struct {
@ +	/*
@ +	 * hi must be rounded to at most 106 bits so that multiplication
@ +	 * by r1 in expm1l() is exact, but it is rounded to 88 bits due to
@ +	 * historical accidents.

Keep this part of the comment.

@ +	 *
@ +	 * XXX it is wasteful to use long double for both hi and lo.  ld128
@ +	 * exp2l() uses only float for lo (in a very differently organized
@ +	 * table; ld80 exp2l() is different again.  It uses 2 doubles in a
@ +	 * table organized like this one.  1 double and 1 float would
@ +	 * suffice).  There are different packing/locality/alignment/caching
@ +	 * problems with these methods.
@ +	 *
@ +	 * XXX C's bad %a format makes the bits unreadable.  They happen
@ +	 * to all line up for the hi values 1 before the point and 88
@ +	 * in 22 nybbles, but for the low values the nybbles are shifted
@ +	 * randomly.
@ +	 */

Reminders of things to fix.

In a development version, I need hi to have only about 56 bits.  It is
easy to re-split hi+lo for testing this.  A 24-bit or 53-bit hi is
sufficient and would give this automatically.

@  	long double	hi;
@ @@ -311,5 +336,11 @@
@   * Split the interval [T1:T2] into two intervals [T1:T3] and [T3:T2].
@ - * Setting T3 to 0 would require the |x| < 0x1p-113  condition to appear
@ + * Setting T3 to 0 would require the |x| < 0x1p-113 condition to appear
@   * in both subintervals, so set T3 = 2**-5, which places the condition
@   * into the [T1:T3] interval.
@ + *
@ + * XXX we now do this more to (partially) balance the number of terms
@ + * in the C and D polys than to avoid checking the conditon in both
@ + * intervals.
@ + *
@ + * XXX these micro-optimizations are excessive.
@   */
@ @@ -319,7 +350,25 @@
@  /*
@ - * XXX Estimated range is for absolute error.
@ - * Domain [-0.1659, 0.03125], range ~[-1.8933e-38, 1.8943e-38]:
@ - * |(exp(x)-1-x-x**2/2)/x**3 - p(x)| < 2**-125.3
@ + * Domain [-0.1659, 0.03125], range ~[2.9134e-44, 1.8404e-37]:
@ + * |(exp(x)-1-x-x**2/2)/x - p(x)| < 2**-122.03

The relative error should be the one documented.

@ + *
@ + * XXX the coeffs aren't very carefully rounded.  I got 10.3 more bits with
@ + * the old version for [-0.1659, -0.03125].  Now T3 is better balanced, and
@ + * I would expect only 7-8 extra bits.
@ + *
@ + * XXX the number of terms can be reduced by 1.  Then I get a few more bits
@ + * with the same number of doubles (5), and 0.7 more bits with 8 doubles.
@ + * This much accuracy is hard to explain, and it isn't clear that reduction
@ + * of x to double is valid at the same point that reduction of the coeffs to
@ + * double.  With C10 double, the absolute errors from rounding it are up to
@ + * about 2**-53 * 0.1659**10/10! ~= 2**-100.8.  Remes apparently improves
@ + * this to 2**-122.1.
@   */

Better polynomials should be used someday, but I want you to generate them.
After fixing the generator to minimize the relative error instead of the
absolute error, you should get ones like mine.

@  static const long double
@ +/*
@ + * XXX none of the long double C or D coeffs except C10 is correctly printed.
@ + * If you re-print their values in %.35Le format, the result is always
@ + * different.  For example, the last 2 digits in C3 should be 59, not 67.
@ + * 67 is apparently from rounding an extra-precision value to 36 decimal
@ + * places.
@ + */
@  C3  =  1.66666666666666666666666666666666667e-1L,

I didn't fix these.

@ @@ -337,3 +386,3 @@
@  static const double
@ -C14 =  1.1470745580491932e-11,		/*  0x1.93974a81dae3p-37 */
@ +C14 =  1.1470745580491932e-11,		/*  0x1.93974a81dae30p-37 */
@  C15 =  7.6471620181090468e-13,		/*  0x1.ae7f3820adab1p-41 */
@ @@ -344,5 +393,17 @@
@  /*
@ - * XXX Estimated range is for absolute error.
@ - * Domain [0.03125, 0.1659], range ~[-2.7597e-38, 2.7602e-38]:
@ - * |(exp(x)-1-x-x**2/2)/x**3 - p(x)| < 2**-124.8
@ + * Domain [0.03125, 0.1659], range ~[-2.7676e-37, -1.0367e-38]:
@ + * |(exp(x)-1-x-x**2/2)/x - p(x)| < 2**-121.44
@ + *
@ + * XXX the coeffs aren't very carefully rounded. I get 5.2 more bits with
@ + * the old version for [-0.03125, 0.1659].  Now T3 is better balanced, and
@ + * I would expect 7-8 extra bits.
@ + *
@ + * XXX the number of terms can be reduced by 1.  Then I get a few more bits
@ + * with the same number of doubles (4), and 1.1 more bits with 6 doubles.
@ + * This much accuracy is hard to explain, etc., as above.  With D11 double,
@ + * the absolute errors from rounding it are up to about
@ + * 2**-53 * 0.1659**11/11! ~= 2**-106.8.
@ + *
@ + * Note that with my coeffs, although this side needs 1 fewer term, it needs
@ + * 1 more long double term, so it is probably actually slower on sparc64.
@   */

It's painful to have separate polys C and D for Tang's B.

@ @@ -403,3 +466,2 @@
@  	if (T1 < x && x < T2) {
@ -
@  		x2 = x * x;

Bruce