svn commit: r241755 - head/lib/msun/src

Mon Oct 22 03:08:52 UTC 2012

Feel free to fix them however.  I added the comments because the algorithms weren't quite the same...  If you have a better way, feel free to back my stuff out on the way to it.

Warner

On Oct 19, 2012, at 11:43 PM, Bruce Evans wrote:

> On Fri, 19 Oct 2012, Warner Losh wrote:
> 
>> Log:
>> Document the methods used to compute logf. Taken and edited from the
>> double version, with adaptations for the differences between it and
>> the float version.
> 
> Please back this out.
> 
> This was intentionally left out.  It is painful to maintain large
> comments that should be identical for all precisions.  Double precision
> is primary, and only comments that depend on the precision should be
> given in other precisions, except for short comments to to right of
> coulde whose non-duplication would just give larger diffs.  If you
> want to duplicate them, then they would often have to be quadruplicated
> in 4 files for the 4 supported precisions:
> - double precision
> - float precision
> - long double with MANT_DIG = 64 and 16 other bits
> - long double with MANT_DIG = 113 and 15 other bits
> Only 4 now, but there could easily be variations for long doubles.
> 
> log* will be replaced by my version soon.  One of the organizational
> things that I'm currently struggling with is how not to duplicate the
> comments in them.  Currently I duplicate then in 4 files, and don't
> quite duplicate then in a 5th file for optimized float precision.
> Maintaining them has been very painful.  They are about 3 times as large
> as the comments here, and have more and more-delicate precision-dependent
> details, so trimming them is harder than here.  They do cover all of
> log*(), log1p*(), log2*() and log10*() in the same file, so there is
> not as much duplication as for the different e_log*.c files.
> 
> Another organizational problem is how to organize the functions.  I
> currently have 4 primary interfaces per file (should have a couple
> more for kernels), and 1 file for each precision, and currently use
> inlines and ifdefs to avoid these inlines, but the inlines are
> not properly optimized for all cases of interest, and break debugging
> in most cases, so I plan to switch to more macro hackery (like multiple
> #include __FILE__'s with ifdefs to select what is compiled for each
> #include) .  Even more macro hackery would let me put support for all
> precisions in the same files, so duplicated comments would be even less
> useful, but the precision-dependent ones would be even harder to write
> and read.
> 
> We are developing inverse trig functions, and handle the comments and
> other things by generating code for all precisions from the double
> precision case.  Large comments are stripped as part of the generation.
> This is easier to write but not so good to read.  This is only feasible
> because the code is not very precision-dependent.
> 
>> Modified: head/lib/msun/src/e_logf.c
>> ==============================================================================
>> --- head/lib/msun/src/e_logf.c	Fri Oct 19 22:21:01 2012	(r241754)
>> +++ head/lib/msun/src/e_logf.c	Fri Oct 19 22:46:48 2012	(r241755)
>> @@ -19,6 +19,57 @@ __FBSDID("$FreeBSD$");
>> #include "math.h"
>> #include "math_private.h"
>> 
>> +/* __ieee754_log(x)
> 
> Here the comment doesn't even match the code.  This is __ieee754_logf(x),
> not __ieee754_log().  Technically, this is a comment that depends on the
> precision, so it should be given separately, but the differences for type
> suffixes are especially painful to maintain in comments and especially
> useless to document in comments.
> 
>> + * Return the logrithm of x
> 
> This duplicates e_log.c to a fault.  Fixing this spelling error would
> require touching multiple files.
> 
>> + *
>> + * Method :
>> + *   1. Argument Reduction: find k and f such that
>> + *                      x = 2^k * (1+f),
>> + *         where  sqrt(2)/2 < 1+f < sqrt(2) .
>> + *
>> + *   2. Approximation of log(1+f).
>> + *      Let s = f/(2+f) ; based on log(1+f) = log(1+s) - log(1-s)
>> + *               = 2s + 2/3 s**3 + 2/5 s**5 + .....,
>> + *               = 2s + s*R
> 
> Nothing new here.  I don't want to see it again.
> 
>> + *      We use a special Reme algorithm on [0,0.1716] to generate
>> + *      a polynomial of degree 8 to approximate R The maximum error
>> + *      of this polynomial approximation is bounded by 2**-34.24. In
>> + *      other words,
> 
> Lots more spelling and grammar errors.  I'm not sure if the correct spelling
> is Remez or Remes, but I'm sure it isn't Reme.  One sentence break is
> missing a "." and the others are consistently too short.
> 
> The 2**-34.24 number is precision-dependent and is documented in more
> completely below using my automatically generated comment that gives a
> consistent format.  (I haven't merged this improvement back into e_log.c.)
> Now there is duplication even within the same file :-(.
> 
>> + *                      2      4      6      8
>> + *          R(z) ~ Lg1*s +Lg2*s +Lg3*s +Lg4*s
> 
> The number of coeffs is indeed precision-dependent, but uninteresting.
> My constent format for poly coeffs doesn't mention them in comments.
> It is enough to spell the constants for them in a consistent way
> (consistent across all sources, not just log*).
> 
>> + *      (the values of Lg1 to Lg7 are listed in the program)
> 
> Here the comment doesn't even match the code.  There is no Lg5 to
> Lg7 for float precision.
> 
>> + *      and
>> + *          |      2          8           |     -34.24
>> + *          | Lg1*s +...+Lg4*s    -  R(z) | <= 2
>> + *          |                             |
> 
> Another excessively verbose comment duplicated.  It has the magic number
> -34.24 again, so this number is now triplicated.
> 
>> + *      Note that 2s = f - s*f = f - hfsq + s*hfsq, where hfsq = f*f/2.
>> + *      In order to guarantee error in log below 1ulp, we compute log
>> + *      by
>> + *              log(1+f) = f - s*(f - R)        (if f is not too large)
>> + *              log(1+f) = f - (hfsq - s*(hfsq+R)).     (better accuracy)
>> + *
>> + *      3. Finally,  log(x) = k*ln2 + log(1+f).
>> + *                          = k*ln2_hi+(f-(hfsq-(s*(hfsq+R)+k*ln2_lo)))
>> + *         Here ln2 is split into two floating point number:
>> + *                      ln2_hi + ln2_lo,
> 
> Nothing new here.  I don't want to see it again.
> 
>> + *         where n*ln2_hi is always exact for |n| < 2000.
> 
> Now just wrong.  2000 is for double precision.  2048 would be better there.
> It is what we arrange for, to go slightly above 1024.)  Here we need to
> go slightly above 128, and arrange for < 256.  This is one of the points
> documented better in my log*.  Its documentation belongs closer to the
> declaration of ln2_hi where you can see the bits being left out of it.
> But my comments for this grew too large.  They describe complications like
> using the same splitting for double precision as for long double precision
> (go slightly above 16384 instead of slightly abive 1024, at an insignificant
> cost of precision), and sharing ln2_hi/lo with a table value where even
> more bits sometimes have to be left out of ln2_hi, depending on complicated
> options and the precision.  They described the details of why the number
> of bits chosen is needed, and gave the acceptable ranges but have been
> reduced a bit closer to saying something like "ln2_hi is rounded to a
> number of bits that works [to satisfy constraints explained elsewhere".
> 
>> + *
>> + * Special cases:
>> + *      log(x) is NaN with signal if x < 0 (including -INF) ;
>> + *      log(+INF) is +INF; log(0) is -INF with signal;
>> + *      log(NaN) is that NaN with no signal.
> 
> Nothing new here.  I don't want to see it again.
> 
>> + *
>> + * Accuracy:
>> + *      according to an error analysis, the error is always less than
>> + *      1 ulp (unit in the last place).
> 
> Actually, expf() has been exhaustively tested, so its error is known
> much more precisely than the error analysis can give.  The details are
> too MD and unimportant to give here.
> 
> e_log.c is not the place to describe what an ulp is.  Garbage should
> be removed before duplicating things.
> 
>> + *
>> + * Constants:
>> + * The hexadecimal values are the intended ones for the following
>> + * constants. The decimal values may be used, provided that the
>> + * compiler will convert from decimal to binary accurately enough
>> + * to produce the hexadecimal values shown.
>> + */
> 
> More non-useful boilerplate.  This comment is FUD about pre-C90 compilers,
> since compilers couldn't be trusted to round correctly long ago.  After
> C99, we could use hex constants to reduce rounding problems, but I don't
> want to churn the sources for this, and prefer decimal constants in most
> cases, with hex values in comments (mainly for debugging -- hex constants
> tend to be more usefull iff you are looking at hex values in a register).
> 
>> +
>> static const float
>> ln2_hi =   6.9313812256e-01,	/* 0x3f317180 */
>> ln2_lo =   9.0580006145e-06,	/* 0x3717f7d1 */
>> 
> 
> This still uses fdlibm's old format in which the hex values are given as
> pseudo-literals in hex in comments.  The couldn't use C99 hex literals
> because C99 didn't exist.  They are more readable in this form, since
> C99 hex literals handle exponents poorly, so it is hard to see what they
> are in bits and you just can't express the special cases where hex is
> most needed (Infs and NaNs) as C99 hex literals.  But when I optimized
> and otherwise improved the LgN* contants, I had not settled on a format
> and didn't even translate to the above format like I did in some other
> places, so my LgN* declarations are of the form '<C99 literal in hex>;
> /* <decimal constant> */'.
> 
> I was also sloppy with the -34.24 constant when I did that.  It seems
> to be rounded in the wrong direction.  I now get -34.22 (rounded to
> nearest) and round it in a safe direction to -34.2.  Note that this
> constant doesn't quite allow the accuracy in the poly to be read off
> as 34.2 bits.  It is scaled relative to s, but the accuracy in bits
> should be scaled relative to |log(1+s)-log(1-s)| rounded down to a
> power of 2.  This is ~2*s.  Normally for such estimates, scaling by
> s is off by a factor of at most 2 so we can read off an accuracy of
> 33.2 bits, but here the factor of 2 in 2*s means that we can read off
> an accuracy of 33.2 +- 1 bits.  Anyway, the -34.24 has lots of spare
> bits, so its minor inaccuracies don't matter and can be be found by
> exhaustive checking.  But it is strictly wrong when its value is given
> in constants.  e_log.c gives the related bound of 2**-58.45.  This is
> scaled by s, so it takes minor interpretation too.  It seems to be
> correctly rounded.
> 
> Bruce