cvs commit: src/lib/msun/src s_cbrt.c s_cbrtf.c

Sun Dec 18 13:46:48 PST 2005

bde         2005-12-18 21:46:47 UTC

  FreeBSD src repository

  Modified files:
    lib/msun/src         s_cbrt.c s_cbrtf.c 
  Log:
  Fixed code to match comments and the algorithm:
  - in preparing for the third approximation, actually make t larger in
    magnitude than cbrt(x).  After chopping, t must be incremented by 2
    ulps to make it larger, not 1 ulp since chopping can reduce it by
    almost 1 ulp and it might already be up to half a different-sized-ulp
    smaller than cbrt(x).  I have not found any cases where this is
    essential, but the think-time error bound depends on it.  The relative
    smallness of the different-sized-ulp limited the bug.  If there are
    cases where this is essential, then the final error bound would be
    5/6+epsilon instead of of 4/6+epsilon ulps (still < 1).
  - in preparing for the third approximation, round more carefully (but
    still sloppily to avoid branches) so that the claimed error bound of
    0.667 ulps is satisfied in all cases tested for cbrt() and remains
    satisfied in all cases for cbrtf().  There isn't enough spare precision
    for very sloppy rounding to work:
    - in cbrt(), even with the inadequate increment, the actual error was
      0.6685 in some cases, and correcting the increment increased this
      a little.  The fix uses sloppy rounding to 25 bits instead of very
      sloppy rounding to 21 bits, and starts using uint64_t instead of 2
      words for bit manipulation so that rounding more bits is not much
      costly.
    - in cbrtf(), the 0.667 bound was already satisfied even with the
      inadequate increment, but change the code to almost match cbrt()
      anyway.  There is not enough spare precision in the Newton
      approximation to double the inadequate increment without exceeding
      the 0.667 bound, and no spare precision to avoid this problem as
      in cbrt().  The fix is to round using an increment of 2 smaller-ulps
      before chopping so that an increment of 1 ulp is enough.  In cbrt(),
      we essentially do the same, but move the chop point so that the
      increment of 1 is not needed.

  Fixed comments to match code:
  - in cbrt(), the second approximation is good to 25 bits, not quite 26 bits.
  - in cbrt(), don't claim that the second approximation may be implemented
    in single precision.  Single precision cannot handle the full exponent
    range without minor but pessimal changes to renormalize, and although
    single precision is enough, 25 bit precision is now claimed and used.

  Added comments about some of the magic for the error bound 4/6+epsilon.
  I still don't understand why it is 4/6+ and not 6/6+ ulps.

  Indent comments at the right of code more consistently.

  Revision  Changes    Path
  1.12      +27 -12    src/lib/msun/src/s_cbrt.c
  1.13      +16 -9     src/lib/msun/src/s_cbrtf.c