cvs commit: src/lib/msun/src s_cbrt.c s_cbrtf.c

Bruce Evans bde at
Tue Dec 13 12:17:24 PST 2005

bde         2005-12-13 20:17:24 UTC

  FreeBSD src repository

  Modified files:
    lib/msun/src         s_cbrt.c s_cbrtf.c 
  Optimize by not doing excessive conversions for handling the sign bit.
  This gives an optimization of between 9 and 22% on Athlons (largest
  for cbrt() on amd64 -- from 205 to 159 cycles).
  We extracted the sign bit and worked with |x|, and restored the sign
  bit as the last step.  We avoided branches to a fault by using accesses
  to FP values as bits to clear and restore the sign bit.  Avoiding
  branches is usually good, but the bit access macros are not so good
  (especially for setting FP values), and here they always caused pipeline
  stalls on Athlons.  Even using branches would be faster except on args
  that give perfect branch misprediction, since only mispredicted branches
  cause stalls, but it possible to avoid touching the sign bit in FP
  values at all (except to preserve it in conversions from bits to FP
  not related to the sign bit).  Do this.  The results are identical
  except in 2 of the 3 unsupported rounding modes, since all the
  approximations use odd rational functions so they work right on strictly
  negative values, and the special case of -0 doesn't use an approximation.
  Revision  Changes    Path
  1.10      +5 -7      src/lib/msun/src/s_cbrt.c
  1.12      +4 -8      src/lib/msun/src/s_cbrtf.c

More information about the cvs-src mailing list