cvs commit: src/lib/msun/src e_rem_pio2f.c s_cosf.c s_sinf.c s_tanf.c

Bruce Evans bde at
Sat Nov 19 02:38:28 GMT 2005

bde         2005-11-19 02:38:27 UTC

  FreeBSD src repository

  Modified files:
    lib/msun/src         e_rem_pio2f.c s_cosf.c s_sinf.c s_tanf.c 
  Moved all the optimizations for |x| <= 9pi/2 from
  __ieee754_rem_pio2f() to its 3 callers and manually inline them.
  On Athlons, with favourable compiler flags and optimizations and
  favourable pipeline conditions, this gives a speedup of 30-40 cycles
  for cosf(), sinf() and tanf() on the range pi/4 < |x| <= 9pi/4, so
  thes functions are now signifcantly faster than the hardware trig
  functions in many cases.  E.g., in a benchmark with uniformly distributed
  x in [-2pi, 2pi], A64 hardware fcos took 72-129 cycles and cosf() took
  37-55 cycles.  Out-of-order execution is needed to get both of these
  times.  The optimizations in this commit apparently work more by
  removing 1 serialization point than by reducing latency.
  Revision  Changes    Path
  1.17      +0 -55     src/lib/msun/src/e_rem_pio2f.c
  1.10      +33 -2     src/lib/msun/src/s_cosf.c
  1.10      +41 -4     src/lib/msun/src/s_sinf.c
  1.10      +31 -6     src/lib/msun/src/s_tanf.c

More information about the cvs-src mailing list