i386/67469: src/lib/msun/i387/s_tan.S gives incorrect results for large inputs

Bruce Evans bde at zeta.org.au
Sun Feb 13 11:38:21 PST 2005


On Sun, 13 Feb 2005, David Schultz wrote:

> On Mon, Feb 14, 2005, Bruce Evans wrote:
> > >...
> > I did a quick test of some other functions:
> > - hardware sqrt is much faster
> > - hardware exp is slightly faster on the range [1,100]
> > - hardware atan is slower on the range [0,1.5]
> > - hardware acos is much slower (139 nsec vs 57 nsec!) on the range [0,1.0].
>
> sqrt isn't transcendental, so it should be faster and correctly
> rounded on every hardware platform.  I found similar results to

I don't know if we can trust the hardware for that.  ISTR checking that
hardware sqrtf gives the same result as fdlibm for possible values for sqrtf.
This is of course impossible for double sqrt.

> yours for atan() and acos() when writing amd64 math routines, but
> of course amd64 has the overhead of switching between the SSE and
> i387 units.  Maybe they should go away, too...

These are easier to decide (for now) because there are no old CPUs.

I fixed the bug that gave unbelievable cycle counts:

%%%
--- r.c~	Mon Feb 14 02:19:34 2005
+++ r.c	Mon Feb 14 02:22:21 2005
@@ -45,4 +47,5 @@
 	tmax = 0;
 	tmin = INT_MAX;
+	total = 0;
 	for (i = 0; i < ITER; i++) {
 		if (fabs(avg - t[i]) <= sd * 2) {
%%%

With this fix on athlon-xp's, the cpuid instructions only disturb the
cycle counts in a small and almost deterministic way (by about 59 cycles
for every run).

Bruce


More information about the freebsd-i386 mailing list