i386/67469: src/lib/msun/i387/s_tan.S gives incorrect results for large inputs

Bruce Evans bde at zeta.org.au
Sun Feb 13 11:40:21 PST 2005


The following reply was made to PR i386/67469; it has been noted by GNATS.

From: Bruce Evans <bde at zeta.org.au>
To: David Schultz <das at FreeBSD.org>
Cc: FreeBSD-gnats-submit at FreeBSD.org, freebsd-i386 at FreeBSD.org,
	bde at FreeBSD.org
Subject: Re: i386/67469: src/lib/msun/i387/s_tan.S gives incorrect results
 for large inputs
Date: Mon, 14 Feb 2005 06:38:16 +1100 (EST)

 On Sun, 13 Feb 2005, David Schultz wrote:
 
 > On Mon, Feb 14, 2005, Bruce Evans wrote:
 > > >...
 > > I did a quick test of some other functions:
 > > - hardware sqrt is much faster
 > > - hardware exp is slightly faster on the range [1,100]
 > > - hardware atan is slower on the range [0,1.5]
 > > - hardware acos is much slower (139 nsec vs 57 nsec!) on the range [0,1.0].
 >
 > sqrt isn't transcendental, so it should be faster and correctly
 > rounded on every hardware platform.  I found similar results to
 
 I don't know if we can trust the hardware for that.  ISTR checking that
 hardware sqrtf gives the same result as fdlibm for possible values for sqrtf.
 This is of course impossible for double sqrt.
 
 > yours for atan() and acos() when writing amd64 math routines, but
 > of course amd64 has the overhead of switching between the SSE and
 > i387 units.  Maybe they should go away, too...
 
 These are easier to decide (for now) because there are no old CPUs.
 
 I fixed the bug that gave unbelievable cycle counts:
 
 %%%
 --- r.c~	Mon Feb 14 02:19:34 2005
 +++ r.c	Mon Feb 14 02:22:21 2005
 @@ -45,4 +47,5 @@
  	tmax = 0;
  	tmin = INT_MAX;
 +	total = 0;
  	for (i = 0; i < ITER; i++) {
  		if (fabs(avg - t[i]) <= sd * 2) {
 %%%
 
 With this fix on athlon-xp's, the cpuid instructions only disturb the
 cycle counts in a small and almost deterministic way (by about 59 cycles
 for every run).
 
 Bruce


More information about the freebsd-i386 mailing list