(2nd time) tweaks to erff() threshold values

Fri Aug 23 16:57:46 UTC 2013

On Fri, Aug 23, 2013 at 09:12:33PM +1000, Bruce Evans wrote:
> On Thu, 22 Aug 2013, Steve Kargl wrote:
> 
> > (Having problems with freebsd-numerics list.  Hopefully, not
> > a duplicate.)
> 
> It was a duplicate :-).

Yep.  The original finally showed up.  hub.freebsd.org must
have ben swamped. 

> > I would like to apply the patch, which follows my .sig,
> > to msun/src/s_erff.c.  It tweaks the threshold values
> > to those appropriate for erff().
> > ...
> > Index: src/s_erff.c
> > ===================================================================
> > --- src/s_erff.c	(revision 1358)
> > +++ src/s_erff.c	(working copy)
> > @@ -107,10 +107,10 @@
> > 	}
> >
> > 	if(ix < 0x3f580000) {		/* |x|<0.84375 */
> > -	    if(ix < 0x31800000) { 	/* |x|<2**-28 */
> > -	        if (ix < 0x04000000)
> > +	    if(ix < 0x39000000) { 	/* |x|<2**-13 */
> 
> This increases the number of incorrectly rounded cases slightly on i386,
> but 2**-14 decreases it slightly.  This is because i386 has extra
> precision for floats which the nonlinear return expression can actually
> use for approx. 2**-14 < |x| < 2**-13.

I suppose I should update my test programs to count the number
of results that fall within ULP bins.  I typically only want to
know do I make the worse case ULP better or worse.  2**-14 is
certainly better than 2**-28, which is the double precision
threshold. 

> > +	        if (ix < 0x04000000)	/* |x|<0x1p-119 */
> > 		    /*avoid underflow */
> > -		    return (float)0.125*((float)8.0*x+efx8*x);
> > +		    return (8*x+efx8*x)/8;
> 
> Also put the comment back on the same line as the return statement now
> that it fits again (see the double version).

OK.

> Also add "spurious" to the comment.  Underflow still occurs for most denormal
> x.  This is non-spurious.  We just avoid it for |x| larger than about 8
> times the largest denormal, by multiplying efx by 8 so that it is larger
> than 1 (efx8 = 8(efx)).

OK.

> Some modified lines:
> 
> @ 	if(ix < 0x3f580000) {		/* |x|<0.84375 */
> @ 	    if(ix < 0x38800000) { 	/* |x|<2**-14 */
> @ 	        if (ix < 0x04000000)	/* |x|<0x1p-119 */
> @ 		    return (8*x+efx8*x)/8;  /*avoid spurious underflow */
> @ 		}
> @ 		return x + efx*x;
> @ 	    }

OK.  I'll update my patch to use the above.  Did you see my
suggested changes to erfcf()?

> The whole erf implementation is probably not the best way for float precision,
> but is good for understanding higher precisions.

I'm fairly certain that the implementation of erff() is not very
efficient.  The polynomials, used in the rational approximations,
are the same order as those used in the double precision approximation.
I'll check the polys when update my P(x)/Q(x) remes algorithm for
erfl and erfcl.

-- 
Steve