Implementing C99's roundf(), round(), and roundl()

Mon Jan 19 22:10:45 PST 2004

On Mon, 19 Jan 2004, David Schultz wrote:

> On Tue, Dec 02, 2003, Bruce Evans wrote:

> > I thought of a good way after righting the above: roundf() is supposed to
> > be equivalent to rintf() with certain rounding, so set the rounding mode
> > and call rintf() to determine the correct value.  Unfortunately there
> > might not be a correct rounding mode (what does round to nearest do
> > for integers?  I think it rounds to even, but roundf() requires rounding
> > up half-way cases).
>
> You're right on the money.  The C99 standard says:
>
> 	The double version of round behaves as though implemented by
>
> 	#include <math.h>
> 	#include <fenv.h>
> 	#pragma STDC FENV_ACCESS ON
> 	double round(double x)
> 	{
> 		double result;
> 		fenv_t save_env;
> 		feholdexcept(&save_env);
> 		result = rint(x);
> 		if (fetestexcept(FE_INEXACT)) {
> 			fesetround(FE_TOWARDZERO);
> 			result = rint(copysign(0.5 + fabs(x), x));
> 		}
> 		feupdateenv(&save_env);
> 		return result;
> 	}
>
> 	The round functions may, but are not required to, raise the
> 	inexact exception for noninteger numeric arguments, as this
> 	implementation does.

I think getting FE_INEXACT from rint() is the usual case (if the
implementation actually raises it), so we gain little or negative from
sometimes avoiding the more complicated rint() call.  Also, if we
depend on the implemenation to set FE_INEXACT, then we may be able to
depend on the implementation having FE_UPWARD and FE_DOWNWARD, which
seems to permit a simple implementation that I had in mind.

> This should be a faster implementation in that it only has one
> branch.  The one with ceil() has a branch for the inf/NaN test,
> another that I added to avoid spurious inexact exceptions, and a
> third to test the sign of the input.

The fe access functions probably cost a lot more than branches.
Certainly on i386's, especially with an implementation like the
current one where all feset functions set the whole FP enviroment.
The above wants to save and restore the whole FP enviorment anyway.

> However, I think we may be able to use floor() instead of
> fesetround()/rint()/feupdateenv() and do something similar without
> changing the rounding mode:
>
> 	double
> 	round(double x)
> 	{
> 		double result;
> 		fp_except_t fe;
>
> 		fe = fpresetsticky(0);
> 		result = rint(x);
> 		if (fpgetsticky() & FP_X_IMP) {
> 			result = copysign(floor(fabs(x) + 0.5), x);
> 			fe |= FP_X_IMP;
> 		}
> 		fpresetsticky(fe);
> 		return (result);
> 	}
>
> Does this seem reasonable?

rint() is mostly in hardware on i386's (frndint instruction), so it is
faster than floor(), at least on i386's.  But anything that avoids
accessing the FP environment is good.

> By the way, the man pages refer to fpresetsticky(3), but not
> fpsetsticky(3).  The MI ieee.h header declares only fpsetsticky(),
> but some architectures override these with equivalent macros for
> both.  If you have a good idea of how this is supposed to be, it
> would be great if you could fix it.

I once thought that fpresetsticky() was standard and fpsetsticky() was
intentionally undocumented because it is nonstandard, but this seems
to be backwards -- it is fpresetsticky() that is the mistake.  Look
them up on the web.  There are more hits for fpsetsticky, and google
even suggests I wanted fpsetsticky when I seach for fpresetsticky.

The C99 interfaces should make these moot.  It has equivalents to
both, and also feraiseexcept().

> Another option is to just implement the relevant parts of fenv.h.
> They are fairly straightforward, and there's no sense in hiding
> behind the brokenness of gcc's i386 backend forever.

Is "hiding" not bothering to implement this because gcc's bugs make
it moot?  I think fenv.h can't be implemented until there is
"#pragma STDC FENV_ACCESS ON" or some other mechanism to turn off
both the bugs and (non-buggy) optimizations.

BTW, I've been using "#define const volatile const" in lib/msun for
10 years to turn off some constant folding.  I'm not sure exactly what
C99 says about it, but I think most FP constant folding at compile
time is a valid optimization unless FENV_ACCESS is ON.  To avoid
depending on implementation details that aren't even implemented, msun
probably needs large changes to use FENV_ACCESS a lot, and/or to use
feraisexcept() instead of things like:

e_exp.c:	    if(x > o_threshold) return huge*huge; /* overflow */

[This needs to raise FE_OVERFLOW, and depends on huge*huge doing it,
but huge is just "static const double huge 1.0e+300;" and gcc -O1
folds huge*huge to +Inf at compile time and doesn't raise FE_OVERFLOW.
My volatile hack forces this to be done at runtime.]

> (By the way,
> if we did implement fenv.h, does anyone know if there are any
> legal issues or attribution requirements involved in using the
> exact round() code proposed by the C standard?)

I don't know, but think that we would at last have to get permission,
and it would be easier to reimplement it.

Bryce