kern/133583: [libm] fma(3) does not respect rounding mode using extended precision

Fri Dec 3 17:16:08 UTC 2010

On Sat, Dec 04, 2010, Bruce Evans wrote:
> >- The only supported architecture that can have this problem due to
> > dynamic precision changes is i386, and even then only for non-SSE2
> > builds.
> 
> SSE2 makes little difference to this problem for i386, except for clang
> it makes it worse.  The ABI requires using the FPU for at least returning
> values, and gcc keeps using the FPU for operations too.

I wasn't going to get into that, but it just makes things worse.
When gcc uses some weird combination of the i387 and SSE2, you
can't predict what kind of precision you'll get.

> >- The cost and complexity associated with making every function in
> > libm detect and adapt to dynamic precision changes is prohibitive.
> 
> Same as for dynamic rounding direction changes.  Actually, much lower
> cost and complexity than for rounding direction.  For rounding direction,
> it is actually useful to keep the caller's mode, and supporting this
> would require making sure every step of every function works right in
> every mode.

Many of the multiprecision tricks only work in FE_TONEAREST.
(See, e.g., the fesetround() in s_fma.c.)  We have to jump
through these hoops for all the IEEE-754R functions where
exact results are required.  (Are there any where we don't?)
For transcendental functions, it seems far less important,
since correct rounding isn't guaranteed anyway.

> For rounding precision, we can just switch to mode that
> works for every function that needs it, and most don't need it except
> for bizarre environments (like forcing single precision and calling
> extended precision functions and expecting them to return any particular
> precision).

For float and double routines, you could, for instance, write a
wrapper library that checks the precision and changes it if needed
before calling the underlying libm function.  The vast majority of
apps don't change the precision, though, so it seems more
appropriate to tell programmers that they have to set the FPU
precision back to the default before calling library functions.

Making the long double functions work correctly is harder, since
gcc butchers all the constants on i386.