fmod nan_mix usage

Tue Jul 24 06:19:25 UTC 2018

On Mon, 23 Jul 2018, Steve Kargl wrote:

> On Tue, Jul 24, 2018 at 07:41:17AM +1000, Bruce Evans wrote:
>> ...
>> clang normally evaluates this at compile, so it doesn't test the libary.
>> This is arguably a bug in clang, since it doesn't set the exception flags.
>> #pragma FENV_ACCESS should control this, but it is hard to use and rarely
>> works.

["This" is fmod*(3, 0).]

>> The test data needs to be non-literal and perhaps even volatile to prevent
>> the compiler evaluating it at compile time.
>
> Whoops.  I should know better!  I have -fno-builtins hardcoded
> in my development trees and completely forgot about constant
> folding.

I just realised that testing should be done with all combinations of
builtin flags, or at least global -fbuiltin and -fnon-builtin.  clang
might inline all fmod calls.

clang recently started inlining all fmin and fmax calls, and the result
is different than the library -- the library is careful to order -0.0
before +0.0, but clang doesn't distingish between these values so it
produces one depending on the order of the args and other details.  C99
footnote 192 explicitly says that the sloppy comparison is allowed, so
this is only a quality of implementation bug.

Both gcc and clang have always inlined fabs calls and have almost always
inlined sqrt calls.

For efficiency testing, I rename functions by copying their file and editing
the file, and rebuild them with the CFLAGS being tested, so that the main
part of the function is independent of the library including the CFLAGS that
it was built with, and builtins.  This only renames fabs and sqrt when
testing these functions.  The function call overhead for small functions like
fabs is about 10 cycles on modern x86, except for long double precision it is
about 30 cycles.

For accuracy testing, it is the function that will normally be used that
should usually be tested.  This is the builtin if there is one, or the
library function.  However, the builtins should be turned off sometimes,
to get an idea of what the non-builtin function will do with other compilers/
arches where it is not a builtin.  Similarly for optimized MD versions.  My
efficiency tests usually turn off the x86-optimized versions.  Non-i386
arches don't have so many optimized MD versions, so just testing the library
functions on them finds most differences that don't show up for the optimized
MD versions.

Bruce