[Bug 254911] lib/msun/ctrig_test fails if compiled with AVX (-mavx) or any CPUSET enabling AVX
bugzilla-noreply at freebsd.org
bugzilla-noreply at freebsd.org
Fri Apr 9 21:58:45 UTC 2021
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=254911
--- Comment #3 from Dimitry Andric <dim at FreeBSD.org> ---
Hmm it seems that we have a case here that is similar to what is described
here:
https://stackoverflow.com/questions/63125919/how-to-avoid-floating-point-exceptions-in-unused-simd-lanes
The gist being that clang indeed uses the vdivps (Divide Packed
Single-Precision) instruction by default, so the two calculations (beta * rho *
s) / denom, t / denom) are emitted as:
#DEBUG_VALUE: ctanhf:denom <- $xmm2
.loc 1 77 35 is_stmt 1 #
src/lib/msun/src/s_ctanhf.c:77:35
vmulss %xmm1, %xmm3, %xmm1
.loc 1 77 41 is_stmt 0 #
src/lib/msun/src/s_ctanhf.c:77:41
vmulss %xmm1, %xmm0, %xmm0
.loc 1 77 46 #
src/lib/msun/src/s_ctanhf.c:77:46
vinsertps $16, -80(%rbp), %xmm0, %xmm0 # 16-byte Folded Reload
# xmm0 = xmm0[0],mem[0],xmm0[2,3]
vmovsldup %xmm2, %xmm1 # xmm1 = xmm2[0,0,2,2]
vdivps %xmm1, %xmm0, %xmm0
Now the problem with vdivps is apparently that the unused 'lanes' of the SIMD
registers can still result in floating point exception bits being set, such as
FE_INVALID (in this case probably because the unused lanes have zero in them,
giving 0/0).
That stackoverflow article suggests using clang's
-ffp-exception-behavior=maytrap option (documented at
<https://releases.llvm.org/11.0.1/tools/clang/docs/UsersManual.html#cmdoption-ffp-exception-behavior>),
meaning "The compiler avoids transformations that may raise exceptions that
would not have been raised by the original code". It is supported from clang 10
onwards.
In practice, this indeed avoids using vdivps, and uses vdivss (Divide Scalar
Single-Precision) instead, and the assembly for line 77 then looks like:
#DEBUG_VALUE: ctanhf:denom <- $xmm1
.loc 1 77 35 is_stmt 1 #
src/lib/msun/src/s_ctanhf.c:77:35
vmulss %xmm2, %xmm4, %xmm2
.loc 1 77 41 is_stmt 0 #
src/lib/msun/src/s_ctanhf.c:77:41
vmulss %xmm0, %xmm2, %xmm0
.loc 1 77 46 #
src/lib/msun/src/s_ctanhf.c:77:46
vdivss %xmm1, %xmm0, %xmm2
vmovss -80(%rbp), %xmm0 # 4-byte Reload
# xmm0 = mem[0],zero,zero,zero
#DEBUG_VALUE: ctanhf:t <- $xmm0
.loc 1 77 57 #
src/lib/msun/src/s_ctanhf.c:77:57
vdivss %xmm1, %xmm0, %xmm0
And indeed, in this case the FE_INVALID is gone, and the tests succeed.
I guess it may be good to use this -ffp-exception-behavior=maytrap flag for the
whole of lib/msun, as many of these functions rely on this behavior. It does
not seem to be required for gcc.
--
You are receiving this mail because:
You are the assignee for the bug.
More information about the freebsd-bugs
mailing list