bin/43299: march=pentium4 miscompiles msun/src/e_pow.c

David Schultz das at FreeBSD.ORG
Tue May 13 11:04:20 PDT 2003


On Tue, May 13, 2003, Mikhail Teterin wrote:
> The following reply was made to PR bin/43299; it has been noted by GNATS.
> 
> From: Mikhail Teterin <Mikhail.Teterin at murex.com>
> To: freebsd-gnats-submit at FreeBSD.org
> Cc: bde at FreeBSD.org
> Subject: Re: bin/43299: march=pentium4 miscompiles msun/src/e_pow.c
> Date: Mon, 12 May 2003 14:26:35 -0400
> 
>  The problem with our libm (msun) vs. gcc-3 is reproduceable on Linux:
>  
>  	(Note, FreeBSD's /usr is mounted as /misha on the Linux machine)
>  
>  mteterin at nylinux:lib/msun/src (982) cc -v
>  Reading specs from /usr/lib/gcc-lib/i386-redhat-linux/3.2/specs
>  Configured with: ../configure --prefix=/usr --mandir=/usr/share/man 
>  --infodir=/usr/share/info --enable-shared --enable-threads=posix 
>  --disable-checking --host=i386-redhat-linux --with-system-zlib 
>  --enable-__cxa_atexit
>  Thread model: posix
>  gcc version 3.2 20020903 (Red Hat Linux 8.0 3.2-7)
>  mteterin at nylinux:lib/msun/src (983) cc -O -march=pentium3 -I- -I. 
>  -I/misha/src/include -I/misha/src/sys -I/misha/obj/misha/src/i386/usr/include 
>  e_pow.c t.c e_sqrt.c -D__generic___ieee754_sqrt=__ieee754_sqrt
>  mteterin at nylinux:lib/msun/src (984) ./a.out                                     
>  2^2.1 is 4.28709
>  mteterin at nylinux:lib/msun/src (985) cc -O -march=pentium4 -I- -I. 
>  -I/misha/src/include -I/misha/src/sys -I/misha/obj/misha/src/i386/usr/include 
>  e_pow.c t.c e_sqrt.c -D__generic___ieee754_sqrt=__ieee754_sqrt
>  mteterin at nylinux:lib/msun/src (986) ./a.out                                     
>  2^2.1 is 0
>  
>  As can be seen above, using pentium3 produces the correct result, while
>  pentium4 produces the incorrect 0. We can, pretty much, rule out a kernel
>  problem in handling MMX/SSE. Which is it -- our __ieee754_pow or the gcc?

gcc is using SSE instructions in the Pentium 4 case:

[...]
-       fxch    %st(1)
-       fstpl   -56(%ebp)
-       movl    -56(%ebp), %ecx
-       movl    %ecx, -56(%ebp)
-       movl    %edi, -52(%ebp)
-       fldl    -56(%ebp)
+       movd    %edi, %xmm0
+       movsd   %xmm0, -64(%ebp)
+       fldl    -64(%ebp)
[...]

It's possible that gcc screwed something up wrt alignment or has
some sort of bug in the generation of SSE instructions.
__ieee754_pow() looks okay in that it doesn't seem to do anything
naughty with type punning, use uninitialized values, etc.  It
might be useful to construct a simpler test case so the specific
offending assembly can be identified.  Given that most of the
__ieee754_pow() code is special cases about NaNs and infinities,
it shouldn't be too hard to iteratively pare it down to the
line(s) that cause the discrepancy between the p3 and p4.


More information about the freebsd-bugs mailing list