cvs commit: src/lib/msun/i387 Makefile.inc e_atan2.S e_atan2f.S s_atan.S

David Schultz das at FreeBSD.ORG
Tue Feb 22 20:18:17 GMT 2005


On Tue, Feb 22, 2005, Nate Lawson wrote:
> David Schultz wrote:
> >By the way, here are some other results for the Pentium 4, all
> >without SSE.  SSE makes things a bit worse, probably because the
> >x87 and SSE registers are shared, and the Pentium 4 imposes a
> >large penalty for switching between the two sets.
> 
> I don't believe this is correct.  MMX and x87 use the same register 
> context (hence emms), however the XMM registers (SSE*) are separate. 
> It's possible gcc is generating MMX instructions though with your SSE 
> command line switch.

Yep, you're right, I was thinking of the MMX register set.  I
compared the code generated by gcc with and without SSE/SSE2, and
found that the only thing it uses SSE2 for is converting from
floating point->integer and back (e.g. CVTTSD2SI instead of i387
control word frobbing and FISTL).  There was also one place where
gcc just got confused and juggled around a bunch of registers on
the i387 stack, but I don't think that accounts for the
difference.  I wonder if CVTTSD2SI and friends are slower than an
OR/MOV/FLDCW/FISTL/FLDCW sequence on the Pentium 4 for some
bizarre reason, or if I missed something else significant while
scanning the diff.


More information about the cvs-src mailing list