Optimized copy&move (was: Re: [PATCH] Mantaining turnstile aligned to 128 bytes in i386 CPUs)

Ivan Voras ivoras at fer.hr
Wed Jan 17 20:09:24 UTC 2007


Bruce Evans wrote:

> And MMX/XMM registers ar not needed to get movnt on machines with SSE2,
> since movnti is part of SSE2.  This reduces the advantages of using MMX/XMM
> registers on P4's and A64's in 32-bit mode to the non-nt parts of the
> above (fully cached case), which I think are less important than the nt
> parts.

Hmm, I'm looking at i386/i386/support.s and there are several versions
of bcopy and bmove functions, including some that optimize by using FPU
registers (large_i586_bcopy_loop), and a version that uses movnti
(sse2_pagezero), but I can't find the bit of magic which glues them to
bzero() call.

Also, as as I can tell by the comments, the FPU version works by
manually saving context... why is this possible (i.e. won't something
preempt it?)

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 250 bytes
Desc: OpenPGP digital signature
Url : http://lists.freebsd.org/pipermail/freebsd-current/attachments/20070117/82fa0de5/signature.pgp


More information about the freebsd-current mailing list