[patch] Adding optimized kernel copying support - Part III
asmrookie at gmail.com
Wed May 31 16:32:24 PDT 2006
2006/6/1, Bruce Evans <bde at zeta.org.au>:
> >> Does that mean it won't work with SMP and PREEMPTION?
> > Yes it will work (even if I think it needs more testing) but maybe
> > would give lesser performances on SMP|PREEMPTION due to too much
> > traffic on memory/cache. For this I was planing to use non-temporal
> > instructions
> > (obviously benchmarks would be very appreciate).
> Er, isn't its main point to fix some !SMP assumptions made in the old
> copying-through-the-FPU code? (The old code is messy due to its avoidance
> of global changes. It wants to preserve the FPU state on the stack, but
> this doesn't quite work so it does extra things (still mostly locally)
> that only work in the !SMP && (!SMPng even with UP) case. Patching this
> approach to work with SMP || SMPng cases would make it messier.)
> The new code wouldn't behave much differently under SMP. It just might
> be a smaller optimization because more memory pressure for SMP causes
> more cache misses for everything and there are no benefits from copying
> through MMX/XMM unless nontemporal writes are used. All (?) CPUs with
> MMX or SSE* can saturate main memory using 32-bit instructions. On
> 32-bit CPUs, the benefits of using MMX/XMM come from being able to
> saturate the L1 cache on some CPUs (mainly Athlons and not P[2-4]),
> and from being able to use nontemporal writes on some CPUs (at least
> AthlonXP via SSE extensions all CPUs with SSE2).
I was just speaking about the copying routine itself and not about the
SSE2 environment preserving mechanism. It remains untouched in SMP
However I need to say you were right when you suggested me to merge
anything in support.s since it has a more coherent design.
Peace can only be achieved by understanding - A. Einstein
More information about the freebsd-arch