Improved multiprocessor usage on amd64
Dan Nelson
dnelson at allantgroup.com
Tue Sep 16 04:32:30 UTC 2008
In the last episode (Sep 15), Stephen Montgomery-Smith said:
> Stephen Montgomery-Smith wrote:
> > Steve Kargl wrote:
> >> On Mon, Sep 15, 2008 at 07:36:04PM -0500, Stephen Montgomery-Smith wrote:
> >>> ... and each thread is a loop of the form
> >>>
> >>> while (1) {
> >>> wait until told to start;
> >>> do massive amounts of floating point arithmetic (only additions and
> >>> multiplications) on large arrays;
> >>> tell the master process that you are done;
> >>> }
> >>>
> >>>> Do you have about as many threads as processor or more?
> >>> Both ways. The time difference between the two approaches is
> >>> negligible.
> >>>
> >>
> >> Are you using ULE? With my MPI applications, if the number of
> >> launched processes exceeds the number of cpus by 1, ULE falls
> >> through the floor. I have a nagging feeling that there is a problem
> >> with cpu affinity.
> >>
> >> http://lists.freebsd.org/pipermail/freebsd-current/2008-July/086917.html
>
> Let me say a little bit more.
>
> I have this gut feeling that the problem has a lot to do with cache
> management. My program has each thread doing, in effect, huge matrix
> multiplications, each one working on their own little bit. If a CPU
> core changes from one thread to another, it then has to flush out the
> cache to RAM, and read in a whole bunch of other RAM into cache.
You can try playing with the new cpuset functions in HEAD and 7-STABLE
to lock particular threads on certain CPUs.
--
Dan Nelson
dnelson at allantgroup.com
More information about the freebsd-current
mailing list