Improved multiprocessor usage on amd64
Stephen Montgomery-Smith
stephen at math.missouri.edu
Wed Sep 17 21:24:34 UTC 2008
Dan Nelson wrote:
> In the last episode (Sep 15), Stephen Montgomery-Smith said:
>> Stephen Montgomery-Smith wrote:
>>> Steve Kargl wrote:
>>>> On Mon, Sep 15, 2008 at 07:36:04PM -0500, Stephen Montgomery-Smith wrote:
>>>>> ... and each thread is a loop of the form
>>>>>
>>>>> while (1) {
>>>>> wait until told to start;
>>>>> do massive amounts of floating point arithmetic (only additions and
>>>>> multiplications) on large arrays;
>>>>> tell the master process that you are done;
>>>>> }
>>>>>
>>>>>> Do you have about as many threads as processor or more?
>>>>> Both ways. The time difference between the two approaches is
>>>>> negligible.
>>>>>
>>>> Are you using ULE? With my MPI applications, if the number of
>>>> launched processes exceeds the number of cpus by 1, ULE falls
>>>> through the floor. I have a nagging feeling that there is a problem
>>>> with cpu affinity.
>>>>
>>>> http://lists.freebsd.org/pipermail/freebsd-current/2008-July/086917.html
>> Let me say a little bit more.
>>
>> I have this gut feeling that the problem has a lot to do with cache
>> management. My program has each thread doing, in effect, huge matrix
>> multiplications, each one working on their own little bit. If a CPU
>> core changes from one thread to another, it then has to flush out the
>> cache to RAM, and read in a whole bunch of other RAM into cache.
>
> You can try playing with the new cpuset functions in HEAD and 7-STABLE
> to lock particular threads on certain CPUs.
>
It was an excellent suggestion. But it didn't make any difference.
More information about the freebsd-current
mailing list