Scaling and performance issues with FreeBSD 9 (& 10) on 4 socket systems

David Xu davidxu at freebsd.org
Fri Jun 14 02:05:11 UTC 2013


On 2013/06/13 20:01, Remy Nonnenmacher wrote:
>
> On 06/13/13 13:32, Mark Felder wrote:
>> On Wed, 12 Jun 2013 17:58:49 -0500, David O'Brien <obrien at freebsd.org>
>> wrote:
>>
>>> We found FreeBSD 8.4 to perform better than FreeBSD 9.1, and Linux
>>> considerably better than both on the same machine.
>>
>> http://svnweb.freebsd.org/base?view=revision&revision=241246
>>
>> The above link is likely why 8.4 is better than 9.1 on the same machine.
>>
>>> We've tried various things and haven't been able to explain why FreeBSD
>>> isn't scaling on the new hardware.  Nor why it performs so much worse
>>> than FreeBSD on the older "M2" machines.
>>
>> The CPUs between those machines are quite different. I'm sure we're
>> looking at different cache sizes, different behavior for the
>> hyperthreading, etc. I'm sure others would be greatly interested in you
>> providing the same benchmark results for a recent snapshot of HEAD as
>> well.
>> _______________________________________________
>> freebsd-performance at freebsd.org mailing list
>> http://lists.freebsd.org/mailman/listinfo/freebsd-performance
>> To unsubscribe, send any mail to
>> "freebsd-performance-unsubscribe at freebsd.org"
>
> We had same problem on 4x12 cores (AMD) machines. After investigating
> using hwpmc, it appears that performance was killed by a scheduler
> function trying to find "least used cpu" that unfortunately works on
> contended structures (ie: lots a cores are fighting to get works). A
> solution was found by using artificially long queue of stuck process
> (steal_thresh bumped to over 8) and by cpu affinity crafting.
>
> Was a year ago and from my memory. I guess you may give a try to see if
> it helps.
>
> Disregard is a scheduler specialist contradicts.
>
> Thanks.
>

AMD's cache is very different than Intel, AFAIK eariler than Bulldozer, 
AMD's L3 is exclusive cache, util Bulldozer, AMD describes the L3 cache 
as a “non-inclusive victim cache”, it is still different than Intel 
which is inclusive.

"- In sched_pickcpu() change general logic of CPU selection. First
look for idle CPU, sharing last level cache with previously used one,
skipping SMT CPU groups. If none found, search all CPUs for the least loaded
one, where the thread with its priority can run now. If none found, search
just for the least loaded CPU."

For exclusive cache, the L3 has second-hand data, not hot data, when a 
thread is migrated, will have negative effect, its hot data is lost.
I'd prefer to search idle CPU from L2, then L3.




More information about the freebsd-performance mailing list