ULE vs. 4BSD in RELENG_7
Bruce Evans
brde at optusnet.com.au
Wed Oct 24 09:15:29 PDT 2007
On Tue, 23 Oct 2007, Kris Kennaway wrote:
> Josh Carroll wrote:
>> Anyway, in summary, ULE is about 5-6 % slower than 4BSD for two
>> workloads that I am sensitive to: building world with -j X, and ffmpeg
>> -threads X. Other benchmarks seem to indicate relatively equal
>> performance between the two. MySQL, on the other hand, is
>> significantly faster in ULE.
5-6% is a lot. ULE has some tuning for makeworld in -current, which
for me reduced it to less than 1% slower than 4BSD (down from 5-10%
slower), for the case of makeworld -j4 over nfs on a 2-CPU system with
the sources pre-cached on the server and objects on a local file system,
and extensive local tuning of makeworld, nfs and network drivers. I
think the tuning in ULE was mainly for a 2-CPU system, because makeworld
seemed to be very bad under ULE only with 2 CPUs. Apparently, it is also
very bad with more CPUs. There are sysctls to modify the ULE tuning.
>> I'm trying to understand why ffmpeg and buildworld are slower in ULE
>> than 4BSD, since it seems to me that ULE was supposed to be the better
>> scaling scheduler.
Makeworld is slower because any scheduling is bad for it. More context
switches take longer and cost more by reducing affinity.
>> Does anyone have any additional performance tests I can run that might
>> help indicate where the deficiency is in the ULE scheduler? MySQL
>> performance is excellent, so I'm wondering if it was tuned to that
>> particular workload?
I think it was.
> One major difference is that your workload is 100% user. Also you were
> reporting ULE had more idle time, which looks like a bug since I would expect
> it be basically 0% idle on such a workload.
No, at least buildworld, while being mainly user-CPU-bound by the gcc
hog, does some disk accesses and a significant number of sycalls. I
have to work very hard to reduce its idle time to about 5% for UP on
local disks and to 11% for 2-way SMP over nfs.
More idle time for ULE at least used to be a feature. ULE sometimes wants
to avoid switching to another thread immediately, in the hope of finding
a thread with with better affinity than the currently runnable ones. It
waited far too long (in its idle threads) for makeworld with 2 CPUs.
Waiting has a better chance of being best if there are many CPUs.
Bruce
More information about the freebsd-performance
mailing list