ULE vs. 4BSD in RELENG_7

Bruce Evans brde at optusnet.com.au
Wed Oct 24 09:15:29 PDT 2007

On Tue, 23 Oct 2007, Kris Kennaway wrote:

> Josh Carroll wrote:
>> Anyway, in summary, ULE is about 5-6 % slower than 4BSD for two
>> workloads that I am sensitive to: building world with -j X, and ffmpeg
>> -threads X. Other benchmarks seem to indicate relatively equal
>> performance between the two. MySQL, on the other hand, is
>> significantly faster in ULE.

5-6% is a lot.  ULE has some tuning for makeworld in -current, which
for me reduced it to less than 1% slower than 4BSD (down from 5-10%
slower), for the case of makeworld -j4 over nfs on a 2-CPU system with
the sources pre-cached on the server and objects on a local file system,
and extensive local tuning of makeworld, nfs and network drivers.  I
think the tuning in ULE was mainly for a 2-CPU system, because makeworld
seemed to be very bad under ULE only with 2 CPUs.  Apparently, it is also
very bad with more CPUs.  There are sysctls to modify the ULE tuning.

>> I'm trying to understand why ffmpeg and buildworld are slower in ULE
>> than 4BSD, since it seems to me that ULE was supposed to be the better
>> scaling scheduler.

Makeworld is slower because any scheduling is bad for it.  More context
switches take longer and cost more by reducing affinity.

>> Does anyone have any additional performance tests I can run that might
>> help indicate where the deficiency is in the ULE scheduler? MySQL
>> performance is excellent, so I'm wondering if it was tuned to that
>> particular workload?

I think it was.

> One major difference is that your workload is 100% user.  Also you were 
> reporting ULE had more idle time, which looks like a bug since I would expect 
> it be basically 0% idle on such a workload.

No, at least buildworld, while being mainly user-CPU-bound by the gcc
hog, does some disk accesses and a significant number of sycalls.  I
have to work very hard to reduce its idle time to about 5% for UP on
local disks and to 11% for 2-way SMP over nfs.

More idle time for ULE at least used to be a feature.  ULE sometimes wants
to avoid switching to another thread immediately, in the hope of finding
a thread with with better affinity than the currently runnable ones.  It
waited far too long (in its idle threads) for makeworld with 2 CPUs.
Waiting has a better chance of being best if there are many CPUs.


More information about the freebsd-performance mailing list