RELENG_4 -> 5 -> 6: significant performance regression
Dmitry Pryanishnikov
dmitry at atlantis.dp.ua
Fri May 12 13:26:13 PDT 2006
Hello!
On Fri, 28 Apr 2006, Kris Kennaway wrote:
>>>> makeoptions CONF_CFLAGS=-fno-builtin
> I don't know, it needs to be tested in your particular case.
I've built another kernel, adding back
makeoptions CONF_CFLAGS=-fno-builtin
options QUOTA
Results are almost the same as w/o these 2 options. So the following overhead
difference:
>>>> %Sys %Intr %Idl
>>>> RELENG_6 + rl0 45 40 15
>>>> RELENG_6 + fxp0 45 35 20
>
>> %Sys %Intr %Idl "time md5 -t" wall clock time
>> RELENG_6 + rl0 34 24 42 1:43
>> RELENG_6 + fxp0 30 20 50 1:40
is caused by just these:
options INVARIANTS
options INVARIANT_SUPPORT
>> (I'll try to find out which one of these takes which % of overhead when I
>> get free time), but still much worse then under RELENG_4, where this
>> particular (I'd say "quote common") usage pattern takes 24-28% of CPU time,
>> while under RELENG_5 / 6 it takes >= 50% ;(
>
> Thanks. Silly question: the data transfer rate is the same on both
> 4.x and 6.x, right? i.e. the data transfer itself takes the same
> time?
Yes. I'm transferring a large file (ISO image) from another (much faster,
lightly loaded) machine over 10Mbit/s Ethernet link, so the transfer itself
is limited only by the wire speed (actual transfer rate is very close to 1000
KBytes/sec according to ftp client and 'systat -vm 1' disk transfer rate in
every measurement).
> The next step is for you to run some profiling tests to see
> where the kernel is spending time, e.g. with hwpmc.
I have to get myself familiar with this new (for me) feature first... Also,
hwpmc doesn't exist in RELENG_4, so it'll be impossible to compare results
with RELENG_4. It's a pity, because my tests clearly show that main loss
of performance (growth of overhead) occured during RELENG_4 -> 5 transition.
And last, but not least: my test system (Transcend TS-ABX31A
motherboard based on Intel BX chipset) does not provide APIC, will hwpmc
be useful in this situation?
> Also, when you are trying to quantify performance differences, you
> need to run many copies of the test (at least 10) under identical
> conditions to account for possible variations. The ministat tool
> (/usr/src/tools/tools/ministat) is good for performing statistically
> meaningful comparisons of data sets when you have them.
As my transfer takes much time (say 10 minutes) I've observed % of time
used many times during the transfer - they don't vary more than +/- several
(2-3) % during the main transfer phase (when transfer speed is stable).
My "time md5 -t" runs was used only as a confirmation that systat's numbers
are trustworthy - they simply confirm that there are _much_ less CPU cycles
available for application under RELENG_5/6 than under RELENG_4 (under
identical load pattern). I ran "time md5 -t" several (3-5 times) just to
confirm my assumptions, and results didn't vary more than 3%. So I suppose
that ministat isn't necessary in my tests.
> Kris
Sincerely, Dmitry
--
Atlantis ISP, System Administrator
e-mail: dmitry at atlantis.dp.ua
nic-hdl: LYNX-RIPE
More information about the freebsd-stable
mailing list