idprio processes slowing down system

Peter Jeremy peterjeremy at acm.org
Mon Dec 6 21:39:29 UTC 2010


On 2010-Nov-28 02:24:21 -0600, Adam Vande More <amvandemore at gmail.com> wrote:
>On Sun, Nov 28, 2010 at 1:26 AM, Peter Jeremy <peterjeremy at acm.org> wrote:
>> Since all the boinc processes are running at i31, why are they impacting
>> a buildkernel that runs with 0 nicety?
>
>With the setup you presented you're going to have a lot of context switches
>as the buildworld is going to give plenty of oppurtunities for boinc
>processes to get some time.

Agreed.

>  When it does switch out, the CPU cache is
>invalidated, then invalidated again when the buildworld preempts back.

Not quite.  The amd64 uses physically addressed caches (see [1] 7.6.1)
so there's no need to flush the caches on a context switch.  (Though
the TLB _will_ need to be flushed since it does virtual-to-physical
mapping (see [1] 5.5)).  OTOH, whilst the boinc code is running, it
will occupy space in the caches, thus reducing the effective cache
size and presumably reducing the effective cache hit rate.

>  This is what makes it slow.

Unfortunately, I don't think this explains the difference.  My system
doesn't have hyperthreading so any memory stalls will block the
affected core and the stall time will be added to the currently
running process.  My timing figures show that the user and system time
is unaffected by boinc - which is inconsistent with the slowdown being
due to the impact on boinc on caching.

I've done some further investigations following a suggestion from a
friend.  In particular, an idprio process should only be occupying
idle time so the time used by boinc and the system idle task whilst
boinc is running should be the same as the system idle time whilst
boinc is not running.  Re-running the tests and additionally monitoring
process times gives me the following idle time stats:

x /tmp/boinc_running
+ /tmp/boinc_stopped
+------------------------------------------------------------------------+
| +  +        +           +                                       xx  x x|
||__________A_M_______|                                           |__AM| |
+------------------------------------------------------------------------+
    N           Min           Max        Median           Avg        Stddev
x   4         493.3        507.78        501.69       499.765     6.3722759
+   4        332.35        392.08        361.84       356.885     26.514364
Difference at 95.0% confidence
        -142.88 +/- 33.364
        -28.5894% +/- 6.67595%
        (Student's t, pooled s = 19.2823)

The numbers represent seconds of CPU time charged to [idle] (+) or
[idle] and all boinc processes (x).  This shows that when boinc is
running, it is using time that would not otherwise be idle - which
isn't what idprio processes should be doing.

My suspicion is that idprio processes are not being preempted
immediately a higher priority process becomes ready but are being
allowed to continue to run for a short period (possibly until their
current timeslice expires).  Unfortunately, I haven't yet worked out
how to prove or disprove this.

I was hoping that someone more familiar with the scheduler behaviour
would comment.

[1] "AMD64 Architecture Programmer's Manual Volume 2: System Programming"
    http://support.amd.com/us/Processor_TechDocs/24593.pdf

-- 
Peter Jeremy
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 196 bytes
Desc: not available
Url : http://lists.freebsd.org/pipermail/freebsd-stable/attachments/20101206/324dc345/attachment.pgp


More information about the freebsd-stable mailing list