more data: SCHED_ULE+PREEMPTION is the problem

George Mitchell george+freebsd at m5p.com
Tue Apr 17 14:12:47 UTC 2018


On 04/07/18 10:18, Peter wrote:
> Hi all,
> [...]
Thanks for all the investigation!
> 3. kern.sched.preempt_thresh
> 
> I could make the problem disappear by changing kern.sched.preempt_thresh
>  from the default 80 to either 11 (i5-3570T) or 7 (p3) or smaller. This
> seems to correspond to the disk interrupt threads, which run at intr:12
> (i5-3570T) or intr:8 (p3).
> [...]

More data.  With SCHED_4BSD at FreeBSD 10.4-RELEASE-p8 #0 r331984:
kern.sched.runq_fuzz: 1
kern.sched.ipiwakeup.useloop: 0
kern.sched.ipiwakeup.usemask: 1
kern.sched.ipiwakeup.delivered: 376139898
kern.sched.ipiwakeup.requested: 376137875
kern.sched.ipiwakeup.enabled: 1
kern.sched.slice: 12
kern.sched.quantum: 94488
kern.sched.name: 4BSD
kern.sched.preemption: 1
kern.sched.cpusetsize: 8
With dnetc running on a 6-core AMD CPU from a few years back,
"time make buildworld" yields:

6640.224u 828.874s 2:14:37.73 92.4%     28525+494k 31633+431554io 33192pf+0w

I shifted to a GENERIC kernel, FreeBSD 10.4-RELEASE-p8 #0 r332560:
kern.sched.topology_spec: <groups>
 <group level="1" cache-level="0">
  <cpu count="6" mask="3f">0, 1, 2, 3, 4, 5</cpu>
  <children>
   <group level="2" cache-level="2">
    <cpu count="6" mask="3f">0, 1, 2, 3, 4, 5</cpu>
   </group>
  </children>
 </group>
</groups>

kern.sched.steal_thresh: 2
kern.sched.steal_idle: 1
kern.sched.balance_interval: 127
kern.sched.balance: 1
kern.sched.affinity: 1
kern.sched.idlespinthresh: 157
kern.sched.idlespins: 10000
kern.sched.static_boost: 152
kern.sched.preempt_thresh: 80
kern.sched.interact: 30
kern.sched.slice: 12
kern.sched.quantum: 94488
kern.sched.name: ULE
kern.sched.preemption: 1
kern.sched.cpusetsize: 8

I stupidly typed "make buildworld" without the "time" command, but the
build log started at Mon Apr 16 13:49:12 EDT 2018 and completed at
Tue Apr 17 00:22:23 EDT 2018.  You read that right: 2+ hours vs 10 1/2!
So I set "sysctl kern.sched.preempt_thresh=5" (a wild guess on my part)
and started another "time make buildworld".  It's still going now, but
subjectively it's still running like molasses.  I'll post more results
later after trying sysctl kern.sched.preempt_thresh=0.

By the way, over the years that this discussion has been going on, I've
*never* had a response to my question: "What is the workload for which
SCHED_ULE outperforms SCHED_4BSD?"                            -- George


-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: OpenPGP digital signature
URL: <http://lists.freebsd.org/pipermail/freebsd-stable/attachments/20180417/c8d05d29/attachment.sig>


More information about the freebsd-stable mailing list