From nobody Thu Jul 15 01:03:04 2021 X-Original-To: freebsd-hackers@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 038711240B8B for ; Thu, 15 Jul 2021 01:05:05 +0000 (UTC) (envelope-from dewayne.geraghty@heuristicsystems.com.au) Received: from heuristicsystems.com.au (hermes.heuristicsystems.com.au [203.41.22.115]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (2560 bits) client-digest SHA256) (Client CN "hermes.heuristicsystems.com.au", Issuer "Heuristic Systems Type 4 Host CA" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id 4GQGPX2JXcz3m72 for ; Thu, 15 Jul 2021 01:05:03 +0000 (UTC) (envelope-from dewayne.geraghty@heuristicsystems.com.au) Received: from [10.0.5.3] (noddy.hs [10.0.5.3]) (authenticated bits=0) by heuristicsystems.com.au (8.15.2/8.15.2) with ESMTPSA id 16F137mF077064 (version=TLSv1.3 cipher=TLS_AES_256_GCM_SHA384 bits=256 verify=NOT) for ; Thu, 15 Jul 2021 11:03:08 +1000 (AEST) (envelope-from dewayne.geraghty@heuristicsystems.com.au) DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=heuristicsystems.com.au; s=hsa; t=1626310988; x=1626915789; bh=nnmKcb+nqEQLUF0hyYNzYVPCaWDYmN9oB0y3M8NJyqo=; h=Subject:To:From:Message-ID:Date; b=VGrBXOpp6rRigRynJgWrH88Dtv8QfTtcSvu+y0Mn2f9l9qdtdNHwnr4K5t22clQZR aIVlh1wasUQ8PMcdJDHs9z88pUPp+ZG5iI+lK/F2/sAuI30w+6rYDCN5UVx+doU1Ai 0udzK5X9q0U+Z3DPdqUzLK5Wv+oMteu97LRojk8IcYIdXnNVloJGQ X-Authentication-Warning: b3.hs: Host noddy.hs [10.0.5.3] claimed to be [10.0.5.3] Subject: Re: Periodic rant about SCHED_ULE To: freebsd-hackers@freebsd.org References: <13445948-7804-20b4-4ae6-aaac14d11e87@m5p.com> <20210708101907.0be3a3c2@rimwks.local> <20210714164745.0128ea15@gumby.homeunix.com> From: Dewayne Geraghty Message-ID: <8239e474-fc36-b8aa-93b7-39197534cd30@heuristicsystems.com.au> Date: Thu, 15 Jul 2021 11:03:04 +1000 User-Agent: Mozilla/5.0 (Windows NT 6.1; rv:78.0) Gecko/20100101 Thunderbird/78.11.0 List-Id: Technical discussions relating to FreeBSD List-Archive: https://lists.freebsd.org/archives/freebsd-hackers List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-hackers@freebsd.org MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit Content-Language: en-GB X-Rspamd-Queue-Id: 4GQGPX2JXcz3m72 X-Spamd-Bar: ---- Authentication-Results: mx1.freebsd.org; dkim=pass header.d=heuristicsystems.com.au header.s=hsa header.b=VGrBXOpp; dmarc=none; spf=pass (mx1.freebsd.org: domain of dewayne.geraghty@heuristicsystems.com.au designates 203.41.22.115 as permitted sender) smtp.mailfrom=dewayne.geraghty@heuristicsystems.com.au X-Spamd-Result: default: False [-4.47 / 15.00]; RCVD_VIA_SMTP_AUTH(0.00)[]; R_SPF_ALLOW(-0.20)[+mx:c]; HAS_XAW(0.00)[]; RCVD_DKIM_ARC_DNSWL_MED(-0.50)[]; TO_DN_NONE(0.00)[]; DKIM_TRACE(0.00)[heuristicsystems.com.au:+]; RCVD_IN_DNSWL_MED(-0.20)[203.41.22.115:from]; FROM_EQ_ENVFROM(0.00)[]; MIME_TRACE(0.00)[0:+]; ASN(0.00)[asn:1221, ipnet:203.40.0.0/13, country:AU]; MID_RHS_MATCH_FROM(0.00)[]; ARC_NA(0.00)[]; NEURAL_HAM_MEDIUM(-1.00)[-1.000]; R_DKIM_ALLOW(-0.20)[heuristicsystems.com.au:s=hsa]; FROM_HAS_DN(0.00)[]; DWL_DNSWL_MED(-2.00)[heuristicsystems.com.au:dkim]; TO_MATCH_ENVRCPT_ALL(0.00)[]; NEURAL_HAM_LONG(-1.00)[-1.000]; MIME_GOOD(-0.10)[text/plain]; PREVIOUSLY_DELIVERED(0.00)[freebsd-hackers@freebsd.org]; DMARC_NA(0.00)[heuristicsystems.com.au]; RCPT_COUNT_ONE(0.00)[1]; NEURAL_SPAM_SHORT(0.73)[0.729]; RCVD_COUNT_TWO(0.00)[2]; RCVD_TLS_ALL(0.00)[]; MAILMAN_DEST(0.00)[freebsd-hackers] X-ThisMailContainsUnwantedMimeParts: N On 15/07/2021 1:47 am, RW via freebsd-hackers wrote: > On Thu, 8 Jul 2021 10:19:07 +0300 > Rozhuk Ivan wrote: > > >> and sysctl tunings on desktop only: >> >> # SCHEDULER >> kern.sched.steal_thresh=1 >> kern.sched.balance=0 >> kern.sched.balance_interval=1000 >> kern.sched.affinity=10000 > You missed out > > kern.sched.preempt_thresh=224 > > (perhaps because it's so well known). > > In my experience this makes a big difference for desktop use. If I set > that and build on tmpfs, to minimise the effect of I/O contention, I > don't see any discernible effect on Xfce when building world with -j4. > > This is on a bottom of the range i5 from 9 years ago. It's not > particularly fast. > > I think the default only allows preemption by real-time and kernel > threads. > Hi RW,  Note the PRI(ority) column when you perform /usr/bin/top.  Processes with a PRI below the default kern.sched.preempt_thresh=80 (ie nice -n 8) may pre-empt other processes or send interprocessor interrupts to others (CPUs). An idprio 0 top is assigned a starting PRI of 124; so on SCHED_ULE, these processes will receive cpu time (even at idprio 31) but won't pre-empt others. If you really want all processes to pre-empt others, enabling FULL_PREEMPTION achieves the same goal as 224.  I don't have a use case for no pre-emption. Anyone? Why kern.sched.preempt_thresh=224 helps desktop users, I can only speculate that with a high threshold, more IPI's are sent to other CPU cores so they can be busy (?).  Refer to /usr/src/sys/kern/sched_ule.c  -- Returning to the topic.  Its a very hard choice between schedulers.  I did a lot of testing between them and tuning to see if one excelled on my humble Xeon-E3.  I couldn't see a significant difference between workloads - though next time (and a hint for others) I'll disable SMT and set dev.cpu.0.freq to disable turbo behaviour.  For now, sched_4bsd appears to be more efficient in terms of code complexity and people with high CPU workloads have preferred sched_4bsd in the past, while sched_ule has a lot of things to tweak and is recommended by the FreeBSD project. Otherwise it wouldn't be the default  Looking at https://github.com/freebsd/freebsd-src/tree/main/sys/kern/sched_*.c  their histories are tweaked a couple of times a year, so I wouldn't rule sched_4bsd out of contention just yet. FWIW, my servers modify only: kern.sched.affinity=7 kern.sched.interact=0 kern.sched.slice=128 while firewalls: kern.sched.balance=0 kern.sched.interact=0 A loadable schedule has been discussed here a few times - I vaguely recall it being inefficient (complexity) and unnecessary (you'll determine one scheduler and unless testing, unlikely to change).  Further in the past, sched_4bsd was to be removed, but some demonstrated it had better performance for their workload. Cheerio.