Consistently "high" CPU load on 10.0-STABLE

Adrian Chadd adrian at freebsd.org
Sun Jul 20 07:13:06 UTC 2014


Hi,

I don't know how to do this with dtrace, but take a look at
tools/sched/schedgraph.py and enable KTR to get some trace records.

KTR logs the scheduler activity -and- the loadav with it.


-a


On 19 July 2014 23:24, Jeremy Chadwick <jdc at koitsu.org> wrote:
> (Please keep me CC'd as I'm not subscribed to freebsd-stable@)
>
> Today I took the liberty of upgrading my main home server from
> 9.3-STABLE (r268785) to 10.0-STABLE (r268894).  The upgrade consisted of
> doing a fresh install of 10.0-STABLE on a brand new unused SSD.  Most
> everything went as planned, barring a couple ports-related anomalies,
> and I seemed fairly impressed by the fact that buildworld times had
> dropped to 27 minutes and buildkernel to 4 minutes with clang (something
> I'd been avoiding like the plague for a long while).  Kudos.
>
> But after an hour or so, I noticed a consistent (i.e. reproducible)
> trend: the system load average tends to hang around 0.10 to 0.15 all the
> time.  There are times where the load drops to 0.03 or 0.04 but then
> something kicks it back up to 0.15 or 0.20 and then it slowly levels out
> again (over the course of a few minutes) then repeats.
>
> Obviously this is normal behaviour for a system when something is going
> on periodically.  So I figured it might have been a userland process
> behaving differently under 10.x than 9.x.  I let top -a -S -s 1 run and
> paid very very close attention to it for several minutes.  Nothing.  It
> doesn't appear to be something userland -- it appears to be something
> kernel-level, but nothing in top -S shows up as taking up any CPU time
> other than "[idle]" so I have no idea what might be doing it.
>
> The box isn't doing anything like routing network traffic/NAT, it's pure
> IPv4 (IPv6 disabled in world and kernel, and my home network does
> basically no IPv6) and sits idle most of the time fetching mail.  It
> does use ZFS, but not for /, swap, /var, /tmp, or /usr.
>
> vmstat -i doesn't particularly show anything awful.  All the cpuX:timer
> entries tend to fluctuate in rate, usually 120-200 or so; I'd expect an
> interrupt storm to be showing something in the 1000+ range.
>
> The only thing I can think of is the fact that the SSD being used has no
> 4K quirk entry in the kernel (and its ATA IDENTIFY responds with 512
> logical, 512 physical, even though we know it's 4K).  The partitions are
> all 1MB-aligned regardless.
>
> This is all bare-metal, by the way -- no virtualisation involved.
>
> I do have DTrace enabled/built on this box but I have absolutely no clue
> how to go about profiling things.  For example maybe output of this sort
> would be helpful (but I've no idea how to get it):
>
> http://lists.freebsd.org/pipermail/freebsd-stable/2014-July/079276.html
>
> I'm certain I didn't see this behaviour in 9.x so I'd be happy to try
> and track it down if I had a little bit of hand-holding.
>
> I've put all the things I can think of that might be relevant to "system
> config/tuning bits" up here:
>
> http://jdc.koitsu.org/freebsd/releng10_perf_issue/
>
> I should note my kernel config is slightly inaccurate (I've removed some
> stuff from the file in attempt to rebuild, but building world prior to
> kernel failed due to r268896 breaking world, but anyone subscribed here
> has already seen the Jenkins job of that ;-) ).
>
> Thanks.
>
> --
> | Jeremy Chadwick                                   jdc at koitsu.org |
> | UNIX Systems Administrator                http://jdc.koitsu.org/ |
> | Making life hard for others since 1977.             PGP 4BD6C0CB |
>
> _______________________________________________
> freebsd-stable at freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-stable
> To unsubscribe, send any mail to "freebsd-stable-unsubscribe at freebsd.org"


More information about the freebsd-stable mailing list