From nobody Tue Aug 23 09:17:22 2022 X-Original-To: bugs@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4MBkC649pyz4ZtgM for ; Tue, 23 Aug 2022 09:17:22 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "mxrelay.nyi.freebsd.org", Issuer "R3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4MBkC60cXMz3mqW for ; Tue, 23 Aug 2022 09:17:22 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2610:1c1:1:606c::50:1d]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 4MBkC56SbHzPhH for ; Tue, 23 Aug 2022 09:17:21 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org ([127.0.1.5]) by kenobi.freebsd.org (8.15.2/8.15.2) with ESMTP id 27N9HLff054730 for ; Tue, 23 Aug 2022 09:17:21 GMT (envelope-from bugzilla-noreply@freebsd.org) Received: (from www@localhost) by kenobi.freebsd.org (8.15.2/8.15.2/Submit) id 27N9HLCu054729 for bugs@FreeBSD.org; Tue, 23 Aug 2022 09:17:21 GMT (envelope-from bugzilla-noreply@freebsd.org) X-Authentication-Warning: kenobi.freebsd.org: www set sender to bugzilla-noreply@freebsd.org using -f From: bugzilla-noreply@freebsd.org To: bugs@FreeBSD.org Subject: [Bug 266000] noticable higher i/o and cpu usage in 13.1 zfs on root (virtualized) Date: Tue, 23 Aug 2022 09:17:22 +0000 X-Bugzilla-Reason: AssignedTo X-Bugzilla-Type: new X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: kern X-Bugzilla-Version: 13.1-RELEASE X-Bugzilla-Keywords: X-Bugzilla-Severity: Affects Only Me X-Bugzilla-Who: olle@dalnix.se X-Bugzilla-Status: New X-Bugzilla-Resolution: X-Bugzilla-Priority: --- X-Bugzilla-Assigned-To: bugs@FreeBSD.org X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: bug_id short_desc product version rep_platform op_sys bug_status bug_severity priority component assigned_to reporter Message-ID: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated List-Id: Bug reports List-Archive: https://lists.freebsd.org/archives/freebsd-bugs List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-bugs@freebsd.org MIME-Version: 1.0 ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1661246242; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=ZVG9vHOMh+y1YFAOdccF4cpbWYp0d3azUIrbrzXYI8c=; b=eOHO7QJVtFoDppfjiJbfyPyForc+9jVbz4S0SaQ42vcn2uxx4ytglfEx+dFsOThiE+Sk7b l8oHb12TjG4nzetqOxFcxh4DASqOHogY013vQ9pY6Iwjsa7xEkhdkN0jSzm6HZmmLPcsxX hlkRYbXcXB65B/gCLMie38BMx139DPVKXfEUjjWuQVzjCh9ENYB7Zqd0kQeX4ljZ+gBCfu l5HwjtkCv2po16h2Wd5Te1Huxrw4et2DlcTZItIZ/6CPCbUa2AKHKODtkLmwcrVTkYf/xX 9nJ7JmQ1nks5nqqBur15opFOErPdBXiZJsKhH/sYOzZ8jYhNg0xkPN6HglSaZg== ARC-Seal: i=1; s=dkim; d=freebsd.org; t=1661246242; a=rsa-sha256; cv=none; b=foFmg8mc9BfFwfwMQXwKkzfLrafIz/tJpln+NNP6cbnJYuxF86hfnyLwjCrEHzIR+qogrm /6D4AHyBrLpBnbMlyVyvrue7KxWcFtlL28FFBbm2bgIQ+kzXdcbpXSG9KhXNdxji257mUY Sy1rt2xXwnLwdmdSHsUfhjrpdqtEiCwXR7k6qdNvaTzj2hmBuyy6zaARBhbijbaJFPMADP 1CLBBdIkgu8TbHq8FOTKWko1flS3qg0vstr7aDJUXp5QLdWSQ61WxscKfrG6jj4WGzC2WP RhOVR71ZdhcDc/p/wuXPbpceUlGyB9EiG0HTGbwx3UslMSFkRm5Pvux4P8+uRw== ARC-Authentication-Results: i=1; mx1.freebsd.org; none X-ThisMailContainsUnwantedMimeParts: N https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D266000 Bug ID: 266000 Summary: noticable higher i/o and cpu usage in 13.1 zfs on root (virtualized) Product: Base System Version: 13.1-RELEASE Hardware: Any OS: Any Status: New Severity: Affects Only Me Priority: --- Component: kern Assignee: bugs@FreeBSD.org Reporter: olle@dalnix.se I've noticed, on a couple of Digital Ocean vm:s, that after upgrading to 13.1-RELEASE-p1 the hypervisor graphs show a lot higher I/O and CPU usage t= han normal. So far I've upgraded a couple of boxes. All are out of the box ZFS on root installs. prison03: It's a webserver with about dozen of jails for websites, one jail for db, a= nd one for a proxy server. The jail roots and "web" data are on separate block storage volumes. The he storage volumes are also using ZFS. First i noticed it was behaving a bit sluggish when doing simple tasks as running find. Second I noticed the hypervisor graphs had much higher CPU usage than norma= l. Talked to DO support a bit, tried moving it to another hypervisor etc. Didn= 't help. Then I rebooted to the "old" kernel, and cpu and i/o went down (altho= ugh I couldn't actually test this, since it's a production box, and booting the= old kernel pf wouldn't work). But, running find etc went by snappy as before. I have some annotated screenshots of the hypervisor graphs here: https://nextcloud.dalnix.se/index.php/s/8C9yrQqgGbSoQ37 After downgrading, it's snappy fast and graphs are back to normal. prison04: Same here, much higher I/O after upgrade. Graphs: https://nextcloud.dalnix.se/index.php/s/r2A8JXcRJF97rZW Only hosts one website. Normally not doing much. prison08: Sluggish. The graphs are way off for what it actually does. It's an old web server, with no traffic. The only thing running are normal system stuff and offloading some ZFS snapshots. Since this is a box scheduled for destruction, I never noticed the high cpu= , so I have no before and after graphs. https://nextcloud.dalnix.se/index.php/s/dRiF94ED2oYDCmt last pid: 14617; load averages: 3.24, 2.96, 2.85=20=20=20=20=20=20=20= =20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20 up 30+21:30:47 09:02:23 very wierd, it doesn't do anything, still busy =3DD. *******01db03: This is a dedicated database server. With this one, the i/o cpu went so bad it made it unusable when people actu= ally started to use it. It's a DB server for a GIS type app. Normally it doesn't have *that* much load. But, the (I'm guessing) i/o wait, caused the DB serv= er to stop responding. I did some troubleshooting on the DB level, and "fixed" the thing that was causing it. Looked at slow queries, and one, took longer and longer to the point of no return. So I added a index to a table, and that "fixed" it. But, obviously this application error wasn't a problem previous to 13.1. Graph screenshots available here: https://nextcloud.dalnix.se/index.php/s/9pNsyaJa62wRaiW I have a bunch of physical hardware servers as well, but, they do not appea= r to have any issues. Also two upgraded (psysical hardware) storage servers. They all seem fine. = No increased load. Is this an OpenZFS 2.1.4 + kvm (I *think* DO uses kvm) bug? --=20 You are receiving this mail because: You are the assignee for the bug.=