From nobody Tue Sep 05 07:00:51 2023 X-Original-To: freebsd-current@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4RfxHW5hsJz4sJJf for ; Tue, 5 Sep 2023 07:01:11 +0000 (UTC) (envelope-from marklmi@yahoo.com) Received: from sonic316-8.consmr.mail.gq1.yahoo.com (sonic316-8.consmr.mail.gq1.yahoo.com [98.137.69.32]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 4RfxHV3dpcz4FRM for ; Tue, 5 Sep 2023 07:01:10 +0000 (UTC) (envelope-from marklmi@yahoo.com) Authentication-Results: mx1.freebsd.org; dkim=pass header.d=yahoo.com header.s=s2048 header.b=Y8mOx4K8; spf=pass (mx1.freebsd.org: domain of marklmi@yahoo.com designates 98.137.69.32 as permitted sender) smtp.mailfrom=marklmi@yahoo.com; dmarc=pass (policy=reject) header.from=yahoo.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.com; s=s2048; t=1693897267; bh=pw1iElHKkMOpMg4Y7INooIt1Qti82KOxdt/SI3j0lzE=; h=Subject:From:In-Reply-To:Date:Cc:References:To:From:Subject:Reply-To; b=Y8mOx4K8KVHs8hmQ/32UJKqh09IZjxkpWDGzrsMWSm7BbCeB2WSJA2U2MzBtppsiDsunzfxldVFv4nl4zq4exTrIFJo6VkPpAMiUEta7owGe8GorA4dZg4WQrXB86mbgdLQ7MV09UcJEnE8lzETixKnq3iKM2JhJxbZ03/l8WoDKtdvCcPLYcBVJ+HdPGJyzRqfE8O2FnEYg+mg3S5NtfDNB8GwYTBJ7EkfSHzFqWAYSoEOnguGcEybQ7rA9ZGLh1E+QgkvPvAvDMoZ6B6LI7TiMg+ADAchauUUKvdxcufvlCgKhntrwwIFE3jFfwZnaGoqHqz+OHpfzQOWPQws0nQ== X-SONIC-DKIM-SIGN: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.com; s=s2048; t=1693897267; bh=6vv2fQjzKtagdngREFoccdWXzzwtB+0tKhY/ddosAss=; h=X-Sonic-MF:Subject:From:Date:To:From:Subject; b=GJthfjfVnTT5Epa7ReQLTNd7mPtOSnXrzHv3EdRKbkuMCN05JHmS0wHi6GQQO2lbomr4ooHLearIn/NWteiSTxZDWNywkirXOlEsukKfkXEDZDJATlJZ9KPU1HUYegF2hJU8IhhUdpB7z4ThkJrEBq8ds2FiTISPaIgoIOHkKT0Q7mr5l8oy1GaghzrKyGs8on+Zmcgcc7cTEzgDGCxhzHJs6SriKs1AB+8qx+9IjW5KDt57vTlCeelKIFTSbT6DU1R1/0lencI8UKBdnXC70MQHn/LI/MLG4C7OP2hLLFRmeEZk4sLow9SKHIpmlhBEGXt0zFp+dvpapM9nDxDGAQ== X-YMail-OSG: UWGX7a8VM1ngMWLt00TDfCjEm0zBXUhT9xP2gBaaiXRGox_bO1JygZ3e.V8M843 JEWijb4MGnwgu6B4qupXLjoWrDwAxH8H8ut8xfhT8KxOWaguXIMdDTt02pnQwGkKStXN7yX.Ldri 5IhxLeZp_dAHUhXul4EhKImntgVh9YdJ8Ou14zGobFDVDq.qhKNrZ1asKWyHKZcoXMrZnNH5s0kY 8hsYzblKa9leF4NgIJh3S5ldIHT6SbbccWBkON2ZPMnUgahMURhrNGkQsKjaOfcqWnsAAFnUXy7w CdpKu1kjm05P26bHT4mYp0c1ANmqmuRxAG5T_FmBCFySJ4Hjjw_2uP6jKMMfaNrBE49ZzxsrQcv2 ds9EEJRNBFO1_FQY8OAu1LsYgcTAbh13r83bocU5TJ14Kw12HG_0sZMRagBTDJ_LpJZ.q7.hFdQt tlxdWAMwMyAsFA48CLl.8qTQy5fwi17NnisPHQRBqYI966uOyZRDxXj9qST0YRww8_qvefPvsA0c 1u1nwkdtM0rvRbz.TWp9ZxdZi84hDir4pHYa2PzMKFp8hu1A1GugD5EktAGz3zAyXR0XPs_eXyu7 HgaRgWB9l7qFO1W3MFlEUEHKeAM6.VybNQuYN41wQ_B8v1aWWdSjNeFk2kQt3gKTAX23.9tG6C6t WCSW6ivGmyb3UGIZRszaZhAt0kmgGwA.lTrrNzyjPTwE4NQ1E.H79fPMj.Fd4dMUsOPmT6C7gWRP VIwZvPOclfmeLYV7eHaIiN1bkJLfrMRjB47jsxlI7FzR7LlXV_ZCwthjflKX0k9Z8IZI0RJ3vpAA IzQ7jKM70_Kk3.9bzxYRcyrXYxiXAHh09v4LNzuviV_3iQgVL5_izk0juYNHfOZNyI83mDRa5P8C XBoqGIyC70SIkNOUKZ_dGn.nPmM2PyGE2EjyAPAYdA_0LRXUKRJtg25OEe6ckO3XTAGNO28rmsl6 d5LNctd3K5H7HfqePKrEFSj2WzCxqIGlriQThq1mMIuXpoC30bwhMVrYrISLyvoYqrkwfz7zKX.d Mum84UXidMIObPLQqlLkQW6H3Sx5Nw18lsa3w6UhajCpf4XPn1nvyX2_.AKRQlrqIS2xdDvVPnjf pHjA2WMU1Q_ZReFW952WxZMNqdVakiSRggVUr6_QupIES0SlSphpJyN4PPIWFS5XSnqMc4bHpRD6 LoqYLI.5lXhCgElW3Ogv5VSkN6IprQ7LgWsEyuMHTgP9hT1n_uJJ.Ya9_wXPuJSiVQ2OPtTBNIT6 bJB7XcruyP0bM370kPqCo3h1Uo0IYV3xROdF1Z3WhtIvXKK4wJAazsjihYuXhnueohS0peYXqMta seeloelhV7hhK5agdq1VATT.vQvEQc.jKxb7fng72L2g1d53q5uZ.qkQev7PB_3ExZd.Y8O5M8Ao ZDohOW51B7CFtzPpMhJMQ.nuxu3Q9zSEBRthDwTTEUV0Q6PCTu39_wF4GduIU_G0.5ar_pEhRyE0 nrvQoIBs66XLpPGpNZKBdBz43uCalUnTiUKTvAoC61VpGY2NhSrauUQM6uJBs06a.6i56X5TAlKX i0B2AdqvkMhqhckP9rgqt3IQOG1EAFgVWn3TPCToQDAlrn1Uvld59lswK.Mfhsq50Fe8XC9._h12 3PX98IIteRy7srbSc9x9Phj.UzjqvuIorOMYXhE4Rb1DG83DQaZexQFoSRh8v0hz3sG_8Iax4MaF jRiAjSwKGZmXJHJqIE01j6YRfwYaBD.fuu10StqEBPBP91qncnDRhGk8KRBQwRzsImKfxl_v59m_ Vhe09snM4j8wQNMNthXzfsYLk5v6ThM0M9rJXcEocAMfj_IK09kbdlU7cXiSgfkNZfH_oneCO6Aq HLZdbnhho_itCUw7ehctGUJ96.5oF7fSvdX7qRCZOCPAEuEOWkp05jqyPWJqV1Fz8OMTQSKcyp31 GCxn.idLGRnMXembeAo0Fo9rxvZfn.PKW6T25Shb.zQJi5mx976nyHbNNdw7Kj7_Elky0iUNSzhI 2N8EVW0opBjDlx1m2d.y8NVh4tWTK5.pVEbUaLMd0m8Bfhf_Uz1e4oaHXcwYMgcl9bjg50RUA5Ok gqrw4ND4qom5Y85ZonwP5J_zoGGzxZV35AMRnY1cpWTrlmzBwQRVFPUhX6yQSMz4EoADxA5snLMf cqHnk9P8EggmzzumH.PblYEOPKt9ZNZ_DN.p0FeKi09tdOjIqJ_qvm4NHcPPbwxdrx2lrmOvMY7u 4C283snzedLsIxYlgwyC7q63gjwnBtv95Imnk_xxhYNtM7iWNyRx._PVH.cGhOGedYTtuptBpL6j LADU- X-Sonic-MF: X-Sonic-ID: 7f64faff-6bd7-42ab-8b5b-148102c2a526 Received: from sonic.gate.mail.ne1.yahoo.com by sonic316.consmr.mail.gq1.yahoo.com with HTTP; Tue, 5 Sep 2023 07:01:07 +0000 Received: by hermes--production-ne1-7b767b77cc-4t526 (Yahoo Inc. Hermes SMTP Server) with ESMTPA ID 39c396e6b3ede7e9879b45bd04df2c2f; Tue, 05 Sep 2023 07:01:03 +0000 (UTC) Content-Type: text/plain; charset=us-ascii List-Id: Discussions about the use of FreeBSD-current List-Archive: https://lists.freebsd.org/archives/freebsd-current List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-current@freebsd.org Mime-Version: 1.0 (Mac OS X Mail 16.0 \(3731.700.6\)) Subject: Re: An attempted test of main's "git: 2ad756a6bbb3" "merge openzfs/zfs@95f71c019" that did not go as planned From: Mark Millard In-Reply-To: <4E888EC8-4B06-4139-90DF-59AFBA700A5F@yahoo.com> Date: Tue, 5 Sep 2023 00:00:51 -0700 Cc: dev-commits-src-main@freebsd.org, Current FreeBSD Content-Transfer-Encoding: quoted-printable Message-Id: <6990186B-E85B-4D64-8202-4B9ECBD876FD@yahoo.com> References: <673A446E-6F94-451E-910F-079F678C5289@yahoo.com> <2BDD30B5-6248-4EC3-83C8-0499E0717D1D@yahoo.com> <69028684-c2cd-8f0c-617f-d0763c08dbe4@FreeBSD.org> <8D832756-6754-4D1E-AE8D-E716FE08F747@yahoo.com> <5605506b-5059-fb72-3e5a-741863d54444@FreeBSD.org> <5724006A-F4C4-42CA-98A9-90C5EB914F5E@yahoo.com> <4E888EC8-4B06-4139-90DF-59AFBA700A5F@yahoo.com> To: Alexander Motin X-Mailer: Apple Mail (2.3731.700.6) X-Spamd-Bar: --- X-Spamd-Result: default: False [-3.49 / 15.00]; NEURAL_HAM_LONG(-1.00)[-1.000]; NEURAL_HAM_MEDIUM(-1.00)[-1.000]; NEURAL_HAM_SHORT(-0.99)[-0.985]; DMARC_POLICY_ALLOW(-0.50)[yahoo.com,reject]; MV_CASE(0.50)[]; R_SPF_ALLOW(-0.20)[+ptr:yahoo.com]; R_DKIM_ALLOW(-0.20)[yahoo.com:s=s2048]; MIME_GOOD(-0.10)[text/plain]; RCPT_COUNT_THREE(0.00)[3]; FROM_HAS_DN(0.00)[]; RCVD_VIA_SMTP_AUTH(0.00)[]; BLOCKLISTDE_FAIL(0.00)[98.137.69.32:server fail]; TO_MATCH_ENVRCPT_SOME(0.00)[]; MLMMJ_DEST(0.00)[freebsd-current@freebsd.org]; RCVD_IN_DNSWL_NONE(0.00)[98.137.69.32:from]; ARC_NA(0.00)[]; MID_RHS_MATCH_FROM(0.00)[]; RWL_MAILSPIKE_POSSIBLE(0.00)[98.137.69.32:from]; DKIM_TRACE(0.00)[yahoo.com:+]; FREEMAIL_FROM(0.00)[yahoo.com]; TO_DN_SOME(0.00)[]; DWL_DNSWL_NONE(0.00)[yahoo.com:dkim]; FROM_EQ_ENVFROM(0.00)[]; MIME_TRACE(0.00)[0:+]; RCVD_TLS_LAST(0.00)[]; ASN(0.00)[asn:36647, ipnet:98.137.64.0/20, country:US]; FREEMAIL_ENVFROM(0.00)[yahoo.com]; RCVD_COUNT_TWO(0.00)[2] X-Rspamd-Queue-Id: 4RfxHV3dpcz4FRM On Sep 4, 2023, at 22:06, Mark Millard wrote: > On Sep 4, 2023, at 18:39, Mark Millard wrote: >=20 >> On Sep 4, 2023, at 10:05, Alexander Motin wrote: >>=20 >>> On 04.09.2023 11:45, Mark Millard wrote: >>>> On Sep 4, 2023, at 06:09, Alexander Motin wrote: >>>>> per_txg_dirty_frees_percent is directly related to the delete = delays we see here. You are forcing ZFS to commit transactions each 5% = of dirty ARC limit, which is 5% of 10% or memory size. I haven't looked = on that code recently, but I guess setting it too low can make ZFS = commit transactions too often, increasing write inflation for the = underlying storage. I would propose you to restore the default and try = again. >>>> While this machine is different, the original problem was worse = than >>>> the issue here: the load average was less than 1 for the most part >>>> the parallel bulk build when 30 was used. The fraction of time = waiting >>>> was much longer than with 5. If I understand right, both too high = and >>>> too low for a type of context can lead to increased elapsed time = and >>>> getting it set to a near optimal is a non-obvious exploration. >>>=20 >>> IIRC this limit was modified several times since originally = implemented. May be it could benefit from another look, if default 30% = is not good. It would be good if generic ZFS issues like this were = reported to OpenZFS upstream to be visible to a wider public. = Unfortunately I have several other project I must work on, so if it is = not a regression I can't promise I'll take it right now, so anybody else = is welcome. >>=20 >> As I understand, there are contexts were 5 is inappropriate >> and 30 works fairly well: no good single answer as to what >> value range will avoid problems. >>=20 >>>> An overall point for the goal of my activity is: what makes a >>>> good test context for checking if ZFS is again safe to use? >>>> May be other tradeoffs make, say, 4 hardware threads more >>>> reasonable than 32. >>>=20 >>> Thank you for your testing. The best test is one that nobody else = run. It also correlates with the topic of "safe to use", which also = depends on what it is used for. :) >>=20 >> Looks like use of a M.2 Samsung SSD 960 PRO 1TB with a >> non-debug FreeBSD build is suitable for the bulk -a -J128 >> test (no ALLOW_MAKE_JOBS variants enabled, USE_TMPFS=3Dno in >> use) on the 32 hardware thread system. (The swap partition >> in use is from the normal environment's PCIe Optane media.) >> The %idle and the load averages and %user stayed reasonable >> in a preliminary test. One thing it does introduce is trim >> management (both available and potentially useful). (Optane >> media does not support or need it.) No >> per_txg_dirty_frees_percent adjustment involved (still 5). >>=20 >> I've learned to not use ^T for fear of /bin/sh aborting >> and messing up poudriere's context. So I now monitor with: >>=20 >> # poudriere status -b >>=20 >> in a separate ssh session. >>=20 >> I'll note that I doubt I'd try for a complete bulk -a . >> I'd probably stop it if I notice that the number of >> active builders drops off for a notable time (normal >> waiting for prerequisites appearing to be why). >>=20 >>=20 >=20 > So much for that idea. It has reached a state of staying > under 3500 w/s and up to 4.5ms/w (normally above 2ms/w). > %busy wondering in the range 85% to 101%. Lots of top > showing tx->tx. There is some read and other activity as > well. Of course the kBps figures are larger than the > earlier USB3 context (larger kB figures). >=20 > It reached about 1350 port->package builds over the first > hour after the 2nd "Buildee started". >=20 > autotrim is off. Doing a "zpool trim -w zamd64" leads to > . . . larger w/s figures during the process. So > more exploring to do at some point. Possibly: >=20 > autotrim > per_txg_dirty_frees_percent >=20 > For now, I'm just running "zpool trim -w zamd64" once > and a while so the process continues better. >=20 > Still no evidence of deadlocks. No evidence of builds > failing for corruptions. >=20 > . . . At around the end of 2nd hour: 2920 or so built.=20 >=20 > Still no evidence of deadlocks. No evidence of builds > failing for corruptions. >=20 > . . . I've turned on autotrim without stopping the bulk > build. >=20 > . . . At around the end of 3rd hour: 4080 or so built.=20 >=20 > Still no evidence of deadlocks. No evidence of builds > failing for corruptions. >=20 > Looks like the % idle has been high for a significant > time. I think I'll stop this specific test and clean > things out. >=20 > Looks like lang/guile* are examples of not respecting > the lack of ALLOW_MAKE_JOBS use. >=20 > Hmm. The ^C ended up with: >=20 > ^C[03:41:07] Error: Signal SIGINT caught, cleaning up and exiting > [main-amd64-bulk_a-default] [2023-09-04_17h49m28s] [sigint:] Queued: = 34588 Built: 4331 Failed: 12 Skipped: 303 Ignored: 335 Fetched: = 0 Tobuild: 29607 Time: 03:41:05 > [03:41:07] Logs: = /usr/local/poudriere/data/logs/bulk/main-amd64-bulk_a-default/2023-09-04_1= 7h49m28s > [03:41:19] [95] [00:06:59] Finished databases/php80-pdo_sqlite | = php80-pdo_sqlite-8.0.30: Failed: build-depends > [03:41:20] [60] [00:10:53] Finished www/php82-session | = php82-session-8.2.9: Success > [03:41:20] [19] [00:06:42] Finished databases/pecl-memcache@php80 | = php80-pecl-memcache-8.2: Failed: build-depends > [03:41:23] [62] [00:06:40] Finished = net-mgmt/icingaweb2-module-incubator@php81 | = icingaweb2-module-incubator-php81-0.20.0: Success > [03:41:24] Cleaning up > exit: cannot open ./var/run/116_nohang.pid: No such file or directory > exit: cannot open ./var/run/93_nohang.pid: No such file or directory >=20 > and JID 253's processes are stuck in the STOP state, along with the > related JID 0 processes. These are for x11-toolkits/libXaw . I do > not know how long it might have been stuck. >=20 > /usr/local/libexec/poudriere/sh -e /usr/local/share/poudriere/bulk.sh = -J128 -jmain-amd64-bulk_a -a >=20 > is stuck in killpg state: >=20 > 3896 101784 sh - mi_switch+0xbb = _sx_xlock_hard+0x3e1 killpg1+0x193 kern_kill+0x23c amd64_syscall+0x109 = fast_syscall_common+0xf8=20 >=20 > Most recent process birth time listed-first order for the STOP's: >=20 > 86826 634820 sh - mi_switch+0xbb = thread_suspend_switch+0xc3 cursig+0x7b9 ast_sig+0x1df ast_handler+0x88 = ast+0x20 doreti_ast+0x1c=20 > 86825 321240 sh - mi_switch+0xbb = thread_suspend_check+0x260 sig_intr+0x4c fork1+0x186 sys_fork+0x54 = amd64_syscall+0x109 fast_syscall_common+0xf8=20 > 86550 308525 sh - mi_switch+0xbb = sleepq_catch_signals+0x2ab sleepq_wait_sig+0x9 _sleep+0x1e3 = pipe_read+0x420 dofileread+0x86 sys_read+0xbe amd64_syscall+0x109 = fast_syscall_common+0xf8=20 > 3998 637224 gmake - mi_switch+0xbb = sleepq_catch_signals+0x2ab sleepq_wait_sig+0x9 _sleep+0x1e3 = kern_wait6+0x51c sys_wait4+0x7d amd64_syscall+0x109 = fast_syscall_common+0xf8=20 > 3976 298566 sh - mi_switch+0xbb = sleepq_catch_signals+0x2ab sleepq_wait_sig+0x9 _sleep+0x1e3 = kern_wait6+0x51c sys_wait4+0x7d amd64_syscall+0x109 = fast_syscall_common+0xf8=20 > 3974 309336 gmake - mi_switch+0xbb = sleepq_catch_signals+0x2ab sleepq_wait_sig+0x9 _sleep+0x1e3 = kern_wait6+0x51c sys_wait4+0x7d amd64_syscall+0x109 = fast_syscall_common+0xf8=20 > 3962 577230 gmake - mi_switch+0xbb = sleepq_catch_signals+0x2ab sleepq_wait_sig+0x9 _sleep+0x1e3 = kern_wait6+0x51c sys_wait4+0x7d amd64_syscall+0x109 = fast_syscall_common+0xf8=20 > 3958 635598 sh - mi_switch+0xbb = sleepq_catch_signals+0x2ab sleepq_wait_sig+0x9 _sleep+0x1e3 = kern_wait6+0x51c sys_wait4+0x7d amd64_syscall+0x109 fast_syscall_common+ > 0xf8=20 > 3915 668148 make - mi_switch+0xbb = sleepq_catch_signals+0x2ab sleepq_wait_sig+0x9 _sleep+0x1e3 = kern_wait6+0x51c sys_wait4+0x7d amd64_syscall+0x109 fast_syscall_common+ > 0xf8=20 > 3912 638108 sh - mi_switch+0xbb = sleepq_catch_signals+0x2ab sleepq_wait_sig+0x9 _sleep+0x1e3 = kern_wait6+0x51c sys_wait4+0x7d amd64_syscall+0x109 fast_syscall_common+ > 0xf > 46142 212631 sh - mi_switch+0xbb = sleepq_catch_signals+0x2ab sleepq_timedwait_sig+0x12 = _cv_timedwait_sig_sbt+0x10c seltdwait+0x75 kern_select+0x95e = sys_select+0x57 amd64_syscall+0x109 fast_syscall_common+0xf8=20 So I tried 30 for per_txg_dirty_frees_percent for 2 contexts: autotrim on vs. autotrim off where there was 100 GiByte+ to delete after a poudriere bulk run. autotrim on: takes a fair time to delete even 1 GiByte of the 100+ = GiByte vs. autotrim off: takes less time to delete more. The difference is very visible via "gstat -spod" use. autotrim on likely makes things less concurrent, somewhat like USB3 storage having only one command to the device at a time for FreeBSD. autotrim on seems to prevent the larger unit of work from being an effective way to decrease the time required, instead possibly increasing the time requirement. That may be an example of the context dependendency for what value of per_txg_dirty_frees_percent to use to avoid wasting much time. =3D=3D=3D Mark Millard marklmi at yahoo.com