From nobody Thu Aug 24 19:22:00 2023 X-Original-To: freebsd-current@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4RWtJD6SF2z4rVDZ for ; Thu, 24 Aug 2023 19:22:20 +0000 (UTC) (envelope-from marklmi@yahoo.com) Received: from sonic316-54.consmr.mail.gq1.yahoo.com (sonic316-54.consmr.mail.gq1.yahoo.com [98.137.69.30]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 4RWtJC5kVlz4fk3 for ; Thu, 24 Aug 2023 19:22:19 +0000 (UTC) (envelope-from marklmi@yahoo.com) Authentication-Results: mx1.freebsd.org; dkim=pass header.d=yahoo.com header.s=s2048 header.b=OhbABHpG; spf=pass (mx1.freebsd.org: domain of marklmi@yahoo.com designates 98.137.69.30 as permitted sender) smtp.mailfrom=marklmi@yahoo.com; dmarc=pass (policy=reject) header.from=yahoo.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.com; s=s2048; t=1692904937; bh=qferZE9/vX7e0w5rsqmGevvY+K8TlBkQ+mTZ0N6OLe8=; h=Subject:From:In-Reply-To:Date:Cc:References:To:From:Subject:Reply-To; b=OhbABHpGYzNsJX+TFDoKSOLVvEplInP2RZOktUrXPUyb22NMBXbCeX5wYhTimO4Qtq5lnCAxiqyaLoyy3lCjtXklPd1O+BLuQN0fFoZgzZk/tM4FSgQj3JpHbxPobw9P/H0NcSXD621g39RVrrh/Akh9VMyWeGVYCKvCgOjq1ugcEMIYHVqHb4FlazfHhmsmWpke6mmKhC2SWyQ5o7GA9Tz01GVoYckakbw1vUyEv8x9QmXDOD5N3LTPzsE/QRF+95UsAtYoV6OIWkIz5AGALV/z/Kkepnf4QUNWH+6SyrU+kijFq2G1kI2HRVDIUlICddL4GSndGj6hKPa9KbJ0Gw== X-SONIC-DKIM-SIGN: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.com; s=s2048; t=1692904937; bh=otEYXNqS6sHwRyeGBWw7yWmEqRrJtAsOGz60nGoFbB+=; h=X-Sonic-MF:Subject:From:Date:To:From:Subject; b=CdD1KG3x+uDs8D9RuVJup+hbSzNcCj/1/BzE5VfrseynxsQtRVxC8ZLf6bxSYIcRSMGoL2mDY7syeipmsYzIAk3ZZx/T90uUIC0uodwZv+khCKtapB4kpW7QMv3NGgF1aPuIE952m0OVZ3Kmh/2/qg1+hupovdhnpnUoykRYipWoHPDSA3yOKkdBbzKY5SBZgVoVqgrwXCcNHtzALTB/IcdCFt47wg7nQX0vp6Yfdvulq34+LX+Pmku5eYcyS5zKuQZQwEUgi9psBEV/bRPMJqdoHTJpR2hxubSL5AWhqHJVYZ8bqN2Cjo6klcfcCg7+HYnOGFNQ4vvl3o+qG/s/cg== X-YMail-OSG: c1wXOFEVM1lMPZYIylyU9Cxh8GbyxfZv2CkS5ZLMPEaH2Mp_vt.tM2Qhm5tOZRK UW.yy_qeHKwejmndnODvxKWPoxSi5ZbAioXAC65sZQGsNCsjXCf7.NqdENRaHr4zNDHKNMk4A_1L ck19NwGLYhhYZriPIaunI6_1AUAdAUjJ.ND7wKJsveQ1EDWnumGb6HKZaP09PIwZWwBAYFmgLyYq 1dpOfLk38Cf1X_Svztnyncn.ifogCHpsn8KlrGo.Y.Cl.AXwwD7Uj6BRg1pORJAEKaaR5rJKReU_ nM3jsqIFTtKLxao1uzDM6QtFKpJvIwmIf.KBzbd2pP2yrM2j3ePfacNWnZ0bMy3QqvxcfrsgcO0P EUlwUZPj5r0ylulToUwFHC0SZaH0MMfZyCiPpPoASVJZfwbTzrdFMMfMmTmYODHBCQZg4xJLeKL1 ygW0suOWrZJvlm9jcEtIzfBn6NFD024S6DdkfnuxWBq7IL2VoG0gLhYgca7FSez4ejon.R4fRexV weH5IPkFavIBKfXS4LWFaMcRn3Ozd9pAfbAra3QHpmU2mhQ6tjyUuSFrIbVRBa35FudIc0NDPg.6 n74L9FeaLyebzj6Ny9b7o35YZ1iLaCHZKUnCgZGDASbnpGieOLXVusKTZ74e5bjuDqFWKsCN46GU 2Jvup2XUuDIfTkZDoZyC2Xw2bNagWM8_nXAfDKESk3NR_qF1D40I3l2gfJJbW.cWvtOEu7DeEwxj h0d_SJ83Cl1vw531jBBySxOcs2zUlRK0qQrJbkgUKVr_sx.En3Ovg78VwxHLgBHi46oRIp6hTomn BjEKNUWe.v1SoMsxynwrZzytPyLLU.DWQlxLDpnOK1TlKRUe9r0zwOwmCaG2JJWrPLSk.dZxVQOa 0QZpy56_6hD6tCXudLvdaOWdKPiBRTX7m27sj.dHNwL5OVR31RbBNoMEgANos2vRfNJP7dtu.EdY AVuzdzauqS3Sr4rM3bmGzMLTaQYWsW6kNO_jKC0nxdvFwK6DrKvvffOMA1jFhIby5Yrpkdagy_rM GhDlcMw4kei9fuvQLGF18ONbZFnYrsuXi6fheUKWcyZZHDJY9p5oNH.NKz2cHDqjnAR1l0ySiOaw 8nwiEtKQxXiSUV07XFBVKOQ2n_xtAQ6ceeM1DmTcvgMxT_9TJl2z1R0AvGrIZ_fINBnXA19pQbky qQ573_ej3NTPLH3UD3op9zhj8K3STw0KYlbt3IHVubjPd_94Mn9_Uz3fmsbkrFnt8frK7WLFmICA OGd_CjDOT4plcBiz4ysrUrRpClTK37iWtyRwwdGrZVl6RVxg2_m3u.b4uV90J1ULKdNJnPxFhpq9 J_vPS8mqQt6qOdydxRSxV6zev6NhbzPYWobHrn2Ylvyvt4nh1Egm9vbyS1IaOhFPidXIjCp7VKRN W1rLgQdDizjhcvFjlqlltJI5XI1PPdFL8BtogfGq9PpPCtiHn72AcepfHSApU.EHUMcKjvF6i2u6 _80b4X3z_zjjW.Z0QEhmqiaFNMq7of6krV.bre1pHGgOE2Pgtr7OLNjeIh4itZuPS5glXvqvFlp7 6Z8uQNyCT3phbl5v3p.tA74ICWq_2wC3330xcQUqm3rnjVr2QHBL_4cx0ZhoT3Hy0v6czn_tPyZz TdWbVnaRzAtU0VEYT75t.Tutme4aoEXxKKcALw8FbELAmTs0Z2aNlezszNaavi5KAQOp5_ku6pgf X_uPkwppcma6ULLYj5Uh8CSa76oIqpk7kPTXz7sYTd7rSNgcjb2OxHdWKmls7Pihc.0JumzxA1YT eOhsvOB4RDxp15etU89YKVg9qqLPJb.PLTJ3UJusClpLxJHFRT4LGBybdajPOjxxtPEfF7fJo9Af ufe3VcF_0P.X84qaJK48O6gqN9Em.bn0Oi27XLdFMwMptWOfQAeO0laCwkYPio95l5bDllJg_3pK lUtGV4ZskpErLwMvQHKLthDwp3Z_5bbXbmm8f2BIalXpOYuytjVwRoOPP08cRUvvyYKCuxh2VppM C0zKXRYjmGG48H4gRkF1pODnCjIedDrzEOy4AbnR1KhgopPhOJpUbtzoJ1bnhRVS23xFfNSIh68. JVLloFHaEcNWdqV5Og8eXr7fhP7pk.Vgr494.K3x0RttbBIz.QFOgZcYjAHCqXqmZzjyEAhwUTA1 ioxF0xeWLVAaeTMC6a6rGF.NRn1NyiwsNNzgPtiFwFdTOT0JP3MROx7hRMyycPagmtnjhRP6uqJc g.5wR0EqzsmOG2utx.pDENwfhfnVTJFV2M..bo0gCNw8A60QQsA7FZJ7d7bPZGrveNuUYcJLGV2c - X-Sonic-MF: X-Sonic-ID: 75927ab7-ade5-4fcb-a2ce-b3a7df173b06 Received: from sonic.gate.mail.ne1.yahoo.com by sonic316.consmr.mail.gq1.yahoo.com with HTTP; Thu, 24 Aug 2023 19:22:17 +0000 Received: by hermes--production-ne1-7b767b77cc-6vm8t (Yahoo Inc. Hermes SMTP Server) with ESMTPA ID 59bf4f7551a41c00b3a96f7c0b5dd551; Thu, 24 Aug 2023 19:22:12 +0000 (UTC) Content-Type: text/plain; charset=us-ascii List-Id: Discussions about the use of FreeBSD-current List-Archive: https://lists.freebsd.org/archives/freebsd-current List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-current@freebsd.org Mime-Version: 1.0 (Mac OS X Mail 16.0 \(3731.700.6\)) Subject: Re: ZFS deadlock in 14 From: Mark Millard In-Reply-To: <1AC87B79-6B65-402B-B65F-CCFFCC503861@yahoo.com> Date: Thu, 24 Aug 2023 12:22:00 -0700 Cc: Current FreeBSD Content-Transfer-Encoding: quoted-printable Message-Id: <40D0C681-C28B-47C2-B913-90A56CFD69D4@yahoo.com> References: <4FFAE432-21FE-4462-9162-9CC30A5D470A.ref@yahoo.com> <4FFAE432-21FE-4462-9162-9CC30A5D470A@yahoo.com> <1AC87B79-6B65-402B-B65F-CCFFCC503861@yahoo.com> To: Alexander Motin X-Mailer: Apple Mail (2.3731.700.6) X-Spamd-Result: default: False [-3.50 / 15.00]; NEURAL_HAM_LONG(-1.00)[-1.000]; NEURAL_HAM_MEDIUM(-1.00)[-1.000]; NEURAL_HAM_SHORT(-1.00)[-1.000]; DMARC_POLICY_ALLOW(-0.50)[yahoo.com,reject]; MV_CASE(0.50)[]; R_SPF_ALLOW(-0.20)[+ptr:yahoo.com]; R_DKIM_ALLOW(-0.20)[yahoo.com:s=s2048]; MIME_GOOD(-0.10)[text/plain]; FROM_HAS_DN(0.00)[]; RCVD_VIA_SMTP_AUTH(0.00)[]; TO_MATCH_ENVRCPT_SOME(0.00)[]; ARC_NA(0.00)[]; MLMMJ_DEST(0.00)[freebsd-current@freebsd.org]; RCVD_IN_DNSWL_NONE(0.00)[98.137.69.30:from]; DWL_DNSWL_NONE(0.00)[yahoo.com:dkim]; ASN(0.00)[asn:36647, ipnet:98.137.64.0/20, country:US]; RWL_MAILSPIKE_POSSIBLE(0.00)[98.137.69.30:from]; DKIM_TRACE(0.00)[yahoo.com:+]; TO_DN_ALL(0.00)[]; FREEMAIL_FROM(0.00)[yahoo.com]; MID_RHS_MATCH_FROM(0.00)[]; RCPT_COUNT_TWO(0.00)[2]; RCVD_TLS_LAST(0.00)[]; FROM_EQ_ENVFROM(0.00)[]; FREEMAIL_ENVFROM(0.00)[yahoo.com]; MIME_TRACE(0.00)[0:+]; RCVD_COUNT_TWO(0.00)[2] X-Spamd-Bar: --- X-Rspamd-Queue-Id: 4RWtJC5kVlz4fk3 On Aug 23, 2023, at 13:37, Mark Millard wrote: >=20 > On Aug 23, 2023, at 11:40, Alexander Motin wrote: >=20 >> On 22.08.2023 14:24, Mark Millard wrote: >>> Alexander Motin wrote on >>> Date: Tue, 22 Aug 2023 16:18:12 UTC : >>>> I am waiting for final test results from George Wilson and then = will >>>> request quick merge of both to zfs-2.2-release branch. = Unfortunately >>>> there are still not many reviewers for the PR, since the code is = not >>>> trivial, but at least with the test reports Brian Behlendorf and = Mark >>>> Maybee seem to be OK to merge the two PRs into 2.2. If somebody = else >>>> have tested and/or reviewed the PR, you may comment on it. >>> I had written to the list that when I tried to test the system >>> doing poudriere builds (initially with your patches) using >>> USE_TMPFS=3Dno so that zfs had to deal with all the file I/O, I >>> instead got only one builder that ended up active, the others >>> never reaching "Builder started": >>=20 >>> Top was showing lots of "vlruwk" for the cpdup's. For example: >>> . . . >>> 362 0 root 40 0 27076Ki 13776Ki CPU19 19 4:23 = 0.00% cpdup -i0 -o ref 32 >>> 349 0 root 53 0 27076Ki 13776Ki vlruwk 22 4:20 = 0.01% cpdup -i0 -o ref 31 >>> 328 0 root 68 0 27076Ki 13804Ki vlruwk 8 4:30 = 0.01% cpdup -i0 -o ref 30 >>> 304 0 root 37 0 27076Ki 13792Ki vlruwk 6 4:18 = 0.01% cpdup -i0 -o ref 29 >>> 282 0 root 42 0 33220Ki 13956Ki vlruwk 8 4:33 = 0.01% cpdup -i0 -o ref 28 >>> 242 0 root 56 0 27076Ki 13796Ki vlruwk 4 4:28 = 0.00% cpdup -i0 -o ref 27 >>> . . . >>> But those processes did show CPU?? on occasion, as well as >>> *vnode less often. None of the cpdup's was stuck in >>> Removing your patches did not change the behavior. >>=20 >> Mark, to me "vlruwk" looks like a limit on number of vnodes. I was = not deep in that area at least recently, so somebody with more = experience there could try to diagnose it. At very least it does not = look related to the ZIL issue discussed in this thread, at least with = the information provided, so I am not surprised that the mentioned = patches do not affect it. >=20 > Thanks for the information. Good to know. I'll redirect this to be a = different discussion. Mateusz Guzik had me revert 138a5dafba31 ( which is for sys/kern/vfs_subr.c ), which was enough to allow me to run bulk -a with USE_TMPFS=3Dno usefully. (There is now a new sys/kern/vfs_subr.c patch for me to try instead.) So I used the reverted context to test without your patches to see if I'd get a deadlock from a bulk -a with USE_TMPFS=3Dno usage. It is past 9200 finished in about 18 hrs of building. No deadlock. (I do not plan on letting the bulk -a run to completion.) The 3 load averages are normally over 100 and the MaxObs figures for the 3 are currently: 349.68, 264.30, 243.16 (for a 32 hardware-thread system). So it looks like when I try again with Mateusz's new patch, trying with your patches would not be much of a test for preventing deadlocks for this context. More of a cross check on if other types of issues showed up vs. not. It is not clear how useful such testing might be. It might be that the high load average bulk -a style makes the deadlocks in question less likely for some reason. =3D=3D=3D Mark Millard marklmi at yahoo.com =3D=3D=3D Mark Millard marklmi at yahoo.com