From nobody Mon Apr 18 23:32:38 2022
X-Original-To: freebsd-current@mlmmj.nyi.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1])
	by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 7295411CD70A
	for <freebsd-current@mlmmj.nyi.freebsd.org>; Mon, 18 Apr 2022 23:32:52 +0000 (UTC)
	(envelope-from ambrisko@ambrisko.com)
Received: from mail2.ambrisko.com (mail2.ambrisko.com [70.91.206.91])
	by mx1.freebsd.org (Postfix) with ESMTP id 4Kj3Bq5RPfz3rhR
	for <freebsd-current@freebsd.org>; Mon, 18 Apr 2022 23:32:51 +0000 (UTC)
	(envelope-from ambrisko@ambrisko.com)
IronPort-SDR: cwHDB8SuOurvcQQ9JzCMH64K9SuvfbK+AzB4EzvJldyh8SfCixXOQr3zXqfErBysxqIV6/nb/v
 XF9YDv8qpn7GELW+L/3et4mDZS7C8Ig+0=
X-Ambrisko-Me: Yes
IronPort-Data: A9a23:Rt9yJa3sClqeKM1dnfbD5bNxkn2cJEfYwER7XKvMYLTBsI5bp2MPy
 mMYXmyBb/3ZYGuhf48jYIrgpBwHvMPXzdBgTQRq+CA2RRqmiyZl6fd1j6vUF3nPRiH7oc4OA
 /w2MrEsFuhtJpPhjkzF3obJ/CAUOZ6gFuKU5N7sYkiddCc8IMsToUsLd90R3uaEteOE7zal4
 rselSF/1GiNgFaYOkpMg06KRYgGUP7a4Fv0tXRmDRxHUcO3e9D40fsiya+Nw3vQGuG4H8a7Q
 frO1rew+iXQ+h03C8imlfDwdUhirrz6ZFnUzCMIC+77xEIqSi8ais7XMNIVbE1Nii6KmPh4z
 d9XtIezTkEiOaikdOE1DkABTHAgVUFB0PqdSZSliuSd1UDLeWDghv5zFls7O5Ew9Px6DGtV+
 bofMj9lU/wpr4pa25qgR/Nyi955asDuNpkeoXJnizreCJ4brVn4a/2izbdlMP0Y36iixN7SO
 JgUbyRBdhPFb0EdM1sbEshnzu6tjGP+aD5fgFuQr7A2+GvUigd21eG1YtbSf9WLQ+RTn1qZ9
 j+epjWlWklCOYzN0yeB/1KtmvTLwXHxVrUNGeDq7fVtmlCSmDAeUUVESVuhrPCloUeiQNYDe
 VcM8y8joPFqpkymR9XwRTOip3uAskJOUtZcCbdjugiIwLDV+AWeLmEBRCRAc98h8sQxQGVyh
 FOOmtroAx1psaGUGS/Fr+bI9WvqNHFMf2EYZCICQQ8U2PXZodk+3kDVU9JuMK+pldmpSzv+9
 C+H8XoljLIJgM9Vi6jipQLbgyihr4TiRxIu4lmFRXqs6w50adL3Z4Gs7lSHv/9MIJzDFwuAu
 mQJgc6X6KYHCJuXlTeOR6MGG7Twv6SJNzjVgFhOGZg99mTwoyfyIdgIuDwudl10NsskeCPyZ
 B6BsAxc05ZfIX+2YPIleIm2EckrkfDtGIi3TPzSddYSMJF9eBXdpHN1aFSO0nq31kEpm7s+I
 pScN82rCC9CW6hgyTO3QcYb0KMqln1mnDKPHcijwkT1y6eaaV6UVawBYQmHYe0O5a+ZpBnYr
 oREPMyQxhQDCODzb0E7K2LIwYzm+ZTjOa3Llg==
IronPort-HdrOrdr: A9a23:800yh6350K/mETi3oj+XowqjBKwkLtp133Aq2lEZdPVwSL38qy
 nOpoV46faaslossR0b9uxoW5PwIk80l6QU3WB5B97LN2TbUQCTTb2Kg7GN/9XRcReVytJg
Received: from server2.ambrisko.com (HELO internal.ambrisko.com) ([192.168.1.2])
  by ironport2.ambrisko.com with ESMTP; 18 Apr 2022 15:29:19 -0700
Received: from ambrisko.com (localhost [127.0.0.1])
	by internal.ambrisko.com (8.17.1/8.17.1) with ESMTPS id 23INWcri014769
	(version=TLSv1.3 cipher=TLS_AES_256_GCM_SHA384 bits=256 verify=NO)
	for <freebsd-current@freebsd.org>; Mon, 18 Apr 2022 16:32:38 -0700 (PDT)
	(envelope-from ambrisko@ambrisko.com)
X-Authentication-Warning: internal.ambrisko.com: Host localhost [127.0.0.1] claimed to be ambrisko.com
Received: (from ambrisko@localhost)
	by ambrisko.com (8.17.1/8.17.1/Submit) id 23INWcLp014768
	for freebsd-current@freebsd.org; Mon, 18 Apr 2022 16:32:38 -0700 (PDT)
	(envelope-from ambrisko)
Date: Mon, 18 Apr 2022 16:32:38 -0700
From: Doug Ambrisko <ambrisko@ambrisko.com>
To: freebsd-current@freebsd.org
Subject: nullfs and ZFS issues
Message-ID: <Yl31Frx6HyLVl4tE@ambrisko.com>
List-Id: Discussions about the use of FreeBSD-current <freebsd-current.freebsd.org>
List-Archive: https://lists.freebsd.org/archives/freebsd-current
List-Help: <mailto:freebsd-current+help@freebsd.org>
List-Post: <mailto:freebsd-current@freebsd.org>
List-Subscribe: <mailto:freebsd-current+subscribe@freebsd.org>
List-Unsubscribe: <mailto:freebsd-current+unsubscribe@freebsd.org>
Sender: owner-freebsd-current@freebsd.org
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
X-Rspamd-Queue-Id: 4Kj3Bq5RPfz3rhR
X-Spamd-Bar: ++
Authentication-Results: mx1.freebsd.org;
	dkim=none;
	dmarc=none;
	spf=none (mx1.freebsd.org: domain of ambrisko@ambrisko.com has no SPF policy when checking 70.91.206.91) smtp.mailfrom=ambrisko@ambrisko.com
X-Spamd-Result: default: False [2.91 / 15.00];
	 ARC_NA(0.00)[];
	 FREEFALL_USER(0.00)[ambrisko];
	 FROM_HAS_DN(0.00)[];
	 TO_MATCH_ENVRCPT_ALL(0.00)[];
	 MIME_GOOD(-0.10)[text/plain];
	 PREVIOUSLY_DELIVERED(0.00)[freebsd-current@freebsd.org];
	 HAS_XAW(0.00)[];
	 AUTH_NA(1.00)[];
	 RCPT_COUNT_ONE(0.00)[1];
	 RCVD_COUNT_THREE(0.00)[3];
	 TO_DN_NONE(0.00)[];
	 NEURAL_SPAM_MEDIUM(1.00)[0.999];
	 NEURAL_SPAM_LONG(0.97)[0.970];
	 DMARC_NA(0.00)[ambrisko.com];
	 NEURAL_HAM_SHORT(-0.06)[-0.059];
	 MLMMJ_DEST(0.00)[freebsd-current];
	 R_SPF_NA(0.00)[no SPF record];
	 RCVD_NO_TLS_LAST(0.10)[];
	 FROM_EQ_ENVFROM(0.00)[];
	 R_DKIM_NA(0.00)[];
	 MIME_TRACE(0.00)[0:+];
	 ASN(0.00)[asn:7922, ipnet:70.88.0.0/14, country:US];
	 MID_RHS_MATCH_FROM(0.00)[]
X-ThisMailContainsUnwantedMimeParts: N

I've switched my laptop to use nullfs and ZFS.  Previously, I used
localhost NFS mounts instead of nullfs when nullfs would complain
that it couldn't mount.  Since that check has been removed, I've
switched to nullfs only.  However, every so often my laptop would
get slow and the the ARC evict and prune thread would consume two
cores 100% until I rebooted.  I had a 1G max. ARC and have increased
it to 2G now.  Looking into this has uncovered some issues:
     -	nullfs would prevent vnlru_free_vfsops from doing anything
	when called from ZFS arc_prune_task
     -	nullfs would hang onto a bunch of vnodes unless mounted with
	nocache
     -	nullfs and nocache would break untar.  This has been fixed now.

With nullfs, nocache and settings max vnodes to a low number I can
keep the ARC around the max. without evict and prune consuming
100% of 2 cores.  This doesn't seem like the best solution but it
better then when the ARC starts spinning.

Looking into this issue with bhyve and a md drive for testing I create
a brand new zpool mounted as /test and then nullfs mount /test to /mnt.
I loop through untaring the Linux kernel into the nullfs mount, rm -rf it
and repeat.  I set the ARC to the smallest value I can.  Untarring the
Linux kernel was enough to get the ARC evict and prune to spin since
they couldn't evict/prune anything.

Looking at vnlru_free_vfsops called from ZFS arc_prune_task I see it
  static int
  vnlru_free_impl(int count, struct vfsops *mnt_op, struct vnode *mvp)
  {
	...

        for (;;) {
	...
                vp = TAILQ_NEXT(vp, v_vnodelist);
	...

                /*
                 * Don't recycle if our vnode is from different type
                 * of mount point.  Note that mp is type-safe, the
                 * check does not reach unmapped address even if
                 * vnode is reclaimed.
                 */
                if (mnt_op != NULL && (mp = vp->v_mount) != NULL &&
                    mp->mnt_op != mnt_op) {
                        continue;
                }
	...

The vp ends up being the nulfs mount and then hits the continue
even though the passed in mvp is on ZFS.  If I do a hack to
comment out the continue then I see the ARC, nullfs vnodes and 
ZFS vnodes grow.  When the ARC calls arc_prune_task that calls
vnlru_free_vfsops and now the vnodes go down for nullfs and ZFS.
The ARC cache usage also goes down.  Then they increase again until
the ARC gets full and then they go down again.  So with this hack
I don't need nocache passed to nullfs and I don't need to limit
the max vnodes.  Doing multiple untars in parallel over and over
doesn't seem to cause any issues for this test.  I'm not saying
commenting out continue is the fix but a simple POC test.

It appears that when ZFS is asking for cached vnodes to be
free'd nullfs also needs to free some up as well so that
they are free'd on the VFS level.  It seems that vnlru_free_impl
should allow some of the related nullfs vnodes to be free'd so
the ZFS ones can be free'd and reduce the size of the ARC.

BTW, I also hacked the kernel and mount to show the vnodes used
per mount ie. mount -v:
  test on /test (zfs, NFS exported, local, nfsv4acls, fsid 2b23b2a1de21ed66, vnodes: count 13846 lazy 0)
  /test on /mnt (nullfs, NFS exported, local, nfsv4acls, fsid 11ff002929000000, vnodes: count 13846 lazy 0)

Now I can easily see how the vnodes are used without going into ddb.
On my laptop I have various vnet jails and nullfs mount my homedir into
them so pretty much everything goes through nullfs to ZFS.  I'm limping
along with the nullfs nocache and small number of vnodes but it would be
nice to not need that.

Thanks,

Doug A.