ZFS leaking vnodes (sort of)
Pawel Jakub Dawidek
pjd at FreeBSD.org
Mon Jul 9 00:27:31 UTC 2007
On Sat, Jul 07, 2007 at 02:26:17PM +0100, Doug Rabson wrote:
> I've been testing ZFS recently and I noticed some performance issues
> while doing large-scale port builds on a ZFS mounted /usr/ports tree.
> Eventually I realised that virtually nothing ever ended up on the vnode
> free list. This meant that when the system reached its maximum vnode
> limit, it had to resort to reclaiming vnodes from the various
> filesystem's active vnode lists (via vlrureclaim). Since those lists
> are not sorted in LRU order, this led to pessimal cache performance
> after the system got into that state.
>
> I looked a bit closer at the ZFS code and poked around with DDB and I
> think the problem was caused by a couple of extraneous calls to vhold
> when creating a new ZFS vnode. On FreeBSD, getnewvnode returns a vnode
> which is already held (not on the free list) so there is no need to
> call vhold again.
Whoa! Nice catch... The patch works here - I did some pretty heavy
tests, so please commit it ASAP.
I also wonder if this can help with some of those 'kmem_map too small'
panics. I was observing that ARC cannot reclaim memory and this may be
because all vnodes and thus associated data are beeing held.
To ZFS users having problems with performance and/or stability of ZFS:
Can you test the patch and see if it helps?
> This patch appears to fix the problem (only very lightly tested):
>
> Index: zfs_vnops.c
> ===================================================================
> RCS
> file: /home/ncvs/src/sys/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c,v
> retrieving revision 1.22
> diff -u -r1.22 zfs_vnops.c
> --- zfs_vnops.c 28 May 2007 02:37:43 -0000 1.22
> +++ zfs_vnops.c 7 Jul 2007 13:01:41 -0000
> @@ -3493,7 +3493,7 @@
> rele = 0;
> vp->v_data = NULL;
> ASSERT(vp->v_holdcnt > 1);
> - vdropl(vp);
> + VI_UNLOCK(vp);
> if (!zp->z_unlinked && rele)
> VFS_RELE(zfsvfs->z_vfs);
> return (0);
> Index: zfs_znode.c
> ===================================================================
> RCS
> file: /home/ncvs/src/sys/contrib/opensolaris/uts/common/fs/zfs/zfs_znode.c,v
> retrieving revision 1.8
> diff -u -r1.8 zfs_znode.c
> --- zfs_znode.c 6 May 2007 19:05:37 -0000 1.8
> +++ zfs_znode.c 7 Jul 2007 13:17:32 -0000
> @@ -115,7 +115,6 @@
> ASSERT(error == 0);
> zp->z_vnode = vp;
> vp->v_data = (caddr_t)zp;
> - vhold(vp);
> vp->v_vnlock->lk_flags |= LK_CANRECURSE;
> vp->v_vnlock->lk_flags &= ~LK_NOSHARE;
> } else {
> @@ -601,7 +600,6 @@
> ASSERT(err == 0);
> vp = ZTOV(zp);
> vp->v_data = (caddr_t)zp;
> - vhold(vp);
> vp->v_vnlock->lk_flags |= LK_CANRECURSE;
> vp->v_vnlock->lk_flags &= ~LK_NOSHARE;
> vp->v_type = IFTOVT((mode_t)zp->z_phys->zp_mode);
--
Pawel Jakub Dawidek http://www.wheel.pl
pjd at FreeBSD.org http://www.FreeBSD.org
FreeBSD committer Am I Evil? Yes, I Am!
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 187 bytes
Desc: not available
Url : http://lists.freebsd.org/pipermail/freebsd-current/attachments/20070709/181dfdd7/attachment.pgp
More information about the freebsd-current
mailing list