From nobody Sun Oct 24 16:59:25 2021 X-Original-To: dev-commits-src-main@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id C581018178B5; Sun, 24 Oct 2021 16:59:25 +0000 (UTC) (envelope-from git@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "mxrelay.nyi.freebsd.org", Issuer "R3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4Hckp55CpMz4c4J; Sun, 24 Oct 2021 16:59:25 +0000 (UTC) (envelope-from git@FreeBSD.org) Received: from gitrepo.freebsd.org (gitrepo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:5]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 8F84018ECD; Sun, 24 Oct 2021 16:59:25 +0000 (UTC) (envelope-from git@FreeBSD.org) Received: from gitrepo.freebsd.org ([127.0.1.44]) by gitrepo.freebsd.org (8.16.1/8.16.1) with ESMTP id 19OGxPUL020995; Sun, 24 Oct 2021 16:59:25 GMT (envelope-from git@gitrepo.freebsd.org) Received: (from git@localhost) by gitrepo.freebsd.org (8.16.1/8.16.1/Submit) id 19OGxPkf020994; Sun, 24 Oct 2021 16:59:25 GMT (envelope-from git) Date: Sun, 24 Oct 2021 16:59:25 GMT Message-Id: <202110241659.19OGxPkf020994@gitrepo.freebsd.org> To: src-committers@FreeBSD.org, dev-commits-src-all@FreeBSD.org, dev-commits-src-main@FreeBSD.org From: "Jason A. Harmening" Subject: git: fd8ad2128dcd - main - unionfs: implement vnode-based cache lookup List-Id: Commit messages for the main branch of the src repository List-Archive: https://lists.freebsd.org/archives/dev-commits-src-main List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-dev-commits-src-main@freebsd.org X-BeenThere: dev-commits-src-main@freebsd.org MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-Git-Committer: jah X-Git-Repository: src X-Git-Refname: refs/heads/main X-Git-Reftype: branch X-Git-Commit: fd8ad2128dcd4fe7b9e52e181bfe4e2a24b0ab3a Auto-Submitted: auto-generated X-ThisMailContainsUnwantedMimeParts: N The branch main has been updated by jah: URL: https://cgit.FreeBSD.org/src/commit/?id=fd8ad2128dcd4fe7b9e52e181bfe4e2a24b0ab3a commit fd8ad2128dcd4fe7b9e52e181bfe4e2a24b0ab3a Author: Jason A. Harmening AuthorDate: 2021-10-17 02:16:25 +0000 Commit: Jason A. Harmening CommitDate: 2021-10-24 17:05:50 +0000 unionfs: implement vnode-based cache lookup unionfs uses a per-directory hashtable to cache subdirectory nodes. Currently this hashtable is looked up using the directory name, but since unionfs nodes aren't removed from the cache until they're reclaimed, this poses some problems. For example, if a directory is created on a unionfs mount shortly after deleting a previous directory with the same path, the cache may end up reusing the node for the previous directory, including its upper/lower FS vnodes. Operations against those vnodes with then likely fail because the vnodes represent deleted files; for example UFS will reject VOP_MKDIR() against such a vnode because its effective link count is 0. This may then manifest as e.g. mkdir(2) or open(2) returning ENOENT for an attempt to create a file under the re-created directory. While it would be possible to fix this by explicitly managing the name-based cache during delete or rename operations, or by rejecting cache hits if the underlying FS vnodes don't match those passed to unionfs_nodeget(), it seems cleaner to instead hash the unionfs nodes based on their underlying FS vnodes. Since unionfs prefers to operate against the upper vnode if one is present, the lower vnode will only be used for hashing as long as the upper vnode is NULL. This should also make hashing faster by eliminating string traversal and using the already-computed hash index stored in each vnode. While here, fix a couple of other cache-related issues: --Remove 8 bytes of unnecessary baggage from each unionfs node by getting rid of the stored hash mask field. The mask is knowable at compile time. --When a matching node is found in the cache, reference its vnode using vrefl() while still holding the vnode interlock. Previously unionfs_nodeget() would vref() the vnode after the interlock was dropped, but the vnode may be reclaimed during that window. This caused intermittent panics from vn_lock(9) during unionfs stress testing. Reviewed by: kib, markj Tested by: pho Differential Revision: https://reviews.freebsd.org/D32533 --- sys/fs/unionfs/union.h | 1 - sys/fs/unionfs/union_subr.c | 148 +++++++++++++++++++++++--------------------- 2 files changed, 79 insertions(+), 70 deletions(-) diff --git a/sys/fs/unionfs/union.h b/sys/fs/unionfs/union.h index 79469a5957f4..fe611dea1461 100644 --- a/sys/fs/unionfs/union.h +++ b/sys/fs/unionfs/union.h @@ -96,7 +96,6 @@ struct unionfs_node { STAILQ_ENTRY(unionfs_node) un_rele; /* deferred release list */ }; - u_long un_hashmask; /* bit mask */ char *un_path; /* path */ int un_pathlen; /* strlen of path */ int un_flag; /* unionfs node flag */ diff --git a/sys/fs/unionfs/union_subr.c b/sys/fs/unionfs/union_subr.c index 380a61819f1a..326483bcc400 100644 --- a/sys/fs/unionfs/union_subr.c +++ b/sys/fs/unionfs/union_subr.c @@ -64,6 +64,7 @@ #include #define NUNIONFSNODECACHE 16 +#define UNIONFSHASHMASK (NUNIONFSNODECACHE - 1) static MALLOC_DEFINE(M_UNIONFSHASH, "UNIONFS hash", "UNIONFS hash table"); MALLOC_DEFINE(M_UNIONFSNODE, "UNIONFS node", "UNIONFS vnode private part"); @@ -129,69 +130,83 @@ unionfs_deferred_rele(void *arg __unused, int pending __unused) } static struct unionfs_node_hashhead * -unionfs_get_hashhead(struct vnode *dvp, char *path) +unionfs_get_hashhead(struct vnode *dvp, struct vnode *lookup) { struct unionfs_node *unp; - int count; - char hash; - hash = 0; unp = VTOUNIONFS(dvp); - if (path != NULL) { - for (count = 0; path[count]; count++) - hash += path[count]; - } - return (&(unp->un_hashtbl[hash & (unp->un_hashmask)])); + return (&(unp->un_hashtbl[vfs_hash_index(lookup) & UNIONFSHASHMASK])); } /* - * Get the cached vnode. + * Attempt to lookup a cached unionfs vnode by upper/lower vp + * from dvp, with dvp's interlock held. */ static struct vnode * -unionfs_get_cached_vnode(struct vnode *uvp, struct vnode *lvp, - struct vnode *dvp, char *path) +unionfs_get_cached_vnode_locked(struct vnode *lookup, struct vnode *dvp) { - struct unionfs_node_hashhead *hd; struct unionfs_node *unp; + struct unionfs_node_hashhead *hd; struct vnode *vp; - KASSERT((uvp == NULLVP || uvp->v_type == VDIR), - ("unionfs_get_cached_vnode: v_type != VDIR")); - KASSERT((lvp == NULLVP || lvp->v_type == VDIR), - ("unionfs_get_cached_vnode: v_type != VDIR")); + hd = unionfs_get_hashhead(dvp, lookup); - VI_LOCK(dvp); - hd = unionfs_get_hashhead(dvp, path); LIST_FOREACH(unp, hd, un_hash) { - if (!strcmp(unp->un_path, path)) { + if ((unp->un_uppervp == lookup) || + (unp->un_lowervp == lookup)) { vp = UNIONFSTOV(unp); VI_LOCK_FLAGS(vp, MTX_DUPOK); - VI_UNLOCK(dvp); vp->v_iflag &= ~VI_OWEINACT; if (VN_IS_DOOMED(vp) || ((vp->v_iflag & VI_DOINGINACT) != 0)) { VI_UNLOCK(vp); vp = NULLVP; - } else + } else { + vrefl(vp); VI_UNLOCK(vp); + } return (vp); } } - VI_UNLOCK(dvp); return (NULLVP); } + +/* + * Get the cached vnode. + */ +static struct vnode * +unionfs_get_cached_vnode(struct vnode *uvp, struct vnode *lvp, + struct vnode *dvp) +{ + struct vnode *vp; + + KASSERT((uvp == NULLVP || uvp->v_type == VDIR), + ("unionfs_get_cached_vnode: v_type != VDIR")); + KASSERT((lvp == NULLVP || lvp->v_type == VDIR), + ("unionfs_get_cached_vnode: v_type != VDIR")); + + vp = NULLVP; + VI_LOCK(dvp); + if (uvp != NULLVP) + vp = unionfs_get_cached_vnode_locked(uvp, dvp); + else if (lvp != NULLVP) + vp = unionfs_get_cached_vnode_locked(lvp, dvp); + VI_UNLOCK(dvp); + + return (vp); +} + /* * Add the new vnode into cache. */ static struct vnode * unionfs_ins_cached_vnode(struct unionfs_node *uncp, - struct vnode *dvp, char *path) + struct vnode *dvp) { struct unionfs_node_hashhead *hd; - struct unionfs_node *unp; struct vnode *vp; KASSERT((uncp->un_uppervp==NULLVP || uncp->un_uppervp->v_type==VDIR), @@ -199,29 +214,20 @@ unionfs_ins_cached_vnode(struct unionfs_node *uncp, KASSERT((uncp->un_lowervp==NULLVP || uncp->un_lowervp->v_type==VDIR), ("unionfs_ins_cached_vnode: v_type != VDIR")); + vp = NULLVP; VI_LOCK(dvp); - hd = unionfs_get_hashhead(dvp, path); - LIST_FOREACH(unp, hd, un_hash) { - if (!strcmp(unp->un_path, path)) { - vp = UNIONFSTOV(unp); - VI_LOCK_FLAGS(vp, MTX_DUPOK); - vp->v_iflag &= ~VI_OWEINACT; - if (VN_IS_DOOMED(vp) || - ((vp->v_iflag & VI_DOINGINACT) != 0)) { - LIST_INSERT_HEAD(hd, uncp, un_hash); - VI_UNLOCK(vp); - vp = NULLVP; - } else - VI_UNLOCK(vp); - VI_UNLOCK(dvp); - return (vp); - } + if (uncp->un_uppervp != NULL) + vp = unionfs_get_cached_vnode_locked(uncp->un_uppervp, dvp); + else if (uncp->un_lowervp != NULL) + vp = unionfs_get_cached_vnode_locked(uncp->un_lowervp, dvp); + if (vp == NULLVP) { + hd = unionfs_get_hashhead(dvp, (uncp->un_uppervp != NULLVP ? + uncp->un_uppervp : uncp->un_lowervp)); + LIST_INSERT_HEAD(hd, uncp, un_hash); } - - LIST_INSERT_HEAD(hd, uncp, un_hash); VI_UNLOCK(dvp); - return (NULLVP); + return (vp); } /* @@ -233,13 +239,13 @@ unionfs_rem_cached_vnode(struct unionfs_node *unp, struct vnode *dvp) KASSERT((unp != NULL), ("unionfs_rem_cached_vnode: null node")); KASSERT((dvp != NULLVP), ("unionfs_rem_cached_vnode: null parent vnode")); - KASSERT((unp->un_hash.le_prev != NULL), - ("unionfs_rem_cached_vnode: null hash")); VI_LOCK(dvp); - LIST_REMOVE(unp, un_hash); - unp->un_hash.le_next = NULL; - unp->un_hash.le_prev = NULL; + if (unp->un_hash.le_prev != NULL) { + LIST_REMOVE(unp, un_hash); + unp->un_hash.le_next = NULL; + unp->un_hash.le_prev = NULL; + } VI_UNLOCK(dvp); } @@ -281,7 +287,7 @@ unionfs_nodeget_cleanup(struct vnode *vp, void *arg) if (unp->un_lowervp != NULLVP) vput(unp->un_lowervp); if (unp->un_hashtbl != NULL) - hashdestroy(unp->un_hashtbl, M_UNIONFSHASH, unp->un_hashmask); + hashdestroy(unp->un_hashtbl, M_UNIONFSHASH, UNIONFSHASHMASK); free(unp->un_path, M_UNIONFSPATH); free(unp, M_UNIONFSNODE); } @@ -302,6 +308,7 @@ unionfs_nodeget(struct mount *mp, struct vnode *uppervp, struct unionfs_mount *ump; struct unionfs_node *unp; struct vnode *vp; + u_long hashmask; int error; int lkflags; enum vtype vt; @@ -322,10 +329,9 @@ unionfs_nodeget(struct mount *mp, struct vnode *uppervp, path = NULL; /* check the cache */ - if (path != NULL && dvp != NULLVP && vt == VDIR) { - vp = unionfs_get_cached_vnode(uppervp, lowervp, dvp, path); + if (dvp != NULLVP && vt == VDIR) { + vp = unionfs_get_cached_vnode(uppervp, lowervp, dvp); if (vp != NULLVP) { - vref(vp); *vpp = vp; goto unionfs_nodeget_out; } @@ -352,9 +358,12 @@ unionfs_nodeget(struct mount *mp, struct vnode *uppervp, if (lowervp != NULLVP) vref(lowervp); - if (vt == VDIR) + if (vt == VDIR) { unp->un_hashtbl = hashinit(NUNIONFSNODECACHE, M_UNIONFSHASH, - &(unp->un_hashmask)); + &hashmask); + KASSERT(hashmask == UNIONFSHASHMASK, + ("unexpected unionfs hash mask 0x%lx", hashmask)); + } unp->un_vnode = vp; unp->un_uppervp = uppervp; @@ -395,12 +404,12 @@ unionfs_nodeget(struct mount *mp, struct vnode *uppervp, unionfs_nodeget_cleanup(vp, unp); return (ENOENT); } - if (path != NULL && dvp != NULLVP && vt == VDIR) - *vpp = unionfs_ins_cached_vnode(unp, dvp, path); + + if (dvp != NULLVP && vt == VDIR) + *vpp = unionfs_ins_cached_vnode(unp, dvp); if (*vpp != NULLVP) { unionfs_nodeget_cleanup(vp, unp); vp = *vpp; - vref(vp); } else { if (uppervp != NULL) VOP_UNLOCK(uppervp); @@ -457,7 +466,7 @@ unionfs_noderem(struct vnode *vp, struct thread *td) if (uvp != NULLVP) VOP_UNLOCK(uvp); - if (dvp != NULLVP && unp->un_hash.le_prev != NULL) + if (dvp != NULLVP) unionfs_rem_cached_vnode(unp, dvp); if (lockmgr(vp->v_vnlock, LK_EXCLUSIVE, VI_MTX(vp)) != 0) @@ -474,7 +483,7 @@ unionfs_noderem(struct vnode *vp, struct thread *td) } if (unp->un_hashtbl != NULL) { - for (count = 0; count <= unp->un_hashmask; count++) { + for (count = 0; count <= UNIONFSHASHMASK; count++) { hd = unp->un_hashtbl + count; LIST_FOREACH_SAFE(unp_t1, hd, un_hash, unp_t2) { LIST_REMOVE(unp_t1, un_hash); @@ -482,7 +491,7 @@ unionfs_noderem(struct vnode *vp, struct thread *td) unp_t1->un_hash.le_prev = NULL; } } - hashdestroy(unp->un_hashtbl, M_UNIONFSHASH, unp->un_hashmask); + hashdestroy(unp->un_hashtbl, M_UNIONFSHASH, UNIONFSHASHMASK); } LIST_FOREACH_SAFE(unsp, &(unp->un_unshead), uns_list, unsp_tmp) { @@ -777,6 +786,7 @@ static void unionfs_node_update(struct unionfs_node *unp, struct vnode *uvp, struct thread *td) { + struct unionfs_node_hashhead *hd; struct vnode *vp; struct vnode *lvp; struct vnode *dvp; @@ -799,16 +809,16 @@ unionfs_node_update(struct unionfs_node *unp, struct vnode *uvp, vn_lock(uvp, LK_EXCLUSIVE | LK_CANRECURSE | LK_RETRY); /* - * cache update + * Re-cache the unionfs vnode against the upper vnode */ - if (unp->un_path != NULL && dvp != NULLVP && vp->v_type == VDIR) { - static struct unionfs_node_hashhead *hd; - + if (dvp != NULLVP && vp->v_type == VDIR) { VI_LOCK(dvp); - hd = unionfs_get_hashhead(dvp, unp->un_path); - LIST_REMOVE(unp, un_hash); - LIST_INSERT_HEAD(hd, unp, un_hash); - VI_UNLOCK(dvp); + if (unp->un_hash.le_prev != NULL) { + LIST_REMOVE(unp, un_hash); + hd = unionfs_get_hashhead(dvp, uvp); + LIST_INSERT_HEAD(hd, unp, un_hash); + } + VI_UNLOCK(unp->un_dvp); } }