From nobody Sun Nov 14 02:57:00 2021 X-Original-To: dev-commits-src-all@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id F3EF818524AF; Sun, 14 Nov 2021 02:57:00 +0000 (UTC) (envelope-from git@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "mxrelay.nyi.freebsd.org", Issuer "R3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4HsH6N6YCyz4vLl; Sun, 14 Nov 2021 02:57:00 +0000 (UTC) (envelope-from git@FreeBSD.org) Received: from gitrepo.freebsd.org (gitrepo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:5]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id C0CCF1ECFC; Sun, 14 Nov 2021 02:57:00 +0000 (UTC) (envelope-from git@FreeBSD.org) Received: from gitrepo.freebsd.org ([127.0.1.44]) by gitrepo.freebsd.org (8.16.1/8.16.1) with ESMTP id 1AE2v0qS097283; Sun, 14 Nov 2021 02:57:00 GMT (envelope-from git@gitrepo.freebsd.org) Received: (from git@localhost) by gitrepo.freebsd.org (8.16.1/8.16.1/Submit) id 1AE2v0h5097282; Sun, 14 Nov 2021 02:57:00 GMT (envelope-from git) Date: Sun, 14 Nov 2021 02:57:00 GMT Message-Id: <202111140257.1AE2v0h5097282@gitrepo.freebsd.org> To: src-committers@FreeBSD.org, dev-commits-src-all@FreeBSD.org, dev-commits-src-branches@FreeBSD.org From: Rick Macklem Subject: git: 00ec7c6ab526 - stable/13 - nfscl: Fix race between Lookup and Setattr of size List-Id: Commit messages for all branches of the src repository List-Archive: https://lists.freebsd.org/archives/dev-commits-src-all List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-dev-commits-src-all@freebsd.org X-BeenThere: dev-commits-src-all@freebsd.org MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-Git-Committer: rmacklem X-Git-Repository: src X-Git-Refname: refs/heads/stable/13 X-Git-Reftype: branch X-Git-Commit: 00ec7c6ab526e0a8b10f57c578f423d647c4d93b Auto-Submitted: auto-generated X-ThisMailContainsUnwantedMimeParts: N The branch stable/13 has been updated by rmacklem: URL: https://cgit.FreeBSD.org/src/commit/?id=00ec7c6ab526e0a8b10f57c578f423d647c4d93b commit 00ec7c6ab526e0a8b10f57c578f423d647c4d93b Author: Rick Macklem AuthorDate: 2021-10-30 23:35:02 +0000 Commit: Rick Macklem CommitDate: 2021-11-14 02:52:26 +0000 nfscl: Fix race between Lookup and Setattr of size PR#259071 provides a test program that fails for the NFS client. Testing with it, there appears to be a race between Lookup and VOPs like Setattr-of-size, where Lookup ends up loading stale attributes (including what might be the wrong file size) into the NFS vnode's attribute cache. The race occurs when the modifying VOP (which holds a lock on the vnode), blocks the acquisition of the vnode in Lookup, after the RPC (with now potentially stale attributes). Here's what seems to happen: Child Parent does stat(), which does VOP_LOOKUP(), doing the Lookup RPC with the directory vnode locked, acquiring file attributes valid at this point in time blocks waiting for locked file does ftruncate(), which vnode does VOP_SETATTR() of Size, changing the file's size while holding an exclusive lock on the file's vnode releases the vnode lock acquires file vnode and fills in now stale attributes including the old wrong Size does a read() which returns wrong data size This patch fixes the problem by saving a timestamp in the NFS vnode in the VOPs that modify the file (Setattr-of-size, Allocate). Then lookup/readdirplus compares that timestamp with the time just before starting the RPC after it has acquired the file's vnode. If the modifying RPC occurred during the Lookup, the attributes in the RPC reply are discarded, since they might be stale. With this patch the test program works as expected. Note that the test program does not fail on a July stable/12, although this race is in the NFS client code. I suspect a fairly recent change to the name caching code exposed this bug. PR: 259071 (cherry picked from commit 2be417843a04f25e631e99d5188eb2652b13d80d) --- sys/fs/nfsclient/nfs_clrpcops.c | 29 +++++++++++++-- sys/fs/nfsclient/nfs_clvnops.c | 78 +++++++++++++++++++++++++++++++++++++---- sys/fs/nfsclient/nfsnode.h | 1 + 3 files changed, 99 insertions(+), 9 deletions(-) diff --git a/sys/fs/nfsclient/nfs_clrpcops.c b/sys/fs/nfsclient/nfs_clrpcops.c index 94f7496378c5..6489569f8acf 100644 --- a/sys/fs/nfsclient/nfs_clrpcops.c +++ b/sys/fs/nfsclient/nfs_clrpcops.c @@ -3384,7 +3384,8 @@ nfsrpc_readdirplus(vnode_t vp, struct uio *uiop, nfsuint64 *cookiep, nfsattrbit_t attrbits, dattrbits; size_t tresid; u_int32_t *tl2 = NULL, rderr; - struct timespec dctime; + struct timespec dctime, ts; + bool attr_ok; KASSERT(uiop->uio_iovcnt == 1 && (uiop->uio_resid & (DIRBLKSIZ - 1)) == 0, @@ -3569,6 +3570,7 @@ nfsrpc_readdirplus(vnode_t vp, struct uio *uiop, nfsuint64 *cookiep, *tl = txdr_unsigned(NFSV4OP_GETATTR); (void) nfsrv_putattrbit(nd, &dattrbits); } + nanouptime(&ts); error = nfscl_request(nd, vp, p, cred, stuff); if (error) return (error); @@ -3726,6 +3728,7 @@ nfsrpc_readdirplus(vnode_t vp, struct uio *uiop, nfsuint64 *cookiep, ncookie.lval[1]; if (nfhp != NULL) { + attr_ok = true; if (NFSRV_CMPFH(nfhp->nfh_fh, nfhp->nfh_len, dnp->n_fhp->nfh_fh, dnp->n_fhp->nfh_len)) { VREF(vp); @@ -3755,12 +3758,32 @@ nfsrpc_readdirplus(vnode_t vp, struct uio *uiop, nfsuint64 *cookiep, if (!error) { newvp = NFSTOV(np); unlocknewvp = 1; + /* + * If n_localmodtime >= time before RPC, + * then a file modification operation, + * such as VOP_SETATTR() of size, has + * occurred while the Lookup RPC and + * acquisition of the vnode happened. As + * such, the attributes might be stale, + * with possibly an incorrect size. + */ + NFSLOCKNODE(np); + if (timespecisset( + &np->n_localmodtime) && + timespeccmp(&np->n_localmodtime, + &ts, >=)) { + NFSCL_DEBUG(4, "nfsrpc_readdirplus:" + " localmod stale attributes\n"); + attr_ok = false; + } + NFSUNLOCKNODE(np); } } nfhp = NULL; if (newvp != NULLVP) { - error = nfscl_loadattrcache(&newvp, - &nfsva, NULL, NULL, 0, 0); + if (attr_ok) + error = nfscl_loadattrcache(&newvp, + &nfsva, NULL, NULL, 0, 0); if (error) { if (unlocknewvp) vput(newvp); diff --git a/sys/fs/nfsclient/nfs_clvnops.c b/sys/fs/nfsclient/nfs_clvnops.c index 0ab02af9e5e8..eab6eed82830 100644 --- a/sys/fs/nfsclient/nfs_clvnops.c +++ b/sys/fs/nfsclient/nfs_clvnops.c @@ -1012,6 +1012,7 @@ nfs_setattr(struct vop_setattr_args *ap) struct vattr *vap = ap->a_vap; int error = 0; u_quad_t tsize; + struct timespec ts; #ifndef nolint tsize = (u_quad_t)0; @@ -1106,11 +1107,18 @@ nfs_setattr(struct vop_setattr_args *ap) NFSUNLOCKNODE(np); } error = nfs_setattrrpc(vp, vap, ap->a_cred, td); - if (error && vap->va_size != VNOVAL) { - NFSLOCKNODE(np); - np->n_size = np->n_vattr.na_size = tsize; - vnode_pager_setsize(vp, tsize); - NFSUNLOCKNODE(np); + if (vap->va_size != VNOVAL) { + if (error == 0) { + nanouptime(&ts); + NFSLOCKNODE(np); + np->n_localmodtime = ts; + NFSUNLOCKNODE(np); + } else { + NFSLOCKNODE(np); + np->n_size = np->n_vattr.na_size = tsize; + vnode_pager_setsize(vp, tsize); + NFSUNLOCKNODE(np); + } } return (error); } @@ -1167,7 +1175,7 @@ nfs_lookup(struct vop_lookup_args *ap) struct nfsfh *nfhp; struct nfsvattr dnfsva, nfsva; struct vattr vattr; - struct timespec nctime; + struct timespec nctime, ts; *vpp = NULLVP; if ((flags & ISLASTCN) && (mp->mnt_flag & MNT_RDONLY) && @@ -1271,6 +1279,7 @@ nfs_lookup(struct vop_lookup_args *ap) newvp = NULLVP; NFSINCRGLOBAL(nfsstatsv1.lookupcache_misses); + nanouptime(&ts); error = nfsrpc_lookup(dvp, cnp->cn_nameptr, cnp->cn_namelen, cnp->cn_cred, td, &dnfsva, &nfsva, &nfhp, &attrflag, &dattrflag, NULL); @@ -1337,6 +1346,22 @@ nfs_lookup(struct vop_lookup_args *ap) if (error) return (error); newvp = NFSTOV(np); + /* + * If n_localmodtime >= time before RPC, then + * a file modification operation, such as + * VOP_SETATTR() of size, has occurred while + * the Lookup RPC and acquisition of the vnode + * happened. As such, the attributes might + * be stale, with possibly an incorrect size. + */ + NFSLOCKNODE(np); + if (timespecisset(&np->n_localmodtime) && + timespeccmp(&np->n_localmodtime, &ts, >=)) { + NFSCL_DEBUG(4, "nfs_lookup: rename localmod " + "stale attributes\n"); + attrflag = 0; + } + NFSUNLOCKNODE(np); if (attrflag) (void) nfscl_loadattrcache(&newvp, &nfsva, NULL, NULL, 0, 1); @@ -1396,6 +1421,22 @@ nfs_lookup(struct vop_lookup_args *ap) if (error) return (error); newvp = NFSTOV(np); + /* + * If n_localmodtime >= time before RPC, then + * a file modification operation, such as + * VOP_SETATTR() of size, has occurred while + * the Lookup RPC and acquisition of the vnode + * happened. As such, the attributes might + * be stale, with possibly an incorrect size. + */ + NFSLOCKNODE(np); + if (timespecisset(&np->n_localmodtime) && + timespeccmp(&np->n_localmodtime, &ts, >=)) { + NFSCL_DEBUG(4, "nfs_lookup: localmod " + "stale attributes\n"); + attrflag = 0; + } + NFSUNLOCKNODE(np); if (attrflag) (void) nfscl_loadattrcache(&newvp, &nfsva, NULL, NULL, 0, 1); @@ -2613,7 +2654,9 @@ nfs_lookitup(struct vnode *dvp, char *name, int len, struct ucred *cred, struct componentname cn; int error = 0, attrflag, dattrflag; u_int hash; + struct timespec ts; + nanouptime(&ts); error = nfsrpc_lookup(dvp, name, len, cred, td, &dnfsva, &nfsva, &nfhp, &attrflag, &dattrflag, NULL); if (dattrflag) @@ -2674,6 +2717,22 @@ printf("replace=%s\n",nnn); if (error) return (error); newvp = NFSTOV(np); + /* + * If n_localmodtime >= time before RPC, then + * a file modification operation, such as + * VOP_SETATTR() of size, has occurred while + * the Lookup RPC and acquisition of the vnode + * happened. As such, the attributes might + * be stale, with possibly an incorrect size. + */ + NFSLOCKNODE(np); + if (timespecisset(&np->n_localmodtime) && + timespeccmp(&np->n_localmodtime, &ts, >=)) { + NFSCL_DEBUG(4, "nfs_lookitup: localmod " + "stale attributes\n"); + attrflag = 0; + } + NFSUNLOCKNODE(np); } if (!attrflag && *npp == NULL) { if (newvp == dvp) @@ -3628,11 +3687,14 @@ nfs_allocate(struct vop_allocate_args *ap) struct thread *td = curthread; struct nfsvattr nfsva; struct nfsmount *nmp; + struct nfsnode *np; off_t alen; int attrflag, error, ret; + struct timespec ts; attrflag = 0; nmp = VFSTONFS(vp->v_mount); + np = VTONFS(vp); mtx_lock(&nmp->nm_mtx); if (NFSHASNFSV4(nmp) && nmp->nm_minorvers >= NFSV42_MINORVERSION && (nmp->nm_privflag & NFSMNTP_NOALLOCATE) == 0) { @@ -3652,6 +3714,10 @@ nfs_allocate(struct vop_allocate_args *ap) if (error == 0) { *ap->a_offset += alen; *ap->a_len -= alen; + nanouptime(&ts); + NFSLOCKNODE(np); + np->n_localmodtime = ts; + NFSUNLOCKNODE(np); } else if (error == NFSERR_NOTSUPP) { mtx_lock(&nmp->nm_mtx); nmp->nm_privflag |= NFSMNTP_NOALLOCATE; diff --git a/sys/fs/nfsclient/nfsnode.h b/sys/fs/nfsclient/nfsnode.h index b34e362a8522..833737654f6b 100644 --- a/sys/fs/nfsclient/nfsnode.h +++ b/sys/fs/nfsclient/nfsnode.h @@ -129,6 +129,7 @@ struct nfsnode { struct nfsv4node *n_v4; /* extra V4 stuff */ struct ucred *n_writecred; /* Cred. for putpages */ struct nfsclopen *n_openstateid; /* Cached open stateid */ + struct timespec n_localmodtime; /* Last local modify */ }; #define n_atim n_un1.nf_atim