From nobody Fri Aug 01 21:56:45 2025 X-Original-To: dev-commits-src-main@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4bv0Cd3VLDz639nT; Fri, 01 Aug 2025 21:56:45 +0000 (UTC) (envelope-from git@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "mxrelay.nyi.freebsd.org", Issuer "R10" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4bv0Cd2q00z3LVH; Fri, 01 Aug 2025 21:56:45 +0000 (UTC) (envelope-from git@FreeBSD.org) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1754085405; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=9gTr37DfmpK9S8A/ubIXYthYfqOkig5hew8VviEoVpY=; b=PiDZdAknFQmQIvavYI7znQJ709JjWSKS/aMAdqkEqrSLNvalVE+mPIQuW2eTrQPdYFULzW y4mthg7BiJQiEQIwrLYxx7WVWPmiYObQYysP2G88vHDiDlr8Zl65teFEk6llvkPxISohF3 rqJwM6Wr4DWicMZNNAwIDP2bKifqDZZcdkDlSmzYinBNOImXb6sSOR5bVorez6Ql8ya1pB OtD0Af6nYfiA72EHwmFgpPtTiub2VH7Zc1b43tGtNv5uOKD+pXFhg7GKs7hQbpiJlMr6hV /EKUt5NIWB7Ps4Hcp+o4EQzQfZEbt5Z1TLyx9l2LSruwqXrs3Yw6b2gmPBzpig== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1754085405; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=9gTr37DfmpK9S8A/ubIXYthYfqOkig5hew8VviEoVpY=; b=fMpKhsaZ1gvGeeY/1wO8hh4/u9xvENRh1DK4GJbZr+4Rf/RO0K5NOH6OJRLhT3uJmuL5OV +OfyGmtYHjchOQzL7oCse5ZkyYTDQju3SWlRPQcvCGlMoLs74IR6LHDEy8lR52SOUoc+sz +FZhcr/4lWI3AIcplcChZBKnqUL887HjnGzB7EumQATSgZ5UtOuFEBueYqApd9QeLN701S 3M2u7g254cj4Fv2h5MlDZT4o2kyzWLkyMXl7V+J2TtbnwBaYIEgbTPLvsm7pfRICvrM37o UxLJ6TGdvc4PZmP9kanbY0Akz6ebqKOD5zZZYOsqCqijoBHDtvQek7ODTMM0wQ== ARC-Authentication-Results: i=1; mx1.freebsd.org; none ARC-Seal: i=1; s=dkim; d=freebsd.org; t=1754085405; a=rsa-sha256; cv=none; b=sewb9yzXscffHhc0tNJvIGGUTx/muWY1DO4oT8m6HjQdsewDe0vBUK7cuQbWLn/D+QcNAc 5XjT5e3WtoonTQNVMqOywLdXdUpW9JsczgXpiGbb4p5Y0jWd5u5qay1ZEW6yKfMIFUanis zRMbw9LaV+wLtTPO3rpPFWirUqY3fxJiXYT9bnn951RsOc5eXuMhmU7UWVozcxBhyFjQOF PsoNRgUKMDW6WpfV3IoTuzrMW6qJSrkXi16SnkAplCq4X+8THPFxubIQ1RWOQn6N6sGmWM jcJ/hJfv9MdrJnI6l+ArbG9X/Gu3UT0DL5xL9cxdEqjLmduA9IFR2A1MrpJkqA== Received: from gitrepo.freebsd.org (gitrepo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:5]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 4bv0Cd2BJtz14vH; Fri, 01 Aug 2025 21:56:45 +0000 (UTC) (envelope-from git@FreeBSD.org) Received: from gitrepo.freebsd.org ([127.0.1.44]) by gitrepo.freebsd.org (8.18.1/8.18.1) with ESMTP id 571LujwZ003463; Fri, 1 Aug 2025 21:56:45 GMT (envelope-from git@gitrepo.freebsd.org) Received: (from git@localhost) by gitrepo.freebsd.org (8.18.1/8.18.1/Submit) id 571LujDE003461; Fri, 1 Aug 2025 21:56:45 GMT (envelope-from git) Date: Fri, 1 Aug 2025 21:56:45 GMT Message-Id: <202508012156.571LujDE003461@gitrepo.freebsd.org> To: src-committers@FreeBSD.org, dev-commits-src-all@FreeBSD.org, dev-commits-src-main@FreeBSD.org From: Mark Johnston Subject: git: c2230ad3b121 - main - inotify: Avoid calling vrele() with a namecache mutex held List-Id: Commit messages for the main branch of the src repository List-Archive: https://lists.freebsd.org/archives/dev-commits-src-main List-Help: List-Post: List-Subscribe: List-Unsubscribe: X-BeenThere: dev-commits-src-main@freebsd.org Sender: owner-dev-commits-src-main@FreeBSD.org MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-Git-Committer: markj X-Git-Repository: src X-Git-Refname: refs/heads/main X-Git-Reftype: branch X-Git-Commit: c2230ad3b121bd1e57fdb2c466f5b826aad5f730 Auto-Submitted: auto-generated The branch main has been updated by markj: URL: https://cgit.FreeBSD.org/src/commit/?id=c2230ad3b121bd1e57fdb2c466f5b826aad5f730 commit c2230ad3b121bd1e57fdb2c466f5b826aad5f730 Author: Mark Johnston AuthorDate: 2025-08-01 18:16:17 +0000 Commit: Mark Johnston CommitDate: 2025-08-01 21:44:25 +0000 inotify: Avoid calling vrele() with a namecache mutex held In cache_vop_inotify(), we call inotify_log() with a namecache hash lock held. inotify_log() looks at all watches registered with the vnode to see if any of them are interested in the event. In some cases, we have to detach and free the watch after logging the event. This means we must vrele() the watched vnode, and this must not be done while a non-sleepable lock held. Previously, I deferred the vrele() to until the inotify softc and vnode pollinfo locks were dropped. However, this is not enough since we may still be holding the aforementioned namecache lock. Go further and use a taskqueue thread to release vnode references. Introduce a set of detached watches, and queue a threaded task which releases the vnode reference. Reported by: syzbot+c128f121cb22df95559b@syzkaller.appspotmail.com Reviewed by: kib Fixes: f1f230439fa4 ("vfs: Initial revision of inotify") Differential Revision: https://reviews.freebsd.org/D51685 --- sys/kern/vfs_inotify.c | 71 ++++++++++++++++++++++++++++++++++++-------------- 1 file changed, 52 insertions(+), 19 deletions(-) diff --git a/sys/kern/vfs_inotify.c b/sys/kern/vfs_inotify.c index d3cd0d1f9832..746a5a39208e 100644 --- a/sys/kern/vfs_inotify.c +++ b/sys/kern/vfs_inotify.c @@ -34,6 +34,7 @@ #include #include #include +#include #include #include #include @@ -168,6 +169,8 @@ struct inotify_softc { size_t nbpending; /* bytes available to read */ uint64_t ino; /* unique identifier */ struct inotify_watch_tree watches; /* active watches */ + TAILQ_HEAD(, inotify_watch) deadwatches; /* watches pending vrele() */ + struct task reaptask; /* task to reap dead watches */ struct selinfo sel; /* select/poll/kevent info */ struct ucred *cred; /* credential ref */ }; @@ -374,6 +377,13 @@ inotify_unlink_watch_locked(struct inotify_softc *sc, struct inotify_watch *watc vn_irflag_unset(vp, VIRF_INOTIFY); } +static void +inotify_free_watch(struct inotify_watch *watch) +{ + vrele(watch->vp); + free(watch, M_INOTIFY); +} + /* * Assumes that the watch has already been removed from its softc. */ @@ -389,9 +399,24 @@ inotify_remove_watch(struct inotify_watch *watch) mtx_lock(&vp->v_pollinfo->vpi_lock); inotify_unlink_watch_locked(sc, watch); mtx_unlock(&vp->v_pollinfo->vpi_lock); + inotify_free_watch(watch); +} - vrele(vp); - free(watch, M_INOTIFY); +static void +inotify_reap(void *arg, int pending) +{ + struct inotify_softc *sc; + struct inotify_watch *watch; + + sc = arg; + mtx_lock(&sc->lock); + while ((watch = TAILQ_FIRST(&sc->deadwatches)) != NULL) { + TAILQ_REMOVE(&sc->deadwatches, watch, vlink); + mtx_unlock(&sc->lock); + inotify_free_watch(watch); + mtx_lock(&sc->lock); + } + mtx_unlock(&sc->lock); } static int @@ -403,6 +428,7 @@ inotify_close(struct file *fp, struct thread *td) sc = fp->f_data; + /* Detach watches from their vnodes. */ mtx_lock(&sc->lock); (void)chginotifycnt(sc->cred->cr_ruidinfo, -1, 0); while ((watch = RB_MIN(inotify_watch_tree, &sc->watches)) != NULL) { @@ -411,6 +437,17 @@ inotify_close(struct file *fp, struct thread *td) inotify_remove_watch(watch); mtx_lock(&sc->lock); } + + /* Make sure that any asynchronous vrele() calls are done. */ + mtx_unlock(&sc->lock); + taskqueue_drain(taskqueue_thread, &sc->reaptask); + mtx_lock(&sc->lock); + KASSERT(RB_EMPTY(&sc->watches), + ("%s: watches not empty in %p", __func__, sc)); + KASSERT(TAILQ_EMPTY(&sc->deadwatches), + ("%s: deadwatches not empty in %p", __func__, sc)); + + /* Drop pending events. */ while (!STAILQ_EMPTY(&sc->pending)) { rec = inotify_dequeue(sc); if (rec != &sc->overflow) @@ -456,6 +493,8 @@ inotify_create_file(struct thread *td, struct file *fp, int flags, int *fflagsp) sc->nextwatch = 1; /* Required for compatibility. */ STAILQ_INIT(&sc->pending); RB_INIT(&sc->watches); + TAILQ_INIT(&sc->deadwatches); + TASK_INIT(&sc->reaptask, 0, inotify_reap, sc); mtx_init(&sc->lock, "inotify", NULL, MTX_DEF); knlist_init_mtx(&sc->sel.si_note, &sc->lock); sc->cred = crhold(td->td_ucred); @@ -560,17 +599,16 @@ inotify_queue_record(struct inotify_softc *sc, struct inotify_record *rec) return (true); } -static int +static void inotify_log_one(struct inotify_watch *watch, const char *name, size_t namelen, int event, uint32_t cookie) { struct inotify_watch key; struct inotify_softc *sc; struct inotify_record *rec; - int relecount; bool allocfail; - relecount = 0; + mtx_assert(&watch->vp->v_pollinfo->vpi_lock, MA_OWNED); sc = watch->sc; rec = inotify_alloc_record(watch->wd, name, namelen, event, cookie, @@ -599,20 +637,22 @@ inotify_log_one(struct inotify_watch *watch, const char *name, size_t namelen, /* * Remove the watch, taking care to handle races with - * inotify_close(). + * inotify_close(). The thread that removes the watch is + * responsible for freeing it. */ key.wd = watch->wd; if (RB_FIND(inotify_watch_tree, &sc->watches, &key) != NULL) { RB_REMOVE(inotify_watch_tree, &sc->watches, watch); inotify_unlink_watch_locked(sc, watch); - free(watch, M_INOTIFY); - /* Defer vrele() to until locks are dropped. */ - relecount++; + /* + * Defer the vrele() to a sleepable thread context. + */ + TAILQ_INSERT_TAIL(&sc->deadwatches, watch, vlink); + taskqueue_enqueue(taskqueue_thread, &sc->reaptask); } } mtx_unlock(&sc->lock); - return (relecount); } void @@ -620,25 +660,18 @@ inotify_log(struct vnode *vp, const char *name, size_t namelen, int event, uint32_t cookie) { struct inotify_watch *watch, *tmp; - int relecount; KASSERT((event & ~(IN_ALL_EVENTS | IN_ISDIR | IN_UNMOUNT)) == 0, ("inotify_log: invalid event %#x", event)); - relecount = 0; mtx_lock(&vp->v_pollinfo->vpi_lock); TAILQ_FOREACH_SAFE(watch, &vp->v_pollinfo->vpi_inotify, vlink, tmp) { KASSERT(watch->vp == vp, ("inotify_log: watch %p vp != vp", watch)); - if ((watch->mask & event) != 0 || event == IN_UNMOUNT) { - relecount += inotify_log_one(watch, name, namelen, event, - cookie); - } + if ((watch->mask & event) != 0 || event == IN_UNMOUNT) + inotify_log_one(watch, name, namelen, event, cookie); } mtx_unlock(&vp->v_pollinfo->vpi_lock); - - for (int i = 0; i < relecount; i++) - vrele(vp); } /*