From nobody Sat Sep 25 15:19:50 2021 X-Original-To: fs@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 2770517C86BC for ; Sat, 25 Sep 2021 15:19:50 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "mxrelay.nyi.freebsd.org", Issuer "R3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4HGsyZ0Y4Hz4gQg for ; Sat, 25 Sep 2021 15:19:50 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2610:1c1:1:606c::50:1d]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id E65CF4BD8 for ; Sat, 25 Sep 2021 15:19:49 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org ([127.0.1.5]) by kenobi.freebsd.org (8.15.2/8.15.2) with ESMTP id 18PFJnh9045026 for ; Sat, 25 Sep 2021 15:19:49 GMT (envelope-from bugzilla-noreply@freebsd.org) Received: (from www@localhost) by kenobi.freebsd.org (8.15.2/8.15.2/Submit) id 18PFJnXl045025 for fs@FreeBSD.org; Sat, 25 Sep 2021 15:19:49 GMT (envelope-from bugzilla-noreply@freebsd.org) X-Authentication-Warning: kenobi.freebsd.org: www set sender to bugzilla-noreply@freebsd.org using -f From: bugzilla-noreply@freebsd.org To: fs@FreeBSD.org Subject: [Bug 258208] [zfs] locks up when using rollback or destroy on both 13.0-RELEASE & sysutils/openzfs port Date: Sat, 25 Sep 2021 15:19:50 +0000 X-Bugzilla-Reason: AssignedTo X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: bin X-Bugzilla-Version: 13.0-RELEASE X-Bugzilla-Keywords: X-Bugzilla-Severity: Affects Only Me X-Bugzilla-Who: markj@FreeBSD.org X-Bugzilla-Status: Open X-Bugzilla-Resolution: X-Bugzilla-Priority: --- X-Bugzilla-Assigned-To: fs@FreeBSD.org X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated List-Id: Filesystems List-Archive: https://lists.freebsd.org/archives/freebsd-fs List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-fs@freebsd.org MIME-Version: 1.0 X-ThisMailContainsUnwantedMimeParts: N https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D258208 --- Comment #7 from Mark Johnston --- I am not sure how best to fix this. To elaborate a bit more, the deadlock occurs because a rollback does a suspend/resume of the target fs. This involves taking the teardown write lock; one thing we do with the lock held= is call zfs_rezget() on all vnodes associated with the filesystem, which among other things throws away all data cached in the page cache. This requires pages to be busied with the ZFS write lock held, so I am inclined to think = that zfs_freebsd_getpages() should be responsible for breaking the deadlock as it does in https://cgit.freebsd.org/src/commit/?id=3Dcd32b4f5b79c97b293f7be3fe9ddfc902= 4f7d734 . zfs_freebsd_getpages() could perhaps trylock and upon failure return some EAGAIN-like status to ask the fault handler to retry, but I don't see a way= to do that - vm_fault_getpages() squashes the error and does not allow the pag= er to return KERN_RESOURCE_SHORTAGE. Alternately, zfs_freebsd_getpages() could perhaps wire and unbusy the page = upon a trylock failure. Once it successfully acquires the teardown read lock, it could re-lookup the fault page and compare or re-insert the wired page if necessary. OTOH I cannot see how this is handled on Linux. In particular, I do not see how their zfs_rezget() invalidates the page cache. --=20 You are receiving this mail because: You are the assignee for the bug.=