From nobody Mon Mar 31 09:06:35 2025 X-Original-To: dev-commits-src-main@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4ZR4xl64LGz5rhYQ; Mon, 31 Mar 2025 09:06:35 +0000 (UTC) (envelope-from git@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "mxrelay.nyi.freebsd.org", Issuer "R10" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4ZR4xl3j5Bz3YW3; Mon, 31 Mar 2025 09:06:35 +0000 (UTC) (envelope-from git@FreeBSD.org) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1743411995; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=jzGECOOrA8Wt6NRATXCL1pSn39aioS08EYHXVXCIIhM=; b=S/q7i+IYNoI6OSxlMgZUvlHkiipdChokDKtRw086OFAj+9CsEq97Iicetr4mJX7Br68lNZ VjYS0TFG4C5bDtuaOBjAIE0V49Egr8KyfjzivOuuM0D/M2hC5nIoZ9/LTznesXvxRX/451 7aaLSAI1DUZJZrlJkGK2palQ7ej9Ans6omVq8p9N/UE8iHs2aXN0JXbNHBG3YTQTUkcSte UtTdEeoH8WnoVzv6MO7HsSBo4yQBe/cARyTwWudiECxE5ogKwGH56yqU421NIpgUrBGfLJ nEBCFrEd4060ehqw8MTOtaMTO12D4BP6p6kSXRlztsldaiaqg+GCv5AL+aXNFQ== ARC-Seal: i=1; s=dkim; d=freebsd.org; t=1743411995; a=rsa-sha256; cv=none; b=BNcRKS3OzcgOILk5U/aXZPYVwDWJ11mZ/DIB7naJHhB4Yz6fvp83kw31S2kRhY4svxyJcs Cv/Nw85ZQFu0zVro8vd0n2TrVKRdZ4SrW0U6pQDbkOwsLnTrn18th/J5fN3Wk6cWdCZ8hA vLG2RQq/+JY/jHewc5mt7iJx/+pL3/cK2QG3/e5qbbGJtZ/xVBU03uqB1KO8n0sIUVUzyc FHraMN60DmI3YIJX7m+PCd5iJaDKFswBmt+3hfFaiZ1QLEd+Ams7rF0tmpHmiY0c7Xz5Xb 4zXLnQXMMRU/cIDBPMGgduWBvCp8/H5USZ+K/Oimp6MFx+tYP8y2L4zoHuWQaw== ARC-Authentication-Results: i=1; mx1.freebsd.org; none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1743411995; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=jzGECOOrA8Wt6NRATXCL1pSn39aioS08EYHXVXCIIhM=; b=GAfaz8tmUGQnxzDaVHgEpdTOGv5PXrFz4gPIouumy9YN2e/jsVG0W8aVuuhxYFwOB20lnv QGzzfZOMMIgS/w3SzlShaX7e0S4k9ZocD3VkNTP48MXCltgKOQq+ee6duQEdnzCCl0riCs eqWVSAffPnhWASs4ZAMRNd4fHzP6fRtXs17HfQ3jbU4YAhnH+Qu0is8IErHXOm5fB5ubqi jvmsJwP+5bgXIowAqBhQwH6IJQiZTxE75HNVoeAkOXAhb/AdrwRH1tK8D0iq/tGv1Dcvmg ES5VRdyZBIYLHIZHDO64Lp24GplEtzyklQR/eUJt6T+tRZEP4y4K86DR5q+wqA== Received: from gitrepo.freebsd.org (gitrepo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:5]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 4ZR4xl3JLRzfrm; Mon, 31 Mar 2025 09:06:35 +0000 (UTC) (envelope-from git@FreeBSD.org) Received: from gitrepo.freebsd.org ([127.0.1.44]) by gitrepo.freebsd.org (8.18.1/8.18.1) with ESMTP id 52V96ZT5078840; Mon, 31 Mar 2025 09:06:35 GMT (envelope-from git@gitrepo.freebsd.org) Received: (from git@localhost) by gitrepo.freebsd.org (8.18.1/8.18.1/Submit) id 52V96ZJV078837; Mon, 31 Mar 2025 09:06:35 GMT (envelope-from git) Date: Mon, 31 Mar 2025 09:06:35 GMT Message-Id: <202503310906.52V96ZJV078837@gitrepo.freebsd.org> To: src-committers@FreeBSD.org, dev-commits-src-all@FreeBSD.org, dev-commits-src-main@FreeBSD.org From: Mark Johnston Subject: git: fe7fe3b175b6 - main - rangelock: Fix handling of trylocks List-Id: Commit messages for the main branch of the src repository List-Archive: https://lists.freebsd.org/archives/dev-commits-src-main List-Help: List-Post: List-Subscribe: List-Unsubscribe: X-BeenThere: dev-commits-src-main@freebsd.org Sender: owner-dev-commits-src-main@FreeBSD.org MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-Git-Committer: markj X-Git-Repository: src X-Git-Refname: refs/heads/main X-Git-Reftype: branch X-Git-Commit: fe7fe3b175b626dd1402cd06745b1e3f070c3edd Auto-Submitted: auto-generated The branch main has been updated by markj: URL: https://cgit.FreeBSD.org/src/commit/?id=fe7fe3b175b626dd1402cd06745b1e3f070c3edd commit fe7fe3b175b626dd1402cd06745b1e3f070c3edd Author: Mark Johnston AuthorDate: 2025-03-29 08:57:52 +0000 Commit: Mark Johnston CommitDate: 2025-03-31 09:01:09 +0000 rangelock: Fix handling of trylocks When inserting a queue entry, i.e., locking a range, there are two points where a trylock operation may fail, one before the new entry is inserted, one after. In the latter case, rl_(r|w)_validate() would mark the entry and rangelock_lock_int() would free it. However, this is of course incorrect, since the entry is visible to other threads, which will eventually attempt to remove it and free it again. Factor out conflict handling in rl_(r|w)_validate() to a common function as they are functionally the same. Then, introduce a new result which indicates that a trylock failed but that the queue entry must not be cleaned up. While here, assert that a conflicting range isn't owned by the current thread, as that would indicate a bug in the consumer. Reviewed by: olce, kib Reported by: syzkaller Fixes: 5badbeeaf061 ("Re-implement rangelocks part 2") Differential Revision: https://reviews.freebsd.org/D49438 --- sys/kern/kern_rangelock.c | 122 ++++++++++++++++++++++++++++++---------------- 1 file changed, 79 insertions(+), 43 deletions(-) diff --git a/sys/kern/kern_rangelock.c b/sys/kern/kern_rangelock.c index 59112acfb03d..3854ffbeec29 100644 --- a/sys/kern/kern_rangelock.c +++ b/sys/kern/kern_rangelock.c @@ -462,6 +462,10 @@ rl_insert_sleep(struct rangelock *lock) smr_enter(rl_smr); } +/* + * Try to insert an entry into the queue. Return true if successful, otherwise + * false. + */ static bool rl_q_cas(struct rl_q_entry **prev, struct rl_q_entry *old, struct rl_q_entry *new) @@ -517,15 +521,60 @@ again: enum RL_INSERT_RES { RL_TRYLOCK_FAILED, + RL_TRYLOCK_FAILED_MARKED, RL_LOCK_SUCCESS, RL_LOCK_RETRY, }; +/* + * Handle a possible lock conflict between cur and e. "inserted" is true if e + * is already inserted into the queue. + */ +static enum RL_INSERT_RES +rl_conflict(struct rangelock *lock, struct rl_q_entry *cur, struct rl_q_entry *e, + bool trylock, bool inserted) +{ + sleepq_lock(&lock->sleepers); + if (rl_e_is_marked(rl_q_load(&cur->rl_q_next))) { + sleepq_release(&lock->sleepers); + return (RL_LOCK_SUCCESS); /* no conflict after all */ + } + + /* + * Make sure we're not conflicting with one of our own locks. This + * scenario could conceivably happen for one of two reasons: a bug in + * the rangelock consumer (if "inserted" is true), or a bug in the + * rangelock implementation itself (if "inserted" is false). + */ + KASSERT(cur->rl_q_owner != curthread, + ("%s: conflicting range is locked by the current thread", + __func__)); + + if (inserted) + rangelock_unlock_int(lock, e); + if (trylock) { + sleepq_release(&lock->sleepers); + + /* + * If the new entry e has been enqueued and is thus visible to + * other threads, it cannot be safely freed. + */ + return (inserted ? RL_TRYLOCK_FAILED_MARKED: RL_TRYLOCK_FAILED); + } + rl_insert_sleep(lock); + return (RL_LOCK_RETRY); +} + +/* + * Having inserted entry e, verify that no conflicting write locks are present; + * clean up dead entries that we encounter along the way. + */ static enum RL_INSERT_RES rl_r_validate(struct rangelock *lock, struct rl_q_entry *e, bool trylock, struct rl_q_entry **free) { struct rl_q_entry *cur, *next, **prev; + enum RL_INSERT_RES res; again: prev = &e->rl_q_next; @@ -550,28 +599,23 @@ again: cur = rl_e_unmark_unchecked(rl_q_load(prev)); continue; } - if (!rl_e_is_marked(rl_q_load(&cur->rl_q_next))) { - sleepq_lock(&lock->sleepers); - if (rl_e_is_marked(rl_q_load(&cur->rl_q_next))) { - sleepq_release(&lock->sleepers); - continue; - } - rangelock_unlock_int(lock, e); - if (trylock) { - sleepq_release(&lock->sleepers); - return (RL_TRYLOCK_FAILED); - } - rl_insert_sleep(lock); - return (RL_LOCK_RETRY); - } + + res = rl_conflict(lock, cur, e, trylock, true); + if (res != RL_LOCK_SUCCESS) + return (res); } } +/* + * Having inserted entry e, verify that no conflicting locks are present; + * clean up dead entries that we encounter along the way. + */ static enum RL_INSERT_RES rl_w_validate(struct rangelock *lock, struct rl_q_entry *e, bool trylock, struct rl_q_entry **free) { struct rl_q_entry *cur, *next, **prev; + enum RL_INSERT_RES res; again: prev = (struct rl_q_entry **)&lock->head; @@ -596,20 +640,10 @@ again: cur = rl_e_unmark_unchecked(rl_q_load(prev)); continue; } - sleepq_lock(&lock->sleepers); - /* Reload after sleepq is locked */ - next = rl_q_load(&cur->rl_q_next); - if (rl_e_is_marked(next)) { - sleepq_release(&lock->sleepers); - goto again; - } - rangelock_unlock_int(lock, e); - if (trylock) { - sleepq_release(&lock->sleepers); - return (RL_TRYLOCK_FAILED); - } - rl_insert_sleep(lock); - return (RL_LOCK_RETRY); + + res = rl_conflict(lock, cur, e, trylock, true); + if (res != RL_LOCK_SUCCESS) + return (res); } } @@ -653,19 +687,19 @@ again: prev = &cur->rl_q_next; cur = rl_q_load(prev); } else if (r == 0) { - sleepq_lock(&lock->sleepers); - if (__predict_false(rl_e_is_marked(rl_q_load( - &cur->rl_q_next)))) { - sleepq_release(&lock->sleepers); - continue; - } - if (trylock) { - sleepq_release(&lock->sleepers); - return (RL_TRYLOCK_FAILED); + enum RL_INSERT_RES res; + + res = rl_conflict(lock, cur, e, trylock, false); + switch (res) { + case RL_LOCK_SUCCESS: + /* cur does not conflict after all. */ + break; + case RL_LOCK_RETRY: + /* e is still valid. */ + goto again; + default: + return (res); } - rl_insert_sleep(lock); - /* e is still valid */ - goto again; } else /* r == 1 */ { e->rl_q_next = cur; if (rl_q_cas(prev, cur, e)) { @@ -697,10 +731,12 @@ rangelock_lock_int(struct rangelock *lock, bool trylock, vm_ooffset_t start, smr_enter(rl_smr); res = rl_insert(lock, e, trylock, &free); smr_exit(rl_smr); - if (res == RL_TRYLOCK_FAILED) { + if (res == RL_TRYLOCK_FAILED || res == RL_TRYLOCK_FAILED_MARKED) { MPASS(trylock); - e->rl_q_free = free; - free = e; + if (res == RL_TRYLOCK_FAILED) { + e->rl_q_free = free; + free = e; + } e = NULL; } rangelock_free_free(free);