Re: kernel svc_rpc_gss_update_seq missing 2017 Coverity fix + TOCTOU with check_replay
Date: Sun, 12 Apr 2026 12:23:46 UTC
On Sun, Apr 12, 2026 at 12:08 AM Stanislav Fort
<stanislav.fort@aisle.com> wrote:
>
> Hi Rick,
>
> Thanks for picking up the first fix. Could you please include a Reported by: tag for me in the commit? (Stanislav Fort of Aisle Research <stanislav.fort@aisle.com>)
I was going to list you as author, but if you prefer
reported by, I can do that.
>
> For the TOCTOU issue, I went ahead and wrote a minimal patch = just a bounds check in update_seq rather than widening the lock scope, so it shouldn't affect the validate/nextverf path at all:
>
> --- a/sys/rpc/rpcsec_gss/svc_rpcsec_gss.c
> +++ b/sys/rpc/rpcsec_gss/svc_rpcsec_gss.c
> @@ -1387,6 +1387,8 @@
> } else {
> offset = client->cl_seqlast - seq;
> + if (offset >= SVC_RPC_GSS_SEQWINDOW)
> + goto out;
> word = offset / 32;
> bit = offset % 32;
> client->cl_seqmask[word] |= (1 << bit);
> @@ -1392,6 +1394,8 @@
> }
> +out:
> sx_xunlock(&client->cl_lock);
> }
>
> The idea: if another thread advanced cl_seqlast past the window between check_replay and update_seq, the sequence number is already too old to track, so we just skip the write. This avoids the OOB on cl_seqmask without changing the locking around validate/nextverf.
>
> Happy to also put this up on Phabricator if that's easier for review.
>
> Best wishes,
> Stanislav Fort
> Aisle Research
>
> On Sun, Apr 12, 2026 at 2:24 AM Rick Macklem <rick.macklem@gmail.com> wrote:
>>
>> On Sat, Apr 11, 2026 at 4:59 PM Rick Macklem <rick.macklem@gmail.com> wrote:
>> >
>> > On Sat, Apr 11, 2026 at 3:42 PM Stanislav Fort <stanislav.fort@aisle.com> wrote:
>> > >
>> > > Hi there,
>> > >
>> > > I think there might be two bugs in sys/rpc/rpcsec_gss/svc_rpcsec_gss.c in svc_rpc_gss_update_seq():
>> > >
>> > > 1. Missing port of lib/librpcsec_gss fix from 11bc2c1ca77e
>> > >
>> > > The userspace lib/librpcsec_gss/svc_rpcsec_gss.c was fixed in 2017 (commit 11bc2c1ca77e, Coverity CID 1198859) to change "while (offset > 32)" to "while (offset >= 32)" and guard the subsequent bit-shift loop with "if (offset > 0)". The kernel copy still has the old pattern:
>> > >
>> > > while (offset > 32) { /* should be >= 32 */
>> > > /* ... word shift ... */
>> > > offset -= 32;
>> > > }
>> > > /* missing: if (offset > 0) { */
>> > > carry = 0;
>> > > for (i = 0; i < SVC_RPC_GSS_SEQWINDOW / 32; i++) {
>> > > newcarry = client->cl_seqmask[i] >> (32 - offset);
>> > > client->cl_seqmask[i] =
>> > > (client->cl_seqmask[i] << offset) | carry; /* UB when offset==32 */
>> > > carry = newcarry;
>> > > }
>> > >
>> > > When offset is exactly 32, the while loop doesn't execute and the shift loop does << 32 on a uint32_t, which is undefined behavior(?).
>> Btw, I tried "s << 32", where s is uint32_t on amd64 with clang and it
>> does not change value of s.
>>
>> Therefore, the loop ends up setting extra bits (or of the adjacent
>> seqmask with the next seqmask) and, if I
>> recall RPCSEC_GSS correctly, that will result in RPCs
>> which should be valid being discarded as retries.
>> (This window of seq#s is meant to subvert replay attempts.)
>>
>> rick
>>
>> > The fix is to match what lib/librpcsec_gss already does: use >= 32 and wrap the shift loop in if (offset > 0).
>> >
>> > This makes sense. I will commit the patch.
>> > I think the most this bug can do is result in a bogus
>> > retry of an RPC. Since kerberized NFS isn't used a
>> > lot and almost always over TCP (where the sequence#
>> > increments by 1 for each RPC and they generally stay
>> > in order), I don't think this bug introduced a serious issue.
>> > (The seq# in RPCSEC_GSS is basically for UDP.)
>> >
>> > >
>> > > 2. TOCTOU between check_replay and update_seq
>> > >
>> > > In svc_rpc_gss(), svc_rpc_gss_check_replay() and svc_rpc_gss_update_seq() each independently acquire and release cl_lock. Between the two calls, svc_rpc_gss_validate() and svc_rpc_gss_nextverf() run without the lock held. Since svc_rpc_gss_find_client() matches clients by opaque handle and doesn't bind them to a specific transport, two connections presenting the same GSS handle can drive concurrent threads through svc_rpc_gss() on the same client struct.
>> > >
>> > > If thread A passes check_replay for a low sequence number, then thread B advances cl_seqlast significantly before thread A enters update_seq, the else branch in update_seq computes:
>> > >
>> > > offset = client->cl_seqlast - seq;
>> > > word = offset / 32;
>> > > bit = offset % 32;
>> > > client->cl_seqmask[word] |= (1 << bit);
>> > >
>> > > Word can exceed SVC_RPC_GSS_SEQWINDOW/32 - 1 (i.e., >= 4), causing an OOB write to cl_seqmask. This requires authenticated access (GSS MIC verification still gates both paths) and winning a race, so practical impact is limited.
>> > >
>> > > A simple fix would be to hold cl_lock across both check and update, or to re-validate offset < SVC_RPC_GSS_SEQWINDOW inside update_seq before writing.
>> >
>> > I'll look at fixing this one later, but as you note, it is also unlikely
>> > to cause serious trouble, imho.
>> >
>> > rick
>> >
>> > >
>> > > Let me know what you think.
>> > >
>> > > Best wishes,
>> > > Stanislav Fort
>> > > Aisle Research