Re: 66eb78377bf1 - main - libc/amd64: fix overread conditions in stpncpy()

From: Robert Clausecker <fuz_at_fuz.su>
Date: Tue, 16 Dec 2025 21:58:16 UTC
Am Tue, Dec 16, 2025 at 10:54:33PM +0100 schrieb Robert Clausecker:
> Hi Ravi,
> 
> The bts (bit test and set) instruction takes a bit index and an integer and
> sets the indicated bit in the register.  I.e. bts a, b does a |= 1 << b.

That was supposed to be "bts a, b does b |= 1 << a" of course (in AT&T syntax).

> It also returns the previous value of the bit in the carry flag, but that's
> not really important here.  R8 is the 64-bit variant of r8d.  I switched to
> it for this instruction as bts masks its index to the operand size (i.e. to
> 5 bits for a range of 0--31 for the 32 bit case).  However, due to the
> bug fix, the case r8 == 32 is now possible, which led to wrong behaviour
> (that is, the least-significant bit is set).  This is avoided by switching to
> 64-bit operand size, instead setting the 32nd bit.  test does not care about
> these bits, so 32 bit operand size is fine.  Who does care is the tzcnt
> instruction further down, and that's probably the bug I added.  Will have
> to investigate further and then submit a patch.
> 
> I'm sorry for the continued errors.  It's a bit of a stressful time for me.
> 
> Yours,
> Robert Clausecker
> 
> Am Tue, Dec 16, 2025 at 02:33:32PM -0500 schrieb Ravi Pokala:
> > I know ~nothing of assembly, but one thing that seems odd to me is that the 'bts' line
> > originally used '%r8d', but the new version uses '%r8'. And yet, the new 'test' line
> > uses '%r8d'. Again, knowing nothing about this, it seems to me that the 'bts' and 'test'
> > should probably be talking about the same thing...?
> > 
> > -Ravi (rpokala@)
> > 
> > -----Original Message-----
> > From: <owner-src-committers@freebsd.org <mailto:owner-src-committers@freebsd.org>> on behalf of Cy Schubert <Cy.Schubert@cschubert.com <mailto:Cy.Schubert@cschubert.com>>
> > Reply-To: Cy Schubert <Cy.Schubert@cschubert.com <mailto:Cy.Schubert@cschubert.com>>
> > Date: Tuesday, December 16, 2025 at 11:00
> > To: Cy Schubert <Cy.Schubert@cschubert.com <mailto:Cy.Schubert@cschubert.com>>
> > Cc: Robert Clausecker <fuz@FreeBSD.org <mailto:fuz@FreeBSD.org>>, <src-committers@FreeBSD.org <mailto:src-committers@FreeBSD.org>>, <dev-commits-src-all@FreeBSD.org <mailto:dev-commits-src-all@FreeBSD.org>>, <dev-commits-src-main@FreeBSD.org <mailto:dev-commits-src-main@FreeBSD.org>>
> > Subject: Re: git: 66eb78377bf1 - main - libc/amd64: fix overread conditions in stpncpy()
> > 
> > 
> > In message <20251216135709.46D25203@slippy.cwsent.com <mailto:20251216135709.46D25203@slippy.cwsent.com>>, Cy Schubert writes:
> > > In message <693ee0f1.3662d.650a5e21@gitrepo.freebsd.org <mailto:693ee0f1.3662d.650a5e21@gitrepo.freebsd.org>>, Robert Clausecker 
> > > wri
> > > tes:
> > > > The branch main has been updated by fuz:
> > > >
> > > > URL: https://cgit.FreeBSD.org/src/commit/?id=66eb78377bf109af1d9e25626bf254 <https://cgit.FreeBSD.org/src/commit/?id=66eb78377bf109af1d9e25626bf254>
> > > b4
> > > > 369436ec
> > > >
> > > > commit 66eb78377bf109af1d9e25626bf254b4369436ec
> > > > Author: Robert Clausecker <fuz@FreeBSD.org <mailto:fuz@FreeBSD.org>>
> > > > AuthorDate: 2025-12-10 20:45:18 +0000
> > > > Commit: Robert Clausecker <fuz@FreeBSD.org <mailto:fuz@FreeBSD.org>>
> > > > CommitDate: 2025-12-14 16:06:05 +0000
> > > >
> > > > libc/amd64: fix overread conditions in stpncpy()
> > > > 
> > > > Due to incorrect unit test design, two overread conditions went
> > > > undetected in the amd64 baseline stpncpy() implementation.
> > > > For buffers of 1--16 and 32 bytes that do not contain nul bytes
> > > > and end exactly at a page boundary, the code would incorrectly
> > > > read 16 bytes from the next page, possibly crossing into an
> > > > unmapped page and crashing the program. If the next page was
> > > > mapped, the code would then proceed with the expected behaviour
> > > > of the stpncpy() function.
> > > > 
> > > > Three changes were made to fix the bug:
> > > > 
> > > > - an off-by-one error is fixed in the code deciding whether to
> > > > enter the runt case or not, entering it for 0<n<=32 bytes
> > > > instead of 0<n<32 bytes as it was before.
> > > > - in the runt case, the logic to skip reading a second 16-byte
> > > > chunk if the buffer ends in the first chunk was fixed to
> > > > account for buffers that end at a 16-byte boundary but do not
> > > > hold a nul byte.
> > > > - in the runt case, the logic to transform the location of the
> > > > end of the input buffer into a bit mask was fixed to allow
> > > > the case of n==32, which was previously impossible due to the
> > > > incorrect logic for entering said case.
> > > > 
> > > > The performance impact should be minimal.
> > > > 
> > > > PR: 291359
> > > > See also: D54169
> > > > Reported by: Collin Funk <collin.funk1@gmail.com <mailto:collin.funk1@gmail.com>>
> > > > Reviewed by: getz
> > > > Approved by: markj (mentor)
> > > > MFC after: 1 week
> > > > Fixes: 90253d49db09a9b1490c448d05314f3e4bbfa468 (D42519)
> > > > Differential Revision: https://reviews.freebsd.org/D54170 <https://reviews.freebsd.org/D54170>
> > > > ---
> > > > lib/libc/amd64/string/stpncpy.S | 7 ++++---
> > > > 1 file changed, 4 insertions(+), 3 deletions(-)
> > > >
> > > > diff --git a/lib/libc/amd64/string/stpncpy.S b/lib/libc/amd64/string/stpncp
> > > y.
> > > > S
> > > > index 5ce0dd093a9e..df22bb9f0c53 100644
> > > > --- a/lib/libc/amd64/string/stpncpy.S
> > > > +++ b/lib/libc/amd64/string/stpncpy.S
> > > > @@ -100,7 +100,7 @@ ARCHENTRY(__stpncpy, baseline)
> > > > movdqa (%rsi), %xmm0 # load head
> > > > and $0xf, %ecx # offset from alignment
> > > > mov $-1, %r9d
> > > > - lea -32(%rcx), %rax # set up overflow-proof compari
> > > > son rdx+rcx<=32
> > > > + lea -33(%rcx), %rax # set up overflow-proof compari
> > > > son rdx+rcx<=32
> > > > shl %cl, %r9d # mask of bytes belonging to th
> > > > e string
> > > > sub %rcx, %rdi # adjust RDI to correspond to R
> > > > SI
> > > > pxor %xmm1, %xmm1
> > > > @@ -223,8 +223,9 @@ ARCHENTRY(__stpncpy, baseline)
> > > > 
> > > > /* 1--32 bytes to copy, bounce through the stack */
> > > > .Lrunt: movdqa %xmm1, bounce+16(%rsp) # clear out rest of on-
> > > > stack copy
> > > > - bts %r10d, %r8d # treat end of buffer as end of
> > > > string
> > > > - and %r9w, %r8w # end of string within first bu
> > > > ffer?
> > > > + bts %r10, %r8 # treat end of buffer as end of
> > > > string
> > > > + and %r9d, %r8d # mask out head before string
> > > > + test $0x1ffff, %r8d # end of string within first ch
> > > > unk or right after?
> > > > jnz 0f # if yes, do not inspect second
> > > > buffer
> > > > 
> > > > movdqa 16(%rsi), %xmm0 # load second chunk of input
> > > >
> > >
> > > I've opened PR/291720 regarding a significant regression caused by this 
> > > commit. It affects my older machines, resulting in enviornment (getenv) 
> > > corruption. It does not affect my newer (and with more RAM) machines.
> > 
> > 
> > BTW, this brought my postfix server effectively offline. I have not been 
> > able to send or receive email until this commit was reverted locally.
> > 
> > 
> > 
> > 
> > -- 
> > Cheers,
> > Cy Schubert <Cy.Schubert@cschubert.com <mailto:Cy.Schubert@cschubert.com>>
> > FreeBSD UNIX: <cy@FreeBSD.org <mailto:cy@FreeBSD.org>> Web: https://FreeBSD.org <https://FreeBSD.org>
> > NTP: <cy@nwtime.org <mailto:cy@nwtime.org>> Web: https://nwtime.org <https://nwtime.org>
> > 
> > 
> > e**(i*pi)+1=0
> > 
> > 
> > 
> > 
> > 
> > 
> > 
> > 
> > 
> > 
> > 
> 
> -- 
> ()  ascii ribbon campaign - for an encoding-agnostic world
> /\  - against html email  - against proprietary attachments

-- 
()  ascii ribbon campaign - for an encoding-agnostic world
/\  - against html email  - against proprietary attachments