Re: 66eb78377bf1 - main - libc/amd64: fix overread conditions in stpncpy()

From: Robert Clausecker <fuz_at_fuz.su>
Date: Tue, 16 Dec 2025 21:54:33 UTC
Hi Ravi,

The bts (bit test and set) instruction takes a bit index and an integer and
sets the indicated bit in the register.  I.e. bts a, b does a |= 1 << b.
It also returns the previous value of the bit in the carry flag, but that's
not really important here.  R8 is the 64-bit variant of r8d.  I switched to
it for this instruction as bts masks its index to the operand size (i.e. to
5 bits for a range of 0--31 for the 32 bit case).  However, due to the
bug fix, the case r8 == 32 is now possible, which led to wrong behaviour
(that is, the least-significant bit is set).  This is avoided by switching to
64-bit operand size, instead setting the 32nd bit.  test does not care about
these bits, so 32 bit operand size is fine.  Who does care is the tzcnt
instruction further down, and that's probably the bug I added.  Will have
to investigate further and then submit a patch.

I'm sorry for the continued errors.  It's a bit of a stressful time for me.

Yours,
Robert Clausecker

Am Tue, Dec 16, 2025 at 02:33:32PM -0500 schrieb Ravi Pokala:
> I know ~nothing of assembly, but one thing that seems odd to me is that the 'bts' line
> originally used '%r8d', but the new version uses '%r8'. And yet, the new 'test' line
> uses '%r8d'. Again, knowing nothing about this, it seems to me that the 'bts' and 'test'
> should probably be talking about the same thing...?
> 
> -Ravi (rpokala@)
> 
> -----Original Message-----
> From: <owner-src-committers@freebsd.org <mailto:owner-src-committers@freebsd.org>> on behalf of Cy Schubert <Cy.Schubert@cschubert.com <mailto:Cy.Schubert@cschubert.com>>
> Reply-To: Cy Schubert <Cy.Schubert@cschubert.com <mailto:Cy.Schubert@cschubert.com>>
> Date: Tuesday, December 16, 2025 at 11:00
> To: Cy Schubert <Cy.Schubert@cschubert.com <mailto:Cy.Schubert@cschubert.com>>
> Cc: Robert Clausecker <fuz@FreeBSD.org <mailto:fuz@FreeBSD.org>>, <src-committers@FreeBSD.org <mailto:src-committers@FreeBSD.org>>, <dev-commits-src-all@FreeBSD.org <mailto:dev-commits-src-all@FreeBSD.org>>, <dev-commits-src-main@FreeBSD.org <mailto:dev-commits-src-main@FreeBSD.org>>
> Subject: Re: git: 66eb78377bf1 - main - libc/amd64: fix overread conditions in stpncpy()
> 
> 
> In message <20251216135709.46D25203@slippy.cwsent.com <mailto:20251216135709.46D25203@slippy.cwsent.com>>, Cy Schubert writes:
> > In message <693ee0f1.3662d.650a5e21@gitrepo.freebsd.org <mailto:693ee0f1.3662d.650a5e21@gitrepo.freebsd.org>>, Robert Clausecker 
> > wri
> > tes:
> > > The branch main has been updated by fuz:
> > >
> > > URL: https://cgit.FreeBSD.org/src/commit/?id=66eb78377bf109af1d9e25626bf254 <https://cgit.FreeBSD.org/src/commit/?id=66eb78377bf109af1d9e25626bf254>
> > b4
> > > 369436ec
> > >
> > > commit 66eb78377bf109af1d9e25626bf254b4369436ec
> > > Author: Robert Clausecker <fuz@FreeBSD.org <mailto:fuz@FreeBSD.org>>
> > > AuthorDate: 2025-12-10 20:45:18 +0000
> > > Commit: Robert Clausecker <fuz@FreeBSD.org <mailto:fuz@FreeBSD.org>>
> > > CommitDate: 2025-12-14 16:06:05 +0000
> > >
> > > libc/amd64: fix overread conditions in stpncpy()
> > > 
> > > Due to incorrect unit test design, two overread conditions went
> > > undetected in the amd64 baseline stpncpy() implementation.
> > > For buffers of 1--16 and 32 bytes that do not contain nul bytes
> > > and end exactly at a page boundary, the code would incorrectly
> > > read 16 bytes from the next page, possibly crossing into an
> > > unmapped page and crashing the program. If the next page was
> > > mapped, the code would then proceed with the expected behaviour
> > > of the stpncpy() function.
> > > 
> > > Three changes were made to fix the bug:
> > > 
> > > - an off-by-one error is fixed in the code deciding whether to
> > > enter the runt case or not, entering it for 0<n<=32 bytes
> > > instead of 0<n<32 bytes as it was before.
> > > - in the runt case, the logic to skip reading a second 16-byte
> > > chunk if the buffer ends in the first chunk was fixed to
> > > account for buffers that end at a 16-byte boundary but do not
> > > hold a nul byte.
> > > - in the runt case, the logic to transform the location of the
> > > end of the input buffer into a bit mask was fixed to allow
> > > the case of n==32, which was previously impossible due to the
> > > incorrect logic for entering said case.
> > > 
> > > The performance impact should be minimal.
> > > 
> > > PR: 291359
> > > See also: D54169
> > > Reported by: Collin Funk <collin.funk1@gmail.com <mailto:collin.funk1@gmail.com>>
> > > Reviewed by: getz
> > > Approved by: markj (mentor)
> > > MFC after: 1 week
> > > Fixes: 90253d49db09a9b1490c448d05314f3e4bbfa468 (D42519)
> > > Differential Revision: https://reviews.freebsd.org/D54170 <https://reviews.freebsd.org/D54170>
> > > ---
> > > lib/libc/amd64/string/stpncpy.S | 7 ++++---
> > > 1 file changed, 4 insertions(+), 3 deletions(-)
> > >
> > > diff --git a/lib/libc/amd64/string/stpncpy.S b/lib/libc/amd64/string/stpncp
> > y.
> > > S
> > > index 5ce0dd093a9e..df22bb9f0c53 100644
> > > --- a/lib/libc/amd64/string/stpncpy.S
> > > +++ b/lib/libc/amd64/string/stpncpy.S
> > > @@ -100,7 +100,7 @@ ARCHENTRY(__stpncpy, baseline)
> > > movdqa (%rsi), %xmm0 # load head
> > > and $0xf, %ecx # offset from alignment
> > > mov $-1, %r9d
> > > - lea -32(%rcx), %rax # set up overflow-proof compari
> > > son rdx+rcx<=32
> > > + lea -33(%rcx), %rax # set up overflow-proof compari
> > > son rdx+rcx<=32
> > > shl %cl, %r9d # mask of bytes belonging to th
> > > e string
> > > sub %rcx, %rdi # adjust RDI to correspond to R
> > > SI
> > > pxor %xmm1, %xmm1
> > > @@ -223,8 +223,9 @@ ARCHENTRY(__stpncpy, baseline)
> > > 
> > > /* 1--32 bytes to copy, bounce through the stack */
> > > .Lrunt: movdqa %xmm1, bounce+16(%rsp) # clear out rest of on-
> > > stack copy
> > > - bts %r10d, %r8d # treat end of buffer as end of
> > > string
> > > - and %r9w, %r8w # end of string within first bu
> > > ffer?
> > > + bts %r10, %r8 # treat end of buffer as end of
> > > string
> > > + and %r9d, %r8d # mask out head before string
> > > + test $0x1ffff, %r8d # end of string within first ch
> > > unk or right after?
> > > jnz 0f # if yes, do not inspect second
> > > buffer
> > > 
> > > movdqa 16(%rsi), %xmm0 # load second chunk of input
> > >
> >
> > I've opened PR/291720 regarding a significant regression caused by this 
> > commit. It affects my older machines, resulting in enviornment (getenv) 
> > corruption. It does not affect my newer (and with more RAM) machines.
> 
> 
> BTW, this brought my postfix server effectively offline. I have not been 
> able to send or receive email until this commit was reverted locally.
> 
> 
> 
> 
> -- 
> Cheers,
> Cy Schubert <Cy.Schubert@cschubert.com <mailto:Cy.Schubert@cschubert.com>>
> FreeBSD UNIX: <cy@FreeBSD.org <mailto:cy@FreeBSD.org>> Web: https://FreeBSD.org <https://FreeBSD.org>
> NTP: <cy@nwtime.org <mailto:cy@nwtime.org>> Web: https://nwtime.org <https://nwtime.org>
> 
> 
> e**(i*pi)+1=0
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 

-- 
()  ascii ribbon campaign - for an encoding-agnostic world
/\  - against html email  - against proprietary attachments