git: 66eb78377bf1 - main - libc/amd64: fix overread conditions in stpncpy()

From: Robert Clausecker <fuz_at_FreeBSD.org>
Date: Sun, 14 Dec 2025 16:08:17 UTC
The branch main has been updated by fuz:

URL: https://cgit.FreeBSD.org/src/commit/?id=66eb78377bf109af1d9e25626bf254b4369436ec

commit 66eb78377bf109af1d9e25626bf254b4369436ec
Author:     Robert Clausecker <fuz@FreeBSD.org>
AuthorDate: 2025-12-10 20:45:18 +0000
Commit:     Robert Clausecker <fuz@FreeBSD.org>
CommitDate: 2025-12-14 16:06:05 +0000

    libc/amd64: fix overread conditions in stpncpy()
    
    Due to incorrect unit test design, two overread conditions went
    undetected in the amd64 baseline stpncpy() implementation.
    For buffers of 1--16 and 32 bytes that do not contain nul bytes
    and end exactly at a page boundary, the code would incorrectly
    read 16 bytes from the next page, possibly crossing into an
    unmapped page and crashing the program.  If the next page was
    mapped, the code would then proceed with the expected behaviour
    of the stpncpy() function.
    
    Three changes were made to fix the bug:
    
     - an off-by-one error is fixed in the code deciding whether to
       enter the runt case or not, entering it for 0<n<=32 bytes
       instead of 0<n<32 bytes as it was before.
     - in the runt case, the logic to skip reading a second 16-byte
       chunk if the buffer ends in the first chunk was fixed to
       account for buffers that end at a 16-byte boundary but do not
       hold a nul byte.
     - in the runt case, the logic to transform the location of the
       end of the input buffer into a bit mask was fixed to allow
       the case of n==32, which was previously impossible due to the
       incorrect logic for entering said case.
    
    The performance impact should be minimal.
    
    PR:             291359
    See also:       D54169
    Reported by:    Collin Funk <collin.funk1@gmail.com>
    Reviewed by:    getz
    Approved by:    markj (mentor)
    MFC after:      1 week
    Fixes:          90253d49db09a9b1490c448d05314f3e4bbfa468 (D42519)
    Differential Revision:  https://reviews.freebsd.org/D54170
---
 lib/libc/amd64/string/stpncpy.S | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/lib/libc/amd64/string/stpncpy.S b/lib/libc/amd64/string/stpncpy.S
index 5ce0dd093a9e..df22bb9f0c53 100644
--- a/lib/libc/amd64/string/stpncpy.S
+++ b/lib/libc/amd64/string/stpncpy.S
@@ -100,7 +100,7 @@ ARCHENTRY(__stpncpy, baseline)
 	movdqa		(%rsi), %xmm0		# load head
 	and		$0xf, %ecx		# offset from alignment
 	mov		$-1, %r9d
-	lea		-32(%rcx), %rax		# set up overflow-proof comparison rdx+rcx<=32
+	lea		-33(%rcx), %rax		# set up overflow-proof comparison rdx+rcx<=32
 	shl		%cl, %r9d		# mask of bytes belonging to the string
 	sub		%rcx, %rdi		# adjust RDI to correspond to RSI
 	pxor		%xmm1, %xmm1
@@ -223,8 +223,9 @@ ARCHENTRY(__stpncpy, baseline)
 
 	/* 1--32 bytes to copy, bounce through the stack */
 .Lrunt:	movdqa		%xmm1, bounce+16(%rsp)	# clear out rest of on-stack copy
-	bts		%r10d, %r8d		# treat end of buffer as end of string
-	and		%r9w, %r8w		# end of string within first buffer?
+	bts		%r10, %r8		# treat end of buffer as end of string
+	and		%r9d, %r8d		# mask out head before string
+	test		$0x1ffff, %r8d		# end of string within first chunk or right after?
 	jnz		0f			# if yes, do not inspect second buffer
 
 	movdqa		16(%rsi), %xmm0		# load second chunk of input