git: 207f3b2b25ea - main - libmd: Fix amd64 AVX2 SHA-1 transcription errors

Go to: [ bottom of page ] [ top of archives ] [ this month ]
From: Jessica Clarke <jrtc27_at_FreeBSD.org>
Date: Tue, 03 Jun 2025 01:51:44 UTC
The branch main has been updated by jrtc27:

URL: https://cgit.FreeBSD.org/src/commit/?id=207f3b2b25eaa0f9d32699e664b139e5e40e5450

commit 207f3b2b25eaa0f9d32699e664b139e5e40e5450
Author:     Jessica Clarke <jrtc27@FreeBSD.org>
AuthorDate: 2025-06-03 01:46:57 +0000
Commit:     Jessica Clarke <jrtc27@FreeBSD.org>
CommitDate: 2025-06-03 01:46:57 +0000

    libmd: Fix amd64 AVX2 SHA-1 transcription errors
    
    This source was manually transcribed from Go's assembly syntax into
    FreeBSD's. Some differences exist (e.g. around stack frame allocation,
    but also some upstream LEAL instructions were replaced with ADDL here as
    getting the 64-bit super-registers of 32-bit isn't so doable, unlike Go)
    that were intended, but a few errors crept in. Fix these, found by
    comparing post-processed disassembly[1] (handling the ADDL difference
    above, and due to Go's assembler not optimising VP[X]OR encoding by
    commuting operands when it would give rise to a 2-byte VEX prefix) of a
    built copy of the corresponding Go source against ours.
    
    [1] In Vim:
        %g/\<vpx\?or\>/s/\(%ymm\([89]\|1[0-5]\)\), %ymm\([0-7]\), %ymm/%ymm\3, \1, %ymm/g
        (to commute the VP[X]OR operands as LLVM does)
        %s/\<leal\>\([[:space:]]\+\)(%r\(..\),%r\(..\)), %e\2/addl\1%e\3, %e\2/
        (to convert LEAL to ADDL in the cases we do)
        %s/%e12\>/%r12d/g
        (as the previous conversion turns %r12 into %e12 not %r12d)
    
    Fixes:  8b4684afcde3 ("lib/libmd: add optimised SHA1 implementations for amd64")
---
 lib/libmd/amd64/sha1block.S | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/lib/libmd/amd64/sha1block.S b/lib/libmd/amd64/sha1block.S
index 0307dcdece32..f1291ef2647a 100644
--- a/lib/libmd/amd64/sha1block.S
+++ b/lib/libmd/amd64/sha1block.S
@@ -1220,7 +1220,7 @@ END(_libmd_sha1block_scalar)
 
 .macro	calc116
 	calc_f2_pre	0x130, %eax, %edx, %edi
-	precalc37	%ymm5
+	precalc36	%ymm5
 	calc_f2_post	%eax, %ecx, %ebx, %edi
 .endm
 
@@ -1354,7 +1354,7 @@ END(_libmd_sha1block_scalar)
 .endm
 
 .macro	calc139
-	calc_f2_pre	0x1cc, %edx, %ecx, %eax
+	calc_f2_pre	0x1dc, %edx, %ecx, %eax
 	precalc35	%ymm14
 	calc_f2_post	%edx, %ebx, %esi, %eax
 .endm
@@ -1586,7 +1586,7 @@ ENTRY(_libmd_sha1block_avx2)
 
 	add		$128, %r10		// move to the next even-64-byte block
 	cmp		%r11, %r10		// is the current block the last one?
-	cmovae		%r10, %r8		// signal the last iteration smartly
+	cmovae		%r8, %r10		// signal the last iteration smartly
 
 	calc60
 	calc61