Re: git: 32a2fed6e71f - stable/13 - openssl: Fix detection of ARMv7 and ARM64 CPU features
Date: Wed, 24 Nov 2021 08:30:16 UTC
On Tue, 23 Nov 2021 20:36:40 +0100 (CET) freebsd@oldach.net (Helge Oldach) wrote: > Allan Jude wrote on Tue, 23 Nov 2021 20:14:53 +0100 (CET): > > On 11/23/2021 5:00 AM, Helge Oldach wrote: > > > Allan Jude wrote on Mon, 22 Nov 2021 19:14:13 +0100 (CET): > > > Hmmm. On a RPi4/8G: > > > > > > Before (FreeBSD 13.0-STABLE (GENERIC) #366 stable/13-n248173-d16fbc488e6): > > > | type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes 16384 bytes > > > | aes-256-gcm 35791.98k 38533.57k 39986.77k 41397.59k 39840.43k 39638.36k > > > > > > After (FreeBSD 13.0-STABLE (GENERIC) #367 stable/13-n248176-f085bb0e621) > > > > > > | type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes 16384 bytes > > > | aes-256-gcm 21277.62k 23226.64k 23613.90k 23687.51k 23892.93k 23947.95k > > > > > > It seems that AES throughput is actually cut by almost half? > > > > Do you know which of the CPU optimizations your RPi4 supports? > > Is this what you need? > > Instruction Set Attributes 0 = <CRC32> So there is no AES+PMULL instruction set on RPI4, I guess that openssl uses them for aes-gcm. I wonder what it uses before that make it have this boost. On my rockpro64 I do see the improvement btw : root@generic:~ # cpuset -l 4,5 openssl speed -evp aes-256-gcm ... aes-256-gcm 122861.59k 337938.39k 565408.44k 661223.09k 709175.19k 712327.25k root@generic:~ # cpuset -l 4,5 env OPENSSL_armcap=0 openssl speed -evp aes-256-gcm ... aes-256-gcm 34068.11k 38068.62k 39435.24k 39818.75k 39905.34k 39922.35k Running on the big cores at max freq. > Instruction Set Attributes 1 = <> > Processor Features 0 = <AdvSIMD,FP,EL3 32,EL2 32,EL1 32,EL0 32> > Processor Features 1 = <> > Memory Model Features 0 = <TGran4,TGran64,SNSMem,BigEnd,16bit ASID,16TB PA> > Memory Model Features 1 = <8bit VMID> > Memory Model Features 2 = <32bit CCIDX,48bit VA> > Debug Features 0 = <DoubleLock,2 CTX BKPTs,4 Watchpoints,6 Breakpoints,PMUv3,Debugv8> > Debug Features 1 = <> > Auxiliary Features 0 = <> > Auxiliary Features 1 = <> > AArch32 Instruction Set Attributes 5 = <CRC32,SEVL> > AArch32 Media and VFP Features 0 = <FPRound,FPSqrt,FPDivide,DP VFPv3+v4,SP VFPv3+v4,AdvSIMD> > AArch32 Media and VFP Features 1 = <SIMDFMAC,FPHP DP Conv,SIMDHP SP Conv,SIMDSP,SIMDInt,SIMDLS,FPDNaN,FPFtZ> > > > You can set the environment variable OPENSSL_armcap to override > > OpenSSL's detection. > > > > Try: env OPENSSL_armcap=0 openssl speed -evp aes-256-gcm > > On FreeBSD 13.0-STABLE (GENERIC) #367 stable/13-n248176-f085bb0e621 again (i.e. after this commit): > > hmo@p48 ~ $ env OPENSSL_armcap=0 openssl speed -evp aes-256-gcm > Doing aes-256-gcm for 3s on 16 size blocks: 6445704 aes-256-gcm's in 3.08s > Doing aes-256-gcm for 3s on 64 size blocks: 1861149 aes-256-gcm's in 3.00s > Doing aes-256-gcm for 3s on 256 size blocks: 479664 aes-256-gcm's in 3.01s > Doing aes-256-gcm for 3s on 1024 size blocks: 122853 aes-256-gcm's in 3.04s > Doing aes-256-gcm for 3s on 8192 size blocks: 15181 aes-256-gcm's in 3.00s > Doing aes-256-gcm for 3s on 16384 size blocks: 7796 aes-256-gcm's in 3.07s > OpenSSL 1.1.1l-freebsd 24 Aug 2021 > built on: reproducible build, date unspecified > options:bn(64,64) rc4(int) des(int) aes(partial) idea(int) blowfish(ptr) > compiler: clang > The 'numbers' are in 1000s of bytes per second processed. > type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes 16384 bytes > aes-256-gcm 33504.57k 39704.51k 40825.01k 41394.83k 41454.25k 41601.52k > hmo@p48 ~ $ openssl speed -evp aes-256-gcm > Doing aes-256-gcm for 3s on 16 size blocks: 4066201 aes-256-gcm's in 3.00s > Doing aes-256-gcm for 3s on 64 size blocks: 1087387 aes-256-gcm's in 3.00s > Doing aes-256-gcm for 3s on 256 size blocks: 280110 aes-256-gcm's in 3.03s > Doing aes-256-gcm for 3s on 1024 size blocks: 70412 aes-256-gcm's in 3.04s > Doing aes-256-gcm for 3s on 8192 size blocks: 8762 aes-256-gcm's in 3.00s > Doing aes-256-gcm for 3s on 16384 size blocks: 4402 aes-256-gcm's in 3.02s > OpenSSL 1.1.1l-freebsd 24 Aug 2021 > built on: reproducible build, date unspecified > options:bn(64,64) rc4(int) des(int) aes(partial) idea(int) blowfish(ptr) > compiler: clang > The 'numbers' are in 1000s of bytes per second processed. > type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes 16384 bytes > aes-256-gcm 21686.41k 23197.59k 23656.30k 23725.04k 23926.10k 23916.23k > hmo@p48 ~ $ > > Kind regards, > Helge -- Emmanuel Vadot <manu@bidouilliste.com> <manu@freebsd.org>