Re: git: 32a2fed6e71f - stable/13 - openssl: Fix detection of ARMv7 and ARM64 CPU features

From: Emmanuel Vadot <manu_at_bidouilliste.com>
Date: Wed, 24 Nov 2021 08:30:16 UTC
On Tue, 23 Nov 2021 20:36:40 +0100 (CET)
freebsd@oldach.net (Helge Oldach) wrote:

> Allan Jude wrote on Tue, 23 Nov 2021 20:14:53 +0100 (CET):
> > On 11/23/2021 5:00 AM, Helge Oldach wrote:
> > > Allan Jude wrote on Mon, 22 Nov 2021 19:14:13 +0100 (CET):
> > > Hmmm. On a RPi4/8G:
> > > 
> > > Before (FreeBSD 13.0-STABLE (GENERIC) #366 stable/13-n248173-d16fbc488e6):
> > > | type             16 bytes     64 bytes    256 bytes   1024 bytes   8192 bytes  16384 bytes
> > > | aes-256-gcm      35791.98k    38533.57k    39986.77k    41397.59k    39840.43k    39638.36k
> > > 
> > > After (FreeBSD 13.0-STABLE (GENERIC) #367 stable/13-n248176-f085bb0e621)
> > > 
> > > | type             16 bytes     64 bytes    256 bytes   1024 bytes   8192 bytes  16384 bytes
> > > | aes-256-gcm      21277.62k    23226.64k    23613.90k    23687.51k    23892.93k    23947.95k
> > > 
> > > It seems that AES throughput is actually cut by almost half?
> > 
> > Do you know which of the CPU optimizations your RPi4 supports?
> 
> Is this what you need?
> 
>  Instruction Set Attributes 0 = <CRC32>

 So there is no AES+PMULL instruction set on RPI4, I guess that openssl
uses them for aes-gcm.

 I wonder what it uses before that make it have this boost.

 On my rockpro64 I do see the improvement btw :
root@generic:~ # cpuset -l 4,5 openssl speed -evp aes-256-gcm
...
aes-256-gcm     122861.59k   337938.39k   565408.44k   661223.09k   709175.19k   712327.25k
root@generic:~ # cpuset -l 4,5 env OPENSSL_armcap=0 openssl speed -evp aes-256-gcm
...
aes-256-gcm      34068.11k    38068.62k    39435.24k    39818.75k    39905.34k    39922.35k

 Running on the big cores at max freq.

>  Instruction Set Attributes 1 = <>
>          Processor Features 0 = <AdvSIMD,FP,EL3 32,EL2 32,EL1 32,EL0 32>
>          Processor Features 1 = <>
>       Memory Model Features 0 = <TGran4,TGran64,SNSMem,BigEnd,16bit ASID,16TB PA>
>       Memory Model Features 1 = <8bit VMID>
>       Memory Model Features 2 = <32bit CCIDX,48bit VA>
>              Debug Features 0 = <DoubleLock,2 CTX BKPTs,4 Watchpoints,6 Breakpoints,PMUv3,Debugv8>
>              Debug Features 1 = <>
>          Auxiliary Features 0 = <>
>          Auxiliary Features 1 = <>
> AArch32 Instruction Set Attributes 5 = <CRC32,SEVL>
> AArch32 Media and VFP Features 0 = <FPRound,FPSqrt,FPDivide,DP VFPv3+v4,SP VFPv3+v4,AdvSIMD>
> AArch32 Media and VFP Features 1 = <SIMDFMAC,FPHP DP Conv,SIMDHP SP Conv,SIMDSP,SIMDInt,SIMDLS,FPDNaN,FPFtZ>
> 
> > You can set the environment variable OPENSSL_armcap to override 
> > OpenSSL's detection.
> > 
> > Try: env OPENSSL_armcap=0 openssl speed -evp aes-256-gcm
> 
> On FreeBSD 13.0-STABLE (GENERIC) #367 stable/13-n248176-f085bb0e621 again (i.e. after this commit):
> 
> hmo@p48 ~ $ env OPENSSL_armcap=0 openssl speed -evp aes-256-gcm
> Doing aes-256-gcm for 3s on 16 size blocks: 6445704 aes-256-gcm's in 3.08s
> Doing aes-256-gcm for 3s on 64 size blocks: 1861149 aes-256-gcm's in 3.00s
> Doing aes-256-gcm for 3s on 256 size blocks: 479664 aes-256-gcm's in 3.01s
> Doing aes-256-gcm for 3s on 1024 size blocks: 122853 aes-256-gcm's in 3.04s
> Doing aes-256-gcm for 3s on 8192 size blocks: 15181 aes-256-gcm's in 3.00s
> Doing aes-256-gcm for 3s on 16384 size blocks: 7796 aes-256-gcm's in 3.07s
> OpenSSL 1.1.1l-freebsd  24 Aug 2021
> built on: reproducible build, date unspecified
> options:bn(64,64) rc4(int) des(int) aes(partial) idea(int) blowfish(ptr)
> compiler: clang
> The 'numbers' are in 1000s of bytes per second processed.
> type             16 bytes     64 bytes    256 bytes   1024 bytes   8192 bytes  16384 bytes
> aes-256-gcm      33504.57k    39704.51k    40825.01k    41394.83k    41454.25k    41601.52k
> hmo@p48 ~ $ openssl speed -evp aes-256-gcm
> Doing aes-256-gcm for 3s on 16 size blocks: 4066201 aes-256-gcm's in 3.00s
> Doing aes-256-gcm for 3s on 64 size blocks: 1087387 aes-256-gcm's in 3.00s
> Doing aes-256-gcm for 3s on 256 size blocks: 280110 aes-256-gcm's in 3.03s
> Doing aes-256-gcm for 3s on 1024 size blocks: 70412 aes-256-gcm's in 3.04s
> Doing aes-256-gcm for 3s on 8192 size blocks: 8762 aes-256-gcm's in 3.00s
> Doing aes-256-gcm for 3s on 16384 size blocks: 4402 aes-256-gcm's in 3.02s
> OpenSSL 1.1.1l-freebsd  24 Aug 2021
> built on: reproducible build, date unspecified
> options:bn(64,64) rc4(int) des(int) aes(partial) idea(int) blowfish(ptr)
> compiler: clang
> The 'numbers' are in 1000s of bytes per second processed.
> type             16 bytes     64 bytes    256 bytes   1024 bytes   8192 bytes  16384 bytes
> aes-256-gcm      21686.41k    23197.59k    23656.30k    23725.04k    23926.10k    23916.23k
> hmo@p48 ~ $
> 
> Kind regards,
> Helge


-- 
Emmanuel Vadot <manu@bidouilliste.com> <manu@freebsd.org>