RPi4B buildworld buildkernel times for already installed system being -mcpu=cortex-a72 vs. -mcpu=cortex-a53 based

Mark Millard marklmi at yahoo.com
Wed Sep 23 03:17:04 UTC 2020



On 2020-Sep-20, at 18:40, Mark Millard <marklmi at yahoo.com> wrote:

> On 2020-Sep-20, at 18:32, Mark Millard <marklmi at yahoo.com> wrote:
> 
>> The following are from scratch buildworld buildkernel rebuilds
>> on a RPi4B (head -r363590 context).
>> 
>> ENVIRONMENT: -mcpu=cortex-a72 based world and kernel running already, RPi4B @ 2G Hz,
>> Restricted to 3 GiByte RAM, -j3:
>> 
>> World built in 37469 seconds, ncpu: 4, make -j3
>> Kernel(s)  GENERIC-NODBG built in 2474 seconds, ncpu: 4, make -j3
>> 
>> ENVIRONMENT: -mcpu=cortex-a53 based kernel running, RPi4B @ 2G Hz,
>> Restricted to 3 GiByte RAM, -j3:
>> 
>> World built in 44034 seconds, ncpu: 4, make -j3
>> Kernel(s)  GENERIC-NODBG built in 2895 seconds, ncpu: 4, make -j3
>> 
>> So a little under 11.1 hr total vs. a little over 13.0 hr total,
>> a somewhat over 50 min improvement.
> 
> "a somewhat over 1hr 50 min improvement" is what I should have
> managed to type.

Some experiments indicate that the faster result may be rather
dependent on clang 10 -O use vs. clang 11 -O vs. -O2 use as well
as use of -mcpu=cortex-a72 . Jumping from clang 10 -O to clang 11
-O2 for the FreeBSD kernel build in use looks like it might revert
to more like the older times for buildworld buildkernel. (clang 11
-O is -O1 instead of the historical -O2 .) But I've not rerun any
build tests to know for sure.

clang 11's use -f -O meaning -O1 is causing FreeBSD-kernel-build
problems when DEBUG is defined --from lack of inlining in some
environments. FreeBSD may switch to use of -O2 explicitly for all
platforms.

(I build non-debug [no witness and such] with DEBUG=-g forced.
My context is now forcing -O2 currently because powerpc64 has the
inlining problem and I'm checking if the uniform setting works
uniformly across what I have access to.)

>> (A xhci patch finally allowed me to boot -mcpu=cortex-a72
>> based kernel builds on the RPi4B: The xhci event ring
>> initialization code was missing a usb_bus_mem_flush_all
>> call previously.)
>> 
>> 
>> Supporting details:
>> 
>> (e-mail based spacing changes expected below)
>> 
>> # diff -u ~/src.configs/src.conf.cortexA72-clang-bootstrap.aarch64-host ~/src.configs/src.conf.cortexA53-clang-bootstrap.aarch64-host
>> --- /root/src.configs/src.conf.cortexA72-clang-bootstrap.aarch64-host	2020-03-13 22:29:25.470155000 -0700
>> +++ /root/src.configs/src.conf.cortexA53-clang-bootstrap.aarch64-host	2020-03-13 22:29:25.469455000 -0700
>> @@ -49,9 +49,9 @@
>> # Use of the .clang 's here avoids
>> # interfering with other C<?>FLAGS
>> # usage, such as ?= usage.
>> -CFLAGS.clang+= -mcpu=cortex-a72
>> -CXXFLAGS.clang+= -mcpu=cortex-a72
>> -CPPFLAGS.clang+= -mcpu=cortex-a72
>> -ACFLAGS.arm64cpuid.S+=  -mcpu=cortex-a72+crypto
>> -ACFLAGS.aesv8-armx.S+=  -mcpu=cortex-a72+crypto
>> -ACFLAGS.ghashv8-armx.S+=        -mcpu=cortex-a72+crypto
>> +CFLAGS.clang+= -mcpu=cortex-a53
>> +CXXFLAGS.clang+= -mcpu=cortex-a53
>> +CPPFLAGS.clang+= -mcpu=cortex-a53
>> +ACFLAGS.arm64cpuid.S+=  -mcpu=cortex-a53+crypto
>> +ACFLAGS.aesv8-armx.S+=  -mcpu=cortex-a53+crypto
>> +ACFLAGS.ghashv8-armx.S+=        -mcpu=cortex-a53+crypto
>> 
>> 
>> The .amd64-host files are similar for doing cross builds.
>> 
>> I also use += in secure/lib/libcrypto/Makefile :
>> 
>> # svnlite diff /usr/src/secure/lib/libcrypto/Makefile
>> Index: /usr/src/secure/lib/libcrypto/Makefile
>> ===================================================================
>> --- /usr/src/secure/lib/libcrypto/Makefile	(revision 365919)
>> +++ /usr/src/secure/lib/libcrypto/Makefile	(working copy)
>> @@ -20,7 +20,7 @@
>> SRCS+=	o_str.c o_time.c threads_pthread.c uid.c
>> .if defined(ASM_aarch64)
>> SRCS+=	arm64cpuid.S armcap.c
>> -ACFLAGS.arm64cpuid.S=	-march=armv8-a+crypto
>> +ACFLAGS.arm64cpuid.S+=	-march=armv8-a+crypto
>> .elif defined(ASM_amd64)
>> SRCS+=	x86_64cpuid.S
>> .elif defined(ASM_arm)
>> @@ -35,7 +35,7 @@
>> SRCS+=	aes_cbc.c aes_cfb.c aes_ecb.c aes_ige.c aes_misc.c aes_ofb.c aes_wrap.c
>> .if defined(ASM_aarch64)
>> SRCS+=	aes_core.c aesv8-armx.S vpaes-armv8.S
>> -ACFLAGS.aesv8-armx.S=	-march=armv8-a+crypto
>> +ACFLAGS.aesv8-armx.S+=	-march=armv8-a+crypto
>> .elif defined(ASM_amd64)
>> SRCS+=	aes_core.c aesni-mb-x86_64.S aesni-sha1-x86_64.S aesni-sha256-x86_64.S
>> SRCS+=	aesni-x86_64.S vpaes-x86_64.S
>> @@ -242,7 +242,7 @@
>> SRCS+=	ofb128.c wrap128.c xts128.c
>> .if defined(ASM_aarch64)
>> SRCS+=	ghashv8-armx.S
>> -ACFLAGS.ghashv8-armx.S=	-march=armv8-a+crypto
>> +ACFLAGS.ghashv8-armx.S+=	-march=armv8-a+crypto
>> .elif defined(ASM_amd64)
>> SRCS+=	aesni-gcm-x86_64.S ghash-x86_64.S
>> .elif defined(ASM_arm)
>> 
>> The RPi4B is using:
>> 
>> over_voltage=6
>> arm_freq=2000
>> 
>> and was booted via uefi/ACPI.
>> 
>> I have not repeated the -j4 or other -jN comparisons that
>> I reported in the past. The -mcpu=cortex-a53 figures are
>> from the past.
> 



===
Mark Millard
marklmi at yahoo.com
( dsl-only.net went
away in early 2018-Mar)



More information about the freebsd-arm mailing list