freebsd 6x slower than Raspbian 10 buster on RPi 4b
Mark Millard
marklmi at yahoo.com
Mon Feb 8 22:05:59 UTC 2021
On 2021-Feb-8, at 13:51, Mark Millard <marklmi at yahoo.com> wrote:
> On 2021-Feb-8, at 13:42, Mark Millard <marklmi at yahoo.com> wrote:
>
>> On 2021-Feb-8, at 12:12, Elwood Downey <elwood.downey at gmail.com> wrote:
>>
>>> Hello all!
>>>
>>> Just wanted to share a comparison I did between freebsd and raspbian on the
>>> same RPi 4b with 1 GB RAM. I wrote a tiny C++ program that creates
>>> pthreads, each of which mallocs an array and spins filling it with sqrtf of
>>> the array index. Setting it to 3 threads (the hw has 4 cores), I found
>>> freebsd takes consistently 6.5x wall-clock time longer than with raspbian.
>>> Below are the sessions for each showing pertinent details. Attached is the
>>> program itself (if it doesn't make it through the newsgroup, mail me direct
>>> for a copy). One good news is the thread overhead for freebsd is about 100x
>>> smaller so kudos to the scheduler.
>>>
>>> This is surprising and disappointing. Any comments welcome, especially what
>>> I'm doing wrong here. Thank you for your time.
>>>
>>> Elwood Downey
>>> Tucson AZ
>>>
>>>
>>>
>>> *Raspbian:*
>>>
>>> pi at hamclock:~$ uname -a
>>> Linux hamclock 5.4.83-v7l+ #1379 SMP Mon Dec 14 13:11:54 GMT 2020 armv7l
>>> GNU/Linux
>>> pi at hamclock:~$ g++ --version
>>> g++ (Raspbian 8.3.0-6+rpi1) 8.3.0
>>> Copyright (C) 2018 Free Software Foundation, Inc.
>>> This is free software; see the source for copying conditions. There is NO
>>> warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
>>> pi at hamclock:~$ g++ -Wall -o pthread_bench{,.cpp} -lpthread -lm
>>> pi at hamclock:~$ ./pthread_bench 10000 3
>>> tot thr : 4.917360
>>> mean thr: 1.639120
>>> tot wall: 1.726206
>>> thr gain: 2.84865
>>> overhead: 5.04494 %
>>>
>>>
>>> *Freebsd:*
>>>
>>> [ecdowney at freebsdpi ~]$ uname -a
>>> FreeBSD freebsdpi 13.0-CURRENT FreeBSD 13.0-CURRENT #0
>>> main-c255641-gf2b794e1e90: Thu Jan 7 08:00:13 UTC 2021
>>> root at releng1.nyi.freebsd.org:/usr/obj/usr/src/arm64.aarch64/sys/GENERIC
>>> arm64
>>> [ecdowney at freebsdpi ~]$ g++ --version
>>> g++ (FreeBSD Ports Collection) 10.2.0
>>> Copyright (C) 2020 Free Software Foundation, Inc.
>>> This is free software; see the source for copying conditions. There is NO
>>> warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
>>> [ecdowney at freebsdpi ~]$ g++ -Wall -o pthread_bench{,.cpp} -lpthread -lm
>>> [ecdowney at freebsdpi ~]$ sysctl dev.cpu.0.freq
>>> dev.cpu.0.freq: 1500
>>> [ecdowney at freebsdpi ~]$ ./pthread_bench 10000 3
>>> tot thr : 33.810808
>>> mean thr: 11.270269
>>> tot wall: 11.277030
>>> thr gain: 2.9982
>>> overhead: 0.0599537 %
>>> <pthread_bench.cpp>_______________________________________________
>>> freebsd-arm at freebsd.org mailing list
>>> https://lists.freebsd.org/mailman/listinfo/freebsd-arm
>>> To unsubscribe, send any mail to "freebsd-arm-unsubscribe at freebsd.org"
>>>
>>
>> One issue is default optimization level vs. using a
>> specific controlled level:
>>
>> # g++10 -Wall -o pthread_bench pthread_bench.cpp -lpthread -lm
>> # ./pthread_bench 10000 3
>> tot thr : 25.900658
>> mean thr: 8.633552
>> tot wall: 8.633356
>> thr gain: 3.00007
>> overhead: -0.00227026 %
>>
>> # g++10 -Wall -O2 -o pthread_bench pthread_bench.cpp -lpthread -lm
>> # ./pthread_bench 10000 3
>> tot thr : 1.133682
>> mean thr: 0.377894
>> tot wall: 0.376152
>> thr gain: 3.01389
>> overhead: -0.463111 %
>>
>> (I'm not certain that the gcc port and the linux have the
>> same configuration for how g++10 was built or the default
>> optimizations used.)
>>
>> # g++10 -v
>> Using built-in specs.
>> COLLECT_GCC=g++10
>> COLLECT_LTO_WRAPPER=/usr/local/libexec/gcc10/gcc/aarch64-portbld-freebsd14.0/10.2.0/lto-wrapper
>> Target: aarch64-portbld-freebsd14.0
>> Configured with: /wrkdirs/usr/ports/lang/gcc10/work/gcc-10.2.0/configure --disable-multilib --disable-bootstrap --disable-nls --enable-gnu-indirect-function --enable-plugin --libdir=/usr/local/lib/gcc10 --libexecdir=/usr/local/libexec/gcc10 --program-suffix=10 --with-as=/usr/local/bin/as --with-gmp=/usr/local --with-gxx-include-dir=/usr/local/lib/gcc10/include/c++/ --with-ld=/usr/local/bin/ld --with-pkgversion='FreeBSD Ports Collection' --with-system-zlib --enable-languages=c,c++,objc,fortran --prefix=/usr/local --localstatedir=/var --mandir=/usr/local/man --infodir=/usr/local/share/info/gcc10 --build=aarch64-portbld-freebsd14.0
>> Thread model: posix
>> Supported LTO compression algorithms: zlib
>> gcc version 10.2.0 (FreeBSD Ports Collection)
>>
>>
>> Another issue is g++ and libstdc++ vs. clang++ (system c++)
>> and (system) libc++. So trying system clang and libc++:
>>
>> # c++ -Wall -o pthread_bench pthread_bench.cpp -lpthread -lm
>> # ./pthread_bench 10000 3
>> tot thr : 2.525239
>> mean thr: 0.841746
>> tot wall: 0.849135
>> thr gain: 2.9739
>> overhead: 0.87018 %
>>
>> # c++ -Wall -O2 -o pthread_bench pthread_bench.cpp -lpthread -lm
>> # ./pthread_bench 10000 3
>> tot thr : 0.000000
>> mean thr: 0.000000
>> tot wall: 0.000369
>> thr gain: 0
>> overhead: 100 %
>>
>> That last is because the compiler optimized run(. . .) down
>> to just:
>>
>> 0000000000400a24 <_Z3runPv> mov x0, xzr
>> 0000000000400a28 <_Z3runPv+0x4> ret
>>
>> The source code needs to do something to prevent
>> the compiler from optimizing out currently unused
>> computations.
>>
>> Having the compilers check more material also
>> produces notices like:
>>
>> g++:
>> pthread_bench.cpp: In function 'void* run(void*)':
>> pthread_bench.cpp:18:18: warning: unused parameter 'dummy' [-Wunused-parameter]
>> 18 | void *run (void *dummy)
>> | ~~~~~~^~~~~
>> pthread_bench.cpp: In function 'int main(int, char**)':
>> pthread_bench.cpp:41:19: warning: ISO C++ forbids variable length array 'tid' [-Wvla]
>> 41 | pthread_t tid[n_th];
>> | ^~~
>>
>> clang++:
>> pthread_bench.cpp:18:18: warning: unused parameter 'dummy' [-Wunused-parameter]
>> void *run (void *dummy)
>> ^
>> pthread_bench.cpp:41:22: warning: variable length arrays are a C99 feature [-Wvla-extension]
>> pthread_t tid[n_th];
>>
>> ^
>>
>>
>> FYI: Here is a mix of using g++10 but with the FreeBSD
>> system libc++ instead of gcc's libstdc++ :
>>
>> . . .
I messed that up: it was still using libstdc++ when I
checked with ldd. Trying again, with the required linker
related command line options used as well:
# g++10 -Wno-psabi -nostdinc -nostdinc++ -I/usr/include/c++/v1 -I/usr/include -mno-outline-atomics -nodefaultlibs -lc++ -lcxxrt -lthr -lm -lc -lgcc_s -Wl,-rpath=/usr/local/lib/gcc10 -flto -Wall -O2 -o pthread_bench pthread_bench.cpp -lpthread -lm
# ldd pthread_benchpthread_bench:
libc++.so.1 => /usr/lib/libc++.so.1 (0x4047f000)
libcxxrt.so.1 => /lib/libcxxrt.so.1 (0x40579000)
libthr.so.3 => /lib/libthr.so.3 (0x405c8000)
libm.so.5 => /lib/libm.so.5 (0x40624000)
libc.so.7 => /lib/libc.so.7 (0x40690000)
libgcc_s.so.1 => /usr/local/lib/gcc10/libgcc_s.so.1 (0x40aae000)
# ./pthread_bench 10000 3
tot thr : 1.131916
mean thr: 0.377305
tot wall: 0.376065
thr gain: 3.00989
overhead: -0.32973 %
>> My FreeBSD context on the RPi4B is based on non-debug
>> builds of main (14-CURRENT at this point):
>>
>> # ~/fbsd-based-on-what-freebsd-main.sh
>> merge-base: 847dfd2803f6c8b077e3ebc68e35adff2c79a65f
>> merge-base: CommitDate: 2021-02-03 21:24:22 +0000
>> 325d7069b027 (HEAD -> mm-src) mm-src snapshot for mm's patched build in git context.
>> 847dfd2803f6 (freebsd/main, freebsd/HEAD, pure-src, main) readelf: do not trucate section name with -W
>> FreeBSD RPi4B 14.0-CURRENT FreeBSD 14.0-CURRENT mm-src-n244624-325d7069b027 GENERIC-NODBG arm64 aarch64 1400003 1400003
>>
>> It is a tailored build for cortex-a72 via -mcpu=
>> use. The RPi4B's config.txt has:
>>
>> over_voltage=6
>> arm_freq=2000
>> arm_freq_min=2000
>> sdram_freq_min=3200
>>
>> FYI:
>>
>> # sysctl hw.physmem
>> hw.physmem: 8465969152
>>
>
> I forgot to note that the RPi4B has heatsinks and
> a fan and has a good 5.1A 3.5A power supply.
Trying again: 5.1V 3.5A
===
Mark Millard
marklmi at yahoo.com
( dsl-only.net went
away in early 2018-Mar)
More information about the freebsd-arm
mailing list