Comparing the OverDrive 1000 (A57) vs. MACCHIATObin Double Shot (A72) for buildworld and via a CPU/cache/RAM tradeoff-exploring benchmark

Mark Millard marklmi at
Mon Dec 2 22:15:33 UTC 2019

It looks like the OverDrive 1000 vs. MACCHIATObin Double
Shot comparison ends up being an example of memory
access making the difference for the specific workload:
-j4 buildworld for head -r355027 (building itself
from scratch).

buildworld times (not needing a llvm bootstrap build):

OverDrive 1000:           13895 sec (about 3.86 hrs)
MACCHIATObin Double Shot: 16561 sec (about 4.60 hrs)

So a little under 45 min difference when the mean
and geometric mean are both a little over 4.2 hrs.

SSD ufs file systems: One with Samsung 860 Pro, the
other with Samsung 850 Pro. I do not expect that I/O
made much of a difference, but I did nothing to measure
such for the buildworld activity.

OverDrive RAM:     8GiByte, half in each of the 2 slots
MACCHIATObin RAM: 16GiByte, all in its 1 slot.

MACCHIATObin: jumpers set for the fastest CPU/RAM
speed for the Double Shot.

A comparison graph from exploring single threaded
and multi-threaded CPU/cache and RAM limited
performance (a variation on the old HINT serial
and pthread benchmarks) is shown at:

There are curves for various involved types:
double (d), unsigned long long (ull), unsigned
long (ul), unsigned int (ui). The match for
ull and ul for the context provides some
evidence of the variability observed.

(The OverDrive and MACCHIATObin were not benchmarked
for the graph at the same version of head: -r352341
based vs. -r355027 based.)

(I did not set things such that the benchmark run
would explore paging getting involved. Thus there
is basically no I/O considered in the comparison

The MACCHIATObin clearly wins single threaded and
its memory subsystem was well matched to the single
threaded use when the same-invovled-types are
compared. (Single threaded are the blueish curves,
MACCHIATObin having the lighter colors.)

For multi-threaded in the range where RAM access
limits things, the two systems are a close match.
(Greenish colors, right side of plot, upper

The range were the OverDrive 1000 is clearly faster
is part of the middle of the multi-threaded curves.
(This might be tied to whatever is done with the
dual RAM slot structure or to the amount of caching,
or some such, I do not know the details.)

I would expect "-j1 buildworld" would take less time
on the MACCHIATObin than on the OverDrive, but I'm
not planing on measuring that.

A more historical comparison, old PowerMac11,2
(2 sockets, 2 cores each) vs. the MACCHIATObin,
both having 16 GiBytes of RAM:

For analogous benchmark graphs (matching types),
the MACCHIATObin single threaded is faster than
the old PowerMac11,2 single threaded and also is
usually faster than that 11,2's multi-threaded
benchmark data as well. Multi-threaded, the
MACCHIATObin is faster for the exploration by
the benchmark.

I expect that this is interesting for the likely
difference in power usage during the benchmarking.
(Not that I've measured the power usage.)

(The FreeBSD head vintages are not the same in
the graph: -r355027 based vs. -r352341 based.)

Mark Millard
marklmi at
( went
away in early 2018-Mar)

More information about the freebsd-arm mailing list