Comparing the OverDrive 1000 (A57) vs. MACCHIATObin Double Shot (A72) for buildworld and via a CPU/cache/RAM tradeoff-exploring benchmark (links corrected, again)
Mark Millard
marklmi at yahoo.com
Mon Dec 2 23:07:17 UTC 2019
[May be this time I'll get working links in place . . .]
On 2019-Dec-2, at 14:56, Mark Millard <marklmi at yahoo.com> wrote:
> [Just correcting the links to be to .png files
> and correcting some PowerMac11,2 related wording.]
>
> On 2019-Dec-2, at 14:15, Mark Millard <marklmi at yahoo.com> wrote:
>
>> It looks like the OverDrive 1000 vs. MACCHIATObin Double
>> Shot comparison ends up being an example of memory
>> access making the difference for the specific workload:
>> -j4 buildworld for head -r355027 (building itself
>> from scratch).
>>
>> buildworld times (not needing a llvm bootstrap build):
>>
>> OverDrive 1000: 13895 sec (about 3.86 hrs)
>> MACCHIATObin Double Shot: 16561 sec (about 4.60 hrs)
>>
>> So a little under 45 min difference when the mean
>> and geometric mean are both a little over 4.2 hrs.
>>
>> SSD ufs file systems: One with Samsung 860 Pro, the
>> other with Samsung 850 Pro. I do not expect that I/O
>> made much of a difference, but I did nothing to measure
>> such for the buildworld activity.
>>
>> OverDrive RAM: 8GiByte, half in each of the 2 slots
>> MACCHIATObin RAM: 16GiByte, all in its 1 slot.
>>
>> MACCHIATObin: jumpers set for the fastest CPU/RAM
>> speed for the Double Shot.
>>
>> A comparison graph from exploring single threaded
>> and multi-threaded CPU/cache and RAM limited
>> performance (a variation on the old HINT serial
>> and pthread benchmarks) is shown at:
Corrected link (2nd try):
https://github.com/markmi/acpphint/blob/master/acpphint_example_data/acpphint-OverDrive_1000_MacchDblShot-threads_4-LP64-g%2B%2B_9_O3-libc%2B%2B-DSIZE_large_fast_types-RAM.png
>> There are curves for various involved types:
>> double (d), unsigned long long (ull), unsigned
>> long (ul), unsigned int (ui). The match for
>> ull and ul for the context provides some
>> evidence of the variability observed.
>>
>> (The OverDrive and MACCHIATObin were not benchmarked
>> for the graph at the same version of head: -r352341
>> based vs. -r355027 based.)
>>
>> (I did not set things such that the benchmark run
>> would explore paging getting involved. Thus there
>> is basically no I/O considered in the comparison
>> graph.)
>>
>> The MACCHIATObin clearly wins single threaded and
>> its memory subsystem was well matched to the single
>> threaded use when the same-invovled-types are
>> compared. (Single threaded are the blueish curves,
>> MACCHIATObin having the lighter colors.)
>>
>> For multi-threaded in the range where RAM access
>> limits things, the two systems are a close match.
>> (Greenish colors, right side of plot, upper
>> curves.)
>>
>> The range were the OverDrive 1000 is clearly faster
>> is part of the middle of the multi-threaded curves.
>> (This might be tied to whatever is done with the
>> dual RAM slot structure or to the amount of caching,
>> or some such, I do not know the details.)
>>
>> I would expect "-j1 buildworld" would take less time
>> on the MACCHIATObin than on the OverDrive, but I'm
>> not planing on measuring that.
>>
>>
>>
>> A more historical comparison, old PowerMac11,2
>> (2 sockets, 2 cores each) vs. the MACCHIATObin,
>> both having 16 GiBytes of RAM:
>>
>> For analogous benchmark graphs (matching types),
>> the MACCHIATObin single threaded is faster than
>> the old PowerMac11,2 single threaded and also is
>> usually faster than that 11,2's multi-threaded
>> benchmark data as well.
>
> I should have pointed out that the MACCHIATObin
> single threaded and PowerMac11,2 multi-threaded
> results are similar where memory access limits
> things, with use of double (d) being a little
> slower on the MACCHIATObin in this region.
>
>> Multi-threaded, the
>> MACCHIATObin is faster for the exploration by
>> the benchmark.
>
Corrected link (2nd try):
https://github.com/markmi/acpphint/blob/master/acpphint_example_data/acpphint-MacchDblShot_PowerMac11%2C2-threads_4-LP64-g%2B%2B_9_O3-libc%2B%2B-DSIZE_large_fast_types-RAM.png
>> I expect that this is interesting for the likely
>> difference in power usage during the benchmarking.
>> (Not that I've measured the power usage.)
>>
>> (The FreeBSD head vintages are not the same in
>> the graph: -r355027 based vs. -r352341 based.)
>>
===
Mark Millard
marklmi at yahoo.com
( dsl-only.net went
away in early 2018-Mar)
More information about the freebsd-arm
mailing list