CACHE_LINE_SIZE on x86
Jim Harris
jim.harris at gmail.com
Thu Nov 1 00:50:54 UTC 2012
On Thu, Oct 25, 2012 at 2:40 PM, Jim Harris <jim.harris at gmail.com> wrote:
> On Thu, Oct 25, 2012 at 2:32 PM, John Baldwin <jhb at freebsd.org> wrote:
> >
> > It would be good to know though if there are performance benefits from
> > avoiding sharing across paired lines in this manner. Even if it has
> > its own MOESI state, there might still be negative effects from sharing
> > the pair.
>
> On 2S, I do see further benefits by using 128 byte padding instead of
> 64. On 1S, I see no difference. I've been meaning to turn off
> prefetching on my system to see if it has any effect in the 2S case -
> I can give that a shot tomorrow.
>
>
So tomorrow turned into next week, but I have some data finally.
I've updated to HEAD from today, including all of the mtx_padalign
changes. I tested 64 v. 128 byte alignment on 2S amd64 (SNB Xeon). My
BIOS also has a knob to disable the adjacent line prefetching (MLC spatial
prefetcher), so I ran both 64b and 128b against this specific prefetcher
both enabled and disabled.
MLC prefetcher enabled: 3-6% performance improvement, 1-5% decrease in CPU
utilization by using 128b padding instead of 64b.
MLC prefetcher disabled: performance and CPU utilization differences are in
the noise - anywhere from -0.2% to +0.5%. The performanc here matches
extremely closely (within 1%) with 128b padding and the MLC prefetcher
enabled.
I think it's safe to say that the 128b pad/alignment is worth keeping for
multi-socket x86, and is most certainly due to the MLC spatial prefetcher.
I still see no measurable differences with 64b v. 128b padding on 1S, but
that's only testing with my benchmark.
Thanks,
-Jim
More information about the freebsd-arch
mailing list