amd opteron NUMA support
grehan at freebsd.org
Fri Sep 19 15:07:57 UTC 2008
> Is anyone looking at trying to add specific support for the
> hyper-transport based numa AMD systems?
You don't have to do anything. The penalty for remote access is hidden
with caching. It's only hugely large working sets that would show the
difference, and for those, the memory is probably set up interleaved to
average out the worst-case behaviour.
At least that was the NetApp quad-socket experience with the older 8xx
Opterons and up to 2-hop trips for slow DDR2 RAM. These days, the 8xxx
provide a cross-link, so you only have to deal with a single hop in a
> with each processor having memory associated with it,
> and a penalty for accessing memory associated with other CPUS,
> several things come to mind, including:
> Obviously, doing a lot of work to stop threads from migrating around.
> Page replacement of pages that are 'far away' with closer ones over time.
> CPU or die specific memory allocators.
That is certainly possible, though an experiment I witnessed with
putting pcpu data in per-CPU pages made no difference, since caches do
> Multiple copies of read-only segments (so that each cpu has it's own
> copy of the /bin/sh text segment for example).
Try dealing with the debugger issues :)
> Servicing interrupts on CPUs most closely associated with the IO channels.
Once again, little to no difference. Any extra latency incurred with
the HT interrupt packet traversing an extra hop is lost in the noise.
> Now I know SOME work has been done on some of this
> but it would be good to know if anyone if focusing on this.
Linux has APIs for NUMA memory allocation. SGI did a lot of work for
Irix on their large NUMA machines.
More information about the freebsd-current