amd opteron NUMA support

Fri Sep 19 15:07:57 UTC 2008

Hi Julian,

> Is anyone looking at trying to add specific support for the
> hyper-transport based numa AMD systems?

  You don't have to do anything. The penalty for remote access is hidden 
with caching. It's only hugely large working sets that would show the 
difference, and for those, the memory is probably set up interleaved to 
average out the worst-case behaviour.

  At least that was the NetApp quad-socket experience with the older 8xx 
Opterons and up to 2-hop trips for slow DDR2 RAM. These days, the 8xxx 
provide a cross-link, so you only have to deal with a single hop in a 
quad system.

> with each processor having memory associated with it,
> and a penalty for accessing memory associated with other CPUS,
> several things come to mind, including:
> 
> Obviously, doing a lot of work to stop threads from migrating around.
> Page replacement of pages that are 'far away' with closer ones over time.
  ...
> CPU or die specific memory allocators.

  That is certainly possible, though an experiment I witnessed with 
putting pcpu data in per-CPU pages made no difference, since caches do 
their job.

> Multiple copies of read-only segments (so that each cpu has it's own 
> copy of the /bin/sh text segment for example).

  Try dealing with the debugger issues :)

> Servicing interrupts on CPUs most closely associated with the IO channels.

  Once again, little to no difference. Any extra latency incurred with 
the HT interrupt packet traversing an extra hop is lost in the noise.

> Now I know SOME work has been done on some of this
> but it would be good to know if anyone if focusing on this.

  Linux has APIs for NUMA memory allocation. SGI did a lot of work for 
Irix on their large NUMA machines.

later,

Peter.