About Transparent Superpages and Non-transparent superapges

Adrian Chadd adrian at freebsd.org
Tue Sep 24 00:16:47 UTC 2013


On 23 September 2013 14:30, Sebastian Kuzminsky <S.Kuzminsky at f5.com> wrote:

> On Sep 23, 2013, at 15:24 , Adrian Chadd wrote:
>
> > On 20 September 2013 08:20, Sebastian Kuzminsky <S.Kuzminsky at f5.com>
> wrote:
> >
> > It's transparent for the kernel: all of UMA and
> kmem_malloc()/kmem_free() is backed by 1 gig superpages.
> >
> > .. not entirely true, as I've found out at work. :(
>
> Can you expand on this, Adrian?
>
> Did you compile & boot the github branch i pointed to, and run in to a
> situation where kmem_malloc() returned memory not backed by 1 gig pages, on
> hardware that supports it?
>
>
I haven't done that yet, sorry.

So the direct map is backed by 1GB pages, except when it can't be:

* first 1GB - because of the memory hole(s)
* the 4th GB - because of the PCI IO hole(s)
* the end of RAM - because of memory remapping so you don't lose hundreds
of megabytes of RAM behind said memory/IO/ROM holes, the end of RAM isn't
on a 1GB boundary.

So, those regions seem to get mapped by smaller pages.

I'm still tinkering with this; I'd like to hack things up to (a) get all
the VM structures in the last gig of aligned RAM, so it falls inside a 1GB
direct mapped page, and (b) prefer that 1GB page for kernel allocations, so
things like mbufs, vm_page entries, etc all end up coming from the same 1GB
direct map page.

I _think_ I have an idea of what to do -  I'll create a couple of 1GB sized
freelists in the last two 1GB direct mapped regions at the end of RAM, then
I'll hack up the vm_phys allocator to prefer allocating from those.

The VM structures stuff is a bit more annoying - it gets allocated from the
top of RAM early on during boot; so unless your machine has the last region
of RAM fall exactly on a 1GB boundary, it'll be backed by 4k/2m pages. I
tested this out by setting hw.physmem to force things to be rounded on a
boundary and it helped for a while. Unfortunately the fact that everything
else gets allocated from random places in physical memory meant that I'm
thrashing the TLB cache - there's only 4 1GB slots on Sandy Bridge Xeon;
and with 64gig of RAM I'm seeing a 10-12% miss load when serving lots of
traffic from SSD (all with mbuf and vm structure allocations.)

So, if I can remove that 10% of CPU cycles taken walking pages, I'll be
happy.

Note: I'm a newbie here in the physical mapping code. :-)




-adrian


More information about the freebsd-hackers mailing list