hw.physmem/hw.realmem question

Chris Torek torek at torek.net
Tue Jul 2 20:15:45 UTC 2013


>for example, this host has has 32G of physical memory ...
>[snip - dmesg:]
>real memory  = 34359738368 (32768 MB)
>avail memory = 32191340544 (30700 MB)
>[snip]
>and from sysctl:
>hw.physmem: 34284916736
>hw.usermem: 32964923392
>hw.realmem: 36507222016
>
>after setting
>	hw.physmem=16G
>
>from dmesg:
>real memory  = 34359738368 (32768 MB)
>avail memory = 13999382528 (13350 MB)
>
>and sysctl:
>
>	hw.physmem: 14957563904
>	hw.usermem: 10094678016
>	hw.realmem: 17179869184
>
>from the numbers, I can assume that realmem is the real physical memory,
>(or whatever is set in hw.physmem),

I think there are three numbers (though one is not always a single
number) of interest, which I believe were the original intent of
the three sysctl values ("real", "phys", and "user" memory,
specially called out in sysctl.h under the "HW" section).  These
are:

  - The RAM present in hardware.  This would be 32 GB on
    your particular system.

    On some machines, RAM is not even close to physically
    contiguous, e.g., there might be 4 GB of RAM starting at
    address 8 TB, then 32 GB of RAM starting at address 16 TB,
    then 1 TB of RAM starting at address 24 TB, on a machine where
    each DIMM slot is simply located at each 8-TB position (with
    "architectural room" for up to 8 TB per slot), and "slot 0"
    reserved for boot ROMs (hence physical space starting at 8
    TB).  These enormous "holes" in the physical space should (my
    opinion) be reflected in whatever interface gets you the RAM
    map.  That makes this one not a "single number".  Of course,
    if you can't or won't show the holes, you can make up a single
    number.

    (On x86 systems you don't have holes this big, although you do
    have the historical "I/O space" windows, e.g., that annoying
    gap at 640kB :-) .  Worse, there's always a half-GB gap at
    3.5GB, although some memory controllers work around this.
    See:

  http://lists.freebsd.org/pipermail/freebsd-amd64/2005-August/005849.html

    for example.)

  - The amount of "usable" RAM for the OS.  This subtracts off
    any spaces reserved by boot ROMs / BIOSes / what-have-you, and
    in the case of (e.g.) FreeBSD-9 on amd64, the 1 TB direct-map
    limit (which you must take care of manually with the loader's
    hw.physmem setting).

    This is what "phys mem" should be and mostly is.  If you boot
    a machine with 1.5 TB of RAM but the OS is limited to 1 TB,
    hw.physmem should be 1 TB minus a bit for the BIOS, etc.

  - The amount of memory left after subtracting more or less fixed
    kernel resources, such as kernel text and data (including loaded
    modules), page table pages, "vm_page" array data structures, and
    so on.  This will shift over time.

    This is what "user mem" is, more or less -- to be exact, it's:

	ctob(physmem - cnt.v_wire_count)

    where "ctob" is "clicks to bytes" which is really "pages to
    bytes", as these things count in terms of pages (4K at a time,
    on the x86).  The "wire count" is the number of pages that are
    not page-in/out-able, so subtracting that from physmem gives
    you the number of pageable, hence user-use-able, pages.

There's a fourth number, which is not really very useful, but I
need to describe for the below:

  - The highest useable physical address in the system (or
    really, just after that -- the same way if the machine is to
    count to 4, you go "0 1 2 3" and that gives you 4).  This
    is called "Maxmem" in amd64/amd64/machdep.c (and pmap.c).

Note that the printf output:

    real memory = <number> (<number> MB)

generally comes from the amount reported by the BIOS, which is
different yet again!  My box with 8GB says:

    real memory  = 8589934592 (8192 MB)

but on my machine, Maxmem is 8.5 GB, which also shows up in
sysctl (more on that in a moment):

    hw.realmem: 9126805504

I have the Intel memory controller that "moves up" the shadowed
(by PCI hole) RAM at 3.5 GB. If I were to set hw.physmem to 8
GB in my loader.conf, I would actually give up half a gigabyte.

So, on to this:

>if so, where did almost 2G go? (realmem - physmem) ...
>what is physmem and realmem, and what's the relationship - if any
>- between them?

"hw.realmem" is a snapshot of the value of Maxmem, before a few
adjustments are made.  If you have 32 GB of physical RAM and you
have not limited it (and you don't have the remapping noted
below), "realmem" will be 32 GB.  If you *have* limited it,
realmem captures the limit (which I think is wrong -- the snapshot
should happen earlier, at the least).

"hw.physmem" results from counting up useable pages.  After
finding Maxmem, which gives you "maximum valid address plus 1",
the machdep.c code goes through all the segments -- segments being
stuff like "64k to 640k", with the first 64k off limits because
BIOSes tend to munch on it and the 640k limit due to the ISA hole
-- and checks that each page in that segment for useability.  If
the page is good, it's added to physmem.

Your 2 GB (542555 pages, to be exact) is space eaten up by your
BIOS and architectural holes, including the large PCI hole.  Your
BIOS and motherboard etc may (or may not) allow you to remap some
of your "hidden" or "shadowed" RAM (out of the PCI hole),
increasing the boot-time value of Maxmem, and hence also
increasing both "hw.realmem" and actual, useable pages.

Note: the x86's architectural holes still use up some dedicated
kernel memory, even with shadowed-memory remapping: if addresses
from zero to 8.5 GB are valid (as they are on my box), the kernel
allocates enough "vm_page" data structures to have one for all 8.5
GB, even though there's a .5 GB PCI hole with no RAM behind it.
Those pages are marked "not here, never use these" -- but they
still take sizeof(struct vm_page) bytes (120 bytes) to represent.
512 MB of hole = 512*1024*1024 / 4096 = 131072 pages, which means
the kernel is using 15728640 bytes (15 MB) to track this empty
area.  So my remapping hardware gains me 512 MB and then the
kernel loses 15, for a net of 497 MB recovered.

Chris


More information about the freebsd-hackers mailing list