mmap performance and memory use
Wojciech Puchar
wojtek at wojtek.tensor.gdynia.pl
Fri Oct 7 00:46:52 UTC 2011
>> page. how much memory is used to manage this?
> I am not sure how deep the enumeration you want to know, but the first
> approximation will be:
> one struct vm_map_entry
> one struct vm_object
> one pv_entry
actually i don't need precise answer but algorithms.
>
> Page table structures need four pages for directories and page table proper.
>>
>> 2) suppose we have 1TB file on disk without holes and 100000 processes
>> mmaps this file to it's address space. are just pages shared or can
>> pagetables be shared too? how much memory is used to manage such
>> situation?
> Only pages are shared. Pagetables are not.
this is what i really asked, thank you for an answer. My example was
rather extreme but datasets of tens of gigabytes would be used.
> superpages are due to more efficient use of TLB.
actually this is not really working at least a while ago (but already in
FreeBSD 8) i tested it. Even with 1GB squid process without any swapping
it wasn't often allocating them.
Even with working case it probably will not help much here unless
completely all data is in RAM, and following explains why
> accurate tracking of the accesses and writes, which can result in better
> pageout performance.
>
> For the situation 1TB/100000 processes, you will probably need to tune
> the amount of pv entries, see sysctl vm.pmap.pv*.
so there is a workaround but causing lots of soft page faults as there
would be no more than few hundreds or so instructions between touching
different pages.
What i want to do is database library (but no SQL!). It will be something
alike (but definitely not the same and NOT compatible) CA-Clipper/Harbour
or harbour but with higher performance and to use it including heavy
cases.
With this system one user is one process, one thread. if used as
WWW/something alike it will be this+some other thing doing WWW interface
but still one logged user=exactly one process
As properly planned database tables should not be huge i assume most of
them (possibly excluded parts that are mostly not used) will be kept in
memory by VM subsystem. So hard faults and disk I/O will not be a deciding
factor.
To avoid system calls i just want to mmap tables and indexes. All
semaphores can be done from userspace too, and i already know how to avoid
lock contention well.
Using indexes means doing lots of memory reads from different pages, but
for every process it will be usually not all pages touched but small
subset.
So it MAY work well this way, or may end with 95% system CPU time mostly
doing soft faults.
But future question - is something for that case planned in FreeBSD? I
think i am not the only one about that, not all people on earth use
computers for few processes or personal usage and there are IMHO many
cases when programs need to share huge dataset using mmap, while doing
heavy timesharing.
I understand that mmap works that way because it may be mapped in
different places and even with parts of single file in different places as
this is what mmap allows.
But is it possible to make different mmap in kernel like that
mmap_fullfile(fd,maxsize)
which (taking amd64 case) will map file at 2MB boundary if maxsize<=2MB,
1GB boundary if maxsize<=1GB, 512GB boundary otherwise, with
subsequent multiple 512GB address blocks if needed, and sharing
everything?
it is completely no problem that things like madvise from one process will
clean madvise setting from other process, or other problems - as only one
type of programs that are aware of this would use it.
this way there will be practicaly no pagetable mapping overhead and
actually simpler/faster OS duties.
I don't really know how exactly VM subsystem works under FreeBSD but if it
is not hard i may do this with some help from you.
And no - i don't want to use any popular database systems for good
reasons.
More information about the freebsd-hackers
mailing list