superpages for UMA
Alexander V. Chernikov
melifaro at FreeBSD.org
Mon Aug 18 15:03:23 UTC 2014
Hello list.
Currently UMA(9) uses PAGE_SIZE kegs to store items in.
It seems fine for most usage scenarios, however there are some where
very large number of items is required.
I've run into this problem while using ipfw tables (radix based) with
~50k records. This is how
`pmcstat -TS DTLB_LOAD_MISSES.MISS_CAUSES_A_WALK -w1` looks like:
PMC: [DTLB_LOAD_MISSES.MISS_CAUSES_A_WALK] Samples: 2359 (100.0%) , 0
unresolved
%SAMP IMAGE FUNCTION CALLERS
28.7 kernel rn_match ipfw_lookup_table:21.7
rtalloc_fib_nolock:7.0
25.5 ipfw.ko ipfw_chk ipfw_check_hook
6.0 kernel rn_lookup ipfw_lookup_table
Some numbers: table entry occupies 128 bytes, so we may store no more
than 30 records in single page-sized keg.
50k records require more than 1500 kegs.
As far as I understand second-level TLB for modern Intel CPU may be 256
or 512 entries( for 4K pages ), so using large number of entries
results in TLB cache misses constantly happening.
Other examples:
Route tables (in current implementation): struct rte occupies more than
128 bytes and storing full-view (> 500k routes) would result in TLB
misses happening all of the time.
Various stateful packet processing: modern SLB/firewall can have
millions of states. Regardless of state size PAGE_SIZE'd kegs is not the
best choice.
All of these can be addressed:
Ipwa tables/ipfw dynamic state allocation code can (and will) be
rewritten to use uma+uma_zone_set_allocf (suggested by glebius),
radix should simply be changed to a different lookup algo (as it is
happening in ipfw tables).
However, we may consider on adding another UMA flag to allocate
2M/1G-sized kegs per request.
(Additionally, Intel Haswell arch has 512 entries in STLB shared?
between 4k/2M so it should help the former).
What do you think?
More information about the freebsd-arch
mailing list