Superpages on amd64 FreeBSD 7.2-STABLE

krad kraduk at googlemail.com
Thu Nov 26 17:04:37 UTC 2009


2009/11/26 Linda Messerschmidt <linda.messerschmidt at gmail.com>

> We have a squid proxy process with very large memory requirements (10
> - 20 GB) on a machine with 24GB of RAM.
>
> Unfortunately, we have to rotate the logs of this process once per
> day.  When we do, it fork()s and exec()s about 16-20 child processes
> as helpers.  Since it's got this multi-million-entry page table,
> that's a disaster, because it has to copy all those page table entries
> for each child, then throw them out.  This takes a couple of minutes
> of 100% CPU usage, during which time the machine is pretty much
> unresponsive.
>
> Someone on the squid list suggested we try the new superpages feature
> (vm.pmap.pg_ps_enabled) in 7.2.  We did, and after some tuning, we got
> it to work.
>
> Here's some "sysctl vm.pmap" for a similar machine with 16GB of RAM
> that does NOT have this setting enabled:
>
> vm.pmap.pv_entry_count: 2307899
> vm.pmap.pde.promotions: 0
> vm.pmap.pde.p_failures: 0
> vm.pmap.pde.mappings: 0
> vm.pmap.pde.demotions: 0
> vm.pmap.pv_entry_max: 4276871
> vm.pmap.pg_ps_enabled: 0
>
> Now, here is the machine that does have it, just prior to the daily
> rotation mentioned above:
>
> vm.pmap.pv_entry_count: 61361
> vm.pmap.pde.promotions: 23123
> vm.pmap.pde.p_failures: 327946
> vm.pmap.pde.mappings: 1641
> vm.pmap.pde.demotions: 17848
> vm.pmap.pv_entry_max: 7330186
> vm.pmap.pg_ps_enabled: 1
>
> So it obviously this feature makes a huge difference and is a
> brilliant idea. :-)
>
> My (limited) understanding is that one of the primary benefits of this
> feature is to help situations like ours... a page table that's 512x
> smaller can be copied 512x faster.  However, in practice this doesn't
> happen.  It's like fork() breaks up the squid process into 4kb pages
> again.  Here's the same machine's entries just after rotation:
>
> vm.pmap.pv_entry_count: 1908056
> vm.pmap.pde.promotions: 23212
> vm.pmap.pde.p_failures: 413171
> vm.pmap.pde.mappings: 1641
> vm.pmap.pde.demotions: 21470
> vm.pmap.pv_entry_max: 7330186
> vm.pmap.pg_ps_enabled: 1
>
> So some 3,600 superpages spontaneously turned into 1,850,000 4k pages.
>
> Once this happens, squid seems reluctant to use more superpages until
> its restarted.  We get a lot of p_failures and a slow-but-steady
> stream of demotions.  Here's the same machine just now:
>
> vm.pmap.pv_entry_count: 2022786
> vm.pmap.pde.promotions: 25281
> vm.pmap.pde.p_failures: 996027
> vm.pmap.pde.mappings: 1641
> vm.pmap.pde.demotions: 21683
> vm.pmap.pv_entry_max: 7330186
> vm.pmap.pg_ps_enabled: 1
>
> And a few minutes later:
>
> vm.pmap.pv_entry_count: 2021556
> vm.pmap.pde.promotions: 25331
> vm.pmap.pde.p_failures: 1001773
> vm.pmap.pde.mappings: 1641
> vm.pmap.pde.demotions: 21684
> vm.pmap.pv_entry_max: 7330186
> vm.pmap.pg_ps_enabled: 1
>
> (There were *no* p_failures or demotions in the several hours prior to
> rotation.)
>
> This trend continues... the pv_entry_count bounces up and down even
> though memory usage is increasing, so it's like it's trying to recover
> and convert things back (promotions), but it's having a lot of trouble
> (p_failures).
>
> It's not clear to me if this might be a problem with the superpages
> implementation, or if squid does something particularly horrible to
> its memory when it forks to cause this, but I wanted to ask about it
> on the list in case somebody who understands it better might know
> whats going on. :-)
>
> This is on FreeBSD-STABLE 7.2 amd64 r198976M.
>
> Thanks!
> _______________________________________________
> freebsd-hackers at freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
> To unsubscribe, send any mail to "freebsd-hackers-unsubscribe at freebsd.org"
>


Im sure you will get a lot of lovely answers to this but best keep things
simple. WHy not just syslog it of to another server and offload all the
compression to that box. You could even back it with zfs nad do on the fly
gzip compression at the file system level, or use syslog-ng to do it. If you
are worried about zfs and bsd use (open)*solaris  or another filesystem with
with inline compression


More information about the freebsd-hackers mailing list