dirhash and dynamic memory allocation

Jeremy Chadwick freebsd at jdc.parodius.com
Fri Oct 21 16:20:27 UTC 2011


On Fri, Oct 21, 2011 at 05:38:43PM +0200, Miroslav Lachman wrote:
> Hi, I am back on this topic...
> 
> Ivan Voras wrote:
> >On 14/10/2011 11:20, Miroslav Lachman wrote:
> >>Hi all,
> >>
> >>I tried some tuning of dirhash on our servers and after googlig a bit, I
> >>found an old GSoC project wiki page about Dynamic Memory Allocation for
> >>Dirhash: http://wiki.freebsd.org/DirhashDynamicMemory
> >>Is there any reason not to use it / not commit it to HEAD?
> >
> >AFAIK it's sort-of already present. In 8-stable and recent kernels you
> >can give huge amounts of memory to dirhash via vfs.ufs.dirhash_maxmem
> >(but except in really large edge cases I don't think you *need* more
> >than 32 MB), and the kernel will scale-down or free the memory if not
> >needed.
> >
> >In effect, vfs.ufs.dirhash_maxmem is the upper limit - the kernel will
> >use less and will free the allocated memory in low memory situations
> >(which I've tried and it works).
> 
> So the current behavior is that on 7.3+ and 8.x we have smaller
> average dirhash buffer (by default) than it was initialy 10 years
> ago. Because it starts as 2MB fixed size and now we have 2MB max,
> which is lowered in low mem situations... and sometimes it is set to
> 0MB!
> 
> I caught this 2 days ago:
> 
> root at rip ~/# sysctl vfs.ufs
> vfs.ufs.dirhash_reclaimage: 5
> vfs.ufs.dirhash_lowmemcount: 36953
> vfs.ufs.dirhash_docheck: 0
> vfs.ufs.dirhash_mem: 0
> vfs.ufs.dirhash_maxmem: 8388608
> vfs.ufs.dirhash_minsize: 2560
> 
> I set maxmem to 8MB in sysctl.conf to increase performance and
> dirhash_mem 0 is really bad surprise!

Actually, the "bad surprise" is dirhash_lowmemcount of 36953.  You
increasing dirhash_maxmem is fine -- what you're seeing is that your
machine keeps running out of memory, or that your directories are filled
with so many files that you're exhausting the dirhash repetitively.

I'm going to be blunt and just ask it: why does that happen?  Or do you
have a filesystem that has an absurdly high number of files in a single
directory?  If the former, ignore the next paragraph

I've harped on this before on the mailing list: one of the first things
I learned as a system administrator was that you Do Not(tm) fill
directories with tens of thousands of files.  Split them up into
subdirs.  Even caching daemons (squid, etc.) support this kind of thing;
filename "aj1j11hsfkqXaj21" should really be aj/1j/11hsfkqXaj21.  You
get the idea.  DNS/BIND administrators of systems which have tens of
thousands of domains are quite familiar with this scenario too.

> I am worrying about bad performance in situation where dirhash is
> emptied in situations, where server is already running at maximum
> performance (there is some memory hungry process and system can
> start swapping to disk + dirhash is efectively disabled)
>
> I found a PR kern/145246
> http://www.freebsd.org/cgi/query-pr.cgi?pr=145246
> 
> Is it possible to add some dirhash_minmem limit to not clear all the
> dirhash memory?
> So I can set dirhash_minmem=2MB dirhash_maxmem=16MB and then
> dirhash_mem will be allways between these two limits?

dirhash shouldn't be "disabled", it's that memory pressure from other
things has priority over the dirhash, but I understand what you mean.
This is quite evident from dirhash_lowmemcount being so high.

I understand what you want, and maybe there is a way to get what you
want (with little effort), but I am strongly inclined to say you need to
figure out on your system what is causing such memory pressure and solve
that.  Honest: try to solve the real problem rather than dancing around
it.  If you have a process that skyrockets in RSS/RES usage due to a
memory leak or out-of-control design (such as a daemonised perl script
which blindly uses .= to append data to a scalar, or blindly keeps
appending data to the end of a list), then fix that problem.

Basically I'm trying to say that it shouldn't be the responsibility of
dirhash to "work around" other problems happening on the system that
diminish or exhaust available memory.  You end up with a kernel design
that has tons of one-offs in it and that does nothing but bite you in
the butt down the road.  (Linux has been through this many times over.)
 
-- 
| Jeremy Chadwick                                jdc at parodius.com |
| Parodius Networking                       http://www.parodius.com/ |
| UNIX Systems Administrator                   Mountain View, CA, US |
| Making life hard for others since 1977.               PGP 4BD6C0CB |



More information about the freebsd-fs mailing list