UFS Subdirectory limit.

Robert Watson rwatson at FreeBSD.org
Sun Mar 27 07:32:09 PST 2005


On Sat, 26 Mar 2005, David Malone wrote:

> > Also, the more important
> > concern is that large directories simply don't scale in UFS.  Lookups
> > are a linear operation, and while DIRHASH helps, it really doesn't scale
> > well to 150k entries.
> 
> It seems to work passably well actually, not that I've benchmarked it
> carefully at this size. My junkmail maildir has 164953 entries at the
> moment, and is pretty much continiously appended to without creating any
> problems for the machine it lives on. Dirhash doesn't care if the
> entries are subdirectories or files. 
> 
> If the directory entries are largely static, the name cache should do
> all the work, and it is well capable of dealing with lots of files. 
> 
> We should definitely look at what sort of filesystem features we're
> likely to need in the future, but I just wanted to see if we can offer
> people a sloution that doesn't mean waiting for FreeBSD 6 or 7. 

FWIW, I regularly use directories with several hundred thousand files in
them, and it works quite well for the set of operations I perform
(typically, I only append new entries to the directory).  This is with a
cyrus server hosting fairly large shared folders -- in Cyrus, a
maildir-like format is used.  For example, the lists.linux.kernel
directory references 430,000 individual files.  Between UFS_DIRHASH and
Cyrus's use of a cache file, opening the folder primarily consists of
mmap'ing the cache file and then doing lookups, which occur quite quickly. 
My workload doesn't currently require large numbers of directories
referenced by a similar directory, but based on the results I've had with
large numbers of files, I can say it likely would work fine subject to the
ability for UFS to express it. 

Robert N M Watson




More information about the freebsd-fs mailing list