(no subject)

Nick Barkas nick.barkas at gmail.com
Wed Nov 19 05:55:46 PST 2008


On Wed, Nov 19, 2008 at 06:24, Dan <dan-freebsd-fs at ourbrains.org> wrote:
> A recent question came up about huge numbers of files in one directory.
> Well, some people actually have to deal with it on the job:
>
> http://leaf.dragonflybsd.org/mailarchive/kernel/2008-11/msg00070.html
>
> An FS doesn't have to be designed such that file look-ups take a very
> long time to search when directories are large. When a nice hash is used
> as part of the FS design, the time to search for 1 in a 100 files or 2
> billion is the same. I view it as a feature. I can imagine a few cases
> where a large, non-human-readable directory is used to store many files.
> When developers know they have this feature at hand, they might as well
> use it. FS-based databases, image/sound editing, etc.

I'm not sure if this is what you're looking for, but FreeBSD's does
have some provisions to avoid too much performance degradation with
large directories. The VFS name cache will speed up look-up operations
on specific individual files in any size directory that are repeatedly
searched for, and it is filesystem independent. Specific to UFS2 there
is dirhash, which was implemented by Ian Dowse and David Malone. It
speeds up more types of operations involving large directories. They
wrote a paper about it you can find here:
http://www.usenix.org/events/usenix02/tech/freenix/dowse.html

More recently I've done a little bit of work on dirhash as well that
might further speed things up. It's not committed to SVN yet, but is
in Perforce. I sent out patches to this list a little while back but
have not received any reports from testers. My patches might need to
be updated to apply on the latest -CURRENT, and I'll try to update the
wiki page  (http://wiki.freebsd.org/DirhashDynamicMemory) if I find
out that that is the case.

I am hoping to find the time in the next few months to start working
on on-disk directory indexing for UFS2 so that linear searching
through directory entries is never necessary. You are correct in that
filesystems don't have to be designed such that searches are slow for
large directories, but UFS was designed quite a long time ago. It is
not trivial to change disk formats for directories now, especially
given that we want to remain backwards compatible and be able to work
properly with softupdates. I hope I can help make it happen though :)

Nick


More information about the freebsd-fs mailing list