find vs ls performance for walking folders, are there any faster options?

Bruce Evans brde at optusnet.com.au
Wed Dec 12 14:51:01 UTC 2012


On Wed, 12 Dec 2012, [ISO-8859-1] Olav Grønås Gjerde wrote:

> I'm working on scanning filesystems to build a file search engine and
> came over something interesting.
>
> I can walk through 300 000 folders in ~19.5seconds with this command:
> ls -Ra | grep -e "./.*:" | sed "s/://"
>
> With find, it surprisingly takes ~50.5 seconds.:
> find . -type d

This is because 'find' with '-type' lstats all the files.  It doesn't
use DT_DIR from dirent for some reason.  ls can be slowed down similarly
using -F.

> My results are based on five runs of each command to warm up the disk cache.
> I've tried both this with both UFS and ZFS, and both filesystems shows
> the same speed difference.

I get almost exactly the same ratio of speeds on an old version of FreeBSD.
All the data was cached, and there were only 7 symlinks.  Thr file system
was mounted with -noatime, so the cache actually worked.

> On a modern Linux distribution(Ubuntu 12.10 with EXT4), ls is just
> slight faster than find(about 15-20%).

Apparently lstat() is relatively much slower in FreeBSD.  It only takes
5 usec here, but that is a lot for converting cached data (getpid()
takes 0.2 usec).  A file system mounted with -atime might be much
slower, for writing directory timestamps (the sync of the timestamps
is delayed, but it is a very heavyweight operation).

> Are there a faster way to walk folders on FreeBSD? Are there some
> options(sysctl) I could tune to improve the performance?

Nothing much faster than find without -type.  Whatever fts(3) gives.

Bruce


More information about the freebsd-performance mailing list