find vs ls performance for walking folders, are there any faster options?
Bruce Evans
brde at optusnet.com.au
Wed Dec 12 14:51:01 UTC 2012
On Wed, 12 Dec 2012, [ISO-8859-1] Olav Grønås Gjerde wrote:
> I'm working on scanning filesystems to build a file search engine and
> came over something interesting.
>
> I can walk through 300 000 folders in ~19.5seconds with this command:
> ls -Ra | grep -e "./.*:" | sed "s/://"
>
> With find, it surprisingly takes ~50.5 seconds.:
> find . -type d
This is because 'find' with '-type' lstats all the files. It doesn't
use DT_DIR from dirent for some reason. ls can be slowed down similarly
using -F.
> My results are based on five runs of each command to warm up the disk cache.
> I've tried both this with both UFS and ZFS, and both filesystems shows
> the same speed difference.
I get almost exactly the same ratio of speeds on an old version of FreeBSD.
All the data was cached, and there were only 7 symlinks. Thr file system
was mounted with -noatime, so the cache actually worked.
> On a modern Linux distribution(Ubuntu 12.10 with EXT4), ls is just
> slight faster than find(about 15-20%).
Apparently lstat() is relatively much slower in FreeBSD. It only takes
5 usec here, but that is a lot for converting cached data (getpid()
takes 0.2 usec). A file system mounted with -atime might be much
slower, for writing directory timestamps (the sync of the timestamps
is delayed, but it is a very heavyweight operation).
> Are there a faster way to walk folders on FreeBSD? Are there some
> options(sysctl) I could tune to improve the performance?
Nothing much faster than find without -type. Whatever fts(3) gives.
Bruce
More information about the freebsd-fs
mailing list