Improving ZFS performance for large directories
Kevin Day
toasty at dragondata.com
Tue Jan 29 23:20:19 UTC 2013
I'm trying to improve performance when using ZFS in large (>60000 files) directories. A common activity is to use "getdirentries" to enumerate all the files in the directory, then "lstat" on each one to get information about it. Doing an "ls -l" in a large directory like this can take 10-30 seconds to complete. Trying to figure out why, I did:
ktrace ls -l /path/to/large/directory
kdump -R |sort -rn |more
to see what sys calls were taking the most time, I ended up with:
69247 ls 0.190729 STRU struct stat {dev=846475008, ino=46220085, mode=-rw-r--r-- , nlink=1, uid=0, gid=0, rdev=4294967295, atime=1333196714, stime=1201004393, ctime=1333196714.547566024, birthtime=1333196714.547566024, size=30784, blksize=31232, blocks=62, flags=0x0 }
69247 ls 0.180121 STRU struct stat {dev=846475008, ino=46233417, mode=-rw-r--r-- , nlink=1, uid=0, gid=0, rdev=4294967295, atime=1333197088, stime=1209814737, ctime=1333197088.913571042, birthtime=1333197088.913571042, size=3162220, blksize=131072, blocks=6409, flags=0x0 }
69247 ls 0.152370 RET getdirentries 4088/0xff8
69247 ls 0.139939 CALL stat(0x800d8f598,0x7fffffffcca0)
69247 ls 0.130411 RET __acl_get_link 0
69247 ls 0.121602 RET __acl_get_link 0
69247 ls 0.105799 RET getdirentries 4064/0xfe0
69247 ls 0.105069 RET getdirentries 4068/0xfe4
69247 ls 0.096862 RET getdirentries 4028/0xfbc
69247 ls 0.085012 RET getdirentries 4088/0xff8
69247 ls 0.082722 STRU struct stat {dev=846475008, ino=72941319, mode=-rw-r--r-- , nlink=1, uid=0, gid=0, rdev=4294967295, atime=1348686155, stime=1348347621, ctime=1348686155.768875422, birthtime=1348686155.768875422, size=6686225, blksize=131072, blocks=13325, flags=0x0 }
69247 ls 0.070318 STRU struct stat {dev=846475008, ino=46211679, mode=-rw-r--r-- , nlink=1, uid=0, gid=0, rdev=4294967295, atime=1333196475, stime=1240230314, ctime=1333196475.038567672, birthtime=1333196475.038567672, size=829895, blksize=131072, blocks=1797, flags=0x0 }
69247 ls 0.068060 RET getdirentries 4048/0xfd0
69247 ls 0.065118 RET getdirentries 4088/0xff8
69247 ls 0.062536 RET getdirentries 4096/0x1000
69247 ls 0.061118 RET getdirentries 4020/0xfb4
69247 ls 0.055038 STRU struct stat {dev=846475008, ino=46220358, mode=-rw-r--r-- , nlink=1, uid=0, gid=0, rdev=4294967295, atime=1333196720, stime=1274282669, ctime=1333196720.972567345, birthtime=1333196720.972567345, size=382344, blksize=131072, blocks=773, flags=0x0 }
69247 ls 0.054948 STRU struct stat {dev=846475008, ino=75025952, mode=-rw-r--r-- , nlink=1, uid=0, gid=0, rdev=4294967295, atime=1351071350, stime=1349726805, ctime=1351071350.800873870, birthtime=1351071350.800873870, size=2575559, blksize=131072, blocks=5127, flags=0x0 }
69247 ls 0.054828 STRU struct stat {dev=846475008, ino=65021883, mode=-rw-r--r-- , nlink=1, uid=0, gid=0, rdev=4294967295, atime=1335730367, stime=1332843230, ctime=1335730367.541567371, birthtime=1335730367.541567371, size=226347, blksize=131072, blocks=517, flags=0x0 }
69247 ls 0.053743 STRU struct stat {dev=846475008, ino=46222016, mode=-rw-r--r-- , nlink=1, uid=0, gid=0, rdev=4294967295, atime=1333196765, stime=1257110706, ctime=1333196765.206574132, birthtime=1333196765.206574132, size=62112, blksize=62464, blocks=123, flags=0x0 }
69247 ls 0.052015 RET getdirentries 4060/0xfdc
69247 ls 0.051388 RET getdirentries 4068/0xfe4
69247 ls 0.049875 RET getdirentries 4088/0xff8
69247 ls 0.049156 RET getdirentries 4032/0xfc0
69247 ls 0.048609 RET getdirentries 4040/0xfc8
69247 ls 0.048279 RET getdirentries 4032/0xfc0
69247 ls 0.048062 RET getdirentries 4064/0xfe0
69247 ls 0.047577 RET getdirentries 4076/0xfec
(snip)
the STRU are returns from calling lstat().
It looks like both getdirentries and lstat are taking quite a while to return. The shortest return for any lstat() call is 0.000004 seconds, the maximum is 0.190729 and the average is around 0.0004. Just from lstat() alone, that makes "ls" take over 20 seconds.
I'm prepared to try an L2arc cache device (with secondarycache=metadata), but I'm having trouble determining how big of a device I'd need. We've got >30M inodes now on this filesystem, including some files with extremely long names. Is there some way to determine the amount of metadata on a ZFS filesystem?
More information about the freebsd-fs
mailing list