Improving ZFS performance for large directories

Peter Jeremy peter at rulingia.com
Wed Feb 20 08:28:47 UTC 2013


On 2013-Feb-19 14:10:47 -0600, Kevin Day <toasty at dragondata.com> wrote:
>Timing doing an "ls" in large directories 20 times, the first is the
>slowest, then all subsequent listings are roughly the same.

OK.  My testing was on large files rather than large amounts of metadata.

>Thinking I'd make the primary cache metadata only, and the secondary
>cache "all" would improve things,

This won't work as expected.  L2ARC only caches data coming out of ARC
so by setting ARC to cache metadata only, there's never any "data" in
ARC and hence never any evicted from ARC to L2ARC.

> I wiped the device (SATA secure erase to make sure)

That's not necessary.  L2ARC doesn't survive reboots because all teh
L2ARC "metadata" is in ARC only.  This does mean that it takes quite
a while for L2ARC to warm up following a reboot.

>Before adding the SSD, an "ls" in a directory with 65k files would
>take 10-30 seconds, it's now down to about 0.2 seconds.

That sounds quite good.

> There are roughly 29M files, growing at about 50k files/day. We
>recently upgraded, and are now at 96 3TB drives in the pool. 

That number of files isn't really excessive but it sounds like your
workload has very low locality.  At this stage, my suggestions are:
1) Disable atime if you don't need it & haven't already.
   Otherwise file accesses are triggering metadata updates.
2) Increase vfs.zfs.arc_meta_limit
   You're still getting more metadata misses than data misses
3) Increase your ARC size (more RAM)
   Your pool is quite large compared to your RAM.

>It's a 250G drive, and only 22G is being used, and there's still a
>~66% miss rate.

That's 66% of the requests that missed in ARC.

> Is there any way to tell why more metadata isn't
>being pushed to the L2ARC?

ZFS treats writing to L2ARC very much as an afterthought.  L2ARC writes
are rate limited by vfs.zfs.l2arc_write_{boost,max} and will be aborted
if they might interfere with a read.  I'm not sure how to improve it.

Since this is all generic ZFS, you might like to try asking on
zfs at lists.illumos.org as well.  Some of the experts there might have
some ideas.

-- 
Peter Jeremy
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 196 bytes
Desc: not available
URL: <http://lists.freebsd.org/pipermail/freebsd-fs/attachments/20130220/f43182d7/attachment.sig>


More information about the freebsd-fs mailing list