Improving ZFS performance for large directories

Matthew Ahrens mahrens at delphix.com
Tue Jan 29 23:42:30 UTC 2013


On Tue, Jan 29, 2013 at 3:20 PM, Kevin Day <toasty at dragondata.com> wrote:

> I'm prepared to try an L2arc cache device (with secondarycache=metadata),


You might first see how long it takes when everything is cached.  E.g. by
doing this in the same directory several times.  This will give you a lower
bound on the time it will take (or put another way, an upper bound on the
improvement available from a cache device).


> but I'm having trouble determining how big of a device I'd need. We've got
> >30M inodes now on this filesystem, including some files with extremely
> long names. Is there some way to determine the amount of metadata on a ZFS
> filesystem?


For a specific filesystem, nothing comes to mind, but I'm sure you could
cobble something together with zdb.  There are several tools to determine
the amount of metadata in a ZFS storage pool:

 - "zdb -bbb <pool>"
     but this is unreliable on pools that are in use
 - "zpool scrub <pool>; <wait for scrub to complete>; echo '::walk
spa|::zfs_blkstats' | mdb -k"
    the scrub is slow, but this can be mitigated by setting the global
variable zfs_no_scrub_io to 1.  If you don't have mdb or equivalent
debugging tools on freebsd, you can manually look at
<spa_t>->spa_dsl_pool->dp_blkstats.

In either case, the "LSIZE" is the size that's required for caching (in
memory or on a l2arc cache device).  At a minimum you will need 512 bytes
for each file, to cache the dnode_phys_t.

--matt


More information about the freebsd-fs mailing list