ZFS and large directories - caveat report
Ivan Voras
ivoras at freebsd.org
Thu Jul 21 15:46:38 UTC 2011
I'm writing this mostly for future reference / archiving and also if
someone has an idea on how to improve the situation.
A web server I maintain was hit by DoS, which has caused more than 4
million PHP session files to be created. The session files are sharded
in 32 directories in a single level - which is normally more than enough
for this web server as the number of users is only a couple of thousand.
With the DoS, the number of files per shard directory rose to about 130,000.
The problem is: ZFS has proven horribly inefficient with such large
directories. I have other, more loaded servers with simlarly bad / large
directories on UFS where the problem is not nearly as serious as here
(probably due to the large dirhash). On this system, any operation which
touches even only the parent of these 32 shards (e.g. "ls") takes
seconds, and a simple "find | wc -l" on one of the shards takes > 30
minutes (I stopped it after 30 minutes). Another symptom is that
SIGINT-ing such find process takes 10-15 seconds to complete (sic! this
likely means the kernel operation cannot be interrupted for so long).
This wouldn't be a problem by itself, but operations on such directories
eat IOPS - clearly visible with the "find" test case, making the rest of
the services on the server fall as collateral damage. Apparently there
is a huge amount of seeking being done, even though I would think that
for read operations all the data would be cached - and somehow the
seeking from this operation takes priority / livelocks other operations
on the same ZFS pool.
This is on a fresh 8-STABLE AMD64, pool version 28 and zfs version 5.
Is there an equivalent of UFS dirhash memory setting for ZFS? (i.e. the
size of the metadata cache)
More information about the freebsd-fs
mailing list