Millions of small files: best filesystem / best options
Bruce Evans
brde at optusnet.com.au
Tue May 29 07:35:52 UTC 2012
On Mon, 28 May 2012, Doug Barton wrote:
> On 5/28/2012 10:01 AM, Alessio Focardi wrote:
>> So in my case I would have to use -b 4096 -f 512
>>
>> It's an improvement, but still is not ideal: still a big waste with 200 bytes files!
>
> Are all of the files exactly 200 bytes? If so that's likely the best you
> can do.
It is easy to do better by using a file system that supports small block
sizes. This might be slow, but it reduces the wastage. Possible file
systems:
- msdosfs has a minimum block size of 512 and handles caching for this
fairly well for a small number of files, but is probably even slower
than ffs for a large number of files. Especially when directories
are involved.
- ext2fs has a minimum block size of 1024 and handles caching for this
fairly poorly.
- it is easy to fix ffs to support a minimum block size of 512 (by
reducing its gratuitous limit of MINBSIZE and fixing the few things
that break:
% magic 19540119 (UFS2) time Tue May 29 16:16:20 2012
% superblock location 65536 id [ 4fc46886 2007c27b ]
% ncg 4 size 1200 blocks 947
% bsize 512 shift 9 mask 0xfffffe00
% fsize 512 shift 9 mask 0xfffffe00
% frag 1 shift 0 fsbtodb 0
% minfree 8% optim time symlinklen 120
% maxbsize 512 maxbpg 64 maxcontig 256 contigsumsize 16
% nbfree 944 ndir 2 nifree 75 nffree 0
% bpg 301 fpg 301 ipg 20
% nindir 64 inopb 2 maxfilesize 136353791
% sbsize 1536 cgsize 512 csaddr 171 cssize 512
% sblkno 144 cblkno 160 iblkno 161 dblkno 171
% cgrotor 0 fmod 0 ronly 0 clean 1
% avgfpdir 64 avgfilesize 16384
% flags none
% fsmnt /mnt
% volname swuid 0
Note that sbsize is now larger than bsize. Most of the things that
break involve wrong checks that sbsize <= bsize. sbsize is not
limited by bsize in either direction, since the super block is
accessed in DEV_BSIZE-blocks, not bsize-blocks and the upper limit
on its size is not the same as upper limit on bsize.
> The good news is that it's a big improvement (I've done similar
> stuff in the past). You'll also want to tweak the -i (inode) value to
> insure that you have sufficient inodes for the number of files you plan
> to store. The default is not likely to be adequate for your needs.
Big is relative. 4K-blocks with 200-byte files gives a wastage factor
of 20. Metadata alone will be 256 bytes for the inode alone with ffs2.
Only 128 bytes with ffs1. Only 32 bytes with msdosfs.
> ...
But I expect using a file system would be so slow for lots of really
small files that I wouldn't try it. Caching is already poor for
4K-files, and a factor of 20 loss won't improve it. If you don't want
to use a database, maybe you can use tar.[gz] files. These at least
reduce the wastage (but still waste about twice as much as msdosfs with
512 byte blocks), unless they are compressed. I think there are ways
to treat tar files as file systems and to avoid reading the whole file
to find files in it (zip format is better for this).
Bruce
More information about the freebsd-fs
mailing list