4.8 ffs_dirpref problem
Ken Marx
kmarx at vicor.com
Wed Oct 29 17:25:58 PST 2003
Don Lewis wrote:
> On 28 Oct, Ken Marx wrote:
>
>>
>>Kirk McKusick wrote:
>
>
>>>I does look like the hash function is having some trouble.
>>>It has been completely revamped in 5.0, but is still using
>>>a "power-of-2" hashing scheme in 4.X. I highly recommend
>>>trying a scheme with non-power-of-2 base. Perhaps something
>>>as simple as changing the hashing to use modulo rather than
>>>logical & (e.g., in bufhash change from & bufhashmask to
>>>% bufhashmask).
>>>
>>> Kirk McKusick
>>>
>>>
>
>
>>We have a sample 'fix' for the hashtable in vfs_bio.c
>>that uses all the blkno bits. It's in the diff link above.
>>Use as you see fit. However, it too doesn't really address
>>our symptoms significantly. Darn.
>>Bogging down to 1Mb/sec and > 90% system seen.
>
>
> A Fibonacci hash, like I implemented in the kern/kern_mtxpool.c 1.8,
> might be a good choice here, since it tends to distribute the keys
> fairly uniformly. I think this is a secondary issue, though.
>
> I think the real problem is the following code in ffs_dirpref():
>
> avgifree = fs->fs_cstotal.cs_nifree / fs->fs_ncg;
> avgbfree = fs->fs_cstotal.cs_nbfree / fs->fs_ncg;
> avgndir = fs->fs_cstotal.cs_ndir / fs->fs_ncg;
> [snip]
> maxndir = min(avgndir + fs->fs_ipg / 16, fs->fs_ipg);
> minifree = avgifree - fs->fs_ipg / 4;
> if (minifree < 0)
> minifree = 0;
> minbfree = avgbfree - fs->fs_fpg / fs->fs_frag / 4;
> if (minbfree < 0)
> minbfree = 0;
> [snip]
> prefcg = ino_to_cg(fs, pip->i_number);
> for (cg = prefcg; cg < fs->fs_ncg; cg++)
> if (fs->fs_cs(fs, cg).cs_ndir < maxndir &&
> fs->fs_cs(fs, cg).cs_nifree >= minifree &&
> fs->fs_cs(fs, cg).cs_nbfree >= minbfree) {
> if (fs->fs_contigdirs[cg] < maxcontigdirs)
> return ((ino_t)(fs->fs_ipg * cg));
> }
> for (cg = 0; cg < prefcg; cg++)
> if (fs->fs_cs(fs, cg).cs_ndir < maxndir &&
> fs->fs_cs(fs, cg).cs_nifree >= minifree &&
> fs->fs_cs(fs, cg).cs_nbfree >= minbfree) {
> if (fs->fs_contigdirs[cg] < maxcontigdirs)
> return ((ino_t)(fs->fs_ipg * cg));
> }
>
> If the file system is more than 75% full, minbfree will be zero, which
> will allow new directories to be created in cylinder groups that have no
> free blocks for either the directory itself, or for any files created in
> that directory. If this happens, allocating the blocks for the
> directory and its files will require ffs_alloc() to do an expensive
> search across the cylinder groups for each block. It looks to me like
> minbfree needs to equal, or at least a lot closer to avgbfree.
>
> A similar situation exists with minifree. Please note that the fallback
> algorithm uses the condition:
> fs->fs_cs(fs, cg).cs_nifree >= avgifree
>
>
>
Interesting. We (Vicor) will defer to experts here, but are very willing to
test anything you come up with.
thanks,
k
--
Ken Marx, kmarx at vicor-nb.com
I insist that we do the right thing and be accountable for the realistic
goals.
- http://www.bigshed.com/cgi-bin/speak.cgi
More information about the freebsd-fs
mailing list