kern/117954: dirhash on very large directories blocks the machine
for tens of seconds
martin at nowhere.com
Sat Nov 10 00:10:02 PST 2007
>Synopsis: dirhash on very large directories blocks the machine for tens of seconds
>Arrival-Date: Sat Nov 10 08:10:00 UTC 2007
>Originator: Martin Birgmeier
>Release: FreeBSD 6.2-RELEASE i386
MBi at home
System: FreeBSD gandalf.xyzzy 6.2-RELEASE FreeBSD 6.2-RELEASE #0: Sat Jan 13 20:23:55 CET 2007 root at gandalf.xyzzy:/d/14.1/OBJ/FreeBSD/RELENG_6_2_0_RELEASE/src/sys/XYZZY i386
This machine is about 7 years old, with an Athlon 800 MHz processor. The disks can sustain about 10 Mbyte/sec both read and write.
I am mirroring the KDE subversion repository via rsync. KDE currently holds at rev. 734839, meaning that there are two subdirectories (revs and revprops) holding 734840 files each. For this to work at all, I have enabled dirhash and set the hashing are to 32MB via vfs.ufs.dirhash_maxmem=33554432 in sysctl.conf.
The problem is that whenever the hashing is done (i.e., after these directories have not been in the kernel for some time, and now are being accessed), they will be read in by the dirhash algorithm, and doing this, consume lots of processor time (my xload jumps to 8+ all at once), and, as far as I can make out in such a situation, also all (or at least most) of the available disk bandwidth.
For my machine the behavior is so bad that for about a minute the X Window system freezes completely (including the cursor). (Note that in fact it is more like 2 x 30 secs, obviously for each of the two directories involved.) The xload spike is becoming visible after this. Also, as I am using pppoa (ADSL over USB, basically), the buffers allotted to this are exhausted, as shown by log messages to the console. To me this looks like even interrupts are not serviced any more.
Enter a directory with > 250 k entries after it has not been accessed for a long time.
I assume that the fix involves modifying the dirhash algorithm such that it obeys standard process scheduling behavior, esp. with regard to relinquishing the CPU according to the process' scheduling parameters.
This probably means that the syscall in question can no longer be implemented as a single atomic operation (which it currently seems to be).
Since I am no expert in this area, please take those ideas with a grain of salt!
Please note that the e-mail address given above is not valid, as I am paranoid about spam. Simply reply via adding to the PR, I'll monitor it regularly.
More information about the freebsd-bugs