defrag

Fri Aug 29 00:42:05 UTC 2008

>
> Essentially, the UFS file system (and its close relatives) is
> intentionally fragmented in a controlled way as the files are written,

exactly that was invented over 20 years ago and still it works perfect.

> at sort-of-random locations all over the disk, rather than starting at

it's definitely NOT "sort of random".

it divides disk onto "cylinder groups". it puts new files to the same 
cylinder group as other files in the same directory, BUT when files grow 
large (like over 1MB) it FORCES the fragmentation by switching to other 
cylinder group.

the reason is simple - having file fragmented every few megs doesn't make 
a speed difference, while it keeps every cylinder group from filling out.

for small files there will be almost always space available in the same 
cyl group. seek time within cylinder group is in order of 2-3ms at most.

UFS from the beginning optimized for rotational delay too, by dividing 
tracks into multiple "angle zones", so if it has to fragment within 
cylinder group, it choose the space in the zone that after head movement 
it will be shortest rotational delay possible.
same for seeking between inode and file data.

unfortunately - modern drives hide real geometry, so such optimization 
doesn't work any more. this is quite a large loss, for 7200rpm drive
one rotation is 9 ms, average rotational delay 4.5ms, could be half that 
or less with such optimization possible.

UFS does not just prevent fragmentation, it tries to manage it (as 
unavoidable thing) to make it's effect as little as possible.

all of this worked fine and efficient on about 1 MIPS computer like VAX, 
after that UFS was changed a lot, but this basic mechanism is still the 
same.

except extreme cases there is never need for defragmenting UFS 
filesystem!!!

> the remaining space as efficiently as possible, at the cost of speed.

while it still can keep fragmentation quite low with much less space 
available (unless it's really close to 0%), this "low speed" means mostly 
higher CPU load when selecting blocks to allocate. on modern machines like
1Ghz or more it's difficult to see any difference.

> large files, you can adjust some of the parameters (e.g. using
> tunefs(8)) so the filesystem will handle large files more efficiently,
> at the expense of wasting space on small files.

rather by newfs, by making huge blocks like -b 65536 -f 8192, and make 
MUCH less inodes (like -i 1048576)

still - it will lose about as much space then as FAT32 with 8kB clusters, 
which is AFAIK default for FAT32 on large drives.

with huge files, such settings may not only speed things a bit, but 
actually save space by not reserving that much for inodes and bitmaps.