File system blocks alignment

Fri Dec 25 14:03:08 UTC 2009

Poul-Henning Kamp wrote:
> In message <4B349ABF.2070800 at FreeBSD.org>, Alexander Motin writes:
>> The difference is quite significant. Unaligned RAID0 access causes two
>> disks involved in it's handling, while aligned one leaves one of disks
>> free for another request, doubling performance.
> 
> You will find RAID5 writes to be an even better test:  Optimal filesystem
> block-size is a RAID5 stripe width, and if you do not get the offset
> right you instantly loose at least 50% of your write bandwidth.  My
> practical experience says oftem more like 75% is lost.

Sure, I just had no trusted RAID5 nearby to do benchmark.

Actually with RAID5 situation is even more complicated, as there are
actually two optimal transaction sizes:
- First is a stripe size - amount of data written sequentially to one
disk. If you are not aligned with it, it give same results as I have
just shown.
- Second is a row size - stripe size * number of data disks. You may
freely read less information then full row, but short write cause RAID
to handle read-modify-write scenario. If you have 3 disks and no battery
backed cache - you will definitely loose. But if there are 15 disks and
good cache, I believe ability to execute multiple requests independently
in parallel will compensate penalty. Also with 15 disks it would
impractical to increase FS block size, as in that case OS will have to
do that read-modify-write instead of controller and you may loose even more.

With RAID5 I think best practice would be to align FS to the stripe size
and instruct it to write data in maximal bursts, in best case - full row
at a time.

-- 
Alexander Motin