svn commit: r201658 - head/sbin/geom/class/stripe

Wed Jan 6 20:01:31 UTC 2010

On Wed, 6 Jan 2010, Ivan Voras wrote:

> 2010/1/6 Alexander Motin <mav at freebsd.org>:
>> Ivan Voras wrote:
>
>>> I think there was one more reason - though I'm not sure if it is still
>>> valid because of your current and future work - the MAXPHYS
>>> limitation. If MAXPHYS is 128k, with 64k stripes data was only to be
>>> read from maximum of 2 drives. With 4k stripes it would have been read
>>> from 128/4=32 drives, though I agree 4k is too low in any case
>>> nowadays. I usually choose 16k or 32k for my setups.
>>
>> While you are right about MAXPHYS influence, and I hope we can rise it
>> not so far, IMHO it is file system business to manage deep enough
>> read-ahead/write-back to make all drives busy, independently from
>> MAXPHYS value. With small MAXPHYS value FS should just generate more
>> requests in advance. Except some RAID3/5/6 cases, where short writes
>> ineffective, MAXPHYS value should only affect processing overhead.
>
> Yes, my experience which lead to the post was mostly on UFS which,
> while AFAIK it does read-ahead, it still does it serially (I think
> this is implied by your experiments with NCQ and ZFS vs UFS) - so in
> any case only 2 drives are hit with 64k stripe size at any moment in
> time.

fsck has no signifcant knowledge of read-ahead.  Normally it uses vfs
read clustering, which under the most favourable circumstances reduces
to read-ahead of a maxiumum of MAXPHYS (less the initial size).  If
read clustering is disabled, then ffs does old-style read-ahead of
a whole block (16K).  Most file systems in FreeBSD are similar or
worse (some support a block size of 512 and reading ahead by that
amount gives interestingly slow behaviour).

Bruce