ZFS guidelines - preparing for future storage expansion

James R. Van Artsdalen james-freebsd-fs2 at jrv.org
Tue Dec 1 00:31:52 UTC 2009


Andrew Snow wrote:
> Currently there is no "read-ahead" for scrubbing and resilvering, so
> it only talks to one disk and at a time and proceeds using only about
> half the I/O capacity of your disks (or less). Read-ahead is one of
> the planned features for ZFS next year.
gstat sometimes shows multiple outstanding I/O requests to a drive
during a scrub. All of the disks are lit up at the same time: there's no
one-disk-at-a-time.

I see roughly 500 MB/sec during a scrub, which is around 50% of the
theoretical bandwidth of both the disk-to-HBA links and the
HBA-to-system slot in my case.  I hope to be able to fix both this
spring and see if I can reach gigabyte-per-second levels, especially for
userland reads (I've seen 420 MB/s so far).

(each vdev in my case is a 2-way mirror so 500 MB/s of disk is 250 MB/s
of user data)

> Also, when your disks are 98% or more full and you are doing any
> writes at all ZFS spends a long time looking for free blocks with an
> inefficient algorithm.  An improved "disk full" algorithm is also
> planned for next year.

As the disk approaches 100% capacity the free space list(s) become
shorter, not longer.  It's fragmentation, or the need to search a long
time for a large block in the right area, that is likely the problem. 
If you can accept the block at the head of the list there is no search
at all.

A  quick snapshot during a scrub:

dT: 1.006s  w: 1.000s
 L(q)  ops/s    r/s   kBps   ms/r    w/s   kBps   ms/w   %busy Name
    0    238    238  30525    4.4      0      0    0.0   34.8| ada2
    0    235    235  30016    3.1      0      0    0.0   26.3| ada3
    0    274    274  35104    3.7      0      0    0.0   35.4| ada4
    0    277    277  35485    4.0      0      0    0.0   40.5| ada5
    0    273    273  34976    2.9      0      0    0.0   29.4| ada6
    4    270    270  34474    7.2      0      0    0.0   53.4| ada7
    0    271    271  34722    3.2      0      0    0.0   32.4| ada8
    5    270    270  34410    3.4      0      0    0.0   34.0| ada9
    7    268    268  34277    5.8      0      0    0.0   43.6| ada10
    0    267    267  34213    4.3      0      0    0.0   32.1| ada11
    4    269    269  34468    5.7      0      0    0.0   41.6| ada12
    7    268    268  34277    4.7      0      0    0.0   33.4| ada13
    0    277    277  35421    5.1      0      0    0.0   36.5| ada14
    4    270    270  34595    5.4      0      0    0.0   37.5| ada15
    0    269    269  34468    6.3      0      0    0.0   43.9| ada16
    0    275    275  35167    6.2      0      0    0.0   44.9| ada17



More information about the freebsd-fs mailing list