graid3
Ivan Voras
ivoras at freebsd.org
Sun Jul 27 00:55:51 UTC 2008
Wojciech Puchar wrote:
> i read the graid3 manual and http://www.acnc.com/04_01_03.html to make
> sure i know what's RAID3 and i don't understand few things.
>
> 1)
>
> "The number of components must be equal to 3, 5, 9, 17, etc.
> (2^n + 1)."
>
> why it can't be say 5 disks+parity?
The reason is in the definition on "RAID 3", which says the updates to
the RAID device must be atomic. In some ideal universe, RAID 3 is
implemented in hardware and on individual bytes, but here we cannot
write to the drives in units other than sectorsize and sectorsize is 512
bytes.
Parity needs to be calculated with regards to each sector, so at the
sector level, the minimum number of sectors is three sectors: two for
data and one for parity. This means the high-level atomic sectorsize is
2*512=1024 bytes. If you inspect your RAID 3 devices, you'll see just that:
# diskinfo -v /dev/raid3/homes
/dev/raid3/homes
1024 # sectorsize
107374181376 # mediasize in bytes (100G)
104857599 # mediasize in sectors
But each drive has a normal sectorsize of 512:
# diskinfo -v /dev/ad4
/dev/ad4
512 # sectorsize
80026361856 # mediasize in bytes (75G)
156301488 # mediasize in sectors
Sector sizes cannot be arbitrary for various reasons, mostly dealing
with how memory pages and virtual memory are managed. In short, they
need to be powers of two. This restricts us to high-level ("big") sector
sizes that can be exactly one of the following values: 1024, 2048, 4096,
8192, etc. Since drive sectors are fixed to 512 bytes, this means that
the number of *data* drives must also be a power of two: 2, 4, 8, 16,
etc. Add one more drive for the parity and you get the starting
sequence: 3, 5, 9, 17.
In practice, this means that if you have 17 drives in RAID3, the
sectorsize of the array itself will be 16*512 = 8192. Each write to the
array will update all 17 drives before returning (one sector on each
drive, ensuring an atomic operation). Note that the file system created
on such an array will also have its characteristics modified to the
sector size (the fragment size will be the sector size).
> 2) "-r Use parity component for reading in round-robin fashion.
> "Without this option the parity component is not used at
> all for reading operations when the device is in a complete state.
> With this option specified random I/O read operations are even 40% faster
> , but sequential reads are slower. One cannot use this option if the -w
> option is also specified."
>
>
> how parity disk could speed up random I/O?
It will work well only when the number of drives is small (i.e. three
drives), by using the parity drive as a valid source of data, avoiding
some seeks to all drives. I think that, theoretically, you can save at
most 0.33 (1/3) of all seeks - I don't know where the 40% number comes from.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 250 bytes
Desc: OpenPGP digital signature
Url : http://lists.freebsd.org/pipermail/freebsd-questions/attachments/20080727/c0969192/signature.pgp
More information about the freebsd-questions
mailing list