PERFORCE change 123662 for review
Ulf Lilleengen
lulf at FreeBSD.org
Fri Jul 20 12:35:27 UTC 2007
On tor, jul 19, 2007 at 11:01:20pm -0500, Eric Anderson wrote:
> On 07/17/07 16:09, Ulf Lilleengen wrote:
> > http://perforce.freebsd.org/chv.cgi?CH=123662
> > Change 123662 by lulf at lulf_carrot on 2007/07/17 21:08:43
> > - Initial implementation of growing RAID-5 arrays. This is done by
> > splitting the offset calculation into one for read and one for write
> > operations. We make a distinction of subdisks that were added after
> > the plex is not newborn any longer and subdisks that were added at
> > creation/tasting time. When a BIO write comes, the write will go to
> > the whole plex, but read operations will only be done on subdisks that
> > do not have the GV_SD_GROW flag set. The bad thing with this is that
> > we must ensure that new subdisks are added to a later plexoffset
> > (which we should force, to make it easier for us, since there is not a
> > good reason why the user should be able to set the plexoffset in this
> > operation). The implementation will probably change a bit.
> > - Add another state called RESIZING, and a flag called GV_PLEX_GROWING
> > to indicate that a plex is in growing operation.
> > - Make sure obvious parts of the code respects this flag. Will need to
> > look over this more though.
>
>
>
> Hi -
>
> So far, I'm very excited about your gvinum work - great work so far!
>
> I'm curious how you are growing a RAID5. Can you describe this method a bit
> more? Where did you see how to do this?
>
>
Hi,
Well, what I do is to attach/create the new subdisk as usual, but since it's a
RAID-5 array that I know is operational, I give the subdisk a flag, and sets the
plex in a resize state. Then, In the raid-5 code, I modify gv_raid5_offset
(which basically computes offsets within a subdisk based on the number of
subdisks and stripesize). However, what I do, is that instead of taking all
subdisks in the calculation, I only take those who does not have the GROW flag
(when reading), and I take all subdisks into calculation when it's a write.
This means, that if a create a gv_grow_plex function that reads (stripesize x
sdcount) bytes (from the subdisks that do not have the GROW flag), and writes
that data to the plex (including all subdisks). This way, i sort of overwrite
the old data, but the data is spread out over the new subdisks. I'm sorry if
this might seem a bit complex, but just ask more questions if you didn't
understand.
Actually, I didn't read this anywhere.. I sort of thought this out myself :P
--
Ulf Lilleengen
More information about the p4-projects
mailing list