svn commit: r216984 - projects/graid/head/sys/geom/raid

Pawel Jakub Dawidek pjd at FreeBSD.org
Wed Jan 5 13:46:29 UTC 2011


On Wed, Jan 05, 2011 at 02:41:34PM +0200, Alexander Motin wrote:
> On 05.01.2011 10:39, Pawel Jakub Dawidek wrote:
> >On Wed, Jan 05, 2011 at 12:19:40AM +0000, Warner Losh wrote:
> >>Author: imp
> >>Date: Wed Jan  5 00:19:40 2011
> >>New Revision: 216984
> >>URL: http://svn.freebsd.org/changeset/base/216984
> >>
> >>Log:
> >>   First pass at error recovery: if the first disk that we get errors on
> >>   has a problem, try from the second one.  Note info about possible bad
> >>   sector remap attempt through write, and some ideas on when to eject
> >>   the subdisk from the disk.
> >
> >My ideas what to do on I/O error mostly matches yours:
> >- On read error, read from the other disk, write the data back to the
> >   first disk.  Before you return the data up, you must wait for write to
> >   complete.  If you won't wait, you can lose race with new write request
> >   going into the same area and you will overwrite new data with the old
> >   one.
> 
> In design document we have planned range locking mechanism for use here 
> and during synchronization/rebuild.

Range locking is definiatelly good idea. It is a must have for
RAID4/RAID5, but also for RAID1 when you synchronize.

> >- On write error you want to mark disk as broken immediately, as from
> >   now on it has stale data and can't be trusted.
> 
> Right. As further steps we have discussed idea of keeping such disks as 
> part of array, marking them as dirty, avoiding reads from them. If main 
> disk instrantly fail, partially broken disk is probably better then nothing.

I agree that this is more intuitive and easier for the user to observe
which disk exactly broke and why.

> >How do you plan to detect if there was unclean shutdown and you need to
> >synchronize the disks?
> 
> It depends from metadata format. Intel metadata, according to Linux 
> sources, seem to have some flags related to the case. I have planned to 
> implement logic used by gmirror (dirty on first write and clean on close 
> or after timeout) using that flags and metadata sequence numbers.

I was also thinking about flash-friendly resync. Currently gmirror
synchronizes entire thing by reading data from one component and
writting to the other one. Flash-friendly synchronization will read data
from both components and write only if they differ.

-- 
Pawel Jakub Dawidek                       http://www.wheelsystems.com
pjd at FreeBSD.org                           http://www.FreeBSD.org
FreeBSD committer                         Am I Evil? Yes, I Am!
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 196 bytes
Desc: not available
Url : http://lists.freebsd.org/pipermail/svn-src-projects/attachments/20110105/69841a90/attachment.pgp


More information about the svn-src-projects mailing list