HAST + ZFS: no action on drive failure
Mikolaj Golub
trociny at freebsd.org
Sun Jul 3 15:55:13 UTC 2011
On Sat, 2 Jul 2011 14:43:15 -0700 Timothy Smith wrote:
TS> Hello Mikolaj,
TS> So, just to be clear, if a local drive fails in my pool, but the
TS> corresponding remote drive remains available, then hastd will both write to
TS> and read from the remote drive? That's really very cool!
Yes.
TS> I looked more closely at the hastd(8) man page. There is some indication of
TS> what you say, but not so clear:
TS> "Read operations (BIO_READ) are handled locally unless I/O error occurs or local
TS> version of the data is not up-to-date yet (synchronization is in progress)."
This is about READ operations, and for WRITE we have just above:
Every write, delete and flush operation (BIO_WRITE,
BIO_DELETE, BIO_FLUSH) is send to local component and synchronously
replicated to the remote (secondary) node if it is available.
There might be things that should be improved in documetation but I don't feel
capable to do this :-)
TS> Perhaps this can be modified a bit? Adding, "or the local disk is
TS> unavailable. In such a case, the I/O operation will be handled by the remote
TS> resource."
TS> It does makes sense however, since HAST is base on the idea of raid. This
TS> feature increases the redundancy of the system greatly. My boss will be
TS> very impressed, as am I!
TS> I did notice however that when the pulled drive is reinserted, I need to
TS> change the associated hast resource to init, then back to primary to allow
TS> hastd to once again use it (perhaps the same if the secondary drive is
TS> failed?). Unless it will do this on it's own after some time? I did not wait
TS> more than a few minutes. But this is easy enough to script or to monitor the
TS> log and present a notification to admin at such a time.
When you are reinserting the drive the resource should be in init state.
Remember, some data was updated on secondary only, so the right sequence of
operations could be:
1) Failover (switch primary to init and secondary to primary).
2) Fix the disk issue.
3) If this is a new drive, recreate HAST metadata on it with hastctl utility.
4) Switch the repaired resource to secondary and wait until the new primary
connects to it and updates metadata. After this synchronization is started.
5) You can switch to the previous primary before the synchronization is
complete -- it will continue in right direction, but then you should expect
performance degradation until the synchronization is complete -- the READ
requests will go to remote node. So it might be better to wait until the
synchronization is complete before switching back.
--
Mikolaj Golub
More information about the freebsd-stable
mailing list