Weird issue with hastd(8)

Sat Jun 25 14:54:14 UTC 2011

On Fri, Jun 3, 2011 at 11:26 AM, Maxim Sobolev <sobomax at freebsd.org> wrote:
>
> I would also like to get your input on my two other patches - randomization
> of the synchronization pattern and ad-hoc asynchronous more. Hastd appears
> extremely useful to synchronize large virtual disks over slow links without
> taking live virtual machine offline.

For me the idea to send updates to secondary only via
synchronization thread, starting it periodically looks
interesting. Sure it should not be the replacement for "real"
async mode, but having something like this in hast apart other
synchronization modes might be useful.

Comparing it with "real" async  that is described in manual it has
the following advantages:

1) It is much easier to implement.

2) If you have frequent updates of the same blocks, "real" async
will send them all, while with sync thread approach we will skip
many intermediate updates.

Even if we don't run sync thread very frequently and HAST
switches to failover it may sync dirty buffers from previous
master.

It might be useful for backuping volumes via WAN, instead of
rsync or zfs send.

There is a disadvantage -- instead of sending only one dirty
block we synchronize the hole extent (see below how it may be
improved though).

But let me say about the problems with your patch:

http://sobomax.sippysoft.com/primary.c.diff

In your approach you still put the requests to the send thread
but mark them there as failed so they are not actually sent and
the extent is marked as need sync.

You don't start sync thread. It starts in your case after
reconnecting to secondary. You have frequent reconnects because
of the following. Because there are requests in the send thread it does
not send keep alive requests (it sends them only when it is idle)
but actually the requests are not sent and the secondary exits by
timeout not receiving any data from primary. Sure frequent
reconnects are bad.

Also the problem you described in "randomization" thread looks
like is only possible with your patch. As the request "fails" in
send thread the extent is marked as need sync, if at this time
sync thread is running you may observe the effect when the same
frequently updated extent is resent frequently. Without your
patch an extent may be marked as need sync only when connection
to secondary is lost, so synchronization is not running at that
moment.

I think the right approach could be:

1) Don't put the request to the send thread at all.

2) When returning the request to the kernel it still remains
dirty in memmap.

3) periodically, the dirty (in memmap) extents are marked as need
sync and the sync thread is waken up.

Here is the patch that implements it:

http://people.freebsd.org/~trociny/hast.async.patch

The patch can not be considered as complete because of:

1) I think this mode should not be called async, because people
would expect from it the behavior that was known from man (and
how it works in DRBD it suppose). Also "real" async might be implemented
in future too. Some other name should be thought out.

2) The synchronization thread is waked up in guard thread every
HAST_KEEPALIVE seconds. I think it should be not so frequent and
configurable.

It can be improved but I would like to know Pawel's opinion
first. He might know why this is completely wrong :-)

Now about sending the hole extent when only small part of it is
updated. It might be improved with checksum based
synchronization. I have a patch that implements it -- when
synchronizing an extent, before sending the chunk of MAXPHYS
size, its checksum is send and if it matches the chunk is not
sent. It is supposed to be useful when one needs to resync disks,
e.g. after split brain, when most of the blocks on the nodes match.
But apparently it should improve things in this case too.

--
Mikolaj Golub