hast vs ggate+gmirror sychrnoisation speed

Mikolaj Golub to.my.trociny at gmail.com
Sat Oct 30 12:26:03 UTC 2010


On Thu, 28 Oct 2010 22:08:54 +0300 Mikolaj Golub wrote to Pawel Jakub Dawidek:

 PJD>> I looked at the code and the keepalive packets arbe sent from another
 PJD>> thread. Could you try turning them off in primary.c and see if that
 PJD>> helps?

 MG> At first I set RETRY_SLEEP to 1 sec to have more keepalive packets. The errors
 MG> started to observe frequently:

 MG> Oct 28 21:35:53 bolek hastd[1709]: [storage] (secondary) Unable to receive request header: RPC version wrong.
 MG> Oct 28 21:35:54 bolek hastd[1632]: [storage] (secondary) Worker process exited ungracefully (pid=1709, exitcode=75).
 MG> Oct 28 21:36:12 bolek hastd[1722]: [storage] (secondary) Unable to receive request header: RPC version wrong.
 MG> Oct 28 21:36:12 bolek hastd[1632]: [storage] (secondary) Worker process exited ungracefully (pid=1722, exitcode=75).
 MG> ...

 MG> Now I have been running synchronization for more then a half an hour with
 MG> keepalive_send disabled and have not seen any error.

So :-) What do you think about sending keepalive in remote_send_thread() to
avoid this problem and sending them only when a connection is idle (it looks
like there is no much use to send them all the time)? Something like in the
patch below (it works for me).

-- 
Mikolaj Golub

-------------- next part --------------
A non-text attachment was scrubbed...
Name: hastd.keepalive.patch
Type: text/x-patch
Size: 3934 bytes
Desc: not available
Url : http://lists.freebsd.org/pipermail/freebsd-stable/attachments/20101030/5de23b22/hastd.keepalive.bin


More information about the freebsd-stable mailing list