hast vs ggate+gmirror sychrnoisation speed
Mikolaj Golub
to.my.trociny at gmail.com
Wed Oct 27 19:05:29 UTC 2010
On Tue, 26 Oct 2010 17:01:01 +0100 Pete French wrote:
PF> Actually, I just llooked I dmesg on the secondary - it is full
PF> of messages thus:
PF> Oct 26 15:44:59 serpentine-passive hastd[10394]: [serp0] (secondary) Unable to receive request header: RPC version wrong.
PF> Oct 26 15:45:00 serpentine-passive hastd[782]: [serp0] (secondary) Worker process exited ungracefully (pid=10394, exitcode=75).
PF> Oct 26 15:46:59 serpentine-passive hastd[10421]: [serp0] (secondary) Unable to receive request header: RPC version wrong.
PF> Oct 26 15:47:04 serpentine-passive hastd[782]: [serp0] (secondary) Worker process exited ungracefully (pid=10421, exitcode=75).
I saw this too but only sporadic messages so I forgot and did not investigate
then this :-).
Now running synchronization I see them too (but again only sporadic). Setting
the assertion and looking at the received header:
(gdb) list
309 goto fail;
310
311 if (hdr.version != HAST_PROTO_VERSION) {
312 assert(0);
313 errno = ERPCMISMATCH;
314 goto fail;
315 }
316
317 hdr.size = le32toh(hdr.size);
318
(gdb) p/x hdr
$2 = {version = 0x9, size = 0x65657266}
So it looks like garbage.
In hast_proto_send() we send header and then data. Couldn't it be that
remote_send and sync threads interfere and their packets are mixed? May be some
synchronization is needed here?
I set sleep(1) in hast_proto_send() between proto_send(header) and
proto_send(data). The error started to occur frequently.
--
Mikolaj Golub
More information about the freebsd-stable
mailing list