Weird issue with hastd(8)

Mikolaj Golub trociny at freebsd.org
Sun May 29 11:38:16 UTC 2011


On Wed, 25 May 2011 11:21:04 -0700 Maxim Sobolev wrote:

 MS> Hi Pawel,

 MS> I am observing strange errors while synchronizing the data between
 MS> primary and secondary. I keep getting the following error messages:

 MS> May 25 11:09:19 eights hastd[10113]: [test] (secondary) Unable to
 MS> receive request header: Socket is not connected.
 MS> May 25 11:09:24 eights hastd[37571]: [test] (secondary) Worker process
 MS> exited ungracefully (pid=10113, exitcode=75).
 MS> May 25 11:10:17 eights hastd[12109]: [test] (secondary) Unable to
 MS> receive request header: Socket is not connected.
 MS> May 25 11:10:18 eights hastd[37571]: [test] (secondary) Worker process
 MS> exited ungracefully (pid=12109, exitcode=75).
 MS> May 25 11:10:39 eights hastd[14685]: [test] (secondary) Unable to
 MS> receive request header: Socket is not connected.
 MS> May 25 11:10:44 eights hastd[37571]: [test] (secondary) Worker process
 MS> exited ungracefully (pid=14685, exitcode=75).

 MS> The synchronization steel proceeds, but it's slow due to the need to
 MS> re-negotiate and re-spawn the secondary worker. I have tried to ktrace
 MS> both server and client at the same time. For some reason the primary
 MS> keeps sending data, while client gets 0-read from the recvfrom at some
 MS> point, while the primary keeps sending more data. This is 8-STABLE
 MS> code on both ends.

 MS> Any ideas of what could be wrong here are appreciated.

This might be MSG_WAITALL issue I described on net@ (look for the thread
"recv() with MSG_WAITALL might stuck when receiving more than rcvbuf", and
also kern/154504).

Could you please try the patch?

http://people.freebsd.org/~trociny/uipc_socket.c.patch

-- 
Mikolaj Golub


More information about the freebsd-current mailing list