HAST: primary might get stuck when there are connectivity problems with secondary

Fri Apr 23 06:29:56 UTC 2010

On Wed, Apr 21, 2010 at 12:02:16PM +0300, Mikolaj Golub wrote:
> Restoring network after this changes nothing. Primary gets stuck. No messages
> in the log and "dirty" in status output does not change:
> 
> [root at hasta ~]# hastctl status
> storage:
>   role: primary
>   provname: storage
>   localpath: /dev/ad4
>   extentsize: 2097152
>   keepdirty: 64
>   remoteaddr: 172.20.66.202
>   replication: memsync
>   status: complete
>   dirty: 2097152 bytes
> 
> On the secondary we have all this time:
> 
> tcp4       0      0 172.20.66.202.8457     172.20.66.201.57596    ESTABLISHED
> tcp4       0      0 172.20.66.202.8457     172.20.66.201.41841    ESTABLISHED
> 
> The last messages in log:
> 
> Apr 21 10:50:21 hasta hastd: [storage] (primary) ggate_recv: (0x28411bc0) Request received from the kernel: READ(13565952, 65536).
> Apr 21 10:50:21 hasta hastd: [storage] (primary) ggate_recv: (0x28411bc0) Moving request to the send queue.
> Apr 21 10:50:21 hasta hastd: [storage] (primary) ggate_recv: Taking free request.
> Apr 21 10:50:21 hasta hastd: [storage] (primary) ggate_recv: (0x28411b80) Got free request.
> Apr 21 10:50:21 hasta hastd: [storage] (primary) ggate_recv: (0x28411b80) Waiting for request from the kernel.
> Apr 21 10:50:21 hasta hastd: [storage] (primary) local_send: (0x28411bc0) Got request.
> Apr 21 10:50:21 hasta hastd: [storage] (primary) local_send: (0x28411bc0) Moving request to the done queue.
> Apr 21 10:50:21 hasta hastd: [storage] (primary) local_send: Taking request.
> Apr 21 10:50:21 hasta hastd: [storage] (primary) ggate_send: (0x28411bc0) Got request.
> Apr 21 10:50:21 hasta hastd: [storage] (primary) ggate_send: (0x28411bc0) Moving request to the free queue.
> Apr 21 10:50:21 hasta hastd: [storage] (primary) ggate_send: Taking request.
> Apr 21 10:51:00 hasta hastd: [storage] (primary) ggate_recv: (0x28411b80) Request received from the kernel: READ(1812529152, 65536).
> Apr 21 10:51:00 hasta hastd: [storage] (primary) ggate_recv: (0x28411b80) Moving request to the send queue.
> Apr 21 10:51:00 hasta hastd: [storage] (primary) ggate_recv: Taking free request.
> Apr 21 10:51:00 hasta hastd: [storage] (primary) ggate_recv: (0x28411b00) Got free request.
> Apr 21 10:51:00 hasta hastd: [storage] (primary) ggate_recv: (0x28411b00) Waiting for request from the kernel.
> Apr 21 10:51:00 hasta hastd: [storage] (primary) local_send: (0x28411b80) Got request.
> Apr 21 10:51:00 hasta hastd: [storage] (primary) local_send: (0x28411b80) Moving request to the done queue.
> Apr 21 10:51:00 hasta hastd: [storage] (primary) local_send: Taking request.
> Apr 21 10:51:00 hasta hastd: [storage] (primary) ggate_send: (0x28411b80) Got request.
> Apr 21 10:51:00 hasta hastd: [storage] (primary) ggate_send: (0x28411b80) Moving request to the free queue.
> Apr 21 10:51:00 hasta hastd: [storage] (primary) ggate_send: Taking request.
> 
> The backtrace of gotten stuck hastd is in the attach.
> 
> I interpret this in the following way. Although the network is down
> hast_proto_send() in remote_send_thread() returns success (sent data are
> stored in the kernel buffer). Then kernel tries to send data and eventually
> fails after timeout and close the socket. hastd is not aware about this,
> remote_send_thread() is blocked in "Taking request" at this time, sync thread
> is waiting for status from the secondary about sent data but secondary does
> not send it because it did not receive any data.

"Taking request" only means that HAST is waiting for a request. In case
of ggate_send thread it means it is waiting for I/O request from the
kernel. It stops here, because there is no new request.

> Restarting hastd on the secondary usually helps. A workaround is to set
> net.inet.tcp.keepidle to some small value (e.g. 300 sec) on the
> secondary. Then the secondary will notice much earlier that the peer has
> closed the connection and will restart the worker itself:
> 
> Apr 21 11:52:21 hastb hastd: [storage] (secondary) Unable to receive request header: Connection reset by peer.
> Apr 21 11:52:21 hastb hastd: [storage] (secondary) Worker process (pid=1398) exited ungracefully: status=19200.

What you are observing here is because currently there is no keep-alive
mechanism in hastd, so if there are no I/O requests, primary won't
notice that secondary is down until next I/O request arrives.
I don't consider it critical, but confusing. Note that when secondary
goes down and there are no I/O requests to primary, there will be
nothing to synchronize, so fast reconnect isn't crucial in this case.
We need one I/O request coming to primary to discover that secondary is
down and start reconnect loop.

Could you confirm this is what you are observing? If not, what FreeBSD
version do you use? If it is HEAD, do you have r205738?

-- 
Pawel Jakub Dawidek                       http://www.wheelsystems.com
pjd at FreeBSD.org                           http://www.FreeBSD.org
FreeBSD committer                         Am I Evil? Yes, I Am!
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 196 bytes
Desc: not available
Url : http://lists.freebsd.org/pipermail/freebsd-fs/attachments/20100423/893d1c3e/attachment.pgp