HAST instability

Daniel Kalchev daniel at digsys.bg
Fri Jun 3 16:18:41 UTC 2011


Well, apparently my HAST joy was short. On a second run, I got stuck with

Jun  3 19:08:16 b1a hastd[1900]: [data2] (primary) Unable to receive 
reply header: Operation timed out.

on the primary. No messages on the secondary.

On primary:

# netstat -an | grep 8457

tcp4       0      0 10.2.101.11.42659      10.2.101.12.8457       FIN_WAIT_2
tcp4       0      0 10.2.101.11.62058      10.2.101.12.8457       CLOSE_WAIT
tcp4       0      0 10.2.101.11.34646      10.2.101.12.8457       FIN_WAIT_2
tcp4       0      0 10.2.101.11.11419      10.2.101.12.8457       CLOSE_WAIT
tcp4       0      0 10.2.101.11.37773      10.2.101.12.8457       FIN_WAIT_2
tcp4       0      0 10.2.101.11.21911      10.2.101.12.8457       FIN_WAIT_2
tcp4       0      0 10.2.101.11.40169      10.2.101.12.8457       CLOSE_WAIT
tcp4       0  97749 10.2.101.11.44360      10.2.101.12.8457       CLOSE_WAIT
tcp4       0      0 10.2.101.11.8457       *.*                    LISTEN

on secondary

# netstat -an | grep 8457

tcp4       0      0 10.2.101.12.8457       10.2.101.11.42659      CLOSE_WAIT
tcp4       0      0 10.2.101.12.8457       10.2.101.11.62058      FIN_WAIT_2
tcp4       0      0 10.2.101.12.8457       10.2.101.11.34646      CLOSE_WAIT
tcp4       0      0 10.2.101.12.8457       10.2.101.11.11419      FIN_WAIT_2
tcp4       0      0 10.2.101.12.8457       10.2.101.11.37773      CLOSE_WAIT
tcp4       0      0 10.2.101.12.8457       10.2.101.11.21911      CLOSE_WAIT
tcp4       0      0 10.2.101.12.8457       10.2.101.11.40169      FIN_WAIT_2
tcp4   66415      0 10.2.101.12.8457       10.2.101.11.44360      FIN_WAIT_2
tcp4       0      0 10.2.101.12.8457       *.*                    LISTEN

on primary

# hastctl status
data0:
   role: primary
   provname: data0
   localpath: /dev/gpt/data0
   extentsize: 2097152 (2.0MB)
   keepdirty: 64
   remoteaddr: 10.2.101.12
   sourceaddr: 10.2.101.11
   replication: fullsync
   status: complete
   dirty: 0 (0B)
data1:
   role: primary
   provname: data1
   localpath: /dev/gpt/data1
   extentsize: 2097152 (2.0MB)
   keepdirty: 64
   remoteaddr: 10.2.101.12
   sourceaddr: 10.2.101.11
   replication: fullsync
   status: complete
   dirty: 0 (0B)
data2:
   role: primary
   provname: data2
   localpath: /dev/gpt/data2
   extentsize: 2097152 (2.0MB)
   keepdirty: 64
   remoteaddr: 10.2.101.12
   sourceaddr: 10.2.101.11
   replication: fullsync
   status: complete
   dirty: 6291456 (6.0MB)
data3:
   role: primary
   provname: data3
   localpath: /dev/gpt/data3
   extentsize: 2097152 (2.0MB)
   keepdirty: 64
   remoteaddr: 10.2.101.12
   sourceaddr: 10.2.101.11
   replication: fullsync
   status: complete
   dirty: 0 (0B)

Sits in this state for over 10 minutes.

Unfortunately, no KDB in kernel. Any ideas what other to look for?

Daniel


More information about the freebsd-stable mailing list