HAST - detect failure and restore avoiding an outage?

Thu Feb 21 22:00:49 UTC 2013

On Wed, Feb 20, 2013 at 02:54:54PM -0600, Chad M Stewart wrote:
> 
> I built a 2 node cluster for testing HAST out.  Each node is an older HP server with 6 scsi disks.  Each disk is configured as RAID 0 in the raid controller, I wanted a JBOD to be presented to FreeBSD 9.1 x86.  I allocated a single disk for the OS, and the other 5 disks for HAST.
> 
> node2# zpool status
>   pool: scsi-san
>  state: ONLINE
>   scan: scrub repaired 0 in 0h27m with 0 errors on Tue Feb 19 17:38:55 2013
> config:
> 
> 	NAME            STATE     READ WRITE CKSUM
> 	scsi-san        ONLINE       0     0     0
> 	  raidz1-0      ONLINE       0     0     0
> 	    hast/disk1  ONLINE       0     0     0
> 	    hast/disk2  ONLINE       0     0     0
> 	    hast/disk3  ONLINE       0     0     0
> 	    hast/disk4  ONLINE       0     0     0
> 	    hast/disk5  ONLINE       0     0     0
> 
> 
>   pool: zroot
>  state: ONLINE
>   scan: none requested
> config:
> 
> 	NAME         STATE     READ WRITE CKSUM
> 	zroot        ONLINE       0     0     0
> 	  gpt/disk0  ONLINE       0     0     0
> 
> 
> 
> Yesterday I physically pulled disk2 (from node1) out to simulate a
> failure.  ZFS didn't see anything wrong, expected.  hastd did see
> the problem, expected.  'hastctl status' didn't show me anything
> unusual or indicate any problem that I could see on either node.  I
> saw hastd reporting problems in the logs, otherwise everything
> looked fine.  Is there a way to detect a failed disk from hastd
> besides the log?  camcontrol showed the disk had failed and
> obviously I'll be monitoring using it as well.

It looks currently logs are only way to detect errors from hastd side.
Here is a patch that adds local i/o error statistics, accessable avia
hastctl:

http://people.freebsd.org/~trociny/hast.stat_error.1.patch

hastctl output:

  role: secondary
  provname: test
  localpath: /dev/md102
  extentsize: 2097152 (2.0MB)
  keepdirty: 0
  remoteaddr: kopusha:7771
  replication: memsync
  status: complete
  dirty: 0 (0B)
  statistics:
    reads: 0
    writes: 366
    deletes: 0
    flushes: 0
    activemap updates: 0
    local i/o errors: 269

Pawel, what do you think about this patch?

> For recovery I installed a new disk in the same slot.  To protect
> the data reliability the safest way I can think of to recover is to
> do the following:
> 
> 1 - node1 - stop the apps
> 2 - node1 - export pool
> 3 - node1 - hastctl create disk2
> 4 - node1 - for D in 1 2 3 4 5; do hastctl role secondary;done
> 5 - node2 - for D in 1 2 3 4 5; do hastctl role primary;done
> 6 - node2 - import pool
> 7 - node2 - start the apps

> At step 5 the hastd will start to resynchronize node2:disk2 ->
> node1:disk2.  I've been trying to think of a way to re-establish the
> mirror without having to restart/move the pool _and_ not pose
> additional risk of data loss.
>
> To avoid an application outage I suppose the following would work:
>
> 1 - insert new disk in node1
> 2 - hastctl role init disk2
> 3 - hastctl create disk2
> 4 - hastctl role primary disk2
>
> At that point ZFS would have seen a disk failure and then started
> resilvering the pool. No application outage, but now only 4 disks
> contain the data (assuming changing bits on the pool, not static
> content).  Using the previous steps application outage, but a
> healthy pool is maintained always.

> Is there another scenario I'm thinking of where both data health and
> no application outage could be achieved?
>

-- 
Mikolaj Golub