HAST + ZFS + NFS + CARP

Thu Jun 30 15:42:13 UTC 2016

> On 30 Jun 2016, at 17:37, Julien Cigar <julien at perdition.city> wrote:
> 
>> On Thu, Jun 30, 2016 at 05:28:41PM +0200, Ben RUBSON wrote:
>> 
>>> On 30 Jun 2016, at 17:14, InterNetX - Juergen Gotteswinter <jg at internetx.com> wrote:
>>> 
>>> 
>>> 
>>>> Am 30.06.2016 um 16:45 schrieb Julien Cigar:
>>>> Hello,
>>>> 
>>>> I'm always in the process of setting a redundant low-cost storage for 
>>>> our (small, ~30 people) team here.
>>>> 
>>>> I read quite a lot of articles/documentations/etc and I plan to use HAST
>>>> with ZFS for the storage, CARP for the failover and the "good old NFS"
>>>> to mount the shares on the clients.
>>>> 
>>>> The hardware is 2xHP Proliant DL20 boxes with 2 dedicated disks for the
>>>> shared storage.
>>>> 
>>>> Assuming the following configuration:
>>>> - MASTER is the active node and BACKUP is the standby node.
>>>> - two disks in each machine: ada0 and ada1.
>>>> - two interfaces in each machine: em0 and em1
>>>> - em0 is the primary interface (with CARP setup)
>>>> - em1 is dedicated to the HAST traffic (crossover cable)
>>>> - FreeBSD is properly installed in each machine.
>>>> - a HAST resource "disk0" for ada0p2.
>>>> - a HAST resource "disk1" for ada1p2.
>>>> - a zpool create zhast mirror /dev/hast/disk0 /dev/hast/disk1 is created
>>>> on MASTER
>>>> 
>>>> A couple of questions I am still wondering:
>>>> - If a disk dies on the MASTER I guess that zpool will not see it and
>>>> will transparently use the one on BACKUP through the HAST ressource..
>>> 
>>> thats right, as long as writes on $anything have been successful hast is
>>> happy and wont start whining
>>> 
>>>> is it a problem? 
>>> 
>>> imho yes, at least from management view
>>> 
>>>> could this lead to some corruption?
>>> 
>>> probably, i never heard about anyone who uses that for long time in
>>> production
>>> 
>>> At this stage the
>>>> common sense would be to replace the disk quickly, but imagine the
>>>> worst case scenario where ada1 on MASTER dies, zpool will not see it 
>>>> and will transparently use the one from the BACKUP node (through the 
>>>> "disk1" HAST ressource), later ada0 on MASTER dies, zpool will not 
>>>> see it and will transparently use the one from the BACKUP node 
>>>> (through the "disk0" HAST ressource). At this point on MASTER the two 
>>>> disks are broken but the pool is still considered healthy ... What if 
>>>> after that we unplug the em0 network cable on BACKUP? Storage is
>>>> down..
>>>> - Under heavy I/O the MASTER box suddently dies (for some reasons), 
>>>> thanks to CARP the BACKUP node will switch from standy -> active and 
>>>> execute the failover script which does some "hastctl role primary" for
>>>> the ressources and a zpool import. I wondered if there are any
>>>> situations where the pool couldn't be imported (= data corruption)?
>>>> For example what if the pool hasn't been exported on the MASTER before
>>>> it dies?
>>>> - Is it a problem if the NFS daemons are started at boot on the standby
>>>> node, or should they only be started in the failover script? What
>>>> about stale files and active connections on the clients?
>>> 
>>> sometimes stale mounts recover, sometimes not, sometimes clients need
>>> even reboots
>>> 
>>>> - A catastrophic power failure occur and MASTER and BACKUP are suddently
>>>> powered down. Later the power returns, is it possible that some
>>>> problem occur (split-brain scenario ?) regarding the order in which the
>>> 
>>> sure, you need an exact procedure to recover
>>> 
>>>> two machines boot up?
>>> 
>>> best practice should be to keep everything down after boot
>>> 
>>>> - Other things I have not thought?
>>>> 
>>> 
>>> 
>>> 
>>>> Thanks!
>>>> Julien
>>>> 
>>> 
>>> 
>>> imho:
>>> 
>>> leave hast where it is, go for zfs replication. will save your butt,
>>> sooner or later if you avoid this fragile combination
>> 
>> I was also replying, and finishing by this :
>> Why don't you set your slave as an iSCSI target and simply do ZFS mirroring ?
> 
> Yes that's another option, so a zpool with two mirrors (local + 
> exported iSCSI) ?

Yes, you would then have a real time replication solution (as HAST), compared to ZFS send/receive which is not.
Depends on what you need :)

> 
>> ZFS would then know as soon as a disk is failing.
>> And if the master fails, you only have to import (-f certainly, in case of a master power failure) on the slave.
>> 
>> Ben