HAST + ZFS + NFS + CARP

Thu Jun 30 16:35:47 UTC 2016

On Thu, Jun 30, 2016 at 05:42:04PM +0200, Ben RUBSON wrote:
> 
> 
> > On 30 Jun 2016, at 17:37, Julien Cigar <julien at perdition.city> wrote:
> > 
> >> On Thu, Jun 30, 2016 at 05:28:41PM +0200, Ben RUBSON wrote:
> >> 
> >>> On 30 Jun 2016, at 17:14, InterNetX - Juergen Gotteswinter <jg at internetx.com> wrote:
> >>> 
> >>> 
> >>> 
> >>>> Am 30.06.2016 um 16:45 schrieb Julien Cigar:
> >>>> Hello,
> >>>> 
> >>>> I'm always in the process of setting a redundant low-cost storage for 
> >>>> our (small, ~30 people) team here.
> >>>> 
> >>>> I read quite a lot of articles/documentations/etc and I plan to use HAST
> >>>> with ZFS for the storage, CARP for the failover and the "good old NFS"
> >>>> to mount the shares on the clients.
> >>>> 
> >>>> The hardware is 2xHP Proliant DL20 boxes with 2 dedicated disks for the
> >>>> shared storage.
> >>>> 
> >>>> Assuming the following configuration:
> >>>> - MASTER is the active node and BACKUP is the standby node.
> >>>> - two disks in each machine: ada0 and ada1.
> >>>> - two interfaces in each machine: em0 and em1
> >>>> - em0 is the primary interface (with CARP setup)
> >>>> - em1 is dedicated to the HAST traffic (crossover cable)
> >>>> - FreeBSD is properly installed in each machine.
> >>>> - a HAST resource "disk0" for ada0p2.
> >>>> - a HAST resource "disk1" for ada1p2.
> >>>> - a zpool create zhast mirror /dev/hast/disk0 /dev/hast/disk1 is created
> >>>> on MASTER
> >>>> 
> >>>> A couple of questions I am still wondering:
> >>>> - If a disk dies on the MASTER I guess that zpool will not see it and
> >>>> will transparently use the one on BACKUP through the HAST ressource..
> >>> 
> >>> thats right, as long as writes on $anything have been successful hast is
> >>> happy and wont start whining
> >>> 
> >>>> is it a problem? 
> >>> 
> >>> imho yes, at least from management view
> >>> 
> >>>> could this lead to some corruption?
> >>> 
> >>> probably, i never heard about anyone who uses that for long time in
> >>> production
> >>> 
> >>> At this stage the
> >>>> common sense would be to replace the disk quickly, but imagine the
> >>>> worst case scenario where ada1 on MASTER dies, zpool will not see it 
> >>>> and will transparently use the one from the BACKUP node (through the 
> >>>> "disk1" HAST ressource), later ada0 on MASTER dies, zpool will not 
> >>>> see it and will transparently use the one from the BACKUP node 
> >>>> (through the "disk0" HAST ressource). At this point on MASTER the two 
> >>>> disks are broken but the pool is still considered healthy ... What if 
> >>>> after that we unplug the em0 network cable on BACKUP? Storage is
> >>>> down..
> >>>> - Under heavy I/O the MASTER box suddently dies (for some reasons), 
> >>>> thanks to CARP the BACKUP node will switch from standy -> active and 
> >>>> execute the failover script which does some "hastctl role primary" for
> >>>> the ressources and a zpool import. I wondered if there are any
> >>>> situations where the pool couldn't be imported (= data corruption)?
> >>>> For example what if the pool hasn't been exported on the MASTER before
> >>>> it dies?
> >>>> - Is it a problem if the NFS daemons are started at boot on the standby
> >>>> node, or should they only be started in the failover script? What
> >>>> about stale files and active connections on the clients?
> >>> 
> >>> sometimes stale mounts recover, sometimes not, sometimes clients need
> >>> even reboots
> >>> 
> >>>> - A catastrophic power failure occur and MASTER and BACKUP are suddently
> >>>> powered down. Later the power returns, is it possible that some
> >>>> problem occur (split-brain scenario ?) regarding the order in which the
> >>> 
> >>> sure, you need an exact procedure to recover
> >>> 
> >>>> two machines boot up?
> >>> 
> >>> best practice should be to keep everything down after boot
> >>> 
> >>>> - Other things I have not thought?
> >>>> 
> >>> 
> >>> 
> >>> 
> >>>> Thanks!
> >>>> Julien
> >>>> 
> >>> 
> >>> 
> >>> imho:
> >>> 
> >>> leave hast where it is, go for zfs replication. will save your butt,
> >>> sooner or later if you avoid this fragile combination
> >> 
> >> I was also replying, and finishing by this :
> >> Why don't you set your slave as an iSCSI target and simply do ZFS mirroring ?
> > 
> > Yes that's another option, so a zpool with two mirrors (local + 
> > exported iSCSI) ?
> 
> Yes, you would then have a real time replication solution (as HAST), compared to ZFS send/receive which is not.
> Depends on what you need :)

More a real time replication solution in fact ... :)
Do you have any resource which resume all the pro(s) and con(s) of HAST
vs iSCSI ? I have found a lot of article on ZFS + HAST but not that much
with ZFS + iSCSI .. 

> 
> > 
> >> ZFS would then know as soon as a disk is failing.
> >> And if the master fails, you only have to import (-f certainly, in case of a master power failure) on the slave.
> >> 
> >> Ben
> _______________________________________________
> freebsd-fs at freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-fs
> To unsubscribe, send any mail to "freebsd-fs-unsubscribe at freebsd.org"

-- 
Julien Cigar
Belgian Biodiversity Platform (http://www.biodiversity.be)
PGP fingerprint: EEF9 F697 4B68 D275 7B11  6A25 B2BB 3710 A204 23C0
No trees were killed in the creation of this message.
However, many electrons were terribly inconvenienced.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: not available
URL: <http://lists.freebsd.org/pipermail/freebsd-fs/attachments/20160630/b54028ac/attachment.sig>