HAST + ZFS + NFS + CARP
InterNetX - Juergen Gotteswinter
jg at internetx.com
Thu Jun 30 15:24:11 UTC 2016
Am 30.06.2016 um 16:45 schrieb Julien Cigar:
> Hello,
>
> I'm always in the process of setting a redundant low-cost storage for
> our (small, ~30 people) team here.
>
> I read quite a lot of articles/documentations/etc and I plan to use HAST
> with ZFS for the storage, CARP for the failover and the "good old NFS"
> to mount the shares on the clients.
>
> The hardware is 2xHP Proliant DL20 boxes with 2 dedicated disks for the
> shared storage.
>
> Assuming the following configuration:
> - MASTER is the active node and BACKUP is the standby node.
> - two disks in each machine: ada0 and ada1.
> - two interfaces in each machine: em0 and em1
> - em0 is the primary interface (with CARP setup)
> - em1 is dedicated to the HAST traffic (crossover cable)
> - FreeBSD is properly installed in each machine.
> - a HAST resource "disk0" for ada0p2.
> - a HAST resource "disk1" for ada1p2.
> - a zpool create zhast mirror /dev/hast/disk0 /dev/hast/disk1 is created
> on MASTER
>
> A couple of questions I am still wondering:
> - If a disk dies on the MASTER I guess that zpool will not see it and
> will transparently use the one on BACKUP through the HAST ressource..
thats right, as long as writes on $anything have been successful hast is
happy and wont start whining
> is it a problem?
imho yes, at least from management view
> could this lead to some corruption?
probably, i never heard about anyone who uses that for long time in
production
At this stage the
> common sense would be to replace the disk quickly, but imagine the
> worst case scenario where ada1 on MASTER dies, zpool will not see it
> and will transparently use the one from the BACKUP node (through the
> "disk1" HAST ressource), later ada0 on MASTER dies, zpool will not
> see it and will transparently use the one from the BACKUP node
> (through the "disk0" HAST ressource). At this point on MASTER the two
> disks are broken but the pool is still considered healthy ... What if
> after that we unplug the em0 network cable on BACKUP? Storage is
> down..
> - Under heavy I/O the MASTER box suddently dies (for some reasons),
> thanks to CARP the BACKUP node will switch from standy -> active and
> execute the failover script which does some "hastctl role primary" for
> the ressources and a zpool import. I wondered if there are any
> situations where the pool couldn't be imported (= data corruption)?
> For example what if the pool hasn't been exported on the MASTER before
> it dies?
> - Is it a problem if the NFS daemons are started at boot on the standby
> node, or should they only be started in the failover script? What
> about stale files and active connections on the clients?
sometimes stale mounts recover, sometimes not, sometimes clients need
even reboots
> - A catastrophic power failure occur and MASTER and BACKUP are suddently
> powered down. Later the power returns, is it possible that some
> problem occur (split-brain scenario ?) regarding the order in which the
sure, you need an exact procedure to recover
> two machines boot up?
best practice should be to keep everything down after boot
> - Other things I have not thought?
>
> Thanks!
> Julien
>
imho:
leave hast where it is, go for zfs replication. will save your butt,
sooner or later if you avoid this fragile combination
More information about the freebsd-fs
mailing list