HAST + ZFS + NFS + CARP
Julien Cigar
julien at perdition.city
Thu Jun 30 15:30:32 UTC 2016
On Thu, Jun 30, 2016 at 05:14:08PM +0200, InterNetX - Juergen Gotteswinter wrote:
>
>
> Am 30.06.2016 um 16:45 schrieb Julien Cigar:
> > Hello,
> >
> > I'm always in the process of setting a redundant low-cost storage for
> > our (small, ~30 people) team here.
> >
> > I read quite a lot of articles/documentations/etc and I plan to use HAST
> > with ZFS for the storage, CARP for the failover and the "good old NFS"
> > to mount the shares on the clients.
> >
> > The hardware is 2xHP Proliant DL20 boxes with 2 dedicated disks for the
> > shared storage.
> >
> > Assuming the following configuration:
> > - MASTER is the active node and BACKUP is the standby node.
> > - two disks in each machine: ada0 and ada1.
> > - two interfaces in each machine: em0 and em1
> > - em0 is the primary interface (with CARP setup)
> > - em1 is dedicated to the HAST traffic (crossover cable)
> > - FreeBSD is properly installed in each machine.
> > - a HAST resource "disk0" for ada0p2.
> > - a HAST resource "disk1" for ada1p2.
> > - a zpool create zhast mirror /dev/hast/disk0 /dev/hast/disk1 is created
> > on MASTER
> >
> > A couple of questions I am still wondering:
> > - If a disk dies on the MASTER I guess that zpool will not see it and
> > will transparently use the one on BACKUP through the HAST ressource..
>
> thats right, as long as writes on $anything have been successful hast is
> happy and wont start whining
>
> > is it a problem?
>
> imho yes, at least from management view
>
> > could this lead to some corruption?
>
> probably, i never heard about anyone who uses that for long time in
> production
>
> At this stage the
> > common sense would be to replace the disk quickly, but imagine the
> > worst case scenario where ada1 on MASTER dies, zpool will not see it
> > and will transparently use the one from the BACKUP node (through the
> > "disk1" HAST ressource), later ada0 on MASTER dies, zpool will not
> > see it and will transparently use the one from the BACKUP node
> > (through the "disk0" HAST ressource). At this point on MASTER the two
> > disks are broken but the pool is still considered healthy ... What if
> > after that we unplug the em0 network cable on BACKUP? Storage is
> > down..
> > - Under heavy I/O the MASTER box suddently dies (for some reasons),
> > thanks to CARP the BACKUP node will switch from standy -> active and
> > execute the failover script which does some "hastctl role primary" for
> > the ressources and a zpool import. I wondered if there are any
> > situations where the pool couldn't be imported (= data corruption)?
> > For example what if the pool hasn't been exported on the MASTER before
> > it dies?
> > - Is it a problem if the NFS daemons are started at boot on the standby
> > node, or should they only be started in the failover script? What
> > about stale files and active connections on the clients?
>
> sometimes stale mounts recover, sometimes not, sometimes clients need
> even reboots
>
> > - A catastrophic power failure occur and MASTER and BACKUP are suddently
> > powered down. Later the power returns, is it possible that some
> > problem occur (split-brain scenario ?) regarding the order in which the
>
> sure, you need an exact procedure to recover
>
> > two machines boot up?
>
> best practice should be to keep everything down after boot
>
> > - Other things I have not thought?
> >
>
>
>
> > Thanks!
> > Julien
> >
>
>
> imho:
>
> leave hast where it is, go for zfs replication. will save your butt,
> sooner or later if you avoid this fragile combination
Do you mean a $> zfs snapshot followed by a $> zfs send ... | ssh zfs
receive ... ?
--
Julien Cigar
Belgian Biodiversity Platform (http://www.biodiversity.be)
PGP fingerprint: EEF9 F697 4B68 D275 7B11 6A25 B2BB 3710 A204 23C0
No trees were killed in the creation of this message.
However, many electrons were terribly inconvenienced.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: not available
URL: <http://lists.freebsd.org/pipermail/freebsd-fs/attachments/20160630/71dc124d/attachment.sig>
More information about the freebsd-fs
mailing list