Fail-over SAN setup: ZFS, NFS, and ...?

Thu Jun 25 00:04:12 UTC 2009

Why not take a look at gluster?

Freddie Cash wrote:
> [Not exactly sure which ML this belongs on, as it's related to both
> clustering and filesystems.  If there's a better spot, let me know and I'll
> update the CC:/reply-to.]
> 
> We're in the planning stages for building a multi-site, fail-over SAN setup
> which will be used to provide redundant storage for a virtual machine setup.
>  The setup will be like so:
>    [Server Room 1]      .      [Server Room 2]
>   -----------------     .    -------------------
>                         .
>   [storage server]      .     [storage server]
>           |             .             |
>           |             .             |
>    [storage switch]     .      [storage switch]
>                  \----fibre----/      |
>                         .             |
>                         .             |
>                         .   [storage aggregator]
>                         .             |
>                         .             |
>                         .     /---[switch]---\
>                         .     |       |      |
>                         .     |   [VM box]   |
>                         .     |       |      |
>                         .  [VM box]   |      |
>                         .     |       |  [VM box]
>                         .     |       |      |
>                         .     [network switch]
>                         .             |
>                         .             |
>                         .         [internet]
> 
> Server room 1 and server room 2 are on opposite ends of town (about 3 km)
> with a dedicated, direct-link, fibre link between them.  There will be a set
> of VM boxes at each site, that use the shared storage, and will act as
> fail-over for each other.  In theory, only 1 server room would ever be
> active at a time, although we may end up migrating VMs between the two sites
> for maintenance purposes.
> 
> We've got the storage server side of things figured out (5U rackmounts with
> 24 drive bauys, using FreeBSD 7.x and ZFS).  We've got the storage switches
> picked out (HP Procurve 2800 or 2900, depending on if we go with 1 GbE or 10
> GbE fibre links between them).  We're stuck on the storage aggregator.
> 
> For a single aggregator box setup, we'd use FreeBSD 7.x with ZFS.  The
> storage servers would each export a single zvol using iSCSI.  The storage
> aggregator would use ZFS to create a pool using a mirrored vdev.  To expand
> the pool, we put in two more storage servers, and add another mirrored vdev
> to the pool.  No biggie.  The storage aggregator then uses NFS and/or iSCSI
> to make storage available to the VM boxes.  This is the easy part.
> 
> However, we'd like to remove the single-point-of-failure that the storage
> aggregator represents, and have a duplicate of it running at Server Room 1.
>  Right now, we can do this using cold-spares that rsync from the live box
> every X hours/days.   We'd like this to be a live, fail-over spare, though.
>  And this is where we're stuck.
> 
> What can we use to do this?  CARP?  Heatbeat?  ggate?  Should we look at
> Linux with DRBD or linux-ha or cluster-nfs or similar?  Perhaps RedHat
> Cluster Suite?  (We'd prefer not to, as then storage management becomes a
> nightmare again, requiring mdadm, lvm, and more.)  Would a cluster
> filessytem be needed?  AFS or similar?
> 
> We have next to no knowledge of fail-over clustering when it comes to
> high-availability and fail-over.  Any pointers to things to read online, or
> tips, or even "don't do that, you're insane" comments greatly appreciated.
>  :)