HAST + ZFS + NFS + CARP

Ben RUBSON ben.rubson at gmail.com
Thu Jul 14 15:50:39 UTC 2016


> On 12 Jul 2016, at 15:15, Jan Bramkamp <crest at rlwinm.de> wrote:
> 
> On 04/07/16 19:55, Jordan Hubbard wrote:
>> 
>>> On Jul 3, 2016, at 11:05 PM, Ben RUBSON <ben.rubson at gmail.com> wrote:
>>> 
>>> Of course Jordan, in this topic, we (well at least me :) make the following assumption :
>>> one iSCSI target/disk = one real physical disk (a SAS disk, a SSD disk...), from a server having its own JBOD, no RAID adapter or whatever, just what ZFS likes !
>> 
>> I certainly wouldn’t make that assumption.  Once you allow iSCSI to be the back-end in any solution, end-users will avail themselves of the flexibility to also export arbitrary or synthetic devices (like zvols / RAID devices) as “disks”.  You can’t stop them from doing so, so you might as well incorporate that scenario into your design.  Even if you could somehow enforce the 1:1 mapping of LUN to disk, iSCSI itself is still going to impose a serialization / performance / reporting (iSCSI LUNs don’t report SMART status) penalty that removes a lot of the advantages of having direct physical access to the media, so one might also ask what you’re gaining by imposing those restrictions.
> 
> 
> How about 3way ZFS mirrors spread over three SAS JBODs with dual-ported expanders connected to two FreeBSD servers with SAS HBAs and a *reliable* arbiter to the disks. This could either be an external locking server e.g. consul/etcd/zookeeper and/or SCSI reservations. If more than two head servers are to share the disks a pair of SAS switches should do the job.

It would be nice if it could work without a third server, so one important / interesting thing to test would be the SCSI reservations : be sure that when the pool is imported on MASTER, SLAVE can't use the disks anymore.
(this is the case with iSCSI, when SLAVE exports its disks through CTL, it can't import them using ZFS as CTL locks them as soon as it it started)

> If N-1 disk redundancy is enough two JBODs and 2way mirrors would work as well.

Or if we only have 2 JBODs (for whatever reason), we could (should certainly :) use 4way mirrors so that if one JBOD dies, we're still confident with the pool.

> While you can't prevent stupid operators from blowing their feet of it doesn't offer the same "flexibility" as iSCSI if only because you can't conveniently hookup everything talking Ethernet offering itself als iSCSI target. That is until someone implements a SAS target with CTL and a suitable HBA in FreeBSD ;-).

Why would you prefer a SAS target over an iSCSI target ?
How would it fit ?

> This kind of setup should also preserve all assumptions ZFS has regarding disks.

Yep, although AFAIR no one demonstrated ZFS suffers from iSCSI :) (devs on #openzfs stated it does not)

Anyway, this is nice SAS-only setup, which avoids an additional protocol, a very good reason to go with it.
One good reason for iSCSI is that it allows servers to be in different racks (well there are long SAS cables) / different rooms / buildings.

> I have the required spare hardware to build a two JBOD test setup [1] and could run some tests if anyone is interested in such a setup.
> 
> 
> [1]: Test setup
> 
>    +-----------+    +-----------+
>    | MASTER    |    | SLAVE     |
>    |           |    |           |
>    | HBA0 HBA1 |    | HBA0 HBA1 |
>    +--+----+---+    +--+----+---+
>       ^    ^           ^    ^
>       |    |           |    |
>       |    |           |    +------+
>       |    |           |           |
>       |    |           +----+      |
>       |    |                |      |
>       |    +-----------+    |      |
>       |                |    |      |
>       v                v    v      |
>    +--+--------+    +--+----+---+  |
>    | JBOD 0    |    | JBOD 1    |  |
>    +-------+---+    +-----------+  |
>            ^                       |
>            |                       |
>            +-----------------------+



More information about the freebsd-fs mailing list