HAST + ZFS + NFS + CARP

Borja Marcos borjam at sarenet.es
Thu Aug 11 11:22:10 UTC 2016


> On 11 Aug 2016, at 13:02, Julien Cigar <julien at perdition.city> wrote:
> 
> On Thu, Aug 11, 2016 at 12:15:39PM +0200, Julien Cigar wrote:
>> On Thu, Aug 11, 2016 at 11:24:40AM +0200, Borja Marcos wrote:
>>> 
>>>> On 11 Aug 2016, at 11:10, Julien Cigar <julien at perdition.city> wrote:
>>>> 
>>>> As I said in a previous post I tested the zfs send/receive approach (with
>>>> zrep) and it works (more or less) perfectly.. so I concur in all what you
>>>> said, especially about off-site replicate and synchronous replication.
>>>> 
>>>> Out of curiosity I'm also testing a ZFS + iSCSI + CARP at the moment, 
>>>> I'm in the early tests, haven't done any heavy writes yet, but ATM it 
>>>> works as expected, I havent' managed to corrupt the zpool.
>>> 
>>> I must be too old school, but I don’t quite like the idea of using an essentially unreliable transport
>>> (Ethernet) for low-level filesystem operations.
>>> 
>>> In case something went wrong, that approach could risk corrupting a pool. Although, frankly,
> 
> Now I'm thinking of the following scenario:
> - filer1 is the MASTER, filer2 the BACKUP
> - on filer1 a zpool data mirror over loc1, loc2, rem1, rem2 (where rem1 
> and rem2 are iSCSI disks)
> - the pool is mounted on MASTER
> 
> Now imagine that the replication interface corrupts packets silently,
> but data are still written on rem1 and rem2. Does ZFS will detect 
> immediately that written blocks on rem1 and rem2 are corrupted?

As far as I know ZFS does not read after write. It can detect silent corruption when reading a file
or a metadata block, but that will happen only when requested (file), when needed (metadata)
or in a scrub. It doesn’t do preemptive read-after-write, I think. Or I don’t recall having read it.

Silent corruption can be overcome by ZFS as long as it isn’t too much. In my case with the
evil HBA it was like a block operation error in an hour of intensive I/O. In normal operation it could
be a block error in a week or so. With that error rate, the chance of a random I/O error corrupting the
same block in three different devices (it’s a raidz2 vdev) are really remote. 

But, again, and I won’t push more at the risk of annoying you to death. Just, think that your I/O 
throughput will be bound by your network and iSCSI performance, anyway ;)




Borja.


P.D: I forgot to reply to this before:

>> Yeah.. although you could have silent data corruption with any broken
>> hardware too. Some years ago I suffered a silent data corruption due to 
>> a broken RAID card, and had to restore from backups..

Ethernet hardware is designed with the assumption that the loss of a packet is not such a big deal. 
Shit happens on SAS and other specialized storage networks of course, but you should expect it to be 
at least a bit less. ;)




More information about the freebsd-fs mailing list