Re: Odd behaviour of two identical ZFS servers mirroring via rsync

From: andy thomas <andy_at_time-domain.co.uk>
Date: Thu, 17 Nov 2022 14:43:38 UTC
On Thu, 17 Nov 2022, Freddie Cash wrote:

> Now that you have it working with rsync, you should look into using ZFS
> send/recv as an alternative. You should find it finishes a lot quicker than
> rsync, although it does require a bit more scripting know-how (especially if
> you want to use restartable/interruptible transfers, or use a transport
> other than SSH for better throughout).
> ZFS send/recv works "below" the filesystem later today rsync works at. ZFS
> knows which individual blocks on disk have changed between snapshots and
> only transfers those blocks. There's no file comparisons and hash
> computations to work out between the hosts.
> 
> Transferring the initial snapshot takes a long time, though, as it has to
> transfer the entire filesystem across. Transferring individual snapshots
> after that takes very little time. It's similar to doing a "full" backup,
> and then "incrementals".
> 
> When transferring data between ZFS pools with similar filesystem
> hierarchies, you really should consider send/recv.

Point taken! Three days ago, one of our HPC users who has ~9TB of data 
stored on our server decided to rename a subdirectory containing ~4TB of 
experimental data stored as many millions of relatively small files within 
a lot of subdirectories. As a result, rsync on the destination (mirror) 
server is still deleting his old folder and its contents and hasn't even 
started mirroring the renamed folder.

Since our servers have been up for 5.5 years and are both well overdue for 
an O/S upgrade from FBSD 11.3 to 13.x anyway, I think this would be a good 
opportunity to switch from rsync to ZFS send/recv. I was planning to do 
the O/S update over the upcoming Christmas vacation when HPC demand here 
traditionally falls to a very low level - I will set up a pair of test 
servers in the next day or two, play around with this and get some 
experience of this before upgrading the 'live' servers.

cheers, Andy

> Typos due to smartphone keyboard.
> 
> On Thu., Nov. 17, 2022, 12:50 a.m. andy thomas, <andy@time-domain.co.uk>
> wrote:
>       I thought I would report back that changed my rsync options from
>       '-Wav
>       --delete' to '-av --inplace --no-whole-file --delete' has made a
>       significant difference, with mirrored directory sizes on the
>       slave server
>       now falling and approaching the original sizes on the master.
>       The only
>       downside is that since whole-file replication is obviously a lot
>       faster
>       than updating the changed parts of individual files, mirroring
>       is now
>       taking longer than 24 hours so this will be changed to every few
>       days or
>       even weekly when more is known about user behaviour on the
>       master server.
>
>       Andy
>
>       On Sun, 13 Nov 2022, Bob Friesenhahn wrote:
>
>       > On Sun, 13 Nov 2022, Mark Saad wrote:
>       >>>
>       >> Bob are you saying when the target is zfs --inplace
>       --no-whole-file helps
>       >> or just in general when you have
>       >> large files ?  Also have you tried using --delete-during /
>       --delete-after
>       >> ?
>       >
>       > The '-inplace --no-whole-file' updates the file blocks if they
>       have changed
>       > (comparing the orgin blocks with the existing mirror blocks)
>       rather than
>       > creating a new copy of the file and moving it into place when
>       it is complete.
>       > ZFS does not check if data content has been changed while it
>       is being written
>       > so a write of the same data will result in a fresh allocation
>       based on its
>       > Copy On Write ("COW") design.  Writing a whole new file
>       obviously
>       > significantly increases the number of blocks which are
>       written.  Requesting
>       > that rsync only write to the file for the blocks which have
>       changed reduces
>       > the total number of blocks which get written.
>       >
>       > The above helps quite a lot when using snapshots since then
>       fewer blocks are
>       > in the snapshots.
>       >
>       > I have never tried --delete-during so I can't comment on that.
>       >
>       > Bob
>       > --
>       > Bob Friesenhahn
>       > bfriesen@simple.dallas.tx.us,
>       http://www.simplesystems.org/users/bfriesen/
>       > GraphicsMagick Maintainer,    http://www.GraphicsMagick.org/
>       > Public Key,   
>        http://www.simplesystems.org/users/bfriesen/public-key.txt
>       >
>       >
> 
>