Re: Odd behaviour of two identical ZFS servers mirroring via rsync
- In reply to: andy thomas : "Re: Odd behaviour of two identical ZFS servers mirroring via rsync"
- Go to: [ bottom of page ] [ top of archives ] [ this month ]
Date: Fri, 18 Nov 2022 00:47:12 UTC
Take the time to figure out send/recv; it is a killer app of ZFS. Note that
your initial sync will have to send the entire filesystem; there is no way
to start with an rsync-ed copy due to the nature of send/recv.
Also note you cannot modify the receive side and then update the backup; as
such you should typically set it to be read-only. (Otherwise you will have
to roll back to the synchronized snapshot before updating). You can still
recv into a read-only zfs filesystem, as the read-only is a statement of
“read only through the posix layer”.
- Eric
On Thu, Nov 17, 2022 at 8:44 AM andy thomas <andy@time-domain.co.uk> wrote:
> On Thu, 17 Nov 2022, Freddie Cash wrote:
>
> > Now that you have it working with rsync, you should look into using ZFS
> > send/recv as an alternative. You should find it finishes a lot quicker
> than
> > rsync, although it does require a bit more scripting know-how
> (especially if
> > you want to use restartable/interruptible transfers, or use a transport
> > other than SSH for better throughout).
> > ZFS send/recv works "below" the filesystem later today rsync works at.
> ZFS
> > knows which individual blocks on disk have changed between snapshots and
> > only transfers those blocks. There's no file comparisons and hash
> > computations to work out between the hosts.
> >
> > Transferring the initial snapshot takes a long time, though, as it has to
> > transfer the entire filesystem across. Transferring individual snapshots
> > after that takes very little time. It's similar to doing a "full" backup,
> > and then "incrementals".
> >
> > When transferring data between ZFS pools with similar filesystem
> > hierarchies, you really should consider send/recv.
>
> Point taken! Three days ago, one of our HPC users who has ~9TB of data
> stored on our server decided to rename a subdirectory containing ~4TB of
> experimental data stored as many millions of relatively small files within
> a lot of subdirectories. As a result, rsync on the destination (mirror)
> server is still deleting his old folder and its contents and hasn't even
> started mirroring the renamed folder.
>
> Since our servers have been up for 5.5 years and are both well overdue for
> an O/S upgrade from FBSD 11.3 to 13.x anyway, I think this would be a good
> opportunity to switch from rsync to ZFS send/recv. I was planning to do
> the O/S update over the upcoming Christmas vacation when HPC demand here
> traditionally falls to a very low level - I will set up a pair of test
> servers in the next day or two, play around with this and get some
> experience of this before upgrading the 'live' servers.
>
> cheers, Andy
>
> > Typos due to smartphone keyboard.
> >
> > On Thu., Nov. 17, 2022, 12:50 a.m. andy thomas, <andy@time-domain.co.uk>
> > wrote:
> > I thought I would report back that changed my rsync options from
> > '-Wav
> > --delete' to '-av --inplace --no-whole-file --delete' has made a
> > significant difference, with mirrored directory sizes on the
> > slave server
> > now falling and approaching the original sizes on the master.
> > The only
> > downside is that since whole-file replication is obviously a lot
> > faster
> > than updating the changed parts of individual files, mirroring
> > is now
> > taking longer than 24 hours so this will be changed to every few
> > days or
> > even weekly when more is known about user behaviour on the
> > master server.
> >
> > Andy
> >
> > On Sun, 13 Nov 2022, Bob Friesenhahn wrote:
> >
> > > On Sun, 13 Nov 2022, Mark Saad wrote:
> > >>>
> > >> Bob are you saying when the target is zfs --inplace
> > --no-whole-file helps
> > >> or just in general when you have
> > >> large files ? Also have you tried using --delete-during /
> > --delete-after
> > >> ?
> > >
> > > The '-inplace --no-whole-file' updates the file blocks if they
> > have changed
> > > (comparing the orgin blocks with the existing mirror blocks)
> > rather than
> > > creating a new copy of the file and moving it into place when
> > it is complete.
> > > ZFS does not check if data content has been changed while it
> > is being written
> > > so a write of the same data will result in a fresh allocation
> > based on its
> > > Copy On Write ("COW") design. Writing a whole new file
> > obviously
> > > significantly increases the number of blocks which are
> > written. Requesting
> > > that rsync only write to the file for the blocks which have
> > changed reduces
> > > the total number of blocks which get written.
> > >
> > > The above helps quite a lot when using snapshots since then
> > fewer blocks are
> > > in the snapshots.
> > >
> > > I have never tried --delete-during so I can't comment on that.
> > >
> > > Bob
> > > --
> > > Bob Friesenhahn
> > > bfriesen@simple.dallas.tx.us,
> > http://www.simplesystems.org/users/bfriesen/
> > > GraphicsMagick Maintainer, http://www.GraphicsMagick.org/
> > > Public Key,
> > http://www.simplesystems.org/users/bfriesen/public-key.txt
> > >
> > >
> >
> >