Re: Odd behaviour of two identical ZFS servers mirroring via rsync

From: andy thomas <andy_at_time-domain.co.uk>
Date: Sat, 12 Nov 2022 08:24:27 UTC
Thank you for the suggestions, I'll set up a pair of test servers and 
experiment with adjusting the rsync block size to match the 128k ZFS 
record size, noting disk usage on both for varying file sizes.

Buffering could well be an issue here with data on the server being 
mirrrored contantly changing within the HPC it supports (30 Linux compute 
nodes with up to 700 simultaneous user jobs) and this might be something I 
will just have to live with.

Andy

On Fri, 11 Nov 2022, Bob Friesenhahn wrote:

> On Fri, 11 Nov 2022, andy thomas wrote:
>> 
>> It seems almost as if ZFS is not freeing up blocks when rsync has deleted 
>> or shrank files, leaving unwanted blocks lurking around in the folder that 
>> 'du' then discovers and adds to its tally when it works out the space usage 
>> of that folder!
>
> This would be completely expected behavior if zfs snapshots are used.
>
> The rsync block sizes can be adjusted to be a better match for zfs block 
> sizes (e.g. 128k).  For example, zfs will do a 'sync' to write new data to 
> disk and it will help if all of the data in an new/updated zfs block is 
> provided at the time of the 'sync' (rather than 1/4 or 1/2 of the block).
>
> Network buffering can also be a factor since it effects the timing of data 
> delivery to the backup server.  If the sending side tends to stall, then add 
> more buffering.
>
> Bob
> -- 
> Bob Friesenhahn
> bfriesen@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
> GraphicsMagick Maintainer,    http://www.GraphicsMagick.org/
> Public Key,     http://www.simplesystems.org/users/bfriesen/public-key.txt
>
>


----------------------------
Andy Thomas,
Time Domain Systems

Tel: +44 (0)7866 556626
http://www.time-domain.co.uk