Re: Odd behaviour of two identical ZFS servers mirroring via rsync

From: Karl Denninger <karl_at_denninger.net>
Date: Fri, 11 Nov 2022 20:51:34 UTC
Are you sure there are no snapshots that are holding blocks?  If so 
those become copy-on write and will not be released even if the alleged 
file is deleted, as the snapshot copy is still there and valid.

On 11/11/2022 15:20, andy thomas wrote:
> Yes, I can confirm the rsync --delete option is being used and in 
> fact, 'du' reports some of the mirrored folders as having identical 
> sizes on both servers, mainly those containing only small amounts of 
> data.
>
> It seems almost as if ZFS is not freeing up blocks when rsync has 
> deleted or shrank files, leaving unwanted blocks lurking around in the 
> folder that 'du' then discovers and adds to its tally when it works 
> out the space usage of that folder!
>
> I suppose I could always destroy the zfs dataset on the mirror server 
> & resync the whole thing but will take days to complete even over a 10 
> Gbit/s network link (the servers ought to be upgraded to FBSD 13.1 as 
> well).
>
> Andy
>
> On Fri, 11 Nov 2022, Mehmet Erol Sanliturk wrote:
>
>>
>> , Nov 11, 2022 at 8:42 PM andy thomas <andy@time-domain.co.uk> wrote:
>>       I have two identical servers, called clustor2 and
>>       clustor-backup, each
>>       with a ZFS RAIDZ-1 pool containing 9 SAS hard disks plus one
>>       spare and two
>>       SSDs for the ZIL and ARC functions. clustor2 stores user data
>>       from a
>>       HPC while clustor2-backup uses rsync to mirrors all the data
>>       from clustor2
>>       every 24 hours.
>>
>>       However, the disk usage on the mirror server is considerably
>>       more than on
>>       the other server - attached is a screenshot showing the two
>>       servers side
>>       by side, with the mirror server on the right, and displaying the
>>       contents
>>       of the same subdirectory choen at random (named 'ratio_10.0' in
>>       this
>>       instance); as you can see, the sizes of the files within each of
>>       the
>>       folders are identical but 'du' reports very different
>>       space usages for each folder and 'zpool list' also reports a
>>       significant
>>       difference in ZFS pool size.
>>
>>       I'm not sure if this is relevant but both servers have ZFS pools
>>       with no
>>       compression although lz4 compression is enabled on the ZFS
>>       filesystems &
>>       both run FreeBSD 11.3 with ZFS version 5.
>>
>>       Perhaps using zfs send/receive instead of rsync for mirroring
>>       might solve
>>       this disparity?
>>
>>       Thanks in advance for any suggestions,
>>
>>       Andy
>>
>>
>>
>>
>> Your question I am understanding the following points .
>>
>>
>>
>> I am using  rsync  in Fedora Linux .
>>
>> There are  parameters of  rsync  such as
>>
>>  --delete
>>
>> to delete files from the destination drive when they do not exist in the
>> source drive .
>>
>>
>> Please carefully scan  rsync  parameters and use suitable ones for your
>> application .
>>
>>
>> If  a parameter like  --delete  is not used , rsync  copies new files 
>> from
>> the source drive and
>> it does not delete any files from the destination drive .
>>
>>
>> With my best wishes for all .
>>
>>
>> Mehmet Erol Sanliturk
>>
>>
>>
>>
>>
>>
>>
>>
>>
>
>
> ----------------------------
> Andy Thomas,
> Time Domain Systems
>
> Tel: +44 (0)7866 556626
> http://www.time-domain.co.uk
-- 
Karl Denninger
karl@denninger.net
/The Market Ticker/
/[S/MIME encrypted email preferred]/