rsync and moving files [Re: backup w/ snapshots]

Svein Halvor Halvorsen svein-freebsd-questions at
Tue Aug 30 07:32:11 GMT 2005

* Greg Barniskis [2005-08-29 11:45 -0500]
>  Eh? Bad assumptions about snapshots, I think. If a snapshot occupied even a
>  tenth of the space of the data that it represented, we would quickly fill all
>  our disks and the snapshot technology would be almost as painful as useful.
>  A snapshot is essentially only an index of occupied disk space, not a copy of
>  the actual data, and a snapshot is therefore much, much, much, much smaller
>  than the data files that have changed. Read the relevant man pages and
>  handbook sections again, and test your assumptions by measuring the actual
>  change in snapshot size. I don't think your perceived problem really exists.

Yes, that's correct! But let's say I keep more than one snapshot around. I 
maybe didn't mention this, but this the sole purpose of using snapshots; 
for me to have more full backups laying around.

If I change the disk alot between snapshots. Eg. I rsync moved files (yes, 
within tha same fs), this will result in alot of file deletion and 
creation. Next, when I make the snapshot, a new list of occupied diskspace 
will be made, and all of these blocks will be marked "in use", and 
therefore take up alot of diskspace.

In reality the information change between the two snapshots, didn't change 
much at all, but the effect remains: my disk cannot longer store two 
snapshots (unless the backup disk is twice as large, which it is not).

The solution: Somehow, I need to mirror all the move ops on the remote 
system before doing the rsync. This could probably be done by making a 
hash table of inodes/filenames pairs (or triplets, etc) each time i sync. 
Then the next time, I could compare the old table with the new, to find 
out which files are the same only with new names, then find those names on 
the remote system, change them to the new ones, and then rsyncing. If the 
inodes are recycled for brand new files between syncs, I don't think that 
would be a problem. The following rsync-job would recognize the diffs and 
sync that, which it would have done anyway, if the file is new.

What do you think?

More information about the freebsd-questions mailing list