rsync and moving files [Re: backup w/ snapshots]
Svein Halvor Halvorsen
svein-freebsd-questions at theloosingend.net
Tue Aug 30 07:32:11 GMT 2005
* Greg Barniskis [2005-08-29 11:45 -0500]
> Eh? Bad assumptions about snapshots, I think. If a snapshot occupied even a
> tenth of the space of the data that it represented, we would quickly fill all
> our disks and the snapshot technology would be almost as painful as useful.
>
> A snapshot is essentially only an index of occupied disk space, not a copy of
> the actual data, and a snapshot is therefore much, much, much, much smaller
> than the data files that have changed. Read the relevant man pages and
> handbook sections again, and test your assumptions by measuring the actual
> change in snapshot size. I don't think your perceived problem really exists.
Yes, that's correct! But let's say I keep more than one snapshot around. I
maybe didn't mention this, but this the sole purpose of using snapshots;
for me to have more full backups laying around.
If I change the disk alot between snapshots. Eg. I rsync moved files (yes,
within tha same fs), this will result in alot of file deletion and
creation. Next, when I make the snapshot, a new list of occupied diskspace
will be made, and all of these blocks will be marked "in use", and
therefore take up alot of diskspace.
In reality the information change between the two snapshots, didn't change
much at all, but the effect remains: my disk cannot longer store two
snapshots (unless the backup disk is twice as large, which it is not).
The solution: Somehow, I need to mirror all the move ops on the remote
system before doing the rsync. This could probably be done by making a
hash table of inodes/filenames pairs (or triplets, etc) each time i sync.
Then the next time, I could compare the old table with the new, to find
out which files are the same only with new names, then find those names on
the remote system, change them to the new ones, and then rsyncing. If the
inodes are recycled for brand new files between syncs, I don't think that
would be a problem. The following rsync-job would recognize the diffs and
sync that, which it would have done anyway, if the file is new.
What do you think?
More information about the freebsd-questions
mailing list