rsync and moving files [Re: backup w/ snapshots]
nalists at scls.lib.wi.us
Tue Aug 30 14:27:19 GMT 2005
Svein Halvor Halvorsen wrote:
> * Greg Barniskis [2005-08-29 11:45 -0500]
>> Eh? Bad assumptions about snapshots, I think. If a snapshot occupied even a
>> tenth of the space of the data that it represented, we would quickly fill all
>> our disks and the snapshot technology would be almost as painful as useful.
>> A snapshot is essentially only an index of occupied disk space, not a copy of
>> the actual data, and a snapshot is therefore much, much, much, much smaller
>> than the data files that have changed. Read the relevant man pages and
>> handbook sections again, and test your assumptions by measuring the actual
>> change in snapshot size. I don't think your perceived problem really exists.
> Yes, that's correct! But let's say I keep more than one snapshot around. I
> maybe didn't mention this, but this the sole purpose of using snapshots;
> for me to have more full backups laying around.
Ah. That does change things a bit, I guess. A previous post
indicated file renames and replication followed by taking a new
snapshot, and I thought it was implied your older snapshots were
> If I change the disk alot between snapshots. Eg. I rsync moved files (yes,
> within tha same fs), this will result in alot of file deletion and
> creation. Next, when I make the snapshot, a new list of occupied diskspace
> will be made, and all of these blocks will be marked "in use", and
> therefore take up alot of diskspace.
> In reality the information change between the two snapshots, didn't change
> much at all, but the effect remains: my disk cannot longer store two
> snapshots (unless the backup disk is twice as large, which it is not).
> The solution: Somehow, I need to mirror all the move ops on the remote
> system before doing the rsync. This could probably be done by making a
> hash table of inodes/filenames pairs (or triplets, etc) each time i sync.
> Then the next time, I could compare the old table with the new, to find
> out which files are the same only with new names, then find those names on
> the remote system, change them to the new ones, and then rsyncing. If the
> inodes are recycled for brand new files between syncs, I don't think that
> would be a problem. The following rsync-job would recognize the diffs and
> sync that, which it would have done anyway, if the file is new.
> What do you think?
This is admittedly beyond my ken, at least within the limited number
of brain cycles I can offer to the problem. Hopefully someone else
will provide clues for you. Personally, I think you're violating the
KISS principle unless there's a really compelling need to keep your
previous file system states accessible online. Dumping older states
to offline media and reclaiming that space would be my first order
of business, but that's just me. Or just buy some whopping big disks
appropriate to the task, since that's generally cheaper than admin
time to create workarounds (unless you just consider this fun =).
Greg Barniskis, Computer Systems Integrator
South Central Library System (SCLS)
Library Interchange Network (LINK)
<gregb at scls.lib.wi.us>, (608) 266-6348
More information about the freebsd-questions