FreeBSD UFS2 snapshots, and math ...

Oliver Fromme olli at lurza.secnetix.de
Fri Oct 21 10:18:48 PDT 2005


user <user at dhp.com> wrote:
 > Let's say I have a filesystem, and on that filesystem I create a snapshot
 > every single night, and every night I delete the snapshot from 5 nights
 > ago.  This means that at all times, I have four snapshots running on that
 > filesystem, one from 1 day ago, one from 2 days ago, one from 3 days ago,
 > and one from 4 days ago.
 > 
 > Let's also assume that the percent change of the filesystem is 5% (every
 > day 5% of the blocks in the filesystem are either changed or deleted).
 > 
 > Does this mean that if that 5% change is a different 5% every day, that
 > the one day ago snapshot will be size 5%_of_filesystem, and that the 2 day
 > ago snapshot will be size 10%_of_filesystem, day 3 15% and day 4 20%, for
 > a total of 50% of the total filesystem taken up with snapshot data ?

No, the size requirement of every new snapshot should be 5%.
Only the data that is modified requires new space in every
case, even if there are multiple snapshots.

In other words:  If you have five snapshots, and you modify
a file, should the original content of the file be copied to
every snapshot, i.e. five times?  That would be terribly
inefficient.  That's _not_ how snapshots work.  Instead,
when you modify the file, the new data will be written to
new disk blocks, and the blocks containing the original data
are assigned to the snapshots.  There is no copying involved,
and the data exists only once on the disk.

Therefore, when you change 5%, the size requirement for the
snapshots grows by 5%, no matter how many snapshots you have
and how old they are.

 > If the 5% data changed per day is the _same_ 5% every day (perhaps
 > changing the same table in a DB every day,

That depends on the DB.  For PostgreSQL the WAL files will
almost always occupy new (different) disk space, even if
you only modify the same table over and over again.  That's
a feature, not a bug.  ;-)

 > or perhaps changing the same
 > block of lines in a text file every day)

That depends on the editor.  Some editors write a completely
new file and mv(2) it into the place of the original one.
You can check for this case by watching the inode number of
the file ("ls -li").  If it changed after editing, then the
editor wrote a new file, so it has occupied different blocks
on the filesystem.

 > does that mean that every day
 > simply represents 5%_of_filesystem, for a total of 20% of the total
 > filesystem in use at all times for snapshot data ?

That should happen in all cases, no matter what data you
modify, whether it's the same as the previous night or not.

 > Finally, are there any snapshot diag tools at all ?  Like, something that
 > reports snapshot sizes

Well, it's not easy to define "snapshot size".  Of course
it has a virtual size which is reported by df(1), and a
physical size reported by "ls -l".  But the data of the
snapshots consists of regular filesystem blocks which have
not been modified yet, and blocks of original content that
has been modified -- but these might be shared between
multiple snapshots, so how would you account them?

 > percent of disk used for snapshots,

Well, that should be easy to calculate from df(1).  I think
there's already a tool in the ports collection which does
that.

 > and maybe even
 > a way for me to actually calculate what the percent change for time period
 > X is for a particular filsystem >?

Basically, it depends exactly on the amount of data that
you modify.  When you modify or delete <X> blocks that
have not been modified since the last snapshot had been
created, the space requirement of the snapshot data will
grow by <X> blocks.  The number of snapshots in existence
does not matter.

Best regards
   Oliver

-- 
Oliver Fromme,  secnetix GmbH & Co. KG, Marktplatz 29, 85567 Grafing
Dienstleistungen mit Schwerpunkt FreeBSD: http://www.secnetix.de/bsd
Any opinions expressed in this message may be personal to the author
and may not necessarily reflect the opinions of secnetix in any way.

"When your hammer is C++, everything begins to look like a thumb."
        -- Steve Haflich, in comp.lang.c++


More information about the freebsd-hackers mailing list