Storing revisions of large files using ZFS snapshots

Wed Jun 1 00:52:23 UTC 2011

On Tue, May 31, 2011 at 10:42:37PM +0200, Per von Zweigbergk wrote:
> I'm currently looking at the option of using a FreeBSD server using ZFS to store offsite backups.
> 
> The primary backup product used (Veeam Backup & Replication) stores its backups in what's called reverse-incremental mode. Basically, this means storing backups as a huge VBK file (one for each job) containing a deduplicated and compressed dump of all the virtual machine files being backed up. The system will also store what are known as "reverse incrementals", i.e. anything it overwrites on a backup pass will be preserved in a file, similar to a traditional incremental backup, except in the other direction.
> 
> Since this product does not have any real solutions for offsite backup replication, after considering a few different options, I'm seriously considering using a combination of ZFS snapshots and rsync.
> 
> Basically what would happen is that every night after the backup completes, rsync would be run, synchronizing over the differences between the synthetic full backup from the previous day. Historic copies of the full backup images as synchronized by rsync would be kept using ZFS snapshots. After our retention window closes, I'd just nuke the oldest snapshots from the server.
>
> [snip]
>
> Second, are there any other caveats that I'm likely to run into as I go down this path for storing backups?
> 
> Obviously, I'd prefer just trucking over plain old incremental backups, and doing a consolidation job off-site, but the backup software doesn't have any image management software that could consolidate a full backup plus its incrementals into a synthetic full backup. It'll only do it as part of a backup job. Grmbl. But then I wouldn't get to play with the idea of actually storing full backup images for every restore point using filesystem level snapshots. :)

Speaking strictly about rsync (in any situation):

Please be aware that rsync will destroy your atimes (on both the source
and the destination).  The --atimes patch doesn't fix this problem
either; it copies the atimes from the source to the destination, but
destroys the atimes on the source.

Why this matters: if you have any software that relies on atimes -- the
big one is classic UNIX mail spools (atime is used for new mail
detection) -- then your atimes will be lost, and your shell users who
use things like mailx/elm/pine/mutt/alpine will start complaining that
they had new mail but weren't informed of it.

Just something to keep in mind.  We live with this problem (in our
production infrastructure) despite that.

Speaking strictly about ZFS snapshots:

The mentality of ZFS snapshots seems very similar to that of UFS
snapshots, in the sense that the design/model seems to be oriented
towards "bare-metal" restoration.

That works generally okay (depends on your view) for administrators,
but depending on your demographic, it almost certainly won't work for
users.

We've found that in most cases, a user will overwrite or rm a file which
they didn't mean to and wish to restore just that file.  They need to do
so quickly and easily.  ZFS and UFS snapshots don't make this easy for
them to accomplish (ZFS is easier than UFS in this regard, absolutely).

In our case, we use rsnapshot (which relies on rsync) to perform backups
of systems on a dedicated network, and each system NFS mounts the backup
server (mount is read-only).  Users totally understand this:

$ cd /backups
$ ls -l
total 60
drwxr-xr-x   16 root      wheel     22 May 31 03:59 daily.0/
drwxr-xr-x   16 root      wheel     22 May 30 04:02 daily.1/
drwxr-xr-x   16 root      wheel     22 May 21 03:57 daily.10/
drwxr-xr-x   16 root      wheel     22 May 20 03:59 daily.11/
drwxr-xr-x   16 root      wheel     22 May 19 03:58 daily.12/
drwxr-xr-x   16 root      wheel     22 May 18 03:59 daily.13/
drwxr-xr-x   16 root      wheel     22 May 17 03:57 daily.14/
drwxr-xr-x   16 root      wheel     22 May 16 03:59 daily.15/
drwxr-xr-x   16 root      wheel     22 May 15 03:58 daily.16/
drwxr-xr-x   16 root      wheel     22 May 14 03:58 daily.17/
drwxr-xr-x   16 root      wheel     22 May 13 03:57 daily.18/
drwxr-xr-x   16 root      wheel     22 May 12 03:57 daily.19/
drwxr-xr-x   16 root      wheel     22 May 29 03:58 daily.2/
drwxr-xr-x   16 root      wheel     22 May 11 03:59 daily.20/
drwxr-xr-x   16 root      wheel     22 May 10 03:59 daily.21/
drwxr-xr-x   16 root      wheel     22 May  9 04:02 daily.22/
drwxr-xr-x   16 root      wheel     22 May  8 03:58 daily.23/
drwxr-xr-x   16 root      wheel     22 May  7 03:56 daily.24/
drwxr-xr-x   16 root      wheel     22 May  6 03:58 daily.25/
drwxr-xr-x   16 root      wheel     22 May  5 03:59 daily.26/
drwxr-xr-x   16 root      wheel     22 May  4 03:56 daily.27/
drwxr-xr-x   16 root      wheel     22 May  3 04:00 daily.28/
drwxr-xr-x   16 root      wheel     22 May  2 04:02 daily.29/
drwxr-xr-x   16 root      wheel     22 May 28 03:58 daily.3/
drwxr-xr-x   16 root      wheel     22 May 27 03:59 daily.4/
drwxr-xr-x   16 root      wheel     22 May 26 03:58 daily.5/
drwxr-xr-x   16 root      wheel     22 May 25 03:59 daily.6/
drwxr-xr-x   16 root      wheel     22 May 24 03:58 daily.7/
drwxr-xr-x   16 root      wheel     22 May 23 04:01 daily.8/
drwxr-xr-x   16 root      wheel     22 May 22 03:57 daily.9/
$ cd daily.9
$ cd home/myaccount/public_html
$ cp mywebdocument.html ~/public_html

I spent almost 2 months, off and on, trying to explain to one of our
customers how to use "restore -i" effectively.  The above method is
simple and "makes sense", and said customer has used it many, MANY
times -- without my assistance.

I would also advocate that you review the freebsd-fs and freebsd-stable
archives since, say, January of this year, and look for problem reports
or issues with ZFS snapshots or zfs send/recv.  This is not FUD or
attempt to spread misinformation -- each incident/situation is
different, yes -- but get an idea of what you *could* encounter and
weigh pros/cons.

Finally, if you plan on using ZFS at all, please run RELENG_8 and don't
stick with the present -RELEASE branch.  There are lots of fixes in
RELENG_8 which won't exist until 8.3-RELEASE, obviously.

-- 
| Jeremy Chadwick                                   jdc at parodius.com |
| Parodius Networking                       http://www.parodius.com/ |
| UNIX Systems Administrator                  Mountain View, CA, USA |
| Making life hard for others since 1977.               PGP 4BD6C0CB |