Backup solution suggestions

Johan Ström johan at stromnet.se
Tue Jan 15 05:00:50 PST 2008


First of all, thanks for your extensive answer!

On Jan 15, 2008, at 13:34 , Jeremy Chadwick wrote:

> On Tue, Jan 15, 2008 at 10:52:56AM +0100, Johan Ström wrote:
>> I'm looking to invest in some new hardware for backup. probably  
>> some kind
>> of NAS (a 4-disk 1U NAS or something in that size). The thing is  
>> that I
>> won't be the only one with access to this box, thus I would like  
>> to secure
>> my data.
>
> In my experience, your best bet when it comes to backups like what you
> want (1U box with 4 disks, or a 2U box with 8 or more) is to simply  
> buy
> a server with the specifications you want, and run FreeBSD on it.  I
> cannot recommend commercial products for something of this  
> "scale" (e.g.
> small/medium).
>
> I could list off all the reasons why [as a small hosting provider] I
> avoid proprietary backup solutions, but the list is quite long.  The
> two main reasons:
>
> 1) Proprietary solutions often use proprietary hardware.  How do you
> know what's inside of that mystery box?  What if it uses a SATA
> controller you know has h/w-level bugs in it?  What if something in  
> the
> device fails; are you going to be charged an arm and a leg for a
> replacement part?  Does it even HAVE user-servicable parts?  etc...
>
> I feel much more confident relying on hardware that I'm familiar with,
> e.g. I know what motherboard is in the server I buy or build, I  
> know who
> makes it, I know if it's compatible with FreeBSD or Linux, I know the
> SATA controller works and isn't flaky, I know the SATA backplane
> actually works properly and supports hot-swapping, and I know if I  
> need
> replacement parts I can get them promptly.  Also, if the h/w I buy  
> turns
> out to have compatibility problems or performance issues, I can always
> return it, get my money back, and try other h/w; with a proprietary
> solution you're "stuck with it", and if something's broken about it
> which the vendor can't/won't fix, you're screwed.
>
> 2) Proprietary solutions also means proprietary software.  This is
> pretty much guaranteed regardless of what h/w is used.  What if the
> volume manager used for your array has a bug and your data is
> corrupt?  You have no way of really "knowing" this until it's too  
> late,
> and you only have one person to turn to: the vendor.

All good points there, cannot argue against that. Certainly something  
to think about before doing any purchases. The only thing against  
that right now is size (we've got "cheap" access to a rack with  
limited depth), havent realy found any good 1U chassis that arent to  
deep. Admittedly I haven't spent veery much time looking yet but.. :)

>
> I prefer to have freedom of choice when it comes to backup methods.
> "Hmm, dump/restore isn't working out very well, so maybe I'll try ZFS,
> or bacula, or tar over NFS, or rsync, or...".
>
>> What I would like is encryption both for the transfer to the box, and
>> encrypted on disk. The data on disk should not be readable by  
>> anyone but me
>> (ie the other user(s) of the box should not be able to read it, at  
>> least
>> not without a big effort).
>
> I'm curious what the reason is for on-disk encryption?  Is it  
> necessary
> for something *only you* will have access to?  What's the concern  
> here?

I think I wrote that I *wont* be the only one with access to the box.  
Sorry if that wasn't clear.

It will be shared with a friend (or rather his company) of mine. I do  
trust him, but to keep some level of security I don't want him (or  
rather, someone with access to his box) to be able to read my files  
(and the other way arround for his files).

>
>> So, I'm wondering what the best solution might be.. Tar'balling  
>> all my
>> stuff and encrypt it with GPG or something and just dump it there  
>> with NFS
>> would be the easiest solution, but maybe not the best. I've been  
>> thinking
>> about running a GELI image on my box, and store that on the NAS  
>> over NFS..
>> would that be doable/secure/stable?
>
> I would recommend avoiding NFS unless the machine you're running
> nfsd/mountd/portmap on has no direct way to talk to the Internet.   
> It's
> impossible to get NFS-related daemons to bind solely to one IP/ 
> interface
> on FreeBSD, which imposes a security risk.  If the machine is behind
> NAT, you're very likely safe (unless the public has some way of
> accessing another machine on that NAT network).  Thus, if you  
> choose to
> go the NFS route, have it on a segregated network.

The box will be on a separate LAN only accessible by our two boxes.  
No internet connectivity. But the client boxes ofcourse have internet  
connectivty (but that would only be NFS clients, not servers).

>
> That said -- what we use in our production environment is dump/restore
> over SSH over a dedicated LAN.  I wrote a series of scripts that do
> this, using SSH keys for the SSH portion.  Incrementals are done 6  
> days
> a week, with fulls done once a week.

I use a similar scheme now, using BackupPC. However that is to my box  
at home which is not a very good solution due to bandwidth  
limitations (5MBit only).. The first copy takes ages, the incremental  
ones not as much.. It's around 20-30GB of data currently. The NAS/ 
backup box would be located on an 100MBit/1000MBit unmetered link.

>
> Does it work?  Yes.  Have I had to restore from it?  Yes, twice.   
> Did it
> work OK?  Yes, but was not as simple as "restore the backup to this
> disk, throw the disk in the server, and voila FreeBSD is back up and
> running".  It's more of "replace the disk, install FreeBSD on it,
> configure the box like before, then restore the user data..."
>
> Once all of our systems are running RELENG_7, I plan on utilising ZFS
> heavily.  ZFS offers backup/restore capability, including over a
> network, and it's very fast.  Now if only installing FreeBSD onto ZFS
> was made simple, ditto with booting off of ZFS...
>
> Now, on a personal level -- I do backups at home too.  My home system
> has 4 disks in it -- one for the OS (UFS2), one for backups (UFS2),  
> and
> two for a ZFS RAID-0-like volume.
>
> For the OS disk and filesystems (e.g. / /var /usr /tmp /home), I use
> rsync.  For the ZFS volume, I use ZFS snapshots in an incremental
> fashion (6 days of incrementals, 1 day of full) and do "zfs send
> {volume} > /backup_disk/volume.X" to do the backups.
>
> In case you're wondering about how long they all take and how much  
> data
> is backed up, here's some times of full level 0 backups:
>
> ==> Backing up / to /backups/rootfs/ (method: rsync)
> ==> Start time: Sun Jan 13 02:45:01 PST 2008
> ==> End time:   Sun Jan 13 02:45:01 PST 2008
> ==> Backing up /var to /backups/var/ (method: rsync)
> ==> Start time: Sun Jan 13 02:45:01 PST 2008
> ==> End time:   Sun Jan 13 02:45:06 PST 2008
> ==> Backing up /usr to /backups/usr/ (method: rsync)
> ==> Start time: Sun Jan 13 02:45:06 PST 2008
> ==> End time:   Sun Jan 13 02:46:03 PST 2008
> ==> Backing up /home to /backups/home/ (method: rsync)
> ==> Start time: Sun Jan 13 02:46:03 PST 2008
> ==> End time:   Sun Jan 13 02:46:03 PST 2008
> ==> Backing up storage to /backups/storage.zfs.%%% (method: zfs)
> ==> Start time: Sun Jan 13 02:46:03 PST 2008
> ==> End time:   Sun Jan 13 03:29:33 PST 2008
>
> Filesystem   1024-blocks      Used     Avail Capacity  Mounted on
> /dev/ad8s1a       507630    211410    255610    45%    /
> /dev/ad8s1d      8122126    108502   7363854     1%    /var
> /dev/ad8s1e      4058062       420   3732998     0%    /tmp
> /dev/ad8s1f     32494668   2023282  27871814     7%    /usr
> /dev/ad8s1g    139955812     11640 128747708     0%    /home
> /dev/ad10s1d   473009638 146843210 288325658    34%    /backups
> storage        957526016 124001408 833524608    13%    /storage
>
> And here's what you see on /backups:
>
> total 144005480
> drwxr-xr-x    6 root      wheel              512 16 Oct 10:08 home/
> drwxr-xr-x   24 root      wheel              512 13 Jan 23:49 rootfs/
> -rw-r--r--    1 root      wheel     126996957624 13 Jan 03:29  
> storage.zfs.0
> -rw-r--r--    1 root      wheel           747136 14 Jan 02:46  
> storage.zfs.1
> -rw-r--r--    1 root      wheel        541937432 15 Jan 02:45  
> storage.zfs.2
> -rw-r--r--    1 root      wheel       4408684056  9 Jan 02:46  
> storage.zfs.3
> -rw-r--r--    1 root      wheel       4716827040 10 Jan 02:47  
> storage.zfs.4
> -rw-r--r--    1 root      wheel       5362108640 11 Jan 02:47  
> storage.zfs.5
> -rw-r--r--    1 root      wheel       5362108640 12 Jan 02:47  
> storage.zfs.6
> drwxr-xr-x   17 root      wheel              512  1 Dec 09:06 usr/
> drwxr-xr-x   23 root      wheel              512  6 Jan 01:36 var/
>
> For the ZFS incremental storage.zfs.2 (541MB of data), the time was  
> very
> quick (9 seconds)
>
> ==> Backing up storage to /backups/storage.zfs.%%% (method: zfs)
> ==> Start time: Tue Jan 15 02:45:26 PST 2008
> ==> End time:   Tue Jan 15 02:45:35 PST 2008
>
> I have dump/restore on UFS2 via ssh times if you want them as well.
> They're not pretty.


ZFS is indeed very nice, I'm running it at home for a not-so- 
important server.. I love it! Have been working without a single  
hickup since I started using it (end of November).
We've been thinking of doing using a fbsd machine with ZFS, but the  
dump/restore scheme wouldnt help us since the machines beeing backupd  
doesnt run ZFS (didnt exist on Fbsd/wasnt stable enough when those  
where setup). So relying on ZFS's dump/restore for the backupee- 
 >backup box is, I'm afraid, not an option. However the snapshots  
could ofcourse be usable on the backup box, ie copying the files  
first time, creating a snapshot, rsyncing new versions, new shapshot  
& new rsync and so on, if I've understood the snapshots correct  
(havent played with them very much yet).
However this wont work either, or at least probably not very  
effective since the data should be encrypted and not in plaintext.

>
>> Another idea would be to go with some regular 1U box running some  
>> FBSD,
>> doing scp to the box and geli local on the box but that would  
>> require me to
>> have the encryption keys on that box (which would be shared so  
>> thus no good
>> idea).
>
> I would recommend going this route, at least in regards to the 1U box
> running FreeBSD.  See above comment about GELI.  scp to the box  
> would be
> fine; why does this part worry you?

Well, explained above, I *wont* be the only one with access to it.

>
>> Any other ideas? Being able to rsync to the backup storage instead  
>> of just
>> sending big encrypted tarballs would be very nice (and I guess  
>> that would
>> be possible with geli version)
>
> See above, re: why is encryption needed?
>

Above again.


Again, thanks you very much for all your time and thoughts, very much  
appreciated!

--
Johan


More information about the freebsd-stable mailing list