continuous backup solution for FreeBSD

Matthew Dillon dillon at apollo.backplane.com
Fri Oct 10 01:14:23 UTC 2008


:The fact that I still would need to take full backups once in a while =20
:if I do this and Linux users do not have to because the CDP software =20
:on Linux does not need to do this. The software expires the old data =20
:automatically and you only need a full backup at first run only.

    You have to do this anyway.  Nobody in their right mind trusts a
    single storage solution for all of their backups.  CDP is a brute-force
    block-level continuous backup mechanism and while it works, to a point,
    it also has all the drawbacks that ANY block level backup system has...
    it is discontinuous from the high level filesystem structure and while
    you can pretty much guarantee that it is possible to recover from a
    disaster eventually, the filesystem you choose to run on top of that
    block device still matters a lot.  On top of that being a client-server
    solution CDP is going to have some significant bottlenecks too.

    Even more telling is the fact that block level storage solutions tend
    to migrate corruption instead of detect it early.  Big oopses can wind
    up stealthily winding their way through all your backups when you use
    a block-level solution.  I stopped using block-level solutions almost 10
    years ago... that's how little I trust them.

    Only a 'modern' linux filesystem (ext3 or the upcoming ext4, reiser,
    xfs, and few others), or something like ZFS or HAMMER, really have
    the ability to reliably recover from a point-in-time block level
    snapshot.  Filesystems such as ZFS and HAMMER also give you insanely
    good snapshotting solutions that are far, FAR more flexible then what
    CDP gives you.  You can upgrade between EXT filesystems without having
    to copy, but if you decided the best filesystem for you was one of the
    other many Linux filesystems, such as Reiser4 or XFS, and you were
    running EXT3, then you would have to copy.  There are massive lists of
    pluses and minuses to each of the linux filesystem choices.

    Data expiration is a non-issue.  You have to think about it either way.
    You have to test that the backup system actually works.  You have to
    carefully control the backup policy and in particular not allow heavy
    disk I/O (such as a hacker DD'ing to your filesystem for 24 hours)
    to blow out your ability to recover the system.  It requires time and
    effort no matter how well automated it is.

:The bigger problem is that I have to convert all my filesystems to =20
:ZFS. Can one convert UFS2 to ZFS easily even? I dont wanna spend a =20
:large part of the year doing such job while Linux users can just do =20
:this on 'any' filesystem they use. How am I suppose to compete with =20
:companies which use Linux otherwise if I am doing this sort of tasks =20
:all the time?

    I converted our main developer machine from UFS to HAMMER in about
    12 hours.  99% of that time simply waiting for the cpdup to finish
    copying a few hundred gigabytes from the old filesystem to the new.  

    I think we're talking a few days here... time enough to learn how the
    filesystem works and play with it, make a few test copies (as you would
    with ANY new system that you did not have previous experienced with),
    and then do it for real.

    Linux users cannot just do this with a flip of a switch either.
    The new filesystem has to be constructed and the old data copied over.
    The old data set has to be retained, you cannot convert in-place, it's
    just too dangerous.  It takes a certain amount of time to copy the
    data no matter what OS you are running, based on the amount of data
    you have.  Some filesystem transitions such as going from ext2 to
    ext3 or ext3 to ext4 (if I remember right) are forwards compatible
    and do not require copying, but that sort of transition is NOT to a
    new filesystem, it's to a newer version of the same filesystem.  With
    storage capacities increasing so quickly and mandating the replacement
    of whole disk subsystems (for running and electricity cost more
    then anything else, and more convenience, and less maintainance) it
    is a small convenience at best if you are going to copy the data anyway.

    Frankly, even if I were upgrading from ext2 to ext3 the last thing I
    would do is run it in-place.  There's just too high a chance of software
    bugs creeping in.  I would want a fresh ext3 and I would want my old 
    ext2 data sitting on a shelf for a few days in case something were
    to go terribly wrong.  I'd copy there too, just for safety.  What is
    a few extra hours compared to blowing up the life-blood of your company?

:I am not really looking for alternatives because there is none. You =20
:cant just expect commercial companies to convert to a new filesystem =20
:to add a feature which other OSes manage without going to such =20
:measures. Can you imagine the monetary cost if all FreeBSD users had =20
:to convert to ZFS (or another filesystem) to take near cdp level =20
:backups? This simply would make people think 'I wish I used Linux from =20
:the beginning'.
:
:Thanks,
:Evren

    Commercial companies like to farm things out and they certainly like
    turnkey solutions.  There are many more linux companies then BSD
    companies which offer turnkey solutions for various narrow bands of
    problems.  I'm not knocking it, it's how the world works.  Linux has
    a great deal of momentum despite the fracturing of the distributions.
    Linux has many other things going for it including a far better package
    updating scheme then any of the BSDs, but there are also downsides such
    as that huge ssh public key mess that was basically one programmer
    committing a change to a piece of code he was clueless about.

    The BSDs don't have the vendor support that linux has.  There are many
    reasons for this, and it's really too bad that it turned out that way,
    but if you are a sysop in a company that doesn't have infinite $$ to
    spend, and you can't afford the high costs of turnkey solutions, then
    rolling your own on the core platforms (Linux or BSD) require only having
    a good head on your shoulders to turn into a success.  It will take
    about the same amount of time either way.  You might be pleasantly 
    surprised!  I know a lot of people who tear their hair out on Linux
    (but unfortunately the same people mostly think BSD's upsides don't
    make up for its downsides, sigh).

    But be careful not to confuse turnkey solutions with flexibility.
    Turnkey solutions... things like CDP, are narrowly focused
    and tend to be very inflexible when it comes to integration beyond
    their basic design.  If something breaks in the middle of that black
    box you will be in 'houston we have a problem! mode' and it won't be
    fixed quickly.  These sorts of companies expect to be paid, most
    of them with ongoing support contracts.  Turnkey solutions are not
    going to be inexpensive.

    Even for open-source software, if you intend to manage both sides of
    the turnkey equation yourself you are essentially going to be devoting
    the same amount of time to it as someone rolling their own solution
    based on lower level (but still substantial) building blocks, such
    as the native features of ZFS.

    I can give you nightmare stories about turnkey backup solutions which
    either fail unexpectedly and f*ck up a company, or worse:  Become obsolete
    and the programmers have so little visibility into the turnkey 'solution'
    they are unable to upgrade it.  Lots of people have the latter problem.
    Turnkey can be good if carefully managed, and a nightmare if not.

					-Matt
					Matthew Dillon 
					<dillon at backplane.com>


More information about the freebsd-hackers mailing list