Curious about SCM choice

Sat Jun 28 08:05:47 UTC 2008

On Sat, 28 Jun 2008, Henrik Brix Andersen wrote:

> On Fri, Jun 27, 2008 at 06:26:54PM -0700, Milo Hyson wrote:
>> Can not the "decentralized" systems like Mercurial and GIT be used in a 
>> centralized fashion? Our internal experiments certainly show them to be 
>> every bit as capable as Subversion in this regard. Has your experience been 
>> different?
>
> They _can_ be used in centralized fashion, but they do not enforce it. 
> Subversion enforces a centralized development model.

Well, tools like svk bring the benefits of decentralized development to svn, 
and many decentralized tools can easily piggy-back on the changeset stream for 
svn, so I'm not sure I'd use the word "enforce".  I think the driving factors 
really come down to some more practical concerns:

- The migration to svn from cvs is an incremental step, which means that the
   conversion is (was) less likely to seriously interrupt development.

- The import and export from cvs is well-understood and well-honed; Peter
   still had to do a month of work to get it to import, and that helps capture
   what a massive task even that was.  Export is really important to, as we
   will continue to support cvsup of a consistently structured repository for
   the forseeable future using CVS and cvsup.

- Certain key features are precluded *by design* in many of the decentralized
   tools.  In particular, the use of cryptographic hashes to over changesets to
   construct version identifiers means that the tools would have to be rather
   seriously modified to support obliteration, and we consider obliteration a
   functional requirement.  It's not as easy in svn as in cvs, but it is
   possible, and that is key.  We'd love to see the DVCS systems that currently
   prevent obliterate grow some sort of support for it, and I can't imagine
   that we are alone in this requirement.

- We make extensive use of $FreeBSD$ version identifiers when debugging user
   problems, and cryptographic hash values used by some systems for version
   numbers don't meet our requirements.  For example, access to a repository is
   required to answer the question "Do you have foo.h revision X or greater".

- There was a stron desire to support partial check-outs of our tree without
   subdividing the tree, which would lose the benefit of atomic changesets, not
   to mention our generally unified approach to build and revision control.
   Interestingly, many of the DCVS tools don't allow you to say "just check out
   the kernel" without organizing repositories around "the kernel" as an
   administrative boundary.  There's a pretty long thread discussing whether
   git could be used by the KDE folks for this reason, who also have a very
   large repository.  The conclusion appeared to be that they'd need to break
   it into many smaller repositories.  There are other similar scalability
   concerns with several other pieces of DCVS software.

I think it's also worth pointing out that the SVN and SVK folk have been 
incredibly helpful during the migration; and, of course, it shouldn't be 
ignored that SVN provides many of the features we *do* require, especially 
with 1.5, and does them efficiently and well.  SVN also makes it easier for 
developers or third parties who *do* want to use DCVS to do so: instead of 
exporting CVS meta-data, we're now exporting a consistent changeset stream 
with merging meta-data, etc, which is a much better thing to import into a 
DCVS.

As the transition was a complex and very long-running project (and still is 
running!), there's a lot more to it than can be easily captured in a bullet 
list, and I'm certain I've missed at least a few interesting points.  Perhaps 
Peter will chime in and mention some more :-).

Robert N M Watson
Computer Laboratory
University of Cambridge