using git for freebsd development

Andriy Gapon avg at icyb.net.ua
Wed May 14 21:47:54 UTC 2008


I finally found some time and inspiration to write about some things I 
did about a month ago. I decided to use GIT for local FreeBSD 
development and I looked for the tools to do initial conversion of 
FreeBSD CVS src repository to GIT (I wanted to get as much history as 
possible) and also to do one-way sync-ing from CVS to GIT for subsequent 
updates.
Please note that I just wanted to achieve my goal, I didn't attempt to 
do objective comparisons, benchmarking, etc. So whatever performance 
data I give below are very imprecise.

The following can be considered as a followup to the excellent 
FreeBSD/GIT wiki page:
http://wiki.freebsd.org/GitConversion

So my first task was to do the initial conversion.
My research showed that there were the following most popular options:
git-cvsimport which is a part of GIT suite
parsecvs http://gitweb.freedesktop.org/?p=users/keithp/parsecvs.git
fromcvs/togit http://www.selenic.com/mercurial/wiki/index.cgi/fromcvs
tailor http://progetti.arstecnica.it/tailor/browser/README.rst

All of the tools either required source CVS repository to be available 
locally or worked much faster in that case, so the first thing to do was 
to get src-all from my local cvsup mirror. Easy.

The first tool I tried was cvsimport because it came with git. It 
failed. After working for a short while it went into infinite loop on 
some file, it first complained that version X is before Y on branch Z 
and then that version Y is before X on the same branch and so on and on. 
Moving away that file didn't help as there were more troublesome ones.
Maybe it had to do with repocopying.

Next one was parsecvs. I think that this is the best one for initial 
import. It worked for about 6 hours on a very old machine: 512MB RAM, 
450 MHz Pentium III. Resulting GIT repo took about 8G of space. 
Subsequent git repack took about 12 hours and reduced the size to ~500M. 
Quite nice.
I should note that during the whole process of conversion parsecvs did 
not use more than 300M of RAM, this is by far the most conservative of 
all the tools that I tried (and that worked).
There were some warning messages during conversion.
Unfortunately parsecvs does not provide any option for keyword handling 
control and it doesn't expand any keywords. There are reasons to prefer 
this behavior, but personally I would prefer them to be expanded. I 
think that this is something that should be very easy to tweak in 
parsecvs source code.
Also, quite unfortunately, parsecvs can only do full repository 
conversion and doesn't support incremental import.

Because of the above, although I already had a converted git FreeBSD 
repo, I decided to give a try to some other tools - thinking that maybe 
using the same tool for both tasks would be somehow better.
Thus I tried fromcvs/togit. It required me to install couple of ruby 
packages available via ports and two custom ones - rcsparse and fromcvs.
It was quite easy to setup and run. This time I executed conversion on a 
modern system with Athlon XP 4800+ two core processor and 2GB of RAM.
And that was needed - fromcvs worked for about two hours, peak memory 
usage was around 1.5G.
There were some warning messages during conversion.
Unfortunately, detailed examination showed that there were some issues 
with the conversion. Some files that were never changed on some branches 
in CVS were not to be found on the corresponding GIT branches.
What's strange is that when I tried to convert only sys/ subdirectory 
everything went very well, no issues. Only on the complete src 
repository this problem did happen.
Author of fromcvs (Simon 'corecode' Schubert) is aware of the issue and 
encountered it himself, so I hope it will be resolved soon.
But so far no go.

Then I decided to try tailor. I must admit that I had some difficulties 
understanding its documentation and that's probably the cause of what 
happened next.
I provided what I thought were good options to tailor and it generated 
its config file. Then I executed it with the config, it worked for about 
two hours being the most resource hungry of everything I tried - using 
swap on the mentioned 2GB machine. Then it produced some error that 
looked like a complaint about configuration problem and then I gave up.

Summary: only parsecvs worked good enough for me.

Part two, doing incremental updates.
I updated my copy of FreeBSD CVS repository with cvsup and proceeded.
BTW, csup supported only checkout mode, so it could not be used instead 
of cvsup.

By tradition I tried git-cvsimport first. It went into infinite loop 
again (maybe not infinite, but too long for me). This time it didn't 
produce any errors, just consumed 100% CPU, didn't make any system calls 
at all (ktrace to witness). I waited for about 3 hours (on the modern 
machine).

parsecvs, as I said before, doesn't do incremental imports.

tailor, on it I gave up.

So I finally tried fromcvs and it worked, and it worked fast and it 
worked good. At least, so far I do not see any issues in incremental 
updates that it performs.

So my conclusion is that at this time parsecvs is the best tool for 
initial import and fromcvs is the best tool for incremental imports.
One small quirk is that parsecvs imported keywords unexpanded, but 
fromcvs expands them in incremental updates.
Another small quirk: couple of commit messages in CVS contain extended 
Latin symbols from ISO8859-1. It seems that parsecvs copied them as is 
to GIT log history, I think they should have been converted to UTF-8. 
E.g. "Hörnquist Åstrand" in history of Makefile.inc1

As a concluding word: I decided to clone the converted repository and to 
create my topic and "integration" and other branches in the cloned 
repository. Some of the branches are tracking branches, so that it is 
easy, for example, to synchronize my RELENG_7-specific changes with what 
is going on in CVS.
So I first get CVS updates with cvsup.
Then do fromcvs incremental import into "pristine" GIT repository.
And then do 'git pull' into the working GIT repo.


Hope that this will be interesting and/or useful to the community.

-- 
Andriy Gapon


More information about the freebsd-hackers mailing list