distributed scm+freebsd svn?

Giorgos Keramidas keramida at freebsd.org
Mon Aug 3 01:56:39 UTC 2009


On Sun, 26 Jul 2009 16:15:34 -0700, Alfred Perlstein <alfred at freebsd.org> wrote:
> Hello hackers,
>
> Does anyone here use one of the distributed SCMs to manage
> contributions to FreeBSD in an easy manner?

Hi Alfred,
Yes, I do that.

> Any pointers to a setup you have?
>
> I thought "git" was supposed to make this easy, but going over the
> docs leaves me with a lot of questions.

Git is a wonderful system but it's "UI" and documentation often make me
want to scream bad things.  My own suggestion is to go with Mercurial,
because it's command set looks a *lot* like CVS or Subversion, it's often
as fast or even faster than Git, and it doesn't seem as 'confusing' as Git.

More details below...

> I'm hoping to be able to basically:
>   sync into my "distributed repo".
>   allow a third party access to it.
>   easily commit upstream back into svn from a branch
>     in my distributed scm.

I use a local Mercurial repository for my own patches.  It seems to
support most of the things I want to do, i.e.:

  * Keep a clean `/hg/bsd/head' workspace and pull full changesets into
    that from our svn repository

  * Support incremental updates of `/hg/bsd/head'.

  * Easily clone my `/hg/bsd/head' to one or more `feature' branches.

  * Allow others to pull from `head' as a read-only source over http or
    ssh.

The /head branch has a huge history that I don't really want to keep
around in every clone.  So I started my conversion from 2007-12-31 and I
keep updating it with the `hg convert ...' command wrapped in a small
shell script:

    $ cat -n /hg/bsd/pull-head.sh
         1  #!/bin/sh
         2
         3  set -e
         4  hg convert \
         5          --config convert.svn.startrev='175021' \
         6          --config convert.svn.trunk='head' \
         7          --config convert.svn.branches='' \
         8          --config convert.svn.tags='' \
         9          file:///home/svn/base/ \
        10          /hg/bsd/head

You can use the webdav http://svn.freebsd.org/base/ or an SSH tunneled
URI to access to Subversion repository, but I keep a local mirror of the
Subversion repository too, so I prefer that.


Typical Mercurial-based Workflow
================================

  1. Pull subversion commits into the 'head' workspace.

  2. Pull these changes from 'head' to my working tree.

  3. Merge the changes with the local patches of the working tree.

  4. Extract one or more patches for committing to Subversion

  5. Rinse, leather, repeat...

Pulling the latest commits from Subversion
------------------------------------------

The first step is the easiest bit.  I just run `/hg/bsd/pull-head.sh'.

This requires an installed copy of the Python bindings of Subversion
[devel/py-subversion] and the `convert' extension enabled in my ~/.hgrc
file with:

    [extensions]
    convert =

A sample run of `pull-head.sh' looks like this:

    keramida at kobe:/hg/bsd$ time ./pull-head.sh
    scanning source...
    sorting...
    converting...
    1 Many network stack subsystems use a single global data structure to hold
    0 Add padding to struct inpcb, missed during our padding sweep earlier in
            3.306 real      1.809 user      0.619 sys
    keramida at kobe:/hg/bsd$

This is reasonably fast, but it does come with an important caveat.
It's not terribly important for my own work, but it *may* be for yours:

    The Python bindings of Subversion do not support svn:keywords, so
    all our manually configured '$FreeBSD$' stuff is unexpanded in the
    converted tree.  Mergemaster may cause various levels of "fun" and
    "amusement" if you mix, match and alternate between svn-based and
    mercurial-based workspaces often!

At this point, after "pull-head.sh" has finished running, the most
recent commit in the head/.hg/ workspace state is the last commit by
rwatson:

    keramida at kobe:/hg/bsd/head$ hg log --limit 1
    changeset:   12589:8ce7c7a0b804
    branch:      head
    tag:         tip
    user:        rwatson
    date:        Sun Aug 02 22:47:08 2009 +0000
    summary:     Add padding to struct inpcb, missed during our padding sweep earlier in

    keramida at kobe:/hg/bsd/head$

This clone/workspace is my 'clean' slate, and it only contains an `.hg'
data store.  No checkout or other workspace contents:

    keramida at kobe:/hg/bsd/head$ ls -la
    total 6
    drwxr-xr-x  3 keramida  users  - 512 Nov 10  2008 .
    drwxr-xr-x  8 keramida  users  - 512 Aug  3 02:36 ..
    drwxr-xr-x  3 keramida  users  - 512 Aug  3 02:36 .hg
    keramida at kobe:/hg/bsd/head$ du -sh .
    243M    .
    keramida at kobe:/hg/bsd/head$

It does however contains separate changesets for each subversion commit,
so I can browse the (local only now) history with a fair amount of speed.


Pulling these changes in my personal workspace
----------------------------------------------

The second step is to pull the latest versions in my personal workspace
at `/hg/bsd/src':

    keramida at kobe:/home/keramida$ cd /hg/bsd/src
1)  keramida at kobe:/hg/bsd/src$ hg incoming --style compact ../head
    comparing with /hg/bsd/head
    searching for changes
    12588   6b04ed36e454   2009-08-02 19:43 +0000   rwatson
      Many network stack subsystems use a single global data structure to
      hold

    12589[tip]   8ce7c7a0b804   2009-08-02 22:47 +0000   rwatson
      Add padding to struct inpcb, missed during our padding sweep earlier
      in

2)  keramida at kobe:/hg/bsd/src$ hg pull ../head
    pulling from /hg/bsd/head
    searching for changes
    adding changesets
    adding manifests
    adding file changes
    added 2 changesets with 16 changes to 16 files (+1 heads)
    (run 'hg heads' to see heads, 'hg merge' to merge)
    keramida at kobe:/hg/bsd/src$

    Comments:
    ---------

    (1) Just look at what would be pulled from 'head' but don't actually
        make any changes to the local workspace or branch/data store.

    (2) Pull the changes, creating a 'fork' in the history of the local
        workspace at the place where the changes were grafted on top of
        the previous svn-based changeset.

At this point the history of my personal workspace has "forked".  The
subversion changesets have been grafted on top of the previous svn
commit.  This is easier to show with a picture/graph of the local
history, so here it is (from the 'graphlog' extension of Mercurial):

    keramida at kobe:/hg/bsd/src$ hg glog --limit 5
    o  12785[tip]   8ce7c7a0b804   2009-08-02 22:47 +0000   rwatson
    |    Add padding to struct inpcb, missed during our padding sweep earlier in
    |
    o  12784:12782   6b04ed36e454   2009-08-02 19:43 +0000   rwatson
    |    Many network stack subsystems use a single global data structure to hold
    |
    | @  12783:12762,12782   c50060fec1db   2009-08-02 21:06 +0300   keramida
    |/|    Merge from head
    | |
    o |  12782   8e4fc85e5aa3   2009-08-02 16:59 +0000   julian
    | |    Stop uuidgen(2) from crashing in vimage kerenels.
    | |
    o |  12781   bf9c3383d680   2009-08-02 14:28 +0000   attilio
    | |    Make the newbus subsystem Giant free by adding the new newbus sxlock.
    | |
    keramida at kobe:/hg/bsd/src$

The last time I pulled from Subversion was after 16:59 UTC (this is the
buildworld I am still running in the background).  My current workspace
contains a checkout of revision 12783/c50060fec1db (hence the '@' in the
relevant node of the history graph).

Merging the New Changesets
--------------------------

To prepare for my next `merge with head', I first look at the two
'heads' of the history.  My last successful merge and the new 'head'
created by importing from Subversion:

    keramida at kobe:/hg/bsd/src$ hg heads --style compact
    12785[tip]   8ce7c7a0b804   2009-08-02 22:47 +0000   rwatson
      Add padding to struct inpcb, missed during our padding sweep earlier in

    12783:12762,12782   c50060fec1db   2009-08-02 21:06 +0300   keramida
      Merge from head

    keramida at kobe:/hg/bsd/src$

Then I verify that I do *not* have local uncommitted stuff:

    keramida at kobe:/hg/bsd/src$ hg status
    keramida at kobe:/hg/bsd/src$

This takes a few seconds, but I don't want to throw away any changes I
may have been working on without noticing.

Since this is a clean checkout of revision 12783:c50060fec1db, I don't
necessarily need the next step, but it's safe to do it.  I check out a
clean copy of the local merge head:

    keramida at kobe:/hg/bsd/src$ time hg update --clean c50060fec1db
    0 files updated, 0 files merged, 0 files removed, 0 files unresolved
            4.684 real      1.691 user      2.558 sys

Being a clean checkout already and having the filesystem cache 'hot'
from the last pull I just did, this is often fast enough.

Finally a merge of the 'remote' Subversion based commits I just pulled:

    keramida at kobe:/hg/bsd/src$ hg merge
    merging sys/netinet/in_gif.c
    merging sys/netinet/ip_var.h
    14 files updated, 2 files merged, 0 files removed, 0 files unresolved
    (branch merge, don't forget to commit)
    You have new mail in /home/keramida/Mailbox
    keramida at kobe:/hg/bsd/src$ hg commit -m 'Merge from head'
    keramida at kobe:/hg/bsd/src$

That's all.  Now I have my own local changes 'merged' with the latest
changeset from Subversion.  There are no conflicts to resolve manually,
because the subversion changesets I pulled do not affect any of the
files with local-only changes, so this merge was relatively pain-free,
quite fast and most importantly required no manual input at all :)

Looking at the history with graphlog again, shows something like this:

    keramida at kobe:/hg/bsd/src$ hg glog --limit 5
    @    12786[tip]:12783,12785   368efb2b98b9   2009-08-03 03:00 +0300   keramida
    |\     Merge from head
    | |
    | o  12785   8ce7c7a0b804   2009-08-02 22:47 +0000   rwatson
    | |    Add padding to struct inpcb, missed during our padding sweep earlier in
    | |
    | o  12784:12782   6b04ed36e454   2009-08-02 19:43 +0000   rwatson
    | |    Many network stack subsystems use a single global data structure to hold
    | |
    o |  12783:12762,12782   c50060fec1db   2009-08-02 21:06 +0300   keramida
    |\|    Merge from head
    | |
    | o  12782   8e4fc85e5aa3   2009-08-02 16:59 +0000   julian
    | |    Stop uuidgen(2) from crashing in vimage kerenels.
    | |
    keramida at kobe:/hg/bsd/src$

Merging with 'head' often means that you can then publish this workspace
so others can pull from it, and publish their own patches for the same
workspace.  How often you merge with each other is up to you.  In the
Greek documentation project we often merge after many days or several
weeks.  With projects that have a higher local change rate merging every
day might be nicer (and result in far fewer conflicts).


Extracting Patches from the Local Workspace
===========================================

Keeping local changes means you may eventually want to push those
changes towards the /head of Subversion.  You'll have to extract patches
for one or more file then.  Since you have the tools to look at the
local history, you can use "hg diff" with the last subversion commit,
i.e. with the history shown above:

    keramida at kobe:/hg/bsd/src$ hg glog --limit 5
    @    12786[tip]:12783,12785   368efb2b98b9   2009-08-03 03:00 +0300   keramida
    |\     Merge from head
    | |
    | o  12785   8ce7c7a0b804   2009-08-02 22:47 +0000   rwatson
    | |    Add padding to struct inpcb, missed during our padding sweep earlier in
    | |
    | o  12784:12782   6b04ed36e454   2009-08-02 19:43 +0000   rwatson
    | |    Many network stack subsystems use a single global data structure to hold
    | |
    o |  12783:12762,12782   c50060fec1db   2009-08-02 21:06 +0300   keramida
    |\|    Merge from head
    : :

You can run "hg diff" between changesets 12785 and 12786 (tip):

    keramida at kobe:/hg/bsd/src$ hg diff -r 12785:tip | diffstat -p1
     contrib/mg/Makefile            |   29 +
     contrib/mg/README              |   74 ++
     contrib/mg/autoexec.c          |  111 ++++
     ... snip ...
     usr.bin/yacc/reader.c          |    2
     usr.sbin/chown/chgrp.1         |   10
     usr.sbin/chown/chown.8         |   12
     usr.sbin/chown/chown.c         |   17
     76 files changed, 18355 insertions(+), 44 deletions(-)
    keramida at kobe:/hg/bsd/src$

My own local workspace includes an import of OpenBSD's mg(1) editor.
This shows as a single patch in the diff command I used above, along
with *every* other file that differs in my local 'branch' of FreeBSD
head.  Let's assume, for example's sake, that you don't really care
about mg(1) patches, and you want to look at everything else.  The
--include and --exclude options of "hg diff" help a lot there (short
names -I and -X):

    keramida at kobe:/hg/bsd/src$ hg diff -r 12785:tip -X contrib/mg -X usr.bin/mg | diffstat -p1
     contrib/top/top.X              |    5 -
     contrib/top/top.c              |    2
     etc/Makefile                   |    6 +
     etc/mtree/BSD.games.dist       |   16 +++
     etc/mtree/BSD.usr.dist         |    2
     etc/mtree/BSD.var.dist         |    2
     libexec/rtld-elf/rtld.c        |    2
     share/mk/bsd.own.mk            |    2
     sys/amd64/conf/KOBE            |  185 ++++++++++++++++++++++++++++++++++++++
     sys/boot/common/interp.c       |    2
     sys/boot/common/interp_forth.c |    2
     sys/conf/newvers.sh            |   26 +++++
     sys/i386/conf/KOBE             |  195 +++++++++++++++++++++++++++++++++++++++++
     usr.bin/Makefile               |    5 +
     usr.bin/truss/amd64-fbsd.c     |    4
     usr.bin/truss/amd64-fbsd32.c   |    3
     usr.bin/truss/amd64-linux32.c  |    4
     usr.bin/truss/i386-fbsd.c      |    4
     usr.bin/truss/i386-linux.c     |    4
     usr.bin/truss/ia64-fbsd.c      |    4
     usr.bin/truss/powerpc-fbsd.c   |    4
     usr.bin/truss/sparc64-fbsd.c   |    4
     usr.bin/yacc/reader.c          |    2
     usr.sbin/chown/chgrp.1         |   10 +-
     usr.sbin/chown/chown.8         |   12 +-
     usr.sbin/chown/chown.c         |   17 ++-
     26 files changed, 480 insertions(+), 44 deletions(-)
    keramida at kobe:/hg/bsd/src$

If you want to commit *all* the local changes as a single patch to
Subversion, you can keep refining the --exclude/--include patterns until
the final patch looks and smells "right".

If you know the directories that you want to diff, i.e. you want to
commit all the local `usr.sbin/chown' changes in one subversion
changeset, you can use a directory with "hg diff" too (which works quite
unsurprisingly like "svn diff"):

    keramida at kobe:/hg/bsd/src$ hg diff -r 12785:tip usr.sbin/chown | diffstat -p1
     usr.sbin/chown/chgrp.1 |   10 +++++++---
     usr.sbin/chown/chown.8 |   12 ++++++++----
     usr.sbin/chown/chown.c |   17 +++++++++++------
     3 files changed, 26 insertions(+), 13 deletions(-)
    keramida at kobe:/hg/bsd/src$

Extracting Local Changesets as Patches
======================================

Another option is to extract only the very specific local commit that
affected a file, i.e. my local `usr.sbin/chown/chown.c' changes.  First
you'd have to look at the local history of the file:

    keramida at kobe:/hg/bsd/src$ hg log --style compact usr.sbin/chown/chown.c
    12010:11998,12009   7f4fa839afed   2009-06-20 00:50 +0300   keramida
      Merge from head

    12002   96e04082ef3f   2009-06-19 15:58 +0000   brooks
      In preparation for raising NGROUPS and NGROUPS_MAX, change base

    7782   141cd5ffef80   2009-01-22 07:18 +0200   keramida
      Add a new -x option to chown and chgrp, to inhibit file system

    0   dd5ed0412a8b   2007-12-31 22:03 +0000   jhb
      Actually declare the kern.features sysctl node.

    keramida at kobe:/hg/bsd/src$

>From this output it's obvious my local changes were made on 2009-01-22
07:18 +0200, and they are available as changeset 7782:141cd5ffef80.  You
can extract this changeset only with "hg export":

    keramida at kobe:/hg/bsd/src$ hg export 141cd5ffef80
    # HG changeset patch
    # User Giorgos Keramidas <keramida at ceid.upatras.gr>
    # Date 1232601528 -7200
    # Branch head
    # Node ID 141cd5ffef80ff979627d8898500c92984287426
    # Parent  e8506b2ac7aefbfb875f0def0de8dd6441885a40
    Add a new -x option to chown and chgrp, to inhibit file system
    mount point traversal.  The -x option is documented, like -v, as
    a non-standard option in the COMPATIBILITY manpage sections.

    diff -r e8506b2ac7ae -r 141cd5ffef80 usr.sbin/chown/chgrp.1
    --- a/usr.sbin/chown/chgrp.1    Wed Jan 21 21:31:33 2009 +0200
    +++ b/usr.sbin/chown/chgrp.1    Thu Jan 22 07:18:48 2009 +0200
    @@ -31,7 +31,7 @@
     .\"     @(#)chgrp.1    8.3 (Berkeley) 3/31/94
     .\" $FreeBSD$
     .\"
    -.Dd April 25, 2003
    +.Dd January 22, 2009
     .Dt CHGRP 1
     .Os
     .Sh NAME
    @@ -39,7 +39,7 @@
     .Nd change group
     .Sh SYNOPSIS
     .Nm
    -.Op Fl fhv
    +.Op Fl fhvx
     .Oo
     .Fl R
    ... snip ...

This patch is _directly_ usable with `patch -p1' in a checkout of /head
from Subversion, but it *may* require a bit of `svn merge' work if you
have fast-forwarded chown/chgrp to its latest 'head' version.  It is not
a diff of the *latest* chown/chgrp from /head but a _precise_ copy of
the past changeset, as it was committed.

More...
=======

  * You can browse the 'clean' head/ copy I am using at

        http://hg.hellug.gr/freebsd/head/

    Note: this only has the head history since 2007-12-31.  For older head
    commits, you will have to use a new "convert" run and it will change
    all the commit/changeset hashes even if their patch data is identical.

  * You can find a compressed 'bundle' file called 'head.hg' in my home
    directory at freefall.  This can be used to 'seed' the initial copy
    of your local 'head', eg. by pulling directly from the bundle:

        % cd /var/tmp
        % scp freefall.freebsd.org:'~keramida/head.hg' .

        % cd ~
        % mkdir -p ~/work/freebsd/head
        % hg init ~/work/freebsd/head
        % cd ~/work/freebsd/head
        % hg pull /var/tmp/head.hg

  * If you plan to use the incremental conversion script I described
    earlier in this message, you will also need a "SHA map" file that
    maps Subversion changesets to Mercurial changeset hashes.  This is
    also available at freefall as `~keramida/head.shamap' and you have
    to copy it to your `head/.hg/shamap' file.  Then either run the "hg
    convert" extension manually or use a small shell wrapper like the
    one I pasted here.

  * For more information about some of the extensions I've mentioned it
    is always a good idea to check the Mercurial Wiki at:

        http://www.selenic.com/mercurial/

I hope all this helps a bit...

If you need more help with publishing local workspaces over http and/or
extracting patches (these are often two of the points where help is
necessary and welcome), please feel free to ask.  I've been using this
sort of workflow for local changesets quite some time and I know enough
about Mercurial to help where needed.



More information about the freebsd-hackers mailing list