Subversion documentation for the FreeBSD project?

Wed Jun 4 12:56:30 UTC 2008

On Wed, Jun 4, 2008 at 2:51 PM, Brooks Davis <brooks at freebsd.org> wrote:
> On Wed, Jun 04, 2008 at 08:32:53AM +0200, Ulrich Spoerlein wrote:
>> This is not entirely true, the cvs@ archive details almost all CVS
>> repo copies. There are also lots of forced commits to denote repo
>> copies. Yes, it would have to be a manual process, where you gather
>> (old, new, revision) tuples for the time of the repo copy (and perhaps
>> the branch?)
>> This file could then augment the conversion process and handle the CVS
>> files more intelligently.
>>
>> I'm not volunteering and am happy with what's been done anyway. I'm
>> just against the "this can never ever been fixed, because the
>> information is totally lost" attitude.
>
> Some of the information exists scattered across the archive, much of it
> probably does not since at one point committers had direct access to the
> repo and used it.  The forced commit rule has been forgotten many times.
> A partial reconstruction might be possible if someone wanted to waste a
> few months of their life.

Ok, I'm not that familiar with the RCS format, but couldn't this
algorithm catch 97% of the affected files?

- Grab content from rev 1.1 of each file and build MD5 sum
- files whose rev 1.1 is the same have probably been repocopied
- the point in time, where file A is no longer comitted to, and file B
has the first commit which is not also in file A, that's when the copy
happened

I think tools like fromcvs/tohg do a pretty good job at capturing
these instances.

Cheers,
Uli