Service disruption: git converter currently down

Thu Sep 26 14:26:59 UTC 2019

On Wed, 25 Sep 2019 at 15:50, Warner Losh <imp at bsdimp.com> wrote:
>
> git log always requires added care. There's not actually 9000 commits there. The tree looks fine topologically. Its purely an artifact of git log.

This seems to be getting into a philosophical discussion of what it
means for a commit to exist. But, given the constraints in the way git
represents commits the history crafted by the svn-git exporter indeed
shows thousands of "phantom" commits. The converter should (and with
uqs tweaks, would) represent the offending commit here as if it were a
cherry pick, not a merge.

In order to really represent this correctly we need to add to git
metadata tracking file operations. Recording that path d1/f1 was
copied from d2/f2 at some hash would allow us to properly represent
this case as well as renames/moves.

>> git log --first-parent isn't really a solution here either, because
>> there are cases where one legitimately does want history from both
>> parents, especially working in downstream projects.
>
> I'm pretty sure it would be fine, even in that case.

It's not fine, because it omits the commits I want to see.

>> > I'd offer the opinion that needing to know about things like git log --first-parent vs having to rebase every single downstream fork,
>>
>> We won't need to rebase every fork - in no case should the path
>> forward be worse than uqs's suggestion of a merge from both old/new
>> conversions.
>
> IMHO, uqs suggestion is a complete non-starter, at least the "git diff | git patch" one. It destroys all local history, commit messages, etc. Except for the most trivial cases, it's not really going to fly with our users. His other, followup ones might be workable into scripts.

diff | patch is not the suggestion; the suggestion is to perform a
merge from the "new" conversion. Other options (e.g. some sort of
scripted commit replay) are at least no worse than that base case.

> I'm not sure you can merge, as there's no common ancestor that's recent enough to give it a chance at succeeding (since the different exports would have different hashes starting fairly early in our history). My experience with qemu is that long-lived merge-updated branches become quite difficult to cope with after a while. It took me three weeks to sort out that relatively simple repo.

In fact, the merge works fine, even with completely unrelated
histories. You can try this by merging 'svn_head' (from git svn) to
'master' (from svn2git), using `git merge --allow-unrelated-histories
origin/svn_head`. The resulting history has two copies of every
commit, but the file contents are unchanged over the merge.

If you try this in a tree with changes (i.e., try applying it to a
long-running merge-based branch) every modified file will result in a
conflict, but they can be trivially resolved in favour of the first
version. From that point on merging from the "new" conversion will
work as expected.

> A rebase has a chance of working for people following a 'rebase' work flow.

Indeed, for rebase workflow it's fairly straightforward.

> However, for people like CHERIBSD who follow a 'merge from upstream' model which never rebases (since that would be anti-social to their down streams), I'm having trouble understanding how that could work. At work, we basically do the merge from upstream with collapse model, which I'm having trouble seeing how to move from old hashes to new. I'd like to know what the plan for that would be and would happily test any solution there with a copy of our repo. I'd even be happy to run experiments in advance of there being something more public available to see what options do or don't work.

Could you expand on the "merge from upstream with collapse" -
specifically, can you provide an example command used when merging
from FreeBSD?