Monitoring commits on all branches
Marc Branchaud
marcnarc at gmail.com
Thu Nov 19 22:50:17 UTC 2020
On 2020-11-19 12:16 p.m., Warner Losh wrote:
>
> Thanks Marc! This is great advice... more comments below...
>
> On Thu, Nov 19, 2020 at 9:16 AM Marc Branchaud <marcnarc at gmail.com
> <mailto:marcnarc at gmail.com>> wrote:
>
> On 2020-11-18 8:49 p.m., Dan Langille wrote:
> > How can a repo be monitored for commits on all branches?
> >
> > I know how to ask a given branch: do you have any commits after
> foo_hash?
> >
> > How do I:
> >
> > * get a list of all commits since foo_hash
>
> A quick a note about Warner's reply:
>
> > git log $hash..HEAD
>
> "HEAD" is just a git nickname for "whatever you have currently
> checked-out" (which can be a branch, a tag, or "detached" commit SHA
> ID).
>
> > * know which branch each of those commits was on (e.g. master,
> branches/2020Q4)
>
> Unfortunately you'll find most normal git advice to be a bit
> frustrating
> with the FreeBSD repos, because FreeBSD doesn't work the way most
> people
> use git. Specifically, the FreeBSD project does not ever merge
> branches
> (in the git sense of the word "merge"). Things would be very, very
> much
> easier if the FreeBSD project were to use git-style merging. I believe
> there are discussions underway about adjusting the whole MFC process
> for
> the git world. I admit that part of my motivation in writing this
> message is to provide grist for that mill.
>
>
> FreeBSD src will be doing cherry-picks. There's only pain and suffering
> from merge commits in this environment. Git's tools are adequate to cope
> with individual and squashed cherry picks.
Fair enough. I'm also sure that the git community would welcome patches
that help make FreeBSD's workflow a bit smoother.
> Fortunately even without git-merged branches, there are still git tools
> that help, though they're not as precise as one would like.
>
>
> They are for src. I suspect for ports they might not be.
>
> Let's look at a concrete example with the beta ports git repo (which I
> just cloned), and compare the 2020Q4 and main branches. I'll start
> with
> some overall exploration, then address your specific question.
>
> There are 298 commits in the 2020Q4 branch. I know this because
> git merge-base origin/main origin/branches/2020Q4
> tells me where 2020Q4 branched off of main: commit 5dbe4e5f775ea2. And
> git rev-list 5dbe4e5f775ea2..origin/branches/2020Q4 | wc -l
> says "299". (The "rev-list" command is a bare-bones version of "log"
> that only lists commit SHA IDs.)
>
> Meanwhile there have been 4538 commits to the main branch since commit
> 5dbe4e5f775ea2.
>
> As far as git is concerned, those 299 commits in 2020Q4 are *different*
> from anything in main. Even though most of them made the exact same
> code changes, they were created at different times, often by different
> authors, and they have different commit messages.
>
>
> True.
>
> But you can still ask git to look at the code-change level to see which
> 2020Q4 commits exactly replicated the code change from main:
>
> git cherry -v origin/main origin/branches/2020Q4
>
> This little piece of magic looks at the 299 commits in 2020Q4 that are
> not in main and compares their code changes to the 4538 commits in main
> that are not in 2020Q4. It prints out the 299 2020Q4 commit SHA IDs,
> prefixed with either a "- " or a "+ ". The -v appends the commit
> message's first line:
>
> - 394d9746e5eea73f56334b2e7ddbdc8f686d6541 MFH: r550869
> + 1ac9571956759c91d852ee92859a12e52dcbde48 MFH: r550885 r550886
> - fd411bdfda55488b84de75e6b043c513a281abf0 MFH: r551209
> - 533cdaa97457b3318aebcc53f7a1a46ea66721da MFH: r551236
> ......
>
> A "-" means that the commit matches the code change made by a commit in
> main, while a "+" means that the commit's code change does not
> *exactly*
> match any main commit since commit 5dbe4e5f775ea2.
>
> So
> git cherry -v origin/main origin/branches/2020Q4 | grep ^-
> shows us the 234 2020Q4 commits that made the exact same change as a
> commit in main.
>
> And
> git cherry -v origin/main origin/branches/2020Q4 | grep ^+
> shows us that there are 41 not-exactly-the-same-change commits in
> 2020Q4. Mostly these are ones that combined two or more MFH's into one
> commit (e.g. 2020Q4 commit 1ac95719567), or that changed a file in a
> slightly different way (see the first patch hunk of 2020Q4 commit
> cbd002878f2, compared to its counterpart in main: commit a5d21ea16b6).
>
>
> Yes. These sorts of issues are why doing merge commits aren't always the
> right way to go because we're not merging the entire history together
> (doing a join), but rather just small subsets of it. How to cope with
> the mostly the same small files tree that is our ports tree in the face
> of git's guessing which does a poor job on such a tree is an interesting
> problem to solve. merge commits can help some of the issue, but they can
> create other issues as well when done incorrectly....
I admit I don't quite follow you there, but I'm particularly ignorant of
the ports tree. I have some quite-likely-stupid ideas after having
played with it for 10 minutes while composing my earlier message, but
even if the ideas are somehow clever I suspect they'd entail too much
workflow change to be palatable.
> Even so, great hints for how to find cherry picked items. I suspect
> we'll need to have some tooling that embeds hash(es) into the commit
> message in some stylized way to allow tracking the non--trivial patch
> changes that sometimes happen: squashing several cherry picks, necessary
> differences due to branch drift, etc. It's unclear how we should do
> this, though, in a way that works well, is reliable and doesn't add
> undue friction to the process...
It's traditional when doing a cherry-pick to add a
Cherry-picked-from: <SHA ID>
line to the commit message. The "cherry-pick" command even has a -x
option to automatically add such a line to the new commit's message.
(There's also a "git interpret-trailers" command that is a
general-purpose tool for manipulating "Foo: blah blah" lines in commit
messages.)
"git cherry-pick" might actually lead people away from squashing
together multiple changes into one commit, because you have to make a
bit of an effort to get cherry-pick to squash things up. I personally
think the project would benefit from discouraging squashed-together MFC's.
> Now to your specific question: Given a commit, how can we tell which
> branches contain that code change? Let's look at main commit
> 6a9a8389d609 which I've determined, through manual spelunking, matches
> 2020Q4's commit 02eba4048564.
>
> At a basic level, "git cherry" can tell us that *something* in 2020Q4
> made the same change as commit 6a9a8389d609. Here I reversed the order
> of the branch names in the command:
> git cherry origin/branches/2020Q4 origin/main | grep 6a9a8389d609
> This outputs:
> - 6a9a8389d609ca0370c8c6eb8f993c1aa4071681
> and the "-" tells me that 6a9a8389d609's code change is *somewhere* in
> 2020Q4 unique 299 commits.
>
> Unfortunately there's no convenient git command that'll tell you
> *which*
> 2020Q4 commit replicated commit 6a9a8389d609. For that, we need to
> do a
> bit of scripting:
>
> -----8<-----8<-----8<-----8<-----
>
> #!/bin/sh
>
> TARGET="6a9a8389d609"
>
> BASE=`git merge-base origin/branches/2020Q4 origin/main`
>
> TARGET_PATCH_ID=`git show -p $TARGET | git patch-id --stable | cut -f 1
> -d ' '`
>
> for REV in `git rev-list $BASE..origin/branches/2020Q4`; do
> PATCH_ID=`git show -p $REV | git patch-id --stable | cut -f 1
> -d ' '`
> if [ "$PATCH_ID" = "$TARGET_PATCH_ID" ]; then
> echo "Found a commit that replicated target commit $TARGET:"
> echo
> git show -s $REV
> exit 0
> fi
> done
>
> echo "Did not find any commit that exactly replicated $TARGET."
> exit 1
>
> ----->8----->8----->8----->8-----
>
> This only looks at the 2020Q4 branch, but it's easily adapted to
> look at
> a user-specified branch, or multiple branches. (In the above I used
> "git patch-id", which is what "git cherry" uses internally to
> identify a
> commit's code changes.)
>
> I hope all this helps a bit!
>
>
> It does. I thought I'd had my head deep into git, but hadn't stumbled
> upon this.
I've been using git for over 10 years, and I still discover new things.
This "git cherry" stuff, for example, I've only started using a little
bit in the last few months.
> It looks useful enough I'll try to add a section to my FAQ.
I'm honoured!
M.
More information about the freebsd-git
mailing list