Removing build metadata, for reproducible kernel builds

Ian Lepore ian at
Thu Dec 3 21:59:29 UTC 2015

On Thu, 2015-12-03 at 18:11 -0330, Jonathan Anderson wrote:
> On 3 Dec 2015, at 17:45, Ian Lepore wrote:
> > I'm curious why anyone wants this enabled by default, like... are
> > we
> > missing something?  Does it improve freebsd-update behavior maybe?
> There is value in being able to reproduce the things you run,
> especially 
> if you download them from somebody else (like releases or binary 
> packages). It's not a panacea (see "Reflections on Trusting Trust"),
> but 
> it’s helpful, even if you don't always do the reproduction work. The 
> very fact that someone *can* check a binary release for naughtiness
> is a 
> strong incentive for many adversaries not to try their hand.
> > If it's just for some general "reproducibility is good" philosophy 
> > then
> > I would counter with "information is even better, so don't throw it
> > away without a good reason."
> When you're building your own stuff, sure, it might help to know that
> this is the kernel you built on "this machine" at "that time". When 
> running 10.2-RELEASE-p7, however, it’s not very useful to know that
> it 
> was built on, or that the source tree
> was 
> located at /usr/src. It *might* be useful to know that {set of
> people} 
> all got kernels that hash to {some bit pattern} when they reproduced
> the 
> build (like Certificate Transparency). Or, more interestingly, that 
> {people using some configuration} got a different result. Again, like
> Certificate Transparency. :)
> > Reproducibility is good for some people, and completely useless for
> > others, and the people who need it aren't going to mind turning on
> > a
> > knob or two to get what they want.
> Possibly. I don't have any strong opinions on whether the default is 
> "reproducible" or "full of information that helps me identify busted 
> kernels”, just so long as "reproducible" is available and easy to
> turn 
> on. And my personal opinion is that it should be turned on for public
> releases: I think that being able to validate the kernel is more 
> important than knowing what machine it was built on.
> > > Yea I was reading things backwards.
> > > 
> > > In the review, I suggested that if you've modified the tree
> > > (which 
> > > the SCM
> > > will tell you), then do the old format to preserve useful
> > > metadata 
> > > that's
> > > really really needed and if not to use the shorter version. When 
> > > you've
> > > modified the tree, reproducible builds aren't a concern at all.
> > > 
> > 
> > How are you going to determine what consitutes a modified tree? 
> >  What
> > you think of as modifications may be what I call my baseline
> > version.
> Since we host our code in Subversion and have an official Git mirror,
> how about svn status || git status? If you're basing your code off of
> anything other than an official mirror, you get to deal with the 
> reproducibility problem yourself, but it sounds like many people in
> this 
> camp would prefer the more verbose version string anyway.

By "we" you must mean "The FreeBSD Project" but surely you also realize
that the universe of freebsd users is much larger than just the
project, and not all of them use subversion or git to check out freebsd
and/or manage their local copies of it.

For a company building products based on freebsd, reproducibility is
important, but they're quite likely to be using something other than
subversion or git to manage the source.  They're also quite likely to
have local modifications that they consider to be part of their
baseline even if they appear to be modifications from the project's
repo at the same svn revision number.  Either way, these folks are
going to want to set some control that enforces reproducibility
regardless of any build system heuristics about what to default to.

For other companies or end users the important factor might be the
ability to reproduce an official release, which one presumes would
start with checkout out the official sources using one of the official
SCMs and then a whole other set of "what constitues a modification"
would apply.

As someone who works for one of those "not-svn, not-git" companies I
just want to make sure there's a "do what I say" knob that overrides
any attempts to be smart about detecting modifications.

-- Ian

More information about the freebsd-arch mailing list