Removing build metadata, for reproducible kernel builds
Jonathan Anderson
jonathan at FreeBSD.org
Thu Dec 3 21:42:14 UTC 2015
On 3 Dec 2015, at 17:45, Ian Lepore wrote:
> I'm curious why anyone wants this enabled by default, like... are we
> missing something? Does it improve freebsd-update behavior maybe?
There is value in being able to reproduce the things you run, especially
if you download them from somebody else (like releases or binary
packages). It's not a panacea (see "Reflections on Trusting Trust"), but
it’s helpful, even if you don't always do the reproduction work. The
very fact that someone *can* check a binary release for naughtiness is a
strong incentive for many adversaries not to try their hand.
> If it's just for some general "reproducibility is good" philosophy
> then
> I would counter with "information is even better, so don't throw it
> away without a good reason."
When you're building your own stuff, sure, it might help to know that
this is the kernel you built on "this machine" at "that time". When
running 10.2-RELEASE-p7, however, it’s not very useful to know that it
was built on amd64-builder.daemonology.net, or that the source tree was
located at /usr/src. It *might* be useful to know that {set of people}
all got kernels that hash to {some bit pattern} when they reproduced the
build (like Certificate Transparency). Or, more interestingly, that
{people using some configuration} got a different result. Again, like
Certificate Transparency. :)
> Reproducibility is good for some people, and completely useless for
> others, and the people who need it aren't going to mind turning on a
> knob or two to get what they want.
Possibly. I don't have any strong opinions on whether the default is
"reproducible" or "full of information that helps me identify busted
kernels”, just so long as "reproducible" is available and easy to turn
on. And my personal opinion is that it should be turned on for public
releases: I think that being able to validate the kernel is more
important than knowing what machine it was built on.
>> Yea I was reading things backwards.
>>
>> In the review, I suggested that if you've modified the tree (which
>> the SCM
>> will tell you), then do the old format to preserve useful metadata
>> that's
>> really really needed and if not to use the shorter version. When
>> you've
>> modified the tree, reproducible builds aren't a concern at all.
>>
>
> How are you going to determine what consitutes a modified tree? What
> you think of as modifications may be what I call my baseline version.
Since we host our code in Subversion and have an official Git mirror,
how about svn status || git status? If you're basing your code off of
anything other than an official mirror, you get to deal with the
reproducibility problem yourself, but it sounds like many people in this
camp would prefer the more verbose version string anyway.
Jon
--
Jonathan Anderson
jonathan at FreeBSD.org
More information about the freebsd-arch
mailing list