Re: Migrating to LLVM binutils tools (ar, nm, addr2line, etc.)

From: Shawn Webb <shawn.webb_at_hardenedbsd.org>
Date: Mon, 02 Aug 2021 13:11:32 UTC
On Mon, Jul 05, 2021 at 11:09:18AM -0400, Ed Maste wrote:
> FreeBSD migrated from GNU binutils to versions from ELF Tool Chain,
> starting in 2014. At that time there were no usable LLVM versions of
> those tools, but they have been developing rapidly since then. Now I
> think it may be prudent to migrate to the LLVM tools where they exist,
> for both functionality and maintainability reasons.
> 
> I'd like to allow use of link-time optimization (LTO) in the FreeBSD
> base system. LTO runs optimization passes over the entire executable
> (or library) at link time and thus allows for more effective
> optimization than when performed on individual compilation units.
> 
> When using LTO object files (.o) including those contained in static
> library archives (.a) contain LLVM IR bitcode rather than target
> object code. This means that utilities that operate on object files
> need to support LLVM IR; we currently use a number of bespoke tools
> and ones obtained from ELF Tool Chain that do not have this support.
> 
> Alex Richardson also pointed out that asan (address sanitizer)
> produces a useful backtrace only if addr2line is llvm-symbolizer.
> 
> Like ELF Tool Chain the LLVM tools aim for command line and output
> format compatibility with GNU binutils, although there are a few minor
> differences. Where these cause a material issue (breaking a port or
> eliminating required functionality) we can submit LLVM bugs and work
> on patches.
> 
> In the past we provided build knobs to control individual utilities
> (e.g. WITH_LLD_IS_LD); I'd like to avoid perpetuating that here. It
> seems individual knobs (WITH_LLVM_AR_IS_AR, WITH_LLVM_NM_IS_NM,
> WITH_LLVM_SYMBOLIZER_IS_ADDR2LINE etc.) will introduce extra
> complexity without adding much value.
> 
> Alex is working on a patch now and will follow up shortly, but I
> wanted to email the list as a heads-up, and see if there are any
> comments or concerns.
> 
> Potential next steps are:
> - Introduce new build knob
> - Iterate on exp-runs and call for testing
> - Switch to LLVM tools by default
> - Major release (14.0)
> - Retire knob, leaving only the LLVM implementation.

Hey Ed,

As background for anyone curious, HardenedBSD switched to using
llvm-ar, llvm-nm, and llvm-objdump by default years ago as part of the
work to start integrating Cross-DSO CFI.

We've noticed one small, but important, issue with llvm-ar (which is
also the same underlying program as llvm-ranlib) in some behavior that
doesn't match ELF Toolchain's ar/ranlib (which I'll call elftc-ar).

For most cases, when elftc-ar fails, it does not set the exitcode to
non-zero. This tricks the ports tree to continue to build a port where
elftc-ar actually errored.

llvm-ar does the right thing in exiting with a non-zero exit code on
error.

However, due to this discrepency in behavior, certain ports that cause
an error condition when calling ar/ranlib continue to build when
elftc-ar is used, but fail to build when llvm-ar is used.

I'm thinking that I'll report this same issue to the ELF Toolchain
folks since elftc-ar really should exit with a non-zero exitcode on
failure.

I've just now hacked llvm-ar to behave the same as elftc-ar[0] and
will do a poudriere bulk run soon.

I'll report back my status with the ELF Toolchain notification and the
poudriere run as soon as I have more info.

[0]:
https://git.hardenedbsd.org/hardenedbsd/HardenedBSD/-/commit/5bdcc54a23f05883f55e895da49726955fa8b07b

Thanks,

-- 
Shawn Webb
Cofounder / Security Engineer
HardenedBSD

https://git.hardenedbsd.org/hardenedbsd/pubkeys/-/raw/master/Shawn_Webb/03A4CBEBB82EA5A67D9F3853FF2E67A277F8E1FA.pub.asc