Re: llvm10 build failure on Rpi3

From: bob prohaska <fbsd_at_www.zefox.net>
Date: Wed, 23 Jun 2021 22:28:38 UTC
On Wed, Jun 23, 2021 at 02:03:42PM -0700, Mark Millard wrote:
> On 2021-Jun-23, at 10:43, bob prohaska <fbsd at www.zefox.net> wrote:
> 
> > On Wed, Jun 23, 2021 at 01:34:55AM -0700, Mark Millard wrote:
> >> 
> >> Not that it helps much, but: 2779096485 == 0xA5A5A5A5
> >> 
> >> It appears that such somehow was involved-in/generated by:
> >> 
> >> [ 24% 1326/5364] cd /wrkdirs/usr/ports/devel/llvm10/work/.build && /wrkdirs/usr/ports/devel/llvm10/work/.build/bin/llvm-tblgen -gen-global-isel -I /wrkdirs/usr/ports/devel/llvm10/work/llvm-10.0.1.src/lib/Target/AMDGPU -I /wrkdirs/usr/ports/devel/llvm10/work/llvm-10.0.1.src/include -I /wrkdirs/usr/ports/devel/llvm10/work/llvm-10.0.1.src/lib/Target /wrkdirs/usr/ports/devel/llvm10/work/llvm-10.0.1.src/lib/Target/AMDGPU/AMDGPUGISel.td --write-if-changed -o lib/Target/AMDGPU/AMDGPUGenGlobalISel.inc -d lib/Target/AMDGPU/AMDGPUGenGlobalISel.inc.d
> >> 
> >> and that lead to the commented out notation in the output, with the "@2779096485" listed in the comment as well.
> >> 
> > 
> > A Pi4 doing a bulk build of chromium, lxqt and apache has gone far past that
> > point building llvm10, suggesting the fault lies somewhere in my setup.
> 
> I'm not so sure of that for the 0xA5A5A5A5u value. You run
> main [so: 14 at this point]. Is it a debug build? Or a
> non-debug build? I expect that 0xA5A5A5A5u has some specific
> debug-build potential meaning.
> 
The kernel in use is 
FreeBSD www.zefox.org 14.0-CURRENT FreeBSD 14.0-CURRENT #1 main-n247405-8fa5c577de3: Fri Jun 18 17:03:19 PDT 2021     bob@www.zefox.org:/usr/obj/usr/src/arm64.aarch64/sys/GENERIC-MMCCAM  arm64
and it can invoke the debugger using [enter]-tilda-control-b.

> For example, 0xA5u byte values might be the value that newly
> allocated memory is initialized to. Looking . . . man jemalloc
> (the memory allocator implementation used by FreeBSD) reports:
> 
>        opt.junk (const char *) r- [--enable-fill]
>            Junk filling. If set to ???alloc???, each byte of uninitialized
>            allocated memory will be initialized to 0xa5. If set to ???free???, all
>            deallocated memory will be initialized to 0x5a. If set to ???true???,
>            both allocated and deallocated memory will be initialized, and if
>            set to ???false???, junk filling be disabled entirely. This is intended
>            for debugging and will impact performance negatively. This option
>            is ???false??? by default unless --enable-debug is specified during
>            configuration, in which case it is ???true??? by default.
> 
> So, if you have junk filling enabled, I expect that you ran
> into a legitimate defect in the llvm-tblgen in use. Having
> Junk Filling disabled might be a workaround.
> 
> There is /etc/malloc.conf as a way of controlling the behavior:
> 
> ln -s 'junk:false' /usr/local/poudriere/poudriere-system/etc/malloc.conf
> 
> I suggest you retry building after getting the above in place.
> If it does not get the 0xA5A5A5A5u value, that would be
> more evidence of a uninitialized-memory defect in the llvm-tblgen
> involved.
>
Done and running now. In the interim I tried building llvm10 using
make in /usr/ports, but it failed with another python conflict.
 
> I do not normally run debug builds and so would not have
> run into 0xA5A5A5A5u from Junk Filling of memory allocations.
> 
> I'm not sure when I can setup and do a junk filling experiment
> (in a debug main build?). But it looks like some independent
> compare/contrast activity might be appropriate.
> 
> > The instructions you gave for setting up poudriere seemed to work perfectly
> > initially, but since that time both world and kernel have been updated
> > along with ports. Is it necessary or advisable to alter /usr/local/poudriere,
> > either by  update commands or complete replacement? 
> 
> I will note that your log file reports:
> 
> Host OSVERSION: 1400023
> Jail OSVERSION: 1400019
> 
> So your jail's OSVERSION is older than the environment
> that it is running in. (Unlikely to contribute to the
> 0xA5A5A5A5u as far as I can tell.) In other words, you
> have not updated your:
> 
> /usr/local/poudriere/poudriere-system/
> 
> to 1400023 as far as I can tell.
>

After one of the world/kernel rebuilds I attempted to repeat your
poudriere setup instructions, thinking it would update the setup.
IIRC both commands were refused, not with an error, but more like
a "don't do that" sort of message. I fumbled for a while with
poudriere ports -u, but couldn't get the syntax right. Then I
noticed a reference to null-mounting /usr/ports, which strongly
suggested any updates to ports would be picked up by default. 

> Separately from that, for poudriere itself:
>
 
> I do not know if you are using ports-mgmt/poudriere-devel vs.
> ports-mgmt/poudriere . 

Poudriere version reports 3.3.6. I believe it's _not_ the -devel version.

> But, whichever, it is a port and is
> one of the ports that should be built when it has updated
> when you update /usr/ports content and should then have its
> install be updated via pkg like the other ports.
>

I've yet to master getting pkg to actually work from a local repository.
The handbook says to create 
/usr/local/poudriere % more /usr/local/etc/pkg/repos/FreeBSD.conf
containing
FreeBSD: {
        enabled: no
}

Hopefully, using 
pkg install -r /usr/local/poudriere/data/packages/main-default/All [pkgname]
will do the trick. Any cautionary tales would be much appreciated. 

> I list ports-mgmt/poudriere-devel in the file with the other
> ports that I list in ~/origins/CA72-origins.txt and I use
> that file via -f in the bulk command.
> 

If there's a guide to using poudriere/pkg in a self-hosting situation
it would be very useful. The existing docs have a very different focus.

Thanks again for reading and replying!

bob prohaska