Re: llvm10 build failure on Rpi3

From: Mark Millard via freebsd-ports <>
Date: Sun, 04 Jul 2021 00:43:51 UTC
On 2021-Jul-3, at 14:54, bob prohaska <fbsd at> wrote:

> On Sat, Jul 03, 2021 at 01:15:19PM -0700, Mark Millard wrote:
>> So you still have not tried an artifacts or snapshot kernel+world?
> Not yet. 
>>> Eventually I resorted to running make in devel/llvm10, to my surprise it
>>> ran to completion.
>> Interesting.
>> Was this -j4? -j1? -j2? Any other interesting characteristics
>> for how it was run?
> Nothing special was done. IIRC, it was make -DBATCH > make.log in
> the background. From top's screen it looked like -j4. 
>> It would be interesting to see if building in a chroot
>> in that make style also worked (or a non-poudriere jail).
> Can you point me to instructions for doing the experiment?

I'll deal with this is a separate reply.

>>> It also ran make package successfully. Again I tried to
>>> build just devel/llvm10 using poudriere, again getting "expected expression". 
>>> At that point I resized the swap partitions to 1 GB each and tried poudriere
>>> on devel/llvm10. That got rid of the excessive swap warnings, but didn't help.
>>> Finally I placed 
>>> in /usr/local/etc/poudriere.d/make.conf and tried again. That still failed,
>>> still with "expected expression". 
>> I'll note that the running build build shows Load Averages
>> of under 3. So the MAKE_JOBS_NUMBER=2 seems to be working.
>>> Since devel/llvm10 had created a package successfully, I tried slipping a copy
>>> into poudriere's package directory, hoping it would find and use the package
>>> to make further progress. Unfortunately, poudriere seems to remember the failure
>>> and won't use the proffered package. 
> [large snip which convinced me to give up on tricking poudriere into
> using a package constructed by make] 
>> Going in a different direction, one way to force a build to
>> start over after a failure is to: rm -fr PATH/.building
>> before starting a new bulk build. This might be appropriate
> I'm missing something here: what does PATH represent? There's
> nothing called .building under /usr/local/poudriere, at least
> after the run finishes. 

Part of how this works is that .building/ is initially
populated with a shadow copy of the already existing
.latest/ mostly via use of hard links, with some top
level files actually copied.

If the status of the bulk run reaches stopped:done: then the
.building/ is mv'd (renamed) to be of the form .real_*/
with a new match for the * and then the links are adjusted
to point to the new .real_*/ and the old .real_*/ is
removed. In your context, this happens inside:


So, yes, your run that reached stopped:done: no longer
has a .building/

By contrast, say you ^C the bulk run or that it reaches the
stopped:crashed: state instead of stopped:done: . Then the
.building/ would still be present, as would the pre-existing
existing .real_*/ and the links that use it. This is the
context for the next bulk run reporting:

"Using packages from previously failed build: ${PACKAGES}/.building"

>> if one suspects a problem of a kind that did not stop a
>> build but produced something for a build that fails to operate
>> correctly.
> Such as a corrupt llmv-tblgen?

Yep, possibly via it depending on something else that
has problems.

>> So lang/rust finished. That is interesting because it includes an
>> llvm build internally.
> Does that build invoke the same llvm-tblgen?

Every devel/llvm* build builds its own llvm-tblgen .
lang/rust would build its own too. And the system
llvm support builds its own as well.

> [snip] 
>> Again, poudriere does not control memory initialization in
>> the processes in the builders.
> For some reason I got the idea that whatever  asked for memory to use
> was responsible for initializing it.

Part of the point of having memory management libraries
have way to be told to fill-in things like 0xA5u bytes is
to get hints about contexts that end up with memory not
explicitly initialized by the requesting program.

Such is why I had you try the contrasting junk:false
case in /etc/malloc.conf . The results showed what the
memory allocation library initialized with instead of
something specific to the code requesting the allocation.

> Certainly not the kernel.....

The kernel fills in bytes into some user-space memory
as part of doing various requested operations. In such
cases it is potentially possible for the kernel to not
have filled-in the memory like it should have.

It is also possible for the kernel to replace the bytes
seen by user-space memory that it should not touch.
There is an example on-going issue with this for the
32-bit powerpc kernels that cover using old PowerMacs.

>>> The fact that the stoppage reported looks like
>>> a syntax error specific to devel/llmv10 which is unaffected by swap pressure
>>> makes it seem unrelated to kernel or swap constraints. 
>> The files with the syntax errors are ones generated by llvm-tblgen
>> during the build and it is the output of llvm-tblgen that is corrupt,
>> showing evidence of having used memory not initialized like it should
>> have been.
> Wouldn't that point suspicion at llvm-tblgen, of whatever version
> LLVM is actually doing the work? 

It points at llvm-tblgen and/or something(s) that llvm-tblgen
depends on. Either way, the observed failure is from the
llvm-tblgen output being incorrect and later complained about.

devel/llvm10 builds its own llvm-tblgen for its own use. Each
devel/llvm* does. (As does the system's llvm*.)

There is also the variability in which llvm-tblgen output is
messed up: it is always some example of:


but which value for the *'s tends to vary from build attempt
to build attempt. It suggests that some sort of race condition
is involved.

>>> AIUI, the hardware of the Pi4 is considerably different from the Pi3 in terms
>>> of memory management, noted from an interview with Eben Upton on YouTube.
>> Why would Eben Upton be talking about FreeBSD's memory management?
> He was talking about the Pi4 hardware and how it differed from the Pi3

Which is not memory management as such.

>> I suspect that the talk is not about what you think it is about,
>> but some narrower aspects than the overall memory managment.
> I thought it had something to do with added DMA capablity. The video is at
> In light of the discussion about llvm-tblgen I'm doubtful it's relevant,
> but it's not the worst way to waste an hour.
>>> Is there any sort of sanity test for the poudriere system? If I delete and
>>> re-create the existing jail can the existing package library be preserved
>>> and re-used? If not, that's OK, I'd just like to know beforehand.
>> # poudriere jail -jNAME -d
>> # poudriere jail -c -jNAME -m null -M /WORLDPATH -S /SRCPATH -v 14.0-CURRENT
>> should work fine. But really all that you are
>> doing is (using an example from my environment)
>> is deleting and rewriting a few very small files
>> in a directory with the jail's name:
> So, in my case /usr/local/poudriere/poudriere-system? 

After the delete would be:

poudriere jail -c -jNAME -m null -M /usr/local/poudriere/poudriere-system -S /usr/src -v 14.0-CURRENT

Same as in your:

> (using the nomenclature in your sample instructions).
> That would leave /usr/local/poudriere/data intact....

Yep. The delete does have an option (-C ???) for causing
more to be deleted under /usr/local/poudriere/data/ .

(Despite documentation claims otherwise, it did not
seem to delete packages when reqeuested.)

> I'm starting to understand why you think it unlikely
> to help.
>> The deletion/replacement of timestamp may have rebuild
>> consequences from appearing to have changed (or just
>> being missing).
> If timestamps guide decisions on what to make and when,
> that might be significant. Not sure how I might've screwed
> them up, but in my hands anything is possible 8-)

I took a quick look and did not notice any timestamp
comparisons controlling anything.

>> Nothing about any of those is going to change how memory
>> initialization is working in llvm-tblgen's operation
>> for generating any * files, other than
>> if the timestamp forces some sort of rebuild from scratch
>> of some build dependencies first.
> Maybe this should be obvious, but which llvm-tblgen is in 
> action? the one from the system, (12.0.1) or something
> else?

devel/llvm10 builds its own llvm-tblgen and uses it.
Every devel/llvm* build builds its own llvm-tblgen .

Looking in the .log file for a build there are lines
containing commands that start out with (from my
example devel/llvm10 build context):


Before any of those, there are commands associated with
building that bin/llvm-tblgen .

Mark Millard
marklmi at
( went
away in early 2018-Mar)