Re: llvm10 build failure on Rpi3

From: Mark Millard via freebsd-ports <freebsd-ports_at_freebsd.org>
Date: Sat, 03 Jul 2021 20:55:46 UTC
On 2021-Jul-3, at 13:15, Mark Millard <marklmi at yahoo.com> wrote:

> On 2021-Jul-3, at 11:25, bob prohaska <fbsd@www.zefox.net> wrote:
> 
>>>>> On 2021-Jul-2, at 19:23, Mark Millard <marklmi at yahoo.com> wrote:
>>>> 
>>>>>> Side note:
>>>>>> 
>>>>>> It llooks like http://www.zefox.org/~bob/swaplogs/poudrierellvm10.log
>>>>>> shows that you tried with:
>>>>>> 
>>>>>> Device          1K-blocks     Used    Avail Capacity
>>>>>> /dev/da0s2b       1048576    25784  1022792     2%
>>>>>> /dev/mmcsd0s2b    1048576    25124  1023452     2%
>>>>>> Total             2097152    50908  2046244     2%
>>>>>> 
>> [hope the quotes are right!]
>> 
>> That's correct. The sequence of experiments ran something like this:
>> 
>> The Pi3 was configured with a a pair of ~3 GB swap partitions, one on
>> microSD, the other on the 1 TB mechanical hard disk. Make was not limited
>> in the number of jobs it could parallel. OOMA was restrained by putting
>> vm.pageout_oom_seq="4096"
>> vm.pfault_oom_attempts="20"
>> in /boot/loader.conf The usual "excessive swap" warnings were presented
>> during boot and ignored by me. 
>> 
>> Worlds and kernels built wtihout trouble, so I tried building www/chromium
>> using poudriere. It stopped in /devel/llvm10 with the "expected expression"
>> error and continued to stop there despite updating /usr/ports several times. 
>> At no time were there any hints of swap problems. Resorting to a GENERIC
>> self-hosted kernel made no difference. /usr/src was not tampered with. 
> 
> So you still have not tried an artifacts or snapshot kernel+world?
> 
>> Eventually I resorted to running make in devel/llvm10, to my surprise it
>> ran to completion.
> 
> Interesting.
> 
> Was this -j4? -j1? -j2? Any other interesting characteristics
> for how it was run?
> 
> It would be interesting to see if building in a chroot
> in that make style also worked (or a non-poudriere jail).
> 
>> It also ran make package successfully. Again I tried to
>> build just devel/llvm10 using poudriere, again getting "expected expression". 
>> 
>> At that point I resized the swap partitions to 1 GB each and tried poudriere
>> on devel/llvm10. That got rid of the excessive swap warnings, but didn't help.
>> Finally I placed 
>> MAKE_JOBS_NUMBER=2 
>> in /usr/local/etc/poudriere.d/make.conf and tried again. That still failed,
>> still with "expected expression". 
> 
> I'll note that the running build build shows Load Averages
> of under 3. So the MAKE_JOBS_NUMBER=2 seems to be working.
> 
>> Since devel/llvm10 had created a package successfully, I tried slipping a copy
>> into poudriere's package directory, hoping it would find and use the package
>> to make further progress. Unfortunately, poudriere seems to remember the failure
>> and won't use the proffered package. 
> 
> After things build correctly, things tend to look something like
> (using an example):
> 
> 2# ls -FTla /usr/local/poudriere/data/packages/main-CA53-default/
> total 12
> drwxr-xr-x  3 root  wheel  512 Jul  3 07:19:32 2021 ./
> drwxr-xr-x  4 root  wheel  512 Jul  1 19:25:44 2021 ../
> lrwxr-xr-x  1 root  wheel   18 Jun 28 04:32:43 2021 .buildname@ -> .latest/.buildname
> lrwxr-xr-x  1 root  wheel   20 Jun 28 04:32:43 2021 .jailversion@ -> .latest/.jailversion
> lrwxr-xr-x  1 root  wheel   16 Jul  3 07:19:32 2021 .latest@ -> .real_1625321972
> drwxr-xr-x  4 root  wheel  512 Jul  3 07:19:32 2021 .real_1625321972/
> lrwxr-xr-x  1 root  wheel   11 Jun 28 04:32:43 2021 All@ -> .latest/All
> lrwxr-xr-x  1 root  wheel   14 Jun 28 04:32:43 2021 Latest@ -> .latest/Latest
> lrwxr-xr-x  1 root  wheel   17 Jun 28 04:32:43 2021 meta.conf@ -> .latest/meta.conf
> lrwxr-xr-x  1 root  wheel   16 Jun 28 04:32:43 2021 meta.txz@ -> .latest/meta.txz
> lrwxr-xr-x  1 root  wheel   23 Jun 28 04:32:43 2021 packagesite.txz@ -> .latest/packagesite.txz
> 
> But, if a bulk is in process or has finished after some package
> had a build failure, there is also a:
> 
> .building/
> 
> in there. That is what the message:
> 
> Using packages from previously failed build: ${PACKAGES}/.building
> 
> is about when starting poudriere bulk again. This is how
> poudriere avoids rebuilding what successfully built --but
> without adjusting the prior successful bulk build (if any).
> 
> So poudriere would have expected the file for devel/llvm10 's
> build to be in that .building/ directory instead of down under
> the .real_*/ directory.
> 
> (I've not checked if there is other record keeping in .building/
> about the materials as well.)
> 
> Going in a different direction, one way to force a build to
> start over after a failure is to: rm -fr PATH/.building
> before starting a new bulk build. This might be appropriate
> if one suspects a problem of a kind that did not stop a
> build but produced something for a build that fails to operate
> correctly.
> 
>> It's still running, on lang/spidermoneky78.  
> 
> So lang/rust finished. That is interesting because it includes an
> llvm build internally.
> 
> Also: had you updated to pick up the workaround for the rust
> build failures on aarch64? I doubt it because they were
> commited on 2021-July-02. See,
> 
> https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=256864#c18
> 
> So that you did not get the process crash/core-dump during
> lang/rust 's build is interesting.
> 
>> There were no reboots between experiments.
>> 
>> My first suspicion is that I've somehow screwed up the poudriere setup, perhaps
>> by a fumbled execution of poudriere jail -u, which I mistakenly thought was
>> needed after updating /usr/ports.
> 
> Again, poudriere does not control memory initialization in
> the processes in the builders.
> 
>> The fact that the stoppage reported looks like
>> a syntax error specific to devel/llmv10 which is unaffected by swap pressure
>> makes it seem unrelated to kernel or swap constraints. 
> 
> The files with the syntax errors are ones generated by llvm-tblgen
> during the build and it is the output of llvm-tblgen that is corrupt,
> showing evidence of having used memory not initialized like it should
> have been.
> 
>> AIUI, the hardware of the Pi4 is considerably different from the Pi3 in terms
>> of memory management, noted from an interview with Eben Upton on YouTube.
> 
> Why would Eben Upton be talking about FreeBSD's memory management?
> 
> I suspect that the talk is not about what you think it is about,
> but some narrower aspects than the overall memory managment.
> 
>> He 
>> didn't go into any detail.  Whether that's relevant is unclear to me, but it 
>> does suggest the Pi4, even with restricted memory, won't behave like a Pi3.
> 
> Various reserved memory areas and such will vary but FreeBSD
> uses the same general memory management code, not completely
> separate code.
> 
>> Is there any sort of sanity test for the poudriere system? If I delete and
>> re-create the existing jail can the existing package library be preserved
>> and re-used? If not, that's OK, I'd just like to know beforehand.
>> 
> 
> # poudriere jail -jNAME -d
> # poudriere jail -c -jNAME -m null -M /WORLDPATH -S /SRCPATH -v 14.0-CURRENT
> 
> should work fine. But really all that you are
> doing is (using an example from my environment)
> is deleting and rewriting a few very small files
> in a directory with the jail's name:
> 
> # ls -FTla /usr/local/etc/poudriere.d/jails/main-CA53/
> total 36
> drwxr-xr-x  2 root  wheel  512 Jul  2 21:03:23 2021 ./
> drwxr-xr-x  3 root  wheel  512 Jul  2 21:03:23 2021 ../
> -rw-r--r--  1 root  wheel   14 Jul  2 21:03:23 2021 arch
> -rw-r--r--  1 root  wheel    5 Jul  2 21:03:23 2021 method
> -rw-r--r--  1 root  wheel   33 Jul  2 21:03:23 2021 mnt
> -rw-r--r--  1 root  wheel    2 Jul  2 21:03:23 2021 pkgbase
> -rw-r--r--  1 root  wheel   14 Jul  2 21:03:23 2021 srcpath
> -rw-r--r--  1 root  wheel   11 Jul  2 21:03:23 2021 timestamp
> -rw-r--r--  1 root  wheel   13 Jul  2 21:03:23 2021 version
> 
> # cat /usr/local/etc/poudriere.d/jails/main-CA53/arch
> arm64.aarch64
> 
> # cat /usr/local/etc/poudriere.d/jails/main-CA53/method
> null
> 
> # cat /usr/local/etc/poudriere.d/jails/main-CA53/mnt
> /usr/obj/DESTDIRs/main-CA53-poud
> 
> # cat /usr/local/etc/poudriere.d/jails/main-CA53/pkgbase 
> 0
> 
> # cat /usr/local/etc/poudriere.d/jails/main-CA53/srcpath 
> /usr/main-src
> 
> # cat /usr/local/etc/poudriere.d/jails/main-CA53/timestamp 
> 1625285003
> 
> # cat /usr/local/etc/poudriere.d/jails/main-CA53/version 
> 14.0-CURRENT
> 
> The deletion/replacement of timestamp may have rebuild
> consequences from appearing to have changed (or just
> being missing).
> 
> Nothing about any of those is going to change how memory
> initialization is working in llvm-tblgen's operation
> for generating any *GenGlobalISel.inc files, other than
> if the timestamp forces some sort of rebuild from scratch
> of some build dependencies first.

I'll note that the poudriere ports binding is similar for
deletion and creation: just a few small files in a directory
for the name used (default here):

# ls -FTla /usr/local/etc/poudriere.d/ports/default/
total 24
drwxr-xr-x  2 root  wheel  512 Apr 18 02:05:47 2021 ./
drwxr-xr-x  3 root  wheel  512 Apr 18 02:05:47 2021 ../
-rw-r--r--  1 root  wheel    2 Apr 18 02:05:47 2021 created_fs
-rw-r--r--  1 root  wheel    5 Apr 18 02:05:47 2021 method
-rw-r--r--  1 root  wheel   11 Apr 18 02:05:47 2021 mnt
-rw-r--r--  1 root  wheel   11 Apr 18 02:05:47 2021 timestamp

# cat /usr/local/etc/poudriere.d/ports/default/created_fs 
0

# cat /usr/local/etc/poudriere.d/ports/default/method 
null

# cat /usr/local/etc/poudriere.d/ports/default/mnt
/usr/ports

# cat /usr/local/etc/poudriere.d/ports/default/timestamp 
1618736747


===
Mark Millard
marklmi at yahoo.com
( dsl-only.net went
away in early 2018-Mar)