ports head -r484652: lang/ruby24 fails to amd64 -> armv7 cross build: qemu: uncaught target signal 11 (2 of them) [armv7 native build worked]

Mark Millard marklmi at yahoo.com
Thu Nov 15 19:47:01 UTC 2018


[While the poudriere-devel/qemu-arm-static/nxb-bin/ amd64 -> armv7
cross build failed, a native armv7 build worked. It turns out the
difference that matters is likely -O2 use vs -O use. More later
below.]

On 2018-Nov-10, at 23:29, Mark Millard <marklmi at yahoo.com> wrote:

> Poudriere-devel reported:
> 
> [00:18:32] [07] [00:02:56] Saved lang/ruby24 | ruby-2.4.5,1 wrkdir to: /usr/local/poudriere/data/wrkdirs/FBSDFSSDjailArmV7-default/default/ruby-2.4.5,1.tbz
> [00:18:32] [07] [00:02:56] Finished lang/ruby24 | ruby-2.4.5,1: Failed: build
> 
> The log showed:
> 
> --- miniruby ---
> linking miniruby
> --- .rbconfig.time ---
> --- encdb.h ---
> generating encdb.h
> --- .rbconfig.time ---
> qemu: uncaught target signal 11 (Segmentation fault) - core dumped
> Segmentation fault
> *** [.rbconfig.time] Error code 139
> 
> make[1]: stopped in /wrkdirs/usr/ports/lang/ruby24/work/ruby-2.4.5
> --- encdb.h ---
> qemu: uncaught target signal 11 (Segmentation fault) - core dumped
> Segmentation fault
> *** [encdb.h] Error code 139
> 
> make[1]: stopped in /wrkdirs/usr/ports/lang/ruby24/work/ruby-2.4.5
> 2 errors
> 
> 
> Despite how the above looks, I find only one .core file in the
> tar archive produced for the failure:
> 
> # find /wrkdirs/usr/ports/lang/ruby/ -name "*.core" -print
> /wrkdirs/usr/ports/lang/ruby/work/ruby-2.4.5/qemu_miniruby.core
> 
> Apparently qemu does not allow for separate files for distinct
> processes.
> 
> For that .core file I find (libexec/gdb):
> 
> # chroot /usr/obj/DESTDIRs/clang-armv7-installworld-poud
> # cd /wrkdirs/usr/ports/lang/ruby/work/ruby-2.4.5/
> # /usr/libexec/gdb miniruby qemu_miniruby.core 
> . . .
> (gdb) bt
> #0  0x00113f84 in rb_gc_writebarrier_unprotect (obj=4104601600) at gc.c:1119
> 1119	    return RVALUE_WB_UNPROTECTED_BITMAP(obj) != 0;
> [New Thread f4b5d000 (LWP 100638/<unknown>)]
> [New LWP 61684]
> Current language:  auto; currently minimal
> (gdb) bt
> #0  0x00113f84 in rb_gc_writebarrier_unprotect (obj=4104601600) at gc.c:1119
> #1  0x000c3fc8 in rb_include_class_new (module=4104569400, super=<value optimized out>) at ruby.h:1456
> #2  0x000c4424 in include_modules_at (klass=4104602160, c=4104602160, module=4104569400, search_super=<value optimized out>) at class.c:913
> #3  0x000c41f0 in rb_include_module (klass=4104602160, module=4104569400) at class.c:870
> #4  0x001f6dec in Init_String () at string.c:10021
> #5  0x00129398 in rb_call_inits () at inits.c:28
> #6  0x00103bac in ruby_setup () at eval.c:60
> #7  0x00103be8 in ruby_init () at eval.c:76
> #8  0x000a3300 in main (argc=11, argv=0x9fffe41c) at main.c:35
> (gdb) up
> #1  0x000c3fc8 in rb_include_class_new (module=4104569400, super=<value optimized out>) at ruby.h:1456
> 1456	    rb_gc_writebarrier_unprotect(x);
> (gdb) up
> #2  0x000c4424 in include_modules_at (klass=4104602160, c=4104602160, module=4104569400, search_super=<value optimized out>) at class.c:913
> 913		iclass = rb_include_class_new(module, RCLASS_SUPER(c));
> (gdb) up
> #3  0x000c41f0 in rb_include_module (klass=4104602160, module=4104569400) at class.c:870
> 870	    changed = include_modules_at(klass, RCLASS_ORIGIN(klass), module, TRUE);
> (gdb) up
> #4  0x001f6dec in Init_String () at string.c:10021
> 10021	    rb_include_module(rb_cString, rb_mComparable);
> (gdb) up
> #5  0x00129398 in rb_call_inits () at inits.c:28
> 28	    CALL(String);
> (gdb) up
> #6  0x00103bac in ruby_setup () at eval.c:60
> 60		rb_call_inits();
> (gdb) up
> #7  0x00103be8 in ruby_init () at eval.c:76
> 76	    int state = ruby_setup();
> (gdb) up
> #8  0x000a3300 in main (argc=11, argv=0x9fffe41c) at main.c:35
> 35		ruby_init();
> 
> (I'm not familiar with what details libexec/gdb gets
> right vs. wrong. But the call chain seems coherent.)
> 
> Host environment:
> 
> # uname -apKU
> FreeBSD FBSDFSSD 13.0-CURRENT FreeBSD 13.0-CURRENT #0 r340287M: Fri Nov  9 08:37:01 PST 2018     markmi at FBSDFSSD:/usr/obj/amd64_clang/amd64.amd64/usr/src/amd64.amd64/sys/GENERIC-NODBG  amd64 amd64 1300003 1300003

A prior example that fails for native armv7 builds
but works for poudriere-devel/qemu-arm-static/nxb-bin/
(native cross tools based) amd64 -> armv7 cross builds
is x11/pixman.

Previously I discovered that x11/pixman builds fine in
poudriere-devel/qemu-arm-static/nxb-bin/ amd64 -> armv7
cross builds but a link fails during native armv7
builds. It turned out that with the host-native cross
tools involved -O2 was being used where native -O
was being used: the code in share/mk/sys.mk that
is designed to use -O for arm fails to do so and uses
-O2 instead.

(MACHINE_ARCH temporarily looks to be amd64, which
gets a -O2 put in CFLAGS instead of -O .)

ruby seems to go the other direction: with -O2 involved
something builds that fails to run during the build.
With -O involved instead ruby builds fine and produces
a ruby that works.

(I've not done any analysis to see if the -O2 based
build failure is because of code making assumption
that are not guaranteed vs. if the compiler/linker
is producing something bad from well-defined code.)


Bryan Drewery is now aware of the odd -O2 vs. -O
behavior under poudriere-devel/qemu-arm-static/nxb-bin/
amd64 -> armv7 cross builds and likely it will be
fixed at some point.

But the existing behavior means that official armv6 and armv7
port builds that use that poudriere-devel/qemu-arm-static/nxb-bin/
amd64 -> armv7 cross build structure have been using
-O2 for a long time. This may challenge the use of
-O by default in CFLAGS for armv6 and armv7, in that -O2
has been under an implicit test for as long as the
cross build structure has been used with share/mk/sys.mk
having the MACHINE_ARCH based selection of -O2 vs. -O .

mips* may have similar issues to arm* based on what
share/mk/sys.mk does for -O2 vs. -O .

===
Mark Millard
marklmi at yahoo.com
( dsl-only.net went
away in early 2018-Mar)



More information about the freebsd-ports mailing list