Re: git: b2127b6f1ae2 - stable/13 - Install unwind.h into /usr/include

From: Dimitry Andric <dim_at_FreeBSD.org>
Date: Fri, 04 Mar 2022 18:25:06 UTC
On 3 Mar 2022, at 21:30, John Baldwin <jhb@freebsd.org> wrote:
> 
> On 3/3/22 10:08 AM, Dimitry Andric wrote:
>> I can't even get to building libreoffice, as all the openjdk ports fail to build, with DTrace errors:
>> dtrace: failed to compile script /wrkdirs/share/dim/ports/java/openjdk12/work/openjdk-jdk12u-jdk-12.0.2-10-4/build/bsd-x86_64-server-release/hotspot/variant-server/support/dtrace/hotspot.h.d: "/u
>> /mbuf.d", line 1: failed to copy type of 'm_data': Type information is in parent and unavailable
>> * For target hotspot_variant-server_gensrc_dtracefiles_hotspot_jni.h:
>> dtrace: failed to compile script /wrkdirs/share/dim/ports/java/openjdk12/work/openjdk-jdk12u-jdk-12.0.2-10-4/build/bsd-x86_64-server-release/hotspot/variant-server/support/dtrace/hotspot_jni.h.d:
>> race/mbuf.d", line 1: failed to copy type of 'm_data': Type information is in parent and unavailable
>> * For target hotspot_variant-server_gensrc_dtracefiles_hs_private.h:
>> dtrace: failed to compile script /wrkdirs/share/dim/ports/java/openjdk12/work/openjdk-jdk12u-jdk-12.0.2-10-4/build/bsd-x86_64-server-release/hotspot/variant-server/support/dtrace/hs_private.h.d:
>> ace/mbuf.d", line 1: failed to copy type of 'm_data': Type information is in parent and unavailable
>> It seems there is no way to disable DTrace in the openjdk ports, so I'm a little stuck on this.
>> Maybe it is possible to build libreoffice without Java support? But then I don't know if I will get the same error as you're getting.
>> -Dimitry
> 
> It might also be helpful to not re-use the same PR for multiple issues.  (When I
> first started reading the PR it was hard to understand why unwind.h had broken
> openjpeg.)
> 
> If I have read it correctly, some build tool (gengal?) is now segfaulting during
> the build.  My initial guess is that the build tool decided to alter its behavior
> based on a configure-type check finding unwind.h and enabling some bit of
> functionality that previously was not enabled.

So here's what apppears to be happening:

Core was generated by `/wrkdirs/share/dim/ports/editors/libreoffice/work/libreoffice-7.3.0.3/instdir/pr'.
Program terminated with signal SIGSEGV, Segmentation fault.
Address not mapped to object.
#0  std::type_info::name (this=0x0) at /usr/include/c++/v1/typeinfo:318
318           return __impl::__type_name_to_string(__type_name);

this=NULL, that's never good. :)

(gdb) bt
#0  std::type_info::name (this=0x0) at /usr/include/c++/v1/typeinfo:318
#1  gcc3::deleteException (pExc=0x87b5aff00)
    at bridges/source/cpp_uno/gcc3_linux_x86-64/except.cxx:139
#2  0x000000082c2e92e7 in __cxa_free_exception (
    thrown_exception=0x87b5aff00)
    at /share/dim/src/freebsd/llvm-14-update/contrib/libcxxrt/exception.cc:627
#3  0x000000082a4e6a15 in FileExists (rURL=...)
    at svx/source/gallery2/galmisc.cxx:214
#4  0x000000082a4f5871 in GalleryBinaryStorageLocations::ImplGetURLIgnoreCase (rURL=...)
    at svx/source/gallery2/gallerybinarystoragelocations.cxx:28
#5  0x000000082a4f4ac7 in GalleryBinaryEngineEntry::CreateUniqueURL (
    rBaseURL=..., aURL=...)
    at svx/source/gallery2/gallerybinaryengineentry.cxx:56
#6  0x000000082a4e2543 in GalleryThemeEntry::GalleryThemeEntry (
    this=0x87b56b4a0, bCreateUniqueURL=true, rBaseURL=..., rName=...,
    _bReadOnly=false, _bNewFile=false, _nId=0,
    _bThemeNameFromResource=<optimized out>)
    at svx/source/gallery2/gallery1.cxx:123
#7  0x000000082a4e4c20 in Gallery::CreateTheme (this=0x87a2fffc0,
    rThemeName=...) at svx/source/gallery2/gallery1.cxx:582
#8  0x0000000000207105 in createTheme (aThemeName=...,
    aGalleryURL=..., aDestDir=..., rFiles=...,
    bRelativeURLs=<optimized out>) at svx/source/gengal/gengal.cxx:72
#9  (anonymous namespace)::GalApp::Main (
    this=0x20db98 <vclmain::createApplication()::aGalApp>)
    at svx/source/gengal/gengal.cxx:294
#10 0x000000082e571d3e in ImplSVMain ()
    at vcl/source/app/svmain.cxx:199
#11 0x000000082e5730aa in SVMain () at vcl/source/app/svmain.cxx:231
#12 0x000000000020a5df in sal_main ()
    at vcl/source/salmain/salmain.cxx:34
#13 main (argc=<optimized out>, argv=<optimized out>)
    at vcl/source/salmain/salmain.cxx:29
(gdb) up
#1  gcc3::deleteException (pExc=0x87b5aff00)
    at bridges/source/cpp_uno/gcc3_linux_x86-64/except.cxx:139
139         OUString unoName( toUNOname( header->exceptionType->name() ) );
(gdb) print *header
$7 = {referenceCount = 0, exceptionType = 0x0,
  exceptionDestructor = 0x825fc9ad0 <typeinfo for com::sun::star::ucb::InteractiveAugmentedIOException>,
  unexpectedHandler = 0x875b3a220 <gcc3::deleteException(void*)>,
  terminateHandler = 0x82c2e9510 <std::terminate()>,
  nextException = 0x83041e0e0 <abort>, handlerCount = 0,
  handlerSwitchValue = 0, actionRecord = 0x0,
  languageSpecificData = 0x82a216508 "\003}\004",
  catchTemp = 0x82a2164cc, adjustedPtr = 0x0, unwindHeader = {
    exception_class = 5138137972254386944,
    exception_cleanup = 0x82c2e9990 <exception_cleanup(_Unwind_Reason_Code, _Unwind_Exception*)>, private_1 = 0, private_2 = 34912978464}}

The problem is that header->exceptionType is NULL. This appears to be
because the offset for 'header' is 8 bytes off; there is an assert in
except.cxx that checks this, but of course it gets compiled out for
release mode:

    assert(header->exceptionDestructor == &deleteException);

(gdb) print header->exceptionDestructor
$8 = (void (*)(void *)) 0x825fc9ad0 <typeinfo for com::sun::star::ucb::InteractiveAugmentedIOException>
(gdb) print &gcc3::deleteException
$10 = (void (*)(void *)) 0x875b3a220 <gcc3::deleteException(void*)>

So those definitely don't match, the header is in the wrong spot. If you
shift it 8 bytes up, it starts looking better:

(gdb) print *(const __cxa_exception *)((const char *)header + 8)
$11 = {referenceCount = 0,
  exceptionType = 0x825fc9ad0 <typeinfo for com::sun::star::ucb::InteractiveAugmentedIOException>,
  exceptionDestructor = 0x875b3a220 <gcc3::deleteException(void*)>,
  unexpectedHandler = 0x82c2e9510 <std::terminate()>,
  terminateHandler = 0x83041e0e0 <abort>, nextException = 0x0,
  handlerCount = 0, handlerSwitchValue = 0,
  actionRecord = 0x82a216508 "\003}\004",
  languageSpecificData = 0x82a2164cc "\377\233M\001\061\067\021\243\003\aP\t\373\002\aY\025\334\002\az\003\261\002\t\211\001\003\251\002\t\310\001\020\271\002\a\356\002\003\363\002\t\215\003\003\233\003\t\220\003Z", catchTemp = 0x0, adjustedPtr = 0x87b5aff00, unwindHeader = {
    exception_class = 35100989840, exception_cleanup = 0x0,
    private_1 = 34912978464, private_2 = 36429320480}}

E.g. there the exceptionDestructor field exactly matches the address of
gcc3::deleteException.

Interestingly, in except.cxx there is a large block with an explanatory
comment, which tells how this might be caused, but it is *only* enabled
for libc++abi, apparently:

#if defined _LIBCPPABI_VERSION // detect libc++abi
    // First, the libcxxabi commit
    // <http://llvm.org/viewvc/llvm-project?view=revision&revision=303175>
    // "[libcxxabi] Align unwindHeader on a double-word boundary" towards
    // LLVM 5.0 changed the size of __cxa_exception by adding
    //
    //   __attribute__((aligned))
    //
    // to the final member unwindHeader, on x86-64 effectively adding a hole of
    // size 8 in front of that member (changing its offset from 88 to 96,
    // sizeof(__cxa_exception) from 120 to 128, and alignof(__cxa_exception)
    // from 8 to 16); the "header1" hack below to dynamically determine whether we run against a
    // LLVM 5 libcxxabi is to look at the exceptionDestructor member, which must
    // point to this function (the use of __cxa_exception in fillUnoException is
    // unaffected, as it only accesses members towards the start of the struct,
    // through a pointer known to actually point at the start).  The libcxxabi commit
    // <https://github.com/llvm/llvm-project/commit/9ef1daa46edb80c47d0486148c0afc4e0d83ddcf>
    // "Insert padding before the __cxa_exception header to ensure the thrown" in LLVM 6
    // removes the need for this hack, so the "header1" hack can be removed again once we can be
    // sure that we only run against libcxxabi from LLVM >= 6.
    //
    // Second, the libcxxabi commit
    // <https://github.com/llvm/llvm-project/commit/674ec1eb16678b8addc02a4b0534ab383d22fa77>
    // "[libcxxabi] Insert padding in __cxa_exception struct for compatibility" in LLVM 10 changed
    // the layout of the start of __cxa_exception to
    //
    //  [8 byte  void *reserve]
    //   8 byte  size_t referenceCount
    //
    // so the "header2" hack below to dynamically determine whether we run against a LLVM >= 10
    // libcxxabi is to look whether the exceptionDestructor (with its known value) has increased its
    // offset by 8.  As described in the definition of __cxa_exception
    // (bridges/source/cpp_uno/gcc3_linux_x86-64/share.hxx), the "header2" hack (together with the
    // "#if 0" in the definition of __cxa_exception and the corresponding hack in fillUnoException)
    // can be dropped once we can be sure that we only run against new libcxxabi that has the
    // reserve member.
    if (header->exceptionDestructor != &deleteException) {
        auto const header1 = reinterpret_cast<__cxa_exception const *>(
            reinterpret_cast<char const *>(header) - 8);
        if (header1->exceptionDestructor == &deleteException) {
            header = header1;
        } else {
            auto const header2 = reinterpret_cast<__cxa_exception const *>(
                reinterpret_cast<char const *>(header) + 8);
            if (header2->exceptionDestructor == &deleteException) {
                header = header2;
            } else {
                assert(false);
            }
        }
    }
#endif

I remember that we have had an earlier issue which was related to this
shifting up and down by 8 bytes, as there is a part in libcxxrt's
exception.cc about this:

#ifdef __LP64__
/**
 * There's an ABI bug in __cxa_exception: unwindHeader requires 16-byte
 * alignment but it was broken by the addition of the referenceCount.
 * The unwindHeader is at offset 0x58 in __cxa_exception.  In order to keep
 * compatibility with consumers of the broken __cxa_exception, explicitly add
 * padding on allocation (and account for it on free).
 */
static const int exception_alignment_padding = 8;
#else
static const int exception_alignment_padding = 0;
#endif

However, previously with libcxxrt's own unwind.h headers installed,
libreoffice didn't need to do anything special to get at the correct
header address.

For some reason, with the libunwind headers instead, it now gets shifted
by 8 bytes. I think we are having a case of kludges upon kludges upon
hacks, which is now falling apart... :-)

In any case, if I unconditionally enable the "#if defined
_LIBCPPABI_VERSION // detect libc++abi" block in except.cxx, it does
detect the exception header successfully, the gengal.bin command does
not crash anymore, and the whole libreoffice build succeeds. The
libreoffice binaries even run OK, though I only did light testing, like
opening a doc file and messing around with it a little bit.

But I'm not sure if it is the solution we want. Maybe we should again
put in some sort of compat hack, to make the libunwind/libcxxrt
combination work for this scenario? (I think libreoffice is one of the
few applications that calls __cxa_throw directly, with its own deleter
function...)

-Dimitry