Re: llvm & RTTI over shared libraries

From: Mark Millard <marklmi_at_yahoo.com>
Date: Mon, 25 Apr 2022 23:22:12 UTC

On 2022-Apr-25, at 15:39, Mark Millard <marklmi@yahoo.com> wrote:

> 	• <jbo_at_insane.engineer> wrote on
> 	• Date: Mon, 25 Apr 2022 13:01:48 UTC :
> 
>> I've created a small minimal test case which reproduces the problem (attached).
>> The key points here are:
>> - CMake based project consisting of:
>> - The header-only interface for the plugin and the types (test-interface).
>> - The main executable that loads the plugin (test-core).
>> - A plugin implementation (plugin-one).
>> - Compiles out-of-the-box on FreeBSD 13/stable with both lang/gcc11 and devel/llvm14.
>> - It uses the exact mechanism I use to load the plugins in my actual application.
>> 
>> stdout output when compiling with lang/gcc11:
>> 
>> t is type int
>> t is type string
>> done.
>> 
>> 
>> stdout output when compiling with lang/llvm14:
>> 
>> could not cast t
>> could not cast t
>> done.
>> 
>> 
>> Unfortunately, I could not yet figure out which compiler/linker flags llvm requires to implement the same behavior as GCC does. I understand that eventually I'd be better of rewriting the necessary parts to eliminate that problem but this is not a quick job.
>> 
>> Could somebody lend me a hand in figuring out which compiler/linker flags are necessary to get this to work with llvm?
> 
> The GCC default behavior is technically wrong. GCC allows being configured to
> do the correct thing --at the cost of ABI mismatches vs. what they originally
> did. (At least that is how I understand what I read in the code.)
> 
> To my knowledge LLVM does not allow clang++ being configured to do the wrong
> thing: it never had the ABI messed up and so did not face the self-compatibility
> question. (Bug-for-bug clang++ vs. g++ compatibility has not been the major
> goal.)

Looks like I may have got that wrong, although, like gcc11,
it is is more of a when-building-the-C++-library time frame
operation than a when-copmiling/linking to use the C++
library time frame thing. Also, nothing says the same
strings would be used by clang++ vs. g++, possibly the
comparisons might not agree across toolchains when string
comparisons are used.

. . ./contrib/llvm-project/libcxx/include/typeinfo has the
following material for !defined(_LIBCPP_ABI_MICROSOFT):


// ========================================================================== //
//                           Implementations
// ========================================================================== //
// ------------------------------------------------------------------------- //
//                               Unique
//               (_LIBCPP_TYPEINFO_COMPARISON_IMPLEMENTATION = 1)
// ------------------------------------------------------------------------- //
// This implementation of type_info assumes a unique copy of the RTTI for a
// given type inside a program. This is a valid assumption when abiding to the
// Itanium ABI (http://itanium-cxx-abi.github.io/cxx-abi/abi.html#vtable-components).
// Under this assumption, we can always compare the addresses of the type names
// to implement equality-comparison of type_infos instead of having to perform
// a deep string comparison.
// -------------------------------------------------------------------------- //
//                             NonUnique
//               (_LIBCPP_TYPEINFO_COMPARISON_IMPLEMENTATION = 2)
// -------------------------------------------------------------------------- //
// This implementation of type_info does not assume there is always a unique
// copy of the RTTI for a given type inside a program. For various reasons
// the linker may have failed to merge every copy of a types RTTI
// (For example: -Bsymbolic or llvm.org/PR37398). Under this assumption, two
// type_infos are equal if their addresses are equal or if a deep string
// comparison is equal.
// -------------------------------------------------------------------------- //
//                          NonUniqueARMRTTIBit
//               (_LIBCPP_TYPEINFO_COMPARISON_IMPLEMENTATION = 3)
// -------------------------------------------------------------------------- //
// This implementation is specific to ARM64 on Apple platforms.
//
// This implementation of type_info does not assume always a unique copy of
// the RTTI for a given type inside a program. When constructing the type_info,
// the compiler packs the pointer to the type name into a uintptr_t and reserves
// the high bit of that pointer, which is assumed to be free for use under that
// ABI. If that high bit is set, that specific copy of the RTTI can't be assumed
// to be unique within the program. If the high bit is unset, then the RTTI can
// be assumed to be unique within the program.
//
// When comparing type_infos, if both RTTIs can be assumed to be unique, it
// suffices to compare their addresses. If both the RTTIs can't be assumed to
// be unique, we must perform a deep string comparison of the type names.
// However, if one of the RTTIs is guaranteed unique and the other one isn't,
// then both RTTIs are necessarily not to be considered equal.
//
// The intent of this design is to remove the need for weak symbols. Specifically,
// if a type would normally have a default-visibility RTTI emitted as a weak
// symbol, it is given hidden visibility instead and the non-unique bit is set.
// Otherwise, types declared with hidden visibility are always considered to have
// a unique RTTI: the RTTI is emitted with linkonce_odr linkage and is assumed
// to be deduplicated by the linker within the linked image. Across linked image
// boundaries, such types are thus considered different types.

// This value can be overriden in the __config_site. When it's not overriden,
// we pick a default implementation based on the platform here.


> I have a nearly-minimalist change to your example that makes it result in:
> 
> # ./test-core
> t is type_int
> t is type_string
> done.
> 
> under clang. I pasted a diff -ruN in the message later below but that may
> lead to white space not being fully preserved. (I could send it to you in
> another form if it proved needed.)
> 
> Basically I avoid inline definitions of:
> 
>        virtual ~type_base();
>        virtual ~type_int();
>        virtual ~type_string();
> 
> Also, these are deliberately(!) the first non-inline virtual
> member functions in the 3 types. Where the implementations
> are placed controls were the type_info is put for the 3 types.
> (Not a language definition issue but a fairly common
> implementation technique.)
> 
> I also make the place with the implementation be a tiny .so
> that both test-core and libplugin-one.so are bound to. This
> makes them use the same type_info definitions instead of
> having multiple competing ones around, sort of a form of
> single-definition-rule (unique addresses in the process).
> With the single definition rule followed, RTTI works just
> fine.
> 
> I do warn that this is the first direct adjustment of cmake
> material that I've ever done. So if anything looks odd for
> how I did the cmake aspects, do not be surprised. I'm not
> cmake literate.
> 
> For reference:
> 
> # find clang_test_dist_m_m/ -print
> clang_test_dist_m_m/
> clang_test_dist_m_m/plugins
> clang_test_dist_m_m/plugins/CMakeLists.txt
> clang_test_dist_m_m/plugins/plugin_one
> clang_test_dist_m_m/plugins/plugin_one/CMakeLists.txt
> clang_test_dist_m_m/plugins/plugin_one/plugin.cpp
> clang_test_dist_m_m/shared_types_impl
> clang_test_dist_m_m/shared_types_impl/CMakeLists.txt
> clang_test_dist_m_m/shared_types_impl/types_impl.cpp
> clang_test_dist_m_m/core
> clang_test_dist_m_m/core/dlclass.hpp
> clang_test_dist_m_m/core/CMakeLists.txt
> clang_test_dist_m_m/core/main.cpp
> clang_test_dist_m_m/CMakeLists.txt
> clang_test_dist_m_m/interface
> clang_test_dist_m_m/interface/plugin.hpp
> clang_test_dist_m_m/interface/types.hpp
> clang_test_dist_m_m/interface/CMakeLists.txt
> 
> where the diff -ruN is . . .
> 
> diff -ruN clang_test_dist/ clang_test_dist_m_m/ | more
> diff -ruN clang_test_dist/CMakeLists.txt clang_test_dist_m_m/CMakeLists.txt
> --- clang_test_dist/CMakeLists.txt      2022-04-19 13:38:59.000000000 -0700
> +++ clang_test_dist_m_m/CMakeLists.txt  2022-04-25 12:51:03.448582000 -0700
> @@ -5,4 +5,5 @@
> 
> add_subdirectory(core)
> add_subdirectory(interface)
> +add_subdirectory(shared_types_impl)
> add_subdirectory(plugins)
> diff -ruN clang_test_dist/core/CMakeLists.txt clang_test_dist_m_m/core/CMakeLists.txt
> --- clang_test_dist/core/CMakeLists.txt 2022-04-19 13:38:59.000000000 -0700
> +++ clang_test_dist_m_m/core/CMakeLists.txt     2022-04-25 13:18:52.539921000 -0700
> @@ -19,9 +19,12 @@
>     PRIVATE
>         test-interface
>         dl
> +    PUBLIC
> +       shared-types-impl
> )
> 
> add_dependencies(
>     ${TARGET}
> +    shared-types-impl
>     plugin-one
> )
> diff -ruN clang_test_dist/interface/types.hpp clang_test_dist_m_m/interface/types.hpp
> --- clang_test_dist/interface/types.hpp 2022-04-19 13:38:59.000000000 -0700
> +++ clang_test_dist_m_m/interface/types.hpp     2022-04-25 14:48:52.534159000 -0700
> @@ -7,18 +7,20 @@
> 
>     struct type_base
>     {
> -        virtual ~type_base() = default;
> +        virtual ~type_base();
>     };
> 
>     struct type_int :
>         type_base
>     {
> +        virtual ~type_int();
>         int data;
>     };
> 
>     struct type_string :
>         type_base
>     {
> +        virtual ~type_string();
>         std::string data;
>     };
> 
> diff -ruN clang_test_dist/plugins/plugin_one/CMakeLists.txt clang_test_dist_m_m/plugins/plugin_one/CMakeLists.txt
> --- clang_test_dist/plugins/plugin_one/CMakeLists.txt   2022-04-19 13:38:59.000000000 -0700
> +++ clang_test_dist_m_m/plugins/plugin_one/CMakeLists.txt       2022-04-25 13:19:20.188778000 -0700
> @@ -12,3 +12,14 @@
>     PRIVATE
>         plugin.cpp
> )
> +
> +target_link_libraries(
> +    ${TARGET}
> +    PUBLIC
> +        shared-types-impl
> +)
> +
> +add_dependencies(
> +    ${TARGET}
> +    shared-types-impl
> +)
> diff -ruN clang_test_dist/shared_types_impl/CMakeLists.txt clang_test_dist_m_m/shared_types_impl/CMakeLists.txt
> --- clang_test_dist/shared_types_impl/CMakeLists.txt    1969-12-31 16:00:00.000000000 -0800
> +++ clang_test_dist_m_m/shared_types_impl/CMakeLists.txt        2022-04-25 12:55:29.760985000 -0700
> @@ -0,0 +1,15 @@
> +set(TARGET shared-types-impl)
> +add_library(${TARGET} SHARED)
> +
> +target_compile_features(
> +    ${TARGET}
> +    PRIVATE
> +        cxx_std_20
> +)
> +
> +target_sources(
> +    ${TARGET}
> +    PRIVATE
> +        types_impl.cpp
> +)
> +
> diff -ruN clang_test_dist/shared_types_impl/types_impl.cpp clang_test_dist_m_m/shared_types_impl/types_impl.cpp
> --- clang_test_dist/shared_types_impl/types_impl.cpp    1969-12-31 16:00:00.000000000 -0800
> +++ clang_test_dist_m_m/shared_types_impl/types_impl.cpp        2022-04-25 14:49:23.599440000 -0700
> @@ -0,0 +1,5 @@
> +#include "../interface/types.hpp"
> +
> +interface::type_base::~type_base()     {}
> +interface::type_int::~type_int()       {}
> +interface::type_string::~type_string() {}
> 
> 
> That is all there is to the changes.
> 



===
Mark Millard
marklmi at yahoo.com