Re: llvm & RTTI over shared libraries

From: Mark Millard <marklmi_at_yahoo.com>
Date: Fri, 15 Apr 2022 17:55:28 UTC
On 2022-Apr-14, at 23:25, Mark Millard <marklmi@yahoo.com> wrote:

> From: <jbo_at_insane.engineer> wrote on
> Date: Thu, 14 Apr 2022 16:36:24 +0000 :
> (I've line-split the text.)
> 
>> I'm in the middle of moving to FreeBSD as my primary development platform (desktop wise).
>> As such, I am currently building various software tools I've written over the years on
>> FreeBSD for the first time. Most of those were developed on either Linux+GCC or on
>> Windows+Mingw (MinGW -> GCC).
>> 
>> Today I found myself debugging a piece of software which runs fine on FreeBSD when
>> compiled with gcc11 but not so much when compiling with clang14.
>> I managed to track down the problem but I lack the deeper understanding to resolve
>> this properly - so here we are.
>> 
>> The software in question is written in C++20 and consisting of:
>>  - An interface library (just a bunch of header files).
>>  - A main executable.
>>  - A bunch of plugins which the executable loads via dlopen().
>> 
>> The interface headers provide several types. Lets call them A, B, C and D. where B,
>> C and D inherit from A.
>> The plugins use std::dynamic_pointer_cast() to cast an std::shared_ptr<A> (received
>> via the plugin interface) to the derived classes such as std::shared_ptr<B>.
>> This is where the trouble begins.
>> 
>> If everything (the main executable and the plugins) are compiled using gcc11, everything
>> works "as I expect it".
>> However, when compiling everything with clang14, the main executable is able to load the
>> plugins successfully but those std::dynamic_pointer_cast() calls within the plugins
>> always return nullptr.
>> 
>> After some research I seem to understand that the way that RTTI is handled over shared
>> library boundaries is different between GCC and LLVM.
>> This is where my understanding starts to get less solid.
>> 
>> I read the manual page of dlopen(3). It would seem like the flag RTLD_GLOBAL would be
>> potentially interesting to me: "Symbols from this shared object [...] of needed objects
>> will be available for re-solving undefined references from all other shared objects."
>> The software (which "works as intended" when compiled with GCC) was so far only calling
>> dlopen(..., RTLD_LAZY).
>> I'm not even sure whether this applies to my situation. My gut feeling tells me that I'm
>> heading down the wrong direction here. After all, the main executable is able to load
>> the plugins and to call the plugin's function which receives an std::shared_ptr<A>
>> asparameter just fine, also when compiled with LLVM.
>> Is the problem I'm experiencing related to the way that the plugin (shared library) is
>> loaded or the way that the symbols are being exported?
>> In the current state, the plugins do not explicitly export any symbols.
>> 
>> Here's a heavily simplified version of my scenario:
> 
> The simplified example was not designed to compile and test.
> So I made guesses and made my own. The .cpp files have
> comments on the compile/link commands used and there are
> examples of c++ and g++11 compile/link/run sequences
> after the source code. The code is not well commented. Nor
> does it deal with error handling or the like. But it is
> fairly short overall.
> 
> # more base_plugin.h
> #include  <memory>
> 
> // For its own libbase_plugin.so file, load time bound, no dlopen used for it:
> 
> struct base
> {
>        virtual ~base();
> };
> struct base_plugin
> {
>        virtual std::shared_ptr<base> create_data_instance()             = 0;
>        virtual void                  action(std::shared_ptr<base> data) = 0;
>        virtual ~base_plugin();
> };
> 
> extern "C" // for each derived plugin .so file:
> {
>        using plugin_instance_creator= base_plugin* (*)();
>        const char plugin_instance_creator_name[] = "create_plugin_instance"; // Lookup via dlsym.
> 
>        using plugin_instance_destroyer= void (*)(base_plugin*);
>        const char plugin_instance_destroyer_name[] = "destroy_plugin_instance"; // Lookup via dlsym.
> };
> 
> # more base_plugin.cpp
> // c++   -std=c++20 -O0 -g -fPIC -lc++    -olibbase_plugin.so -shared base_plugin.cpp
> // g++11 -std=c++20 -O0 -g -fPIC -lstdc++ -olibbase_plugin.so -shared base_plugin.cpp
> 
> #include "base_plugin.h"
> 
> base::~base()               {}
> base_plugin::~base_plugin() {}
> 
> # more main_using_plugin.cpp 
> // c++   -std=c++20 -O0 -g -fPIC -lc++    -L. -lbase_plugin -Wl,-rpath=. \
> //                                       -omain_using_plugin main_using_plugin.cpp
> // g++11 -std=c++20 -O0 -g -fPIC -lstdc++ -L. -lbase_plugin -Wl,-rpath=. \
> //       -Wl,-rpath=/usr/local/lib/gcc11 -omain_using_plugin main_using_plugin.cpp
> 
> #include "base_plugin.h"
> #include <dlfcn.h>
> 
> int main()
> {
>        auto dl= dlopen("./libsharedlib_plugin.so",RTLD_LAZY); // hardcoded .so path for the example
> 
>        union { void* as_voidptr; plugin_instance_creator as_plugin_instance_creator; } creator_plugin_func;
>        creator_plugin_func.as_voidptr= dlsym(dl,plugin_instance_creator_name);
> 
>        union { void* as_voidptr; plugin_instance_destroyer as_plugin_instance_destroyer; } destroyer_plugin_func;
>        destroyer_plugin_func.as_voidptr= dlsym(dl,plugin_instance_destroyer_name);
> 
>        auto plugin= (creator_plugin_func.as_plugin_instance_creator)();
> 
>        { // Local scope for data
>                std::shared_ptr<base> data{plugin->create_data_instance()};
>                plugin->action(data);
>        } // Presume for the example that nothing requires the plugin after here.
> 
>        (destroyer_plugin_func.as_plugin_instance_destroyer)(plugin);
>        destroyer_plugin_func.as_voidptr= nullptr;
> 
>        dlclose(dl);
> }
> 
> NOTE: So, other than the dlopen, the above has no direct tie to
> the specific dynamically loaded plugin. The base_plugin is in a
> .so but is load-time bound instead of using dlopen. That .so
> would be used by all the plugins found via dllopen. (I only
> made one example.)
> 
> As for the .so used via dlopen/dlsym/dlclose . . .
> 
> # more sharedlib_plugin.h
> #include "base_plugin.h"
> 
> // For its own libsharedlib_plugin.so file, where dlopen is used to find it:
> 
> struct sharedlib : base { int v; };
> struct sharedlib_plugin : base_plugin
> {
>        std::shared_ptr<base> create_data_instance()             override;
>        void                  action(std::shared_ptr<base> base) override;
> };
> 
> # more sharedlib_plugin.cpp
> // c++   -std=c++20 -O0 -g -fPIC -lc++    -olibsharedlib_plugin.so -shared sharedlib_plugin.cpp
> // g++11 -std=c++20 -O0 -g -fPIC -lstdc++ -olibsharedlib_plugin.so -shared sharedlib_plugin.cpp
> 
> #include "sharedlib_plugin.h"
> #include <iostream>
> 
> std::shared_ptr<base> sharedlib_plugin::create_data_instance()
> {
>        std::cout << "create_data_instance in use from dlopen'd .so\n";
>        return std::static_pointer_cast<base>(std::make_shared<sharedlib>());
> }
> 
> void sharedlib_plugin::action(std::shared_ptr<base> b)
> {
>        std::cout << "action in use from dlopen'd .so class\n";
>        auto separate_share = std::dynamic_pointer_cast<sharedlib>(b);
>        if (separate_share->v || 1 < separate_share.use_count())
>                std::cout << "separate_share is not nullptr (would crash otherwise)\n";
> }
> 
> extern "C" base_plugin* create_plugin_instance()
> {
>        std::cout << "create_plugin_instance in use from dlopen'd .so\n";
>        return new sharedlib_plugin();
> }
> 
> extern "C" void destroy_plugin_instance(const base_plugin* plugin)
> {
>        std::cout << "destroy_plugin_instance in use from dlopen'd .so\n";
>        delete plugin;
> }
> 
> # c++ -std=c++20 -O0 -g -fPIC -lc++ -olibbase_plugin.so -shared base_plugin.cpp
> # c++ -std=c++20 -O0 -g -fPIC -lc++ -L. -lbase_plugin -Wl,-rpath=. \
>      -omain_using_plugin main_using_plugin.cpp
> # c++ -std=c++20 -O0 -g -fPIC -lc++ -olibsharedlib_plugin.so -shared sharedlib_plugin.cpp
> # ./main_using_plugin
> create_plugin_instance in use from dlopen'd .so
> create_data_instance in use from dlopen'd .so
> action in use from dlopen'd .so class
> separate_share is not nullptr (would crash otherwise)
> destroy_plugin_instance in use from dlopen'd .so
> 
> For reference:
> 
> # ldd main_using_plugin
> main_using_plugin:
>        libc++.so.1 => /lib/libc++.so.1 (0x819d0000)
>        libcxxrt.so.1 => /lib/libcxxrt.so.1 (0x82735000)
>        libbase_plugin.so => ./libbase_plugin.so (0x8328d000)
>        libm.so.5 => /lib/libm.so.5 (0x83c47000)
>        libgcc_s.so.1 => /lib/libgcc_s.so.1 (0x85861000)
>        libc.so.7 => /lib/libc.so.7 (0x848f9000)
> 
> # ldd ./libsharedlib_plugin.so
> ./libsharedlib_plugin.so:
>        libc++.so.1 => /lib/libc++.so.1 (0x3b69aeeb6000)
>        libcxxrt.so.1 => /lib/libcxxrt.so.1 (0x3b69af6f2000)
>        libm.so.5 => /lib/libm.so.5 (0x3b69afd1f000)
>        libgcc_s.so.1 => /lib/libgcc_s.so.1 (0x3b69b0303000)
>        libc.so.7 => /lib/libc.so.7 (0x3b69aafdb000)
> 
> 
> As for g++11 use . . .
> 
> Testing with g++11 does involve additional/adjusted command line
> options:
>    -Wl,-rpath=/usr/local/lib/gcc11/ ( for main_using_plugin.cpp )
>    -lstdc++ (for all 3 .cpp files)
> 
> (FreeBSD's libgcc_s.so.1 does not cover everything needed for
> all architectures for g++11's code generation. I was working
> in a context where using /usr/local/lib/gcc11//libgcc_s.so.1
> was important.)
> 
> 
> # g++11 -std=c++20 -O0 -g -fPIC -lstdc++ -olibbase_plugin.so -shared base_plugin.cpp
> # g++11 -std=c++20 -O0 -g -fPIC -lstdc++ -L. -lbase_plugin -Wl,-rpath=. \
>      -Wl,-rpath=/usr/local/lib/gcc11 -omain_using_plugin main_using_plugin.cpp
> # g++11 -std=c++20 -O0 -g -fPIC -lstdc++ -olibsharedlib_plugin.so -shared sharedlib_plugin.cpp
> # ./main_using_plugin
> create_plugin_instance in use from dlopen'd .so
> create_data_instance in use from dlopen'd .so
> action in use from dlopen'd .so class
> separate_share is not nullptr (would crash otherwise)
> destroy_plugin_instance in use from dlopen'd .so
> 
> For reference:
> 
> # ldd main_using_plugin
> main_using_plugin:
>        libstdc++.so.6 => /usr/local/lib/gcc11//libstdc++.so.6 (0x83a00000)
>        libbase_plugin.so => ./libbase_plugin.so (0x8213d000)
>        libm.so.5 => /lib/libm.so.5 (0x82207000)
>        libgcc_s.so.1 => /usr/local/lib/gcc11//libgcc_s.so.1 (0x82c66000)
>        libc.so.7 => /lib/libc.so.7 (0x849c4000)
> 
> # ldd ./libsharedlib_plugin.so
> ./libsharedlib_plugin.so:
>        libstdc++.so.6 => /usr/local/lib/gcc11/libstdc++.so.6 (0x1c2a7b800000)
>        libm.so.5 => /lib/libm.so.5 (0x1c2a7bb1c000)
>        libgcc_s.so.1 => /lib/libgcc_s.so.1 (0x1c2a7c416000)
>        libc.so.7 => /lib/libc.so.7 (0x1c2a780e8000)
> 
> 
> Overall:
> 
> Looks to me like both the system clang/llvm and g++11 contexts
> are working. (The platform context was aarch64 main [so: 14],
> in case it matters.)

It also works for clang++14 from devel/llvm14 :

# clang++14 -std=c++20 -O0 -g -fPIC -lc++ -olibbase_plugin.so -shared base_plugin.cpp
# clang++14 -std=c++20 -O0 -g -fPIC -lc++ -L. -lbase_plugin -Wl,-rpath=. -omain_using_plugin main_using_plugin.cpp
# clang++14 -std=c++20 -O0 -g -fPIC -lc++ -olibsharedlib_plugin.so -shared sharedlib_plugin.cpp
# ./main_using_plugin
create_plugin_instance in use from dlopen'd .so
create_data_instance in use from dlopen'd .so
action in use from dlopen'd .so class
separate_share is not nullptr (would crash otherwise)
destroy_plugin_instance in use from dlopen'd .so
# ldd main_using_plugin
main_using_plugin:
        libc++.so.1 => /lib/libc++.so.1 (0x81ceb000)
        libcxxrt.so.1 => /lib/libcxxrt.so.1 (0x82a4f000)
        libbase_plugin.so => ./libbase_plugin.so (0x833a5000)
        libm.so.5 => /lib/libm.so.5 (0x84fcb000)
        libgcc_s.so.1 => /lib/libgcc_s.so.1 (0x83b37000)
        libc.so.7 => /lib/libc.so.7 (0x84739000)
# ldd ./libsharedlib_plugin.so
./libsharedlib_plugin.so:
        libc++.so.1 => /lib/libc++.so.1 (0x2a87e9ab1000)
        libcxxrt.so.1 => /lib/libcxxrt.so.1 (0x2a87e86dc000)
        libm.so.5 => /lib/libm.so.5 (0x2a87e9020000)
        libgcc_s.so.1 => /lib/libgcc_s.so.1 (0x2a87eace1000)
        libc.so.7 => /lib/libc.so.7 (0x2a87e45a9000)

I've also used -O2 and lack of -g . All the combinations I
tried worked.

===
Mark Millard
marklmi at yahoo.com