Re: c++: dynamic_cast woes

From: Dimitry Andric <dim_at_FreeBSD.org>
Date: Thu, 10 Aug 2023 17:38:26 UTC
On 10 Aug 2023, at 00:25, Dimitry Andric <dim@FreeBSD.org> wrote:
> 
> On 9 Aug 2023, at 00:29, Christoph Moench-Tegeder <cmt@burggraben.net> wrote:
>> 
>> ## Dimitry Andric (dim@FreeBSD.org):
>> 
>>> Yes, this is a typical problem when type info is replicated across
>>> dynamic library boundaries. The best thing to prevent this is to
>>> ensure that the key functions for a class (typically constructors
>>> and destructors) are only in one translation unit (object file),
>>> and that object file is only in one .so file.
>> 
>> As FreeBSD is basically unsupported from upstream, this sounds
>> like I'm in for quite a bit of fun here. Well.
> 
> FWIW, it took quite a while (kicad has LOTS of dependencies!), but I
> built kicad and it looks like I can reproduce the original problem,
> after disabling the static-cast-patch. So I will investigate it a bit
> further.

It looks like KiCad is violating the One Definition Rule (ODR) all over
the place... So it will not really be trivial to fix, unfortunately.

The problem is that C++ virtual tables and type information gets copied
into *both* the kicad main binaries (kicad/kicad-cli), and the plugins
(_*.kiface). This is due to the way the binaries and plugins get linked,
namely to a bunch of static libraries containing the common kicad parts
(libcommon.a, libscripting.a, etc).

Classes like KICAD_SETTINGS or JOB_EXPORT_SCH_PDF are declared in header
files, and usually their vtables and type_info get emitted into one
particular object file, so for instance the vtable and type_info for
KICAD_SETTINGS gets emitted into kicad_settings.cpp.o.

However, kicad_settings.cpp.o gets archived into libcommon.a, and this
libcommon.a gets linked into both the main binaries *and* the .kiface
plugins! It means there are now multiple copies of the same virtual
table and type_info, and that is an explicit violation of the ODR.

The behavior at runtime is now officially undefined, but in practice it
is implementation-defined: some implementations like libstdc++ work
around it, by doing a deep comparison of type_info when casting
dynamically. Others, like libcxxrt and libc++abi, only compare the
addresses of the different copies of the same type_info, and in such
cases the comparison will fail.

The correct way to fix this is by making sure the common code gets put
into one shared libary, which explicitly exports all the required
symbols (so using __dllexport on Windows, and non-hidden visibility on
ELF platforms). Only then can you make sure the ODR is not violated.

But this would entail a pretty big refactoring for KiCad... :)

-Dimitry