Re: port binary dumping core on recent head in poudriere

From: Guido Falsi <mad_at_madpilot.net>
Date: Sun, 24 Nov 2024 17:07:19 UTC
On 23/11/24 15:56, Guido Falsi wrote:
> On 23/11/24 15:34, Guido Falsi wrote:
>> On 21/11/24 18:33, Guido Falsi wrote:
>>> On 21/11/24 18:27, Dimitry Andric wrote:
>>>> On 21 Nov 2024, at 18:17, Guido Falsi <mad@madpilot.net> wrote:
>>>>>
>>>>> On 20/11/24 23:50, Guido Falsi wrote:
>>>>>> On 20/11/24 22:14, Dimitry Andric wrote:
>>>>>>> On 20 Nov 2024, at 18:32, Guido Falsi <mad@madpilot.net> wrote:
>>>>>>>> I've noticed that recently some ports are dumping core during 
>>>>>>>> builds of dependencies in head in poudriere.
>>>>>>>>
>>>>>>>> I'm seeing this for example with sassc crashing while trying to 
>>>>>>>> build x11-themes/greybird-theme.
>>>>>>>>
>>>>>>>> My first suspect was the llvm upgrade in head, but forcing sassc 
>>>>>>>> and libsass to build with older clang via USES=llvm:max=18 is 
>>>>>>>> not helping.
>>>>>>>>
>>>>>>>> I did recompile the offending programs with debug and tried a 
>>>>>>>> backtrace and got this:
>>>>>>>>
>>>>>>>> ```
>>>>>>>> (lldb) bt
>>>>>>>> * thread #1, name = 'sassc', stop reason = signal SIGSEGV: 
>>>>>>>> invalid permissions for mapped object (fault address: 0x82374a000)
>>>>>>>>    * frame #0: 0x000000082374a000 libsass.so.1
>>>>>>>>      frame #1: 0x0000000823865a86 
>>>>>>>> libsass.so.1`_GLOBAL__sub_I_ast.cpp [inlined] double 
>>>>>>>> std::__1::__math::acos[abi:se190102]<int, 0>(__x=-1) at 
>>>>>>>> inverse_trigonometric_functions.h:40:10
>>>>>>>>      frame #2: 0x0000000823865a81 
>>>>>>>> libsass.so.1`_GLOBAL__sub_I_ast.cpp [inlined] 
>>>>>>>> __cxx_global_var_init at units.hpp:11:21
>>>>>>>>      frame #3: 0x0000000823865a81 
>>>>>>>> libsass.so.1`_GLOBAL__sub_I_ast.cpp at ast.cpp:0
>>>>>>>>      frame #4: 0x00001eac6e3f078d ld-elf.so.1
>>>>>>>>      frame #5: 0x00001eac6e3ef349 ld-elf.so.1
>>>>>>>>      frame #6: 0x00001eac6e3ec099 ld- 
>>>>>>>> elf.so.1`___lldb_unnamed_symbol27 + 25
>>>>>>>> ```
>>>>>>>>
>>>>>>>> which points me to this upstream line of code: https:// 
>>>>>>>> github.com/ sass/libsass/ 
>>>>>>>> blob/7037f03fabeb2b18b5efa84403f5a6d7a990f460/src/ units.hpp#L11
>>>>>>>>
>>>>>>>> I could change the way it derives PI, but I'm not sure this is 
>>>>>>>> the correct fix.
>>>>>>>
>>>>>>> At first sight this looks like some sort of initialization order 
>>>>>>> fiasco, but without a full backtrace and some indications on what 
>>>>>>> it is exactly segfaulting on it is hard to say. Is it reproducible?
>>>>>> It is fully reproducible here by just compiling the sassc port and 
>>>>>> trying to run it. It segfaults on startup.
>>>>>
>>>>> I'm following up to myself to note that I'm observing the same 
>>>>> issue in textproc/opensp if trying to run anything linked with the 
>>>>> library, for example its own binary "osx".
>>>>>
>>>>> I noticed it because it is required by libosp and then by gnucash 
>>>>> which I use and maintain. libosp fails during configure due to a 
>>>>> test binary compiled by configure script dumping core.
>>>>>
>>>>> I suspect there are more around the ports tree.
>>>>
>>>> I cannot reproduce this at all. For me the sassc binary runs fine, 
>>>> and also the x11-themes/greybird-theme port builds fine. Then again, 
>>>> my base system is probably older than yours? Which revision are you 
>>>> running?
>>>
>>> I'm running cdfd0600dc8882f0a0d0e6d9a1cdcf926edba6d6 from Tue Nov 5 
>>> 13:35:17 2024 -0800 (cut & paste from git log)
>>>
>>>
>>
>> I tried upgrading to 07593d13fa2ad6fe4d962b7473c6020aef2a0414 from 
>> yesterday, cleaning all ports, forcing a rebuild, but I see the exact 
>> same issue.
>>
> 
> In fact, I noticed, guile2 is also showing this behaviour.
> 
> 

I've tried some more experiments, to rule out some possibilities on my part:

- rebuild from scratch clening up obj, ccache
- I also tried rolling back a pair of commits in the dynamic loader, 
just in case
- update to a newer snapshot [1], since I noticed a new version of clang 
is included.

Unluckily nothing of this worked, and the issue is presenting itself 
constantly.

I must admit I'm out of ideas, although I still think some issue in the 
llvm suite looks the most probable cause, but I admit it is just an 
hunch feeling I cannot really back up with any proof.


I really hope something is uncovered about this.

Should I create a bug report on bugzilla to track this?


[1] now testing with commit 718519f4efc71096422fc71dab90b2a3369871ff 
from Sun Nov 24 10:04:11 2024 +0100

-- 
Guido Falsi <mad@madpilot.net>