Re: i386 on amd64 can fail to return from cond_wait_user, using basically 100% of a FreeBSD cpu [IGNORE: wrong thread]

From: Mark Millard <marklmi_at_yahoo.com>
Date: Sat, 05 Jul 2025 22:21:36 UTC
On Jul 5, 2025, at 10:08, Mark Millard <marklmi@yahoo.com> wrote:

> On Jul 5, 2025, at 01:23, Konstantin Belousov <kostikbel@gmail.com> wrote:
> 
>> On Fri, Jul 04, 2025 at 11:01:22PM -0700, Mark Millard wrote:
>>> Some package builds are failing on the port-packages build cluster
>>> machines that do i386 builds during the following code. The analysis
>>> is from replication in a personal context (using poudriere bulk with
>>> -i), using poudriere-devel instead. I'll note that the personal
>>> context is from using PkgBase 14.3-RELEASE in the poudriere jail.
>>> I also installed most of the realted *-dbg* PkgBase packages in order
>>> to get the nicer backtracing.
>>> 
>>> (gdb) bt
>>> #0  _umtx_op_err () at /home/pkgbuild/worktrees/releng/14.3/lib/libthr/arch/i386/i386/_umtx_op_err.S:37
>>> #1  0x2499f897 in _thr_umtx_timedwait_uint (mtx=0x249a365c, id=0, clockid=4, abstime=0x0, shared=0) at /home/pkgbuild/worktrees/releng/14.3/lib/libthr/thread/thr_umtx.c:233
>>> #2  0x24995b26 in _thr_sleep (curthread=0x24d36004, clockid=4, abstime=0x0) at /home/pkgbuild/worktrees/releng/14.3/lib/libthr/thread/thr_kern.c:197
>>> #3  0x24990beb in cond_wait_user (cvp=0x24dfa8a0, mp=0x24d38d04, abstime=<optimized out>, cancel=<optimized out>) at /home/pkgbuild/worktrees/releng/14.3/lib/libthr/thread/thr_cond.c:317
>>> 
>>> NOTE: cond_wait_user never returns but #2..#0 repeat (observed by
>>> repeated ^c and bt usage).
>>> 
>>> (i386 is the oddball with 32-bit time_t but I do not know
>>> if that is involved here.)
>>> 
>>> #4  cond_wait_common (cond=<optimized out>, mutex=<optimized out>, abstime=0x0, cancel=1) at /home/pkgbuild/worktrees/releng/14.3/lib/libthr/thread/thr_cond.c:377
>>> #5  0x24990e8f in __thr_cond_wait (cond=0x23b9b4f4, mutex=0x23b9b4ec) at /home/pkgbuild/worktrees/releng/14.3/lib/libthr/thread/thr_cond.c:392
>>> #6  0x23be1e4b in uv_cond_wait () from /usr/local/lib/libuv.so.1
>>> #7  0x024bd497 in node::NodePlatform::DrainTasks(v8::Isolate*) ()
>>> #8  0x0232f5b6 in node::SpinEventLoopInternal(node::Environment*) ()
>>> #9  0x02485bf0 in node::NodeMainInstance::Run() ()
>>> #10 0x023eaba1 in node::Start(int, char**) ()
>>> #11 0x24a1da85 in __libc_start1 (argc=5, argv=0xffffda3c, env=0xffffda54, cleanup=0x23b73020 <rtld_nop_exit>, mainX=0x314a720 <main>)
>>>   at /home/pkgbuild/worktrees/releng/14.3/lib/libc/csu/libc_start1.c:157
>>> #12 0x0232d0a8 in _start ()
>>> 
>>> www/librewolf and other firefox related package builds can do
>>> this until a 7200 sec timeout by poudriere occurs:
>>> 
>>> =>> Killing runaway build after 7200 seconds with no output
>>> 
>>> I'll note that truss did not generate any output when used to
>>> watch the process that was stuck. It appears to be a world-internal
>>> problem.
> 
> Looks like my truss comment does not have the implications
> that I thought.
> 
>> Can you  provide a minimal stand-alone reproducer in C?
> 
> Unsure. My attempt will be mostly exploratory, not being familiar
> with the subject area or /usr/local/lib/libuv.so.1's uv_cond_wait
> or node's code.
> 

[I've removed a bunch of notes here that now prove
irrelevant.]

I finally managed to notice that I was looking at the
wrong thread: the busy thread does not involve
_umtx_op_err or the like. See the "v8::. . ." below.

Sorry for the noise.

(gdb) info threads
  Id   Target Id                   Frame 
  1    LWP 101947 of process 78445 _umtx_op_err () at /home/pkgbuild/worktrees/releng/14.3/lib/libthr/arch/i386/i386/_umtx_op_err.S:37
  2    LWP 257278 of process 78445 _kevent () at _kevent.S:4
* 3    LWP 257279 of process 78445 0x039fb33e in v8::internal::compiler::turboshaft::MaybeRedundantStoresTable::map_to_key(v8::internal::compiler::turboshaft::OpIndex, int, unsigned char) ()
  4    LWP 257280 of process 78445 _umtx_op_err () at /home/pkgbuild/worktrees/releng/14.3/lib/libthr/arch/i386/i386/_umtx_op_err.S:37
  5    LWP 257281 of process 78445 _umtx_op_err () at /home/pkgbuild/worktrees/releng/14.3/lib/libthr/arch/i386/i386/_umtx_op_err.S:37
  6    LWP 257282 of process 78445 _umtx_op_err () at /home/pkgbuild/worktrees/releng/14.3/lib/libthr/arch/i386/i386/_umtx_op_err.S:37
  7    LWP 257283 of process 78445 _umtx_op () at _umtx_op.S:4

(gdb) bt
#0  0x039fb33e in v8::internal::compiler::turboshaft::MaybeRedundantStoresTable::map_to_key(v8::internal::compiler::turboshaft::OpIndex, int, unsigned char) ()
#1  0x039f9a49 in v8::internal::compiler::turboshaft::RedundantStoreAnalysis::ProcessBlock(v8::internal::compiler::turboshaft::Block const&) ()
#2  0x039f95b3 in v8::internal::compiler::turboshaft::StoreStoreEliminationReducer<v8::internal::compiler::turboshaft::ReducerStack<v8::internal::compiler::turboshaft::Assembler<v8::internal::compiler::turboshaft::reducer_list<v8::internal::compiler::turboshaft::TurboshaftAssemblerOpInterface, v8::internal::compiler::turboshaft::GraphVisitor, v8::internal::compiler::turboshaft::StoreStoreEliminationReducer, v8::internal::compiler::turboshaft::LateLoadEliminationReducer, v8::internal::compiler::turboshaft::MachineOptimizationReducer, v8::internal::compiler::turboshaft::BranchEliminationReducer, v8::internal::compiler::turboshaft::ValueNumberingReducer, v8::internal::compiler::turboshaft::TSReducerBase> >, true, v8::internal::compiler::turboshaft::LateLoadEliminationReducer, v8::internal::compiler::turboshaft::MachineOptimizationReducer, v8::internal::compiler::turboshaft::BranchEliminationReducer, v8::internal::compiler::turboshaft::ValueNumberingReducer, v8::internal::compiler::turboshaft::TSReducerBase> >::Analyze() ()
#3  0x039f7b42 in void v8::internal::compiler::turboshaft::GraphVisitor<v8::internal::compiler::turboshaft::ReducerStack<v8::internal::compiler::turboshaft::Assembler<v8::internal::compiler::turboshaft::reducer_list<v8::internal::compiler::turboshaft::TurboshaftAssemblerOpInterface, v8::internal::compiler::turboshaft::GraphVisitor, v8::internal::compiler::turboshaft::StoreStoreEliminationReducer, v8::internal::compiler::turboshaft::LateLoadEliminationReducer, v8::internal::compiler::turboshaft::MachineOptimizationReducer, v8::internal::compiler::turboshaft::BranchEliminationReducer, v8::internal::compiler::turboshaft::ValueNumberingReducer, v8::internal::compiler::turboshaft::TSReducerBase> >, true, v8::internal::compiler::turboshaft::StoreStoreEliminationReducer, v8::internal::compiler::turboshaft::LateLoadEliminationReducer, v8::internal::compiler::turboshaft::MachineOptimizationReducer, v8::internal::compiler::turboshaft::BranchEliminationReducer, v8::internal::compiler::turboshaft::ValueNumberingReducer, v8::internal::compiler::turboshaft::TSReducerBase> >::VisitGraph<false>() ()
#4  0x039f7acd in v8::internal::compiler::turboshaft::CopyingPhaseImpl<v8::internal::compiler::turboshaft::StoreStoreEliminationReducer, v8::internal::compiler::turboshaft::LateLoadEliminationReducer, v8::internal::compiler::turboshaft::MachineOptimizationReducer, v8::internal::compiler::turboshaft::BranchEliminationReducer, v8::internal::compiler::turboshaft::ValueNumberingReducer>::Run(v8::internal::compiler::turboshaft::Graph&, v8::internal::Zone*, bool) ()
#5  0x039f79e8 in v8::internal::compiler::turboshaft::StoreStoreEliminationPhase::Run(v8::internal::Zone*) ()
#6  0x0344167e in auto v8::internal::compiler::PipelineImpl::Run<v8::internal::compiler::turboshaft::StoreStoreEliminationPhase>() ()
#7  0x0343bb88 in v8::internal::compiler::PipelineImpl::OptimizeGraph(v8::internal::compiler::Linkage*) ()
#8  0x0343af76 in v8::internal::compiler::PipelineCompilationJob::ExecuteJobImpl(v8::internal::RuntimeCallStats*, v8::internal::LocalIsolate*) ()
#9  0x026d2311 in v8::internal::OptimizedCompilationJob::ExecuteJob(v8::internal::RuntimeCallStats*, v8::internal::LocalIsolate*) ()
#10 0x02707aef in v8::internal::OptimizingCompileDispatcher::CompileNext(v8::internal::TurbofanCompilationJob*, v8::internal::LocalIsolate*) ()
#11 0x02709369 in v8::internal::OptimizingCompileDispatcher::CompileTask::Run(v8::JobDelegate*) ()
#12 0x0315ff4e in v8::platform::DefaultJobWorker::Run() ()
#13 0x024bb718 in ?? ()
#14 0x24991cf4 in thread_start (curthread=0x24d3a004) at /home/pkgbuild/worktrees/releng/14.3/lib/libthr/thread/thr_create.c:289
#15 0x00000000 in ?? ()



===
Mark Millard
marklmi at yahoo.com