Re: i386 on amd64 can fail to return from cond_wait_user, using basically 100% of a FreeBSD cpu [IGNORE: wrong thread]
Date: Sat, 05 Jul 2025 22:21:36 UTC
On Jul 5, 2025, at 10:08, Mark Millard <marklmi@yahoo.com> wrote: > On Jul 5, 2025, at 01:23, Konstantin Belousov <kostikbel@gmail.com> wrote: > >> On Fri, Jul 04, 2025 at 11:01:22PM -0700, Mark Millard wrote: >>> Some package builds are failing on the port-packages build cluster >>> machines that do i386 builds during the following code. The analysis >>> is from replication in a personal context (using poudriere bulk with >>> -i), using poudriere-devel instead. I'll note that the personal >>> context is from using PkgBase 14.3-RELEASE in the poudriere jail. >>> I also installed most of the realted *-dbg* PkgBase packages in order >>> to get the nicer backtracing. >>> >>> (gdb) bt >>> #0 _umtx_op_err () at /home/pkgbuild/worktrees/releng/14.3/lib/libthr/arch/i386/i386/_umtx_op_err.S:37 >>> #1 0x2499f897 in _thr_umtx_timedwait_uint (mtx=0x249a365c, id=0, clockid=4, abstime=0x0, shared=0) at /home/pkgbuild/worktrees/releng/14.3/lib/libthr/thread/thr_umtx.c:233 >>> #2 0x24995b26 in _thr_sleep (curthread=0x24d36004, clockid=4, abstime=0x0) at /home/pkgbuild/worktrees/releng/14.3/lib/libthr/thread/thr_kern.c:197 >>> #3 0x24990beb in cond_wait_user (cvp=0x24dfa8a0, mp=0x24d38d04, abstime=<optimized out>, cancel=<optimized out>) at /home/pkgbuild/worktrees/releng/14.3/lib/libthr/thread/thr_cond.c:317 >>> >>> NOTE: cond_wait_user never returns but #2..#0 repeat (observed by >>> repeated ^c and bt usage). >>> >>> (i386 is the oddball with 32-bit time_t but I do not know >>> if that is involved here.) >>> >>> #4 cond_wait_common (cond=<optimized out>, mutex=<optimized out>, abstime=0x0, cancel=1) at /home/pkgbuild/worktrees/releng/14.3/lib/libthr/thread/thr_cond.c:377 >>> #5 0x24990e8f in __thr_cond_wait (cond=0x23b9b4f4, mutex=0x23b9b4ec) at /home/pkgbuild/worktrees/releng/14.3/lib/libthr/thread/thr_cond.c:392 >>> #6 0x23be1e4b in uv_cond_wait () from /usr/local/lib/libuv.so.1 >>> #7 0x024bd497 in node::NodePlatform::DrainTasks(v8::Isolate*) () >>> #8 0x0232f5b6 in node::SpinEventLoopInternal(node::Environment*) () >>> #9 0x02485bf0 in node::NodeMainInstance::Run() () >>> #10 0x023eaba1 in node::Start(int, char**) () >>> #11 0x24a1da85 in __libc_start1 (argc=5, argv=0xffffda3c, env=0xffffda54, cleanup=0x23b73020 <rtld_nop_exit>, mainX=0x314a720 <main>) >>> at /home/pkgbuild/worktrees/releng/14.3/lib/libc/csu/libc_start1.c:157 >>> #12 0x0232d0a8 in _start () >>> >>> www/librewolf and other firefox related package builds can do >>> this until a 7200 sec timeout by poudriere occurs: >>> >>> =>> Killing runaway build after 7200 seconds with no output >>> >>> I'll note that truss did not generate any output when used to >>> watch the process that was stuck. It appears to be a world-internal >>> problem. > > Looks like my truss comment does not have the implications > that I thought. > >> Can you provide a minimal stand-alone reproducer in C? > > Unsure. My attempt will be mostly exploratory, not being familiar > with the subject area or /usr/local/lib/libuv.so.1's uv_cond_wait > or node's code. > [I've removed a bunch of notes here that now prove irrelevant.] I finally managed to notice that I was looking at the wrong thread: the busy thread does not involve _umtx_op_err or the like. See the "v8::. . ." below. Sorry for the noise. (gdb) info threads Id Target Id Frame 1 LWP 101947 of process 78445 _umtx_op_err () at /home/pkgbuild/worktrees/releng/14.3/lib/libthr/arch/i386/i386/_umtx_op_err.S:37 2 LWP 257278 of process 78445 _kevent () at _kevent.S:4 * 3 LWP 257279 of process 78445 0x039fb33e in v8::internal::compiler::turboshaft::MaybeRedundantStoresTable::map_to_key(v8::internal::compiler::turboshaft::OpIndex, int, unsigned char) () 4 LWP 257280 of process 78445 _umtx_op_err () at /home/pkgbuild/worktrees/releng/14.3/lib/libthr/arch/i386/i386/_umtx_op_err.S:37 5 LWP 257281 of process 78445 _umtx_op_err () at /home/pkgbuild/worktrees/releng/14.3/lib/libthr/arch/i386/i386/_umtx_op_err.S:37 6 LWP 257282 of process 78445 _umtx_op_err () at /home/pkgbuild/worktrees/releng/14.3/lib/libthr/arch/i386/i386/_umtx_op_err.S:37 7 LWP 257283 of process 78445 _umtx_op () at _umtx_op.S:4 (gdb) bt #0 0x039fb33e in v8::internal::compiler::turboshaft::MaybeRedundantStoresTable::map_to_key(v8::internal::compiler::turboshaft::OpIndex, int, unsigned char) () #1 0x039f9a49 in v8::internal::compiler::turboshaft::RedundantStoreAnalysis::ProcessBlock(v8::internal::compiler::turboshaft::Block const&) () #2 0x039f95b3 in v8::internal::compiler::turboshaft::StoreStoreEliminationReducer<v8::internal::compiler::turboshaft::ReducerStack<v8::internal::compiler::turboshaft::Assembler<v8::internal::compiler::turboshaft::reducer_list<v8::internal::compiler::turboshaft::TurboshaftAssemblerOpInterface, v8::internal::compiler::turboshaft::GraphVisitor, v8::internal::compiler::turboshaft::StoreStoreEliminationReducer, v8::internal::compiler::turboshaft::LateLoadEliminationReducer, v8::internal::compiler::turboshaft::MachineOptimizationReducer, v8::internal::compiler::turboshaft::BranchEliminationReducer, v8::internal::compiler::turboshaft::ValueNumberingReducer, v8::internal::compiler::turboshaft::TSReducerBase> >, true, v8::internal::compiler::turboshaft::LateLoadEliminationReducer, v8::internal::compiler::turboshaft::MachineOptimizationReducer, v8::internal::compiler::turboshaft::BranchEliminationReducer, v8::internal::compiler::turboshaft::ValueNumberingReducer, v8::internal::compiler::turboshaft::TSReducerBase> >::Analyze() () #3 0x039f7b42 in void v8::internal::compiler::turboshaft::GraphVisitor<v8::internal::compiler::turboshaft::ReducerStack<v8::internal::compiler::turboshaft::Assembler<v8::internal::compiler::turboshaft::reducer_list<v8::internal::compiler::turboshaft::TurboshaftAssemblerOpInterface, v8::internal::compiler::turboshaft::GraphVisitor, v8::internal::compiler::turboshaft::StoreStoreEliminationReducer, v8::internal::compiler::turboshaft::LateLoadEliminationReducer, v8::internal::compiler::turboshaft::MachineOptimizationReducer, v8::internal::compiler::turboshaft::BranchEliminationReducer, v8::internal::compiler::turboshaft::ValueNumberingReducer, v8::internal::compiler::turboshaft::TSReducerBase> >, true, v8::internal::compiler::turboshaft::StoreStoreEliminationReducer, v8::internal::compiler::turboshaft::LateLoadEliminationReducer, v8::internal::compiler::turboshaft::MachineOptimizationReducer, v8::internal::compiler::turboshaft::BranchEliminationReducer, v8::internal::compiler::turboshaft::ValueNumberingReducer, v8::internal::compiler::turboshaft::TSReducerBase> >::VisitGraph<false>() () #4 0x039f7acd in v8::internal::compiler::turboshaft::CopyingPhaseImpl<v8::internal::compiler::turboshaft::StoreStoreEliminationReducer, v8::internal::compiler::turboshaft::LateLoadEliminationReducer, v8::internal::compiler::turboshaft::MachineOptimizationReducer, v8::internal::compiler::turboshaft::BranchEliminationReducer, v8::internal::compiler::turboshaft::ValueNumberingReducer>::Run(v8::internal::compiler::turboshaft::Graph&, v8::internal::Zone*, bool) () #5 0x039f79e8 in v8::internal::compiler::turboshaft::StoreStoreEliminationPhase::Run(v8::internal::Zone*) () #6 0x0344167e in auto v8::internal::compiler::PipelineImpl::Run<v8::internal::compiler::turboshaft::StoreStoreEliminationPhase>() () #7 0x0343bb88 in v8::internal::compiler::PipelineImpl::OptimizeGraph(v8::internal::compiler::Linkage*) () #8 0x0343af76 in v8::internal::compiler::PipelineCompilationJob::ExecuteJobImpl(v8::internal::RuntimeCallStats*, v8::internal::LocalIsolate*) () #9 0x026d2311 in v8::internal::OptimizedCompilationJob::ExecuteJob(v8::internal::RuntimeCallStats*, v8::internal::LocalIsolate*) () #10 0x02707aef in v8::internal::OptimizingCompileDispatcher::CompileNext(v8::internal::TurbofanCompilationJob*, v8::internal::LocalIsolate*) () #11 0x02709369 in v8::internal::OptimizingCompileDispatcher::CompileTask::Run(v8::JobDelegate*) () #12 0x0315ff4e in v8::platform::DefaultJobWorker::Run() () #13 0x024bb718 in ?? () #14 0x24991cf4 in thread_start (curthread=0x24d3a004) at /home/pkgbuild/worktrees/releng/14.3/lib/libthr/thread/thr_create.c:289 #15 0x00000000 in ?? () === Mark Millard marklmi at yahoo.com