[Bug 277671] 14-RELEASE/14-STABLE crash with heavy disk IO on AMD Asus x670e motherboard and Intel i225 (igc) breakage NIC non-functioning
Date: Mon, 10 Jun 2024 16:08:45 UTC
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=277671 --- Comment #9 from Cameron <cam@vasteel.io> --- Tried running monerod for the first time in a while... And my system no longer crashes! This could be resolved by one or more of the following changes: 1. Upgraded to 14.1-RELEASE. I tried 14-STABLE maybe within a few months of 14.1-RELEASE and still had the problem. 2. Started using "zpool trim"... But I have another FreeBSD that had 14.0-RELEASE where I didn't run trim and had no problems. 3. I'm on a beta BIOS for this motherboard that's more recent than current latest official release. I notice after monerod has run for a while, I start getting tons of these messages in dmesg: Jun 5 02:19:11 hostname kernel: sonewconn: pcb 0xfffff802963b9540 (0.0.0.0:18080 (proto 6)): Listen queue overflow: 193 already in queue awaiting acceptance (1 occurrences), euid 781, rgid 781, jail 0 Jun 5 02:25:11 hostname kernel: sonewconn: pcb 0xfffff802963b9540 Increasing kern.ipc.soacceptqueue doesn't seem to help at all. I wonder if IO is so slow that monerod can't keep up with the connections? The first few times I ran "zpool trim", it only took a few minutes... But over time, it has progressively gotten worse, now taking 21+ minutes. Suggesting there's still some IO issue. Perhaps the same issue I've had in the past when running monerod, but now it no longer causes my box to completely lockup. I can now run monerod constantly without locking up my box though, which is a nice improvement! In /var/log/monerod.log, I see a lot of traces: 2024-06-10 15:46:31.253 [P2P6] INFO stacktrace src/common/stack_trace.cpp:134 Exception: boost::wrapexcept<boost::bad_weak_ptr> 2024-06-10 15:46:31.253 [P2P6] INFO stacktrace src/common/stack_trace.cpp:135 Unwound call stack: 2024-06-10 15:46:31.385 [P2P6] INFO stacktrace src/common/stack_trace.cpp:163 1 0x9ab808 __cxa_throw + 0xc8 2024-06-10 15:46:31.510 [P2P6] INFO stacktrace src/common/stack_trace.cpp:159 2 0x50b05f 2024-06-10 15:46:31.633 [P2P6] INFO stacktrace src/common/stack_trace.cpp:159 3 0x7e1f4a 2024-06-10 15:46:31.757 [P2P6] INFO stacktrace src/common/stack_trace.cpp:159 4 0x7dc205 2024-06-10 15:46:31.879 [P2P6] INFO stacktrace src/common/stack_trace.cpp:159 5 0x788439 2024-06-10 15:46:32.001 [P2P6] INFO stacktrace src/common/stack_trace.cpp:159 6 0x78886c 2024-06-10 15:46:32.122 [P2P6] INFO stacktrace src/common/stack_trace.cpp:159 7 0x7c05e2 2024-06-10 15:46:32.244 [P2P6] INFO stacktrace src/common/stack_trace.cpp:159 8 0x7b2e5b 2024-06-10 15:46:32.365 [P2P6] INFO stacktrace src/common/stack_trace.cpp:159 9 0x7bc49d 2024-06-10 15:46:32.486 [P2P6] INFO stacktrace src/common/stack_trace.cpp:159 a 0x4d9b88 2024-06-10 15:46:32.607 [P2P6] INFO stacktrace src/common/stack_trace.cpp:159 b 0x491100 2024-06-10 15:46:32.728 [P2P6] INFO stacktrace src/common/stack_trace.cpp:159 c 0x48eddd 2024-06-10 15:46:32.849 [P2P6] INFO stacktrace src/common/stack_trace.cpp:159 d 0x48c562 2024-06-10 15:46:32.970 [P2P6] INFO stacktrace src/common/stack_trace.cpp:159 e 0x7e39a5 2024-06-10 15:46:33.091 [P2P6] INFO stacktrace src/common/stack_trace.cpp:159 f 0x7fd24f 2024-06-10 15:46:33.212 [P2P6] INFO stacktrace src/common/stack_trace.cpp:159 10 0x7fd118 2024-06-10 15:46:33.333 [P2P6] INFO stacktrace src/common/stack_trace.cpp:159 11 0x4fb1b2 2024-06-10 15:46:33.453 [P2P6] INFO stacktrace src/common/stack_trace.cpp:159 12 0x4f03c4 2024-06-10 15:46:33.575 [P2P6] INFO stacktrace src/common/stack_trace.cpp:159 13 0x4efe94 2024-06-10 15:46:33.695 [P2P6] INFO stacktrace src/common/stack_trace.cpp:159 14 0x4efbcc 2024-06-10 15:46:33.816 [P2P6] INFO stacktrace src/common/stack_trace.cpp:159 15 0x7deaa2 2024-06-10 15:46:33.937 [P2P6] INFO stacktrace src/common/stack_trace.cpp:159 16 0x82bec79bd 2024-06-10 15:46:34.058 [P2P6] INFO stacktrace src/common/stack_trace.cpp:159 17 0x8324bcb05 I see similar traces on my other box where monerod has never given me problems, but the traces become more far more common on the box that does give me problems once the sonewconn errors start appearing. The sonewconn errors have never appeared on the other working box. It seems monerod is mostly or entirely unable to continue syncing the block chain with constant stacktraces once it gets to this point unless I completely reboot the system. Completely stopping and starting monerod doesn't help. Looking at sockstat -c the last time I was in this state, I only had a bit over 200 connections. -- You are receiving this mail because: You are the assignee for the bug.