Re: Panic in pipe_write from syslogd in 12.2?

From: Mark Johnston <markj_at_freebsd.org>
Date: Wed, 13 Oct 2021 14:46:29 UTC
On Tue, Oct 12, 2021 at 08:39:55PM +0200, Peter Eriksson wrote:
> I just noticed that a couple of my 12.2-RELEASE-p4 running servers have… 8263, 14474 and 3831 defunct subprocesses from syslogd and also seems to have stopped writing to the log files… When I tried to kill syslogd on a fourth server (with some X000 defunct processes) the machine panic’ed and rebooted.
> 
> I seem to have a vague memory of this being a known bug/someone saw something similar or perhaps even solved in later patch releases? But my google-fu seems to be failing me today. Anyone else remember?

I don't believe we've released a patch that would fix this.  The unix
domain socket code has been refactored a fair bit with respect to
locking since 12.2, and I believe this panic will be fixed in 12.3.  In
particular, I've seen one other report of a similar panic that went
away after
https://cgit.freebsd.org/src/commit/?id=ccdadf1a9bb64156e4a62bb6207c37b841467cb7 .

> (The one that panic’ed is now running -p10 instead which they should have done a long time ago but…)
> 
> 
> I reported it on the FreeBSD bugzilla:
>   https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=259084 <https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=259084>
> 
> 
> Output from one that still is running:
> # egrep syslogd /var/log/sys/15:10/procstat-kk-a.log
>  9212 101640 syslogd             -                   mi_switch+0xd4 sleepq_catch_signals+0x403 sleepq_wait_sig+0xf _sleep+0x1de pipe_write+0x583 dofilewrite+0xb0 sys_write+0xc0 amd64_syscall+0x387 fast_syscall_common+0xf8
> 
> Output from the one that panic’ed:
> Fatal trap 12: page fault while in kernel mode
> cpuid = 20; apic id = 14
> fault virtual address	= 0x410
> fault code		= supervisor read data, page not present
> instruction pointer	= 0x20:0xffffffff80b9f55c
> stack pointer	        = 0x28:0xfffffe14debc6710
> frame pointer	        = 0x28:0xfffffe14debc6790
> code segment		= base r <https://svnweb.freebsd.org/changeset/base/>x0, limit 0xfffff, type 0x1b
> 			= DPL 0, pres 1, long 1, def32 0, gran 1
> processor eflags	= interrupt enabled, resume, IOPL = 0
> current process		= 9277 (sshd)
> trap number		= 12
> panic: page fault
> cpuid = 20
> time = 1633990484
> KDB: stack backtrace:
> #0 0xffffffff80c0ad75 at kdb_backtrace+0x65
> #1 0xffffffff80bbf02b at vpanic+0x17b
> #2 0xffffffff80bbeea3 at panic+0x43
> #3 0xffffffff8108e911 at trap_fatal+0x391
> #4 0xffffffff8108e96f at trap_pfault+0x4f
> #5 0xffffffff8108dfb6 at trap+0x286
> #6 0xffffffff81066c28 at calltrap+0x8
> #7 0xffffffff80c6365f at unp_pcb_owned_lock2_slowpath+0x12f
> #8 0xffffffff80c61e0f at uipc_send+0x139f
> #9 0xffffffff80c55b7a at sosend_generic+0x4ca
> #10 0xffffffff80c55f90 at sosend+0x50
> #11 0xffffffff80c5cc55 at kern_sendit+0x225
> #12 0xffffffff80c5cfcc at sendit+0x19c
> #13 0xffffffff80c5ce1d at sys_sendto+0x4d
> #14 0xffffffff8108f4c7 at amd64_syscall+0x387
> #15 0xffffffff8106754e at fast_syscall_common+0xf8
> Uptime: 212d21h35m47s
> 
> - Peter
>