From nobody Mon Mar 21 15:11:44 2022 X-Original-To: freebsd-net@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 609071A35DAC for ; Mon, 21 Mar 2022 15:11:46 +0000 (UTC) (envelope-from mike@mail.karels.net) Received: from mail.karels.net (mail.karels.net [216.160.39.52]) by mx1.freebsd.org (Postfix) with ESMTP id 4KMdPY1FZ8z4qrD; Mon, 21 Mar 2022 15:11:45 +0000 (UTC) (envelope-from mike@mail.karels.net) Received: from mail.karels.net (localhost [127.0.0.1]) by mail.karels.net (8.16.1/8.16.1) with ESMTPS id 22LFBijp041122 (version=TLSv1.3 cipher=TLS_AES_256_GCM_SHA384 bits=256 verify=NO); Mon, 21 Mar 2022 10:11:44 -0500 (CDT) (envelope-from mike@mail.karels.net) Received: (from mike@localhost) by mail.karels.net (8.16.1/8.16.1/Submit) id 22LFBiHG041121; Mon, 21 Mar 2022 10:11:44 -0500 (CDT) (envelope-from mike) Message-Id: <202203211511.22LFBiHG041121@mail.karels.net> To: Kristof Provost cc: freebsd-net@freebsd.org From: Mike Karels Reply-to: mike@karels.net Subject: Re: kernel epoch crash in IPv4 multicast code In-reply-to: Your message of Mon, 21 Mar 2022 13:41:15 +0100. <9E6CA0F5-5E02-4458-8D9F-C7F8F1715BFC@FreeBSD.org> List-Id: Networking and TCP/IP with FreeBSD List-Archive: https://lists.freebsd.org/archives/freebsd-net List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-net@freebsd.org MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-ID: <41119.1647875504.1@mail.karels.net> Content-Transfer-Encoding: quoted-printable Date: Mon, 21 Mar 2022 10:11:44 -0500 X-Rspamd-Queue-Id: 4KMdPY1FZ8z4qrD X-Spamd-Bar: - Authentication-Results: mx1.freebsd.org; dkim=none; dmarc=none; spf=none (mx1.freebsd.org: domain of mike@mail.karels.net has no SPF policy when checking 216.160.39.52) smtp.mailfrom=mike@mail.karels.net X-Spamd-Result: default: False [-1.70 / 15.00]; HAS_REPLYTO(0.00)[mike@karels.net]; ARC_NA(0.00)[]; FREEFALL_USER(0.00)[mike]; FROM_HAS_DN(0.00)[]; TO_DN_SOME(0.00)[]; TO_MATCH_ENVRCPT_ALL(0.00)[]; NEURAL_HAM_LONG(-1.00)[-1.000]; MIME_GOOD(-0.10)[text/plain]; REPLYTO_ADDR_EQ_FROM(0.00)[]; DMARC_NA(0.00)[karels.net]; AUTH_NA(1.00)[]; MID_RHS_MATCH_FROMTLD(0.00)[]; NEURAL_HAM_SHORT(-1.00)[-1.000]; RCPT_COUNT_TWO(0.00)[2]; NEURAL_HAM_MEDIUM(-1.00)[-1.000]; MLMMJ_DEST(0.00)[freebsd-net]; FORGED_SENDER(0.30)[mike@karels.net,mike@mail.karels.net]; RCVD_NO_TLS_LAST(0.10)[]; R_SPF_NA(0.00)[no SPF record]; R_DKIM_NA(0.00)[]; MIME_TRACE(0.00)[0:+]; ASN(0.00)[asn:209, ipnet:216.160.36.0/22, country:US]; FROM_NEQ_ENVFROM(0.00)[mike@karels.net,mike@mail.karels.net]; RCVD_COUNT_TWO(0.00)[2] X-ThisMailContainsUnwantedMimeParts: N Kristof wrote: > On 18 Mar 2022, at 19:02, Mike Karels wrote: > > It looks like the IPv4 multicast code has not been fully converted to > > use epochs. I installed this week's snapshot of -current, configured > > and started mrouted, and started rwhod -m. The system crashed shortly > > thereafter with this: > > > > panic: Assertion in_epoch(net_epoch_preempt) failed at /usr/src/sys/ne= tinet/ip_output.c:343 > > cpuid =3D 15 > > time =3D 1647609865 > > KDB: stack backtrace: > > db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe01= b51a39d0 > > vpanic() at vpanic+0x17f/frame 0xfffffe01b51a3a20 > > panic() at panic+0x43/frame 0xfffffe01b51a3a80 > > ip_output() at ip_output+0x15f9/frame 0xfffffe01b51a3b80 > > phyint_send() at phyint_send+0x107/frame 0xfffffe01b51a3be0 > > ip_mdq() at ip_mdq+0x259/frame 0xfffffe01b51a3c60 > > X_ip_mrouter_set() at X_ip_mrouter_set+0x9e4/frame 0xfffffe01b51a3d30 > > sosetopt() at sosetopt+0xee/frame 0xfffffe01b51a3d80 > > kern_setsockopt() at kern_setsockopt+0xad/frame 0xfffffe01b51a3de0 > > sys_setsockopt() at sys_setsockopt+0x24/frame 0xfffffe01b51a3e00 > > amd64_syscall() at amd64_syscall+0x12e/frame 0xfffffe01b51a3f30 > > fast_syscall_common() at fast_syscall_common+0xf8/frame 0xfffffe01b51a= 3f30 > > --- syscall (105, FreeBSD ELF64, sys_setsockopt), rip =3D 0x821b72dda,= rsp =3D 0x8204c06f8, rbp =3D 0x8204c0750 --- > > KDB: enter: panic > > > > The kgdb backtrace is appended. > > > > It looks like ip_mroute is protected in the forwarding path (it's call= ed > > from ip_input) and the output path, but not in the setup path from > > setsockopt(). At least the MRT_ADD_MFC call needs to enter an epoch. > > I tried adding epoch handling in add_mfc(), and that seems to work. > > The alternative would be to do it in Xip_mrouter_set() so it would cov= er > > all the calls. Any opinions? > > > Your analysis looks reasonable. > I think I'd suggest adding the NET_EPOCH_ENTER() calls in add_mfc(). We = already do that in add_vif(), so we'd be following existing choices. > I'd also suggest adding NET_EPOCH_ASSERT() to everything which directly = or indirectly calls ip_output(). That should help us catch other potential= issues like this one. Thanks. I had already added one assert; I added one in send_packet() as well. For anyone interested, this is now in review: https://reviews.freebsd.org/D34624. Mike > Br, > Kristof