From nobody Sun Jun 13 08:19:18 2021 X-Original-To: net@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id EB96211D5D81 for ; Sun, 13 Jun 2021 08:19:20 +0000 (UTC) (envelope-from kp@freebsd.org) Received: from smtp.freebsd.org (smtp.freebsd.org [96.47.72.83]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "smtp.freebsd.org", Issuer "R3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4G2nYN6PHzz54QX; Sun, 13 Jun 2021 08:19:20 +0000 (UTC) (envelope-from kp@freebsd.org) Received: from venus.codepro.be (venus.codepro.be [5.9.86.228]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mx1.codepro.be", Issuer "R3" (verified OK)) (Authenticated sender: kp) by smtp.freebsd.org (Postfix) with ESMTPSA id AAC7B2DA79; Sun, 13 Jun 2021 08:19:20 +0000 (UTC) (envelope-from kp@freebsd.org) Received: by venus.codepro.be (Postfix, authenticated sender kp) id 376E8376DD; Sun, 13 Jun 2021 10:19:19 +0200 (CEST) Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable From: Kristof Provost List-Id: Networking and TCP/IP with FreeBSD List-Archive: https://lists.freebsd.org/archives/freebsd-net List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-net@freebsd.org Mime-Version: 1.0 (1.0) Subject: Re: page fault in pfioctl Date: Sun, 13 Jun 2021 10:19:18 +0200 Message-Id: <980E0B5C-41CF-466E-AD45-7B93532199F4@freebsd.org> References: Cc: net@freebsd.org In-Reply-To: To: Andriy Gapon X-Mailer: iPhone Mail (18F72) X-ThisMailContainsUnwantedMimeParts: N > On 13 Jun 2021, at 09:41, Andriy Gapon wrote: >=20 > =EF=BB=BFOn 13/06/2021 10:26, Kristof Provost wrote: >>> On 12 Jun 2021, at 19:59, Andriy Gapon wrote: >>> Not sure if this has been reported, or maybe even fixed, yet. >>> The crash happened with stable/13 as of 92f49c769b4 (June 3). >>> Judging from the time I think that it happened when running a periodic r= eport (likely 520.pfdenied). >>> I have the vmcore, can take a look into it on Monday. >>>=20 >>> Ah, and I must add that this is a custom kernel configuration with INVAR= IANTS. >>>=20 >>> Kernel page fault with the following non-sleepable locks held: >>> exclusive rm pf rulesets (pf rulesets) r =3D 0 (0xffffffff85558e58) lock= ed @ /usr/devel/git/trant/sys/netpfil/pf/pf_ioctl.c:2459 >>>=20 >> This panic doesn=E2=80=99t seem to ring any bells for me. >> I=E2=80=99d be interested in seeing what kgdb can pull out of the vmcore.= >> The line number for the lock would suggest it happened in DIOCGETRULENV, a= nd the backtrace suggests it=E2=80=99s during the copyout. >> I=E2=80=99m just not sure how that=E2=80=99d panic, because we copy out t= he result of nvlist_pack() (and have checked that for NULL), using the size i= t gave us. >> Hopefully the vmcore will be more enlightening. >> That is fairly new code though, so bugs are not impossible. >=20 > Based on the panic message (page fault with non-sleepable locks held), it s= eems that the problem is with holding the lock across the copyout. Usually t= hat won't panic, but if the destination happens to be paged out... > And only with INVARIANTS, I guess... Oh right. Thanks.=20 I=E2=80=99ve gotten bitten by that one before, but had clearly garbage colle= cted the memory.=20 I=E2=80=99ll fix this one and check for others on Monday.=20 I=E2=80=99ll also see of we can persuade copyout to always panic on this bug= , not just when the destination memory is actually paged out.=20 That way we=E2=80=99ll catch this in the regression tests in the future.=20 Best regards, Kristof=