Re: page fault in pfioctl

From: Kristof Provost <kp_at_freebsd.org>
Date: Sun, 13 Jun 2021 08:19:18 UTC

> On 13 Jun 2021, at 09:41, Andriy Gapon <avg@freebsd.org> wrote:
> 
> On 13/06/2021 10:26, Kristof Provost wrote:
>>> On 12 Jun 2021, at 19:59, Andriy Gapon wrote:
>>> Not sure if this has been reported, or maybe even fixed, yet.
>>> The crash happened with stable/13 as of 92f49c769b4 (June 3).
>>> Judging from the time I think that it happened when running a periodic report (likely 520.pfdenied).
>>> I have the vmcore, can take a look into it on Monday.
>>> 
>>> Ah, and I must add that this is a custom kernel configuration with INVARIANTS.
>>> 
>>> Kernel page fault with the following non-sleepable locks held:
>>> exclusive rm pf rulesets (pf rulesets) r = 0 (0xffffffff85558e58) locked @ /usr/devel/git/trant/sys/netpfil/pf/pf_ioctl.c:2459
>>> 
>> This panic doesn’t seem to ring any bells for me.
>> I’d be interested in seeing what kgdb can pull out of the vmcore.
>> The line number for the lock would suggest it happened in DIOCGETRULENV, and the backtrace suggests it’s during the copyout.
>> I’m just not sure how that’d panic, because we copy out the result of nvlist_pack() (and have checked that for NULL), using the size it gave us.
>> Hopefully the vmcore will be more enlightening.
>> That is fairly new code though, so bugs are not impossible.
> 
> Based on the panic message (page fault with non-sleepable locks held), it seems that the problem is with holding the lock across the copyout.  Usually that won't panic, but if the destination happens to be paged out...
> And only with INVARIANTS, I guess...

Oh right. Thanks. 
I’ve gotten bitten by that one before, but had clearly garbage collected the memory. 

I’ll fix this one and check for others on Monday. 

I’ll also see of we can persuade copyout to always panic on this bug, not just when the destination memory is actually paged out. 
That way we’ll catch this in the regression tests in the future. 

Best regards,
Kristof