Re: page fault in pfioctl

From: Andriy Gapon <>
Date: Sun, 13 Jun 2021 07:39:43 UTC
On 13/06/2021 10:26, Kristof Provost wrote:
> On 12 Jun 2021, at 19:59, Andriy Gapon wrote:
>> Not sure if this has been reported, or maybe even fixed, yet.
>> The crash happened with stable/13 as of 92f49c769b4 (June 3).
>> Judging from the time I think that it happened when running a periodic report 
>> (likely 520.pfdenied).
>> I have the vmcore, can take a look into it on Monday.
>> Ah, and I must add that this is a custom kernel configuration with INVARIANTS.
>> Kernel page fault with the following non-sleepable locks held:
>> exclusive rm pf rulesets (pf rulesets) r = 0 (0xffffffff85558e58) locked @ 
>> /usr/devel/git/trant/sys/netpfil/pf/pf_ioctl.c:2459
> This panic doesn’t seem to ring any bells for me.
> I’d be interested in seeing what kgdb can pull out of the vmcore.
> The line number for the lock would suggest it happened in DIOCGETRULENV, and the 
> backtrace suggests it’s during the copyout.
> I’m just not sure how that’d panic, because we copy out the result of 
> nvlist_pack() (and have checked that for NULL), using the size it gave us.
> Hopefully the vmcore will be more enlightening.
> That is fairly new code though, so bugs are not impossible.

Based on the panic message (page fault with non-sleepable locks held), it seems 
that the problem is with holding the lock across the copyout.  Usually that 
won't panic, but if the destination happens to be paged out...
And only with INVARIANTS, I guess...

Based on the kgdb backtrace it's indeed DIOCGETRULENV.

I guess that the copyout of the (packed) nvlist can be delayed until after 
unlocking the rules lock.

And it seems that the pattern is present in some other ioctls.

#0  doadump (textdump=textdump@entry=1) at 
#1  0xffffffff808443f2 in kern_reboot (howto=260) at 
#2  0xffffffff80844a47 in vpanic (fmt=0xffffffff80bee342 "%s", 
ap=0xfffffe01cb072190) at /usr/devel/git/trant/sys/kern/kern_shutdown.c:919
#3  0xffffffff808445f3 in panic (fmt=<unavailable>) at 
#4  0xffffffff80b300a5 in trap_fatal (frame=0xfffffe01cb0723c0, eva=34370363392) 
at /usr/devel/git/trant/sys/amd64/amd64/trap.c:915
#5  0xffffffff80b30180 in trap_pfault (frame=frame@entry=0xfffffe01cb0723c0, 
usermode=false, signo=<optimized out>, signo@entry=0x0, ucode=<optimized out>, 
ucode@entry=0x0) at /usr/devel/git/trant/sys/amd64/amd64/trap.c:732
#6  0xffffffff80b2f729 in trap (frame=frame@entry=0xfffffe01cb0723c0) at 
#7  0xffffffff80b304d9 in trap_check (frame=0xfffffe01cb0723c0) at 
#8  <signal handler called>
#9  copyout_nosmap_std () at /usr/devel/git/trant/sys/amd64/amd64/support.S:855
#10 0xffffffff85540358 in pfioctl (dev=<optimized out>, cmd=3222815751, 
addr=<optimized out>, flags=<optimized out>, td=<optimized out>) at 
#11 0xffffffff807176cf in devfs_ioctl (ap=0xfffffe01cb0729d8) at 
#12 0xffffffff80bb26e2 in VOP_IOCTL_APV (vop=0xffffffff80ead008 <devfs_specops>, 
a=a@entry=0xfffffe01cb0729d8) at vnode_if.c:1231
#13 0xffffffff80928014 in VOP_IOCTL (vp=vp@entry=0xfffff8068c0737a0, 
command=<optimized out>, data=0x117e, data@entry=0xfffffe01cb072ba0, fflag=47, 
cred=0xfffffe01ced81000, cred@entry=0xfffff8045b7a9b00, td=0x8080808080808080)
     at ./vnode_if.h:642
#14 0xffffffff80923330 in vn_ioctl (fp=0xfffff802cb0baf00, 
com=18446741882451533824, data=0xfffffe01cb072ba0, 
active_cred=0xfffff8045b7a9b00, td=0x8080808080808080) at 
#15 0xffffffff80717bbe in devfs_ioctl_f (fp=0x800a22000, 
com=18446741882451533824, data=0x117e, cred=0x2f, td=0xfffff804132d0000) at 
#16 0xffffffff808abc6b in fo_ioctl (fp=0x800a22000, com=18446741882451533824, 
com@entry=3222815751, data=0x117e, data@entry=0xfffffe01cb072ba0, 
active_cred=0x2f, td=0xfffffe01ced81000, td@entry=0xfffff804132d0000)
     at /usr/devel/git/trant/sys/sys/file.h:356
#17 0xffffffff808abc01 in kern_ioctl (td=td@entry=0xfffff804132d0000, fd=3, 
com=com@entry=3222815751, data=data@entry=0xfffffe01cb072ba0 "") at 
#18 0xffffffff808ab982 in sys_ioctl (td=0xfffff804132d0000, 
uap=0xfffff804132d03e8) at /usr/devel/git/trant/sys/kern/sys_generic.c:711
#19 0xffffffff80b30cc9 in syscallenter (td=0xfffff804132d0000) at 
#20 0xffffffff80b309a5 in amd64_syscall (td=0xfffff804132d0000, traced=0) at 

Andriy Gapon