Re: git: 95f773e59482 - main - i386 copyout_fast: improve detection of a fault on accessing userspace

From: Konstantin Belousov <kostikbel_at_gmail.com>
Date: Thu, 08 Sep 2022 06:00:13 UTC
On Wed, Sep 07, 2022 at 11:18:39PM +0300, Konstantin Belousov wrote:
> On Wed, Sep 07, 2022 at 06:38:04PM +0200, Tijl Coosemans wrote:
> > On Tue, 6 Sep 2022 23:17:45 +0200 Tijl Coosemans <tijl@FreeBSD.org> wrote:
> > > On Tue, 6 Sep 2022 18:30:01 +0300 Konstantin Belousov
> > > <kostikbel@gmail.com> wrote:
> > >> I suspect you see that leftover panics, which I am working on right now.  
> > > 
> > > Yes, it looks like this:
> > > 
> > > panic: vm_fault_lookup: fault on nofault entry, addr: 0
> > > 
> > > GNU gdb (GDB) 12.1 [GDB v12.1 for FreeBSD]
> > > Copyright (C) 2022 Free Software Foundation, Inc.
> > > License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
> > > This is free software: you are free to change and redistribute it.
> > > There is NO WARRANTY, to the extent permitted by law.
> > > Type "show copying" and "show warranty" for details.
> > > This GDB was configured as "i386-portbld-freebsd14.0".
> > > Type "show configuration" for configuration details.
> > > For bug reporting instructions, please see:
> > > <https://www.gnu.org/software/gdb/bugs/>.
> > > Find the GDB manual and other documentation resources online at:
> > >     <http://www.gnu.org/software/gdb/documentation/>.
> > > 
> > > For help, type "help".
> > > Type "apropos word" to search for commands related to "word"...
> > > Reading symbols from /boot/kernel/kernel...
> > > Reading symbols from /usr/lib/debug//boot/kernel/kernel.debug...
> > > 
> > > Unread portion of the kernel message buffer:
> > > panic: vm_fault_lookup: fault on nofault entry, addr: 0
> > > time = 1662487582
> > > KDB: stack backtrace:
> > > db_trace_self_wrapper(5,ffffffff,0,0,0,...) at db_trace_self_wrapper+0x28/frame 0xffc09cf4
> > > kdb_backtrace(1a716ae0,100,e340e0,ffc09db4,ffc09e00,...) at kdb_backtrace+0x2b/frame 0xffc09d4c
> > > vpanic(bef316,ffc09d88,ffc09d88,ffc09dac,b3181a,...) at vpanic+0xe9/frame 0xffc09d68
> > > panic(bef316,be68d1,0,ffc09dc4,480f308,...) at panic+0x14/frame 0xffc09d7c
> > > vm_fault_lookup(0,0,1a710701,0,0,...) at vm_fault_lookup+0x13a/frame 0xffc09dac
> > > vm_fault(e340e0,0,1,0,0) at vm_fault+0x79/frame 0xffc09e30
> > > vm_fault_trap(e340e0,3b,1,0,0,0) at vm_fault_trap+0x44/frame 0xffc09e58
> > > trap_pfault(3b,0,0) at trap_pfault+0x119/frame 0xffc09ea0
> > > trap(ffc09f6c,8,28,28,e247000,...) at trap+0x2d4/frame 0xffc09f60
> > > calltrap() at 0xffc031ff/frame 0xffc09f60
> > > --- trap 0xc, eip = 0x3b, esp = 0xffc09fac, ebp = 0xffc033dc ---
> > > (null)() at 0x3b/frame 0xffc033dc
> > > KDB: enter: panic
> > > 
> > > 0x009a1dfd in dump_savectx ()
> > >     at /home/tijl/freebsd/base/main/sys/kern/kern_shutdown.c:404
> > > (kgdb) #0  0x009a1dfd in dump_savectx ()
> > >     at /home/tijl/freebsd/base/main/sys/kern/kern_shutdown.c:404
> > > Backtrace stopped: Cannot access memory at address 0xffc09af4
> > 
> > The eip = 0x3b here looks like the size of a copy so I added a test on
> > top of your patch to see if copyout_fast or copyin_fast was called a
> > second time (see attached patch) because that would push the size where
> > eip is in the trapframe.  This triggered after running for a while with
> > this backtrace:
> > 
> > Unread portion of the kernel message buffer:
> > panic: 
> > time = 1662560141
> > KDB: stack backtrace:
> > db_trace_self_wrapper(5,ffffffff,0,0,0,...) at db_trace_self_wrapper+0x28/frame 0x1b0bca0c
> > kdb_backtrace(1c1f93a0,100,1b0bcb1a,ffc06ff0,141e000,...) at kdb_backtrace+0x2b/frame 0x1b0bca64
> > vpanic(b8da7d,1b0bcaa0,1b0bcaa0,1b0bcaac,ffc042af,...) at vpanic+0xe9/frame 0x1b0bca80
> > panic(b8da7d,fb028f96,2,0,1b0bcadc,...) at panic+0x14/frame 0x1b0bca94
> > __stop_set_sysinit_set(1b0bcb1a,fb028f96,2,141e000) at 0xffc042af/frame 0x1b0bcaac
> > copyout(1b0bcb1a,fb028f96,2) at copyout+0x58/frame 0x1b0bcadc
> > pollout(fb028f90,2) at pollout+0x3c/frame 0x1b0bcb04
> > kern_poll(1c1f93a0,fb028f90,2,1b0bcc3c,0) at kern_poll+0x83/frame 0x1b0bcc20
> > sys_poll(1c1f93a0,1c1f9644) at sys_poll+0x54/frame 0x1b0bcc48
> > syscallenter(1,0,1c1f93a0,1b0bcd30,480f318,...) at syscallenter+0xc0/frame 0x1b0bcc78
> > syscall(1b0bcce8,3b,3b,3b,3dfb1300,...) at syscall+0x28/frame 0x1b0bccdc
> > Xint0x80_syscall() at 0xffc03419/frame 0x1b0bccdc
> > --- syscall (4, FreeBSD ELF32, sys_write), eip = 0x2058c67b, esp = 0xffbfc524, ebp = 0x1 ---
> > KDB: enter: panic
> > 
> > 
> > So interrupts must have been reenabled somehow, probably by the page
> > fault handler, and this allows context switches and then another process
> > can call copyout or copyin and corrupt the trampoline stack and
> > copyout_buf.
> I do not see where the interrupts could be reenabled in copyout_fast path,
> without or with page fault on the userspace access.  Also I am not sure
> what should your patch demonstrate.
> 
> Also, pmap_trm_alloc() puts guard page between consequent allocations,
> so trampoline stack overflow must be very careful to not really overflow,
> but just touch enough bytes to give the effect.
> 
> But lets check the hypothesis.  If interrupts are enabled somehow, then
> processor would execute interrupt/fault handler on the same stack, which
> is trampoline stack and not the kstack.  I added an INVARIANTS check that
> verifies that both trap() and syscall() use kstack.  Could you, please,
> fetch my ast branch and see what it the outcome?

I realized that interrupt handlers (as opposed to exception handlers)
were missed.  I updated the commit with handling for apic/atpic interrupts
and TLB shutdown handlers.