Re: Panics in recent NFS server

From: Mateusz Guzik <mjguzik_at_gmail.com>
Date: Mon, 31 May 2021 17:03:57 +0200
I reproduced the panic, things work for me with the patch below.
However, there may be more to it so I'm going to ask Rick to weigh in.
but short version is that length returned by nfsrv_parsename is off by
one compared to copyinstr.

diff --git a/sys/fs/nfsserver/nfs_nfsdsubs.c b/sys/fs/nfsserver/nfs_nfsdsubs.c
index 2b6e17752544..8c7db36bbd05 100644
--- a/sys/fs/nfsserver/nfs_nfsdsubs.c
+++ b/sys/fs/nfsserver/nfs_nfsdsubs.c
_at__at_ -2065,7 +2065,7 _at__at_ nfsrv_parsename(struct nfsrv_descript *nd, char
*bufp, u_long *hashp,
            }
        }
        *tocp = '\0';
-       *outlenp = (size_t)outlen;
+       *outlenp = (size_t)outlen + 1;
        if (hashp != NULL)
                *hashp = hash;
 nfsmout:


On 5/31/21, Mateusz Guzik <mjguzik_at_gmail.com> wrote:
> On 5/31/21, Mateusz Guzik <mjguzik_at_gmail.com> wrote:
>> It's probably my commit d81aefa8b7dd8cbeffeda541fca9962802404983 ,
>> I'll look at this later.
>
> Well let me rephrase. While the panic was added in said commit, I
> suspect the bug is on nfs side -- it has its own namei variant which I
> suspect is managing ni_pathlen in a manner different than the
> original, it just happens to not panic on kernels prior to the above
> change.
>
>>
>> On 5/31/21, Dimitry Andric <dim_at_freebsd.org> wrote:
>>> Hi,
>>>
>>> I recently upgraded a -CURRENT NFS server from 2021-05-12 to today
>>> (2021-05-31), and when the first NFS client attempted to connect, I got
>>> this
>>> panic:
>>>
>>> panic: lookup: expected nul at 0xfffff800104b3002; string [dim]
>>>
>>> cpuid = 0
>>> time = 1622463863
>>> KDB: stack backtrace:
>>> db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame
>>> 0xfffffe00747e89b0
>>> vpanic() at vpanic+0x187/frame 0xfffffe00747e8a10
>>> panic() at panic+0x43/frame 0xfffffe00747e8a70
>>> lookup() at lookup+0xad2/frame 0xfffffe00747e8b10
>>> nfsvno_namei() at nfsvno_namei+0x1a4/frame 0xfffffe00747e8bc0
>>> nfsrvd_lookup() at nfsrvd_lookup+0x191/frame 0xfffffe00747e8eb0
>>> nfsrvd_dorpc() at nfsrvd_dorpc+0xfab/frame 0xfffffe00747e90c0
>>> nfssvc_program() at nfssvc_program+0x604/frame 0xfffffe00747e92a0
>>> svc_run_internal() at svc_run_internal+0xa72/frame 0xfffffe00747e93d0
>>> svc_run() at svc_run+0x250/frame 0xfffffe00747e9430
>>> nfsrvd_nfsd() at nfsrvd_nfsd+0x33c/frame 0xfffffe00747e9590
>>> nfssvc_nfsd() at nfssvc_nfsd+0x473/frame 0xfffffe00747e9aa0
>>> sys_nfssvc() at sys_nfssvc+0xc7/frame 0xfffffe00747e9ac0
>>> amd64_syscall() at amd64_syscall+0x12e/frame 0xfffffe00747e9bf0
>>> fast_syscall_common() at fast_syscall_common+0xf8/frame
>>> 0xfffffe00747e9bf0
>>> --- syscall (155, FreeBSD ELF64, sys_nfssvc), rip = 0x8011aa59a, rsp =
>>> 0x7fffffffe4e8, rbp = 0x7fffffffe780 ---
>>> KDB: enter: panic
>>>
>>> __curthread ()
>>>     at /share/dim/src/freebsd/src-dim/sys/amd64/include/pcpu_aux.h:55
>>> 55		__asm("movq %%gs:%P1,%0" : "=r" (td) : "n" (offsetof(struct pcpu,
>>> (kgdb) #0  __curthread ()
>>>     at /share/dim/src/freebsd/src-dim/sys/amd64/include/pcpu_aux.h:55
>>> #1  doadump (textdump=textdump_at_entry=0)
>>>     at /share/dim/src/freebsd/src-dim/sys/kern/kern_shutdown.c:399
>>> #2  0xffffffff804cca5a in db_dump (dummy=<optimized out>,
>>>     dummy2=<unavailable>, dummy3=<unavailable>, dummy4=<unavailable>)
>>>     at /share/dim/src/freebsd/src-dim/sys/ddb/db_command.c:575
>>> #3  0xffffffff804cc912 in db_command (last_cmdp=<optimized out>,
>>>     cmd_table=<optimized out>, dopager=dopager_at_entry=1)
>>>     at /share/dim/src/freebsd/src-dim/sys/ddb/db_command.c:482
>>> #4  0xffffffff804cc58d in db_command_loop ()
>>>     at /share/dim/src/freebsd/src-dim/sys/ddb/db_command.c:535
>>> #5  0xffffffff804cfd06 in db_trap (type=<optimized out>, code=<optimized
>>> out>)
>>>     at /share/dim/src/freebsd/src-dim/sys/ddb/db_main.c:270
>>> #6  0xffffffff80c69f17 in kdb_trap (type=type_at_entry=3,
>>> code=code_at_entry=0,
>>>     tf=tf_at_entry=0xfffffe00747e88e0)
>>>     at /share/dim/src/freebsd/src-dim/sys/kern/subr_kdb.c:727
>>> #7  0xffffffff810d7aee in trap (frame=0xfffffe00747e88e0)
>>>     at /share/dim/src/freebsd/src-dim/sys/amd64/amd64/trap.c:576
>>> #8  <signal handler called>
>>> #9  kdb_enter (why=0xffffffff812d3d27 "panic", msg=<optimized out>)
>>>     at /share/dim/src/freebsd/src-dim/sys/kern/subr_kdb.c:506
>>> #10 0xffffffff80c1d248 in vpanic (
>>>     fmt=0xffffffff8129dfef "%s: expected nul at %p; string [%s]\n",
>>>     ap=<optimized out>, ap_at_entry=0xfffffe00747e8a50)
>>>     at /share/dim/src/freebsd/src-dim/sys/kern/kern_shutdown.c:907
>>> #11 0xffffffff80c1cfd3 in panic (
>>>     fmt=0xffffffff81e9b9c8 <cnputs_mtx> "=\t)\201\377\377\377\377")
>>>     at /share/dim/src/freebsd/src-dim/sys/kern/kern_shutdown.c:843
>>> #12 0xffffffff80cfa992 in lookup (ndp=ndp_at_entry=0xfffffe00747e8d90)
>>>     at /share/dim/src/freebsd/src-dim/sys/kern/vfs_lookup.c:919
>>> #13 0xffffffff80b33f84 in nfsvno_namei (nd=nd_at_entry=0xfffffe00747e9100,
>>>     ndp=ndp_at_entry=0xfffffe00747e8d90, dp=<optimized out>,
>>>     dp_at_entry=0xfffff80010544380, islocked=<optimized out>,
>>> islocked_at_entry=0,
>>>     exp=exp_at_entry=0xfffffe00747e8fd8, p=p_at_entry=0xfffffe00bbfa3000,
>>>     retdirp=0xfffffe00747e8e78)
>>>     at
>>> /share/dim/src/freebsd/src-dim/sys/fs/nfsserver/nfs_nfsdport.c:597
>>> #14 0xffffffff80b266a1 in nfsrvd_lookup (nd=0xfffffe00747e9100,
>>>     isdgram=<optimized out>, dp=0xfffff80010544380,
>>> vpp=0xfffffe00747e9010,
>>>     fhp=0xfffffe00747e9074, exp=0xfffffe00747e8fd8)
>>>     at
>>> /share/dim/src/freebsd/src-dim/sys/fs/nfsserver/nfs_nfsdserv.c:607
>>> #15 0xffffffff80b1073b in nfsrvd_compound (nd=0xfffffe00747e9100,
>>> isdgram=0,
>>>     tag=0xf <error: Cannot access memory at address 0xf>, taglen=6,
>>>     minorvers=4294967294)
>>>     at
>>> /share/dim/src/freebsd/src-dim/sys/fs/nfsserver/nfs_nfsdsocket.c:1098
>>> #16 nfsrvd_dorpc (nd=nd_at_entry=0xfffffe00747e9100,
>>> isdgram=isdgram_at_entry=0,
>>>     tag=0xf <error: Cannot access memory at address 0xf>, taglen=6,
>>>     minorvers=4294967294)
>>>     at
>>> /share/dim/src/freebsd/src-dim/sys/fs/nfsserver/nfs_nfsdsocket.c:626
>>> #17 0xffffffff80b24c44 in nfs_proc (nd=0xfffffe00747e9100,
>>>     xid=<optimized out>, xprt=0xfffff80003a14e00, rpp=<optimized out>)
>>>     at
>>> /share/dim/src/freebsd/src-dim/sys/fs/nfsserver/nfs_nfsdkrpc.c:402
>>> #18 nfssvc_program (rqst=0xfffff80010455800, xprt=0xfffff80003a14e00)
>>>     at
>>> /share/dim/src/freebsd/src-dim/sys/fs/nfsserver/nfs_nfsdkrpc.c:288
>>> #19 0xffffffff80edead2 in svc_executereq (rqstp=0xfffff80010455800)
>>>     at /share/dim/src/freebsd/src-dim/sys/rpc/svc.c:1037
>>> #20 svc_run_internal (grp=<optimized out>, grp_at_entry=0xfffff800100e6100,
>>>     ismaster=ismaster_at_entry=1)
>>>     at /share/dim/src/freebsd/src-dim/sys/rpc/svc.c:1313
>>> #21 0xffffffff80eddf80 in svc_run (pool=<optimized out>)
>>>     at /share/dim/src/freebsd/src-dim/sys/rpc/svc.c:1392
>>> #22 0xffffffff80b251ec in nfsrvd_nfsd (td=<optimized out>,
>>>     td_at_entry=0xfffffe00bbfa3000, args=args_at_entry=0xfffffe00747e9660)
>>>     at
>>> /share/dim/src/freebsd/src-dim/sys/fs/nfsserver/nfs_nfsdkrpc.c:561
>>> #23 0xffffffff80b3ec93 in nfssvc_nfsd (td=0xfffffe00bbfa3000,
>>>     uap=<optimized out>)
>>>     at
>>> /share/dim/src/freebsd/src-dim/sys/fs/nfsserver/nfs_nfsdport.c:3714
>>> #24 0xffffffff80e6f647 in sys_nfssvc (td=0xfffffe00bbfa3000,
>>>     uap=0xfffffe00bbfa33e8)
>>>     at /share/dim/src/freebsd/src-dim/sys/nfs/nfs_nfssvc.c:111
>>> #25 0xffffffff810d891e in syscallenter (td=<optimized out>)
>>>     at
>>> /share/dim/src/freebsd/src-dim/sys/amd64/amd64/../../kern/subr_syscall.c:189
>>> #26 amd64_syscall (td=0xfffffe00bbfa3000, traced=0)
>>>     at /share/dim/src/freebsd/src-dim/sys/amd64/amd64/trap.c:1156
>>> #27 <signal handler called>
>>> #28 0x00000008011aa59a in ?? ()
>>>
>>> Is anybody seeing this too? :)
>>>
>>> I can probably bisect, but it'll take quite a while.
>>>
>>> -Dimitry
>>>
>>>
>>
>>
>> --
>> Mateusz Guzik <mjguzik gmail.com>
>>
>
>
> --
> Mateusz Guzik <mjguzik gmail.com>
>


-- 
Mateusz Guzik <mjguzik gmail.com>
Received on Mon May 31 2021 - 15:03:57 UTC

Original text of this message