kern/125149: [zfs][nfs] changing into .zfs dir from nfs client causes endless panic loop

Volker Werth vwe at freebsd.org
Thu Oct 2 21:15:42 UTC 2008


On 10/02/08 21:05, Weldon Godfrey wrote:
> Yes, I can replicate statting .zfs dir from NFS client causes FreeBSD to
> panic and reboot, this time from CentOS 5.0 box.  ...
> 
> 
> Replicate:
> 
> [root at asmtp2 ~]# df
> Filesystem           1K-blocks      Used Available Use% Mounted on
> /dev/mapper/VolGroup00-LogVol00
>                       60817412   2814548  54863692   5% /
> /dev/sda1               101086     28729     67138  30% /boot
> tmpfs                  2008628         0   2008628   0% /dev/shm
> 192.168.2.22:/vol/enamail
>                      1286702144 1032758816 253943328  81%
> /var/spool/mail
> 192.168.2.21:/vol/exports/gaggle
>                      400959408 144327584 256631824  36%
> /var/spool/mail/archive/gaggle
> 192.168.2.36:/export/store1-1
>                      1413955712   4619136 1409336576   1%
> /var/spool/mail/store1-1
> [root at asmtp2 ~]# 
> [root at asmtp2 ~]# 
> [root at asmtp2 ~]# cd /var/spool/mail/store1-1
> [root at asmtp2 store1-1]# ls
> 1  2  3  4  5  6  7  8  9  crap
> [root at asmtp2 store1-1]# cd .zfs
> [root at asmtp2 .zfs]# ls
> (FreeBSD ZFS server panics here)
> 
> Weldon
> 
> Backtrace:
> 
> store1# kgdb /usr/obj/usr/src/sys/GENERIC/kernel.debug vmcore.27
> [GDB will not be able to debug user-mode threads:
> /usr/lib/libthread_db.so: Undefined symbol "ps_pglobal_lookup"]
> GNU gdb 6.1.1 [FreeBSD]
> Copyright 2004 Free Software Foundation, Inc.
> GDB is free software, covered by the GNU General Public License, and you
> are
> welcome to change it and/or distribute copies of it under certain
> conditions.
> Type "show copying" to see the conditions.
> There is absolutely no warranty for GDB.  Type "show warranty" for
> details.
> This GDB was configured as "amd64-marcel-freebsd".
> 
> Unread portion of the kernel message buffer:
> 
> 
> Fatal trap 12: page fault while in kernel mode
> cpuid = 5; apic id = 05
> fault virtual address   = 0x108
> fault code              = supervisor write data, page not present
> instruction pointer     = 0x8:0xffffffff804f06fa
> stack pointer           = 0x10:0xffffffffdf761590
> frame pointer           = 0x10:0x4
> code segment            = base 0x0, limit 0xfffff, type 0x1b
>                         = DPL 0, pres 1, long 1, def32 0, gran 1
> processor eflags        = interrupt enabled, resume, IOPL = 0
> current process         = 807 (nfsd)
> trap number             = 12
> panic: page fault
> cpuid = 5
> Uptime: 1m19s
> Physical memory: 16367 MB
> Dumping 891 MB: 876 860 844 828 812 796 780 764 748 732 716 700 684 668
> 652 636 620 604 588 572 556 540 524 508 492 476 460 444 428 412 396 380
> 364 348 332 316 300 284 268 252 236 220 204 188 172 156 140 124 108 92
> 76 60 44 28 12
> 
> #0  doadump () at pcpu.h:194
> 194     pcpu.h: No such file or directory.
>         in pcpu.h
> (kgdb) vt
> Undefined command: "vt".  Try "help".
> (kgdb) bt
> #0  doadump () at pcpu.h:194
> #1  0x0000000000000004 in ?? ()
> #2  0xffffffff80477699 in boot (howto=260) at
> /usr/src/sys/kern/kern_shutdown.c:409
> #3  0xffffffff80477a9d in panic (fmt=0x104 <Address 0x104 out of
> bounds>) at /usr/src/sys/kern/kern_shutdown.c:563
> #4  0xffffffff8072ed24 in trap_fatal (frame=0xffffff00059a0340,
> eva=18446742974291977320)
>     at /usr/src/sys/amd64/amd64/trap.c:724
> #5  0xffffffff8072f0f5 in trap_pfault (frame=0xffffffffdf7614e0,
> usermode=0) at /usr/src/sys/amd64/amd64/trap.c:641
> #6  0xffffffff8072fa38 in trap (frame=0xffffffffdf7614e0) at
> /usr/src/sys/amd64/amd64/trap.c:410
> #7  0xffffffff807156ae in calltrap () at
> /usr/src/sys/amd64/amd64/exception.S:169
> #8  0xffffffff804f06fa in vput (vp=0x0) at atomic.h:142
> #9  0xffffffff8060670d in nfsrv_readdirplus (nfsd=0xffffff000584f100,
> slp=0xffffff0005725900, 
>     td=0xffffff00059a0340, mrq=0xffffffffdf761af0) at
> /usr/src/sys/nfsserver/nfs_serv.c:3613
> #10 0xffffffff80615a5d in nfssvc (td=Variable "td" is not available.
> ) at /usr/src/sys/nfsserver/nfs_syscalls.c:461
> #11 0xffffffff8072f377 in syscall (frame=0xffffffffdf761c70) at
> /usr/src/sys/amd64/amd64/trap.c:852
> #12 0xffffffff807158bb in Xfast_syscall () at
> /usr/src/sys/amd64/amd64/exception.S:290
> #13 0x000000080068746c in ?? ()
> Previous frame inner to this frame (corrupt stack?)
> 
> 

Weldon,

can you please try the following from kgdb and send the output:

(kgdb) frame 9
(kgdb) list
(kgdb) p *vp
(kgdb) p *dp
(kgdb) frame 8
(kgdb) list

Please keep the core dump as we might need to check some variable values
later.

I think the problem is the NULL pointer to vput. A maintainer needs to
check how nvp can get a NULL pointer (judging by assuming my fresh
codebase is not too different from yours).

Thanks

Volker


More information about the freebsd-bugs mailing list