kern/125149: [zfs][nfs] changing into .zfs dir from nfs client causes endless panic loop

Volker Werth vwe at freebsd.org
Sun Oct 5 17:10:04 UTC 2008


The following reply was made to PR kern/125149; it has been noted by GNATS.

From: Volker Werth <vwe at freebsd.org>
To: bug-followup at FreeBSD.org
Cc:  
Subject: RE: kern/125149: [zfs][nfs] changing into .zfs dir from nfs client
 causes endless panic loop
Date: Sun, 05 Oct 2008 19:05:22 +0200

 Attach submitted debugging information to the PR.
 
 -------- Original Message --------
 Subject: RE: kern/125149: [zfs][nfs] changing into .zfs dir from nfs
 client causes endless panic loop
 Date: Fri, 3 Oct 2008 08:58:42 -0500
 From: Weldon Godfrey <wgodfrey at ena.com>
 To: Volker Werth <vwe at freebsd.org>
 CC: <freebsd-bugs at freebsd.org>
 References: <200810012106.m91L6jq2007417 at freefall.freebsd.org>
 <A7B0A9F02975A74A845FE85D0B95B8FA0A1107A6 at misex01.ena.com>
 <48E535D8.4030101 at freebsd.org>
 
 
 No problem, here is the result.  Thanks!
 Weldon
 
 
 store1# kgdb /usr/obj/usr/src/sys/GENERIC/kernel.debug vmcore.27
 [GDB will not be able to debug user-mode threads:
 /usr/lib/libthread_db.so: Undefined symbol "ps_pglobal_lookup"]
 GNU gdb 6.1.1 [FreeBSD]
 Copyright 2004 Free Software Foundation, Inc.
 GDB is free software, covered by the GNU General Public License, and you are
 welcome to change it and/or distribute copies of it under certain
 conditions.
 Type "show copying" to see the conditions.
 There is absolutely no warranty for GDB.  Type "show warranty" for details.
 This GDB was configured as "amd64-marcel-freebsd".
 
 Unread portion of the kernel message buffer:
 
 
 Fatal trap 12: page fault while in kernel mode
 cpuid = 5; apic id = 05
 fault virtual address   = 0x108
 fault code              = supervisor write data, page not present
 instruction pointer     = 0x8:0xffffffff804f06fa
 stack pointer           = 0x10:0xffffffffdf761590
 frame pointer           = 0x10:0x4
 code segment            = base 0x0, limit 0xfffff, type 0x1b
                         = DPL 0, pres 1, long 1, def32 0, gran 1
 processor eflags        = interrupt enabled, resume, IOPL = 0
 current process         = 807 (nfsd)
 trap number             = 12
 panic: page fault
 cpuid = 5
 Uptime: 1m19s
 Physical memory: 16367 MB
 Dumping 891 MB: 876 860 844 828 812 796 780 764 748 732 716 700 684 668
 652 636 620 604 588 572 556 540 524 508 492 476 460 444 428 412 396 380
 364 348 332 316 300 284 268 252 236 220 204 188 172 156 140 124 108 92
 76 60 44 28 12
 
 #0  doadump () at pcpu.h:194
 194     pcpu.h: No such file or directory.
         in pcpu.h
 (kgdb) frame 9
 #9  0xffffffff8060670d in nfsrv_readdirplus (nfsd=0xffffff000584f100,
 slp=0xffffff0005725900,
     td=0xffffff00059a0340, mrq=0xffffffffdf761af0) at
 /usr/src/sys/nfsserver/nfs_serv.c:3613
 3613            vput(nvp);
 (kgdb) list
 3608                    nfsm_reply(NFSX_V3POSTOPATTR);
 3609                    nfsm_srvpostop_attr(getret, &at);
 3610                    error = 0;
 3611                    goto nfsmout;
 3612            }
 3613            vput(nvp);
 3614            nvp = NULL;
 3615
 3616            dirlen = len = NFSX_V3POSTOPATTR + NFSX_V3COOKIEVERF +
 3617                2 * NFSX_UNSIGNED;
 (kgdb) p *vp
 $1 = {v_type = VDIR, v_tag = 0xffffffffdf8a7647 "zfs", v_op =
 0xffffffffdf8ab4e0, v_data = 0xffffff0005958d00,
   v_mount = 0xffffff0005908978, v_nmntvnodes = {tqe_next =
 0xffffff0005aed1f0, tqe_prev = 0xffffff0005a117e8},
   v_un = {vu_mount = 0x0, vu_socket = 0x0, vu_cdev = 0x0, vu_fifoinfo =
 0x0}, v_hashlist = {le_next = 0x0,
     le_prev = 0x0}, v_hash = 0, v_cache_src = {lh_first = 0x0},
 v_cache_dst = {tqh_first = 0x0,
     tqh_last = 0xffffff0005aed440}, v_dd = 0x0, v_cstart = 0, v_lasta =
 0, v_lastw = 0, v_clen = 0, v_lock = {
     lk_object = {lo_name = 0xffffffffdf8a7647 "zfs", lo_type =
 0xffffffffdf8a7647 "zfs", lo_flags = 70844416,
       lo_witness_data = {lod_list = {stqe_next = 0x0}, lod_witness =
 0x0}}, lk_interlock = 0xffffffff80a49ed0,
     lk_flags = 128, lk_sharecount = 0, lk_waitcount = 0,
 lk_exclusivecount = 0, lk_prio = 80, lk_timo = 51,
     lk_lockholder = 0xffffffffffffffff, lk_newlock = 0x0}, v_interlock =
 {lock_object = {
       lo_name = 0xffffffff807ee47a "vnode interlock", lo_type =
 0xffffffff807ee47a "vnode interlock",
       lo_flags = 16973824, lo_witness_data = {lod_list = {stqe_next =
 0x0}, lod_witness = 0x0}}, mtx_lock = 4,
     mtx_recurse = 0}, v_vnlock = 0xffffff0005aed478, v_holdcnt = 2,
 v_usecount = 2, v_iflag = 0, v_vflag = 0,
   v_writecount = 0, v_freelist = {tqe_next = 0x0, tqe_prev = 0x0},
 v_bufobj = {bo_mtx = 0xffffff0005aed4c8,
     bo_clean = {bv_hd = {tqh_first = 0x0, tqh_last =
 0xffffff0005aed538}, bv_root = 0x0, bv_cnt = 0}, bo_dirty = {
       bv_hd = {tqh_first = 0x0, tqh_last = 0xffffff0005aed558}, bv_root
 = 0x0, bv_cnt = 0}, bo_numoutput = 0,
     bo_flag = 0, bo_ops = 0xffffffff809cc320, bo_bsize = 0, bo_object =
 0x0, bo_synclist = {le_next = 0x0,
       le_prev = 0x0}, bo_private = 0xffffff0005aed3e0, __bo_vnode =
 0xffffff0005aed3e0}, v_pollinfo = 0x0,
   v_label = 0x0}
 (kgdb) p *dp
 $2 = {d_fileno = 1, d_reclen = 12, d_type = 4 '\004', d_namlen = 1 '\001',
   d_name =
 ".\000\000\000\001\000\000\000\f\000\004\002..\000\000\002\000\000\000\024\000\004\bsnapshot\000\000\000\000\000\000\000\000 at s'\n\000ÿÿÿ\004\000\000\000\003\000\000\000\022\000\000\000\000\000\000\000|D~\200ÿÿÿÿ|D~\200ÿÿÿÿ\000\000:\002",
 '\0' <repeats 12 times>, "\006", '\0' <repeats 32 times>,
 "à\224\005\000ÿÿÿ\000à\224\005\000ÿÿÿ\000à\224\005\000ÿÿÿ\000\000\000\000\000\000\000\000\030Ö\224\005\000ÿÿÿ",
 '\0' <repeats 87 times>}
 (kgdb) frame 8
 #8  0xffffffff804f06fa in vput (vp=0x0) at atomic.h:142
 142     atomic.h: No such file or directory.
         in atomic.h
 (kgdb) list
 137     in atomic.h
 (kgdb)
 
 Weldon
 
 
 -----Original Message-----
 From: Volker Werth [mailto:vwe at freebsd.org]
 Sent: Thursday, October 02, 2008 3:58 PM
 To: Weldon Godfrey
 Cc: freebsd-bugs at freebsd.org
 Subject: Re: kern/125149: [zfs][nfs] changing into .zfs dir from nfs
 client causes endless panic loop
 
 On 10/02/08 21:05, Weldon Godfrey wrote:
 > Yes, I can replicate statting .zfs dir from NFS client causes FreeBSD to
 > panic and reboot, this time from CentOS 5.0 box.  ...
 > 
 > 
 > Replicate:
 > 
 > [root at asmtp2 ~]# df
 > Filesystem           1K-blocks      Used Available Use% Mounted on
 > /dev/mapper/VolGroup00-LogVol00
 >                       60817412   2814548  54863692   5% /
 > /dev/sda1               101086     28729     67138  30% /boot
 > tmpfs                  2008628         0   2008628   0% /dev/shm
 > 192.168.2.22:/vol/enamail
 >                      1286702144 1032758816 253943328  81%
 > /var/spool/mail
 > 192.168.2.21:/vol/exports/gaggle
 >                      400959408 144327584 256631824  36%
 > /var/spool/mail/archive/gaggle
 > 192.168.2.36:/export/store1-1
 >                      1413955712   4619136 1409336576   1%
 > /var/spool/mail/store1-1
 > [root at asmtp2 ~]# 
 > [root at asmtp2 ~]# 
 > [root at asmtp2 ~]# cd /var/spool/mail/store1-1
 > [root at asmtp2 store1-1]# ls
 > 1  2  3  4  5  6  7  8  9  crap
 > [root at asmtp2 store1-1]# cd .zfs
 > [root at asmtp2 .zfs]# ls
 > (FreeBSD ZFS server panics here)
 > 
 > Weldon
 > 
 > Backtrace:
 > 
 > store1# kgdb /usr/obj/usr/src/sys/GENERIC/kernel.debug vmcore.27
 > [GDB will not be able to debug user-mode threads:
 > /usr/lib/libthread_db.so: Undefined symbol "ps_pglobal_lookup"]
 > GNU gdb 6.1.1 [FreeBSD]
 > Copyright 2004 Free Software Foundation, Inc.
 > GDB is free software, covered by the GNU General Public License, and you
 > are
 > welcome to change it and/or distribute copies of it under certain
 > conditions.
 > Type "show copying" to see the conditions.
 > There is absolutely no warranty for GDB.  Type "show warranty" for
 > details.
 > This GDB was configured as "amd64-marcel-freebsd".
 > 
 > Unread portion of the kernel message buffer:
 > 
 > 
 > Fatal trap 12: page fault while in kernel mode
 > cpuid = 5; apic id = 05
 > fault virtual address   = 0x108
 > fault code              = supervisor write data, page not present
 > instruction pointer     = 0x8:0xffffffff804f06fa
 > stack pointer           = 0x10:0xffffffffdf761590
 > frame pointer           = 0x10:0x4
 > code segment            = base 0x0, limit 0xfffff, type 0x1b
 >                         = DPL 0, pres 1, long 1, def32 0, gran 1
 > processor eflags        = interrupt enabled, resume, IOPL = 0
 > current process         = 807 (nfsd)
 > trap number             = 12
 > panic: page fault
 > cpuid = 5
 > Uptime: 1m19s
 > Physical memory: 16367 MB
 > Dumping 891 MB: 876 860 844 828 812 796 780 764 748 732 716 700 684 668
 > 652 636 620 604 588 572 556 540 524 508 492 476 460 444 428 412 396 380
 > 364 348 332 316 300 284 268 252 236 220 204 188 172 156 140 124 108 92
 > 76 60 44 28 12
 > 
 > #0  doadump () at pcpu.h:194
 > 194     pcpu.h: No such file or directory.
 >         in pcpu.h
 > (kgdb) vt
 > Undefined command: "vt".  Try "help".
 > (kgdb) bt
 > #0  doadump () at pcpu.h:194
 > #1  0x0000000000000004 in ?? ()
 > #2  0xffffffff80477699 in boot (howto=260) at
 > /usr/src/sys/kern/kern_shutdown.c:409
 > #3  0xffffffff80477a9d in panic (fmt=0x104 <Address 0x104 out of
 > bounds>) at /usr/src/sys/kern/kern_shutdown.c:563
 > #4  0xffffffff8072ed24 in trap_fatal (frame=0xffffff00059a0340,
 > eva=18446742974291977320)
 >     at /usr/src/sys/amd64/amd64/trap.c:724
 > #5  0xffffffff8072f0f5 in trap_pfault (frame=0xffffffffdf7614e0,
 > usermode=0) at /usr/src/sys/amd64/amd64/trap.c:641
 > #6  0xffffffff8072fa38 in trap (frame=0xffffffffdf7614e0) at
 > /usr/src/sys/amd64/amd64/trap.c:410
 > #7  0xffffffff807156ae in calltrap () at
 > /usr/src/sys/amd64/amd64/exception.S:169
 > #8  0xffffffff804f06fa in vput (vp=0x0) at atomic.h:142
 > #9  0xffffffff8060670d in nfsrv_readdirplus (nfsd=0xffffff000584f100,
 > slp=0xffffff0005725900, 
 >     td=0xffffff00059a0340, mrq=0xffffffffdf761af0) at
 > /usr/src/sys/nfsserver/nfs_serv.c:3613
 > #10 0xffffffff80615a5d in nfssvc (td=Variable "td" is not available.
 > ) at /usr/src/sys/nfsserver/nfs_syscalls.c:461
 > #11 0xffffffff8072f377 in syscall (frame=0xffffffffdf761c70) at
 > /usr/src/sys/amd64/amd64/trap.c:852
 > #12 0xffffffff807158bb in Xfast_syscall () at
 > /usr/src/sys/amd64/amd64/exception.S:290
 > #13 0x000000080068746c in ?? ()
 > Previous frame inner to this frame (corrupt stack?)
 > 
 > 
 
 Weldon,
 
 can you please try the following from kgdb and send the output:
 
 (kgdb) frame 9
 (kgdb) list
 (kgdb) p *vp
 (kgdb) p *dp
 (kgdb) frame 8
 (kgdb) list
 
 Please keep the core dump as we might need to check some variable values
 later.
 
 I think the problem is the NULL pointer to vput. A maintainer needs to
 check how nvp can get a NULL pointer (judging by assuming my fresh
 codebase is not too different from yours).
 
 Thanks
 
 Volker
 


More information about the freebsd-bugs mailing list