kern/156168: [nfs] [panic] Kernel panic under concurrent access over NFS

Rick Macklem rmacklem at uoguelph.ca
Wed Oct 19 16:04:16 UTC 2011


Mark Saad wrote:
> The following reply was made to PR kern/156168; it has been noted by
> GNATS.
> 
> From: Mark Saad <nonesuch at longcount.org>
> To: bug-followup at FreeBSD.org, niakrisn at gmail.com
> Cc:
> Subject: Re: kern/156168: [nfs] [panic] Kernel panic under concurrent
> access
> over NFS
> Date: Thu, 29 Sep 2011 11:32:12 -0400
> 
> All
> I am seeing a similar crash on 7.3-RELEASE-p2 amd64 when using
> apache-1.3.34 with accf_httpd and a nfs docroot
> The servers that have crashed are all FreeBSD 7.3-RELEASE amd64.
> Hardware is HP Dl145 g2
> They have 2G of ram and 2G swap with one single core opteron cpu.
> 
> 
> We are using the following sysctls .
> 
> kern.ipc.maxsockbuf=2097152
> kern.ipc.nmbclusters=32768
> kern.ipc.somaxconn=1024
> kern.maxfiles=131072
> kern.maxfilesperproc=32768
> net.inet.tcp.inflight.enable=0
> net.inet.tcp.path_mtu_discovery=0
> net.inet.tcp.recvbuf_inc=524288
> net.inet.tcp.recvbuf_max=8388608
> net.inet.tcp.recvspace=32768
> net.inet.tcp.sendbuf_inc=16384
> net.inet.tcp.sendbuf_max=8388608
> net.inet.tcp.sendspace=32768
> net.inet.udp.recvspace=42080
> net.isr.direct=1
> vm.pmap.shpgperproc=600
> 
> 
> Up time prior to the crash was not the other system was up for 11 days
> this one was 6 days.
> 
> Here is the contents of my crash
> 
> 
> [root at web29 /var/crash]# kgdb /boot/kernel/kernel /var/crash/vmcore.0
> GNU gdb 6.1.1 [FreeBSD]
> Copyright 2004 Free Software Foundation, Inc.
> GDB is free software, covered by the GNU General Public License, and
> you are
> welcome to change it and/or distribute copies of it under certain
> conditions.
> Type "show copying" to see the conditions.
> There is absolutely no warranty for GDB. Type "show warranty" for
> details.
> This GDB was configured as "amd64-marcel-freebsd"...
> 
> Unread portion of the kernel message buffer:
> 
> 
> Fatal trap 12: page fault while in kernel mode
> cpuid = 0; apic id = 00
> fault virtual address = 0x258
> fault code = supervisor read data, page not present
> instruction pointer = 0x8:0xffffffff8051a66d
> stack pointer = 0x10:0xffffff803e69b1c0
> frame pointer = 0x10:0xffffff0001b50ae0
> code segment = base 0x0, limit 0xfffff, type 0x1b
> = DPL 0, pres 1, long 1, def32 0, gran 1
> processor eflags = interrupt enabled, resume, IOPL = 0
> current process = 9336 (libhttpd.ep)
> trap number = 12
> panic: page fault
> cpuid = 0
> Uptime: 6d5h18m39s
> Physical memory: 2034 MB
> Dumping 1451 MB: 1436 1420 1404 1388 1372 1356 1340 1324 1308 1292
> 1276 1260 1244 1228 1212 1196 1180 1164 1148 1132 1116 1100 1084 1068
> 1052 1036 1020 1004 988 972 956 940 924 908 892 876 860 844 828 812
> 796 780 764 748 732 716 700 684 668 652 636 620 604 588 572 556 540
> 524 508 492 476 460 444 428 412 396 380 364 348 332 316 300 284 268
> 252 236 220 204 188 172 156 140 124 108 92 76 60 44 28 12
> 
> Reading symbols from /boot/kernel/accf_http.ko...Reading symbols from
> /boot/kernel/accf_http.ko.symbols...done.
> done.
> Loaded symbols for /boot/kernel/accf_http.ko
> #0 doadump () at pcpu.h:195
> 195 pcpu.h: No such file or directory.
> in pcpu.h
> (kgdb) bt
> #0 doadump () at pcpu.h:195
> #1 0x0000000000000004 in ?? ()
> #2 0xffffffff805285f9 in boot (howto=260) at
> /usr/src/sys/kern/kern_shutdown.c:418
> #3 0xffffffff80528a02 in panic (fmt=0x104 <Address 0x104 out of
> bounds>) at /usr/src/sys/kern/kern_shutdown.c:574
> #4 0xffffffff807ec813 in trap_fatal (frame=0xffffff0001b50ae0,
> eva=Variable "eva" is not available.
> ) at /usr/src/sys/amd64/amd64/trap.c:777
> #5 0xffffffff807ecbe5 in trap_pfault (frame=0xffffff803e69b110,
> usermode=0) at /usr/src/sys/amd64/amd64/trap.c:693
> #6 0xffffffff807ed50c in trap (frame=0xffffff803e69b110) at
> /usr/src/sys/amd64/amd64/trap.c:464
> #7 0xffffffff807d614e in calltrap () at
> /usr/src/sys/amd64/amd64/exception.S:218
> #8 0xffffffff8051a66d in _mtx_lock_sleep (m=0xffffff002f3d7a80,
> tid=18446742974226565856, opts=Variable "opts" is not available.
> )
> at /usr/src/sys/kern/kern_mutex.c:339
> #9 0xffffffff80701f60 in clnt_dg_create (so=0xffffff00017755a0,
> svcaddr=0xffffff803e69b310, program=100000, version=4, sendsz=Variable
> "sendsz" is not available.
> )
> at /usr/src/sys/rpc/clnt_dg.c:259
> #10 0xffffffff806e97c9 in nlm_get_rpc (sa=Variable "sa" is not
> available.
> ) at /usr/src/sys/nlm/nlm_prot_impl.c:327
> #11 0xffffffff806e9d39 in nlm_host_get_rpc (host=0xffffff0001705000)
> at /usr/src/sys/nlm/nlm_prot_impl.c:1199
> #12 0xffffffff806e680f in nlm_clearlock (host=0xffffff0001705000,
> ext=0xffffff803e69b9a0, vers=4, timo=0xffffff803e69b9d0,
> retries=2147483647, vp=0xffffff004881edc8, op=2,
> fl=0xffffff803e69bac0, flags=64, svid=9336, fhlen=32,
> fh=0xffffff803e69b750,
> size=689) at /usr/src/sys/nlm/nlm_advlock.c:943
> #13 0xffffffff806e7801 in nlm_advlock_internal (vp=0xffffff004881edc8,
> id=Variable "id" is not available.
> ) at /usr/src/sys/nlm/nlm_advlock.c:355
> #14 0xffffffff806e8166 in nlm_advlock (ap=Variable "ap" is not
> available.
> ) at /usr/src/sys/nlm/nlm_advlock.c:392
> #15 0xffffffff806ced28 in nfs_advlock (ap=0xffffff803e69ba90) at
> /usr/src/sys/nfsclient/nfs_vnops.c:3153
> #16 0xffffffff804f40e2 in closef (fp=0xffffff0073716d80,
> td=0xffffff0001b50ae0) at vnode_if.h:1036
> #17 0xffffffff804f462b in kern_close (td=0xffffff0001b50ae0,
> fd=Variable "fd" is not available.
> ) at /usr/src/sys/kern/kern_descrip.c:1125
> #18 0xffffffff807ece67 in syscall (frame=0xffffff803e69bc80) at
> /usr/src/sys/amd64/amd64/trap.c:920
> #19 0xffffffff807d635b in Xfast_syscall () at
> /usr/src/sys/amd64/amd64/exception.S:339
> #20 0x00000008009c5b1c in ?? ()
> Previous frame inner to this frame (corrupt stack?)
> 
You could try the attached patch, which contains some of the changes
in the newer versions of clnt_dg.c. (There have been many changes, so
carrying them all across isn't practical, for me at least.)

I have no way of testing this patch at this time, so all I did was
compile it, rick

> --
> mark saad | nonesuch at longcount.org
> _______________________________________________
> freebsd-fs at freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-fs
> To unsubscribe, send any mail to "freebsd-fs-unsubscribe at freebsd.org"
-------------- next part --------------
A non-text attachment was scrubbed...
Name: nlmdg7.patch
Type: text/x-patch
Size: 2276 bytes
Desc: not available
Url : http://lists.freebsd.org/pipermail/freebsd-fs/attachments/20111019/a28a6f98/nlmdg7.bin


More information about the freebsd-fs mailing list