i386/84563: Panics occur when PAE enabled and >3.5GB memory used

David Kirchner dpk at dpk.net
Thu Aug 4 21:30:16 GMT 2005


>Number:         84563
>Category:       i386
>Synopsis:       Panics occur when PAE enabled and >3.5GB memory used
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    freebsd-i386
>State:          open
>Quarter:        
>Keywords:       
>Date-Required:
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Thu Aug 04 21:30:14 GMT 2005
>Closed-Date:
>Last-Modified:
>Originator:     David Kirchner
>Release:        5.4-RELEASE-p5 and -STABLE as of a few days ago
>Organization:
>Environment:
FreeBSD host 5.4-RELEASE-p5 FreeBSD 5.4-RELEASE-p5 #0: Thu Aug  4 13:06:27 PDT 2005     root at host:/usr/src/sys/i386/compile/STD  i386
>Description:
This is with a Supermicro X6DVA-4G/EG system, and with 4GB of RAM. We have several of these with the same configuration, and they have the same problem.

The problem is repeatable using the PAE kernel config that comes stock with the OS.

The problem appears to be when memory above 3.5GB (memory which the BIOS remaps to just above 4096MB) is touched in some way. Paged out, perhaps.

Here are two traces from two different panics, with something in common:

(gdb) bt
#0  kdb_enter (msg=0x12 <Address 0x12 out of bounds>) at ../../../kern/subr_kdb.c:266
#1  0xc033ea1f in panic (fmt=0xc04d782d "ffs_write: dir write") at ../../../kern/kern_shut
down.c:550
#2  0xc04292de in ffs_write (ap=0xeb858a94) at ../../../ufs/ffs/ffs_vnops.c:614
#3  0xc0452e71 in vnode_pager_generic_putpages (vp=0xc6237630, m=0xeb858bf0, bytecount=409
6,  
    flags=0, rtvals=0xeb858b70) at vnode_if.h:432
#4  0xc038b7e2 in vop_stdputpages (ap=0x12) at ../../../kern/vfs_default.c:650
#5  0xc038af3b in vop_defaultop (ap=0x0) at ../../../kern/vfs_default.c:157
#6  0xc0435ebf in ufs_vnoperate (ap=0x0) at ../../../ufs/ufs/ufs_vnops.c:2821
#7  0xc0452c0e in vnode_pager_putpages (object=0xc6901a50, m=0x12, count=18, sync=0, rtval
s=0x12)
    at vnode_if.h:1357
#8  0xc044a5db in vm_pageout_flush (mc=0xeb858bf0, count=1, flags=0) at vm_pager.h:147
#9  0xc044a505 in vm_pageout_clean (m=0x0) at ../../../vm/vm_pageout.c:347
#10 0xc044b386 in vm_pageout_scan (pass=1) at ../../../vm/vm_pageout.c:985
#11 0xc044c106 in vm_pageout () at ../../../vm/vm_pageout.c:1476
#12 0xc032911d in fork_exit (callout=0xc044bdf4 <vm_pageout>, arg=0x0, frame=0xeb858d48)
    at ../../../kern/kern_fork.c:791
#13 0xc0474f6c in fork_trampoline () at ../../../i386/i386/exception.s:209

#0  kdb_enter (msg=0x12 <Address 0x12 out of bounds>) at ../../../kern/subr_kdb.c:266
#1  0xc033ea1f in panic (fmt=0xc04c99ff "lockmgr: thread %p, not %s %p unlocking")
    at ../../../kern/kern_shutdown.c:550
#2  0xc0333181 in lockmgr (lkp=0xc61f5e14, flags=6, interlkp=0x1000000, td=0x0)
    at ../../../kern/kern_lock.c:419
#3  0xc038b08b in vop_stdunlock (ap=0x12) at ../../../kern/vfs_default.c:295
#4  0xc038af3b in vop_defaultop (ap=0x0) at ../../../kern/vfs_default.c:157
#5  0xc03010bb in spec_vnoperate (ap=0x0) at ../../../fs/specfs/spec_vnops.c:118
#6  0xc0301648 in spec_write (ap=0xeb858a94) at vnode_if.h:1044
#7  0xc03010bb in spec_vnoperate (ap=0x0) at ../../../fs/specfs/spec_vnops.c:118
#8  0xc0452ecd in vnode_pager_generic_putpages (vp=0xc61f5d68, m=0xeb858bf0, bytecount=409
6,
    flags=0, rtvals=0xeb858b70) at vnode_if.h:432
#9  0xc038b7e2 in vop_stdputpages (ap=0x12) at ../../../kern/vfs_default.c:650
#10 0xc038af3b in vop_defaultop (ap=0x0) at ../../../kern/vfs_default.c:157
#11 0xc03010bb in spec_vnoperate (ap=0x0) at ../../../fs/specfs/spec_vnops.c:118
#12 0xc0452c6a in vnode_pager_putpages (object=0xc085e7bc, m=0x12, count=18, sync=0, rtval
s=0x12)
    at vnode_if.h:1357
#13 0xc044a603 in vm_pageout_flush (mc=0xeb858bf0, count=1, flags=0) at vm_pager.h:147
#14 0xc044a52d in vm_pageout_clean (m=0x0) at ../../../vm/vm_pageout.c:347
#15 0xc044b3df in vm_pageout_scan (pass=0) at ../../../vm/vm_pageout.c:996
#16 0xc044c162 in vm_pageout () at ../../../vm/vm_pageout.c:1487
#17 0xc032911d in fork_exit (callout=0xc044be50 <vm_pageout>, arg=0x0, frame=0xeb858d48)
    at ../../../kern/kern_fork.c:791
#18 0xc0474fcc in fork_trampoline () at ../../../i386/i386/exception.s:209

In both cases, you'll notice that vm_pageout_flush's mc argument is identical. That is decimal 3,951,397,872 . When you boot these servers without PAE enabled, the "real memory" is 3,757,965,312. I think this indicates that the page the kernel is dealing with is within the "remapped" region.

There is a third panic that occurs, which I do not have a trace for, but follows the same pattern as this person saw:

http://groups-beta.google.com/group/lucky.freebsd.stable/browse_thread/thread/99978f6cbf071223/136ab31fcd339d5c?lnk=st&q=freebsd+4GB+PAE+thread&rnum=5&hl=en#136ab31fcd339d5c

That seems to be dealing with memory in around the same range as what I'm seeing.

My understanding of kernel internals and fancy PAE memory access is pretty limited, so I could be way off on my guesses. It does seem that others are having the same trouble, though.
>How-To-Repeat:
This bug is very easy to reproduce. On the system, compile and install the PAE kernel, reboot, then run a program which calloc()'s 500MB, several times, while rebuilding the kernel repeatedly. Eventually the kernel will crash (usually around 10-30 minutes in). I believe it crashes when it starts putting pages >3.5GB into the inactive queue, or tries to swap it out, or something like that.
>Fix:
Unknown. Disabling PAE works, but is obviously not ideal.
>Release-Note:
>Audit-Trail:
>Unformatted:


More information about the freebsd-i386 mailing list