ALPHA4 panic in VM
Steve Kargl
sgk at troutmask.apl.washington.edu
Wed Sep 19 21:11:59 UTC 2018
On Wed, Sep 19, 2018 at 05:02:11PM -0400, Mark Johnston wrote:
> On Wed, Sep 19, 2018 at 01:01:52PM -0700, Steve Kargl wrote:
> > I have the kernel and core file if more information is needed.
> >
> > % cat info.2
> > Dump header from device: /dev/ada0p3
> Architecture: amd64
> > Architecture Version: 2
> > Dump Length: 2348281856
> > Blocksize: 512
> > Compression: none
> > Dumptime: Wed Sep 19 12:29:59 2018
> > Hostname: troutmask.apl.washington.edu
> > Magic: FreeBSD Kernel Dump
> > Version String: FreeBSD 12.0-ALPHA4 #0 r338505: Thu Sep 6 13:45:34 PDT 2018
> > kargl at troutmask.apl.washington.edu:/usr/obj/usr/src/amd64.amd64/sys/SPEW
> > Panic String: page fault
> > Dump Parity: 2676008548
> > Bounds: 2
> > Dump Status: good
> >
> > % more core.txt.2
> > Fatal trap 12: page fault while in kernel mode
> > cpuid = 1; apic id = 11
> > fault virtual address = 0xffffb8000719a428
>
> This seems to be the result of a bit-flip. cred is 0xffffb8000719a400,
> which is almost but not quite in the direct map. In particular we have:
>
> (kgdb) frame 10
> #10 0xffffffff8083e07d in vm_object_destroy (object=<optimized out>) at /usr/src/sys/vm/vm_object.c:703
> 703 swap_release_by_cred(object->charge, object->cred);
> (kgdb) p object
> $8 = <optimized out>
> (kgdb) p *(vm_object_t)$r13
> $9 = {
> ...
> cred = 0xffffb8000719a400,
> charge = 28672,
> umtx_data = 0x0
> }
> (kgdb) p *(struct ucred *)0xfffff8000719a400
> $10 = {
> cr_ref = 5737,
> cr_uid = 1001,
> cr_ruid = 1001,
> cr_svuid = 1001,
> cr_ngroups = 7,
> cr_rgid = 1001,
> cr_svgid = 1001,
> cr_uidinfo = 0xfffff80007285500,
> cr_ruidinfo = 0xfffff80007285500,
> cr_prison = 0xffffffff80a9de10 <prison0>,
> ... <more sane-looking ucred fields>
>
> That is, flipping one of the bits in the fault address leads me to a
> valid ucred. This could in principle be the result of a software bug,
> but I'd be more inclined to suspect the hardware.
Mark,
Thanks for looking into the problem. This system has
been running for probably 2 years or so without issues.
I guess it's time to pull out memtest86+ (or similar)
to see if hardware is starting to fail.
--
Steve
More information about the freebsd-current
mailing list