Ian J Hart
ianjhart at ntlworld.com
Tue Jul 7 09:51:08 UTC 2009
Quoting Ian J Hart <ianjhart at ntlworld.com>:
> Quoting Ian J Hart <ianjhart at ntlworld.com>:
>> Is this likely to be hardware? Details will follow if not.
>> [copied from a screen dump]
>> Fatal trap 12: page fault while in kernel mode
>> cpuid = 1; apic id = 01
>> fault virtual address = 0x0
>> fault code = supervisor write data, page not present
>> instruction pointer = 0x8:0xffffffff807c6c12
>> stack pointer = 0x10:0xffffffff510e7890
>> frame pointer = 0x10:0xffffff00054a6c90
>> code segment = base 0x0, limit 0xfffff, type 0x1b
>> = DPL 0, pres 1, long 1 def32 0, gran 1
>> processor eflags = interrupt enabled, resume, IOPL = 0
>> current process = 75372 (printf)
>> trap number = 12
>> panic: page fault
>> cpuid = 1
>> uptime: 8m2s
>> Cannot dump. No dump device defined.
> [First attempt apparently went into a blackhole. Apologies in you
> get this twice.]
> Some suggestions (off list) that it may not be hardware, so here's
> the follow up.
> supermicro 5015b-mt (super X7SBi mobo)
> Intel Q6600
> 8GB ECC DDR2
> 4x Seagate 320GB, two gmirror, two idle.
> issues so far
> 1 OK) 7.x doesn't boot without hw.ata.atapi_dma=0. Not recently tested.
> 2 OK) disks enumerate differently 6.x to 7.x. Painful if you
> hardwired the providor into your mirror.
> 3) 6.3 and 7.2 remote dump over ssh fails with 'Disconnecting:
> Corrupted MAC on input.'
> 4) On 7.2 (AFAICT from logs) random reboots under load. e.g. the
> above generated by a portupgrade run.
> I had dumpdev=none as I hadn't setup rc.early to allow savecore to work.
> In the interests of full disclosure I should say that this box was
> migrated from older hardware and then source upgraded from i386 to
> amd64 (6.3). Only one issue with that, format of accounting
> file.Upgrade to 7.2 and a rebuild or two since then.
> This box is our email server and there's no load. An identical box
> running as a gateway/firewall backup dumps okay and doesn't reboot.
> That box does drop network connections when running a cvsup server
> (treelist write), but when configured to pass through these
> connections (using balance) runs okay. But that's a story for
> another day as it's still on 6.x.
> Anyway, I put the two gmirror disks in another chassis and the
> remote dumps are now completing.This at least does seem to be
> Before I moved the two gmirror disks I synced a third disk. I can
> now test (most of) the original hardware and software.
> I was unable to make this single disk system crash, so I added two
> new disks and synced them.Now a 3 disk mirror, one disk idle.
> I've disabled sendmail and the email server so as not to clash.
> A portupgrade run caused a crash. I've setup coredumps so I can now
> test. Remote backup dumps do fail.
> xmail# kldstat
> Id Refs Address Size Name
> 1 2 0xffffffff80100000 bd23e0 kernel
> 2 1 0xffffffff80cd3000 20608 geom_mirror.ko
> I did have ipfw module loaded, but I got the crash without it so
> I've removed it (firewall_type=OPEN).
> Ran crashinfo, now have much more info than I need ;)
> Starting another portupgrade run now to see how reproducable this is.
> Later BIOS waiting in USB floppy.
It took 2 runs of portupgrade -af.Some corruption in the dbs may have
to pkg_delete -a.
FreeBSD * 7.2-RELEASE-p1 FreeBSD 7.2-RELEASE-p1 #0: Tue Jun 16
18:03:10 BST 2009 *@*:/usr/obj/usr/src/sys/GENERIC amd64
panic: page fault
GNU gdb 6.1.1 [FreeBSD]
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB. Type "show warranty" for details.
This GDB was configured as "amd64-marcel-freebsd"...
Unread portion of the kernel message buffer:
Fatal trap 12: page fault while in kernel mode
cpuid = 1; apic id = 01
fault virtual address = 0xfffffffff5555570
fault code = supervisor write data, page not present
instruction pointer = 0x8:0xffffffff807c429b
stack pointer = 0x10:0xffffffff511e4710
frame pointer = 0x10:0x20
code segment = base 0x0, limit 0xfffff, type 0x1b
= DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags = interrupt enabled, resume, IOPL = 0
current process = 69996 (mkdir)
trap number = 12
panic: page fault
cpuid = 1
Physical memory: 8177 MB
Dumping 730 MB: 715 699 683 667 651 635 619 603 587 571 555 539 523
507 491 475 459 443 427 411 395 379 363 347 331 315 299 283 267 251
235 219 203 187 171 155 139 123 107 91 75 59 43 27 11
Reading symbols from /boot/kernel/geom_mirror.ko...Reading symbols
Loaded symbols for /boot/kernel/geom_mirror.ko
#0 doadump () at pcpu.h:195
195 pcpu.h: No such file or directory.
(kgdb) #0 doadump () at pcpu.h:195
#1 0x0000000000000004 in ?? ()
#2 0xffffffff8050df19 in boot (howto=260)
#3 0xffffffff8050e322 in panic (fmt=0x104 <Address 0x104 out of bounds>)
#4 0xffffffff807d21f3 in trap_fatal (frame=0xffffff0005f94a50,
eva=Variable "eva" is not available.
#5 0xffffffff807d25c5 in trap_pfault (frame=0xffffffff511e4660, usermode=0)
#6 0xffffffff807d2f04 in trap (frame=0xffffffff511e4660)
#7 0xffffffff807b706e in calltrap ()
#8 0xffffffff807c429b in free_pv_entry (pmap=0xffffffff80b66c80,
pv=Variable "pv" is not available.
#9 0xffffffff807c4403 in pmap_remove_entry (pmap=Variable "pmap" is
#10 0xffffffff807c6447 in pmap_remove_pte (pmap=0xffffffff80b66c80,
ptq=0xaaaaaaa8, va=18446744070506639360, ptepde=23601251,
free=0xffffffff511e4790) at /usr/src/sys/amd64/amd64/pmap.c:2366
#11 0xffffffff807cab87 in pmap_remove (pmap=0xffffffff80b66c80,
#12 0xffffffff8073bf80 in vm_map_delete (map=0xffffff00016830f8,
#13 0xffffffff80739905 in kmem_free_wakeup (map=0xffffff00016830f8,
addr=18446744070506639360, size=267264) at /usr/src/sys/vm/vm_kern.c:462
#14 0xffffffff804e648d in exec_free_args (args=0xffffffff511e4b00)
#15 0xffffffff804e784a in kern_execve (td=0xffffff0005f94a50,
args=0xffffffff511e4b00, mac_p=Variable "mac_p" is not available.
) at /usr/src/sys/kern/kern_exec.c:836
#16 0xffffffff804e7fd7 in execve (td=0xffffff0005f94a50, uap=Variable
"uap" is not available.
#17 0xffffffff807d2847 in syscall (frame=0xffffffff511e4c80)
#18 0xffffffff807b727b in Xfast_syscall ()
#19 0x00000008005044b0 in ?? ()
Previous frame inner to this frame (corrupt stack?)
ian j hart
This message was sent using IMP, the Internet Messaging Program.
More information about the freebsd-stable