kern/101618: kernel panic on multiple Dell PE2850s

Dave Clausen d.clausen at kualo.com
Tue Aug 8 00:10:29 UTC 2006


>Number:         101618
>Category:       kern
>Synopsis:       kernel panic on multiple Dell PE2850s
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    freebsd-bugs
>State:          open
>Quarter:        
>Keywords:       
>Date-Required:
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Tue Aug 08 00:10:16 GMT 2006
>Closed-Date:
>Last-Modified:
>Originator:     Dave Clausen
>Release:        FreeBSD 5.5-RELEASE i386
>Organization:
Kualo LTD
>Environment:
Problem exists on multiple Dell PowerEdge 2850s dual Xeons with HT.
System: FreeBSD kernbuild.kgix.net 5.5-RELEASE FreeBSD 5.5-RELEASE #0: Sun May 28 22:29:05 UTC 2006 tech at kernbuild.kgix.net:/usr/obj/usr/src/sys/KGIX-SMP-i386 i386

>Description:
We are experiencing multiple kernel panics per day on a series of Dell PowerEdge 2850 servers each running Cpanel (dcpumon is a cpu monitoring process distributed with cpanel).

Kernel is GENERIC plus firewalling.

dmesg:

Copyright (c) 1992-2006 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
        The Regents of the University of California. All rights reserved.
FreeBSD 5.5-RELEASE #0: Sun May 28 22:29:05 UTC 2006
    tech at kernbuild.kgix.net:/usr/obj/usr/src/sys/KGIX-SMP-i386
ACPI APIC Table: <DELL   PE BKC  >
Timecounter "i8254" frequency 1193182 Hz quality 0
CPU: Intel(R) Xeon(TM) CPU 3.00GHz (2992.51-MHz 686-class CPU)
  Origin = "GenuineIntel"  Id = 0xf43  Stepping = 3
  Features=0xbfebfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE>
  Hyperthreading: 2 logical CPUs
real memory  = 2147221504 (2047 MB)
avail memory = 2095767552 (1998 MB)
FreeBSD/SMP: Multiprocessor System Detected: 4 CPUs
 cpu0 (BSP): APIC ID:  0
 cpu1 (AP): APIC ID:  1
 cpu2 (AP): APIC ID:  6
 cpu3 (AP): APIC ID:  7
ioapic0: Changing APIC ID to 8
ioapic1: Changing APIC ID to 9
ioapic1: WARNING: intbase 32 != expected base 24
ioapic2: Changing APIC ID to 10
ioapic2: WARNING: intbase 64 != expected base 56
ioapic3: Changing APIC ID to 11
ioapic3: WARNING: intbase 96 != expected base 88
ioapic0 <Version 2.0> irqs 0-23 on motherboard
ioapic1 <Version 2.0> irqs 32-55 on motherboard
ioapic2 <Version 2.0> irqs 64-87 on motherboard
ioapic3 <Version 2.0> irqs 96-119 on motherboard
npx0: <math processor> on motherboard
npx0: INT 16 interface
acpi0: <DELL PE BKC> on motherboard
acpi0: Power Button (fixed)
Timecounter "ACPI-fast" frequency 3579545 Hz quality 1000
acpi_timer0: <24-bit timer at 3.579545MHz> port 0x808-0x80b on acpi0
cpu0: <ACPI CPU> on acpi0
cpu1: <ACPI CPU> on acpi0
cpu2: <ACPI CPU> on acpi0
cpu3: <ACPI CPU> on acpi0
pcib0: <ACPI Host-PCI bridge> port 0xcf8-0xcff on acpi0
pci0: <ACPI PCI bus> on pcib0
pcib1: <ACPI PCI-PCI bridge> at device 2.0 on pci0
pci1: <ACPI PCI bus> on pcib1
pcib2: <ACPI PCI-PCI bridge> at device 0.0 on pci1
pci2: <ACPI PCI bus> on pcib2
amr0: <LSILogic MegaRAID 1.51> mem 0xfeac0000-0xfeafffff,0xf80f0000-0xf80fffff irq 46 at device 14.0 on pci2
amr0: <LSILogic PERC 4e/Di> Firmware 521S, BIOS H430, 256MB RAM
pcib3: <ACPI PCI-PCI bridge> at device 0.2 on pci1
pci3: <ACPI PCI bus> on pcib3
pcib4: <ACPI PCI-PCI bridge> at device 4.0 on pci0
pci4: <ACPI PCI bus> on pcib4
pcib5: <ACPI PCI-PCI bridge> at device 5.0 on pci0
pci5: <ACPI PCI bus> on pcib5
pcib6: <ACPI PCI-PCI bridge> at device 0.0 on pci5
pci6: <ACPI PCI bus> on pcib6
em0: <Intel(R) PRO/1000 Network Connection, Version - 1.7.35> port 0xecc0-0xecff mem 0xfe7e0000-0xfe7fffff irq 64 at device 7.0 on pci6
em0: Ethernet address: 00:14:22:20:34:5a
pcib7: <ACPI PCI-PCI bridge> at device 0.2 on pci5
pci7: <ACPI PCI bus> on pcib7
em1: <Intel(R) PRO/1000 Network Connection, Version - 1.7.35> port 0xdcc0-0xdcff mem 0xfe5e0000-0xfe5fffff irq 65 at device 8.0 on pci7
em1: Ethernet address: 00:14:22:20:34:5b
pcib8: <ACPI PCI-PCI bridge> at device 6.0 on pci0
pci8: <ACPI PCI bus> on pcib8
pcib9: <ACPI PCI-PCI bridge> at device 0.0 on pci8
pci9: <ACPI PCI bus> on pcib9
pcib10: <ACPI PCI-PCI bridge> at device 0.2 on pci8
pci10: <ACPI PCI bus> on pcib10
pcib11: <ACPI PCI-PCI bridge> at device 30.0 on pci0
pci11: <ACPI PCI bus> on pcib11
pci11: <display, VGA> at device 13.0 (no driver attached)
isab0: <PCI-ISA bridge> at device 31.0 on pci0
isa0: <ISA bus> on isab0
atapci0: <Intel ICH5 UDMA100 controller> port 0xfc00-0xfc0f,0x376,0x170-0x177,0x3f6,0x1f0-0x1f7 at device 31.1 on pci0
ata0: channel #0 on atapci0
ata1: channel #1 on atapci0
fdc0: <floppy drive controller> port 0x3f7,0x3f0-0x3f5 irq 6 drq 2 on acpi0
fd0: <1440-KB 3.5" drive> on fdc0 drive 0
atkbdc0: <Keyboard controller (i8042)> port 0x64,0x60 irq 1 on acpi0
atkbd0: <AT Keyboard> irq 1 on atkbdc0
kbd0 at atkbd0
sio0: <16550A-compatible COM port> port 0x3f8-0x3ff irq 4 flags 0x10 on acpi0
sio0: type 16550A
pmtimer0 on isa0
orm0: <ISA Option ROMs> at iomem 0xec000-0xeffff,0xc0000-0xcafff on isa0
sc0: <System console> at flags 0x100 on isa0
sc0: VGA <16 virtual consoles, flags=0x300>
vga0: <Generic ISA VGA> at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0
ppc0: parallel port not found.
sio1: configured irq 3 not in bitmap of probed irqs 0
sio1: port may not be enabled
Timecounters tick every 10.000 msec
ipfw2 initialized, divert disabled, rule-based forwarding enabled, default to accept, logging limited to 100 packets/entry by default
acd0: CDROM <TEAC CD-ROM CD-224E-N/3.AB> at ata0-master PIO4
amrd0: <LSILogic MegaRAID logical drive> on amr0
amrd0: 69880MB (143114240 sectors) RAID 1 (optimal)
amrd1: <LSILogic MegaRAID logical drive> on amr0
amrd1: 69880MB (143114240 sectors) RAID 0 (optimal)
ses0 at amr0 bus 0 target 6 lun 0
ses0: <PE/PV 1x6 SCSI BP 1.0> Fixed Processor SCSI-2 device
ses0: SAF-TE Compliant Device
SMP: AP CPU #2 Launched!
SMP: AP CPU #1 Launched!
SMP: AP CPU #3 Launched!
Mounting root from ufs:/dev/amrd0s1a
WARNING: / was not properly dismounted

kgdb output:
Unread portion of the kernel message buffer:


Fatal trap 12: page fault while in kernel mode
cpuid = 2; apic id = 06
fault virtual address   = 0x1c
fault code              = supervisor write, page not present
instruction pointer     = 0x8:0xc062dd01
stack pointer           = 0x10:0xf15fa780
frame pointer           = 0x10:0xf15fa79c
code segment            = base 0x0, limit 0xfffff, type 0x1b
                        = DPL 0, pres 1, def32 1, gran 1
processor eflags        = interrupt enabled, resume, IOPL = 0
current process         = 23694 (dcpumon)
trap number             = 12
panic: page fault
cpuid = 2
boot() called on cpu#0
Uptime: 21h7m11s
Dumping 2047 MB
 16 32 48 64 80 96 112 128 144 160 176 192 208 224 240 256 272 288 304 320 336 352 368 384 400 416 432 448 464 480 496 512 528 544 560 576 592 608 624 640 656 672 688 704 720 736 752 768 784 800 816 832 848 864 880 896 912 928 944 960 976 992 1008 1024 1040 1056 1072 1088 1104 1120 1136 1152 1168 1184 1200 1216 1232 1248 1264 1280 1296 1312 1328 1344 1360 1376 1392 1408 1424 1440 1456 1472 1488 1504 1520 1536 1552 1568 1584 1600 1616 1632 1648 1664 1680 1696 1712 1728 1744 1760 1776 1792 1808 1824 1840 1856 1872 1888 1904 1920 1936 1952 1968 1984 2000 2016 2032

#0  doadump () at pcpu.h:160
160             __asm __volatile("movl %%fs:0,%0" : "=r" (td));
(kgdb)

(kgdb) bt
#0  doadump () at pcpu.h:160
#1  0xc05cca59 in boot (howto=260) at ../../../kern/kern_shutdown.c:412
#2  0xc05ccd7d in panic (fmt=0xc07c4944 "%s")
    at ../../../kern/kern_shutdown.c:568
#3  0xc0786410 in trap_fatal (frame=0xf15fa740, eva=28)
    at ../../../i386/i386/trap.c:822
#4  0xc078614f in trap_pfault (frame=0xf15fa740, usermode=0, eva=28)
    at ../../../i386/i386/trap.c:737
#5  0xc0785d89 in trap (frame=
      {tf_fs = -1067581416, tf_es = -1065615344, tf_ds = -1067581424, tf_edi = 4, tf_esi = 0, tf_ebp = -245389412, tf_isp = -245389460, tf_ebx = 131074, tf_edx = -979096576, tf_ecx = 0, tf_eax = 4, tf_trapno = 12, tf_err = 2, tf_eip = -1067262719, tf_cs = 8, tf_eflags = 66118, tf_esp = 7, tf_ss = -245388144})
    at ../../../i386/i386/trap.c:427
#6  0xc077385a in calltrap () at ../../../i386/i386/exception.s:140
#7  0xc05e0018 in devclass_find_free_unit (dc=0x0, unit=0)
    at ../../../kern/subr_bus.c:1147
#8  0xc058bc85 in procfs_doprocfile (td=0x0, p=0x20002, pn=0xc3674c80,
    sb=0xf15fa7f0, uio=0x0) at ../../../fs/procfs/procfs.c:73
#9  0xc058fef4 in pfs_readlink (va=0x0) at pcpu.h:157
#10 0xc0628c1c in kern_readlink (td=0xc5a42c00, path=0x0,
    pathseg=UIO_USERSPACE, buf=0x0, bufseg=UIO_USERSPACE, count=1024)
    at vnode_if.h:925
---Type <return> to continue, or q <return> to quit---
#11 0xc0628b42 in readlink (td=0xc5a42c00, uap=0x0)
    at ../../../kern/vfs_syscalls.c:2197
#12 0xc078674b in syscall (frame=
      {tf_fs = 47, tf_es = 47, tf_ds = 47, tf_edi = 135507064, tf_esi = 135655436, tf_ebp = -1077941016, tf_isp = -245387932, tf_ebx = 674068596, tf_edx = -1077942040, tf_ecx = 0, tf_eax = 58, tf_trapno = 47, tf_err = 2, tf_eip = 672570948, tf_cs = 31, tf_eflags = 643, tf_esp = -1077942100, tf_ss = 47})
    at ../../../i386/i386/trap.c:1014
#13 0xc07738af in Xint0x80_syscall () at ../../../i386/i386/exception.s:201
#14 0x0000002f in ?? ()
#15 0x0000002f in ?? ()
#16 0x0000002f in ?? ()
#17 0x0813ac78 in ?? ()
#18 0x0815f00c in ?? ()
#19 0xbfbfece8 in ?? ()
#20 0xf15fad64 in ?? ()
#21 0x282d7874 in ?? ()
#22 0xbfbfe8e8 in ?? ()
#23 0x00000000 in ?? ()
#24 0x0000003a in ?? ()
#25 0x0000002f in ?? ()
#26 0x00000002 in ?? ()
#27 0x28169e44 in ?? ()
---Type <return> to continue, or q <return> to quit---
#28 0x0000001f in ?? ()
#29 0x00000283 in ?? ()
#30 0xbfbfe8ac in ?? ()
#31 0x0000002f in ?? ()
#32 0x00000000 in ?? ()
#33 0x00000000 in ?? ()
#34 0x00000000 in ?? ()
#35 0x00000000 in ?? ()
#36 0x415ff000 in ?? ()
#37 0xc8668a98 in ?? ()
#38 0xc5a42c00 in ?? ()
#39 0xf15fa600 in ?? ()
#40 0xf15fa5e8 in ?? ()
#41 0xc3547300 in ?? ()
#42 0xc05dd0c3 in sched_switch (td=0x815f00c, newtd=0x282d7874, flags=Cannot access memory at address 0xbfbfecf8
)
    at ../../../kern/sched_4bsd.c:881
Previous frame inner to this frame (corrupt stack?)
(kgdb)

>How-To-Repeat:

>Fix:

>Release-Note:
>Audit-Trail:
>Unformatted:


More information about the freebsd-bugs mailing list