Kernel Panic, not sure what to do.

M_SPAHZgORN m_spahzgorn at yahoo.com
Fri May 14 13:03:20 PDT 2004


The panic happens with my custom kernel (see it below), haven't tried with the
generic kernel yet because the panic happens every other day.  Someone
mentioned that they think this is the problem:

http://www.freebsd.org/cgi/query-pr.cgi?pr=i386/53382

The panic seems to happen after a lot of disk activity, i.e. CVSUP (via cron
job), etc.

He recommended I add 'options KVA_PAGES=512' to my kernel, so I did, but it's
only been 24 hours since no panics.

My system will be a web server, hosting apache, bind, mysql, and qmail.  The
load should be pretty light, maybe 1,000 unique visits per day.

Here is my dmesg output:

###
Copyright (c) 1992-2003 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
        The Regents of the University of California. All rights reserved.
FreeBSD 4.9-RELEASE-p7 #1: Thu May 13 16:26:56 EDT 2004
    blah at blah.com:/usr/obj/usr/src/sys/DEBUG2
Timecounter "i8254"  frequency 1193182 Hz
CPU: Intel(R) Xeon(TM) CPU 2.40GHz (2399.33-MHz 686-class CPU)
  Origin = "GenuineIntel"  Id = 0xf27  Stepping = 7
 
Features=0xbfebfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,A
CPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE>
  Hyperthreading: 2 logical CPUs
real memory  = 2146893824 (2096576K bytes)
avail memory = 2087829504 (2038896K bytes)
Programming 24 pins in IOAPIC #0
IOAPIC #0 intpin 2 -> irq 0
FreeBSD/SMP: Multiprocessor motherboard: 4 CPUs
 cpu0 (BSP): apic id:  0, version: 0x00050014, at 0xfee00000
 cpu1 (AP):  apic id:  1, version: 0x00050014, at 0xfee00000
 cpu2 (AP):  apic id:  6, version: 0x00050014, at 0xfee00000
 cpu3 (AP):  apic id:  7, version: 0x00050014, at 0xfee00000
 io0 (APIC): apic id:  2, version: 0x00178020, at 0xfec00000
Preloaded elf kernel "kernel" at 0x802b9000.
Warning: Pentium 4 CPU: PSE disabled
Pentium Pro MTRR support enabled
Using $PIR table, 19 entries at 0x800fde90
npx0: <math processor> on motherboard
npx0: INT 16 interface
pcib0: <Host to PCI bridge> on motherboard
IOAPIC #0 intpin 17 -> irq 2
pci0: <PCI bus> on pcib0
agp0: <Intel Generic host to PCI bridge> mem 0xec000000-0xedffffff at device
0.0 on pci0
pci0: <unknown card> (vendor=0x8086, dev=0x2551) at 0.1
pcib1: <PCI to PCI bridge (vendor=8086 device=2552)> mem 0xee000000-0xefffffff
at device 1.0 on pci0
pci1: <PCI bus> on pcib1
pci1: <NVidia GeForce2 GTS graphics accelerator> at 0.0 irq 2
pcib2: <Intel 82801BA/BAM (ICH2) Hub to PCI bridge> at device 30.0 on pci0
IOAPIC #0 intpin 18 -> irq 5
IOAPIC #0 intpin 16 -> irq 11
pci2: <PCI bus> on pcib2
em0: <Intel(R) PRO/1000 Network Connection, Version - 1.7.16> port
0x2000-0x203f mem 0xea000000-0xea01ffff
irq 5 at device 3.0 on pci2
em0:  Speed:N/A  Duplex:N/A
pcib3: <PCI to PCI bridge (vendor=1044 device=a500)> at device 4.0 on pci2
pci3: <PCI bus> on pcib3
asr0: <Adaptec Caching SCSI RAID> mem 0xf8000000-0xf9ffffff irq 11 at device
4.1 on pci2
asr0: major=154
asr0: ADAPTEC 2100S FW Rev. 320P, 1 channel, 256 CCBs, Protocol I2O
isab0: <PCI to ISA bridge (vendor=8086 device=24c0)> at device 31.0 on pci0
isa0: <ISA bus> on isab0
atapci0: <Intel ICH4 ATA100 controller> port
0x1420-0x142f,0-0x3,0-0x7,0-0x3,0-0x7 irq 0 at device 31.1 on
pci0
ata0: at 0x1f0 irq 14 on atapci0
ata1: at 0x170 irq 15 on atapci0
pci0: <unknown card> (vendor=0x8086, dev=0x24c3) at 31.3 irq 2
orm0: <Option ROMs> at iomem 0xcc000-0xd1fff,0xe0000-0xe3fff on isa0
pmtimer0 on isa0
atkbdc0: <Keyboard controller (i8042)> at port 0x60,0x64 on isa0
atkbd0: <AT Keyboard> flags 0x1 irq 1 on atkbdc0
vga0: <Generic ISA VGA> at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0
sc0: <System console> at flags 0x100 on isa0
sc0: VGA <1 virtual consoles, flags=0x300>
APIC_IO: Testing 8254 interrupt delivery
APIC_IO: routing 8254 via IOAPIC #0 intpin 2
IP packet filtering initialized, divert disabled, rule-based forwarding
enabled, default to deny, logging l
imited to 10 packets/entry by default
SMP: AP CPU #1 Launched!
SMP: AP CPU #2 Launched!
SMP: AP CPU #3 Launched!
Mounting root from ufs:/dev/da0s1a
da0 at asr0 bus 0 target 0 lun 0
da0: <ADAPTEC RAID-5 320P> Fixed Direct Access SCSI-2 device
da0: Tagged Queueing Enabled
da0: 35002MB (71684096 512 byte sectors: 255H 63S/T 4462C)
em0: Link is up 100 Mbps Full Duplex
###

> what version of FreeBSD?

Running: FreeBSD 4_9

> what error message comes to the screen when it panics?

I don't know because it usually happens between 3:03 AM and
4:40 AM EST. I am sleeping at this time. ;-)

> does the panic occur regularly (when I run this it fails, fail once
> in a while, etc)?

Yes, it seems to happen every day or every other day at the
times stated above. I've reinstalled the OS at least 10 times,
doing the same process each time, so I think it has something
to do with my install process. There are no CRON jobs scheduled
at the times it happens so I don't believe it's program related.

> any hardware issues flaky RAM/powersupply, non-terminate SCSI bus
> or heat problems that could be the problem?

Running high-quality Kingston ECC RAM (2 GB) with a 4 GB swap,
three seagate scsi-raid drives in raid-5 on an adaptec 2100s
controller, tyan dual-xeon mobo with two CPUs, everything is
top of the line. Heat is not an issue, I have extremely good
airflow in the box (15 fans total). The case is a 4u rackmount,
power supply is high quality 500w, and all scsi devices are
terminated properly. I am co-locating this box as soon as I
can figure this problem out.

> do you have GDB compiled into the kernel? (nice to have the symbol
table)

I believe so, here is my Kernel config:

#####

machine i386
ident DEBUG
maxusers 0
options MAXDSIZ="(512*1024*1024)"
options SMP
options APIC_IO
cpu I686_CPU
options COMPAT_43
options SYSVSHM
options SYSVSEM
options SYSVMSG
options KTRACE
options INET
pseudo-device ether
pseudo-device loop
options IPFIREWALL
options IPFIREWALL_VERBOSE
options IPFIREWALL_VERBOSE_LIMIT=10
options IPSTEALTH
options RANDOM_IP_ID
options ACCEPT_FILTER_DATA
options ACCEPT_FILTER_HTTP
options ICMP_BANDLIM
options FFS
options FFS_ROOT
options PROCFS
options SOFTUPDATES
options P1003_1B
options _KPOSIX_PRIORITY_SCHEDULING
device scbus
device da
device pass
options SCSI_DELAY=5000
pseudo-device pty
device isa
device atkbdc0 at isa? port IO_KBD
device atkbd0 at atkbdc? irq 1 flags 0x1
#options KBD_INSTALL_CDEV
device vga0 at isa?
device sc0 at isa? flags 0x100
options MAXCONS=1
options SC_DISABLE_DDBKEY
options SC_DISABLE_REBOOT
options SC_NO_CUTPASTE
options SC_NO_FONT_LOADING
options SC_NO_HISTORY
options SC_NO_SYSMOUSE
device npx0 at nexus? port IO_NPX irq 13
device ata
options ATA_STATIC_ID
device pci
device agp
device em
options NMBCLUSTERS=87040 device asr
options DDB
options DDB_UNATTENDED
makeoptions DEBUG=-g
options DIAGNOSTIC

#####

And in my rc.conf I added:

#####
dumpdev="/dev/da0s1b"
dumpdir="/var/crash"
#####

Then after it crashes I run:

shell> gdb -k /usr/obj/usr/src/sys/DEBUG/kernel.debug /var/crash/vmcore.0

... I also run ...

shell> gdb -k /usr/obj/usr/src/sys/DEBUG/kernel.debug.orig /var/crash/vmcore.0

... because I've been told to backup my original 'kernel.debug'
file because it is changed after a crash. I get the same output
when I run gdb on either 'kernel.debug' file.

Then...

(kgdb) where

... and here is the output (which is what I need help interpreting):

-------------------------------------------------------------------------------
#0 dumpsys () at /usr/src/sys/kern/kern_shutdown.c:487
487 if (dumping++) {
(kgdb) where
#0 dumpsys () at /usr/src/sys/kern/kern_shutdown.c:487
#1 0xc014ba30 in boot (howto=256) at /usr/src/sys/kern/kern_shutdown.c:316
#2 0xc014beb1 in panic (fmt=0xc0230db9 "%s") at
/usr/src/sys/kern/kern_shutdown.c:595
#3 0xc01ffc82 in trap_fatal (frame=0xfe9fac2c, eva=0) at
/usr/src/sys/i386/i386/trap.c:974
#4 0xc01ff8d5 in trap_pfault (frame=0xfe9fac2c, usermode=0, eva=0) at
/usr/src/sys/i386/i386/trap.c:867
#5 0xc01ff41b in trap (frame={tf_fs = -752156648, tf_es = -1071316976, tf_ds =
16, tf_edi = 0,
tf_esi = -737259520, tf_ebp = -23090016, tf_isp = -23090088, tf_ebx = 0,
tf_edx = -1744879617,
tf_ecx = 42, tf_eax = 0, tf_trapno = 12, tf_err = 2, tf_eip
= -1071651613, tf_cs = 8,
tf_eflags = 66050, tf_esp = -24734848, tf_ss = -1072191432}) at
/usr/src/sys/i386/i386/trap.c:466
#6 0xc01fe4e3 in generic_bzero ()
#7 0xc01b7bd0 in ffs_vget (mp=0xd34da200, ino=739027, vpp=0xfe9fad50)
at /usr/src/sys/ufs/ffs/ffs_vfsops.c:1109
#8 0xc01bab2f in ufs_lookup (ap=0xfe9fada8) at
/usr/src/sys/ufs/ufs/ufs_lookup.c:611
#9 0xc01bf595 in ufs_vnoperate (ap=0xfe9fada8) at
/usr/src/sys/ufs/ufs/ufs_vnops.c:2376
#10 0xc017684a in vfs_cache_lookup (ap=0xfe9fae00) at vnode_if.h:77
#11 0xc01bf595 in ufs_vnoperate (ap=0xfe9fae00) at
/usr/src/sys/ufs/ufs/ufs_vnops.c:2376
#12 0xc0179921 in lookup (ndp=0xfe9fae7c) at vnode_if.h:52
#13 0xc017940c in namei (ndp=0xfe9fae7c) at /usr/src/sys/kern/vfs_lookup.c:153
#14 0xc017f93d in lstat (p=0xfe869380, uap=0xfe9faf80) at
/usr/src/sys/kern/vfs_syscalls.c:1824
#15 0xc01fffed in syscall2 (frame={tf_fs = 142082095, tf_es = 47, tf_ds
= -1078001617,
tf_edi = 136761280, tf_esi = 142748160, tf_ebp = -1077946144, tf_isp
= -23089196,
tf_ebx = 136761432, tf_edx = -1077945820, tf_ecx = 142568712, tf_eax =
190,
tf_trapno = -1077945552, tf_err = 2, tf_eip = 674403276, tf_cs = 31,
tf_eflags = 646,
tf_esp = -1077946524, tf_ss = 47}) at /usr/src/sys/i386/i386/trap.c:1175
#16 0xc01ecf8b in Xint0x80_syscall ()
#17 0x80e336e in ?? ()
#18 0x8111cbb in ?? ()
#19 0x804e2dd in ?? ()
#20 0x804fbab in ?? ()
#21 0x804ed51 in ?? ()
#22 0x804fbab in ?? ()
#23 0x804ed51 in ?? ()
#24 0x804fbab in ?? ()
#25 0x804ed51 in ?? ()
#26 0x804fbab in ?? ()
#27 0x804ed51 in ?? ()
#28 0x8050930 in ?? ()
#29 0x807b819 in ?? ()
#30 0x806a029 in ?? ()
#31 0x804adfe in ?? ()
-------------------------------------------------------------------------------

... and here is the relevant part from my all.log (5-13-04 (FRESH INSTALL) -
4:40:10):

#####

May 13 04:13:00 tycobb /usr/sbin/cron[3366]: (root) CMD
(/usr/local/sbin/tripwire --check | mail -s "Cron <
May 13 04:15:00 tycobb /usr/sbin/cron[3370]: (root) CMD (/usr/libexec/atrun)
May 13 04:29:56 tycobb syslogd: restart
May 13 04:29:56 tycobb /kernel: Checking for core dump:
May 13 04:29:56 tycobb /kernel: savecore: reboot after panic: page fault
May 13 04:29:56 tycobb savecore: reboot after panic: page fault
May 13 04:29:57 tycobb /kernel: savecore: system went down at Thu May 13
04:15:35 2004
May 13 04:29:57 tycobb /kernel: savecore: /var/crash/bounds: No such file or
directory

#####


	
		
__________________________________
Do you Yahoo!?
SBC Yahoo! - Internet access at a great low price.
http://promo.yahoo.com/sbc/


More information about the freebsd-questions mailing list