help to track server crash (LONG)

Alexandre Biancalana biancalana at gmail.com
Wed Aug 29 13:34:28 PDT 2007


I've a 6-STABLE (sources from 2 weeks ago) server that is rebooting at
random, I changed memory, cpu, motherboard and the problem persists. This
machine is a firewall, running pf, squid and bind.

Now I recompiled the kernel with debugging bits. Follow the debug session,
dmesg and kernel config file.

I apreciate any help to track this down,


FW1:/sys/i386/compile/GW.debug # kgdb kernel.debug /var/crash/vmcore.0
kgdb: kvm_nlist(_stopped_cpus):
kgdb: kvm_nlist(_stoppcbs):
[GDB will not be able to debug user-mode threads: /usr/lib/libthread_db.so:
Undefined symbol "ps_pglobal_lookup"]
GNU gdb 6.1.1 [FreeBSD]
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain
conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "i386-marcel-freebsd".

Unread portion of the kernel message buffer:
KDB: enter: manual escape to debugger
panic: from debugger
Uptime: 31m48s
Dumping 502 MB (2 chunks)
  chunk 0: 1MB (159 pages) ... ok
  chunk 1: 503MB (128559 pages) 487 471 455 439 423 407 391 375 359 343 327
311 295 279 263 247 231 215 199 183 167 151 135 119 103 87 71 55 39 23 7

#0  doadump () at pcpu.h:165
165             __asm __volatile("movl %%fs:0,%0" : "=r" (td));
(kgdb) bt
#0  doadump () at pcpu.h:165
#1  0xc0533578 in boot (howto=260) at ../../../kern/kern_shutdown.c:409
#2  0xc0533823 in panic (fmt=0xc069754b "from debugger") at
../../../kern/kern_shutdown.c:565
#3  0xc047cd61 in db_panic (addr=-1068190189, have_addr=0, count=-1,
modif=0xd54cbab8 "") at ../../../ddb/db_command.c:438
#4  0xc047ccf8 in db_command (last_cmdp=0xc07081a4, cmd_table=0x0,
aux_cmd_tablep=0xc06cc0c4, aux_cmd_tablep_end=0xc06cc0c8)
    at ../../../ddb/db_command.c:350
#5  0xc047cdc0 in db_command_loop () at ../../../ddb/db_command.c:458
#6  0xc047e9bd in db_trap (type=3, code=0) at ../../../ddb/db_main.c:222
#7  0xc054b88f in kdb_trap (type=3, code=0, tf=0xd54cbbf8) at
../../../kern/subr_kdb.c:473
#8  0xc0675364 in trap (frame=
      {tf_fs = -716439544, tf_es = -1068236760, tf_ds = -1066794968, tf_edi
= -1065903680, tf_esi = -1065924320, tf_ebp = -716391368, tf_isp =
-716391388, tf_ebx = 0, tf_edx = 0, tf_ecx = -1056878592, tf_eax = 38,
tf_trapno = 3, tf_err = 0, tf_eip = -1068190189, tf_cs = 32, tf_eflags =
524930, tf_esp = -716391296, tf_ss = -1067080420}) at
../../../i386/i386/trap.c:594
#9  0xc06636aa in calltrap () at ../../../i386/i386/exception.s:139
#10 0xc054b613 in kdb_enter (msg=0x26 <Address 0x26 out of bounds>) at
cpufunc.h:60
#11 0xc065a51c in scgetc (sc=0xc07799c0, flags=2) at
../../../dev/syscons/syscons.c:3365
#12 0xc06561e8 in sckbdevent (thiskbd=0xc07618c0, event=0, arg=0xc07799c0)
at ../../../dev/syscons/syscons.c:659
#13 0xc0641cd9 in atkbd_intr (kbd=0xc07618c0, arg=0x0) at
../../../dev/atkbdc/atkbd.c:503
#14 0xc0642df6 in atkbdintr (arg=0xc1015000) at
../../../dev/atkbdc/atkbd_atkbdc.c:174
#15 0xc05201ae in ithread_execute_handlers (p=0xc3391430, ie=0xc322cc00) at
../../../kern/kern_intr.c:682
#16 0xc05202de in ithread_loop (arg=0xc338c340) at
../../../kern/kern_intr.c:765
#17 0xc051f41c in fork_exit (callout=0xc0520278 <ithread_loop>,
arg=0xc338c340, frame=0xd54cbd38) at ../../../kern/kern_fork.c:830
#18 0xc066370c in fork_trampoline () at ../../../i386/i386/exception.s:208
(kgdb)


Copyright (c) 1992-2007 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
        The Regents of the University of California. All rights reserved.
FreeBSD is a registered trademark of The FreeBSD Foundation.
FreeBSD 6.2-STABLE #0: Wed Aug 29 09:40:39 BRT 2007
    root at FW1:/usr/src/sys/i386/compile/GW.debug
WARNING: WITNESS option enabled, expect reduced performance.
WARNING: DIAGNOSTIC option enabled, expect reduced performance.
ACPI APIC Table: <INTEL  D915GAG >
Timecounter "i8254" frequency 1193182 Hz quality 0
CPU: Intel(R) Pentium(R) 4 CPU 3.20GHz (3200.00-MHz 686-class CPU)
  Origin = "GenuineIntel"  Id = 0xf43  Stepping = 3

Features=0xbfebfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE>
  Features2=0x649d<SSE3,RSVD2,MON,DS_CPL,EST,CNTX-ID,CX16,<b14>>
  AMD Features=0x20100000<NX,LM>
  Logical CPUs per core: 2
real memory  = 527626240 (503 MB)
avail memory = 506597376 (483 MB)
ioapic0 <Version 2.0> irqs 0-23 on motherboard
acpi0: <INTEL D915GAG> on motherboard
acpi0: Power Button (fixed)
Timecounter "ACPI-safe" frequency 3579545 Hz quality 1000
acpi_timer0: <24-bit timer at 3.579545MHz> port 0x408-0x40b on acpi0
cpu0: <ACPI CPU> on acpi0
acpi_perf0: <ACPI CPU Frequency Control> on cpu0
acpi_perf0: failed in PERF_STATUS attach
device_attach: acpi_perf0 attach returned 6
acpi_perf0: <ACPI CPU Frequency Control> on cpu0
acpi_perf0: failed in PERF_STATUS attach
device_attach: acpi_perf0 attach returned 6
acpi_throttle0: <ACPI CPU Throttling> on cpu0
pcib0: <ACPI Host-PCI bridge> port 0xcf8-0xcff on acpi0
pci0: <ACPI PCI bus> on pcib0
pcib1: <ACPI PCI-PCI bridge> at device 1.0 on pci0
pci1: <ACPI PCI bus> on pcib1
pci0: <display, VGA> at device 2.0 (no driver attached)
pcib2: <ACPI PCI-PCI bridge> at device 28.0 on pci0
pci5: <ACPI PCI bus> on pcib2
pcib3: <ACPI PCI-PCI bridge> at device 28.1 on pci0
pci4: <ACPI PCI bus> on pcib3
pcib4: <ACPI PCI-PCI bridge> at device 28.2 on pci0
pci3: <ACPI PCI bus> on pcib4
pcib5: <ACPI PCI-PCI bridge> at device 28.3 on pci0
pci2: <ACPI PCI bus> on pcib5
pci0: <serial bus, USB> at device 29.0 (no driver attached)
pci0: <serial bus, USB> at device 29.1 (no driver attached)
pci0: <serial bus, USB> at device 29.2 (no driver attached)
pci0: <serial bus, USB> at device 29.3 (no driver attached)
pci0: <serial bus, USB> at device 29.7 (no driver attached)
pcib6: <ACPI PCI-PCI bridge> at device 30.0 on pci0
pci6: <ACPI PCI bus> on pcib6
rl0: <RealTek 8139 10/100BaseTX> port 0xb800-0xb8ff mem
0xff511000-0xff5110ff irq 21 at device 0.0 on pci6
miibus0: <MII bus> on rl0
rlphy0: <RealTek internal media interface> on miibus0
rlphy0:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto
rl0: Ethernet address: 00:50:fc:61:6e:e7
pcib7: <PCI-PCI bridge> at device 1.0 on pci6
pci7: <PCI bus> on pcib7
ste0: <D-Link DL10050 10/100BaseTX> port 0xac00-0xac7f irq 22 at
device 4.0on pci7
miibus1: <MII bus> on ste0
ukphy0: <Generic IEEE 802.3u media interface> on miibus1
ukphy0:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto
ste0: Ethernet address: 00:0d:88:68:92:7c
ste1: <D-Link DL10050 10/100BaseTX> port 0xa800-0xa87f irq 21 at
device 5.0on pci7
miibus2: <MII bus> on ste1
ukphy1: <Generic IEEE 802.3u media interface> on miibus2
ukphy1:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto
ste1: Ethernet address: 00:0d:88:68:92:7d
ste2: <D-Link DL10050 10/100BaseTX> port 0xa400-0xa47f irq 20 at
device 6.0on pci7
miibus3: <MII bus> on ste2
ukphy2: <Generic IEEE 802.3u media interface> on miibus3
ukphy2:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto
ste2: Ethernet address: 00:0d:88:68:92:7e
ste3: <D-Link DL10050 10/100BaseTX> port 0xa000-0xa07f irq 23 at
device 7.0on pci7
miibus4: <MII bus> on ste3
ukphy3: <Generic IEEE 802.3u media interface> on miibus4
ukphy3:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto
ste3: Ethernet address: 00:0d:88:68:92:7f
fxp0: <Intel 82562EZ (ICH6)> port 0xbc00-0xbc3f mem 0xff510000-0xff510fff
irq 20 at device 8.0 on pci6
miibus5: <MII bus> on fxp0
inphy0: <i82562ET 10/100 media interface> on miibus5
inphy0:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto
fxp0: Ethernet address: 00:16:76:24:23:25
isab0: <PCI-ISA bridge> at device 31.0 on pci0
isa0: <ISA bus> on isab0
atapci0: <Intel ICH6 UDMA100 controller> port
0x1f0-0x1f7,0x3f6,0x170-0x177,0x376,0xffa0-0xffaf at device 31.1 on pci0
ata0: <ATA channel 0> on atapci0
ata1: <ATA channel 1> on atapci0
atapci1: <Intel ICH6 SATA150 controller> port
0xe800-0xe807,0xe400-0xe403,0xe000-0xe007,0xdc00-0xdc03,0xd800-0xd80f irq 19
at device 31.2 on pci0
ata2: <ATA channel 0> on atapci1
ata3: <ATA channel 1> on atapci1
pci0: <serial bus, SMBus> at device 31.3 (no driver attached)
acpi_button0: <Power Button> on acpi0
atkbdc0: <Keyboard controller (i8042)> port 0x60,0x64 irq 1 on acpi0
atkbd0: <AT Keyboard> irq 1 on atkbdc0
kbd0 at atkbd0
atkbd0: [GIANT-LOCKED]
pmtimer0 on isa0
sc0: <System console> at flags 0x100 on isa0
sc0: VGA <16 virtual consoles, flags=0x300>
vga0: <Generic ISA VGA> at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0
Timecounter "TSC" frequency 3199999632 Hz quality 800
Timecounters tick every 1.000 msec
ad4: 76319MB <Seagate ST3808110AS 3.AAH> at ata2-master SATA150
GEOM_MIRROR: Device gm0 created (id=2284208825).
GEOM_MIRROR: Device gm0: provider ad4 detected.
ad6: 76319MB <Seagate ST380817AS 3.42> at ata3-master SATA150
GEOM_MIRROR: Device gm0: provider ad6 detected.
GEOM_MIRROR: Device gm0: provider ad6 activated.
GEOM_MIRROR: Device gm0: provider mirror/gm0 launched.
GEOM_MIRROR: Device gm0: rebuilding provider ad4.
Trying to mount root from ufs:/dev/mirror/gm0s1a
WARNING: / was not properly dismounted
WARNING: /tmp was not properly dismounted
WARNING: /usr was not properly dismounted
/usr: mount pending error: blocks 4 files 1
WARNING: /var was not properly dismounted
Expensive timeout(9) function: 0xc05f38a8(0xc3312000) 0.002864889 s
lock order reversal:
 1st 0xc0759dac tcp (tcp) @ netinet/tcp_input.c:625
 2nd 0xc0704840 pf task mtx (pf task mtx) @ contrib/pf/net/pf.c:6386
KDB: stack backtrace:
kdb_backtrace(0,ffffffff,c071b3d0,c0719760,c06e35e4,...) at
kdb_backtrace+0x29
witness_checkorder(c0704840,9,c0696032,18f2) at witness_checkorder+0x578
_mtx_lock_flags(c0704840,0,c0696029,18f2,c0704840,2,c0696029,18f2) at
_mtx_lock_flags+0x78
pf_test(2,c331b400,d4038b00,0,0) at pf_test+0x86
pf_check_out(0,d4038b00,c331b400,2,0) at pf_check_out+0x4f
pfil_run_hooks(c0759940,d4038b74,c331b400,2,0,...) at pfil_run_hooks+0xc9
ip_output(c3617900,0,d4038b40,0,0,...) at ip_output+0x682
tcp_respond(0,c358e010,c358e024,c3617900,0,386b91fb,4) at tcp_respond+0x2ae
tcp_input(c3617900,14,13e,39a06c9,0,...) at tcp_input+0x2a08
ip_input(c3617900) at ip_input+0x5ee
netisr_processqueue(c0758ed8) at netisr_processqueue+0x6e
swi_net(0) at swi_net+0xc2
ithread_execute_handlers(c322ea78,c327a280) at ithread_execute_handlers+0xe6
ithread_loop(c3217690,d4038d38,c3217690,c0520278,0,...) at ithread_loop+0x66
fork_exit(c0520278,c3217690,d4038d38) at fork_exit+0xa0
fork_trampoline() at fork_trampoline+0x8
--- trap 0x1, eip = 0, esp = 0xd4038d6c, ebp = 0 ---
lock order reversal:
 1st 0xc0704840 pf task mtx (pf task mtx) @ contrib/pf/net/pf.c:6386
 2nd 0xc075a4ac udp (udp) @ contrib/pf/net/pf.c:2744
KDB: stack backtrace:
kdb_backtrace(0,ffffffff,c0719760,c071b420,c06e35e4,...) at
kdb_backtrace+0x29
witness_checkorder(c075a4ac,9,c0696032,ab8) at witness_checkorder+0x578
_mtx_lock_flags(c075a4ac,0,c0696029,ab8,c3640028,...) at
_mtx_lock_flags+0x78
pf_socket_lookup(d4038b08,d4038b0c,1,d4038bc8,0,...) at
pf_socket_lookup+0x130
pf_test_udp(d4038b78,d4038b70,1,c34fbc00,c35d4900,...) at pf_test_udp+0x4fd
pf_test(1,c331b400,d4038c64,0,0) at pf_test+0x664
pf_check_in(0,d4038c64,c331b400,1,0) at pf_check_in+0x37
pfil_run_hooks(c0759940,d4038cb4,c331b400,1,0) at pfil_run_hooks+0xc9
ip_input(c35d4900) at ip_input+0x251
netisr_processqueue(c0758ed8) at netisr_processqueue+0x6e
swi_net(0) at swi_net+0xc2
ithread_execute_handlers(c322ea78,c327a280) at ithread_execute_handlers+0xe6
ithread_loop(c3217690,d4038d38,c3217690,c0520278,0,...) at ithread_loop+0x66
fork_exit(c0520278,c3217690,d4038d38) at fork_exit+0xa0
fork_trampoline() at fork_trampoline+0x8
--- trap 0x1, eip = 0, esp = 0xd4038d6c, ebp = 0 ---
lock order reversal:
 1st 0xc375b360 inp (tcpinp) @ netinet/tcp_usrreq.c:368
 2nd 0xc0704840 pf task mtx (pf task mtx) @ contrib/pf/net/pf.c:6386
KDB: stack backtrace:
kdb_backtrace(0,ffffffff,c071b3a8,c0719760,c06e35e4,...) at
kdb_backtrace+0x29
witness_checkorder(c0704840,9,c0696032,18f2) at witness_checkorder+0x578
_mtx_lock_flags(c0704840,0,c0696029,18f2,c0704840,2,c0696029,18f2) at
_mtx_lock_flags+0x78
pf_test(2,c33a7400,d54c2b48,0,c375b2d0) at pf_test+0x86
pf_check_out(0,d54c2b48,c33a7400,2,c375b2d0) at pf_check_out+0x4f
pfil_run_hooks(c0759940,d54c2bbc,c33a7400,2,c375b2d0,...) at
pfil_run_hooks+0xc9
ip_output(c35d1900,0,d54c2b88,0,0,c375b2d0) at ip_output+0x682
tcp_output(c375c740) at tcp_output+0xe01
tcp_usr_connect(c358742c,c34e0220,c3392780) at tcp_usr_connect+0xe3
soconnect(c358742c,c34e0220,c3392780) at soconnect+0x4e
kern_connect(c3392780,6,c34e0220,c34e0220,0,...) at kern_connect+0x74
connect(c3392780,d54c2d04) at connect+0x2f
syscall(3b,3b,3b,bfbf9e7c,0,...) at syscall+0x25b
Xint0x80_syscall() at Xint0x80_syscall+0x1f
--- syscall (98, FreeBSD ELF32, connect), eip = 0x282de573, esp =
0xbfbf9e2c, ebp = 0xbfbfa068 ---
Expensive timeout(9) function: 0xc05bd5a0(0) 0.022250034 s
Expensive timeout(9) function: 0xc0657ea0(0xc07799c0) 0.149071646 s



machine         i386
cpu             I686_CPU
ident           GW

# To statically compile in device wiring instead of /boot/device.hints
#hints          "GENERIC.hints"         # Default places to look for
devices.

makeoptions     DEBUG=-g                # Build kernel with gdb(1) debug
symbols
options         KDB
options         DDB
#options                BREAK_TO_DEBUGGER
options         INVARIANTS
options         INVARIANT_SUPPORT
options         WITNESS
options         DEBUG_LOCKS
options         DEBUG_VFS_LOCKS
options         DIAGNOSTIC

#options        SCHED_ULE               # ULE scheduler
options         SCHED_4BSD              # 4BSD scheduler
options         PREEMPTION              # Enable kernel thread preemption
options         INET                    # InterNETworking
options         FFS                     # Berkeley Fast Filesystem
options         SOFTUPDATES             # Enable FFS soft updates support
options         UFS_ACL                 # Support for access control lists
options         UFS_DIRHASH             # Improve performance on big
directories
options         MD_ROOT                 # MD is a potential root device
options         NFSCLIENT               # Network Filesystem Client
options         NFSSERVER               # Network Filesystem Server
options         NFS_ROOT                # NFS usable as /, requires
NFSCLIENT
options         MSDOSFS                 # MSDOS Filesystem
options         CD9660                  # ISO 9660 Filesystem
options         PROCFS                  # Process filesystem (requires
PSEUDOFS)
options         PSEUDOFS                # Pseudo-filesystem framework
options         GEOM_GPT                # GUID Partition Tables.
options         COMPAT_43               # Compatible with BSD 4.3 [KEEP
THIS!]
#options        COMPAT_FREEBSD4         # Compatible with FreeBSD4
#options        COMPAT_FREEBSD5         # Compatible with FreeBSD5
options         SCSI_DELAY=5000         # Delay (in ms) before probing SCSI
options         KTRACE                  # ktrace(1) support
options         SYSVSHM                 # SYSV-style shared memory
options         SYSVMSG                 # SYSV-style message queues
options         SYSVSEM                 # SYSV-style semaphores
options         _KPOSIX_PRIORITY_SCHEDULING # POSIX P1003_1B real-time
extensions
options         KBD_INSTALL_CDEV        # install a CDEV entry in /dev
options         AHC_REG_PRETTY_PRINT    # Print register bitfields in debug
                                        # output.  Adds ~128k to driver.
options         AHD_REG_PRETTY_PRINT    # Print register bitfields in debug
                                        # output.  Adds ~215k to driver.
options         ADAPTIVE_GIANT          # Giant mutex is adaptive.

device          apic                    # I/O APIC

# Bus support.
device          eisa
device          pci

# Floppy drives
device          fdc

# ATA and ATAPI devices
device          ata
device          atadisk         # ATA disk drives
device          ataraid         # ATA RAID drives
device          atapicd         # ATAPI CDROM drives
device          atapifd         # ATAPI floppy drives
device          atapist         # ATAPI tape drives
options         ATA_STATIC_ID   # Static device numbering

# SCSI Controllers
device          ahc             # AHA2940 and onboard AIC7xxx devices

# SCSI peripherals
device          scbus           # SCSI bus (required for SCSI)
device          ch              # SCSI media changers
device          da              # Direct Access (disks)
device          sa              # Sequential Access (tape etc)
device          cd              # CD
device          pass            # Passthrough device (direct SCSI access)
device          ses             # SCSI Environmental Services (and SAF-TE)

# atkbdc0 controls both the keyboard and the PS/2 mouse
device          atkbdc          # AT keyboard controller
device          atkbd           # AT keyboard
device          psm             # PS/2 mouse

device          vga             # VGA video card driver

device          splash          # Splash screen and screen saver support

# syscons is the default console driver, resembling an SCO console
device          sc

# Enable this for the pcvt (VT220 compatible) console driver
#device         vt
#options        XSERVER         # support for X server on a vt console
#options        FAT_CURSOR      # start with block cursor

#device         agp             # support several AGP chipsets

# Power management support (see NOTES for more options)
#device         apm
# Add suspend/resume support for the i8254.
device          pmtimer

# PCI Ethernet NICs that use the common MII bus controller code.
# NOTE: Be sure to keep the 'device miibus' line in order to use these NICs!
device          miibus          # MII bus support
device          fxp             # Intel EtherExpress PRO/100B (82557, 82558)
device          ste             # RealTek 8129/8139
device          em              # RealTek 8129/8139
device          rl              # RealTek 8129/8139
device          xl              # 3Com 3c90x (``Boomerang'', ``Cyclone'')

# Pseudo devices.
device          loop            # Network loopback
device          random          # Entropy device
device          ether           # Ethernet support
device          pty             # Pseudo-ttys (telnet etc)

# The `bpf' device enables the Berkeley Packet Filter.
# Be aware of the administrative consequences of enabling this!
# Note that 'bpf' is required for DHCP.
device          bpf             # Berkeley packet filter

device          pf
device          pflog
device          pfsync
device          carp

options         ALTQ
options         ALTQ_CBQ
options         ALTQ_RED
options         ALTQ_RIO
options         ALTQ_HFSC
options         ALTQ_CDNR
options         ALTQ_PRIQ


More information about the freebsd-stable mailing list