5.2.1-p9 kernel hang (unknown reason, but more info)

Dave Stephens hsoftdev17 at hotmail.com
Tue Aug 10 20:15:28 PDT 2004


I know this may seem a silly suggestion after all the info you've dug up, 
but it almost sounds like a bad power supply.  I've seen virtually identical 
problems with my old dual P2/333 box and it turned out to be the AT 
Motherboard power connectors from the power supply that were screwed up and 
making a flakey connection.

http://life.trollserver.com/image/dsc00332.jpg - very burnt AT connector
http://life.trollserver.com/image/dsc00333.jpg - visible scarring on the 
motherboard itself

The electrical engineer explanation is:  The power connector was making a 
crappy connection on one or more of the 5v lines, the others carried the 
load.  Burning caused by the excessive current caused other 5v lines to fail 
and the load on the remaining got higher still...  Queue the cascade effect 
to utter failure (and nearly a fire.)  After getting the power connectors 
replaced on the power supply and polishing the motherboard pins it worked 
well for a while and then started acting like your descriptions.  The 
eventual solution (keep in mind this was an old AT board and that this 
solution is not recommended for the hardware you/the data center have) was 
to cut the power connectors off completely and solder them directly to the 
motherboard.  no more flakey power issues.  :)

dave

----Original Message Follows----
(note: I'm not subscribed to the list with this email, but I couldn't send 
this large mail by my webmail)

Hey list,


Recently my server started failing daily, I though it was because of high 
loads. Now it even goes down if completely idle after a few hours.

I *still* don't have the panic message (lazy datacenter lol) but I booted 
the server with verbose enabled and got the list of hardware. I also removed 
any optimizations in sysctl stuff I had added (maxfiles, maxprocs, ipc 
stuff, etc, all defaults now)

The server ran fine for the first 20 days or so, then it kept dying on a 
daily basis. Unfortunately i still don't have the panic message. But if 
someone can give some hints based on hardware list+dmesg+mptable, it would 
help. I've been shooting in the dark so far..


Here is the dmesg (long):

Frequency 1193182 Hz quality 0
Calibrating TSC clock ... TSC clock: 2800121436 Hz
CPU: Intel(R) Xeon(TM) CPU 2.80GHz (2800.12-MHz 686-class CPU)
  Origin = "GenuineIntel"  Id = 0xf25  Stepping = 5
  
Features=0xbfebfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE>
  Hyperthreading: 2 logical CPUs
real memory  = 4088397824 (3899 MB)
Physical memory chunk(s):
0x0000000000001000 - 0x000000000009efff, 647168 bytes (158 pages)
0x0000000000100000 - 0x00000000003fffff, 3145728 bytes (768 pages)
0x0000000000c29000 - 0x00000000ef64ffff, 4003622912 bytes (977447 pages)
avail memory = 3973623808 (3789 MB)
APIC ID: physical 0, logical 0:0
APIC ID: physical 1, logical 0:1
APIC ID: physical 6, logical 0:2
APIC ID: physical 7, logical 0:3
FreeBSD/SMP: Multiprocessor System Detected: 4 CPUs
cpu0 (BSP): APIC ID:  0
cpu1 (AP): APIC ID:  1
cpu2 (AP): APIC ID:  6
cpu3 (AP): APIC ID:  7
bios32: Found BIOS32 Service Directory header at 0xc00fdb70
bios32: Entry = 0xfdb80 (c00fdb80)  Rev = 0  Len = 1
pcibios: PCI BIOS entry at 0xf0000+0xdba1
pnpbios: Found PnP BIOS data at 0xc00f4420
pnpbios: Entry = f0000:32c4  Rev = 1.0
Other BIOS signatures found:
ioapic0: Assuming intbase of 0
ioapic0: intpin 0 -> ExtINT (edge, activehi)
ioapic0: intpin 1 -> irq 1 (edge, activehi)
ioapic0: intpin 2 -> irq 2 (edge, activehi)
ioapic0: intpin 3 -> irq 3 (edge, activehi)
ioapic0: intpin 4 -> irq 4 (edge, activehi)
ioapic0: intpin 5 -> irq 5 (edge, activehi)
ioapic0: intpin 6 -> irq 6 (edge, activehi)
ioapic0: intpin 7 -> irq 7 (edge, activehi)
ioapic0: intpin 8 -> irq 8 (edge, activehi)
ioapic0: intpin 9 -> irq 9 (edge, activehi)
ioapic0: intpin 10 -> irq 10 (edge, activehi)
ioapic0: intpin 11 -> irq 11 (edge, activehi)
ioapic0: intpin 12 -> irq 12 (edge, activehi)
ioapic0: intpin 13 -> irq 13 (edge, activehi)
ioapic0: intpin 14 -> irq 14 (edge, activehi)
ioapic0: intpin 15 -> irq 15 (edge, activehi)
ioapic1: Assuming intbase of 16
ioapic1: intpin 0 -> irq 16 (level, activelo)
ioapic1: intpin 1 -> irq 17 (level, activelo)
ioapic1: intpin 2 -> irq 18 (level, activelo)
ioapic1: intpin 3 -> irq 19 (level, activelo)
ioapic1: intpin 4 -> irq 20 (level, activelo)
ioapic1: intpin 5 -> irq 21 (level, activelo)
ioapic1: intpin 6 -> irq 22 (level, activelo)
ioapic1: intpin 7 -> irq 23 (level, activelo)
ioapic1: intpin 8 -> irq 24 (level, activelo)
ioapic1: intpin 9 -> irq 25 (level, activelo)
ioapic1: intpin 10 -> irq 26 (level, activelo)
ioapic1: intpin 11 -> irq 27 (level, activelo)
ioapic1: intpin 12 -> irq 28 (level, activelo)
ioapic1: intpin 13 -> irq 29 (level, activelo)
ioapic1: intpin 14 -> irq 30 (level, activelo)
ioapic1: intpin 15 -> irq 31 (level, activelo)
ioapic2: Assuming intbase of 32
ioapic2: intpin 0 -> irq 32 (level, activelo)
ioapic2: intpin 1 -> irq 33 (level, activelo)
ioapic2: intpin 2 -> irq 34 (level, activelo)
ioapic2: intpin 3 -> irq 35 (level, activelo)
ioapic2: intpin 4 -> irq 36 (level, activelo)
ioapic2: intpin 5 -> irq 37 (level, activelo)
ioapic2: intpin 6 -> irq 38 (level, activelo)
ioapic2: intpin 7 -> irq 39 (level, activelo)
ioapic2: intpin 8 -> irq 40 (level, activelo)
ioapic2: intpin 9 -> irq 41 (level, activelo)
ioapic2: intpin 10 -> irq 42 (level, activelo)
ioapic2: intpin 11 -> irq 43 (level, activelo)
ioapic2: intpin 12 -> irq 44 (level, activelo)
ioapic2: intpin 13 -> irq 45 (level, activelo)
ioapic2: intpin 14 -> irq 46 (level, activelo)
ioapic2: intpin 15 -> irq 47 (level, activelo)
ioapic3: Assuming intbase of 48
ioapic3: intpin 0 -> irq 48 (level, activelo)
ioapic3: intpin 1 -> irq 49 (level, activelo)
ioapic3: intpin 2 -> irq 50 (level, activelo)
ioapic3: intpin 3 -> irq 51 (level, activelo)
ioapic3: intpin 4 -> irq 52 (level, activelo)
ioapic3: intpin 5 -> irq 53 (level, activelo)
ioapic3: intpin 6 -> irq 54 (level, activelo)
ioapic3: intpin 7 -> irq 55 (level, activelo)
ioapic3: intpin 8 -> irq 56 (level, activelo)
ioapic3: intpin 9 -> irq 57 (level, activelo)
ioapic3: intpin 10 -> irq 58 (level, activelo)
ioapic3: intpin 11 -> irq 59 (level, activelo)
ioapic3: intpin 12 -> irq 60 (level, activelo)
ioapic3: intpin 13 -> irq 61 (level, activelo)
ioapic3: intpin 14 -> irq 62 (level, activelo)
ioapic3: intpin 15 -> irq 63 (level, activelo)
ioapic1: intpin 13 trigger: level
ioapic1: intpin 13 polarity: active-lo
ioapic1: intpin 1 trigger: level
ioapic1: intpin 1 polarity: active-lo
ioapic1: intpin 10 trigger: level
ioapic1: intpin 10 polarity: active-lo
ioapic1: intpin 12 trigger: level
ioapic1: intpin 12 polarity: active-lo
ioapic0: intpin 1 trigger: edge
ioapic0: intpin 1 polarity: active-hi
ioapic0: Routing IRQ 0 -> intpin 2
ioapic0: intpin 2 trigger: edge
ioapic0: intpin 2 polarity: active-hi
ioapic0: intpin 3 trigger: edge
ioapic0: intpin 3 polarity: active-hi
ioapic0: intpin 4 trigger: edge
ioapic0: intpin 4 polarity: active-hi
ioapic0: intpin 5 trigger: edge
ioapic0: intpin 5 polarity: active-hi
ioapic0: intpin 6 trigger: edge
ioapic0: intpin 6 polarity: active-hi
ioapic0: intpin 7 trigger: edge
ioapic0: intpin 7 polarity: active-hi
ioapic0: intpin 8 trigger: edge
ioapic0: intpin 8 polarity: active-hi
ioapic0: intpin 12 trigger: edge
ioapic0: intpin 12 polarity: active-hi
ioapic0: intpin 13 trigger: edge
ioapic0: intpin 13 polarity: active-hi
ioapic0: intpin 14 trigger: edge
ioapic0: intpin 14 polarity: active-hi
ioapic0: intpin 15 trigger: edge
ioapic0: intpin 15 polarity: active-hi
lapic: Routing ExtINT -> LINT0
lapic: LINT0 trigger: edge
lapic: LINT0 polarity: active-hi
lapic: Routing NMI -> LINT1
lapic: LINT1 trigger: edge
lapic: LINT1 polarity: active-hi
ioapic0 <Version 1.1> irqs 0-15 on motherboard
ioapic1 <Version 1.1> irqs 16-31 on motherboard
ioapic2 <Version 1.1> irqs 32-47 on motherboard
ioapic3 <Version 1.1> irqs 48-63 on motherboard
cpu0 BSP:
     ID: 0x00000000   VER: 0x00050014 LDR: 0x01000000 DFR: 0x0fffffff
  lint0: 0x00010700 lint1: 0x00000400 TPR: 0x00000000 SVR: 0x000001ff
wlan: <802.11 Link Layer>
null: <null device, zero device>
random: <entropy source>
mem: <memory & I/O>
Pentium Pro MTRR support enabled
npx0: [FAST]
npx0: <math processor> on motherboard
npx0: INT 16 interface
pci_open(1):mode 1 addr port (0x0cf8) is 0x800078ac
pci_open(1a):mode1res=0x80000000 (0x80000000)
pci_cfgcheck:device 0 [class=060000] [hdr=80] is there (id=00171166)
pcibios: BIOS version 2.10
Using $PIR table, 12 entries at 0xc00f4a20
PCI-Only Interrupts: none
Location  Bus Device Pin  Link  IRQs
embedded    0   15    A   0x01  10
slot 1      0    5    A   0x14  3 4 5 7 9 10 11 12 14 15
slot 1      0    5    B   0x15  3 4 5 7 9 10 11 12 14 15
slot 1      0    5    C   0x10  3 4 5 7 9 10 11 12 14 15
slot 1      0    5    D   0x11  3 4 5 7 9 10 11 12 14 15
slot 2      0    4    A   0x12  3 4 5 7 9 10 11 12 14 15
slot 2      0    4    B   0x13  3 4 5 7 9 10 11 12 14 15
slot 2      0    4    C   0x18  3 4 5 7 9 10 11 12 14 15
slot 2      0    4    D   0x19  3 4 5 7 9 10 11 12 14 15
slot 3      0    3    A   0x10  3 4 5 7 9 10 11 12 14 15
slot 3      0    3    B   0x11  3 4 5 7 9 10 11 12 14 15
slot 3      0    3    C   0x1e  3 4 5 7 9 10 11 12 14 15
slot 3      0    3    D   0x1f  3 4 5 7 9 10 11 12 14 15
slot 4      0    6    A   0x16  3 4 5 7 9 10 11 12 14 15
slot 4      0    6    B   0x17  3 4 5 7 9 10 11 12 14 15
slot 4      0    6    C   0x12  3 4 5 7 9 10 11 12 14 15
slot 4      0    6    D   0x13  3 4 5 7 9 10 11 12 14 15
slot 5      0    7    A   0x18  3 4 5 7 9 10 11 12 14 15
slot 5      0    7    B   0x19  3 4 5 7 9 10 11 12 14 15
slot 5      0    7    C   0x14  3 4 5 7 9 10 11 12 14 15
slot 5      0    7    D   0x15  3 4 5 7 9 10 11 12 14 15
slot 5      0    2    A   0x12  3 4 5 7 9 10 11 12 14 15
slot 5      0    2    B   0x13  3 4 5 7 9 10 11 12 14 15
slot 5      0    2    C   0x12  3 4 5 7 9 10 11 12 14 15
slot 5      0    2    D   0x13  3 4 5 7 9 10 11 12 14 15
embedded    0    8    A   0x1c  3 4 5 7 9 10 11 12 14 15
embedded    0    9    A   0x1a  3 4 5 7 9 10 11 12 14 15
embedded    0   10    A   0x1e  3 4 5 7 9 10 11 12 14 15
embedded    0   10    B   0x1f  3 4 5 7 9 10 11 12 14 15
embedded    0   11    A   0x1d  3 4 5 7 9 10 11 12 14 15
embedded    0    0    C   0x62  3 4 5 7 9 10 11 12 14 15
embedded    0    0    D   0x62  3 4 5 7 9 10 11 12 14 15
pcib0: <MPTable Host-PCI bridge> at pcibus 0 on motherboard
pci0: <PCI bus> on pcib0
pci0: physical bus=0
found->vendor=0x1166, dev=0x0017, revid=0x32
bus=0, slot=0, func=0
class=06-00-00, hdrtype=0x00, mfdev=1
cmdreg=0x0000, statreg=0x0000, cachelnsz=16 (dwords)
lattimer=0x00 (0 ns), mingnt=0x00 (0 ns), maxlat=0x00 (0 ns)
found->vendor=0x1166, dev=0x0017, revid=0x00
bus=0, slot=0, func=1
class=06-00-00, hdrtype=0x00, mfdev=1
cmdreg=0x0000, statreg=0x0000, cachelnsz=16 (dwords)
lattimer=0x00 (0 ns), mingnt=0x00 (0 ns), maxlat=0x00 (0 ns)
map[10]: type 1, range 32, base feb40000, size 17, enabled
map[18]: type 4, range 32, base 0000e000, size  6, enabled
pcib0: slot 8 INTA routed to irq 28
found->vendor=0x8086, dev=0x100e, revid=0x02
bus=0, slot=8, func=0
class=02-00-00, hdrtype=0x00, mfdev=0
cmdreg=0x0117, statreg=0x0230, cachelnsz=16 (dwords)
lattimer=0x40 (1920 ns), mingnt=0xff (63750 ns), maxlat=0x00 (0 ns)
intpin=a, irq=28
powerspec 2  supports D0 D3  current D0
MSI supports 1 message, 64 bit
map[10]: type 1, range 32, base feb80000, size 17, enabled
map[18]: type 4, range 32, base 0000e400, size  6, enabled
pcib0: slot 9 INTA routed to irq 26
found->vendor=0x8086, dev=0x100e, revid=0x02
bus=0, slot=9, func=0
class=02-00-00, hdrtype=0x00, mfdev=0
cmdreg=0x0117, statreg=0x0230, cachelnsz=16 (dwords)
lattimer=0x40 (1920 ns), mingnt=0xff (63750 ns), maxlat=0x00 (0 ns)
intpin=a, irq=26
powerspec 2  supports D0 D3  current D0
MSI supports 1 message, 64 bit
map[10]: type 1, range 32, base fd000000, size 24, enabled
map[14]: type 4, range 32, base 0000e800, size  8, enabled
map[18]: type 1, range 32, base febff000, size 12, enabled
pcib0: slot 11 INTA routed to irq 29
found->vendor=0x1002, dev=0x4752, revid=0x27
bus=0, slot=11, func=0
class=03-00-00, hdrtype=0x00, mfdev=0
cmdreg=0x0087, statreg=0x0290, cachelnsz=16 (dwords)
lattimer=0x40 (1920 ns), mingnt=0x08 (2000 ns), maxlat=0x00 (0 ns)
intpin=a, irq=29
powerspec 2  supports D0 D1 D2 D3  current D0
found->vendor=0x1166, dev=0x0203, revid=0xa0
bus=0, slot=15, func=0
class=06-00-00, hdrtype=0x00, mfdev=1
cmdreg=0x0107, statreg=0x2200, cachelnsz=0 (dwords)
lattimer=0x40 (1920 ns), mingnt=0x00 (0 ns), maxlat=0x00 (0 ns)
map[10]: type 4, range 32, base 000001f0, size  3, enabled
map[14]: type 4, range 32, base 000003f4, size  2, enabled
map[18]: type 4, range 32, base 00000170, size  3, enabled
map[1c]: type 4, range 32, base 00000374, size  2, enabled
map[20]: type 4, range 32, base 0000ffa0, size  4, enabled
found->vendor=0x1166, dev=0x0213, revid=0xa0
bus=0, slot=15, func=1
class=01-01-8a, hdrtype=0x00, mfdev=1
cmdreg=0x0015, statreg=0x0200, cachelnsz=8 (dwords)
lattimer=0x40 (1920 ns), mingnt=0x00 (0 ns), maxlat=0x00 (0 ns)
map[10]: type 1, range 32, base febfd000, size 12, enabled
pcib0: slot 15 INTA routed to irq 17
found->vendor=0x1166, dev=0x0221, revid=0x05
bus=0, slot=15, func=2
class=0c-03-10, hdrtype=0x00, mfdev=1
cmdreg=0x0117, statreg=0x0280, cachelnsz=0 (dwords)
lattimer=0x40 (1920 ns), mingnt=0x00 (0 ns), maxlat=0x50 (20000 ns)
intpin=a, irq=17
found->vendor=0x1166, dev=0x0227, revid=0x00
bus=0, slot=15, func=3
class=06-01-00, hdrtype=0x00, mfdev=1
cmdreg=0x0107, statreg=0x0200, cachelnsz=0 (dwords)
lattimer=0x00 (0 ns), mingnt=0x00 (0 ns), maxlat=0x00 (0 ns)
em0: <Intel(R) PRO/1000 Network Connection, Version - 1.7.19> port 
0xe000-0xe03f mem 0xfeb40000-0xfeb5ffff irq 28 at device 8.0 on pci0
em0: bpf attached
em0:  Speed:N/A  Duplex:N/A
em1: <Intel(R) PRO/1000 Network Connection, Version - 1.7.19> port 
0xe400-0xe43f mem 0xfeb80000-0xfeb9ffff irq 26 at device 9.0 on pci0
em1: bpf attached
em1:  Speed:N/A  Duplex:N/A
pci0: <display, VGA> at device 11.0 (no driver attached)
atapci0: <ServerWorks CSB6 UDMA100 controller> port 
0xffa0-0xffaf,0x374-0x377,0x170-0x177,0x3f4-0x3f7,0x1f0-0x1f7 at device 15.1 
on pci0
ata0: reset tp1 mask=03 ostat0=50 ostat1=50
ata0-master: stat=0x50 err=0x01 lsb=0x00 msb=0x00
ata0-slave:  stat=0x50 err=0x01 lsb=0x00 msb=0x00
ata0: reset tp2 mask=03 stat0=50 stat1=50 devices=0x3<ATA_SLAVE,ATA_MASTER>
ata0: at 0x1f0 irq 14 on atapci0
ata0: [MPSAFE]
ata1: reset tp1 mask=03 ostat0=50 ostat1=00
ata1-master: stat=0x50 err=0x01 lsb=0x00 msb=0x00
ata1-slave:  stat=0x00 err=0x01 lsb=0x00 msb=0x00
ata1: reset tp2 mask=03 stat0=50 stat1=00 devices=0x1<ATA_MASTER>
ata1: at 0x170 irq 15 on atapci0
ata1: [MPSAFE]
ohci0: <OHCI (generic) USB controller> mem 0xfebfd000-0xfebfdfff irq 17 at 
device 15.2 on pci0
ohci0: (New OHCI DeviceId=0x02211166)
usb0: OHCI version 1.0, legacy support
usb0: SMM does not respond, resetting
usb0: <OHCI (generic) USB controller> on ohci0
usb0: USB revision 1.0
uhub0: (0x1166) OHCI root hub, class 9/0, rev 1.00/1.00, addr 1
uhub0: 4 ports with 4 removable, self powered
isab0: <PCI-ISA bridge> at device 15.3 on pci0
isa0: <ISA bus> on isab0
pcib255: <ServerWorks host to PCI bridge(unknown chipset)> at pcibus 255 on 
motherboard
pci255: <PCI bus> on pcib255
pci255: physical bus=255
unknown: status reg test failed ff
unknown: status reg test failed ff
unknown: status reg test failed ff
unknown: status reg test failed ff
unknown: status reg test failed ff
unknown: status reg test failed ff
ata: ata0 already exists; skipping it
ata: ata1 already exists; skipping it
Trying Read_Port at 203
Trying Read_Port at 243
Trying Read_Port at 283
Trying Read_Port at 2c3
Trying Read_Port at 303
Trying Read_Port at 343
Trying Read_Port at 383
Trying Read_Port at 3c3
ex_isa_identify()
pnpbios: 15 devices, largest 234 bytes
pnpbios: handle 0 device ID PNP0c01 (010cd041)
PNP0200: adding dma mask 0x10
PNP0200: adding io range 0-0xf, size=0x10, align=0x1
PNP0200: adding io range 0x80-0x90, size=0x11, align=0x1
PNP0200: adding io range 0x94-0x9f, size=0xc, align=0x1
PNP0200: adding io range 0xc0-0xde, size=0x1f, align=0x1
pnpbios: handle 2 device ID PNP0200 (0002d041)
PNP0100: adding irq mask 0x1
PNP0100: adding io range 0x40-0x43, size=0x4, align=0x1
pnpbios: handle 3 device ID PNP0100 (0001d041)
PNP0b00: adding irq mask 0x100
PNP0b00: adding io range 0x70-0x71, size=0x2, align=0x1
pnpbios: handle 4 device ID PNP0b00 (000bd041)
PNP0303: adding irq mask 0x2
PNP0303: adding io range 0x60-0x60, size=0x1, align=0x1
PNP0303: adding io range 0x64-0x64, size=0x1, align=0x1
pnpbios: handle 5 device ID PNP0303 (0303d041)
PNP0800: adding io range 0x61-0x61, size=0x1, align=0x1
pnpbios: handle 6 device ID PNP0800 (0008d041)
PNP0c04: adding irq mask 0x2000
PNP0c04: adding io range 0xf0-0xff, size=0x10, align=0x1
pnpbios: handle 7 device ID PNP0c04 (040cd041)
PNP0c02: adding io range 0x4d0-0x4d1, size=0x2, align=0x1
PNP0c02: adding io range 0xcf8-0xcff, size=0x8, align=0x1
PNP0c02: adding io range 0x10-0x1f, size=0x10, align=0x1
PNP0c02: adding io range 0x40b-0x40b, size=0x1, align=0x1
PNP0c02: adding io range 0x4d6-0x4d6, size=0x1, align=0x1
PNP0c02: adding io range 0xc00-0xc01, size=0x2, align=0x1
PNP0c02: adding io range 0xc14-0xc14, size=0x1, align=0x1
PNP0c02: adding io range 0xc49-0xc4a, size=0x2, align=0x1
PNP0c02: adding io range 0xc52-0xc52, size=0x1, align=0x1
PNP0c02: adding io range 0xc6c-0xc6c, size=0x1, align=0x1
PNP0c02: adding io range 0xc6f-0xc6f, size=0x1, align=0x1
PNP0c02: adding io range 0xcd6-0xcd7, size=0x2, align=0x1
PNP0c02: adding io range 0xf50-0xf58, size=0x9, align=0x1
PNP0c02: adding io range 0x374-0x375, size=0x2, align=0x1
PNP0c02: adding io range 0x377-0x377, size=0x1, align=0x1
pnpbios: handle 8 device ID PNP0c02 (020cd041)
PNP0c02: adding io range 0x2e-0x2e, size=0x1, align=0x1
PNP0c02: adding io range 0x2f-0x2f, size=0x1, align=0x1
PNP0c02: adding io range 0x580-0x58f, size=0x10, align=0x1
PNP0c02: adding io range 0x500-0x51f, size=0x20, align=0x1
PNP0c02: adding io range 0x480-0x48f, size=0x10, align=0x1
pnpbios: handle 9 device ID PNP0c02 (020cd041)
pnpbios: handle 10 device ID PNP0a03 (030ad041)
PNP0501: adding io range 0x3f8-0x3ff, size=0x8, align=0x8
PNP0501: adding irq mask 0x10
pnpbios: handle 11 device ID PNP0501 (0105d041)
PNP0501: adding io range 0x2f8-0x2ff, size=0x8, align=0x8
PNP0501: adding irq mask 0x8
pnpbios: handle 12 device ID PNP0501 (0105d041)
PNP0401: adding io range 0x378-0x37b, size=0x4, align=0x8
PNP0401: adding io range 0x778-0x77a, size=0x3, align=0x8
PNP0401: adding irq mask 0x80
PNP0401: adding dma mask 0x8
pnpbios: handle 13 device ID PNP0401 (0104d041)
PNP0700: adding io range 0x3f0-0x3f5, size=0x6, align=0x1
PNP0700: adding irq mask 0x40
PNP0700: adding dma mask 0x4
pnpbios: handle 14 device ID PNP0700 (0007d041)
sc: sc0 already exists; skipping it
vga: vga0 already exists; skipping it
isa_probe_children: disabling PnP devices
isa_probe_children: probing non-PnP devices
orm0: <Option ROM> at iomem 0xc0000-0xc7fff on isa0
pmtimer0 on isa0
adv0: not probed (disabled)
aha0: not probed (disabled)
aic0: not probed (disabled)
atkbdc0: <Keyboard controller (i8042)> at port 0x64,0x60 on isa0
psm0: current command byte:0065
psm0: failed to reset the aux device.
bt0: not probed (disabled)
cs0: not probed (disabled)
ed0: not probed (disabled)
fdc0: <Enhanced floppy controller (i82077, NE72065 or clone)> at port 
0x3f7,0x3f0-0x3f5 irq 6 drq 2 on isa0
fdc0: FIFO enabled, 8 bytes threshold
fd0: <1440-KB 3.5" drive> on fdc0 drive 0
fe0: not probed (disabled)
ie0: not probed (disabled)
le0: not probed (disabled)
lnc0: not probed (disabled)
pcic0 failed to probe at port 0x3e0 iomem 0xd0000 on isa0
pcic1: not probed (disabled)
ppc0: parallel port found at 0x378
ppc0: using extended I/O port range
ppc0: ECP SPP SPP
ppc0: <Parallel port> at port 0x378-0x37f irq 7 on isa0
ppc0: Generic chipset (ECP/PS2/NIBBLE) in COMPATIBLE mode
ppc0: FIFO with 16/16/8 bytes threshold
ppbus0: <Parallel port bus> on ppc0
plip0: <PLIP network interface> on ppbus0
plip0: bpf attached
lpt0: <Printer> on ppbus0
lpt0: Interrupt-driven port
ppi0: <Parallel I/O> on ppbus0
sc0: <System console> at flags 0x100 on isa0
sc0: VGA <16 virtual consoles, flags=0x300>
sc0: fb0, terminal emulator: sc (syscons terminal)
sio0: irq maps: 0xc0c1 0xc0d1 0xc0c1 0xc0c1
sio0 at port 0x3f8-0x3ff irq 4 flags 0x10 on isa0
sio0: type 16550A
sio1: irq maps: 0xc0c1 0xc0c9 0xc0c1 0xc0c1
sio1 at port 0x2f8-0x2ff irq 3 on isa0
sio1: type 16550A
sio2: not probed (disabled)
sio3: not probed (disabled)
sn0: not probed (disabled)
vga0: <Generic ISA VGA> at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0
fb0: vga0, vga, type:VGA (5), flags:0x7007f
fb0: port:0x3c0-0x3df, crtc:0x3d4, mem:0xa0000 0x20000
fb0: init mode:24, bios mode:3, current mode:24
fb0: window:0xc00b8000 size:32k gran:32k, buf:0 size:32k
VGA parameters upon power-up
50 18 10 00 00 00 03 00 02 67 5f 4f 50 82 55 81
bf 1f 00 4f 0d 0e 00 00 07 80 9c 8e 8f 28 1f 96
b9 a3 ff 00 01 02 03 04 05 14 07 38 39 3a 3b 3c
3d 3e 3f 0c 00 0f 08 00 00 00 00 00 10 0e 00 ff
VGA parameters in BIOS for mode 24
50 18 10 00 10 00 03 00 02 67 5f 4f 50 82 55 81
bf 1f 00 4f 0d 0e 00 00 00 00 9c 8e 8f 28 1f 96
b9 a3 ff 00 01 02 03 04 05 14 07 38 39 3a 3b 3c
3d 3e 3f 0c 00 0f 08 00 00 00 00 00 10 0e 00 ff
EGA/VGA parameters to be used for mode 24
50 18 10 00 10 00 03 00 02 67 5f 4f 50 82 55 81
bf 1f 00 4f 0d 0e 00 00 00 00 9c 8e 8f 28 1f 96
b9 a3 ff 00 01 02 03 04 05 14 07 38 39 3a 3b 3c
3d 3e 3f 0c 00 0f 08 00 00 00 00 00 10 0e 00 ff
vt0: not probed (disabled)
isa_probe_children: probing PnP devices
adv1: Invalid baseport of 0x0 specified. Nearest valid baseport is 0x100.  
Failing probe.
adv1: Invalid baseport of 0x40 specified. Nearest valid baseport is 0x100.  
Failing probe.
adv1: Invalid baseport of 0x70 specified. Nearest valid baseport is 0x100.  
Failing probe.
unknown: <PNP0303> can't assign resources (port)
unknown: <PNP0303> at port 0x60 on isa0
adv1: Invalid baseport of 0x61 specified. Nearest valid baseport is 0x100.  
Failing probe.
unknown: <PNP0800> failed to probe at port 0x61 on isa0
adv1: Invalid baseport of 0xf0 specified. Nearest valid baseport is 0x100.  
Failing probe.
adv1: Invalid baseport of 0x4d0 specified. Nearest valid baseport is 0x330.  
Failing probe.
adv1: Invalid baseport of 0x2e specified. Nearest valid baseport is 0x100.  
Failing probe.
unknown: <PNP0501> can't assign resources (port)
unknown: <PNP0501> at port 0x3f8-0x3ff on isa0
unknown: <PNP0501> can't assign resources (port)
unknown: <PNP0501> at port 0x2f8-0x2ff on isa0
unknown: <PNP0401> can't assign resources (port)
unknown: <PNP0401> at port 0x378-0x37b on isa0
unknown: <PNP0700> can't assign resources (port)
unknown: <PNP0700> at port 0x3f0-0x3f5 on isa0
Device configuration finished.
procfs registered
Timecounter "TSC" frequency 2800121436 Hz quality -100
Timecounters tick every 10.000 msec
lo0: bpf attached
ata0-slave: pio=0x0c wdma=0x22 udma=0x46 cable=80pin
ata0-master: pio=0x0c wdma=0x22 udma=0x46 cable=40pin
ata0-master: setting PIO4 on ServerWorks CSB6 chip
ata0-master: DMA limited to UDMA33, non-ATA66 cable or device
ata0-master: setting UDMA33 on ServerWorks CSB6 chip
ata0-slave: setting PIO4 on ServerWorks CSB6 chip
ata0-slave: setting UDMA100 on ServerWorks CSB6 chip
GEOM: create disk ad0 dp=0xcb0c4e60
ad0: <Maxtor 6Y080L0/YAR41BW0> ATA-7 disk at ata0-master
ad0: 78167MB (160086528 sectors), 158816 C, 16 H, 63 S, 512 B
ad0: 16 secs/int, 1 depth queue, UDMA33
GEOM: new disk ad0
ar: FreeBSD check1 failed
[0] f:80 typ:165 s(CHS):0/1/1 e(CHS):1023/254/63 s:63 l:73384857
[1] f:00 typ:165 s(CHS):1023/255/63 e(CHS):1023/254/63 s:73384920 l:86686740
[2] f:00 typ:0 s(CHS):0/0/0 e(CHS):0/0/0 s:0 l:0
[3] f:00 typ:0 s(CHS):0/0/0 e(CHS):0/0/0 s:0 l:0
GEOM: Configure ad0s1, start 32256 length 37573046784 end 37573079039
GEOM: Configure ad0s2, start 37573079040 length 44383610880 end 81956689919
GEOM: create disk ad1 dp=0xcb0c4b60
ad1: <Maxtor 6Y080L0/YAR41BW0> ATA-7 disk at ata0-slave
ad1: 78167MB (160086528 sectors), 158816 C, 16 H, 63 S, 512 B
ad1: 16 secs/int, 1 depth queue, UDMA100
ar: FreeBSD check1 failed
GEOM: Configure ad0s1a, start 0 length 157286400 end 157286399
GEOM: Configure ad0s1b, start 157286400 length 2147483648 end 2304770047
GEOM: Configure ad0s1c, start 0 length 37573046784 end 37573046783
GEOM: Configure ad0s1d, start 2304770048 length 314572800 end 2619342847
GEOM: Configure ad0s1e, start 2619342848 length 4294967296 end 6914310143
GEOM: Configure ad0s1f, start 6914310144 length 5368709120 end 12283019263
GEOM: Configure ad0s1g, start 12283019264 length 7516192768 end 19799212031
GEOM: Configure ad0s1h, start 19799212032 length 17773834752 end 37573046783
ata1-master: pio=0x0c wdma=0x22 udma=0x46 cable=40pin
ata1-master: setting PIO4 on ServerWorks CSB6 chip
ata1-master: DMA limited to UDMA33, non-ATA66 cable or device
ata1-master: setting UDMA33 on ServerWorks CSB6 chip
GEOM: create disk ad2 dp=0xcb0c4a60
ad2: <Maxtor 6Y200P0/YAR41BW0> ATA-7 disk at ata1-master
ad2: 194481MB (398297088 sectors), 395136 C, 16 H, 63 S, 512 B
ad2: 16 secs/int, 1 depth queue, UDMA33
GEOM: Configure ad0s2c, start 0 length 44383610880 end 44383610879
GEOM: Configure ad0s2d, start 0 length 44383610880 end 44383610879
GEOM: new disk ad1
ar: FreeBSD check1 failed
SMP: AP CPU #2 Launched!
cpu2 AP:
     ID: 0x06000000   VER: 0x00050014 LDR: 0x04000000 DFR: 0x0fffffff
  lint0: 0x00010700 lint1: 0x00010400 TPR: 0x00000000 SVR: 0x000001ff
SMP: AP CPU #1 Launched!
cpu1 AP:
     ID: 0x01000000   VER: 0x00050014 LDR: 0x02000000 DFR: 0x0fffffff
  lint0: 0x00010700 lint1: 0x00010400 TPR: 0x00000000 SVR: 0x000001ff
SMP: AP CPU #3 Launched!
cpu3 AP:
     ID: 0x07000000   VER: 0x00050014 LDR: 0x08000000 DFR: 0x0fffffff
  lint0: 0x00010700 lint1: 0x00010400 TPR: 0x00000000 SVR: 0x000001ff
ioapic0: routing intpin 3 (IRQ 3) to cluster 0
ioapic0: routing intpin 4 (IRQ 4) to cluster 0
ioapic0: routing intpin 6 (IRQ 6) to cluster 0
ioapic0: routing intpin 7 (IRQ 7) to cluster 0
ioapic0: routing intpin 8 (IRQ 8) to cluster 0
ioapic0: routing intpin 13 (IRQ 13) to cluster 0
ioapic0: routing intpin 14 (IRQ 14) to cluster 0
ioapic0: routing intpin 15 (IRQ 15) to cluster 0
ioapic1: routing intpin 1 (IRQ 17) to cluster 0
ioapic1: routing intpin 10 (IRQ 26) to cluster 0
ioapic1: routing intpin 12 (IRQ 28) to cluster 0
GEOM: new disk ad2
[0] f:80 typ:165 s(CHS):0/1/1 e(CHS):1023/3/63 s:63 l:160071597
[1] f:00 typ:0 s(CHS):0/0/0 e(CHS):0/0/0 s:0 l:0
[2] f:00 typ:0 s(CHS):0/0/0 e(CHS):0/0/0 s:0 l:0
[3] f:00 typ:0 s(CHS):0/0/0 e(CHS):0/0/0 s:0 l:0
GEOM: Configure ad1s1, start 32256 length 81956657664 end 81956689919
[0] f:80 typ:165 s(CHS):0/1/1 e(CHS):1023/7/63 s:63 l:398283417
[1] f:00 typ:0 s(CHS):0/0/0 e(CHS):0/0/0 s:0 l:0
[2] f:00 typ:0 s(CHS):0/0/0 e(CHS):0/0/0 s:0 l:0
[3] f:00 typ:0 s(CHS):0/0/0 e(CHS):0/0/0 s:0 l:0
GEOM: Configure ad2s1, start 32256 length 203921109504 end 203921141759
GEOM: Configure ad1s1c, start 0 length 81956657664 end 81956657663
GEOM: Configure ad1s1d, start 0 length 2147483648 end 2147483647
GEOM: Configure ad1s1e, start 2147483648 length 10737418240 end 12884901887
GEOM: Configure ad2s1c, start 0 length 203921109504 end 203921109503
GEOM: Configure ad2s1d, start 0 length 5368709120 end 5368709119
GEOM: Configure ad2s1e, start 5368709120 length 2147483648 end 7516192767
Mounting root from ufs:/dev/ad0s1a
start_init: trying /sbin/init
em1: Link is up 100 Mbps Full Duplex
Linux ELF exec handler installed
pflog0: bpf attached
pflog: $Name: VERSION_2_03 $
pfsync0: bpf attached
pfsync: $Name: VERSION_2_03 $
in6_ifattach: pflog0 is not multicast capable, IPv6 not enabled
in6_ifattach: pfsync0 is not multicast capable, IPv6 not enabled
pflog0: promiscuous mode enabled
pf: $Name: VERSION_2_03 $
em1: Link is up 100 Mbps Full Duplex
WARNING: /jailsystem/bulletservices was not properly dismounted
WARNING: /jailsystem/raptor was not properly dismounted


-
Some oddities:

adv1: Invalid baseport of 0x61 specified. Nearest valid baseport is 0x100.  
Failing probe.
unknown: status reg test failed ff
disks: only 1 is UDMA 100, the others are UDMA33; Don't know if this could 
cause any trouble.

No idea of what these are, but they could lead to something.

mptable:

===============================================================================

MPTable, version 2.0.15

-------------------------------------------------------------------------------

MP Floating Pointer Structure:

  location:BIOS
  physical address:0x000ff780
  signature:'_MP_'
  length:16 bytes
  version:1.4
  checksum:0xe3
  mode:Virtual Wire

-------------------------------------------------------------------------------

MP Config Table Header:

  physical address:0x000f0ea0
  signature:'PCMP'
  base table length:324
  version:1.4
  checksum:0x87
  OEM ID:'AMI     '
  Product ID:'GCHE        '
  OEM table pointer:0x00000000
  OEM table size:0
  entry count:29
  local APIC address:0xfee00000
  extended table length:124
  extended table checksum:17

-------------------------------------------------------------------------------

MP Config Base Table Entries:

--
Processors:APIC IDVersionStateFamilyModelStepFlags
0 0x14 BSP, usable 15 2 5 0xbfebfbff
1 0x14 AP, usable 15 2 5 0xbfebfbff
6 0x14 AP, usable 15 2 5 0xbfebfbff
7 0x14 AP, usable 15 2 5 0xbfebfbff
--
Bus:Bus IDType
0 PCI   1 ISA   --
I/O APICs:APIC IDVersionStateAddress
8 0x11 usable 0xfec00000
9 0x11 usable 0xfec01000
10 0x11 usable 0xfec02000
11 0x11 usable 0xfec03000
--
I/O Ints:TypePolarity    TriggerBus ID IRQAPIC IDPIN#
INTactive-lo       level     011:A      9  13
INTactive-lo       level     015:A      9   1
INTactive-lo       level     0 9:A      9  10
INTactive-lo       level     0 8:A      9  12
ExtINTactive-hi        edge     1   0      8   0
INTactive-hi        edge     1   1      8   1
INTactive-hi        edge     1   0      8   2
INTactive-hi        edge     1   3      8   3
INTactive-hi        edge     1   4      8   4
INTactive-hi        edge     1   5      8   5
INTactive-hi        edge     1   6      8   6
INTactive-hi        edge     1   7      8   7
INTactive-hi        edge     1   8      8   8
INTactive-hi        edge     1  12      8  12
INTactive-hi        edge     1  13      8  13
INTactive-hi        edge     1  14      8  14
INTactive-hi        edge     1  15      8  15
--
Local Ints:TypePolarity    TriggerBus ID IRQAPIC IDPIN#
ExtINTactive-hi        edge     1   0    255   0
NMIactive-hi        edge     0 0:A    255   1

-------------------------------------------------------------------------------

MP Config Extended Table Entries:

--
System Address Space
bus ID: 0 address type: I/O address
address base: 0xd000
address range: 0x2000
--
System Address Space
bus ID: 0 address type: I/O address
address base: 0x0
address range: 0x100
--
System Address Space
bus ID: 0 address type: memory address
address base: 0xa0000
address range: 0x20000
--
System Address Space
bus ID: 0 address type: memory address
address base: 0xfcb00000
address range: 0x2100000
--
System Address Space
bus ID: 0 address type: prefetch address
address base: 0xfca00000
address range: 0x100000
--
Bus Heirarchy
bus ID: 1 bus info: 0x01 parent bus ID: 0
--
Compatibility Bus Address
bus ID: 0 address modifier: add
predefined range: 0x00000000
--
Compatibility Bus Address
bus ID: 0 address modifier: add
predefined range: 0x00000001

===============================================================================


pciconf -lv:

hostb0 at pci0:0:0:class=0x060000 card=0x00000000 chip=0x00171166 rev=0x32 
hdr=0x00
    vendor   = 'ServerWorks (Was: Reliance Computer Corp)'
    device   = 'CMIC-SL'
    class    = bridge
    subclass = HOST-PCI
hostb1 at pci0:0:1:class=0x060000 card=0x00000000 chip=0x00171166 rev=0x00 
hdr=0x00
    vendor   = 'ServerWorks (Was: Reliance Computer Corp)'
    device   = 'CMIC-SL'
    class    = bridge
    subclass = HOST-PCI
em0 at pci0:8:0:class=0x020000 card=0x004e8086 chip=0x100e8086 rev=0x02 
hdr=0x00
    vendor   = 'Intel Corporation'
    device   = '82544XT PRO/1000 MT Gigabit Ethernet Controller'
    class    = network
    subclass = ethernet
em1 at pci0:9:0:class=0x020000 card=0x004e8086 chip=0x100e8086 rev=0x02 
hdr=0x00
    vendor   = 'Intel Corporation'
    device   = '82544XT PRO/1000 MT Gigabit Ethernet Controller'
    class    = network
    subclass = ethernet
none0 at pci0:11:0:class=0x030000 card=0x00081002 chip=0x47521002 rev=0x27 
hdr=0x00
    vendor   = 'ATI Technologies'
    device   = 'Rage XL PCI'
    class    = display
    subclass = VGA
hostb2 at pci0:15:0:class=0x060000 card=0x415515d9 chip=0x02031166 rev=0xa0 
hdr=0x00
    vendor   = 'ServerWorks (Was: Reliance Computer Corp)'
    class    = bridge
    subclass = HOST-PCI
atapci0 at pci0:15:1:class=0x01018a card=0x021211d9 chip=0x02131166 rev=0xa0 
hdr=0x00
    vendor   = 'ServerWorks (Was: Reliance Computer Corp)'
    class    = mass storage
    subclass = ATA
ohci0 at pci0:15:2:class=0x0c0310 card=0x415515d9 chip=0x02211166 rev=0x05 
hdr=0x00
    vendor   = 'ServerWorks (Was: Reliance Computer Corp)'
    class    = serial bus
    subclass = USB
isab0 at pci0:15:3:class=0x060100 card=0x415515d9 chip=0x02271166 rev=0x00 
hdr=0x00
    vendor   = 'ServerWorks (Was: Reliance Computer Corp)'
    class    = bridge
    subclass = PCI-ISA


And finally, the hardware list as given by the datacenter:

Motherboard
SuperMicro X5DEI-GG

Memory
Apacer DDR 1024MB PC266 ECC+REG

CDrom
SuperMicro CDM-TEAC-32Black 32X Slimline

Heatsink
SuperMicro Heatsink SNK-0039

Proc
2 Intel S604 Xeon P4 2.8 512kb 533FSB Box

Hdd
Maxtor 080Gb U133 7200rpm 2MB 6Y080L0
Maxtor 080Gb U133 7200rpm 2MB 6Y080L0
Maxtor 200Gb U133 7200rpm 8MB 6Y200P0


-------


I have apm disabled (doesn't work with this board)

My kernel has some custom options:
options                 QUOTA
options                 RANDOM_IP_ID
options                 SC_DISABLE_REBOOT
options                 IPSTEALTH
options                 ADAPTIVE_MUTEXES

And it is a SMP kernel (as seen by the dmesg).

A couple of things changed until the server started dying daily, I've tried 
to revert them all but nothing seemed to be the cause:

ad2 (200G) - It wasn't present when I got the server, and problems started 
roughly after adding the disk. I asked the datacenter to remove it, it 
worked for like 8 hours and died as usual.

jails (@ad1,ad2) - I thought it could be a problem writing to multiple disks 
at once, so I let it run without any jails. Lasted 30 minutes that time.

High loads - I had seti at home running (4 processes) and the server was always 
loaded. I thought the panic (which I still don't have, the message) was 
caused by high loads, to test this i disabled seti and let the server idle. 
Around 8h again, then died. Recently I loaded the hell out of it with seti @ 
-15 prio, portsdb -Uu and compiling stuff, got a steady 50 load and it 
didn't die (until half hour later, when load was around 25.. but if it held 
the 50 load for 1h or so, I'm assuming high loads aren't the problem)

pf - packet filter. I had to reboot the server in order to enable the pf 
firewall, I had no problems until I started using it. But i'm doubting a 
firewall would be the cause for the server to die.. So I didn't try running 
the server without it yet.. Anyone had problems with this because of PF ? I 
don't think so.

sysctl's - I made a lot of optimizations on the sysctls to handle the large 
volume of file desc. & sockets I was expecting. About 15 minutes ago, I 
commented ALL of the stuff I changed out and rebooted the server.




As a final note, atm only named, postfix, courier imap, apache, setiathome 
and sshd are running on the system (+2 jails).

Any suggestions as of what is causing the server to stop daily?


Many thanks if you got this far reading this :-)

Regards,

Hugo



_______________________________________________
freebsd-smp at freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-smp
To unsubscribe, send any mail to "freebsd-smp-unsubscribe at freebsd.org"

_________________________________________________________________
Don’t just search. Find. Check out the new MSN Search! 
http://search.msn.click-url.com/go/onm00200636ave/direct/01/

_______________________________________________
freebsd-questions at freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to "freebsd-questions-unsubscribe at freebsd.org"


More information about the freebsd-smp mailing list