Unexplained panic? on Sun X2100 - help / pointers?

Peter Thoenen eol1 at yahoo.com
Thu Feb 16 13:07:13 PST 2006


Purchased an Sun X2100 a couple weeks ago and been experiencing
unexplained panics?*  Narrowed it down to what I think is the bge
driver but unsure how to confirm / prove this.  Help would be
appreciated as I would like to get this fixed (mainly if its bad
hardware, get replacement parts, if bad software, hope that FBSD 6.1
fixes).  No point to owning a server if you can't use it :)

* Its a remote colo so I don't actually see a panic screen nor does
logging serial console (or syslog) report a panic.  Assume panic as the
box just arbitrarily dies after a couple hours.

* Not seeing anything is /var/crashes though I have enough space on my
dumpdir and swap.

Logic on bge driver (though might be other network related):
- Installed 6.0 w/ GENERIC kernel.  Runs fine. (72 hour test, no crash)
- Installed custom kernel. Runs fine. (72 hour test, no crash)
- Installed dns, httpd, mail, ssh, couple other low bandwidth items. 
Runs fine (72 hour test, no crash)
- Install tor, i2p, freenet.  Run ANY ONE (or multiples) of these items
and box dies after about 3 to 4 hours of max pf queue (450kbs per each
of the 3 items).
- Reboot repeat bandwidth intensive test, repeat box dies.
- Do not run of those 3 bandwidth intensive item.  Box up for 72 hours.
- Run any one of the 3 items and once again box dies with 3 to 4 hours.
- Rebuild GENERIC kernel (though maybe custom kernel issue)
- Repeat run tests, same issue, box dies after 3 or 4 hours of constant
450kbs+ traffic.

Might be the bge driver, might be pf altq releated, might be *other*. 
Lost where to proceed from here.  No longer think its hardware related
(as in bad hardware) as I can run it for 72+ hours on low bandwidth and
no crash.  Only crashes on a (albeit minor) load.

Not sure if this is of any use but below is my DMESG:

---------------------------------------------------------

Copyright (c) 1992-2005 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993,
1994
        The Regents of the University of California. All rights
reserved.
FreeBSD 6.0-RELEASE-p4 #1: Mon Feb 13 20:48:35 EST 2006
    root at nan-elmoth:/usr/obj/usr/src/sys/CUSTOM
ACPI APIC Table: <SUNW   AWRDACPI>
Timecounter "i8254" frequency 1193182 Hz quality 0
CPU: AMD Opteron(tm) Processor 148 (2211.34-MHz K8-class CPU)
  Origin = "AuthenticAMD"  Id = 0x20f71  Stepping = 1
 
Features=0x78bfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,MMX,FXSR,SSE,SSE2>
  Features2=0x1<SSE3>
  AMD Features=0xe2500800<SYSCALL,NX,MMX+,<b25>,LM,3DNow+,3DNow>
real memory  = 2146304000 (2046 MB)
avail memory = 2063179776 (1967 MB)
ioapic0 <Version 1.1> irqs 0-23 on motherboard
acpi0: <SUNW AWRDACPI> on motherboard
acpi_bus_number: can't get _ADR
acpi_bus_number: can't get _ADR
acpi0: Power Button (fixed)
acpi_bus_number: can't get _ADR
acpi_bus_number: can't get _ADR
acpi_bus_number: can't get _ADR
acpi_bus_number: can't get _ADR
acpi_bus_number: can't get _ADR
acpi_bus_number: can't get _ADR
pci_link0: <ACPI PCI Link LNK1> on acpi0
pci_link1: <ACPI PCI Link LNK2> on acpi0
pci_link2: <ACPI PCI Link LNK3> on acpi0
pci_link3: <ACPI PCI Link LNK4> irq 5 on acpi0
pci_link4: <ACPI PCI Link LNK5> on acpi0
pci_link5: <ACPI PCI Link LUBA> irq 10 on acpi0
pci_link6: <ACPI PCI Link LUBB> on acpi0
pci_link7: <ACPI PCI Link LMAC> irq 3 on acpi0
pci_link8: <ACPI PCI Link LACI> irq 7 on acpi0
pci_link9: <ACPI PCI Link LMCI> on acpi0
pci_link10: <ACPI PCI Link LSMB> irq 5 on acpi0
pci_link11: <ACPI PCI Link LUB2> irq 11 on acpi0
pci_link12: <ACPI PCI Link LIDE> on acpi0
pci_link13: <ACPI PCI Link LSID> irq 11 on acpi0
pci_link14: <ACPI PCI Link LFID> irq 10 on acpi0
pci_link15: <ACPI PCI Link LPCA> on acpi0
pci_link16: <ACPI PCI Link APC1> irq 0 on acpi0
pci_link17: <ACPI PCI Link APC2> irq 0 on acpi0
pci_link18: <ACPI PCI Link APC3> irq 0 on acpi0
pci_link19: <ACPI PCI Link APC4> irq 0 on acpi0
pci_link20: <ACPI PCI Link APC5> irq 16 on acpi0
pci_link21: <ACPI PCI Link APCF> irq 0 on acpi0
pci_link22: <ACPI PCI Link APCG> irq 0 on acpi0
pci_link23: <ACPI PCI Link APCH> irq 0 on acpi0
pci_link24: <ACPI PCI Link APCJ> irq 0 on acpi0
pci_link25: <ACPI PCI Link APCK> irq 0 on acpi0
pci_link26: <ACPI PCI Link APCS> irq 0 on acpi0
pci_link27: <ACPI PCI Link APCL> irq 0 on acpi0
pci_link28: <ACPI PCI Link APCZ> irq 0 on acpi0
pci_link29: <ACPI PCI Link APSI> irq 0 on acpi0
pci_link30: <ACPI PCI Link APSJ> irq 0 on acpi0
pci_link31: <ACPI PCI Link APCP> irq 0 on acpi0
unknown: I/O range not supported
Timecounter "ACPI-fast" frequency 3579545 Hz quality 1000
acpi_timer0: <24-bit timer at 3.579545MHz> port 0x1008-0x100b on acpi0
cpu0: <ACPI CPU> on acpi0
acpi_button0: <Power Button> on acpi0
pcib0: <ACPI Host-PCI bridge> port 0xcf8-0xcff on acpi0
pci_link26: BIOS IRQ 5 for -2145826954.1.INTA is invalid
pci_link21: BIOS IRQ 10 for -2145826954.2.INTA is invalid
pci_link27: BIOS IRQ 11 for -2145826954.2.INTB is invalid
pci_link23: BIOS IRQ 3 for -2145826954.10.INTA is invalid
pci_link29: BIOS IRQ 11 for -2145826954.7.INTA is invalid
pci_link30: BIOS IRQ 10 for -2145826954.8.INTA is invalid
pci0: <ACPI PCI bus> on pcib0
pci0: <memory> at device 0.0 (no driver attached)
isab0: <PCI-ISA bridge> at device 1.0 on pci0
isa0: <ISA bus> on isab0
pci0: <serial bus, SMBus> at device 1.1 (no driver attached)
ohci0: <OHCI (generic) USB controller> mem 0xfe02f000-0xfe02ffff irq 21
at device 2.0 on pci0
ohci0: [GIANT-LOCKED]
usb0: OHCI version 1.0, legacy support
usb0: SMM does not respond, resetting
usb0: <OHCI (generic) USB controller> on ohci0
usb0: USB revision 1.0
uhub0: nVidia OHCI root hub, class 9/0, rev 1.00/1.00, addr 1
uhub0: 8 ports with 8 removable, self powered
ehci0: <EHCI (generic) USB 2.0 controller> mem 0xfeb00000-0xfeb000ff
irq 22 at device 2.1 on pci0
ehci0: [GIANT-LOCKED]
usb1: EHCI version 1.0
usb1: companion controller, 4 ports each: usb0
usb1: <EHCI (generic) USB 2.0 controller> on ehci0
usb1: USB revision 2.0
uhub1: nVidia EHCI root hub, class 9/0, rev 2.00/1.00, addr 1
uhub1: 8 ports with 8 removable, self powered
atapci0: <nVidia nForce4 UDMA133 controller> port
0x1f0-0x1f7,0x3f6,0x170-0x177,0x376,0xe800-0xe80f at device 6.0 on pci0
ata0: <ATA channel 0> on atapci0
ata1: <ATA channel 1> on atapci0
atapci1: <nVidia nForce4 SATA150 controller> port
0x9f0-0x9f7,0xbf0-0xbf3,0x970-0x977,0xb70-0xb73,0xd400-0xd40f mem
0xfe02c000-0xfe02cfff irq 23 at device 7.0 on pci0
ata2: <ATA channel 0> on atapci1
ata3: <ATA channel 1> on atapci1
atapci2: <nVidia nForce4 SATA150 controller> port
0x9e0-0x9e7,0xbe0-0xbe3,0x960-0x967,0xb60-0xb63,0xc000-0xc00f mem
0xfe02b000-0xfe02bfff irq 21 at device 8.0 on pci0
ata4: <ATA channel 0> on atapci2
ata5: <ATA channel 1> on atapci2
pcib1: <ACPI PCI-PCI bridge> at device 9.0 on pci0
pci_link16: BIOS IRQ 23 for 0.7.INTA is invalid
pci_link19: BIOS IRQ 21 for 0.8.INTA is invalid
pci_link17: BIOS IRQ 22 for 0.10.INTA is invalid
pci1: <ACPI PCI bus> on pcib1
pci1: <display, VGA> at device 5.0 (no driver attached)
nve0: <NVIDIA nForce MCP9 Networking Adapter> port 0xbc00-0xbc07 mem
0xfe02a000-0xfe02afff irq 22 at device 10.0 on pci0
nve0: Ethernet address 00:e0:81:59:33:88
miibus0: <MII bus> on nve0
ukphy0: <Generic IEEE 802.3u media interface> on miibus0
ukphy0:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT,
1000baseT-FDX, auto
nve0: Ethernet address: 00:e0:81:59:33:88
nve0: [GIANT-LOCKED]
pcib2: <ACPI PCI-PCI bridge> at device 11.0 on pci0
pci2: <ACPI PCI bus> on pcib2
pcib3: <ACPI PCI-PCI bridge> at device 12.0 on pci0
pci3: <ACPI PCI bus> on pcib3
pcib4: <ACPI PCI-PCI bridge> at device 13.0 on pci0
pci4: <ACPI PCI bus> on pcib4
bge0: <Broadcom BCM5721 Gigabit Ethernet, ASIC rev. 0x4101> mem
0xfdaf0000-0xfdafffff irq 19 at device 0.0 on pci4
miibus1: <MII bus> on bge0
brgphy0: <BCM5750 10/100/1000baseTX PHY> on miibus1
brgphy0:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseTX,
1000baseTX-FDX, auto
bge0: Ethernet address: 00:e0:81:59:33:89
pcib5: <ACPI PCI-PCI bridge> at device 14.0 on pci0
pci5: <ACPI PCI bus> on pcib5
sio0: <16550A-compatible COM port> port 0x3f8-0x3ff irq 4 flags 0x10 on
acpi0
sio0: type 16550A, console
orm0: <ISA Option ROMs> at iomem
0xc0000-0xc7fff,0xc8000-0xcbfff,0xce000-0xcf7ff on isa0
atkbdc0: <Keyboard controller (i8042)> at port 0x60,0x64 on isa0
atkbd0: <AT Keyboard> flags 0x1 irq 1 on atkbdc0
device_attach: atkbd0 attach returned 6
sc0: <System console> at flags 0x100 on isa0
sc0: VGA <16 virtual consoles, flags=0x100>
sio1: configured irq 3 not in bitmap of probed irqs 0
sio1: port may not be enabled
vga0: <Generic ISA VGA> at port 0x3c0-0x3df iomem 0xa0000-0xbffff on
isa0
ukbd0: GM-TEK USB Composite Device, rev 1.01/0.01, addr 2, iclass 3/1
kbd0 at ukbd0
uhid0: GM-TEK USB Composite Device, rev 1.01/0.01, addr 2, iclass 3/1
Timecounter "TSC" frequency 2211343601 Hz quality 800
Timecounters tick every 1.000 msec
Fast IPsec: Initialized Security Association Processing.
acd0: DVDROM <MATSHITADVD-ROM SR-8178/PZ16> at ata0-master UDMA66
ad4: 238475MB <Seagate ST3250823AS 3.03> at ata2-master SATA150
Trying to mount root from ufs:/dev/ad4s1a
bge0: link state changed to UP


More information about the freebsd-amd64 mailing list