Continued instability with 5.3-STABLE

Doug White dwhite at
Wed Mar 9 23:46:21 PST 2005

On Wed, 9 Mar 2005, Tony Arcieri wrote:

> I have a dual Opteron upon which seems to only stay up approximately two
> weeks at a time then spontaneously reboots.  It's colocated so I can't ever
> see panic messages, and I don't have another system colocated at the same
> place I can use to gather debugging info.

You may want to consider finding a small system with a free serial port to
serve as a temporary serial console.  Without output from the crash its
impossible to tell what went wrong.

> I've never managed to get the system to generate a crash dump either.  It
> has a 1GB swap partition and 2GB of physical RAM but through the last
> few reboots I've been setting hw.physmem to 896M as the only custom parameter
> in loader.conf.  The swap partition is labeled as follows:
> twed0s1b  swap         1024MB SWAP
> And dumpdev is set in rc.conf as follows:
> dumpdev="/dev/twed0s1b"
> /var/crash/minfree is set to 2048
> Lately I built a kernel from GENERIC using the latest RELENG_5 sources and
> without SMP support and experienced a reboot after approximately 16 days uptime,
> roughly equivalent to how long it took the system to crash with SMP enabled.
> No core file was generated.
> The kernel was built using source checked out from RELENG_5 on February 18th.
> I'm not sure if any Opteron specific fixes have been applied to the branch
> since then.

Make sure you're actually running this kernel since crashdump support for
twe was added 2/12, in rev of src/sys/dev/twe/twe.c.

> Are there any other means of gathering debugging data that would work in
> my situation?  As is I'm still unsure if my problems are hardware or
> software related as I've still never seen a panic message from the
> system (hardware is a Tyan K8S motherboard in a Tyan Transport system)

You really, really want a serial console.

> Should I look into using KTR ALQ to log KTR data to the swap partition, and
> if it fills up will it wrap over to the beginning?  I've never used that
> feature before...

If you don't have a serial console to manipulate ddb from or crashdumps
then there is no way to retrieve the ktr data.

Doug White                    |  FreeBSD: The Power to Serve
dwhite at          |

