durham at jcdurham.com
Fri Oct 1 17:23:07 PDT 2004
On Friday 01 October 2004 06:38 pm, Kris Kennaway wrote:
> On Thu, Sep 30, 2004 at 10:03:00AM -0400, Jim Durham wrote:
> > I have had this problem now with at least 3 FreeBSD servers over a period
> > of about 2 years. I had put it down to some hardware problem but it seems
> > to be too much of a coincidence with 3 different machines doing the same
> > thing.
> > The first time was when I put 4.5-RELEASE on a brand new Dell Poweredge
> > 2650. I ran it on the bench for a week or so, then decided all was well
> > and put it in the server rack and started doing the company's email
> > service on it. After a few weeks, it suddenly would 'reboot' for no
> > apparent reason. No log entries, nothing at all except the usual stuff in
> > /var/log/messages about '/ was not unmounted correctly', etc. Just like
> > you had pulled the power plug.
> > The 2nd instance was a server that I maintain for an ISP that was a
> > mirror image of their primary server, a 'hot spare' so to speak. The
> > primary, running the same software was solid, but the backup would reboot
> > at about 5:20 every morning with the same syndrome..no log entries of any
> > sort and just the usual entries in /var/log messages saying the the /
> > partition was not unmounted properly. The odd thing was that it was
> > happening at virtually the same time every morning.
> > I upgraded both systems to the latest -RELEASE and it made no
> > difference. Then, they both just *stopped doing it by themselves* with no
> > apparent correlation to anything installed software-wise. Neither server
> > has had any problem for over a year now.
> > The 3rd instance is happening now. Another server I maintain for my
> > 'night job' is doing the same thing for a customer. It just 'stops' like
> > you pulled the power plug. However, this time I thought to check using
> > 'last' and found that I had accidentally left an ssh session open and
> > that entry said 'crash'. There are no other log entries I can find
> > related to the 'reboot'.
> Do you have ddb enabled? If not, the machine may be panicking and
> rebooting automatically.
No. Not on any of the 3 boxes. Like I said, the problem has gone away and not
returned on the Dell and the ISP's box and the loads on those boxes are
always increasing and they've been fine for over a year now. It was just when
this same thing started with a customer's server box that I started to wonder
if it was some very intermittent problem in the kernel.
> Actual "spontaneous reboots" are very rare
These are very rare.... except they seem to happen about once a day for a
while and then stop... very strange..
> and usually caused by hardware problems (e.g. faulty power supply,
> overheating CPU, bad RAM).
Possible, but if so, the hardware fixed itself on the first two boxes I
> Enable DDB, and see what happens the next
> time it crashes.
I'll try that on the one that's doing it now. Any suggestions as to how to log
this to get the moset info ? I've not played with ddb, but I'll read the docs
and get it going.
Thanks much to all who responded!
More information about the freebsd-hackers