Server crashing, no explanations

Chris Pratt eagletree at hughes.net
Wed May 21 15:17:29 UTC 2008


On May 20, 2008, at 7:17 AM, Alan Gilmour wrote:

> Hey all,
>
> We have recently been getting a lot of traffic to one of our sites.
> The CPU is consistently during busy periods using 100% utilisation.
> When this happens we have approx 150 apache threads, and the loads
> goes way above 15.
>
> However recently the server has been auto-restarting (when under heavy
> load) with no explanation in any logs. I've checked the console log,
> messages, db logs e.t.c. but no mention of anything wrong.
>
> Brief server summary :
>
> FreeBSD 6.3-STABLE #0:
> CPU: Intel(R) Xeon(TM) CPU 2.80GHz (2800.11-MHz 686-class CPU)
>  Logical CPUs per core: 2
> real memory  = 17716740096 (16896 MB)
> avail memory = 16837763072 (16057 MB)
> FreeBSD/SMP: Multiprocessor System Detected: 4 CPUs
>
> We tried installing mbmon and lmmon and healthd, but none seem to  
> work.
>
> Anyone got any suggestions for other things we can try to detect why
> the server is failing? or other ways to check things like CPU temp and
> memory status?

We have experienced this since 6.x began and it's not hardware.
It can be reproduced by moving the role to another similar server.
When the role is changed and the traffic (not necessarily the load),
the problem goes away or rather, will transfer to the new box.

Look at the thread named "zonealarm issues" on Freebsd-Net a
couple of months ago. You may find it will apply but there aren't
any answers there yet. I gather that people need more data
collection. I have never figured out how to get a dump though
people have recommended things to try over the last couple of
years. I was hoping 7.0 would be the solution but I'm told it's
not.

Reduce your traffic and the problem will go away. Split the
traffic to more than one server is a way to do this. We increased
our uptime drastically by doing this but we still get hit hard enough
at times to go down. During our low traffic periods of the year,
we simply stay up all the time (in the hottest days of summer).

By the way, the symptom I see is never immediate reboot, it will
hang for reasonable period of time prior to rebooting. As I
monitor ours 24/7, I reset power on the box before it reboots to
reduce the outage to customers. If I'm not watching it eventually
will reboot. Brutal but it works.

Realize it's possible you don't have this problem but there are a
few of us who do. It has something to do with buffers not being
freed up.

>
> Cheers
>
> Alan
> _______________________________________________
> freebsd-questions at freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-questions
> To unsubscribe, send any mail to "freebsd-questions- 
> unsubscribe at freebsd.org"



More information about the freebsd-questions mailing list