Compaq 1850R freezing, controller issues?

Gerald bsdlist at bsdisp.com
Tue Apr 27 09:19:25 PDT 2004


I'll try to be as specific as possible without overkill.

I have a Compaq 1850R with dual P3 450s and a gig of RAM running FreeBSD
4.8-RELEASE-p16. The 3 internal 36G SCSI 10k disks are set up RAID 5 on
the Smart 2 SL RAID controller. SMP is enabled.

FreeBSD uses the ida driver to interact with the RAID controller.

This machine is the most I've tasked these 1850s to do so far and it has
started Freezing shortly after I was forced to put it in production. There
is a lot of disk I/O since this is a mail server (POP & SMTP), and the
disk is being NFS accessed as well. Time between freezes ranges 15 hours
to 72 hours.

I've set up a lot of debugging to try and find what is going on with the
machine and I had a little more light shed this morning. Let me define
freeze:

- no network response at all
- display was still going to monitor
- alt fN keys would switch displays, but...
- type in username and hit enter and it just acknowledges the enter with
  line feeds.
- mrtg was also registering a huge release of memory right before the
  crash. Average is 10-100 MB of Free memory and it would register all of
  the memory being freed up.

Saturday when it froze last I setup 2 displays running commands since it
appeared to keep running to the monitor after it would die to all else.
One was running top -ores (since mrtg was pointing around memory) and the
other was running systat -vm 5. When it froze today, there were about 15
processes in State: inode running at priority -14. They weren't all
sendmail either. snmpd, radiator (just for accounting), and sendmail were
running at -14. The top process and other processes were still running but
all services had died again and I couldn't pull it out of the coma
without hitting the power button...again.

None of the logs record anything out of the ordinary. The machine goes
from normal operation to freeze too fast to record the problem. Also, if
it is a disk access problem, then that would explain why my logs don't
have anything.

If I had to put this as questions...:

- What do I do to keep the freezes from happening?
- What can I do to record more information to find out what specifically
  is causing the freeze? (or is this enough information and I just don't
  know the answer?)
- Has anyone else put the Smart 2 SL or the 1850Rs through some heavy
  lifting on 4.8?

I'm going to do some research on the Smart 2 SL and see if there are any
updates that the SmartStart CD might have put an SEP field around and I'm
going to try to find drop in replacement controller prices.

The disks are brand new from newegg so I don't speculate them yet.

Thanks for any help, suggestions, pointers, or assistance in advance,

Gerald

P.S. First post to the FreeBSD lists. Go easy on me.



More information about the freebsd-stable mailing list