System hangs up every day

Jeremy Chadwick koitsu at FreeBSD.org
Mon Jan 21 23:19:37 PST 2008


On Tue, Jan 22, 2008 at 04:01:56AM +0300, A. Rymkus wrote:
> Tuesday, December 18, 2007, 11:42:19 AM, you wrote:
> ??> Unfortunately my problem still doesn't have any solution.
> ??> But I have an interesting observation. The gateway freezes very
> ??> quickly, if torrent client programs are running on workstations.
> ??> I assume the cause of the problem consists in many number of
> ??> TCP/IP connections that torrent client establishes.
> ??> Any ideas?
> ??> Maybe I can tune somehow a TCP/IP via kernel, sysctl or pf settings?
> 
> >> There is one FreeBSD server in our company. The server platform is: Supermicro SuperServer 6014V-T2B (2x Intel > Xeon 2.8, 1Gb RAM, 3WARE 3W-8006-2LP RAID-Controller).
> >> The server works as:
> >> - a gateway between LAN and Internet
> >> - an Intranet web- and database server (Apache + MySQL + PHP)
> >> - a firewall (OpenBSD pf)
> >> - a transparent proxy server (Squid)
> >> A mounthly traffic through this server is about 100Gb. There is about 200 internet users in our conpany.
> >> FreeBSD 6.2-RELEASE-p8
> 
> >> This server hangs up every day without any messages in the log files and on the system console. A keyboard dosen't work too. I can make only hard reset and after restart coredump files are not appearing.
> 
> >> If I make and install a kernel with SMP options the system under working load begins hang up every two hours.
> 
> >> The two days "Memtest" gave no result.
> >> I tried to install the newest Intel ethernet adapter driver, but without any results.
> >> As an experiment I tried also to plug a system HDD to another sever platform (SuperServer 6015V-TB), but system hanging didn't stop.
> >> I think that it is not only hardware problem.
> >> Linux (Gentoo) and Windows server 2003 on this hardware were working fine.
> 
>   Got same problem in 6.2 based on VMWare ESX 3.0, with both type of
> provided adapters type - lnc & em. System just hangs one adapter first, then hangs completely.
>   Can this problem be solved by cvsup'g and rebuilding whole world, or
> I have to wait for 7.0-RELEASE? ;)

The interesting thing about these reported problems is that I cannot
reproduce them on any of our boxes.  Here's the hardware, and the
network traffic that occurs on them over a month period of time:

* Supermicro SuperServer 5015M-T
    - Single E6420 CPU (dual core), 2GB RAM (non-ECC)
    - RELENG_7, SMP enabled, using ULE scheduler
    - apache 2.2, mysql 5, PHP 5, postfix
    - load is usually 0.40 (mainly httpd and mysqld)
    - em0: 10gbit/month, em1: 8gbit/month

* Supermicro SuperServer 5015M-T
    - Single E6420 CPU (dual core), 2GB RAM (non-ECC)
    - RELENG_6, SMP enabled, using 4BSD scheduler
    - mysql 5, ntpd, bind, postfix, and ircd-ratbox
    - load is usually 0.09 (mainly mysqld)
    - em0: 7gbit/month, em1: 42gbit/month

* Supermicro SuperServer 5014C-MR
    - Single Pentium 4 CPU, 3GB RAM (non-ECC)
    - RELENG_6, SMP disabled, using 4BSD scheduler
    - apache 2.2, PHP 5, bind, postfix, shell services
    - load is usually 0.08
    - bge0: 30gbit/month, bge1: 32gbit/month

* Supermicro SuperServer 5013C-T
    - Single Pentium 4 CPU, 1GB RAM (non-ECC)
    - RELENG_6, SMP disabled, using 4BSD scheduler
    - bind, postfix, miscellaneous minor stuff
    - load is usually 0.03
    - em0: 5gbit/month, em1: 41gbit/month

* Supermicro SuperServer 5010E
    - Single Pentium 3 CPU, 512MB RAM (non-ECC)
    - RELENG_6, SMP disabled, using 4BSD scheduler
    - apache 2.0, mysql 5, bind, sendmail
    - load is usually 0.05
    - fxp0: 2gbit/month, fxp1: 5.1gbit/month

All machines have pf(4) enabled and in use, but are not routing traffic
as a gateway; they simply provide content and other services.

Based on what you've described, I'm left thinking there could be some
BIOS-related setting which is tickling a bug under FreeBSD, or something
along those lines.

By the way, hard-resetting the box will not cause a kernel panic, thus
there will be no coredumps to examine.  Additionally, there's a
chicken-and-egg situation which is causing savecore(8) to not create any
coredumps from panics anyways -- see PR 118255 for details of that.

-- 
| Jeremy Chadwick                                    jdc at parodius.com |
| Parodius Networking                           http://www.parodius.com/ |
| UNIX Systems Administrator                      Mountain View, CA, USA |
| Making life hard for others since 1977.                  PGP: 4BD6C0CB |



More information about the freebsd-stable mailing list