Big problem still remains with 7.2-STABLE locking up
NAKAJI Hiroyuki
nakaji at jp.freebsd.org
Sat Jun 6 14:33:36 UTC 2009
Hi,
I noticed, some months ago, frequent lockups on my RELENG_6 server with
ECS PM800-M2, Celeron 2.6GHz (UP), 2GB ram, ATA HDDs and 3Com NIC(xl0),
and then I gave up this old server.
Last month, I replaced this 'unstable' server to the new one with
7.2-RELEASE which worked very well until I setup it as 'a server'. The
problem began just after it started 'the services'.
My story is very similar to Pete's.
http://lists.freebsd.org/pipermail/freebsd-stable/2009-January/047487.html
I followed some instructions in the list thread. But unfortunately, the
big problem still remains. 7.2-STABLE server locks up frequently.
Help! :-(
The server is NEC Express5800 S70/SD.
o CPU: Intel(R) Celeron(R) CPU 440 @ 2.00GHz (2280.25-MHz K8-class CPU)
o 6GB RAM
o ACPI APIC Table: <NEC DT000020>
o 80GB and 250GB SATA HDDs
o http://www.heimat.gr.jp/~nakaji/localhost/dmesg.boot
The kernel configuration is:
include GENERIC
ident HEIMAT
options MSGBUF_SIZE=81920
makeoptions DEBUG=-g
options KDB
options DDB
options BREAK_TO_DEBUGGER
options QUOTA
options DEVICE_POLLING
options HZ=1000
options SW_WATCHDOG
options DEBUG_VFS_LOCKS
options INVARIANTS
options INVARIANT_SUPPORT
options WITNESS
options WITNESS_SKIPSPIN
options LOCK_PROFILING
This server runs as web server, nfs server, dhcp server, ntp server,
mail server with spam checks, ML server, usenet server and so on. From
/etc/rc.conf*, there are some "_enable" lines as shown below.
o ntpdate
o ntpd
o nfs_server
o sshd
o inetd
o named
o sendmail
o rtadvd
o watchdogd
o dhcpd
o snmpd
o apache22
o samba
o zope29
o zope210
o amavisd
o amavisd_milter
o cvsupd
o ntop
o compat6x
o munin_node
o spamd
o spamass_milter
o smartd
o mailman
o sshblock
o innd
o skkserv
>From munin's graphs, the 'resets' value in netstat is increasing while
on other 'desktops' it remains zero. Though I did not find if there is a
threshold of 'resets', when it reaches to 0.8 - 1.2 the server gets
"lockup". No ping response, no messages on cosole, no keyboard response,
and, of cource, Ctrl-Alt-Esc does not function, when it locks up. I
wonder why netstat's reset is increasing.
I had learned a workaround from other Japanese guys, that is, enabling
ichwd and running watchdogd can reboot the box when it locks up if the
box has ICH. Exactly, after about 4 hours, the box rebooted while I was
in bed last night. Watchdogd functions very well.
Advice? Thanks.
--
NAKAJI Hiroyuki
More information about the freebsd-stable
mailing list