5.4-RC2 freezing - ATA related?

On Mon, May 16, 2005 at 06:40:01AM -0600, Elliot Finley wrote:
> This has been happening since 5.3-R, I've been tuning different parameters
> to no avail.  I've taken the disks off of the onboard ICH5 controller and
> put them a promise TX4 S150 controller, but still the same thing happens.
> The system freezes, but isn't totally dead.  It'll still respond to pings,
> the screensaver still functions, but it won't respond to a CAD at the
> console.  But if I press 'Enter' at the console, it'll give me a 'login:'
> prompt, but after entering the username, it never comes back with the
> 'password:' prompt.
> After manually resetting the system it boots and says 'Automatic file system
> check failed; help!' and drops into single user mode.  Running fsck manually
> corrects errors on all volumes.  Then it'll boot from that point.
> This seems to be triggered by daily periodic as it happens at 3:02-3:03AM
> each time.  But it doesn't happen *every* morning.
> I suspect a bug in FreeBSD because this mode of failure happens on 3
> different machines, all configured similarly.
> ASUS P4P800
> 2G RAM (though the other affected systems only have 1G)
> 80G Seagate Barracuda SATA drives (one system now on Promise TX4 S150
> controller, others on onboard ICH5)
> On my lightly loaded systems, it happens rarely.  On my mailserver (fairly
> heavy disk load), it happens quite frequently.
> How can I troubleshoot this?

Managed to get a dump on our system for a similar prob we are getting:

[GDB will not be able to debug user-mode threads: /usr/lib/libthread_db.so: Undefined symbol "ps_pglobal_lookup"]
GNU gdb 6.1.1 [FreeBSD]
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "i386-marcel-freebsd".
#0  doadump () at pcpu.h:160
160             __asm __volatile("movl %%fs:0,%0" : "=r" (td));
(kgdb) bt
#0  doadump () at pcpu.h:160
#1  0xc05131ae in boot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:410
#2  0xc0513474 in panic (fmt=0xc06c3da5 "%s") at /usr/src/sys/kern/kern_shutdown.c:566
#3  0xc0691e18 in trap_fatal (frame=0xecb4bb34, eva=532) at /usr/src/sys/i386/i386/trap.c:817
#4  0xc0691b73 in trap_pfault (frame=0xecb4bb34, usermode=0, eva=532) at /usr/src/sys/i386/i386/trap.c:735
#5  0xc0691771 in trap (frame=
      {tf_fs = -1068433384, tf_es = -989790192, tf_ds = 16, tf_edi = -1066124736, tf_esi = -1066124736, tf_ebp = -323699844, tf_isp = -323699872, tf_ebx = -1007063716, tf_edx = 528, tf_ecx = -1013235680, tf_eax = 307472464, tf_trapno = 12, tf_err = 2, tf_eip = -1067870386, tf_cs = 8, tf_eflags = 66050, tf_esp = -989760240, tf_ss = -1007063716}) at /usr/src/sys/i386/i386/trap.c:425
#6  0xc068168a in calltrap () at /usr/src/sys/i386/i386/exception.s:140
#7  0xc0510018 in crcopy () at /usr/src/sys/kern/kern_prot.c:1810
#8  0xc0598c77 in in_pcbdetach (inp=0xc0743a40) at /usr/src/sys/netinet/in_pcb.c:720
#9  0xc05b21a6 in tcp_close (tp=0x0) at /usr/src/sys/netinet/tcp_subr.c:783
#10 0xc05ae560 in tcp_input (m=0xc3a6a300, off0=20) at /usr/src/sys/netinet/tcp_input.c:2308
#11 0xc05a5aed in ip_input (m=0xc3a6a300) at /usr/src/sys/netinet/ip_input.c:776
#12 0xc0582f13 in netisr_processqueue (ni=0xc0742498) at /usr/src/sys/net/netisr.c:233
#13 0xc058310a in swi_net (dummy=0x0) at /usr/src/sys/net/netisr.c:346
#14 0xc04ffa79 in ithread_loop (arg=0xc3481600) at /usr/src/sys/kern/kern_intr.c:547
#15 0xc04fed0c in fork_exit (callout=0xc04ff928 <ithread_loop>, arg=0xc3481600, frame=0xecb4bd38) at /usr/src/sys/kern/kern_fork.c:791
#16 0xc06816ec in fork_trampoline () at /usr/src/sys/i386/i386/exception.s:209

Help? ;)

