ad8: TIMEOUT - WRITE_DMA errors UFS 7.0-RC1

Jeremy Chadwick koitsu at FreeBSD.org
Sat Jan 26 19:58:15 PST 2008


On Sat, Jan 26, 2008 at 04:28:29PM -0700, Joe Peterson wrote:
> Remco van Bekkum wrote:
> > Same here. On an amd64 system with 1x sata disk (Western Digital Caviar
> > Green Power) on an amd690G chipset, with UFS and intensive disk activity
> > the system hangs and in the end it may panic. I've csupped today and
> > rebuild world & generic kernel but still it's very unstable, sometimes it
> > even hangs when activating geom volumes at boot time... 
> > I must add that this is a new system so I'm not 100% sure the hardware is sane.
> > Using ZFS it also crashed when doing intensive I/O.
> 
> This is very interesting.  It seems to there are several of us who are
> experiencing something that *looks* like hardware (disk) issues when using 7.0.

We need Soren Schmidt and/or Xin Li to help with this situation.  I
really don't know what we can provide (other than hardware, which I am
more than happy to donate).  In my case, I was able to let the machine
remain broken for 15 minutes or so, and it eventually panic'd.  Of
course due to PR 118255, it's becoming difficult to get a coredump.

> Could this be related to the mouse freeze issue?  Could some process be
> locking/grabbing the CPU at inopportune times and causing not only the
> freezing symptoms but also reads/writes problems?

I don't use a mouse on my systems, but what you've described is
possible.  I'm guessing some sort of loop in the kernel (or a driver)
which holds the system down for too long.

> If this is widespread, I think the chances re slim that it is a
> hardware problem in every case.

I'm in definite agreement here.  I think it might be worthwhile to note
what hardware we're all using, in case there's something similar between
our systems (chipset, disk vendor, etc.).

My system is as follows; timeouts were reported during an rsync of data
from the ZFS stripe (ad8+ad10) to a UFS2 filesystem on ad6.  System
eventually panic'd after remaining deadlocked (while kernel messages
about timeouts kept printing on the console for ad6 only) for 10-15
minutes.

*   MB: Supermicro PDSMI+  (Intel ICH7-based)
*  CPU: Intel Core 2 Duo E6600
*  RAM: Corsair CM2X1024-6400 DDR2, 2GB
*  ad4: WD Caviar SE WD2000JD (boot/OS)
*  ad6: Seagate Barracuda 7200.10 ST3500630AS
*  ad8: WD Caviar SE16 WD5000AAKS (ZFS stripe)
* ad10: WD Caviar SE16 WD5000AAKS (ZFS stripe)
* All drives are hooked up to the ICH7.
* SMART stats showed no problems on any of the drives before or after.
* RELENG_7, i386, ULE scheduler.

-- 
| Jeremy Chadwick                                    jdc at parodius.com |
| Parodius Networking                           http://www.parodius.com/ |
| UNIX Systems Administrator                      Mountain View, CA, USA |
| Making life hard for others since 1977.                  PGP: 4BD6C0CB |



More information about the freebsd-stable mailing list