ad8: TIMEOUT - WRITE_DMA errors UFS 7.0-RC1
Jeremy Chadwick
koitsu at FreeBSD.org
Sat Jan 26 19:58:15 PST 2008
On Sat, Jan 26, 2008 at 04:28:29PM -0700, Joe Peterson wrote:
> Remco van Bekkum wrote:
> > Same here. On an amd64 system with 1x sata disk (Western Digital Caviar
> > Green Power) on an amd690G chipset, with UFS and intensive disk activity
> > the system hangs and in the end it may panic. I've csupped today and
> > rebuild world & generic kernel but still it's very unstable, sometimes it
> > even hangs when activating geom volumes at boot time...
> > I must add that this is a new system so I'm not 100% sure the hardware is sane.
> > Using ZFS it also crashed when doing intensive I/O.
>
> This is very interesting. It seems to there are several of us who are
> experiencing something that *looks* like hardware (disk) issues when using 7.0.
We need Soren Schmidt and/or Xin Li to help with this situation. I
really don't know what we can provide (other than hardware, which I am
more than happy to donate). In my case, I was able to let the machine
remain broken for 15 minutes or so, and it eventually panic'd. Of
course due to PR 118255, it's becoming difficult to get a coredump.
> Could this be related to the mouse freeze issue? Could some process be
> locking/grabbing the CPU at inopportune times and causing not only the
> freezing symptoms but also reads/writes problems?
I don't use a mouse on my systems, but what you've described is
possible. I'm guessing some sort of loop in the kernel (or a driver)
which holds the system down for too long.
> If this is widespread, I think the chances re slim that it is a
> hardware problem in every case.
I'm in definite agreement here. I think it might be worthwhile to note
what hardware we're all using, in case there's something similar between
our systems (chipset, disk vendor, etc.).
My system is as follows; timeouts were reported during an rsync of data
from the ZFS stripe (ad8+ad10) to a UFS2 filesystem on ad6. System
eventually panic'd after remaining deadlocked (while kernel messages
about timeouts kept printing on the console for ad6 only) for 10-15
minutes.
* MB: Supermicro PDSMI+ (Intel ICH7-based)
* CPU: Intel Core 2 Duo E6600
* RAM: Corsair CM2X1024-6400 DDR2, 2GB
* ad4: WD Caviar SE WD2000JD (boot/OS)
* ad6: Seagate Barracuda 7200.10 ST3500630AS
* ad8: WD Caviar SE16 WD5000AAKS (ZFS stripe)
* ad10: WD Caviar SE16 WD5000AAKS (ZFS stripe)
* All drives are hooked up to the ICH7.
* SMART stats showed no problems on any of the drives before or after.
* RELENG_7, i386, ULE scheduler.
--
| Jeremy Chadwick jdc at parodius.com |
| Parodius Networking http://www.parodius.com/ |
| UNIX Systems Administrator Mountain View, CA, USA |
| Making life hard for others since 1977. PGP: 4BD6C0CB |
More information about the freebsd-stable
mailing list