"ad0: TIMEOUT - WRITE_DMA" type errors with 7.0-RC1

Remco van Bekkum remco at spacemarines.us
Mon Feb 11 12:11:35 UTC 2008


On Mon, Feb 11, 2008 at 01:00:57PM +0100, Remco van Bekkum wrote:
> On Fri, Jan 25, 2008 at 04:38:46PM -0800, Jeremy Chadwick wrote:
> > Joe, I wanted to send you a note about something that I'm still in the
> > process of dealing with.  The timing couldn't be more ironic.
> > 
> > I decided it would be worthwhile to migrate from my two-disk ZFS stripe
> > with a non-ZFS disk for nightly backups, to to a RAIDZ pool of all 3
> > disks combined (since they're all the same size).  I had another
> > terminal with gstat -I500ms running in it, so I could see overall I/O.
> > 
> > All was going well until about the 81GB mark of the copy.  gstat started
> > showing 0KB in/out on all the drives, and the rsync was stalled.  ^Z did
> > nothing, which is usually a bad sign.  :-)  I ssh'd in and did a dmesg
> > (summarised):
> > 
> > ad6: WARNING - SETFEATURES SET TRANSFER MODE taskqueue timeout - completing request directly
> > ad6: WARNING - SETFEATURES ENABLE WCACHE taskqueue timeout - completing request directly
> > ad6: WARNING - SET_MULTI taskqueue timeout - completing request directly
> > ad6: TIMEOUT - WRITE_DMA retrying (0 retries left) LBA=13951071
> > ad6: TIMEOUT - WRITE_DMA retrying (0 retries left) LBA=13951327
> > ad6: FAILURE - WRITE_DMA timed out LBA=13951071
> > ad6: FAILURE - WRITE_DMA timed out LBA=13951327
> > ad6: TIMEOUT - WRITE_DMA retrying (0 retries left) LBA=13951583
> > ad6: TIMEOUT - WRITE_DMA retrying (0 retries left) LBA=13951839
> > ad6: FAILURE - WRITE_DMA timed out LBA=13951583
> > ad6: FAILURE - WRITE_DMA timed out LBA=13951839
> > ad6: TIMEOUT - WRITE_DMA retrying (0 retries left) LBA=13952095
> > ad6: TIMEOUT - WRITE_DMA retrying (0 retries left) LBA=13952351
> > g_vfs_done():ad6s1d[WRITE(offset=7142916096, length=131072)]error = 5
> > g_vfs_done():ad6s1d[WRITE(offset=7143047168, length=131072)]error = 5
> > g_vfs_done():ad6s1d[WRITE(offset=7143178240, length=131072)]error = 5
> > g_vfs_done():ad6s1d[WRITE(offset=7143309312, length=131072)]error = 5
> > g_vfs_done():ad6s1d[WRITE(offset=7143440384, length=131072)]error = 5
> > 
> > It appears my /dev/ad6 (a Seagate -- more irony) must have some bad
> > blocks.  Actually, after letting things go for a while, I realised the
> > box just locked up.  Probably kernel panic'd due to the I/O problem.
> > I'll have to poke at SMART stats later to see what showed up.
> > 
> > -- 
> > | Jeremy Chadwick                                    jdc at parodius.com |
> > | Parodius Networking                           http://www.parodius.com/ |
> > | UNIX Systems Administrator                      Mountain View, CA, USA |
> > | Making life hard for others since 1977.                  PGP: 4BD6C0CB |
> > 
> > _______________________________________________
> > freebsd-stable at freebsd.org mailing list
> > http://lists.freebsd.org/mailman/listinfo/freebsd-stable
> > To unsubscribe, send any mail to "freebsd-stable-unsubscribe at freebsd.org"
> 
> Hi all,
> 
> After having replaced my first SATA disk with one of the same type,
> having still the same errors, I replaced this 1TB drive with 4x500GB
> Hitachi P7K500 in raidz. It worked fine for a week, but yesterday I
> cvsupped and rebuild world. This afternoon everything is breaking down
> again with the same errors:
> 
> Feb 11 12:34:09 xaero kernel: ad6: WARNING - SETFEATURES SET TRANSFER
> MODE taskqueue timeout - completing request directly
> Feb 11 12:34:13 xaero kernel: ad6: WARNING - SETFEATURES SET TRANSFER
> MODE taskqueue timeout - completing request directly
> Feb 11 12:34:17 xaero kernel: ad6: WARNING - SETFEATURES ENABLE RCACHE
> taskqueue timeout - completing request directly
> Feb 11 12:34:21 xaero kernel: ad6: WARNING - SETFEATURES ENABLE WCACHE
> taskqueue timeout - completing request directly
> Feb 11 12:34:25 xaero kernel: ad6: WARNING - SET_MULTI taskqueue timeout
> - completing request directly
> Feb 11 12:34:25 xaero kernel: ad6: FAILURE - WRITE_DMA48 timed out
> LBA=298014274
> 
> Feb 11 12:34:29 xaero kernel: ad8: WARNING - SETFEATURES SET TRANSFER
> MODE taskqueue timeout - completing request directly
> Feb 11 12:34:33 xaero kernel: ad8: WARNING - SETFEATURES SET TRANSFER
> MODE taskqueue timeout - completing request directly
> Feb 11 12:34:37 xaero kernel: ad8: WARNING - SETFEATURES ENABLE RCACHE
> taskqueue timeout - completing request directly
> Feb 11 12:34:41 xaero kernel: ad8: WARNING - SETFEATURES ENABLE WCACHE
> taskqueue timeout - completing request directly
> Feb 11 12:34:45 xaero kernel: ad8: WARNING - SET_MULTI taskqueue timeout
> - completing request directly
> Feb 11 12:34:45 xaero kernel: ad8: FAILURE - WRITE_DMA48 timed out
> LBA=298013590
> 
> So of 6 new disk I have 4 with the same errors. It would be quite safe then
> to not blame the disks imho. I've tested the second drive in another
> machine, but still got these timeout errors. What's wrong here?
> It's on an amd64, Asus m2a-vm with ati xp600, AMD BE-2350 CPU, 2GB
> 800MHz RAM.
> 
> Regards,
> 
> Remco
> _______________________________________________
> freebsd-stable at freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-stable
> To unsubscribe, send any mail to "freebsd-stable-unsubscribe at freebsd.org"

Sorry, ati ixp sb600 that is...

Remco



More information about the freebsd-stable mailing list