"ad0: TIMEOUT - WRITE_DMA" type errors with 7.0-RC1

Jeremy Chadwick koitsu at FreeBSD.org
Fri Jan 25 16:38:46 PST 2008


Joe, I wanted to send you a note about something that I'm still in the
process of dealing with.  The timing couldn't be more ironic.

I decided it would be worthwhile to migrate from my two-disk ZFS stripe
with a non-ZFS disk for nightly backups, to to a RAIDZ pool of all 3
disks combined (since they're all the same size).  I had another
terminal with gstat -I500ms running in it, so I could see overall I/O.

All was going well until about the 81GB mark of the copy.  gstat started
showing 0KB in/out on all the drives, and the rsync was stalled.  ^Z did
nothing, which is usually a bad sign.  :-)  I ssh'd in and did a dmesg
(summarised):

ad6: WARNING - SETFEATURES SET TRANSFER MODE taskqueue timeout - completing request directly
ad6: WARNING - SETFEATURES ENABLE WCACHE taskqueue timeout - completing request directly
ad6: WARNING - SET_MULTI taskqueue timeout - completing request directly
ad6: TIMEOUT - WRITE_DMA retrying (0 retries left) LBA=13951071
ad6: TIMEOUT - WRITE_DMA retrying (0 retries left) LBA=13951327
ad6: FAILURE - WRITE_DMA timed out LBA=13951071
ad6: FAILURE - WRITE_DMA timed out LBA=13951327
ad6: TIMEOUT - WRITE_DMA retrying (0 retries left) LBA=13951583
ad6: TIMEOUT - WRITE_DMA retrying (0 retries left) LBA=13951839
ad6: FAILURE - WRITE_DMA timed out LBA=13951583
ad6: FAILURE - WRITE_DMA timed out LBA=13951839
ad6: TIMEOUT - WRITE_DMA retrying (0 retries left) LBA=13952095
ad6: TIMEOUT - WRITE_DMA retrying (0 retries left) LBA=13952351
g_vfs_done():ad6s1d[WRITE(offset=7142916096, length=131072)]error = 5
g_vfs_done():ad6s1d[WRITE(offset=7143047168, length=131072)]error = 5
g_vfs_done():ad6s1d[WRITE(offset=7143178240, length=131072)]error = 5
g_vfs_done():ad6s1d[WRITE(offset=7143309312, length=131072)]error = 5
g_vfs_done():ad6s1d[WRITE(offset=7143440384, length=131072)]error = 5

It appears my /dev/ad6 (a Seagate -- more irony) must have some bad
blocks.  Actually, after letting things go for a while, I realised the
box just locked up.  Probably kernel panic'd due to the I/O problem.
I'll have to poke at SMART stats later to see what showed up.

-- 
| Jeremy Chadwick                                    jdc at parodius.com |
| Parodius Networking                           http://www.parodius.com/ |
| UNIX Systems Administrator                      Mountain View, CA, USA |
| Making life hard for others since 1977.                  PGP: 4BD6C0CB |



More information about the freebsd-stable mailing list