ZFS and DMA read error

Tim Judd tajudd at gmail.com
Mon Aug 31 18:22:27 UTC 2009


On 8/31/09, Mark Stapper <stark at mapper.nl> wrote:
> Good day to you,
>
> I'm having a bit of trouble with one of the disks in my zfs raidz1 pool.
> It's giving me dma read error, and zpool is reporting READ failures.
> However, data integrity is OK :-)
> Unfortunately I was in the middle of rearranging my backup media, so I'm
> backup up everything as we speak.
> I will be testing the failing drive in another computer soon, however
> before I return it i'd like to know if this could be caused my something
> other than hardware failing.
> Below the output of "zpool status" and a snippet of /var/log/messages
> showing the DMA errors.
> Thanks for the input.
> Greetz,
> Mark
>
>
> pool: data
>  state: ONLINE
> status: One or more devices has experienced an unrecoverable error.  An
>         attempt was made to correct the error.  Applications are unaffected.
> action: Determine if the device needs to be replaced, and clear the errors
>         using 'zpool clear' or replace the device with 'zpool replace'.
>    see: http://www.sun.com/msg/ZFS-8000-9P
>  scrub: none requested
> config:
>
>         NAME        STATE     READ WRITE CKSUM
>         data        ONLINE       0     0     0
>           raidz1    ONLINE       0     0     0
>             ad4     ONLINE       0     0     0
>             ad6     ONLINE      21     0     0
>             ad8     ONLINE       0     0     0
>             ad10    ONLINE       0     0     0
>
> errors: No known data errors
>
> Aug 31 03:04:35 yoshi kernel: ad6: FAILURE - READ_DMA48
> status=51<READY,DSC,ERROR> error=40<UNCORRECTABLE> LBA=932040832
> Aug 31 03:04:35 yoshi root: ZFS: vdev I/O failure, zpool=data
> path=/dev/ad6 offset=477204905984 size=65536 error=5
> Aug 31 03:04:35 yoshi root: ZFS: vdev I/O failure, zpool=data
> path=/dev/ad6 offset=477204925440 size=2560 error=5
<snip 9 identical messages, based on the uncorrectable LBA error>



Since it's all throwing errors at the same LBA, I'd run a SMART
diagnostics on the drive (i think it's port sysutils/smartmontools)
and see if it's showing errors too.  Looks like a failing/failed drive
and I would recommend replacing it.  I doubt (but you can try)
spinrite will help you when you get to this point.

spinrite's website is at grc.com


Hope you have backups or redundancy.  No fun replacing data.


--TJ


More information about the freebsd-questions mailing list