Forcing full file read in ZFS even when checksum error encountered

Tue Feb 5 19:39:24 UTC 2008

On Tue, Feb 05, 2008 at 06:22:30PM +0100, Dag-Erling Smørgrav wrote:
> Joe Peterson <joe at skyrush.com> writes:
> > Dag-Erling Smørgrav <des at des.no> writes:
> > > There is now way to "read the bad data" since an unrecoverable
> > > checksum error means that ZFS has no idea which of the multiple
> > > version of the affected block is the right one.
> > Nope, no mirror, no RAIDZ - just one partition.  But as far as I know, there
> > were no read errors, just a checksum error.
> 
> A checksum error results from a read error.  Check your drive's SMART
> error log if it has one.  It might not be detectable in a surface scan,
> as the damaged sector will be automatically reassigned if it's written
> to (which ZFS may very well have done)

Joe's drive is in decent shape, and SMART shows absolutely no sign of
temporary or even transient error.  The thread regarding how all of tha
was done, and the results, is over on -stable:

http://lists.freebsd.org/pipermail/freebsd-stable/2008-January/039983.html

Worth noting is that Joe's situation is somewhat similar to one which
happened to me during the course of his and I's conversation (totally by
chance!).  Joe was able to get ZFS to increment CKSUM counts, while in
my case ZFS scrubbing has never once incremented R/W/CK counters -- even
after the incident.

In my case, SMART showed no problems, the cabling was absolutely fine,
and the controller stable and reliable (ICH7).  I'm pretty familiar with
diagnosing disk-level problems (I deal with it at work on a near-daily
basis, and that's with SCSI), so I feel confident stating Joe and I's
problems were *not* caused by flaky hardware.  Joe and I have different
hardware, though we both saw the issue with Seagate disks.

The timing of the discussion became even more interesting when someone a
day or so later someone posted about reproducable deadlock when copying
data from a ZFS pool to a UFS filesystem -- which is exactly what I was
doing when my crash happened.  We don't know if that individual saw any
DMA errors or timeouts in the ATA subsystem before the panic, though:

http://lists.freebsd.org/pipermail/freebsd-stable/2008-January/040047.html

I realise I classify pretty much everything as "priority 0" (of utmost
importance), but there doesn't appear to be any indication of anyone
caring about this problem.  I want to do more to help, but I don't know
what I *can* do, besides offer to buy people identical hardware for
attempted reproduction, or put up a test box with serial console access
for developers to beat up on.

If we need some central point or archival of all the issues of this type
seen with ZFS, I would be more than happy to take on that task.

-- 
| Jeremy Chadwick                                    jdc at parodius.com |
| Parodius Networking                           http://www.parodius.com/ |
| UNIX Systems Administrator                      Mountain View, CA, USA |
| Making life hard for others since 1977.                  PGP: 4BD6C0CB |