zfs i/o error, no driver error

Mon Jun 7 17:11:47 UTC 2010

On Mon, 7 Jun 2010, Jeremy Chadwick wrote:

> rubbish.  "Datacenter-quality drives?"  Oh, I think they mean
> "enterprise-grade drives", which really don't offer much more than
> high-end consumer-grade drives at this point in time[2].  One of the key
> points of ZFS's creation was to provide a reliable filesystem using
> cheap disks[3][4].

There are differences between disks.  High-grade enterprise disks 
offer uncorrected error rates at least an order of magnitude better 
than typical tier-2 "SATA" disks and sometimes two orders of magnitude 
better than a cheap maximum-density drive.  Yes, there are tier-2 
drives that come with SAS interfaces, and you can immediately 
distinguish what they are since they offer high storage capacities and 
more reasonable prices.

> What's confusing about this is the phrase that pool verification is done
> by "verifying all the blocks can be read".  Doesn't that happen when a
> standard read operation comes down the pipe for a file?  What I'm

No.  A standard read does not verify that all data and metadata can be 
read.  Only one copy of the data and metadata is read and there may be 
several such copies.  Metadata is always stored multiple times, even 
if the vdev does not offer additional redundancy.

> The topic of scrub intervals was also brought up a month later[7].
> Someone said:
>
> "We did a study on re-write scrubs which showed that once per year was a
> good interval for modern, enterprise-class disks.  However, ZFS does a
> read-only scrub, so you might want to scrub more often".

The concept of "bit rot" on modern disk drives is very unproven.  The 
magnetism will surely last 1000+ years so the issue is mostly with 
stability of the media material and the heads.  The idea that scrub 
should re-write the data assumes that magnetic hysteresis is lost over 
time.  This is all very silly for a device with an expected service 
life of 5 years.  It is much more likely for the drive heads to lose 
their function or for a mechanical defect to appear.

Given the above, it makes sense to scrub more often on pools which see 
a lot of writes (to verify the recently written data), and less often 
on pools which are rarely updated.  More levels of redundancy 
diminshes the value of the scrub.

Bob
-- 
Bob Friesenhahn
bfriesen at simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,    http://www.GraphicsMagick.org/