add BIO_NORETRY flag, implement support in ata_da, use in ZFS vdev_geom

Andriy Gapon avg at FreeBSD.org
Tue Dec 12 16:20:13 UTC 2017


On 25/11/2017 12:54, Scott Long wrote:
> Why is overloading EIO so bad?  brelse() will call bdirty() when a BIO_WRITE
> command has failed with EIO.  Calling bdirty() has the effect of retrying the I/O.
> This disregards the fact that disk drivers only return EIO when they’ve decided
> that the I/O cannot be retried.  It has no termination condition for the retries, and
> will endlessly retry I/O in vain; I’ve seen this quite frequently.  It also disregards
> the fact that I/O marked as B_PAGING can’t be retried in this fashion, and will
> trigger a panic.  Because we pretend that EIO can be retried, we are left with
> a system that is very fragile when I/O actually does fail.  Instead of adding
> more special cases and blurred lines, I want to go back to enforcing strict
> contracts between the layers and force the core parts of the system to respect
> those contracts and handle errors properly, instead of just retrying and
> hoping for the best.

I agree with your intention.
But let's not project what I consider to be a bug in the buffer cache code on
all consumers of bio / geom interface.
In fact, I am much surprised that there is any code that treats EIO as
retriable.  I don't know of any other such code except for specialized disk
recovery tools.

-- 
Andriy Gapon


More information about the freebsd-geom mailing list