add BIO_NORETRY flag, implement support in ata_da, use in ZFS vdev_geom

Andriy Gapon avg at FreeBSD.org
Tue Dec 12 16:09:09 UTC 2017


On 25/11/2017 13:37, Poul-Henning Kamp wrote:
> The real fundamental deficiency is that we do not have a way to say "give up
> if this bio cannot be completed in X time" which is what people actually want.

Indeed.
And I think that that was also what Warner tried to help me understand.
That it is not about absolute retry count, but about a time budget for a request.

> That is suprisingly hard to provide, there are far too many
> corner-cases for me to enumerate them all, but let me just give one
> example:

This is true and this is a good example.
I think that we might want to try first to handle simpler cases like deciding
whether to retry a request if we get a transient error
Dealing with a request that just doesn't come back is the much harder piece, of
course.

> Imagine you issue a deadlined write to a RAID5 thing.  Thee component
> writes happen smoothly, but the last two fail the deadline, with
> no way to predict how long time it will take before they complete
> or fail.
> 
> * Does the bio write transaction fail ?
> 
> * Does the bio write transaction time out ?
> 
> * Do you attempt to complete the write to the RAID5 ?
> 
> * Where do you store a copy of the data if you do ?
> 
> * What happens next time a read happens on this bio's extent ?
> 
> Then for an encore, imagine it was a read bio: Three DMAs go smoothly,
> two are outstanding and you don't know if/when they will complete/fail.
> 
> * If you fail or time out the bio, how do you "taint" the space
>   being read into until the two remaining DMAs are outstanding?
> 
> * What if that space is mapped into userland ?
> 
> * What if that space is being executed ?
> 
> * What if one of the two outstanding DMAs later return garbage ?
> 
> My conclusion back when I did GEOM, was that the only way to
> do something like this sanely, is to have a special GEOM do it
> for you, which always allocates a temp-space:
> 
> 	allocate temp buffer
> 	if (write)
> 		copy write data to temp buffer
> 	issue bio downwards on temp buffer
> 	if timeout
> 		park temp buffer until biodone
> 		return(timeout)
> 	if (read)
> 		copy temp buffer to read space
> 	return (ok/error)


-- 
Andriy Gapon


More information about the freebsd-fs mailing list