svn commit: r227015 - head/sys/geom

Pawel Jakub Dawidek pjd at FreeBSD.org
Thu Nov 3 07:52:21 UTC 2011


On Wed, Nov 02, 2011 at 05:09:42PM +0200, Alexander Motin wrote:
> On 11/02/11 15:42, Pawel Jakub Dawidek wrote:
> > On Wed, Nov 02, 2011 at 09:24:59AM +0000, Alexander Motin wrote:
> >> Author: mav
> >> Date: Wed Nov  2 09:24:59 2011
> >> New Revision: 227015
> >> URL: http://svn.freebsd.org/changeset/base/227015
> >>
> >> Log:
> >>   Add mutex and two flags to make orphan() call properly asynchronous:
> >>    - delay consumer closing and detaching on orphan() until all I/Os complete;
> >>    - prevent new I/Os submission after orphan() called.
> >>   Previous implementation could destroy consumers still having active
> >>   requests and worked only because of global workaround made on GEOM level.
> > 
> > Alexander, I'm not sure I agree with your recent changes to address
> > this. The checks in GEOM were there to avoid the need for counting
> > outstanding I/O requests in every single GEOM class.
> 
> Sorry, I've sent you letter last week asking for your opinion on this
> problem, but got no response. :(

I did not receive it, sorry.

> > Why do you think the checks in GEOM are not good enough?
> 
> Mostly because nstart increment and request submission are not locked.
> There are race windows between start() call, request submission and
> consumer detach: start() method may get provider pointer that will not
> be valid in time of the request submission.
> 
> According geom(4), that race should be closed by assumption that
> provider should not be closed until all active requests are completed.
> Kind of reference counting, done by top consumers, such as geom_dev or
> geom_vfs. That is what I am trying to fix with my changes.
> 
> Also I don't very like idea of periodic polling, trying to catch moment
> when nstart == nend. As soon as at that time we haven't called orphan()
> method yet, requests may go infinitely, even though each will be aborted
> quickly. It looks dirty at least.
> 
> Counting of outstanding I/O requests needed only for classes that for
> some reason can't follow that assumption. For example, geom_vfs releases
> provider as soon as it orphans, not waiting for close() call from file
> system. Another example is gmirror -- we want to drop single
> disconnected disk while upstream provider is still working and won't be
> closed.

Ok, I can see your point now. You are right. The check that nstart is
equal to nend I added was not to fix races within one class. IIRC I was
seeing panics there with simple classes that only forward orphan events,
so even if provider's error was set and no new I/O requests were coming,
we could destroy orphaned provider even if there were still in-flight
requests. I was a bit confused, because you were refering to the check I
added and I don't think it is really related.

> > Can we design solution that can be implemented in the framework itself,
> > so simple GEOM classes can stay simple?
> 
> Simple classes without special needs around access() method and
> especially with one consumer should stay simple. All they should do on
> orphan() call is just forward orphan() call up, calling some
> g_wither_XXX() and not trying to forcefully close consumer, and let
> things happen naturally.
> 
> The problem is that start() and bio_done() methods are not locked now,
> but same time they expect topology not to change under them. I am not
> sure how can it be simplified globally, unless we want to lock all of
> them with topology lock.

-- 
Pawel Jakub Dawidek                       http://www.wheelsystems.com
FreeBSD committer                         http://www.FreeBSD.org
Am I Evil? Yes, I Am!                     http://yomoli.com
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 196 bytes
Desc: not available
Url : http://lists.freebsd.org/pipermail/svn-src-head/attachments/20111103/92133acd/attachment.pgp


More information about the svn-src-head mailing list