svn commit: r240822 - head/sys/geom

Pawel Jakub Dawidek pjd at FreeBSD.org
Wed Sep 26 19:45:17 UTC 2012


On Wed, Sep 26, 2012 at 01:21:17PM -0600, Kenneth D. Merry wrote:
> On Wed, Sep 26, 2012 at 20:53:39 +0200, Pawel Jakub Dawidek wrote:
> > On Wed, Sep 26, 2012 at 11:29:17AM -0600, Kenneth D. Merry wrote:
> > > Here is what CAM needs at each step:
> > > 
> > > 1.  When a device goes away, we need a method to call from daoninvalidate()
> > >     (or any other peripheral driver invalidate routine) with these
> > >     properties:
> > >     - It tells GEOM that the device has gone away, and starts the process
> > >       of shutting down the device.  (i.e. withers/orphans the provider)
> > >     - It is callable from an interrupt context, with the SIM (MTX_DEF) lock
> > >       held, so it can't sleep.
> > 
> > Neither g_wither_provider() nor g_orphan_provider() require the topology
> > lock. They only acquire the event lock, but it is regular mutex, so this
> > is fine. Traversing geom's providers list looks like something that does
> > need the topology lock, but maybe traversing is not needed at all.
> > The reason for this change was a panic in iSCSI initiator where
> > disk_gone() was called and provider was destroyed before g_wither_geom()
> > returned.
> 
> Ahh.  How about using LIST_FOREACH_SAFE?  Would that address the problem at
> hand?  Are there any other races in there?

It depends. If one geom can hold more than one provider then it might be
racy, but from what I see there is always only one provider - there has
to be only one, because disk_destroy() destroys it and struct disk
represents always only one disk. If that's true then I see not reason to
have a loop in there. I'd change it to:

void
disk_gone(struct disk *dp)
{
	struct g_geom *gp;
	struct g_provider *pp;

	gp = dp->d_geom;
	if (gp != NULL) {
		pp = LIST_FIRST(&gp->provider);
		if (pp != NULL)
			g_wither_provider(pp, ENXIO);
	}
}

> > So maybe disk_destroy() should first orphan provider, which in turn will
> > set its error. If provider's error is set, all I/O requests will be
> > denied by GEOM by returning provider's error, so strategy method within
> > a driver won't be called.
> 
> The current semantics of disk_destroy() are that the da(4) driver won't use
> the disk structure after it is called.  We can guarantee that if it is
> called from dacleanup(), but not if it is called from daoninvalidate().
> 
> And if we combined the functionality of the current disk_gone() (which
> orphans the provider) and disk_destroy() routines, we would have to call it
> from daoninvalidate().  And that won't work, because the da(4) driver may
> well access elements of the disk structure after daoninvalidate() is
> called.

And I assume this is not something that can be fixed/changed?

-- 
Pawel Jakub Dawidek                       http://www.wheelsystems.com
FreeBSD committer                         http://www.FreeBSD.org
Am I Evil? Yes, I Am!                     http://tupytaj.pl
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 196 bytes
Desc: not available
Url : http://lists.freebsd.org/pipermail/svn-src-head/attachments/20120926/f68be404/attachment.pgp


More information about the svn-src-head mailing list