Geom / destroy_dev() deadlock

Steven Haber steven.haber at isilon.com
Mon Jun 11 22:28:07 UTC 2012


> On Mon, Jun 11, 2012 at 02:15:56PM -0700, Steven Haber wrote:
> > > On Mon, Jun 11, 2012 at 10:19:07AM -0700, Steven Haber wrote:
> > > > Hey FreeBSD devs,
> > > >  
> > > > I hope this is the right forum for this. I haven't used the
FreeBSD
> > > > mailing lists before. I'm a relatively new hire at EMC Isilon.
We are
> > > > using FreeBSD 7.0 as the basis for our scale-out NAS product
line. We
> > > > are frequently hitting the deadlock described here:
> > > > 
> > > >
> >
http://lists.freebsd.org/pipermail/freebsd-geom/2010-February/003888.htm
l
> > > > 
> > > > The race is that destroy_dev() sleeps indefinitely waiting for
thread
> > > > references to drop from a dev. In the case of geom, we hold the
> > topology
> > > > lock the whole time we're in the dev layer. In devfs_open() and
> > > > devfs_close(), we take a ref on the dev before calling into to
geom.
> > > > 
> > > > The proposed solution of destroy_dev_sched() makes sense to me.
I am
> > > > trying to understand the necessity of Alexander Motin's
additional
> > > > changes. destroy_dev_tq() holds the devmtx during the destroy
and
> > devfs
> > > > uses this lock to take refs before calling into geom. To me it
seems
> > we
> > > > should be able to protect the cdev with just the devmtx,
provided
> > > > consumers of d_close(), d_open(), etc. ref and rel the dev
> > > > appropriately.
> > 
> > 
> > > From: Konstantin Belousov
> > > Sent: Monday, June 11, 2012 1:44 PM
> > > To: Steven Haber
> > > Cc: freebsd-geom at freebsd.org
> > > Subject: Re: Geom / destroy_dev() deadlock
> > 
> > > devmtx is mutex. If taken, then cdevsw methods would be unable to
> > sleep.
> > 
> > To clarify, I'm not suggesting additional locking with devmtx.
Currently
> > we use the devmtx to take thread references on the dev.
destroy_devl()
> > sleeps until there are no references. I don't see why it matters if
we a
> > foreground destroy (current code, destroy_dev()) or scheduled
destroy
> > (using destroy_dev_sched()) since the devmtx combined with thread
refs
> > should guarantee we are never in the cdevsw methods during a
destroy. I
> > think that the single routine change is sufficient, and I can't seem
to
> > cause any sort of crash on my system with that change in place.

> From: Konstantin Belousov
> Sent: Monday, June 11, 2012 2:56 PM
> To: Steven Haber
> Cc: freebsd-geom at freebsd.org
> Subject: Re: Geom / destroy_dev() deadlock

> I do not understand what you are proposing. Could you, please, show
> the patch ?

---
 src/sys/geom/geom_dev.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/sys/geom/geom_dev.c b/src/sys/geom/geom_dev.c
index 38251e1..787235a 100644
--- a/src/sys/geom/geom_dev.c
+++ b/src/sys/geom/geom_dev.c
@@ -497,7 +497,7 @@ g_dev_orphan(struct g_consumer *cp)
 
        /* Destroy the struct cdev *so we get no more requests */
        unit = dev2unit(dev);
-       destroy_dev(dev);
+       destroy_dev_sched(dev);
        free_unr(unithdr, unit);
 
        /* Wait for the cows to come home */
-- 

Steven


More information about the freebsd-geom mailing list