Geom / destroy_dev() deadlock

Steven Haber steven.haber at isilon.com
Tue Jun 19 18:24:49 UTC 2012


> On Mon, Jun 11, 2012 at 03:27:39PM -0700, Steven Haber wrote:
> > I do not understand what you are proposing. Could you, please, show
> > > the patch ?
> > 
> > ---
> >  src/sys/geom/geom_dev.c |    2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/src/sys/geom/geom_dev.c b/src/sys/geom/geom_dev.c
> > index 38251e1..787235a 100644
> > --- a/src/sys/geom/geom_dev.c
> > +++ b/src/sys/geom/geom_dev.c
> > @@ -497,7 +497,7 @@ g_dev_orphan(struct g_consumer *cp)
> >  
> >         /* Destroy the struct cdev *so we get no more requests */
> >         unit = dev2unit(dev);
> > -       destroy_dev(dev);
> > +       destroy_dev_sched(dev);
> >         free_unr(unithdr, unit);
> >  
> >         /* Wait for the cows to come home */
>
> From: Konstantin Belousov
> Sent: Monday, June 11, 2012 3:53 PM
> To: Steven Haber
> Cc: freebsd-geom at freebsd.org
> Subject: Re: Geom / destroy_dev() deadlock
>
> Did you noted the comment above the block you changing ?
> The patch would cause races allowing arbitrary kernel memory
corruption.
>
> The moment when the cdev is destroyed is somewhere in future, while
> structures that the cdev reference are freed synchronously.
>
> I tried to put some safety measures into destroy_dev_sched(9), namely
> CDP_SCHED_DTR flag that somewhat reduces the possibility of usermode
> accessing cdev after destroy_dev_sched(), but this cannot be
eliminated
> entirely.

So destroy_dev_sched() is inherently racey. That explains why there
aren't many examples of usage in the kernel.

Without doing a scheduled destroy, can you think of any way to prevent
the devdrn/geom deadlock? From the original discussion:

	GEOM calls destroy_dev() while holding the topology lock.

	Destroy_dev() wants to destroy device, but can't because there
are
	threads that still have it open.

	The threads can't close it, because to close it they need the
topology
	lock.

There may be additional locking that can be done in the dev layer, like
a third sleepable lock. We could also set a devdrn flag on the cdev to
cause g_dev calls to return with an error prior to taking the topology
lock.

Steven


More information about the freebsd-geom mailing list