newbus integration of MOD_QUIESCE

Poul-Henning Kamp phk at phk.freebsd.dk
Wed Jul 14 11:25:32 PDT 2004


In message <20040714.090222.36824783.imp at bsdimp.com>, "M. Warner Losh" writes:

>: More like "If you are idle, prepare to be unloaded, otherwise yell".
>
>OK.  So there is a race here.  The modules have to cope with the lag
>between the MOD_QUIESCE call and the MOD_UNLOAD that will come later.
>Can the modules assume that if they get a MOD_QUIESCE that they WILL
>get a MOD_UNLOAD later if they return 0?

Yes, they can assume that.  There is the footnote that we can have
multiple modules in one kld and if one of them vetos the MOD_MODQUIESCE
the MOD_UNLOAD may not happen until the user tries to kldunload again
or at all.

Notice that MOD_UNLOAD can also be veto'ed and that we already have
this multi-modules in one KLD issue if one of the modules veto
MOD_UNLOAD and the others don't.

>That's a fairly high standard to live up to.  It would mean setting a
>flag and not accepting any new business.

I'm not sure I find it a particular high standard to add code to:

	case MOD_QUIESCE:
		...
		if (error == 0)
			sorry_we_re_close = 1;
		return (error)

and then check the flag in the probe/attach/taste/config routines.

>I thought in irc discussions
>that this was a 'am I busy or not' call, but it is clear that this has
>side effects.  There's no way to ask the module if it is idle or not,
>so this is just a small increment over what we have now.

Yes.  The smallest increment which allowed me to implement -f(orce).

>: Finegrained guidance for what a module should do in MOD_QUISCE:
>: 
>:     Network driver:
>: 
>: 	If any instance is IF_UP, return EBUSY.
>: 	Otherwise detach all instances and do not attach to
>: 	any new instances offered.
>
>This sounds a lot like what we call detach in newbus.  Yet this is
>different than what we'd talked about in terms of the
>'notify/shutdown' split that we've been designing.  This makes it
>harder to do that.

(I should clarify here that the detachment doesn't have to happen
until MOD_UNLOAD, the important bit is determining if EBUSY should
be returned and if not, prevent new business.)

>I don't like this guidance.  It doesn't map to
>anything in newbus that we have or we're planning.

MOD_QUIESCE is designed for kldunload -f, not for newbus so this
is hardly surprising.

>So this would also act as a gate to all further probe routines
>returning 'good' as well?  That's a lot of side effects.

You lost me there.  Once MOD_QUISCE returned zero, all probe
routines in the module should fail to recognize things.  I
would expectat that to mean that they return 'bad', but this
may just be a terminology thing.

>This makes it hard to map into other plans we have for newbus, and
>hard for us to provide a hook into newbus drivers for this module
>message.

I hate to be so blunt about this, but why is it that you think you
need to do that ?

Let me explain why I can't see the connection:

Imagine that we have a pretty hairy driver which attaches to things
on ISA, PCI and S-BUS.

MOD_QUIESCE acts on all instances of this driver, no matter which bus
it is attached to, and no matter what it attaches to upwards in the
kernel (netgraph, geom, netstack, filesystemc, MAC etc).  The user
wants the kld unloaded, he asks for it, and MOD_QUISCE + MOD_UNLOAD
handles this for him.

If I have understood anything about what you are talking about, the
newbus involvement comes from the event expressing that I pressed
the "I'll yank this one" button on a hot-swap PCI slot.

That would not and should not affect the instances which sit on 
the ISA or S-BUS.

I fail to see how it can avoid having a device_t argument to tell
which instance the event is going to operate on, and further more,
I entirely fail to see why you would use a module event to communicate
this rather than have newbus call the instance in question directly
with a quiesce method.

What am I missing ?

How is MOD_QUIESCE any different from MOD_UNLOAD with respect to
newbus ?

>: 	If any instance is opened, return EBUSY.
>: 	Otherwise destroy_dev all instances and do not attach to
>: 	any new instances offered.
>
>Same as the above.  Likely with a lot of wrinkles for bus drivers.  If
>pccard were to receive this, does it consider itself busy because it
>has children drivers?  Or should it pass the request down to them?
>Since the children drivers aren't necessarily modules, we can't just
>rely on them.

I think you have overlooked something fundamental Warner.

MOD_QUIESCE only offers the ability to split the current MOD_UNLOAD
semantics in two parts so that -f(orce) can override the
policy decision part of the action in the module.

If pccard today implements MOD_UNLOAD, then you can split that code
in two and put half of it in MOD_QUIESCE if you want to implement
a -f(orce) semantic for pccard.

If modules cannot implement MOD_QUISCE they don't have to.

If you don't do that, then it will work exactly the way it does
today (provided it returns a sensible value for an unknown events).

MOD_UNLOAD has the same semantics as they already have, including
the ability to simply not implement unloads because it is too hairy.

>In addition, these sorts of recommendations fundamentally change the
>way that drivers unload these days.  Today, kldunload of a typical
>network driver will detach it, even if it is up or busy.  It is an
>extraordinary driver that does anything differently.

No, it doesn't change anything "fundamentally".

It gives the module writer the ability to split the MOD_UNLOAD into
two so that a -f(orce) semantic can be implemented, and if he doesn't
do that, then you have the exact same situation you have today.

>Do you plan on fixing all the drivers/modules in the tree that doesn't
>do this now?

No, MOD_QUIESCE is entirely optional.  If you don't implement it
there will be no difference between kldunload with or without -f(orce)
and that's it.  There is no difference today for any driver.

>Having inconsistant unload behavior is one of the
>problems we have in the tree, and it seems like this would make it
>worse.

If you want consistent unload behaviour, you need to start out by
defining what it should be.

For GEOM we currently refuse unloading if anything is busy.  This is
based on the behaviour of unmount which refuses if any vnodes are in use.

This entire thing comes about because sometimes you may want to
unload busy GEOM classes, and that's what -f(orce) gives you a
lever for.

If we have modules which today unload while being in use then I would
personally argue that it is a POLA violation, but given the lack of
the -f(orce) lever until now, I can fully understand the reason
why it got that way.

Hopefully now with -f(orce) we can get a consistent behaviour
which IMO should be:

	"kldunload" fails if the module is in use.
	"kldunload -f" suceeds even if the module is in use.

>Also, in preparing this message, I noticed a number of drivers that
>return EINVAL instead of EOPNOSUPP for MOD_ events they don't
>understand.  There are several that also return 0 for events that they
>don't understand.  These should be corrected in the tree.  Do you have
>plans to do this?

As I find them yes.

Currently EOPNOTSUPP and zero are treated as "OK" return from
MOD_QUIESCE.  You'll notice that I put it in the man-page to define
a sensible default behaviour so we can eliminate this particular
inconsistency over time.

>And none of this helps us to unload idle drivers, but I guess that
>wasn't your goal.

You mean automatically unload drivers which are not/no longer used ?
No, that was nowhere near my goals.  If you mean something else
you'll have to explain to me what you meant.

>Finally, I'm not trying to be negative here.  There are just a lot of
>issues that haven't been considered yet.  I'm a little grumpy that I
>was specifically told that you didn't want to hear about newbus
>issues, but I'm trying to be constructive in my comments here none the
>less.  I am looking to implement something in newbus with these
>messages, so I'm trying to understand it at a high level of detail.

As you can see above I still totally fail to see the newbus connection
in this.

I can see that implementations of *both* MOD_UNLOAD and MOD_QUIESCE
for bus modules could get some help from newbus, but we don't have
that today for MOD_UNLOAD as far as I know, so I doubt that MOD_QUIESCE
changes that.

And mind you, I would far have preferred if there had been a flag
argument to ->modevent() so that MOD_UNLOAD could simply have gotten
a MOD_UNLOAD_FORCE flag passed in, but fixing that would require
changing every single modevent handler in the system and I was
not too interested in that.

If Robert wants increased granularity as he argued for in one email,
then the flag route would probably be necessary, but I don't think
he can carry that motion.

-- 
Poul-Henning Kamp       | UNIX since Zilog Zeus 3.20
phk at FreeBSD.ORG         | TCP/IP since RFC 956
FreeBSD committer       | BSD since 4.3-tahoe    
Never attribute to malice what can adequately be explained by incompetence.


More information about the freebsd-arch mailing list