cvs commit: src/sys/geom/eli g_eli.c

Mon Jan 29 20:20:33 UTC 2007

On Mon, Jan 29, 2007 at 07:52:20PM +0000, Robert Watson wrote:
> 
> On Mon, 29 Jan 2007, Pawel Jakub Dawidek wrote:
> 
> >>Why?  You're proposing yet another intrusive change to the kernel to handle yet another one-off requirement of your code.  Why not do what I suggested before with hooking 
> >>the appropriate SYSINIT in your module? Or why not follow Robert's suggestion and implement a simple event mechanism so that any module can know when a CPU has come 
> >>online or offline.  Heck, you probably don't even need to implement a new mechanism, just hook the existing EVENTHANLER mechanism.  That's what it's designed for!!
> >
> >I'm afraid Scott that your proposals are hacks. As a GEOM class I should not use SYSINIT, EVENTHANDLER, etc. I shouldn't bother if CPUs are online or not. All events I 
> >need to implement a GEOM class I should receive from the infrastructure. Also I shouldn't be called by the infrastructure when the system is not yet ready for my activity, 
> >that's why I proposed to implement this functionality in the infrastructure (ie. delay GEOM tasting machanism), that hack SYSINITs in every single GEOM class that need to 
> >bind to a CPU.
> 
> I guess I'm not sure I entirely agree.  I think that we lack some important infrastructure, which we've been talking on and off for a dev summit or two now, for handling 
> the arrival and departure of CPU resources ("dynamic reconfiguration").  While once this wasn't really an issue on PC hardware, it now is, with the advent of hypervisors, 
> virtualization, not to mention more multiprocessing, etc.  We have quite a few algorithms and data structures that assume that the set of CPUs is static, and fail quite 
> badly (i.e., memory leaks, work lost, etc) if a CPU were to stop scheduling threads.  Geli is not alone in wanting to know what and when CPUs are available for concurrent 
> work, and like other pieces of code (UMA is the piece I have particular familiarity with), finds our infrastructure lacking.  I'm also not entirely convinced I agree with 
> you as by the same token that you might claim sysinits and event handlers shouldn't be used by GEOM modules, perhaps kthreads should also not be used :-).  Sysinits, 
> eventhandlers, and kthreads are all ways for scheduling and dispatching work.

The infrastructure is also there to help, simplify the code and allow to
avoid code duplications. I see no reason to start GEOM classes when the
system is simply not ready. So instead of using yet another KPI in every
GEOM class that would like to bind to CPU, I suggested to remove the
code from geli and instruct GEOM to do it for all classes in one go.

> So perhaps we need to start having the conversation about CPU events more seriously now.  What do you think of the idea of the following: two event handlers, a CPU start 
> event and a CPU stop event, which are guaranteed to run on each CPU as as the CPU comes online, and just before the CPU goes offline. Kernel subsystems could use these 
> events to determine when CPU resources were arriving and departing in some serious sense (not just "busy") in order to initialize and tear down per-CPU data structures, 
> rebalance workloads, start or stop per-CPU works, etc.  The example I have in mind here is the network stack, which might reasonably wish to have per-CPU netisr (worker) 
> threads. When the set of CPUs changes, it would like to increase or decrease the number of workers -- having the same number of workers compressed down to a smaller number 
> of CPUs by migration would be a disaster for performance.

I fully agree that there should be a clean KPI for this. What you
proposed if fine. Because of lack of such KPI geli has to handle HTT
CPUs which are turned off by default in releases also by abusing
scheduler internals. KPI you proposed would allow me to remove those
hacks. And I'm really all for it.

What you and Scott are missing is that when I implement a GEOM class,
I'm using what is available to do my work. I'm not going to educate
myself how schedulers work, implement nice and clean KPI to use it in
my class. I'm not saying it wouldn't be great to be able to do so, but I
don't have time for everything, unfortunately, and you guys should
understand that very well.
I had conversation with John (jhb@) on IRC when I asked him how can I
skip CPUs that are turned off. He then mentioned that it should be
handled by KPI you're proposing, but also mentioned that I should go
with the solution I've now, because at this point there is nothing
better than that.

Anyway, I'd love to remove current hacks and use what you proposed,
Robert.

> How to handle the boot processor is an interesting question -- are we interested in configuring away the boot processor at run-time?  If not, we probably want to handle it 
> as a special case via sysinit.  If all CPUs are equal and any may go away, then we might need to rework our notion of shutdown, and provide these same events for the boot 
> CPU (which does sound desirable so as not to end up with lots of special casing in subsystems). Regardless, we are hardly the first OS to try to address these issues via a 
> clean architectural solution, and my thinking is we should do a bit of research.  A first place to look would definitely be OpenSolaris.

I'd prefer boot CPU to not be treated in any special way. If it has to
be, it can be hidden from the subsystems, ie. by sending CPU-online
event for the boot CPU, but never sending CPU-offline event or something
like this.

-- 
Pawel Jakub Dawidek                       http://www.wheel.pl
pjd at FreeBSD.org                           http://www.FreeBSD.org
FreeBSD committer                         Am I Evil? Yes, I Am!
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 187 bytes
Desc: not available
Url : http://lists.freebsd.org/pipermail/cvs-src/attachments/20070129/a6f831aa/attachment.pgp