Generic Kernel API

Scott Long scottl at samsco.org
Wed Nov 9 19:39:56 PST 2005


Chuck Swiger wrote:
> Scott Long wrote:
> 
>> Charles Swiger wrote:
> 
> [ ... ]
> 
>> I have a fair amount of very close experience with the OSX kernel.  See
>> my comment below:
> 
> 
> I'd say that you have some experience with the FreeBSD kernel, too.  :-)
> 
>>> I'm not strongly advocating the use of C++ in the kernel, but Apple  
>>> is using g++ to build their kernels, so I'd imagine that FreeBSD  
>>> could utilize the same embedded C++ dialect in our kernels if people  
>>> wanted to do so.  The things that leapt out at me in comparing the  
>>> FreeBSD APIs and IOKit were:
>>
>>
>> A cut down version of C++ is used for IOKit, it is not used for the 
>> whole kernel.  The large majority of the kernel is written in C, not
>> C++.
> 
> 
> Agreed.
> 
>> Not all kernel modules are hardware device drivers, neither in
>> OSX or in FreeBSD.  GEOM modules, filesystems, and netgraph modules are
>> all valid examples of pseudo drivers that benefit from a stable API but
>> do not represent hardware devices.  So IOKit is not the cure-all API.
> 
> 
> Goodness, no.  In some ways, I actually like FreeBSD's C implementation 
> of device_t's using kobj's quite a bit compared to C++ code in the 
> IOKit, and some driver families (sound in particular) seem to take 
> advantage of inheritence more than other drivers do.
> 
> The IOKit has some C++-related blemishes like:
> 
> 3-pan% tail 
> /System/Library/Frameworks/IOKit.framework/Versions/A/Headers/network/IONetworkController.h 
> 
>     OSMetaClassDeclareReservedUnused( IONetworkController, 28);
>     OSMetaClassDeclareReservedUnused( IONetworkController, 29);
>     OSMetaClassDeclareReservedUnused( IONetworkController, 30);
>     OSMetaClassDeclareReservedUnused( IONetworkController, 31);
> };
> 
> #endif /* defined(KERNEL) && defined(__cplusplus) */
> 
> #endif /* !_IONETWORKCONTROLLER_H */
> 
> ...reserving 32 slots and keeping a pointer variable to an undefined 
> struct (*_unused) handy just in case due to the fragile base class issue.
> 
>>> 1) the notion of a system-wide driver registry, which could be  
>>> obtained easily from the existing code in sys/bus.h & 
>>> kern/subr_bus.c  which keeps track of this:
>>>
>>> typedef TAILQ_HEAD(driver_list, driverlink) driver_list_t;
>>>
>>> [ devclass_get_devices() is close but not quite the same thing... ]
>>
>>
>> There is already a module registry.  It's used to know when to reject
>> loading KLDs that contain modules that are already in the system.  This
>> works for both device drivers and pseudo drivers.
> 
> 
> True, but a list of modules was not quite was I was looking for.
> 
>>> 2) the "work loop" abstraction (long link, again):
>>>
>>> http://developer.apple.com/documentation/DeviceDrivers/Conceptual/ 
>>> IOKitFundamentals/HandlingEvents/chapter_8_section_2.html
>>>
>>> Programming using callbacks or continuations, having to serialize  
>>> access to driver data structures, etc is one of the most difficult  
>>> areas to deal with, and race conditions and so forth are a common  
>>> source of evil, tricky, hard-to-reproduce bugs.  There isn't a free  
>>> lunch, the kernel has got to deal with such things, but having an  
>>> abstraction like this would probably help make the lives of people  
>>> writing drivers easier. [1]
>>
>>
>> I've written an IOKit driver for high performance hardware.  I'm not
>> convinced that the work loop paradigm is any more efficient than
>> locking.  Apple advocates it because it is indeed easier to program to
>> and takes less to explain than using the different locking primitives.
> 
> 
> The IOKit provides relatively fine-grain mutex locking (on the class or 
> instance level of driver objects) and supports re-entrancy:
> 
> "An IOWorkLoop object (or simply, a work loop) is primarily a gating 
> mechanism that ensures single-threaded access to the data structures 
> used by hardware. For some event contexts, a work loop is also a thread. 
> In essence, a work loop is a mutually exclusive (mutex) lock associated 
> with a thread."
> 
> ...while providing a API (or KPI) which lets the developer code as if he 
> or she had a single worker thread, even though underneath, the system 
> may be scheduling many worker threads amoungst the available CPUs and/or 
> event sources.
> 
> Certainly that's better (more efficient) than contending over the GIANT 
> lock.

Condending over Giant is a thing of the past.  Most of the major 
subsystems and drivers are out from under it.  The few that are not are
now separated enough that contention is extremely low.  Studies are 
being done on this very topic right now, in fact, and the results are
quite good.

> 
>> Remember that the target audience for much of the Apple documentation is
>> people who have never programmed in a Unix kernel before, be they coming
>> from Windows or coming from OS9.  In fact, the Apple docs go out of 
>> their way to discourage you from writing kernel modules entirely.
> 
> 
> Sure-- don't you agree that anything which can be done in userland, 
> generally ought to be done there?  Apple has to contend with developers 
> who are looking to hook into the vertical blanking handler for 
> screensavers and clock programs and who knows what else, just like they 
> did in OS 9.  Discouraging such things from going into the kernel is a 
> good idea.
> 
> Also remember that Mach is closer to being a microkernel than the other 
> BSD kernels are, and the philosophy is showing in the design.  That 
> doesn't mean it's always the best approach, but Mach feels more 
> consistent to me.

The use of Mach in OSX in a whole lot more limited that you might think.
The three uses are the BSD+IOkit kernel, the window server, and the 
security server.  The filesystems are still inside the BSD task, as are
most drivers.  The exception here is certain kinds of USB peripheral
drivers.  The USB hardware driver itself is still inside the kernel task.

> 
>>> 3) the IOMemoryDescriptor and IOMemoryCursor classes, which provide  
>>> an abstraction for managing virtual memory mappings and representing  
>>> DMA or PIO activity (ie, building a scatter/gather list appropriate  
>>> for a particular NIC or RAID controller's DMA engine):
>>>
>>> http://developer.apple.com/documentation/DeviceDrivers/Conceptual/ 
>>> IOKitFundamentals/DataMgmt/chapter_9_section_5.html
>>
>>
>> There is already a well established and stable API for doing DMA in 
>> FreeBSD.  Just about every driver in the kernel uses it.  Why change?
> 
> 
> You mean isa_dmacascade(), isa_dma_acquire(), isa_dmainit() and 
> bus_dma_*...?
> 
> Eww.

Uh, what?

> 
> The forces of entropy are winning the fight to keep the ISA bus and DMA 
> bounce buffers which must be less than 64K around forever, even on 
> hardware which doesn't have such limitations.  :-)

Until the G5 was introduced, OSX never had to worry about making 32-bit 
DMA work on >4GB memory configurations, and it certainly never worried
about ISA DMA.  These are all still realities for i386 and amd64.  There
are a lot of common I/O controllers out there, including traditional 
ATA, that can't do 64-bit DMA and thus __require__ bounce buffers.
Sparc64 requires that you program the IOMMU in order to do any DMA.
Busdma makes all of this transparent.  And as for the G5, it does h0h0
magic to make 32bit DMA work that is outside the scope of the IOMemory
classes.

So, I'm sure what you have against the existing APIs, but they work well
for the FreeBSD environment.

> 
>> There are good ideas in the IOKit that I've advocated for FreeBSD in the
>> past (interrupt filters, for example), and the object oriented approach
>> there is certainly interesting, but I don't see it as a cure all to 
>> stability or ease.
> 
> 
> The IOKit isn't a cure-all, nor is an OO viewpoint always the best 
> approach. There isn't too much difference between inheriting the right 
> behavior and having stuff like this in every driver:
> 
> static device_method_t mypci_methods[] = {
>     /* Device interface */
>     DEVMETHOD(device_probe,     mypci_probe),
>     DEVMETHOD(device_attach,    mypci_attach),
>     DEVMETHOD(device_detach,    mypci_detach),
>     DEVMETHOD(device_shutdown,  mypci_shutdown),
>     DEVMETHOD(device_suspend,   mypci_suspend),
>     DEVMETHOD(device_resume,    mypci_resume),
>     { 0, 0 }
> };
> 
> On the other hand, using inheritence for drivers seems to work pretty 
> well in practice, and the notion of encapsulation seems to help Darwin 
> avoid running into nearly as many lock-order reversals and layering 
> violations.
> 

Again, IOKit doesn't cover pseudo drivers, and it papers over locking by
providing high level serialization constructs.  It would be interesting
to write an IOKit driver two different ways, one that uses work loops
and one that uses mutexes directly, as see if there is any performance
difference on SMP.  Until then, it's hard to say that work loops have a
practical advantage in high performance environments.  I'm starting to
see evidence in FreeBSD that excessive serialization in device drivers
is not good.  Also, workloops aren't available outside of IOKit, and
Darwin provides no good tools like WITNESS to detect and debugging
locking problems, so it must be done through trial and error. That is
really not fun.  As interesting as Darwin is, I still prefer to work in
FreeBSD.

Scott


More information about the freebsd-current mailing list