Linux kernel compatability

Jeff Roberson jroberson at jroberson.net
Tue Jan 4 03:21:37 UTC 2011


On Mon, 3 Jan 2011, Matthew Jacob wrote:

>
>>> Also, linux likes to change things very rapidly.  Not to mention a lot of 
>>> their APIs go against the grain on BSD and we would not find them 
>>> aesthetically or architecturally pleasing.
>> 
>> Yea, that's happened so often, one has to wonder if it is intentional on 
>> their part :)  But, alas, I think phk's signature actually explains it.
>> 
>> 
>
> You should read 
> http://www.kernel.org/doc/Documentation/stable_api_nonsense.txt for a 
> reasoning. It's also a misreading of both motive and capabilities to ascribe 
> either malice or incompetence to the linux folks.
>

I would agree to this.  It's more an issue of different priorities.  We 
tend to have more layering, contained APIs, and long-term stability.  They 
tend to favor performance above all else.

> On the title subject, having written several multiplatform drivers and been 
> involved trying to write APIs and ABIs common to multiple platforms and OS' 
> for the last 20 odd years or so, I have to say I have very mixed feelings 
> about the approach Jeff has taken here.
>
> Syntactic differences between different OS functions are *generally* trivial. 
> printf is printf.
>
> Semantic, and implied semantic differences can be tricky. The linux locking 
> model is fundamentally different from the FreeBSD, and only conscious and 
> careful programming in code that calls shim layer stuff can avoid problems 
> (although maybe Jeff has been cleverer than I can be with this).

The locking was relatively straightforward.  And in fact our kernel looks 
like a preemptable real-time linux kernel.  In that case they do the very 
same transformations that I have done.  Spinlocks become mutexes, there 
are no real spinlocks, and semaphores, etc. are implemented with sx.

In other cases I had to be very careful to preserve the semantics and 
contexts of execution which at times required the implementation of new 
BSD functionality.  Indeed any way you crack the nut care and attention to 
detail are required.

>
> Given the above, centralized KAPI services that multiple OS platforms can 
> share seem a hard goal to reach. There was at least one (sponsored by SCO) 
> which was a complete non-starter (despite the best of intentions). There have 
> been several hardware based efforts that have also come a cropper (I20 and 
> OBIOS).

I agree with this.

>
> Even *within* a single OS community, variants abound that are irreconcilable 
> (the multiple *BSDs; AT&T vs BSD; .....). And once you get outside the 
> kernel, it's complete chaos (the multitude of Linux distros).
>
> I take from all of this that only select and relatively contained kernel 
> subsystems are good candidates for multiple platform instantiations. Direct 
> hardware drivers are generally pretty good for this since they can usually be 
> written with shim type supports pretty easily.
>
> Filesystems are tricky. The attempts by SGI to maintain multiple platform 
> versions of XFS have failed spectacularly. The GFS folks gave up on this one 
> early.

Yes, you'll notice I only did minimal compatibility with network 
interfaces and none with mbufs.  At some points the wrapper is not 
worthwhile.  For Infiniband SDP (sockets) and IPOIB were a more direct 
porting since they touched layers that were too big to translate.

>
> OFED is something of a special case. It's a very large body of code, written 
> specifically for Linux, and is essentially the de facto standard for HPC 
> interconnects. Like I said, I have mixed feelings about Jeff's approach here.
>
> On the one hand, that's a huge shim layer to write (10K LOC!).
>
> On the *other* hand, it makes importation of updated (Linux based) code quite 
> workable. What that *does* imply is that native OFED development will 
> unlikely ever be undertaken on FreeBSD (probably fine- whoza gonna pay for it 
> that ain't paying for Linux development already?).

Please consider that we have imported nearly 700,000 lines of code as part 
of the OFED distribution.  Actually writing the shim layer took less time 
than it would to hand port all of that and in almost all cases the only 
gain would be incompatability with updates as most differences are 
actually quite trivial.

Just yesterday I updated our sources with 9 months of changes from the 
ofed git repository.  It took about an hour to resolve the diffs and 
update the wrapper with a few missing features.  It took probably 4 hours 
for all of the svn operations to complete.

The original OFED porting effort I did with John Polstra and the people at 
Isilon was never updated to my knowledge.  It was more mechanical changes 
and 'felt' more like FreeBSD but fell so far out of date as to be useless. 
Interestingly there was originally a porting layer in the ofed stack back 
as it originally compiled on many operating systems.  However the 
opensource effort focused on linux and the linux people wouldn't take it 
without the shims removed.

>
> The other thing to ask in making for common functionality is whether or not 
> there are performance and/or functionality tradeoffs being made. There are 
> things you can do in the FreeBSD kernel that you can't (easily) do in Linux 
> (and vice versa). That's another consideration.
>
> Anyway, to end this meandering email, my suggestion would be "Yes, the 10K 
> LOC would be useful for all modules to use!", but only expect it to work 
> (well) for OFED.

The two parts of OFED which required hand porting ended up requiring less 
of it and being more compatible with the Linux code as a result of leaving 
things which didn't matter using the wrapper.  It's not an all or none 
approach.  Of course that isn't aesthetically pleasing but functionally 
it's very useful.

Thanks,
Jeff

>
> _______________________________________________
> freebsd-arch at freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-arch
> To unsubscribe, send any mail to "freebsd-arch-unsubscribe at freebsd.org"
>


More information about the freebsd-arch mailing list