Linux kernel compatability
Jeff Roberson
jroberson at jroberson.net
Tue Jan 4 23:10:23 UTC 2011
On Tue, 4 Jan 2011, Julian Elischer wrote:
> On 1/4/11 12:53 PM, Jeff Roberson wrote:
>> On Tue, 4 Jan 2011, Alexander Kabaev wrote:
>>
>>> On Mon, 3 Jan 2011 19:03:01 -1000 (HST)
>>> Jeff Roberson <jroberson at jroberson.net> wrote:
>>>
>>>> On Mon, 3 Jan 2011, Alexander Kabaev wrote:
>>>>
>>>>> On Mon, 3 Jan 2011 10:31:24 -1000 (HST)
>>>>> Jeff Roberson <jroberson at jroberson.net> wrote:
>>>>>
>>>>>> Hello Folks,
>>>>>>
>>>>>> Some of you may have seen my infiniband work proceed in svn. It is
>>>>>> coming to a close soon and I will be integrating it into current.
>>>>>> I have a few patches to the kernel to send for review but I wanted
>>>>>> to bring up the KPI wrapper itself for discussion.
>>>>>>
>>>>>> The infiniband port has been done by creating a 10,000 line KPI
>>>>>> compatability layer. With this layer the vast majority of the
>>>>>> driver code runs unmodified. The exceptions are anything that
>>>>>> interfaces with skbs and most of the code that deals with network
>>>>>> interfaces.
>>>>>>
>>>>>> Some examples of things supported by the wrapper:
>>>>>>
>>>>>> atomics, types, bitops, byte order conversion, character devices,
>>>>>> pci devices, dma, non-device files, idr tables, interrupts,
>>>>>> ioremap, hashes, kobjects, radix trees, lists, modules, notifier
>>>>>> blocks, rbtrees, rwlock, rwsem, semaphore, schedule, spinlocks,
>>>>>> kalloc, wait queues, workqueues, timers, etc.
>>>>>>
>>>>>> Obviously a complete wrapper is impossible and I only implemented
>>>>>> the features that I needed. The build is accomplished by pointing
>>>>>> the linux compatible code at sys/ofed/include/ which has a
>>>>>> simulated linux kernel include tree. There are some config(8)
>>>>>> changes to help this along as well.
>>>>>>
>>>>>> I have seen that some attempt at similar wrappers has been made
>>>>>> elsewhere. I wonder if instead of making each one tailored to
>>>>>> individual components, which mostly seem to be filesystems so far,
>>>>>> should we put this in a central place under compat somewhere? Is
>>>>>> this project doomed to be tied to a single consumer by the specific
>>>>>> nature of it?
>>>>>>
>>>>>> Other comments or concerns?
>>>>>>
>>>>>> Thanks,
>>>>>> Jeff
>>>>>
>>>>>
>>>>> This probably will go against popular opinion here, but having 10k
>>>>> linux emulation layer that _almost_ work in the tree will be an
>>>>> unfortunate event and will do more damage to FreeBSD as a platform
>>>>> than good in the long run. I would rather see this code never hit
>>>>> main repository.
>>>>
>>>> I would argue that the layer works very well for infiniband. Much
>>>> better than almost. It is only almost complete in that there is no
>>>> need for me to implement features that we're not using.
>>>>
>>>> I am interested in hearing your other concerns however.
>>>>
>>>> Thanks,
>>>> Jeff
>>>>
>>>
>>
>> Alexander, let me first start out by saying I have a great deal of respect
>> for you and I hear your concerns. I see that this is a somewhat heated
>> issue and I can really only address the technical points. The more
>> existential questions about FreeBSD will have to be left to others.
>>
>>> The considerations are simple enough. First, we do not have many IB
>>> users of FreeBSD in the wild and those that we have (Isilon) seem to be
>>> perfectly capable of managing the IB stack out of the tree, without
>>> dumping the thousands of lines of the code into the base. If they had
>>> the stack before, but were not willing/capable to provide adequate care
>>> for it in the past, there is no reason to expect things to change with
>>> second stack, which now will rot in our tree instead of theirs.
>>
>> They provided adequate care for it to keep their product running on old
>> versions of FreeBSD. Unfortunately it is a large stack and there are a
>> great number of people and organizations working on improving and advancing
>> it on Linux via OFED and having a private stack does not give you the
>> benefit of their work. The motivation for making the wrapper layer was
>> entirely to keep pace with this development and make it less likely that
>> what is in the tree will rot.
>>
>>>
>>> Second, semi-complete Linux compat layer in kernel will have the
>>> same effect as linuxulator in userland - we do have some vendors still
>>> trying to bother with FreeBSD drivers for their hardware now and we
>>> will have none after we provide the possibility to hack their Linux
>>> code to run somewhat stably on top of Linux compat layer. Due to
>>> intentional fluidity of Linux kAPI, our shims will never quite walk and
>>> quack like their original implementation in Linux kernel and combined
>>> result will always be lees stable than native Linux linux drivers in
>>> Linux kernel.
>>
>> I have heard this argument about the linuxulator and what we're really
>> talking about is slipping FreeBSD marketshare. I don't share the view that
>> the linuxulator futhered this slip but rather my view is that it allows us
>> to stay relevant in areas where companies can not justify an independent
>> FreeBSD effort. Adobe is a good example of this.
>>
>> Let's talk nuts and bolts about what this thing does. In the vast majority
>> of cases it simply shuffles arguments and function names around where there
>> is a 1:1 correlation between linux api and FreeBSD API. Think about things
>> like atomics, callouts, locks, jiffies vs ticks, etc. In these areas the
>> systems are trivially different. In a very small number of areas where
>> this wasn't the case I did a direct port and noted it with an #ifdef.
>>
>> This works specifically in the infiniband case because it is its own middle
>> layer. You can't write a scsi driver for linux and use it on BSD with
>> this. You can't write a network driver even. But if you do bring in code
>> from linux you don't have to worry about changing every kmalloc to malloc
>> and every printk to printf so diffs can be reduced in trivial cases. I
>> thought given your work on XFS for FreeBSD that would make sense to you.
>>
>> Our options are, to leave FreeBSD users without infiniband, which I can
>> tell you has cost us more market share as I know of specific cases we have
>> lost due to it. To maintain our own stack independently, which no one has
>> the budget for. Or to try to integrate with OFED. Do you see some other
>> approach?
> As you may know Alexander and I both work for a company that produces a
> "large" driver that runs on everything from windows to FreeBSD and everything
> in between (linux, osx, esx, aix, solaris).. we have a porting layer
> (Alexander hates it I know).
> But it's not a freebsd to linux layer, it's a freebsd (or whatever) to
> 'internal' layer,
> (though it started out being the first).
> Looking at what we have and what you have it seems to me that we could take a
> subset of the basic CS101
> methods that are there and make a linux driver porter's toolkit.
Yes the problem in this case is that OFED controls the code and they
specifically removed a portability layer. So the portability layer is the
Linux APIs they are currently using. Really in some sense it ends up
being the same thing.
>
> things like linux style linked lists vs out queue macros can be
> addressed easily, but it's just a pain in the neck to do so..
> also linux run queues and such are different but making a small
> set of toolkits to do so wouldn't be a bad idea.
> some of the more esoteric parts might stay with the ib code however.
After this discussion I'm leaning towards leaving the layer I have in the
ofed/ directory and leaving it tied to the version of ofed we currently
have imported. They actually have a set of scripts to 'backport' their
stack to different linux versions if we were to standardize on some older
linux kernel release but I don't think it's worth the effort.
I understand wanting to limit the spread of hybridized linux kernel code.
It is not my first choice but comparing it with the alternative of not
having some desired feature I will choose the feature.
Thanks,
Jeff
>
>
>
>>
>> Thanks,
>> Jeff
>>
>>>
>>>
>>> --
>>> Alexander Kabaev
>>>
>> _______________________________________________
>> freebsd-arch at freebsd.org mailing list
>> http://lists.freebsd.org/mailman/listinfo/freebsd-arch
>> To unsubscribe, send any mail to "freebsd-arch-unsubscribe at freebsd.org"
>>
>
More information about the freebsd-arch
mailing list