[CFT] new tables for ipfw

Alexander V. Chernikov melifaro at yandex-team.ru
Thu Aug 14 12:46:12 UTC 2014


On 14.08.2014 16:08, Marko Zec wrote:
> On Thu, 14 Aug 2014 15:52:34 +0400
> "Alexander V. Chernikov" <melifaro at yandex-team.ru> wrote:
>
>> On 14.08.2014 15:15, Luigi Rizzo wrote:
>>>
>>>
>>> On Thu, Aug 14, 2014 at 12:57 PM, Alexander V. Chernikov
>>> <melifaro at yandex-team.ru <mailto:melifaro at yandex-team.ru>> wrote:
>>>
>>>      On 14.08.2014 14:44, Luigi Rizzo wrote:
>>>>
>>>>
>>>>      On Thu, Aug 14, 2014 at 11:57 AM, Alexander V. Chernikov
>>>>      <melifaro at yandex-team.ru <mailto:melifaro at yandex-team.ru>>
>>>> wrote:
>>>>
>>>>          On 14.08.2014 13:23, Luigi Rizzo wrote:
>>>>>
>>>>>
>>>>>          On Wed, Aug 13, 2014 at 10:11 PM, Alexander V. Chernikov
>>>>>          <melifaro at yandex-team.ru <mailto:melifaro at yandex-team.ru>>
>>>>>          wrote:
>>>>>
>>>>>              Hello list.
>>>>>
>>>>>              I've been hacking ipfw for a while and It seems there
>>>>> is something ready to test/review in projects/ipfw branch.
>>>>>
>>>>>
>>>>>          ​this is a fantastic piece of work, thanks for doing it
>>>>> and for integrating the feedback.
>>>>>>>>>>          I have some detailed feedback that will send you
>>>>> privately, but just a curiosity:
>>>>>
>>>>>              ​...​
>>>>>
>>>>>              Some examples (see ipfw(8) manual page for the
>>>>> description):
>>>>>
>>>>>              ​...
>>>>>
>>>>>
>>>>>                ipfw table mi_test create type cidr algo "cidr:hash
>>>>>              masks=/30,/64"
>>>>>
>>>>>
>>>>>          ​why do we need to specify mask lengths in the above​ ?
>>>>          Well, since we're hashing IP we have to know mask to cut
>>>> host bits in advance.
>>>>          (And the real reason is that I'm too lazy to implement
>>>>          hierarchical  matching (check /32, then /31, then /30) like
>>>>          how, for example,
>>>>
>>>>
>>>>      ​oh well for that we should use cidr:radix
>>>>
>>>>      Research results have never shown a strong superiority of
>>>>      hierarchical hash tables over good radix implementations,
>>>>      and in those cases one usually adopts partial prefix
>>>>      expansion so you only have, say, masks that are a
>>>>      multiple of 2..8 bits so you only need a small number of
>>>>      hash lookups.
>>>      Definitely, especially for IPv6. So I was actually thinking
>>> about covering some special sparse cases (e.g. someone having a
>>> bunch of /32 and a bunch of /30 and that's all).
>>>
>>>      Btw, since we're talking about "good radix implementation": what
>>>      license does DXR have? :)
>>>      Is it OK to merge it as another cidr implementation?
>>>
>>> "cidr" is a very ugly name, i'd rather use "addr"
>> Ok, no problem with that. "addr" really sounds better.
>>> DXR has a ​bsd license and of course it is possible to use it.
>>> You should ask Marko Zec for his latest version of the code
>>> (and probably make sure we have one copy of the code in the source
>>> tree).
>> Great!. I'll ask him :)
> The so far cleanest DXR implementation is significantly C++ poluted and
> wrapped inside Click glue (available here: http://www.nxab.fer.hr/dxr)
>
> I'll try to backport the fixes to the original C-only / BSD
> implementation over the weekend and let you know how it goes...
Great! I've got 2012 version half-ported (and radix fix has been merged 
to the tree), but something definitely has changed since then :)
I'd be happy to hear from you :)
>
> Marko
>
>
>>> Speaking of features, one thing that would be nice is the ability
>>> for tables to reference the in-kernel tables (e.g. fibs, socket
>>> lists, interface lists...), perhaps in readonly mode.
>>> How complex do you think that would be ?
>> Implementing algo support for particular provider like
>> sockets/iflists shouldn't be hard. Most of the algorithms complexity
>> lies in table modifications. Here we have to support
>> lookup and dump operations, so it is the question of providing
>> necessary bindings to existing mechanisms (via some direct binding or
>> utilizing things like kernel_sysctl for dump support).
>>
>> It looks like the following maps well to current table concept:
>> * such tables are not created by default
>> * user issues
>>    `ipfw table kfib create type addr algo "addr:kernel fib=0"`
>> or
>>    `ipfw table ktcp create type flow algo "flow:kernel_tcp fib=0"`
>> or
>> `ipfw table kiface create type iface algo "iface:kernel"`
>> * tables have special "readonly" type, flush_all requests are ignored
>> * no state stored internally
>>
>> So generic table handling code needs to be modified to support
>> read-only tables (and making more callbacks optional).
>> Additionally, we might need to proxy "info" request info algo
>> callback (optional, "real" algorithms won't implement it) to be able
>> to show number of items (and some other info) to user.
>>
>>
>>
>>> cheers
>>> luigi
>>>
>> _______________________________________________
>> freebsd-net at freebsd.org mailing list
>> http://lists.freebsd.org/mailman/listinfo/freebsd-net
>> To unsubscribe, send any mail to "freebsd-net-unsubscribe at freebsd.org"



More information about the freebsd-net mailing list