ipfw divert filter for IPv4 geo-blocking

Mon Aug 1 06:17:55 UTC 2016

On 30/07/2016 10:17 PM, Dr. Rolf Jansen wrote:
>> Am 29.07.2016 um 10:23 schrieb Dr. Rolf Jansen <rj at obsigna.com>:
>>> Am 29.07.2016 um 06:50 schrieb Julian Elischer <julian at freebsd.org>:
>>> On 29/07/2016 5:22 PM, Julian Elischer wrote:
>>>> On 29/07/2016 4:53 PM, Dr. Rolf Jansen wrote:
>>>>>> Am 28.07.2016 um 23:48 schrieb Lee Brown <leeb at ratnaling.org>:
>>>>>>
>>>>>> That makes sense to me.  Your /20 range encompasses 201.222.16.0 -
>>>>>> 201.222.31.255.
>>>>>> If you want 201.222.20.0-201.222.31.255, you'll need 3 ranges:
>>>>>>
>>>>>> 201.222.20.0/22 (201.222.20.0-201.222.23.255)
>>>>>> 201.222.24.0/22 (201.222.24.0-201.222.27.255)
>>>>>> 201.222.28.0/22 (201.222.28.0-201.222.31.255)
>>>>> Ian, Julian and Lee,
>>>>>
>>>>> Thank you vary much for your responses. In order not bloat the thread, I answer only to one message.
>>>>>
>>>>> I found the problem. As a matter of fact, the respective IP ranges in the LACNIC delegation statistics file are 3 adjacent blocks with 1024 addresses, i.e. those that you listed in your message above:
>>>>>
>>>>> $grep 201.222.2 /usr/local/etc/ipdb/IPRanges/lacnic.dat
>>>>> lacnic|BR|ipv4|201.222.20.0|1024|20140710|allocated|164725
>>>>> lacnic|BR|ipv4|201.222.24.0|1024|20140630|allocated|138376
>>>>> lacnic|BR|ipv4|201.222.28.0|1024|20140701|allocated|129095
>>>>>
>>>>> However, my database compilation combines adjacent blocks with the same country code, and the ranges above turn into one block of 3072 addresses, which obviously doesn't have a valid netmask - log(3072) = 11,5849625.
>>>>> ...
>>>>> ..., it is not sufficient to forget about optimization but I need to check also whether, the delegation files contain already some non-CIDR ranges, which need to be broken down.
>>>> there is code to take ranges and produce cidr sets.
>>>>
>>>> We used to have exactly that code in the appletalk code before we took it out. Appletalk uses ranges.
>>>> https://svnweb.freebsd.org/base/release/3.2.0/sys/netatalk/at_control.c?view=annotate#l703
>>> though htat uassumes input in the form af an appletak sockaddr..
>>> there is also this python module
>>> https://pythonhosted.org/netaddr/tutorial_01.html#support-for-non-standard-address-ranges
>>>
>>>> maybe you can find other versions on the net.
>>>> however if you fully populate the table, you will get the correct result because more specific entries will
>>>> override less specific entries. To do that you would have to have a way to describe to your program what
>>>> value each table entry should output.
>>>> If you did what you do now, then you would specify the value for the required countries, and give a default falue for "all others".
>>>> aggregation of adjacent ranges with same value would be an optimisation.
>> Don't worry, breaking down an arbitrary IP-range into a CIDR conforming set of ranges, doesn't seem too difficult. ...
>> ...
>> Once I come to a conclusion, I will post it to this mailing list.
> I finished the work on CIDR conformity of the IP ranges tables generated by the tool geoip. The main constraint is that the start and end address of an IP block given by the delegation files MUST BE PRESERVED during the transformation to a set of CIDR records. This target is achieved by:
>
>   1. Finding the largest common netmask boundary of the start address utilizing
>      int(log2(addr_count)); then iteration like Euclid's algorithm in computing
>      a GCD.
>
>   2. Output the CIDR with the given start address and the masklen belonging
>      to the found netmask.
>
>   3. If the CIDR does not match the whole original IP range then set the start
>      address of the next CIDR block to the next boundary of the common netmask,
>      and loop over starting at 1. until the original range has been satisfied.

check out the appletalk code I pointed out  to you.. I wrote that in 
93 or so but I remember sweating blood
over it to get it right.
>
> I carefully tested the algorithm and a table that I pipe by the new geoip tool into ipfw is 100 % identical to the output of the ipfw command 'table N list'.
though that doesn't mean it is semantically identical to the original 
table due to 'most specific rule wins" behaviour.

for example:
if you type in ;

1.2.3.0/24 -> A
and
1.2.3.0/26 -> B
then both rules will be listed the same as what you put in
but if you wanted to get all rules that point to A, without having 
rules that point to B, then you would have to export
1.2.3.64/26  -> A
1.2.3.128/25 -> A
  (i.e. TWO rules)

you could also export
1.2.3.0/24 -> A
1.2.3.0/26 -> 0  (think of it as an "EXCEPT for these" rule)

which is ALSO two rules but you would need to be sure that the 
receiver knows what to do with them.

>
> It is worth to note, that already the original RIR delegation files contain 457 non CIDR conforming IPv4 ranges in a total of 165815 original records. I guess that this number will increase in the future because the RIR's ran empty on new IPv4 ranges and are urged to subdivide returned old ranges for new delegations. The above algorithm is ready for this.
>
> Generally, CIDR conforming tables are more than twice as large as optimized (joined adjacencies) IP range tables. All said changes have been pushed to GitHup already.
Unfortunately there is no way to specify (using cidr notation) a.b.1.x 
AND a.b.2.x without including a.b.[03].x.

HOWEVER
if you specified the FULL table you could use the "except" feature of 
routing table behaviour where
a.b.0.x/22  -> A
a.b.0.x/24  -> B
a.b.3.x/24  -> B
gives you the same thing because of the 'most specific rule wins" 
nature of routing table evaluation.
I believe this is the case in the tables you imported.
the trick is to be able to take an "optimised" table such as that 
above and produce, given a required subset, just the required part, 
while changing the rules as needed on the fly to "de-optimise" them 
enough to maintain correctness.

>
> I am still a little bit amazed how ipfw come to accept incorrect CIDR ranges and arbitrarily moves the start/end addresses in order to achieve CIDR conformity, and that without any further notice, and that given that ipfw can be considered as being quite relevant to system security. Or, may I assume that ipfw knows always better than the user what should be allowed or denied. Otherwise, perhaps I am the only one ever who input incorrect CIDR ranges for processing by ipfw.

I answered this before but can't see the answer in my out box, plus I 
have added info..

The ipfw code is derived from the routing code.  it is shorthand 
notation for a.b.c.d [netmask e.f.g.h ]
there is nothing that says that a.b.c.d need be the first address in 
the range. (though some vendors may require that.)
to quote wikipedia on the topic (yes, I know, not an authoritative source)

==== quote ====

The address may denote a single, distinct interface address or the 
beginning address of an entire network. The maximum size of the 
network is given by the number of addresses that are possible with the 
remaining, least-significant bits below the prefix. The aggregation of 
these bits is often called the /host identifier/.

For example:

  * 192.168.100.14/24 represents the IPv4
    <https://en.wikipedia.org/wiki/IPv4> address 192.168.100.14 and
    its associated routing prefix 192.168.100.0, or equivalently, its
    subnet mask 255.255.255.0, which has 24 leading 1-bits.

I use this all the time when parsing information that contains a 
hostname, and I know the netmask width. It saves me from having to 
have complicated shell code to pull apart the address and zero out the 
host bits of the address.

> Best regards
>
> Rolf
>
> _______________________________________________
> freebsd-ipfw at freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-ipfw
> To unsubscribe, send any mail to "freebsd-ipfw-unsubscribe at freebsd.org"
>