Re: [PATCH] Re: 12.2 Splay Tree ipfw potential panic source

From: Karl Denninger <karl_at_denninger.net>
Date: Sat, 10 Jul 2021 12:36:29 UTC
On 7/10/2021 04:52, Stefan Esser wrote:
> Am 10.07.21 um 10:23 schrieb Stefan Esser:
>> Am 10.07.21 um 04:41 schrieb Karl Denninger:
>>> Ok, so I have good news and bad news.
>>>
>>> I have the trap and it is definitely in libalias which appears to come about as
>>> a result of a NAT translation attempt.
>>>
>>> Fatal trap 18: integer divide fault while in kernel mode
>> [...]
>>> HouseKeeping() at HouseKeeping+0x1c/frame 0xfffffe0017b6b320
>> The divide by zero at one of the first instructions of HouseKeeping()
>> seems to be caused by this line:
>>
>> /sys/netinet/libalias/alias_db.c:1753:
>>
>>          if (packets % packet_limit == 0) {
>>
>> Seems that packet_limit can become zero, there ...
>>
>> At line 1780 within that function:
>>
>>        		if (now != LibAliasTime) {
>>                          /* retry three times a second */
>>                          packet_limit = packets / 3;
>>                          packets = 0;
>>                          LibAliasTime = now;
>>                  }
>>
>> The static variable packet limit is divided by 3 without any
>> protection against going down to 0.
>>
>> A packet_limit of zero makes no sense (besides causing a divide
>> by zero abort), therefore this value should probably have a lower
>> limit of 1.
>>
>> Maybe that
>>                          packet_limit = packets / 3 + 1;
>>
>> would give an acceptably close result in all cases.
>>
>> Else enforce a minimum value of 1 after the division:
>>
>>                          packet_limit = packets / 3;
>>                          if (packet_limit == 0)
>>                                  packet_limit = 1;
>> Or just:
>>                          packet_limit = packets >= 3 ? packets / 3 : 1;
>>
>> Regards, STefan
> I have just noticed that enforcing a lower limit of 1 is totally
> equivalent to testing for zero before performing the modulo operation.
>
> The attached patch should fix the panic and does not change the way
> packet_limit is calculated. Since the variable is immediately used
> in the modulo when not zero, the additional cost of the test for zero
> is extremely low, less than that of the other suggested changes.
>
> Maybe that increasing packet_limit by 1 is sensible, anyway, since at
> low packet rates it will result in 0 to 5 packets giving the same
> effect (the condition in line 1753 evaluates to true).
>
> Anyway, please try the attached patch, which will fix the divide by
> zero panic.
>
> Regards, STefan
>
> PS: Patch inline in case it is stripped by the mail-list:
>
> diff --git a/sys/netinet/libalias/alias_db.c b/sys/netinet/libalias/alias_db.c
> index c09ad4352ce4..d5dec0709cbe 100644
> --- a/sys/netinet/libalias/alias_db.c
> +++ b/sys/netinet/libalias/alias_db.c
> @@ -1769,7 +1769,7 @@ HouseKeeping(struct libalias *la)
>           * Reduce the amount of house keeping work substantially by
>           * sampling over the packets.
>           */
> -       if (packets % packet_limit == 0) {
> +       if (packet_limit == 0 || packets % packet_limit == 0) {
>                  time_t now;
>
>   #ifdef _KERNEL
>
>
> (Line numbers from -CURRENT, may be slightly off for stable/12.)
Compiling now; I have a roughly hour-long window before a blackout 
period where I can't have that system crashing until late afternoon.  If 
I can get it loaded before then will advise but yeah, what you 
identified would certainly do it if packet_limit became zero......
-- 
Karl Denninger
karl@denninger.net <mailto:karl@denninger.net>
/The Market Ticker/
/[S/MIME encrypted email preferred]/