Re: 12.2 Splay Tree ipfw potential panic source

From: Karl Denninger <karl_at_denninger.net>
Date: Fri, 09 Jul 2021 00:53:01 UTC
On 7/8/2021 18:11, Lutz Donnerhacke wrote:
> On Thu, Jul 08, 2021 at 05:38:35PM -0400, Karl Denninger wrote:
>> This is the only change I'm aware of in the build I just put on a firewall
>> that uses ipfw heavily, and it is not panic'ing quite regularly.
> The code was delayed considerably for MFC until such problems are
> identified, tested and solved (AFAIK) in main. The tests are also present in
> stable, and the CI system (jenkins) did not report any failure after MFCing.
>
>> I'll see if I can get a dump from it but that may be difficult as the
>> machine in question is a "small box" without attached storage (boots from
>> an SD card.)
> That would be really helpful. Most notably the bud I can't find easily are
> in the area of inbound redirection and protocol helpers. My use case is
> Carrier Grade NAT (outbound).

The box in question has a material number of "permanent" hole punch 
redirects for inbound links, to wit:

        ${fwcmd} nat 100 config if ${oif} log same_ports reset 
redirect_port tcp  ${fwd}:2552 2552 redirect_port tcp 
192.168.10.214:8080 8080 redirect_port tcp 192.168.10.203:443 4443 
redirect_port tcp 192.168.10.216:8080 8088 redirect_port
tcp ${fwd}:993 993 redirect_port tcp ${fwd}:4552 4552 redirect_port tcp 
${fwd}:imaps imaps redirect_port tcp ${fwd}:11443 11443 redirect_port 
tcp ${fwd}:80 80 redirect_port tcp ${fwd}:2200 22042

The kernel with the splay-tree improvements survives less than /five 
minutes /before it blows up -- repeatedly.  There is a client (on an 
Android phone) that maintains a connection via one of those 
(specifically, to the 10.214 ip) along with inbound email on 2552 from 
an off-site relay.

I will see if I can get at least a panic backtrace, although the 
impacted box is a pcEngines firewall that boots of an SD card.  An AMD 
box running the same rev but without any ipfw side of things running has 
been up for a couple of days without incident.  Odds are reasonably-high 
the splay tree changes are responsible since that's the "stand-out" 
difference between the two boxes.

-- 
Karl Denninger
karl@denninger.net <mailto:karl@denninger.net>
/The Market Ticker/
/[S/MIME encrypted email preferred]/