Machine freezes when loading pf ruleset

Kolontai Andrej Andrej.Kolontai at Verwaltung.Uni-Muenchen.DE
Tue Aug 25 15:56:25 UTC 2015


Hello,

I'm new to this list and I hope it's the right place to ask. 

We have highly utilized installation of two FreeBSD-machines running 10.1-RELEASE, pf and carp. There are about 50 networks (some via vlan, some ipsec) connected to them, usually about 50000 pf states, about 1500 rules and traffic  sometimes hitting 1 gbit/s or more. 

When we load the ruleset on the active machine it sometimes freezes. It occurs more often as our installation grows. Sometimes it freezes completely, which is actually good as the backup machine takes over. But at other times it's still some kind of alive but unbearably slow (ping is about 40 s instead of the usual 0.2 ms), the backup machine does not take over and people start to get angry. 
The crashed machine usually recovers after a couple of minutes. 

We've been trying to resolve this for some months now and are running out of ideas. We do believe that something is wrong with or setup but have no clue what it could be. 

When it happens, the syslogs show tons of messages like these:

Aug 24 09:08:39 applej kernel: pf_map_addr: src tracking maps 172.23.101.87 to 141.84.149.18
Aug 24 09:08:39 applej kernel: pf_map_addr: src tracking maps 172.23.101.87 to 141.84.149.18
Aug 24 09:08:39 applej kernel: pf_map_addr: src tracking maps 172.23.107.206 to 141.84.149.18
Aug 24 09:08:39 applej kernel: pf_map_addr: src tracking maps 172.23.101.87 to 141.84.149.18
Aug 24 09:08:39 applej kernel: pf_map_addr: src tracking maps 172.23.107.206 to 141.84.149.18
Aug 24 09:08:39 applej kernel: pf_map_addr: src tracking maps 172.23.101.87 to 141.84.149.18
Aug 24 09:08:39 applej kernel: pf_map_addr: src tracking maps 172.23.106.171 to 141.84.149.18
Aug 24 09:08:39 applej kernel: pf_map_addr: src tracking maps 172.23.106.171 to 141.84.149.18
Aug 24 09:08:39 applej kernel: pf_map_addr: src tracking maps 172.23.106.72 to 141.84.149.18
Aug 24 09:08:39 applej kernel: pf_map_addr: src tracking maps 172.23.106.58 to 141.84.149.19
Aug 24 09:08:39 applej last message repeated 2 times
Aug 24 09:08:39 applej kernel: pf: BAD state: TCP in wire: 10.200.108.141:49202 172.23.2.92:8192 stack: - [lo=3482585550 high=3482585552 win=8192 modulator=0pf_map_addr: src tracking maps ]172.23.106.72 [lo=0 hi
gh=1 win=1 modulator=0 to ]141.84.149.18 2:0
Aug 24 09:08:39 applej kernel: S seq=3503222902 (3503222902) ack=0 len=0 ackskew=0 pkts=1:1 dir=in,fwd
Aug 24 09:08:39 applej kernel: pf: State failure on: 1       | 5  
Aug 24 09:08:39 applej kernel: pf_map_addr: src tracking maps 172.23.106.171 to 141.84.149.18
Aug 24 09:08:39 applej kernel: pf_map_addr: src tracking maps 172.23.107.20 to 141.84.149.18
Aug 24 09:08:39 applej kernel: pf_map_addr: src tracking maps 172.23.101.87 to 141.84.149.18
Aug 24 09:08:39 applej kernel: pf_map_addr: src tracking maps pf_map_addr: src tracking maps pf_map_addr: src tracking maps 172.23.106.171172.23.106.171 to 172.23.106.171141.84.149.18 to  to 
Aug 24 09:08:39 applej kernel: 141.84.149.18141.84.149.18
Aug 24 09:08:39 applej kernel: 
Aug 24 09:08:40 applej kernel: pf_map_addr: src tracking maps pf_map_addr: src tracking maps 172.23.106.171172.23.101.87 to 
Aug 24 09:08:40 applej kernel: to 141.84.149.18141.84.149.18
Aug 24 09:08:40 applej kernel: 
Aug 24 09:08:40 applej kernel: pf: OK ICMP 3:10 141.84.149.249 -> 10.153.101.239 state: TCP in wire: 141.84.149.249:37677 10.153.101.239
Aug 24 09:08:40 applej kernel: :80 stack: - [lo=4171141930 high=4171159337 win=106 modulator=0 wscale=7] [lo=2653133771 high=2653147339 win=136 modulator=0 wscale=7] 10:10 seq=2653133771
Aug 24 09:08:40 applej kernel: pf: OK ICMP 3:10 141.84.149.249 -> 10.153.101.239 state: TCP out wire: 10.153.101.239:80 141.84.149.249
Aug 24 09:08:40 applej kernel: :37677 stack: - [lo=4171141930 high=4171159337 win=106 modulator=0 wscale=7] [lo=2653133771 high=2653147339 win=136 modulator=0 wscale=7] 10:10 seq=2653133771
Aug 24 09:08:40 applej kernel: pf_map_addr: src tracking maps pf_map_addr: src tracking maps 172.23.106.184pf: wire key attach failed on vlan42: 172.23.106.165pf_map_addr: src tracking maps 
Aug 24 09:08:40 applej kernel: 112 to  in141.84.149.18 wire: 172.23.106.58
Aug 24 09:08:40 applej kernel: 141.84.149.27
Aug 24 09:08:40 applej kernel: to 224.0.0.18 to  1:0141.84.149.18, existing: 
Aug 24 09:08:40 applej kernel: 112141.84.149.19 in
Aug 24 09:08:40 applej kernel: wire: 
Aug 24 09:08:40 applej kernel: 141.84.149.27 
Aug 24 09:08:40 applej kernel: 224.0.0.18 stack: 141.84.149.27 224.0.0.18 1:0
Aug 24 09:08:40 applej kernel: pf: wire key attach failed on vlan42: 112 in wire: 141.84.149.1 224.0.0.18 1:0, existing: 
Aug 24 09:08:40 applej kernel: 112 in wire: 141.84.149.1 224.0.0.18 stack: 141.84.149.1 224.0.0.18 1:0
Aug 24 09:08:40 applej kernel: 
Aug 24 09:08:40 applej kernel: pf_map_addr: src tracking maps pf_map_addr: src tracking maps pf: wire key attach failed on vlan250: 172.23.100.108pf: OK ICMP 3:10 112
Aug 24 09:08:40 applej kernel: 172.23.106.184 in to  wire:  to 10.153.101.73141.84.149.18 224.0.0.18141.84.149.18 1:0141.84.149.198, existing: 
Aug 24 09:08:40 applej kernel: 112 in
Aug 24 09:08:40 applej kernel: pf_map_addr: src tracking maps 
Aug 24 09:08:40 applej kernel: 
Aug 24 09:08:40 applej kernel: wire:  -> 10.153.101.73172.23.106.58  to 224.0.0.18141.84.149.19 stack: 
Aug 24 09:08:40 applej kernel: 10.153.101.73
Aug 24 09:08:40 applej kernel: 77.7.121.109  state: 224.0.0.18TCP 1:0
Aug 24 09:08:40 applej kernel: out wire: 141.84.149.198:80 77.7.121.109:52521 stack: 
Aug 24 09:08:40 applej kernel: - [lo=2750816253 high=2750829820 win=256 modulator=0 wscale=8] [lo=2902946358 high=2903011894 win=106 modulator=0
Aug 24 09:08:40 applej kernel: wscale=7] 7:9 seq=2750816252
Aug 24 09:08:40 applej kernel: pf: OK ICMP 3:10 141.84.149.198 -> 77.7.121.109 state: TCP in wire: 77.7.121.109:52521
Aug 24 09:08:40 applej kernel: 141.84.149.198:80 stack: - [lo=2750816253 high=2750829820 win=256 modulator=0
Aug 24 09:08:40 applej kernel: wscale=8] [lo=2902946358 high=2903011894 win=106 modulator=0 wscale=7] 7:9 seq=2750816252 
....
Aug 24 09:09:11 applej kernel: 4:7:80:1433  seq=1025128833
Aug 24 09:09:11 applej kernel: 141.84.149.19:47806  stack: 10.110.32.129-:50143 [lo=3790813207 high=3790831127 win=14600 modulator=0 stack:  wscale=8-] [lo=3961092366 high=3962143087 win=4106 modulator=0 [lo=155
1251521 high=1554989121 win=17898 modulator=0 wscale=8 wscale=8]] [lo=1005669588 high=1006720647 win=4106 modulator=0 4:2 wscale=8 seq=1551251520
Aug 24 09:09:11 applej kernel: ] 4:4 seq=1005669587
Aug 24 09:09:11 applej kernel: pf: OK ICMP 3:1 172.23.254.250pf: OK ICMP 3:1  -> 10.110.32.220172.23.106.87 ->  state: 10.110.32.193TCP state:  inTCP wire:  out172.23.106.87 wire: :5017310.110.32.193 :1433141.84
.149.21 :844310.110.32.130 stack: :54454172.23.106.87 stack: :50173-  [lo=2704530546 high=2705581267 win=4106 modulator=0141.84.149.18 wscale=8:8443] [lo=521335749 high=521353924 win=256 modulator=0 [lo=19884644
01 high=1989515460 win=4106 modulator=0 wscale=8 wscale=8]] [lo=3203135938 high=3203200962 win=71 modulator=0 4:4 wscale=8 seq=1988464400
Aug 24 09:09:11 applej kernel: ] 7:4 seq=521335748
Aug 24 09:09:11 applej kernel: pf: OK ICMP 3:1 pf: OK ICMP 3:1 pf: OK ICMP 3:1 pf: OK ICMP 3:1 pf: OK ICMP 3:1 pf: OK ICMP 3:1 172.23.254.250172.23.254.25010.110.32.220 ->  -> 172.23.106.8710.110.32.193 state:  
state: TCPTCP in out wire:  wire: 172.23.106.8710.110.32.193:50168:1433  141.84.149.2110.110.32.129:8443:50175 stack:  stack: 172.23.106.87-:50168 [lo=1504649300 high=1505700021 win=4106 modulator=0  wscale=8141
.84.149.18]:8443 [lo=723475428 high=724526487 win=4106 modulator=0 [lo=975354116 high=975374851 win=256 modulator=0 wscale=8 wscale=8]] 4:4 [lo=4251253613 high=4251317869 win=81 modulator=0 seq=723475427
Aug 24 09:09:11 applej kernel: wscale=8] 7:4 seq=975354115
Aug 24 09:09:11 applej kernel: 141.84.44.211 -> 172.23.254.250217.86.168.236 ->  state: 172.23.3.29TCP state:  inTCP wire: 141.84.44.211217.86.168.236 -> :5098310.180.10.209  state: 141.84.149.198TCP:80 in stack
:  wire: -10.180.10.209 [lo=3709359805 high=3709359806 win=8192 modulator=0 in:49223]  [lo=0 high=8192 win=1 modulator=0141.84.149.244]:443 2:0 stack:  seq=3709359804
Aug 24 09:09:11 applej kernel: - [lo=827930303 high=827944894 win=67 modulator=0 wire:  wscale=8172.23.3.29]:35196 [lo=1796383793 high=1796400177 win=114 modulator=0  wscale=710.153.101.108]:6556 10:10 stack:  s
eq=827930303
Aug 24 09:09:11 applej kernel: - [lo=1991477221 high=1991477222 win=14600 modulator=0 -> ]172.23.101.172 [lo=0 high=14600 win=1 modulator=0 state: ]TCP 2:0 in seq=1991477220
Aug 24 09:09:11 applej kernel: wire: 172.23.101.172:49613 141.84.149.21:8080 stack: 172.23.101.172:49613 141.84.149.19:8080 [lo=3559829126 high=3559829127 win=8192 modulator=0] [lo=0 high=8192 win=1 modulator=0]
 2:0 seq=3559829125
Aug 24 09:09:11 applej kernel: pf: OK ICMP 3:1 pf: OK ICMP 3:1 10.110.32.220141.84.44.211 -> pf_map_addr: src tracking maps  -> 172.23.101.8710.110.32.193 to 115.248.50.20141.84.149.19 state: 
Aug 24 09:09:11 applej kernel: state: TCPTCP out in wire:  wire: 10.110.32.193115.248.50.20:1433:50426  10.110.32.130141.84.149.229:54455:80 stack:  stack: -- [lo=201188016 high=202238737 win=513 modulator=0 [lo
=2382456191 high=2382456193 win=2048 modulator=0 wscale=8]] [lo=0 high=1 win=1 modulator=0 [lo=3935828538 high=3935959789 win=4106 modulator=0] wscale=8 2:0] seq=2382456191
Aug 24 09:09:11 applej kernel: 4:4 seq=3935828537

Sorry if this is too much but I really can' t tell  which of these messages could help. Perhaps there isn't any useful information at all. To me it seems it's just a side effect of the actual problem which I am unable to see.

Maybe somebody has an idea about what's going wrong and what to investigate. Or theories right now are:
* issues with pfsync  (messages complaining about states, but unlikely, it's normally working just fine)
* some lack of ressources (yet, which?)
* race condition inside the kernel (there are 24 CPU cores)

We have not yet been able to reproduce it in a lab environment. 

Right now, we work around this by putting the master into the backup state (using demotion) before loading the ruleset and changing back to master state after the reload. This is working fine. But it kills, of course, the IPSec-SAs. But that's something we can live with right now. 


Viele Grüße 
Andrej Kolontai

Ludwig-Maximilians-Universitaet Muenchen
Ref. VI.4 (IT-Sicherheit & Verzeichnisdienste) 
Martiusstrasse 4 / 207
80802 Muenchen

phone  +49 (0)89 2180-3815
email  mailto:andrej.kolontai at verwaltung.uni-muenchen.de
web     http://www.uni-muenchen.de/zuv/it/




More information about the freebsd-pf mailing list