NFE adapter 'hangs'

Fri Sep 3 06:59:37 UTC 2010

Thank you for your very quick response :)

>First of all, if you have 'route-to' rules in pf please disable TSO
>on nfe(4). This is a known pf(4) issue for handling TSO frames.

There is route-to in the PF ruleset but it is not evaluated, and I've confirmed the rules is not matched

@0 pass in on nfe0 route-to (gif1000 X) inet from X/26 to ! <internalnets> flags S/SA keep state
  [ Evaluations: 126538    Packets: 0         Bytes: 0           States: 0     ]
  [ Inserted: uid 0 pid 79327 ]

>If this is not the case, would you show me the output of
>"sysctl dev.nfe.0.stats"? Can you see these statistics are still
>updated at the time of hang?

Before:

M [root at X:~]# sysctl dev.nfe.0.stats
dev.nfe.0.stats.rx.frame_errors: 0
dev.nfe.0.stats.rx.extra_bytes: 0
dev.nfe.0.stats.rx.late_cols: 0
dev.nfe.0.stats.rx.runts: 0
dev.nfe.0.stats.rx.jumbos: 0
dev.nfe.0.stats.rx.fifo_overuns: 0
dev.nfe.0.stats.rx.crc_errors: 0
dev.nfe.0.stats.rx.fae: 0
dev.nfe.0.stats.rx.len_errors: 0
dev.nfe.0.stats.rx.unicast: 0
dev.nfe.0.stats.rx.multicast: 0
dev.nfe.0.stats.rx.broadcast: 0
dev.nfe.0.stats.rx.octets: 53488472383
dev.nfe.0.stats.rx.pause: 0
dev.nfe.0.stats.rx.drops: 0
dev.nfe.0.stats.tx.octets: 7480691065
dev.nfe.0.stats.tx.zero_rexmits: 68040095
dev.nfe.0.stats.tx.one_rexmits: 0
dev.nfe.0.stats.tx.multi_rexmits: 0
dev.nfe.0.stats.tx.late_cols: 0
dev.nfe.0.stats.tx.fifo_underuns: 0
dev.nfe.0.stats.tx.carrier_losts: 0
dev.nfe.0.stats.tx.excess_deferrals: 0
dev.nfe.0.stats.tx.retry_errors: 0
dev.nfe.0.stats.tx.deferrals: 0
dev.nfe.0.stats.tx.frames: 68040095
dev.nfe.0.stats.tx.pause: 0

After:
M [root at X:~]# sysctl dev.nfe.0.stats
dev.nfe.0.stats.rx.frame_errors: 0
dev.nfe.0.stats.rx.extra_bytes: 0
dev.nfe.0.stats.rx.late_cols: 0
dev.nfe.0.stats.rx.runts: 0
dev.nfe.0.stats.rx.jumbos: 0
dev.nfe.0.stats.rx.fifo_overuns: 0
dev.nfe.0.stats.rx.crc_errors: 0
dev.nfe.0.stats.rx.fae: 0
dev.nfe.0.stats.rx.len_errors: 0
dev.nfe.0.stats.rx.unicast: 0
dev.nfe.0.stats.rx.multicast: 0
dev.nfe.0.stats.rx.broadcast: 0
dev.nfe.0.stats.rx.octets: 53488985199
dev.nfe.0.stats.rx.pause: 0
dev.nfe.0.stats.rx.drops: 0
dev.nfe.0.stats.tx.octets: 7480798032
dev.nfe.0.stats.tx.zero_rexmits: 68041016
dev.nfe.0.stats.tx.one_rexmits: 0
dev.nfe.0.stats.tx.multi_rexmits: 0
dev.nfe.0.stats.tx.late_cols: 0
dev.nfe.0.stats.tx.fifo_underuns: 0
dev.nfe.0.stats.tx.carrier_losts: 0
dev.nfe.0.stats.tx.excess_deferrals: 0
dev.nfe.0.stats.tx.retry_errors: 0
dev.nfe.0.stats.tx.deferrals: 0
dev.nfe.0.stats.tx.frames: 68041016
dev.nfe.0.stats.tx.pause: 0

M [root at X:~]# sysctl dev.nfe.0.stats
dev.nfe.0.stats.rx.frame_errors: 0
dev.nfe.0.stats.rx.extra_bytes: 0
dev.nfe.0.stats.rx.late_cols: 0
dev.nfe.0.stats.rx.runts: 0
dev.nfe.0.stats.rx.jumbos: 0
dev.nfe.0.stats.rx.fifo_overuns: 0
dev.nfe.0.stats.rx.crc_errors: 0
dev.nfe.0.stats.rx.fae: 0
dev.nfe.0.stats.rx.len_errors: 0
dev.nfe.0.stats.rx.unicast: 0
dev.nfe.0.stats.rx.multicast: 0
dev.nfe.0.stats.rx.broadcast: 0
dev.nfe.0.stats.rx.octets: 53489024128
dev.nfe.0.stats.rx.pause: 0
dev.nfe.0.stats.rx.drops: 0
dev.nfe.0.stats.tx.octets: 7480798032
dev.nfe.0.stats.tx.zero_rexmits: 68041016
dev.nfe.0.stats.tx.one_rexmits: 0
dev.nfe.0.stats.tx.multi_rexmits: 0
dev.nfe.0.stats.tx.late_cols: 0
dev.nfe.0.stats.tx.fifo_underuns: 0
dev.nfe.0.stats.tx.carrier_losts: 0
dev.nfe.0.stats.tx.excess_deferrals: 0
dev.nfe.0.stats.tx.retry_errors: 0
dev.nfe.0.stats.tx.deferrals: 0
dev.nfe.0.stats.tx.frames: 68041016
dev.nfe.0.stats.tx.pause: 0

>Also I'd like to know whether both RX and TX are dead or only one
>RX/TX path is hung. Can you see incoming traffic with tcpdump when
>you think the controller is in stuck?

Yes, though not very much. The traffic to 4800 is every second so you can see in the following trace when it stops

07:10:42.287163 IP 192.168.1.203 > 224.0.0.240:  pfsync 108
07:10:42.911995
07:10:43.112073 STP 802.1d, Config, Flags [Topology change], bridge-id 8000.c4:7d:4f:a9:ac:30.8008, length 43
07:10:43.148659 IP 192.168.1.203.57026 > 192.168.1.255.4800: UDP, length 60
07:10:43.148684 IP 172.31.1.203 > 172.31.1.129: GREv0, length 92: IP 192.168.1.203.57026 > 192.168.1.129.4800: UDP, length 60
07:10:43.148689 IP 172.31.1.203 > 172.31.1.129: GREv0, length 92: IP 192.168.1.203.57026 > 192.168.1.1.4800: UDP, length 60
07:10:43.148918 IP 192.168.1.213.40677 > 192.168.1.255.4800: UDP, length 48
07:10:43.329272 IP 192.168.1.203 > 224.0.0.240:  pfsync 108
07:10:44.151762 IP 192.168.1.203.57026 > 192.168.1.255.4800: UDP, length 60
07:10:44.151787 IP 172.31.1.203 > 172.31.1.129: GREv0, length 92: IP 192.168.1.203.57026 > 192.168.1.129.4800: UDP, length 60
07:10:44.151791 IP 172.31.1.203 > 172.31.1.129: GREv0, length 92: IP 192.168.1.203.57026 > 192.168.1.1.4800: UDP, length 60
07:10:44.152043 IP 192.168.1.213.40677 > 192.168.1.255.4800: UDP, length 48
07:10:44.371387 IP 192.168.1.203 > 224.0.0.240:  pfsync 108
07:10:45.154868 IP 192.168.1.203.57026 > 192.168.1.255.4800: UDP, length 60
07:10:45.154892 IP 172.31.1.203 > 172.31.1.129: GREv0, length 92: IP 192.168.1.203.57026 > 192.168.1.129.4800: UDP, length 60
07:10:45.154896 IP 172.31.1.203 > 172.31.1.129: GREv0, length 92: IP 192.168.1.203.57026 > 192.168.1.1.4800: UDP, length 60
07:10:45.155146 IP 192.168.1.213.40677 > 192.168.1.255.4800: UDP, length 48
07:10:45.413495 IP 192.168.1.203 > 224.0.0.240:  pfsync 108
07:10:45.513130 IP 192.168.1.213 > 224.0.0.240:  pfsync 196
07:10:46.157973 IP 192.168.1.203.57026 > 192.168.1.255.4800: UDP, length 60
07:10:46.157997 IP 172.31.1.203 > 172.31.1.129: GREv0, length 92: IP 192.168.1.203.57026 > 192.168.1.129.4800: UDP, length 60
07:10:46.158002 IP 172.31.1.203 > 172.31.1.129: GREv0, length 92: IP 192.168.1.203.57026 > 192.168.1.1.4800: UDP, length 60
07:10:46.158222 IP 192.168.1.213.40677 > 192.168.1.255.4800: UDP, length 48
07:10:46.455622 IP 192.168.1.203 > 224.0.0.240:  pfsync 108
07:10:47.497702 IP 192.168.1.203 > 224.0.0.240:  pfsync 108
07:10:48.108787 STP 802.1d, Config, Flags [Topology change], bridge-id 8000.c4:7d:4f:a9:ac:30.8008, length 43
07:10:48.150038 IP 192.168.1.213 > 224.0.0.240:  pfsync 108
07:10:48.539811 IP 192.168.1.203 > 224.0.0.240:  pfsync 108
07:10:49.581924 IP 192.168.1.203 > 224.0.0.240:  pfsync 108
07:10:49.869585
07:10:50.154224 IP 192.168.1.213 > 224.0.0.240:  pfsync 108
07:10:50.624031 IP 192.168.1.203 > 224.0.0.240:  pfsync 108
07:10:51.666137 IP 192.168.1.203 > 224.0.0.240:  pfsync 108
07:10:52.156425 IP 192.168.1.213 > 224.0.0.240:  pfsync 108
07:10:52.708257 IP 192.168.1.203 > 224.0.0.240:  pfsync 108
07:10:53.105667 STP 802.1d, Config, Flags [Topology change], bridge-id 8000.c4:7d:4f:a9:ac:30.8008, length 43
07:10:53.712615 IP 192.168.1.213 > 224.0.0.240:  pfsync 284
07:10:53.750351 IP 192.168.1.203 > 224.0.0.240:  pfsync 108
07:10:54.792460 IP 192.168.1.203 > 224.0.0.240:  pfsync 108
07:10:54.996715 IP 192.168.1.213 > 224.0.0.240:  pfsync 36
07:10:55.160134 IP 192.168.1.213 > 224.0.0.240:  pfsync 68
07:10:55.834569 IP 192.168.1.203 > 224.0.0.240:  pfsync 108
07:10:56.160794 IP 192.168.1.213 > 224.0.0.240:  pfsync 108
07:10:56.876680 IP 192.168.1.203 > 224.0.0.240:  pfsync 108
07:10:57.918791 IP 192.168.1.203 > 224.0.0.240:  pfsync 108

a bit later on, still broken, a slight odd message:
07:11:43.079720 IP 172.31.1.129 > 172.31.1.213: GREv0, length 52: IP 192.168.1.129.60446 > 192.168.1.213.179:  tcp 12 [bad hdr length 16 - too short, < 20]
07:11:44.210794 IP 172.31.1.129 > 172.31.1.203: GREv0, length 84: IP 192.168.1.129.64744 > 192.168.1.203.4800: UDP, length 52
07:11:44.210831 IP 172.31.1.129 > 172.31.1.213: GREv0, length 84: IP 192.168.1.129.64744 > 192.168.1.213.4800: UDP, length 52

Now this really is odd, I don't recognise either of those MAC addresses, though the SQL shown is used on this machine (
07:12:13.054393 45:43:54:20:41:63 > 00:00:03:53:45:4c, ethertype Unknown (0x6374), length 60:
        0x0000:  556e 6971 7565 4964 2046 524f 4d20 7261  UniqueId.FROM.ra
        0x0010:  6461 6363 7420 2057 4845 5245 2043 616c  dacct..WHERE.Cal
        0x0020:  6c69 6e67 5374 6174 696f 6e49 6420       lingStationId.
07:12:13.229942 ARP, Request who-has 172.31.1.129 tell 172.31.1.213, length 28
07:12:13.229969 ARP, Request who-has 172.31.1.129 tell 172.31.1.213, length 28
07:12:14.600212 IP 172.31.1.213 > 224.0.0.18: VRRPv2, Advertisement, vrid 1, prio 240, authtype none, intvl 1s, length 36
07:12:14.600234 ARP, Request who-has 172.31.1.193 tell 172.31.1.193, length 28
07:12:15.643450 IP 172.31.1.213 > 224.0.0.18: VRRPv2, Advertisement, vrid 1, prio 240, authtype none, intvl 1s, length 36
07:12:16.261215 ARP, Request who-has 172.31.1.129 tell 172.31.1.213, length 28
07:12:17.583475 IP 172.31.1.213 > 224.0.0.18: VRRPv2, Advertisement, vrid 1, prio 240, authtype none, intvl 1s, length 36
07:12:18.051111 41:54:45:20:74:72 > 00:00:03:55:50:44, ethertype Unknown (0x616e), length 60:
        0x0000:  7369 656e 742e 7261 6469 7070 6f6f 6c20  sient.radippool.
        0x0010:  5345 5420 6e61 7369 7061 6464 7265 7373  SET.nasipaddress
        0x0020:  203d 2027 3137 322e 3331 2e33 2e31       .=.'172.30.0.1

I then restarted the interface (nfe down/up, route restart)

From dmesg at the time (slight obfuscated)
Sep  3 07:10:19 manch2 bgpd[89612]: neighbor XX: received notification: HoldTimer expired, unknown subcode 0
Sep  3 07:10:49 manch2 bgpd[89612]: neighbor XX connect: Host is down
# at this point I took the interface down & up and reloaded the routing tables
Sep  3 07:12:07 manch2 kernel: carp0: link state changed to DOWN
Sep  3 07:12:07 manch2 kernel: carp0: link state changed to DOWN
Sep  3 07:12:07 manch2 kernel: nfe0: link state changed to DOWN
Sep  3 07:12:07 manch2 kernel: carp0: link state changed to DOWN
Sep  3 07:12:11 manch2 kernel: nfe0: link state changed to UP   
Sep  3 07:12:11 manch2 kernel: carp0: link state changed to DOWN
Sep  3 07:12:14 manch2 kernel: carp0: link state changed to UP
Sep  3 07:13:09 manch2 bgpd[89612]: neighbor YYY: socket error: Operation timed out

And a while later

M [root at manch2:~]# sysctl dev.nfe.0.stats
dev.nfe.0.stats.rx.frame_errors: 0
dev.nfe.0.stats.rx.extra_bytes: 0
dev.nfe.0.stats.rx.late_cols: 0
dev.nfe.0.stats.rx.runts: 0
dev.nfe.0.stats.rx.jumbos: 0
dev.nfe.0.stats.rx.fifo_overuns: 0
dev.nfe.0.stats.rx.crc_errors: 0
dev.nfe.0.stats.rx.fae: 0
dev.nfe.0.stats.rx.len_errors: 0
dev.nfe.0.stats.rx.unicast: 0
dev.nfe.0.stats.rx.multicast: 0
dev.nfe.0.stats.rx.broadcast: 0
dev.nfe.0.stats.rx.octets: 53502234479
dev.nfe.0.stats.rx.pause: 0
dev.nfe.0.stats.rx.drops: 0
dev.nfe.0.stats.tx.octets: 7482784573
dev.nfe.0.stats.tx.zero_rexmits: 68060063
dev.nfe.0.stats.tx.one_rexmits: 0
dev.nfe.0.stats.tx.multi_rexmits: 0
dev.nfe.0.stats.tx.late_cols: 0
dev.nfe.0.stats.tx.fifo_underuns: 0
dev.nfe.0.stats.tx.carrier_losts: 0
dev.nfe.0.stats.tx.excess_deferrals: 0
dev.nfe.0.stats.tx.retry_errors: 0
dev.nfe.0.stats.tx.deferrals: 0
dev.nfe.0.stats.tx.frames: 68060063
dev.nfe.0.stats.tx.pause: 0

>By chance, are you using polling(4) or VLAN on nfe(4)?

No

Also, I tried turning TSO off and the same behaviour happened though over a slightly longer period of time (could be co-incidence)

(with TSO off)
nfe0: flags=8943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> metric 0 mtu 1500
        options=9b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM>
        ether 00:26:9e:9b:a6:14
        inet 172.31.1.213 netmask 0xffffff80 broadcast 172.31.1.255
        media: Ethernet autoselect (1000baseT <full-duplex>)
        status: active

Mel