[Bug 260427] [regression]: netmap causes packet drops

From: <bugzilla-noreply_at_freebsd.org>
Date: Wed, 15 Dec 2021 04:55:19 UTC
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=260427

            Bug ID: 260427
           Summary: [regression]: netmap causes packet drops
           Product: Base System
           Version: 12.2-RELEASE
          Hardware: Any
                OS: Any
            Status: New
          Severity: Affects Only Me
          Priority: ---
         Component: kern
          Assignee: bugs@FreeBSD.org
          Reporter: emz@norma.perm.ru

Env:
FreeBSD 12.2-RELEASE which was previously running 11.x and 10.x
Input flow from Catalyst 2960 around 30-300 Mbit/sec, media 1000baseT

Hardware:
IBM System x3250 m2

Hardware interfaces:
bge(4) NetXtreme BCM5722 Gigabit Ethernet PCI Express
em(4) 82572EI Gigabit Ethernet Controller (Copper)

First the input flow was directd via onboard bge(4), port no.1. Around several
months ago we noticed high error rate reflecting in netstat input errors and
hardware dev.bge.0 counters like 

dev.bge.0.stats.InputDiscards

Error input rate was changing from 0 (most of the time) to 6K-80K per second.

Recovery measures (that didn't help, each step):
- changed the patch cable from catalyst
- changed the onboard port from 1 to 0
- started to suspect the onboeard ethernet controller, added the Intel Pro/1000
MT external adapter via the riser card, error rate migrated into the
dev.em.0.mac_stats.missed_packets counter, sometimes triggering the
dev.em.0.mac_stats.recv_no_buff:

dev.em.0.mac_stats.recv_no_buff: 9424
dev.em.0.mac_stats.missed_packets: 1853592

- added the iflib/netmap tuning:

net.isr.numthreads="2"
net.isr.maxthreads="2"

dev.em.0.iflib.rx_budget="65535"
dev.em.0.iflib.override_nrxds="4096"
dev.em.0.iflib.override_ntxds="4096"
dev.em.0.iflib.disable_msix="0"

- added the interrupt moderation

dev.em.0.rx_int_delay="200"
dev.em.0.tx_int_delay="200"
dev.em.0.rx_abs_int_delay="4000"
dev.em.0.tx_abs_int_delay="4000"

- tried to play with the kern.eventtimer

kern.eventtimer.periodic="1"

Steps that did help:

- decided to try the Intel em(4) module from ports, that doesn't wotk with
netmap and requires kernel built without em(4) support. Added the "nodevice em"
and "nodevice netmap" config lines, rebuilt the kernel, installed it and
rebooted (still with the stock driver at this time, just to switch to the
loadable module).

Errors magically stopped.

-- 
You are receiving this mail because:
You are the assignee for the bug.