Issue with BGP router / high interrupt / Chelsio / FreeBSD 12.1

BulkMailForRudy crapsh at monkeybrains.net
Fri Feb 14 22:58:10 UTC 2020


On 2/14/20 4:21 AM, Andrey V. Elsukov wrote:
> On 13.02.2020 06:21, Rudy wrote:
>>
>> I'm having issues with a box that is acting as a BGP router for my
>> network.  3 Chelsio cards, two T5 and one T6.  It was working great
>> until I turned up our first port on the T6.  It seems like traffic
>> passing in from a T5 card and out the T6 causes a really high load (and
>> high interrupts).
>>
>> Traffic (not that much, right?)
>>
>>       Dev  RX bps    TX bps    RX PPS    TX PPS Error
>>       cc0       0         0         0         0         0
>>       cc1    2212 M       7 M     250 k       6 k 0 (100Gbps uplink,
>> filtering inbound routes to keep TX low)
>>      cxl0     287 k    2015 M     353       244 k 0   (our network)
>>      cxl1     940 M    3115 M     176 k     360 k 0 (our network)
>>      cxl2     634 M    1014 M     103 k     128 k 0 (our network)
>>      cxl3       1 k      16 M       1         4 k       0
>>      cxl4       0         0         0         0         0
>>      cxl5       0         0         0         0         0
>>      cxl6    2343 M     791 M     275 k     137 k 0 (IX , part of lagg0)
>>      cxl7    1675 M     762 M     215 k     133 k 0 (IX , part of lagg0)
>>      ixl0     913 k      18 M       0         0 0
>>      ixl1       1 M      30 M       0         0         0
>>     lagg0    4019 M    1554 M     491 k     271 k       0
>>     lagg1       1 M      48 M       0         0         0
>> FreeBSD 12.1-STABLE orange                 976 Bytes/Packetavg
>>   1:42PM  up 13:25, 5 users, load averages: 9.38, 10.43, 9.827
> Hi,
>
> did you try to use pmcstat to determine what is the heaviest task for
> your system?
>
> # kldload hwpmc
> # pmcstat -S inst_retired.any -Tw1


PMC: [inst_retired.any] Samples: 168557 (100.0%) , 2575 unresolved
Key: q => exiting...
%SAMP IMAGE      FUNCTION             CALLERS
  16.6 kernel     sched_idletd         fork_exit
  14.7 kernel     cpu_search_highest   cpu_search_highest:12.4 
sched_switch:1.4 sched_idletd:0.9
  10.5 kernel     cpu_search_lowest    cpu_search_lowest:9.6 
sched_pickcpu:0.9
   4.2 kernel     eth_tx               drain_ring
   3.4 kernel     rn_match             fib4_lookup_nh_basic
   2.4 kernel     lock_delay           __mtx_lock_sleep
   1.9 kernel     mac_ifnet_check_tran ether_output

>
> Then capture several first lines from the output and quit using 'q'.
>
> Do you use some firewall? Also, can you show the snapshot from the `top
> -HPSIzts1` output.


last pid: 28863;  load averages:  9.30, 10.33, 
10.56                                up 0+14:16:08  14:53:23
817 threads:   25 running, 586 sleeping, 206 waiting
CPU 0:   0.8% user,  0.0% nice,  6.2% system,  0.0% interrupt, 93.0% idle
CPU 1:   2.4% user,  0.0% nice,  0.0% system,  7.9% interrupt, 89.8% idle
CPU 2:   0.0% user,  0.0% nice,  0.8% system,  7.1% interrupt, 92.1% idle
CPU 3:   1.6% user,  0.0% nice,  0.0% system, 10.2% interrupt, 88.2% idle
CPU 4:   0.0% user,  0.0% nice,  0.0% system,  9.4% interrupt, 90.6% idle
CPU 5:   0.8% user,  0.0% nice,  0.8% system, 20.5% interrupt, 78.0% idle
CPU 6:   1.6% user,  0.0% nice,  0.0% system,  5.5% interrupt, 92.9% idle
CPU 7:   0.0% user,  0.0% nice,  0.0% system,  3.1% interrupt, 96.9% idle
CPU 8:   0.8% user,  0.0% nice,  0.8% system,  7.1% interrupt, 91.3% idle
CPU 9:   0.0% user,  0.0% nice,  0.8% system,  9.4% interrupt, 89.8% idle
CPU 10:  0.0% user,  0.0% nice,  0.0% system, 35.4% interrupt, 64.6% idle
CPU 11:  0.0% user,  0.0% nice,  0.0% system, 36.2% interrupt, 63.8% idle
CPU 12:  0.0% user,  0.0% nice,  0.0% system, 38.6% interrupt, 61.4% idle
CPU 13:  0.0% user,  0.0% nice,  0.0% system, 49.6% interrupt, 50.4% idle
CPU 14:  0.0% user,  0.0% nice,  0.0% system, 46.5% interrupt, 53.5% idle
CPU 15:  0.0% user,  0.0% nice,  0.0% system, 32.3% interrupt, 67.7% idle
CPU 16:  0.0% user,  0.0% nice,  0.0% system, 46.5% interrupt, 53.5% idle
CPU 17:  0.0% user,  0.0% nice,  0.0% system, 56.7% interrupt, 43.3% idle
CPU 18:  0.0% user,  0.0% nice,  0.0% system, 31.5% interrupt, 68.5% idle
CPU 19:  0.0% user,  0.0% nice,  0.8% system, 34.6% interrupt, 64.6% idle
Mem: 636M Active, 1159M Inact, 5578M Wired, 24G Free
ARC: 1430M Total, 327M MFU, 589M MRU, 32K Anon, 13M Header, 502M Other
      268M Compressed, 672M Uncompressed, 2.51:1 Ratio
Swap: 4096M Total, 4096M Free

   PID USERNAME    PRI NICE   SIZE    RES STATE    C   TIME    WCPU COMMAND
    12 root        -92    -     0B  3376K WAIT    13  41:13  12.86% 
intr{irq358: t5nex0:2a1}
    12 root        -92    -     0B  3376K WAIT    12  48:08  12.77% 
intr{irq347: t5nex0:1a6}
    12 root        -92    -     0B  3376K CPU13   13  47:40  11.96% 
intr{irq348: t5nex0:1a7}
    12 root        -92    -     0B  3376K WAIT    17  43:46  11.38% 
intr{irq342: t5nex0:1a1}
    12 root        -92    -     0B  3376K WAIT    14  29:17  10.70% 
intr{irq369: t5nex0:2ac}
    12 root        -92    -     0B  3376K WAIT    11  47:55   9.85% 
intr{irq428: t5nex1:2a5}
    12 root        -92    -     0B  3376K WAIT    16  46:11   9.22% 
intr{irq351: t5nex0:1aa}
    12 root        -92    -     0B  3376K WAIT    19  42:28   9.04% 
intr{irq344: t5nex0:1a3}
    12 root        -92    -     0B  3376K WAIT    16  46:45   8.82% 
intr{irq341: t5nex0:1a0}
    12 root        -92    -     0B  3376K RUN     11  48:04   8.33% 
intr{irq356: t5nex0:1af}
    12 root        -92    -     0B  3376K WAIT    10  46:24   8.32% 
intr{irq355: t5nex0:1ae}
    12 root        -92    -     0B  3376K WAIT    10  42:03   8.32% 
intr{irq345: t5nex0:1a4}
    12 root        -92    -     0B  3376K WAIT    14  36:34   8.29% 
intr{irq441: t5nex1:3a2}
    12 root        -92    -     0B  3376K WAIT    19  46:14   8.21% 
intr{irq354: t5nex0:1ad}
    12 root        -92    -     0B  3376K WAIT    14  47:29   8.13% 
intr{irq349: t5nex0:1a8}
    12 root        -92    -     0B  3376K WAIT    11  40:25   7.91% 
intr{irq346: t5nex0:1a5}
    12 root        -92    -     0B  3376K WAIT    15  49:33   7.62% 
intr{irq350: t5nex0:1a9}
    12 root        -92    -     0B  3376K WAIT     5  45:37   7.57% 
intr{irq322: t6nex0:1af}
    12 root        -92    -     0B  3376K WAIT    18  45:41   7.43% 
intr{irq353: t5nex0:1ac}
    12 root        -92    -     0B  3376K WAIT    17  36:43   7.34% 
intr{irq434: t5nex1:2ab}
    12 root        -92    -     0B  3376K WAIT    17  33:30   7.11% 
intr{irq424: t5nex1:2a1}
    12 root        -92    -     0B  3376K WAIT     4  31:43   7.02% 
intr{irq312: t6nex0:1a5}
    12 root        -92    -     0B  3376K WAIT    16  35:01   6.95% 
intr{irq433: t5nex1:2aa}
    12 root        -92    -     0B  3376K WAIT    17  47:03   6.84% 
intr{irq352: t5nex0:1ab}
    12 root        -92    -     0B  3376K WAIT    18  41:33   6.73% 
intr{irq343: t5nex0:1a2}
    12 root        -92    -     0B  3376K WAIT     9  37:02   6.42% 
intr{irq317: t6nex0:1aa}
    12 root        -92    -     0B  3376K WAIT    10  32:22   6.40% 
intr{irq427: t5nex1:2a4}




Thanks.  I did change the chelsio_affinity today to get the cards to 
bind IRQs to CPU cores in the same numa-domain.  Still, load seems a bit 
high when using the t6 card compared to just using the T5 cards.



More information about the freebsd-net mailing list