Re: 60+% ping packet loss on Pi3 under -current and stable-13

From: Mark Millard <marklmi_at_yahoo.com>
Date: Mon, 02 May 2022 18:31:43 UTC
On 2022-May-2, at 08:53, bob prohaska <fbsd@www.zefox.net> wrote:

> On Mon, May 02, 2022 at 08:56:12AM +0200, Hans Petter Selasky wrote:
> [reply at end]
>> On 5/2/22 03:13, bob prohaska wrote:
>>> On Sun, May 01, 2022 at 05:10:59PM -0700, Mark Millard wrote:
>>> [reply at end]
>>>> On 2022-May-1, at 16:27, bob prohaska <fbsd@www.zefox.net> wrote:
>>>> 
>>>>> On Sun, May 01, 2022 at 12:58:45PM -0700, Mark Millard wrote:
>>>>>> 
>>>>>> Looks like there is some problem getting past
>>>>>> gig1-1-1.gw.davsca11.sonic.net .
>>>>>> 
>>>>> 
>>>>> That seems independent of my own internal connection problems,
>>>>> but worth taking up with my ISP on Monday. Meanwhile, can you
>>>>> ping any other hosts in the 50.1.20.31-24 range? All are up
>>>>> at the moment. Hosts 28 and 24 are the troublemakers.
>>>>> 
>>>>> If anybody cares there's an ascii-art network diagram at
>>>>> http://www.zefox.net/~fbsd/netmap
>>>>> 
>>>>> Not sure it'll survive the mailing list, but here goes:
>>>>> dsl_modem-----switch---------router-----lan-------wifi-----pi4_workstation
>>>>>                      |                  |             |
>>>>>                      |                  |             |---Mac workstation
>>>>>                      |                  |
>>>>>                      |                  |------printer
>>>>>    ------------------|
>>>>>    |
>>>>>    |------50.1.20.30 ns1.zefox.net Pi2 12.3 usb-serial----50.1.20.27
>>>>>    |------50.1.20.29 ns2.zefox.net Pi2 12.3 usb-serial----50.1.20.30
>>>>>    |------50.1.20.27 www.zefox.net Pi2 12.3 usb-serial----50.1.20.26
>>>>>    |------50.1.20.26 www.zefox.com Pi2 -current usb-serial---50.1.20.24
>>>>>    |------50.1.20.24 pelorus.zefox.org Pi3 13.1 usb-serial---50.1.20.28
>>>>> switch
>>>>>    |------50.1.20.25 nemesis.zefox.com Pi4 -current usb-serial---50.1.20.29
>>>>>    |------50.1.20.28 www.zefox.org Pi3 -current usb-serial----50.1.20.25
>>>> 
>>>> 
>>>> For ns1.zefox.net there is no problem and
>>>> it looks like:
>>>> 
>>>>                                      My traceroute  [v0.95]
>>>> amd64_ZFS (192.168.1.120) -> ns1.zefox.net (50.1.20.29)                2022-05-01T16:52:27-0700
>>>> Keys:  Help   Display mode   Restart statistics   Order of fields   quit
>>>>                                                        Packets               Pings
>>>>  Host                                                Loss%   Snt   Last   Avg  Best  Wrst StDev
>>>>  1. 192.168.1.1                                       0.0%    53    1.2   0.8   0.1   1.4   0.4
>>>>  2. 172.30.26.67                                      0.0%    53   11.8  25.0  11.8  61.0  11.4
>>>>  3. 68.85.243.125                                     0.0%    53   10.0  10.0   7.7  46.9   5.3
>>>>  4. 96.216.60.165                                     0.0%    53    8.8   9.3   7.8  12.1   0.9
>>>>  5. 68.85.243.197                                     0.0%    53    8.6  13.2   8.6  28.3   4.2
>>>>  6. be-36231-cs03.seattle.wa.ibone.comcast.net        0.0%    53   15.3  14.8  13.0  16.9   1.0
>>>>  7. be-2312-pe12.seattle.wa.ibone.comcast.net         0.0%    53   16.2  15.9  12.9  59.8   6.5
>>>>  8. (waiting for reply)
>>>>  9. be3717.ccr22.sfo01.atlas.cogentco.com             0.0%    53   29.8  30.9  26.5  97.9  10.1
>>>> 10. be2430.ccr31.sjc04.atlas.cogentco.com             0.0%    53   29.0  29.0  26.6  39.3   1.8
>>>> 11. 38.104.141.82                                     0.0%    53   28.9  33.8  26.1 115.0  17.0
>>>> 12. 0.xe-0-3-0.scrm-gw1.scrmca01.sonic.net            0.0%    53   32.1  31.3  29.2  33.9   1.0
>>>> 13. 0.xe-0-0-0.cr1.scrmca13.sonic.net                 0.0%    53   30.5  32.1  29.2  57.6   4.3
>>>> 14. gig1-1-1.gw.wscrca11.sonic.net                    0.0%    53   31.8  32.0  28.8  43.7   2.0
>>>> 15. gig1-1-1.gw.davsca11.sonic.net                    0.0%    52   31.0  32.4  30.2  38.4   1.4
>>>> 16. ns1.zefox.net                                     0.0%    52   51.4  51.1  49.8  53.4   0.8
>>>> 
>>>> ns2.zefox.net and others got a 17. instead of
>>>> a 16. An example is:
>>>> 
>>>>                                      My traceroute  [v0.95]
>>>> amd64_ZFS (192.168.1.120) -> ns2.zefox.net (50.1.20.30)                2022-05-01T16:58:45-0700
>>>> Keys:  Help   Display mode   Restart statistics   Order of fields   quit
>>>>                                                        Packets               Pings
>>>>  Host                                                Loss%   Snt   Last   Avg  Best  Wrst StDev
>>>>  1. 192.168.1.1                                       0.0%    55    0.3   0.9   0.1   1.4   0.4
>>>>  2. 172.30.26.66                                      0.0%    55   13.5  26.4  10.4  54.7  10.1
>>>>  3. 68.85.243.77                                      0.0%    55   10.5   9.1   7.9  10.5   0.6
>>>>  4. 24.124.129.106                                    0.0%    54    8.3   9.5   8.2  13.4   1.0
>>>>  5. 96.216.60.165                                     0.0%    54    8.8   9.8   7.8  22.8   2.2
>>>>  6. 68.85.243.197                                     0.0%    54   17.1  15.1   9.0  37.3   5.9
>>>>  7. be-36241-cs04.seattle.wa.ibone.comcast.net        0.0%    54   15.2  15.0  13.2  17.8   0.9
>>>>  8. be-2412-pe12.seattle.wa.ibone.comcast.net         0.0%    54   15.0  14.8  13.2  17.1   1.0
>>>>  9. (waiting for reply)
>>>> 10. be2075.ccr21.sfo01.atlas.cogentco.com             0.0%    54   28.4  29.2  26.9  36.8   1.4
>>>> 11. be2379.ccr31.sjc04.atlas.cogentco.com             0.0%    54   29.8  30.0  27.3  84.2   7.6
>>>> 12. 38.104.141.82                                     0.0%    54   28.6  33.7  27.5 105.5  16.2
>>>> 13. 0.xe-0-3-0.scrm-gw1.scrmca01.sonic.net            0.0%    54   31.6  31.4  29.5  33.8   0.9
>>>> 14. 0.xe-0-0-0.cr1.scrmca13.sonic.net                 0.0%    54   31.1  32.1  29.1  52.9   3.4
>>>> 15. gig1-1-1.gw.wscrca11.sonic.net                    0.0%    54   31.2  31.9  30.0  34.1   0.9
>>>> 16. gig1-1-1.gw.davsca11.sonic.net                    0.0%    54   33.3  32.6  30.8  45.8   2.1
>>>> 17. ns2.zefox.net                                     0.0%    54   52.5  51.4  49.1  54.9   1.2
>>>> 
>>>> The routing need not be the same from one
>>>> try to the next.
>>>> 
>>>> www.zefox.net     is similar.
>>>> www.zefox.com     is similar.
>>>> pelorus.zefox.org is similar.
>>>> nemesis.zefox.com is similar.
>>>> www.zefox.org     is similar.
>>>> 
>>>> Notably www.zefox.org was what I tried and
>>>> reported on before that had the failures.
>>>> 
>>>> I observed a initial connection sequence once
>>>> for pelorus.zefox.org where it briefly displayed
>>>> something like (not captured, just from memory):
>>>> 
>>>> 16. gig1-1-1.gw.davsca11.sonic.net
>>>> 17. (waiting for reply)
>>>> 18. (waiting for reply)
>>>> 19. pelorus.zefox.org
>>>> 
>>>> before changing to
>>>> 
>>>> 16. gig1-1-1.gw.davsca11.sonic.net
>>>> 17. ns2.zefox.net
>>>> 
>>>> That may be normal but usually timed such that I
>>>> would not usually see it.
>>>> 
>>>> But it might actually be evidence of a stage that
>>>> the leads to the overall failure by never getting
>>>> past the:
>>>> 
>>>> 16. gig1-1-1.gw.davsca11.sonic.net
>>>> 17. (waiting for reply)
>>>> 18. (waiting for reply)
>>>> 19. WHATEVER
>>>> 
>>>> in some cases.
>>>> 
>>>> However, in the above the below worked fine:
>>>> 
>>>> 50.1.20.24 pelorus.zefox.org Pi3 13.1 usb-serial---50.1.20.28
>>>> 50.1.20.28 www.zefox.org Pi3 -current usb-serial----50.1.20.25
>>>> 
>>>> What changed?
>>> 
>>> I restarted an outgoing ping so I could access those hosts via ssh,
>>> to bring up a serial console connection to the next host in the "ring".
>>> Usually I simply ping 50.1.20.31 (my router) but at least in the past
>>> it did not matter what the destination was. In one case I tried an
>>> unused address. That makes the role of a distant host somewhat
>>> baffling.
>>> 
>>> Thanks for checking!
>>> 
>>> bob prohaska
>> 
>> Hi,
>> 
>> Did you try to force the link mode to 100MBit/s ?
>> 
> 
> Not explcitly, but ifconfig -a reports
> ue0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
>        options=80009<RXCSUM,VLAN_MTU,LINKSTATE>
>        ether b8:27:eb:71:46:4e
>        inet 50.1.20.28 netmask 0xffffff00 broadcast 50.1.20.255
>        media: Ethernet autoselect (100baseTX <full-duplex>)
>        status: active
>        nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
> so I think it's 100MBit/s anyway.

He might have been asking about how other equipment tries to
to do thing: setting such to only try 100baseTX for talking
to the 2 RPi3B's ( 50.1.20.28 and 50.1.20.24 ).

FYI, I just checked and I'm seeing:

                                     My traceroute  [v0.95]
amd64_ZFS (192.168.1.120) -> pelorus.zefox.org (50.1.20.24)            2022-05-02T11:19:40-0700
Keys:  Help   Display mode   Restart statistics   Order of fields   quit
                                                       Packets               Pings
 Host                                                Loss%   Snt   Last   Avg  Best  Wrst StDev
 1. 192.168.1.1                                       0.0%    56    0.5   0.8   0.1   1.4   0.4
 2. 172.30.26.66                                      0.0%    56   81.1  73.6  23.8 138.6  25.5
 3. 68.85.243.77                                      0.0%    56   70.0  62.0   8.1  77.2  17.7
 4. 24.124.129.106                                    0.0%    56   18.4  49.0  12.0  83.5  23.7
 5. 96.216.60.165                                     0.0%    56   72.0  57.0  14.4  74.7  21.8
 6. 68.85.243.197                                     0.0%    56   72.1  63.3  10.6 115.9  22.9
 7. be-36221-cs02.seattle.wa.ibone.comcast.net        0.0%    56   79.4  58.8  14.3  79.4  23.3
 8. be-2212-pe12.seattle.wa.ibone.comcast.net         0.0%    56   77.2  64.3  19.5 125.1  22.2
 9. (waiting for reply)
10. be2075.ccr21.sfo01.atlas.cogentco.com             0.0%    55   86.8  74.9  30.5 104.6  21.7
11. be2379.ccr31.sjc04.atlas.cogentco.com             3.6%    55   47.4  75.8  29.4  92.8  22.0
12. 38.104.141.82                                     0.0%    55   35.5  83.3  29.0 155.0  27.5
13. 0.xe-0-3-0.scrm-gw1.scrmca01.sonic.net            0.0%    55   92.6  81.0  37.0  95.0  17.9
14. 0.xe-0-0-0.cr1.scrmca13.sonic.net                 0.0%    55   91.3  78.8  32.6 136.1  21.5
15. gig1-1-1.gw.wscrca11.sonic.net                    0.0%    55   96.9  79.9  31.3  96.9  21.4
16. gig1-1-1.gw.davsca11.sonic.net                    0.0%    55   91.8  79.9  38.7  97.2  19.8
17. pelorus.zefox.org                                67.3%    55  173.6 152.8  51.3 177.6  45.9

                                     My traceroute  [v0.95]
amd64_ZFS (192.168.1.120) -> www.zefox.org (50.1.20.28)                2022-05-02T11:21:29-0700
Keys:  Help   Display mode   Restart statistics   Order of fields   quit
                                                       Packets               Pings
 Host                                                Loss%   Snt   Last   Avg  Best  Wrst StDev
 1. 192.168.1.1                                       0.0%    57    0.7   0.7   0.1   1.4   0.4
 2. 172.30.26.66                                      0.0%    57   78.2  75.3  20.2 193.3  28.8
 3. 68.85.243.77                                      0.0%    57   67.8  53.2  11.4  73.1  20.8
 4. 24.124.129.106                                    0.0%    56   72.3  51.4  15.8  73.4  21.7
 5. 96.216.60.165                                     0.0%    56   73.6  53.5  10.4  74.6  22.8
 6. 68.85.243.197                                     0.0%    56   72.4  63.5  19.3  93.4  19.6
 7. be-36241-cs04.seattle.wa.ibone.comcast.net        0.0%    56   77.8  58.9  16.6  78.9  22.4
 8. be-2412-pe12.seattle.wa.ibone.comcast.net         0.0%    56   78.0  61.7  23.5  79.8  20.8
 9. (waiting for reply)
10. be2075.ccr21.sfo01.atlas.cogentco.com             0.0%    56   92.1  75.2  27.6 137.5  24.2
11. be2379.ccr31.sjc04.atlas.cogentco.com             0.0%    56   99.5  75.3  31.3  99.5  19.9
12. 38.104.141.82                                     0.0%    56   53.6  79.8  35.4 132.7  22.6
13. 0.xe-0-3-0.scrm-gw1.scrmca01.sonic.net            0.0%    56   94.0  75.5  35.4 108.0  21.6
14. 0.xe-0-0-0.cr1.scrmca13.sonic.net                 0.0%    56   94.8  77.6  37.1 102.5  20.2
15. gig1-1-1.gw.wscrca11.sonic.net                    0.0%    56   93.5  80.6  32.8 128.6  20.3
16. gig1-1-1.gw.davsca11.sonic.net                    0.0%    56   63.6  97.5  36.9 274.3  46.5
17. www.zefox.org                                    48.2%    56  175.6 153.4  58.4 179.1  44.1

But 0.0% on the others.

This is different than the basic lack of connection
I reported originally.

> One new oddity is seeing in the daily security report the lines
> www.zefox.org kernel log messages:
> +ue0: promiscuous mode enabled
> +ue0: promiscuous mode disabled
> +ue0: promiscuous mode enabled
> +ue0: promiscuous mode disabled
> +ue0: promiscuous mode enabled
> +ue0: promiscuous mode disabled

Looks to me like you ran tcpdump (or other such)
3 times.

> I'm using static addresses set in /etc/rc.conf. The DHCP line is commented
> out but not expicitly disabled. Could something else be trying to turn DHCP
> on, which I gather would also place the interface in promiscuous mode? 

I'd guess that DHCP does not of itself enable promiscuous mode.

But monitoring all the traffic on the EtherNet connection basically
requires promiscuous mode during the monitoring.

> Uname -a reports:
> 
> FreeBSD www.zefox.org 14.0-CURRENT FreeBSD 14.0-CURRENT #55 main-n255108-9fb40baf604: Fri Apr 29 20:42:26 PDT 2022     bob@www.zefox.org:/usr/obj/usr/src/arm64.aarch64/sys/GENERIC arm64
> 




===
Mark Millard
marklmi at yahoo.com