Re: 60+% ping packet loss on Pi3 under -current and stable-13

From: Mark Millard <marklmi_at_yahoo.com>
Date: Sun, 01 May 2022 04:18:02 UTC
On 2022-Apr-30, at 18:11, bob prohaska <fbsd@www.zefox.net> wrote:

> On Fri, Apr 29, 2022 at 08:14:27PM -0700, Mark Millard wrote:
>> On 2022-Apr-29, at 19:12, bob prohaska <fbsd@www.zefox.net> wrote:
>> 
>>> Since about December of 2021 I've been noticing problems with
>>> wired network connectivity on a pair of raspberry pi 3 machines
>>> using wired network connections. One runs stable-13.1, the other
>>> runs -current, both are up to date as of a few days ago.
>> 
>> Compared to your later notes about 192.168.1.n style use,
>> are any of the above that way? Or are the all well-analogous
>> to the "on the public network" context mentioned later?
>> 
> Not sure I follow what you're getting at, could you clarify
> please? The move between public and private networks was done
> by changing comment delimiters in /etc/rc.conf and moving
> cables between public switch and private router. Only the two
> Pi3s have so far failed to answer pings and ssh connections
> after reboot. 
> 


What, if anything, has been tested that did not fail to
answer pings and ssh connections after reboot on the
public network? Any other types of RPi*'s?

For example, temporarily moving a RPi4B from the private
network to the public one, but booted from the same
13.1-RC4 microsd card as used for the RPi3B test, would
allow checking if the problem happens on the additional
type of RPi*.

Testing a RPi2 v1.1 could not use the same 13.1-RC4 microsd
card content as the RPi3B's can. Still a useful test,
but I mention RPi4B above because it can boot from the
same media content as was used for the RPI3B testing.


>>> Essentially both machines fail to respond to inbound network
>>> connections via ssh or ping after reboot. If I get on the 
>>> serial console and start an outbound ping to anywhere, both
>>> machines respond to incoming pings with about a 65% packet
>>> loss. Ssh connections are answered with delays of zero to
>>> perhaps thirty seconds. Once connected ssh is usable but
>>> erratic, with dropped characters, multi-second delays and
>>> disconnects after random intervals from minutes to hours.
>>> 
>>> There are five other Raspberry Pi's on the network. Three
>>> Pi2's run 12.3-stable, one Pi2 runs -current
>> 
>> RPi2 v1.2's used as aarch64? (So similar to RPi3*'s.)
> No, the Pi2s are v1.1.
>> RPi2 v1.1's (armv7)?
> Yes.

Good to know.

> 
>> Which type of RPi3* variant? B? B+? Revision?
>> 
> The stable/13 machine reports:
> bob@pelorus:~ % sysctl -a | grep model
> hw.model: ARM Cortex-A53 r0p4
> hw.fdt.compatible: raspberrypi,3-model-b brcm,bcm2837
> hw.fdt.model: Raspberry Pi 3 Model B Rev 1.2

A RPi3B+ would be Rev 1.3 based on the table near the
bottom of the page at:

https://www.raspberrypi.com/documentation/computers/raspberry-pi.html

No Rev 1.2 or before for RPi3B+. The only revision
code documented for such a B+ is a020d3.

But there is such a thing as a non-+ RPi3B with Rev 1.3
as well. But most of the revision codes for them are
Rev 1.2.

I'll note that if the RPi* firmware debugging output
is enabled via config.txt then there are lines in the
output identifying the exact .dtb file that is used
as the starting point:

MESS:00:00:02.136715:0: dtb_file 'bcm2710-rpi-3-b.dtb'
MESS:00:00:02.140152:0: Trying Device Tree file 'bcm2710-rpi-3-b.dtb'
MESS:00:00:02.155700:0: brfs: File read: /mfs/sd/bcm2710-rpi-3-b.dtb
MESS:00:00:02.160357:0: Loading 'bcm2710-rpi-3-b.dtb' to 0x4000 size 0x70fb

The names are as below and indicate the plus
or not expllicitly:

bcm2710-rpi-2-b.dtb
bcm2710-rpi-3-b-plus.dtb
bcm2710-rpi-3-b.dtb
bcm2710-rpi-cm3.dtb
bcm2711-rpi-4-b.dtb
bcm2711-rpi-400.dtb
bcm2711-rpi-cm4.dtb

Enabling the debug output looks like:

enable_uart=1
uart_2ndstage=1
dtdebug=1

> dev.smscphy.0.%pnpinfo: oui=0x800f model=0xc rev=0x3
> bob@pelorus:~ % 
> 
> and the -current machine reports: 
> bob@www:~ % sysctl -a | grep -i model
>      Memory Model Features 0 = <TGran4,TGran64,SNSMem,BigEnd,16bit ASID,1TB PA>
>      Memory Model Features 1 = <8bit VMID>
>      Memory Model Features 2 = <32bit CCIDX,48bit VA>
> hw.model: ARM Cortex-A53 r0p4
> hw.fdt.compatible: raspberrypi,3-model-b brcm,bcm2837
> hw.fdt.model: Raspberry Pi 3 Model B Rev 1.2

Again, if the Rev. 1.2 is accurate, it is unlikely to be
a RPi3B+ .

> dev.smscphy.0.%pnpinfo: oui=0x800f model=0xc rev=0x3
> bob@www:~ % 
> 
> That's slightly surprising, since they are of different age and
> one has WiFi, not sure which. I believe that makes one a B+ though
> I gather FreeBSD still doesn't support the on-board WiFi. Either
> way, I thought the wired ethernet setup was identical. 
> 

Both have WiFi: all RPi3's have WiFi.

QUOTING https://www.raspberrypi.com/products/raspberry-pi-3-model-b/ :
Specification

Raspberry Pi 3 Model B is the earliest model of the third-generation Raspberry Pi. It replaced Raspberry Pi 2 Model B in February 2016. See also Raspberry Pi 3 Model B+, the latest product in the Raspberry Pi 3 range.

	• Quad Core 1.2GHz Broadcom BCM2837 64bit CPU
	• 1GB RAM
	• BCM43438 wireless LAN and Bluetooth Low Energy (BLE) on board
. . .
END QUOTE

What was different was the vintage of WiFi for the RPi3B+ :

QUOTING https://www.raspberrypi.com/products/raspberry-pi-3-model-b-plus/ :
Specification

The Raspberry Pi 3 Model B+ is the final revision in the Raspberry Pi 3 range.

	• Broadcom BCM2837B0, Cortex-A53 (ARMv8) 64-bit SoC @ 1.4GHz
	• 1GB LPDDR2 SDRAM
	• 2.4GHz and 5GHz IEEE 802.11.b/g/n/ac wireless LAN, Bluetooth 4.2, BLE
. . .
END QUOTE

So RPi3B+ had 5 GhZ 802.11.n and 802.11.ac .


The one that I tested via a private network was also
a RPi3B (non-+). I do not have access to a RPi3B+ .


The RPi3B+ has different EtherNet, faster. Right hand side is
again quoting those pages:

RPI3B :	• 100 Base Ethernet
RPi3B+: • Gigabit Ethernet over USB 2.0 (maximum throughput 300 Mbps)



>>> and a Pi4 runs
>>> -current. All have no problems pinging one another and out
>>> of network, so there's nothing obviously wrong with the net.
>>> The network is not routed, but rather a block of eight
>>> addresses simply bridged from my ISP over DSL.
>>> 
>>> It's been found that an image of 13.1-RC4 behaves similarly
>>> on one Pi3 when on the public network but exhibits more normal
>>> ping response when moved to a 192.168.1.n private network. 
> 
> Just to be clear, it was the same Pi3, I  moved the cables and 
> changed lines in /etc/rc.conf to make the switch.
> 

Yep. I've suggested testing a RPi4B via such switching of
cables and /etc/rc.conf adjustment.

>>> On the face of it, this seems significant, but I can't guess how.
>> 
>> Did you try a RPi4B on the public network, booted using the
>> same 13.1-RC4 microsd card you used in the RPi3* testing?
>> (Modern aarch64 RPi* images should boot either type of
>> aarch64 RPI*.)
>> 
> 
>> If yes, what was the behavior like? Did it behave like the
>> RPi3*?
>> 
>> If no, it should be a good test for how specific the problem
>> is to the RPi3* vs. RPi*'s more generally.
>> 
> 
> I haven't tried yet, since the Pi4 was on the private network to
> begin with and it has never had problems answering ping and ssh.

The question that is left is if it would have problems on
the public network vs. not. I can not reasonably predict
that based on the private network result.

> AIUI the Pi4 ethernet is on PCIe, while the Pi3 uses USB. If the
> Pi4 failed to answer ping when running the snapshot I guess that
> would point to either faulty media or a different place in the
> network software. Perhaps worth a try. 
> 

Yep, that is a kind of information I was after.

> 
>> Testing a EtherNet dongle known to use a different driver
>> could also be a form of cross check, if you happen to have
>> such available.
> 
> My only alternative Ethernet adapter is a Ralink WiFi dongle.
> My WiFi is private-network only, and the snapshot worked reasonably
> well when wired on the private network. A wired adapter would be
> more informative, but I'll have to figure out what to order. 

Only being able to test a private network definately limits
the utility of the such a test (WiFi test).

I'm not sure if you want to get a device just for the test
activity at this point.

===
Mark Millard
marklmi at yahoo.com