Re: LOR on Star64 + network problem (eqos TX not working)

From: JMT Sihvola <jsihv_at_gmx.com>
Date: Sat, 07 Jun 2025 21:37:54 UTC

> Sent: Saturday, June 07, 2025 at 10:17 AM
> From: "Milan Obuch" <freebsd-riscv@dino.sk>
> To: riscv@freebsd.org
> Subject: Re: LOR on Star64 + network problem (eqos TX not working)
>
> On Thu, 29 May 2025 12:43:25 +0200
> Milan Obuch <freebsd-riscv@dino.sk> wrote:
> 
> > On Thu, 29 May 2025 12:14:09 +0200
> > JMT Sihvola <jsihv@gmx.com> wrote:
> > 
> > > > Sent: Thursday, May 29, 2025 at 9:48 AM
> > > > From: "Milan Obuch" <freebsd-riscv@dino.sk>
> > > > To: riscv@freebsd.org
> > > > Subject: LOR on Star64 + network problem (eqos TX not working)
> > > >
> > > > This occured on cable attach to network port of Star64 board:
> > > > 
> > > > lock order reversal: (sleepable after non-sleepable)
> > > >  1st 0xffffffc086509070 eqos lock (network driver, sleep mutex) @
> > > > /usr/src/sys/kern/kern_mutex.c:213 2nd 0xffffffc0008f90c0 Clock
> > > > topology lock (Clock topology lock, sx) @
> > > > /usr/src/sys/dev/clk/clk.c:1208 lock order network driver -> Clock
> > > > topology lock attempted at: #0 0xffffffc00038b6be at
> > > > witness_checkorder+0xa02 #1 0xffffffc00032d776 at _sx_xlock+0x58
> > > > #2 0xffffffc0000ecc0a at clk_set_freq+0x44 #3 0xffffffc000613884
> > > > at if_eqos_starfive_set_speed+0x78 #4 0xffffffc000611eaa at
> > > > eqos_miibus_statchg+0x13e #5 0xffffffc000114aae at
> > > > miibus_statchg+0x50 #6 0xffffffc0001157c0 at mii_phy_update+0x60
> > > > #7 0xffffffc0001133d8 at mcommphy_service+0x226
> > > > #8 0xffffffc000114426 at mii_tick+0x32
> > > > #9 0xffffffc000613474 at eqos_tick+0x68
> > > > #10 0xffffffc00033ee20 at $x+0
> > > > #11 0xffffffc000340354 at softclock_thread+0xaa
> > > > #12 0xffffffc0002de9bc at fork_exit+0x68
> > > > #13 0xffffffc0005fd49a at fork_trampoline+0xa
> > > > 
> > > > Additionally, network does not fully work - I can tcpdump on
> > > > eqos0, I see some packet being received, arp protocol works at
> > > > least to some degree - I see arp table entry on Star64, but not
> > > > on the other side. It looks like receive path is OK, but sending
> > > > does not work.
> > > > 
> > > > This test was done first with cable put to 100 Mbps switch port,
> > > > no idea whether it's relevant. I tried with 1 Gbps as well, the
> > > > result is the same. So TX path is not working for me.
> > > > 
> > > > Link negotiating seems to be OK, ifconfig output shows 100 or 1000
> > > > reflecting the port speed it is connected to, but no packet is
> > > > seen arriwing to the other side of the cable.
> > > > 
> > > > Regards,
> > > > Milan
> > > >     
> > > 
> > > LOR and the network problem have been discussed on this
> > > differential: https://reviews.freebsd.org/D45600  
> > 
> > I'll give a look there.
> >
> 
> I tried, no new ideas from there. Maybe it is not a big problem, after
> all.
> 
> > > So far it has seemed that on VisionFive2 the firstly connected
> > > Ethernet port works. If it's not connected during the boot,
> > > "dhclient eqos[0 or 1]" may be required.  
> > 
> > It did not work for me - I connected the cable in fully booted state,
> > but 'dhclient eqos0' was the first thing I tried. No avail.
> > 
> > Also, tried reboot with cable attached, no change. To reiterate: I see
> > no packet from Star64 on wire (or rather, on the other side interface,
> > using tcpdump). On Star64, I see packets coming from the other side.
> > 
> > Similar behaviour was seen on some special board under development,
> > with hardware bug - crystal for network interface (PHY, if I still
> > remember it exactly) was wrong, should be 25 MHz, somehow 26 MHz got
> > soldered. After this bug was found, crystal replaced with correct one,
> > everything was working. This does not mean here is the same problem,
> > just the behaviour is the same.
> > 
> > I am going to prepare a test with Linux based OS, which I used to
> > basic functionality check. Stay tuned.
> > 
> 
> As expected, Linux based DietPi just works (I could ssh into the box,
> so the network works).
> 
> I noticed there are two different DTBs for Vision Five boards, so there
> should be two different versions. There are somewhat different
> definitions for eqos interfaces in those DTBs. So it would be
> interesting to know which board version (or maybe there are more
> revisions, I did not look for this) those trying FreeBSD on it have.
> And how it goes with network - does it work at all? Are there any
> troubles?
> 
> Anybody else with Star64 board?
> 
> Regards,
> Milan
> 

I noticed that Linux (mainline) has a later addition in their dwmac-starfive.c
driver code which sets a different TX clock rate if the device tree has a
property "starfive,tx-use-rgmii-clk". This might be the reason for the problem.
I could try to implement this feature within a week (tell me if you're trying
to implement it by yourself).

-jari