From nobody Thu Sep 23 17:46:04 2021 X-Original-To: arm@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 1398B17C126F for ; Thu, 23 Sep 2021 17:46:08 +0000 (UTC) (envelope-from avg@FreeBSD.org) Received: from smtp.freebsd.org (smtp.freebsd.org [IPv6:2610:1c1:1:606c::24b:4]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "smtp.freebsd.org", Issuer "R3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4HFjJJ05Tgz3RGl; Thu, 23 Sep 2021 17:46:08 +0000 (UTC) (envelope-from avg@FreeBSD.org) Received: from [192.168.0.88] (unknown [195.64.148.76]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client did not present a certificate) (Authenticated sender: avg/mail) by smtp.freebsd.org (Postfix) with ESMTPSA id 64FD2255F7; Thu, 23 Sep 2021 17:46:07 +0000 (UTC) (envelope-from avg@FreeBSD.org) From: Andriy Gapon To: Emmanuel Vadot , "freebsd-arm@freebsd.org" References: <20210920190213.5839f18816daf1f6e4289b94@bidouilliste.com> Subject: Re: rock64 verbose boot hangs Message-ID: Date: Thu, 23 Sep 2021 20:46:04 +0300 User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:78.0) Gecko/20100101 Firefox/78.0 Thunderbird/78.14.0 List-Id: Porting FreeBSD to ARM processors List-Archive: https://lists.freebsd.org/archives/freebsd-arm List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-arm@freebsd.org MIME-Version: 1.0 In-Reply-To: <20210920190213.5839f18816daf1f6e4289b94@bidouilliste.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 8bit X-ThisMailContainsUnwantedMimeParts: N On 20/09/2021 20:02, Emmanuel Vadot wrote: > > Hi Andriy, > > On Sat, 18 Sep 2021 15:58:00 +0300 > Andriy Gapon wrote: > >> >> Normal boot works every time, but with boot_verbose="YES" it hanged on all >> attempts so far. >> >> Last messages on the console: >> cpulist0: on ofwbus0 >> cpu0: on cpulist0 >> cpu0: Nominal frequency 600Mhz >> cpufreq_dt0: on cpu0 >> cpufreq_dt0: 408.000 Mhz (950000 uV) >> cpufreq_dt0: 600.000 Mhz (950000 uV) >> cpufreq_dt0: 816.000 Mhz (1000000 uV) >> cpufreq_dt0: 1008.000 Mhz (1100000 uV) >> cpufreq_dt0: 1200.000 Mhz (1225000 uV) >> cpufreq_dt0: 1296.000 Mhz (1300000 uV) >> cpu1: on cpulist0 >> cpu1: Nominal frequency 600Mhz >> cpufreq_dt1: on cpu1 >> >> The kernel is totally unresponsive after that. > > Can't reproduce here, I'm running 548a706608d with latest DTB and > latest u-boot/atf > >> Any suggestions on how to debug this? > > Not really sure how to start, that seems weird that the kernel will > hang at the cpufreq attach but maybe try modifying the DTB to remove > this node ? > Also did that happens with my recent commit on clock or was this the > same before ? Thank you and every one else who responded with information and suggestions. Some extra details. I've been having this problem since I've got this board 9 months ago. It's been through several FreeBSD and U-Boot and stuff in the ESP partition upgrades. And the problem was always present. Now I've done more extensive testing with a couple of dozen reboots in a row and some additional debug prints (like, for example, DEBUG in subr_bus.c). I actually see several variations of the problem. Sometimes it's a hang, but sometimes it's a crash. A hang can happen in different places and a crash can happen in different places too. Some crashes happens during AP startup and the information I am getting is not very usable. Some crashes happen during a driver probing when the bus code searches the hints memory space. Those crashes look like a memory corruption happens there at random. Given those variations plus some other differences that I have comparing to other Rock64 users (like needing special setup for eMMC and for the watchdog), I am inclined to think that the board I have has something special either in the hardware (like a different configuration via some fuses) or in the BootROM. Even though the PCB has the standard markings. And I would not be surprised about that (that it could be a customized production) as I got my Rock64-s via a special / unusual deal on Amazon. Iconikal and Recon Sentinal are keywords to search for, for those interested. Some news articles from the time: https://liliputing.com/2020/09/this-10-single-board-computer-is-faster-than-a-raspberry-pi-3.html https://www.tomshardware.com/news/raspberry-pi-sized-iconikal-rockchip-sbc-only-dollar8-on-amazon So, in the end, I still do not know what causes the verbose boot to hang / crash. Maybe there is some (not fully working) watchdog that gets armed and disarmed by some hardware accesses and the verbose boot is too slow to complete in time. Here is a small subset of panics and hangs that I saw: https://people.freebsd.org/~avg/rock64-verbose-boot-panic.txt -- Andriy Gapon