Re: Loading splash ok -> <reset>, how to debug?

From: Bjoern A. Zeeb <bzeeb-lists_at_lists.zabbadoz.net>
Date: Mon, 09 Jun 2025 01:29:50 UTC
On Mon, 9 Jun 2025, Bjoern A. Zeeb wrote:

> On Sun, 8 Jun 2025, Warner Losh wrote:
>
>> On Sun, Jun 8, 2025 at 11:46 AM Bjoern A. Zeeb
>> <bzeeb-lists@lists.zabbadoz.net> wrote:
>>>
>>> On Sun, 8 Jun 2025, Warner Losh wrote:
>>>
>>>> On Sun, Jun 8, 2025, 10:08 AM Tomek CEDRO <tomek@cedro.info> wrote:
>>>>
>>>>> On Sun, Jun 8, 2025 at 5:10 PM Bjoern A. Zeeb
>>>>> <bzeeb-lists@lists.zabbadoz.net> wrote:
>>>>>> Hi,
>>>>>> is there any good way to debug early kernel kabooms (or maybe efi
>>>>>> loader
>>>>>> handover) without JTAG in place?
>>>>>> I have a SoC using U-Boot and often after a cold start it goes like
>>>>>> this:
>>>>>>
>>>>>> ------------------------------------------------------------------------
>>>>>> Hit [Enter] to boot immediately, or any other key for command prompt.
>>>>>> Booting [/boot/kernel/kernel]...
>>>>>> Using DTB provided by EFI at 0x3e6c3000.
>>>>>> Loading splash ok
>>>>>> <reset>
>>>>>>
>>>>>> ------------------------------------------------------------------------
>>>>>> Once it's been looping for a few iterations eventually it starts
>>>>>> booting.
>>>>>> Given there is zero output of the error, is there aything one could try
>>>>>> to do to debug this (remotely)?
>>>>>> I can netboot it and change laoder/kernel.
>>>>>
>>>>> DDB?
>>>>> https://docs.freebsd.org/en/books/developers-handbook/kerneldebug/#kerneldebug-online-ddb
>>>>>
>>>>> KGDB?
>>>>> https://docs.freebsd.org/en/books/developers-handbook/kerneldebug/#kerneldebug-online-gdb
>>>>>
>>>>> over the Serial Port? :-)
>>>>>
>>>>
>>>> I think he's not even getting to the handoff. Best to toss out all
>>>
>>> hmm..
>>>
>>>> graphical things that are configured and see if that gets us to the
>>>> handoff
>>>> point... get rid of the splash screen first of all..
>>>
>>> How?  I do not have boot_mute="YES" set.
>>
>> By not configuring the splash screen. We wouldn't be loading it if it
>> wasn't configured.
>
> Then it's done because it's in defaults/loader.conf ?
>
> I set splash="" now, which seems to make a difference;  magic not
> working correctly enabling it automatically?
>
>
>>> I tried setting console to just efi manually; that didn't help. Also on
>>> first attempt I got an error, typing it again worked (but I didn't
>>> re-validate).
>>
>> OK. efi is the default console, so it's not surprising it didn't
>> change anything.
>>
>>> I have the following in loader.conf and I tried various combinations
>>> without
>>> success to see ... I also tried to fdt ls to make sure fdt is all right
>>> and there...
>>>
>>> ----------------------------------------
>>> boot_verbose="-v"
>>> debug.kdb.alt_break_to_debugger=1
>>> beastie_disable="YES"
>>> ----------------------------------------
>>>
>>> Do we have a way to validate in loader if a kernel loaded into memory is
>>> correct?  I was wondering if the kernel was corrupted...
>>
>> It's not. Or rather, the data you've given doesn't show the kernel
>> loading in the detail that should be there. But yea, if the kernel is
>> corrupt, all bets are off.
>>
>> Can you just post the entire log from when the boot loader starts to
>> the error? That will end all ambiguity.
>
> Had to hard reset it;  with splash="" the error seems to have shifted:
>
> ------------------------------------------------------------------------
> Consoles: EFI console
>    Reading loader env vars from /efi/freebsd/loader.env
> FreeBSD/arm64 EFI loader, Revision 3.0
> (Thu Jun  5 10:57:09 UTC 2025 bz@somearm64)
>
>   Command line arguments: loader.efi
>   Image base: 0x3e5ee000
>   EFI version: 2.110
>   EFI Firmware: Das U-Boot (rev 8229.1024)
>   Console: efi,comconsole (0)
>   Load Path: /\exports\foo01\boot\loader.efi
>   Load Device: ...
>   BootOrder: 0000
> Setting currdev to net0:
>
>
> cLoading kernel...
> /boot/kernel/kernel text=0x318 text=0x9be4e8 text=0x2d7260 data=0x181168
> data=0x0+0x371000 0x8+0x1721a0+0x8+0x1a2cf5-
> Loading configured modules...
> can't find '/boot/entropy'
> can't find '/etc/hostid'
>
> Hit [Enter] to boot immediately, or any other key for command prompt.
> Booting [/boot/kernel/kernel] in 9 seconds...
>
>
> Using DTB provided by EFI at 0x3e6c3000.
> \
> ------------------------------------------------------------------------

I turned beastie back on and by default it thinks it's booting using
video;  when I toggle through to serial it seems to boot;  now that
might be the accidental "working now".

Where does that menu get it's values from compared to the console
variable?

Not toggling the menu I see:
console=efi,comconsole
efi-version=2.110
efi_max_resolution=1x1

Hmm toggling it sets:
boot_serial=YES

I'll leave that in laoder.conf for now and see tomorrow.  We are getting
too complicated with too many knobs these days.  I know why I love the
forth loader.  You write a 7 line loader.rc and you are done and know
what you have...

-- 
Bjoern A. Zeeb                                                     r15:7