Re: randomdev hangs during initial boot of -current on Raspberry Pi [main 74cf7cae4d22 issue]

From: Mark Millard <marklmi_at_yahoo.com>
Date: Wed, 02 Feb 2022 10:47:31 UTC
[Forwarding to John Baldwin, who authored and comitted:
https://cgit.freebsd.org/src/commit/?id=74cf7cae4d22
"softclock: Use dedicated ithreads for running callouts."
Also including text from the original message:
https://lists.freebsd.org/archives/freebsd-current/2022-January/001474.html
"randomdev hangs during initial boot of -current on Raspberry Pi"
so there is a description of the problem wihtout having
to look elsewhwere.
]

Mike Karels <mike_at_karels.net> wrote on
Date: Mon, 31 Jan 2022 12:27:41 -0600 :

> I hadn't updated my Raspberry Pi 4B running -current for a couple of
> months, so I booted the latest snapshot (Jan 27).  It hangs when it
> does the "growfs" step, expanding the root partition and fs to fill
> the SD card.  When it hangs, it prints this every 10 seconds or so:
> 
>     random: randomdev_wait_until_seeded unblock wait
> 
> I waited several minutes the first time, and 20 minutes on another trial.
> If I hold down the return key on the serial console, the device unblocks
> and the boot continues.  This only happens on the initial boot, when the
> growfs script runs.  The hang happens on a Raspberry Pi 3B+ as well.
> It also happens with the two-week-old snapshot, but not the Nov 25
> snapshot.  The program that's running during the hang is awk, doing
> a read, according to ^T; the script uses awk to parse output from
> mount, glabel, and sysctl.
> 
> It sounds like there is no source of entropy at this point, and there
> was no cache.  I don't see any changes to the random device since this
> was working.  Does anyone have a guess what to look for?  A bisect
> would be rather laborious, building a modified SD card each time,
> even if just testing kernel changes.  Any other suggestions?
> 
> An excerpt from /var/log/messages during this time is appended.
> 
> 		Mike
> 
> Jan 27 10:38:48 generic kernel: umass0 on uhub0
> Jan 27 10:38:48 generic kernel: umass0: <ADATA SD600Q, class 0/0, rev 3.00/93.01, addr 2> on usbus0
> Jan 27 10:38:48 generic kernel: umass0:  SCSI over Bulk-Only; quirks = 0x8100
> Jan 27 10:38:48 generic kernel: umass0:0:0: Attached to scbus0
> Jan 27 10:38:48 generic kernel: da0 at umass-sim0 bus 0 scbus0 target 0 lun 0
> Jan 27 10:38:48 generic kernel: da0: <ADATA SD600Q 9301> Fixed Direct Access SPC-4 SCSI device
> Jan 27 10:38:48 generic kernel: da0: Serial Number 40118905201B
> Jan 27 10:38:48 generic kernel: da0: 400.000MB/s transfers
> Jan 27 10:38:48 generic kernel: da0: 228936MB (468862128 512 byte sectors)
> Jan 27 10:38:48 generic kernel: da0: quirks=0x2<NO_6_BYTE>
> Jan 27 10:38:48 generic kernel: random: randomdev_wait_until_seeded unblock wait
> Jan 27 10:38:48 generic syslogd: last message repeated 48 times
> Jan 27 10:38:48 generic kernel: random: unblocking device.
> Jan 27 10:38:48 generic kernel: GEOM_PART: mmcsd0s2 was automatically resized.
> Jan 27 10:38:48 generic kernel:   Use `gpart commit mmcsd0s2` to save changes or `gpart undo mmcsd0s2` to revert them.
> Jan 27 10:38:48 generic kernel: lo0: link state changed to UP


Later material . . .

On 2022-Feb-2, at 01:40, Jesper Schmitz Mouridsen <jsm@FreeBSD.org> wrote:
> 
> On 31.01.2022 22.20, Mark Millard wrote:
>> Mike Karels <mike_at_karels.net> wrote on
>> Date: Mon, 31 Jan 2022 12:27:41 -0600 :
>>> A bisect
>>> would be rather laborious, building a modified SD card each time,
>>> even if just testing kernel changes.  Any other suggestions?
>> Historically I've used:
>> https://artifact.ci.freebsd.org/snapshot/main/?C=M&O=D
>> and the likes of kernel.txz (or more) from, for example:
>> https://artifact.ci.freebsd.org/snapshot/main/b4cc5d63b6112746598d21413c9800a43171da52/arm64/aarch64/?C=M&O=D
>> to update just the kernel (or whatever) and rebooted.
>> (It can help to have a somewhat older world that is
>> left in place instead of running newer worlds on older
>> kernels. Avoiding needing got update world as well has
>> been helpful when testing for kernel issues.)
>> This avoids building the kernels and allows a somewhat
>> bisect like activity until some subrange has no
>> arm64/aarch64 artifacts available.
>> One can sometimes run into the dates for the sort for:
>> https://artifact.ci.freebsd.org/snapshot/main/?C=M&O=D
>> not matching up well with the dates on the files of
>> interest in specific sub directoreis. (Some sort of
>> directory update?) This can make the bisect far more
>> difficult, given the choice to not have the directory
>> names prefixed with text that would sort by a
>> date/time estimate when sorted by name. (Only using
>> the commit id/hash completely randomizes the naming.)
>> ===
>> Mark Millard
>> marklmi at yahoo.com
> Hi
> My bisect gives:
> The latest working is:
> dda9847275da79ccbb2f0b7079b250e28b3b3b2a
> The excact following commit:
> 74cf7cae4d2238ae6d1c949b2bbd077e1ab33634 is bad.
> So  74cf7cae4d2238ae6d1c949b2bbd077e1ab33634 is where the problem starts for me.
> Hope that someone can explain why 74cf7cae4d2238ae6d1c949b2bbd077e1ab33634 does block entropy/random seeding on first boot around growfs invocation on arm64
> /Jsm

===
Mark Millard
marklmi at yahoo.com