Re: Rock64 configuration fails to boot for main 22c4ab6cb015 but worked for main 06bd74e1e39c (Nov 21): e.MMC mishandled?
Date: Sun, 12 Dec 2021 00:19:15 UTC
[I've cut out the history: just presenting some new evidence.]
First, a little context from getting to the db> prompt.
db> ps
pid ppid pgrp uid state wmesg wchan cmd
18 0 0 0 DL syncer 0xffff000000eca5a8 [syncer]
17 0 0 0 DL vlruwt 0xffffa00007d2ea60 [vnlru]
16 0 0 0 DL (threaded) [bufdaemon]
100089 D qsleep 0xffff000000ec9478 [bufdaemon]
100092 D - 0xffff000000c11100 [bufspacedaemon-0]
100093 D - 0xffff000000c21680 [bufspacedaemon-1]
9 0 0 0 DL psleep 0xffff000000ef0650 [vmdaemon]
8 0 0 0 DL (threaded) [pagedaemon]
100087 D psleep 0xffff000000ee2b38 [dom0]
100094 D launds 0xffff000000ee2b44 [laundry: dom0]
100095 D umarcl 0xffff0000007b38d8 [uma]
7 0 0 0 DL mmcsd d 0xffffa00007b72e00 [mmcsd0boot1: mmc/sd]
6 0 0 0 DL mmcsd d 0xffffa00007b71300 [mmcsd0boot0: mmc/sd]
5 0 0 0 DL mmcreq 0xffff00009b5d0710 [mmcsd0: mmc/sd card]
4 0 0 0 DL - 0xffff000000ccc020 [rand_harvestq]
15 0 0 0 DL (threaded) [usb]
. . .
and "mmcreq" is from the while loop in:
static int
mmc_wait_for_req(struct mmc_softc *sc, struct mmc_request *req)
{
req->done = mmc_wakeup;
req->done_data = sc;
if (__predict_false(mmc_debug > 1)) {
device_printf(sc->dev, "REQUEST: CMD%d arg %#x flags %#x",
req->cmd->opcode, req->cmd->arg, req->cmd->flags);
if (req->cmd->data) {
printf(" data %d\n", (int)req->cmd->data->len);
} else
printf("\n");
}
MMCBR_REQUEST(device_get_parent(sc->dev), sc->dev, req);
MMC_LOCK(sc);
while ((req->flags & MMC_REQ_DONE) == 0)
msleep(req, &sc->sc_mtx, 0, "mmcreq", 0);
MMC_UNLOCK(sc);
if (__predict_false(mmc_debug > 2 || (mmc_debug > 0 &&
req->cmd->error != MMC_ERR_NONE)))
device_printf(sc->dev, "CMD%d RESULT: %d\n",
req->cmd->opcode, req->cmd->error);
return (0);
}
So it appears that the error report:
mmcsd0: Error indicated: 4 Failed
ends up associated with (req->flags & MMC_REQ_DONE) == 0 staying
true in the above code: an unbounded loop with MMC_LOCK(sc) active.
The "4" in the error report seems to be from:
#define MMC_ERR_FAILED 4
It looks like there are some problems with handling errors, problems
such that it gets stuck looping (no panic, no progress).
That seems to be separate from why the MMC_ERR_FAILED was generated
in the first place. So: 2 problems, not just one. Thus it may be a
good context for tackling the looping problem with a known example
failure to look at.
Just for reference, I tried "boot -v" with debug.verbose_sysinit=1 in place,
just to capture and report the tail of the output for the boot failure.
. . .
subsystem f000000
release_aps(0)... Release APs...done
done.
intr_irq_shuffle(0)... Trying to mount root from ufs:/dev/gpt/Rock64root []...
done.
netisr_start(0)... done.
taskqgroup_bind_softirq(0)... done.
GEOM: new disk mmcsd0
GEOM: new disk mmcsd0boot0
GEOM: new disk mmcsd0boot1
smp_after_idle_runnable(0)... done.
taskqgroup_bind_if_config_tqg(0)... done.
taskqgroup_bind_if_io_tqg(0)... done.
tmr_setup_user_access(0)... done.
subsystem f000001
mmcsd0: Error indicated: 4 Failed
epoch_init_smp(0)... done.
subsystem f100000
racctd_init(0)... done.
subsystem fffffff
start_periodic_resettodr(0)... done.
oktousecallout(0)... done.
clknode_finish(0)... Unresolved linked clock found: hdmi_phy
Unresolved linked clock found: usb480m_phy
done.
regulator_constraint(0)... done.
regulator_shutdown(0)... regulator: shutting down unused regulators
regulator: shutting down vcc_sd... busy
done.
uhub0: 1 port with 1 removable, self powered
uhub2: 2 ports with 2 removable, self powered
uhub3: 1 port with 1 removable, self powered
uhub1: 1 port with 1 removable, self powered
ugen4.2: <Samsung PSSD T7 Touch> at usbus4
umass0 on uhub2
umass0: <Samsung PSSD T7 Touch, class 0/0, rev 3.20/1.00, addr 1> on usbus4
umass0: SCSI over Bulk-Only; quirks = 0x0000
umass0:0:0: Attached to scbus0
pass0 at umass-sim0 bus 0 scbus0 target 0 lun 0
pass0: <Samsung PSSD T7 Touch 0> Fixed Direct Access SPC-4 SCSI device
pass0: Serial Number REPLACED
pass0: 400.000MB/s transfers
da0 at umass-sim0 bus 0 scbus0 target 0 lun 0
da0: <Samsung PSSD T7 Touch 0> Fixed Direct Access SPC-4 SCSI device
da0: Serial Number REPLACED
da0: 400.000MB/s transfers
da0: 953869MB (1953525168 512 byte sectors)
da0: quirks=0x2<NO_6_BYTE>
da0: Delete methods: <NONE(*),ZERO>
random: unblocking device.
No more output after that.
===
Mark Millard
marklmi at yahoo.com
( dsl-only.net went
away in early 2018-Mar)