Re: Rock64 configuration fails to boot for main 22c4ab6cb015 but worked for main 06bd74e1e39c (Nov 21): e.MMC mishandled?

From: Kornel_Dulęba <mindal_at_semihalf.com>
Date: Fri, 17 Dec 2021 11:17:12 UTC
> >>> [I've cut out the history: just presenting some new evidence.]
> >>>
> >>> First, a little context from getting to the db> prompt.
> >>>
> >>> db> ps
> >>> pid  ppid  pgrp   uid  state   wmesg   wchan               cmd
> >>> 18     0     0     0  DL      syncer  0xffff000000eca5a8  [syncer]
> >>> 17     0     0     0  DL      vlruwt  0xffffa00007d2ea60  [vnlru]
> >>> 16     0     0     0  DL      (threaded)                  [bufdaemon]
> >>> 100089                   D       qsleep  0xffff000000ec9478  [bufdaemon]
> >>> 100092                   D       -       0xffff000000c11100  [bufspacedaemon-0]
> >>> 100093                   D       -       0xffff000000c21680  [bufspacedaemon-1]
> >>>  9     0     0     0  DL      psleep  0xffff000000ef0650  [vmdaemon]
> >>>  8     0     0     0  DL      (threaded)                  [pagedaemon]
> >>> 100087                   D       psleep  0xffff000000ee2b38  [dom0]
> >>> 100094                   D       launds  0xffff000000ee2b44  [laundry: dom0]
> >>> 100095                   D       umarcl  0xffff0000007b38d8  [uma]
> >>>  7     0     0     0  DL      mmcsd d 0xffffa00007b72e00  [mmcsd0boot1: mmc/sd]
> >>>  6     0     0     0  DL      mmcsd d 0xffffa00007b71300  [mmcsd0boot0: mmc/sd]
> >>>  5     0     0     0  DL      mmcreq  0xffff00009b5d0710  [mmcsd0: mmc/sd card]
> >>>  4     0     0     0  DL      -       0xffff000000ccc020  [rand_harvestq]
> >>> 15     0     0     0  DL      (threaded)                  [usb]
> >>> . . .
> >>>
> >>> and "mmcreq" is from the while loop in:
> >>>
> >>> static int
> >>> mmc_wait_for_req(struct mmc_softc *sc, struct mmc_request *req)
> >>> {
> >>>
> >>>      req->done = mmc_wakeup;
> >>>      req->done_data = sc;
> >>>      if (__predict_false(mmc_debug > 1)) {
> >>>              device_printf(sc->dev, "REQUEST: CMD%d arg %#x flags %#x",
> >>>                  req->cmd->opcode, req->cmd->arg, req->cmd->flags);
> >>>              if (req->cmd->data) {
> >>>                      printf(" data %d\n", (int)req->cmd->data->len);
> >>>              } else
> >>>                      printf("\n");
> >>>      }
> >>>      MMCBR_REQUEST(device_get_parent(sc->dev), sc->dev, req);
> >>>      MMC_LOCK(sc);
> >>>      while ((req->flags & MMC_REQ_DONE) == 0)
> >>>              msleep(req, &sc->sc_mtx, 0, "mmcreq", 0);
> >>>      MMC_UNLOCK(sc);
> >>>      if (__predict_false(mmc_debug > 2 || (mmc_debug > 0 &&
> >>>          req->cmd->error != MMC_ERR_NONE)))
> >>>              device_printf(sc->dev, "CMD%d RESULT: %d\n",
> >>>                  req->cmd->opcode, req->cmd->error);
> >>>      return (0);
> >>> }
> >>>
> >>> So it appears that the error report:
> >>>
> >>> mmcsd0: Error indicated: 4 Failed
> >>>
> >>> ends up associated with (req->flags & MMC_REQ_DONE) == 0 staying
> >>> true in the above code: an unbounded loop with MMC_LOCK(sc) active.
> >>> The "4" in the error report seems to be from:
> >>>
> >>> #define MMC_ERR_FAILED  4
> >>>
> >>> It looks like there are some problems with handling errors, problems
> >>> such that it gets stuck looping (no panic, no progress).
> >>>
> >>> That seems to be separate from why the MMC_ERR_FAILED was generated
> >>> in the first place. So: 2 problems, not just one. Thus it may be a
> >>> good context for tackling the looping problem with a known example
> >>> failure to look at.
> >>>
> >>>
> >>>
> >>> Just for reference, I tried "boot -v" with debug.verbose_sysinit=1 in place,
> >>> just to capture and report the tail of the output for the boot failure.
> >>>
> >>> . . .
> >>> subsystem f000000
> >>> release_aps(0)... Release APs...done
> >>> done.
> >>> intr_irq_shuffle(0)... Trying to mount root from ufs:/dev/gpt/Rock64root []...
> >>> done.
> >>> netisr_start(0)... done.
> >>> taskqgroup_bind_softirq(0)... done.
> >>> GEOM: new disk mmcsd0
> >>> GEOM: new disk mmcsd0boot0
> >>> GEOM: new disk mmcsd0boot1
> >>> smp_after_idle_runnable(0)... done.
> >>> taskqgroup_bind_if_config_tqg(0)... done.
> >>> taskqgroup_bind_if_io_tqg(0)... done.
> >>> tmr_setup_user_access(0)... done.
> >>> subsystem f000001
> >>> mmcsd0: Error indicated: 4 Failed
> >>> epoch_init_smp(0)... done.
> >>> subsystem f100000
> >>> racctd_init(0)... done.
> >>> subsystem fffffff
> >>> start_periodic_resettodr(0)... done.
> >>> oktousecallout(0)... done.
> >>> clknode_finish(0)... Unresolved linked clock found: hdmi_phy
> >>> Unresolved linked clock found: usb480m_phy
> >>> done.
> >>> regulator_constraint(0)... done.
> >>> regulator_shutdown(0)... regulator: shutting down unused regulators
> >>> regulator: shutting down vcc_sd... busy
> >>> done.
> >>> uhub0: 1 port with 1 removable, self powered
> >>> uhub2: 2 ports with 2 removable, self powered
> >>> uhub3: 1 port with 1 removable, self powered
> >>> uhub1: 1 port with 1 removable, self powered
> >>> ugen4.2: <Samsung PSSD T7 Touch> at usbus4
> >>> umass0 on uhub2
> >>> umass0: <Samsung PSSD T7 Touch, class 0/0, rev 3.20/1.00, addr 1> on usbus4
> >>> umass0:  SCSI over Bulk-Only; quirks = 0x0000
> >>> umass0:0:0: Attached to scbus0
> >>> pass0 at umass-sim0 bus 0 scbus0 target 0 lun 0
> >>> pass0: <Samsung PSSD T7 Touch 0> Fixed Direct Access SPC-4 SCSI device
> >>> pass0: Serial Number REPLACED
> >>> pass0: 400.000MB/s transfers
> >>> da0 at umass-sim0 bus 0 scbus0 target 0 lun 0
> >>> da0: <Samsung PSSD T7 Touch 0> Fixed Direct Access SPC-4 SCSI device
> >>> da0: Serial Number REPLACED
> >>> da0: 400.000MB/s transfers
> >>> da0: 953869MB (1953525168 512 byte sectors)
> >>> da0: quirks=0x2<NO_6_BYTE>
> >>> da0: Delete methods: <NONE(*),ZERO>
> >>> random: unblocking device.
> >>>
> >>> No more output after that.
> >>
> >> As for why MMC_ERR_FAILED results, the following code diff is
> >> intended to suggest what I think may be incomplete about sticking
> >> to what the device-specific code supports vs. does not support
> >> (not supporting HS200 here). The code does compile in my context
> >> but is untested.
> >
> > It is now tested (at least to be a useful hack): no longer am I
> > running an older 1400042 kernel. For reference,
> >
> > # uname -apKU
> > FreeBSD Rock64_RPi_4_3_2v1p2 14.0-CURRENT FreeBSD 14.0-CURRENT #18 main-n251456-22c4ab6cb015-dirty: Sun Dec 12 00:34:53 PST 2021     root@CA72_16Gp_ZFS:/usr/obj/BUILDs/main-CA53-nodbg-clang/usr/main-src/arm64.aarch64/sys/GENERIC-NODBG-CA53  arm64 aarch64 1400043 1400043
> >
> > And it reports during the boot (other than the "REPLACED"):
> >
> > mmcsd0: 125GB <MMCHC DJNB4R 0.7 SN REPLACED MFG 06/2016 by 21 0x0000> at mmc0 52.0MHz/8bit/1016-block
> >
> > So it no longer sets up a mode that the rk3328-specific-code does not
> > actually support.
> >
> > (Nothing that I've done here deals with the looping issue when there
> > is a MMC_ERR_FAILED or the like.)
> >
> >> The email handling may mess up some leading
> >> whitespace --but, again, I'm only trying to suggest a type of
> >> change.
> >>
> >> # git -C /usr/main-src/ diff /usr/main-src/sys/dev/mmc
> >> diff --git a/sys/dev/mmc/mmc.c b/sys/dev/mmc/mmc.c
> >> index 9c73dfd57ce0..dffd1c382684 100644
> >> --- a/sys/dev/mmc/mmc.c
> >> +++ b/sys/dev/mmc/mmc.c
> >> @@ -59,6 +59,7 @@ __FBSDID("$FreeBSD$");
> >> #include <sys/param.h>
> >> #include <sys/systm.h>
> >> #include <sys/kernel.h>
> >> +#include <sys/kobj.h>
> >> #include <sys/malloc.h>
> >> #include <sys/lock.h>
> >> #include <sys/module.h>
> >> @@ -1512,6 +1513,8 @@ mmc_timing_to_string(enum mmc_bus_timing timing)
> >> static bool
> >> mmc_host_timing(device_t dev, enum mmc_bus_timing timing)
> >> {
> >> +       kobjop_desc_t kobj_desc;
> >> +       kobj_method_t *kobj_method;
> >>       int host_caps;
> >>
> >>       host_caps = mmcbr_get_caps(dev);
> >> @@ -1543,14 +1546,37 @@ mmc_host_timing(device_t dev, enum mmc_bus_timing timing)
> >>       case bus_timing_mmc_ddr52:
> >>               return (HOST_TIMING_CAP(host_caps, MMC_CAP_MMC_DDR52));
> >>       case bus_timing_mmc_hs200:
> >> -               return (HOST_TIMING_CAP(host_caps, MMC_CAP_MMC_HS200_120) ||
> >> -                       HOST_TIMING_CAP(host_caps, MMC_CAP_MMC_HS200_180));
> >>       case bus_timing_mmc_hs400:
> >> -               return (HOST_TIMING_CAP(host_caps, MMC_CAP_MMC_HS400_120) ||
> >> -                       HOST_TIMING_CAP(host_caps, MMC_CAP_MMC_HS400_180));
> >>       case bus_timing_mmc_hs400es:
> >> -               return (HOST_TIMING_CAP(host_caps, MMC_CAP_MMC_HS400 |
> >> -                   MMC_CAP_MMC_ENH_STROBE));
> >> +               /*
> >> +                * Disable eMMC modes that require use of
> >> +                * MMC_SEND_TUNING_BLOCK_HS200 to set things up if either the
> >> +                * tune or re-tune method is the default NULL implementation.
> >> +                */
> >> +               kobj_desc = &mmcbr_tune_desc;
> >> +               kobj_method = kobj_lookup_method(((kobj_t)dev)->ops->cls, NULL,
> >> +                   kobj_desc);
> >> +               if (kobj_method == &kobj_desc->deflt)
> >> +                       return (false);
> >> +               kobj_desc = &mmcbr_retune_desc;
> >> +               kobj_method = kobj_lookup_method(((kobj_t)dev)->ops->cls, NULL,
> >> +                   kobj_desc);
> >> +               if (kobj_method == &kobj_desc->deflt) {
> >> +                       return (false);
> >> +               }
> >> +
> >> +               /*
> >> +                * Otherwise track the host capabilities.
> >> +                */
> >> +               if (timing == bus_timing_mmc_hs200)
> >> +                       return (HOST_TIMING_CAP(host_caps, MMC_CAP_MMC_HS200_120) ||
> >> +                               HOST_TIMING_CAP(host_caps, MMC_CAP_MMC_HS200_180));
> >> +               if (timing == bus_timing_mmc_hs400)
> >> +                       return (HOST_TIMING_CAP(host_caps, MMC_CAP_MMC_HS400_120) ||
> >> +                               HOST_TIMING_CAP(host_caps, MMC_CAP_MMC_HS400_180));
> >> +               if (timing == bus_timing_mmc_hs400es)
> >> +                       return (HOST_TIMING_CAP(host_caps, MMC_CAP_MMC_HS400 |
> >> +                               MMC_CAP_MMC_ENH_STROBE));
> >>       }
> >>
> >> #undef HOST_TIMING_CAP
> >>
> >>
> >> In other words: have mmc_host_timing avoid returning true for some
> >> combinations that definitely do not have sufficient software support
> >> present at the time. (So far as I can tell, the rk3328's get the
> >> NULL-implementations as things are.)
> >>
> >> I expect that this sort of thing would go back to using
> >> MMC_CAP_MMC_DDR52 for the rk3328's, as an example. Working, but in a
> >> slower mode, the same mode as FreeBSD was previously using.
> >>
> >> A possible incompleteness in the suggestion is that there is also a
> >> drive-strength setting involved. If that also had "kobj" interfacing
> >> and NULL-implementation possibilities, then in the future there would
> >> be more to test for possibly forcing return-false than I did above.
> >>
> >> Hopefully this sort of thing would help, possibly more than just for
> >> rk3328's.
> >
> >
>
> As for what was happening without my patch . . .
>
> sys/dev/mmc/mmcbr_if.m defines:
>
>         static int
>         null_retune(device_t brdev __unused, device_t reqdev __unused,
>             bool reset __unused)
>         {
>
>                 return (0);
>         }
>
>         static int
>         null_tune(device_t brdev __unused, device_t reqdev __unused,
>             bool hs400 __unused)
>         {
>
>                 return (0);
>         }
> . . .
> #
> # Called by the mmcbus with the bridge claimed to execute initial tuning.
> #
> METHOD int tune {
>         device_t        brdev;
>         device_t        reqdev;
>         bool            hs400;
> } DEFAULT null_tune;
>
> #
> # Called by the mmcbus with the bridge claimed to execute re-tuning.
> #
> METHOD int retune {
>         device_t        brdev;
>         device_t        reqdev;
>         bool            reset;
> } DEFAULT null_retune;
> . . .
>
> It is these success-reporting no-op routines that were being
> used to attempt the tuning: so there was no tuning done.
>
> The code that I added detects that these routines would be
> used and avoids allowing contexts that would involve putting
> them to use with HS200 mode.
>
> I'll note that there is another such null_* routine that the
> code (even with my patch) does not deal with avoiding the use
> of:
>
> . . .
>         static int
>         null_switch_vccq(device_t brdev __unused, device_t reqdev __unused)
>         {
>
>                 return (0);
>         }
> . . .
> #
> # Called by the mmcbus to switch the signaling voltage (VCCQ).
> #
> METHOD int switch_vccq {
>         device_t        brdev;
>         device_t        reqdev;
> } DEFAULT null_switch_vccq;
> . . .
>
> /usr/main-src/sys/dev/sdhci/sdhci.c has somewhat analogous code for
> somewhat analogous null_* routines. null_set_uhs_timing for that is
> from sys/dev/sdhci/sdhci_if.m (but the other two are again the above
> null_tune and null_retune routines, so not repeated here):
>
> . . .
>         static void
>         null_set_uhs_timing(device_t brdev __unused,
>             struct sdhci_slot *slot __unused)
>         {
>
>         }
> . . .
> METHOD void set_uhs_timing {
>         device_t                brdev;
>         struct sdhci_slot       *slot;
> } DEFAULT null_set_uhs_timing;
> . . .
>
> sdhci_init_slot(device_t dev, struct sdhci_slot *slot, int num)
> in sdhci.c looks like (in part):
>
> . . .
>         /*
>          * Disable UHS-I and eMMC modes if the set_uhs_timing method is the
>          * default NULL implementation.
>          */
>         kobj_desc = &sdhci_set_uhs_timing_desc;
>         kobj_method = kobj_lookup_method(((kobj_t)dev)->ops->cls, NULL,
>             kobj_desc);
>         if (kobj_method == &kobj_desc->deflt)
>                 host_caps &= ~(MMC_CAP_UHS_SDR12 | MMC_CAP_UHS_SDR25 |
>                     MMC_CAP_UHS_SDR50 | MMC_CAP_UHS_DDR50 | MMC_CAP_UHS_SDR104 |
>                     MMC_CAP_MMC_DDR52 | MMC_CAP_MMC_HS200 | MMC_CAP_MMC_HS400);
>
> #define SDHCI_CAP_MODES_TUNING(caps2)                                   \
>     (((caps2) & SDHCI_TUNE_SDR50 ? MMC_CAP_UHS_SDR50 : 0) |             \
>     MMC_CAP_UHS_DDR50 | MMC_CAP_UHS_SDR104 | MMC_CAP_MMC_HS200 |        \
>     MMC_CAP_MMC_HS400)
>
>         /*
>          * Disable UHS-I and eMMC modes that require (re-)tuning if either
>          * the tune or re-tune method is the default NULL implementation.
>          */
>         kobj_desc = &mmcbr_tune_desc;
>         kobj_method = kobj_lookup_method(((kobj_t)dev)->ops->cls, NULL,
>             kobj_desc);
>         if (kobj_method == &kobj_desc->deflt)
>                 goto no_tuning;
>         kobj_desc = &mmcbr_retune_desc;
>         kobj_method = kobj_lookup_method(((kobj_t)dev)->ops->cls, NULL,
>             kobj_desc);
>         if (kobj_method == &kobj_desc->deflt) {
> no_tuning:
>                 host_caps &= ~(SDHCI_CAP_MODES_TUNING(caps2));
>         }
> . . .
>
> What I've done in my patch is analogous to what the the code shown
> after the #define SDHCI_CAP_MODES_TUNING above does, translated to
> fit the mmc's pre-existing code structure.

Good catch. For some reason mmcbr_tune/mmcbr_retune are not
implemented in sdhci_fdt.c.
This looks like a separate bug/issue.
Note that pretty much all other SDHCI drivers (sdhci_pci.c,
sdhci_xenon.c, sdhci_fsl_fdt.c, sdhci_acpi.c) provide some
implementation of mmcbr_tune/mmcbr_retune.
I agree that the logic in sdhci.c should disable HS200 if those
methods are implemented.
Could you try hooking mmcbr_tune and mmcbr_retune to their generic
sdhci implementations?
If we're lucky this could be enough to make HS200 work on rock64,
though it's unlikely.