From nobody Fri Dec 17 11:17:12 2021 X-Original-To: freebsd-arm@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id CD1D218EA390 for ; Fri, 17 Dec 2021 11:17:18 +0000 (UTC) (envelope-from mindal@semihalf.com) Received: from mail-ed1-x52c.google.com (mail-ed1-x52c.google.com [IPv6:2a00:1450:4864:20::52c]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smtp.gmail.com", Issuer "GTS CA 1D4" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4JFmfQ1H16z4grt for ; Fri, 17 Dec 2021 11:17:18 +0000 (UTC) (envelope-from mindal@semihalf.com) Received: by mail-ed1-x52c.google.com with SMTP id y13so6328306edd.13 for ; Fri, 17 Dec 2021 03:17:18 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=semihalf-com.20210112.gappssmtp.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc:content-transfer-encoding; bh=h0qsgPokLGQzgknTj23i2fawgFZkkbEUDbTNrhjKbAY=; b=yQTZxgKoUi1ovAG+P2IZqNjiAZA/JtwYc8g46c7WhI4J8F420ukg7SxzV/vcL/Hz6y zpTpjWZiJBoaIj++tKhnhiJwFzFdJrO3lCf9wKxs7uXeQQEi15K8bTx/5fIdWx5Cwk+z qbzQJGHFxU2ZDWLHTqZiXQWO7fLqo6fU318+fcld0Qooy2L1XEYlA/mYrilvcMu1v7j1 WnNdPK0KcfY3e6Z0yFx3IxFUU1DXvyAVyN4R/hnwa+K2RnOgyR05KAYB8tYGk3N/kRjm 7Z6S2O3g7Rel8b0CCWXqi6GKao+FSvl6HTDuJtSkz8C/9O/JHYNQFlqJ80iIZrIJuKnm K6RQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=h0qsgPokLGQzgknTj23i2fawgFZkkbEUDbTNrhjKbAY=; b=WzuVOiCYxo22f+uuPGPwpafBWsQLCflQEmC90bwj7EdGz/W87PNiRxe5119x2sjFg7 6e8hKXeV2J3N1p0lgsRV0i3B3c7jtmaOAGmx4YiYVn8aQIw8xozpcBiP0INJmz2XgvcZ zkH9cGlNTiyuOay6CcCi8O4n+2qtE6lYO0yH5T8+/BcI/k11Y3wnTP+YsCk3q5OHt4pJ Ntdv8IcTVl+CbgefY20bItatK5gBjprKiGT9n1mvP3j8Q0N0FxJGHOAsNK/KO5jsoEVX kF3ukEPBj8k0bSErcChdqzZDfv4IYDDqt9RyhHbViWcB9H9OlV+5dyuo9XtJXWZVUsP4 8L3A== X-Gm-Message-State: AOAM5331FHYlWPUZTPwOTh1mXaKGvLyojEQaxsERoPmy3npIMfoffnW6 2n7DBQ7zIvHhnUTOGOQf4DHWz7kU1uaiFTvnj98LG+QfJkZ6Fg== X-Google-Smtp-Source: ABdhPJwGe20s4sC4IDd2jyYjOl1Fd04oUzPbpckOVrOqHXYcbN8o7uqXeKSweM4KPldSYyvPBquJcrz60nacsTrxZds= X-Received: by 2002:a17:907:2d0b:: with SMTP id gs11mr2150168ejc.700.1639739836967; Fri, 17 Dec 2021 03:17:16 -0800 (PST) List-Id: Porting FreeBSD to ARM processors List-Archive: https://lists.freebsd.org/archives/freebsd-arm List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-arm@freebsd.org MIME-Version: 1.0 References: <243CBFC7-DFB5-4F8B-A8A3-CFF78455148D.ref@yahoo.com> <243CBFC7-DFB5-4F8B-A8A3-CFF78455148D@yahoo.com> <20211209081930.7970b6995a8f7c5f7466227d@bidouilliste.com> <053617FD-AA34-4A3F-853A-4D2E44F8254B@yahoo.com> <43901D57-9C39-4FAC-A2BE-CCE642791705@yahoo.com> <8DAA50A1-3CF0-4AFA-9977-58FE15D4F171@yahoo.com> <21B0478B-340F-4BB2-9189-B5A6AE458134@yahoo.com> <7717F6CF-0239-4DC0-B23F-B9D5F75C0A8D@yahoo.com> <7EFA98DF-325F-4821-A040-FB4A9E66AB8F@yahoo.com> In-Reply-To: From: =?UTF-8?Q?Kornel_Dul=C4=99ba?= Date: Fri, 17 Dec 2021 12:17:12 +0100 Message-ID: Subject: Re: Rock64 configuration fails to boot for main 22c4ab6cb015 but worked for main 06bd74e1e39c (Nov 21): e.MMC mishandled? To: Mark Millard Cc: Emmanuel Vadot , Free BSD Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Queue-Id: 4JFmfQ1H16z4grt X-Spamd-Bar: -- Authentication-Results: mx1.freebsd.org; dkim=pass header.d=semihalf-com.20210112.gappssmtp.com header.s=20210112 header.b=yQTZxgKo; dmarc=none; spf=none (mx1.freebsd.org: domain of mindal@semihalf.com has no SPF policy when checking 2a00:1450:4864:20::52c) smtp.mailfrom=mindal@semihalf.com X-Spamd-Result: default: False [-2.24 / 15.00]; ARC_NA(0.00)[]; NEURAL_HAM_MEDIUM(-0.94)[-0.944]; R_DKIM_ALLOW(-0.20)[semihalf-com.20210112.gappssmtp.com:s=20210112]; FROM_HAS_DN(0.00)[]; RCPT_COUNT_THREE(0.00)[3]; NEURAL_HAM_LONG(-1.00)[-1.000]; MIME_GOOD(-0.10)[text/plain]; PREVIOUSLY_DELIVERED(0.00)[freebsd-arm@freebsd.org]; DMARC_NA(0.00)[semihalf.com]; TO_MATCH_ENVRCPT_SOME(0.00)[]; TO_DN_ALL(0.00)[]; DKIM_TRACE(0.00)[semihalf-com.20210112.gappssmtp.com:+]; NEURAL_HAM_SHORT(-1.00)[-1.000]; RCVD_IN_DNSWL_NONE(0.00)[2a00:1450:4864:20::52c:from]; R_SPF_NA(0.00)[no SPF record]; FREEMAIL_TO(0.00)[yahoo.com]; FROM_EQ_ENVFROM(0.00)[]; MIME_TRACE(0.00)[0:+]; SUBJECT_ENDS_QUESTION(1.00)[]; ASN(0.00)[asn:15169, ipnet:2a00:1450::/32, country:US]; RCVD_COUNT_TWO(0.00)[2]; RCVD_TLS_ALL(0.00)[] X-ThisMailContainsUnwantedMimeParts: N > >>> [I've cut out the history: just presenting some new evidence.] > >>> > >>> First, a little context from getting to the db> prompt. > >>> > >>> db> ps > >>> pid ppid pgrp uid state wmesg wchan cmd > >>> 18 0 0 0 DL syncer 0xffff000000eca5a8 [syncer] > >>> 17 0 0 0 DL vlruwt 0xffffa00007d2ea60 [vnlru] > >>> 16 0 0 0 DL (threaded) [bufdaemon] > >>> 100089 D qsleep 0xffff000000ec9478 [bufdaem= on] > >>> 100092 D - 0xffff000000c11100 [bufspac= edaemon-0] > >>> 100093 D - 0xffff000000c21680 [bufspac= edaemon-1] > >>> 9 0 0 0 DL psleep 0xffff000000ef0650 [vmdaemon] > >>> 8 0 0 0 DL (threaded) [pagedaemon= ] > >>> 100087 D psleep 0xffff000000ee2b38 [dom0] > >>> 100094 D launds 0xffff000000ee2b44 [laundry= : dom0] > >>> 100095 D umarcl 0xffff0000007b38d8 [uma] > >>> 7 0 0 0 DL mmcsd d 0xffffa00007b72e00 [mmcsd0boot= 1: mmc/sd] > >>> 6 0 0 0 DL mmcsd d 0xffffa00007b71300 [mmcsd0boot= 0: mmc/sd] > >>> 5 0 0 0 DL mmcreq 0xffff00009b5d0710 [mmcsd0: mm= c/sd card] > >>> 4 0 0 0 DL - 0xffff000000ccc020 [rand_harve= stq] > >>> 15 0 0 0 DL (threaded) [usb] > >>> . . . > >>> > >>> and "mmcreq" is from the while loop in: > >>> > >>> static int > >>> mmc_wait_for_req(struct mmc_softc *sc, struct mmc_request *req) > >>> { > >>> > >>> req->done =3D mmc_wakeup; > >>> req->done_data =3D sc; > >>> if (__predict_false(mmc_debug > 1)) { > >>> device_printf(sc->dev, "REQUEST: CMD%d arg %#x flags %#x= ", > >>> req->cmd->opcode, req->cmd->arg, req->cmd->flags); > >>> if (req->cmd->data) { > >>> printf(" data %d\n", (int)req->cmd->data->len); > >>> } else > >>> printf("\n"); > >>> } > >>> MMCBR_REQUEST(device_get_parent(sc->dev), sc->dev, req); > >>> MMC_LOCK(sc); > >>> while ((req->flags & MMC_REQ_DONE) =3D=3D 0) > >>> msleep(req, &sc->sc_mtx, 0, "mmcreq", 0); > >>> MMC_UNLOCK(sc); > >>> if (__predict_false(mmc_debug > 2 || (mmc_debug > 0 && > >>> req->cmd->error !=3D MMC_ERR_NONE))) > >>> device_printf(sc->dev, "CMD%d RESULT: %d\n", > >>> req->cmd->opcode, req->cmd->error); > >>> return (0); > >>> } > >>> > >>> So it appears that the error report: > >>> > >>> mmcsd0: Error indicated: 4 Failed > >>> > >>> ends up associated with (req->flags & MMC_REQ_DONE) =3D=3D 0 staying > >>> true in the above code: an unbounded loop with MMC_LOCK(sc) active. > >>> The "4" in the error report seems to be from: > >>> > >>> #define MMC_ERR_FAILED 4 > >>> > >>> It looks like there are some problems with handling errors, problems > >>> such that it gets stuck looping (no panic, no progress). > >>> > >>> That seems to be separate from why the MMC_ERR_FAILED was generated > >>> in the first place. So: 2 problems, not just one. Thus it may be a > >>> good context for tackling the looping problem with a known example > >>> failure to look at. > >>> > >>> > >>> > >>> Just for reference, I tried "boot -v" with debug.verbose_sysinit=3D1 = in place, > >>> just to capture and report the tail of the output for the boot failur= e. > >>> > >>> . . . > >>> subsystem f000000 > >>> release_aps(0)... Release APs...done > >>> done. > >>> intr_irq_shuffle(0)... Trying to mount root from ufs:/dev/gpt/Rock64r= oot []... > >>> done. > >>> netisr_start(0)... done. > >>> taskqgroup_bind_softirq(0)... done. > >>> GEOM: new disk mmcsd0 > >>> GEOM: new disk mmcsd0boot0 > >>> GEOM: new disk mmcsd0boot1 > >>> smp_after_idle_runnable(0)... done. > >>> taskqgroup_bind_if_config_tqg(0)... done. > >>> taskqgroup_bind_if_io_tqg(0)... done. > >>> tmr_setup_user_access(0)... done. > >>> subsystem f000001 > >>> mmcsd0: Error indicated: 4 Failed > >>> epoch_init_smp(0)... done. > >>> subsystem f100000 > >>> racctd_init(0)... done. > >>> subsystem fffffff > >>> start_periodic_resettodr(0)... done. > >>> oktousecallout(0)... done. > >>> clknode_finish(0)... Unresolved linked clock found: hdmi_phy > >>> Unresolved linked clock found: usb480m_phy > >>> done. > >>> regulator_constraint(0)... done. > >>> regulator_shutdown(0)... regulator: shutting down unused regulators > >>> regulator: shutting down vcc_sd... busy > >>> done. > >>> uhub0: 1 port with 1 removable, self powered > >>> uhub2: 2 ports with 2 removable, self powered > >>> uhub3: 1 port with 1 removable, self powered > >>> uhub1: 1 port with 1 removable, self powered > >>> ugen4.2: at usbus4 > >>> umass0 on uhub2 > >>> umass0: on = usbus4 > >>> umass0: SCSI over Bulk-Only; quirks =3D 0x0000 > >>> umass0:0:0: Attached to scbus0 > >>> pass0 at umass-sim0 bus 0 scbus0 target 0 lun 0 > >>> pass0: Fixed Direct Access SPC-4 SCSI devic= e > >>> pass0: Serial Number REPLACED > >>> pass0: 400.000MB/s transfers > >>> da0 at umass-sim0 bus 0 scbus0 target 0 lun 0 > >>> da0: Fixed Direct Access SPC-4 SCSI device > >>> da0: Serial Number REPLACED > >>> da0: 400.000MB/s transfers > >>> da0: 953869MB (1953525168 512 byte sectors) > >>> da0: quirks=3D0x2 > >>> da0: Delete methods: > >>> random: unblocking device. > >>> > >>> No more output after that. > >> > >> As for why MMC_ERR_FAILED results, the following code diff is > >> intended to suggest what I think may be incomplete about sticking > >> to what the device-specific code supports vs. does not support > >> (not supporting HS200 here). The code does compile in my context > >> but is untested. > > > > It is now tested (at least to be a useful hack): no longer am I > > running an older 1400042 kernel. For reference, > > > > # uname -apKU > > FreeBSD Rock64_RPi_4_3_2v1p2 14.0-CURRENT FreeBSD 14.0-CURRENT #18 main= -n251456-22c4ab6cb015-dirty: Sun Dec 12 00:34:53 PST 2021 root@CA72_16G= p_ZFS:/usr/obj/BUILDs/main-CA53-nodbg-clang/usr/main-src/arm64.aarch64/sys/= GENERIC-NODBG-CA53 arm64 aarch64 1400043 1400043 > > > > And it reports during the boot (other than the "REPLACED"): > > > > mmcsd0: 125GB a= t mmc0 52.0MHz/8bit/1016-block > > > > So it no longer sets up a mode that the rk3328-specific-code does not > > actually support. > > > > (Nothing that I've done here deals with the looping issue when there > > is a MMC_ERR_FAILED or the like.) > > > >> The email handling may mess up some leading > >> whitespace --but, again, I'm only trying to suggest a type of > >> change. > >> > >> # git -C /usr/main-src/ diff /usr/main-src/sys/dev/mmc > >> diff --git a/sys/dev/mmc/mmc.c b/sys/dev/mmc/mmc.c > >> index 9c73dfd57ce0..dffd1c382684 100644 > >> --- a/sys/dev/mmc/mmc.c > >> +++ b/sys/dev/mmc/mmc.c > >> @@ -59,6 +59,7 @@ __FBSDID("$FreeBSD$"); > >> #include > >> #include > >> #include > >> +#include > >> #include > >> #include > >> #include > >> @@ -1512,6 +1513,8 @@ mmc_timing_to_string(enum mmc_bus_timing timing) > >> static bool > >> mmc_host_timing(device_t dev, enum mmc_bus_timing timing) > >> { > >> + kobjop_desc_t kobj_desc; > >> + kobj_method_t *kobj_method; > >> int host_caps; > >> > >> host_caps =3D mmcbr_get_caps(dev); > >> @@ -1543,14 +1546,37 @@ mmc_host_timing(device_t dev, enum mmc_bus_tim= ing timing) > >> case bus_timing_mmc_ddr52: > >> return (HOST_TIMING_CAP(host_caps, MMC_CAP_MMC_DDR52)); > >> case bus_timing_mmc_hs200: > >> - return (HOST_TIMING_CAP(host_caps, MMC_CAP_MMC_HS200_1= 20) || > >> - HOST_TIMING_CAP(host_caps, MMC_CAP_MMC_HS200_1= 80)); > >> case bus_timing_mmc_hs400: > >> - return (HOST_TIMING_CAP(host_caps, MMC_CAP_MMC_HS400_1= 20) || > >> - HOST_TIMING_CAP(host_caps, MMC_CAP_MMC_HS400_1= 80)); > >> case bus_timing_mmc_hs400es: > >> - return (HOST_TIMING_CAP(host_caps, MMC_CAP_MMC_HS400 | > >> - MMC_CAP_MMC_ENH_STROBE)); > >> + /* > >> + * Disable eMMC modes that require use of > >> + * MMC_SEND_TUNING_BLOCK_HS200 to set things up if eit= her the > >> + * tune or re-tune method is the default NULL implemen= tation. > >> + */ > >> + kobj_desc =3D &mmcbr_tune_desc; > >> + kobj_method =3D kobj_lookup_method(((kobj_t)dev)->ops-= >cls, NULL, > >> + kobj_desc); > >> + if (kobj_method =3D=3D &kobj_desc->deflt) > >> + return (false); > >> + kobj_desc =3D &mmcbr_retune_desc; > >> + kobj_method =3D kobj_lookup_method(((kobj_t)dev)->ops-= >cls, NULL, > >> + kobj_desc); > >> + if (kobj_method =3D=3D &kobj_desc->deflt) { > >> + return (false); > >> + } > >> + > >> + /* > >> + * Otherwise track the host capabilities. > >> + */ > >> + if (timing =3D=3D bus_timing_mmc_hs200) > >> + return (HOST_TIMING_CAP(host_caps, MMC_CAP_MMC= _HS200_120) || > >> + HOST_TIMING_CAP(host_caps, MMC_CAP_MMC= _HS200_180)); > >> + if (timing =3D=3D bus_timing_mmc_hs400) > >> + return (HOST_TIMING_CAP(host_caps, MMC_CAP_MMC= _HS400_120) || > >> + HOST_TIMING_CAP(host_caps, MMC_CAP_MMC= _HS400_180)); > >> + if (timing =3D=3D bus_timing_mmc_hs400es) > >> + return (HOST_TIMING_CAP(host_caps, MMC_CAP_MMC= _HS400 | > >> + MMC_CAP_MMC_ENH_STROBE)); > >> } > >> > >> #undef HOST_TIMING_CAP > >> > >> > >> In other words: have mmc_host_timing avoid returning true for some > >> combinations that definitely do not have sufficient software support > >> present at the time. (So far as I can tell, the rk3328's get the > >> NULL-implementations as things are.) > >> > >> I expect that this sort of thing would go back to using > >> MMC_CAP_MMC_DDR52 for the rk3328's, as an example. Working, but in a > >> slower mode, the same mode as FreeBSD was previously using. > >> > >> A possible incompleteness in the suggestion is that there is also a > >> drive-strength setting involved. If that also had "kobj" interfacing > >> and NULL-implementation possibilities, then in the future there would > >> be more to test for possibly forcing return-false than I did above. > >> > >> Hopefully this sort of thing would help, possibly more than just for > >> rk3328's. > > > > > > As for what was happening without my patch . . . > > sys/dev/mmc/mmcbr_if.m defines: > > static int > null_retune(device_t brdev __unused, device_t reqdev __unused, > bool reset __unused) > { > > return (0); > } > > static int > null_tune(device_t brdev __unused, device_t reqdev __unused, > bool hs400 __unused) > { > > return (0); > } > . . . > # > # Called by the mmcbus with the bridge claimed to execute initial tuning. > # > METHOD int tune { > device_t brdev; > device_t reqdev; > bool hs400; > } DEFAULT null_tune; > > # > # Called by the mmcbus with the bridge claimed to execute re-tuning. > # > METHOD int retune { > device_t brdev; > device_t reqdev; > bool reset; > } DEFAULT null_retune; > . . . > > It is these success-reporting no-op routines that were being > used to attempt the tuning: so there was no tuning done. > > The code that I added detects that these routines would be > used and avoids allowing contexts that would involve putting > them to use with HS200 mode. > > I'll note that there is another such null_* routine that the > code (even with my patch) does not deal with avoiding the use > of: > > . . . > static int > null_switch_vccq(device_t brdev __unused, device_t reqdev __unuse= d) > { > > return (0); > } > . . . > # > # Called by the mmcbus to switch the signaling voltage (VCCQ). > # > METHOD int switch_vccq { > device_t brdev; > device_t reqdev; > } DEFAULT null_switch_vccq; > . . . > > /usr/main-src/sys/dev/sdhci/sdhci.c has somewhat analogous code for > somewhat analogous null_* routines. null_set_uhs_timing for that is > from sys/dev/sdhci/sdhci_if.m (but the other two are again the above > null_tune and null_retune routines, so not repeated here): > > . . . > static void > null_set_uhs_timing(device_t brdev __unused, > struct sdhci_slot *slot __unused) > { > > } > . . . > METHOD void set_uhs_timing { > device_t brdev; > struct sdhci_slot *slot; > } DEFAULT null_set_uhs_timing; > . . . > > sdhci_init_slot(device_t dev, struct sdhci_slot *slot, int num) > in sdhci.c looks like (in part): > > . . . > /* > * Disable UHS-I and eMMC modes if the set_uhs_timing method is t= he > * default NULL implementation. > */ > kobj_desc =3D &sdhci_set_uhs_timing_desc; > kobj_method =3D kobj_lookup_method(((kobj_t)dev)->ops->cls, NULL, > kobj_desc); > if (kobj_method =3D=3D &kobj_desc->deflt) > host_caps &=3D ~(MMC_CAP_UHS_SDR12 | MMC_CAP_UHS_SDR25 | > MMC_CAP_UHS_SDR50 | MMC_CAP_UHS_DDR50 | MMC_CAP_UHS_S= DR104 | > MMC_CAP_MMC_DDR52 | MMC_CAP_MMC_HS200 | MMC_CAP_MMC_H= S400); > > #define SDHCI_CAP_MODES_TUNING(caps2) \ > (((caps2) & SDHCI_TUNE_SDR50 ? MMC_CAP_UHS_SDR50 : 0) | \ > MMC_CAP_UHS_DDR50 | MMC_CAP_UHS_SDR104 | MMC_CAP_MMC_HS200 | \ > MMC_CAP_MMC_HS400) > > /* > * Disable UHS-I and eMMC modes that require (re-)tuning if eithe= r > * the tune or re-tune method is the default NULL implementation. > */ > kobj_desc =3D &mmcbr_tune_desc; > kobj_method =3D kobj_lookup_method(((kobj_t)dev)->ops->cls, NULL, > kobj_desc); > if (kobj_method =3D=3D &kobj_desc->deflt) > goto no_tuning; > kobj_desc =3D &mmcbr_retune_desc; > kobj_method =3D kobj_lookup_method(((kobj_t)dev)->ops->cls, NULL, > kobj_desc); > if (kobj_method =3D=3D &kobj_desc->deflt) { > no_tuning: > host_caps &=3D ~(SDHCI_CAP_MODES_TUNING(caps2)); > } > . . . > > What I've done in my patch is analogous to what the the code shown > after the #define SDHCI_CAP_MODES_TUNING above does, translated to > fit the mmc's pre-existing code structure. Good catch. For some reason mmcbr_tune/mmcbr_retune are not implemented in sdhci_fdt.c. This looks like a separate bug/issue. Note that pretty much all other SDHCI drivers (sdhci_pci.c, sdhci_xenon.c, sdhci_fsl_fdt.c, sdhci_acpi.c) provide some implementation of mmcbr_tune/mmcbr_retune. I agree that the logic in sdhci.c should disable HS200 if those methods are implemented. Could you try hooking mmcbr_tune and mmcbr_retune to their generic sdhci implementations? If we're lucky this could be enough to make HS200 work on rock64, though it's unlikely.