From nobody Fri May 14 21:51:27 2021 X-Original-To: freebsd-arm@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 404B08458DD for ; Fri, 14 May 2021 21:51:34 +0000 (UTC) (envelope-from marklmi@yahoo.com) Received: from sonic311-24.consmr.mail.gq1.yahoo.com (sonic311-24.consmr.mail.gq1.yahoo.com [98.137.65.205]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 4Fhj0P4vyxz4S07 for ; Fri, 14 May 2021 21:51:33 +0000 (UTC) (envelope-from marklmi@yahoo.com) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.com; s=s2048; t=1621029092; bh=XeklgD1Qj01BgoH+sIv4QOQO/llmGb/mkJnctDuktNU=; h=Subject:From:In-Reply-To:Date:Cc:References:To:From:Subject:Reply-To; b=PmBQLIr1IS//Ju37Qk3BcnkaYDqmo7qmcMr2ojtC7ta4BQkfbRtuKok47wzeenNCfrYI+5AhdYo6rteEZjuk7AgAwtanLeHjR6TttJWnSNLY/wrpNXuovEld7b6oxNvothQXw6Xzw+ZScchQVHkFgyXOG4JKsvsuvDNcY77XztN57soLrYi/vygp2Y/5YlzM6SG967Mq9hd8q7vp9vtnYLdidvfKsCuvO6DgMCq/TMqz2GNOxu0yDRcDCw8LS4vyLFOU+9SAVx4g6do+AtPeUvVsFChbnOxOVP8yGqc8ebBq4BBFjAHcekjKsxdMJkL548MWzK1YYRb1kCsA2eW7Tw== X-SONIC-DKIM-SIGN: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.com; s=s2048; t=1621029092; bh=PyerJuCXvnxw4fWfreEAtnWE/TSG+E2zG3KPG4+H0dx=; h=X-Sonic-MF:Subject:From:Date:To:From:Subject; b=sPhSH/QWt0P8ZSNO8YGKjs1B1I4LeY8t7w/pozmlfXD5ALfqHojC1Y4/lz8iiMbYxdXTx4uuYE2QgGqc06TRgZTWhxiXwfoLeqtChu1ShGlOloq3cBnQEjz84p8ixqsbxMwDja5uOmTV6hu10Js4jYpJMfr/J8RaPBgz7Zp+hdLPIR2EWsEC8AwJ1CtM3wDeBVtlnYtO6l/rIHuPpiQNi120+IscmiZkz6Wg6xBuLmR7E7xuiGzFiadEIt8iIOi3eHs4RhD2H2WleZUyte+LSy0FwGXrPRTnoG/3TdsYEIOePdlPkuNvOlBrjwZRiSYWkEveniECUxD5sIpbf7/FMg== X-YMail-OSG: bB5QLCAVM1k7zQ9gz7HkDefIGfSCzIAcIUP9z._yRMSJJNjc01tLVAHuRSEac8x g1j5BGe6J_UVzLHEhEYIOVh7DMd376NrdU8kVIpVbKs5hIQdaT.tOnYiIODtFLvCfRK8qbCWcSdk QyQbw_TZNmv3sRcoHh_UJWXH7PHGWJqKFS1CMh8jyJ4tst_hrLKRLdmAT9YMJq9Sef_ywRNAV6IM GDNKB3NmVqh89WWlyXQdnpHIoBZEBLmCnrtH3jIDZeszMZMGv6BJs_LUDbn4J9o988QNPeAxYp7e oo38iXD4Dys9hoBfu8vpAwLOUYEL7wcoBDlLtYexZE8EosW9kIBq_TdEIMr7pGsdoEjRls7aIqpZ 1fEBMl7wpTVHaTj_z69AbtIo4aZ0h3xLYUqYWKH2UzRUxwUdgE4ebKfRwefhbwru3gvRK_eTGxd3 UeD3aIm2mA4X2sDU_RAgFrQpM8FD5Jgcu.5zcz.V8DpDzO_7bUdFI2h0Z3hKbR0tlGzWj7kFySgZ FoQkgkG9WQZlLz3hVbVOfut7524Ekcbug6a5nlH88PWiOXyMaCCK56RAWM8yre6DQp7Zr4MkvB73 cId1hd2yZyQ_KzQEQgzJ2uQ_IJA1uzO.sqI5yNsw41TyjncepjVBQZspnYJvZFy6d.FHnX8QBWLl cx0sjFSZywdzJA61u3Cu4e_LGzz.zuZOdHZXlbRGfGNZrQmLrLUbD_64cP.dvK26T6smmEqqr_n6 V8II.tu8q9A.v6KGrLTIv_XvIH9bBj.WsMUJwIl_RVcer96WGIISoRkH86HDs8tcEDrR7uaNUqyX P5sGlGdijHCebkxNLFiNKewF7ydYc_HRanmz188ZFALE4K6e2FsLyh.nCsH8BuxsWgqHpzG2BglU V6ImcST.EMPuRCnnsQhooyKgBHZZz9Osb_8_zqoJd60Zc.5QioV4.4qf85Z1o5.kh6ocJYOUv1HM WFwa9tphbBzihhi9s87ulTg1WHJMccri3Fr4Y5kgOelQNDkHhjNLk6cOXMOwyETVN4Vp9Of33sEo mP3.GMNLsSbDKyKiTIciLsrXwGHIU3KmaXhQ_xe8ZCs7FgqarYuvmADY3VyVwnwFJ.x3I4Egzbaw CicyMMzV8j3Og7txRLO14MkdkCmEJ9Ip6vOOwu748qSc8ALWQb.0ZynGYre48l9k.SCaqifGPM9V K_HCVW9c4vHE81Tb3xzcyz7shbqIAQT45LAtOL.4DnwDkekuSjXQCuahEcKcqMqWitKjKxHzJ6j6 T_IphXOlMcmOvaaZvbzvHGXfXIxvKpqcWeQjj_TC32bTZdKHCo5F_Mk_42ce553vxHjQyzOSBR8B h.K.e9Rc2cBDM8riwuESxCaWrKWjkM9Q2toXzhEFaMSUF33E1t8O92vQWgkkCugHPQ_piNs5irAF Uh4FuyGxeFSUX_wdv3OQfY7Fl21z6MUvOHozdl74iT.QEZGK8cDzN_lqSJvD1vPVhHy2d1l0ne1_ gWuiTBPLbTf6RWovVLoMI.tyinQg8xqwfMeWsTPaJ9vALmjpH2gNWVPmtnmd3XwwKD1X7P1WrIWE AyxdCohPmAQ5htE9L1lg0c49nFakpxZ_w.9r_e7mK8oeyStfrlYcj5CrnZuUpevdTmC_ft97IKsy K4BdJGNXZmgnsGnLYckmTnV6hZonx8atKJ.kpWNOMf_t3tgpLGqR_h4uZQ4d152UeuS_IVpT69dj mHcHD0nsj95B_X8vHMuED3j99g3ouh_2c6c9wQ82J44gLvGyMZTeOxvXf61qSHnYdNldaSQ78Kz8 dVNtA4Jj2Ys6qWUsOANaJlAO6a50HE6T4vK7bo0Jg6Yod_0lJeP8anSg1LfkAlKQcHtZEHLUfr_N OFBUYq685vgc86sGH0uWhyQaeHs5uuMtoZGcgOIRgfAixisM9oJXfxuiiJcFy35HIOER4l7ZJwM8 NGBw3_vNW0RNrxfgrupjUEuw7oCwLf.KTFMtYqBK0tgPgolnCyVj0q8dq0e5YpoeUB.3IgfS.BqP TbSXsS6R06FAtv.e73JeKHn4PFdyF2Jho2gSSCTBUDzn8IvEWDsb32.14sp6KbXiBj1FFLQ4dblZ P1Lkkva.paoWtR65NT8W9C9MssksSQ4E.amO9Fma0owBAZLbRpElfhtom.HhZLICQULbo7edube. a7bYkHY3cYy5gRm.Ciod1f918oGiOQuFwi_ObQp_WLxqhGj0qRsxvmviHGT2_NTPYKTYd69gGbOt TWw9NsCc5DwKqy.euhm6hycdVP59QVlu6X76ZglEb_oeJuKNyII233_RJSpPZYZnleS1GTJbh1Iu iEehA_72RMvYCAEH9QP5.4Yp3xrhpaeD9G3zj0sH_PAXyiPYA8w0EwxVz3CvW._9xgPWEhCYWRP4 LTPLDfVQJFe5lYeM84VGdTRuCBEH33gvYJ8RUAA_IjmnTm_FgYTlxxdp2bh7im2IWAFEGtb5Z852 E4.8QlmO9XUOSD_BQTiSr0Kr4fZTCg0wjBm0ZhAu1cZU6a3qX_.6SloyPSuYMlufJuzHytN73U8x WZn7snA.ZD40Q4vOVXJ0uyonhpQy9YaoFpdH6HsR3QdQnfjSWBp1W3Pw9CnG7nP3SLScJC3mPbNs 0DAq1lP4EeUUBbRs3k26P1vYlwyT_yDEOsi2QVm3eJ9dRNAVOsmfTC62MuFFV8RD85HgU2Z0KbGJ SEGyAtSS1yFwP_B1cvg0I9jsFfcODi8jzG_kfi_80.r213FQOuLupkH1EoFpdXQ6GZzT15Jeg8fU aUDo8pzLNdKyxBzEn X-Sonic-MF: Received: from sonic.gate.mail.ne1.yahoo.com by sonic311.consmr.mail.gq1.yahoo.com with HTTP; Fri, 14 May 2021 21:51:32 +0000 Received: by kubenode540.mail-prod1.omega.ne1.yahoo.com (VZM Hermes SMTP Server) with ESMTPA ID 2542ac01b2ad8470e0046eaf3256ade2; Fri, 14 May 2021 21:51:29 +0000 (UTC) Content-Type: text/plain; charset=us-ascii List-Id: Porting FreeBSD to ARM processors List-Archive: http://lists.freebsd.org/arm List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-arm@freebsd.org Mime-Version: 1.0 (Mac OS X Mail 14.0 \(3654.80.0.2.43\)) Subject: Re: FYI for aarch64 main [14] running a mid March version: I ended up with [usb{usbus2}] stuck at (near) 100% cpu In-Reply-To: <202105142048.14EKmm9K082795@gndrsh.dnsmgr.net> Date: Fri, 14 May 2021 14:51:27 -0700 Cc: freebsd-arm , freebsd-current Content-Transfer-Encoding: quoted-printable Message-Id: References: <202105142048.14EKmm9K082795@gndrsh.dnsmgr.net> To: "Rodney W. Grimes" X-Mailer: Apple Mail (2.3654.80.0.2.43) X-Rspamd-Queue-Id: 4Fhj0P4vyxz4S07 X-Spamd-Bar: ---- Authentication-Results: mx1.freebsd.org; none X-Spamd-Result: default: False [-4.00 / 15.00]; REPLY(-4.00)[] Reply-To: marklmi@yahoo.com From: Mark Millard via freebsd-arm X-Original-From: Mark Millard On 2021-May-14, at 13:48, Rodney W. Grimes wrote: >>=20 >> On 2021-May-14, at 05:52, Rodney W. Grimes wrote: >>=20 >>>> Note: The context was using a non-debug main build >>>> from mid-2021-Mar. (More details identified >>>> later.) >>>>=20 >>>> The issue happend while attempting a: >>>>=20 >>>> # zfs send -R zpold@for-copy | zfs recv -Fdv zpnew >>>>=20 >>>> where the drives involved in the command were: >>>>=20 >>>> zpold: a USB3 SSD, using /dev/da0p3 >>>> zpnew: an 480 GiByte Optane in the PCIe slot, using /dev/nda0p3 >>>>=20 >>>> with: >>>>=20 >>>> # gpart show -pl >>>> =3D> 40 468862048 da0 GPT (224G) >>>> 40 532480 da0p1 4C8GCA72EFI (260M) >>>> 532520 2008 - free - (1.0M) >>>> 534528 29360128 da0p2 4C8GCA72swp14 (14G) >>>> 29894656 4194304 - free - (2.0G) >>>> 34088960 33554432 da0p4 4C8GCA72swp16 (16G) >>>> 67643392 401217536 da0p3 4C8GCA72zfs (191G) >>>> 468860928 1160 - free - (580K) >>>>=20 >>>> =3D> 40 2000409184 ada0 GPT (954G) >>>> 40 409600 ada0p1 (null) (200M) >>>> 409640 1740636160 ada0p2 FBSDmacchroot (830G) >>>> 1741045800 58720256 ada0p3 FBSDmacchswp0 (28G) >>>> 1799766056 176160768 ada0p4 FBSDmacchswp1 (84G) >>>> 1975926824 24482400 - free - (12G) >>>>=20 >>>> =3D> 40 937703008 nda0 GPT (447G) >>>> 40 532480 nda0p1 CA72opt0EFI (260M) >>>> 532520 2008 - free - (1.0M) >>>> 534528 117440512 nda0p2 CA72opt0swp56 (56G) >>>> 117975040 16777216 - free - (8.0G) >>>> 134752256 134217728 nda0p4 CA72opt0swp64 (64G) >>>> 268969984 668731392 nda0p3 CA72opt0zfs (319G) >>>> 937701376 1672 - free - (836K) >>>>=20 >>>> The system running was that on /dev/ada0p2 (FBSDmacchroot, >>>> which is UFS instead of ZFS). >>>>=20 >>>> The [usb{usbus2}] process eventually got stuck-busy, no >>>> more I/O: >>>>=20 >>>> CPU 0: 0.0% user, 0.0% nice, 100% system, 0.0% interrupt, 0.0% = idle >>>> CPU 1: 0.0% user, 0.0% nice, 0.0% system, 0.0% interrupt, 100% = idle >>>> CPU 2: 0.0% user, 0.0% nice, 0.0% system, 0.0% interrupt, 100% = idle >>>> CPU 3: 0.4% user, 0.0% nice, 0.0% system, 0.0% interrupt, 99.6% = idle >>>>=20 >>>> PID USERNAME PRI NICE SIZE RES STATE C TIME = CPU COMMAND >>>> 15 root -72 - 0B 262144B CPU0 0 8:51 = 99.95% [usb{usbus2}] >>>>=20 >>>> 1295 root -8 0 20108Ki 8092Ki q->bq_ 2 0:04 = 0.00% zfs recv -Fdv zpnew{receive_writer_thre} >>>> 1295 root 48 0 20108Ki 8092Ki piperd 2 0:22 = 0.00% zfs recv -Fdv zpnew{zfs} >>>> 1294 root -8 0 17544Ki 7740Ki q->bq_ 2 0:01 = 0.00% zfs send -R zpold@for-copy{send_reader_thread} >>>> 1294 root -8 0 17544Ki 7740Ki q->bq_ 0 0:00 = 0.00% zfs send -R zpold@for-copy{send_merge_thread} >>>> 1294 root -8 0 17544Ki 7740Ki hdr->b 2 0:00 = 0.00% zfs send -R zpold@for-copy{send_traverse_threa} >>>> 1294 root 52 0 17544Ki 7740Ki range- 3 0:20 = 0.00% zfs send -R zpold@for-copy{zfs} >>>>=20 >>>> 1036 root -8 - 0B 1488Ki t->zth 0 0:00 = 0.00% [zfskern{z_checkpoint_discar}] >>>> 1036 root -8 - 0B 1488Ki t->zth 1 0:00 = 0.00% [zfskern{z_livelist_condense}] >>>> 1036 root -8 - 0B 1488Ki t->zth 2 0:00 = 0.00% [zfskern{z_livelist_destroy}] >>>> 1036 root -8 - 0B 1488Ki t->zth 1 0:00 = 0.00% [zfskern{z_indirect_condense}] >>>> 1036 root -8 - 0B 1488Ki mmp->m 3 0:00 = 0.00% [zfskern{mmp_thread_enter}] >>>> 1036 root -8 - 0B 1488Ki tx->tx 1 0:00 = 0.00% [zfskern{txg_thread_enter}] >>>> 1036 root -8 - 0B 1488Ki tx->tx 2 0:00 = 0.00% [zfskern{txg_thread_enter}] >>>>=20 >>>> I was unable to ^c or ^z the process where I >>>> typed the command. I eventually stopped the >>>> system with "shutdown -p now" from a ssh >>>> session (that had already been in place). >>>=20 >>> Should this occur again before doing the shutdown run a >>> zpool status & >>> I have gotten in this state when the recv pool was a usb device >>=20 >> The USB device had the send pool in my example. >>=20 >>> and for some reason it had a timeout and gone offline. >>=20 >> No messages about timeouts or other such were made. >>=20 >>> The clue >>> this occured are in dmesg, and zpool status. >>=20 >> No console, dmesg -a, or /var/log/messages output were >> generated. (And the system was running from a SATA SSD >> that was operating well.) >>=20 >> For reference, the USB EtherNet device that was in a >> USB2 port continued to operate just fine, allowing the >> use of existing ssh sessions that were displaying >> gstat -spod and top -Samio -ototal until I started >> looking at the problem. (I did not try making a new >> ssh session). >>=20 >> I did not do the "zpool status" so I can not report >> about it. >>=20 >>> Unplug/plug the USB device, check dmesg that it came online, >>> and do a zpool clear. >>=20 >> Okay. I normally avoid unplugging USB storage media >> if the system overall does not hang up: hopes of >> a clean shutdown leaving things better. >=20 > Do the zpool status, and only if that indicates a device offline > or other problem would you proceed to do the unplug/plug, as at > that point zfs has stopped doing anything to the device and > your shutdown wont do anything as far as zfs for that pool anyway. >=20 >>=20 >> The system did appear to shutdown to completion. >=20 > Yes, you can shutdown a system with a zpool in failed state. I still have the console output available and I was wrong: the shutdown hung up and apparently I cut power before "All buffers synced". All it got to was: Stopping cron. Waiting for PIDS: 858. Stopping sshd. Waiting for PIDS: 852. fstab: /etc/fstab:6: Inappropriate file type or format Stopping ntpd. Waiting for PIDS: 807. Stopping nfsd. Waiting for PIDS: 781 782. Stopping mountd. Waiting for PIDS: 779. Stopping rpcbind. Waiting for PIDS: 741. Stopping devd. Waiting for PIDS: 431. Writing entropy file: . Writing early boot entropy file: . . Terminated May 13 20:16:26 FBSDmacch syslogd: exiting on signal 15 It did not get to any of the usual sort of (from a different shutdown): Waiting (max 60 seconds) for system process `vnlru' to stop... done Waiting (max 60 seconds) for system process `syncer' to stop...=20 Syncing disks, vnodes remaining... 0 0 0 0 0 0 0 0 0 0 done Waiting (max 60 seconds) for system thread `bufdaemon' to stop... done Waiting (max 60 seconds) for system thread `bufspacedaemon-1' to stop... = done Waiting (max 60 seconds) for system thread `bufspacedaemon-3' to stop... = done Waiting (max 60 seconds) for system thread `bufspacedaemon-4' to stop... = done Waiting (max 60 seconds) for system thread `bufspacedaemon-5' to stop... = done Waiting (max 60 seconds) for system thread `bufspacedaemon-0' to stop... = done Waiting (max 60 seconds) for system thread `bufspacedaemon-2' to stop... = done Waiting (max 60 seconds) for system thread `bufspacedaemon-6' to stop... = done All buffers synced. Uptime: 8m12s >>=20 >>>>=20 >>>> When I retried after rebooting and scrubbing (no >>>> problems found), the problem did not repeat. >>>>=20 >>>> I do not have more information nor a way to repeat >>>> the problem on demand, unfortunately. >>>>=20 >>>> Details of the vintage of the system software and >>>> such: >>>>=20 >>>> # ~/fbsd-based-on-what-freebsd-main.sh=20 >>>> FreeBSD FBSDmacch 14.0-CURRENT FreeBSD 14.0-CURRENT = mm-src-n245445-def0058cc690 GENERIC-NODBG arm64 aarch64 1400005 1400005 >>>> def0058cc690 (HEAD -> mm-src) mm-src snapshot for mm's patched = build in git context. >>>> merge-base: 7381bbee29df959e88ec59866cf2878263e7f3b2 >>>> merge-base: CommitDate: 2021-03-12 20:29:42 +0000 >>>> 7381bbee29df (freebsd/main, freebsd/HEAD, pure-src, main) cam: Run = all XPT_ASYNC ccbs in a dedicated thread >>>> n245444 (--first-parent --count for merge-base) >>>>=20 >>>> The system was a MACCHIATObin Double Shot. =3D=3D=3D Mark Millard marklmi at yahoo.com ( dsl-only.net went away in early 2018-Mar)