From nobody Tue May 16 09:03:50 2023 X-Original-To: freebsd-arm@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4QL9K32QYcz4Bn5B for ; Tue, 16 May 2023 09:04:07 +0000 (UTC) (envelope-from marklmi@yahoo.com) Received: from sonic307-54.consmr.mail.gq1.yahoo.com (sonic307-54.consmr.mail.gq1.yahoo.com [98.137.64.30]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 4QL9K16GPLz4LLK for ; Tue, 16 May 2023 09:04:05 +0000 (UTC) (envelope-from marklmi@yahoo.com) Authentication-Results: mx1.freebsd.org; dkim=pass header.d=yahoo.com header.s=s2048 header.b=B8aTa6Xe; spf=pass (mx1.freebsd.org: domain of marklmi@yahoo.com designates 98.137.64.30 as permitted sender) smtp.mailfrom=marklmi@yahoo.com; dmarc=pass (policy=reject) header.from=yahoo.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.com; s=s2048; t=1684227843; bh=uHAm/Id2CiAHOzt0dBbycNC8tIqrvrpr5tlPoU65UJs=; h=Subject:From:In-Reply-To:Date:Cc:References:To:From:Subject:Reply-To; b=B8aTa6Xe3dJjd6tPEETZ+foI8Wbguxbbj6kvl5ySN/+TM+IxRpHhzB8bYlgAoPO3RZ+XgQ+SgzX3nXh5ho5HtYJZlxPN2Jl0EqFmQ0pEKVqRaz2ONnghaQsSorTH4tNCflgTysFGxa1mic9pvRLnD4Nid2ef3W5xLBLPqU9e27cmUrcWbqFqoQznEJt+7kW+ZygVTzlJAlJaJ81AtWe3AOVjjlEApbgi/lwtb6wtDb2e5Hcr2NsiHfyiYwZXLejd8ut7gk8CWozXr9DFwwLcdMRK9dZDZFe3FcEVjiaYbnsbxwj5y55TGxjDwHDPKXZLF1pFFY/ZgPSOypbiDf2taA== X-SONIC-DKIM-SIGN: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.com; s=s2048; t=1684227843; bh=sq6I+q2uLd+pRzGDtTqWbtNVf67Cc7NvM/IolWPAYeD=; h=X-Sonic-MF:Subject:From:Date:To:From:Subject; b=REMYnE0XLb8MQIR6S5Q+OKqNAUTY0FJTV1pYGPnCeUafoWuuB3kdL6sHAnhRz1HuErjKj3nmJrI+oEc8TkagVhW2KWqH9FB3ltLfvGYbWStIZnmSMDFLyPJAOQqmN8kYrwUdgZSyPOVhz2ZMnuybzFxzyAWenGLMxlTTuNvI5FhTtpb5qLA+kcSsrSyqYggcyaK6E+Y0zqnSyJxVjZryq0pHXbWq38YUO3PbU8RlatzKfL3z/IYKD4bC7lWJrNeYnMOw6BIsjInY02R53gP5sxvlUm5IX4P9Xh+Qcy5t1Dr4pXPLIyFOrRbq1o1fCTH62T/r8KS/wfcuNcAeYA6wEQ== X-YMail-OSG: Wsgz6dQVM1nFzJdoAKEmSNG1znuEg8HWQ5sf8UOMC6NIC2uJiXL.JrP9y.5RXF6 2D02jSBCEVUVw3f7fTz2nYYkxp6NDstXW54JOMVny9BT0laLYS48krHKRVFSVgE32Y96akiNE432 hFXwoyBJIO.x86Xh4_mA9CVwSMa.cdKBhs.ZwesXKfR4LEaurBPsw36phPd7rAgYqFylZZ4Qfern g6sVptiBBYuNlRH26mC5qbqWgSIretL6FLIpZ3gl4sCmzlkkTC1DQX1eIzQVJSKM5F23TvZA2DYu RPXxeUSgQdsgoFH35ZBnmZXUP7v3Q3GUTLYhiKqX.6d8OH5insjmuqW4.S7S0J5AAowI1M17KTDP ec9S44GbE9RMopBzrmW93LVF6d6ta3O9O4rgqG7.hI9IH1aI0ljJL73EpQvkYXJ6weYL7XPRE4M5 8Ury2kFYAZxewHGfN5epq.Yp3XY4c2Vtr_BtlC5XyVd1HmnQrraDrsewfU3C2bQN4nT4EHRhlZ8m fdDph60ai4TzKrYVzWWmA5tnYRqMtS8vXRwqusGQtuJgQXm9fVyUkWKSwGqk3_dv406nlZRoFZRs Sh2KVdpUm7LY2EfT6gR7NBvnKHTUaga7Xr6whp4eMkYQnMacys1SwmmiyZur5.5hraPOpxnjYEyx 1zEMVN6WCgMhVjmkirDKMyoADyQHodFUIzwijW9kLOFgazWsBQIs.9Bo.3.eCnmWp0RcqpCmQtBA _JBnByEM4YKbaWVUtEK1slOsnz0TxH.u26OjWW14H5mD51a1m9C4reuKO3qiQHn8fy9mswOKgS.K CdEBpXyPkv49ehvPcrK5hTKA8zeiC4s23h2OsJTFC1rSHVMiteoA25td5ZuGp_owv_foNsLUsQy6 lzmclo5LTKgZJGlGs.94Jw5DhzraO7kX61cEPyKp_Y2kuarFzOLfvc7BCklUGNNT8RcbQWixhQgz a7T_4_PSrdAzuo7sFXa_XFh2a2IwU.U_2dilaht_4W6vfrCIHrMlWUQrQeq244Dej60fUx88I4r1 MRjNOihuo1qUnGCSDXj9eOUL1yU6LUEsT.Eavt.wrfAQcN5v6m_r5SeSX0_C9nv3__QOkIh3A9XS 1tcmluZ9DsR6T7uZvxgfhMV5R3muhNB700M2PlZkcW.g.piRjiXrN4Mlh0nnEQnasl5NJMTHMu2S KulcjvBzOatBvUhJJkd753zfBQZyiTBdN6ky9jZrO0znNd1dOKIEeMeqCFconSz7xU.D82SyKAJz cyhql9rfLF8Whtv2dRNTMpIwvscZEXYzIrZCFEOQr3VGY8TrfOB_952TrBv5OjsxZo8nNnPvNHs1 VIsHbp1wwI2.fcCysRS.ZuagOvkxwi15MT6bGfYMQ_hu8kXBXMRb0QH756LyJIEQACVargqAgrXb G.45JLlhC8tAk1FHXoQivsvFIgJ2R_UBsl0WMgO4yJP_87mL3O5STJmyoA8RdvVokpqEmW.VHzoM XjuNxWa699NFGBCrqc47fQLrmNyCA7uvqq_FjCtdN8rY8BA5FT0s_E_y2EezNHiKblMlvEPWRhhB VrBkSBcI.ZKZGIo3D6T0tPJ_1myQjT7AL6axvcENs2AlO8LwqH8buv4Lm9W4pMUgBvspllpIeJC1 scaluikVf.0qz5ng8TuYWpbyE5x9gD_GwKmlxyTPGRnpRRJcJLFFwXoi.VfJZNloeQ9V51QXSt_8 G_Oe.3W54oWzNcEuB_BoDMknf8Urqn5D1UjWGqsll4Ucyayrkp7EmrpnDRDR3fopV6L8rhq3Zxrj LgXEPxZn_fLmo4NuJsKhv557rFz0b_4MXvtNi0UnWtPhTsOtZUJag2aPZDxy8g9sz28pntbvQ0BJ b7B4qsrkvZiUBZemfUKR6lA8rI9Wu.uJ_1Nj3Cpi6Qv_.FqTixE5ly7E0JgCCPN6uK.k8x0DbqRz bwsd4NO13FeLQ.VzWkJQ2oEqouWKuG_3R.jVjFuztwW5ukDintoaFyQP.UebPJ9CrQS34h5xv0YD 7eZuGxGchRH0kjO7D.PhI_1eOj_PoWI8clYVmsqGKDiGWdqCqmdDt7xHZfPElWlFh1661t1KFRan mfPJiCNUaXqDGE5liyfZ9myqaDdsQIo3l6yVRyY13IuX_h.R7uZ8XgyYpIZcs0vSU.bpA9vh2fo. GNMBswukjWGx3bVbhHKM_jwJzxG_PrZyB7zFw03.E92rsfTvhKYLBZ1U9qqiYVW83LIuozijU.eL G8hJuWjeBJTpi.6qqvvI_ouzgyCsHFKHrYlXAo95VLOYHF0QAy2xkcGZVI4USRN0.CA2HOyMOtL3 ZEw-- X-Sonic-MF: X-Sonic-ID: 7c9b72b1-7fb1-49e3-bf91-8a98be359ae6 Received: from sonic.gate.mail.ne1.yahoo.com by sonic307.consmr.mail.gq1.yahoo.com with HTTP; Tue, 16 May 2023 09:04:03 +0000 Received: by hermes--production-gq1-6db989bfb-hz24p (Yahoo Inc. Hermes SMTP Server) with ESMTPA ID 3ef801103db473dfff557383acd5d8e8; Tue, 16 May 2023 09:04:01 +0000 (UTC) Content-Type: text/plain; charset=us-ascii List-Id: Porting FreeBSD to ARM processors List-Archive: https://lists.freebsd.org/archives/freebsd-arm List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-arm@freebsd.org Mime-Version: 1.0 (Mac OS X Mail 16.0 \(3731.400.51.1.1\)) Subject: Re: Cores of different performance vs. time spent creating threads: Windows Dev Kit 2023 example [Oddity is back!] From: Mark Millard In-Reply-To: <47DE0BF6-3A16-4F87-AEEF-6D320BBC90E5@yahoo.com> Date: Tue, 16 May 2023 02:03:50 -0700 Cc: freebsd-arm Content-Transfer-Encoding: 7bit Message-Id: <91F3816A-EFF5-462A-8580-EE5C73A0FBEB@yahoo.com> References: <11EBAA22-6E0F-4B27-9799-7786E149D9B1@yahoo.com> <47DE0BF6-3A16-4F87-AEEF-6D320BBC90E5@yahoo.com> To: FreeBSD Hackers X-Mailer: Apple Mail (2.3731.400.51.1.1) X-Spamd-Result: default: False [-3.50 / 15.00]; NEURAL_HAM_MEDIUM(-1.00)[-1.000]; NEURAL_HAM_LONG(-1.00)[-1.000]; NEURAL_HAM_SHORT(-1.00)[-0.999]; MV_CASE(0.50)[]; DMARC_POLICY_ALLOW(-0.50)[yahoo.com,reject]; R_DKIM_ALLOW(-0.20)[yahoo.com:s=s2048]; R_SPF_ALLOW(-0.20)[+ptr:yahoo.com]; MIME_GOOD(-0.10)[text/plain]; DWL_DNSWL_NONE(0.00)[yahoo.com:dkim]; ARC_NA(0.00)[]; RCVD_VIA_SMTP_AUTH(0.00)[]; FROM_HAS_DN(0.00)[]; MID_RHS_MATCH_FROM(0.00)[]; TO_MATCH_ENVRCPT_SOME(0.00)[]; RCVD_TLS_LAST(0.00)[]; FREEMAIL_ENVFROM(0.00)[yahoo.com]; RCVD_IN_DNSWL_NONE(0.00)[98.137.64.30:from]; TO_DN_ALL(0.00)[]; RCVD_COUNT_THREE(0.00)[3]; FREEMAIL_FROM(0.00)[yahoo.com]; ASN(0.00)[asn:36647, ipnet:98.137.64.0/20, country:US]; SUBJECT_HAS_EXCLAIM(0.00)[]; RCPT_COUNT_TWO(0.00)[2]; DKIM_TRACE(0.00)[yahoo.com:+]; MIME_TRACE(0.00)[0:+]; FROM_EQ_ENVFROM(0.00)[]; MLMMJ_DEST(0.00)[freebsd-arm@freebsd.org] X-Rspamd-Queue-Id: 4QL9K16GPLz4LLK X-Spamd-Bar: --- X-ThisMailContainsUnwantedMimeParts: N On May 15, 2023, at 12:14, Mark Millard wrote: > On May 9, 2023, at 19:19, Mark Millard wrote: > >> First some context that reaches an oddity that seems to >> be involved in the time to create threads . . . >> >> The Windows Dev Kit 2023 (WDK23 abbrevation here) boot reports: >> >> CPUs (cores) 0..3: cortex-a78c (the slower cores) >> CPUs (cores) 4..7: cortex-x1c (the faster cores) >> >> Building a kernel explicitly via involving -mcpu= use >> gets the following oddity relative to cpu numbering >> when the kernel is used: >> >> -mcpu=cortex-x1c or -mcpu=cortex-a78c: >> Benchmarking tracks that number/performance pairing. >> >> -mcpu=cortex-a72: >> The slower vs. faster gets swapped number blocks. >> >> So, for -mcpu=cortex-a72 , 0..3 are the faster cores. >> >> This sets up for the following . . . >> >> But I also observe (a relative comparison of contexts >> via some benchmark-like activity): >> >> -mcpu=cortex-x1c or -mcpu=cortex-a78c based kernel: >> threads take more time to create >> >> -mcpu=cortex-a72 based kernel: >> threads take less time to create >> >> The difference is not trivial for the activity involved >> for this WDK23 context. >> >> If there is a bias as to which core(s) are involved in part >> of thread creation generally, it would appear to be important >> that the bias to be to the more performant cores (for what the >> activity involves). The above suggests that such is possibly >> not necessarily the case for FreeBSD as is. BIG/little (and >> analogous?) cause this to become more relevant. >> >> Does this hypothesis about what type of thing is going on >> fit with how FreeBSD actually works? >> >> As stands, I'm going to experiment with the WDK23 using >> a cortex-a72 targeted kernel but a cortex-x1c/cortex-a78c >> targeted world for my general operation of the WDK23. >> >> >> Note: While the benchmark results allow seeing in plots >> what traces back to thread creation time contributions, >> the benchmark itself does not directly measure that time. >> It is more like, the average work rate for a time changes >> based on the fraction of the time involved in the thread >> creations for each given problem size. The actual definition >> of work here involves a mathematical quantity for a >> mathematical problem (that need not be limited to computers >> doing the work). >> >> The benchmark results are more useful for discovering that >> there is something to potentially investigate than to >> actually do an investigation with. >> > > Never mind: I was wrong about that . . . its back. (See later below.) > Starting over did not reproduce the oddity. So: > operator oddity/error, though I've no clue of how > to reproduce the odd swap of which cpu number ranges > took more vs. less time for each given size problem. > (Or any other aspect that might be considered also > odd, such as specific performance figures.) > > Retry details: > > I booted the WDK23 via UFS media set up for > cortex-a72, media that I use for UFS activities on > the HoneyComb (for example). I built the benchmark > and ran it. > > As stands, I've only done the "cpu lock down" case. > It produces less messy data by avoiding cpu > migration once the lockdown completes (singleton > cpuset for the thread). I'll also run the variant > that does not have the cpu lock downs (standard > C++ code without FreeBSD specifics added). I got the swapped number blocks vs. performance again, but not for cortext-a72 tailored FreeBSD, but for cortex-x1c/cortex-a78c +nolse tailored FreeBSD. Not rebooting for now, the oddity exists for the benchmark built with each of: clang 16 plus libc++ g++ 13 plus libc++ g++ 13 plus libstdc++ As before, top shows the name CPU's for STATE that the benchmark does for the cpuset based cpu id (bit numbering). As before, the measured performance for "faster" is also higher than normal. As a cross check: Avoiding use of my benchmark program . . . # cpuset -l0-3 openssl speed Doing mdc2 for 3s on 16 size blocks: 1705580 mdc2's in 3.10s . . . vs. # cpuset -l4-7 openssl speed Doing mdc2 for 3s on 16 size blocks: 1079870 mdc2's in 3.03s . . . So, openssl speed also shows the oddity: 0-3 usage being faster than 4-7 usage. The 1705580 is also somewhat large compared to a normal "4-7 is faster" context: 1705580/3.1 approx= 550187/sec . Compare to the similar calculation results in the below. For example: Shutting down, powering off, powering on, booting, and doing the openssl speed type of examples: # cpuset -l0-3 openssl speed Doing mdc2 for 3s on 16 size blocks: 997679 mdc2's in 3.09s . . . # cpuset -l4-7 openssl speed Doing mdc2 for 3s on 16 size blocks: 1360400 mdc2's in 3.02s . . . # cpuset -l0-3 openssl speed Doing mdc2 for 3s on 16 size blocks: 967253 mdc2's in 3.00s . . . # cpuset -l4-7 openssl speed Doing mdc2 for 3s on 16 size blocks: 1406978 mdc2's in 3.08s . . . So (2 similar calculations to earlier above): About 550187/sec vs. about 450463/sec and 456811/sec That is about 1.2 times faster. I've no clue about the cause or what stage(s) lead to the odd context happening. === Mark Millard marklmi at yahoo.com