From nobody Fri Jan 19 08:22:01 2024 X-Original-To: freebsd-amd64@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4TGXfN21bDz588xv for ; Fri, 19 Jan 2024 08:22:20 +0000 (UTC) (envelope-from marklmi@yahoo.com) Received: from sonic311-23.consmr.mail.gq1.yahoo.com (sonic311-23.consmr.mail.gq1.yahoo.com [98.137.65.204]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 4TGXfL3hQ8z4KmF for ; Fri, 19 Jan 2024 08:22:18 +0000 (UTC) (envelope-from marklmi@yahoo.com) Authentication-Results: mx1.freebsd.org; dkim=pass header.d=yahoo.com header.s=s2048 header.b="qzn/Fc83"; dmarc=pass (policy=reject) header.from=yahoo.com; spf=pass (mx1.freebsd.org: domain of marklmi@yahoo.com designates 98.137.65.204 as permitted sender) smtp.mailfrom=marklmi@yahoo.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.com; s=s2048; t=1705652536; bh=iyjhzUx6p93tH9VKcmenEOB2GzDSIAJGZGcTuEUpi2g=; h=From:Subject:Date:To:References:From:Subject:Reply-To; b=qzn/Fc832hdrqQ2wa+dusUnI90kKImZZlkDNOxupTtrGd6XisqJuI7mk0uUocSisbVBCdqCeqFU/DbPUkqPlcsWEHZwCDwne3aRVmY0sXXr14Qk2PUK3BBK9U0XWNx3cGT+qbst+QhJk1ohSQ/kU8KwLDGxbx/4wUlg/erZwS7uHBFSa140H81LHfK/DBfVtCylzdzGs03BRaZqk0TeVR70KGDKE3W+cWkcM/n4Sv6Xxa7+IBb9gIKforh2ZcvmWrWYHeU3rHrYJHLfIy+cUBBVygvajjIGlYZEbQKtH2U7fDqlOsy5tKu09H55Z3+2Ew+VHfp82CnRw72t+2f5kBg== X-SONIC-DKIM-SIGN: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.com; s=s2048; t=1705652536; bh=yMIaaaoYlcWR7a6zwhdWbTEd9YVzMij6ONz8ZdFZBgr=; h=X-Sonic-MF:From:Subject:Date:To:From:Subject; b=FpNNAgSf2ALe/8Tr3X2GDUvsdnbks+sS16cDpJWf+fZ3BM8EO0HOE3g6WSRqQ8WGNH9BI5PQQM5eLmSUj8352VLkdD6tnaXeeXPb40+M73B7gVsgWZsBLNs6C4gWQkrYjXsVtVOvKhljaRzjg30q5Sdyv3ldFJrjRLLEsau2adX/EIjh9M62u6CDXxECxVoZAZMoJ0YvD3sxB+z0YYsOc7CFYzK/tzKLX3w3GnL4Km2oOjldPwwaD2jSYA0kOB5gkLPQiJ3oymRC9kgwKP5aVscF7gLrO3pvrm+ldNW6aa7ytn+wt0YLWrMoXxtin3mdF6P6SC1KHd4UDDYn8CZ8eg== X-YMail-OSG: X_TnZj4VM1nDeDgJZdecmuznLoaQegUhbppR4_EmpwhQKEhm7znJ84_gH5f9uIk DpPzMkLalPDyOh1Ij_jnORtA.EY0xyEsYsb1iPTF5oiF6GLmx8PuMdnhIbUJNSFWLRtTvkgVkGoF tPPRljs.kqvqT_ow61HjxrZZYKzIopFaLWkOSOPlJfv4G0ZiOtlUpaji9SaV.cbUAmtCmpjfWC6R fOJxF9JDeVrap0n1yuPWFzja.nEhFBiXIJzvnSqFae4D5d9K.sB5hqos7z4vOAYnQ5b1QnLLvyn. z19u32t_9n7kFVKOUyCwsobnDbV9SjugPLi5zop8AhIOkm6BtrfExC9qhhDXgpbxARwGSAF716s3 c5JiSqlLCVvwkfqudAjl1lPtGRYA3ZHRgc1exAdDbdSlWoKfH1dLZ0tH.OacFwSYWE1AmSXTKS5D bhJbdA3BF8ChGGZXrnXYXORj8u_WZOBVm_yiP1J_qXPmqsS1lQte1f4th7Cg0RRR9T1FJHOcwDCa 4yyZklg7KQu1zl9weuijZfvy_C_LgQrbxRgJpMhJBGnQuVEb7rrt0WephVeQiFXyj5N2jH75M5ZJ XqTUsjWkIDusIsa_4GV2aTX_bEQFKL5z8XboS9xJakZGooCOU0nF.PFDfOF3OJ9OY7jeAmRiP7NY e8OxLwT7Q7HGeTA5s6L2IoUils4psPCa2hBOi69aV4wLti3WTxgmqzhRl.xH2wNnf1TTOkOOGDj5 9BpfH.wD3xWg8sTrehaVbkV4novhsUBymIftVwCxAxw40xVrabT5m2Nrg1uPu31ojjdUoRMXhb9j P179OkeBkqz9M4dIuuiLLVij.URt9qoKAUg9vp_RYjDpNHegbLkDCJaQZ0WXf1JSyjM4mjuGCWU. 8QP3uDe3ehwcZ3y_V8OJiyZk1xisUW.VlNUzJ631jF1I4AE5ShCpcwEZQKlg1kMVyMvXQlozsCBI dDO6g3_dXiur3lppv_rvGlv2DJyeJeVnIxbe_1dwdwU1fmOUNUlShGRdLrSX_ieiVocdgInuthun WNvfB.cm_HFWf0c3CLmPU7Rqg4GtJTKR3QY2UsQxKkYx9P9Ailo9_dDI1kTymuSfD8bqEuWxS1yp aVvmZfxQMS_AVKlN3KUcRZg2pQYblY0TGXk4diYqxw5Q7aFlVmFJ0RPfC9tkqDVBWNVaIkcjUNY2 5kyp7l7HZOctEkq4BHbaRDdG8L0VhIjl3jqSZ_bpPtYdJH39Pl2ckb3yt_Hz6iGdP1NlINkjZLfQ ZcgBPlXkle9gU08YaGs6z_cGiQ3Y9e0ufRbSARRTlaSKFkcyMKjlAs174OBJPAty2w1st3lCYoeE HnoYAFno_q5q44QZQ02mkRkgWF97OoVDXETKnSgSCbv6SHVBunUX..pfQPy7o8bgfqKk8cU8qpeW 9_Vo79cU7qoRxXp4itG3LSQpCPGhvvy.s2eRrt.lb5YC52yXuOZj0mqu8CnptlznFglQcyBWElDV nKSLAkOMLyxeGY6i9Yu8PC9ks5WdM4HQ9J29m9r6DwV31aXdeDuDUyYrlvj903AjI4pfiturAP6b 25bbsAfmQ7CguI9k0eBjl2szxXzoVSZ.xstREXZRH_dGUBvJBDKlfSOs7oCnVWSurObuiia2fCuN qzSr.ALJIQzEgWdyUD2A0GbQOuYwlcHlNsiWw62qxEOG5xan7JYpUzNssXbuMAPY.vm1v3iS7isP cR3qRlezr.IYNQsr.Y2RrVrFQrtkOxGMezQQOa9_4nX.EMPuYO_1tI_oQy.Tl6cYAM_Wcx_uA8ZJ cq6E0J3sK.JuAf8en8tymZFS_Nby4oKVEPFQrx.oALuqtFfqIPuXv920CwQumEe._iUnTylP6qAW d3leoKM9rKjDrixbN1D6YQJBvSgtp1Dw058uM9jpU20r3ofES1.TnoiGpJNULkXGW29fwlqctP4H EsCbsUMDw1s2Fwj52KEjtEcQTw3D6cRdHLo_tFZeKul7dZlpDI8WqTYG2kOIxzuo4sv9Xk4Wm2n6 lY54fDe49NZgAwWHeJu0FxJP0FZ0h0x2dbe0Gr_TQvC4ix_06kHvBfi2ObWSBgoo0v2W.VbbzPI7 LHqScl4488SdjNnWyBB5c1fDrQMrVxAeXZDY9cJZJ.oLsZpiArQsjcstNSMHlmDlaeeSMvhr79Fu .TAPsRMszxQFderMwXdJt1r0yvIuLO_mfK2mPgk9z2Sa_SHiirYBg0N5oc_pwAb_B18Zf9moD4JJ gvl4fnzBCVmCXNDbalieZIPKKjpoQtUIXYnRx2D3QSXQ.TY0N8jUKagP8 X-Sonic-MF: X-Sonic-ID: d5413ed0-36d4-4bdb-90cf-c8d95814b2c9 Received: from sonic.gate.mail.ne1.yahoo.com by sonic311.consmr.mail.gq1.yahoo.com with HTTP; Fri, 19 Jan 2024 08:22:16 +0000 Received: by hermes--production-gq1-78d49cd6df-tswkb (Yahoo Inc. Hermes SMTP Server) with ESMTPA ID ad4c2cabdd8dc1f646c0e85753985d65; Fri, 19 Jan 2024 08:22:12 +0000 (UTC) From: Mark Millard Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit List-Id: Porting FreeBSD to the AMD64 platform List-Archive: https://lists.freebsd.org/archives/freebsd-amd64 List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-amd64@freebsd.org Mime-Version: 1.0 (Mac OS X Mail 16.0 \(3774.300.61.1.2\)) Subject: 7950X3D: using 1 hardware thread per core vs. 2 hardware threads per core: a fairly large difference Message-Id: <8001F567-7E02-43FB-8D08-D42B560369D8@yahoo.com> Date: Fri, 19 Jan 2024 00:22:01 -0800 To: freebsd-amd64@freebsd.org X-Mailer: Apple Mail (2.3774.300.61.1.2) References: <8001F567-7E02-43FB-8D08-D42B560369D8.ref@yahoo.com> X-Spamd-Bar: --- X-Spamd-Result: default: False [-3.50 / 15.00]; NEURAL_HAM_MEDIUM(-1.00)[-1.000]; NEURAL_HAM_LONG(-1.00)[-1.000]; NEURAL_HAM_SHORT(-1.00)[-1.000]; MV_CASE(0.50)[]; DMARC_POLICY_ALLOW(-0.50)[yahoo.com,reject]; R_SPF_ALLOW(-0.20)[+ptr:yahoo.com]; R_DKIM_ALLOW(-0.20)[yahoo.com:s=s2048]; MIME_GOOD(-0.10)[text/plain]; RCVD_TLS_LAST(0.00)[]; TO_MATCH_ENVRCPT_ALL(0.00)[]; RCPT_COUNT_ONE(0.00)[1]; MIME_TRACE(0.00)[0:+]; ARC_NA(0.00)[]; DWL_DNSWL_NONE(0.00)[yahoo.com:dkim]; FREEMAIL_FROM(0.00)[yahoo.com]; FROM_HAS_DN(0.00)[]; ASN(0.00)[asn:36647, ipnet:98.137.64.0/20, country:US]; FREEMAIL_ENVFROM(0.00)[yahoo.com]; TO_DN_NONE(0.00)[]; RCVD_COUNT_TWO(0.00)[2]; FROM_EQ_ENVFROM(0.00)[]; DKIM_TRACE(0.00)[yahoo.com:+]; MLMMJ_DEST(0.00)[freebsd-amd64@freebsd.org]; RWL_MAILSPIKE_POSSIBLE(0.00)[98.137.65.204:from]; MID_RHS_MATCH_FROM(0.00)[]; RCVD_VIA_SMTP_AUTH(0.00)[]; RCVD_IN_DNSWL_NONE(0.00)[98.137.65.204:from] X-Rspamd-Queue-Id: 4TGXfL3hQ8z4KmF I do not know how much the below generalizes as I do not have access to other rather modern FreeBSD amd64 systems to test, just the 7950X3D system with 192 GiBytes of RAM. The gist: https://gist.github.com/markmi/193423c6fd6f534a72725d7d5cd0236a is an image showing performance curves for a benchmark. Each curve is for 8 hardware threads in use. The x axis is for the problem size (Bytes, logarithmic scaling). The y axis is performance (linear). (It is a mathematical definition in a mathematical approximation problem that is handled a specific way in the benchmark.) As the problem size grows signficantly larger than a RAM cache, the access pattern makes the RAM-cache become notably less effective. The benchmark variant restricts each software thread to a specific hardware thread (singleton cpuset) after the thread starts, generally avoiding losing structural information to thread migration variability in the structures used. The major performance difference ends up being tied to: 1 hardware thread per core vs. 2 hardware threads per core A quick textual summary giving a clue is: 1 per core, 8 cores: around 800*(10^6) to 850*(10^6) peak. 2 per core, 4 cores: around 500*(10^6) to 550*(10^6) peak. (same units) But far more than the peaks show large differences in the same orientation for the same caching generally. Think of an area under a curve for a size range being important for that size range. Each hardware thread does independent processing. (But the threads' results are combined to get the overall result for a problem size.) So more RAM cache sharing and other resource sharing is involved for 2 threads per core --and it has non-trivial performance consequences from the competition for shared resources. The far right of each curve [around 150*(10^6)] vs. the peaks of the curve suggest how much the RAM-caching helps the performance (or how much the processor waits for RAM when RAM-caching is not very effective vs. when RAM-caching is more effective). The RAM is DDR5-5200, 2 DIMMS per channel, 2 channels, 48 GiBytes per DIMM. Note: The benchmark can also be built to not have the CPU LockDown used, allowing general migration of software threads across the hardware threads in a cpuset. Seeing the CPU LockDown results first can help interpret the messier with-migration results. === Mark Millard marklmi at yahoo.com