From nobody Fri May 19 00:29:29 2023 X-Original-To: freebsd-hackers@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4QMnm650YQz4CBx1 for ; Fri, 19 May 2023 00:29:42 +0000 (UTC) (envelope-from jo@bruelltuete.com) Received: from email.jo-t.de (seppel.jo-t.de [45.132.244.126]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 4QMnm45H2Dz4G29 for ; Fri, 19 May 2023 00:29:40 +0000 (UTC) (envelope-from jo@bruelltuete.com) Authentication-Results: mx1.freebsd.org; dkim=pass header.d=bruelltuete.com header.s=bruelltuete18a header.b="m3/ajP1Z"; spf=pass (mx1.freebsd.org: domain of jo@bruelltuete.com designates 45.132.244.126 as permitted sender) smtp.mailfrom=jo@bruelltuete.com; dmarc=pass (policy=none) header.from=bruelltuete.com DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=bruelltuete.com; s=bruelltuete18a; t=1684456174; bh=2F0/hTZUB3VmTSjENeOKyGLe5ge89eZweYlK9TA1YUM=; h=Message-ID:Date:MIME-Version:Subject:From:To:References:From; b=m3/ajP1Z6d9xWHXYY+EzA9YykNdauHXpc4G0AtqomBEHppMxooXzcg814up/wDx3x hiwCZCDpDk19Oycs6654L51pclRKJ/GG8NdLCDLgmbWJS6OcpQNUHoq5g5GJBRmuTP 5ETlNIAhI6n0W39kVW0nj45xutglI7LH6KGPiQjgainJfdro4T8SJpK3vazuZAFScJ +xnigorL/e8UI2NYcu5BESk2Ff1/9nSEMc4uD7RIh2S1lmqRKXrdxJUmG/RxlmDTUC NiflGZpk2bQeFxWeO73H1EVNFThAIE95TfOyOgNhdmDc0vPCqNTzBMv/TwFKRoBn+c 7oKANFIk5iGTg== Message-ID: Date: Fri, 19 May 2023 01:29:29 +0100 List-Id: Technical discussions relating to FreeBSD List-Archive: https://lists.freebsd.org/archives/freebsd-hackers List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-hackers@freebsd.org MIME-Version: 1.0 Subject: Re: cpufreq & hwpstate_amd & Zen 2 From: Johannes Totz To: FreeBSD Hackers References: <576641a3-8b9c-fedb-67a6-a5c61a52f654@bruelltuete.com> Content-Language: en-GB In-Reply-To: <576641a3-8b9c-fedb-67a6-a5c61a52f654@bruelltuete.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Spamd-Result: default: False [-3.99 / 15.00]; NEURAL_HAM_MEDIUM(-1.00)[-1.000]; NEURAL_HAM_LONG(-1.00)[-1.000]; NEURAL_HAM_SHORT(-0.99)[-0.986]; DMARC_POLICY_ALLOW(-0.50)[bruelltuete.com,none]; R_SPF_ALLOW(-0.20)[+mx]; R_DKIM_ALLOW(-0.20)[bruelltuete.com:s=bruelltuete18a]; MIME_GOOD(-0.10)[text/plain]; MLMMJ_DEST(0.00)[freebsd-hackers@FreeBSD.org]; RCVD_COUNT_ZERO(0.00)[0]; FROM_EQ_ENVFROM(0.00)[]; MIME_TRACE(0.00)[0:+]; ARC_NA(0.00)[]; TO_DN_ALL(0.00)[]; ASN(0.00)[asn:197540, ipnet:45.132.244.0/22, country:DE]; FROM_HAS_DN(0.00)[]; DKIM_TRACE(0.00)[bruelltuete.com:+]; TO_MATCH_ENVRCPT_ALL(0.00)[]; RCPT_COUNT_ONE(0.00)[1]; MID_RHS_MATCH_FROM(0.00)[] X-Rspamd-Queue-Id: 4QMnm45H2Dz4G29 X-Spamd-Bar: --- X-ThisMailContainsUnwantedMimeParts: N On 15/05/2023 22:16, Johannes Totz wrote: > Hi all, > > I'm poking cpufreq's hwpstate_amd to see what I can tune re performance > vs power vs heat trade-off. Here are some patches, if anyone is interested: https://reviews.freebsd.org/D40139 Adds a tunable for cpufreq/hwpstate to get the P-state info from the CPU's MSR instead of acpi_perf. https://reviews.freebsd.org/D40158 Adds another tunable that allows overriding the default (or BIOS-configured?) P-state configuration. Stuff like over- or underclocking and -volting. https://reviews.freebsd.org/D40140 Adds power calculation if P-state info comes from MSR. This was missing until now but is really just cosmetic. These do not solve the mystery below though :( And fwiw, C-state power saving is really effective. Messing with the P-states does not do much while idle, it's measurable only when the CPU is busy. > I'm struggling with the P-state behaviour though. > The code looks really straight-forward: > https://github.com/freebsd/freebsd-src/blob/main/sys/x86/cpufreq/hwpstate_amd.c#L172 > > But enabling hwpstate_verify, it looks like P-state transitions never go > as requested. > For this, I'm not running powerd. > In addition to the existing verify code, I've sprinkled in a few more > printfs. > > PStateCurLim (aka MSR_AMD_10H_11H_LIMIT = 0x20) and PStateDef (aka > MSR_AMD_10H_11H_CONFIG = eg 0x8000000049120890) look all reasonable. > > > $ sysctl dev.cpu.0 > dev.cpu.0.freq_levels: 3600/3960 2800/2800 2200/1980 > dev.cpu.0.freq: 2800 > > $ sysctl dev.cpu.0.freq=3600 > dev.cpu.0.freq: 2800 -> 3600 > > $ cat /var/log/messages > [...extra printf debugging...] > kernel: hwpstate0: setting P0-state on cpu0 > kernel: hwpstate0: setting P1(2) -> P0 on cpu1 > [...same for all the other cpus...] > kernel: hwpstate0: setting P1(2) -> P0 on cpu15 > > > This shows that cpufreq thought we were at P1 and wanted to transition > to P0. But actually, the CPU was in P2 (the 2 in brackets). > > We want to go from P0 to P2... > > > $ sysctl dev.cpu.0.freq=2200 > dev.cpu.0.freq: 3600 -> 2200 > > $ cat /var/log/messages > kernel: hwpstate0: setting P2-state on cpu0 > kernel: hwpstate0: setting P0(1) -> P2 on cpu1 > > > ...but CPU was in P1 at that time. > > Wanting to go from P2 back to P1... > > > $ sysctl dev.cpu.0.freq=2800 > dev.cpu.0.freq: 2200 -> 2800 > > $ cat /var/log/messages > kernel: hwpstate0: setting P1-state on cpu0 > kernel: hwpstate0: setting P2(2) -> P1 on cpu1 > > > ...shows that this time the CPU really was in P2 (yeay). But it did not > transition to P1, it stayed in P2 (not shown in the log). > > > So question is: what else could be interfering with P-state? > > > thanks, > > Johannes