From nobody Mon Sep 18 18:27:55 2023 X-Original-To: freebsd-current@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4RqCw30Zhtz4tZVr for ; Mon, 18 Sep 2023 18:28:03 +0000 (UTC) (envelope-from mike@karels.net) Received: from mail2.karels.net (mail2.karels.net [3.19.118.201]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "freebsd", Issuer "freebsd" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id 4RqCw24Wbhz3dgQ for ; Mon, 18 Sep 2023 18:28:02 +0000 (UTC) (envelope-from mike@karels.net) Authentication-Results: mx1.freebsd.org; none Received: from mail2.karels.net (localhost [IPv6:0:0:0:0:0:0:0:1]) by mail2.karels.net (8.17.1/8.17.1) with ESMTP id 38IIRubg076825; Mon, 18 Sep 2023 13:27:56 -0500 (CDT) (envelope-from mike@karels.net) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=karels.net; s=mail2; t=1695061676; bh=tSGkr5yVAuDdG1IFvSnKn4uHZjBtcJa2/sAa/1pe2BY=; h=From:To:Cc:Subject:Date:In-Reply-To:References; b=ULevyafd/VQd/+YQfSHeDoECHouQsvjvNL8IlUB3yMEw86pU6Kip2ZHlrReLGkE58 nxjYiYfMTjbp4ln5g8VLoK0WzgZ2ckKqbh3jcUpqHM0Yszf2WQTtBx2Vpk9uK7/DDY mXLNrNZRlW6GUYCWBMjPfCl1knFtxR95EywYKVR//OBOE7k9FQrcBibsMkOs9r1+a8 kQPU0BNKgqFvfFBTrt7GNQLBVWxk5FcktE4Chx2UlbGDCCuURlxrsdcmwz6x73AJHu n35h+7f9EpmYwR1jUe8ah2ELzLgosGyuVG1hAsfaekICveyUylNJ6DsAvYFchkFpHE aCyJOJ1hI7N/w== Received: from [10.0.2.130] ([73.62.165.147]) by mail2.karels.net with ESMTPSA id 49NTFqyWCGUXLAEAs/W3XQ (envelope-from ); Mon, 18 Sep 2023 13:27:56 -0500 From: Mike Karels To: Michael Butler Cc: freebsd-current@freebsd.org Subject: Re: [Intel AlderLake] Read&Write files to FAT32 or UFS partition cause data corrupt due to P-Core&E-Core Date: Mon, 18 Sep 2023 13:27:55 -0500 X-Mailer: MailMate (1.14r5964) Message-ID: <77F5DC92-726E-4F26-ACEA-0AF92E0AF5D2@karels.net> In-Reply-To: <946c1f29-dd2a-776d-e88d-7523c103b221@protected-networks.net> References: <59cbcfe2-cd53-69d8-65d6-7a79e656f494@FreeBSD.org> <1f968af1-1c57-9a09-7e01-145a5262e27f@FreeBSD.org> <20230806181238.858f58e25dfd0f99269cfe53@dec.sakura.ne.jp> <20230808063735.e8e1d3ede370a18f200a6f48@dec.sakura.ne.jp> <20230808224612.c3889d6e20b6fc980f5278cc@dec.sakura.ne.jp> <20230808235635.744e0e1c6a72face7fdf6a9b@dec.sakura.ne.jp> <4f0fbb44-eebe-aa8f-f958-dcd678936fe1@protected-networks.net> <946c1f29-dd2a-776d-e88d-7523c103b221@protected-networks.net> List-Id: Discussions about the use of FreeBSD-current List-Archive: https://lists.freebsd.org/archives/freebsd-current List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-current@freebsd.org MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-Spamd-Bar: ---- X-Rspamd-Pre-Result: action=no action; module=replies; Message is reply to one we originated X-Spamd-Result: default: False [-4.00 / 15.00]; REPLY(-4.00)[]; ASN(0.00)[asn:16509, ipnet:3.16.0.0/14, country:US] X-Rspamd-Queue-Id: 4RqCw24Wbhz3dgQ On 18 Sep 2023, at 10:38, Michael Butler wrote: > On 8/8/23 13:50, Michael Butler wrote: >> On 8/8/23 10:56, Tomoaki AOKI wrote: >>> On Tue, 8 Aug 2023 17:02:32 +0300 >>> Konstantin Belousov wrote: >> >> =C2=A0[ .. snip .. ] >> >>>> The workaround is switched on automatically, when kernel detects 'sm= all cores' >>>> reported by CPUID. >>> >>> If I read the code correctly, vm.pmap.pcid_invlpg_workaround >>> (precicely, the corresponding variable) is set to non-zero when the >>> workaround is enabled. Not sure it was detected correctly at the >>> original reporter's environment, but forcibly setting the tunable to = 1 >>> didn't reported to help sufficiently. >>> Currently, only setting tunable vm.pmap.pcid_enabled to 0 could help.= >> >> I'm seeing similar stability problems on an N95-based device. This too= is an Alderlake-N device with only E-cores although I'm running it with = a compilation with CPUTYPE=3Dtremont .. from an older, verbose start-up .= =2E >> >> PPIM 0: PA=3D0x4000000000, VA=3D0xffffffff82710000, size=3D0x1d5000, m= ode=3D0x1 >> pmap: large map 8 PML4 slots (4096 GB) >> VT(efifb): resolution 800x600 >> Preloaded elf kernel "/boot/kernel.new/kernel" at 0xffffffff8234e000. >> Preloaded boot_entropy_cache "/boot/entropy" at 0xffffffff82357d08. >> Preloaded cpu_microcode "/boot/firmware/intel-ucode.bin" at 0xffffffff= 82357d60. >> Preloaded hostuuid "/etc/hostid" at 0xffffffff82357dc0. >> Preloaded TSLOG data "TSLOG" at 0xffffffff82357e10. >> CPU: Intel(R) N95 (1689.60-MHz K8-class CPU) >> =C2=A0 Origin=3D"GenuineIntel"=C2=A0 Id=3D0xb06e0=C2=A0 Family=3D0x6=C2= =A0 Model=3D0xbe=C2=A0 Stepping=3D0 >> >> Features=3D0xbfebfbff= >> >> Features2=3D0x7ffafbbf >> =C2=A0 AMD Features=3D0x2c100800 >> =C2=A0 AMD Features2=3D0x121 >> =C2=A0 Structured Extended Features=3D0x239ca7eb >> =C2=A0 Structured Extended Features2=3D0x98c007bc >> =C2=A0 Structured Extended Features3=3D0xfc184410 >> =C2=A0 XSAVE Features=3D0xf >> =C2=A0 IA32_ARCH_CAPS=3D0x180fd6b >> =C2=A0 VT-x: Basic Features=3D0x3da0500 >> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 Pin-Based Controls=3D0xff<= ExtINT,NMI,VNMI,PreTmr,PostIntr> >> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 Primary Processor Controls= =3D0xfffbfffe >> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 Secondary Processor Contro= ls=3D0x75d7fff >> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 Exit Controls=3D0x3da0500<= PAT-LD,EFER-SV,PTMR-SV> >> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 Entry Controls=3D0x3da0500= >> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 EPT Features=3D0x6f34141 >> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 VPID Features=3D0xf01 >> =C2=A0 TSC: P-state invariant, performance statistics >> 64-Byte prefetching >> L2 cache: 2048 kbytes, 16-way associative, 64 bytes/line >> real memory=C2=A0 =3D 17179869184 (16384 MB) >> Physical memory chunk(s): >> 0x0000000000010000 - 0x000000000009dfff, 581632 bytes (142 pages) >> 0x000000000009f000 - 0x000000000009ffff, 4096 bytes (1 pages) >> 0x0000000000100000 - 0x000000005fffffff, 1609564160 bytes (392960 page= s) >> 0x0000000062401000 - 0x000000007264dfff, 270848000 bytes (66125 pages)= >> 0x0000000075fff000 - 0x0000000075ffffff, 4096 bytes (1 pages) >> 0x0000000100001000 - 0x0000000462497fff, 14533881856 bytes (3548311 pa= ges) >> 0x000000047fa00000 - 0x000000047fb68fff, 1478656 bytes (361 pages) >> avail memory =3D 16363008000 (15604 MB) >> CPU microcode: updated from 0xc to 0x10 > > With the most recent microcode update, this device reports .. > > CPU microcode: updated from 0xc to 0x11 > > .. and is now stable with vm.pmap.pcid_enabled=3D0, vm.pmap.pcid_invlp= g_workaround=3D1, and CPUTYPE?=3Dalderlake set in /etc/make.conf over mul= tiple full system builds. > > I have not tested with vm.pmap.pcid_invlpg_workaround=3D0. I believe that vm.pmap.pcid_invlpg_workaround does not matter if vm.pmap.pcid_enabled=3D0. Enabling the workaround or disabling pcid shou= ld be basically the same for this CPU, so I don't understand why that isn't true. It might be interesting to test with pcid enabled with the new microcode, although I don't see why that would affect the results (pcid should still not be used on any CPU). The CPUTYPE for the compiler should not affect the pcid vm issues, just change the optimization by the compiler. Mike >> On start-up, vm.pmap.pcid_invlpg_workaround=3D1 but seemingly random f= aults still occurred under load, for example, 'make buildworld'. Apparent= misreads of source-files resulting in syntax errors were the most common= symptom. Compilation reattempts (mostly) succeed. >> >> Initially, I put this down to an inadequate power-supply but setting v= m.pmap.pcid_enabled=3D0 seems to have stabilised it. >> >> I guess there's another dragon in there .. :-( >> >> =C2=A0=C2=A0=C2=A0=C2=A0Michael