How many segments does it take to span from VM_MIN_KERNEL_ADDRESS through VM_MAX_SAFE_KERNEL_ADDRESS? 128 in moea64_late_bootstrap

Thu May 2 10:45:55 UTC 2019

["vt_upgrade(&vt_consdev). . ." hang-ups with the patch do
happen for -r347003. The patch does not fix the overall
hangs-up behavior, although it changes some detailed
behavior that is associated. I've also avoided the panic
issue by avoiding cmpb use. This does not fix the "mtx_lock
of spin mutex WWV" but avoids it.]

On 2019-May-1, at 23:21, Mark Millard <marklmi at yahoo.com> wrote:

> [Some results, mixed Im afraid.]
> 
> On 2019-May-1, at 17:22, Mark Millard <marklmi at yahoo.com> wrote:
> 
>> On 2019-May-1, at 14:54, Justin Hibbits <chmeeedalf at gmail.com> wrote:
>> 
>>> On Wed, 1 May 2019 14:35:56 -0700
>>> Mark Millard <marklmi at yahoo.com> wrote:
>>> 
>>>>>> What happens if you revert all your patches,  
>>>>> 
>>>>> Most of the patches in Bugzilla 233863 are not for this
>>>>> issue at all and are not tied to starting the non-bsp
>>>>> cpus. (The one for improving how close the Time Base
>>>>> registers are is tied to starting these cpus.) Only the
>>>>> aim/mp_cpudep.c and aim/slb.c changes seem relevant.
>>>>> 
>>>>> Are you worried about some form of interaction that means
>>>>> I need to avoid patches for other issues?
>>>>> 
>>>>> Note: for now I'm staying at using head -r345758 as the
>>>>> basis for my experiments.
>>>>> 
>>>>>> and change this loop to
>>>>>> stop at n_slb?  So something more akin to:
>>>>>> 
>>>>>> 	int i = 0;
>>>>>> 
>>>>>> 	for (va = virtual_avail; va < virtual_end && i < n_slb -
>>>>>> 1; va += SEGMENT_LENGTH, i++);
>>>>>> 		...
>>>>>> 
>>>>>> If it reliably boots with that, then that's fine.  We can prefault
>>>>>> as much as we can and leave the rest for on-demand.  
>>>>> 
>>>>> I'm happy to experiment with this loop without my hack
>>>>> for forcing the slb entry to exist in cpudep_ap_bootstrap.
>>>>> 
>>>>> But, it seems to presume that the pc_curpcb's will
>>>>> all always point into the lower address range spanned
>>>>> when cpudep_ap_bootstrap is executing on the cpu.
>>>>> Does some known property limit the pc_curpcb->
>>>>> references to such? Only that would be sure to
>>>>> avoid an slb-miss at that stage. Or is this just an
>>>>> alternate hack or a means of getting evidence, not a
>>>>> proposed solution?
>>>>> 
>>>>> (Again, I'm happy to disable my hack that forces the
>>>>> slb entry and to try the loop suggested.)  
>>> ...
>>>> And the patch for the loop looks like:
>>>> 
>>>> 	virtual_end = VM_MAX_SAFE_KERNEL_ADDRESS; 
>>>> 
>>>> 	/*
>>>> -	 * Map the entire KVA range into the SLB. We must not fault
>>>> there.
>>>> +	 * Map the lower-address part of the KVA range into the SLB.
>>>> We must not fault there. */
>>>> 	#ifdef __powerpc64__
>>>> -	for (va = virtual_avail; va < virtual_end; va +=
>>>> SEGMENT_LENGTH)
>>>> +	i = 0;
>>>> +	for (va = virtual_avail; va < virtual_end && i<n_slbs-1; va
>>>> += SEGMENT_LENGTH, i++) moea64_bootstrap_slb_prefault(va, 0);
>>>> 	#endif
>>>> 
>>> 
>>> Yep, that's the patch I was going for.
>>> 
>>>> 
>>>> So I've built, installed, and have tested some: it did not go well
>>>> overall.
>>>> 
>>>> Using:
>>>> 
>>>> OK set debug.verbose_sysinit=1
>>>> 
>>>> to show better context about where the hangs occur, shows:
>>>> (Typed from a screen picture.)
>>>> 
>>>> subsystem a800000
>>>> boot_run_interrupt_driven_config_hooks(0)...
>>>> . . . (omitted) . . .
>>>> done.
>>>> vt_upgrade(&vt_consdev). . .
>>>> 
>>>> The "vt_upgrade(&vt_consdev). . ." never says done when booting
>>>> hangs with the above changes.
>>>> 
>>>> Trying to boot a bunch of times did produce one
>>>> completed boot, all 4 cpus working. Otherwise I'm
>>>> using kernel.old to manage to complete a boot.
>>>> 
>>>> I'll note that "vt_upgrade(&vt_consdev). . ." is where
>>>> Dennis Clarke reported for the hangups that he was
>>>> seeing, without any of my patches being available back
>>>> then: 2019-Feb-14.
>>> 
>>> Maybe try the commit that caused the problem back in July?  r334498.
>>> 
>> 
>> I'd already started down the path of getting materials from:
>> 
>> https://artifact.ci.freebsd.org/snapshot/head/r347003/powerpc/powerpc64/
>> 
>> and putting them on a separate SSD that I sometimes use for artifact.ci
>> or snapshot experiments. Also: checking out matching svn sources for
>> -r347003 and then doing a buildworld buildkernel with a bootstrap gcc
>> 4.2.1 compiler used. I'm verifying that I can build it before making
>> the source changes for the kernel. The build is of a debug kernel
>> (GENERIC64).
>> 
>> The test buildworld is still in process.
>> 
>> Let me know if this is insufficient for your purposes. I could revert
>> to:
>> 
>> https://artifact.ci.freebsd.org/snapshot/head/r334594/powerpc/powerpc64/
>> 
>> (There is no head/r334498/ and the first after that with a
>> powerpc64/ is head/r334594/ .)
>> 
>> For either head/r347003/ or head/r334594/ :
>> 
>> Use of artifact materials allows using officially built files for
>> every file but some specific file(s) that I replace. It also allows
>> comparison/contrast of the behavior of the official files vs. when
>> adjusted ones are substituted.
>> 
>> Use of artifact-version materials also means that I know I'm using
>> a vintage that actually built --and so I hope to avoid other problems
>> getting in the way.
> 
> I present without-the-patch results before presenting
> with-the-patch results. The end result is mixed, I'm
> afraid.
> 
> 
> 
> As for the results without any patch,
> just artifact materials . . .
> 
> Note: "Add debug.verbose_sysinit tunable for VERBOSE_SYSINIT" was
> not checked-in until -r335458 .
> 
> Trying to boot without any updates or rebuilds, just artifact
> materials shows variable stopping points:
> 
> (For debug.verbose_sysinit=1 :)
> -r347003 stops sometimes at: vt_upgrade(&vt_consdev). . .
> -r347003 stops sometimes at: cpu_mp_unleash(0). . .
> 
> -r334594 stops after: ada0 lines, VERBOSE_SYSINIT not built in
> 
> 
> 
> So I had to build my own -r334594 kernel to see verbose_sysinit
> information about the stopping point. Again, no patch here,
> I just copied over my build of the /boot/kernel/kernel file:
> 
> -r334594 stops sometimes at: vt_upgrade(&vt_consdev). . .
> -r334594 stops sometimes at: cpu_mp_unleash(0). . .
> 
> 
> Summary thus far:
> 
> I did not find any obvious difference in how often each stops
> in either of the alternatives.
> 
> So I'm seeing if the proposed patch changes the behavior of
> -r347003 .
> 
> 
> 
> Later test of patched -r347003 . . .
> 
> The patched kernel is based on:
> 
> # svnlite diff /mnt/usr/src/ | more
> Index: /mnt/usr/src/sys/powerpc/aim/mmu_oea64.c
> ===================================================================
> --- /mnt/usr/src/sys/powerpc/aim/mmu_oea64.c    (revision 347003)
> +++ /mnt/usr/src/sys/powerpc/aim/mmu_oea64.c    (working copy)
> @@ -959,7 +959,8 @@
>         * Map the entire KVA range into the SLB. We must not fault there.
>         */
>        #ifdef __powerpc64__
> -       for (va = virtual_avail; va < virtual_end; va += SEGMENT_LENGTH)
> +       i = 0;
> +       for (va = virtual_avail; va < virtual_end && i<n_slbs-1; va += SEGMENT_LENGTH, i++)
>                moea64_bootstrap_slb_prefault(va, 0);
>        #endif
> 
> 
> So far with the patched code:
> 
> -r347003 has never stopped at: vt_upgrade(&vt_consdev). . .

I have since had hang-ups at "vt_upgrade(&vt_consdev). . .".

> -r347003 stops sometimes at: cpu_mp_unleash(0). . . [but differently!]
> -r347003 panics at a particular point the rest of the time
> 
> The cpu_mp_unleash hangups report:
> (typed from screen pictures)
> 
> subsystem f000000
>   cpu_mp_unleash(0)... Launching APs 1 2 SMP: 4 CPUs found; 4 CPUs usable; 3 CPUs woken
> 
> After that it is hung-up.
> 
> 
> As for when that does not happen . . .
> 
> I do not even have /etc/fstab set up and so end up at the mountroot>
> prompt. When I enter "ufs:/dev/daa0s3" I get a panic for:
> 
> panic: mtx_lock of spin mutex WWV @ /mnt/usr/src/sys/powerpc/aim/mmu_oea64.c:2812
> (it is a debug-kernel build)
> 
> For reference, line 2812 is: PMAP_LOCK(pm);
> 
> panic is reached via an interesting(?) call chain,
> showing the backtrace (typed from screen pictures):
> 
> .__mtx_lock_flags+0xd4
> .moea64_sync_icache+0x48
> .pmap_sync_icache+0x90
> .ppc_instr_emulate+0x1b4
> .trap+0x10fc
> .powerpc_interrupt+0x2cc
> user PGM trap by 0x810053bb4: srr1=0x900000000008d032
> r1=0x3ffffffffffffcc00 cr=0x20002024 xer=0 ctr=0x1 r2=0x81007bdd0 frame=0xe000000070ca9810
> 
> It was thread pid 28 tid 100097
> 
> So far these details seem consistent.
> 
> But I will note that openfirmware use via ofwdump -ap
> and the like causes system crashes going back to when
> the direct map base was moved to high memory addresses
> ( -r330610 and later ). This is one of the reasons I
> want to avoid openfirmware and use the conversion to
> fdt instead. (There is a -r330614 artifact to test
> such crashes with --or use a later one that otherwise
> boots.)

I avoided the panics by adjusting src/lib/libc/powerpc64/string/strcmp.S
to not use cmpb instructions. This does not fix the "mtx_lock of spin
mutex WWV" but avoids it. So now there are two patches:

# svnlite diff /mnt/usr/src/
Index: /mnt/usr/src/lib/libc/powerpc64/string/strcmp.S
===================================================================

--- /mnt/usr/src/lib/libc/powerpc64/string/strcmp.S	(revision 347003)
+++ /mnt/usr/src/lib/libc/powerpc64/string/strcmp.S	(working copy)
@@ -88,9 +88,16 @@
 .Lstrcmp_compare_by_word:
 	ld	%r5,0(%r3)	/* Load double words. */
 	ld	%r6,0(%r4)
-	xor	%r8,%r8,%r8	/* %r8 <- Zero. */
+	lis     %r8,32639	/* 0x7f7f */
+	ori     %r8,%r8,32639	/* 0x7f7f7f7f */
+	rldimi  %r8,%r8,32,0	/* 0x7f7f7f7f'7f7f7f7f */
 	xor	%r0,%r5,%r6	/* Check if double words are different. */
-	cmpb	%r7,%r5,%r8	/* Check if double words contain zero. */
+				/* Check for zero vs. not bytes: */
+	and	%r9,%r5,%r8	/* 0x00->0x00, 0x80->0x00, other->ms-bit-in-byte==0 */
+	add	%r9,%r9,%r8	/*     ->0x7f,     ->0x7f,      ->ms-bit-in-byte==1 */
+	nor	%r7,%r9,%r5	/*     ->0x80,     ->0x00,      ->ms-bit-in-byte==0 */
+	andc	%r7,%r7,%r8	/*     ->0x80,     ->0x00,      ->0x00 */
+				/* sort of like cmpb %r7,%r5,%r8 for %r8 being zero */
 
 	/*
 	 * If double words are different or contain zero,
@@ -104,7 +111,12 @@
 	ldu	%r5,8(%r3)	/* Load double words. */
 	ldu	%r6,8(%r4)
 	xor	%r0,%r5,%r6	/* Check if double words are different. */
-	cmpb	%r7,%r5,%r8	/* Check if double words contain zero. */
+				/* Check for zero vs. not bytes: */
+	and	%r9,%r5,%r8	/* 0x00->0x00, 0x80->0x00, other->ms-bit-in-byte==0 */
+	add	%r9,%r9,%r8	/*     ->0x7f,     ->0x7f,      ->ms-bit-in-byte==1 */
+	nor	%r7,%r9,%r5	/*     ->0x80,     ->0x00,      ->ms-bit-in-byte==0 */
+	andc	%r7,%r7,%r8	/*     ->0x80,     ->0x00,      ->0x00 */
+				/* sort of like cmpb %r7,%r5,%r8 for %r8 being zero */
 
 	/*
 	 * If double words are different or contain zero,
Index: /mnt/usr/src/sys/powerpc/aim/mmu_oea64.c
===================================================================
--- /mnt/usr/src/sys/powerpc/aim/mmu_oea64.c	(revision 347003)
+++ /mnt/usr/src/sys/powerpc/aim/mmu_oea64.c	(working copy)
@@ -959,7 +959,8 @@
 	 * Map the entire KVA range into the SLB. We must not fault there.
 	 */
 	#ifdef __powerpc64__
-	for (va = virtual_avail; va < virtual_end; va += SEGMENT_LENGTH)
+	i = 0;
+	for (va = virtual_avail; va < virtual_end && i<n_slbs-1; va += SEGMENT_LENGTH, i++)
 		moea64_bootstrap_slb_prefault(va, 0);
 	#endif
 
With this I do sometimes manage to boot. So, in this modified
context, I've seen all 3 of:

-r347003M stops sometimes at: vt_upgrade(&vt_consdev). . .
-r347003M stops sometimes at: cpu_mp_unleash(0). . .
          [but with: "SMP: 4 CPUs found; 4 CPUs usable; 3 CPUs woken"]
-r347003M boots and operates sometimes.

(I did not do much with it booted, focusing on more boot attempts
instead.)

===
Mark Millard
marklmi at yahoo.com
( dsl-only.net went
away in early 2018-Mar)