RPi3 not using SMP?
Mark Millard
marklmi at yahoo.com
Sat Feb 8 22:28:48 UTC 2020
On 2020-Feb-7, at 21:58, Mark Millard <marklmi at yahoo.com> wrote:
> [A note on avoiding a bad interpretation of
> my evidence.]
>
> On 2020-Feb-7, at 20:12, Mark Millard <marklmi at yahoo.com> wrote:
>
>> On 2020-Feb-7, at 19:10, Mark Millard <marklmi at yahoo.com> wrote:
>>
>>> [I now have some bounds on when PSCI_FN_VERSION
>>> gets a 0 result from smc #0 instead of the correct
>>> result: 2. Hopefully I can narrow the range more.]
>>>
>>> On 2020-Feb-7, at 18:46, Mark Millard <marklmi at yahoo.com> wrote:
>>>
>>>
>>>
>>>> On 2020-Feb-7, at 17:19, bob prohaska <fbsd at www.zefox.net> wrote:
>>>>
>>>>> For some weeks now an RPi3 running -current has seemed rather slow....
>>>>>
>>>>> On looking at the early part of the boot message the processor
>>>>> attributes seem rather scant:
>>>>>
>>>>> ......
>>>>> elease APs...APs not started
>>>>> CPU 0: ARM Cortex-A53 r0p4 affinity: 0
>>>>> Instruction Set Attributes 0 = <CRC32>
>>>>> Instruction Set Attributes 1 = <>
>>>>> Processor Features 0 = <AdvSIMD,FP,EL3 32,EL2 32,EL1 32,EL0 32>
>>>>> Processor Features 1 = <>
>>>>> Memory Model Features 0 = <TGran4,TGran64,SNSMem,BigEnd,16bit ASID,1TB PA>
>>>>> Memory Model Features 1 = <8bit VMID>
>>>>> Memory Model Features 2 = <32bit CCIDX,48bit VA>
>>>>> Debug Features 0 = <2 CTX BKPTs,4 Watchpoints,6 Breakpoints,PMUv3,Debugv8>
>>>>> Debug Features 1 = <>
>>>>> Auxiliary Features 0 = <>
>>>>> Auxiliary Features 1 = <>
>>>>> CPU 1: (null) (null) r0p0 affinity: 0
>>>>> CPU 2: (null) (null) r0p0 affinity: 0
>>>>> CPU 3: (null) (null) r0p0 affinity: 0
>>>>> ............
>>>>> In a top window, STATE is reported as RUN, rather than the
>>>>> former CPUn.
>>>>>
>>>>> Is a software switch now required to enable multiprocessing?
>>>>>
>>>>> Or, could it be related to the lines:
>>>>> psci0: PSCI version number mismatched with DT
>>>>> as pointed out by Mark M in reference to the cpu_reset failed
>>>>> problem, which is still manifest?
>>>>>
>>>>> The kernel is at r357644.
>>>>
>>>> Head -r356767's kernel does not have this problem for RPi3/4 used as
>>>> aarch64 FreeBSD.
>>>>
>>>> Head -r356776 and later all have this problem for both RPi3 and RPi4.
>>>>
>>>> Note: There are no head versions between those.
>>>>
>>>> The console log shows evidence of the problem much earlier.
>>>> Instead of saying:
>>>>
>>>> psci0: <ARM Power State Co-ordination Interface Driver> on ofwbus0
>>>> psci0: PSCI version 0.2 compatible
>>>>
>>>> (once) it says:
>>>>
>>>> psci0: <ARM Power State Co-ordination Interface Driver> on ofwbus0
>>>> psci0: PSCI version number 0 mismatched with DT, default 2
>>>> device_attach: psci0 attach returned 6
>>>>
>>>> (and those 3 lines repeat in various places) for which none of
>>>> them show up for -r356767 .
>>>>
>>>> Without identifying and using PSCI, the extra cores will not
>>>> start and the cpu(s) will not reset. (PSCI is the ARM interface
>>>> for doing such things.)
>>>>
>>>> I've no clue why, but the version number it gets in my RPi4
>>>> experiments is 0. My only guess is that at some point
>>>> memory important to ARM's PSCI operation has been touched
>>>> and is no longer valid for the PSCI operation.
>>>
>>> I've been doing some crude printf-based information
>>> gathering for RPi4B boots (based on head -r357529 just
>>> because it is convenient in my context) and I have
>>> learned some about the staging of when PSCI_FN_VERSION
>>> works vs. when it is no longer working.
>>>
>>> For the below text extraction from a boot . . .
>>> Lines with: "PSCI_FN_VERSION 2" are working cases.
>>> Lines with: "PSCI_FN_VERSION 0" are no-longer-working cases.
>>>
>>> . . .
>>> FreeBSD clang version 9.0.1 (git at github.com:llvm/llvm-project.git c1a0a213378a458fbea1a5c77b315c7dce08fd05) (based on LLVM 9.0.1)
>>> uma_startup1 start: PSCI_FN_VERSION 2
>>> uma_startup1 end: PSCI_FN_VERSION 2
>>> uma_startup2 start: PSCI_FN_VERSION 2
>>> uma_startup2 end: PSCI_FN_VERSION 2
>>> VT(efifb): resolution 1824x984
>>> module firmware already present!
>>> psci_fdt_get_callfn: method 'smc'
>>> psci_fdt_get_callfn: arm_smccc_hvc '0xffff0000008663b0'
>>> psci_fdt_get_callfn: arm_smccc_smc '0xffff0000008663c8'
>>> psci_init: PSCI_FN_VERSION 0
>>> Starting CPU 1 (1)
>>> Starting CPU 2 (2)
>>> Starting CPU 3 (3)
>>> FreeBSD/SMP: Multiprocessor System Detected: 4 CPUs
>>> random: unblocking device.
>>> uma_startup3 start: PSCI_FN_VERSION 0
>>> uma_startup3 start: PSCI_FN_VERSION 0
>>> . . .
>>>
>>> So sometime after uma_startup2 ends but before psci_init
>>> sets up the normal use of arm_smccc_smc things are
>>> messed up.
>>>
>>> My guess is that one or more of the kernel memory
>>> allocations ended up getting memory used by ARM's
>>> PSCI implementation and the content was invalidated
>>> for ARM's PSCI purposes.
>>>
>>
>> The transition from working to not is before the
>> debug.verbose_sysinit=1 output turns in to
>> symbolic notation:
>>
>> . . .
>> subsystem 1800000
>> mi_startup: PSCI_FN_VERSION 2
>> 0xffff00000047e170(0)... done.
>> mi_startup: PSCI_FN_VERSION 2
>> 0xffff0000007bda18(0)... done.
>> mi_startup: PSCI_FN_VERSION 2
>> 0xffff0000004407e4(0)... done.
>> mi_startup: PSCI_FN_VERSION 2
>> 0xffff0000004400c0(0xffff000000a75890)... done.
>> mi_startup: PSCI_FN_VERSION 2
>> 0xffff0000004400c0(0xffff000000a76340)... done.
>> . . .
>> mi_startup: PSCI_FN_VERSION 2
>> 0xffff0000004400c0(0xffff000000aca328)... done.
>> mi_startup: PSCI_FN_VERSION 2
>> 0xffff0000004400c0(0xffff000000aca378)... done.
>> mi_startup: PSCI_FN_VERSION 0
>> 0xffff0000004400c0(0xffff000000aca5f8)... done.
>> . . .
>> mi_startup: PSCI_FN_VERSION 0
>> 0xffff0000004c62b0(0)... done.
>> mi_startup: PSCI_FN_VERSION 0
>> 0xffff00000049e93c(0)... done.
>> mi_startup: PSCI_FN_VERSION 0
>> linker_preload(0)... Preloaded elf kernel "/boot/kernel/kernel" at 0xffff000001568000.
>> Preloaded elf module "/boot/kernel/umodem.ko" at 0xffff000001571020.
>> Preloaded elf module "/boot/kernel/ucom.ko" at 0xffff000001571838.
>> Preloaded boot_entropy_cache "/boot/entropy" at 0xffff000001572010.
>> module firmware already present!
>> done.
>> subsystem 1ffffff
>> mi_startup: PSCI_FN_VERSION 0
>> ucom_init(0)... done.
>> subsystem 2000000
>> mi_startup: PSCI_FN_VERSION 0
>> procs_show_all_add(0)... done.
>> . . .
>>
>>
>> It turns out that 0xffff0000004400c0 is: malloc_init .
>>
>> It turns out that 0xffff000000aca378 is for: M_IFMADDR :
>>
>> . . .
>>
>
> Do not take the above as an indication of a stable stage for the
> change to happen. As my activities change the memory allocation
> patterns and the memory layout some (and possibly other sources
> of variability are involved), where in the boot sequence moves
> around.
>
> For example, while the above was before the output
> turned to symbolic notation, that need not be the
> case in general.
>
> So it is just an example of where is possible and limits a
> kind of activity that is sufficient for the change in status
> to happen, since it is a fairly specific example context.
>
> The way things move around, I'm not likely to come up with
> a narrower type of activity spanning the status change.
>
> I have no evidence on if the Excluded Memory Regions
> are sufficient or respected.
>
> For reference, I'd used
>
> set debug.verbose_sysinit=1
> boot -v
>
> for the above example and my own printf's were
> involved. And I based this testing on head
> -r357529 instead of directly on -r356776 . The
> kernel was a non-debug build (with symbols).
>
FYI: I ran into a note on the web reporting that for the
RPi4:
The SoC does not seem to feature a secure memory controller of any kind, so portions of DRAM can’t be protected properly from the Non-secure world.
( https://trustedfirmware-a.readthedocs.io/en/latest/plat/rpi4.html#tf-a-port-design )
===
Mark Millard
marklmi at yahoo.com
( dsl-only.net went
away in early 2018-Mar)
More information about the freebsd-arm
mailing list