svn commit: r313037 - in head/sys: amd64/include kern mips/include net powerpc/include sparc64/include

Svatopluk Kraus onwahe at gmail.com
Sat Feb 4 21:34:22 UTC 2017


Probably not related. But when I took short look to the patch to see
what could go wrong, I walked into the following comment in
_rm_wlock(): "Assumes rm->rm_writecpus update is visible on other CPUs
before rm_cleanIPI is called." There is no explicit barrier to ensure
it. However, there might be some barriers inside of
smp_rendezvous_cpus(). I have no idea what could happened if this
assumption is not met. Note that rm_cleanIPI() is affected by the
patch.



On Sat, Feb 4, 2017 at 9:39 PM, Jason Harmening
<jason.harmening at gmail.com> wrote:
> Can you post an example of such panic?  Only 2 MI pieces were changed,
> netisr and rmlock.  I haven't seen problems on my own amd64/i386/arm testing
> of this, so a backtrace might help to narrow down the cause.
>
> On Sat, Feb 4, 2017 at 12:22 PM, Andreas Tobler <andreast at freebsd.org>
> wrote:
>>
>> On 04.02.17 20:54, Jason Harmening wrote:
>>>
>>> I suspect this broke rmlocks for mips because the rmlock implementation
>>> takes the address of the per-CPU pc_rm_queue when building tracker
>>> lists.  That address may be later accessed from another CPU and will
>>> then translate to the wrong physical region if the address was taken
>>> relative to the globally-constant pcpup VA used on mips.
>>>
>>> Regardless, for mips get_pcpup() should be implemented as
>>> pcpu_find(curcpu) since returning an address that may mean something
>>> different depending on the CPU seems like a big POLA violation if
>>> nothing else.
>>>
>>> I'm more concerned about the report of powerpc breakage.  For powerpc we
>>> simply take each pcpu pointer from the pc_allcpu list (which is the same
>>> value stored in the cpuid_to_pcpu array) and pass it through the ap_pcpu
>>> global to each AP's startup code, which then stores it in sprg0.  It
>>> should be globally unique and won't have the variable-translation issues
>>> seen on mips.   Andreas, are you certain this change was responsible the
>>> breakage you saw, and was it the same sort of hang observed on mips?
>>
>>
>> I'm really sure. 313036 booted fine, allowed me to execute heavy
>> compilation jobs, np. 313037 on the other side gave me various patterns of
>> panics. During startup, but I also succeeded to get into multiuser and then
>> the panic happend during port building.
>>
>> I have no deeper inside where pcpu data is used. Justin mentioned netisr?
>>
>> Andreas
>>
>


More information about the svn-src-head mailing list