'make -j16 universe' gives SIReset

Peter Jeremy peterjeremy at acm.org
Mon Jun 13 23:51:56 UTC 2011


On 2011-Jun-09 00:48:01 +0200, Marius Strobl <marius at alchemy.franken.de> wrote:
>This might be due to the excessive use of sched_lock by SCHED_4BSD
>and the MD code, f.e. more CPUs means less TLB contexts per CPU which
>in turn means more flushes that are protect by sched_lock.

I have noticed that systat reports very high trap & fault counts.

> It would
>be great if the machine wouldn't lock up so you could check what
>exactly is holding the mutex so long.

Agreed.

>I think meanwhile I had a sound idea how to achieve the necessary level
>of protection in the MD code using just atomic operations instead of
>sched_lock, which further down would also allow the use of SCHED_ULE.

Sounds good - let me know if there's anything I can do to help.

>> I tried adding this and the system survived a "make -j30 universe" on
>> -stable (BTW "make universe" seems to have issues cross-building x86
>> derivatives).  I'm now trying that on -current.  I'm not sure what the
>> implications of the above change are.
>> 
>
>What was the outcome of these tests?

I got a "spinlock held too long" panic that should have gone to DDB
but the system wouldn't respond to anything other than a RSC reset.

>Nevertheless it also would be interesting to know if you end up
>with a corrupt kernel stack with DDB, KDB and r222840 in place,
>especially in case disabling superscalar dispatch doesn't solve
>the problem.

I'm building r223035 with DDB & KDB and will see how that goes.

-- 
Peter Jeremy
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 196 bytes
Desc: not available
Url : http://lists.freebsd.org/pipermail/freebsd-sparc64/attachments/20110613/26ad8885/attachment.pgp


More information about the freebsd-sparc64 mailing list