FreeBSD 10-STABLE/sparc64 panic
John-Mark Gurney
jmg at funkthat.com
Mon Sep 29 04:23:05 UTC 2014
Chris Ross wrote this message on Mon, Sep 29, 2014 at 00:00 -0400:
> On Jun 30, 2014, at 10:40 , Chris Ross <cross+freebsd at distal.com> wrote:
> > tl;dr : I?ve finished my testing and have a result, but see other things I
> > don?t understand. Could use more help.
>
> Old thread, problem still exists. Noticed in head around:
>
> http://lists.freebsd.org/pipermail/freebsd-sparc64/2014-March/009261.html
>
> And in stable/10 as of revision 263676 (likely earlier). As numerous people
> have tried, I have also tried, to narrow it down to a commit, or small number
> of commits, but the failure is sporadic. I think looking at the current code which
> is still failing may be most useful.
>
> I am right now seeing this on stable/10 code updated today, 10.1-BETA3,
> r272264. As noted earlier in these threads, I am running a Sun Fire v240. At
> least one or two other folks with v240's have seen this, and I think a variant
> of SunBlade that also has bge's on it.
>
> Multiuser boot panics at:
>
> Setting hostname: hostname.distal.com.
> bge0: link state changed to DOWN
> spin lock 0xc0c95330 (smp rendezvous) held by 0xfffff8000560a490 (tid 100347) too long
> timeout stopping cpus
> panic: spin lock held too long
> cpuid = 1
> KDB: stack backtrace:
> #0 0xc054a0d0 at _mtx_lock_spin_failed+0x50
> #1 0xc054a198 at _mtx_lock_spin_cookie+0xb8
> #2 0xc08b989c at tick_get_timecount_mp+0xdc
> #3 0xc056c33c at binuptime+0x3c
> #4 0xc08857ac at timercb+0x6c
> #5 0xc08b9c00 at tick_intr+0x220
> Uptime: 20s
> Automatic reboot in 15 seconds - press a key on the console to abort
>
> In past kernels, ones more recent than March 2014, it will sometimes
> boot [to multiuser] the first try, but usually will crash a few times, but
> eventually come all the way up. Given 30-40 minutes, it will usually
> recover to multiuser, and is stable forever (in past testing) at that point.
> This evening, it was rebooting for about 40 minutes (11 panic and
> reboot sequences), but then came up.
>
> I would be happy to dig into this further, but will need some advice and
> instruction. I fear I may not even have built the kernel with full debugging,
> but can do so. I'll look into that now that the machine is up again.
>
> Please let me know what I can do to help. Thanks.
If you could get a core dump (call doadump) that'd be good, but dumping
the stack of the tid that held the spinlock too long would be a good
start..
--
John-Mark Gurney Voice: +1 415 225 5579
"All that I will do, has been done, All that I have, has not."
More information about the freebsd-sparc64
mailing list