10-stable sparc64 boot problems

Mon Mar 24 15:50:36 UTC 2014

  I just updated my 10-stable sparc64 (Sun Fire v240) to a 10-stable kernel from revision 263676, which was the first one I found after the numerous failures over the weekend in route6d.  Reboot into single-user for mergemaster and install world, I attempted to reboot into multi-user the first two attempts yielded:

Trying to mount root from zfs:zroot []...
Setting hostuuid: 94588820-cd20-11e1-b15b-0003bae34047.
Setting hostid: 0x4f9a5776.
Entropy harvesting: interrupts ethernet point_to_point swi.
Starting file system checks:
Mounting local file systems:.
Writing entropy file:.
Setting hostname: hostname.distal.com.
bge0: link state changed to DOWN
spin lock 0xc0c61cb0 (smp rendezvous) held by 0xfffff800054dcdb0 (tid 100328) too long
timeout stopping cpus
panic: spin lock held too long
cpuid = 1
KDB: stack backtrace:
#0 0xc051fcf0 at _mtx_lock_spin_failed+0x50
#1 0xc051fdb8 at _mtx_lock_spin_cookie+0xb8
#2 0xc088771c at tick_get_timecount_mp+0xdc
#3 0xc0541efc at binuptime+0x3c
#4 0xc08513cc at timercb+0x6c
#5 0xc0887a80 at tick_intr+0x220
Uptime: 23s
Automatic reboot in 15 seconds - press a key on the console to abort

  Both were the same, except 27s uptime in one case and 23s in the other.  The next reboot went all the way to multiuser, and appears to be operating normally.  At least, for the first 5 minutes.

  I’ll keep an eye on it.  But, is this possibly related to the bge0 device driver, or is this more likely to be a problem in the sparc/sparc64 code not related to a specific device?

  The prior kernel that had been running without this problem was:

FreeBSD 10.0-STABLE #6 r261083: Thu Jan 23 17:54:24 EST 2014

  Just wanted to see if anyone had any thoughts, and I’ll hope the machine stays operational now that it’s up and running…

                                 - Chris