Strange issue after early AP startup
Cy Schubert
Cy.Schubert at komquats.com
Thu Jan 19 06:50:11 UTC 2017
In message <1922021.4HJeqFJ74r at ralph.baldwin.cx>, John Baldwin writes:
> On Tuesday, January 17, 2017 05:08:58 PM Cy Schubert wrote:
> > In message <1492450.XZfNz8zFfg at ralph.baldwin.cx>, John Baldwin writes:
> > > On Tuesday, January 17, 2017 12:53:19 PM Cy Schubert wrote:
> > > > In message <b9c53237-4b1a-a140-f692-bf5837060b18 at selasky.org>, Hans Pet
> ter
> > > > Sela
> > > > sky writes:
> > > > > Hi,
> > > > >
> > > > > When booting I observe an additional 30-second delay after this print
> :
> > > > >
> > > > > > Timecounters tick every 1.000 msec
> > > > >
> > > > > ~30 second delay and boot continues like normal.
> > > > >
> > > > > Checking "vmstat -i" reveals that some timers have been running loose
> .
> > > > >
> > > > > > cpu0:timer 44300 442
> > > > > > cpu1:timer 40561 404
> > > > > > cpu3:timer 48462822 483058
> > > > > > cpu2:timer 48477898 483209
> > > > >
> > > > > Trying to add delays and/or prints around the Timecounters printout
> > > > > makes the issue go away. Any ideas for debugging?
> > > > >
> > > > > Looks like a startup race to me.
> > > >
> > > > just picking a random email to reply to, I'm seeing a different issue w
> ith
> > > > early AP startup. It affects one of my four machines, my laptop. My thr
> ee
> > > > server systems downstairs have no problem however my laptop will reboot
>
> > > > repeatedly at:
> > > >
> > > > Jan 17 11:55:16 slippy kernel: cd0: Attempt to query device size failed
> :
> > > > NOT READY, Medium not present - tray closed
> > >
> > > So it panics and reboots after this?
> >
> > Yes, it goes into a panic/reboot loop for a few iterations until it
> > successfully boots. Disabling early AP startup allows it to boot up without
>
> > the assumed race.
>
> Can you add DDB to the kernel config (and remove DDB_UNATTENDED) to get it
> to break into DDB when it panics to get the panic message (and a stack trace
> as well)?
I found and fixed the problem. It was in some code I had added a long time
ago but not committed yet to the bge driver to implement WOL. It was a lock
assertion.
--
Cheers,
Cy Schubert <Cy.Schubert at cschubert.com>
FreeBSD UNIX: <cy at FreeBSD.org> Web: http://www.FreeBSD.org
The need of the many outweighs the greed of the few.
More information about the freebsd-current
mailing list