Strange issue after early AP startup

Cy Schubert Cy.Schubert at komquats.com
Thu Jan 19 06:50:11 UTC 2017


In message <1922021.4HJeqFJ74r at ralph.baldwin.cx>, John Baldwin writes:
> On Tuesday, January 17, 2017 05:08:58 PM Cy Schubert wrote:
> > In message <1492450.XZfNz8zFfg at ralph.baldwin.cx>, John Baldwin writes:
> > > On Tuesday, January 17, 2017 12:53:19 PM Cy Schubert wrote:
> > > > In message <b9c53237-4b1a-a140-f692-bf5837060b18 at selasky.org>, Hans Pet
> ter 
> > > > Sela
> > > > sky writes:
> > > > > Hi,
> > > > > 
> > > > > When booting I observe an additional 30-second delay after this print
> :
> > > > > 
> > > > > > Timecounters tick every 1.000 msec
> > > > > 
> > > > > ~30 second delay and boot continues like normal.
> > > > > 
> > > > > Checking "vmstat -i" reveals that some timers have been running loose
> .
> > > > > 
> > > > > > cpu0:timer                         44300        442
> > > > > > cpu1:timer                         40561        404
> > > > > > cpu3:timer                      48462822     483058
> > > > > > cpu2:timer                      48477898     483209
> > > > > 
> > > > > Trying to add delays and/or prints around the Timecounters printout 
> > > > > makes the issue go away. Any ideas for debugging?
> > > > > 
> > > > > Looks like a startup race to me.
> > > > 
> > > > just picking a random email to reply to, I'm seeing a different issue w
> ith 
> > > > early AP startup. It affects one of my four machines, my laptop. My thr
> ee 
> > > > server systems downstairs have no problem however my laptop will reboot
>  
> > > > repeatedly at:
> > > > 
> > > > Jan 17 11:55:16 slippy kernel: cd0: Attempt to query device size failed
> : 
> > > > NOT READY, Medium not present - tray closed
> > > 
> > > So it panics and reboots after this?
> > 
> > Yes, it goes into a panic/reboot loop for a few iterations until it 
> > successfully boots. Disabling early AP startup allows it to boot up without
>  
> > the assumed race.
> 
> Can you add DDB to the kernel config (and remove DDB_UNATTENDED) to get it
> to break into DDB when it panics to get the panic message (and a stack trace
> as well)?

I found and fixed the problem. It was in some code I had added a long time 
ago but not committed yet to the bge driver to implement WOL. It was a lock 
assertion.


-- 
Cheers,
Cy Schubert <Cy.Schubert at cschubert.com>
FreeBSD UNIX:  <cy at FreeBSD.org>   Web:  http://www.FreeBSD.org

	The need of the many outweighs the greed of the few.




More information about the freebsd-current mailing list