Exactly that commit (was Re: Latest -current 100% hang at the late boot stage)

Kenneth D. Merry ken at freebsd.org
Tue Jun 21 16:17:21 UTC 2011


On Mon, Jun 20, 2011 at 15:46:56 +0400, Andrey Chernov wrote:
> On Mon, Jun 20, 2011 at 11:01:46AM +0300, Kostik Belousov wrote:
> > On Mon, Jun 20, 2011 at 11:02:22AM +0400, Andrey Chernov wrote:
> > > On Sun, Jun 19, 2011 at 08:15:43PM -0600, Justin T. Gibbs wrote:
> > > > On 6/19/11 6:19 PM, Andrey Chernov wrote:
> > > > > Exactly that commit is responsible for boot hang.
> > > > > Please fix.
> > > > > BTW, I have MBR on SATA disk (CAM emulated), ICH9.
> > > > 
> > > > Since it works for me, you'll need to provide more information.  Can you
> > > > at least drop into kdb to determine the likely source of the hang by
> > > > getting a stack trace of all processes to see where they are sleeping
> > > > and dumping lock information?
> > > 
> > > I drop into DDB and put 'bt' console photo in the very first message of 
> > > this thread - nothing unusual seen in the main stack. Could you please 
> > > specify exact DDB commands you want to be issued by me? No dump can be 
> > > provided since nothing is mounted yet including swap,
> > > 
> > > BTW, I remember I saw previously unseen warnings with post Jun 14 kernels:
> > > "xpt_action_default: CCB type 0xe not supported"
> > > 
> > > 'ps' inside DDB shows [xpt_thrd] at "ccb_scan" wmesg state and [g_event]
> > > at "caplck" wmesg state, [kernel] at "g_waitid" state.
> > > Even don't know, if it matters.
> > 
> > Just in case, please try r223277.
> 
> As the second message in the thread states, I try first even 223296 with 
> the same hang and the same 
> xpt_action_default: CCB type 0xe not supported
> As I think, DDB's 'ps' indicates that kernel waits something from geom and 
> geom waits something from ccb_scan forever, just raw guess. I will be glad to 
> issue more specific DDB commands and upload corresponding photos.
> BTW, pluging and unplugging USB devides works in that stage.

Can you do the following when the hang happens:

ps
alltrace
show locks
show msgbuf

Hopefully that will give us something to start looking at...

This would really work a lot better if there is any way to get a serial
console on the machine.  The above will produce a good bit of output, and
would likely need a lot of pictures.

Since we can't reproduce the problem here, some debugging help would be
greatly appreciated.

Thanks,

Ken
-- 
Kenneth Merry
ken at FreeBSD.ORG


More information about the freebsd-current mailing list