Sleeping thread owns a nonsleepable lock panic (& lor)
Kostik Belousov
kostikbel at gmail.com
Wed Jul 27 12:08:58 UTC 2011
On Tue, Jul 26, 2011 at 07:12:23PM -0400, Rick Macklem wrote:
> Kostik Belousov wrote:
> > On Tue, Jul 26, 2011 at 01:17:52PM +0200, Herve Boulouis wrote:
> > > Le 26/07/2011 12:06, Kostik Belousov a Иcrit:
> > > > On Tue, Jul 26, 2011 at 11:49:13AM +0200, Herve Boulouis wrote:
> > > > > Le 25/07/2011 11:59, Kostik Belousov a ?crit:
> > > > >
> > > > > Ok the patched server crashed this morning strangely : all httpd
> > > > > processes were stuck in nfs or vmopar
> > > > > and were unkillable. Below is the full ps.
> > > >
> > > > Please see the
> > > > http://www.freebsd.org/doc/en_US.ISO8859-1/books/developers-handbook/kerneldebug-deadlocks.html
> > > > for information required to debug the deadlocks.
> > >
> > > the box was not stricly deadlocked since I was able to interact with
> > > it but I suppose you want me to
> > > break into debugger when the symptoms appears again and report all
> > > the commands listed in the handbook
> > > deadlock section ?
> >
> > Exactly.
> >
> > I think everything was hung that accessed an nfs mount point.
> > From the usermode, procstat -kk could catch some interesting
> > information,
> > but it is redundant if ddb output is captured.
>
> Would it be worth considering reverting r223054?
> (Note that I don't understand the VM side, so this may be completely
> wrong:-)
>
> The sleeps on vmopar could be happening because a dirty page is busy
> and r223054 changes the VM_PAGER_xx value set a couple of ways.
> 1 - When it returns VM_PAGER_ERROR instead of VM_PAGER_AGAIN, the
> return value of "runlen" from vm_pageout_flush() changes.
> 2 - I'm not sure, but I think the pre-r223054 code marked a partially
> written page as VM_PAGER_OK instead of VM_PAGER_AGAIN?
> (I'm wondering about this one, since the problem seems to happen
> when the file's size has been truncated.)
>
> Herve Boulouis, if you want to see what r223054 changes, just go to
> http://svn.freebsd.org/viewvc/stable/8/sys/nfsclient
> and then click on nfs_bio.c.
> (The changes are small and could easily be reverted with a manual
> edit.)
>
> Since r223054 went into stable/8 on Jun 13, it seems a possible
> explanation? rick
I doubt it. The ps output makes it not very inplausible that the
reporter got the LOR between vnode lock and page busy flag. The correct
order is vnode lock -> busy bit. vmopar is a wait for the busy page
state.
Mentioned revision does not change the lock order.
Anyway, this is only a speculation, until the requested data is provided.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 196 bytes
Desc: not available
Url : http://lists.freebsd.org/pipermail/freebsd-stable/attachments/20110727/fc86b27e/attachment.pgp
More information about the freebsd-stable
mailing list