FreeBSD 10.x + LiquidSoap + NFS == Server Hang

Rick Macklem rmacklem at uoguelph.ca
Fri Jul 4 12:10:31 UTC 2014


Marc Founier wrote:
> 
> k, just found
> http://www.freebsd.org/doc/en/books/developers-handbook/kerneldebug-online-ddb.html
> and setup KDB/DDB and just tested that using the ‘sysctl’ works to
> get me to the KDB prompt … hopefully this will allow me to provide
> more useful information, if someone can let me know what exactly
> that would be for next time it hangs? :)
> 
> 
> thx
> 
> 
> On Jul 3, 2014, at 9:26 PM, Marc Fournier <scrappy at hub.org> wrote:
> 
> > 
> > Oh, on the remote console, last two lines I see are:
> > 
> > ==
> > nfs_getpages: error 4
> > vm_fault: pager read error, pid 2957 (liquid soap)
> > ==
4 is EINTR. That would suggest you might have the "intr" option on the mount?

If so, try taking out the "intr" option on the mount, if you are using it.

The problem with it is that, if anything posts a signal to a process while
I/O is in progress it will fail. In this case the failure is in nfs_getpages(),
which is a pagein operation (and you don't want those to fail).

If you aren't using "intr", then I have no idea why a read would fail with EINTR.

rick

> > 
> > if that helps any ...
> > 
> > On Jul 3, 2014, at 9:23 PM, Marc Fournier <scrappy at hub.org> wrote:
> > 
> >> 
> >> Hi all …
> >> 
> >> 	I have a jail running on FreeBSD 10-STABLE (svn update as of July
> >> 	2nd @ ~05:30 UTC:
> >> 
> >> ==
> >> Working Copy Root Path: /usr/src
> >> URL: https://svn0.us-east.freebsd.org/base/stable/10
> >> Relative URL: ^/stable/10
> >> Repository Root: https://svn0.us-east.freebsd.org/base
> >> Repository UUID: ccf9f872-aa2e-dd11-9fc8-001c23d0bc1f
> >> Revision: 268135
> >> Node Kind: directory
> >> Schedule: normal
> >> Last Changed Author: pfg
> >> Last Changed Rev: 268132
> >> Last Changed Date: 2014-07-02 01:28:38 +0000 (Wed, 02 Jul 2014)
> >> ==
> >> 
> >> 	Currently it has 3 jail’d environments running off it, with the
> >> 	files for them NFS mounted from a NetApp filer … and right now,
> >> 	the NFS mount that these jails are running from is “locked” … a
> >> 	‘df’ hangs … trying to do a ‘jexec # /bin/tcsh’ into one of the
> >> 	jail’s hangs … etc.
> >> 
> >> 	The same NFS file system is mounted and running on a half dozen
> >> 	other servers, and they are all operating just fine, so the
> >> 	NetApp is operating properly.
> >> 
> >> 	If I move the jail with liquidsoap running around to a different
> >> 	server, the hang will follow to the new server, and the old
> >> 	server will once more become rock solid …
> >> 
> >> 	I’m not 100% certain it is liquidsoap, but the hang appears to
> >> 	always coincide with reloading a new playlist … and although it
> >> 	happens frequently (more with recent upgrades), it doesn’t
> >> 	happen *every* night …
> >> 
> >> 	This is on a remote server … so doing things at the console isn’t
> >> 	possible, and although I’ve got a remote console on this, I’ve
> >> 	never figured out how to break to the debugger through it,
> >> 	although I’m going to work on it to see if I can’t get it to
> >> 	work …
> >> 
> >> 	Baring breaking to the debugger (is there a way, from the command
> >> 	line, to force it to break to the debugger?), is there anything
> >> 	else I can use to provide some sort of useful information?
> >> 
> >> ps aux for the proces shows:
> >> 
> >> # ps aux | grep liq
> >> 1002     2957   0.0  0.7 226888 112792  -  TLJ   4:45AM
> >>   370:27.23 /usr/local/bin/liquidsoap -q -d
> >> /usr/local/etc/liquidsoap/liquidsoap.liq
> >> 
> >> and:
> >> 
> >> # ps auxxwl | grep 2957
> >> 1002     2957   0.0  0.7 226888 112792  -  TLJ   4:45AM
> >>   370:27.23 /usr/local/bin/l  1002     1   0  20  0 -
> >> 1002    96280   0.0  0.0  12316      0  -  IWJ  -
> >>           0:00.00 pwait 2957        1002 96274   0  52  0 kqread
> >> root    96508   0.0  0.0  18788   1828  4  S+    4:19AM
> >>     0:00.00 grep 2957            0 96505   0  20  0 piperd
> >> 
> >> 	Other commands I can / should run next time it happens … ?
> >> 	   Which won’t take long ...
> >> 
> >> Thanks …
> >> 
> >> 
> > 
> > _______________________________________________
> > freebsd-stable at freebsd.org mailing list
> > http://lists.freebsd.org/mailman/listinfo/freebsd-stable
> > To unsubscribe, send any mail to
> > "freebsd-stable-unsubscribe at freebsd.org"
> 
> _______________________________________________
> freebsd-stable at freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-stable
> To unsubscribe, send any mail to
> "freebsd-stable-unsubscribe at freebsd.org"
> 


More information about the freebsd-stable mailing list