mpd has hung

Alexander Shikoff minotaur at crete.org.ua
Mon Feb 22 11:34:58 UTC 2010


On Sat, Feb 20, 2010 at 12:04:35PM +0000, Bjoern A. Zeeb wrote:
> On Sat, 20 Feb 2010, Bjoern A. Zeeb wrote:
> 
> > On Fri, 19 Feb 2010, Mikolaj Golub wrote:
> >
> >> On Thu, 18 Feb 2010 17:32:37 +0200 Nikos Vassiliadis wrote:
> >> 
> >>> On 2/17/2010 3:26 PM, Alexander Shikoff wrote:
> >>>> Hello All,
> >>>> 
> >>>> I have mpd 5.3 running on 8.0-RC1 as PPPoE server (now only 5 clients).
> >>>> Today mpd process hung and I cannot kill it with -9 signal, and I cannot
> >>>> access it's console via telnet.
> >>>> 
> >>>> State of process in `top` output is STOP:
> >>>> 73551 root          2  44    0 29588K  5692K STOP    6   0:32  0.00% mpd5
> >>>> 
> >>>> # procstat -kk 73551
> >>>>    PID    TID COMM             TDNAME           KSTACK
> >>>> 73551 100233 mpd5             -                mi_switch+0x16f 
> >>>> sleepq_wait+0x42 _cv_wait+0x111 flowtable_flush+0x51 if_detach+0x2f2 
> >>>> ng_iface_shutdown+0x1e ng_rmnode+0x167 ng_apply_item+0xef7 
> >>>> ng_snd_item+0x2ce ngc_send+0x1d2 sosend_generic+0x3f6 kern_sendit+0x13d 
> >>>> sendit+0xdc sendto+0x4d syscall+0x1da Xfast_syscall+0xe1
> >>>> 73551 100502 mpd5             -                mi_switch+0x16f 
> >>>> thread_suspend_switch+0xc6 thread_single+0x1b6 exit1+0x72 sigexit+0x7c 
> >>>> postsig+0x306 ast+0x279 doreti_ast+0x1f
> >>>> 
> >>>> Is there a way to stop a process without rebooting a whole system?
> >>>> Thanks in advance!
> >>>> 
> >>>> P.S. I'm ready for experiments with it before tonight, but I cannot
> >>>> force system to crash in order to get crash dump right now.
> >>>> 
> >>> 
> >>> It's probably too late now, but are you sure that nobody pressed
> >>> CTLR-Z while in the mpd console???
> >>> 
> >>> CTLR-Z will send SIGSTOP to the process and the process will
> >>> stop. While stopped, all processing stops(including receiving
> >>> SIGKILL, you cannot kill it, and the signals are queued). You
> >>> have to send SIGCONT for the process to continue.
> >> 
> >> We were discussing this problem with Alexander in another 
> >> (Russian/Ukrainian
> >> speaking) maillist. And it looks like the problem is the following.
> >> 
> >> mpd5 thread was detaching ng interface and when doing flowtable_flush() it
> >> slept in cv_wait waiting for flowclean_cycles variable to be updated. It
> >> should have been awaken by flowcleaner thread but this thread got stuck in
> >> endless loop, supposedly in flowtable_clean_vnet()/flowtable_free_stale(), 
> >> I
> >> think because of inconsistent state of some lists (iface?) due to if_detach
> >> being in progress.
> >
> > I have patches that are out for review.
> 
> I am not sure if they apply cleanly as they are broken out of the tail
> side of a larger patchset.
> 
> If you are not using VIMAGEs you could ignore the ones I marked with (*).
> 
> http://people.freebsd.org/~bz/20100216-10-ft-cv.diff
> http://people.freebsd.org/~bz/20100216-11-ft-debugging.diff
> http://people.freebsd.org/~bz/20100216-12-ft-cleanup.diff	(*)
> http://people.freebsd.org/~bz/20100216-13-ft-ll-cleanup.diff
> http://people.freebsd.org/~bz/20100216-18-ft-free.diff		(*)
> 
> If you are still seeing the hang and have DDB support in your kernel,
> then break into the debugger and save the complete output of
>  	ddb> ps
> before rebooting.

I cannot make tests right now because of that box in production.
I need some time to remove all traffic from it.

-- 
MINO-RIPE


More information about the freebsd-net mailing list