mpd has hung

Bjoern A. Zeeb bzeeb-lists at lists.zabbadoz.net
Sat Feb 20 12:05:07 UTC 2010


On Sat, 20 Feb 2010, Bjoern A. Zeeb wrote:

> On Fri, 19 Feb 2010, Mikolaj Golub wrote:
>
>> On Thu, 18 Feb 2010 17:32:37 +0200 Nikos Vassiliadis wrote:
>> 
>>> On 2/17/2010 3:26 PM, Alexander Shikoff wrote:
>>>> Hello All,
>>>> 
>>>> I have mpd 5.3 running on 8.0-RC1 as PPPoE server (now only 5 clients).
>>>> Today mpd process hung and I cannot kill it with -9 signal, and I cannot
>>>> access it's console via telnet.
>>>> 
>>>> State of process in `top` output is STOP:
>>>> 73551 root          2  44    0 29588K  5692K STOP    6   0:32  0.00% mpd5
>>>> 
>>>> # procstat -kk 73551
>>>>    PID    TID COMM             TDNAME           KSTACK
>>>> 73551 100233 mpd5             -                mi_switch+0x16f 
>>>> sleepq_wait+0x42 _cv_wait+0x111 flowtable_flush+0x51 if_detach+0x2f2 
>>>> ng_iface_shutdown+0x1e ng_rmnode+0x167 ng_apply_item+0xef7 
>>>> ng_snd_item+0x2ce ngc_send+0x1d2 sosend_generic+0x3f6 kern_sendit+0x13d 
>>>> sendit+0xdc sendto+0x4d syscall+0x1da Xfast_syscall+0xe1
>>>> 73551 100502 mpd5             -                mi_switch+0x16f 
>>>> thread_suspend_switch+0xc6 thread_single+0x1b6 exit1+0x72 sigexit+0x7c 
>>>> postsig+0x306 ast+0x279 doreti_ast+0x1f
>>>> 
>>>> Is there a way to stop a process without rebooting a whole system?
>>>> Thanks in advance!
>>>> 
>>>> P.S. I'm ready for experiments with it before tonight, but I cannot
>>>> force system to crash in order to get crash dump right now.
>>>> 
>>> 
>>> It's probably too late now, but are you sure that nobody pressed
>>> CTLR-Z while in the mpd console???
>>> 
>>> CTLR-Z will send SIGSTOP to the process and the process will
>>> stop. While stopped, all processing stops(including receiving
>>> SIGKILL, you cannot kill it, and the signals are queued). You
>>> have to send SIGCONT for the process to continue.
>> 
>> We were discussing this problem with Alexander in another 
>> (Russian/Ukrainian
>> speaking) maillist. And it looks like the problem is the following.
>> 
>> mpd5 thread was detaching ng interface and when doing flowtable_flush() it
>> slept in cv_wait waiting for flowclean_cycles variable to be updated. It
>> should have been awaken by flowcleaner thread but this thread got stuck in
>> endless loop, supposedly in flowtable_clean_vnet()/flowtable_free_stale(), 
>> I
>> think because of inconsistent state of some lists (iface?) due to if_detach
>> being in progress.
>
> I have patches that are out for review.

I am not sure if they apply cleanly as they are broken out of the tail
side of a larger patchset.

If you are not using VIMAGEs you could ignore the ones I marked with (*).

http://people.freebsd.org/~bz/20100216-10-ft-cv.diff
http://people.freebsd.org/~bz/20100216-11-ft-debugging.diff
http://people.freebsd.org/~bz/20100216-12-ft-cleanup.diff	(*)
http://people.freebsd.org/~bz/20100216-13-ft-ll-cleanup.diff
http://people.freebsd.org/~bz/20100216-18-ft-free.diff		(*)

If you are still seeing the hang and have DDB support in your kernel,
then break into the debugger and save the complete output of
 	ddb> ps
before rebooting.

Regards,
Bjoern

-- 
Bjoern A. Zeeb         It will not break if you know what you are doing.


More information about the freebsd-net mailing list