mpd has hung
Bjoern A. Zeeb
bzeeb-lists at lists.zabbadoz.net
Sat Feb 20 12:05:07 UTC 2010
On Sat, 20 Feb 2010, Bjoern A. Zeeb wrote:
> On Fri, 19 Feb 2010, Mikolaj Golub wrote:
>
>> On Thu, 18 Feb 2010 17:32:37 +0200 Nikos Vassiliadis wrote:
>>
>>> On 2/17/2010 3:26 PM, Alexander Shikoff wrote:
>>>> Hello All,
>>>>
>>>> I have mpd 5.3 running on 8.0-RC1 as PPPoE server (now only 5 clients).
>>>> Today mpd process hung and I cannot kill it with -9 signal, and I cannot
>>>> access it's console via telnet.
>>>>
>>>> State of process in `top` output is STOP:
>>>> 73551 root 2 44 0 29588K 5692K STOP 6 0:32 0.00% mpd5
>>>>
>>>> # procstat -kk 73551
>>>> PID TID COMM TDNAME KSTACK
>>>> 73551 100233 mpd5 - mi_switch+0x16f
>>>> sleepq_wait+0x42 _cv_wait+0x111 flowtable_flush+0x51 if_detach+0x2f2
>>>> ng_iface_shutdown+0x1e ng_rmnode+0x167 ng_apply_item+0xef7
>>>> ng_snd_item+0x2ce ngc_send+0x1d2 sosend_generic+0x3f6 kern_sendit+0x13d
>>>> sendit+0xdc sendto+0x4d syscall+0x1da Xfast_syscall+0xe1
>>>> 73551 100502 mpd5 - mi_switch+0x16f
>>>> thread_suspend_switch+0xc6 thread_single+0x1b6 exit1+0x72 sigexit+0x7c
>>>> postsig+0x306 ast+0x279 doreti_ast+0x1f
>>>>
>>>> Is there a way to stop a process without rebooting a whole system?
>>>> Thanks in advance!
>>>>
>>>> P.S. I'm ready for experiments with it before tonight, but I cannot
>>>> force system to crash in order to get crash dump right now.
>>>>
>>>
>>> It's probably too late now, but are you sure that nobody pressed
>>> CTLR-Z while in the mpd console???
>>>
>>> CTLR-Z will send SIGSTOP to the process and the process will
>>> stop. While stopped, all processing stops(including receiving
>>> SIGKILL, you cannot kill it, and the signals are queued). You
>>> have to send SIGCONT for the process to continue.
>>
>> We were discussing this problem with Alexander in another
>> (Russian/Ukrainian
>> speaking) maillist. And it looks like the problem is the following.
>>
>> mpd5 thread was detaching ng interface and when doing flowtable_flush() it
>> slept in cv_wait waiting for flowclean_cycles variable to be updated. It
>> should have been awaken by flowcleaner thread but this thread got stuck in
>> endless loop, supposedly in flowtable_clean_vnet()/flowtable_free_stale(),
>> I
>> think because of inconsistent state of some lists (iface?) due to if_detach
>> being in progress.
>
> I have patches that are out for review.
I am not sure if they apply cleanly as they are broken out of the tail
side of a larger patchset.
If you are not using VIMAGEs you could ignore the ones I marked with (*).
http://people.freebsd.org/~bz/20100216-10-ft-cv.diff
http://people.freebsd.org/~bz/20100216-11-ft-debugging.diff
http://people.freebsd.org/~bz/20100216-12-ft-cleanup.diff (*)
http://people.freebsd.org/~bz/20100216-13-ft-ll-cleanup.diff
http://people.freebsd.org/~bz/20100216-18-ft-free.diff (*)
If you are still seeing the hang and have DDB support in your kernel,
then break into the debugger and save the complete output of
ddb> ps
before rebooting.
Regards,
Bjoern
--
Bjoern A. Zeeb It will not break if you know what you are doing.
More information about the freebsd-net
mailing list