答复: why a FreeBSD 10.2 VM hang under high pressure?

Tue Dec 27 14:40:00 UTC 2016

Hello Roger,
	Thank you for the time and patience you devoted to reading my messages and e-mails. I really appreciate that.
	After analyzed, the reason was there was a user process named monitor which often read and wrote xenstore in my test VM. The monitor had been suspended after stop_all_proc was called and I Found it held xs.request_mutex lock but in the following functions xs_write and xs_suspend 
would not get the lock. So the VM hang.
	I can't use git send-email so I attach a patch directly. In this patch, I let stop_all_proc hold the xs.request_mutex lock ahead and xs_dev_write in the xenstore device need wait until stop_all_proc has finished. 
In my test environment, my VM was alive after made the live migration back and forth 1000 times.
	I want to know how about this patch, Please let me know if you have any questions. Thanks!

    Note: The patch is made on the base of the origin/release/10.2.0 branch.

-----邮件原件-----
发件人: Roger Pau Monne [mailto:roger.pau at citrix.com] 
发送时间: 2016年12月16日 0:47
收件人: Liuyingdong
抄送: Zhaojun (Euler); Suoben
主题: Re: why a FreeBSD 10.2 VM hang under high pressure?

On Thu, Dec 15, 2016 at 01:58:33PM +0000, Liuyingdong wrote:
> Hello,
> 	The migration process didn't finish successfully, VM hang on the sending side. 
> 	Because I had introduced suspend-cancel mechanism in my test 
> environment, I think VM will be alive on the sending side if libxl suspend is timeout.If memory pressure was oversized in the sending side, the VM continuous dirty pages was faster than the rate of copying, this resulted in a much longer downtime. So prediction algorithm cancelled the live migration. In my test environment, my VM was alive on the sending side as I expected .

Right, this is thanks to your recover from failed suspension patch, I will try to review this ASAP.

>     I found stop_all_proc had finished by printing log:
> =====================log start===============================
> [===test===]->xctrl_suspend
> [===test===]start to stop_all_proc
> [===1209===stop]p_comm=sftp-server,pid=3158,r=0
> [===1209===stop]p_comm=csh,pid=3156,r=0
> [===1209===stop]p_comm=sshd,pid=3152,r=0
> [===1209===stop]p_comm=csh,pid=836,r=0
> [===1209===stop]p_comm=sshd,pid=833,r=0
> [===1209===stop]p_comm=getty,pid=715,r=0
> [===1209===stop]p_comm=getty,pid=714,r=0
> [===1209===stop]p_comm=getty,pid=713,r=0
> [===1209===stop]p_comm=getty,pid=712,r=0
> [===1209===stop]p_comm=getty,pid=711,r=0
> [===1209===stop]p_comm=getty,pid=710,r=0
> [===1209===stop]p_comm=getty,pid=709,r=0
> [===1209===stop]p_comm=getty,pid=708,r=0
> [===1209===stop]p_comm=getty,pid=707,r=0
> [===1209===stop]p_comm=cron,pid=663,r=0
> [===1209===stop]p_comm=sendmail,pid=659,r=0
> [===1209===stop]p_comm=sendmail,pid=656,r=0
> [===1209===stop]p_comm=sshd,pid=653,r=0
> [===1209===stop]p_comm=uvp-monitor,pid=559,r=0
> [===1209===stop]p_comm=syslogd,pid=455,r=0
> [===1209===stop]p_comm=devd,pid=381,r=0
> [===1209===stop]p_comm=dhclient,pid=380,r=0
> [===1209===stop]p_comm=dhclient,pid=318,r=0
> [===test===]finish to stop_all_proc
> [===test===]start to power_suspend
> [===test===]finish to power_suspend
> [===test===]start to sched_bind
> [===test===]finish to sched_bind
> [===test===]running on CPU#0
> ...
> ======================log end==============================
> 
> 	I debugged by remote way and stack information is shown as follows. To my surprise, acpi_cpu.c:1038 was acpi_cpu_c1() but next frame jumped suddenly pmap_kenter() function. I guest the vm was doing crashdump map after acpi_cpu_c1().
> what's your opinion about it?

See below for a little bit of information about this trace. I'm a little bit puzzled here, did this crash/hang happened with you resume from failed suspension patch applied, or is this trace from before applying your patch?

> ======================stack start==============================
> (kgdb) target remote 9.61.1.166:123456 Remote debugging using 
> 9.61.1.166:123456 [New Thread 0] [Switching to Thread 0]
> 0xffffffff80d2edc6 in map_table (pa=18446744073709551614, offset=0, 
>     sig=0xf "$0f00000$2430663$2432343$2432343$2432343$2432343$2432343$2432343$2432343$2432343$2432343$2432343$2432343$2432343$2432343$2432343$2432343$2432343$2432343$2432343$2432343$2432343$2432343$2432343$2432343"...)
>     at /usr/src/sys/amd64/acpica/acpi_machdep.c:135
> 135                     pmap_kenter(va, pa);

Hm, I don't think so, here the pmap_kenter happens in the function table_map, which means that acpi_cpu_idle was trying to fetch some information from an ACPI table maybe?

> Current language:  auto; currently minimal
> (kgdb) bt
> #0  0xffffffff80d2edc6 in map_table (pa=18446744073709551614, offset=0, 
>     sig=0xf "$0f00000$2430663$2432343$2432343$2432343$2432343$2432343$2432343$2432343$2432343$2432343$2432343$2432343$2432343$2432343$2432343$2432343$2432343$2432343$2432343$2432343$2432343$2432343$2432343$2432343"...)
>     at /usr/src/sys/amd64/acpica/acpi_machdep.c:135
> #1  0xffffffff8036ca3a in acpi_cpu_idle (sbt=<value optimized out>) at 
> /usr/src/sys/dev/acpica/acpi_cpu.c:1038

This doesn't seem to make much sense, here the trace points to a call to
acpi_cpu_c1 (which is basically a hlt instruction, however the debugger seems to think it's calling table_map. So either the symbols are out-of-sync or there's something weird going on here.

> #2  0xffffffff80d352ff in cpu_idle (busy=-2123763712) at 
> /usr/src/sys/amd64/amd64/machdep.c:828
> #3  0xffffffff80d353a0 in cpu_idle_wakeup (cpu=53412) at 
> /usr/src/sys/amd64/amd64/machdep.c:852
> #4  0xffffffff80975ee5 in sched_idletd (dummy=<value optimized out>) 
> at /usr/src/sys/kern/sched_ule.c:808
> #5  0xffffffff80912e7a in fork_exit (callout=0xffffffff80975d10 <sched_idletd+160>, arg=0x0, frame=0xfffffe007afadc00)
>     at /usr/src/sys/kern/kern_fork.c:1018
> #6  0xffffffff80d31cde in Xint0x80_syscall () at ia32_exception.S:72
> #7  0x0000000000000000 in ?? ()
> (kgdb) info threads 
>   4 Thread 3  0xffffffff80d2edc6 in map_table (pa=18446744073709551607, offset=0, 
>     sig=0xf "$0f00000$2430663$2432343$2432343$2432343$2432343$2432343$2432343$2432343$2432343$2432343$2432343$2432343$2432343$2432343$2432343$2432343$2432343$2432343$2432343$2432343$2432343$2432343$2432343$2432343"...)
>     at /usr/src/sys/amd64/acpica/acpi_machdep.c:135
>   3 Thread 2  0xffffffff80d2edc6 in map_table (pa=18446744073709551611, offset=0, 
>     sig=0xf "$0f00000$2430663$2432343$2432343$2432343$2432343$2432343$2432343$2432343$2432343$2432343$2432343$2432343$2432343$2432343$2432343$2432343$2432343$2432343$2432343$2432343$2432343$2432343$2432343$2432343"...)
>     at /usr/src/sys/amd64/acpica/acpi_machdep.c:135
>   2 Thread 1  0xffffffff80d2edc6 in map_table (pa=18446744073709551613, offset=0, 
>     sig=0xf "$0f00000$2430663$2432343$2432343$2432343$2432343$2432343$2432343$2432343$2432343$2432343$2432343$2432343$2432343$2432343$2432343$2432343$2432343$2432343$2432343$2432343$2432343$2432343$2432343$2432343"...)
>     at /usr/src/sys/amd64/acpica/acpi_machdep.c:135
> * 1 Thread 0  0xffffffff80d2edc6 in map_table (pa=18446744073709551614, offset=0, 
>     sig=0xf "$0f00000$2430663$2432343$2432343$2432343$2432343$2432343$2432343$2432343$2432343$2432343$2432343$2432343$2432343$2432343$2432343$2432343$2432343$2432343$2432343$2432343$2432343$2432343$2432343$2432343"...)
>     at /usr/src/sys/amd64/acpica/acpi_machdep.c:135
> ======================stack end==============================

Roger.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: monitor need wait until stop_all_proc has finished.patch
Type: application/octet-stream
Size: 2932 bytes
Desc: monitor need wait until stop_all_proc has finished.patch
URL: <http://lists.freebsd.org/pipermail/freebsd-xen/attachments/20161227/03701aa7/attachment.obj>