Recent MFC to 7 causes crash on VMware ESXi

John Baldwin jhb at freebsd.org
Mon Feb 8 15:33:17 UTC 2010


On Monday 08 February 2010 9:56:36 am Kostik Belousov wrote:
> On Mon, Feb 08, 2010 at 09:49:00AM -0500, John Baldwin wrote:
> > On Saturday 06 February 2010 4:47:16 pm Tom McLaughlin wrote:
> > > John Baldwin wrote, On 02/05/2010 08:27 AM:
> > > > On Thursday 04 February 2010 10:00:55 pm Tom McLaughlin wrote:
> > > >> Hi all, a recent MFC to 7-STABLE has started to cause issues for my VMs
> > > >> on VMware ESXi 3.5u4.  After loading the mpt driver for the LSI disk
> > > >> controller the VM just shuts off.  The workaround is to change the disk
> > > >> controller to the BusLogic type.  Still, it used to work up until last
> > > >> week.  The change was made around January 26th and based on the commits
> > > >> that day I'm guessing it's either r203047 or r203073
> > > >>
> > > >> I have the same issue with both amd64 and i386 VMs.  This affects HEAD
> > > >> and 8-STABLE as well and first affected HEAD over the summer.  (I just
> > > >> worked around it and went about my business at the time. :-/)  I've
> > > >> attached a dmesg from a kernel before the problem and one from after it
> > > >> started.
> > > > 
> > > > What if you set 'hw.clfush_disable=1' from the loader?
> > > > 
> > > 
> > > Yes, that corrected it on all my VMs.  I've talked to people on ESXi 4
> > > and they do not see the problem.  I have yet to try 3.5u5 to see if this
> > > is a non-issue.  3.5 will be supported for awhile longer from VMware.
> > > I'm going to try upgrading the box during the week.
> > 
> > I believe folks had to do this on HEAD/8.x as well.  Perhaps we can 
> > automatically disable clflush if we are executing under VMware or Xen:
> > 
> > Index: amd64/amd64/initcpu.c
> > ===================================================================
> > --- amd64/amd64/initcpu.c	(revision 203430)
> > +++ amd64/amd64/initcpu.c	(working copy)
> > @@ -177,17 +177,16 @@
> >  	if ((cpu_feature & CPUID_CLFSH) != 0)
> >  		cpu_clflush_line_size = ((cpu_procinfo >> 8) & 0xff) * 8;
> >  	/*
> > -	 * XXXKIB: (temporary) hack to work around traps generated when
> > -	 * CLFLUSHing APIC registers window.
> > +	 * XXXKIB: (temporary) hack to work around traps generated
> > +	 * when CLFLUSHing APIC registers window under virtualization
> > +	 * environments.
> >  	 */
> >  	TUNABLE_INT_FETCH("hw.clflush_disable", &hw_clflush_disable);
> > -	if (cpu_vendor_id == CPU_VENDOR_INTEL && !(cpu_feature & CPUID_SS) &&
> > -	    hw_clflush_disable == -1)
> > +	if (vm_guest != 0 /* VM_GUEST_NO */ && hw_clflush_disable == -1)
> >  		cpu_feature &= ~CPUID_CLFSH;
> >  	/*
> >  	 * Allow to disable CLFLUSH feature manually by
> > -	 * hw.clflush_disable tunable.  This may help Xen guest on some AMD
> > -	 * CPUs.
> > +	 * hw.clflush_disable tunable.
> >  	 */
> >  	if (hw_clflush_disable == 1)
> >  		cpu_feature &= ~CPUID_CLFSH;
> > Index: i386/i386/initcpu.c
> > ===================================================================
> > --- i386/i386/initcpu.c	(revision 203430)
> > +++ i386/i386/initcpu.c	(working copy)
> > @@ -724,17 +724,16 @@
> >  	if ((cpu_feature & CPUID_CLFSH) != 0)
> >  		cpu_clflush_line_size = ((cpu_procinfo >> 8) & 0xff) * 8;
> >  	/*
> > -	 * XXXKIB: (temporary) hack to work around traps generated when
> > -	 * CLFLUSHing APIC registers window.
> > +	 * XXXKIB: (temporary) hack to work around traps generated
> > +	 * when CLFLUSHing APIC registers window under virtualization
> > +	 * environments.
> >  	 */
> >  	TUNABLE_INT_FETCH("hw.clflush_disable", &hw_clflush_disable);
> > -	if (cpu_vendor_id == CPU_VENDOR_INTEL && !(cpu_feature & CPUID_SS) &&
> > -	    hw_clflush_disable == -1)
> > +	if (vm_guest != 0 /* VM_GUEST_NO */ && hw_clflush_disable == -1)
> >  		cpu_feature &= ~CPUID_CLFSH;
> >  	/*
> >  	 * Allow to disable CLFLUSH feature manually by
> > -	 * hw.clflush_disable tunable.  This may help Xen guest on some AMD
> > -	 * CPUs.
> > +	 * hw.clflush_disable tunable.
> >  	 */
> >  	if (hw_clflush_disable == 1)
> >  		cpu_feature &= ~CPUID_CLFSH;
> 
> It might be better to "or" old condition, i.e. Intel without SS, and
> new one, vm_guest != 0, instead of replacing the old ?

I thought the old condition only happened under VMware?

-- 
John Baldwin


More information about the freebsd-stable mailing list