Please help me diagnose this crazy VMWare/FreeBSD 8.x crash
Adrian Chadd
adrian at freebsd.org
Fri May 25 00:56:01 UTC 2012
Hi,
You guys now absolutely, positively have enough information for a PR.
It's still not clear whether it's a device/interrupt layer issue in
FreeBSD, or whether vmware is doing something wrong with how it
implements shared interrupts, or a bit of both..
Adrian
On 24 May 2012 13:54, dane foster <dene at ilovedene.com> wrote:
> Hey all,
>
> On 25/05/2012, at 1:47 AM, Mark Felder wrote:
>
>> On Wed, 23 May 2012 17:30:40 -0500, Adrian Chadd <adrian at freebsd.org> wrote:
>>
>>> Hi,
>>>
>>> can you please, -please- file a PR? And place all of the above
>>> information in it so we don't lose it?
>>>
>>
>> I'd be glad to post a PR and assist in helping to get it permanently fixed. I certainly don't want this data to get lost and honestly our business uses FreeBSD on VMWare so much that we really need a permanent fix as much as anyone else :-)
>>
>> The reason I've hesitated to post a PR so far is that I didn't have any truly useful or concrete evidence of where the problem lies. After Dane Foster contacted me and told me he could recreate the crash on demand with his workload it was easier to narrow things down. The suggestion that it was an interrupts issue (by possibly Bjoern Zeeb?) and Dane's discovery that his crashes ceased when em0 and mpt0 share an IRQ, but em0 is completely unused was starting to prove there is some strong evidence here in favor of the interrupts issue.
>>
>> Dane, what's the status on your end? Has your fix still been successful? Is it also stable if you simply set hint.mpt.0.msi_enable="1" ?
>>
>
> The situation I've got that's stable now is:
>
> hw.pci.enable_msi="0"
> hw.pci.enable_msix="0"
>
> in /boot/loader.conf
>
> and:
>
> samael:~:% vmstat -i [ 6:31PM]
> interrupt total rate
> irq1: atkbd0 6 0
> irq18: em0 mpt0 3061100 15
> irq19: em1 6891706 35
> cpu0: timer 166383735 868
> cpu1: timer 166382123 868
> cpu3: timer 166382123 868
> cpu2: timer 166382121 868
> Total 675482914 3525
>
> Not using em0. This works for 8 (FreeBSD samael.slush.ca 8.3-STABLE FreeBSD 8.3-STABLE #1: Mon May 7 11:51:03 NZST 2012 root at samael.slush.ca:/usr/obj/usr/src/sys/DENE amd64).
>
> Neither of those settings on their own seem to stop it from happening.
>
> The 9 box I've tried this on still hangs almost every time i run handbrake, no matter whether MSI/MSIX is enabled, or I have separate IRQs for mpt0 and em0/1
>
> I can cause the hang mostly on demand, but not quite sure what information to provide from the hung system. If somebody can let me know what they need, including root access, I can make that happen.
>
> Cheers,
>
> Dane
>
>
>
>>
>> Thanks!
>
>
>
>
More information about the freebsd-questions
mailing list