Please help me diagnose this crazy VMWare/FreeBSD 8.x crash
nightrecon at hotmail.com
Mon May 21 18:48:00 UTC 2012
Mark Felder wrote:
> OK guys I've been talking with another user who can recreate this crash
> and the last bit of information we've learned seems to be leaning towards
> interrupts/IRQ issues like someone (bz@ perhaps?) suggested.
> I'm still trying to test this myself, but the other user was able to
> recreate my crash pretty much on demand. The fix was to not use the first
> NIC in the VM because it will always share an IRQ with mpt0. Once mpt0 is
> on its own the crash does not seem to be reproducible anymore.
I am not anywhere near your level in this subject area. My understanding is
limited and do not have the in-depth experience. However, please allow me to
possibly add an idea or two.
I am shakedown testing FreeBSD 9 in a VirtualBox VM - so there is definitely
a degree of 'apples vs oranges' present. VirtualBox (as I am using it) is a
userland app and not a bare-metal hypervisor. When I set up the VM I chose
to use the synthetic SAS controller as that would best represent actual
server hardware in my workplace, along with the corresponding mpt driver in
the FreeBSD 9 guest.
Please note some of the following for comparative purposes only:
Event timer "LAPIC" quality 400
ACPI APIC Table: <VBOX VBOXAPIC>
FreeBSD/SMP: Multiprocessor System Detected: 2 CPUs
FreeBSD/SMP: 1 package(s) x 2 core(s)
cpu0 (BSP): APIC ID: 0
cpu1 (AP): APIC ID: 1
ioapic0 <Version 1.1> irqs 0-23 on motherboard
kbd1 at kbdmux0
acpi0: <VBOX VBOXXSDT> on motherboard
acpi0: Power Button (fixed)
acpi0: Sleep Button (fixed)
Timecounter "HPET" frequency 14318180 Hz quality 950
Timecounter "ACPI-fast" frequency 3579545 Hz quality 900
acpi_timer0: <32-bit timer at 3.579545MHz> port 0x4008-0x400b on acpi0
em0: <Intel(R) PRO/1000 Legacy Network Connection 1.0.3> port 0xd000-0xd007
mem 0xf0000000-0xf001ffff irq 19 at device 3.0 on pci0
mpt0: <LSILogic SAS/SATA Adapter> port 0xd100-0xd1ff mem
0xf0820000-0xf083ffff,0xf0840000-0xf085ffff irq 22 at device 22.0 on pci0
mpt0: MPI Version=126.96.36.199
The em0 is the first Intel NIC in Vbox and notice how it and mpt0 come up
with distinctly different IRQs.
A sysctl -a |grep mpt returns this:
dev.mpt.0.%desc: LSILogic SAS/SATA Adapter
dev.mpt.0.%location: slot=22 function=0
dev.mpt.0.%pnpinfo: vendor=0x1000 device=0x0054 subvendor=0x1000
Very curious how 'irq 22 at device 22.0' and 'dev.mpt.0.%location: slot=22'
all match with a '22'.
The obvious thing here is we are comparing a userland Vbox guest to a VMWare
hypervisor. From what little I know concerning any of this, to me it sounds
vaguely like an APIC, LAPIC, and IO/APIC bug. There are known bugs wrt to
BIOS setting up IRQ routing incorrectly, and/or providing incorrect ACPI
and/or IMS tables to operating systems.
The parallel in this case would be the logical or synthetic so-called "BIOS"
that the VMWare hypervisor presents to the FreeBSD guest at guest boot time.
In this case the truest fix for the problem would fall to VMWare, e.g. if the
hypervisor is setting up tables in such a way as to create the shared IRQ
problem in the first place.
If my idea/theory/potential hypothesis has any merit. I do not understand
why any of this would be different depending upon which guest is installed,
but I also know absolutely nothing about VMWare hypervisor internals.
> Is there any other way we can make mpt0 get its own dedicated IRQ without
> having to do this? The problem is that it causes us to have to make
> rc.conf changes, pf.conf changes, and who knows what other software could
> be on these machines that is trying to bind to a specific NIC...
Very possibly Andrew's device.hints is probably your best shot at a
Wish you the best of luck in any case. You have done quite a job in
researching this problem even to arrive at this point. Thank-you for that,
and for sharing it with the community. Even though I can't really offer the
kind of assistance you require, I have followed along with interest for self
More information about the freebsd-questions