Illumos boot

Tycho Nightingale tycho.nightingale at pluribusnetworks.com
Tue Oct 13 21:15:45 UTC 2015


Hi,

On Oct 13, 2015, at 9:35 AM, Matt Churchyard wrote:
> 
>> On Oct 13, 2015, at 7:17 AM, Matt Churchyard via freebsd-virtualization <freebsd-virtualization at freebsd.org> wrote:
> 
>> In my quest to continue expanding guest support in my vm-bhyve utility (See https://github.com/churchers/vm-bhyve :) ), I've found the Windows support pretty solid once I got clear on the slot requirements. I'm now trying an OS that requires CSM (Illumos) but unfortunately I'm currently struggling to get it to boot up correctly.
>> 
>> Here's an example of the command I'm generating at the moment (This is running on an Intel Core-i3):
>> 
>> bhyve -c 2 -m 2G -s 0,hostbridge -s 31,lpc \
>>     -s 3,ahci-cd,/data/vm/.iso/smartos-latest.iso \
>>     -s 4:0,ahci-hd,/data/vm/smartos/disk0.img \
>>     -s 5:0,virtio-net,tap0 \
>>     -l com1,stdio -l com2,/dev/nmdm2A \
>>     -H -l bootrom,/data/vm/.config/BHYVE_UEFI_CSM.fd \
>>     smartos
>> 
>> I have com1 set to stdio so I can easily watch the output as it runs.
>> It tends to get as far as "Legacy INT19 Boot...", then fall over.
>> Depending on whether I put the network interface directly in the slot after the HDD, I seem to get different errors -
>> 
>> slot 3 - cd
>> slot 4 - hdd
>> slot 5 - virtio-net
>> 
>> panic[cpu0]/thread=ffffff01457cdb40: BAD TRAP: type=e (#pf Page fault) rp=ffffff0004a69a60 addr=40 occurred in module "genunix" due to a NULL pointer dereference
>> 
>> slot 3 - cd
>> slot 4 - hdd
>> slot 7 - virtio-net
>> 
>> panic[cpu1]/thread=ffffff0004002c40: BAD TRAP: type=d (#gp General protection) rp=ffffff0004002740 addr=0
>> 
>> On com2 I see the boot menu, then one and a half lines of dots. The second line of dots stops about 2/3 of the way across.
> 
>> Have you tried booting illumos in verbose mode - edit the grub command line and provide '-v'.  That may give you a better backtrace than a >program counter.
> 
> This is what I get from a verbose boot:
> 
> Bhyve-HandleProtocol: Copying DevPath: PciRoot(0x0)/Pci(0x3,0x0)/Sata(0x0,0x0,0x0) [32]
> Legacy INT19 Boot...
> cpu0: x86 (chipid 0x0 GenuineIntel 206A7 family 6 model 42 step 7 clock 3109 MHz)
> cpu0: Intel(r) Core(tm) i3-2100 CPU @ 3.10GHz
> pseudo-device: stmf_sbd0
> stmf_sbd0 is /pseudo/stmf_sbd at 0
> pseudo-device: lofi0
> lofi0 is /pseudo/lofi at 0
> pseudo-device: devinfo0
> devinfo0 is /pseudo/devinfo at 0
> acpinex0 at root
> acpinex0 is /fw
> iscsi0 at root
> iscsi0 is /iscsi
> xsvc0 at root: space 0 offset 0
> xsvc0 is /xsvc at 0,0
> acpinex: sb at 0, acpinex1
> acpinex1 is /fw/sb at 0
> pseudo-device: pseudo1
> pseudo1 is /pseudo/zconsnex at 1
> pseudo-device: pseudo2
> pseudo2 is /pseudo/zfdnex at 2
> /pci at 0,0/pci8086,2821 at 3 :
>        SATA CD/DVD (ATAPI) device at port 0
>        model BHYVE SATA DVD ROM
>        firmware 001
>        serial number BHYVE-EA14-A68A-54FA
>        supported features:
>         DMA
>        SATA Gen3 signaling speed (6.0Gbps)
> pseudo-device: llc10
> llc10 is /pseudo/llc1 at 0
> pseudo-device: power0
> power0 is /pseudo/power at 0
> pseudo-device: ramdisk1024
> ramdisk1024 is /pseudo/ramdisk at 1024
> pseudo-device: ucode0
> ucode0 is /pseudo/ucode at 0
> pseudo-device: zfs0
> zfs0 is /pseudo/zfs at 0
> pseudo-device: srn0
> srn0 is /pseudo/srn at 0
> pseudo-device: dtrace0
> dtrace0 is /pseudo/dtrace at 0
> pseudo-device: dcpc0
> dcpc0 is /pseudo/dcpc at 0
> pseudo-device: fasttrap0
> fasttrap0 is /pseudo/fasttrap at 0
> pseudo-device: fbt0
> fbt0 is /pseudo/fbt at 0
> pseudo-device: profile0
> profile0 is /pseudo/profile at 0
> pseudo-device: lockstat0
> lockstat0 is /pseudo/lockstat at 0
> pseudo-device: sdt0
> sdt0 is /pseudo/sdt at 0
> pseudo-device: systrace0
> systrace0 is /pseudo/systrace at 0
> pseudo-device: ipd0
> ipd0 is /pseudo/ipd at 0
> pseudo-device: stmf0
> stmf0 is /pseudo/stmf at 0
> sd0 at ahci0: target 0 lun 0
> sd0 is /pci at 0,0/pci8086,2821 at 3/cdrom at 0,0
> pseudo-device: fssnap0
> fssnap0 is /pseudo/fssnap at 0
> /pci at 0,0/pci8086,2821 at 3/cdrom at 0,0 (sd0) online
> /pci at 0,0/pci8086,2821 at 4 :
>        SATA disk device at port 0
>        model BHYVE SATA DISK
>        firmware 001
>        serial number BHYVE-3083-1AF1-1754
>        supported features:
>         48-bit LBA, DMA, Native Command Queueing
>        SATA Gen3 signaling speed (6.0Gbps)
>        Supported queue depth 32
>        capacity = 62914560 sectors
> WARNING: kvm: no hardware support
> 
> pseudo-device: pool0
> pool0 is /pseudo/pool at 0
> pseudo-device: bpf0
> bpf0 is /pseudo/bpf at 0
> sd1 at ahci1: target 0 lun 0
> sd1 is /pci at 0,0/pci8086,2821 at 4/disk at 0,0
> pseudo-device: pm0
> pm0 is /pseudo/pm at 0
> pseudo-device: nsmb0
> nsmb0 is /pseudo/nsmb at 0
> pseudo-device: tap0
> tap0 is /pseudo/tap at 0
> /pci at 0,0/pci8086,2821 at 4/disk at 0,0 (sd1) online
> NOTICE: vioif0: Got MAC address from host: e4:94:1:0:ff:ff
> pseudo-device: tun0
> tun0 is /pseudo/tun at 0
> pseudo-device: lx_systrace0
> lx_systrace0 is /pseudo/lx_systrace at 0
> 
> panic[cpu0]/thread=ffffff0002566c40: BAD TRAP: type=d (#gp General protection) rp=ffffff00025664c0 addr=20
> 
> sched: #gp General protection
> addr=0x20
> pid=0, pc=0xfffffffff80d375a, sp=0xffffff00025665b0, eflags=0x10282
> cr0: 8005003b<pg,wp,ne,et,ts,mp,pe> cr4: 406b8<osxsav,xmme,fxsr,pge,pae,pse,de>
> cr2: fed3b5accr3: 1dc00000cr8: c
> 
>        rdi: 7f1a90ffffff00c3 rsi: ffffff00c33321f8 rdx: ffffff00c37f5828
>        rcx: ffffff00c3868603  r8: ffffff00ca7aa600  r9:             2ba6
>        rax:                0 rbx: ffffff00c30d0ef0 rbp: ffffff00025665c0
>        r10: fffffffffb8554c4 r11:                1 r12:               1f
>        r13: ffffff00c319f880 r14:               10 r15:               20
>        fsb:                0 gsb: fffffffffbc326a0  ds:               4b
>         es:               4b  fs:                0  gs:              1c3
>        trp:                d err:                0 rip: fffffffff80d375a
>         cs:               30 rfl:            10282 rsp: ffffff00025665b0
>         ss:               38
> 
> This is the log of the bhyve options used (apart from 1 cpu, 1G ram)
> 
> Oct 13 14:22:43:  [bhyve devices: -s 0,hostbridge -s 31,lpc -s 4:0,ahci-hd,/data/vm/smartos/disk0.img -s 7:0,virtio-net,tap1]
> Oct 13 14:22:43:  [bhyve console: -l com1,/dev/nmdm1A -l com2,/dev/nmdm2A]
> Oct 13 14:22:43:  [bhyve options: -Hw -l bootrom,/data/vm/.config/BHYVE_UEFI_CSM.fd]
> Oct 13 14:22:43:  [bhyve iso device: -s 3:0,ahci-cd,/data/vm/.iso/smartos-latest.iso]

Ouch, even with the additional verbosity, the output isn't very insightful.  All I can glean is that you are reasonably far along in the boot process.  You could try to run with KMDB (-k boot option) and do some disassembly around that program counter to see if any specific module is implicated.

Alternately, I'd omit the network device and see how far that gets you.

>> Interestingly, my code normally puts the CD after the HDD, which Windows seems happy with as long as the slots are consecutive.
>> In SmartOS this gives me a different error:
>> 
>> slot 3 - hdd
>> slot 4 - cd
>> slot 5 - virtio-net
>> 
>> PlatformBdsBootFail
>> Boot Failed. Harddisk 1
>> !!!! Find PE image /home/grehan/proj/stock_edk2/Build/BhyveX64/DEBUG_GCC48/X64/UefiCpuPkg/CpuDxe/CpuDxe/DEBUG/CpuDxe.dll (ImageBase=000000007F8DC000, EntryPoint=000000007F8DC2AF) !!!!
> 
>> This error is from the UEFI code.  It implies that the CSM boot failed or was never invoked.  If the HDD isn't bootable, yet the CD is, that is the >most likely source as the CSM assumes the first block device it encounters is the desired boot source.
> 
> Ok, so the boot semantics are currently different between the CSM and non-CSM firmware? CSM will try and boot the first device and fail if it's not bootable, whereas non-CSM will always boot CD if it's bootable, regardless of order (from Windows instructions).


Yes, the boot semantics are different.  The UEFI (non-CSM) path is somewhat more tolerant of trying the next device in the boot-order if the current one is deemed "unbootable".  The CSM path is cruder in that it just searches for the first block device and goes for it.  If that device is unbootable it will fall thorough to the UEFI path somewhat ungracefully as UEFI isn't really expecting CSM to ever return.

Tycho


More information about the freebsd-virtualization mailing list