FreeBSD PVH guest support
Roger Pau Monné
roger.pau at citrix.com
Mon Oct 28 13:35:11 UTC 2013
Hello,
The Xen community is working on a new virtualization mode (or maybe I
should say an extension of HVM) to be able to run PV guests inside HVM
containers without requiring a device-model (Qemu). One of the
advantages of this new virtualization mode is that now it is much more
easier to port guests to run under it (as compared to pure PV guests).
Given that FreeBSD already supports PVHVM, adding PVH support is quite
easy, we only need some glue for the PV entry point and then support
for diverging some early init functions (like fetching the e820 map or
starting the APs).
The attached patch contains all this changes, and allows a SMP FreeBSD
guest to fully boot (and AFAIK work) under this new PVH mode. The patch
can also be found on my git repo:
git://xenbits.xen.org/people/royger/freebsd.git pvh_v2
The patch touches quite a lot of the early init, so I've Cced the
persons that maintain those areas, so they can review it.
In order to test it, and since the PVH changes are not yet merged into
upstream Xen, the use of a patched Xen is necessary. I've collected the
patches for PVH guest support from George Dunlap (v13) and fixed some
bugs on top of them, the tree can be found at:
git://xenbits.xen.org/people/royger/xen.git fix_pvh
For those curious, here is a dmesg of a FreeBSD PVH guest booting:
GDB: no debug ports present
KDB: debugger backends: ddb
KDB: current backend: ddb
SMAP type=01 base=0000000000000000 len=0000000138800000
ACPI BIOS Error (bug): A valid RSDP was not found (20130823/tbxfroot-223)
APIC: Using the Xen PV enumerator.
SMP: Added CPU 0 (BSP)
SMP: Added CPU 2 (AP)
SMP: Added CPU 4 (AP)
SMP: Added CPU 6 (AP)
SMP: Added CPU 8 (AP)
SMP: Added CPU 10 (AP)
SMP: Added CPU 12 (AP)
Copyright (c) 1992-2013 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
The Regents of the University of California. All rights reserved.
FreeBSD is a registered trademark of The FreeBSD Foundation.
FreeBSD 11.0-CURRENT #420: Mon Oct 28 13:07:53 CET 2013
root at odin:/usr/obj/usr/src/sys/GENERIC amd64
FreeBSD clang version 3.3 (tags/RELEASE_33/final 183502) 20130610
WARNING: WITNESS option enabled, expect reduced performance.
Hypervisor: Origin = "XenVMMXenVMM"
Calibrating TSC clock ... TSC clock: 3066775691 Hz
CPU: Intel(R) Xeon(R) CPU W3550 @ 3.07GHz (3066.78-MHz K8-class CPU)
Origin = "GenuineIntel" Id = 0x106a5 Family = 0x6 Model = 0x1a Stepping = 5
Features=0x1fc98b75<FPU,DE,TSC,MSR,PAE,CX8,APIC,SEP,CMOV,PAT,CLFLUSH,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT>
Features2=0x80982201<SSE3,SSSE3,CX16,SSE4.1,SSE4.2,POPCNT,HV>
AMD Features=0x20100800<SYSCALL,NX,LM>
AMD Features2=0x1<LAHF>
real memory = 5242880000 (5000 MB)
Physical memory chunk(s):
0x0000000000010000 - 0x00000000001fffff, 2031616 bytes (496 pages)
0x0000000002708000 - 0x0000000130864fff, 5068148736 bytes (1237341 pages)
avail memory = 5035581440 (4802 MB)
INTR: Adding local APIC 2 as a target
INTR: Adding local APIC 4 as a target
INTR: Adding local APIC 6 as a target
INTR: Adding local APIC 8 as a target
INTR: Adding local APIC 10 as a target
INTR: Adding local APIC 12 as a target
FreeBSD/SMP: Multiprocessor System Detected: 7 CPUs
FreeBSD/SMP: 1 package(s) x 7 core(s)
cpu0 (BSP): APIC ID: 0
cpu1 (AP): APIC ID: 2
cpu2 (AP): APIC ID: 4
cpu3 (AP): APIC ID: 6
cpu4 (AP): APIC ID: 8
cpu5 (AP): APIC ID: 10
cpu6 (AP): APIC ID: 12
XEN: CPU 0 has VCPU ID 0
XEN: CPU 1 has VCPU ID 1
XEN: CPU 2 has VCPU ID 2
XEN: CPU 3 has VCPU ID 3
XEN: CPU 4 has VCPU ID 4
XEN: CPU 5 has VCPU ID 5
XEN: CPU 6 has VCPU ID 6
x86bios: IVT 0x000000-0x0004ff at 0xfffff80000000000
x86bios: SSEG 0x010000-0x010fff at 0xfffffe012e79d000
x86bios: ROM 0x0a0000-0x0fefff at 0xfffff800000a0000
random device not loaded; using insecure entropy
ULE: setup cpu 0
ULE: setup cpu 1
ULE: setup cpu 2
ULE: setup cpu 3
ULE: setup cpu 4
ULE: setup cpu 5
ULE: setup cpu 6
Event-channel device installed.
snd_unit_init() u=0x00ff8000 [512] d=0x00007c00 [32] c=0x000003ff [1024]
feeder_register: snd_unit=-1 snd_maxautovchans=16 latency=5 feeder_rate_min=1 feeder_rate_max=2016000 feeder_rate_round=25
wlan: <802.11 Link Layer>
Hardware, VIA Nehemiah Padlock RNG: VIA Padlock RNG not present
Hardware, Intel IvyBridge+ RNG: RDRAND is not present
null: <null device, zero device>
Falling back to <Software, Yarrow> random adaptor
random: <Software, Yarrow> initialized
nfslock: pseudo-device
kbd0 at kbdmux0
module_register_init: MOD_LOAD (vesa, 0xffffffff80d21c60, 0) error 19
io: <I/O>
VMBUS: load
mem: <memory>
hpt27xx: RocketRAID 27xx controller driver v1.1
hptrr: RocketRAID 17xx/2xxx SATA controller driver v1.2
hptnr: R750/DC7280 controller driver v1.0
ACPI BIOS Error (bug): A valid RSDP was not found (20130823/tbxfroot-223)
ACPI: Table initialisation failed: AE_NOT_FOUND
ACPI: Try disabling either ACPI or apic support.
xenstore0: <XenStore> on motherboard
Grant table initialized
xc0: <Xen Console> on motherboard
xen_et0: <Xen PV Clock> on motherboard
Event timer "XENTIMER" frequency 1000000000 Hz quality 950
Timecounter "XENTIMER" frequency 1000000000 Hz quality 950
xen_et0: registered as a time-of-day clock (resolution 10000000us, adjustment 5.000000000s)
pvcpu0: <Xen PV CPU> on motherboard
pvcpu1: <Xen PV CPU> on motherboard
pvcpu2: <Xen PV CPU> on motherboard
pvcpu3: <Xen PV CPU> on motherboard
pvcpu4: <Xen PV CPU> on motherboard
pvcpu5: <Xen PV CPU> on motherboard
pvcpu6: <Xen PV CPU> on motherboard
legacy_pcib_identify: no bridge found, adding pcib0 anyway
pcib0 pcibus 0 on motherboard
pci0: <PCI bus> on pcib0
pci0: domain=0, physical bus=0
cpu0 on motherboard
cpu1 on motherboard
cpu2 on motherboard
cpu3 on motherboard
cpu4 on motherboard
cpu5 on motherboard
cpu6 on motherboard
isa0: <ISA bus> on motherboard
qpi0: <QPI system bus> on motherboard
ex_isa_identify()
isa_probe_children: disabling PnP devices
isa_probe_children: probing non-PnP devices
fb: new array size 4
sc0: <System console> on isa0
sc0: MDA <16 virtual consoles, flags=0x100>
sc0: fb0, kbd0, terminal emulator: scteken (teken terminal)
vga0: <Generic ISA VGA> at port 0x3b0-0x3bb iomem 0xb0000-0xb7fff on isa0
isa_probe_children: probing PnP devices
Device configuration finished.
procfs registered
Timecounters tick every 1.000 msec
vlan: initialized, using hash tables with chaining
tcp_init: net.inet.tcp.tcbhashsize auto tuned to 65536
lo0: bpf attached
hpt27xx: no controller detected.
hptrr: no controller detected.
hptnr: no controller detected.
xenbusb_front0: <Xen Frontend Devices> on xenstore0
xenbusb_add_device: Device device/suspend/event-channel ignored. State 6
xn0: <Virtual Network Interface> at device/vif/0 on xenbusb_front0
xn0: bpf attached
xn0: Ethernet address: 00:16:3e:0b:a4:b1
xenbusb_back0: <Xen Backend Devices> on xenstore0
xctrl0: <Xen Control Device> on xenstore0
xn0: backend features: feature-sg feature-gso-tcp4
xbd0: 20480MB <Virtual Block Device> at device/vbd/51712 on xenbusb_front0
xbd0: features: flush, write_barrier
xbd0: synchronize cache commands enabled.
GEOM: new disk xbd0
random: unblocking device.
Netvsc initializing... SMP: AP CPU #5 Launched!
SMP: AP CPU #2 Launched!
SMP: AP CPU #1 Launched!
SMP: AP CPU #3 Launched!
SMP: AP CPU #6 Launched!
SMP: AP CPU #4 Launched!
TSC timecounter discards lower 1 bit(s)
Timecounter "TSC-low" frequency 1533387845 Hz quality -100
WARNING: WITNESS option enabled, expect reduced performance.
Trying to mount root from ufs:/dev/xbd0p2 []...
start_init: trying /sbin/init
Setting hostuuid: c9230f36-1a54-489e-877c-1d15b8f463e9.
Setting hostid: 0xd52252c7.
ZFS filesystem version: 5
ZFS storage pool version: features support (5000)
Entropy harvesting: interrupts ethernet point_to_pointsha256: /kernel: No such file or directory
kickstart.
Starting file system checks:
/dev/xbd0p2: FILE SYSTEM CLEAN; SKIPPING CHECKS
/dev/xbd0p2: clean, 2213647 free (17111 frags, 274567 blocks, 0.4% fragmentation)
Mounting local file systems:.
Writing entropy file:.
xn0: link state changed to DOWN
xn0: link state changed to UP
Starting Network: lo0 xn0.
lo0: flags=8049<UP,LOOPBACK,RUNNING,MULTICAST> metric 0 mtu 16384
options=600003<RXCSUM,TXCSUM,RXCSUM_IPV6,TXCSUM_IPV6>
inet6 ::1 prefixlen 128
inet6 fe80::1%lo0 prefixlen 64 scopeid 0x1
inet 127.0.0.1 netmask 0xff000000
nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
xn0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
options=503<RXCSUM,TXCSUM,TSO4,LRO>
ether 00:16:3e:0b:a4:b1
nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
media: Ethernet manual
status: active
Starting devd.
Starting dhclient.
DHCPDISCOVER on xn0 to 255.255.255.255 port 67 interval 7
DHCPOFFER from 172.16.1.1
DHCPREQUEST on xn0 to 255.255.255.255 port 67
DHCPACK from 172.16.1.1
bound to 172.16.1.149 -- renewal in 43200 seconds.
add net ::ffff:0.0.0.0: gateway ::1
add net ::0.0.0.0: gateway ::1
add net fe80::: gateway ::1
add net ff02::: gateway ::1
ELF ldconfig path: /lib /usr/lib /usr/lib/compat /usr/local/lib
32-bit compatibility ldconfig path: /usr/lib32
Creating and/or trimming log files.
Starting syslogd.
No core dumps found.
lock order reversal:
1st 0xfffffe012e861e28 bufwait (bufwait) @ /usr/src/sys/kern/vfs_bio.c:3050
2nd 0xfffff80005b87c00 dirhash (dirhash) @ /usr/src/sys/ufs/ufs/ufs_dirhash.c:284
KDB: stack backtrace:
X_db_symbol_values() at X_db_symbol_values+0x10b/frame 0xfffffe012fb8c410
kdb_backtrace() at kdb_backtrace+0x39/frame 0xfffffe012fb8c4c0
witness_checkorder() at witness_checkorder+0xd23/frame 0xfffffe012fb8c550
_sx_xlock() at _sx_xlock+0x75/frame 0xfffffe012fb8c590
ufsdirhash_add() at ufsdirhash_add+0x3b/frame 0xfffffe012fb8c5d0
ufs_direnter() at ufs_direnter+0x688/frame 0xfffffe012fb8c690
ufs_vinit() at ufs_vinit+0x33f3/frame 0xfffffe012fb8c890
VOP_MKDIR_APV() at VOP_MKDIR_APV+0xf0/frame 0xfffffe012fb8c8c0
kern_mkdirat() at kern_mkdirat+0x1ff/frame 0xfffffe012fb8cae0
amd64_syscall() at amd64_syscall+0x265/frame 0xfffffe012fb8cbf0
Xfast_syscall() at Xfast_syscall+0xfb/frame 0xfffffe012fb8cbf0
--- syscall (136, FreeBSD ELF64, sys_mkdir), rip = 0x80092faaa, rsp = 0x7fffffffd788, rbp = 0x7fffffffdc70 ---
Clearing /tmp (X related).
Updating motd:.
Configuring syscons: keymap blanktime.
Performing sanity check on sshd configuration.
Starting sshd.
Starting cron.
Starting background file system checks in 60 seconds.
Mon Oct 28 13:22:52 CET 2013
FreeBSD/amd64 (Amnesiac) (xc0)
-------------- next part --------------
>From 16de1566ada65e5838105870df576ab8258ed8b6 Mon Sep 17 00:00:00 2001
From: Roger Pau Monne <roger.pau at citrix.com>
Date: Mon, 14 Oct 2013 18:33:17 +0200
Subject: [PATCH] Xen x86 PVH support
This is still very experimental, and PVH support has not yet been
merged into upstream Xen.
PVH mode is basically a PV guest inside an HVM container, and shares
a great amount of code with PVHVM. The main difference is the way the
guest is started, PVH uses the PV start sequence, jumping directly
into the kernel entry point in long mode and with page tables set.
The main work of this patch consists in setting the environment as
similar as possible to what native FreeBSD expects, and then adding
hooks to the PV ops when necessary.
sys/amd64/amd64/locore.S:
* Add PV entry point, hypervisor_page and the necessary elfnotes.
sys/amd64/amd64/machdep.c:
* Add hooks to replace bare metal operations that should use a PV
helper, this includes:
- Preload metadata
- i8254_init and i8254_delay
- Fetching the e820 memory map
- Reserve of the MP bootstrap region
* Create a DELAY function that uses the PV hooks.
* Introduce a new hammer_time_xen that sets the necessary stuff when
running in PVH mode.
sys/amd64/amd64/mp_machdep.c:
* Introduce a hook to replace start_all_aps.
* Introduce a lapic_disabled variable to prevent polluting the code
with xen specific gates.
sys/amd64/include/asmacros.h:
* Copy the ELFNOTE macro from the i386 Xen PV port.
sys/amd64/include/clock.h:
sys/i386/include/clock.h:
* Prototypes for the xen early delay initialization and usage.
sys/amd64/include/cpu.h:
* Introduce a new cpu hook to init APs.
sys/amd64/include/sysarch.h:
* Declare the init_ops structure.
sys/amd64/include/xen/hypercall.h:
sys/i386/include/xen/hypercall.h
* Switch to the PV style hypercall mechanism for HVM also.
sys/conf/files:
* Make the PV console available on XENHVM also.
sys/conf/files.amd64:
* Include the new files for the PVH port.
sys/dev/xen/console/console.c:
sys/dev/xen/console/xencons_ring.c:
* Gate the PV console attach so it is only used on PV ports.
* Use HYPERVISOR_start_info instead of xen_start_info.
* Use HYPERVISOR_event_channel_op to kick the event channel before
xen interrupts are setup.
sys/dev/xen/control/control.c:
* Use the PV shutdown on PVH.
sys/dev/xen/timer/timer.c:
* Pass a vcpu_info to xen_fetch_vcpu_time, this allows using this
function at very early init, before per-cpu vcpu_info is set.
* Remove critical_{enter/exit} from xen_fetch_vcpu_time so it can be
used at early boot, instead place them on the callers.
* Introduce two new functions, xen_delay_init and xen_delay that can
be used at early boot to implement the generic DELAY function.
sys/i386/i386/locore.s:
* Reserve space for the hypercall page.
sys/i386/i386/machdep.c:
* Create a generic DELAY function.
sys/i386/xen/xen_machdep.c:
* Set HYPERVISOR_start_info.
sys/x86/isa/clock.c:
* Rename the generic DELAY function to i8254_delay.
sys/x86/x86/delay.c:
* Put generic delay helpers here, get_tsc and delay_tc.
sys/x86/x86/local_apic.c:
* Prevent the local apic from attaching when running on PVH mode.
sys/x86/xen/hvm.c:
* Set the start_all_aps hook.
* Fix the setting of the hypercall page now that we are using the
same mechanism as the PV port.
* Initialize Xen CPU hooks for the PVH port.
* Introduce the xen_early_printf debug function, which prints
directly to the hypervisor console.
sys/x86/xen/mptable.c:
* Create a dummy PV CPU enumerator for the PVH port.
sys/x86/xen/pv.c:
* Implement the PV functions for the early boot hooks,
parse_preload_data and fetch_e820_map.
* Implement the PV function for the start_all_aps hook.
sys/x86/xen/pvcpu.c:
* Dummy Xen PV CPU device, that we use to set the per-cpu pc_device.
sys/xen/gnttab.c:
* Allocate resume_frames for the PVH port.
sys/xen/interface/arch-x86/xen.h:
* Interface change for the PVH port (not used on FreeBSD).
sys/xen/pv.h:
* Header that exports the specific PV functions.
sys/xen/xen-os.h:
* Declare prototypes for the newly added functions.
sys/xen/xenstore/xenstore.c:
* Make the xenstore driver hang from both xenpci and the nexus when
running XENHVM, this is because we don't have a xenpci device on
the PVH port.
* Gate xenstore addition to parent == xenpci on the HVM case.
---
sys/amd64/amd64/locore.S | 53 ++++++++
sys/amd64/amd64/machdep.c | 179 ++++++++++++++++++++++----
sys/amd64/amd64/mp_machdep.c | 27 +++--
sys/amd64/include/asmacros.h | 26 ++++
sys/amd64/include/clock.h | 6 +
sys/amd64/include/cpu.h | 1 +
sys/amd64/include/sysarch.h | 19 +++
sys/amd64/include/xen/hypercall.h | 7 -
sys/conf/files | 4 +-
sys/conf/files.amd64 | 4 +
sys/conf/files.i386 | 1 +
sys/dev/xen/console/console.c | 23 +++-
sys/dev/xen/console/xencons_ring.c | 15 ++-
sys/dev/xen/control/control.c | 37 +++---
sys/dev/xen/timer/timer.c | 59 +++++++--
sys/i386/i386/locore.s | 9 ++
sys/i386/i386/machdep.c | 9 ++
sys/i386/include/clock.h | 6 +
sys/i386/include/xen/hypercall.h | 7 -
sys/i386/xen/xen_machdep.c | 4 +-
sys/x86/isa/clock.c | 53 +--------
sys/x86/x86/delay.c | 95 ++++++++++++++
sys/x86/x86/local_apic.c | 8 +-
sys/x86/xen/hvm.c | 93 ++++++++++----
sys/x86/xen/mptable.c | 136 ++++++++++++++++++++
sys/x86/xen/pv.c | 247 ++++++++++++++++++++++++++++++++++++
sys/x86/xen/pvcpu.c | 98 ++++++++++++++
sys/xen/gnttab.c | 21 +++-
sys/xen/interface/arch-x86/xen.h | 11 ++-
sys/xen/pv.h | 29 ++++
sys/xen/xen-os.h | 8 +
sys/xen/xenstore/xenstore.c | 32 ++++--
32 files changed, 1141 insertions(+), 186 deletions(-)
create mode 100644 sys/x86/x86/delay.c
create mode 100644 sys/x86/xen/mptable.c
create mode 100644 sys/x86/xen/pv.c
create mode 100644 sys/x86/xen/pvcpu.c
create mode 100644 sys/xen/pv.h
diff --git a/sys/amd64/amd64/locore.S b/sys/amd64/amd64/locore.S
index 55cda3a..e04cc48 100644
--- a/sys/amd64/amd64/locore.S
+++ b/sys/amd64/amd64/locore.S
@@ -31,6 +31,12 @@
#include <machine/pmap.h>
#include <machine/specialreg.h>
+#ifdef XENHVM
+#include <xen/xen-os.h>
+#define __ASSEMBLY__
+#include <xen/interface/elfnote.h>
+#endif
+
#include "assym.s"
/*
@@ -86,3 +92,50 @@ NON_GPROF_ENTRY(btext)
ALIGN_DATA /* just to be sure */
.space 0x1000 /* space for bootstack - temporary stack */
bootstack:
+
+#ifdef XENHVM
+/* Xen */
+.section __xen_guest
+ ELFNOTE(Xen, XEN_ELFNOTE_GUEST_OS, .asciz, "FreeBSD")
+ ELFNOTE(Xen, XEN_ELFNOTE_GUEST_VERSION, .asciz, "HEAD")
+ ELFNOTE(Xen, XEN_ELFNOTE_XEN_VERSION, .asciz, "xen-3.0")
+ ELFNOTE(Xen, XEN_ELFNOTE_VIRT_BASE, .quad, KERNBASE)
+ ELFNOTE(Xen, XEN_ELFNOTE_PADDR_OFFSET, .quad, KERNBASE) /* Xen honours elf->p_paddr; compensate for this */
+ ELFNOTE(Xen, XEN_ELFNOTE_ENTRY, .quad, xen_start)
+ ELFNOTE(Xen, XEN_ELFNOTE_HYPERCALL_PAGE, .quad, hypercall_page)
+ ELFNOTE(Xen, XEN_ELFNOTE_HV_START_LOW, .quad, HYPERVISOR_VIRT_START)
+ ELFNOTE(Xen, XEN_ELFNOTE_FEATURES, .asciz, "writable_descriptor_tables|auto_translated_physmap|supervisor_mode_kernel|hvm_callback_vector")
+ ELFNOTE(Xen, XEN_ELFNOTE_PAE_MODE, .asciz, "yes")
+ ELFNOTE(Xen, XEN_ELFNOTE_L1_MFN_VALID, .long, PG_V, PG_V)
+ ELFNOTE(Xen, XEN_ELFNOTE_LOADER, .asciz, "generic")
+ ELFNOTE(Xen, XEN_ELFNOTE_SUSPEND_CANCEL, .long, 0)
+ ELFNOTE(Xen, XEN_ELFNOTE_BSD_SYMTAB, .asciz, "yes")
+
+ .text
+.p2align PAGE_SHIFT, 0x90 /* Hypercall_page needs to be PAGE aligned */
+
+NON_GPROF_ENTRY(hypercall_page)
+ .skip 0x1000, 0x90 /* Fill with "nop"s */
+
+NON_GPROF_ENTRY(xen_start)
+ /* Don't trust what the loader gives for rflags. */
+ pushq $PSL_KERNEL
+ popfq
+
+ /* Parameters for the xen init function */
+ movq %rsi, %rdi /* shared_info (arg 1) */
+ movq %rsp, %rsi /* xenstack (arg 2) */
+
+ /* Use our own stack */
+ movq $bootstack,%rsp
+ xorl %ebp, %ebp
+
+ /* u_int64_t hammer_time_xen(start_info_t *si, u_int64_t xenstack); */
+ call hammer_time_xen
+ movq %rax, %rsp /* set up kstack for mi_startup() */
+ call mi_startup /* autoconfiguration, mountroot etc */
+
+ /* NOTREACHED */
+0: hlt
+ jmp 0b
+#endif
diff --git a/sys/amd64/amd64/machdep.c b/sys/amd64/amd64/machdep.c
index 2b2e47f..b649def 100644
--- a/sys/amd64/amd64/machdep.c
+++ b/sys/amd64/amd64/machdep.c
@@ -127,6 +127,7 @@ __FBSDID("$FreeBSD$");
#include <machine/reg.h>
#include <machine/sigframe.h>
#include <machine/specialreg.h>
+#include <machine/sysarch.h>
#ifdef PERFMON
#include <machine/perfmon.h>
#endif
@@ -147,10 +148,20 @@ __FBSDID("$FreeBSD$");
#include <isa/isareg.h>
#include <isa/rtc.h>
+#ifdef XENHVM
+/* Xen */
+#include <xen/xen-os.h>
+#include <xen/hvm.h>
+#include <xen/pv.h>
+#endif
+
/* Sanity check for __curthread() */
CTASSERT(offsetof(struct pcpu, pc_curthread) == 0);
extern u_int64_t hammer_time(u_int64_t, u_int64_t);
+#ifdef XENHVM
+extern u_int64_t hammer_time_xen(start_info_t *, u_int64_t);
+#endif
extern void printcpuinfo(void); /* XXX header file */
extern void identify_cpu(void);
@@ -166,6 +177,23 @@ static int set_fpcontext(struct thread *td, const mcontext_t *mcp,
char *xfpustate, size_t xfpustate_len);
SYSINIT(cpu, SI_SUB_CPU, SI_ORDER_FIRST, cpu_startup, NULL);
+/* Preload data parse function */
+static caddr_t native_parse_preload_data(u_int64_t);
+
+/* Native function to fetch the e820 map */
+static void native_fetch_e820_map(caddr_t, struct bios_smap **, u_int32_t *);
+
+/* Default init_ops implementation. */
+struct init_ops init_ops = {
+ .parse_preload_data = native_parse_preload_data,
+ .early_delay_init = i8254_init,
+ .early_delay = i8254_delay,
+ .fetch_e820_map = native_fetch_e820_map,
+#ifdef SMP
+ .mp_bootaddress = mp_bootaddress,
+#endif
+};
+
/*
* The file "conf/ldscript.amd64" defines the symbol "kernphys". Its value is
* the physical address at which the kernel is loaded.
@@ -216,6 +244,15 @@ struct mem_range_softc mem_range_softc;
struct mtx dt_lock; /* lock for GDT and LDT */
+void
+DELAY(int n)
+{
+ if (delay_tc(n))
+ return;
+
+ init_ops.early_delay(n);
+}
+
static void
cpu_startup(dummy)
void *dummy;
@@ -1408,6 +1445,24 @@ add_smap_entry(struct bios_smap *smap, vm_paddr_t *physmap, int *physmap_idxp)
return (1);
}
+static void
+native_fetch_e820_map(caddr_t kmdp, struct bios_smap **smap, u_int32_t *size)
+{
+ /*
+ * get memory map from INT 15:E820, kindly supplied by the
+ * loader.
+ *
+ * subr_module.c says:
+ * "Consumer may safely assume that size value precedes data."
+ * ie: an int32_t immediately precedes smap.
+ */
+ *smap = (struct bios_smap *)preload_search_info(kmdp,
+ MODINFO_METADATA | MODINFOMD_SMAP);
+ if (*smap == NULL)
+ panic("No BIOS smap info from loader!");
+ *size = *((u_int32_t *)*smap - 1);
+}
+
/*
* Populate the (physmap) array with base/bound pairs describing the
* available physical memory in the system, then test this memory and
@@ -1433,19 +1488,8 @@ getmemsize(caddr_t kmdp, u_int64_t first)
basemem = 0;
physmap_idx = 0;
- /*
- * get memory map from INT 15:E820, kindly supplied by the loader.
- *
- * subr_module.c says:
- * "Consumer may safely assume that size value precedes data."
- * ie: an int32_t immediately precedes smap.
- */
- smapbase = (struct bios_smap *)preload_search_info(kmdp,
- MODINFO_METADATA | MODINFOMD_SMAP);
- if (smapbase == NULL)
- panic("No BIOS smap info from loader!");
+ init_ops.fetch_e820_map(kmdp, &smapbase, &smapsize);
- smapsize = *((u_int32_t *)smapbase - 1);
smapend = (struct bios_smap *)((uintptr_t)smapbase + smapsize);
for (smap = smapbase; smap < smapend; smap++)
@@ -1467,7 +1511,8 @@ getmemsize(caddr_t kmdp, u_int64_t first)
#ifdef SMP
/* make hole for AP bootstrap code */
- physmap[1] = mp_bootaddress(physmap[1] / 1024);
+ if (init_ops.mp_bootaddress)
+ physmap[1] = init_ops.mp_bootaddress(physmap[1] / 1024);
#endif
/*
@@ -1681,6 +1726,98 @@ do_next:
msgbufp = (struct msgbuf *)PHYS_TO_DMAP(phys_avail[pa_indx]);
}
+static caddr_t
+native_parse_preload_data(u_int64_t modulep)
+{
+ caddr_t kmdp;
+
+ preload_metadata = (caddr_t)(uintptr_t)(modulep + KERNBASE);
+ preload_bootstrap_relocate(KERNBASE);
+ kmdp = preload_search_by_type("elf kernel");
+ if (kmdp == NULL)
+ kmdp = preload_search_by_type("elf64 kernel");
+ boothowto = MD_FETCH(kmdp, MODINFOMD_HOWTO, int);
+ kern_envp = MD_FETCH(kmdp, MODINFOMD_ENVP, char *) + KERNBASE;
+#ifdef DDB
+ ksym_start = MD_FETCH(kmdp, MODINFOMD_SSYM, uintptr_t);
+ ksym_end = MD_FETCH(kmdp, MODINFOMD_ESYM, uintptr_t);
+#endif
+
+ return (kmdp);
+}
+
+#ifdef XENHVM
+/*
+ * First function called by the Xen PVH boot sequence.
+ *
+ * Set some Xen global variables and prepare the environment so it is
+ * as similar as possible to what native FreeBSD init function expects.
+ */
+u_int64_t
+hammer_time_xen(start_info_t *si, u_int64_t xenstack)
+{
+ u_int64_t physfree;
+ u_int64_t *PT4 = (u_int64_t *)xenstack;
+ u_int64_t *PT3 = (u_int64_t *)(xenstack + PAGE_SIZE);
+ u_int64_t *PT2 = (u_int64_t *)(xenstack + 2 * PAGE_SIZE);
+ int i;
+
+ KASSERT((si != NULL && xenstack != 0),
+ ("invalid start_info or xenstack"));
+
+ xen_early_printf("FreeBSD PVH running on %s\n", si->magic);
+
+ /* We use 3 pages of xen stack for the boot pagetables */
+ physfree = xenstack + 3 * PAGE_SIZE - KERNBASE;
+
+ /* Setup Xen global variables */
+ HYPERVISOR_start_info = si;
+ HYPERVISOR_shared_info =
+ (shared_info_t *)(si->shared_info + KERNBASE);
+
+ /*
+ * Setup some misc global variables for Xen devices
+ *
+ * XXX: devices that need this specific variables should
+ * be rewritten to fetch this info by themselves from the
+ * start_info page.
+ */
+ console_page =
+ (char *)(ptoa(si->console.domU.mfn) + KERNBASE);
+ xen_store = (struct xenstore_domain_interface *)
+ (ptoa(si->store_mfn) + KERNBASE);
+
+ xen_domain_type = XEN_PV_DOMAIN;
+ vm_guest = VM_GUEST_XEN;
+
+ /*
+ * Use the stack Xen gives us to build the page tables
+ * as native FreeBSD expects to find them (created
+ * by the boot trampoline).
+ */
+ for (i = 0; i < 512; i++) {
+ /* Each slot of the level 4 pages points to the same level 3 page */
+ PT4[i] = ((u_int64_t)&PT3[0]) - KERNBASE;
+ PT4[i] |= PG_V | PG_RW | PG_U;
+
+ /* Each slot of the level 3 pages points to the same level 2 page */
+ PT3[i] = ((u_int64_t)&PT2[0]) - KERNBASE;
+ PT3[i] |= PG_V | PG_RW | PG_U;
+
+ /* The level 2 page slots are mapped with 2MB pages for 1GB. */
+ PT2[i] = i * (2 * 1024 * 1024);
+ PT2[i] |= PG_V | PG_RW | PG_PS | PG_U;
+ }
+ load_cr3(((u_int64_t)&PT4[0]) - KERNBASE);
+
+ /* Set the hooks for early functions that diverge from bare metal */
+ xen_pv_set_init_ops();
+
+ /* Now we can jump into the native init function */
+ return hammer_time(0, physfree);
+}
+#endif
+
u_int64_t
hammer_time(u_int64_t modulep, u_int64_t physfree)
{
@@ -1705,17 +1842,7 @@ hammer_time(u_int64_t modulep, u_int64_t physfree)
*/
proc_linkup0(&proc0, &thread0);
- preload_metadata = (caddr_t)(uintptr_t)(modulep + KERNBASE);
- preload_bootstrap_relocate(KERNBASE);
- kmdp = preload_search_by_type("elf kernel");
- if (kmdp == NULL)
- kmdp = preload_search_by_type("elf64 kernel");
- boothowto = MD_FETCH(kmdp, MODINFOMD_HOWTO, int);
- kern_envp = MD_FETCH(kmdp, MODINFOMD_ENVP, char *) + KERNBASE;
-#ifdef DDB
- ksym_start = MD_FETCH(kmdp, MODINFOMD_SSYM, uintptr_t);
- ksym_end = MD_FETCH(kmdp, MODINFOMD_ESYM, uintptr_t);
-#endif
+ kmdp = init_ops.parse_preload_data(modulep);
/* Init basic tunables, hz etc */
init_param1();
@@ -1799,10 +1926,10 @@ hammer_time(u_int64_t modulep, u_int64_t physfree)
lidt(&r_idt);
/*
- * Initialize the i8254 before the console so that console
+ * Initialize the early delay before the console so that console
* initialization can use DELAY().
*/
- i8254_init();
+ init_ops.early_delay_init();
/*
* Initialize the console before we print anything out.
diff --git a/sys/amd64/amd64/mp_machdep.c b/sys/amd64/amd64/mp_machdep.c
index 4ef4b3d..44c2a45 100644
--- a/sys/amd64/amd64/mp_machdep.c
+++ b/sys/amd64/amd64/mp_machdep.c
@@ -90,7 +90,8 @@ extern struct pcpu __pcpu[];
/* AP uses this during bootstrap. Do not staticize. */
char *bootSTK;
-static int bootAP;
+int bootAP;
+bool lapic_disabled = false;
/* Free these after use */
void *bootstacks[MAXCPU];
@@ -122,9 +123,12 @@ u_long *ipi_rendezvous_counts[MAXCPU];
static u_long *ipi_hardclock_counts[MAXCPU];
#endif
+int native_start_all_aps(void);
+
/* Default cpu_ops implementation. */
struct cpu_ops cpu_ops = {
- .ipi_vectored = lapic_ipi_vectored
+ .ipi_vectored = lapic_ipi_vectored,
+ .start_all_aps = native_start_all_aps,
};
extern inthand_t IDTVEC(fast_syscall), IDTVEC(fast_syscall32);
@@ -138,7 +142,7 @@ extern int pmap_pcid_enabled;
static volatile cpuset_t ipi_nmi_pending;
/* used to hold the AP's until we are ready to release them */
-static struct mtx ap_boot_mtx;
+struct mtx ap_boot_mtx;
/* Set to 1 once we're ready to let the APs out of the pen. */
static volatile int aps_ready = 0;
@@ -165,7 +169,6 @@ static int cpu_cores; /* cores per package */
static void assign_cpu_ids(void);
static void set_interrupt_apic_ids(void);
-static int start_all_aps(void);
static int start_ap(int apic_id);
static void release_aps(void *dummy);
@@ -569,7 +572,7 @@ cpu_mp_start(void)
assign_cpu_ids();
/* Start each Application Processor */
- start_all_aps();
+ cpu_ops.start_all_aps();
set_interrupt_apic_ids();
}
@@ -707,7 +710,8 @@ init_secondary(void)
wrmsr(MSR_SF_MASK, PSL_NT|PSL_T|PSL_I|PSL_C|PSL_D);
/* Disable local APIC just to be sure. */
- lapic_disable();
+ if (!lapic_disabled)
+ lapic_disable();
/* signal our startup to the BSP. */
mp_naps++;
@@ -733,7 +737,7 @@ init_secondary(void)
/* A quick check from sanity claus */
cpuid = PCPU_GET(cpuid);
- if (PCPU_GET(apic_id) != lapic_id()) {
+ if (!lapic_disabled && PCPU_GET(apic_id) != lapic_id()) {
printf("SMP: cpuid = %d\n", cpuid);
printf("SMP: actual apic_id = %d\n", lapic_id());
printf("SMP: correct apic_id = %d\n", PCPU_GET(apic_id));
@@ -749,7 +753,8 @@ init_secondary(void)
mtx_lock_spin(&ap_boot_mtx);
/* Init local apic for irq's */
- lapic_setup(1);
+ if (!lapic_disabled)
+ lapic_setup(1);
/* Set memory range attributes for this CPU to match the BSP */
mem_range_AP_init();
@@ -764,7 +769,7 @@ init_secondary(void)
if (cpu_logical > 1 && PCPU_GET(apic_id) % cpu_logical != 0)
CPU_SET(cpuid, &logical_cpus_mask);
- if (bootverbose)
+ if (!lapic_disabled && bootverbose)
lapic_dump("AP");
if (smp_cpus == mp_ncpus) {
@@ -908,8 +913,8 @@ assign_cpu_ids(void)
/*
* start each AP in our list
*/
-static int
-start_all_aps(void)
+int
+native_start_all_aps(void)
{
vm_offset_t va = boot_address + KERNBASE;
u_int64_t *pt4, *pt3, *pt2;
diff --git a/sys/amd64/include/asmacros.h b/sys/amd64/include/asmacros.h
index 1fb592a..ce8dce4 100644
--- a/sys/amd64/include/asmacros.h
+++ b/sys/amd64/include/asmacros.h
@@ -201,4 +201,30 @@
#endif /* LOCORE */
+#ifdef __STDC__
+#define ELFNOTE(name, type, desctype, descdata...) \
+.pushsection .note.name ; \
+ .align 4 ; \
+ .long 2f - 1f /* namesz */ ; \
+ .long 4f - 3f /* descsz */ ; \
+ .long type ; \
+1:.asciz #name ; \
+2:.align 4 ; \
+3:desctype descdata ; \
+4:.align 4 ; \
+.popsection
+#else /* !__STDC__, i.e. -traditional */
+#define ELFNOTE(name, type, desctype, descdata) \
+.pushsection .note.name ; \
+ .align 4 ; \
+ .long 2f - 1f /* namesz */ ; \
+ .long 4f - 3f /* descsz */ ; \
+ .long type ; \
+1:.asciz "name" ; \
+2:.align 4 ; \
+3:desctype descdata ; \
+4:.align 4 ; \
+.popsection
+#endif /* __STDC__ */
+
#endif /* !_MACHINE_ASMACROS_H_ */
diff --git a/sys/amd64/include/clock.h b/sys/amd64/include/clock.h
index d7f7d82..e7817ab 100644
--- a/sys/amd64/include/clock.h
+++ b/sys/amd64/include/clock.h
@@ -25,6 +25,12 @@ extern int smp_tsc;
#endif
void i8254_init(void);
+void i8254_delay(int);
+#ifdef XENHVM
+void xen_delay_init(void);
+void xen_delay(int);
+#endif
+int delay_tc(int);
/*
* Driver to clock driver interface.
diff --git a/sys/amd64/include/cpu.h b/sys/amd64/include/cpu.h
index 3d9ff531..ed9f1db 100644
--- a/sys/amd64/include/cpu.h
+++ b/sys/amd64/include/cpu.h
@@ -64,6 +64,7 @@ struct cpu_ops {
void (*cpu_init)(void);
void (*cpu_resume)(void);
void (*ipi_vectored)(u_int, int);
+ int (*start_all_aps)(void);
};
extern struct cpu_ops cpu_ops;
diff --git a/sys/amd64/include/sysarch.h b/sys/amd64/include/sysarch.h
index cd380d4..27fd3ba 100644
--- a/sys/amd64/include/sysarch.h
+++ b/sys/amd64/include/sysarch.h
@@ -4,3 +4,22 @@
/* $FreeBSD$ */
#include <x86/sysarch.h>
+
+#include <machine/pc/bios.h>
+/*
+ * Struct containing pointers to init functions whose
+ * implementation is run time selectable. Selection can be made,
+ * for example, based on detection of a BIOS variant or
+ * hypervisor environment.
+ */
+struct init_ops {
+ caddr_t (*parse_preload_data)(u_int64_t);
+ void (*early_delay_init)(void);
+ void (*early_delay)(int);
+ void (*fetch_e820_map)(caddr_t, struct bios_smap **, u_int32_t *);
+#ifdef SMP
+ u_int (*mp_bootaddress)(u_int);
+#endif
+};
+
+extern struct init_ops init_ops;
diff --git a/sys/amd64/include/xen/hypercall.h b/sys/amd64/include/xen/hypercall.h
index a1b2a5c..499fb4d 100644
--- a/sys/amd64/include/xen/hypercall.h
+++ b/sys/amd64/include/xen/hypercall.h
@@ -51,15 +51,8 @@
#define CONFIG_XEN_COMPAT 0x030002
#define __must_check
-#ifdef XEN
#define HYPERCALL_STR(name) \
"call hypercall_page + ("STR(__HYPERVISOR_##name)" * 32)"
-#else
-#define HYPERCALL_STR(name) \
- "mov $("STR(__HYPERVISOR_##name)" * 32),%%eax; "\
- "add hypercall_stubs(%%rip),%%rax; " \
- "call *%%rax"
-#endif
#define _hypercall0(type, name) \
({ \
diff --git a/sys/conf/files b/sys/conf/files
index f3e298c..6040447 100644
--- a/sys/conf/files
+++ b/sys/conf/files
@@ -2508,8 +2508,8 @@ dev/xe/if_xe_pccard.c optional xe pccard
dev/xen/balloon/balloon.c optional xen | xenhvm
dev/xen/blkfront/blkfront.c optional xen | xenhvm
dev/xen/blkback/blkback.c optional xen | xenhvm
-dev/xen/console/console.c optional xen
-dev/xen/console/xencons_ring.c optional xen
+dev/xen/console/console.c optional xen | xenhvm
+dev/xen/console/xencons_ring.c optional xen | xenhvm
dev/xen/control/control.c optional xen | xenhvm
dev/xen/netback/netback.c optional xen | xenhvm
dev/xen/netfront/netfront.c optional xen | xenhvm
diff --git a/sys/conf/files.amd64 b/sys/conf/files.amd64
index 1914c48..bd52e8f 100644
--- a/sys/conf/files.amd64
+++ b/sys/conf/files.amd64
@@ -554,5 +554,9 @@ x86/x86/mptable_pci.c optional mptable pci
x86/x86/msi.c optional pci
x86/x86/nexus.c standard
x86/x86/tsc.c standard
+x86/x86/delay.c standard
x86/xen/hvm.c optional xenhvm
x86/xen/xen_intr.c optional xen | xenhvm
+x86/xen/mptable.c optional xenhvm
+x86/xen/pvcpu.c optional xenhvm
+x86/xen/pv.c optional xenhvm
diff --git a/sys/conf/files.i386 b/sys/conf/files.i386
index e259659..15a3aae 100644
--- a/sys/conf/files.i386
+++ b/sys/conf/files.i386
@@ -577,5 +577,6 @@ x86/x86/mptable_pci.c optional apic native pci
x86/x86/msi.c optional apic pci
x86/x86/nexus.c standard
x86/x86/tsc.c standard
+x86/x86/delay.c standard
x86/xen/hvm.c optional xenhvm
x86/xen/xen_intr.c optional xen | xenhvm
diff --git a/sys/dev/xen/console/console.c b/sys/dev/xen/console/console.c
index 65a0e7d..86dc2a4 100644
--- a/sys/dev/xen/console/console.c
+++ b/sys/dev/xen/console/console.c
@@ -69,11 +69,14 @@ struct mtx cn_mtx;
static char wbuf[WBUF_SIZE];
static char rbuf[RBUF_SIZE];
static int rc, rp;
-static unsigned int cnsl_evt_reg;
+unsigned int cnsl_evt_reg;
static unsigned int wc, wp; /* write_cons, write_prod */
xen_intr_handle_t xen_intr_handle;
device_t xencons_dev;
+/* Virt address of the shared console page */
+char *console_page;
+
#ifdef KDB
static int xc_altbrk;
#endif
@@ -113,6 +116,9 @@ static struct ttydevsw xc_ttydevsw = {
static void
xc_cnprobe(struct consdev *cp)
{
+ if (!xen_pv_domain())
+ return;
+
cp->cn_pri = CN_REMOTE;
sprintf(cp->cn_name, "%s0", driver_name);
}
@@ -175,7 +181,7 @@ static void
xc_cnputc(struct consdev *dev, int c)
{
- if (xen_start_info->flags & SIF_INITDOMAIN)
+ if (HYPERVISOR_start_info->flags & SIF_INITDOMAIN)
xc_cnputc_dom0(dev, c);
else
xc_cnputc_domu(dev, c);
@@ -206,8 +212,7 @@ xcons_putc(int c)
xcons_force_flush();
#endif
}
- if (cnsl_evt_reg)
- __xencons_tx_flush();
+ __xencons_tx_flush();
/* inform start path that we're pretty full */
return ((wp - wc) >= WBUF_SIZE - 100) ? TRUE : FALSE;
@@ -217,6 +222,10 @@ static void
xc_identify(driver_t *driver, device_t parent)
{
device_t child;
+
+ if (!xen_pv_domain())
+ return;
+
child = BUS_ADD_CHILD(parent, 0, driver_name, 0);
device_set_driver(child, driver);
device_set_desc(child, "Xen Console");
@@ -245,7 +254,7 @@ xc_attach(device_t dev)
cnsl_evt_reg = 1;
callout_reset(&xc_callout, XC_POLLTIME, xc_timeout, xccons);
- if (xen_start_info->flags & SIF_INITDOMAIN) {
+ if (HYPERVISOR_start_info->flags & SIF_INITDOMAIN) {
error = xen_intr_bind_virq(dev, VIRQ_CONSOLE, 0, NULL,
xencons_priv_interrupt, NULL,
INTR_TYPE_TTY, &xen_intr_handle);
@@ -309,7 +318,7 @@ __xencons_tx_flush(void)
sz = wp - wc;
if (sz > (WBUF_SIZE - WBUF_MASK(wc)))
sz = WBUF_SIZE - WBUF_MASK(wc);
- if (xen_start_info->flags & SIF_INITDOMAIN) {
+ if (HYPERVISOR_start_info->flags & SIF_INITDOMAIN) {
HYPERVISOR_console_io(CONSOLEIO_write, sz, &wbuf[WBUF_MASK(wc)]);
wc += sz;
} else {
@@ -424,7 +433,7 @@ xcons_force_flush(void)
{
int sz;
- if (xen_start_info->flags & SIF_INITDOMAIN)
+ if (HYPERVISOR_start_info->flags & SIF_INITDOMAIN)
return;
/* Spin until console data is flushed through to the domain controller. */
diff --git a/sys/dev/xen/console/xencons_ring.c b/sys/dev/xen/console/xencons_ring.c
index 3701551..3046498 100644
--- a/sys/dev/xen/console/xencons_ring.c
+++ b/sys/dev/xen/console/xencons_ring.c
@@ -32,9 +32,9 @@ __FBSDID("$FreeBSD$");
#define console_evtchn console.domU.evtchn
xen_intr_handle_t console_handle;
-extern char *console_page;
extern struct mtx cn_mtx;
extern device_t xencons_dev;
+extern int cnsl_evt_reg;
static inline struct xencons_interface *
xencons_interface(void)
@@ -60,6 +60,7 @@ xencons_ring_send(const char *data, unsigned len)
struct xencons_interface *intf;
XENCONS_RING_IDX cons, prod;
int sent;
+ struct evtchn_send send = { .port = HYPERVISOR_start_info->console.domU.evtchn };
intf = xencons_interface();
cons = intf->out_cons;
@@ -76,7 +77,11 @@ xencons_ring_send(const char *data, unsigned len)
wmb();
intf->out_prod = prod;
- xen_intr_signal(console_handle);
+ if (cnsl_evt_reg)
+ xen_intr_signal(console_handle);
+ else
+ HYPERVISOR_event_channel_op(EVTCHNOP_send, &send);
+
return sent;
@@ -125,11 +130,11 @@ xencons_ring_init(void)
{
int err;
- if (!xen_start_info->console_evtchn)
+ if (!HYPERVISOR_start_info->console_evtchn)
return 0;
err = xen_intr_bind_local_port(xencons_dev,
- xen_start_info->console_evtchn, NULL, xencons_handle_input, NULL,
+ HYPERVISOR_start_info->console_evtchn, NULL, xencons_handle_input, NULL,
INTR_TYPE_MISC | INTR_MPSAFE, &console_handle);
if (err) {
return err;
@@ -145,7 +150,7 @@ void
xencons_suspend(void)
{
- if (!xen_start_info->console_evtchn)
+ if (!HYPERVISOR_start_info->console_evtchn)
return;
xen_intr_unbind(&console_handle);
diff --git a/sys/dev/xen/control/control.c b/sys/dev/xen/control/control.c
index a9f8d1b..35c923d 100644
--- a/sys/dev/xen/control/control.c
+++ b/sys/dev/xen/control/control.c
@@ -317,21 +317,6 @@ xctrl_suspend()
EVENTHANDLER_INVOKE(power_resume);
}
-static void
-xen_pv_shutdown_final(void *arg, int howto)
-{
- /*
- * Inform the hypervisor that shutdown is complete.
- * This is not necessary in HVM domains since Xen
- * emulates ACPI in that mode and FreeBSD's ACPI
- * support will request this transition.
- */
- if (howto & (RB_HALT | RB_POWEROFF))
- HYPERVISOR_shutdown(SHUTDOWN_poweroff);
- else
- HYPERVISOR_shutdown(SHUTDOWN_reboot);
-}
-
#else
/* HVM mode suspension. */
@@ -447,6 +432,21 @@ xctrl_halt()
shutdown_nice(RB_HALT);
}
+static void
+xen_pv_shutdown_final(void *arg, int howto)
+{
+ /*
+ * Inform the hypervisor that shutdown is complete.
+ * This is not necessary in HVM domains since Xen
+ * emulates ACPI in that mode and FreeBSD's ACPI
+ * support will request this transition.
+ */
+ if (howto & (RB_HALT | RB_POWEROFF))
+ HYPERVISOR_shutdown(SHUTDOWN_poweroff);
+ else
+ HYPERVISOR_shutdown(SHUTDOWN_reboot);
+}
+
/*------------------------------ Event Reception -----------------------------*/
static void
xctrl_on_watch_event(struct xs_watch *watch, const char **vec, unsigned int len)
@@ -529,10 +529,9 @@ xctrl_attach(device_t dev)
xctrl->xctrl_watch.callback_data = (uintptr_t)xctrl;
xs_register_watch(&xctrl->xctrl_watch);
-#ifndef XENHVM
- EVENTHANDLER_REGISTER(shutdown_final, xen_pv_shutdown_final, NULL,
- SHUTDOWN_PRI_LAST);
-#endif
+ if (xen_pv_domain())
+ EVENTHANDLER_REGISTER(shutdown_final, xen_pv_shutdown_final, NULL,
+ SHUTDOWN_PRI_LAST);
return (0);
}
diff --git a/sys/dev/xen/timer/timer.c b/sys/dev/xen/timer/timer.c
index 824c75b..13bd852 100644
--- a/sys/dev/xen/timer/timer.c
+++ b/sys/dev/xen/timer/timer.c
@@ -59,6 +59,9 @@ __FBSDID("$FreeBSD$");
#include <machine/_inttypes.h>
#include <machine/smp.h>
+/* For the declaration of clock_lock */
+#include <isa/rtc.h>
+
#include "clock_if.h"
static devclass_t xentimer_devclass;
@@ -234,18 +237,16 @@ xen_fetch_vcpu_tinfo(struct vcpu_time_info *dst, struct vcpu_time_info *src)
* it happens to be less than another CPU's previously determined value.
*/
static uint64_t
-xen_fetch_vcpu_time(void)
+xen_fetch_vcpu_time(struct vcpu_info *vcpu)
{
struct vcpu_time_info dst;
struct vcpu_time_info *src;
uint32_t pre_version;
uint64_t now;
volatile uint64_t last;
- struct vcpu_info *vcpu = DPCPU_GET(vcpu_info);
src = &vcpu->time;
- critical_enter();
do {
pre_version = xen_fetch_vcpu_tinfo(&dst, src);
barrier();
@@ -266,16 +267,19 @@ xen_fetch_vcpu_time(void)
}
} while (!atomic_cmpset_64(&xen_timer_last_time, last, now));
- critical_exit();
-
return (now);
}
static uint32_t
xentimer_get_timecount(struct timecounter *tc)
{
+ uint32_t xen_time;
+
+ critical_enter();
+ xen_time = (uint32_t)xen_fetch_vcpu_time(DPCPU_GET(vcpu_info)) & UINT_MAX;
+ critical_exit();
- return ((uint32_t)xen_fetch_vcpu_time() & UINT_MAX);
+ return xen_time;
}
/**
@@ -305,7 +309,12 @@ xen_fetch_wallclock(struct timespec *ts)
static void
xen_fetch_uptime(struct timespec *ts)
{
- uint64_t uptime = xen_fetch_vcpu_time();
+ uint64_t uptime;
+
+ critical_enter();
+ uptime = xen_fetch_vcpu_time(DPCPU_GET(vcpu_info));
+ critical_exit();
+
ts->tv_sec = uptime / NSEC_IN_SEC;
ts->tv_nsec = uptime % NSEC_IN_SEC;
}
@@ -354,7 +363,7 @@ xentimer_intr(void *arg)
struct xentimer_softc *sc = (struct xentimer_softc *)arg;
struct xentimer_pcpu_data *pcpu = DPCPU_PTR(xentimer_pcpu);
- pcpu->last_processed = xen_fetch_vcpu_time();
+ pcpu->last_processed = xen_fetch_vcpu_time(DPCPU_GET(vcpu_info));
if (pcpu->timer != 0 && sc->et.et_active)
sc->et.et_event_cb(&sc->et, sc->et.et_arg);
@@ -415,7 +424,9 @@ xentimer_et_start(struct eventtimer *et,
do {
if (++i == 60)
panic("can't schedule timer");
- next_time = xen_fetch_vcpu_time() + first_in_ns;
+ critical_enter();
+ next_time = xen_fetch_vcpu_time(DPCPU_GET(vcpu_info)) + first_in_ns;
+ critical_exit();
error = xentimer_vcpu_start_timer(cpu, next_time);
} while (error == -ETIME);
@@ -573,6 +584,36 @@ xentimer_suspend(device_t dev)
return (0);
}
+/*
+ * Xen delay early init
+ */
+void xen_delay_init(void)
+{
+ /* Init the clock lock */
+ mtx_init(&clock_lock, "clk", NULL, MTX_SPIN | MTX_NOPROFILE);
+}
+/*
+ * Xen PV DELAY function
+ *
+ * When running on PVH mode we don't have an emulated i8524, so
+ * make use of the Xen time info in order to code a simple DELAY
+ * function that can be used during early boot.
+ */
+void xen_delay(int n)
+{
+ uint64_t end_ns;
+ uint64_t current;
+
+ end_ns = xen_fetch_vcpu_time(&HYPERVISOR_shared_info->vcpu_info[0]);
+ end_ns += n * NSEC_IN_USEC;
+
+ for (;;) {
+ current = xen_fetch_vcpu_time(&HYPERVISOR_shared_info->vcpu_info[0]);
+ if (current >= end_ns)
+ break;
+ }
+}
+
static device_method_t xentimer_methods[] = {
DEVMETHOD(device_identify, xentimer_identify),
DEVMETHOD(device_probe, xentimer_probe),
diff --git a/sys/i386/i386/locore.s b/sys/i386/i386/locore.s
index 68cb430..bd136b1 100644
--- a/sys/i386/i386/locore.s
+++ b/sys/i386/i386/locore.s
@@ -898,3 +898,12 @@ done_pde:
#endif
ret
+
+#ifdef XENHVM
+/* Xen Hypercall page */
+ .text
+.p2align PAGE_SHIFT, 0x90 /* Hypercall_page needs to be PAGE aligned */
+
+NON_GPROF_ENTRY(hypercall_page)
+ .skip 0x1000, 0x90 /* Fill with "nop"s */
+#endif
diff --git a/sys/i386/i386/machdep.c b/sys/i386/i386/machdep.c
index c430316..8bd9a8e 100644
--- a/sys/i386/i386/machdep.c
+++ b/sys/i386/i386/machdep.c
@@ -254,6 +254,15 @@ struct mtx icu_lock;
struct mem_range_softc mem_range_softc;
+void
+DELAY(int n)
+{
+ if (delay_tc(n))
+ return;
+
+ i8254_delay(n);
+}
+
static void
cpu_startup(dummy)
void *dummy;
diff --git a/sys/i386/include/clock.h b/sys/i386/include/clock.h
index d980ec7..287b2c8 100644
--- a/sys/i386/include/clock.h
+++ b/sys/i386/include/clock.h
@@ -22,6 +22,12 @@ extern int tsc_is_invariant;
extern int tsc_perf_stat;
void i8254_init(void);
+void i8254_delay(int);
+#ifdef XENHVM
+void xen_delay_init(void);
+void xen_delay(int);
+#endif
+int delay_tc(int);
/*
* Driver to clock driver interface.
diff --git a/sys/i386/include/xen/hypercall.h b/sys/i386/include/xen/hypercall.h
index edc13f4..1c15b0f 100644
--- a/sys/i386/include/xen/hypercall.h
+++ b/sys/i386/include/xen/hypercall.h
@@ -40,15 +40,8 @@
#define CONFIG_XEN_COMPAT 0x030002
-#if defined(XEN)
#define HYPERCALL_STR(name) \
"call hypercall_page + ("STR(__HYPERVISOR_##name)" * 32)"
-#else
-#define HYPERCALL_STR(name) \
- "mov hypercall_stubs,%%eax; " \
- "add $("STR(__HYPERVISOR_##name)" * 32),%%eax; " \
- "call *%%eax"
-#endif
#define _hypercall0(type, name) \
({ \
diff --git a/sys/i386/xen/xen_machdep.c b/sys/i386/xen/xen_machdep.c
index 7049be6..1b1c74d 100644
--- a/sys/i386/xen/xen_machdep.c
+++ b/sys/i386/xen/xen_machdep.c
@@ -89,6 +89,7 @@ IDTVEC(div), IDTVEC(dbg), IDTVEC(nmi), IDTVEC(bpt), IDTVEC(ofl),
int xendebug_flags;
start_info_t *xen_start_info;
+start_info_t *HYPERVISOR_start_info;
shared_info_t *HYPERVISOR_shared_info;
xen_pfn_t *xen_machine_phys = machine_to_phys_mapping;
xen_pfn_t *xen_phys_machine;
@@ -744,7 +745,7 @@ void initvalues(start_info_t *startinfo);
struct xenstore_domain_interface;
extern struct xenstore_domain_interface *xen_store;
-char *console_page;
+extern char *console_page;
void *
bootmem_alloc(unsigned int size)
@@ -927,6 +928,7 @@ initvalues(start_info_t *startinfo)
HYPERVISOR_vm_assist(VMASST_CMD_enable, VMASST_TYPE_4gb_segments_notify);
#endif
xen_start_info = startinfo;
+ HYPERVISOR_start_info = startinfo;
xen_phys_machine = (xen_pfn_t *)startinfo->mfn_list;
IdlePTD = (pd_entry_t *)((uint8_t *)startinfo->pt_base + PAGE_SIZE);
diff --git a/sys/x86/isa/clock.c b/sys/x86/isa/clock.c
index a12e175..a5aed1c 100644
--- a/sys/x86/isa/clock.c
+++ b/sys/x86/isa/clock.c
@@ -247,61 +247,13 @@ getit(void)
return ((high << 8) | low);
}
-#ifndef DELAYDEBUG
-static u_int
-get_tsc(__unused struct timecounter *tc)
-{
-
- return (rdtsc32());
-}
-
-static __inline int
-delay_tc(int n)
-{
- struct timecounter *tc;
- timecounter_get_t *func;
- uint64_t end, freq, now;
- u_int last, mask, u;
-
- tc = timecounter;
- freq = atomic_load_acq_64(&tsc_freq);
- if (tsc_is_invariant && freq != 0) {
- func = get_tsc;
- mask = ~0u;
- } else {
- if (tc->tc_quality <= 0)
- return (0);
- func = tc->tc_get_timecount;
- mask = tc->tc_counter_mask;
- freq = tc->tc_frequency;
- }
- now = 0;
- end = freq * n / 1000000;
- if (func == get_tsc)
- sched_pin();
- last = func(tc) & mask;
- do {
- cpu_spinwait();
- u = func(tc) & mask;
- if (u < last)
- now += mask - last + u + 1;
- else
- now += u - last;
- last = u;
- } while (now < end);
- if (func == get_tsc)
- sched_unpin();
- return (1);
-}
-#endif
-
/*
* Wait "n" microseconds.
* Relies on timer 1 counting down from (i8254_freq / hz)
* Note: timer had better have been programmed before this is first used!
*/
void
-DELAY(int n)
+i8254_delay(int n)
{
int delta, prev_tick, tick, ticks_left;
#ifdef DELAYDEBUG
@@ -317,9 +269,6 @@ DELAY(int n)
}
if (state == 1)
printf("DELAY(%d)...", n);
-#else
- if (delay_tc(n))
- return;
#endif
/*
* Read the counter first, so that the rest of the setup overhead is
diff --git a/sys/x86/x86/delay.c b/sys/x86/x86/delay.c
new file mode 100644
index 0000000..7ea70b1
--- /dev/null
+++ b/sys/x86/x86/delay.c
@@ -0,0 +1,95 @@
+/*-
+ * Copyright (c) 1990 The Regents of the University of California.
+ * Copyright (c) 2010 Alexander Motin <mav at FreeBSD.org>
+ * All rights reserved.
+ *
+ * This code is derived from software contributed to Berkeley by
+ * William Jolitz and Don Ahn.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ * 1. Redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in the
+ * documentation and/or other materials provided with the distribution.
+ * 4. Neither the name of the University nor the names of its contributors
+ * may be used to endorse or promote products derived from this software
+ * without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
+ * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+ * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+ * ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
+ * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+ * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
+ * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
+ * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
+ * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
+ * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
+ * SUCH DAMAGE.
+ *
+ * from: @(#)clock.c 7.2 (Berkeley) 5/12/91
+ */
+
+#include <sys/cdefs.h>
+__FBSDID("$FreeBSD$");
+
+/* Generic x86 routines to handle delay */
+
+#include <sys/param.h>
+#include <sys/systm.h>
+#include <sys/timetc.h>
+#include <sys/proc.h>
+#include <sys/kernel.h>
+#include <sys/sched.h>
+
+#include <machine/clock.h>
+#include <machine/cpu.h>
+
+static u_int
+get_tsc(__unused struct timecounter *tc)
+{
+
+ return (rdtsc32());
+}
+
+int
+delay_tc(int n)
+{
+ struct timecounter *tc;
+ timecounter_get_t *func;
+ uint64_t end, freq, now;
+ u_int last, mask, u;
+
+ tc = timecounter;
+ freq = atomic_load_acq_64(&tsc_freq);
+ if (tsc_is_invariant && freq != 0) {
+ func = get_tsc;
+ mask = ~0u;
+ } else {
+ if (tc->tc_quality <= 0)
+ return (0);
+ func = tc->tc_get_timecount;
+ mask = tc->tc_counter_mask;
+ freq = tc->tc_frequency;
+ }
+ now = 0;
+ end = freq * n / 1000000;
+ if (func == get_tsc)
+ sched_pin();
+ last = func(tc) & mask;
+ do {
+ cpu_spinwait();
+ u = func(tc) & mask;
+ if (u < last)
+ now += mask - last + u + 1;
+ else
+ now += u - last;
+ last = u;
+ } while (now < end);
+ if (func == get_tsc)
+ sched_unpin();
+ return (1);
+}
diff --git a/sys/x86/x86/local_apic.c b/sys/x86/x86/local_apic.c
index 8c8eef6..d8d7701 100644
--- a/sys/x86/x86/local_apic.c
+++ b/sys/x86/x86/local_apic.c
@@ -1368,9 +1368,13 @@ apic_setup_io(void *dummy __unused)
if (retval != 0)
printf("%s: Failed to setup I/O APICs: returned %d\n",
best_enum->apic_name, retval);
-#ifdef XEN
- return;
+
+#if defined(XEN) || defined(XENHVM)
+ /* There's no lapic on PV Xen */
+ if (xen_pv_domain())
+ return;
#endif
+
/*
* Finish setting up the local APIC on the BSP once we know how to
* properly program the LINT pins.
diff --git a/sys/x86/xen/hvm.c b/sys/x86/xen/hvm.c
index 72811dc..be15594 100644
--- a/sys/x86/xen/hvm.c
+++ b/sys/x86/xen/hvm.c
@@ -35,15 +35,21 @@ __FBSDID("$FreeBSD$");
#include <sys/proc.h>
#include <sys/smp.h>
#include <sys/systm.h>
+#include <sys/lock.h>
+#include <sys/mutex.h>
+#include <sys/reboot.h>
#include <vm/vm.h>
#include <vm/pmap.h>
+#include <vm/vm_kern.h>
+#include <vm/vm_extern.h>
#include <dev/pci/pcivar.h>
#include <machine/cpufunc.h>
#include <machine/cpu.h>
#include <machine/smp.h>
+#include <machine/stdarg.h>
#include <x86/apicreg.h>
@@ -52,6 +58,9 @@ __FBSDID("$FreeBSD$");
#include <xen/gnttab.h>
#include <xen/hypervisor.h>
#include <xen/hvm.h>
+#ifdef __amd64__
+#include <xen/pv.h>
+#endif
#include <xen/xen_intr.h>
#include <xen/interface/hvm/params.h>
@@ -97,6 +106,11 @@ extern void pmap_lazyfix_action(void);
/* Variables used by mp_machdep to perform the bitmap IPI */
extern volatile u_int cpu_ipi_pending[MAXCPU];
+#ifdef __amd64__
+/* Native AP start used on PVHVM */
+extern int native_start_all_aps(void);
+#endif
+
/*---------------------------------- Macros ----------------------------------*/
#define IPI_TO_IDX(ipi) ((ipi) - APIC_IPI_INTS)
@@ -119,7 +133,10 @@ enum xen_domain_type xen_domain_type = XEN_NATIVE;
struct cpu_ops xen_hvm_cpu_ops = {
.ipi_vectored = lapic_ipi_vectored,
.cpu_init = xen_hvm_cpu_init,
- .cpu_resume = xen_hvm_cpu_resume
+ .cpu_resume = xen_hvm_cpu_resume,
+#ifdef __amd64__
+ .start_all_aps = native_start_all_aps,
+#endif
};
static MALLOC_DEFINE(M_XENHVM, "xen_hvm", "Xen HVM PV Support");
@@ -157,8 +174,9 @@ DPCPU_DEFINE(xen_intr_handle_t, ipi_handle[nitems(xen_ipis)]);
/*------------------ Hypervisor Access Shared Memory Regions -----------------*/
/** Hypercall table accessed via HYPERVISOR_*_op() methods. */
-char *hypercall_stubs;
+extern char *hypercall_page;
shared_info_t *HYPERVISOR_shared_info;
+start_info_t *HYPERVISOR_start_info;
#ifdef SMP
/*---------------------------- XEN PV IPI Handlers ---------------------------*/
@@ -522,7 +540,7 @@ xen_setup_cpus(void)
{
int i;
- if (!xen_hvm_domain() || !xen_vector_callback_enabled)
+ if (!xen_vector_callback_enabled)
return;
#ifdef __amd64__
@@ -558,7 +576,7 @@ xen_hvm_cpuid_base(void)
* Allocate and fill in the hypcall page.
*/
static int
-xen_hvm_init_hypercall_stubs(void)
+xen_hvm_init_hypercall_stubs(enum xen_hvm_init_type init_type)
{
uint32_t base, regs[4];
int i;
@@ -567,7 +585,7 @@ xen_hvm_init_hypercall_stubs(void)
if (base == 0)
return (ENXIO);
- if (hypercall_stubs == NULL) {
+ if (init_type == XEN_HVM_INIT_COLD) {
do_cpuid(base + 1, regs);
printf("XEN: Hypervisor version %d.%d detected.\n",
regs[0] >> 16, regs[0] & 0xffff);
@@ -577,18 +595,9 @@ xen_hvm_init_hypercall_stubs(void)
* Find the hypercall pages.
*/
do_cpuid(base + 2, regs);
-
- if (hypercall_stubs == NULL) {
- size_t call_region_size;
-
- call_region_size = regs[0] * PAGE_SIZE;
- hypercall_stubs = malloc(call_region_size, M_XENHVM, M_NOWAIT);
- if (hypercall_stubs == NULL)
- panic("Unable to allocate Xen hypercall region");
- }
for (i = 0; i < regs[0]; i++)
- wrmsr(regs[1], vtophys(hypercall_stubs + i * PAGE_SIZE) + i);
+ wrmsr(regs[1], vtophys(&hypercall_page + i * PAGE_SIZE) + i);
return (0);
}
@@ -677,8 +686,6 @@ xen_hvm_disable_emulated_devices(void)
if (inw(XEN_MAGIC_IOPORT) != XMI_MAGIC)
return;
- if (bootverbose)
- printf("XEN: Disabling emulated block and network devices\n");
outw(XEN_MAGIC_IOPORT, XMI_UNPLUG_IDE_DISKS|XMI_UNPLUG_NICS);
}
@@ -691,7 +698,12 @@ xen_hvm_init(enum xen_hvm_init_type init_type)
if (init_type == XEN_HVM_INIT_CANCELLED_SUSPEND)
return;
- error = xen_hvm_init_hypercall_stubs();
+ if (xen_pv_domain()) {
+ /* hypercall page is already set in the PV case */
+ error = 0;
+ } else {
+ error = xen_hvm_init_hypercall_stubs(init_type);
+ }
switch (init_type) {
case XEN_HVM_INIT_COLD:
@@ -701,6 +713,12 @@ xen_hvm_init(enum xen_hvm_init_type init_type)
setup_xen_features();
cpu_ops = xen_hvm_cpu_ops;
vm_guest = VM_GUEST_XEN;
+#ifdef __amd64__
+ if (xen_pv_domain())
+ cpu_ops.start_all_aps = xen_pv_start_all_aps;
+ else
+#endif
+ printf("XEN: Disabling emulated block and network devices\n");
break;
case XEN_HVM_INIT_RESUME:
if (error != 0)
@@ -715,10 +733,13 @@ xen_hvm_init(enum xen_hvm_init_type init_type)
}
xen_vector_callback_enabled = 0;
- xen_domain_type = XEN_HVM_DOMAIN;
- xen_hvm_init_shared_info_page();
xen_hvm_set_callback(NULL);
- xen_hvm_disable_emulated_devices();
+
+ if (!xen_pv_domain()) {
+ xen_domain_type = XEN_HVM_DOMAIN;
+ xen_hvm_init_shared_info_page();
+ xen_hvm_disable_emulated_devices();
+ }
}
void
@@ -749,10 +770,11 @@ xen_set_vcpu_id(void)
struct pcpu *pc;
int i;
- /* Set vcpu_id to acpi_id */
+ /* Set vcpu_id to acpi_id for PVHVM guests */
CPU_FOREACH(i) {
pc = pcpu_find(i);
- pc->pc_vcpu_id = pc->pc_acpi_id;
+ if (xen_hvm_domain())
+ pc->pc_vcpu_id = pc->pc_acpi_id;
if (bootverbose)
printf("XEN: CPU %u has VCPU ID %u\n",
i, pc->pc_vcpu_id);
@@ -790,6 +812,31 @@ xen_hvm_cpu_init(void)
DPCPU_SET(vcpu_info, vcpu_info);
}
+/*----------------------------- Debug functions ------------------------------*/
+#define PRINTK_BUFSIZE 1024
+static int
+vprintk(const char *fmt, __va_list ap)
+{
+ int retval, len;
+ static char buf[PRINTK_BUFSIZE];
+
+ retval = vsnprintf(buf, PRINTK_BUFSIZE - 1, fmt, ap);
+ buf[retval] = 0;
+ len = strlen(buf);
+ retval = HYPERVISOR_console_io(CONSOLEIO_write, len, (char *)buf);
+ return retval;
+}
+
+void
+xen_early_printf(const char *fmt, ...)
+{
+ __va_list ap;
+
+ va_start(ap, fmt);
+ vprintk(fmt, ap);
+ va_end(ap);
+}
+
SYSINIT(xen_hvm_init, SI_SUB_HYPERVISOR, SI_ORDER_FIRST, xen_hvm_sysinit, NULL);
#ifdef SMP
SYSINIT(xen_setup_cpus, SI_SUB_SMP, SI_ORDER_FIRST, xen_setup_cpus, NULL);
diff --git a/sys/x86/xen/mptable.c b/sys/x86/xen/mptable.c
new file mode 100644
index 0000000..8916314
--- /dev/null
+++ b/sys/x86/xen/mptable.c
@@ -0,0 +1,136 @@
+/*-
+ * Copyright (c) 2003 John Baldwin <jhb at FreeBSD.org>
+ * Copyright (c) 2013 Roger Pau Monné <roger.pau at citrix.com>
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ * 1. Redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in the
+ * documentation and/or other materials provided with the distribution.
+ * 3. Neither the name of the author nor the names of any co-contributors
+ * may be used to endorse or promote products derived from this software
+ * without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
+ * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+ * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+ * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
+ * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+ * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
+ * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
+ * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
+ * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
+ * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
+ * SUCH DAMAGE.
+ */
+
+#include <sys/cdefs.h>
+__FBSDID("$FreeBSD$");
+
+#include <sys/param.h>
+#include <sys/systm.h>
+#include <sys/bus.h>
+#include <sys/kernel.h>
+#include <sys/smp.h>
+#include <sys/pcpu.h>
+#include <vm/vm.h>
+#include <vm/pmap.h>
+
+#include <machine/intr_machdep.h>
+#include <machine/apicvar.h>
+
+#include <machine/cpu.h>
+#include <machine/smp.h>
+
+#include <xen/xen-os.h>
+#include <xen/hypervisor.h>
+
+#include <xen/interface/vcpu.h>
+
+static int xenpv_probe(void);
+static int xenpv_probe_cpus(void);
+static int xenpv_setup_local(void);
+static int xenpv_setup_io(void);
+
+static struct apic_enumerator xenpv_enumerator = {
+ "Xen PV",
+ xenpv_probe,
+ xenpv_probe_cpus,
+ xenpv_setup_local,
+ xenpv_setup_io
+};
+
+/*
+ * Look for an ACPI Multiple APIC Description Table ("APIC")
+ */
+static int
+xenpv_probe(void)
+{
+ return (-100);
+}
+
+/*
+ * Run through the MP table enumerating CPUs.
+ */
+static int
+xenpv_probe_cpus(void)
+{
+ int i, ret;
+
+ for (i = 0; i < MAXCPU; i++) {
+ ret = HYPERVISOR_vcpu_op(VCPUOP_is_up, i, NULL);
+ if (ret >= 0)
+ cpu_add((i * 2), (i == 0));
+ }
+
+ return (0);
+}
+
+/*
+ * Initialize the local APIC on the BSP.
+ */
+static int
+xenpv_setup_local(void)
+{
+ PCPU_SET(vcpu_id, 0);
+ return (0);
+}
+
+/*
+ * Enumerate I/O APICs and setup interrupt sources.
+ */
+static int
+xenpv_setup_io(void)
+{
+ return (0);
+}
+
+static void
+xenpv_register(void *dummy __unused)
+{
+ if (xen_pv_domain()) {
+ apic_register_enumerator(&xenpv_enumerator);
+ }
+}
+SYSINIT(xenpv_register, SI_SUB_TUNABLES - 1, SI_ORDER_FIRST, xenpv_register, NULL);
+
+/*
+ * Setup per-CPU ACPI IDs.
+ */
+static void
+xenpv_set_ids(void *dummy)
+{
+ struct pcpu *pc;
+ int i;
+
+ CPU_FOREACH(i) {
+ pc = pcpu_find(i);
+ pc->pc_vcpu_id = i;
+ }
+ return;
+}
+SYSINIT(xenpv_set_ids, SI_SUB_CPU, SI_ORDER_MIDDLE, xenpv_set_ids, NULL);
diff --git a/sys/x86/xen/pv.c b/sys/x86/xen/pv.c
new file mode 100644
index 0000000..6756dec
--- /dev/null
+++ b/sys/x86/xen/pv.c
@@ -0,0 +1,247 @@
+/*
+ * Copyright (c) 2004 Christian Limpach.
+ * Copyright (c) 2004-2006,2008 Kip Macy
+ * Copyright (c) 2013 Roger Pau Monné <roger.pau at citrix.com>
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ * 1. Redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in the
+ * documentation and/or other materials provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS AS IS'' AND
+ * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+ * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+ * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
+ * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+ * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
+ * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
+ * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
+ * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
+ * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
+ * SUCH DAMAGE.
+ */
+
+#include <sys/cdefs.h>
+__FBSDID("$FreeBSD$");
+
+#include <sys/param.h>
+#include <sys/bus.h>
+#include <sys/kernel.h>
+#include <sys/malloc.h>
+#include <sys/proc.h>
+#include <sys/smp.h>
+#include <sys/systm.h>
+#include <sys/lock.h>
+#include <sys/mutex.h>
+#include <sys/reboot.h>
+
+#include <vm/vm.h>
+#include <vm/pmap.h>
+#include <vm/vm_kern.h>
+#include <vm/vm_extern.h>
+
+#include <dev/pci/pcivar.h>
+
+#include <machine/cpufunc.h>
+#include <machine/cpu.h>
+#include <machine/smp.h>
+#include <machine/tss.h>
+#include <machine/sysarch.h>
+#include <machine/clock.h>
+
+#include <x86/apicreg.h>
+
+#include <xen/xen-os.h>
+#include <xen/features.h>
+#include <xen/gnttab.h>
+#include <xen/hypervisor.h>
+#include <xen/hvm.h>
+#include <xen/pv.h>
+#include <xen/xen_intr.h>
+
+#include <xen/interface/hvm/params.h>
+#include <xen/interface/vcpu.h>
+
+#define MAX_E820_ENTRIES 128
+
+/*--------------------------- Forward Declarations ---------------------------*/
+static caddr_t xen_pv_parse_preload_data(u_int64_t);
+static void xen_pv_fetch_e820_map(caddr_t, struct bios_smap **, u_int32_t *);
+
+/*---------------------------- Extern Declarations ---------------------------*/
+/* Variables used by amd64 mp_machdep to start APs */
+extern struct mtx ap_boot_mtx;
+extern void *bootstacks[];
+extern char *doublefault_stack;
+extern char *nmi_stack;
+extern void *dpcpu;
+extern int bootAP;
+extern char *bootSTK;
+extern bool lapic_disabled;
+
+/*-------------------------------- Global Data -------------------------------*/
+/* Xen init_ops implementation. */
+struct init_ops xen_init_ops = {
+ .parse_preload_data = xen_pv_parse_preload_data,
+ .early_delay_init = xen_delay_init,
+ .early_delay = xen_delay,
+ .fetch_e820_map = xen_pv_fetch_e820_map,
+};
+
+static struct
+{
+ const char *ev;
+ int mask;
+} howto_names[] = {
+ {"boot_askname", RB_ASKNAME},
+ {"boot_single", RB_SINGLE},
+ {"boot_nosync", RB_NOSYNC},
+ {"boot_halt", RB_ASKNAME},
+ {"boot_serial", RB_SERIAL},
+ {"boot_cdrom", RB_CDROM},
+ {"boot_gdb", RB_GDB},
+ {"boot_gdb_pause", RB_RESERVED1},
+ {"boot_verbose", RB_VERBOSE},
+ {"boot_multicons", RB_MULTIPLE},
+ {NULL, 0}
+};
+
+static struct bios_smap xen_smap[MAX_E820_ENTRIES];
+
+static int
+start_xen_ap(int cpu)
+{
+ struct vcpu_guest_context *ctxt;
+ int ms, cpus = mp_naps;
+
+ ctxt = malloc(sizeof(*ctxt), M_TEMP, M_NOWAIT | M_ZERO);
+ if (ctxt == NULL)
+ panic("unable to allocate memory");
+
+ ctxt->flags = VGCF_IN_KERNEL;
+ ctxt->user_regs.rip = (unsigned long) init_secondary;
+ ctxt->user_regs.rsp = (unsigned long) bootSTK;
+
+ /* Set the CPU to use the same page tables and CR4 value */
+ ctxt->ctrlreg[3] = KPML4phys;
+ ctxt->ctrlreg[4] = rcr4();
+
+ if (HYPERVISOR_vcpu_op(VCPUOP_initialise, cpu, ctxt))
+ panic("unable to initialize CPU#%d\n", cpu);
+
+ free(ctxt, M_TEMP);
+
+ /* Launch the vCPU */
+ if (HYPERVISOR_vcpu_op(VCPUOP_up, cpu, NULL))
+ panic("unable to start AP#%d\n", cpu);
+
+ /* Wait up to 5 seconds for it to start. */
+ for (ms = 0; ms < 5000; ms++) {
+ if (mp_naps > cpus)
+ return 1; /* return SUCCESS */
+ DELAY(1000);
+ }
+
+ return 0;
+}
+
+int
+xen_pv_start_all_aps(void)
+{
+ int cpu;
+
+ mtx_init(&ap_boot_mtx, "ap boot", NULL, MTX_SPIN);
+ lapic_disabled = true;
+
+ for (cpu = 1; cpu < mp_ncpus; cpu++) {
+
+ /* allocate and set up an idle stack data page */
+ bootstacks[cpu] = (void *)kmem_malloc(kernel_arena,
+ KSTACK_PAGES * PAGE_SIZE, M_WAITOK | M_ZERO);
+ doublefault_stack = (char *)kmem_malloc(kernel_arena,
+ PAGE_SIZE, M_WAITOK | M_ZERO);
+ nmi_stack = (char *)kmem_malloc(kernel_arena, PAGE_SIZE,
+ M_WAITOK | M_ZERO);
+ dpcpu = (void *)kmem_malloc(kernel_arena, DPCPU_SIZE,
+ M_WAITOK | M_ZERO);
+
+ bootSTK = (char *)bootstacks[cpu] + KSTACK_PAGES * PAGE_SIZE - 8;
+ bootAP = cpu;
+
+ /* attempt to start the Application Processor */
+ if (!start_xen_ap(cpu))
+ panic("AP #%d failed to start!", cpu);
+
+ CPU_SET(cpu, &all_cpus); /* record AP in CPU map */
+ }
+
+ return mp_naps;
+}
+
+/*
+ * Functions to convert the "extra" parameters passed by Xen
+ * into FreeBSD boot options (from the i386 Xen port).
+ */
+static char *
+xen_setbootenv(char *cmd_line)
+{
+ char *cmd_line_next;
+
+ /* Skip leading spaces */
+ for (; *cmd_line == ' '; cmd_line++);
+
+ for (cmd_line_next = cmd_line; strsep(&cmd_line_next, ",") != NULL;);
+ return (cmd_line);
+}
+
+static int
+xen_boothowto(char *envp)
+{
+ int i, howto = 0;
+
+ /* get equivalents from the environment */
+ for (i = 0; howto_names[i].ev != NULL; i++)
+ if (getenv(howto_names[i].ev) != NULL)
+ howto |= howto_names[i].mask;
+ return (howto);
+}
+
+static caddr_t
+xen_pv_parse_preload_data(u_int64_t modulep)
+{
+ /* Parse the extra boot information given by Xen */
+ if (HYPERVISOR_start_info->cmd_line)
+ kern_envp = xen_setbootenv(HYPERVISOR_start_info->cmd_line);
+ boothowto |= xen_boothowto(kern_envp);
+
+ return (NULL);
+}
+
+static void
+xen_pv_fetch_e820_map(caddr_t kmdp, struct bios_smap **smap, u_int32_t *size)
+{
+ struct xen_memory_map memmap;
+ int rc;
+
+ /* Fetch the E820 map from Xen */
+ memmap.nr_entries = MAX_E820_ENTRIES;
+ set_xen_guest_handle(memmap.buffer, xen_smap);
+ rc = HYPERVISOR_memory_op(XENMEM_memory_map, &memmap);
+ if (rc)
+ panic("unable to fetch Xen E820 memory map");
+
+ *smap = xen_smap;
+ *size = memmap.nr_entries * sizeof(xen_smap[0]);
+}
+
+void
+xen_pv_set_init_ops(void)
+{
+ /* Init ops for Xen PV */
+ init_ops = xen_init_ops;
+}
diff --git a/sys/x86/xen/pvcpu.c b/sys/x86/xen/pvcpu.c
new file mode 100644
index 0000000..00e063b
--- /dev/null
+++ b/sys/x86/xen/pvcpu.c
@@ -0,0 +1,98 @@
+/*
+ * Copyright (c) 2013 Roger Pau Monné <roger.pau at citrix.com>
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ * 1. Redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in the
+ * documentation and/or other materials provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS AS IS'' AND
+ * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+ * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+ * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
+ * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+ * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
+ * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
+ * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
+ * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
+ * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
+ * SUCH DAMAGE.
+ */
+
+#include <sys/cdefs.h>
+__FBSDID("$FreeBSD$");
+
+#include <sys/param.h>
+#include <sys/systm.h>
+#include <sys/bus.h>
+#include <sys/kernel.h>
+#include <sys/module.h>
+#include <sys/pcpu.h>
+#include <sys/smp.h>
+
+#include <xen/xen-os.h>
+
+static void
+xenpvcpu_identify(driver_t *driver, device_t parent)
+{
+ int i;
+
+ if (!xen_pv_domain())
+ return;
+
+ CPU_FOREACH(i)
+ BUS_ADD_CHILD(parent, 0, "pvcpu", i);
+}
+
+static int
+xenpvcpu_probe(device_t dev)
+{
+ if (!xen_pv_domain())
+ return (ENXIO);
+
+ device_set_desc(dev, "Xen PV CPU");
+ return (0);
+}
+
+static int
+xenpvcpu_attach(device_t dev)
+{
+ struct pcpu *pc;
+ int cpu;
+
+ cpu = device_get_unit(dev);
+ pc = pcpu_find(cpu);
+ pc->pc_device = dev;
+ return (0);
+}
+
+static int
+xenpvcpu_detach(device_t dev)
+{
+
+ return (0);
+}
+
+static device_method_t xenpvcpu_methods[] = {
+ DEVMETHOD(device_identify, xenpvcpu_identify),
+ DEVMETHOD(device_probe, xenpvcpu_probe),
+ DEVMETHOD(device_attach, xenpvcpu_attach),
+ DEVMETHOD(device_detach, xenpvcpu_detach),
+ DEVMETHOD_END
+};
+
+static driver_t xenpvcpu_driver = {
+ "pvcpu",
+ xenpvcpu_methods,
+ 0,
+};
+
+devclass_t xenpvcpu_devclass;
+
+DRIVER_MODULE(xenpvcpu, nexus, xenpvcpu_driver, xenpvcpu_devclass, 0, 0);
+MODULE_DEPEND(xenpvcpu, nexus, 1, 1, 1);
diff --git a/sys/xen/gnttab.c b/sys/xen/gnttab.c
index 03c32b7..909378a 100644
--- a/sys/xen/gnttab.c
+++ b/sys/xen/gnttab.c
@@ -25,6 +25,7 @@ __FBSDID("$FreeBSD$");
#include <sys/lock.h>
#include <sys/malloc.h>
#include <sys/mman.h>
+#include <sys/limits.h>
#include <xen/xen-os.h>
#include <xen/hypervisor.h>
@@ -607,6 +608,7 @@ gnttab_resume(void)
{
int error;
unsigned int max_nr_gframes, nr_gframes;
+ void *alloc_mem;
nr_gframes = nr_grant_frames;
max_nr_gframes = max_nr_grant_frames();
@@ -614,11 +616,20 @@ gnttab_resume(void)
return (ENOSYS);
if (!resume_frames) {
- error = xenpci_alloc_space(PAGE_SIZE * max_nr_gframes,
- &resume_frames);
- if (error) {
- printf("error mapping gnttab share frames\n");
- return (error);
+ if (xen_pv_domain()) {
+ alloc_mem = contigmalloc(max_nr_gframes * PAGE_SIZE,
+ M_DEVBUF, M_NOWAIT, 0,
+ ULONG_MAX, PAGE_SIZE, 0);
+ KASSERT((alloc_mem != NULL),
+ ("unable to alloc memory for gnttab"));
+ resume_frames = vtophys(alloc_mem);
+ } else {
+ error = xenpci_alloc_space(PAGE_SIZE * max_nr_gframes,
+ &resume_frames);
+ if (error) {
+ printf("error mapping gnttab share frames\n");
+ return (error);
+ }
}
}
diff --git a/sys/xen/interface/arch-x86/xen.h b/sys/xen/interface/arch-x86/xen.h
index 1c186d7..6cc15d3 100644
--- a/sys/xen/interface/arch-x86/xen.h
+++ b/sys/xen/interface/arch-x86/xen.h
@@ -147,7 +147,16 @@ struct vcpu_guest_context {
struct cpu_user_regs user_regs; /* User-level CPU registers */
struct trap_info trap_ctxt[256]; /* Virtual IDT */
unsigned long ldt_base, ldt_ents; /* LDT (linear address, # ents) */
- unsigned long gdt_frames[16], gdt_ents; /* GDT (machine frames, # ents) */
+ union {
+ struct {
+ /* PV: GDT (machine frames, # ents).*/
+ unsigned long gdt_frames[16], gdt_ents;
+ } pv;
+ struct {
+ /* PVH: GDTR addr and size */
+ unsigned long gdtaddr, gdtsz;
+ } pvh;
+ } u;
unsigned long kernel_ss, kernel_sp; /* Virtual TSS (only SS1/SP1) */
/* NB. User pagetable on x86/64 is placed in ctrlreg[1]. */
unsigned long ctrlreg[8]; /* CR0-CR7 (control registers) */
diff --git a/sys/xen/pv.h b/sys/xen/pv.h
new file mode 100644
index 0000000..bbb1048
--- /dev/null
+++ b/sys/xen/pv.h
@@ -0,0 +1,29 @@
+/*
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to
+ * deal in the Software without restriction, including without limitation the
+ * rights to use, copy, modify, merge, publish, distribute, sublicense, and/or
+ * sell copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+ * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
+ * DEALINGS IN THE SOFTWARE.
+ *
+ * $FreeBSD$
+ */
+
+#ifndef __XEN_PV_H__
+#define __XEN_PV_H__
+
+int xen_pv_start_all_aps(void);
+void xen_pv_set_init_ops(void);
+
+#endif /* __XEN_PV_H__ */
\ No newline at end of file
diff --git a/sys/xen/xen-os.h b/sys/xen/xen-os.h
index 95e8c6a..d3dccad 100644
--- a/sys/xen/xen-os.h
+++ b/sys/xen/xen-os.h
@@ -53,6 +53,11 @@ void force_evtchn_callback(void);
extern int gdtset;
extern shared_info_t *HYPERVISOR_shared_info;
+extern start_info_t *HYPERVISOR_start_info;
+
+/* XXX: we need to get rid of this and use HYPERVISOR_start_info directly */
+extern struct xenstore_domain_interface *xen_store;
+extern char *console_page;
enum xen_domain_type {
XEN_NATIVE, /* running on bare hardware */
@@ -80,6 +85,9 @@ xen_hvm_domain(void)
return (xen_domain_type == XEN_HVM_DOMAIN);
}
+/* Debug function, prints directly to hypervisor console */
+void xen_early_printf(const char *, ...);
+
#ifndef xen_mb
#define xen_mb() mb()
#endif
diff --git a/sys/xen/xenstore/xenstore.c b/sys/xen/xenstore/xenstore.c
index d404862..b9885af 100644
--- a/sys/xen/xenstore/xenstore.c
+++ b/sys/xen/xenstore/xenstore.c
@@ -1082,6 +1082,19 @@ xs_init_comms(void)
static void
xs_identify(driver_t *driver, device_t parent)
{
+ const char *parent_name;
+
+ if (!xen_domain())
+ return;
+
+ /*
+ * On HVM domains we will get called twice, once from the nexus
+ * and another time after the xenpci device is attached, we should
+ * only attach after the xenpci device has been added.
+ */
+ parent_name = device_get_name(parent);
+ if (xen_hvm_domain() && strncmp(parent_name, "xenpci", 6) != 0)
+ return;
BUS_ADD_CHILD(parent, 0, "xenstore", 0);
}
@@ -1147,13 +1160,15 @@ xs_attach(device_t dev)
/* Initialize the interface to xenstore. */
struct proc *p;
-#ifdef XENHVM
- xs.evtchn = hvm_get_parameter(HVM_PARAM_STORE_EVTCHN);
- xs.gpfn = hvm_get_parameter(HVM_PARAM_STORE_PFN);
- xen_store = pmap_mapdev(xs.gpfn * PAGE_SIZE, PAGE_SIZE);
-#else
- xs.evtchn = xen_start_info->store_evtchn;
-#endif
+ if (xen_hvm_domain()) {
+ xs.evtchn = hvm_get_parameter(HVM_PARAM_STORE_EVTCHN);
+ xs.gpfn = hvm_get_parameter(HVM_PARAM_STORE_PFN);
+ xen_store = pmap_mapdev(xs.gpfn * PAGE_SIZE, PAGE_SIZE);
+ } else if (xen_pv_domain()) {
+ xs.evtchn = HYPERVISOR_start_info->store_evtchn;
+ } else {
+ panic("Unknown domain type, cannot initialize xenstore\n");
+ }
TAILQ_INIT(&xs.reply_list);
TAILQ_INIT(&xs.watch_events);
@@ -1263,9 +1278,8 @@ static devclass_t xenstore_devclass;
#ifdef XENHVM
DRIVER_MODULE(xenstore, xenpci, xenstore_driver, xenstore_devclass, 0, 0);
-#else
-DRIVER_MODULE(xenstore, nexus, xenstore_driver, xenstore_devclass, 0, 0);
#endif
+DRIVER_MODULE(xenstore, nexus, xenstore_driver, xenstore_devclass, 0, 0);
/*------------------------------- Sysctl Data --------------------------------*/
/* XXX Shouldn't the node be somewhere else? */
--
1.7.7.5 (Apple Git-26)
More information about the freebsd-current
mailing list