From nobody Mon May 08 08:25:28 2023 X-Original-To: dev-commits-src-branches@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4QFDr83Vphz49Z63; Mon, 8 May 2023 08:25:28 +0000 (UTC) (envelope-from git@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "mxrelay.nyi.freebsd.org", Issuer "R3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4QFDr82Zy1z3lK4; Mon, 8 May 2023 08:25:28 +0000 (UTC) (envelope-from git@FreeBSD.org) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1683534328; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=4c4sa24V+btVB/J4N8eLCSoSGHptmGAQPqdvd6S7/PU=; b=ETr/FeEGIQyzrv8BXvcQTOo4j0CXv56uopG0bKobC2qIaTY09qwPBRFkiNfRZzt2Yxe65i UbUk9B89qCsbu3RkxVxttPtRz9rfDMXWmPoFBNxRP4+ssoHF3+jIwPZShbSYh8KQlv8Ep8 eWIJHB9k3xRWq8YY79QK8SsL0zS6F4vMTtT2eIaSTKpSy1q8NxnuZS3rQS3nUUFUKCm8YQ 6neqKR1mqwfZ5BfMusipfRK2qbEOfhDR3ZtzkQ0xqzoN/wNIS8ft/W+5ydYLDOwWLEBypI wvJRwppI15YaTwRL3h9QRkIz4R3OWp5Q6tzzEW1YoV0xSygqY182FLJnRFsEZQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1683534328; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=4c4sa24V+btVB/J4N8eLCSoSGHptmGAQPqdvd6S7/PU=; b=mHrGlOIDla+EKS4qUuaH5OJmsPrC6H/DqSJ3EIhxt0xzEAQ8JVWEW95KU6F3KzxRLIi6kR eBcRpjW9jBDT/5ulfl50PZfbaF6+k8a588BNof7/RM74HriABGiXQH44va1qNtJKy19jDZ m5FlKoSgIx9JuXszJ5lFHfGb0zANtB1415o3N2ML7mr9XSAkHMkG3rnshR3g9mB/jq+zx/ know451wRFkzvbJBqFw+6+HoO0T7ubgxr2zbcwJa8NR+xOsWhjAlApNo55RD1BvmXpEHJH U0iZtKfczuguA0pThOCHXbaoqk4gdJfIBL2/IXsmoVH+lGcOedL+jneK/5WruQ== ARC-Authentication-Results: i=1; mx1.freebsd.org; none ARC-Seal: i=1; s=dkim; d=freebsd.org; t=1683534328; a=rsa-sha256; cv=none; b=R8+vHDkLYBX5dDuk7FJpmSiSOyFb3jPTtRAWoiSesuPcyLxnbE7xAi7LKVbuiRl4hYlUW0 y60paEYCEadr/c4pC0JPNXMdYvEOXiyqlI9JECNpVWKxdYrKEkO43860UvqHOCZWeD8fwt pOJBsEn+zUdiasTOx73dWZFkQLP8/aUGZf8dfSh1Ma71Ci57K0sFDZeix9ZR+D3l3wVMKj xseK3X0F//msT9wHT7jI55siPTngayV4ZBFKkci8ajDST4lKQenOPNGw33aaomVZADTtLt UglxCxUgzLrw19jFb4PGHkUaMTAGOqGZiiHpBYfgjUDiVQaNKIAhClVQ/CcR/Q== Received: from gitrepo.freebsd.org (gitrepo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:5]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 4QFDr80rwfzsnS; Mon, 8 May 2023 08:25:28 +0000 (UTC) (envelope-from git@FreeBSD.org) Received: from gitrepo.freebsd.org ([127.0.1.44]) by gitrepo.freebsd.org (8.16.1/8.16.1) with ESMTP id 3488PS3F050356; Mon, 8 May 2023 08:25:28 GMT (envelope-from git@gitrepo.freebsd.org) Received: (from git@localhost) by gitrepo.freebsd.org (8.16.1/8.16.1/Submit) id 3488PSiw050355; Mon, 8 May 2023 08:25:28 GMT (envelope-from git) Date: Mon, 8 May 2023 08:25:28 GMT Message-Id: <202305080825.3488PSiw050355@gitrepo.freebsd.org> To: src-committers@FreeBSD.org, dev-commits-src-all@FreeBSD.org, dev-commits-src-branches@FreeBSD.org From: =?utf-8?Q?Corvin=20K=C3=B6hne?= Subject: git: 694f2c9d354e - stable/13 - bhyve: add basic E820 implementation List-Id: Commits to the stable branches of the FreeBSD src repository List-Archive: https://lists.freebsd.org/archives/dev-commits-src-branches List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-dev-commits-src-branches@freebsd.org X-BeenThere: dev-commits-src-branches@freebsd.org MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-Git-Committer: corvink X-Git-Repository: src X-Git-Refname: refs/heads/stable/13 X-Git-Reftype: branch X-Git-Commit: 694f2c9d354e9938f7ab10376e04e5ee1c04940b Auto-Submitted: auto-generated X-ThisMailContainsUnwantedMimeParts: N The branch stable/13 has been updated by corvink: URL: https://cgit.FreeBSD.org/src/commit/?id=694f2c9d354e9938f7ab10376e04e5ee1c04940b commit 694f2c9d354e9938f7ab10376e04e5ee1c04940b Author: Corvin Köhne AuthorDate: 2021-09-09 09:37:03 +0000 Commit: Corvin Köhne CommitDate: 2023-05-08 08:21:30 +0000 bhyve: add basic E820 implementation There are some use cases where bhyve has to prepare some special memory regions. E.g. GPU passthrough for Intel integrated graphic devices needs to reserve some memory for the graphic device. So, bhyve has to inform the guest about those memory regions. This information can be passed by the qemu fwcfg interface. As qemu creates an E820 table, we can reuse the existing fwcfg item "etc/e820". This commit is the first one of a series. It only adds a basic implementation for the creation of the E820 table. Some subsequent commits will add more items to the E820 table and register it as fwcfg item. Reviewed by: markj MFC after: 1 week Sponsored by: Beckhoff Automation GmbH & Co. KG Differential Revision: https://reviews.freebsd.org/D39545 (cherry picked from commit 9180daa1e34577aaccf3ff64cc63a5179c4f09d8) --- usr.sbin/bhyve/e820.c | 233 ++++++++++++++++++++++++++++++++++++++++++++++++++ usr.sbin/bhyve/e820.h | 28 ++++++ 2 files changed, 261 insertions(+) diff --git a/usr.sbin/bhyve/e820.c b/usr.sbin/bhyve/e820.c new file mode 100644 index 000000000000..746d34d6521c --- /dev/null +++ b/usr.sbin/bhyve/e820.c @@ -0,0 +1,233 @@ +/*- + * SPDX-License-Identifier: BSD-2-Clause + * + * Copyright (c) 2021 Beckhoff Automation GmbH & Co. KG + * Author: Corvin Köhne + */ + +#include +#include + +#include + +#include +#include +#include +#include +#include +#include + +#include "e820.h" +#include "qemu_fwcfg.h" + +#define E820_FWCFG_FILE_NAME "etc/e820" + +#define KB (1024UL) +#define MB (1024 * KB) +#define GB (1024 * MB) + +struct e820_element { + TAILQ_ENTRY(e820_element) chain; + uint64_t base; + uint64_t end; + enum e820_memory_type type; +}; +static TAILQ_HEAD(e820_table, e820_element) e820_table = TAILQ_HEAD_INITIALIZER( + e820_table); + +static struct e820_element * +e820_element_alloc(uint64_t base, uint64_t end, enum e820_memory_type type) +{ + struct e820_element *element; + + element = calloc(1, sizeof(*element)); + if (element == NULL) { + return (NULL); + } + + element->base = base; + element->end = end; + element->type = type; + + return (element); +} + +struct qemu_fwcfg_item * +e820_get_fwcfg_item(void) +{ + struct qemu_fwcfg_item *fwcfg_item; + struct e820_element *element; + struct e820_entry *entries; + int count, i; + + count = 0; + TAILQ_FOREACH(element, &e820_table, chain) { + ++count; + } + if (count == 0) { + warnx("%s: E820 table empty", __func__); + return (NULL); + } + + fwcfg_item = calloc(1, sizeof(struct qemu_fwcfg_item)); + if (fwcfg_item == NULL) { + return (NULL); + } + + fwcfg_item->size = count * sizeof(struct e820_entry); + fwcfg_item->data = calloc(count, sizeof(struct e820_entry)); + if (fwcfg_item->data == NULL) { + free(fwcfg_item); + return (NULL); + } + + i = 0; + entries = (struct e820_entry *)fwcfg_item->data; + TAILQ_FOREACH(element, &e820_table, chain) { + struct e820_entry *entry = &entries[i]; + + entry->base = element->base; + entry->length = element->end - element->base; + entry->type = element->type; + + ++i; + } + + return (fwcfg_item); +} + +static int +e820_add_entry(const uint64_t base, const uint64_t end, + const enum e820_memory_type type) +{ + struct e820_element *new_element; + struct e820_element *element; + struct e820_element *ram_element; + + assert(end >= base); + + new_element = e820_element_alloc(base, end, type); + if (new_element == NULL) { + return (ENOMEM); + } + + /* + * E820 table should always be sorted in ascending order. Therefore, + * search for a range whose end is larger than the base parameter. + */ + TAILQ_FOREACH(element, &e820_table, chain) { + if (element->end > base) { + break; + } + } + + /* + * System memory requires special handling. + */ + if (type == E820_TYPE_MEMORY) { + /* + * base is larger than of any existing element. Add new system + * memory at the end of the table. + */ + if (element == NULL) { + TAILQ_INSERT_TAIL(&e820_table, new_element, chain); + return (0); + } + + /* + * System memory shouldn't overlap with any existing element. + */ + assert(end >= element->base); + + TAILQ_INSERT_BEFORE(element, new_element, chain); + + return (0); + } + + assert(element != NULL); + /* Non system memory should be allocated inside system memory. */ + assert(element->type == E820_TYPE_MEMORY); + /* New element should fit into existing system memory element. */ + assert(base >= element->base && end <= element->end); + + if (base == element->base) { + /* + * New element at system memory base boundary. Add new + * element before current and adjust the base of the old + * element. + * + * Old table: + * [ 0x1000, 0x4000] RAM <-- element + * New table: + * [ 0x1000, 0x2000] Reserved + * [ 0x2000, 0x4000] RAM <-- element + */ + TAILQ_INSERT_BEFORE(element, new_element, chain); + element->base = end; + } else if (end == element->end) { + /* + * New element at system memory end boundary. Add new + * element after current and adjust the end of the + * current element. + * + * Old table: + * [ 0x1000, 0x4000] RAM <-- element + * New table: + * [ 0x1000, 0x3000] RAM <-- element + * [ 0x3000, 0x4000] Reserved + */ + TAILQ_INSERT_AFTER(&e820_table, element, new_element, chain); + element->end = base; + } else { + /* + * New element inside system memory entry. Split it by + * adding a system memory element and the new element + * before current. + * + * Old table: + * [ 0x1000, 0x4000] RAM <-- element + * New table: + * [ 0x1000, 0x2000] RAM + * [ 0x2000, 0x3000] Reserved + * [ 0x3000, 0x4000] RAM <-- element + */ + ram_element = e820_element_alloc(element->base, base, + E820_TYPE_MEMORY); + if (ram_element == NULL) { + return (ENOMEM); + } + TAILQ_INSERT_BEFORE(element, ram_element, chain); + TAILQ_INSERT_BEFORE(element, new_element, chain); + element->base = end; + } + + return (0); +} + +int +e820_init(struct vmctx *const ctx) +{ + uint64_t lowmem_size, highmem_size; + int error; + + TAILQ_INIT(&e820_table); + + lowmem_size = vm_get_lowmem_size(ctx); + error = e820_add_entry(0, lowmem_size, E820_TYPE_MEMORY); + if (error) { + warnx("%s: Could not add lowmem", __func__); + return (error); + } + + highmem_size = vm_get_highmem_size(ctx); + if (highmem_size != 0) { + error = e820_add_entry(4 * GB, 4 * GB + highmem_size, + E820_TYPE_MEMORY); + if (error) { + warnx("%s: Could not add highmem", __func__); + return (error); + } + } + + return (0); +} diff --git a/usr.sbin/bhyve/e820.h b/usr.sbin/bhyve/e820.h new file mode 100644 index 000000000000..6843ad5dc736 --- /dev/null +++ b/usr.sbin/bhyve/e820.h @@ -0,0 +1,28 @@ +/*- + * SPDX-License-Identifier: BSD-2-Clause + * + * Copyright (c) 2021 Beckhoff Automation GmbH & Co. KG + * Author: Corvin Köhne + */ + +#pragma once + +#include + +#include "qemu_fwcfg.h" + +enum e820_memory_type { + E820_TYPE_MEMORY = 1, + E820_TYPE_RESERVED = 2, + E820_TYPE_ACPI = 3, + E820_TYPE_NVS = 4 +}; + +struct e820_entry { + uint64_t base; + uint64_t length; + uint32_t type; +} __packed; + +struct qemu_fwcfg_item *e820_get_fwcfg_item(void); +int e820_init(struct vmctx *const ctx);