From nobody Thu Feb 09 12:08:49 2023 X-Original-To: freebsd-hackers@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4PCFyY0Cztz3nlj7 for ; Thu, 9 Feb 2023 12:08:53 +0000 (UTC) (envelope-from theraven@FreeBSD.org) Received: from smtp.freebsd.org (smtp.freebsd.org [IPv6:2610:1c1:1:606c::24b:4]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "smtp.freebsd.org", Issuer "R3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4PCFyX6s1qz3Pbd for ; Thu, 9 Feb 2023 12:08:52 +0000 (UTC) (envelope-from theraven@FreeBSD.org) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1675944533; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=AEccue/VQFZACRhZNQQ7J9gs9Ha9yuWA3BRuyG3lVoA=; b=HUajpzCOUgWHNVe0cWvi1B+ncgjTEDbMCGf7OaEYgUP7rmZ7iY7rFDySIy3DMgJP1vxjr0 1oNLY/g1beF+K1daT5h0DPt79/XSFP2/iwuWTMGMEAQILBwmjhYPWXLrqOQMuImcUlCJx6 NoUOW0NOpawQWVGgHcqh5tx9IB+3QjwAtS0s+YBRJsCpGaijnJrUUqL2zP6c9D5HjozAbm xVLG6N+Yv7KUq8RlE7r1DmXyev9oxs0DFcaeKcOjqkzChD2sv9GqxHbky8KXbycOw8zNI6 br7D65O8YXfoQtzN1ABHbeo8EgF0Y08VhTcj1vbZARKfOz6wnGr467m4RV3Zeg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1675944533; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=AEccue/VQFZACRhZNQQ7J9gs9Ha9yuWA3BRuyG3lVoA=; b=jXWhCDLJ/ngfFqGn2tPOm3c3JAlrGe6U3sT9IXfp5BXBD1y2pe4lSAGQbRRr3U6+OLobOq peuZsznW9jzo1rB91mYYVZeLYSJ6I7ilzqrIpLg5rmhdw6StsfMvPc8fP1hMcLJzhn9s0W MDiK2pVn+HzqYeWZZOTxIC7cCSvL/7jMe2ZB4ZAKoLhVaVYHwZUhQ/44IiWlpdj6MuyHC2 o57lpOvYQieKvI0mvsdFwtRmCatYSBlMMH12bGc7iZqmtcrGbrQVSoDZOU9HMdBzNEpFjp 3mZzgxBWT1dvymL9+uRXMivn1ilKYS+z68+MeFNMR1yRMjO+iyr4WJBk/tTipg== ARC-Authentication-Results: i=1; mx1.freebsd.org; none ARC-Seal: i=1; s=dkim; d=freebsd.org; t=1675944533; a=rsa-sha256; cv=none; b=YuA/2zslNb3And74q5mtnDfQy2RLl8OeUF/3OFtHOtvWvY9IV/yn7rcOoVOWIWSuHw9/I9 NG7Dk+iwUTFWv+pC/+jblznzPOGZtlZwproIi+fqGrglryuVMkKWzVsPWiSJeGNOFvnrbE bGWQHWhyKffGHJvOFDlLPclvjINbdfz5doVe9o20t5iNQAzEJ36dgmf9+2UUwxawnEQ/ym GPdBXm5caxj+XnHx8+sC+7xhhXUvIO/GblTLUnbClrpU+KkLujh4zTCJweUK6dw1mDGWWw Jpal3UT0EYSe+l7Nt3A6l81x7BEzagh1+3I84r5/JlwoKOSQsUk+s/1ymQ/fgw== Received: from smtp.theravensnest.org (smtp.theravensnest.org [45.77.103.195]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) (Authenticated sender: theraven) by smtp.freebsd.org (Postfix) with ESMTPSA id 4PCFyX5lybzN5K for ; Thu, 9 Feb 2023 12:08:52 +0000 (UTC) (envelope-from theraven@FreeBSD.org) Received: from [192.168.1.202] (host81-158-36-31.range81-158.btcentralplus.com [81.158.36.31]) by smtp.theravensnest.org (Postfix) with ESMTPSA id CF2251B5B for ; Thu, 9 Feb 2023 12:08:51 +0000 (GMT) Message-ID: <2f3dcda0-5135-290a-2dff-683b2e9fe271@FreeBSD.org> Date: Thu, 9 Feb 2023 12:08:49 +0000 List-Id: Technical discussions relating to FreeBSD List-Archive: https://lists.freebsd.org/archives/freebsd-hackers List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-hackers@freebsd.org MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101 Thunderbird/102.6.1 Content-Language: en-GB To: freebsd-hackers From: David Chisnall Subject: CFT: snmalloc as libc malloc Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-ThisMailContainsUnwantedMimeParts: N Hi, For the few yearsI've been running locally with snmalloc as the malloc in libc. Eventually I'd like to propose this for upstreaming but it needs some wider testing first. For those unfamiliar with snmalloc (https://github.com/microsoft/snmalloc), it is an allocator (or, rather, a toolkit for building allocators) from my team at Microsoft Research designed for both performance and security. A few highlights: - Snmalloc uses a message-passing design, which makes allocating on one thread and freeing on another cheap. - Very fast allocation performance - Randomisation of relative locations of allocations - Most metadata is stored out-of-band - In-band metadata uses some lighweight encryption to protect against corruption. - Support for CHERI. In the (limited!) testing that I've done, it outperforms jemalloc and results in a smaller libc binary. I've also previously managed to use it in the kernel, though that code hasn't been tested in a while (last used with FreeBSD 11): https://github.com/microsoft/snmalloc/blob/main/src/snmalloc/pal/pal_freebsd_kernel.h It is also used in the Verona process sandboxing work, which makes it easy to isolate a library in a capsicum Sandbox: https://github.com/microsoft/verona/tree/master/experiments/process_sandbox We test on FreeBSD in CI upstream and the code is actively maintained. We have implemented compatibility wrappers for all of the jemalloc non-standard APIs that FreeBSD's libc exposes. In particular, snmalloc is designed to make it very cheap to find the start and end of an allocation, given a heap pointer. This means that we can insert bounds checks in critical libc functions to prevent heap overflow. This is done in the branch for memcpy, which some investigation of a corpus of security vulnerabilities showed was the root cause of about 10% of arbitrary-code-execution vulnerabilities. The bounds checks are controlled via an environment variable LIBC_BOUNDS_CHECKS. Setting this to 0 disables checks, to 1 checks on destination arguments, and to 2 checks sources and destinations. An ifunc resolver selects the correct memcpy implementation at load time. I did have a version that checked a bunch of other libc functions (e.g. sprintf, puts) but it was quite hacky (and the way the ifunc resolves was implemented broke tcl). The current branch puts two things behind the MALLOC_PRODUCTION toggle: - The additional security checks that detect corruption of malloc state. - Pretty-printing errors. We are currently separating the former into separate knobs upstream, some subset should probably be turned on by default in production. The latter has less of a performance impact than it had and will probably be on for all configurations at some point once we've refactored slightly to ensure the compiler can tail call the failure function (which moves it entirely off the fast path). With this enabled, you get errors that look like this: Fatal Error! memcpy with source out of bounds of heap allocation: range [0x14823c02440, 0x14823c0246a) allocation [0x14823c02440, 0x14823c02450) range goes beyond allocation by 0x1a bytes Abort trap (core dumped) Without it, you just get an illegal instruction trap. There are a few limitations in the current branch: - The memcpy integration is broken on non-amd64 platforms (patches welcome from people who can test these!). - Only memcpy (not, for example, memmove) has bounds checks. - The memcpy in rtld is naive, which may impact performance. - MALLOC_PRODUCTION conflates too many things The branch is here: https://github.com/davidchisnall/freebsd-src/tree/snmalloc2 It adds snmalloc as a submodule in contrib. FreeBSD is allergic to submodules, so upstreaming will need to replace this with something more complicated. You should be able to cherry-pick the top commit on any vaguely-recent -CURRENT. You should also be able to build the libc from this branch against the version that you're running and try it with LD_LIBRARY_PATH. I'd love to hear feedback on: - Performance, especially workloads where snmalloc does badly. - RSS usage (again, especially workloads where snmalloc does badly). - Anything that breaks. David