From nobody Thu Feb 09 19:15:24 2023 X-Original-To: freebsd-hackers@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4PCRQk5MRJz3nQx3 for ; Thu, 9 Feb 2023 19:15:26 +0000 (UTC) (envelope-from mjguzik@gmail.com) Received: from mail-oi1-x235.google.com (mail-oi1-x235.google.com [IPv6:2607:f8b0:4864:20::235]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smtp.gmail.com", Issuer "GTS CA 1D4" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4PCRQk3S3tz3Q9k; Thu, 9 Feb 2023 19:15:26 +0000 (UTC) (envelope-from mjguzik@gmail.com) Authentication-Results: mx1.freebsd.org; none Received: by mail-oi1-x235.google.com with SMTP id cz14so2488277oib.12; Thu, 09 Feb 2023 11:15:26 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=cc:to:subject:message-id:date:from:references:in-reply-to :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=IyYiYPsyFbF0//ohyHC9rKadKqzxR4dXIB6S6vQ7dMo=; b=HRY19yc2uGGjYMOeOqTvE4aSuEIixcAbkE6a0mf+FdDg80cyqR3u37Dra39A54tbLU cz3/Qt1BhiORodLQW378xMv/9SwN9nXyt33c9RNp+63Vh+20NsSwgq9zdinpk1aaDf/a L55xgwMt3vVlYux3mF8R0sD8SCr0H372JkX0k9M81CkPQ/1ki6IkI1HKkOJBh5hsQNx+ +rK0jMew2ghCOlqEUxJnghh4KhILMqoofp5njUjBDn7nVeslSVylAxDo4SoJWZJgJDnl M9Q/BkrdbdX7E9kZ0g1uZt/FxUwiGvE6j8sjCEGck+ISAVLuesN/LV3rj0iFHcCmdVF9 CSgA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:subject:message-id:date:from:references:in-reply-to :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=IyYiYPsyFbF0//ohyHC9rKadKqzxR4dXIB6S6vQ7dMo=; b=M0RX54Q9m8bSarcn9n4MeSijpDS90lWWU0TMkK+gkal9mn/L3cXwRq+J94CoVBD9ew SWJkphVSPl8gQGfh2VZ9YObZ4FqBcD47pKAg3fCNJXCBz6kjWyaNjYPbnB+b3QKv7bIE MjoKhPa6EdXO/PVqol2up9kmhcnu16xYewj0YX/d349/Fduk/qAkXWsuv8OgEmmRDBjf OXg9OKStP3+oJcF9K2SrzLczgpUWdQkGk/Q3qvzzVL07Tajl/9Vt5QU/uS9z3GCvNC+e OCQBEmM++kPNV/IVtETgBzQ9hBhDEwB8jl1FOzVAgHuMUChL6DcXJ5ScFt1FgyOXSzB0 Qy9Q== X-Gm-Message-State: AO0yUKURORUOyCfsn3AWPRwJFUXqyuCj82Fr3QVDDgn8AIyjhfl4WWeo 6FS8zXNlhoO7D9+wSgdnNXnLrk9V4LR+pcWbeppejUM4 X-Google-Smtp-Source: AK7set+QSIM31QvwIzu0QJKFT5fUSiks+6VNADWalr0XWjIyQDe1+VrvKDptg0OloOQpIfyO0F/Q7ROsEfJ5mdqP058= X-Received: by 2002:aca:3c05:0:b0:378:348b:9346 with SMTP id j5-20020aca3c05000000b00378348b9346mr476773oia.81.1675970125156; Thu, 09 Feb 2023 11:15:25 -0800 (PST) List-Id: Technical discussions relating to FreeBSD List-Archive: https://lists.freebsd.org/archives/freebsd-hackers List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-hackers@freebsd.org MIME-Version: 1.0 Received: by 2002:ac9:6c92:0:b0:4b3:d953:974c with HTTP; Thu, 9 Feb 2023 11:15:24 -0800 (PST) In-Reply-To: <2f3dcda0-5135-290a-2dff-683b2e9fe271@FreeBSD.org> References: <2f3dcda0-5135-290a-2dff-683b2e9fe271@FreeBSD.org> From: Mateusz Guzik Date: Thu, 9 Feb 2023 20:15:24 +0100 Message-ID: Subject: Re: CFT: snmalloc as libc malloc To: David Chisnall Cc: freebsd-hackers Content-Type: text/plain; charset="UTF-8" X-Rspamd-Queue-Id: 4PCRQk3S3tz3Q9k X-Spamd-Bar: ---- X-Spamd-Result: default: False [-4.00 / 15.00]; REPLY(-4.00)[]; ASN(0.00)[asn:15169, ipnet:2607:f8b0::/32, country:US] X-Rspamd-Pre-Result: action=no action; module=replies; Message is reply to one we originated X-ThisMailContainsUnwantedMimeParts: N On 2/9/23, David Chisnall wrote: > Hi, > > For the few yearsI've been running locally with snmalloc as the malloc > in libc. Eventually I'd like to propose this for upstreaming but it > needs some wider testing first. > > For those unfamiliar with snmalloc > (https://github.com/microsoft/snmalloc), it is an allocator (or, rather, > a toolkit for building allocators) from my team at Microsoft Research > designed for both performance and security. A few highlights: > > - Snmalloc uses a message-passing design, which makes allocating on > one thread and freeing on another cheap. > - Very fast allocation performance > - Randomisation of relative locations of allocations > - Most metadata is stored out-of-band > - In-band metadata uses some lighweight encryption to protect against > corruption. > - Support for CHERI. > > In the (limited!) testing that I've done, it outperforms jemalloc and > results in a smaller libc binary. > > I've also previously managed to use it in the kernel, though that code > hasn't been tested in a while (last used with FreeBSD 11): > > https://github.com/microsoft/snmalloc/blob/main/src/snmalloc/pal/pal_freebsd_kernel.h > > It is also used in the Verona process sandboxing work, which makes it > easy to isolate a library in a capsicum Sandbox: > > https://github.com/microsoft/verona/tree/master/experiments/process_sandbox > > We test on FreeBSD in CI upstream and the code is actively maintained. > We have implemented compatibility wrappers for all of the jemalloc > non-standard APIs that FreeBSD's libc exposes. > > In particular, snmalloc is designed to make it very cheap to find the > start and end of an allocation, given a heap pointer. This means that > we can insert bounds checks in critical libc functions to prevent heap > overflow. This is done in the branch for memcpy, which some > investigation of a corpus of security vulnerabilities showed was the > root cause of about 10% of arbitrary-code-execution vulnerabilities. > > The bounds checks are controlled via an environment variable > LIBC_BOUNDS_CHECKS. Setting this to 0 disables checks, to 1 checks on > destination arguments, and to 2 checks sources and destinations. An > ifunc resolver selects the correct memcpy implementation at load time. > > I did have a version that checked a bunch of other libc functions (e.g. > sprintf, puts) but it was quite hacky (and the way the ifunc resolves > was implemented broke tcl). > > The current branch puts two things behind the MALLOC_PRODUCTION toggle: > > - The additional security checks that detect corruption of malloc state. > - Pretty-printing errors. > > We are currently separating the former into separate knobs upstream, > some subset should probably be turned on by default in production. The > latter has less of a performance impact than it had and will probably be > on for all configurations at some point once we've refactored slightly > to ensure the compiler can tail call the failure function (which moves > it entirely off the fast path). With this enabled, you get errors that > look like this: > > Fatal Error! > memcpy with source out of bounds of heap allocation: > range [0x14823c02440, 0x14823c0246a) > allocation [0x14823c02440, 0x14823c02450) > range goes beyond allocation by 0x1a bytes > > Abort trap (core dumped) > > Without it, you just get an illegal instruction trap. > > There are a few limitations in the current branch: > > - The memcpy integration is broken on non-amd64 platforms (patches > welcome from people who can test these!). > - Only memcpy (not, for example, memmove) has bounds checks. > - The memcpy in rtld is naive, which may impact performance. > - MALLOC_PRODUCTION conflates too many things > > The branch is here: > > https://github.com/davidchisnall/freebsd-src/tree/snmalloc2 > > It adds snmalloc as a submodule in contrib. FreeBSD is allergic to > submodules, so upstreaming will need to replace this with something more > complicated. You should be able to cherry-pick the top commit on any > vaguely-recent -CURRENT. > > You should also be able to build the libc from this branch against the > version that you're running and try it with LD_LIBRARY_PATH. > > I'd love to hear feedback on: > > - Performance, especially workloads where snmalloc does badly. > - RSS usage (again, especially workloads where snmalloc does badly). > - Anything that breaks. > it fails to build for me: /usr/src/lib/libc/stdlib/snmalloc/malloc.cc:35:10: fatal error: 'override/jemalloc_compat.cc' file not found #include "override/jemalloc_compat.cc" ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1 error generated. --- malloc.o --- *** [malloc.o] Error code 1 make[4]: stopped in /usr/src/lib/libc /usr/src/lib/libc/stdlib/snmalloc/memcpy.cc:25:10: fatal error: 'global/memcpy.h' file not found #include ^~~~~~~~~~~~~~~~~ 1 error generated. --- memcpy.o --- *** [memcpy.o] Error code 1 this is a fresh world, top of snmalloc2 branch: commit a5c83c69817d03943b8be982dd815c7e263d1a83 Author: David Chisnall Date: Fri Jan 21 15:13:09 2022 +0000 Initial commit of snmalloc2 in libc. anyway, I wanted to say I find the memcpy thing incredibly suspicious. I found one article in https://github.com/microsoft/snmalloc/blob/main/docs/security/GuardedMemcpy.md which benches it and that made it even more suspicious. How did the benched memcpy look like inside? -- Mateusz Guzik