Re: CFT: snmalloc as libc malloc

From: Mateusz Guzik <mjguzik_at_gmail.com>
Date: Thu, 09 Feb 2023 19:15:24 UTC
On 2/9/23, David Chisnall <theraven@freebsd.org> wrote:
> Hi,
>
> For the few yearsI've been running locally with snmalloc as the malloc
> in libc.  Eventually I'd like to propose this for upstreaming but it
> needs some wider testing first.
>
> For those unfamiliar with snmalloc
> (https://github.com/microsoft/snmalloc), it is an allocator (or, rather,
> a toolkit for building allocators) from my team at Microsoft Research
> designed for both performance and security.  A few highlights:
>
>   - Snmalloc uses a message-passing design, which makes allocating on
> one thread and freeing on another cheap.
>   - Very fast allocation performance
>   - Randomisation of relative locations of allocations
>   - Most metadata is stored out-of-band
>   - In-band metadata uses some lighweight encryption to protect against
> corruption.
>   - Support for CHERI.
>
> In the (limited!) testing that I've done, it outperforms jemalloc and
> results in a smaller libc binary.
>
> I've also previously managed to use it in the kernel, though that code
> hasn't been tested in a while (last used with FreeBSD 11):
>
> https://github.com/microsoft/snmalloc/blob/main/src/snmalloc/pal/pal_freebsd_kernel.h
>
> It is also used in the Verona process sandboxing work, which makes it
> easy to isolate a library in a capsicum Sandbox:
>
> https://github.com/microsoft/verona/tree/master/experiments/process_sandbox
>
> We test on FreeBSD in CI upstream and the code is actively maintained.
> We have implemented compatibility wrappers for all of the jemalloc
> non-standard APIs that FreeBSD's libc exposes.
>
> In particular, snmalloc is designed to make it very cheap to find the
> start and end of an allocation, given a heap pointer.  This means that
> we can insert bounds checks in critical libc functions to prevent heap
> overflow.  This is done in the branch for memcpy, which some
> investigation of a corpus of security vulnerabilities showed was the
> root cause of about 10% of arbitrary-code-execution vulnerabilities.
>
> The bounds checks are controlled via an environment variable
> LIBC_BOUNDS_CHECKS.  Setting this to 0 disables checks, to 1 checks on
> destination arguments, and to 2 checks sources and destinations.  An
> ifunc resolver selects the correct memcpy implementation at load time.
>
> I did have a version that checked a bunch of other libc functions (e.g.
> sprintf, puts) but it was quite hacky (and the way the ifunc resolves
> was implemented broke tcl).
>
> The current branch puts two things behind the MALLOC_PRODUCTION toggle:
>
>   - The additional security checks that detect corruption of malloc state.
>   - Pretty-printing errors.
>
> We are currently separating the former into separate knobs upstream,
> some subset should probably be turned on by default in production.  The
> latter has less of a performance impact than it had and will probably be
> on for all configurations at some point once we've refactored slightly
> to ensure the compiler can tail call the failure function (which moves
> it entirely off the fast path).  With this enabled, you get errors that
> look like this:
>
> Fatal Error!
> memcpy with source out of bounds of heap allocation:
>          range [0x14823c02440, 0x14823c0246a)
>          allocation [0x14823c02440, 0x14823c02450)
> range goes beyond allocation by 0x1a bytes
>
> Abort trap (core dumped)
>
> Without it, you just get an illegal instruction trap.
>
> There are a few limitations in the current branch:
>
>   - The memcpy integration is broken on non-amd64 platforms (patches
> welcome from people who can test these!).
>   - Only memcpy (not, for example, memmove) has bounds checks.
>   - The memcpy in rtld is naive, which may impact performance.
>   - MALLOC_PRODUCTION conflates too many things
>
> The branch is here:
>
> https://github.com/davidchisnall/freebsd-src/tree/snmalloc2
>
> It adds snmalloc as a submodule in contrib.  FreeBSD is allergic to
> submodules, so upstreaming will need to replace this with something more
> complicated.  You should be able to cherry-pick the top commit on any
> vaguely-recent -CURRENT.
>
> You should also be able to build the libc from this branch against the
> version that you're running and try it with LD_LIBRARY_PATH.
>
> I'd love to hear feedback on:
>
>   - Performance, especially workloads where snmalloc does badly.
>   - RSS usage (again, especially workloads where snmalloc does badly).
>   - Anything that breaks.
>

it fails to build for me:

/usr/src/lib/libc/stdlib/snmalloc/malloc.cc:35:10: fatal error:
'override/jemalloc_compat.cc' file not found
#include "override/jemalloc_compat.cc"
         ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1 error generated.
--- malloc.o ---
*** [malloc.o] Error code 1

make[4]: stopped in /usr/src/lib/libc
/usr/src/lib/libc/stdlib/snmalloc/memcpy.cc:25:10: fatal error:
'global/memcpy.h' file not found
#include <global/memcpy.h>
         ^~~~~~~~~~~~~~~~~
1 error generated.
--- memcpy.o ---
*** [memcpy.o] Error code 1


this is a fresh world, top of snmalloc2 branch:
commit a5c83c69817d03943b8be982dd815c7e263d1a83
Author: David Chisnall <theraven@FreeBSD.org>
Date:   Fri Jan 21 15:13:09 2022 +0000

    Initial commit of snmalloc2 in libc.

anyway, I wanted to say I find the memcpy thing incredibly suspicious.
I found one article in
https://github.com/microsoft/snmalloc/blob/main/docs/security/GuardedMemcpy.md
which benches it and that made it even more suspicious. How did the
benched memcpy look like inside?

-- 
Mateusz Guzik <mjguzik gmail.com>