From nobody Wed Mar 30 17:52:09 2022 X-Original-To: freebsd-hackers@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 76B3C1A3B168 for ; Wed, 30 Mar 2022 17:52:12 +0000 (UTC) (envelope-from sigsys@gmail.com) Received: from mail-qk1-x736.google.com (mail-qk1-x736.google.com [IPv6:2607:f8b0:4864:20::736]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smtp.gmail.com", Issuer "GTS CA 1D4" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4KTDXW4ycbz4pV1; Wed, 30 Mar 2022 17:52:11 +0000 (UTC) (envelope-from sigsys@gmail.com) Received: by mail-qk1-x736.google.com with SMTP id w141so17023642qkb.6; Wed, 30 Mar 2022 10:52:11 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=message-id:date:mime-version:user-agent:content-language:to:cc :references:from:subject:in-reply-to:content-transfer-encoding; bh=rKbhRL1wiqLNsWh3J7aI+GC9sE8miOwptFt2+6YX6ro=; b=fVsThHTgrcrqt8qoXwIQjbr8ZqHBu2mksC/zwSKfH/qvlqwQr4ImirscPLS5K3Nu2j kdCrBQZozIjZylOwyp+E33AXTIXdgMVg8KrPEptE/2qFqsgdZxRWtAT1QzqUZlj7F85/ hWwKrRaafF7kYizkqBJwtkT886TskrZ79Yrl9Na1OMCoJCJuoSZ+pqjwseYgQenFSI4z fhQBFw6r+SEJpLpfsDEbUNOrfbZY0EfDxLMAQgvwiE+rKCjyUO9dDPfTK1P+dBUnrVRm De9VFf1xwQq8NEkjC3f0KQxv1aw8DgS12DdjHJbA2Y1w8f1mOLxYdcjiLmJpcP/a2sF/ 1FtA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:date:mime-version:user-agent :content-language:to:cc:references:from:subject:in-reply-to :content-transfer-encoding; bh=rKbhRL1wiqLNsWh3J7aI+GC9sE8miOwptFt2+6YX6ro=; b=l7t86m0OJ6Gxm+HASAxWr+6Os5i6rpmbiFE+BDS7IIqPbv3jiJvKobGzcz+bEBMSGc clNHJwge+JwcZ6IryAXvHzWJ1WQ/ShfvFIXqGW0NuX8ZdzHTQ7a9m3vtjpI0Elp6WZM2 NMSK1lTPCO/v0oGjByRVxDVMKlv0dM9hYPCk2wbFB1kh3iu2XCPjTKCsl7BwHSySJiJ9 2jORLstX2YhZbG3MJMsbu4HrOx/zsQNAYqyo2oibvM/eoDz8mnpBtJgl08p92mMlp7Df GS6WfnxsXjlT8fs61Nn6Edj1CPYsHN/dyHXx94v8824xn/nL8mMLw0XkVY5FIRGFmeIi bcEw== X-Gm-Message-State: AOAM533AKPOHlRlaupzP238B7OKbh3YL9COz/0W71BQNSLABJWaMkReZ jORj2Et9sg6UBTFNqnqo2wKZ4tsJ3Hk= X-Google-Smtp-Source: ABdhPJxJJbp+5pq8RTSe8pYhJdK+jXp8LgEy3KHmjDSM+woLi7mHoi8qotUYCEpS4q6KZojMpHsNbg== X-Received: by 2002:a37:67c6:0:b0:67b:1153:a63a with SMTP id b189-20020a3767c6000000b0067b1153a63amr581256qkc.695.1648662730718; Wed, 30 Mar 2022 10:52:10 -0700 (PDT) Received: from [10.0.0.2] ([162.156.254.107]) by smtp.gmail.com with ESMTPSA id 11-20020ac8590b000000b002e1e5c5c866sm18711868qty.42.2022.03.30.10.52.09 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Wed, 30 Mar 2022 10:52:10 -0700 (PDT) Message-ID: <8df2d609-0a0d-0a25-4918-a27c91c3790a@gmail.com> Date: Wed, 30 Mar 2022 13:52:09 -0400 List-Id: Technical discussions relating to FreeBSD List-Archive: https://lists.freebsd.org/archives/freebsd-hackers List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-hackers@freebsd.org MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:91.0) Gecko/20100101 Thunderbird/91.7.0 Content-Language: en-US To: Baptiste Daroussin Cc: freebsd-hackers@FreeBSD.org References: <25b5c60f-b9cc-78af-86d7-1cc714232364@gmail.com> <20220330072210.icontj4n7hcqwtql@aniel.nours.eu> From: Mathieu Subject: Re: curtain: WIP sandboxing mechanism with pledge()/unveil() support In-Reply-To: <20220330072210.icontj4n7hcqwtql@aniel.nours.eu> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Rspamd-Queue-Id: 4KTDXW4ycbz4pV1 X-Spamd-Bar: --- Authentication-Results: mx1.freebsd.org; dkim=pass header.d=gmail.com header.s=20210112 header.b=fVsThHTg; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (mx1.freebsd.org: domain of sigsys@gmail.com designates 2607:f8b0:4864:20::736 as permitted sender) smtp.mailfrom=sigsys@gmail.com X-Spamd-Result: default: False [-4.00 / 15.00]; ARC_NA(0.00)[]; RCVD_VIA_SMTP_AUTH(0.00)[]; R_DKIM_ALLOW(-0.20)[gmail.com:s=20210112]; NEURAL_HAM_MEDIUM(-1.00)[-0.997]; FROM_HAS_DN(0.00)[]; TO_DN_SOME(0.00)[]; R_SPF_ALLOW(-0.20)[+ip6:2607:f8b0:4000::/36:c]; FREEMAIL_FROM(0.00)[gmail.com]; MIME_GOOD(-0.10)[text/plain]; TO_MATCH_ENVRCPT_ALL(0.00)[]; NEURAL_HAM_LONG(-1.00)[-1.000]; RCVD_COUNT_THREE(0.00)[3]; DKIM_TRACE(0.00)[gmail.com:+]; RCPT_COUNT_TWO(0.00)[2]; RCVD_IN_DNSWL_NONE(0.00)[2607:f8b0:4864:20::736:from]; NEURAL_HAM_SHORT(-1.00)[-1.000]; DMARC_POLICY_ALLOW(-0.50)[gmail.com,none]; MLMMJ_DEST(0.00)[freebsd-hackers]; FROM_EQ_ENVFROM(0.00)[]; MIME_TRACE(0.00)[0:+]; FREEMAIL_ENVFROM(0.00)[gmail.com]; ASN(0.00)[asn:15169, ipnet:2607:f8b0::/32, country:US]; RCVD_TLS_ALL(0.00)[]; MID_RHS_MATCH_FROM(0.00)[]; DWL_DNSWL_NONE(0.00)[gmail.com:dkim] X-ThisMailContainsUnwantedMimeParts: N On 3/30/22 03:22, Baptiste Daroussin wrote: > Hello Mathieu, > First of all, thank you for this amazing work, leveraging the mac framework to > build curtain is imho an excellent idea, I personnally see a curtain like > approach as complementary to a capsicum approach rather than an antagonist > feature, I can see many possible usage of curtains in freebsd in particular in > the port framework! Alright! One nice thing with the MAC approach is that many of the checks were already carefully placed to not deviate too much from expected behavior for applications.  At first I tried to do all of the checks in namei() and this often made syscalls fail too early with the wrong errnos and this confuses some programs.  When I figured out the right place to add the checks it was almost always right next to a MAC check. Also, if I hadn't used a MAC module and added a new layer instead, you could say that it would have been the fifth (!!!) general access control mechanism in the kernel (after traditional UNIX, Capsicum, jails and MAC).  This was starting to feel like a bit much.  So the MAC framework, it's not a perfect fit for some of what this module wants to do but it could be worse. Also, if you combine curtains with chroots you kind of get jails. It's chroot escape-proof because unveils deny access to the outer directories like any other.  Privileges are restricted.  It doesn't have all the features of jails but it's very jail-like. Just to say, the MAC framework was pretty close to being able to implement jails too. > To allow to integrate and permit reviews from developers, I think we can/should > split the review. The first thing will probably me imho to start a review > process of the sysfilt feature, this is probably the part that will need most of > the back and forth discussion given the rest is pretty isolated (mac module, > userland). Yeah I think it's a good idea.  I could do some minor cleanups and add some comments while at it before submitting it. I'm still working on this, but the majority of the changes should be in the userland and the module itself now. And getting some feedback on the kernel parts (and eventually knowing that they're in an acceptable state) would help.  The module works well enough already that I don't think that there's anything critical that would require extra changes to the rest of the kernel (unless some problem is found). I think the kernel changes could be broken up something like this: 1) Sysfils.  Adds annotations to every syscall in syscalls.master (Linuxulator could be done later too).  Generalizes the Capsicum flag to a sysfils bitmap both for sysents and ucreds.  Changes to the syscall entry checks.  Some sysfil checks spread out in the kernel. 2) There are some small modifications that I believe are bug fixes or minor improvements or basically have no effect in the current state of things (but are important for mac_curtain). 3) The new MAC handlers.  That's a significant part too and it can be separate from sysfils.  This would include the new call sites in vfs_lookup() (and there's some extra logic too).  New call sites in SysV/POSIX IPC modules (and some other changes to path handling).  There's a function that deals with getdirentries() filtering directly in the MAC framework (the MAC hooks only deal with the dirents one by one) that maybe doesn't belong there. vn_open() got a new flag. 4) Various modifications that are specifically to support mac_curtain.  I tried to minimize those but there are some. struct thread/proc need new fields (opaque pointers for unveil tracking/caching).  struct file needs a few extra integer fields (I really tried to get rid of those but couldn't figure out a good way).  I also added "sysctl_shadow" objects to kern_sysctl.c. These are handles that can outlast sysctl_oids.  It's not elegant, but mac_curtain needs it.  At first I thought that I could add long-lived references to sysctl_oids and I designed the curtain data structure for that.  Then I realized that sysctl_oids get totally nuked when a module unloads.  They get unmapped from memory with the rest of the module's data and there's no holding on to them (IIUC).  So I added an intermediate data structure between sysctl_oids and curtains. The mallocs with non-sleepable locks in the SysV IPC module with late label initializations are annoying but this could be fixed separately (it might need a bit of code restructuring...).  Making the module fplookup-friendly would probably need some changes to the MAC framework itself, this can be done separately too. There are others, but not critical and can be done separately. > Can you isolate the sysfils code and start a review in phabricator? If you need > help for this don't hesitate to ask me ;) Yup, I'm on it. > Again thanks for the huge work. No problem! This is gonna be a lot of work to review so thanks to anyone involved too. > > Best regards, > Bapt