Re: Capsicum revocable (proxy) file descriptors

From: Vinícius_dos_Santos_Oliveira <vini.ipsmaker_at_gmail.com>
Date: Wed, 08 Oct 2025 10:45:18 UTC
Em ter., 7 de out. de 2025 às 14:16, David Chisnall
<theraven@freebsd.org> escreveu:
> A file descriptor that is a proxy to another file descriptor with an ioctl that deletes the proxied thing seems moderately easy.  Ideally it would at least have another ioctl that is an is-revocable thing.

I don't see how that's a problem.

Anyways, I think it'd be better to have 2 file descriptors (just like
what happens to ptys, and similar patterns). One file descriptor is
the revoker. The other is the proxy that you pass around.

> But even then, being able to do things like execve a setuid binary and then close its stdout *after* it has checked that it’s checked that stdout exists makes me nervous.

Closing the associated read end of a pipe will cause the write end to
return EPIPE on writes. So "eternal availability" is already not
guaranteed. A proxy file descriptor would work in a similar fashion.

>  I’m not sure if it’s useful to an attacker, but it’s a change in existing behaviour and so would need some analysis.  I’d want some experienced exploit developers to weigh in with ‘I can’t think of a way that I’d use that primitive’.

Fair enough.

> There’s also a bit the question of: what security problem would it solve?

I'd like to use it in FlatPak portals, for instance. If you're running
Firefox in a container, and Firefox asks for a file where to write to
(e.g. when it's trying to download something), I'd like to revoke the
write access once the operation is done (the output file lies outside
the Firefox's container rootfs, and it shouldn't have unlimited access
there). A friend of mine had a similar problem, and he solved it by
syncing files.

For new applications (built around the capabilities model), this way
to revoke access is the standard pattern, and I can see myself using
it more often.

> I think it would be more useful to have a protocol where the untrusted party agrees to close a file descriptor and then a mechanism for validating that they have.  This has the advantage of working in both asymmetric and mutual distrust domains.

If the other party is cooperating, it can already cope with EPIPE or
similar errors that arise from the proxy being revoked. I argue that
such a pattern would be less useful. It's less general, and it solves
less problems.

> The mechanism for that is mandatory locking: you agree to close the file descriptor, I try to acquire an exclusive read-write lock.  If it succeeds, you don’t have an open fd to the resource, if it fails, you (or someone else) do.  Similarly, if I acquire a read lock, I know that you’ve closed the write file descriptor.  Currently, unless it was added recently while I wasn’t looking, FreeBSD has only advisory locks and so you can’t do this.  And that’s a shame.
>
> Mandatory locking is a generally useful mechanism, rather than something special cased for a particular Capsicum use case.
>
> Note: On CHERIoT RTOS, we lean into capability models far more than FreeBSD and we don’t have a mechanism for revoking arbitrary capabilities in the presence of a peer that wants to retain them.  We do have a notion of lexically-scoped delegation, where you can hand a peer a capability for the duration of a call.  If FreeBSD had something like io_uring (which has a per-ring namespace for file descriptors as well as the process-global one) then you can imagine doing something similar on top of some underlying IPC mechanism used for RPC, where a UNIX domain socket sends you one or more FDs that enter reserved slots in that ring’s namespace that are available until you send the response, at which point they’re implicitly closed (and cannot be dup’d in the middle).

I thought about the "private file descriptor on an io ring" design for
a day. Frankly that's not a solution that's going to be easy to sell.
It's basically not compatible with anything that exists today. I don't
see how it solves problems if it's going in this direction.

Think about FUSE. You can already create proxies in a way. You're just
syncing stuff manually.