From nobody Mon Jan 22 10:13:30 2024 X-Original-To: freebsd-hackers@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4TJQzX0H03z57Ztn for ; Mon, 22 Jan 2024 10:13:44 +0000 (UTC) (envelope-from theraven@FreeBSD.org) Received: from smtp.freebsd.org (smtp.freebsd.org [96.47.72.83]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "smtp.freebsd.org", Issuer "R3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4TJQzW6jMGz4SRL; Mon, 22 Jan 2024 10:13:43 +0000 (UTC) (envelope-from theraven@FreeBSD.org) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1705918423; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=c2wT0AZuqsiKwTn9AVVoo8p5HQxgmRTqKZEOBtt81IA=; b=FphhElqa7IZ+IVQFT7JW2iAkXCIeRZKl+J6kGN9aTJY29d/efWfoUDGSnojLoxK5fXbjY6 Ej0+QBE3NDDmH/Mck3p1FDrzeov8z+7qTOz9Pyc0BtNiSjoT7K+BZwzE++PkOdkFrrTdsY nup2IbVxT1oDJbkatLvJRsYLfijihYFA3YbGTQvFKPULRiwIADYdZ+x900KusHzzIu1TWB oFEFV3/Gi7FUGIW/p97c3jqT1prpQnRl45182Ti+8oWKe5v9mn66e2CtynF0CV7S5cek8P F3J3nlF0AsoyTilwHzoLnV7UN1OWU8/xrEMyMJhBgx2ULfu065tG2Hdjd4niGQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1705918423; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=c2wT0AZuqsiKwTn9AVVoo8p5HQxgmRTqKZEOBtt81IA=; b=q4D6yNHOkFfrcJqy1CP/3DawiXmOHgvy8AClwDPuSe0QDWSwUQmx9WD8aUXblX/g3X+U1K PAo5UGOT0+GcwI5aGglCbaVUAaBafYjRB6aqt0owGcYLZ1i7Tuyu3BLixhLM5CuHWi368v xpXevzJ2lbWuRJP0+o2EclIT2+CBazLyX2K6qv2DzSzL3rC7y2GlEhwLmr5LrKxpWhCJks PkbJG40a/YXb7yb/2IN4AnE470TrMFHFGaQys4LV4HgMFJZMuPMFm5yUk5sdYeKaw7QXUg aT1aoBrrauNQg8km715O1ds/eDRKceJsGDvgGXrLFCAspLs4Ffdf4D0BCS7RDw== ARC-Authentication-Results: i=1; mx1.freebsd.org; none ARC-Seal: i=1; s=dkim; d=freebsd.org; t=1705918423; a=rsa-sha256; cv=none; b=bvfK7isdiG7cURmZHXNCpFlJX0jgXoUQX5T42dAlNQGpKqD2CZamdAyEcQrCsnbp0ZNrql YtxHjCi3dNZtLpWApJDIaJ+9cnsrOqa6iMsueuf4z2FOPvTl9e8qyUGXThI1WjJCYJwXe2 TPu2mK9ZgZvYE4bZ0AMZf57EyLN74ProirQTk9TB3AtctXmmg2HTFOvwZjPPAy/btdk+Qe D2YUxEEl3T3n3Wzwl49oSQvv5fYw5fPaTmnBrpacSM2sEc0klEeNFrcyyXc8FmvoMmcOD0 MTnlP2u5PRc4qpewI7DHhiyiwZrgaPtDz/0RkBo1ZFyKgJ3CYn3R22KbdGdpqg== Received: from smtp.theravensnest.org (smtp.theravensnest.org [45.77.103.195]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) (Authenticated sender: theraven) by smtp.freebsd.org (Postfix) with ESMTPSA id 4TJQzW5YxxzX9M; Mon, 22 Jan 2024 10:13:43 +0000 (UTC) (envelope-from theraven@FreeBSD.org) Received: from smtpclient.apple (host109-153-95-118.range109-153.btcentralplus.com [109.153.95.118]) by smtp.theravensnest.org (Postfix) with ESMTPSA id 7DD3EBF21; Mon, 22 Jan 2024 10:13:42 +0000 (GMT) Content-Type: text/plain; charset=utf-8 List-Id: Technical discussions relating to FreeBSD List-Archive: https://lists.freebsd.org/archives/freebsd-hackers List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-hackers@freebsd.org Mime-Version: 1.0 (Mac OS X Mail 16.0 \(3774.200.91.1.1\)) Subject: Re: The Case for Rust (in the base system) From: David Chisnall In-Reply-To: Date: Mon, 22 Jan 2024 10:13:30 +0000 Cc: George Mitchell , freebsd-hackers@freebsd.org Content-Transfer-Encoding: quoted-printable Message-Id: References: <1673801705774097@mail.yandex.ru> <202401210751.40L7pWEF011188@critter.freebsd.dk> <40bc1694-ee00-431b-866e-396e9d5c07a2@m5p.com> To: Alan Somers X-Mailer: Apple Mail (2.3774.200.91.1.1) On 21 Jan 2024, at 16:04, Alan Somers wrote: >=20 > Perhaps it will. But Like David Chisnall, I'm afraid that if FreeBSD = never > modernizes, then it itself will go out of fashion by the 2040s. Apparently I=E2=80=99m participating in this thread already. I=E2=80=99m = getting over a nasty cold and my head is full of cotton wool, so = apologies in advance if this is more rambling than normal: I hope it=E2=80=99s no surprise to anyone that I am in favour of = languages that give stronger guarantees to programmers and let you think = more abut the problems. I can=E2=80=99t imagine going back to writing = anything non-trivial in a language without RAII or a rich set of generic = collections. To give a bit of personal background: In my previous role, I was one of = the coauthors of the internal strategy document that argued for safe = languages at Microsoft. Our rough recommendation was: - No new C code. There are *always* better options. - C++ code should follow the Core Guidelines and use static analysis. = New C++ code is acceptable in projects that are already C/C++ and need = to incrementally improve. - Rust in new projects that need a systems programming language. - Managed languages anywhere where a systems language is not needed = (i.e. most places). Between modern C++ with static analysers and Rust, there was a small = safety delta. The recommendation was primarily based on a human-factors = decision: it=E2=80=99s far easier to prevent people from committing code = that doesn=E2=80=99t compile than it is to prevent them from committing = code that raises static analysis warnings. If a project isn=E2=80=99t = doing pre-merge static analysis, it=E2=80=99s basically impossible. = Between using modern C++ (even just smart pointers and ranges) and C, = there is an enormous safety delta. =20 The unstable Rust ecosystem was less of an issue for Microsoft because = they had a large compiler team and were happy to maintain security = back-ports of any critical crates. The same software supply chain = things applied for Rust as everything else: no random pulling from = Cargo, dependencies need to be cloned internally and run through a load = of compliance things. That=E2=80=99s probably the only sensible way of = interacting with the Rust ecosystem. For userspace, I=E2=80=99d love to see FreeBSD more actively support the = cap-std project in Rust, which makes it incredibly easy to write Rust = programs that play nicely with Capsicum. It=E2=80=99s unclear to me that now is the right time to support Rust in = the base system, because there=E2=80=99s still a lot of churn. Facebook = has effectively forked Rust because their (huge) Rust codebase doesn=E2=80= =99t build with newer compilers. If you=E2=80=99re Microsoft or = Facebook, maintaining an old Rust compiler for a few years and = back-porting things to work with that language snapshot is a cost that = may be worth paying. I don=E2=80=99t think the FreeBSD project has the = resources to do so. A limited set of dependencies may work. There are a few caveats about Rust: First, it=E2=80=99s quite hard to find competent Rust developers. Here = are the OpenHub stats on new F/OSS code being written in Rust, C, and = C++: = https://openhub.net/languages/compare?language_name%5B%5D=3Dc&language_nam= e%5B%5D=3Dcpp&language_name%5B%5D=3Drust&language_name%5B%5D=3D-1&language= _name%5B%5D=3D-1&measure=3Dloc_changed C++ has been slowly trending up, and C down, for the last decade. Rust = is trending up a lot, but it=E2=80=99s starting from zero and there=E2=80=99= s still a lot more C or C++ code being written than Rust. It=E2=80=99s = now easier to hire systems programmers to write C++ than C, and easier = to hire either than to hire good Rust programmers. This tradeoff may be = very different for an open source project because there are a lot of = *very* enthusiastic Rust developers and attracting a dozen or two of = them to contribute would be a huge win. People tend to be less = enthusiastic about C or C++. Most of the new kernels written in the last 20 years have been C++, most = of the new kernels written in the last four years have been Rust. Make = of that what you will. Neither Rust nor C++ guarantee safety. C++ can always escape to bare = pointers (it=E2=80=99s code smell, but it=E2=80=99s sometimes = unavoidable). Rust has unsafe and requires it for any data structure = that isn=E2=80=99t a tree (either directly or via some existing code = such as the RC / ARC traits). One of our concerns was the degree to = which the different uses of unsafe in various Rust crates compose. = There was a paper a couple of years ago that found a lot of = vulnerabilities from this composition. I don=E2=80=99t personally have = a great deal of faith that unique ownership at an object level with a = load of heuristics about when it=E2=80=99s safe to alias is the right = long-term model. Verona went a very different way and I hope Rust may = be able to retrofit our ideas at some point. =20 One project that I worked with, for example, was bitten by the fact that = unsafe in Rust means =E2=80=98I promise to follow all of the Rust rules, = you just can=E2=80=99t mechanically check them=E2=80=99. It read a = value from an MMIO register into a variable typed as an enumeration. = Outside of the unsafe block, it then checked that the value was in = range. Rust enumerations are type safe and so the compiler helpfully = elided this check. Moving the check into the unsafe block fixed it, but = ran counter to the generic =E2=80=98put as little in unsafe blocks as = humanly possible=E2=80=99 advice that is given for Rust programmers. When I looked at a couple of hobbyist kernels written in Rust, they had = trivial security vulnerabilities due to not sanitising system call = arguments. This was depressing because both Rust and C++ make it = trivial to wrap userspace pointers in a smart pointer type that does the = checks automatically. =20 In snmalloc, for example, we use C++ templates to express the lifecycle = of memory throughout its allocation flow. This would also be possible = in Rust, but isn=E2=80=99t free in either language: you have to use the = tools provided, but the outcome is that we can statically check a lot of = properties at compile time. With one of my other hats, I am the maintainer of an RTOS that is = written in C++ and runs on a platform where the hardware enforces = spatial and temporal memory safety. To date, I don=E2=80=99t believe = we=E2=80=99ve had any bugs that would have been prevented by Rust. All = of the memory-safety bugs (we have had some, and we catch them fairly = easily because they lead to traps and so are easy to add tests for) have = been in code that=E2=80=99s doing intrinsically unsafe things (memory = allocators, for example). We use C++20, with moderately heavy use of = concepts. We have a ring buffer implementation that uses a mixture of = static_asserts and templates to verify the wrapping behaviour at compile = time and that=E2=80=99s just one example of a place where we do a lot of = compile-time checks that are impossible in C. I=E2=80=99d also like to clear up a few misunderstandings about C++: - The Itanium C++ ABI has been stable for 20+ years. C++ shared = libraries compiled with clang and linked against those compiled with GCC = (or vice versa), or different versions of the same compiler has been = standard practice for a long time. Both libstdc++ and libc++ use inner = namespaces for the standard-library types and so allow something like = symbol versioning but exposed at the language level. You can see ABI = breaks if one library uses a newer version of a type and the other an = older one, but that=E2=80=99s why we only bump those forward on major = releases: C++ DSOs compiled for FreeBSD 13 may not link with binaries = compiled for FreeBSD 14. - Command-line argument parsing and JSON are not part of the C++ = standard library, but there are de-facto standards. Nlohmann JSON[1] = and CLI11[2] are widely used (it=E2=80=99s been a long time since I=E2=80=99= ve seen a project that used anything else) and have very easy-to-use = interfaces. I believe (I am a member of the C++ standards committee, = but I only recently joined and have not participated in discussions = around this) that a big part of the reason it isn=E2=80=99t in the core = specification is that there is a de-facto standard and there=E2=80=99s = little urgency in adding it to the core. Finally, one of the key things that we found was that a lot of projects = used C/C++ out of inertia. They don=E2=80=99t have peak memory or = sub-millisecond-latency constraints and could easily be written in a = managed language, often even in an interpreted one. We have Lua in the = base system. I=E2=80=99d love to see a richer set of things exposed to = Lua. I played a bit with a kqueue wrapper using Sol2[3] that lets you = write Lua coroutines and have them implicitly yield on blocking = operations. =20 I=E2=80=99d love to see a generic process manager in the base system = that subsumes devd and inetd written in Lua, with C++ wrappers around = pdfork (ideally pdvfork, but it doesn=E2=80=99t exist yet) and friends, = exposed via sol2. The code in C++ is dealing directly with low-level = system interfaces and would not be safer in Rust, but all of the parsing = and control-plane logic can live in a safe GC=E2=80=99d language. You = can run a lot of Lua code in the time it takes one fork call to execute. If we exposed type info from dynamic sysctls generically (I think = there=E2=80=99s a project working on this?) then things like sysstat = could be written in Lua. I was experimenting with Dear ImGui for this, = since it had back ends that rendered in X11, Wayland, in a terminal, or = remotely over a websocket. Unfortunately, the latter two were never = merged and are probably unmaintained (the author is also the person = behind llama.cpp and so probably isn=E2=80=99t going to work on it for a = while). Being able to run management tools in a terminal and click on a = URL to open them in the web browser would be amazing, but doesn=E2=80=99t = require a new systems programming language. I=E2=80=99d love to see a default that anything intended to run with = elevated privilege is written in Lua. David [1] https://github.com/nlohmann/json [2] https://github.com/CLIUtils/CLI11 [3] https://sol2.readthedocs.io/=