From nobody Mon Jan 17 22:51:52 2022 X-Original-To: freebsd-hackers@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 6996D19730AF for ; Mon, 17 Jan 2022 22:52:08 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from kib.kiev.ua (kib.kiev.ua [IPv6:2001:470:d5e7:1::1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 4Jd6bq3sDMz4l6F for ; Mon, 17 Jan 2022 22:52:07 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from tom.home (kib@localhost [127.0.0.1]) by kib.kiev.ua (8.16.1/8.16.1) with ESMTPS id 20HMpqno011824 (version=TLSv1.3 cipher=TLS_AES_256_GCM_SHA384 bits=256 verify=NO); Tue, 18 Jan 2022 00:51:55 +0200 (EET) (envelope-from kostikbel@gmail.com) Received: (from kostik@localhost) by tom.home (8.16.1/8.16.1/Submit) id 20HMpq43011823; Tue, 18 Jan 2022 00:51:52 +0200 (EET) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: tom.home: kostik set sender to kostikbel@gmail.com using -f Date: Tue, 18 Jan 2022 00:51:52 +0200 From: Konstantin Belousov To: "Damian's Proton Mail" Cc: "freebsd-hackers@freebsd.org" Subject: Re: amd64 syscall ABI (vs. Darwin) Message-ID: References: <94B30813-0034-4F90-9AAC-113402A1A3E8@dmcyk.xyz> List-Id: Technical discussions relating to FreeBSD List-Archive: https://lists.freebsd.org/archives/freebsd-hackers List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-hackers@freebsd.org MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <94B30813-0034-4F90-9AAC-113402A1A3E8@dmcyk.xyz> X-Spam-Status: No, score=-1.0 required=5.0 tests=ALL_TRUSTED,BAYES_00, DKIM_ADSP_CUSTOM_MED,FORGED_GMAIL_RCVD,FREEMAIL_FROM, NML_ADSP_CUSTOM_MED autolearn=no autolearn_force=no version=3.4.5 X-Spam-Checker-Version: SpamAssassin 3.4.5 (2021-03-20) on tom.home X-Rspamd-Queue-Id: 4Jd6bq3sDMz4l6F X-Spamd-Bar: ++ Authentication-Results: mx1.freebsd.org; dkim=none; dmarc=fail reason="No valid SPF, No valid DKIM" header.from=gmail.com (policy=none); spf=softfail (mx1.freebsd.org: 2001:470:d5e7:1::1 is neither permitted nor denied by domain of kostikbel@gmail.com) smtp.mailfrom=kostikbel@gmail.com X-Spamd-Result: default: False [3.00 / 15.00]; ARC_NA(0.00)[]; TO_DN_EQ_ADDR_SOME(0.00)[]; RCVD_TLS_ALL(0.00)[]; FROM_HAS_DN(0.00)[]; TO_DN_SOME(0.00)[]; FREEMAIL_FROM(0.00)[gmail.com]; NEURAL_SPAM_SHORT(1.00)[1.000]; MIME_GOOD(-0.10)[text/plain]; HAS_XAW(0.00)[]; R_SPF_SOFTFAIL(0.00)[~all:c]; NEURAL_SPAM_MEDIUM(1.00)[1.000]; TO_MATCH_ENVRCPT_SOME(0.00)[]; RCPT_COUNT_TWO(0.00)[2]; NEURAL_SPAM_LONG(1.00)[1.000]; MLMMJ_DEST(0.00)[freebsd-hackers]; FROM_EQ_ENVFROM(0.00)[]; R_DKIM_NA(0.00)[]; MIME_TRACE(0.00)[0:+]; ASN(0.00)[asn:6939, ipnet:2001:470::/32, country:US]; RCVD_COUNT_TWO(0.00)[2]; FREEMAIL_ENVFROM(0.00)[gmail.com]; DMARC_POLICY_SOFTFAIL(0.10)[gmail.com : No valid SPF, No valid DKIM,none] X-ThisMailContainsUnwantedMimeParts: N On Mon, Jan 17, 2022 at 10:31:09PM +0000, Damian's Proton Mail wrote: > > > On 17 Jan 2022, at 14:38, Konstantin Belousov wrote: > > > > On Mon, Jan 17, 2022 at 12:41:59PM +0000, Damian Malarczyk wrote: > >> Hello, > >> > >> I'm hacking on a toy project to run Darwin (MachO) binaries on FreeBSD. > >> Currently I'm at a stage of syscalls support, and I've noticed a difference in the amd64 ABI that I didn't expect. > >> > >> FreeBSD is changing values of some registers that aren't used as the syscall output. e.g., r8-r11 are changed, while r12-r15 don't seem to be affected. > >> That's not the case on Darwin, from what I've seen onlyrax, rdx used as syscall results are changed. > >> It looks like FreeBSD's syscalls calling convention is more like standard function calling, and r8-r11 should be always caller saved. > > It is not 'more like'. FreeBSD follows C ABI for amd64 for syscall > > registers handling. An additional twist is that the registers which are > > declared as calleee-clobered are zeroed to avoid kernel data leakage to > > userspace. > Oh I see, this explains it then. > > >> > >> At a first glance Darwin approach seems more optimal, as less registers get clobbered. Is there any specific reason why this isn't also the case on FreeBSD? > >> I'm also wondering where exactly the register values are changed. When I look at thetrapframe contents in the sv_set_syscall_retvalsystem vector callback the r8 register value is same as on the input, so it must be changed somewhere later. Does anyone know where exactly this happens? > > > > Look at the sys/amd64/amd64/exceptions.S. The fast_syscall entry point > > is where we receive control after the syscall instruction. > A lot of new things in there for me, but the flow is clear. I was able to find corresponding logic in XNU’s sources too. Earlier I said: > > > At a first glance Darwin approach seems more optimal > But it’s instead the opposite/no difference at all, as in Darwin, they explicitly restore/set all registers, including callee saved r12-r15. > > Explicitly preserving registers would prevent kernel data leakage too. Doing so in FreeBSD would also be an ABI compatible change I think, since users shouldn’t rely on values in those registers. > I’m curious if you see any obvious pros/cons with either approach, or is it just a more arbitrary implementation choice? We preserve everything on syscall entry, it is the SYSCALL instruction behavior that makes it look somewhat convoluted. I suggest you to read the SDM description of the SYSCALL instruction to understand the registers manipulations on entry. On the other hand, on the fast syscall return, we indeed not restore everything. If you want to restore full frame, use PCB_FULL_IRET pcb flag to request iretq return path. > > Not that I’d propose changing the ABI though, I also want my toy project to work as a plug-in kernel module. > I guess the only other option to emulate Darwin's behaviour would be to intercept syscalls in userspace somehow first and manually preserve the register values? To emulate Darwin, you would need specific ABI personality (sysent) in the kernel, which would also provide sv_syscall_ret method. The method can do whatever is needed to the return frame, and set PCB_FULL_IRET to indicate that kernel should load it into CPU GPR file as is. BTW, does Darwin use SYSCALL instruction for syscall entry on amd64?