Re: git: e85eaa930862 - main - Have rtld query the page size from the kernel

From: John Baldwin <jhb_at_FreeBSD.org>
Date: Thu, 14 Nov 2024 14:47:30 UTC
On 11/13/24 12:10, Jessica Clarke wrote:
> On 13 Nov 2024, at 19:44, John Baldwin <jhb@FreeBSD.org> wrote:
>>
>> On 4/7/22 07:38, Andrew Turner wrote:
>>> The branch main has been updated by andrew:
>>> URL: https://cgit.FreeBSD.org/src/commit/?id=e85eaa930862d5b4dc917bc31e8d7254a693635d
>>> commit e85eaa930862d5b4dc917bc31e8d7254a693635d
>>> Author:     Andrew Turner <andrew@FreeBSD.org>
>>> AuthorDate: 2022-04-04 15:05:40 +0000
>>> Commit:     Andrew Turner <andrew@FreeBSD.org>
>>> CommitDate: 2022-04-07 14:37:37 +0000
>>>      Have rtld query the page size from the kernel
>>>           To allow for a dynamic page size on arm64 have the runtime linker
>>>      query the kernel for the currentl page size.
>>>           Reviewed by:    kib
>>>      Sponsored by:   The FreeBSD Foundation
>>>      Differential Revision: https://reviews.freebsd.org/D34765
>>
>> This broke relro handling for rtld.  The reason is that init_pagesizes() is
>> called after parsing the program headers for rltd in init_rtld().  As a result,
>> page_size is 0 when rtld_round_page() is called so the relro_size is 0.  The
>> RTLD_INIT_EARLY_PAGESIZES case was for ia64, and in the early case it's probably
>> not safe to call sysctl?  If it is safe to call sysctl, we could just always
>> init pagesizes early?
> 
> It looks like there are a few things going on:
> 
> 1. relocate_object calls obj_enforce_relro if !obj->mainprog, so will
> try to enforce RELRO for RTLD itself whilst page_size is 0
> 
> 2. init_rtld later calls obj_enforce_relro for obj_rtld, after
> page_size has been initialised
> 
> 3. init_rtld is careful to avoid using global variables until it’s
> called relocate_objects for RTLD itself, but by hiding accesses to
> page_size away in rtld_*_page that’s no longer true (definitely not
> true in the case of text relocations, for example, though whether it
> also occurs for other cases we care more about I don’t know)
> 
> So I think there are a couple of things to fix:
> 
> 1. Stop accessing page_size prior to relocate_objects returning for
> RTLD itself
> 
> 2. Stop enforcing RELRO twice for RTLD (e.g. add && obj != rtldobj to
> relocate_object’s case)
> 
> At least, that’s what I’ve inferred from reading the code.
> 
> Though, to be honest, things might be rather nicer if we just made
> .rtld_start responsible for relocating RTLD itself prior to calling
> init_rtld, that’s what we have to do for CHERI, as do arm, powerpc and
> powerpc64, and it means you can use globals from the start in init_rtld.

I've done 2) locally which fixed my immediate issue.

I agree though that having all arches implement rtld_relocate_nonplt_self()
which is called before rtld() would be nicer.

-- 
John Baldwin