svn commit: r339876 - head/libexec/rtld-elf

Alexander Richardson arichardson at freebsd.org
Wed Oct 31 19:55:16 UTC 2018


On Wed, 31 Oct 2018 at 18:38, Mark Millard <marklmi26-fbsd at yahoo.com> wrote:
>
> On 2018-Oct-30, at 3:23 PM, Alexander Richardson <arichardson at freebsd.org> wrote:
>
> > . . .
> > Before this change obj->textsize was always set as the end of
> > PT_LOAD[0]. Now it will contain everything up to the end of the last
> > PT_LOAD with execute permissions. In the binary you dumped this is
> > PT_LOAD[0] both before and after the patch so there is no change in
> > behaviour. The .got and .plt we not included in textsize before this
> > patch either. Therefore the only way I can see this patch breaking
> > anything is if the PHDRS of one of the loaded libraries causes a
> > change in behaviour.
>
> A fair night's sleep helps.
>
> Not only did I read something into your description that was
> not there, I looked at powerpc64 when it turns out that
> only 32-bit powerpc shows the problem. (I asked.)
>
> I'm also told reverting just those changes fixes 32-bit
> powerpc context.
>
> So I'm using an older 32-bit powerpc context to look at:
>
> https://artifact.ci.freebsd.org/snapshot/head/r339870/powerpc/powerpc/base*.txz
>
> materials now. -r339870 is the first 32-bit powerpc build
> available there after the changes were made.
>
> So in a /339870/ area from expanding the .txz's
> inside it I get:
>
> # elfdump -p ./bin/ls | less
>
> program header:
>
> entry: 0
>         p_type: PT_PHDR
>         p_offset: 52
>         p_vaddr: 0x1800034
>         p_paddr: 0x1800034
>         p_filesz: 224
>         p_memsz: 224
>         p_flags: PF_X|PF_R
>         p_align: 4
>
> entry: 1
>         p_type: PT_INTERP
>         p_offset: 276
>         p_vaddr: 0x1800114
>         p_paddr: 0x1800114
>         p_filesz: 21
>         p_memsz: 21
>         p_flags: PF_R
>         p_align: 1
>
> entry: 2
>         p_type: PT_LOAD
>         p_offset: 0
>         p_vaddr: 0x1800000
>         p_paddr: 0x1800000
>         p_filesz: 34112
>         p_memsz: 34112
>         p_flags: PF_X|PF_R
>         p_align: 65536
>
> entry: 3
>         p_type: PT_LOAD
>         p_offset: 34112
>         p_vaddr: 0x1818540
>         p_paddr: 0x1818540
>         p_filesz: 316
>         p_memsz: 1752
>         p_flags: PF_X|PF_W|PF_R
>         p_align: 65536
>
> entry: 4
>         p_type: PT_DYNAMIC
>         p_offset: 34132
>         p_vaddr: 0x1818554
>         p_paddr: 0x1818554
>         p_filesz: 216
>         p_memsz: 216
>         p_flags: PF_W|PF_R
>         p_align: 4
>
> entry: 5
>         p_type: PT_NOTE
>         p_offset: 300
>         p_vaddr: 0x180012c
>         p_paddr: 0x180012c
>         p_filesz: 48
>         p_memsz: 48
>         p_flags: PF_R
>         p_align: 4
>
> entry: 6
>         p_type: PT_LOAD
>         p_offset: 0
>         p_vaddr: 0
>         p_paddr: 0
>         p_filesz: 0
>         p_memsz: 0
>         p_flags: PF_W|PF_R
>         p_align: 4
>
> I note some things unique to this compared
> to powerpc64 and to what the old code would
> do for what is unique:
>
> There are 2 PT_LOADS with PF_X and there are a
> bunch of pages between that are not covered by
> any entry. This is from p_align being 65536 from
> what I can tell.
>
> entry 2:
> 0x1800000+34112==0x1808540 for the ending of the read-only PF_X area.
> entry 3:
> 0x1818540 for the beginning of the writable PF_X area.
> 0x0010000 differences: 65536 Bytes.
>
> What is the handling of the page range that is
> not described in any entry? Is
> __syncicache(obj->mapbase, obj->textsize)
> spanning the hole valid? Previously the
> hole was not spanned as I understand.
>
>
> Libraries also have the hole between the
> PT_LOAD with PF_X ranges. Using /lib/libc.so
> as an example:
>
> # elfdump -p ./lib/libc.so.7 | less
>
> program header:
>
> entry: 0
>         p_type: PT_LOAD
>         p_offset: 0
>         p_vaddr: 0
>         p_paddr: 0
>         p_filesz: 1746472
>         p_memsz: 1746472
>         p_flags: PF_X|PF_R
>         p_align: 65536
>
> entry: 1
>         p_type: PT_LOAD
>         p_offset: 1746480
>         p_vaddr: 0x1ba630
>         p_paddr: 0x1ba630
>         p_filesz: 44188
>         p_memsz: 201536
>         p_flags: PF_X|PF_W|PF_R
>         p_align: 65536
>
> entry: 2
>         p_type: PT_DYNAMIC
>         p_offset: 1762600
>         p_vaddr: 0x1be528
>         p_paddr: 0x1be528
>         p_filesz: 192
>         p_memsz: 192
>         p_flags: PF_W|PF_R
>         p_align: 4
>
> entry: 3
>         p_type: PT_TLS
>         p_offset: 1746480
>         p_vaddr: 0x1ba630
>         p_paddr: 0x1ba630
>         p_filesz: 2832
>         p_memsz: 2860
>         p_flags: PF_R
>         p_align: 16
>
> entry: 4
>         p_type: PT_NULL
>         p_offset: 1745288
>         p_vaddr: 0x1aa188
>         p_paddr: 0x1aa188
>         p_filesz: 236
>         p_memsz: 236
>         p_flags: PF_R
>         p_align: 4
>
> entry: 5
>         p_type: PT_LOAD
>         p_offset: 0
>         p_vaddr: 0
>         p_paddr: 0
>         p_filesz: 0
>         p_memsz: 0
>         p_flags: PF_W|PF_R
>         p_align: 4
>
> 1746472==0x1AA628 for the ending of the readonly code.
> 0x1ba630 for the beginning of the writable code.
> 0x10008 difference: 655544 bytes.
>
>
> I doubt the following would contribute but I
> note them:
>
> PT_PHDR for ./bin/ls has PF_X indicated.
>
> Various other entries overlap with the
> PT_LOAD's that have PF_X.
>

This seems to indicate that it is not possible to for __syncicache()
to span unmapped ranges.
It seems to me like the current behaviour for flushing the icache was
making some assumptions that do not actually hold and my change
exposed them.
Apparently we have some DSOs that have multiple PF_X PT_LOAD segments
(which should probably be included in the __syncicache() call) but
currently
only the value of the first PT_LOAD is included. This will work
correctly for most cases since PT_LOAD[0] usually contains .text but
will break once we start using
LLD since LLD sometimes places a read-only segment first.
I'm not sure what the correct solution is here. Clearly, we should be
flushing the icache for all executable segments and just PT_LOAD[0]
(which may not even be executable).
If it's not possible to use a single contiguous range we might have to
store a list.

Alex


More information about the svn-src-head mailing list