svn commit: r339876 - head/libexec/rtld-elf

Mark Millard marklmi26-fbsd at yahoo.com
Wed Oct 31 19:28:47 UTC 2018


On 2018-Oct-31, at 11:53 AM, Alexander Richardson <arichardson at freebsd.org> wrote:

> On Wed, 31 Oct 2018 at 18:38, Mark Millard <marklmi26-fbsd at yahoo.com> wrote:
>> 
>> On 2018-Oct-30, at 3:23 PM, Alexander Richardson <arichardson at freebsd.org> wrote:
>> 
>>> . . .
>>> Before this change obj->textsize was always set as the end of
>>> PT_LOAD[0]. Now it will contain everything up to the end of the last
>>> PT_LOAD with execute permissions. In the binary you dumped this is
>>> PT_LOAD[0] both before and after the patch so there is no change in
>>> behaviour. The .got and .plt we not included in textsize before this
>>> patch either. Therefore the only way I can see this patch breaking
>>> anything is if the PHDRS of one of the loaded libraries causes a
>>> change in behaviour.
>> 
>> A fair night's sleep helps.
>> 
>> Not only did I read something into your description that was
>> not there, I looked at powerpc64 when it turns out that
>> only 32-bit powerpc shows the problem. (I asked.)
>> 
>> I'm also told reverting just those changes fixes 32-bit
>> powerpc context.
>> 
>> So I'm using an older 32-bit powerpc context to look at:
>> 
>> https://artifact.ci.freebsd.org/snapshot/head/r339870/powerpc/powerpc/base*.txz
>> 
>> materials now. -r339870 is the first 32-bit powerpc build
>> available there after the changes were made.
>> 
>> So in a /339870/ area from expanding the .txz's
>> inside it I get:
>> 
>> # elfdump -p ./bin/ls | less
>> 
>> program header:
>> 
>> entry: 0
>>        p_type: PT_PHDR
>>        p_offset: 52
>>        p_vaddr: 0x1800034
>>        p_paddr: 0x1800034
>>        p_filesz: 224
>>        p_memsz: 224
>>        p_flags: PF_X|PF_R
>>        p_align: 4
>> 
>> entry: 1
>>        p_type: PT_INTERP
>>        p_offset: 276
>>        p_vaddr: 0x1800114
>>        p_paddr: 0x1800114
>>        p_filesz: 21
>>        p_memsz: 21
>>        p_flags: PF_R
>>        p_align: 1
>> 
>> entry: 2
>>        p_type: PT_LOAD
>>        p_offset: 0
>>        p_vaddr: 0x1800000
>>        p_paddr: 0x1800000
>>        p_filesz: 34112
>>        p_memsz: 34112
>>        p_flags: PF_X|PF_R
>>        p_align: 65536
>> 
>> entry: 3
>>        p_type: PT_LOAD
>>        p_offset: 34112
>>        p_vaddr: 0x1818540
>>        p_paddr: 0x1818540
>>        p_filesz: 316
>>        p_memsz: 1752
>>        p_flags: PF_X|PF_W|PF_R
>>        p_align: 65536
>> 
>> entry: 4
>>        p_type: PT_DYNAMIC
>>        p_offset: 34132
>>        p_vaddr: 0x1818554
>>        p_paddr: 0x1818554
>>        p_filesz: 216
>>        p_memsz: 216
>>        p_flags: PF_W|PF_R
>>        p_align: 4
>> 
>> entry: 5
>>        p_type: PT_NOTE
>>        p_offset: 300
>>        p_vaddr: 0x180012c
>>        p_paddr: 0x180012c
>>        p_filesz: 48
>>        p_memsz: 48
>>        p_flags: PF_R
>>        p_align: 4
>> 
>> entry: 6
>>        p_type: PT_LOAD
>>        p_offset: 0
>>        p_vaddr: 0
>>        p_paddr: 0
>>        p_filesz: 0
>>        p_memsz: 0
>>        p_flags: PF_W|PF_R
>>        p_align: 4
>> 
>> I note some things unique to this compared
>> to powerpc64 and to what the old code would
>> do for what is unique:
>> 
>> There are 2 PT_LOADS with PF_X and there are a
>> bunch of pages between that are not covered by
>> any entry. This is from p_align being 65536 from
>> what I can tell.
>> 
>> entry 2:
>> 0x1800000+34112==0x1808540 for the ending of the read-only PF_X area.
>> entry 3:
>> 0x1818540 for the beginning of the writable PF_X area.
>> 0x0010000 differences: 65536 Bytes.
>> 
>> What is the handling of the page range that is
>> not described in any entry? Is
>> __syncicache(obj->mapbase, obj->textsize)
>> spanning the hole valid? Previously the
>> hole was not spanned as I understand.
>> 
>> 
>> Libraries also have the hole between the
>> PT_LOAD with PF_X ranges. Using /lib/libc.so
>> as an example:
>> 
>> # elfdump -p ./lib/libc.so.7 | less
>> 
>> program header:
>> 
>> entry: 0
>>        p_type: PT_LOAD
>>        p_offset: 0
>>        p_vaddr: 0
>>        p_paddr: 0
>>        p_filesz: 1746472
>>        p_memsz: 1746472
>>        p_flags: PF_X|PF_R
>>        p_align: 65536
>> 
>> entry: 1
>>        p_type: PT_LOAD
>>        p_offset: 1746480
>>        p_vaddr: 0x1ba630
>>        p_paddr: 0x1ba630
>>        p_filesz: 44188
>>        p_memsz: 201536
>>        p_flags: PF_X|PF_W|PF_R
>>        p_align: 65536
>> 
>> entry: 2
>>        p_type: PT_DYNAMIC
>>        p_offset: 1762600
>>        p_vaddr: 0x1be528
>>        p_paddr: 0x1be528
>>        p_filesz: 192
>>        p_memsz: 192
>>        p_flags: PF_W|PF_R
>>        p_align: 4
>> 
>> entry: 3
>>        p_type: PT_TLS
>>        p_offset: 1746480
>>        p_vaddr: 0x1ba630
>>        p_paddr: 0x1ba630
>>        p_filesz: 2832
>>        p_memsz: 2860
>>        p_flags: PF_R
>>        p_align: 16
>> 
>> entry: 4
>>        p_type: PT_NULL
>>        p_offset: 1745288
>>        p_vaddr: 0x1aa188
>>        p_paddr: 0x1aa188
>>        p_filesz: 236
>>        p_memsz: 236
>>        p_flags: PF_R
>>        p_align: 4
>> 
>> entry: 5
>>        p_type: PT_LOAD
>>        p_offset: 0
>>        p_vaddr: 0
>>        p_paddr: 0
>>        p_filesz: 0
>>        p_memsz: 0
>>        p_flags: PF_W|PF_R
>>        p_align: 4
>> 
>> 1746472==0x1AA628 for the ending of the readonly code.
>> 0x1ba630 for the beginning of the writable code.
>> 0x10008 difference: 655544 bytes.

[Dumb typo (too many 5's), fix is: 65544 .]

>> I doubt the following would contribute but I
>> note them:
>> 
>> PT_PHDR for ./bin/ls has PF_X indicated.
>> 
>> Various other entries overlap with the
>> PT_LOAD's that have PF_X.
>> 
> 
> This seems to indicate that it is not possible to for __syncicache()
> to span unmapped ranges.
> It seems to me like the current behaviour for flushing the icache was
> making some assumptions that do not actually hold and my change
> exposed them.
> Apparently we have some DSOs that have multiple PF_X PT_LOAD segments
> (which should probably be included in the __syncicache() call) but
> currently
> only the value of the first PT_LOAD is included. This will work
> correctly for most cases since PT_LOAD[0] usually contains .text but
> will break once we start using
> LLD since LLD sometimes places a read-only segment first.
> I'm not sure what the correct solution is here. Clearly, we should be
> flushing the icache for all executable segments and just PT_LOAD[0]
> (which may not even be executable).
> If it's not possible to use a single contiguous range we might have to
> store a list.

This is not the first time that the __syncicache (or analogous) and
such holes have been an issue. Nathan Whitehorn and I had an exchange
long ago that was tied to such --but I've not been able to find a
copy in my list/E-mail records or in bugzilla.

Back then it turned out that it happened because I'd made a dumb
typo someplace where the exponent of 2 was supposed to be in an
assembler source to specify a value that had to be a power of 2
but I'd typed the intended overall value instead, making the size
bigger than I intended. But, as I remember, Nathan thought that
the mistake had exposed a legitimate issue: the hole could have
existed for other reasons and requiring there be no hole was too
much (long term anyway).

Unfortunately, I only have vague memories of what all was written
at the time --or exactly what code was involved at the time. I
would not depend on any details from back then that come from me.

Still, if you are lucky, Nathan might remember what he was thinking
back then and know if it applies to this.

===
Mark Millard
marklmi at yahoo.com
( dsl-only.net went
away in early 2018-Mar)



More information about the svn-src-head mailing list