Reasons to still not build buildworld buildkernel via system-clang --John Baldwin notes one I was unaware of

Mark Millard marklmi at yahoo.com
Tue Oct 23 21:15:09 UTC 2018


[Mostly just giving some powerpc64 detail, at least
when base/binutils is used.]

On 2018-Oct-22, at 2:35 PM, John Baldwin <jhb at freebsd.org> wrote:

> On 10/19/18 7:23 AM, Mark Millard wrote:
>> [I'm adding toolchain and John B. to the TO: list. John B.
>> may want to reply to Sean F. I'm also adding a
>> /lib/libgcc_s.so related item to the list of 3 known
>> issues.]
>> 
>>> On 2018-Oct-19, at 6:21 AM, Sean Fertile <sd.fertile at gmail.com> wrote:
>>> 
>>> Clang isn't getting the tls model wrong, it actually generates pic code by default as if -fpic were specified. I think the original intent behind switching 
>>> to pic by default was due to incorrectly thinking gcc was pic by default (I'm not sure if this was meant as only gcc on BSD or gcc on powerpc64 in general), 
>>> as well as working around some problems that clangs non-pic codegen has/had for the ELF V1 abi. There are some patches on phabricator for switching
>>> the default back to non-pic codegen, but they leave the pic default for BSD: https://reviews.llvm.org/D53384 and https://reviews.llvm.org/D53383
>>> 
>>> According to what you and John are saying the pic default is incorrect for BSD as well. If thats the case please either comment on the reviews to let Stefan know,
>>> or let me know here and we can update the patches accordingly.
> 
> No, what I am saying is that in GCC, the decision for dynamic TLS model
> vs static TLS model is based on whether or not -fPIC is explicitly
> given on the command line.  For MIPS at least (where I am familiar with
> this), both GCC and clang default to implicit PIC.

FYI:

John discovered that mips64/powerpc64 is the context for PIC being
the default for clang (I'm ignoring x86_64, Windows, MachO and
MacOSX in my comments):

bool Generic_GCC::isPICDefault() const {
 switch (getArch()) {
 case llvm::Triple::x86_64:
   return getTriple().isOSWindows();
 case llvm::Triple::ppc64:
 case llvm::Triple::ppc64le:
   return !getTriple().isOSBinFormatMachO() && !getTriple().isMacOSX();
 case llvm::Triple::mips64:
 case llvm::Triple::mips64el:
   return true;
 default:
   return false;
 }
}

>   However, GCC uses
> static TLS models (initial-exec, etc.) when -fPIC isn't given on the
> command line even if PIC is still implicitly enabled.  It only uses the
> dynamic TLS models (intended for use in shared libraries) if -fPIC is
> explicitly passed on the command line.


> However, clang implements implicit PIC by passing the equivalent of
> -fPIC to the llvm backend, so on MIPS at least, the result is that llvm
> _always_ uses the dynamic TLS models including for static libraries and
> binaries where this is wrong.  I have some patches from one of the LLVM
> MIPS folks that kind of hackishly fix this, but by adding a new flag
> only for MIPS.  I wanted to adjust their patches to be more generic so
> that there's a new '-mshared-library' or some such passed from clang
> to llvm and have clang only set that to true if -fPIC is explicitly
> given on the command line to match GCC's behavior.
> 
> So to be clear, this isn't saying that the implicit PIC setting is
> wrong.  It is saying that the llvm backend incorrectly interprets
> the clang front-end's implicit PIC setting as being an explicit PIC
> setting for the purposes of choosing the TLS model to use.

For powerpc64 things are somewhat different via some link-time
optimizations when base/binutils is in use (lld not being
ready for use for powerpc64 as I understand). (I've no clue
what would happen with lld.)

cc -g -O2 -c will produce .o files with the __tls_get_addr
calls, for example (source shown later):

# objdump -r tlsy.o | grep __tls_get_addr
0000000000000024 R_PPC64_REL24     __tls_get_addr
0000000000000038 R_PPC64_REL24     __tls_get_addr

# objdump -r tls_main.o | grep __tls_get_addr
0000000000000020 R_PPC64_REL24     __tls_get_addr
0000000000000034 R_PPC64_REL24     __tls_get_addr
000000000000008c R_PPC64_REL24     __tls_get_addr
00000000000000a0 R_PPC64_REL24     __tls_get_addr

This is as John indicated. But the likes of:

# cc -g -O0 tls_main.o tlsy.o tls.o

ends up with a.out having r13 use and no such
subroutine calls to __tls_get_addr in x, y, or
main. There are some nop instructions from
where substitutions were made.

It appears that mips64 does not have such a
late-optimization in John's context: the
__tls_get_addr use survives into a.out as I
understand.

The source for the example used above was:

# more tls.c
int __thread i = 3;
int __thread j = 2;

# more tlsy.c
extern int __thread i;
extern int __thread j;
int y() { return i+j; }

# more tls_main.c
extern int __thread i;
extern int __thread j;
extern int y();
int x() { return i+j; }
int main ()
{
   return x()-y();
}

(So main got some inlining of x() in order
for the tls_main.o to show 4 R_PPC64_REL24
uses for __tls_get_addr .)

I have no clue if the late-optimization for
powerpc64 covers all the cases where direct use
of some static TLS model would be appropriate.
I just know that, at least in some types of
contexts, some calls to __tls_get_addr are
eliminated at link time when base/binutils is
in use.

===
Mark Millard
marklmi at yahoo.com
( dsl-only.net went
away in early 2018-Mar)



More information about the freebsd-toolchain mailing list