clang 3.8.0 can mess up __builtin_dwarf_cfa (), at least for TARGET_ARCH=armv6, powerpc, powerpc64: a bug 207325 update

Mon Feb 29 01:41:07 UTC 2016

Looking some at clang/llvm history shows releases/branches:

V2.6 did not have "case Builtin::BI__builtin_dwarf_cfa".
V2.7 did have "case Builtin::BI__builtin_dwarf_cfa" but PPCTargetLowering::LowerFRAMEADDR ignored the argument.
V2.8 had PPCTargetLowering::LowerFRAMEADDR using its argument (as a frame depth, not a byte offset).

The apparently incorrect (not matching g++ frame depth returned) comments, naming, and value (when viewed as a frame depth) for "case Builtin::BI__builtin_dwarf_cfa" started in V2.7 and continues to this day.

That is a lot of time for various dependencies on the clang (mis)definition to accumulate across everything that uses clang.

It may be that limiting any change to specific TARGET_ARCH's for FreeBSD is appropriate. FreeBSD would likely need to list the appropriate TARGET_ARCH's, likely including powerpc and powerpc64 since clang before 3.8.0 was not in use for buildworld for powerpc/powerpc64.

Still this may have consequences for ports that use clang and might reference clang-compiled __builtin_dwarf_cfa() use, possibly from a lang/clang* instead of the system clang. My guess is that the interoperability with being able to use g++ vintages as well may lead to (modern?) lang/clang*'s tracking the fix for FreeBSD TARGET_ARCH's that are fixed.

I can ignore all this and build a system based on using 1 as the frame depth just to test, just as a matter of proof of concept for powerpc. (Powerpc64 hits a system libgcc_s defect and so needs more before C++ exceptions can be tested overall.)

===
Mark Millard
markmi at dsl-only.net

On 2016-Feb-28, at 2:20 PM, Mark Millard <markmi at dsl-only.net> wrote:

In /usr/src/contrib/llvm/tools/clang/lib/CodeGen/CGBuiltin.cpp there is:

>  case Builtin::BI__builtin_dwarf_cfa: {
>    // The offset in bytes from the first argument to the CFA.
>    //
>    // Why on earth is this in the frontend?  Is there any reason at
>    // all that the backend can't reasonably determine this while
>    // lowering llvm.eh.dwarf.cfa()?
>    //
>    // TODO: If there's a satisfactory reason, add a target hook for
>    // this instead of hard-coding 0, which is correct for most targets.
>    int32_t Offset = 0;
> 
>    Value *F = CGM.getIntrinsic(Intrinsic::eh_dwarf_cfa);
>    return RValue::get(Builder.CreateCall(F,
>                                      llvm::ConstantInt::get(Int32Ty, Offset)));
>  }

I would have guessed that the internal argument was how many frames away on the stack to go from what 0 produces (high er address direction). g++'s __builtin_dwarf_cfa() returns the address for the next frame compared to clang 3.8.0 (higher address direction).

I'd call that more of a frame depth than an offset. .eh_frame and its cfa material use offset terminology as byte offsets. And the comments above talk of an offset in bytes --but "next frame" distances in bytes would not be constant.

Looking at a use of LowerFRAMEADDR in a LowerRETURNADDR, for example,

> SDValue ARMTargetLowering::LowerRETURNADDR(SDValue Op, SelectionDAG &DAG) const{
> . . .
>  EVT VT = Op.getValueType();
>  SDLoc dl(Op);
>  unsigned Depth = cast<ConstantSDNode>(Op.getOperand(0))->getZExtValue();
>  if (Depth) {
>    SDValue FrameAddr = LowerFRAMEADDR(Op, DAG);
>    SDValue Offset = DAG.getConstant(4, dl, MVT::i32);
>    return DAG.getLoad(VT, dl, DAG.getEntryNode(),
>                       DAG.getNode(ISD::ADD, dl, VT, FrameAddr, Offset),
>                       MachinePointerInfo(), false, false, false, 0);
>  }
> . . .
> }

(PPCTargetLowering::LowerRETURNADDR is similar.) 

This has a mix of Depth and Offset overall, with the depth going to LowerFRAMEADDR via Op but Offset used later in GAG.getLoad via adding to the FrameAddr.

This would lead me to guess that the terminology and comments in "case Builtin::BI__builtin_dwarf_cfa" are wrong and that the Builder.CreateCall has been given a frame depth, not an offset.

> SDValue PPCTargetLowering::LowerFRAMEADDR(SDValue Op,
>                                          SelectionDAG &DAG) const {
>  SDLoc dl(Op);
>  unsigned Depth = cast<ConstantSDNode>(Op.getOperand(0))->getZExtValue();
> 
>  MachineFunction &MF = DAG.getMachineFunction();
>  MachineFrameInfo *MFI = MF.getFrameInfo();
>  MFI->setFrameAddressIsTaken(true);
> 
>  EVT PtrVT = DAG.getTargetLoweringInfo().getPointerTy(MF.getDataLayout());
>  bool isPPC64 = PtrVT == MVT::i64;
> 
>  // Naked functions never have a frame pointer, and so we use r1. For all
>  // other functions, this decision must be delayed until during PEI.
>  unsigned FrameReg;
>  if (MF.getFunction()->hasFnAttribute(Attribute::Naked))
>    FrameReg = isPPC64 ? PPC::X1 : PPC::R1;
>  else
>    FrameReg = isPPC64 ? PPC::FP8 : PPC::FP;
> 
>  SDValue FrameAddr = DAG.getCopyFromReg(DAG.getEntryNode(), dl, FrameReg,
>                                         PtrVT);
>  while (Depth--)
>    FrameAddr = DAG.getLoad(Op.getValueType(), dl, DAG.getEntryNode(),
>                            FrameAddr, MachinePointerInfo(), false, false,
>                            false, 0);
>  return FrameAddr;
> } 

Again Op is called Depth --and is used to get from one frame pointer value to the next: a frame depth.

To match g++ 4.2.1 the value to use is 1 for depth.

Overall, at least applied to powerpc/powerpc64:

> . . .

>    // TODO: If there's a satisfactory reason, add a target hook for
>    // this instead of hard-coding 0, which is correct for most targets.
>    int32_t Offset = 0;

I think the comments in this area are actually talking about byte offsets, not depths and are just wrong. A byte offset of 0 would make sense relative to hardcoding but the value is actually a frame depth --a very different context.

I think that the naming of the variable is just wrong: it should be called Depth.

And I think that the comments should have originally talked about using a hard coded Depth 1 to match g++ and preexisting library usage of __builtin_dwarf_cfa() for C++ and other exception handling (.eh_frame usage). ANd the code should avhe matched.

As far as I can tell this error in the "case Builtin::BI__builtin_dwarf_cfa:" design was not caught until now.

But since the mess has been around a longtime just switching everything to match the g++ context now likely has its own problems. (Not just a FreeBSD issue.)

For FreeBSD I expect that Depth 1 could be used for powerpc and powerpc64: if it has been wrong for a long time (not just 3.8.0) then powerpc/powerpc64 FreeBSD has likely been broken for C++ exception handling when buildworld was via clang for just as long. (But only recently has clang gotten this near working for buildworld for at least one of powerpc/powerpc64. Currently powerpc is closer, given that powerpc64 does not support softfloat last I knew.)

For other TARGET_ARCH's:

For FreeBSD armv6 it is less clear to me: it is based on clang as it is and I do not know what C++ exception ABI it uses. If a modern gcc/g++ buildworld had problems with C++ exception handling, does anything need to be done about it? For FreeBSD armv6 and the like: is xtoolchain like support important?

FreeBSD may have similar questions for other TARGET_ARCH's.

===
Mark Millard
markmi at dsl-only.net

On 2016-Feb-28, at 2:46 AM, Mark Millard <markmi at dsl-only.net> wrote:

On 2016-Feb-28, at 12:39 AM, Roman Divacky <rdivacky at vlakno.cz> wrote:
> 
> Mark,
> 
> __builtin_dwarf_cfa() is lowered in clang to llvm intrinsic eh_dwarf_cfa.
> There's a depth argument (which defaults to 0, saying it's correct for most
> targets). 
> 
> Then the intrinsic gets lowered in SelectionDAG using
> PPCTargetLowering::LowerFRAMEADDR()
> 
> 
> Can you check that 1) the depth should be 0 for ppc64/ppc32 2) that
> LowerFRAMEADDR() does something sensible?
> 
> There's a loop depth-times, so I wonder if that makes a difference.
> 
> Thanks, Roman

"Lowered"? I'm not familiar with the clang code base or its terminology. Handed off to a lower level interface, may be?

As near as I can tell libgcc_s could be made to deal with clang 3.8.0's way of __builtin_dwarf_cfa() working for powerpc/powerpc64. But then use with g++ would be broken on powerpc/powerpc64 unless there were some sort of live "which compiler's type of code" test also involved.

Having only one libgcc_s and multiple compilers using it for a given TARGET_ARCH= (for example, devel/powerpc64-xtoolchain-gcc like uses) suggests sticking to one convention per TARGET_ARCH= for __builtin_dwarf_cfa().

I would guess that g++ conventions win in this type of context for FreeBSD, under the guideline of trying to be gcc 4.2.1 "ABI" compatible. libgcc_s from FreeBSD works for C++ exceptions with its gcc 4.2.1 for powerpc and powerpc64 as things are as far as I know.

But for clang++ FreeBSD is one context among many and other libraries may be based on clang 3.8.0's existing interpretation, without gcc/g++ compatibility constraints. (I've no experience with earlier clang vintages for the issue.) It may be that any change needs to be FreeBSD target-triple specific for all I know. In essence: making the convention part of the ABI chosen.

I'll probably get some sleep before looking much at the code that you reference. A quick look at part of it suggests a fair amount of research/study for me to interpret things reliably.

The loop may be somewhat analogous to _Unwind_RaiseException's loop, but for a specific depth. I would currently guess that depth 1 would match gcc 4.2.1's result for __builtin_dwarf_cfa().

But there was also some other "address"(?) builtin support routine nearby that seemed to call into LowerFRAMEADDR() and I've no clue if g++ 4.2.1 uses the same depth-figure standard for both sorts of things or not. For all I know both types of builtins(?) might have mismatches with gcc/g++ 4.2.1 and both might need fixes.

I do vaguely remember seeing a builtin in FreeBSD code for something that had an explicit number in the argument list, possibly __builtin_frame_address(n)(?). But I only saw __builtin_dwarf_cfa() with no argument in the argument list as far as I remember.

If clang 3.8.0 and gcc 4.2.1 disagreed about what the numbering standard referred to for __builtin_frame_address(n) (or whatever it was), that would not be good and compatibility would imply conversions between the conventions for the 2 FreeBSD contexts.

I have not checked for armv6 related clang++ vs. g++ compatibility for C++ exception-handling. If anything is not operating for that context I expect that it would be g++ that generated buildworld code that did not work based on the FreeBSD source code: clang/clang++ is the normal compiler and kyua seemed to operate, unlike on the powerpc/powerpc64.

I've never tried to build armv6 via an equivalent of devel/powerpc64-gcc. I do not know if armv6 even uses the same sort of C++ exception-handling ABI or not. But I do know that __builtin_dwarf_cfa() is not compatible between clang++ and g++ from the 2-line source and comparing objdump -d results.

So more than powerpc/powerpc64 might be involved overall.

===
Mark Millard
markmi at dsl-only.net