maintainer-feedback requested: [Bug 258954] java/openjdk11 java/openjdk12 java/openjdk13: crashes when built with clang 13

From: <bugzilla-noreply_at_freebsd.org>
Date: Tue, 05 Oct 2021 19:44:19 UTC
Bugzilla Automation <bugzilla@FreeBSD.org> has asked freebsd-java (Nobody)
<java@FreeBSD.org> for maintainer-feedback:
Bug 258954: java/openjdk11 java/openjdk12 java/openjdk13: crashes when built
with clang 13
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=258954



--- Description ---
During an exp-run for llvm 13 (see bug 258209), it turned out that
java/openjdk11 through openjdk13 fail to build with clang 13:

=== Output from failing command(s) repeated here ===
* For target jdk__packages_attribute.done:
#
# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGSEGV (0xb) at pc=0x0000000802c8a991, pid=92123, tid=618713
#
# JRE version:	(11.0.12+7) (build )
# Java VM: OpenJDK 64-Bit Server VM (11.0.12+7-1, mixed mode, tiered,
compressed oops, serial gc, bsd-amd64)
# Problematic frame:
# V  [libjvm.so+0xe8a991]  JVM_RaiseSignal+0x3bfbf1
#
# Core dump will be written. Default location:
/wrkdirs/usr/ports/java/openjdk11/work/jdk11u-jdk-11.0.12-7-1/make/java.core
#
# An error report file with more information is saved as:
#
/wrkdirs/usr/ports/java/openjdk11/work/jdk11u-jdk-11.0.12-7-1/make/hs_err_pid92
123.log

These crashes are all caused by the markOop/markOopDesc classes, which are used
to keep track of objects, and which are 'marked' using the low few bits. (See
https://github.com/openjdk/jdk13u/blob/master/src/hotspot/share/oops/markOop.hp
p
).

After some laborious bisecting, I found out that these crashes start occuring
after the upstream commit https://github.com
/llvm/llvm-project/commit/16d03818412 (Return "[CGCall] Annotate this argument
with alignment").

What happens afterwards, is that clang considers the "this" pointer to always
be aligned to the alignment of the actual object, and then masking or adding a
few low bits is not working as expected.

The reason openjdk14 and higher work fine with clang 13, and don't crash
similarly, is that the OpenJDK people completely redid the markOop/markOopDesc
classes in
https://github.com/openjdk/jdk/commit/ae5615c6142a4dc0d9033462f4880d7b3c127e26
("8229258: Rework markOop and markOopDesc into a simpler mark word value
carrier"). E.g, the markOopDesc class was renamed to markWord, and *stores* a
pointer-like value instead of *being* a pointer-like value. This is a much
safer way of handling things.

However, this upstream commit is *very* large, as are a few of its follow-ups,
which is probably the reason why it has not been backported to JDKs <= 13. I
tried manually backporting it, but got lost in many nasty patch conflicts and
problems.

I would like to solicit some opinions from our OpenJDK maintainers, on how to
move forward with this issue. I see a few ways:
* Get someone well-versed in OpenJDK internals to backport '8229258: Rework
markOop and markOopDesc' (this is a *lot* of tricky stuff, and has to be done
for at least 11, 12 and 13; but maybe earlier JDKs too).
* Find some alternative way of simplifying the approach in '8229258: Rework
markOop and markOopDesc', and backport that
* Revert the upstream LLVM commit; I don't really like this because we would
have to carry that patch forever (as LLVM upstream won't accept it obviously)
* Adjust the port Makefiles for openjdk11 though openjdk13 to use the clang12
port
* ... something else?