(Finally) solving a substitution GraalVM issue

Nicolas Fränkel - Nov 29 '20 - - Dev Community

One of my current talks is about creating a Kubernetes operator in Java. I demo it step by step. In the later steps, I'm using GraalVM native image to create a native executable. In that regard, some libraries are not compatible with the native image creation process. Several options are available to make them work anyway.

One of the options is to substitute incompatible code with compatible one. In the above post, I describe how to use those substitutions. I thought I had it right. I was wrong. The code didn't work in my latest demos. Either I did and the platform changed the way it worked across version upgrades, or I didn't and the success came from a misconfiguration.

The documentation about substitutions is sparse. I started to convince myself that they didn't work. Then I talked with my colleague Grzegorz who manages to use substitutions in the Hazelcast Client Quarkus plugin. I decided to solve this once and for all. In this talk, I'd like to describe my journey toward resolving the issue.

Scoping the problem

The problem comes from the OkHttp library. To sum it up: one library class has a field that loads a charset during initialization. This charset is not present on the target platform, an exception is thrown, the JVM fails to start, Kubernetes kills the pod, end game.

The incompatible code looks like:

public final class Util {

  private static final ByteString UTF_16_LE_BOM = ByteString.decodeHex("fffe");
  private static final ByteString UTF_32_LE_BOM = ByteString.decodeHex("ffff0000");
  public static final Charset UTF_8 = Charset.forName("UTF-8");
  private static final Charset UTF_16_LE = Charset.forName("UTF-16LE");
  private static final Charset UTF_32_LE = Charset.forName("UTF-32LE");

  public static Charset bomAwareCharset(BufferedSource source, Charset charset) throws IOException {
    if (source.rangeEquals(0, UTF_8_BOM)) {
      source.skip(UTF_8_BOM.size());
      return UTF_8;
    }
    if (source.rangeEquals(0, UTF_16_LE_BOM)) {
      source.skip(UTF_16_LE_BOM.size());
      return UTF_16_LE;
    }
    if (source.rangeEquals(0, UTF_32_LE_BOM)) {
      source.skip(UTF_32_LE_BOM.size());
      return UTF_32_LE;
    }
    return charset;
  }
Enter fullscreen mode Exit fullscreen mode

At runtime, the platform initializes static fields: it calls the Charset.forName() methods. It works for UTF-8 and UTF-16LE because these charsets are available in the image. But it fails on the UTF_32_LE field because it's not.

It's possible to solve the issue by adding all charsets at build time. This option increases the final image size. Let's see how to use substitutions to keep the size low.

What are substitutions?

GraalVM native image transforms bytecode into native code via an AOT compilation process. That process allows "hooks" to change the resulting native code: these are substitutions. In theory, you could change the entire result. In practice, you want to make the code work outside the JVM. This sounds like a solution for the above issue.

The first step is to add the GraalVM library to the compile classpath:

<dependency>
  <groupId>com.oracle.substratevm</groupId>
  <artifactId>svm</artifactId>
  <version>19.2.1</version>
  <scope>provided</scope>                   <!-- 1 -->
</dependency>
Enter fullscreen mode Exit fullscreen mode
  1. Do not add the library to the final executable

The second step is to configure which parts of the code you need to change and remove. You can use either configuration files or annotations. Because I didn't find any relevant documentation about the former, I selected to use the latter.

The 3rd step is to create a new class that references the class to be replaced (or removed).

Removing the offending field

As described above, one field needs to be removed because the charset it references is not available at runtime in the final image. My previous assumption was that since the field was not referenced, then the AOT compile wouldn't find it and it wouldn't translate into the native code. I was wrong. To prevent the native image executable to write a field in the resulting native code, you need to explicitly mark it as @Deleted.

@TargetClass(okhttp3.internal.Util.class)     // 1
public final class okhttp3_internal_Util {    // 2

  @Delete private static Charset UTF_32_LE;   // 3
}
Enter fullscreen mode Exit fullscreen mode
  1. The class to apply changes to
  2. Name it as you want I chose to adopt the fully qualified class name for readability purpose
  3. The field to remove from the final native code

Astute readers will notice that this code won't compile: the bomAwareCharset() method references the field to remove. For that reason, we need to update the method as well.

Update the referencing method

To update a method in the target class, we need to use the @Substitution annotation. By default, the target method has the same method as the annotated one.

@TargetClass(okhttp3.internal.Util.class)
public final class okhttp3_internal_Util {

    @Delete private static Charset UTF_32_LE;

    @Substitute                                       // 1
    public static Charset bomAwareCharset(BufferedSource source, Charset charset) throws IOException {
        if (source.rangeEquals(0, UTF_8_BOM)) {
            source.skip(UTF_8_BOM.size());
            return UTF_8;
        }
        if (source.rangeEquals(0, UTF_16_LE_BOM)) {
            source.skip(UTF_16_LE_BOM.size());
            return UTF_16_LE;
        }
        return charset;
    }
}
Enter fullscreen mode Exit fullscreen mode
  1. Replaces the bomAwareCharset() method in the target class

This is but the first stone as this code doesn't compile either. The static field e.g. UTF_16_BE_BOM are private in the Util class! GraalVM still allows to reference them by marking them with @Alias in the code.

The final code is:

@TargetClass(Util.class)
public final class okhttp3_internal_Util {

    @Alias  private static ByteString UTF_8_BOM;       // 1
    @Alias  private static ByteString UTF_16_LE_BOM;   // 1
    @Alias  public static Charset UTF_8;               // 1
    @Alias  private static Charset UTF_16_LE;          // 1
    @Delete private static Charset UTF_32_LE;

    @Substitute
    public static Charset bomAwareCharset(BufferedSource source, Charset charset) throws IOException {
        if (source.rangeEquals(0, UTF_8_BOM)) {        // 2
            source.skip(UTF_8_BOM.size());             // 2
            return UTF_8;                              // 2
        }
        if (source.rangeEquals(0, UTF_16_LE_BOM)) {
            source.skip(UTF_16_LE_BOM.size());
            return UTF_16_LE;
        }
        return charset;
    }
}
Enter fullscreen mode Exit fullscreen mode
  1. Reference the field from the target class
  2. Use it as usual

The AOT-compilation will make the correct substitution.

The final surprise

I was expecting the previous code to be the last step. But when I ran the build, I got the following error:

Error: com.oracle.svm.hosted.substitute.DeletedElementException: Unsupported field okhttp3.internal.Util.UTF_32_BE is reachable
To diagnose the issue, you can add the option --report-unsupported-elements-at-runtime. The unsupported element is then reported atrun time when it is accessed the first time.
Detailed message:
Trace:
    at parsing okhttp3.internal.Util.<clinit>(Util.java:75)
Call path from entry point to okhttp3.internal.Util.<clinit>():
    no path found from entry point to target method


Error: Use -H:+ReportExceptionStackTraces to print stacktrace of underlying exception
Error: Image build request failed with exit status 1
Enter fullscreen mode Exit fullscreen mode

The solution is to initialize the class at build-time. You need to add the initialize-at-build-time option to the command:

native-image --initialize-at-build-time=okhttp3.internal.Util \
              -jar operator.jar
Enter fullscreen mode Exit fullscreen mode

This leads to another error:

Error: Classes that should be initialized at runtime got initialized during image building:
 okio.ByteString was unintentionally initialized at build time. To see why okio.ByteString got initialized use -H:+TraceClassInitialization
okio.Util was unintentionally initialized at build time. To see why okio.Util got initialized use -H:+TraceClassInitialization

Error: Use -H:+ReportExceptionStackTraces to print stacktrace of underlying exception
Error: Image build request failed with exit status 1
Enter fullscreen mode Exit fullscreen mode

Util makes use of other classes. You need to initialize those mentioned in the error message at build-time as well. Let's update the command accordingly:

native-image --initialize-at-build-time=okhttp3.internal.Util,okio.ByteString,okio.Util \
              -jar operator.jar
Enter fullscreen mode Exit fullscreen mode

This last command succeeds to build the native executable.

The complete source code for this post can be found on Github in Maven format.

Originally published at A Java Geek on November 29th 2020

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .