Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GraalVM support with testing #749

Closed
wants to merge 36 commits into from

Conversation

maschnetwork
Copy link
Contributor

@maschnetwork maschnetwork commented Jan 16, 2024

Issue # and related PRs

#464
#311

And larger issue on the AWS SDK repo:

aws/aws-sdk-java-v2#2948

Description of changes:

Added configuration files (in META-INF/native-image) for latest version of CRT client to support GraalVM native images. Added Maven Profile (graalvm-native) for running unit tests as a native image which should verify that all necessary configuration is present. This would allow us to identify if any configuration is missing in the future. I did not include it into any build pipeline and will rely on the maintainers guidance on how to implement this check properly. The command to run the unit tests with the profile is:

mvn clean test -Pgraalvm-native 

The native testing method requires Junit 5 which is why the profile is leveraging the JUnitVintage Engine. Also I had to add the src/main/resources folder to be recognized in the pom - otherwise the configuration files were ignored during test runs. Lastly, I added a check skipIfNativeImage() to skip JVM related tests as they would not apply to a native binary.

How were the configuration files created?

The files were generated by attaching the native image tracing agent to the maven-surefire-plugin via the argline command. This will generate the configuration based on the unit tests that are executed. I also had to set forkCount to 1 otherwise the agent was not started:

<!-- unit test configuration -->
<plugin>
    <groupId>org.apache.maven.plugins</groupId>
    <artifactId>maven-surefire-plugin</artifactId>
    <version>3.2.3</version>
    <configuration>
          <!-- Other configuration ommited for readability -->
          <argLine>-agentlib:native-image-agent=config-merge-dir=native-configuration -Daws.crt.memory.tracing=2 -Xcheck:jni</argLine>
         <forkCount>1</forkCount>
      </configuration>
</plugin>

Since the generated configuration also includes test classes that are not needed for the final artifact I haven't automated this task and manually diffed and merged them into the final META-INF/native-image folder.

A fully working version (with the configuration being manually added) can be found here: https://github.com/maschnetwork/aws-lambda-graalvm-crt - I also built a SNAPSHOT jar locally and tested it with both Sync and Async client for DynamoDB.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

Copy link

@deki deki left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work!

pom.xml Outdated
Comment on lines 230 to 232
<groupId>org.junit.vintage</groupId>
<artifactId>junit-vintage-engine</artifactId>
<version>5.10.1</version>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we just switch the whole project to JUnit 5 (vintage if that's easier)? Seems simpler than having it be different just for GraalVM.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree. Moved Vintage Engine out of the profile.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The files were generated by attaching the native image tracing agent to the maven-surefire-plugin via the argline command. This will generate the configuration based on the unit tests that are executed.

Since the generated configuration also includes test classes that are not needed for the final artifact I haven't automated this task and manually diffed and merged them into the final META-INF/native-image folder.

I'm worried about devs forgetting to update this file, and not having any way for CI to catch it without 100% test coverage. Some ideas that come to mind:

  1. Move all reflection lookups to one place, and ensure they're ALWAYS run immediately on startup, so that any missing entries are sure to be caught in the Graalvm CI run . This is mostly the case already for JNI code, java_class_ids.c has 99% of reflection lookups. But we'd need to do something similar for any reflection from Java code.
  2. Use automation to keep these files in sync. Maybe some script like codegen-reflection.py that generates java_class_ids.h/c. And we move ALL reflection calls (FindClass()/GetFieldId()/GetMethodId()) into that file. The script could use this src/main/resources/META-INF/native-image/software.amazon.awssdk/crt/aws-crt/jni-config.json files as its source. Or the source could be some other file, and the script generates the jni-config.json file as well. We also add CI that ensures the code-genned files are all up to date and weren't edited by hand.

1 seems simpler in the short term, less code to write and alter.

It would also be great to have some CI check that failed if it found a reflection call outside the blessed locations (e.g. fail if "GetMethodID" found in any .c file other than java_class_ids.c, or "Class.forName" found in a misc .java file). It could be something as dumb as searching text.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that a 100% coverage with automation would be the ideal case. However, looking at the current implementation of the aws-java-sdk-v2 we also seem to rely on integration tests on a subset of functionality that check if the compiled native binary can do S3 and DynamoDB Service calls (Link).

At the moment the users are self-generating their configuration files - generating them automatically from the tests and checking them on the CI build will already improve this situation.

As an alternative to configuration files we could also use the Feature implementation as code. Example:

  public void beforeAnalysis(BeforeAnalysisAccess access) {
    try {
      RuntimeReflection.register(String.class);
      RuntimeReflection.register(String.class.getDeclaredField("value"));
      RuntimeReflection.register(String.class.getDeclaredField("hash"));
      RuntimeReflection.register(String.class.getDeclaredConstructor(char[].class));
      RuntimeReflection.register(String.class.getDeclaredMethod("charAt", int.class));
      RuntimeReflection.register(String.class.getDeclaredMethod("format", String.class, Object[].class));
      RuntimeReflection.register(String.CaseInsensitiveComparator.class);
      RuntimeReflection.register(String.CaseInsensitiveComparator.class.getDeclaredMethod("compare", String.class, String.class));
    } catch (NoSuchMethodException | NoSuchFieldException e) { ... }
  }

"methods":[{"name":"getSuppressed","parameterTypes":[] }]
},
{
"name":"java.util.concurrent.ForkJoinTask",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where do some of these come from? I don't see any reference to "ForkJoinTask" anywhere in this codebase

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Those are files that have been generated by running the Unit-Tests and some of them seem to be related to test cases only. Those should be not needed as the native-image tests require a normal test run in advance so that those files are automatically created. I cleaned up the obvious ones and will push an update.

@@ -0,0 +1 @@
Args=--initialize-at-run-time=io.netty.util.internal.logging.Log4JLogger -H:ClassInitialization=org.slf4j:build_time
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what's up with this? We don't have a dependency on netty

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was copied from the previous PR (https://github.com/awslabs/aws-crt-java/pull/464/files) - could not reproduce the warning and will therefore remove.

@jmklix jmklix added the needs-review This issue or pull request needs review from a core team member. label Jan 24, 2024
@graebm graebm added the feature-request A feature should be added or improved. label Feb 13, 2024
@klopfdreh
Copy link

I recently tested the jni-config.json on x86_64 / linux and an upload to s3 with the AWS CRT dependency (S3AsyncClient.crtBuilder()) like
native-image ... -H:JNIConfigurationFiles=META-INF/native-image/spring-boot-jdk17-x86-native/jni-config.json
and it seems to work!

I don't know if the JNI configuration have been changed again as there are many "Waiting for status to be reported" and the branch is "out-of-date"

image
image

nice work anyway @maschnetwork 👍

@klopfdreh
Copy link

klopfdreh commented Apr 26, 2024

Hey @maschnetwork maybe it would be a good idea to introduce the native support to the code as in Spring Cloud AWS - see: https://github.com/awspring/spring-cloud-aws/pull/856/files - this way there would be no issues even if you move the classes.

Edit: Saw that those classes are Spring related and can't be used in the CRT. Sorry for the noise.

@maschnetwork
Copy link
Contributor Author

Hi @klopfdreh, thanks for reaching out. I will update the PR shortly and will add missing configuration files (if there are any). Thanks for testing and verifying that it is working for you.

@klopfdreh
Copy link

@maschnetwork - thanks a lot! Just one hint: When I tried it with the newest version of AWS SDK V2 for Java there is a runtime issue that a method is missing in the S3AsyncClient when using crtBuilder.

@maschnetwork
Copy link
Contributor Author

@klopfdreh I updated the branch with the new configuration file. There was one configuration missing software.amazon.awssdk.crt.http.HttpProxyOptions which has been added. Can you please confirm your setup is working with the new configuration?

@graebm I now also added a new maven profile generate-graalvm-files that runs the unit tests with the tracing agent to create the new configuration file. Verifying and adding the missing configuration is now relatively straight forward. I will outline the example on how i created the new version below:

  1. Initially I ran mvn clean test -Pgraalvm-native which failed as expected as there were configuration files missing:

Bildschirmfoto 2024-05-07 um 07 14 48

  1. It correctly pointed me to java_class_ids.c which has been changed recently:
    image

  2. I then ran mvn test -Pgenerate-graalvm-files to generate the new configuration files and diffed them with the existing ones and I identified correctly that the following configuration was missing:

{ "name":"software.amazon.awssdk.crt.http.HttpProxyOptions$HttpProxyAuthorizationType", "methods":[{"name":"getValue","parameterTypes":[] }] }

  1. I added this to the src/resources/META-INF/../jni-config.json and reran mvn clean test -Pgraalvm-native which then resulted in a successful build.

@klopfdreh
Copy link

@maschnetwork - I am going to test it with the newest AWS SDK V2 and CRT with those configs and give feedback later this day. 👍

@klopfdreh
Copy link

klopfdreh commented May 7, 2024

@maschnetwork - I used the following versions:

<awssdk-v2.version>2.25.46</awssdk-v2.version>
<aws-crt.version>0.29.18</aws-crt.version>
  1. native-build was a success ✔️
  2. I received a runtime exception when I tried to get resources from S3 ❌
java.lang.NoSuchMethodError: software.amazon.awssdk.identity.spi.Identity.providerName()
at software.amazon.awssdk.core.internal.http.pipeline.stages.ApplyUserAgentStage.providerNameFromIdentity(ApplyUserAgentStage.java:174) ~[na:na]
at software.amazon.awssdk.core.internal.http.pipeline.stages.ApplyUserAgentStage.identityProviderName(ApplyUserAgentStage.java:168) ~[na:na]
at software.amazon.awssdk.core.internal.http.pipeline.stages.ApplyUserAgentStage.getUserAgent(ApplyUserAgentStage.java:147) ~[na:na]
at software.amazon.awssdk.core.internal.http.pipeline.stages.ApplyUserAgentStage.execute(ApplyUserAgentStage.java:135) ~[na:na]
at software.amazon.awssdk.core.internal.http.pipeline.stages.ApplyUserAgentStage.execute(ApplyUserAgentStage.java:52) ~[na:na]

I guess there is a hint missing in a reflect-config.json in the core module at AWS SDK V2.

Edit: Regarding 2. This was due to a wrong version of identity-spi while testing.

@maschnetwork
Copy link
Contributor Author

@klopfdreh Yes this is not related to the CRT client configuration but to the AWS SDK v2: https://github.com/aws/aws-sdk-java-v2 - I see that providerName has been recently added there. Can you please open an issue there with your example code and logs? I will take care of communicating it internally and can also take a look into providing a fix.

@klopfdreh
Copy link

@klopfdreh Yes this is not related to the CRT client configuration but to the AWS SDK v2: https://github.com/aws/aws-sdk-java-v2 - I see that providerName has been recently added there. Can you please open an issue there with your example code and logs? I will take care of communicating it internally and can also take a look into providing a fix.

Hey @maschnetwork - I double checked my dependencies and found identity-spi version was not set correctly. I just fixed that and can confirm that all is running correctly! 🎉

I used the following code to upload a 145MB file to S3 and all is working within a native image. 👍

try (S3AsyncClient s3AsyncClient = S3AsyncClient.crtBuilder()
        .credentialsProvider(awsCredentialsProvider)
        .region(region)
        .endpointOverride(endpoint)
        .targetThroughputInGbps(20.0)
        .minimumPartSizeInBytes(20000000L)
        .maxConcurrency(8)
        .build();
     S3TransferManager s3TransferManager = S3TransferManager
             .builder()
             .s3Client(s3AsyncClient)
             .build()) {
 // ...
 }

Copy link

@klopfdreh klopfdreh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test native image s3 upload.

@graebm
Copy link
Contributor

graebm commented May 8, 2024

I chatted with @maschnetwork.

Before this is merged, ALL reflection calls need to be moved into functions that are called on startup. Most reflection calls already happen this way, but I want to be sure we're not missing anything. This way, we'll detect missing entries in META-INF/native-image/... immediately, without requiring 100% branch coverage in our tests.

If we do this, maybe just remove any automation for generating META-INF/native-image? Since, from hereon, it will just be single line additions? Less code is best code.

We'll also need CI for GraalVM, so we don't break this moving forward. I suppose this can be added in a subsequent PR.

@klopfdreh
Copy link

Thanks a lot! When ever I can test something regarding the native build, please ping me at this PR. 👍

@klopfdreh
Copy link

klopfdreh commented May 13, 2024

If we do this, maybe just remove any automation for generating META-INF/native-image? Since, from hereon, it will just be single line additions? Less code is best code.

Less code is best code - that's true 😃 and for normal use cases I would also say that the automation is not required anymore.

But I like that those plugins are configured within a profile so that if there is a major JNI rework you are still able to run it and compare the jin-config.json and reflect-config.json with the one present in the src folder. There is no other way to generate the complete JNI configuration and sometimes it is hard to keep all required JNI information in sync.

pom.xml Outdated
Comment on lines 234 to 240
<execution>
<id>build-native</id>
<goals>
<goal>build</goal>
</goals>
<phase>package</phase>
</execution>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just want to get your opinion, do we need the build phase? Should it be the application using our lib to build the native executable?

From my understanding, we will not package our lib as a native lib for other package to use, right??

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct, for the current setup we don't need it. Let's remove it together with the tracing agent generation if we agree on the json vs. class format for the hints.

@maschnetwork
Copy link
Contributor Author

maschnetwork commented Jun 1, 2024

Hi @graebm and @TingDaoK, can you please take a look into the latest commit 094af5e? This is an example of leveraging the Feature implementation to provide the JNI configs via a Java class. Wdyt?

@maschnetwork
Copy link
Contributor Author

For transparency : After investigation together with @TingDaoK the CI builds are now working: https://github.com/awslabs/aws-crt-java/pull/786/checks.

@TingDaoK
Copy link
Contributor

TingDaoK commented Jun 3, 2024

Hi @graebm and @TingDaoK, can you please take a look into the latest commit 094af5e? This is an example of leveraging the Feature implementation to provide the JNI configs via a Java class. Wdyt?

With the the Feature implementation, can we use Java reflection to iterator through the classes and methods fromsoftware.amazon.awssdk.crt, and register them all?

But, looks like the CI fails for Java 17 Specified module 'org.graalvm.nativeimage' is unknown.
Also, for Java 8, looks like graalvm SDK wants Java 11+, but we need to support Java 8.
https://github.com/awslabs/aws-crt-java/actions/runs/9354653868/job/25747987564?pr=786
We need to solve those issues before we keep investigating the the Feature approach.

I am fine with the current .json config approach, it's just an extra step to add the binding. And if we forget to add the config, it should be caught by the CI.

@maschnetwork
Copy link
Contributor Author

@TingDaoK based on your feedback I reverted the last commit and we stick with the config.json for now - also removed the build execution from the POM. If CI is passing - is there anything left from your side?

@TingDaoK
Copy link
Contributor

@maschnetwork I just took over the branch to keep it simple. Mostly, I removed the unnecessary reflection from reflect-config.json based on my tests. And removed native-image.properties as we are not using lf4j. And several trivial fix for the tests we have. Let me know if you want to keep this PR instead of #791

@maschnetwork
Copy link
Contributor Author

@TingDaoK Great - even better if it works with less config for the reflection. On this PR I see that you have added several FATAL_ASSERT to the c file. I don't see them anymore on your PR - does it make sense to add them there? src/native/java_class_ids.c

@TingDaoK
Copy link
Contributor

@TingDaoK Great - even better if it works with less config for the reflection. On this PR I see that you have added several FATAL_ASSERT to the c file. I don't see them anymore on your PR - does it make sense to add them there? src/native/java_class_ids.c

yeah, I have already merged that change into main as a separate PR, here, since I wanted the fix to be merged quicker, it should affect the regular JVM as well.

@maschnetwork
Copy link
Contributor Author

@TingDaoK Great - anything else missing then?

@TingDaoK
Copy link
Contributor

@maschnetwork nothing blocking, thanks for your contributing!

merged with #791

@klopfdreh
Copy link

klopfdreh commented Jun 11, 2024

@TingDaoK and @maschnetwork - thanks a lot for all the work - maybe it would be good to add a hint in the README.md on how to use the related json files during a native build and what is required for it. WDYT?

@maschnetwork
Copy link
Contributor Author

@TingDaoK Great - thank you for all your effort on the CI/Testing side and the final cleanup

how to use the related json files during a native build

@klopfdreh Can you elaborate on this? As the config files now ship with CRT itself there should be no action or manual config changes needed on the client/user side of the library.

@klopfdreh
Copy link

klopfdreh commented Jun 11, 2024

@maschnetwork - sure - I assumed like mentioned here https://aws.amazon.com/de/blogs/developer/graalvm-native-image-support-in-the-aws-sdk-for-java-2-x/ we have to add the following arguments to the native-image command for a native build:

native-image \
... \
-H:JNIConfigurationFiles=META-INF/native-image/software.amazon.awssdk/crt/aws-crt/jni-config.json \
-H:ReflectionConfigurationResources=META-INF/native-image/software.amazon.awssdk/crt/aws-crt/reflect-config.json \
-H:ResourceConfigurationResources=META-INF/native-image/software.amazon.awssdk/crt/aws-crt/resource-config.json

Those information I wanted to put in the README.md to ensure that a native build is performed correctly without any issues.

@maschnetwork
Copy link
Contributor Author

@klopfdreh If the configuration files are placed (by the library owner) under the default GraalVM folder e.g. resources/META_INF/native-image/the_library_package - then no additional configuration should be needed on your side. They should be automatically picked up.

@klopfdreh
Copy link

@maschnetwork - all right 👍

@maschnetwork
Copy link
Contributor Author

🎉 Merged with #791. SDK Related PR to bump the version: aws/aws-sdk-java-v2#5290.

Thanks for all the contributions here!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature-request A feature should be added or improved. needs-review This issue or pull request needs review from a core team member.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants