-
-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HotSpot-based Java 11 and higher VM crashes when loaded and initialized via JNI Invocation API on AIX #997
Comments
hotspot creates guard pages for each Java Thread. This causes unfortunate limitations, especially on AIX. I believe it doesn't work for the primordial thread and it requires a certain thread stack size. (Only some of the reasons why I don't like this design. I hope that we can remove it at some point of time.) |
@TheRealMDoerr We are aware of the primordial thread issue. The HotSpot JVM on AIX produces an error message stating this if an attempt is made to create a JVM instance on the primordial thread. The sample code that we attached creates a separate thread on which it attempts to create the JVM instance. Since we use C++ std::thread instead of pthreads, we can't directly set the stack size. However, based on your response and the test code that you referenced, we replaced the std::thread used to initialize the JVM with pthread configured with a large stack size. This test loaded the HotSpot JVM successfully. So, this does give us a work-around for this problem. So, my obvious next questions follow:
The HotSpot JVM implementation which uses guard pages for each thread results in inconsistent JNI Invocation API behavior when compared with JREs which use OpenJ9 JVM. The OpenJ9 JVM can be loaded in the primordial thread AND does not explicitly require a very large thread stack size. This means that we can initialize it directly in the primary thread used to invoke main. Also, if we do initialize the OpenJ9-based JVM in a separate thread, we can used C++ std::thread. Additionally, it causes inconsistent JNI Invocation API behavior when compared with the HotSpot JVM on other platforms (like Linux and Windows). We do NOT have to use a separate thread to initialize the HotSpot JVM on those platforms. Is it possible to rethink the HotSpot JVM design decision which led to all of these issues on AIX? It's a little late for us since we now know the source of the problem, but it might help the next organization. |
I have filed a JBS issue: https://bugs.openjdk.org/browse/JDK-8324431 |
@TheRealMDoerr Based on your feedback, we have replaced the top-level thread that we use to load the JVM on AIX (only) with a pthread. The top-level thread on AIX was previously a C++11 std::thread which has no API to set the thread stack size. This has gotten us past the crash-on-load issue with the HotSpot JVM. We continue to use std::thread for all other threads that we need to create in our application. Unfortunately, I think work-around would be completely unacceptable to many companies. I know that we didn't like making the exception. Thanks for submitting JDK-8324431. I've read through the comments. Maybe I'm taking this out of context, but I completely disagree with the following statement by David Holmes: "If someone reports "My Java application won't run on a C++ Thread because C++ makes the stack too small" then that is not a Java problem." for the following reasons. Our application uses C++ std::thread across all supported platforms, which doesn't have an API to set the stack size. We transitioned to std::thread around 10 years ago after C++11-compliant compilers were readily available on all of our supported platforms. We continued to dynamically load the JVM (both HotSpot and OpenJ9) via JNI Invocation Interface reliably with Java 7 and Java 8 on all platforms, including AIX. After transitioning to Java 11 a couple of years ago, there were suddenly issues loading the HotSpot JVM on AIX that didn't exist previously. We suddenly couldn't load the HotSpot JVM on the primordial thread (our application is primarily single-threaded, but we have several cases where multiple threads are needed) ONLY on AIX. We were able (and still are) to load the HotSpot JVM on the primordial thread on our other platforms. We are able to load the OpenJ9 JVM on the primordial thread on all of our platforms, including AIX. The saving grace with the HotSpot JVM non-primordial thread issue on AIX is that at least it provides a useful error when it fails. Unfortunately, the second issue with the HotSpot JVM 11+ was a complete mystery to us since all that it does it throw a SIGSEGV with no useful information other than approximately where in the JVM that it occurs when our application would attempt to load it via the JNI Invocation API. We spent a considerable amount of time attempting to debug our application, changing compiler options, etc. trying to figure out why the JVM kept crashing. At the end of the day, we ended up resorting to embedding IBM Semeru JRE distribution (OpenJ9-based) on AIX instead of the HotSpot distribution. Unfortunately, IBM's Semeru distribution's java CLI crashes when the IBM XL C++ runtime is higher than a certain patch level, which is unacceptable to our customers who load our application via their own Java application (thus not needing the JNI Invocation API since the JVM is already loaded). We've had to tell those customers to download the Adoptium HotSpot distribution for use with their application. So... in a nutshell, David Holmes statement that this isn't a "java" issue may be right, but based on many months of pulling my hair out on AIX, I would argue that it is DEFINITELY a "JVM" issue. |
Is JNI officially compatible with C++? I'd always go through a C layer. There may be more problems when combining JNI and C++. |
We are marking this issue as stale because it has not been updated for a while. This is just a way to keep the support issues queue manageable. |
We are marking this issue as stale because it has not been updated for a while. This is just a way to keep the support issues queue manageable. |
Please provide a brief summary of the bug
All HotSpot-based distributions that we've tested since 11.0.12 (might have happened earlier) crash very early in the JVM initialization on AIX. The same method that we've been using since around 2003 for loading the JVM still works without fail on our other supported platforms (WIndows and Red Hat-compatible Linux distributions). This includes JDK 21 distributions. As a result of this, we're currently having to embed IBM's OpenJ9-based Semeru JRE in our product distribution, but we recommend the HotSpot-based Adoptium builds to our customers who embed our product in their Java applications (which don't need the JNI Invocation API).
As inferred above, the OpenJ9-based distributions through JDK 21 (highest version that we've tested) load and initialize via JNI Invocation Interface without error on all of our supported platforms, including AIX.
Did you test with the latest update version?
Please provide steps to reproduce where possible
JvmLoader.tar.gz
Attached is a tar.gz which contains a small sample program which illustrates the segmentation fault crash in the JVM. To run it, follow these steps:
Expected Results
After explicitly loading libjvm.so, the JVM loads successfully when JNI_CreateJavaVM is called.
Actual Results
After explicitly loading libjvm.so, the JVM crashes with a segmentation fault while calling JNI_CreateJavaVM. The dbx utility reports the following stack trace from our test application:
IPRA.$checked_mprotect__FPcUli(??, ??, ??) at 0x90000003024778c
guard_memory__2osFPcUl(??, ??) at 0x900000030241978
create_stack_guard_pages__10JavaThreadFv(??) at 0x9000000302fd9bc
create_vm__7ThreadsFP14JavaVMInitArgsPb(??, ??) at 0x900000030302470
JNI_CreateJavaVM_inner__FPP7JavaVM_PPvPv(??, ??, ??) at 0x900000030a2f92c
JvmManager::initializeJvm()(), line 2179 in "memory"
JvmLoader.JvmManager::JvmManager()::'lambda'()::operator()() const(this = 0x000000011004a4e0), line 39 in "JvmManager.h"
unnamed block in _ZNSt3__117__call_once_proxyINS_5tupleIJOZN10JvmManagerC1EvEUlvE_EEEEEvPv(__vp = 0x000000011004a4f8), line 2220 in "type_traits"
unnamed block in _ZNSt3__117__call_once_proxyINS_5tupleIJOZN10JvmManagerC1EvEUlvE_EEEEEvPv(__vp = 0x000000011004a4f8), line 2220 in "type_traits"
_ZNSt3__117__call_once_proxyINS_5tupleIJOZN10JvmManagerC1EvEUlvE_EEEEEvPv(__vp = 0x000000011004a4f8), line 2220 in "type_traits"
std::__1::__call_once(unsigned long volatile&, void*, void ()(void))(??, ??, ??) at 0x9000000035c37c8
unnamed block in JvmLoader.JvmManager::JvmManager()(this = 0x000000011004a5d0), line 666 in "mutex"
unnamed block in JvmLoader.JvmManager::JvmManager()(this = 0x000000011004a5d0), line 666 in "mutex"
JvmLoader.JvmManager::JvmManager()(this = 0x000000011004a5d0), line 666 in "mutex"
main::$_0::operator()() const(this = 0x0000000110016730), line 7 in "JvmLoader.cpp"
unnamed block in void* std::__1::__thread_proxy<std::__1::tuple<std::__1::unique_ptr<std::__1::__thread_struct, std::__1::default_deletestd::__1::__thread_struct >, main::$_0> >(void*)(__vp = 0x0000000110016730), line 2227 in "type_traits"
unnamed block in void* std::__1::__thread_proxy<std::__1::tuple<std::__1::unique_ptr<std::__1::__thread_struct, std::__1::default_deletestd::__1::__thread_struct >, main::$_0> >(void*)(__vp = 0x0000000110016730), line 2227 in "type_traits"
void* std::__1::__thread_proxy<std::__1::tuple<std::__1::unique_ptr<std::__1::__thread_struct, std::__1::default_deletestd::__1::__thread_struct >, main::$_0> >(void*)(__vp = 0x0000000110016730), line 2227 in "type_traits"
What Java Version are you using?
openjdk version "11.0.19" 2023-04-18 OpenJDK Runtime Environment Temurin-11.0.19+7 (build 11.0.19+7) OpenJDK 64-Bit Server VM Temurin-11.0.19+7 (build 11.0.19+7, mixed mode)
What is your operating system and platform?
AIX 7.2 with IBM XL C++ runtime 16.1.0.10 (note that we experience the same crash with many versions of the IBM XL C++ runtime, including Open XL C++ 17.1.x).
How did you install Java?
Most tests are on JDK/JRE distributions expanded from a tar.gz archive.
Did it work before?
Did you test with other Java versions?
Relevant log output
The text was updated successfully, but these errors were encountered: