Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix acquireVMaccessIfNeeded for Non-Compiler Threads #19260

Merged
merged 3 commits into from
Apr 5, 2024

Conversation

dsouzai
Copy link
Contributor

@dsouzai dsouzai commented Apr 2, 2024

When a non-comp thread invokes acquireVMaccessIfNeeded, the function
silently returns, because it assumes that a non-comp thread should
already have had VMAccess. This is particularly problematic when using
TR::VMAccessCriticalSection, as it is not at all obvious that the
current thread may not actually have VMAccess. Furthermore, there are
legitmate circusmtances when a non-comp thread will not have VMAccess

Additionally, this PR includes:

  • Enforce VMAccess when calling compileMethod
  • Use TR::VMAccessCriticalSection in dumpIPBCDataCallGraph

Enables #18982

@dsouzai
Copy link
Contributor Author

dsouzai commented Apr 2, 2024

@mpirvu Could you please review?

@dsouzai
Copy link
Contributor Author

dsouzai commented Apr 3, 2024

Furthermore, there are legitmate circusmtances when a non-comp thread will not have VMAccess

void *currOldStartPC = startPCIfAlreadyCompiled(vmThread, details, startPC);

is one legimate place where the app thread does not have VMAccess, but startPCIfAlreadyCompiled tries to acquire VMAccess via TR::VMAccessCriticalSection.

Copy link
Contributor

@mpirvu mpirvu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code needs to take into the account the fact that TR_DisableNoVMAccess is an option that only applies to compilation threads. Looks good otherwise.

runtime/compiler/env/VMJ9.cpp Show resolved Hide resolved
dsouzai added 3 commits April 3, 2024 11:05
Whena a non-comp thread invokes acquireVMaccessIfNeeded, the function
silently returns, because it assumes that a non-comp thread should
already have had VMAccess. This is particularly problematic when using
TR::VMAccessCriticalSection, as it is not at all obvious that the
current thread may not actually have VMAccess. Furthermore, there are
legitmate circusmtances when a non-comp thread will not have VMAccess
and will need to use this API to acquire it.

Signed-off-by: Irwin D'Souza <[email protected]>
@mpirvu mpirvu self-assigned this Apr 3, 2024
@mpirvu
Copy link
Contributor

mpirvu commented Apr 3, 2024

jenkins test sanity all jdk17

@dsouzai
Copy link
Contributor Author

dsouzai commented Apr 3, 2024

zlinux failure due to #17474

[2024-04-03T16:28:14.324Z]  [ERR] Assertion failed at /home/jenkins/workspace/Build_JDK17_s390x_linux_Personal/openj9/runtime/compiler/env/JITServerPersistentCHTable.cpp:172: classInfo
[2024-04-03T16:28:14.324Z]  [ERR] 	subclass info cannot be null: ensure subclasses are loaded before superclass

@dsouzai
Copy link
Contributor Author

dsouzai commented Apr 3, 2024

xlinux errors are due to #19114

[2024-04-03T18:33:24.061Z]  [OUT] initiate restore
[2024-04-03T18:33:24.061Z]  [OUT] pie: 14706: Error (criu/pie/restorer.c:1839): prctl failed @1839 with -1
[2024-04-03T18:33:24.061Z]  [OUT] pie: 14706: Error (criu/pie/restorer.c:1840): prctl failed @1840 with -1
[2024-04-03T18:33:24.061Z]  [OUT] pie: 14706: Error (criu/pie/restorer.c:1841): prctl failed @1841 with -1
[2024-04-03T18:33:24.061Z]  [OUT] pie: 14706: Error (criu/pie/restorer.c:1842): prctl failed @1842 with -1
[2024-04-03T18:33:24.061Z]  [OUT] pie: 14706: Error (criu/pie/restorer.c:1843): prctl failed @1843 with -1
[2024-04-03T18:33:24.061Z]  [OUT] pie: 14706: Error (criu/pie/restorer.c:1844): prctl failed @1844 with -1
[2024-04-03T18:33:24.061Z]  [OUT] pie: 14706: Error (criu/pie/restorer.c:1845): prctl failed @1845 with -1
[2024-04-03T18:33:24.061Z]  [OUT] pie: 14706: Error (criu/pie/restorer.c:1846): prctl failed @1846 with -1
[2024-04-03T18:33:24.061Z]  [OUT] pie: 14706: Error (criu/pie/restorer.c:1847): prctl failed @1847 with -1
[2024-04-03T18:33:24.061Z]  [OUT] pie: 14706: Error (criu/pie/restorer.c:1848): prctl failed @1848 with -1
[2024-04-03T18:33:24.061Z]  [OUT] pie: 14706: Error (criu/pie/restorer.c:1849): prctl failed @1849 with -1
[2024-04-03T18:33:24.061Z]  [OUT] pie: 14706: Error (criu/pie/restorer.c:1850): prctl failed @1850 with -1
[2024-04-03T18:33:24.061Z]  [OUT] pie: 14706: Error (criu/pie/restorer.c:777): prctl failed @777 with -1
[2024-04-03T18:33:24.061Z]  [OUT] pie: 14706: Error (criu/pie/restorer.c:779): Can't restore EXE link (-1)
[2024-04-03T18:33:24.061Z]  [OUT] pie: 14706: Error (criu/pie/restorer.c:2102): Restorer fail 14706
[2024-04-03T18:33:24.061Z]  [OUT] Error (criu/cr-restore.c:2547): Restoring FAILED.

@mpirvu
Copy link
Contributor

mpirvu commented Apr 4, 2024

aarch64 tests aborted for no apparent reason:

21:11:45  ---TEST RESULTS---
21:11:45  Number of PASSED tests: 8 out of 8
21:11:45  Number of FAILED tests: 0 out of 8
21:11:45  -----------------------------------
21:11:45  cmdLineTester_ShareClassesSimpleSanity_0_PASSED
21:11:45  -----------------------------------
21:11:45  
21:11:45  TEST TEARDOWN:
21:12:00  JVMSHRC005I No shared class caches available
Calling Pipeline was cancelled
21:12:03  Sending interrupt signal to process
21:12:04  /bin/sh: line 14: 812128 Terminated              "/home/jenkins/workspace/Test_openjdk17_j9_sanity.functional_aarch64_linux_Personal_testList_0/jdkbinary/j2sdk-image/bin/java" -Xshareclasses:groupAccess,destroyAll
21:12:04  cache cleanup done
21:12:04  cmdLineTester_ShareClassesSimpleSanity_0 Finish Time: Wed Apr  3 21:12:03 2024 Epoch Time (ms): 1712193123758
21:12:04  make[7]: Leaving directory '/home/jenkins/workspace/Test_openjdk17_j9_sanity.functional_aarch64_linux_Personal_testList_0/aqa-tests/functional/cmdLineTests/shareClassTests/ShareClassesSimpleSanity'
21:12:04  make[7]: Entering directory '/home/jenkins/workspace/Test_openjdk17_j9_sanity.functional_aarch64_linux_Personal_testList_0/aqa-tests/functional/cmdLineTests/shareClassTests/URLHelperTests'
21:12:04  
21:12:04  make[6]: *** [/home/jenkins/workspace/Test_openjdk17_j9_sanity.functional_aarch64_linux_Personal_testList_0/aqa-tests/TKG/../TKG/settings.mk:356: testList-URLHelperTests] Terminated
21:12:04  ===============================================
21:12:04  Running test cmdLineTester_SCURLHelperTests_90_0 ...
21:12:04  make[5]: *** [/home/jenkins/workspace/Test_openjdk17_j9_sanity.functional_aarch64_linux_Personal_testList_0/aqa-tests/TKG/../TKG/settings.mk:356: testList-shareClassTests] Terminated
21:12:04  make[5]: *** wait: No child processes.  Stop.
21:12:04  make[5]: *** Waiting for unfinished jobs....
21:12:04  make[5]: *** wait: No child processes.  Stop.
21:12:04  make[4]: *** [/home/jenkins/workspace/Test_openjdk17_j9_sanity.functional_aarch64_linux_Personal_testList_0/aqa-tests/TKG/../TKG/settings.mk:356: testList-cmdLineTests] Error 2
21:12:04  make[4]: Leaving directory '/home/jenkins/workspace/Test_openjdk17_j9_sanity.functional_aarch64_linux_Personal_testList_0/aqa-tests/functional'
21:12:04  make[3]: *** [/home/jenkins/workspace/Test_openjdk17_j9_sanity.functional_aarch64_linux_Personal_testList_0/aqa-tests/TKG/../TKG/settings.mk:356: testList-functional] Error 2
21:12:04  make[3]: Leaving directory '/home/jenkins/workspace/Test_openjdk17_j9_sanity.functional_aarch64_linux_Personal_testList_0/aqa-tests'
21:12:04  make[2]: *** [settings.mk:356: testList-..] Error 2
21:12:04  make[2]: Leaving directory '/home/jenkins/workspace/Test_openjdk17_j9_sanity.functional_aarch64_linux_Personal_testList_0/aqa-tests/TKG'
21:12:04  make[1]: *** [makefile:65: _testList] Error 2
21:12:04  make[1]: Leaving directory '/home/jenkins/workspace/Test_openjdk17_j9_sanity.functional_aarch64_linux_Personal_testList_0/aqa-tests/TKG'
21:12:04  make: *** [parallelList.mk:8: testList_0] Error 2

@dsouzai
Copy link
Contributor Author

dsouzai commented Apr 4, 2024

The build failed because Calling Pipeline was cancelled but when going up the pipeline it just says Body of block-scoped step failed; so not sure what caused it.

@dsouzai
Copy link
Contributor Author

dsouzai commented Apr 4, 2024

jenkins test sanity alinux64 jdk17

@mpirvu
Copy link
Contributor

mpirvu commented Apr 5, 2024

Test failure on aarch64:

20:22:59  Testing: unmodifiedarray8.2
20:22:59  Test start time: 2024/04/04 20:22:59 Eastern Standard Time
20:22:59  Running command: "/home/jenkins/workspace/Test_openjdk17_j9_sanity.functional_aarch64_linux_Personal_testList_0/jdkbinary/j2sdk-image/bin/java" -cp "/home/jenkins/workspace/Test_openjdk17_j9_sanity.functional_aarch64_linux_Personal_testList_0/aqa-tests/TKG/../../jvmtest/functional/cmdline_options_testresources/cmdlinetestresources.jar" -Xcheck:jni:advice -Xgcthreads1 j9vm.test.jnichk.ModifyArrayData float 10 -1 -1 0
20:22:59  Time spent starting: 3 milliseconds
20:23:07  Cancelling nested steps due to timeout
20:23:07  Sending interrupt signal to process
20:23:08  Time spent executing: 7884 milliseconds
20:23:08  Test result: FAILED
20:23:08  Output from test:
20:23:08   [ERR] JVMJNCK001I JNI check utility installed. Use -Xcheck:jni:help for usage
20:23:08  >> Success condition was not found: [Output match: JVMJNCK074I]
20:23:08  
20:23:08  Testing: originalarraymodified8.2
20:23:08  Test start time: 2024/04/04 20:23:07 Eastern Standard Time
20:23:08  Running command: "/home/jenkins/workspace/Test_openjdk17_j9_sanity.functional_aarch64_linux_Personal_testList_0/jdkbinary/j2sdk-image/bin/java" -cp "/home/jenkins/workspace/Test_openjdk17_j9_sanity.functional_aarch64_linux_Personal_testList_0/aqa-tests/TKG/../../jvmtest/functional/cmdline_options_testresources/cmdlinetestresources.jar" -Xcheck:jni:warn -Xgcthreads1 j9vm.test.jnichk.ModifyArrayData float 10 9 9 0
20:23:08  Time spent starting: 2 milliseconds
20:23:08  -----------------------------------
20:23:08  cmdLineTester_XcheckJNI_0_FAILED
20:23:08  -----------------------------------
20:23:08  
20:23:08  TEST TEARDOWN:

@mpirvu
Copy link
Contributor

mpirvu commented Apr 5, 2024

20:22:09  ===============================================
20:22:09  Running test cmdLineTester_SCURLHelperTests_90_1 ...
20:22:09  ===============================================
20:22:09  cmdLineTester_SCURLHelperTests_90_1 Start Time: Thu Apr  4 20:22:07 2024 Epoch Time (ms): 1712276527585
20:22:09  variation: Mode610
20:22:09  JVM_OPTIONS:  -Xcompressedrefs -Xjit -Xgcpolicy:gencon 
20:22:09  { \
20:22:09  echo "";	echo "TEST SETUP:"; \
20:22:09  "/home/jenkins/workspace/Test_openjdk17_j9_sanity.functional_aarch64_linux_Personal_testList_1/jdkbinary/j2sdk-image/bin/java" -Xshareclasses:destroyAll; "/home/jenkins/workspace/Test_openjdk17_j9_sanity.functional_aarch64_linux_Personal_testList_1/jdkbinary/j2sdk-image/bin/java" -Xshareclasses:groupAccess,destroyAll; echo "cache cleanup done"; \
20:22:09  mkdir -p "/home/jenkins/workspace/Test_openjdk17_j9_sanity.functional_aarch64_linux_Personal_testList_1/aqa-tests/TKG/../TKG/output_17122435429215/cmdLineTester_SCURLHelperTests_90_1"; \
20:22:09  cd "/home/jenkins/workspace/Test_openjdk17_j9_sanity.functional_aarch64_linux_Personal_testList_1/aqa-tests/TKG/../TKG/output_17122435429215/cmdLineTester_SCURLHelperTests_90_1"; \
20:22:09  echo "";	echo "TESTING:"; \
20:22:09  cp -r "/home/jenkins/workspace/Test_openjdk17_j9_sanity.functional_aarch64_linux_Personal_testList_1/aqa-tests/TKG/../../jvmtest/functional/cmdLineTests/shareClassTests/URLHelperTests/URLHelperTests.jar" .; \
20:22:09  "/home/jenkins/workspace/Test_openjdk17_j9_sanity.functional_aarch64_linux_Personal_testList_1/jdkbinary/j2sdk-image/bin/jar" -xf URLHelperTests.jar; \
20:22:09   \
20:22:09  "/home/jenkins/workspace/Test_openjdk17_j9_sanity.functional_aarch64_linux_Personal_testList_1/jdkbinary/j2sdk-image/bin/java" "-Xshareclasses:none" -DJAVA_HOME='/home/jenkins/workspace/Test_openjdk17_j9_sanity.functional_aarch64_linux_Personal_testList_1/jdkbinary/j2sdk-image' -DPATHSEP="/" -DRUN_SCRIPT=sh -DPROPS_DIR=props_unix -DSCRIPT_SUFFIX=.sh -DEXECUTABLE_SUFFIX= -DJAVA_EXE='"/home/jenkins/workspace/Test_openjdk17_j9_sanity.functional_aarch64_linux_Personal_testList_1/jdkbinary/j2sdk-image/bin/java"  -Xcompressedrefs -Xjit -Xgcpolicy:gencon ' -DCPDL=":" -DSCMODE=210 -DTEST_JVM_OPTIONS=" -Xcompressedrefs -Xjit -Xgcpolicy:gencon " \
20:22:09  -jar "/home/jenkins/workspace/Test_openjdk17_j9_sanity.functional_aarch64_linux_Personal_testList_1/aqa-tests/TKG/../../jvmtest/functional/cmdline_options_tester/cmdlinetester.jar" \
20:22:09  -config "/home/jenkins/workspace/Test_openjdk17_j9_sanity.functional_aarch64_linux_Personal_testList_1/aqa-tests/TKG/../../jvmtest/functional/cmdLineTests/shareClassTests/URLHelperTests/URLHelperTests.xml" -xids all,linux_aarch64,17 -xlist "/home/jenkins/workspace/Test_openjdk17_j9_sanity.functional_aarch64_linux_Personal_testList_1/aqa-tests/TKG/../../jvmtest/functional/cmdLineTests/shareClassTests/URLHelperTests/exclude.xml" \
20:22:09  -nonZeroExitWhenError \
20:22:09  -outputLimit 300; \
20:22:09  if [ $? -eq 0 ]; then echo "-----------------------------------"; echo "cmdLineTester_SCURLHelperTests_90_1""_PASSED"; echo "-----------------------------------"; cd /home/jenkins/workspace/Test_openjdk17_j9_sanity.functional_aarch64_linux_Personal_testList_1/aqa-tests/TKG/..; rm -f -r "/home/jenkins/workspace/Test_openjdk17_j9_sanity.functional_aarch64_linux_Personal_testList_1/aqa-tests/TKG/../TKG/output_17122435429215/cmdLineTester_SCURLHelperTests_90_1"; else echo "-----------------------------------"; echo "cmdLineTester_SCURLHelperTests_90_1""_FAILED"; echo "-----------------------------------"; fi; \
20:22:09  echo "";	echo "TEST TEARDOWN:"; \
20:22:09  "/home/jenkins/workspace/Test_openjdk17_j9_sanity.functional_aarch64_linux_Personal_testList_1/jdkbinary/j2sdk-image/bin/java" -Xshareclasses:destroyAll; "/home/jenkins/workspace/Test_openjdk17_j9_sanity.functional_aarch64_linux_Personal_testList_1/jdkbinary/j2sdk-image/bin/java" -Xshareclasses:groupAccess,destroyAll; echo "cache cleanup done"; \
20:22:09   } 2>&1 | tee -a "/home/jenkins/workspace/Test_openjdk17_j9_sanity.functional_aarch64_linux_Personal_testList_1/aqa-tests/TKG/../TKG/output_17122435429215/TestTargetResult";
20:22:09  
20:22:09  TEST SETUP:
20:22:24  JVMSHRC005I No shared class caches available
20:22:39  JVMSHRC005I No shared class caches available
20:22:39  cache cleanup done
20:22:39  
20:22:39  TESTING:
20:23:07  Cancelling nested steps due to timeout
20:23:07  Sending interrupt signal to process

Not sure if these are hangs, or the machine is just too slow.

@pshipton
Copy link
Member

pshipton commented Apr 5, 2024

It's a network problem and I've disabled the problematic machines for now.

jenkins test sanity alinux64 jdk17

@mpirvu
Copy link
Contributor

mpirvu commented Apr 5, 2024

Merging since there are no issues cause this PR.

@mpirvu mpirvu merged commit 6f8105f into eclipse-openj9:master Apr 5, 2024
15 of 19 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants