Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

harmony crash vmState=0x0005ffff #20546

Closed
pshipton opened this issue Nov 8, 2024 · 29 comments
Closed

harmony crash vmState=0x0005ffff #20546

pshipton opened this issue Nov 8, 2024 · 29 comments
Assignees
Labels
blocker comp:jit segfault Issues that describe segfaults / JVM crashes test failure

Comments

@pshipton
Copy link
Member

pshipton commented Nov 8, 2024

Internal build
[Linux PPC 64bit] 80 Load_Level_2.harmony.5mins.Mode112 -Xgcpolicy:gencon -Xjit:count=0 -Xnocompressedrefs
rhel7p8vm14

vmState [0x5ffff]: {J9VMSTATE_JIT} {Illegal optimization number}

30x grinder failed 1/30

j> 08:31:55 #0: /bluebird/builds/bld_81252/sdk/xp6480/jre/lib/ppc64/default/libj9jit29.so(+0x16d954c) [0x3fff7eb6954c]
j> 08:31:55 #1: /bluebird/builds/bld_81252/sdk/xp6480/jre/lib/ppc64/default/libj9jit29.so(+0x16fdf08) [0x3fff7eb8df08]
j> 08:31:55 #2: /bluebird/builds/bld_81252/sdk/xp6480/jre/lib/ppc64/default/libj9jit29.so(+0x5aa1a0) [0x3fff7da3a1a0]
j> 08:31:55 #3: /bluebird/builds/bld_81252/sdk/xp6480/jre/lib/ppc64/default/libj9prt29.so(+0x61160) [0x3fff7f671160]
j> 08:31:55 #4: [0x3fff80760478]
j> 08:31:55 #5: /bluebird/builds/bld_81252/sdk/xp6480/jre/lib/ppc64/default/libj9jit29.so(+0x63c820) [0x3fff7dacc820]
j> 08:31:55 #6: /bluebird/builds/bld_81252/sdk/xp6480/jre/lib/ppc64/default/libj9jit29.so(+0x5a7c28) [0x3fff7da37c28]
j> 08:31:55 #7: /bluebird/builds/bld_81252/sdk/xp6480/jre/lib/ppc64/default/libj9prt29.so(+0x5cec8) [0x3fff7f66cec8]
j> 08:31:55 #8: /bluebird/builds/bld_81252/sdk/xp6480/jre/lib/ppc64/default/libj9jit29.so(+0x5a06c4) [0x3fff7da306c4]
j> 08:31:55 #9: /bluebird/builds/bld_81252/sdk/xp6480/jre/lib/ppc64/default/libj9jit29.so(+0x59f700) [0x3fff7da2f700]
j> 08:31:55 #10: /bluebird/builds/bld_81252/sdk/xp6480/jre/lib/ppc64/default/libj9jit29.so(+0x59dff8) [0x3fff7da2dff8]
j> 08:31:55 #11: /bluebird/builds/bld_81252/sdk/xp6480/jre/lib/ppc64/default/libj9jit29.so(+0x59ddf4) [0x3fff7da2ddf4]
j> 08:31:55 #12: /bluebird/builds/bld_81252/sdk/xp6480/jre/lib/ppc64/default/libj9prt29.so(+0x5cec8) [0x3fff7f66cec8]
j> 08:31:55 #13: /bluebird/builds/bld_81252/sdk/xp6480/jre/lib/ppc64/default/libj9jit29.so(+0x59b8c8) [0x3fff7da2b8c8]
j> 08:31:55 #14: /bluebird/builds/bld_81252/sdk/xp6480/jre/lib/ppc64/default/libj9thr29.so(+0xcd44) [0x3fff7f72cd44]
j> 08:31:55 #15: /usr/lib64/libpthread.so.0(+0xcafc) [0x3fff8071cafc]
j> 08:31:55 #16: /usr/lib64/libc.so.6(clone-0xc453c) [0x3fff805b704c]
j> 08:31:55 Unhandled exception
j> 08:31:55 Type=Segmentation error vmState=0x0005ffff
j> 08:31:55 J9Generic_Signal_Number=00000018 Signal_Number=0000000b Error_Value=00000000 Signal_Code=00000001
j> 08:31:55 Handler1=00003FFF7FA1A968 Handler2=00003FFF7F6EF4B8
j> 08:31:55 R0=FFFFFFFFFF830000 R1=00003FFEB53AA0F0 R2=00003FFF7F2F0400 R3=0000000000000001
j> 08:31:55 R4=00003FFF7F222350 R5=0000000000001000 R6=FFFFFFFFFF000000 R7=00006A6176612F69
j> 08:31:55 R8=0000000000000010 R9=FFFFFFFFFFFFFFF4 R10=00000000000000C0 R11=0000000000000000
j> 08:31:55 R12=00003FFF7DA39F50 R13=00003FFEB53B6900 R14=00003FFEB53AA658 R15=00003FFEB53AA6D8
j> 08:31:55 R16=00003FFEB53AA6B8 R17=00003FFEB53AA698 R18=0000000000000001 R19=0000000000000000
j> 08:31:55 R20=00003FFF7F6E7D10 R21=00003FFF7F6F5F70 R22=00003FFEB53ABAE8 R23=00003FFCF402CB68
j> 08:31:55 R24=00003FFF780FE670 R25=00003FFEA4002200 R26=00003FFEB53AA948 R27=00003FFEA4022940
j> 08:31:55 R28=00003FFF7C171BA0 R29=00003FFEB53ABAE8 R30=0000000000000001 R31=0000000000000000
j> 08:31:55 NIP=00003FFF7DCBCC84 MSR=800000010280F032 ORIG_GPR3=C0000000000093BC CTR=00003FFF7DAD9EC0
j> 08:31:55 LINK=00003FFF7DAD9F74 XER=0000000020000000 CCR=0000000048004088 SOFTE=0000000000000001
j> 08:31:55 TRAP=0000000000000300 DAR=FFFFFFFFFFFFFFFD dsisr=0000000040000000 RESULT=0000000000000000
j> 08:31:55 FPR0=0000000000000000 (f: 0.000000, d: 0.000000e+00)
j> 08:31:55 FPR1=c3e0000000000000 (f: 0.000000, d: -9.223372e+18)
j> 08:31:55 FPR2=41cdcd6500000000 (f: 0.000000, d: 1.000000e+09)
j> 08:31:55 FPR3=0000000000000000 (f: 0.000000, d: 0.000000e+00)
j> 08:31:55 FPR4=0000000000000000 (f: 0.000000, d: 0.000000e+00)
j> 08:31:55 FPR5=c3e0000000000000 (f: 0.000000, d: -9.223372e+18)
j> 08:31:55 FPR6=405a000000000000 (f: 0.000000, d: 1.040000e+02)
j> 08:31:55 FPR7=412e848000000000 (f: 0.000000, d: 1.000000e+06)
j> 08:31:55 FPR8=4000000000000000 (f: 0.000000, d: 2.000000e+00)
j> 08:31:55 FPR9=4530000000000000 (f: 0.000000, d: 1.934281e+25)
j> 08:31:55 FPR10=412e848000000000 (f: 0.000000, d: 1.000000e+06)
j> 08:31:55 FPR11=43300000000f4240 (f: 1000000.000000, d: 4.503600e+15)
j> 08:31:55 FPR12=4530000000000000 (f: 0.000000, d: 1.934281e+25)
j> 08:31:55 FPR13=3fb745d100000000 (f: 0.000000, d: 9.090906e-02)
j> 08:31:55 FPR14=0000000000000000 (f: 0.000000, d: 0.000000e+00)
j> 08:31:55 FPR15=0000000000000000 (f: 0.000000, d: 0.000000e+00)
j> 08:31:55 FPR16=0000000000000000 (f: 0.000000, d: 0.000000e+00)
j> 08:31:55 FPR17=0000000000000000 (f: 0.000000, d: 0.000000e+00)
j> 08:31:55 FPR18=0000000000000000 (f: 0.000000, d: 0.000000e+00)
j> 08:31:55 FPR19=0000000000000000 (f: 0.000000, d: 0.000000e+00)
j> 08:31:55 FPR20=0000000000000000 (f: 0.000000, d: 0.000000e+00)
j> 08:31:55 FPR21=0000000000000000 (f: 0.000000, d: 0.000000e+00)
j> 08:31:55 FPR22=0000000000000000 (f: 0.000000, d: 0.000000e+00)
j> 08:31:55 FPR23=0000000000000000 (f: 0.000000, d: 0.000000e+00)
j> 08:31:55 FPR24=0000000000000000 (f: 0.000000, d: 0.000000e+00)
j> 08:31:55 FPR25=0000000000000000 (f: 0.000000, d: 0.000000e+00)
j> 08:31:55 FPR26=0000000000000000 (f: 0.000000, d: 0.000000e+00)
j> 08:31:55 FPR27=0000000000000000 (f: 0.000000, d: 0.000000e+00)
j> 08:31:55 FPR28=0000000000000000 (f: 0.000000, d: 0.000000e+00)
j> 08:31:55 FPR29=0000000000000000 (f: 0.000000, d: 0.000000e+00)
j> 08:31:55 FPR30=0000000000000000 (f: 0.000000, d: 0.000000e+00)
j> 08:31:55 FPR31=0000000000000000 (f: 0.000000, d: 0.000000e+00)
j> 08:31:55 Module=/bluebird/builds/bld_81252/sdk/xp6480/jre/lib/ppc64/default/libj9jit29.so
j> 08:31:55 Module_base_address=00003FFF7D490000
j> 08:31:55 
j> 08:31:55 Method_being_compiled=java/io/ObjectInputStream.defaultReadFields(Ljava/lang/Object;Ljava/io/ObjectStreamClass;)V
j> 08:31:55 Target=2_90_20241108_81252 (Linux 3.10.0-1160.119.1.el7.ppc64)
j> 08:31:55 CPU=ppc64 (16 logical CPUs) (0xfc2e0000 RAM)
j> 08:31:55 ----------- Stack Backtrace -----------
j> 08:31:55  (0x00003FFF7F6AACB8 [libj9prt29.so+0x9acb8])
j> 08:31:55  (0x00003FFF7F66CEC8 [libj9prt29.so+0x5cec8])
j> 08:31:55  (0x00003FFF7F6AB428 [libj9prt29.so+0x9b428])
j> 08:31:55  (0x00003FFF7F6AA7BC [libj9prt29.so+0x9a7bc])
j> 08:31:55  (0x00003FFF7F66CEC8 [libj9prt29.so+0x5cec8])
j> 08:31:55  (0x00003FFF7F6AA878 [libj9prt29.so+0x9a878])
j> 08:31:55  (0x00003FFF7F87AED4 [libj9vm29.so+0x11aed4])
j> 08:31:55  (0x00003FFF7F66CEC8 [libj9prt29.so+0x5cec8])
j> 08:31:55  (0x00003FFF7F87A688 [libj9vm29.so+0x11a688])
j> 08:31:55  (0x00003FFF7F671160 [libj9prt29.so+0x61160])
j> 08:31:55 __kernel_sigtramp_rt64+0x0 (0x00003FFF80760478)
j> 08:31:55  (0x00003FFF7DACC820 [libj9jit29.so+0x63c820])
j> 08:31:55  (0x00003FFF7DA37C28 [libj9jit29.so+0x5a7c28])
j> 08:31:55  (0x00003FFF7F66CEC8 [libj9prt29.so+0x5cec8])
j> 08:31:55  (0x00003FFF7DA306C4 [libj9jit29.so+0x5a06c4])
j> 08:31:55  (0x00003FFF7DA2F700 [libj9jit29.so+0x59f700])
j> 08:31:55  (0x00003FFF7DA2DFF8 [libj9jit29.so+0x59dff8])
j> 08:31:55  (0x00003FFF7DA2DDF4 [libj9jit29.so+0x59ddf4])
j> 08:31:55  (0x00003FFF7F66CEC8 [libj9prt29.so+0x5cec8])
j> 08:31:55  (0x00003FFF7DA2B8C8 [libj9jit29.so+0x59b8c8])
j> 08:31:55  (0x00003FFF7F72CD44 [libj9thr29.so+0xcd44])
j> 08:31:55  (0x00003FFF8071CAFC [libpthread.so.0+0xcafc])
j> 08:31:55 clone+0xfff3bac4 (0x00003FFF805B704C [libc.so.6+0x14704c])
j> 08:31:55 ---------------------------------------
@pshipton
Copy link
Member Author

pshipton commented Nov 8, 2024

@zl-wang fyi

Copy link

github-actions bot commented Nov 8, 2024

Issue Number: 20546
Status: Open
Recommended Components: comp:vm, comp:gc, comp:build
Recommended Assignees: pshipton, hangshao0, tajila

@pshipton
Copy link
Member Author

See also #20567

@pshipton
Copy link
Member Author

http://vmfarm.rtp.raleigh.ibm.com/job_output.php?id=95899244
[AIX64] 80 Load_Level_2.harmony.5mins.Mode112

@pshipton
Copy link
Member Author

pshipton commented Nov 14, 2024

Not sure if this is the same issue, but the vmstate and mode is the same.
Also the method being compiled is related.

http://vmfarm.rtp.raleigh.ibm.com/job_output.php?id=95965401
[Linux Hammer] 80 Load_Level_2.harmony.5mins.Mode112 -Xgcpolicy:gencon -Xjit:count=0 -Xnocompressedrefs
rtv-rhel8x86-rtp-test-us8h2-1

100x grinder failed 5/100
Also failing on the last ibuild (http://vmfarm.rtp.raleigh.ibm.com/build_info.php?build_id=81589) so it's not a result of the new changes in the abuild.

j> 06:55:00 #0: /bluebird/builds/bld_81617/sdk/xa6480/jre/lib/amd64/default/libj9jit29.so(+0x80fd95) [0x7fedf5362d95]
j> 06:55:00 #1: /bluebird/builds/bld_81617/sdk/xa6480/jre/lib/amd64/default/libj9jit29.so(+0x81c0e0) [0x7fedf536f0e0]
j> 06:55:00 #2: /bluebird/builds/bld_81617/sdk/xa6480/jre/lib/amd64/default/libj9jit29.so(+0x1274e9) [0x7fedf4c7a4e9]
j> 06:55:00 #3: /bluebird/builds/bld_81617/sdk/xa6480/jre/lib/amd64/default/libj9prt29.so(+0x26a58) [0x7fedf7176a58]
j> 06:55:00 #4: /usr/lib64/libpthread.so.0(+0x12d20) [0x7fedf6e49d20]
j> 06:55:00 #5: /bluebird/builds/bld_81617/sdk/xa6480/jre/lib/amd64/default/libj9jit29.so(+0x26ae92) [0x7fedf4dbde92]
j> 06:55:00 #6: /bluebird/builds/bld_81617/sdk/xa6480/jre/lib/amd64/default/libj9jit29.so(+0x136c64) [0x7fedf4c89c64]
j> 06:55:00 #7: /bluebird/builds/bld_81617/sdk/xa6480/jre/lib/amd64/default/libj9prt29.so(+0x274a9) [0x7fedf71774a9]
j> 06:55:00 #8: /bluebird/builds/bld_81617/sdk/xa6480/jre/lib/amd64/default/libj9jit29.so(+0x134fad) [0x7fedf4c87fad]
j> 06:55:00 #9: /bluebird/builds/bld_81617/sdk/xa6480/jre/lib/amd64/default/libj9jit29.so(+0x1352f4) [0x7fedf4c882f4]
j> 06:55:00 #10: /bluebird/builds/bld_81617/sdk/xa6480/jre/lib/amd64/default/libj9jit29.so(+0x13435f) [0x7fedf4c8735f]
j> 06:55:00 #11: /bluebird/builds/bld_81617/sdk/xa6480/jre/lib/amd64/default/libj9jit29.so(+0x134588) [0x7fedf4c87588]
j> 06:55:00 #12: /bluebird/builds/bld_81617/sdk/xa6480/jre/lib/amd64/default/libj9jit29.so(+0x134622) [0x7fedf4c87622]
j> 06:55:00 #13: /bluebird/builds/bld_81617/sdk/xa6480/jre/lib/amd64/default/libj9prt29.so(+0x274a9) [0x7fedf71774a9]
j> 06:55:00 #14: /bluebird/builds/bld_81617/sdk/xa6480/jre/lib/amd64/default/libj9jit29.so(+0x1349eb) [0x7fedf4c879eb]
j> 06:55:00 #15: /bluebird/builds/bld_81617/sdk/xa6480/jre/lib/amd64/default/libj9thr29.so(+0xb393) [0x7fedf71f3393]
j> 06:55:00 #16: /usr/lib64/libpthread.so.0(+0x81ca) [0x7fedf6e3f1ca]
j> 06:55:00 #17: /usr/lib64/libc.so.6(clone+0x43) [0x7fedf68968d3]
j> 06:55:00 Unhandled exception
j> 06:55:00 Type=Segmentation error vmState=0x0005ffff
j> 06:55:00 J9Generic_Signal_Number=00000018 Signal_Number=0000000b Error_Value=00000000 Signal_Code=00000001
j> 06:55:00 Handler1=00007FEDF5BDB530 Handler2=00007FEDF7176820 InaccessibleAddress=FFFFFFFFFFFFFFF9
j> 06:55:00 RDI=FFFFFFFFFFFFFFFD RSI=0000000000000010 RAX=0000000000000000 RBX=00007FEDF02516A0
j> 06:55:00 RCX=0000000000000012 RDX=000000000000001A R8=0000000000000001 R9=00007FED1C72F150
j> 06:55:00 R10=0000000000000000 R11=00007FED1C72F9F0 R12=00007FEDF0922750 R13=000055C3D04B9218
j> 06:55:00 R14=00007FED1C72FA28 R15=00007FECCD4BE8E0
j> 06:55:00 RIP=00007FEDF4DBDE92 GS=0000 FS=0000 RSP=00007FED1C72F498
j> 06:55:00 EFlags=0000000000010246 CS=0033 RBP=00007FED1C72F5A0 ERR=0000000000000005
j> 06:55:00 TRAPNO=000000000000000E OLDMASK=0000000000000000 CR2=FFFFFFFFFFFFFFF9
j> 06:55:00 xmm0=0000000000000000 (f: 0.000000, d: 0.000000e+00)
j> 06:55:00 xmm1=00007fedf02516a0 (f: 4028962560.000000, d: 6.949523e-310)
j> 06:55:00 xmm2=0000000000000000 (f: 0.000000, d: 0.000000e+00)
j> 06:55:00 xmm3=00007fed08003c00 (f: 134233088.000000, d: 6.949331e-310)
j> 06:55:00 xmm4=00007fed1c72f7a0 (f: 477296544.000000, d: 6.949348e-310)
j> 06:55:00 xmm5=0000003000000020 (f: 32.000000, d: 1.018558e-312)
j> 06:55:00 xmm6=00007fed1c72f980 (f: 477297024.000000, d: 6.949348e-310)
j> 06:55:00 xmm7=654b657461766972 (f: 1635150208.000000, d: 8.881360e+179)
j> 06:55:00 xmm8=00007fec97363070 (f: 2536910848.000000, d: 6.949237e-310)
j> 06:55:00 xmm9=0000000000000000 (f: 0.000000, d: 0.000000e+00)
j> 06:55:00 xmm10=3f3f3f3f3f3f3f3f (f: 1061109568.000000, d: 4.767923e-04)
j> 06:55:00 xmm11=9999999999999999 (f: 2576980480.000000, d: -2.353437e-185)
j> 06:55:00 xmm12=2020202020202020 (f: 538976256.000000, d: 6.013470e-154)
j> 06:55:00 xmm13=000000003eaaa9ec (f: 1051372032.000000, d: 5.194468e-315)
j> 06:55:00 xmm14=ffffffffffffffff (f: 4294967296.000000, d: -nan)
j> 06:55:00 xmm15=0000000000000000 (f: 0.000000, d: 0.000000e+00)
j> 06:55:00 Module=/bluebird/builds/bld_81617/sdk/xa6480/jre/lib/amd64/default/libj9jit29.so
j> 06:55:00 Module_base_address=00007FEDF4B53000
j> 06:55:00 
j> 06:55:00 Method_being_compiled=java/io/ObjectOutputStream.defaultWriteFields(Ljava/lang/Object;Ljava/io/ObjectStreamClass;)V
j> 06:55:00 Target=2_90_20241114_81617 (Linux 4.18.0-553.27.1.el8_10.x86_64)
j> 06:55:00 CPU=amd64 (4 logical CPUs) (0x1e112c000 RAM)
j> 06:55:00 ----------- Stack Backtrace -----------
j> 06:55:01 _ZN2J913Recompilation23getJittedBodyInfoFromPCEPv+0x2 (0x00007FEDF4DBDE92 [libj9jit29.so+0x26ae92])
j> 06:55:01 _ZN2TR28CompilationInfoPerThreadBase14wrappedCompileEP13J9PortLibraryPv+0x144 (0x00007FEDF4C89C64 [libj9jit29.so+0x136c64])
j> 06:55:01 omrsig_protect+0x239 (0x00007FEDF71774A9 [libj9prt29.so+0x274a9])
j> 06:55:01 _ZN2TR28CompilationInfoPerThreadBase7compileEP10J9VMThreadP21TR_MethodToBeCompiledRN2J917J9SegmentProviderE+0x37d (0x00007FEDF4C87FAD [libj9jit29.so+0x134fad])
j> 06:55:01 _ZN2TR24CompilationInfoPerThread12processEntryER21TR_MethodToBeCompiledRN2J917J9SegmentProviderE+0x164 (0x00007FEDF4C882F4 [libj9jit29.so+0x1352f4])
j> 06:55:01 _ZN2TR24CompilationInfoPerThread14processEntriesEv+0x37f (0x00007FEDF4C8735F [libj9jit29.so+0x13435f])
j> 06:55:01 _ZN2TR24CompilationInfoPerThread3runEv+0x68 (0x00007FEDF4C87588 [libj9jit29.so+0x134588])
j> 06:55:01 _Z30protectedCompilationThreadProcP13J9PortLibraryPN2TR24CompilationInfoPerThreadE+0x82 (0x00007FEDF4C87622 [libj9jit29.so+0x134622])
j> 06:55:01 omrsig_protect+0x239 (0x00007FEDF71774A9 [libj9prt29.so+0x274a9])
j> 06:55:01 _Z21compilationThreadProcPv+0x17b (0x00007FEDF4C879EB [libj9jit29.so+0x1349eb])
j> 06:55:01 thread_wrapper+0x163 (0x00007FEDF71F3393 [libj9thr29.so+0xb393])
j> 06:55:01 start_thread+0xea (0x00007FEDF6E3F1CA [libpthread.so.0+0x81ca])
j> 06:55:01 clone+0x43 (0x00007FEDF68968D3 [libc.so.6+0x398d3])
j> 06:55:01 ---------------------------------------

@pshipton
Copy link
Member Author

@hzongaro pls take a look

@pshipton
Copy link
Member Author

pshipton commented Nov 14, 2024

http://vmfarm.rtp.raleigh.ibm.com/job_output.php?id=95980123
[Linux Hammer] 80 Load_Level_2.harmony.5mins.Mode112

http://vmfarm.rtp.raleigh.ibm.com/job_output.php?id=95999705
[AIX] 80 Load_Level_2.harmony.5mins.Mode112

@hzongaro
Copy link
Member

@pshipton, sorry for the basic question, but it's been a long time since I looked at a failure reported in vmfarm. Where should I be looking for the Linux x86 builds that you mentioned in #20546 (comment) and #20546 (comment)?

j9build@rtv-rhel8le-cuda-1 ~]$ ls /bluebird/builds/bld_81617
ls: cannot access '/bluebird/builds/bld_81617': No such file or directory
[j9build@rtv-rhel8le-cuda-1 ~]$ ls /bluebird/builds/bld_81633
ls: cannot access '/bluebird/builds/bld_81633': No such file or directory
[j9build@rtv-rhel8le-cuda-1 ~]$ ls /bluebird/builds
'""'   bld_81404   bld_81485   bld_81552   bld_81567   bld_81619   bld_81656   bld_81696   bld_81702
 77    bld_81412   bld_81503   bld_81554   bld_81618   bld_81629   bld_81660   bld_81700   generate

@pshipton
Copy link
Member Author

The builds are retired and no longer on disk, you can find them in artifactory.
i.e. https://na.artifactory.swg-devops.com/ui/native/sys-rt-vmfarm-generic-local/R29/acceptance/xa6480/81617

@hzongaro
Copy link
Member

Looking at the x86 Linux core from http://vmfarm.rtp.raleigh.ibm.com/job_output.php?id=95965401, the problem appears to be the same as that described by @IBMJimmyk in #20567 (comment)

(kca) where
0x00007fedf4dbde92 {libj9jit29.so}{_ZN2J913Recompilation23getJittedBodyInfoFromPCEPv} [0x7fed1c72f498]
0x00007fedf4c89c64 {libj9jit29.so}{_ZN2TR28CompilationInfoPerThreadBase14wrappedCompileEP13J9PortLibraryPv} [0x7fed1c72f4a0]
(kca) (0x00007fedf4dbde90)/8i
0x7fedf4dbde90 {libj9jit29.so}{_ZN2J913Recompilation23getJittedBodyInfoFromPCEPv} +0               31c0                 xor       eax, eax
0x7fedf4dbde92 {libj9jit29.so}{_ZN2J913Recompilation23getJittedBodyInfoFromPCEPv} +2               f647fc30             test      byte ptr [rdi - 4], 0x30  
0x7fedf4dbde96 {libj9jit29.so}{_ZN2J913Recompilation23getJittedBodyInfoFromPCEPv} +6          *    7404                 je        0x7fedf4dbde9c C>> +12
0x7fedf4dbde98 {libj9jit29.so}{_ZN2J913Recompilation23getJittedBodyInfoFromPCEPv} +8          |    488b47f4             mov       rax, qword ptr [rdi - 0xc]
0x7fedf4dbde9c {libj9jit29.so}{_ZN2J913Recompilation23getJittedBodyInfoFromPCEPv} +12         >    c3                   ret        <<< +6
0x7fedf4dbde9d {libj9jit29.so}{_ZN2J913Recompilation23getJittedBodyInfoFromPCEPv} +13              90                   nop
0x7fedf4dbde9e {libj9jit29.so}{_ZN2J913Recompilation23getJittedBodyInfoFromPCEPv} +14              6690                 nop
0x7fedf4dbdea0 {libj9jit29.so}{_ZN2J913Recompilation29isAlreadyPreparedForRecompileEPv} +0               0fb757fe             movzx     edx, word ptr [rdi - 2]
(kca) p $rdi
%1 = 0xfffffffffffffffd   (-3)
(kca) j9m 0x00007feccd4be8e0
Method   {ClassPath/Name.MethodName}: {java/io/ObjectOutputStream.defaultWriteFields}
                           Signature: (Ljava/lang/Object;Ljava/io/ObjectStreamClass;)V
                              Access: Private 
                    J9Class/J9Method: 0x00007feccd4bec00 / 0x00007feccd4be8e0
               Compiled Method Start: Not Compiled! (count=-3)
                      ByteCode Start: 0x00007fece015b834 (270 bytes)
                   ROM Constant Pool: 0x00007fece0158e20 (261 entries)
                       Constant Pool: 0x00007feccd4bd1e0 (260 entries)

@pshipton
Copy link
Member Author

AIX core files can be found under http://vmfarm.rtp.raleigh.ibm.com/etc/cores/tmp/

@hzongaro
Copy link
Member

i.e. https://na.artifactory.swg-devops.com/ui/native/sys-rt-vmfarm-generic-local/R29/acceptance/xa6480/81617

@pshipton, do you know where I might find the debug image for that build?

@pshipton
Copy link
Member Author

We have https://na.artifactory.swg-devops.com/ui/native/sys-rt-vmfarm-generic-local/R29/acceptance/jvmxa6480/81617/jvmxa6480.zip which appear to contain debug.

@hzongaro
Copy link
Member

Though the root cause of the problem is the same as that reported in #20567, the location where it occurs is different. That debug build helped move me further in investigating the problem. I was unable to tie addresses reported in the core file directly to line numbers in the source code, but I managed to piece together where exactly the problem occurred by looking for method calls that weren't virtual.

It looks like the problem in this case occurs at the call to TR_ResolvedJ9Method::getExistingJittedBodyInfo() from TR_ResolvedJ9Method::getExistingJittedBodyInfo(), which in turn calls TR::Recompilation::getJittedBodyInfoFromPC.

@pshipton
Copy link
Member Author

@vij-singh
Copy link

@IBMJimmyk @zl-wang Can we verify this one with the fix for #20567 ?

@IBMJimmyk
Copy link
Contributor

Based on Henry's comments, my fix for #20567 would not help. My understanding is getExistingJittedBodyInfo calls startAddressForInterpreterOfJittedMethod which reads the extra field and returns -3 because the latest compilation failed. getExistingJittedBodyInfo then calls getJittedBodyInfoFromPC with -3 and crashes since that's not a real start PC.

This is similar to the problem I saw but is taking a different path. I think the solution might also be to add error checking for invalid startPCs and take appropriate action. I would need to look at the code in more detail to determine what the best way to handle an error would be. It might be possible for getJittedBodyInfoFromPC to return NULL for invalid startPCs and then check that everywhere that uses it can handle a NULL return.

@pshipton
Copy link
Member Author

http://vmfarm.rtp.raleigh.ibm.com/job_output.php?id=96726917
[Win64 Hammer Compressed Pointers] 80 Load_Level_2.harmony.5mins.Mode688

@hzongaro
Copy link
Member

@IBMJimmyk, I know you were thinking about a fix for this. Do you think it can make it into 0.49 or should this move out to 0.51?

@pshipton
Copy link
Member Author

We need a good justification to move a regression/blocker out. This failure has been occurring frequently and also occurred in a 25_01 build.

@pshipton
Copy link
Member Author

@IBMJimmyk
Copy link
Contributor

I just talked to @dsouzai and it looks like this issue might be fixed by this recently merged PR:
#20763

It reverts a problematic PR that was messing with the extra field and was originally merged in on Nov 7. I think this is just before these Harmony -3 start PC problems showed up (This issue was opened on Nov 8). The problematic PR was causing a bad start PC to cause problems in other places as well.

I am currently in the middle of trying to see if I can verify that reverting the change will fix the problem.

@pshipton
Copy link
Member Author

@IBMJimmyk do you have any conclusions? Something seems to have fixed the problem as I haven't been seeing it in the head stream builds any more. We should get the fix backported asap.

@dsouzai
Copy link
Contributor

dsouzai commented Dec 12, 2024

I opened #20825; the original issue I had fixed wasn't tagged for any release, so I never double delivered.

@IBMJimmyk
Copy link
Contributor

It does seem like the (now reverted) change that modified the extra field can potentially cause the problem in this issue. So the PR that reverts it should fix this issue as well.

Copy link

Issue Number: 20546
Status: Closed
Actual Components: comp:jit, test failure, blocker, segfault
Actual Assignees: No one :(
PR Assignees: No one :(

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
blocker comp:jit segfault Issues that describe segfaults / JVM crashes test failure
Projects
None yet
Development

No branches or pull requests

5 participants