Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Matrix exponential fails #1346

Closed
gdalle opened this issue Mar 15, 2024 · 1 comment
Closed

Matrix exponential fails #1346

gdalle opened this issue Mar 15, 2024 · 1 comment
Labels
bug Something isn't working linear algebra

Comments

@gdalle
Copy link
Contributor

gdalle commented Mar 15, 2024

julia> using Enzyme

julia> x = rand(2, 2)
2×2 Matrix{Float64}:
 0.133862  0.197559
 0.265893  0.474916

julia> dx = Enzyme.make_zero(x)
2×2 Matrix{Float64}:
 0.0  0.0
 0.0  0.0

julia> exp(x)
2×2 Matrix{Float64}:
 1.17714  0.271507
 0.36542  1.64585

julia> autodiff(Forward, exp, Duplicated, Duplicated(x, dx))
┌ Warning: Using fallback BLAS replacements for (["dgemm_64_", "dsyrk_64_", "dnrm2_64_"]), performance may be degraded
└ @ Enzyme.Compiler ~/.julia/packages/GPUCompiler/U36Ed/src/utils.jl:59
ERROR: Enzyme execution failed.
Enzyme compilation failed.
Current scope: 
; Function Attrs: mustprogress willreturn
define internal fastcc void @preprocess_julia_gebal__19748({ i64, i64, {} addrspace(10)* }* noalias nocapture nofree noundef nonnull writeonly sret({ i64, i64, {} addrspace(10)* }) align 8 dereferenceable(24) %0, [1 x {} addrspace(10)*]* noalias nocapture nofree noundef nonnull writeonly align 8 dereferenceable(8) "enzyme_inactive" "enzymejl_returnRoots" %1, {} addrspace(10)* noundef nonnull align 16 dereferenceable(40) "enzymejl_parmtype"="127156449281104" "enzymejl_parmtype_ref"="2" %2) unnamed_addr #109 !dbg !12652 {
top:
  %3 = alloca i64, align 16
  %4 = bitcast i64* %3 to i8*
  %5 = alloca i64, align 16
  %6 = bitcast i64* %5 to i8*
  %7 = alloca i64, align 16
  %8 = bitcast i64* %7 to i8*
  %9 = alloca i8, align 1
  %10 = alloca i64, align 16
  %11 = bitcast i64* %10 to i8*
  %12 = alloca i64, align 16
  %13 = bitcast i64* %12 to i8*
  %newstruct74 = alloca [2 x i64], align 8
  %14 = call {}*** @julia.get_pgcstack() #110
  %ptls_field80 = getelementptr inbounds {}**, {}*** %14, i64 2
  %15 = bitcast {}*** %ptls_field80 to i64***
  %ptls_load8182 = load i64**, i64*** %15, align 8, !tbaa !113
  %16 = getelementptr inbounds i64*, i64** %ptls_load8182, i64 2
  %safepoint = load i64*, i64** %16, align 8, !tbaa !117, !invariant.load !112
  fence syncscope("singlethread") seq_cst
  call void @julia.safepoint(i64* %safepoint) #110, !dbg !12653
  fence syncscope("singlethread") seq_cst
  %17 = addrspacecast {} addrspace(10)* %2 to {} addrspace(10)* addrspace(11)*, !dbg !12654
  %arraysize_ptr = getelementptr inbounds {} addrspace(10)*, {} addrspace(10)* addrspace(11)* %17, i64 3, !dbg !12654
  %18 = bitcast {} addrspace(10)* addrspace(11)* %arraysize_ptr to i64 addrspace(11)*, !dbg !12654
  %arraysize = load i64, i64 addrspace(11)* %18, align 8, !dbg !12654, !tbaa !117, !range !126, !invariant.load !112, !alias.scope !127, !noalias !130
  %arraysize_ptr2 = getelementptr inbounds {} addrspace(10)*, {} addrspace(10)* addrspace(11)* %17, i64 4, !dbg !12654
  %19 = bitcast {} addrspace(10)* addrspace(11)* %arraysize_ptr2 to i64 addrspace(11)*, !dbg !12654
  %arraysize3 = load i64, i64 addrspace(11)* %19, align 16, !dbg !12654, !tbaa !117, !range !126, !invariant.load !112, !alias.scope !127, !noalias !130
  %.not = icmp eq i64 %arraysize, %arraysize3, !dbg !12657
  br i1 %.not, label %L13, label %L6, !dbg !12658

L6:                                               ; preds = %top
  %20 = getelementptr inbounds [2 x i64], [2 x i64]* %newstruct74, i64 0, i64 0, !dbg !12659
  store i64 %arraysize, i64* %20, align 8, !dbg !12659, !tbaa !319, !alias.scope !321, !noalias !12660
  %21 = getelementptr inbounds [2 x i64], [2 x i64]* %newstruct74, i64 0, i64 1, !dbg !12659
  store i64 %arraysize3, i64* %21, align 8, !dbg !12659, !tbaa !319, !alias.scope !321, !noalias !12660
  %22 = addrspacecast [2 x i64]* %newstruct74 to [2 x i64] addrspace(11)*, !dbg !12658
  %23 = call fastcc nonnull {} addrspace(10)* @julia_string_19657({} addrspace(10)* nofree noundef nonnull align 16 addrspacecast ({}* inttoptr (i64 127156352624368 to {}*) to {} addrspace(10)*), [2 x i64] addrspace(11)* nocapture nofree noundef nonnull readonly align 8 dereferenceable(16) %22) #110, !dbg !12658
  %current_task75102 = getelementptr inbounds {}**, {}*** %14, i64 -14, !dbg !12658
  %current_task75 = bitcast {}*** %current_task75102 to {}**, !dbg !12658
  %box76 = call noalias nonnull dereferenceable(8) "enzyme_inactive" {} addrspace(10)* @julia.gc_alloc_obj({}** nonnull %current_task75, i64 noundef 8, {} addrspace(10)* noundef addrspacecast ({}* inttoptr (i64 127156287023792 to {}*) to {} addrspace(10)*)) #111, !dbg !12658
  %24 = bitcast {} addrspace(10)* %box76 to [1 x {} addrspace(10)*] addrspace(10)*, !dbg !12658
  %25 = getelementptr [1 x {} addrspace(10)*], [1 x {} addrspace(10)*] addrspace(10)* %24, i64 0, i64 0, !dbg !12658
  store {} addrspace(10)* %23, {} addrspace(10)* addrspace(10)* %25, align 8, !dbg !12658, !tbaa !181, !alias.scope !185, !noalias !12663
  %26 = addrspacecast {} addrspace(10)* %box76 to {} addrspace(12)*, !dbg !12658
  call void @ijl_throw({} addrspace(12)* %26) #112, !dbg !12658
  unreachable, !dbg !12658

L13:                                              ; preds = %top
  %27 = addrspacecast {} addrspace(10)* %2 to { i8 addrspace(13)*, i64, i16, i16, i32 } addrspace(11)*, !dbg !12664
  %arraylen_ptr = getelementptr inbounds { i8 addrspace(13)*, i64, i16, i16, i32 }, { i8 addrspace(13)*, i64, i16, i16, i32 } addrspace(11)* %27, i64 0, i32 1, !dbg !12664
  %arraylen = load i64, i64 addrspace(11)* %arraylen_ptr, align 8, !dbg !12664, !tbaa !117, !range !126, !invariant.load !112, !alias.scope !127, !noalias !130
  %.not83 = icmp eq i64 %arraylen, 0, !dbg !12669
  br i1 %.not83, label %L60, label %L29, !dbg !12665

L29:                                              ; preds = %L13
  %28 = addrspacecast {} addrspace(10)* %2 to double addrspace(13)* addrspace(11)*, !dbg !12671
  %arrayptr84 = load double addrspace(13)*, double addrspace(13)* addrspace(11)* %28, align 16, !dbg !12671, !tbaa !117, !invariant.load !112, !alias.scope !12672, !noalias !130, !nonnull !112
  %value_phi6114 = load double, double addrspace(13)* %arrayptr84, align 8, !dbg !12673, !tbaa !274, !alias.scope !185, !noalias !186
  %29 = fsub double %value_phi6114, %value_phi6114, !dbg !12674
  %30 = fcmp ord double %29, 0.000000e+00, !dbg !12677
  br i1 %30, label %L41.lr.ph, label %L38, !dbg !12676

L41.lr.ph:                                        ; preds = %L29
  %31 = add nuw nsw i64 %arraylen, 1, !dbg !12676
  br label %L41, !dbg !12676

L38.loopexit:                                     ; preds = %L53
  br label %L38, !dbg !12679

L38:                                              ; preds = %L38.loopexit, %L29
  %32 = call fastcc [1 x {} addrspace(10)*] @julia_ArgumentError_19561({} addrspace(10)* nofree noundef nonnull align 32 addrspacecast ({}* inttoptr (i64 127156347529312 to {}*) to {} addrspace(10)*)) #110, !dbg !12679
  %current_task885 = getelementptr inbounds {}**, {}*** %14, i64 -14, !dbg !12679
  %current_task8 = bitcast {}*** %current_task885 to {}**, !dbg !12679
  %box = call noalias nonnull dereferenceable(8) "enzyme_inactive" {} addrspace(10)* @julia.gc_alloc_obj({}** nonnull %current_task8, i64 noundef 8, {} addrspace(10)* noundef addrspacecast ({}* inttoptr (i64 127156310686912 to {}*) to {} addrspace(10)*)) #111, !dbg !12679
  %33 = bitcast {} addrspace(10)* %box to [1 x {} addrspace(10)*] addrspace(10)*, !dbg !12679
  %34 = extractvalue [1 x {} addrspace(10)*] %32, 0, !dbg !12679
  %35 = getelementptr [1 x {} addrspace(10)*], [1 x {} addrspace(10)*] addrspace(10)* %33, i64 0, i64 0, !dbg !12679
  store {} addrspace(10)* %34, {} addrspace(10)* addrspace(10)* %35, align 8, !dbg !12679, !tbaa !181, !alias.scope !185, !noalias !12663
  %36 = addrspacecast {} addrspace(10)* %box to {} addrspace(12)*, !dbg !12679
  call void @ijl_throw({} addrspace(12)* %36) #112, !dbg !12679
  unreachable, !dbg !12679

L41:                                              ; preds = %L53, %L41.lr.ph
  %iv = phi i64 [ %iv.next, %L53 ], [ 0, %L41.lr.ph ]
  %37 = add nuw i64 %iv, 2, !dbg !12680
  %iv.next = add nuw nsw i64 %iv, 1, !dbg !12680
  %exitcond.not = icmp eq i64 %37, %31, !dbg !12680
  br i1 %exitcond.not, label %L60.loopexit, label %L53, !dbg !12682

L53:                                              ; preds = %L41
  %38 = add nsw i64 %37, -1, !dbg !12684
  %39 = getelementptr inbounds double, double addrspace(13)* %arrayptr84, i64 %38, !dbg !12686
  %40 = add nuw i64 %37, 1, !dbg !12687
  %value_phi6 = load double, double addrspace(13)* %39, align 8, !dbg !12673, !tbaa !274, !alias.scope !185, !noalias !186
  %41 = fsub double %value_phi6, %value_phi6, !dbg !12674
  %42 = fcmp ord double %41, 0.000000e+00, !dbg !12677
  br i1 %42, label %L41, label %L38.loopexit, !dbg !12676

L60.loopexit:                                     ; preds = %L41
  br label %L60, !dbg !12688

L60:                                              ; preds = %L60.loopexit, %L13
  %current_task1788 = getelementptr inbounds {}**, {}*** %14, i64 -14, !dbg !12688
  %current_task17 = bitcast {}*** %current_task1788 to {}**, !dbg !12688
  %43 = call noalias nonnull {} addrspace(10)* @ijl_alloc_array_1d({} addrspace(10)* noundef addrspacecast ({}* inttoptr (i64 127156239837216 to {}*) to {} addrspace(10)*), i64 %arraysize) #113, !dbg !12691
  store i8 66, i8* %9, align 1, !dbg !12696, !tbaa !739, !alias.scope !185, !noalias !12663
  store i64 %arraysize, i64* %10, align 16, !dbg !12696, !tbaa !739, !alias.scope !185, !noalias !12663
  %.not93 = icmp eq i64 %arraysize, 0, !dbg !12700
  %44 = select i1 %.not93, i64 1, i64 %arraysize, !dbg !12702
  store i64 %44, i64* %12, align 16, !dbg !12696, !tbaa !739, !alias.scope !185, !noalias !12663
  %45 = addrspacecast {} addrspace(10)* %2 to {} addrspace(11)*, !dbg !12703
  %46 = call nonnull {}* @julia.pointer_from_objref({} addrspace(11)* noundef %45) #114, !dbg !12703
  %47 = bitcast {}* %46 to i8**, !dbg !12703
  %arrayptr39 = load i8*, i8** %47, align 8, !dbg !12703, !tbaa !117, !invariant.load !112, !alias.scope !127, !noalias !130, !nonnull !112
  %48 = ptrtoint i8* %arrayptr39 to i64, !dbg !12703
  %bitcast_coercion47 = ptrtoint i64* %5 to i64, !dbg !12704
  %bitcast_coercion51 = ptrtoint i64* %3 to i64, !dbg !12704
  %49 = addrspacecast {} addrspace(10)* %43 to {} addrspace(11)*, !dbg !12703
  %50 = call nonnull {}* @julia.pointer_from_objref({} addrspace(11)* %49) #114, !dbg !12703
  %51 = bitcast {}* %50 to i8**, !dbg !12703
  %arrayptr53 = load i8*, i8** %51, align 8, !dbg !12703, !tbaa !750, !alias.scope !753, !noalias !754, !nonnull !112
  %52 = ptrtoint i8* %arrayptr53 to i64, !dbg !12703
  %bitcast_coercion57 = ptrtoint i64* %7 to i64, !dbg !12704
  call void @dgebal_64_(i8* noundef nonnull %9, i8* noundef nonnull %11, i64 %48, i8* noundef nonnull %13, i64 noundef %bitcast_coercion47, i64 noundef %bitcast_coercion51, i64 %52, i64 noundef %bitcast_coercion57, i64 noundef 1) #110 [ "jl_roots"({} addrspace(10)* null, {} addrspace(10)* %43, {} addrspace(10)* null, {} addrspace(10)* null, {} addrspace(10)* null, {} addrspace(10)* %2, {} addrspace(10)* null, {} addrspace(10)* null) ], !dbg !12699
  %53 = load i64, i64* %7, align 16, !dbg !12706, !tbaa !739, !alias.scope !185, !noalias !186
  %.not95 = icmp eq i64 %53, 0, !dbg !12709
  br i1 %.not95, label %L176, label %L166, !dbg !12710

L166:                                             ; preds = %L60
  %54 = icmp sgt i64 %53, -1, !dbg !12711
  br i1 %54, label %L173, label %L168, !dbg !12712

L168:                                             ; preds = %L166
  %55 = sub i64 0, %53, !dbg !12713
  call void @llvm.lifetime.end.p0i8(i64 noundef 8, i8* noundef nonnull %4) #110
  call void @llvm.lifetime.end.p0i8(i64 noundef 8, i8* noundef nonnull %6) #110
  call void @llvm.lifetime.end.p0i8(i64 noundef 8, i8* noundef nonnull %8) #110
  call void @llvm.lifetime.end.p0i8(i64 noundef 1, i8* noundef nonnull %9) #110
  call void @llvm.lifetime.end.p0i8(i64 noundef 8, i8* noundef nonnull %11) #110
  call void @llvm.lifetime.end.p0i8(i64 noundef 8, i8* noundef nonnull %13) #110
  %56 = call noalias nonnull "enzyme_inactive" {} addrspace(10)* @ijl_box_int64(i64 signext %55) #113, !dbg !12714
  %57 = call nonnull {} addrspace(10)* ({} addrspace(10)* ({} addrspace(10)*, {} addrspace(10)**, i32, {} addrspace(10)*)*, {} addrspace(10)*, {} addrspace(10)*, ...) @julia.call2({} addrspace(10)* ({} addrspace(10)*, {} addrspace(10)**, i32, {} addrspace(10)*)* noundef nonnull @ijl_invoke, {} addrspace(10)* noundef addrspacecast ({}* inttoptr (i64 127156268451584 to {}*) to {} addrspace(10)*), {} addrspace(10)* noundef addrspacecast ({}* inttoptr (i64 127156246650304 to {}*) to {} addrspace(10)*), {} addrspace(10)* addrspacecast ({}* inttoptr (i64 127156347531504 to {}*) to {} addrspace(10)*), {} addrspace(10)* nonnull %56, {} addrspace(10)* addrspacecast ({}* inttoptr (i64 127156347531472 to {}*) to {} addrspace(10)*)) #115, !dbg !12714
  %box61 = call noalias nonnull dereferenceable(8) "enzyme_inactive" {} addrspace(10)* @julia.gc_alloc_obj({}** nonnull %current_task17, i64 noundef 8, {} addrspace(10)* noundef addrspacecast ({}* inttoptr (i64 127156310686912 to {}*) to {} addrspace(10)*)) #111, !dbg !12714
  %58 = bitcast {} addrspace(10)* %box61 to [1 x {} addrspace(10)*] addrspace(10)*, !dbg !12714
  %59 = getelementptr [1 x {} addrspace(10)*], [1 x {} addrspace(10)*] addrspace(10)* %58, i64 0, i64 0, !dbg !12714
  store {} addrspace(10)* %57, {} addrspace(10)* addrspace(10)* %59, align 8, !dbg !12714, !tbaa !181, !alias.scope !185, !noalias !12663
  %60 = addrspacecast {} addrspace(10)* %box61 to {} addrspace(12)*, !dbg !12714
  call void @ijl_throw({} addrspace(12)* %60) #112, !dbg !12714
  unreachable, !dbg !12714

L173:                                             ; preds = %L166
  call void @llvm.lifetime.end.p0i8(i64 noundef 8, i8* noundef nonnull %4) #110
  call void @llvm.lifetime.end.p0i8(i64 noundef 8, i8* noundef nonnull %6) #110
  call void @llvm.lifetime.end.p0i8(i64 noundef 8, i8* noundef nonnull %8) #110
  call void @llvm.lifetime.end.p0i8(i64 noundef 1, i8* noundef nonnull %9) #110
  call void @llvm.lifetime.end.p0i8(i64 noundef 8, i8* noundef nonnull %11) #110
  call void @llvm.lifetime.end.p0i8(i64 noundef 8, i8* noundef nonnull %13) #110
  %box66 = call noalias nonnull dereferenceable(8) "enzyme_inactive" {} addrspace(10)* @julia.gc_alloc_obj({}** nonnull %current_task17, i64 noundef 8, {} addrspace(10)* noundef addrspacecast ({}* inttoptr (i64 127156276572992 to {}*) to {} addrspace(10)*)) #111, !dbg !12715
  %memcpy_refined_dst = bitcast {} addrspace(10)* %box66 to i64 addrspace(10)*, !dbg !12715
  store i64 %53, i64 addrspace(10)* %memcpy_refined_dst, align 8, !dbg !12715, !tbaa !181, !alias.scope !185, !noalias !12663
  %61 = addrspacecast {} addrspace(10)* %box66 to {} addrspace(12)*, !dbg !12715
  call void @ijl_throw({} addrspace(12)* %61) #112, !dbg !12715
  unreachable, !dbg !12715

L176:                                             ; preds = %L60
  %62 = load i64, i64* %5, align 16, !dbg !12716, !tbaa !739, !alias.scope !185, !noalias !186
  %63 = load i64, i64* %3, align 16, !dbg !12716, !tbaa !739, !alias.scope !185, !noalias !186
  %64 = getelementptr inbounds [1 x {} addrspace(10)*], [1 x {} addrspace(10)*]* %1, i64 0, i64 0, !dbg !12718
  store {} addrspace(10)* %43, {} addrspace(10)** %64, align 8, !dbg !12718, !noalias !12719
  %.repack = getelementptr inbounds { i64, i64, {} addrspace(10)* }, { i64, i64, {} addrspace(10)* }* %0, i64 0, i32 0, !dbg !12718
  store i64 %62, i64* %.repack, align 8, !dbg !12718, !noalias !12719
  %.repack96 = getelementptr inbounds { i64, i64, {} addrspace(10)* }, { i64, i64, {} addrspace(10)* }* %0, i64 0, i32 1, !dbg !12718
  store i64 %63, i64* %.repack96, align 8, !dbg !12718, !noalias !12719
  %.repack98 = getelementptr inbounds { i64, i64, {} addrspace(10)* }, { i64, i64, {} addrspace(10)* }* %0, i64 0, i32 2, !dbg !12718
  store {} addrspace(10)* %43, {} addrspace(10)** %.repack98, align 8, !dbg !12718, !noalias !12719
  ret void, !dbg !12718
}

No forward mode derivative found for dgebal_64_
 at context:   call void @dgebal_64_(i8* noundef nonnull %9, i8* noundef nonnull %11, i64 %48, i8* noundef nonnull %13, i64 noundef %bitcast_coercion47, i64 noundef %bitcast_coercion51, i64 %52, i64 noundef %bitcast_coercion57, i64 noundef 1) #110 [ "jl_roots"({} addrspace(10)* null, {} addrspace(10)* %43, {} addrspace(10)* null, {} addrspace(10)* null, {} addrspace(10)* null, {} addrspace(10)* %2, {} addrspace(10)* null, {} addrspace(10)* null) ], !dbg !216

Stacktrace:
 [1] gebal!
   @ ~/.julia/juliaup/julia-1.10.2+0.x64.linux.gnu/share/julia/stdlib/v1.10/LinearAlgebra/src/lapack.jl:221


Stacktrace:
  [1] throwerr(cstr::Cstring)
    @ Enzyme.Compiler ~/.julia/packages/Enzyme/l4FS0/src/compiler.jl:1289
  [2] gebal!
    @ ~/.julia/juliaup/julia-1.10.2+0.x64.linux.gnu/share/julia/stdlib/v1.10/LinearAlgebra/src/lapack.jl:221
  [3] exp!
    @ ~/.julia/juliaup/julia-1.10.2+0.x64.linux.gnu/share/julia/stdlib/v1.10/LinearAlgebra/src/dense.jl:654
  [4] exp
    @ ~/.julia/juliaup/julia-1.10.2+0.x64.linux.gnu/share/julia/stdlib/v1.10/LinearAlgebra/src/dense.jl:594 [inlined]
  [5] fwddiffejulia_exp_19479wrap
    @ ~/.julia/juliaup/julia-1.10.2+0.x64.linux.gnu/share/julia/stdlib/v1.10/LinearAlgebra/src/dense.jl:0
  [6] macro expansion
    @ ~/.julia/packages/Enzyme/l4FS0/src/compiler.jl:5378 [inlined]
  [7] enzyme_call
    @ ~/.julia/packages/Enzyme/l4FS0/src/compiler.jl:5056 [inlined]
  [8] ForwardModeThunk
    @ ~/.julia/packages/Enzyme/l4FS0/src/compiler.jl:5001 [inlined]
  [9] autodiff(::ForwardMode{FFIABI}, f::Const{typeof(exp)}, ::Type{Duplicated}, args::Duplicated{Matrix{Float64}})
    @ Enzyme ~/.julia/packages/Enzyme/l4FS0/src/Enzyme.jl:324
 [10] autodiff(::ForwardMode{FFIABI}, ::typeof(exp), ::Type, ::Duplicated{Matrix{Float64}})
    @ Enzyme ~/.julia/packages/Enzyme/l4FS0/src/Enzyme.jl:224
 [11] top-level scope
    @ REPL[70]:1
@vchuravy vchuravy added bug Something isn't working linear algebra labels Mar 15, 2024
@wsmoses
Copy link
Member

wsmoses commented Mar 15, 2024

Duplicate of #1222.

@vchuravy this is the need for lapack functions, either proper or fallback.

Rules for this are welcome!

@wsmoses wsmoses closed this as not planned Won't fix, can't repro, duplicate, stale Mar 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working linear algebra
Projects
None yet
Development

No branches or pull requests

3 participants