-
Notifications
You must be signed in to change notification settings - Fork 68
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
gradient (and gradient!) fails on basic example #1340
Comments
After reading Zygote.jl documentation I realized array mutation is hard for AD packages. Thus, I changed the objective slightly, and using Enzyme
using LinearAlgebra
# objective = 0.5 || y - X*beta ||^2
function ols(y, X, beta)
storage = X * beta # avoid mul!
obj = zero(eltype(beta))
for i in eachindex(y)
obj += abs2(y[i] - storage[i])
end
return 0.5obj
end
ols(beta::AbstractVector) = ols(y, X, beta)
# simulate data
n = 100
p = 50
X = randn(n, p)
y = randn(n)
beta = randn(p)
ols(y, X, beta)
# autodiff grad
grad1 = zeros(length(beta))
Enzyme.gradient!(Reverse, grad1, ols, beta) # method 1
grad2 = Enzyme.gradient(Reverse, ols, beta) # method 2
grad3 = zeros(length(beta))
Enzyme.autodiff(Reverse, ols, Active, Duplicated(beta, grad3)) # method 3
# analytical grad
true_grad = -X' * (y - X*beta)
# compare answers
[true_grad grad1 grad2 grad3]
50×4 Matrix{Float64}:
223.421 223.421 223.421 223.421
56.115 56.115 56.115 56.115
162.285 162.285 162.285 162.285
-58.6251 -58.6251 -58.6251 -58.6251
109.732 109.732 109.732 109.732
-152.867 -152.867 -152.867 -152.867
51.7608 51.7608 51.7608 51.7608
-64.8512 -64.8512 -64.8512 -64.8512
38.8877 38.8877 38.8877 38.8877
-62.3895 -62.3895 -62.3895 -62.3895
92.2303 92.2303 92.2303 92.2303
208.785 208.785 208.785 208.785
47.6175 47.6175 47.6175 47.6175
⋮
-26.5664 -26.5664 -26.5664 -26.5664
65.8905 65.8905 65.8905 65.8905
279.925 279.925 279.925 279.925
-18.8817 -18.8817 -18.8817 -18.8817
-169.649 -169.649 -169.649 -169.649
-134.628 -134.628 -134.628 -134.628
273.975 273.975 273.975 273.975
0.220214 0.220214 0.220214 0.220214
-130.344 -130.344 -130.344 -130.344
-3.59516 -3.59516 -3.59516 -3.59516
204.023 204.023 204.023 204.023
-268.678 -268.678 -268.678 -268.678 But of course I would prefer |
I'm not an Enzyme expert or anything, but this case is discussed pretty clearly in the doucmentation. You need to explicitly tell Enzyme about your storage array, e.g: Enzyme.autodiff(Reverse, ols, Active, Const(y), Const(X), Duplicated(beta, grad_storage), Duplicated(storage, zero(storage))) will produce the correct gradient:
It might also be possible to pre-wrap the variables with |
@bgroenks96 thank you very much. I actually saw that example, but its meaning somehow did not register with me when I first saw it. Closing this now.
|
In the following least squares objective, it does not seems like calling
gradient!
(orgradient
) is doing anything?MWE:
Compare answers:
This is on Julia v1.9.1 with
Enzyme
v0.11.19The text was updated successfully, but these errors were encountered: