regularization_L2_beta error #23

EhsanMehdipour · 2024-03-07T14:31:34Z

Hi,

When I initialize DINCAE with regularization_L2_beta = 0.001, I recieve the following error.

ERROR: LoadError: MethodError: no method matching abs2(::CuArray{Float32, 4, CUDA.Mem.DeviceBuffer})

Closest candidates are:
  abs2(!Matched::Complex)
   @ Base complex.jl:281
  abs2(!Matched::ForwardDiff.Dual{T}) where T
   @ ForwardDiff ~/.julia/packages/ForwardDiff/PcZ48/src/dual.jl:238
  abs2(!Matched::DualNumbers.Dual)
   @ DualNumbers ~/.julia/packages/DualNumbers/5knFX/src/dual.jl:204
  ...

Stacktrace:
  [1] MappingRF
    @ ./reduce.jl:95 [inlined]
  [2] _foldl_impl(op::Base.MappingRF{typeof(abs2), Base.BottomRF{typeof(Base.add_sum)}}, init::Base._InitialValue, itr::Zygote.Params{Zygote.Buffer{Any, Vector{Any}}})
    @ Base ./reduce.jl:58
  [3] foldl_impl
    @ ./reduce.jl:48 [inlined]
  [4] mapfoldl_impl(f::typeof(abs2), op::typeof(Base.add_sum), nt::Base._InitialValue, itr::Zygote.Params{Zygote.Buffer{Any, Vector{Any}}})
    @ Base ./reduce.jl:44
  [5] mapfoldl(f::Function, op::Function, itr::Zygote.Params{Zygote.Buffer{Any, Vector{Any}}}; init::Base._InitialValue)
    @ Base ./reduce.jl:170
  [6] mapfoldl
    @ ./reduce.jl:170 [inlined]
  [7] #mapreduce#292
    @ ./reduce.jl:302 [inlined]
  [8] mapreduce
    @ ./reduce.jl:302 [inlined]
  [9] #sum#295
    @ ./reduce.jl:530 [inlined]
 [10] sum(f::Function, a::Zygote.Params{Zygote.Buffer{Any, Vector{Any}}})
    @ Base ./reduce.jl:530
 [11] loss_function(model::DINCAE.StepModel{DINCAE.var"#52#56"{Float64}, DINCAE.var"#53#57"{Bool, Int64, Int64}}, xin::CuArray{Float32, 4, CUDA.Mem.DeviceBuffer}, xtrue::CuArray{Float32, 4, CUDA.Mem.DeviceBuffer})
    @ DINCAE ~/DINCAE/DINCAE.jl/src/model.jl:220
 [12] reconstruct(Atype::Type, data_all::Vector{Vector{NamedTuple{(:filename, :varname, :obs_err_std, :jitter_std, :isoutput), Tuple{String, String, Int64, Float64, Bool}}}}, fnames_rec::Vector{String}; epochs::Int64, batch_size::Int64, truth_uncertain::Bool, enc_nfilter_internal::Vector{Int64}, skipconnections::UnitRange{Int64}, clip_grad::Float64, regularization_L1_beta::Int64, regularization_L2_beta::Float64, save_epochs::StepRange{Int64, Int64}, is3D::Bool, upsampling_method::Symbol, ntime_win::Int64, learning_rate::Float64, learning_rate_decay_epoch::Float64, min_std_err::Float64, loss_weights_refine::Tuple{Float64, Float64}, cycle_periods::Tuple{Float64}, output_ndims::Int64, direction_obs::Nothing, remove_mean::Bool, paramfile::Nothing, laplacian_penalty::Int64, laplacian_error_penalty::Int64)
    @ DINCAE ~/DINCAE/DINCAE.jl/src/model.jl:490
 [13] top-level scope
    @ ~/DINCAE/python/DINCAE/8_3.jl:106

Alexander-Barth · 2024-03-11T13:35:52Z

Can you provide me a (minimal) reproducible example ?
I just tried this example below (with CUDA 5.2.0 and Flux 0.14.13) but I did not have the error:

Maybe this error occurs when you combine different options? Feel free to adapt the example below to what is necessary to trigger the error if you want.

using DINCAE
using Base.Iterators
using Random
using NCDatasets
using CUDA

const F = Float32
Atype = CuArray{F}

filename = "avhrr_sub_add_clouds_n10.nc"

if !isfile(filename)
    download("https://dox.ulg.ac.be/index.php/s/2yFgNMkpsGumVSM/download", filename)
end


data = [
   (filename = filename,
    varname = "SST",
    obs_err_std = 1,
    jitter_std = 0.05,
    isoutput = true,
   )
]
data_test = data;
data_all = [data,data_test]

epochs = 3
batch_size = 5
save_each = 10
skipconnections = [1,2]
enc_nfilter_internal = round.(Int,32 * 2 .^ (0:3))
clip_grad = 5.0
save_epochs = [epochs]
ntime_win = 3
upsampling_method = :nearest

fnames_rec = [tempname()]
paramfile = tempname()

losses = DINCAE.reconstruct(
    Atype,data_all,fnames_rec;
    epochs = epochs,
    batch_size = batch_size,
    enc_nfilter_internal = enc_nfilter_internal,
    clip_grad = clip_grad,
    save_epochs = save_epochs,
    upsampling_method = upsampling_method,
    ntime_win = ntime_win,
    paramfile = paramfile,
    regularization_L2_beta = 0.001,
    )

Output for me:

julia> include("/home/abarth/.julia/dev/DINCAE/test/test_DINCAE_SST_1.jl");
[ Info: Number of threads: 1
SST data shape: 112×112×10 data range: (13.575001f0, 17.775002f0)
SST data shape: 112×112×10 data range: (13.575001f0, 17.775002f0)
[ Info: Output variables:  ["SST"]
[ Info: Input size:        112×112×10×5
[ Info: Input sum:         -9574.162
[ Info: Number of filters in encoder: [10, 32, 64, 128, 256]
[ Info: Number of filters in decoder: [2, 32, 64, 128, 256]
[ Info: Gamma:             10.0
[ Info: Number of filters: [10, 32, 64, 128, 256]
skip connections at level 4
skip connections at level 3
skip connections at level 2
[ Info: using device:      gpu
[ Info: Output size:       112×112×2×5
[ Info: Output range:      (-1.6730540579839301, 2.0068249099692506)
[ Info: Output sum:        51437.69955057635
[ Info: Initial loss:      1.206571102595678
epoch:     1 loss 0.6725
epoch:     2 loss 3.8611
epoch:     3 loss -0.6723
Save output 3
  1.291923 seconds (2.55 M allocations: 179.634 MiB, 7.65% gc time)
  59.881056 seconds (136.24 M allocations: 7.765 GiB, 6.52% gc time, 0.01% compilation time)

Thanks!

EhsanMehdipour · 2024-03-26T10:07:14Z

Hi,

I altered the value of loss_weights_refine from (1.,) to (0.3,0.7) and faced the same error.
Is there any incompatibility between refinement and regularization?

The Hyperparameters that I am using for the test case:

epochs = 3
batch_size = 5
skipconnections = [1,2]
enc_nfilter_internal = round.(Int,32 * 2 .^ (0:3))
regularization_L2_beta = 0.001
ntime_win = 3
upsampling_method = :nearest
loss_weights_refine = (0.3,0.7) ## With refinement
# loss_weights_refine = (1.,) ## without refinement
save_epochs = [epochs]
truth_uncertain = true
remove_mean=false

Alexander-Barth · 2024-03-28T10:21:41Z

Thanks, I could now reproduce the error and committed a fix. Does it also work for you?

EhsanMehdipour · 2024-04-18T09:36:53Z

Thank you it is working now.

EhsanMehdipour closed this as completed Mar 26, 2024

EhsanMehdipour reopened this Mar 26, 2024

Alexander-Barth added a commit that referenced this issue Mar 28, 2024

issue #23

f37e535

Alexander-Barth added a commit that referenced this issue Mar 28, 2024

test case for issue #23

98ca393

Alexander-Barth added a commit that referenced this issue Mar 28, 2024

issue #23

c761469

EhsanMehdipour closed this as completed Apr 18, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

regularization_L2_beta error #23

regularization_L2_beta error #23

EhsanMehdipour commented Mar 7, 2024 •

edited

Loading

Alexander-Barth commented Mar 11, 2024 •

edited

Loading

EhsanMehdipour commented Mar 26, 2024

Alexander-Barth commented Mar 28, 2024

EhsanMehdipour commented Apr 18, 2024

regularization_L2_beta error #23

regularization_L2_beta error #23

Comments

EhsanMehdipour commented Mar 7, 2024 • edited Loading

Alexander-Barth commented Mar 11, 2024 • edited Loading

EhsanMehdipour commented Mar 26, 2024

Alexander-Barth commented Mar 28, 2024

EhsanMehdipour commented Apr 18, 2024

EhsanMehdipour commented Mar 7, 2024 •

edited

Loading

Alexander-Barth commented Mar 11, 2024 •

edited

Loading