Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AssertionError: norm(G * v .- curr) / norm(curr) - multiple failures #370

Closed
ptfreeman-csp opened this issue Dec 7, 2022 · 20 comments
Closed

Comments

@ptfreeman-csp
Copy link

ptfreeman-csp commented Dec 7, 2022

Describe the bug, with logs
I have been running into an issue running Omniscape (that calls aspects of Circuitscape) and @vlandau suggested that I cross-post here for help. I'm in desperate search for help as I am mystified as to why this error is occurring and also on a rapidly approaching deadline. I am trying to run two rather large omni-directional connectivity models with substantial moving window sizes (but that aren't the most massive among other species that have run just fine using this exact workflow). Omniscape runs (in this case for over four hours) with one message that a moving window has failed at a particular cell but otherwise fine and then 20 minutes before complete I got the error message pasted below. I ran a much smaller test area that was only a small component of the larger raster and the model ran successfully without errors so I'm not sure what is happening here.

`Stacktrace:
 [1] wait
   @ ./task.jl:322 [inlined]
 [2] threading_run(func::Function)
   @ Base.Threads ./threadingconstructs.jl:34
 [3] macro expansion
   @ ./threadingconstructs.jl:93 [inlined]
 [4] run_omniscape(cfg::Dict{String, String}, resistance::Matrix{Union{Missing, Float64}}; reclass_table::Matrix{Union{Missing, Float64}}, source_strength::Matrix{Union{Missing, Float64}}, condition1::Matrix{Union{Missing, Float64}}, condition2::Matrix{Union{Missing, Float64}}, condition1_future::Matrix{Union{Missing, Float64}}, condition2_future::Matrix{Union{Missing, Float64}}, wkt::String, geotransform::Vector{Float64}, write_outputs::Bool)
   @ Omniscape ~/.julia/packages/Omniscape/9gHf2/src/main.jl:257
 [5] run_omniscape(path::String)
   @ Omniscape ~/.julia/packages/Omniscape/9gHf2/src/main.jl:536
 [6] top-level scope
   @ REPL[5]:1

    nested task error: AssertionError: norm(G * v .- curr) / norm(curr) < 1.0e-6
    Stacktrace:
     [1] macro expansion
       @ ~/.julia/packages/Omniscape/9gHf2/src/main.jl:278 [inlined]
     [2] (::Omniscape.var"#161#threadsfor_fun#11"{Int64, ProgressMeter.Progress, Int64, Dict{String, String}, Omniscape.ConditionLayers{Float64, 2}, Omniscape.Conditions, Omniscape.OmniscapeFlags, DataType, Dict{String, Int64}, UnitRange{Int64}})(onethread::Bool)
       @ Omniscape ./threadingconstructs.jl:81
     [3] (::Omniscape.var"#161#threadsfor_fun#11"{Int64, ProgressMeter.Progress, Int64, Dict{String, String}, Omniscape.ConditionLayers{Float64, 2}, Omniscape.Conditions, Omniscape.OmniscapeFlags, DataType, Dict{String, Int64}, UnitRange{Int64}})()
       @ Omniscape ./threadingconstructs.jl:48
    
    caused by: AssertionError: norm(G * v .- curr) / norm(curr) < 1.0e-6
    Stacktrace:
     [1] solve_linear_system(G::SparseArrays.SparseMatrixCSC{Float64, Int64}, curr::Vector{Float64}, M::AlgebraicMultigrid.Preconditioner{AlgebraicMultigrid.MultiLevel{AlgebraicMultigrid.Pinv{Float64}, AlgebraicMultigrid.GaussSeidel{AlgebraicMultigrid.SymmetricSweep}, AlgebraicMultigrid.GaussSeidel{AlgebraicMultigrid.SymmetricSweep}, SparseArrays.SparseMatrixCSC{Float64, Int64}, SparseArrays.SparseMatrixCSC{Float64, Int64}, LinearAlgebra.Adjoint{Float64, SparseArrays.SparseMatrixCSC{Float64, Int64}}, AlgebraicMultigrid.MultiLevelWorkspace{Vector{Float64}, 1}}, AlgebraicMultigrid.V})
       @ Circuitscape ~/.julia/packages/Circuitscape/XpftG/src/core.jl:616
     [2] macro expansion
       @ ./timing.jl:287 [inlined]
     [3] multiple_solve(s::Circuitscape.AMGSolver, matrix::SparseArrays.SparseMatrixCSC{Float64, Int64}, sources::Vector{Float64}, suppress_info::Bool)
       @ Circuitscape ~/.julia/packages/Circuitscape/XpftG/src/raster/advanced.jl:312
     [4] multiple_solver(cfg::Dict{String, String}, solver::Circuitscape.AMGSolver, a::SparseArrays.SparseMatrixCSC{Float64, Int64}, sources::Vector{Float64}, grounds::Vector{Float64}, finitegrounds::Vector{Float64})
       @ Circuitscape ~/.julia/packages/Circuitscape/XpftG/src/raster/advanced.jl:292
     [5] compute_omniscape_current(conductance::Matrix{Float64}, source::Matrix{Float64}, ground::Matrix{Float64}, cs_cfg::Dict{String, String})
       @ Circuitscape ~/.julia/packages/Circuitscape/XpftG/src/utils.jl:564
     [6] solve_target!(target::Omniscape.Target, int_arguments::Dict{String, Int64}, source_strength::Matrix{Union{Missing, Float64}}, resistance::Matrix{Union{Missing, Float64}}, os_flags::Omniscape.OmniscapeFlags, cs_cfg::Dict{String, String}, condition_layers::Omniscape.ConditionLayers{Float64, 2}, conditions::Omniscape.Conditions, correction_array::Matrix{Float64}, cum_currmap::Array{Float64, 3}, fp_cum_currmap::Array{Float64, 3}, precision::DataType)
       @ Omniscape ~/.julia/packages/Omniscape/9gHf2/src/utils.jl:332
     [7] macro expansion
       @ ~/.julia/packages/Omniscape/9gHf2/src/main.jl:264 [inlined]
     [8] (::Omniscape.var"#161#threadsfor_fun#11"{Int64, ProgressMeter.Progress, Int64, Dict{String, String}, Omniscape.ConditionLayers{Float64, 2}, Omniscape.Conditions, Omniscape.OmniscapeFlags, DataType, Dict{String, Int64}, UnitRange{Int64}})(onethread::Bool)
       @ Omniscape ./threadingconstructs.jl:81
     [9] (::Omniscape.var"#161#threadsfor_fun#11"{Int64, ProgressMeter.Progress, Int64, Dict{String, String}, Omniscape.ConditionLayers{Float64, 2}, Omniscape.Conditions, Omniscape.OmniscapeFlags, DataType, Dict{String, Int64}, UnitRange{Int64}})()
       @ Omniscape ./threadingconstructs.jl:48`

**How to reproduce (PLEASE ATTACH YOUR PROBLEM FILES) **
I have included the source and resistance layers and the .ini file in this folder. I reproduce my .ini text here for one of the species that I'm running into trouble with in case anyone spots any problems. I have to provide you specific access to the folder so just request access to the folder.

[Input files]
resistance_file = /home/cdca-conn/data/muledeer/omniscape-inputs/update/SCENARIO1/md-scenario1-resistance.tif
source_file = /home/cdca-conn/data/muledeer/omniscape-inputs/update/SCENARIO1/md-scenario1-source-clipped.tif
[Options]
#resistance_file_is_conductance = false
block_size = 35
radius = 444
project_name = output/muledeer/omniscape/update/muledeer-scenario1
correct_artifacts = true
source_threshold = 0.03808912
solver = cg+amg
parallelize = true
parallel_batch_size = 10

Circuitscape and Julia version

Circuitscape v5.11.2
Omniscape v0.5.8
Julia Version 1.6.1
Commit 6aaedecc44 (2021-04-23 05:59 UTC)
Platform Info:
OS: Linux (x86_64-pc-linux-gnu)
CPU: AMD EPYC 7763 64-Core Processor
WORD_SIZE: 64
LIBM: libopenlibm
LLVM: libLLVM-11.0.1 (ORCJIT, generic)

Additional context
I have run several other models with slightly different footprints but all in the same general area without issue.

@ptfreeman-csp
Copy link
Author

@ranjanan -- out of curiosity is there any way that one can "hack" the solver error threshold if that is indeed the problem throwing these errors? I'm not even sure how that error is arising and maybe that's a stupid question, but I figured I would at least ask.

@ranjanan
Copy link
Member

ranjanan commented Dec 27, 2022

@ptfreeman-csp there isn't a way to hack that solver error tolerance. It's in there to make sure the solver has converged to the required relative tolerance. Let me take a look at these files.

@ranjanan
Copy link
Member

@ptfreeman-csp I've requested access to the files through my email.

@ptfreeman-csp
Copy link
Author

Hi @ranjanan -- I was able to get Omniscape to run successfully using some older versions of Omniscape and supporting packages using the Project.toml and Manifest.toml from @gagecarto who outlined his fix here. I will share those files with you and the input files.

@ranjanan
Copy link
Member

I see, thanks for the update @ptfreeman-csp. That solver check is in there so you can know that you can trust your answers. There are a lot of really ill-conditioned matrices in here, so it's always good to know that the linear solver has converged.

@ranjanan
Copy link
Member

I'm going to keep this open and try and isolate one of the linear solves that is failing and then study it.

@ptfreeman-csp
Copy link
Author

ptfreeman-csp commented Dec 28, 2022 via email

@ranjanan
Copy link
Member

In theory, yes. Although the version you said things worked on also has a check for convergence and you're saying it worked. So could be a less concerning bug somewhere too. It would be easier for me if you can reproduce this error with a Circuitscape advanced mode problem though. Is that easy to do on your end?

@ranjanan
Copy link
Member

This problem runs fine as of 5.11.2

@ptfreeman-csp
Copy link
Author

I don't understand how that can be possible - perhaps it's the combination of dependencies I have in my Julia environment.

@mir123
Copy link

mir123 commented Jan 16, 2023

This problem runs fine as of 5.11.2

Does this mean running Omniscape with Circuitscape version 5.11.2 (or maybe you mean 5.12.2) fixes it? The current release of Omniscape uses 5.11.

@ptfreeman-csp
Copy link
Author

ptfreeman-csp commented Jan 16, 2023 via email

@mir123
Copy link

mir123 commented Jan 23, 2023

This problem runs fine as of 5.11.2

@ranjanan can you confirm you used Circuitscape 5.12.2?

@mir123
Copy link

mir123 commented Jan 30, 2023

Can confirm this still happens with Circuitscape 5.12.2 (this time at "100%" completion with 9 minutes to go on a 3 day run)

ERROR: TaskFailedException
Stacktrace:
 [1] wait
   @ ./task.jl:345 [inlined]
 [2] threading_run(fun::Omniscape.var"#161#threadsfor_fun#12"{Omniscape.var"#161#threadsfor_fun#11#13"{Int64, ProgressMeter.Progress, Int64, Dict{String, String}, Omniscape.ConditionLayers{Float64, 2}, Omniscape.Conditions, Omniscape.OmniscapeFlags, DataType, Dict{String, Int64}, UnitRange{Int64}}}, static::Bool)
   @ Base.Threads ./threadingconstructs.jl:38
 [3] macro expansion
   @ ./threadingconstructs.jl:89 [inlined]
 [4] run_omniscape(cfg::Dict{String, String}, resistance::Matrix{Union{Missing, Float64}}; reclass_table::Matrix{Union{Missing, Float64}}, source_strength::Matrix{Union{Missing, Float64}}, condition1::Matrix{Union{Missing, Float64}}, condition2::Matrix{Union{Missing, Float64}}, condition1_future::Matrix{Union{Missing, Float64}}, condition2_future::Matrix{Union{Missing, Float64}}, wkt::String, geotransform::Vector{Float64}, write_outputs::Bool)
   @ Omniscape ~/.julia/packages/Omniscape/9gHf2/src/main.jl:257
 [5] run_omniscape(path::String)
   @ Omniscape ~/.julia/packages/Omniscape/9gHf2/src/main.jl:536
 [6] top-level scope2399
   @ REPL[2]:1

 nested task error: AssertionError: norm(G * v .- curr) / norm(curr) < 1.0e-6
    Stacktrace:
     [1] macro expansion
       @ ~/.julia/packages/Omniscape/9gHf2/src/main.jl:278 [inlined]
     [2] (::Omniscape.var"#161#threadsfor_fun#12"{Omniscape.var"#161#threadsfor_fun#11#13"{Int64, ProgressMeter.Progress, Int64, Dict{String, String}, Omniscape.ConditionLayers{Float64, 2}, Omniscape.Conditions, Omniscape.OmniscapeFlags, DataType, Dict{String, Int64}, UnitRange{Int64}}})(tid::Int64; onethread::Bool)
       @ Omniscape ./threadingconstructs.jl:84
     [3] #161#threadsfor_fun
       @ ./threadingconstructs.jl:51 [inlined]
     [4] (::Base.Threads.var"#1#2"{Omniscape.var"#161#threadsfor_fun#12"{Omniscape.var"#161#threadsfor_fun#11#13"{Int64, ProgressMeter.Progress, Int64, Dict{String, String}, Omniscape.ConditionLayers{Float64, 2}, Omniscape.Conditions, Omniscape.OmniscapeFlags, DataType, Dict{String, Int64}, UnitRange{Int64}}}, Int64})()
       @ Base.Threads ./threadingconstructs.jl:30

    caused by: AssertionError: norm(G * v .- curr) / norm(curr) < 1.0e-6
    Stacktrace:
      [1] solve_linear_system(G::SparseArrays.SparseMatrixCSC{Float64, Int64}, curr::Vector{Float64}, M::AlgebraicMultigrid.Preconditioner{AlgebraicMultigrid.MultiLevel{AlgebraicMultigrid.Pinv{Float64}, AlgebraicMultigrid.GaussSeidel{AlgebraicMultigrid.SymmetricSweep}, AlgebraicMultigrid.GaussSeidel{AlgebraicMultigrid.SymmetricSweep}, SparseArrays.SparseMatrixCSC{Float64, Int64}, SparseArrays.SparseMatrixCSC{Float64, Int64}, LinearAlgebra.Adjoint{Float64, SparseArrays.SparseMatrixCSC{Float64, Int64}}, AlgebraicMultigrid.MultiLevelWorkspace{Vector{Float64}, 1}}, AlgebraicMultigrid.V})
        @ Circuitscape ~/.julia/packages/Circuitscape/HNOrX/src/core.jl:613
      [2] macro expansion
        @ ./timing.jl:382 [inlined]
      [3] multiple_solve(s::Circuitscape.AMGSolver, matrix::SparseArrays.SparseMatrixCSC{Float64, Int64}, sources::Vector{Float64}, suppress_info::Bool)
        @ Circuitscape ~/.julia/packages/Circuitscape/HNOrX/src/raster/advanced.jl:311
      [4] multiple_solver(cfg::Dict{String, String}, solver::Circuitscape.AMGSolver, a::SparseArrays.SparseMatrixCSC{Float64, Int64}, sources::Vector{Float64}, grounds::Vector{Float64}, finitegrounds::Vector{Float64})
        @ Circuitscape ~/.julia/packages/Circuitscape/HNOrX/src/raster/advanced.jl:291
      [5] compute_omniscape_current(conductance::Matrix{Float64}, source::Matrix{Float64}, ground::Matrix{Float64}, cs_cfg::Dict{String, String})
        @ Circuitscape ~/.julia/packages/Circuitscape/HNOrX/src/utils.jl:529
      [6] solve_target!(target::Omniscape.Target, int_arguments::Dict{String, Int64}, source_strength::Matrix{Union{Missing, Float64}}, resistance::Matrix{Union{Missing, Float64}}, os_flags::Omniscape.OmniscapeFlags, cs_cfg::Dict{String, String}, condition_layers::Omniscape.ConditionLayers{Float64, 2}, conditions::Omniscape.Conditions, correction_array::Matrix{Float64}, cum_currmap::Array{Float64, 3}, fp_cum_currmap::Array{Float64, 3}, precision::DataType)
        @ Omniscape ~/.julia/packages/Omniscape/9gHf2/src/utils.jl:332
      [7] macro expansion
        @ ~/.julia/packages/Omniscape/9gHf2/src/main.jl:264 [inlined]
      [8] (::Omniscape.var"#161#threadsfor_fun#12"{Omniscape.var"#161#threadsfor_fun#11#13"{Int64, ProgressMeter.Progress, Int64, Dict{String, String}, Omniscape.ConditionLayers{Float64, 2}, Omniscape.Conditions, Omniscape.OmniscapeFlags, DataType, Dict{String, Int64}, UnitRange{Int64}}})(tid::Int64; onethread::Bool)
        @ Omniscape ./threadingconstructs.jl:84
      [9] #161#threadsfor_fun
        @ ./threadingconstructs.jl:51 [inlined]
     [10] (::Base.Threads.var"#1#2"{Omniscape.var"#161#threadsfor_fun#12"{Omniscape.var"#161#threadsfor_fun#11#13"{Int64, ProgressMeter.Progress, Int64, Dict{String, String}, Omniscape.ConditionLayers{Float64, 2}, Omniscape.Conditions, Omniscape.OmniscapeFlags, DataType, Dict{String, Int64}, UnitRange{Int64}}}, Int64})()
        @ Base.Threads ./threadingconstructs.jl:30

@vlandau
Copy link
Member

vlandau commented Jan 30, 2023

@ranjanan can this just be changed to throw a warning stating that results may be inaccurate when the assertion fails instead of throwing an error?

@gravary
Copy link

gravary commented Mar 7, 2023

I am still having this issue, I would prefer not to use the older environment fix used in this thread: https://github.com/Circuitscape/Omniscape.jl/issues/127 if at all possible.

@slamander
Copy link

I'm also still having this issue, but unlike the above, my moving window failures do not occur on NA/NaN or correspondingly high resistance cells. Have there been any developments? (BTW, I'm using Omniscape [0.5.8]).

@clescoat
Copy link

I also having the issue in Circuitscape (not using Omniscape) when I set the precision to "single".

Circuitscape v5.13.1 / Julia v1.9.3

@BortEdwards
Copy link

Have just run into this same issue (julia/1.9.4) for runs with 25 or 50 mile windows using resistance surfaces that are fine for smaller windows. Tinkered with small changes in block size, hoping it might nudge the windows past some problematic pixel arrangement, but to no effect. Am not relishing the idea of having to mess around with a painfully arrived at series of window/block cell sizes. Hopefully I can identify the cells and try a method in another thread for fixing them individually...
Appreciate everyone's work, but frustrating there has been no fix :/

@vlandau
Copy link
Member

vlandau commented Feb 17, 2024

For future reference, if anyone is coming across this issue because they're getting the same error, make sure you're using Circuitscape 5.13.3 or greater (to check your version of Circuitscape, enter the Package prompt by typing ] into the Julia REPL, then run status Circuitscape to get version info). If it still isn't working after upgrading, open a new issue.

@Circuitscape Circuitscape locked as resolved and limited conversation to collaborators Feb 17, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants