MPI error with kernel_launching.jl
#3981
francispoulin
started this conversation in
Experimental features
Replies: 1 comment 6 replies
-
Weird, I run it on main and I don't have such problems: (base) simonesilvestri@Simones-MacBook-Pro Oceananigans.jl % mpiexecjl -np 4 julia --project validation/distributed_simulations/distributed_nonhydrostatic_turbulence.jl
[ Info: MPI has not been initialized, so we are calling MPI.Init().
[ Info: MPI has not been initialized, so we are calling MPI.Init().
[ Info: MPI has not been initialized, so we are calling MPI.Init().
[ Info: MPI has not been initialized, so we are calling MPI.Init().
grid = grid = 64×256×1 RectilinearGrid{Float64, Oceananigans.Grids.FullyConnected, Periodic, Flat} on Distributed{CPU} with 3×3×0 halo
├── FullyConnected x ∈ [3.14159, 4.71239) regularly spaced with Δx=0.0245437
├── Periodic y ∈ [-1.91418e-18, 6.28319) regularly spaced with Δy=0.0245437
└── Flat z 64×256×1 RectilinearGrid{Float64, Oceananigans.Grids.FullyConnected, Periodic, Flat} on Distributed{CPU} with 3×3×0 halo
├── FullyConnected x ∈ [4.71239, 6.28319) regularly spaced with Δx=0.0245437
├── Periodic y ∈ [-1.91418e-18, 6.28319) regularly spaced with Δy=0.0245437
└── Flat z
grid = 64×256×1 RectilinearGrid{Float64, Oceananigans.Grids.FullyConnected, Periodic, Flat} on Distributed{CPU} with 3×3×0 halo
├── FullyConnected x ∈ [2.41353e-18, 1.5708) regularly spaced with Δx=0.0245437
├── Periodic y ∈ [-1.91418e-18, 6.28319) regularly spaced with Δy=0.0245437
└── Flat z
grid = 64×256×1 RectilinearGrid{Float64, Oceananigans.Grids.FullyConnected, Periodic, Flat} on Distributed{CPU} with 3×3×0 halo
├── FullyConnected x ∈ [1.5708, 3.14159) regularly spaced with Δx=0.0245437
├── Periodic y ∈ [-1.91418e-18, 6.28319) regularly spaced with Δy=0.0245437
└── Flat z
[ Info: Initializing simulation...
[ Info: Initializing simulation...
[ Info: Initializing simulation...
[ Info: Initializing simulation...
[ Info: Iteration: 0, time: 0 seconds
[ Info: Rank 1: max|ζ|: 7.58e+01, max(e): 2.33e-01
[ Info: Rank 3: max|ζ|: 7.49e+01, max(e): 2.17e-01
[ Info: Rank 0: max|ζ|: 7.60e+01, max(e): 2.39e-01
[ Info: Rank 2: max|ζ|: 7.54e+01, max(e): 2.52e-01
[ Info: ... simulation initialization complete (14.299 seconds)
[ Info: ... simulation initialization complete (14.139 seconds)
[ Info: Executing initial time step...
[ Info: Executing initial time step...
[ Info: ... simulation initialization complete (14.232 seconds)
[ Info: Executing initial time step...
[ Info: ... simulation initialization complete (14.290 seconds)
[ Info: Executing initial time step...
[ Info: ... initial time step complete (5.912 seconds).
[ Info: ... initial time step complete (5.913 seconds).
[ Info: ... initial time step complete (5.913 seconds).
[ Info: ... initial time step complete (5.883 seconds).
[ Info: Iteration: 10, time: 100.000 ms
[ Info: Rank 2: max|ζ|: 4.50e+01, max(e): 9.66e-02
[ Info: Rank 3: max|ζ|: 4.45e+01, max(e): 9.25e-02
[ Info: Rank 0: max|ζ|: 4.28e+01, max(e): 9.77e-02
[ Info: Rank 1: max|ζ|: 4.48e+01, max(e): 9.63e-02
[ Info: Iteration: 20, time: 190.000 ms
[ Info: Rank 3: max|ζ|: 3.31e+01, max(e): 7.02e-02
[ Info: Rank 1: max|ζ|: 3.33e+01, max(e): 7.47e-02
[ Info: Rank 2: max|ζ|: 3.47e+01, max(e): 6.17e-02
[ Info: Rank 0: max|ζ|: 3.41e+01, max(e): 7.20e-02
[ Info: Iteration: 30, time: 280.000 ms
...
[ Info: Simulation is stopping after running for 47.977 seconds.
[ Info: Simulation is stopping after running for 47.941 seconds.
[ Info: Model iteration 1000 equals or exceeds stop iteration 1000.
[ Info: Simulation is stopping after running for 47.910 seconds.
[ Info: Model iteration 1000 equals or exceeds stop iteration 1000.
[ Info: Model iteration 1000 equals or exceeds stop iteration 1000.
[ Info: Simulation is stopping after running for 47.821 seconds.
[ Info: Model iteration 1000 equals or exceeds stop iteration 1000.
[ Info: Iteration: 1000, time: 9.170 seconds
[ Info: Rank 0: max|ζ|: 3.39e+00, max(e): 8.75e-03
[ Info: Rank 1: max|ζ|: 3.44e+00, max(e): 7.93e-03
[ Info: Rank 2: max|ζ|: 3.63e+00, max(e): 9.52e-03
[ Info: Rank 3: max|ζ|: 3.65e+00, max(e): 7.15e-03
(base) simonesilvestri@Simones-MacBook-Pro Oceananigans.jl % Is your MPI configured correctly? Are you maybe using an old version of Oceananigans? |
Beta Was this translation helpful? Give feedback.
6 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
We (@jakob-braga and @francispoulin ) are trying to run
distributed_nonhydrostatic_turbulence.jl
and getting an error.We have tried two different servers and found the error in both is due to
kernel_launching.jl
.@glwagner @simone-silvestri ?
Beta Was this translation helpful? Give feedback.
All reactions