Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Offload DG method to GPUs #1485

Draft
wants to merge 96 commits into
base: main
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
96 commits
Select commit Hold shift + click to select a range
096b860
Initial; Added KernelAbstraction dependency
jkravs May 24, 2023
61adfd3
Add bad initial gpu offloading for volume integral
jkravs May 29, 2023
3aa6897
Add possiblity to test correctness using CI
jkravs May 29, 2023
92b9e68
Merge branch 'main' into dg_gpu_port
jkravs May 29, 2023
fabd803
Fixed data race
jkravs Jun 6, 2023
bf5102c
RWTH cluster CI added
jkravs Jun 6, 2023
5f14874
Merge branch 'main' into dg_gpu_port
jkravs Jun 6, 2023
4a13bde
Fixed invalid typo in yml key
jkravs Jun 6, 2023
22cf6e4
Change Backend with trixi_include
jkravs Jun 7, 2023
92c9f34
Fixed gitlab ci
jkravs Jun 7, 2023
35f4926
Removed after scripts
jkravs Jun 7, 2023
a05ed8a
Install OrdinaryDiffEq in CI
jkravs Jun 7, 2023
a848f72
Fixed ODE version because of dependency breakage
jkravs Jun 7, 2023
fdf3931
Added KernelAbstractions dependency in CI
jkravs Jun 7, 2023
54d9fc7
Fixed crash of default cpu computation
jkravs Jun 7, 2023
48c31b8
Removing workgroupsize autotunes kernels in CUDA.jl
jkravs Jun 12, 2023
651d26d
Initial interface flux calculation offloaded
jkravs Jun 13, 2023
9b0bc7e
Merge branch 'main' into dg_gpu_port
jkravs Jun 13, 2023
ad2763f
Fixed CUDA offloading bugs
jkravs Jun 14, 2023
79c5286
Fixed scalar indexing issue during interface init
jkravs Jun 14, 2023
e79d8bb
Julia 1.9 test pipeline
jkravs Jun 19, 2023
9e3db29
Fixed invalid yaml file
jkravs Jun 19, 2023
eeb04d2
Added Extensions to better deal with missing KA API calls
jkravs Jun 20, 2023
7e48fb7
Removed runtime downcasting bottleneck
jkravs Jun 20, 2023
1b3ac6c
Merge branch 'main' into dg_gpu_port
jkravs Jun 20, 2023
b4dd4da
Boundary GPU ported
jkravs Jun 21, 2023
382712f
Fixed issues with zero element arrays and CUDA.jl
jkravs Jun 28, 2023
ee93f61
Prolong2mortars on gpu
jkravs Jun 28, 2023
baf75e2
Dummy for mortar calc
jkravs Jun 28, 2023
6b7a738
Surface integral computation offloaded
jkravs Jun 28, 2023
39cbf80
apply_jacobian offloaded
jkravs Jun 30, 2023
58d165e
calc_sources offloaded
jkravs Jun 30, 2023
ba1a530
Merge branch 'main' into dg_gpu_port
jkravs Jun 30, 2023
f27bb8e
Added suggestions
jkravs Jul 5, 2023
b075711
Initalize derivative_dhat on GPU
jkravs Jul 5, 2023
6df94be
surface_flux_values initalized on gpu
jkravs Jul 5, 2023
8fc47fe
boundary_interpolation initalized on gpu
jkravs Jul 5, 2023
3eb35d0
initalize inverse_jacobian on gpu
jkravs Jul 5, 2023
5350e78
Fixed init of derivative matrix
jkravs Jul 5, 2023
7e23af5
Fixed scalar indexing issue
jkravs Jul 5, 2023
530d10a
Merge branch 'main' into dg_gpu_port
jkravs Jul 5, 2023
c39790b
u/du initalized on gpu
jkravs Jul 12, 2023
7cd2045
Merge branch 'main' into dg_gpu_port
jkravs Jul 12, 2023
8a1de82
Better compability of StrideArrays and KA
jkravs Jul 17, 2023
31915a7
Fix scalar indexing on test
jkravs Jul 17, 2023
9a81463
Removed unesseccary allocations in rhs_gpu
jkravs Jul 17, 2023
dce23a9
Merge branch 'main' into dg_gpu_port
jkravs Jul 17, 2023
9a5a8c4
fixed scalar indexing issues on analysis callback
jkravs Jul 17, 2023
be39bb8
Removed allowed scalar indexing because of performance bottleneck
jkravs Jul 18, 2023
35843bf
Benchmark CI
jkravs Jul 18, 2023
19ba7c8
Merge branch 'main' into dg_gpu_port
jkravs Jul 18, 2023
dd65936
Display benchmark results
jkravs Jul 18, 2023
14db899
P4est advection basic testing
jkravs Jul 18, 2023
1c17e35
Replace Symbols with Int
jkravs Jul 25, 2023
651ef17
Merge branch 'main' into dg_gpu_port
jkravs Jul 25, 2023
ce44e4f
Typo
jkravs Jul 25, 2023
c243726
Change Ints to Index Enum
jkravs Jul 25, 2023
92736d8
Changed all Indices Symbols to Enum
jkravs Jul 25, 2023
f877a12
Separate internal loop for prolong2interfaces
jkravs Jul 26, 2023
184cf37
Inital CPU offloading possible
jkravs Jul 26, 2023
d428557
Merge branch 'main' into dg_gpu_port
jkravs Jul 26, 2023
1edb32c
Test p4est elixir in CI
jkravs Jul 26, 2023
93daf64
Separated internal loop in calc_interface_flux
jkravs Aug 1, 2023
0459fdf
calc_interface_flux offloaded to cpu
jkravs Aug 1, 2023
44343de
Merge branch 'main' into dg_gpu_port
jkravs Aug 1, 2023
a3cd9d4
Separated shared loop of p4est weak form kernel
jkravs Aug 1, 2023
9d8c156
CPU offloading of p4est weak form kernel
jkravs Aug 1, 2023
c2eec36
Separated internal loop of surface integral calc
jkravs Aug 2, 2023
2084f33
Add CPU Offloading of surface integral calc
jkravs Aug 2, 2023
8fb7a3e
Longer CI Timeout
jkravs Aug 9, 2023
21dfe5a
apply_jacobian offloaded for p4est meshes
jkravs Aug 9, 2023
6eb5933
P4est advection basic on GPU
jkravs Aug 9, 2023
d81fe1a
CI Test of p4est on GPU
jkravs Aug 9, 2023
85efe0a
Init interfaces.u on GPU
jkravs Aug 9, 2023
d3b5e09
Merge branch 'main' into dg_gpu_port
jkravs Aug 9, 2023
38ea4bc
Init data from interface container
jkravs Aug 9, 2023
2c6443e
Data from element container init on GPU
jkravs Aug 10, 2023
bd2696d
Removed scalar indexing with dg init on GPU
jkravs Aug 10, 2023
ad1b7b6
Merge branch 'main' into dg_gpu_port
jkravs Aug 10, 2023
8442991
New elixirs
jkravs Sep 6, 2023
2186fd4
Remove all symbols
jkravs Sep 6, 2023
1164e4c
Merge branch 'main' into dg_gpu_port
jkravs Sep 6, 2023
a4fb500
reset du offload
jkravs Sep 6, 2023
3cfc529
Flux differencing kernel offload
jkravs Sep 6, 2023
369518f
prolong2interfaces offloaded
jkravs Sep 6, 2023
50a3ba0
calc_interfaces in 3d p4est offloaded
jkravs Sep 6, 2023
0c1ea4b
dummy functions for p4est 3d boundaries/mortars
jkravs Sep 6, 2023
6e89971
surface_integral p4est 3d offload
jkravs Sep 6, 2023
4ea72b1
apply jacobian p4est gpu offload
jkravs Sep 6, 2023
a597b10
calc source terms 3d offload
jkravs Sep 6, 2023
2ddb8d6
Merge branch 'main' into dg_gpu_port
jkravs Sep 6, 2023
3022e5c
elixir_advection_basic_fd offloaded to gpu
jkravs Sep 6, 2023
74b61a1
p4est euler taylor green vortex elixir offloaded to gpu
jkravs Sep 6, 2023
8e69091
Reduced memory usage of calculate dt
jkravs Sep 13, 2023
a880c16
Merge branch 'main' into dg_gpu_port
jkravs Nov 27, 2023
c7df644
Merge branch 'main' into dg_gpu_port
jkravs Nov 29, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
21 changes: 21 additions & 0 deletions .gitlab-ci.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
stages:
- test

.trigger-template:
stage: test
trigger:
include: /.test-ci.yml
strategy: depend
forward:
yaml_variables: true

julia-1.8-test:
extends: .trigger-template
allow_failure: true
variables:
JULIA_EXEC: "julia-1.8"

julia-1.9-test:
extends: .trigger-template
variables:
JULIA_EXEC: "julia-1.9"
84 changes: 84 additions & 0 deletions .test-ci.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,84 @@
stages:
- precompile
- test
- benchmark


default:
tags: [ "downscope" ]

.julia-job:
variables:
SLURM_PARAM_ACCOUNT: "-A thes1464"
SLURM_PARAM_TASKS: "-n 1"
SLURM_PARAM_CPUS: "--cpus-per-task=24"
SLURM_PARAM_TIME: "-t 10:00:00"
before_script:
- source /work/co693196/MA/julia.sh

precompile-job:
extends: .julia-job
stage: precompile
script:
- mkdir run
- cd run
- $JULIA_EXEC --project="." -e 'using Pkg; Pkg.develop(PackageSpec(path=".."))'

.test-job:
extends: .julia-job
stage: test
before_script:
- source /work/co693196/MA/julia.sh
- mkdir run
- cd run
- $JULIA_EXEC --project="." -e 'using Pkg; Pkg.add(["OrdinaryDiffEq", "KernelAbstractions"]); Pkg.develop(PackageSpec(path=".."));'

.benchmark-job:
extends: .julia-job
stage: benchmark
before_script:
- source /work/co693196/MA/julia.sh
- mkdir run
- cd run
- $JULIA_EXEC --project="." -e 'using Pkg; Pkg.add(["OrdinaryDiffEq", "KernelAbstractions", "BenchmarkTools"]); Pkg.develop(PackageSpec(path=".."));'

cpu-test-job:
extends: .test-job
script:
- $JULIA_EXEC --project="." --threads=24 -e 'using Trixi; trixi_include(pkgdir(Trixi, "test", "test_tree_2d_advection.jl"), offload=false)'
- $JULIA_EXEC --project="." --threads=24 -e 'using Trixi; trixi_include(pkgdir(Trixi, "test", "test_p4est_2d.jl"), offload=false)'

cpu-offload-test-job:
extends: .test-job
script:
- $JULIA_EXEC --project="." --threads=24 -e 'using Trixi; trixi_include(pkgdir(Trixi, "test", "test_tree_2d_advection.jl"), offload=true)'
- $JULIA_EXEC --project="." --threads=24 -e 'using Trixi; trixi_include(pkgdir(Trixi, "test", "test_p4est_2d.jl"), offload=true)'

gpu-offload-test-job:
extends: .test-job
variables:
SLURM_PARAM_GPUS: "--gres=gpu:volta:1"
SLURM_PARAM_PARTITION: "--partition=c18g"
script:
- $JULIA_EXEC --project="." --threads=24 -e 'using Pkg; Pkg.add("CUDA")'
- $JULIA_EXEC --project="." --threads=24 -e 'using Trixi, CUDA; using CUDA.CUDAKernels; trixi_include(pkgdir(Trixi, "test", "test_tree_2d_advection.jl"), offload=true, backend=CUDABackend())'
- $JULIA_EXEC --project="." --threads=24 -e 'using Trixi, CUDA; using CUDA.CUDAKernels; trixi_include(pkgdir(Trixi, "test", "test_p4est_2d.jl"), offload=true, backend=CUDABackend())'

cpu-benchmark-job:
extends: .benchmark-job
script:
- $JULIA_EXEC --project="." --threads=24 -e 'using Trixi, BenchmarkTools; show(stderr, "text/plain", @benchmark trixi_include($joinpath(examples_dir(), "tree_2d_dgsem", "elixir_advection_basic.jl"), offload=false))' 1> /dev/null

cpu-offload-benchmark-job:
extends: .benchmark-job
script:
- $JULIA_EXEC --project="." --threads=24 -e 'using Trixi, BenchmarkTools; show(stderr, "text/plain", @benchmark trixi_include($joinpath(examples_dir(), "tree_2d_dgsem", "elixir_advection_basic.jl"), offload=true))' 1> /dev/null

gpu-offload-benchmark-job:
extends: .benchmark-job
variables:
SLURM_PARAM_GPUS: "--gres=gpu:volta:1"
SLURM_PARAM_PARTITION: "--partition=c18g"
script:
- $JULIA_EXEC --project="." --threads=24 -e 'using Pkg; Pkg.add("CUDA")'
- $JULIA_EXEC --project="." --threads=24 -e 'using Trixi, CUDA, CUDA.CUDAKernels, BenchmarkTools; show(stderr, "text/plain", @benchmark trixi_include($joinpath(examples_dir(), "tree_2d_dgsem", "elixir_advection_basic.jl"), offload=true, backend=CUDABackend()))' 1> /dev/null
14 changes: 14 additions & 0 deletions Project.toml
Original file line number Diff line number Diff line change
Expand Up @@ -10,8 +10,10 @@ DiffEqCallbacks = "459566f4-90b8-5000-8ac3-15dfb0a30def"
EllipsisNotation = "da5c29d0-fa7d-589e-88eb-ea29b0a81949"
FillArrays = "1a297f60-69ca-5386-bcde-b61e274b549b"
ForwardDiff = "f6369f11-7733-5829-9624-2563aa707210"
GPUArrays = "0c68f7d7-f131-5f86-a1c3-88cf8149b2d7"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change to GPUArraysCore.jl (see discussion on Julia Slack)

HDF5 = "f67ccb44-e63f-5c2f-98bd-6dc0ccc4ba2f"
IfElse = "615f187c-cbe4-4ef1-ba3b-2fcf58d6d173"
KernelAbstractions = "63c18a36-062a-441e-b654-da1e3ab1ce7c"
LinearAlgebra = "37e2e46d-f89d-539d-b4ee-838fcccc9c8e"
LinearMaps = "7a12625a-238d-50fd-b39a-03d52299707e"
LoopVectorization = "bdcacae8-1622-11e9-2a5c-532679323890"
Expand Down Expand Up @@ -44,10 +46,18 @@ TriplotBase = "981d1d27-644d-49a2-9326-4793e63143c3"
TriplotRecipes = "808ab39a-a642-4abf-81ff-4cb34ebbffa3"

[weakdeps]
AMDGPU = "21141c5a-9bdb-4563-92ae-f87d6854732e"
CUDA = "052768ef-5323-5732-b1bb-66c8b64840ba"
Makie = "ee78f7c6-11fb-53f2-987a-cfe4a2b5a57a"
Metal = "dde4c033-4e86-420c-a63e-0dd931031962"
oneAPI = "8f75cd03-7ff8-4ecb-9b8f-daf728133b1b"

[extensions]
TrixiAMDGPUExt = "AMDGPU"
TrixiCUDAExt = "CUDA"
TrixiMakieExt = "Makie"
TrixiMetalExt = "Metal"
TrixiOneAPIExt = "oneAPI"

[compat]
CodeTracking = "1.0.5"
Expand Down Expand Up @@ -92,4 +102,8 @@ TriplotRecipes = "0.1"
julia = "1.8"

[extras]
AMDGPU = "21141c5a-9bdb-4563-92ae-f87d6854732e"
CUDA = "052768ef-5323-5732-b1bb-66c8b64840ba"
Makie = "ee78f7c6-11fb-53f2-987a-cfe4a2b5a57a"
Metal = "dde4c033-4e86-420c-a63e-0dd931031962"
oneAPI = "8f75cd03-7ff8-4ecb-9b8f-daf728133b1b"
9 changes: 5 additions & 4 deletions examples/p4est_2d_dgsem/elixir_advection_basic.jl
Original file line number Diff line number Diff line change
Expand Up @@ -3,15 +3,17 @@

using OrdinaryDiffEq
using Trixi
using KernelAbstractions

###############################################################################
# semidiscretization of the linear advection equation

advection_velocity = (0.2, -0.7)
equations = LinearScalarAdvectionEquation2D(advection_velocity)
backend = CPU()

# Create DG solver with polynomial degree = 3 and (local) Lax-Friedrichs/Rusanov flux as surface flux
solver = DGSEM(polydeg = 3, surface_flux = flux_lax_friedrichs)
solver = DGSEM(polydeg = 3, surface_flux = flux_lax_friedrichs, backend=backend)

coordinates_min = (-1.0, -1.0) # minimum coordinates (min(x), min(y))
coordinates_max = (1.0, 1.0) # maximum coordinates (max(x), max(y))
Expand All @@ -24,14 +26,13 @@ mesh = P4estMesh(trees_per_dimension, polydeg = 3,
initial_refinement_level = 1)

# A semidiscretization collects data structures and functions for the spatial discretization
semi = SemidiscretizationHyperbolic(mesh, equations, initial_condition_convergence_test,
solver)
semi = SemidiscretizationHyperbolic(mesh, equations, initial_condition_convergence_test, solver; backend=backend)

###############################################################################
# ODE solvers, callbacks etc.

# Create ODE problem with time span from 0.0 to 1.0
ode = semidiscretize(semi, (0.0, 1.0));
ode = semidiscretize(semi, (0.0, 1.0); offload=false, backend=backend);

# At the beginning of the main loop, the SummaryCallback prints a summary of the simulation setup
# and resets the timers
Expand Down
63 changes: 63 additions & 0 deletions examples/p4est_3d_dgsem/elixir_advection_basic_fd.jl
Original file line number Diff line number Diff line change
@@ -0,0 +1,63 @@
using OrdinaryDiffEq
using Trixi
using KernelAbstractions

###############################################################################
# semidiscretization of the linear advection equation

backend = CPU()

advection_velocity = (0.2, -0.7, 0.5)
equations = LinearScalarAdvectionEquation3D(advection_velocity)

# Create DG solver with polynomial degree = 3 and (local) Lax-Friedrichs/Rusanov flux as surface flux
solver = DGSEM(polydeg=3, surface_flux=flux_lax_friedrichs,
volume_integral=VolumeIntegralFluxDifferencing(flux_lax_friedrichs), backend=backend)

coordinates_min = (-1.0, -1.0, -1.0) # minimum coordinates (min(x), min(y), min(z))
coordinates_max = ( 1.0, 1.0, 1.0) # maximum coordinates (max(x), max(y), max(z))

# Create P4estMesh with 8 x 8 x 8 elements (note `refinement_level=1`)
trees_per_dimension = (4, 4, 4)
mesh = P4estMesh(trees_per_dimension, polydeg=1,
coordinates_min=coordinates_min, coordinates_max=coordinates_max,
initial_refinement_level=1)

# A semidiscretization collects data structures and functions for the spatial discretization
semi = SemidiscretizationHyperbolic(mesh, equations, initial_condition_convergence_test, solver; backend=backend)

###############################################################################
# ODE solvers, callbacks etc.

# Create ODE problem with time span from 0.0 to 1.0
tspan = (0.0, 1.0)
ode = semidiscretize(semi, tspan; offload=false, backend=backend)

# At the beginning of the main loop, the SummaryCallback prints a summary of the simulation setup
# and resets the timers
summary_callback = SummaryCallback()

# The AnalysisCallback allows to analyse the solution in regular intervals and prints the results
analysis_callback = AnalysisCallback(semi, interval=100)

# The SaveSolutionCallback allows to save the solution to a file in regular intervals
save_solution = SaveSolutionCallback(interval=100,
solution_variables=cons2prim)

# The StepsizeCallback handles the re-calculation of the maximum Δt after each time step
stepsize_callback = StepsizeCallback(cfl=1.2)

# Create a CallbackSet to collect all callbacks such that they can be passed to the ODE solver
callbacks = CallbackSet(summary_callback, analysis_callback, save_solution, stepsize_callback)


###############################################################################
# run the simulation

# OrdinaryDiffEq's `solve` method evolves the solution in time and executes the passed callbacks
sol = solve(ode, CarpenterKennedy2N54(williamson_condition=false),
dt=1.0, # solve needs some value here but it will be overwritten by the stepsize_callback
save_everystep=false, callback=callbacks);

# Print the timer summary
summary_callback()
80 changes: 80 additions & 0 deletions examples/p4est_3d_dgsem/elixir_euler_taylor_green_vortex.jl
Original file line number Diff line number Diff line change
@@ -0,0 +1,80 @@
using OrdinaryDiffEq
using Trixi
using KernelAbstractions

###############################################################################
# semidiscretization of the compressible Euler equations

equations = CompressibleEulerEquations3D(1.4)

"""
initial_condition_taylor_green_vortex(x, t, equations::CompressibleEulerEquations3D)

The classical inviscid Taylor-Green vortex.
"""
function initial_condition_taylor_green_vortex(x, t, equations::CompressibleEulerEquations3D)
A = 1.0 # magnitude of speed
Ms = 0.1 # maximum Mach number

rho = 1.0
v1 = A * sin(x[1]) * cos(x[2]) * cos(x[3])
v2 = -A * cos(x[1]) * sin(x[2]) * cos(x[3])
v3 = 0.0
p = (A / Ms)^2 * rho / equations.gamma # scaling to get Ms
p = p + 1.0/16.0 * A^2 * rho * (cos(2*x[1])*cos(2*x[3]) + 2*cos(2*x[2]) + 2*cos(2*x[1]) + cos(2*x[2])*cos(2*x[3]))

return prim2cons(SVector(rho, v1, v2, v3, p), equations)
end

backend = CPU()

initial_condition = initial_condition_taylor_green_vortex

solver = DGSEM(polydeg=3, surface_flux=flux_lax_friedrichs,
volume_integral=VolumeIntegralFluxDifferencing(flux_lax_friedrichs), backend=backend)

coordinates_min = (-1.0, -1.0, -1.0) .* pi
coordinates_max = ( 1.0, 1.0, 1.0) .* pi

# Create P4estMesh with 8 x 8 x 8 elements (note `refinement_level=1`)
trees_per_dimension = (4, 4, 4)
mesh = P4estMesh(trees_per_dimension, polydeg=1,
coordinates_min=coordinates_min, coordinates_max=coordinates_max,
initial_refinement_level=1)

semi = SemidiscretizationHyperbolic(mesh, equations, initial_condition, solver; backend=backend)


###############################################################################
# ODE solvers, callbacks etc.

tspan = (0.0, 5.0)
ode = semidiscretize(semi, tspan; offload=true, backend=backend)

summary_callback = SummaryCallback()

analysis_interval = 100
analysis_callback = AnalysisCallback(semi, interval=analysis_interval)

alive_callback = AliveCallback(analysis_interval=analysis_interval)

save_solution = SaveSolutionCallback(interval=100,
save_initial_solution=true,
save_final_solution=true,
solution_variables=cons2prim)

stepsize_callback = StepsizeCallback(cfl=0.9)

callbacks = CallbackSet(summary_callback,
analysis_callback, alive_callback,
save_solution,
stepsize_callback)


###############################################################################
# run the simulation

sol = solve(ode, CarpenterKennedy2N54(williamson_condition=false),
dt=1.0, # solve needs some value here but it will be overwritten by the stepsize_callback
save_everystep=false, callback=callbacks);
summary_callback() # print the timer summary
10 changes: 6 additions & 4 deletions examples/tree_2d_dgsem/elixir_advection_basic.jl
Original file line number Diff line number Diff line change
@@ -1,15 +1,17 @@

using OrdinaryDiffEq
using KernelAbstractions
using Trixi

###############################################################################
# semidiscretization of the linear advection equation

backend = CPU()
advection_velocity = (0.2, -0.7)
equations = LinearScalarAdvectionEquation2D(advection_velocity)

# Create DG solver with polynomial degree = 3 and (local) Lax-Friedrichs/Rusanov flux as surface flux
solver = DGSEM(polydeg = 3, surface_flux = flux_lax_friedrichs)
solver = DGSEM(polydeg=3, surface_flux=flux_lax_friedrichs, backend=backend)

coordinates_min = (-1.0, -1.0) # minimum coordinates (min(x), min(y))
coordinates_max = (1.0, 1.0) # maximum coordinates (max(x), max(y))
Expand All @@ -20,14 +22,14 @@ mesh = TreeMesh(coordinates_min, coordinates_max,
n_cells_max = 30_000) # set maximum capacity of tree data structure

# A semidiscretization collects data structures and functions for the spatial discretization
semi = SemidiscretizationHyperbolic(mesh, equations, initial_condition_convergence_test,
solver)
semi = SemidiscretizationHyperbolic(mesh, equations, initial_condition_convergence_test, solver; backend=backend)


###############################################################################
# ODE solvers, callbacks etc.

# Create ODE problem with time span from 0.0 to 1.0
ode = semidiscretize(semi, (0.0, 1.0));
ode = semidiscretize(semi, (0.0, 1.0); offload=false, backend=backend);

# At the beginning of the main loop, the SummaryCallback prints a summary of the simulation setup
# and resets the timers
Expand Down
19 changes: 19 additions & 0 deletions ext/TrixiAMDGPUExt.jl
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
# Package extension for some GPGPU API calls missing in KernelAbstractions

module TrixiAMDGPUExt

using Trixi
if isdefined(Base, :get_extension)
using AMDGPU: ROCArray
using AMDGPU.ROCKernels: ROCBackend
else
# Until Julia v1.9 is the minimum required version for Trixi.jl, we still support Requires.jl
using ..AMDGPU: ROCArray
using ..AMDGPU.ROCKernels: ROCBackend
end

function Trixi.get_array_type(backend::ROCBackend)
return ROCArray
end

end
Loading