GPU Support for Operators #9

nHackel · 2024-04-09T16:30:21Z

This PR adds GPU support for (some) of the operators. I will update a list of which operators were ported and how they were changed.

Operators:

Changes:
LinearOperators has a keyword argument S which describes the "storage_type" of the operator. This has to be adapted from the default Vector{T} to a CuArray or another GPU array to make an operator work on the GPU. It is usually just the typeof(...) of the vector to which the operator will be applied. I've started adding this keyword argument to all operators/constructors defined in this package to stay consistent, though not all operators will be able to work on the GPU. The 'S' kwarg does not seem to be applicable to operators that are compositions of existing ones such as the ProdOp and NormalOp

FFTOp: Did not require any specific CUDA dependency so far. I could just refactor the struct a little bit to allow for the S kwarg and keep track of this information. One issue is that the CUDA FFT does not allow for FFTW.FLAGS to be set (the operator used FFTW.MEASURE previously). My work around for now is to give the constructor kwargs... which it passes on to the plan call. Then the caller can decide if a FLAG should be used or not. Alternatively we dispatch on S and depending on that we load a different plan/different plan arguments.

WaveletOp: Wavelets.jl does not seem to work on a GPU. The operator now carries two dense arrays to which it assigns any "non-dense" arguments and then does the transformation., i.e. it computes on the CPU and then turns the result back to a GPU array

GradientOp: I've added an extension for GPUArrays.jl and added new dispatch on GPUArrays for the grad! methods of this operator.

ProdOp: Did not receive an S kwarg, instead the operator was restricted to work on operators that implement storage_type.. At the moment LinearOperator(gpuArray) does not derive a correct storage type at the moment. I have an open PR that should change that (atm only for CUDA).

SamplingOp: Added an S kwarg, however this requires a change downstream in LinearOperators.jl, since the opRestriction this operator is (partially) based on works on its own with GPUs, but cannot be combined with other operators to work on the GPU. I've added PR to LinearOperators.jl. I've also removed a superfluous (I think) opEye.

WeightingOp: Works out of the box, did not (yet) add a WeightingOps(...; kwargs...) to "accept"/ignore a potential S kwarg.

NormalOp: Similar to ProdOp. I have also slightly rearranged the call order of the constructors. I think this is an operator that is being inspected by MRIReco.jl, so I might need to revisit the API again when I adapt the interface there. If the limitation to LinearOperators is an issue, we could also overload the storage_type call on the matrices themselves. I've added this as an option to my PR in LinearOperators.jl. I've also reused the WeightingOp here, one could go a step further and collapse the whole NormalOp into a ProdOp.

NFFTOp: Similar to FFTOp, though the GPU seems to accept the same kwargs as the CPU version so I did not change it

NFFTToeplitzNormalOp: Similar to FFTOp, this time I had to remove the FFTW.FLAGS

JeffFessler · 2024-04-09T17:41:36Z

ext/LinearOperatorFFTWExt/FFTOp.jl

 """
-function LinearOperatorCollection.FFTOp(T::Type; shape::NTuple{D,Int64}, shift::Bool=true, unitary::Bool=true, cuda::Bool=false) where D
+function LinearOperatorCollection.FFTOp(T::Type; shape::NTuple{D,Int64}, shift::Bool=true, unitary::Bool=true, S = Array{Complex{real(T)}}, kwargs...) where D


Suggested change

function LinearOperatorCollection.FFTOp(T::Type; shape::NTuple{D,Int64}, shift::Bool=true, unitary::Bool=true, S = Array{Complex{real(T)}}, kwargs...) where D

function LinearOperatorCollection.FFTOp(T::Type{<:Number}; shape::NTuple{D,Int64}, shift::Bool=true, unitary::Bool=true, S = Array{Complex{real(T)}}, kwargs...) where D

probably you want Numbers?

Yes, good point! The restrictions on T are a bit inconsistent across the package. I will do a pass with this change across all operators once I am done with the GPU changes, potentially I might do this in another PR

…odOps

Restructure FFTOP to work on GPU

ea6bd62

JeffFessler reviewed Apr 9, 2024

View reviewed changes

nHackel added 10 commits April 11, 2024 13:37

Add GPU support to GradientOp

9a2069c

Added S kwarg to WaveletOp

ae5fd8c

Add GPU support to ProdOp

51a4e0f

Add GPU support for SamplingOp

52361c2

Add GPU support for NormalOp

44dc932

Lessen restriction on ProdOp, NormalOp

438f7ff

Add GPU support to NFFTOp

161d527

NormalOp use storage_type of parent

7ea715c

Add GPU support to NormalOpNFFTToeplitz

60f81a1

Allow kwargs for normalOperator(...)

4ee1f31

nHackel mentioned this pull request Apr 17, 2024

Add GPU support to MRIReco.jl MagneticResonanceImaging/MRIReco.jl#182

Merged

nHackel added 17 commits April 18, 2024 07:26

Fix copy for NormalOp

31095ce

Add conversion to dense array to WaveletOp

62a8c8e

Add ProdNormalOp (migration MRIReco

7495aa6

Fix include order for ProdOp

135b5c4

Migrate DiagOp from MRIReco with GPU support

54d5fe5

Fix copy NFFTOp

3e2b95d

Fix DiagOp constructor and normalOp

e828d88

Allow WeightingOp weights vec to be GPU arrays

8cb2bb4

Add operatore copy function as kwarg

f562a4a

Pass along normalOperator kwargs (for FFT flags)

c261108

Reduce allocation in DiagOp normalOperator

b2489a3

Fix tmp array construction for FFTOp on UnionAll storage vectors

af02185

Fix normalOp constructor to use eltype of storage_type

af27ffc

Improve WaveletOp between GPU and CPU

671bbcd

Fix missing ; in SamplingOp

2b1331d

Add RadonOp based on RadonKA

6dfdd5b

Fix eltype of SamplingOp

19d74d5

nHackel added 19 commits June 3, 2024 18:29

Init updating tests

8811386

Fix bugs in FFTOp and SamplingOp

7ae7ddb

Add tests and bugfixes for DiagOp

82d41a3

Add RadonOp test

309a61a

Add breakage workflow

ef5bcd2

Fix branch name

46130c1

Attempt to let breakage fail if tests fail

ed3a934

Improve @testset handling

c278627

Test setup for CUDA

2475c89

Use fill for res in GradOp

ee3b3fb

Add LinearOperatorException for non-concrete S in Diag, Normal and Pr…

27188c6

…odOps

Improve GradOp performance on GPU for multiple dims

885b7c5

Add CUDA buildkite

44994a5

Try fixing buildkite

a97ed90

Fix julia version in buildkite

ef24731

Add CuNFFT to CUDA buildkite

e06506c

Add all extras to buildkite

5e66c00

Use TestEnv to fix compat issues for buildkite

1423c54

Readd CuNFFT to buildkite

8e14f5d

nHackel marked this pull request as ready for review June 27, 2024 13:54

nHackel merged commit 01d720c into main Jun 27, 2024
3 of 6 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GPU Support for Operators #9

GPU Support for Operators #9

nHackel commented Apr 9, 2024 •

edited

Loading

JeffFessler Apr 9, 2024

nHackel Apr 11, 2024

	function LinearOperatorCollection.FFTOp(T::Type; shape::NTuple{D,Int64}, shift::Bool=true, unitary::Bool=true, S = Array{Complex{real(T)}}, kwargs...) where D
	function LinearOperatorCollection.FFTOp(T::Type{<:Number}; shape::NTuple{D,Int64}, shift::Bool=true, unitary::Bool=true, S = Array{Complex{real(T)}}, kwargs...) where D

GPU Support for Operators #9

GPU Support for Operators #9

Conversation

nHackel commented Apr 9, 2024 • edited Loading

JeffFessler Apr 9, 2024

Choose a reason for hiding this comment

nHackel Apr 11, 2024

Choose a reason for hiding this comment

nHackel commented Apr 9, 2024 •

edited

Loading