More efficient `eval_f` in RBC on GPU #492

brownbaerchen · 2024-09-26T12:52:47Z

After profiling the code on GPUs some more, I noticed some efficiency can be gained from reducing the number of matrix multiplications. That is all that I am doing here: Rather than computing derivatives by multiplying each component with the derivative matrix separately, I construct a large matrix containing the derivative matrix multiple times in order to compute all derivatives simultaneously. Also, I cache the $L$ matrix in the basis of Chebychov T polynomials, rather than computing it every time. This is not a huge difference, but saves a few ms in every call of the function on both CPU and GPU.

pancetta · 2024-09-29T14:19:30Z

Please merge master again

brownbaerchen added 2 commits September 26, 2024 14:12

More efficient eval_f in RBC on GPU

7ec7bc5

Merge remote-tracking branch 'upstream/master' into RBC_efficiency

34aa041

Merge remote-tracking branch 'upstream/master' into RBC_efficiency

6826b5e

pancetta merged commit 85dc966 into Parallel-in-Time:master Oct 8, 2024
86 checks passed

brownbaerchen deleted the RBC_efficiency branch October 8, 2024 16:01

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

More efficient `eval_f` in RBC on GPU #492

More efficient `eval_f` in RBC on GPU #492

brownbaerchen commented Sep 26, 2024

pancetta commented Sep 29, 2024

More efficient eval_f in RBC on GPU #492

More efficient eval_f in RBC on GPU #492

Conversation

brownbaerchen commented Sep 26, 2024

pancetta commented Sep 29, 2024

More efficient `eval_f` in RBC on GPU #492

More efficient `eval_f` in RBC on GPU #492