Cupy implementations of dpbench workloads #310
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR adds cupy implementations of six workloads - blackscholes, gpairs, rambo, pca, l2_norm and pairwise_distance. It also contains the necessary changes to the infra to run these workloads. This PR does not contain changes to the CI to run these workloads regularly. For evidence of successful execution, see below report.
Tested on orsatdevnuc01 development machine. Below is the test report.
Report for 2023-11-02 00:05:57 run
Legend
postfix description device
0 cupy cupy 12th Gen Intel(R) Core(TM) i9-12900
1 dpnp dpnp 12th Gen Intel(R) Core(TM) i9-12900
2 numpy NumPy 12th Gen Intel(R) Core(TM) i9-12900
Summary of current implementation
input_size benchmark problem_preset cupy dpnp numpy
0 20MB black_scholes S Success Success Success
1 8KB gpairs S Success Success Success
2 1MB l2_norm S Success Success Success
3 8MB pairwise_distance S Success Success Success
4 1MB pca S Success Success Success
5 7MB rambo S Success Success Success
Summary of current implementation
input_size benchmark problem_preset cupy dpnp numpy
0 20MB black_scholes S 0.16ms 11.87ms 24.01ms
1 8KB gpairs S 1.66ms 16.81ms 0.67ms
2 1MB l2_norm S 0.08ms 3.75ms 0.37ms
3 8MB pairwise_distance S 0.14ms 2.29ms 3.28ms
4 1MB pca S 11.75ms 10.93ms 2.81ms
5 7MB rambo S 0.49ms 4.53ms 1.27ms