Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Investigate best practices for benchmarking comparisons between CPU and GPU #2

Open
matthewfeickert opened this issue Jun 2, 2020 · 4 comments
Assignees
Labels
research Experimental or investiagation

Comments

@matthewfeickert
Copy link
Member

There may be existing literature/blogs/documents on best practices for benchmarking comparisons between CPU and GPU. It would be good to investigate this and learn about what has already been done in this space.

@matthewfeickert matthewfeickert added the research Experimental or investiagation label Jun 2, 2020
@coolalexzb
Copy link
Contributor

Hi all!
Currently, we have many benchmark software to check the performance of GPU and CPU. Besides, there also exist benchmark suites like Rodinia, Parboil, etc. We can learn theoretical thoughts from these applications, but these existing resources are not applicable to our project.

Among the research paper and blogs I read, I found pyhpc-benchmarks, a GitHub repository, is one of the most feasible resources that we can use to complete this project. pyhpc-benchmarks is

a suite of benchmarks to test the sequential CPU and GPU performance of various computational backends with Python frontends.

It constructs benchmark suites for different applications over various python backends, such as NumPy, JAX, PyTorch, TensorFlow, etc, using statistical metrics, like mean, stdev, min, max, median. When testing the performance of CPU and GPU, we can control variables such as data size, number of iteration, number of threads (in terms of muti-thread programming).

I need to have a deeper understanding of pyhf, thus helping me to generate a method to build pyhf benchmark suite. @matthewfeickert

Above is my current thought of benchmarking comparisons between CPU and GPU. I will update my thoughts in the future. Welcome anyone corrects my thoughts!

@matthewfeickert
Copy link
Member Author

matthewfeickert commented Jun 8, 2020

Among the research paper and blogs I read

Can you share some of the more interesting ones here as references?

pyhpc-benchmarks is

a suite of benchmarks to test the sequential CPU and GPU performance of various computational backends with Python frontends.

It constructs benchmark suites for different applications over various python backends, such as NumPy, JAX, PyTorch, TensorFlow, etc, using statistical metrics, like mean, stdev, min, max, median. When testing the performance of CPU and GPU, we can control variables such as data size, number of iteration, number of threads (in terms of muti-thread programming).

Maybe you can try to determine if this is something useful just by looking at CPU tests for the time being. I see that they also have examples of using this to benchmark on Google Collab GPUs so while the group GPU machine is getting setup to support all backends you could do preliminary tests on Collab.

While they advocate for using Conda environments, we don't want to be restricted to being forced to use Conda. pyhpc-benchmarks doesn't explicitly require Conda, so that should be fine.

My first impressions is that as pyhpc-benchmarks is more focused on testing

which high-performance backend is best for geophysical (finite-difference based) simulations.

and we really care about having a tool to quickly test the performance of the backends on different workspaces, pyhpc-benchmarks might be a good starting place or point of inspiration to look at how they did things, but maybe not an "out of the box" solution. I'd be very happy to be wrong here though.

@coolalexzb
Copy link
Contributor

coolalexzb commented Jun 8, 2020

At first, I read Computing Performance Benchmarks among CPU, GPU, and FPGA. Among the benchmark suites it introduces, I dive deep into Rodinia Suite and Parboil Suite.

From the paper, both of the benchmark suites have applications over different applications, but currently, I don't find an approach to use Rodinia Suite and Parboil Suite directly in our project. Among the paper, I find some metrics they used in experiments: time consumption, throughput, kernel execution, memory used, CPU-GPU communication time, etc. Time consumption is the most common and direct way to show the performance of GPU and CPU. These papers do not provide methods of how to measure all of these metrics. I will do more search when I start to implement the measurement of related metrics.

Other useful paper:
Hetero-Mark, A Benchmark Suite for CPU-GPU Collaborative Computing
NUPAR: A Benchmark Suite for Modern GPU Architectures
A Survey of CPU-GPU Heterogeneous Computing Techniques
AI Benchmark: All About Deep Learning on Smartphones in 2019
GPU Computing with Python: Performance, Energy Efficiency and Usability

Wiki Link: Benchmark

As for the blogs I read, I find most of them are related to current benchmark software, such as blog1, which I think are not useful to our project.

If anyone can absorb more useful information from the links I mentioned or have any more useful information, welcome to share and discuss!

@coolalexzb
Copy link
Contributor

I find another python package wandb, which might be a good reference to help us build a benchmark suite for GPU and CPU. In wandb, it makes use of the following system metrics :

  • CPU Utilization
  • System Memory Utilization
  • Disk I/O Utilization
  • Network traffic (bytes sent and received)
  • GPU Utilization
  • GPU Temperature
  • GPU Time Spent Accessing Memory (as a percentage of the sample time)
  • GPU Memory Allocated

The wandb package also uses nvidia-ml-py3 package to collect GPU metrics.

Useful links:
https://www.wandb.com/
https://lambdalabs.com/blog/weights-and-bias-gpu-cpu-utilization/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
research Experimental or investiagation
Projects
None yet
Development

No branches or pull requests

2 participants