Skip to content

Latest commit

 

History

History
48 lines (31 loc) · 1.79 KB

README_nvidia.md

File metadata and controls

48 lines (31 loc) · 1.79 KB

[ Back to index ]

Build Nvidia Docker Container (from 3.1 Inference round)

cm docker script --tags=build,nvidia,inference,server

Run this benchmark via CM

Do a test run to detect and record the system performance

cmr "run-mlperf inference _find-performance" --scenario=Offline \
--model=rnnt --implementation=nvidia-original --device=cuda --backend=tensorrt \
--category=edge --division=open --quiet
  • Use --division=closed to run all scenarios for the closed division (compliance tests are skipped for _find-performance mode)
  • Use --category=datacenter to run datacenter scenarios

Do full accuracy and performance runs for all the scenarios

cmr "run-mlperf inference _submission _all-scenarios" --model=rnnt \
--device=cuda --implementation=nvidia-original --backend=tensorrt \
--execution-mode=valid --category=edge --division=open --quiet
  • Use --power=yes for measuring power. It is ignored for accuracy and compliance runs
  • Use --division=closed to run all scenarios for the closed division including the compliance tests
  • --offline_target_qps, --server_target_qps, and --singlestream_target_latency can be used to override the determined performance numbers

Generate and upload MLPerf submission

Follow this guide to generate the submission tree and upload your results.

Questions? Suggestions?

Check the MLCommons Task Force on Automation and Reproducibility and get in touch via public Discord server.

Acknowledgments

  • CM automation for Nvidia's MLPerf inference implementation was developed by Arjun Suresh and Grigori Fursin.
  • Nvidia's MLPerf inference implementation was developed by Zhihan Jiang, Ethan Cheng, Yiheng Zhang and Jinho Suh.