[ Back to MLPerf inference benchmarks index ]
Bert has two variants - bert-99
and bert-99.9
where the 99
and 99.9
specifies the required accuracy constraint
with respect to the reference floating point model. bert-99.9
model is applicable only on a datacenter system.
In the edge category, bert-99 has Offline and SingleStream scenarios and in the datacenter category,
both bert-99
and bert-99.9
have Offline and Server scenarios.
Please check MLPerf inference GitHub for more details.
Run using the MLCommons CM framework
From Feb 2024, we suggest you to use this GUI to configure MLPerf inference benchmark, generate CM commands to run it across different implementations, models, data sets, software and hardware, and prepare your submissions.
Install MLCommons CM automation framework with automation recipes for MLPerf as described here.
The following guides explain how to run different implementations of this benchmark via CM:
- MLCommons reference implementation in Python (CPU & GPU)
- NVIDIA optimized implementation (GPU)
- Intel optimized implementation (CPU)
- Qualcomm optimized implementation (QAIC)
- DeepSparse implementation (CPU: x64, Arm64)
- Run custom ONNX models with MLPerf reference implementation
- Run multiple DeepSparse Zoo models via MLPerf
Check the MLCommons Task Force on Automation and Reproducibility and get in touch via public Discord server.