-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
one-off runs #3
Comments
Performance speedup
Accuracy
Compilation latency (sec)
Peak Memory Compression Ratio
Absolute latency (ms)
huggingface suite with amp precisionsee morePerformance speedup
Accuracy
Compilation latency (sec)
Peak Memory Compression Ratio
Absolute latency (ms)
timm_models suite with amp precisionsee morePerformance speedup
Accuracy
Compilation latency (sec)
Peak Memory Compression Ratio
Absolute latency (ms)
|
Performance Dashboard for amp precision (max autotune, with cold start)Executive Summarysee moreWe evaluate different backends across three benchmark suites - torchbench, huggingface and timm. We run these experiments on A100 GPUs. Each experiment runs one iteration of forward pass and backward pass for training and forward pass only for inference. For accuracy, we check the numerical correctness of forward pass outputs and gradients by comparing with native pytorch. We measure speedup by normalizing against the performance of native pytorch. We report mean compilation latency numbers and peak memory footprint reduction ratio.Caveats
To measure performance, compilation latency and memory footprint reduction, we remove the models that fail accuracy checks. Passrate
Geometric mean speedup
Mean compilation time (seconds)
Peak memory footprint compression ratio (higher is better)
Warningssee moreWe flag models where:
Accuracy warnings
Performance speedup warnings
Compilation latency (sec) warnings
Peak Memory Compression Ratio warnings
torchbench suite with amp precisionsee morePerformance speedup
Accuracy
Compilation latency (sec)
Peak Memory Compression Ratio
Absolute latency (ms)
huggingface suite with amp precisionsee morePerformance speedup
Accuracy
Compilation latency (sec)
Peak Memory Compression Ratio
Absolute latency (ms)
timm_models suite with amp precisionsee morePerformance speedup
Accuracy
Compilation latency (sec)
Peak Memory Compression Ratio
Absolute latency (ms)
Build Summarysee moreRun nameday_079_20_03_23_performance_amp_778 Commit hashespytorch commit: 9423b863f800c6d20b9b3de4422558cbb338fb83 TorchDynamo config flagsTorch versiontorch: 2.1.0a0+git9423b86 Environment variablesTORCH_CUDA_ARCH_LIST = 8.0 GPU detailsCUDNN VERSION: 8401 |
(next 2 comments are for max-autotune, warm start run)
AMP RUN
Geometric mean speedup
Mean compilation time (seconds)
Peak memory footprint compression ratio (higher is better)
The text was updated successfully, but these errors were encountered: