Add flops benchmark #169

Delaunay · 2023-10-16T18:00:57Z

Used as sanity checks & diagnostic bottlenecks.
New hardware might not be used efficiently by today's models because they were built for different hardware.
Matrix mult is everywhere in AI, this should make us see if the hardware is slow or simply not used efficiently by models.

Delaunay · 2023-10-31T23:39:55Z

Source: /Tmp/slurm.3790816.0/base/runs/nilafimu.2023-10-31_18:41:14.852395
=================
Benchmark results
=================
                         fail   n       perf   sem%   std% peak_memory          score weight
bert-fp16                   0   1      49.79   0.0%   0.2%       23952      49.790125   0.00
bert-fp32                   0   1      19.45   0.1%   0.3%       30922      19.452139   0.00
bert-tf32                   0   1      19.47   0.1%   0.3%       30922      19.465893   0.00
bert-tf32-fp16              0   1      49.78   0.0%   0.2%       23952      49.780634   3.00
bf16                        0   1       7.51   0.0%   0.2%        1140       7.507708   0.00
convnext_large-fp16         0   1     121.26   2.6%  13.8%       26632     121.255449   0.00
convnext_large-fp32         0   1      30.83   0.4%   2.2%       45356      30.828629   0.00
convnext_large-tf32         0   1      30.85   0.5%   2.9%       45356      30.845418   0.00
convnext_large-tf32-fp16    0   1     120.86   2.5%  13.6%       26632     120.861461   3.00
davit_large                 0   1     126.33   0.7%   5.1%       32130     126.325192   1.00
davit_large-multi           0   1     126.63   0.6%   4.6%       32374     126.628733   5.00
dlrm                        0   1  258228.32   0.4%   3.2%        6354  258228.316970   1.00
focalnet                    0   1     147.90   1.7%  12.8%       24000     147.901628   2.00
fp16                        0   1      93.70   0.2%   1.6%        1142      93.703274   0.00
fp32                        0   1      13.50   0.0%   0.1%        1524      13.495270   0.00
opt-1_3b                    1   1        NaN    NaN    NaN          -1            NaN   5.00
opt-1_3b-multinode        NaN NaN        NaN    NaN    NaN         NaN            NaN  10.00
opt-6_7b                    1   1        NaN    NaN    NaN       13622            NaN   5.00
opt-6_7b-multinode        NaN NaN        NaN    NaN    NaN         NaN            NaN  10.00
reformer                    0   1      10.21   0.0%   0.1%       24756      10.213739   1.00
regnet_y_128gf              0   1      29.60   0.0%   0.2%       30748      29.596963   2.00
resnet152                   0   1     226.73   1.0%   7.6%       29604     226.730824   1.00
resnet152-multi             0   1     228.92   1.0%   7.9%       29878     228.916266   5.00
resnet50                    0   1     436.92   2.9%  22.1%        4166     436.919842   1.00
rwkv                        0   1     104.91   0.1%   1.1%        4944     104.905045   1.00
stargan                     0   1      11.47   4.5%  34.5%       35648      11.467530   1.00
super-slomo                 0   1      11.28   0.1%   0.4%       36364      11.275021   1.00
t5                          0   1      14.12   0.5%   3.7%       34794      14.124875   2.00
tf32                        0   1      13.50   0.0%   0.2%        1524      13.501287   0.00
whisper                     0   1      81.20   0.1%   0.7%       35968      81.195167   1.00

Scores
------
Failure rate:       7.14% (FAIL)
Score:              10.70

Errors
------
2 errors, details in HTML report.

Delaunay · 2023-11-01T18:18:03Z

Source: /Tmp/slurm.3792181.0/base/runs/kizovota.2023-11-01_14:10:01.514247
=================
Benchmark results
=================
                         fail   n       perf   sem%   std% peak_memory          score weight
bert-fp16                 NaN NaN        NaN    NaN    NaN         NaN            NaN   0.00
bert-fp32                 NaN NaN        NaN    NaN    NaN         NaN            NaN   0.00
bert-tf32                 NaN NaN        NaN    NaN    NaN         NaN            NaN   0.00
bert-tf32-fp16            NaN NaN        NaN    NaN    NaN         NaN            NaN   3.00
bf16                      NaN NaN        NaN    NaN    NaN         NaN            NaN   0.00
convnext_large-fp16       NaN NaN        NaN    NaN    NaN         NaN            NaN   0.00
convnext_large-fp32       NaN NaN        NaN    NaN    NaN         NaN            NaN   0.00
convnext_large-tf32       NaN NaN        NaN    NaN    NaN         NaN            NaN   0.00
convnext_large-tf32-fp16  NaN NaN        NaN    NaN    NaN         NaN            NaN   3.00
davit_large               NaN NaN        NaN    NaN    NaN         NaN            NaN   1.00
davit_large-multi           0   1     163.91   0.5%   3.7%          -1     163.912403   5.00
dlrm                        0   1  246579.75   0.6%   4.3%          -1  246579.749664   1.00
focalnet                  NaN NaN        NaN    NaN    NaN         NaN            NaN   2.00
fp16                      NaN NaN        NaN    NaN    NaN         NaN            NaN   0.00
fp32                      NaN NaN        NaN    NaN    NaN         NaN            NaN   0.00
opt-1_3b                  NaN NaN        NaN    NaN    NaN         NaN            NaN   5.00
opt-1_3b-multinode        NaN NaN        NaN    NaN    NaN         NaN            NaN  10.00
opt-6_7b                  NaN NaN        NaN    NaN    NaN         NaN            NaN   5.00
opt-6_7b-multinode        NaN NaN        NaN    NaN    NaN         NaN            NaN  10.00
reformer                  NaN NaN        NaN    NaN    NaN         NaN            NaN   1.00
regnet_y_128gf            NaN NaN        NaN    NaN    NaN         NaN            NaN   2.00
resnet152                 NaN NaN        NaN    NaN    NaN         NaN            NaN   1.00
resnet152-multi             0   1     356.95   1.0%   7.7%          -1     356.954961   5.00
resnet50                  NaN NaN        NaN    NaN    NaN         NaN            NaN   1.00
rwkv                      NaN NaN        NaN    NaN    NaN         NaN            NaN   1.00
stargan                   NaN NaN        NaN    NaN    NaN         NaN            NaN   1.00
super-slomo               NaN NaN        NaN    NaN    NaN         NaN            NaN   1.00
t5                        NaN NaN        NaN    NaN    NaN         NaN            NaN   2.00
tf32                      NaN NaN        NaN    NaN    NaN         NaN            NaN   0.00
whisper                   NaN NaN        NaN    NaN    NaN         NaN            NaN   1.00

Scores
------
Failure rate:       0.00% (PASS)
Score:               3.01

Delaunay · 2023-11-01T19:03:28Z

Source: /Tmp/slurm.3793160.0/base/runs/rejerebu.2023-11-01_14:20:35.760834
=================
Benchmark results
=================
                         fail   n       perf   sem%   std% peak_memory          score weight
bert-fp16                   0   1     152.11   0.9%   4.8%       24423     152.109606   0.00
bert-fp32                   0   1      29.50   0.1%   0.3%       31387      29.503697   0.00
bert-tf32                   0   1     114.31   0.2%   1.0%       31389     114.310077   0.00
bert-tf32-fp16              0   1     152.44   0.9%   4.7%       24423     152.444346   3.00
bf16                        0   1     294.36   0.2%   1.8%        1611     294.362907   0.00
convnext_large-fp16         0   1     334.82   3.8%  20.3%       27285     334.819560   0.00
convnext_large-fp32         0   1      45.03   0.2%   1.0%       49405      45.026259   0.00
convnext_large-tf32         0   1     124.00   1.8%   9.4%       49405     123.996143   0.00
convnext_large-tf32-fp16    0   1     322.06   3.9%  20.8%       27285     322.056504   3.00
davit_large                 0   1     305.12   0.8%   6.5%       34067     305.119898   1.00
davit_large-multi           0   1     307.03   0.7%   5.6%       34067     307.033831   5.00
dlrm                        0   1  357004.59   1.2%   9.4%        6927  357004.590028   1.00
focalnet                    0   1     384.33   0.6%   4.5%       26165     384.331351   2.00
fp16                        0   1     294.19   0.0%   0.3%        1611     294.192666   0.00
fp32                        0   1      19.13   0.0%   0.1%        1989      19.127785   0.00
opt-1_3b                    1   1        NaN    NaN    NaN          -1            NaN   5.00
opt-1_3b-multinode        NaN NaN        NaN    NaN    NaN         NaN            NaN  10.00
opt-6_7b                    1   1        NaN    NaN    NaN       14041            NaN   5.00
opt-6_7b-multinode        NaN NaN        NaN    NaN    NaN         NaN            NaN  10.00
reformer                    0   1      61.86   0.0%   0.1%       25227      61.857388   1.00
regnet_y_128gf              0   1      87.18   0.5%   4.0%       31377      87.175186   2.00
resnet152                   0   1     670.30   0.8%   6.5%       34277     670.298059   1.00
resnet152-multi             0   1     666.32   1.1%   8.2%       33793     666.319142   5.00
resnet50                    0   1     671.66   5.2%  40.3%        4553     671.657510   1.00
rwkv                        0   1     473.50   0.2%   1.6%        5413     473.502226   1.00
stargan                     0   1      46.71   3.8%  29.2%       37249      46.714831   1.00
super-slomo                 0   1      42.10   0.9%   6.9%       33623      42.101922   1.00
t5                          0   1      48.27   0.7%   5.2%       35267      48.267302   2.00
tf32                        0   1     148.05   0.1%   0.9%        1989     148.048218   0.00
whisper                     0   1     241.36   0.3%   2.7%       36547     241.364055   1.00

Scores
------
Failure rate:       7.14% (FAIL)
Score:              18.21

Errors
------
2 errors, details in HTML report.

Delaunay · 2023-11-01T19:29:57Z

Source: /Tmp/slurm.3792172.0/base/runs/podovepu.2023-11-01_14:48:57.018371
=================
Benchmark results
=================
                         fail   n       perf   sem%   std% peak_memory          score weight
bert-fp16                   0   1     134.01   0.0%   0.3%       24356     134.005684   0.00
bert-fp32                   0   1      28.35   0.1%   0.3%       31320      28.354964   0.00
bert-tf32                   0   1     108.71   0.1%   0.6%       31322     108.705248   0.00
bert-tf32-fp16              0   1     134.01   0.1%   0.4%       24356     134.007693   3.00
bf16                        0   1     294.46   0.2%   1.6%        1544     294.457774   0.00
convnext_large-fp16         0   1     316.19   2.7%  14.8%       27218     316.187095   0.00
convnext_large-fp32         1   1        NaN    NaN    NaN       40852            NaN   0.00
convnext_large-tf32         1   1        NaN    NaN    NaN       40852            NaN   0.00
convnext_large-tf32-fp16    0   1     310.22   3.6%  19.5%       27218     310.218148   3.00
davit_large                 0   1     297.82   0.6%   4.6%       34000     297.823959   1.00
davit_large-multi           0   1     298.50   0.6%   4.8%       34000     298.503954   5.00
dlrm                        0   1  319430.15   0.6%   4.6%        6860  319430.147148   1.00
focalnet                    0   1     376.86   0.5%   4.1%       25664     376.861197   2.00
fp16                        0   1     289.99   0.1%   0.5%        1544     289.985696   0.00
fp32                        0   1      19.18   0.0%   0.0%        1922      19.176587   0.00
opt-1_3b                    1   1        NaN    NaN    NaN          -1            NaN   5.00
opt-1_3b-multinode        NaN NaN        NaN    NaN    NaN         NaN            NaN  10.00
opt-6_7b                    1   1        NaN    NaN    NaN       13974            NaN   5.00
opt-6_7b-multinode        NaN NaN        NaN    NaN    NaN         NaN            NaN  10.00
reformer                    0   1      47.58   0.2%   1.2%       25160      47.581238   1.00
regnet_y_128gf              0   1      78.37   0.5%   3.7%       31310      78.371645   2.00
resnet152                   0   1     608.47   1.1%   8.3%       34218     608.473292   1.00
resnet152-multi             0   1     611.36   0.9%   6.6%       34722     611.362747   5.00
resnet50                    0   1     763.73   4.1%  31.2%        4486     763.728071   1.00
rwkv                        0   1     418.74   0.2%   1.6%        5346     418.743522   1.00
stargan                     0   1      37.88   2.7%  20.8%       37172      37.882975   1.00
super-slomo                 0   1      41.66   1.1%   8.4%       33556      41.662153   1.00
t5                          0   1      42.10   0.0%   0.4%       35200      42.095743   2.00
tf32                        0   1     145.85   0.1%   0.8%        1922     145.846341   0.00
whisper                     0   1     200.34   0.1%   0.5%       36480     200.344352   1.00

Scores
------
Failure rate:      14.29% (FAIL)
Score:              17.48

Errors
------
4 errors, details in HTML report.

Pierre Delaunay and others added 4 commits October 16, 2023 13:57

Add flops benchmark

08069b3

Add model flops

8e80c84

Merge branch 'master' of github.com:mila-iqia/milabench into add_flops

bd1e4ef

-

7199fbe

Delaunay force-pushed the add_flops branch 4 times, most recently from d36760c to c194ffe Compare October 26, 2023 17:33

Generate pinned dependencies

eea0249

Delaunay force-pushed the add_flops branch from c194ffe to eea0249 Compare October 26, 2023 17:36

pierre.delaunay added 5 commits October 31, 2023 12:46

Add repeat & number args

ecc19d6

Add tag

a23bd12

Add an activator

be34e4c

Tweaks

c1da61b

Working

5131c2c

Update voir

848c3c3

Delaunay merged commit 8e62d37 into master Nov 6, 2023
1 of 2 checks passed

Delaunay deleted the add_flops branch November 6, 2023 14:37

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add flops benchmark #169

Add flops benchmark #169

Delaunay commented Oct 16, 2023

Delaunay commented Oct 31, 2023

Delaunay commented Nov 1, 2023

Delaunay commented Nov 1, 2023

Delaunay commented Nov 1, 2023

Add flops benchmark #169

Add flops benchmark #169

Conversation

Delaunay commented Oct 16, 2023

Delaunay commented Oct 31, 2023

Delaunay commented Nov 1, 2023

Delaunay commented Nov 1, 2023

Delaunay commented Nov 1, 2023