Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix rlhf on trl v0.11.0 #290

Merged
merged 1 commit into from
Oct 2, 2024
Merged

Conversation

satyaog
Copy link
Member

@satyaog satyaog commented Sep 23, 2024

Tested on:

remote [stdout] =================                                                                                                                                    
remote [stdout] Benchmark results                                                                                                                                    
remote [stdout] =================                                                                                                                                    
remote [stdout]                                                                                                                                                      
remote [stdout] System                                                                                                                                               
remote [stdout] ------                                                                                                                                               
remote [stdout] cpu:      AMD EPYC 7543 32-Core Processor                                                                                                            
remote [stdout] n_cpu:    64                                                                                                                                         
remote [stdout] product:  NVIDIA A100-SXM4-80GB                                                                                                                      
remote [stdout] n_gpu:    1                                                                                                                                          
remote [stdout] memory:   81920.0                                                                                                                                    
remote [stdout]                                                                                                                                                      
remote [stdout] Breakdown                                                                                                                                            
remote [stdout] ---------                                                                                                                                            
remote [stdout] bench       | fail |   n | ngpu |       perf |   sem% |   std% | peak_memory |      score | weight                                                   
remote [stdout] rlhf-single |    0 |   1 |    1 |    2589.65 |   0.5% |   4.2% |       12455 |    2589.65 |   1.00                                                   
remote [stdout]                                                                                                                                                      
remote [stdout] Scores                                                                                                                                               
remote [stdout] ------                                                                                                                                               
remote [stdout] Failure rate:       0.00% (PASS)                                                                                                                     
remote [stdout] Score:            2589.65                                                                                                                            
=================
Benchmark results
=================

System
------
cpu:      AMD EPYC 7543 32-Core Processor
n_cpu:    64
product:  NVIDIA A100-SXM4-80GB
n_gpu:    4
memory:   81920.0

Breakdown
---------
bench     | fail |   n | ngpu |       perf |   sem% |   std% | peak_memory |      score | weight
rlhf-gpus |    0 |   1 |    4 |    7179.24 |   0.4% |   2.9% |       21351 |    7179.24 |   1.00

Scores
------
Failure rate:       0.00% (PASS)
Score:            7179.24

@Delaunay Delaunay merged commit 34f56e7 into mila-iqia:master Oct 2, 2024
1 of 3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants