-
Notifications
You must be signed in to change notification settings - Fork 91
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feat: unified gae #1129
base: develop
Are you sure you want to change the base?
Feat: unified gae #1129
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks much nicer! Just some minor nitpicks. Of course we should wait for the final benchmarks, but if that's all good them I'm happy.
Please make an issue for doing this to sable/mat unless you're planning on doing this now (which would be great)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My bad, should be N
instead of A
for number of agents
Co-authored-by: Sasha Abramowitz <[email protected]>
Co-authored-by: Sasha Abramowitz <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@SimonDuToit if you can update MAT and Sable to use gae from this util file 🙏
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @SimonDuToit for the changes, just if you can fix the format and type issues that i mentioned 🙏
Makes all PPO systems use a single shared function for calculating the GAE. This is made possible by making the feed forward systems use the dones from the previous step, bringing them in line with the recurrent systems. Benchmark results.