This is the repository that contains source code for the Off policy RL for diffusion samplers website.
If you find the work useful for your work please cite:
@inproceedings{
venkatraman2024amortizing,
title={Amortizing intractable inference in diffusion models for vision, language, and control},
author={Siddarth Venkatraman and Moksh Jain and Luca Scimeca and Minsu Kim and Marcin Sendera and Mohsin Hasan and Luke Rowe and Sarthak Mittal and Pablo Lemos and Emmanuel Bengio and Alexandre Adam and Jarrid Rector-Brooks and Yoshua Bengio and Glen Berseth and Nikolay Malkin},
booktitle={The Thirty-eighth Annual Conference on Neural Information Processing Systems},
year={2024},
url={https://openreview.net/forum?id=gVTkMsaaGI}
}
@inproceedings{
sendera2024improved,
title={Improved off-policy training of diffusion samplers},
author={Marcin Sendera and Minsu Kim and Sarthak Mittal and Pablo Lemos and Luca Scimeca and Jarrid Rector-Brooks and Alexandre Adam and Yoshua Bengio and Nikolay Malkin},
booktitle={The Thirty-eighth Annual Conference on Neural Information Processing Systems},
year={2024},
url={https://openreview.net/forum?id=vieIamY2Gi}
}
The website is forked from Understanding RLHF website.
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.