In which part does it incorporate RL? #5

YifanDengWHU · 2021-04-01T12:14:13Z

It's nice work! However I have a question. Since I'm not so familiar with Reinforce Learning, I wonder which part of it has RL? In 3.3.2 fine-tuning, "Update the model P(G,S) on the fine-tuning set $D^f$ using policy gradient method" It seems that it uses RL here. However, in the code, it just compute the topo, atom and bond type loss between the expanded $S_i$ and $G_i^k$.
Thanks!

bhomass · 2021-11-25T06:44:16Z

Allow me to comment. I don't think you will find algorithms like the REINFORCE here. What Jin meant is improving p(G|S) using desire chemical properties. This happens in the property_filter() method in finetune.py.

bengioe · 2022-08-01T19:36:56Z

I also went looking for the RL and found none.

For posterity (happy to be corrected by the authors), what the finetuning step (and property_filter) seems to consist in is to generate $Nm$ datapoints from $m$ rationales, filter those points to keep only those with the desired properties, and take a step of maximum likelihood on those remaining points.

This might be a totally valid thing to do, but it is indeed far from policy gradient. In the strictest sense, you could interpret this as a baseline-less REINFORCE with $R=0$ for all the rejected points (which would make their gradient 0) and $R=1$ for the points that property_filter keeps. In practice this is not a recommended RL setup, as it has all sorts of instabilities.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

In which part does it incorporate RL? #5

In which part does it incorporate RL? #5

YifanDengWHU commented Apr 1, 2021

bhomass commented Nov 25, 2021

bengioe commented Aug 1, 2022 •

edited

Loading

In which part does it incorporate RL? #5

In which part does it incorporate RL? #5

Comments

YifanDengWHU commented Apr 1, 2021

bhomass commented Nov 25, 2021

bengioe commented Aug 1, 2022 • edited Loading

bengioe commented Aug 1, 2022 •

edited

Loading