This repository contains the source codes of the Deep Inverse Reinforcement Learning for Structural Evolution of Small Molecules paper. The work proposes a framework for training compound generators using Deep Inverse Reinforcement Learning.
Library/Project | Version |
---|---|
pytorch | 1.3.0 |
numpy | 1.18.4 |
ptan | 0.6 |
tqdm | 4.35.0 |
scikit-learn | 0.23.1 |
joblib | 0.13.2 |
soek | 0.0.1 |
pandas | 1.0.3 |
xgboost | 0.90 |
rdkit | 2019.09.3.0 |
gym | 0.15.6 |
To install the dependencies, we suggest you install Anaconda first and then follow the commands below:
- Create anaconda environment
$ conda create -n irelease python=3.7
- Activate environment
$ conda activate irelease
- Install the dependencies above according to their official websites or documentations.
For instance, you can install
XGBoost
using the command$ pip install xgboost==0.90
The demonstrations dataset used in the experiments are as follows:
Experiment | Dataset |
---|---|
DRD2 Activity | drd2_active_filtered.smi |
LogP | logp_smiles_biased.smi |
JAK2 Max | jak2_max_smiles_biased.smi |
JAK2 Min | jak2_min_smiles_biased.smi |
The datasets used for training the models used as evaluation functions are:
Experiment | Dataset |
---|---|
DRD2 Activity | drd2_bin_balanced.csv |
LogP | logP_labels.csv |
JAK2 Max and Min | jak2_data.csv |
Pretraining dataset: chembl.smi
Install the project as a standard python package from the project directory:
$ pip install -e .
Then cd
into the proj
directory:
$ cd proj/
The Stack-RNN model used in our work could be pretrained with the following command:
$ cd proj
$ python pretrain_rnn.py --data ../data/chembl.smi
The pretrained model we used could be downloaded from here.
The evaluation function for the DRD2 experiment is an RNN classifier trained with the BCE loss function. The following is the command to train the model using 5-fold cross validation:
$ python expert_rnn_bin.py --data_file ../data/drd2_bin_balanced.csv --cv
After training, the evaluation can be done using:
$ python expert_rnn_bin.py --data_file ../data/drd2_bin_balanced.csv --cv --eval --eval_model_dir ./model_dir/expert_rnn_bin/
Note:
The value of the --eval_model_dir
flag is a directory which contains the 5 models
saved from the CV training stage.
The evaluation function for the LogP optimization experiment is an RNN model trained using the MSE loss function. The following command invokes training:
$ python expert_rnn_reg.py --data_file ../data/logP_labels.csv --cv
After training, the evaluation can be done using:
$ python expert_rnn_reg.py --data_file ../data/logP_labels.csv --cv --eval --eval_model_dir ./model_dir/expert_rnn_reg/
We trained XGBoost models for the JAK2 maximization experiment. The same XGBoost models were used for the JAK2 minimization experiment, as mentioned in the paper.
The following invokes the training process:
$ python expert_xgb_reg.py --data_file ../data/jak2_data.csv --cv
And evaluation could be done using:
$ python expert_xgb_reg.py --data_file ../data/jak2_data.csv --cv --eval --eval_model_dir ./model_dir/expert_xgb_reg/
The following files are used for PPO training for both DIRL and IRL:
- DRD2 Activity:
ppo_rl_drd2.py
- LogP Optimization:
ppo_rl_logp.py
- JAK2 Maximization:
ppo_rl_jak2_minmax.py
- JAK2 Minimization:
ppo_rl_jak2_min.py
For DRL training, the following files are used:
- DRD2 Activity:
reinforce_rl_drd2.py
- LogP Optimization:
reinforce_rl_logp.py
- JAK2 Maximization:
reinforce_rl_jak2_minmax.py
- JAK2 Minimization:
ppo_rl_jak2_min.py
These files mostly share command line flags for training. For instance, to train a generator with the DRD2 demonstrations (DIRL) the following command could be used:
$ python ppo_rl_drd2.py --exp_name drd2 --demo ../data/drd2_active_filtered.smi --unbiased ../data/unbiased_smiles.smi --prior_data ../data/chembl.smi --pretrained_model irelease_prior.mod
For DRL just add the flag --use_true_reward
,
$ python ppo_rl_drd2.py --exp_name drd2 --demo ../data/drd2_active_filtered.smi --unbiased ../data/unbiased_smiles.smi --prior_data ../data/chembl.smi --pretrained_model irelease_prior.mod --use_true_reward
Assuming the training phase produces the model biased_generator.mod
, compound
samples, in the form of SMILES, could be generated using:
$ python pretrain_rnn.py --data ../data/chembl.smi --eval --eval_model_name biased_generator.mod --num_smiles 1000
The --num_smiles
flag controls the number of SMILES (valid and invalid) that would be sampled from the
generator.
After the generation, a JSON file is produced which contains valid and invalid
SMILES. In our experiments, we process this .json
file using
smiles_worker.py to save the valid SMILES into a CSV file.
A sample file JSON file produced after SMILES generation is here. The corresponding processed CSV file containing the valid SMILES and the evaluation function's predictions is also here
We thank the authors of ReLeaSE for their original implementation of Stack-RNN. We thank Maxim Lapan for his book on DRL and the ptan project. We also acknowledge the post of int8 on Monte-Carlo Tree Search.
@article{10.1093/bib/bbaa364,
author = {Agyemang, Brighter and Wu, Wei-Ping and Addo, Daniel and Kpiebaareh, Michael Y and Nanor, Ebenezer and Roland Haruna, Charles},
title = "{Deep inverse reinforcement learning for structural evolution of small molecules}",
journal = {Briefings in Bioinformatics},
year = {2020},
month = {12},
issn = {1477-4054},
doi = {10.1093/bib/bbaa364},
url = {https://doi.org/10.1093/bib/bbaa364},
note = {bbaa364},
}