Metis - A Python-Based User Interface to Collect Expert Feedback for Generative Chemistry Models

Metis is GUI to enable the collection of accurate and detailed feedback on small molecules. At its core, it is built around Esben Bjerrums rdEditor using PySide2.

You can find the preprint at ChemRxiv

Table of Contents

Metis
- Set Up
  - Installation
    - Dependencies
  - SSH
- Usage
  - Examples
- Settings

Set up

Installation

Download the repository and navigate to the download location. You can install metis with pip install .. Make sure the environment you want to install into is activated and has python >= 3.9, <3.11 installed.

If you wish to use REINVENT 3 in the backend, also install REINVENT 3 on a remote machine.

Dependencies

Some notes on the dependencies.

PySide 2

Getting the environment set up with PySide2 can be somewhat challenging. It is planned to move to PySide6. There already exists a branch for it, which you can try out. It works but has not yet completely been tested.

scikit-learn

The version scikit-learn constraints are only set to make sure that the examples given here work. In theory, you could use any scikit-learn version. If you want to use Reinvent in the backend, you need to make sure that the version of scikit-learn Reinvent is using on the remote machine should be updated to the version that matches your local installation used by metis.

cairosvg

Depending on the OS you are running installing cairosvg through pip can cause issues, as cairo is not found. On MacOS you can solve this by installing cairo using homebrew, or you can install cairosvg using conda-

SSH

It is assumed you have a working version of Reinvent on a Server instance that is running Slurm and ssh.

Change the ssh settings in the example_project/de_novo_files/ssh_settings.yml file.
- ssh_login: your login to SSH e.g. username@remote_server you should be able to access your remote server without a password, for example, using an RSA Key
- path_remote_folder: path on the remote machine, from where Reinvent files will be loaded and stored.
- de_novo_json: specify which default reinvent.json file to use
- default_slurm: specify which default Slurm job to use
Copy and unzip the metis_reinvent.zip to the remote machine. Make sure that the path_remote_folder in the ssh_settings.yml file matches with the folder location and also in the initial_reinvent.json.

Usage

After installation simply run:

metis -f path/to/settings.yml --output /path/where/to/save/

This will start the GUI. Examples can be found below.

Examples

UI Only

In the most simple example, only the GUI will be started to collect feedback. No models are trained and no de novo run started.

- If you want to show the atom contributions to the predictions/model explanation
- (show_atom_contributions: render: true)
- you will experience heavy slowdowns when switching to a new molecule.
- The only solution at the moment is not to show them.
- You can set show_atom_contributions: render: False.
- This will yield a much smoother experience.

cd example_project
metis -f settings_ui.yml --output results/

Reward Model

Here, next to collecting feedback, a reward model is also trained on the feedback. For this, we provided a QSAR model and Oracle model for JNK3 activity. The setting use_oracle_score: False, will use the feedback of humans as the target variable that is to be predicted. If the setting is set to True, the molecules liked by the chemist will be scored by the oracle, and these scores will then be used as the target varible for the reward model. This can be thought of as an active learning setting, where the chemists decides which molecules are being "biologically validated".

cd example_project
metis -f settings_reward_model.yml --output results/

De Novo Design

With these settings, a REINVENT de novo run can be started directly using Metis on a remote machine. The remote machine needs:

a working installation of REINVENT 3.
update the REINVENTS scikit-learn to >1.0.0
Slurm
access through SSH wih a key
the unzipped example_project/metis_reinvent.zip folder

Once copied and unzipped, the paths and settings in the de_novo_files folder need to be adapted to fit to your paths on the remote machine.

cd example_project
metis -f settings_denovo.yml --output results/

Settings

Here is a brief overview of all settings

Name	Type	Required	Default
seed	Union[int, None]	False
tutorial	bool	False	False
debug	bool	False	False
max_iterations	int	True	...
innerloop_iterations	Union[int, None]	False	None
activity_label	str	True	...
introText	str	True	...
propertyLabels	Dict	True	...
data	DataConfig	True	...
ui	UIConfig	True	...
de_novo_model	Union[DeNovoConfig, None]	False	None
reward_model	Union[RewardModelConfig, None]	False	None

debug: if True will overwrite existing results folders
max_iterations defines how often molecules are sampled, feedback collected and the model updated
innerloop_iteration how often molecules are resampled from the same scaffold memory before the model is sent to the remote machine

DataConfig

Name	Type	Required	Default
initial_path	str	True	...
path	str	True	...
selection_strategy	str	True	...
num_molecules	int	True	...
run_name	str	True	...

initial_path: path to inital dataset, the molecules that shall be evaluated first
path: path to subsequent datasets, these come from the server, and are generated by Reinvent, should end in scaffold_memory.csv
selection_strategy: how to pick which molecules to show
num_molecules: how many molecules to show
run_name: what is the name of the run, under this name the results will be stored

UIConfig

Name	Type	Required	Default
show_atom_contributions	AdditionalWindowsConfig	False	{'render': False, 'path': None, 'ECFP': None}
show_reference_molecules	AdditionalWindowsConfig	False	{'render': False, 'path': None, 'ECFP': None}
tab	TabConfig	True	...
navigationbar	NavigationbarConfig	True	...
general	GeneralConfig	True	...
substructures	SubstructureConfig	True	...
global_properties	GlobalPropertiesConfig	True	...

AdditionalWindowsConfig

Name	Type	Required	Default
render	bool	False	False
path	Union[str, None]	False
ECFP	Union[ECFPConfig, None]	False

ECFPConfig

Name	Type	Required	Default
bitSize	int	True	...
radius	int	True	...
useCounts	bool	False	False

TabConfig

Name	Type	Required	Default
render	bool	True	...
tab_names	List	True	...

render if False it will not render the additional tabs

NavigationbarConfig

Name	Type	Required	Default
sendButton	NavButtonConfig	True	...
editButton	NavButtonConfig	True	...

NavButtonConfig

Name	Type	Required	Default
render	bool	False	False

GeneralConfig

Name	Type	Required	Default
render	bool	False	True
slider	bool	False	False

SubstructureConfig

Name	Type	Required	Default
render	bool	False	False
liabilities	Dict	True	...

Liablities control which properties you can select substructures for: Keys such as ugly or tox are simply used within the script. name will define how the button is called color will define the color of the button as well as the color of the atom highlight

liabilities:
      ugly:
        name: "Mutagenicity"
        color: "#ff7f7f"
      tox:
        name: "Toxicity" 
        color: "#51d67e"
      stability:
        name: "Stability"
        color: "#eed358"
      like:
        name: "Good"
        color: "#9542f5"

GlobalPropertiesConfig

Name	Type	Required	Default
render	bool	False	False
liabilities	List	True	...

DeNovoConfig

Name	Type	Required	Default
ssh_settings	str	True	...
use_human_scoring_func	bool	False	False
use_reward_model	bool	False	False

RewardModelConfig

Name	Type	Required	Default
use_oracle_score	bool	False	True
weight	Union[str, None]	False	None
oracle_path	Union[str, None]	False	None
qsar_model_path	str	True	...
training_data_path	str	True	...
ECFP	ECFPConfig	True	...

use_oracle_score instead of using the feedback directly to train the reward model, one can use the oracle model to score molecules liked by the user. The reward model is then trained on the predictions of the oracle rather than on the direct feedback. This mimics an active learning scenario where the chemist can choose which molecules he wants to biologically validate

Name		Name	Last commit message	Last commit date
Latest commit History 31 Commits
example_project		example_project
metis		metis
.gitignore		.gitignore
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
README.md		README.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Metis - A Python-Based User Interface to Collect Expert Feedback for Generative Chemistry Models

Set up

Installation

Dependencies

SSH

Usage

Examples

UI Only

Reward Model

De Novo Design

Settings

DataConfig

UIConfig

AdditionalWindowsConfig

ECFPConfig

TabConfig

NavigationbarConfig

NavButtonConfig

GeneralConfig

SubstructureConfig

GlobalPropertiesConfig

DeNovoConfig

RewardModelConfig

About

Releases

Packages

Languages

License

Global-Chem/metis

Folders and files

Latest commit

History

Repository files navigation

Metis - A Python-Based User Interface to Collect Expert Feedback for Generative Chemistry Models

Set up

Installation

Dependencies

SSH

Usage

Examples

UI Only

Reward Model

De Novo Design

Settings

DataConfig

UIConfig

AdditionalWindowsConfig

ECFPConfig

TabConfig

NavigationbarConfig

NavButtonConfig

GeneralConfig

SubstructureConfig

GlobalPropertiesConfig

DeNovoConfig

RewardModelConfig

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages