Skip to content

Commit

Permalink
Merge pull request #128 from raquellrios/master
Browse files Browse the repository at this point in the history
New  documentation and tutorials
  • Loading branch information
AndreaVolkamer authored Dec 20, 2023
2 parents d2352d6 + ad2559b commit b4d1942
Show file tree
Hide file tree
Showing 23 changed files with 54,345 additions and 32 deletions.
39 changes: 32 additions & 7 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,21 @@ KinoML
![GitHub closed issues](https://img.shields.io/github/issues-closed-raw/openkinome/kinoml)
![GitHub open issues](https://img.shields.io/github/issues/openkinome/kinoml)

Machine Learning for kinase modeling.
**KinoML** is a modular and extensible framework for machine learning (ML) in small molecule drug discovery with a special focus on kinases. It enables users to easily:
1. **Access and download data**: from online data sources, such as ChEMBL or PubChem as well as from their own files, with a focus on data availability and inmutability.
2. **Featurize data**: so that it is ML readeable. KinoML offers a wide variety of featurization schemes, from ligand-only to ligand:kinase complexes.
3. **Run structure-based experiments**: using KinoML's implemented models, with a special focus on reproducibility.



The purpose of KinoML is to help users conduct ML kinase experiments, from data collection to model evaluation. Tutorials on how to use KinoML as well as working examples showcasing how to use KinoML to perform experiments end-to-end can be found [here.](https://github.com/raquellrios/kinoml/tree/master/tutorials) Note that despite KinoML's focus being on kinases, it can be applied to any protein system. For more detailed instructions, please refer to the [Documentation](https://openkinome.org/kinoml/index.html).

A KinoML workflow to achieve points **1, 2** and **3** is illustrated in the following image:

![KinoML object model](kinoml/data/fig_1_kinomltechpaper_v2.png)
**Fig. 1:** KinoML workflow overview. Colors represent objects of the same class.



### Notice

Expand All @@ -30,12 +44,23 @@ pip install https://github.com/openkinome/kinoml/archive/master.tar.gz

### Usage

Several notebooks providing usage examples can be found in [examples](https://github.com/openkinome/kinoml/tree/master/examples)
including a [getting started notebook](https://github.com/openkinome/kinoml/blob/master/examples/getting_started.ipynb).
This framework is tightly bound to other repositories:
- [experiments-binding-affinity](https://github.com/openkinome/experiments-binding-affinity) - for advanced and reproducable ML experiments using KinoML
- [kinodata](https://github.com/openkinome/kinodata) - ready-to-use kinase-focused datasets from ChEMBL
### Copyright
The tutorials folder is divided into two parts:

1. [**Getting started**](https://github.com/raquellrios/kinoml/tree/master/tutorials/getting_started): the notebooks in this folder aim to give the user an understanding of how to use KinoML to: (1) **access and download** data, (2) **featurize** data, and (3) **run a** (simple) **ML model** on the featurized data obtained with KinoML to predict ligand binding affinity. Additionally, this folder contains notebooks that explain the **KinoML object model** and how to access the different objects, as well as notebooks **showcasing all the different featurizers** implemented within KinoML and how to use each of them.

2. [**Experiments**](https://github.com/raquellrios/kinoml/tree/master/tutorials/experiments): this folder contains four individual structure-based experiments to predict ligand binding affinity. All experiments use KinoML to obtain the data, featurize it and train and evaluate a ML model implemented within the`kinoml.ml` class. The purpose of these experiments is to display usage examples of KinoML to conduct end-to-end structure-based kinases experiments.


⚠️ You will need a valid OpenEye License for the structural featurizers of the tutorials to work. For the Schrodinger featurizers tutorial you will also need a Schrodinger License!


For users interested in more KinoML usage examples, they can checkout other repositories under the initative [OpenKinome](https://github.com/openkinome/). Particularly, other two repositories that may be of interest are:


- [kinodata](https://github.com/openkinome/kinodata): repository with ready-to-use kinase-focused datasets from ChEMBL, as well as tutorials explaining how to process kinase data for ML applications.
- [experiments-binding-affinity](https://github.com/openkinome/experiments-binding-affinity): more advanced and reproducible ML experiments using KinoML.



Copyright (c) 2019, OpenKinome

Expand Down
2 changes: 1 addition & 1 deletion docs/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -136,7 +136,7 @@
"html_prettify": True,
"css_minify": True,
"repo_type": "github",
"globaltoc_depth": 2,
"globaltoc_depth": 3,
"color_primary": "#3f51b5",
"color_accent": "blue",
"touch_icon": "images/custom_favicon.png",
Expand Down
26 changes: 19 additions & 7 deletions docs/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,30 +10,42 @@ If you are interested in this code, please wait for the official release to use
```

# OpenKinome & KinoML
# KinoML

The [OpenKinome](https://openkinome.org) initiative aims to leverage the increasingly available bioactivity data and scalable computational resources to perform kinase-centric drug design in the context of structure-informed machine learning and free energy calculations. `KinoML` is the main library supporting these efforts.
Welcome to the Documentation of KinoML! The documentation is divided into two parts:

* **User guide**: in this section you will learn how to use KinoML to filter and download data from a data base, featurize your kinase data so that it is ML friendly and train and evaluate a ML model on your featurized kinase data. You will also learn about the KinoML object model, and how to access each of these objects. We also provide a detailed examples of how to use every featurizer implemented within KinoML.

* **Experiment tutorials**: this section shows how to use KinoML to ML structure-based experiments. All experiments are structure-based and they are all end to end, from data collection to model training and evaluation.



KinoML falls under the [OpenKinome](https://openkinome.org) initiative, which aims to leverage the increasingly available bioactivity data and scalable computational resources to perform kinase-centric drug design in the context of structure-informed machine learning and free energy calculations. `KinoML` is the main library supporting these efforts.

Do you want to know more about OpenKinome ecosystem? Check its [website](https://openkinome.org).

<!-- Notify Sphinx about the TOC -->

```{toctree}
:caption: User guide
:maxdepth: 1
:maxdepth: 3
:hidden:
notebooks/getting_started.nblink
notebooks/kinoml_object_model.nblink
notebooks/OpenEye_structural_featurizer.nblink
notebooks/Schrodinger_structural_featurizer.nblink
```

```{toctree}
:caption: Tutorials
:maxdepth: 1
:caption: Experiment tutorials
:maxdepth: 2
:hidden:
notebooks/OpenEye_structural_featurizer.nblink
notebooks/Schrodinger_structural_featurizer.nblink
notebooks/ligand-only-smiles-EGFR.nblink
notebooks/ligand-only-morgan1024-EGFR.nblink
notebooks/kinase-ligand-informed-smiles-sequence-EGFR.nblink
notebooks/kinase-ligand-informed-morgan-composition-EGFR.nblink
```

```{toctree}
Expand Down
2 changes: 1 addition & 1 deletion docs/notebooks/OpenEye_structural_featurizer.nblink
Original file line number Diff line number Diff line change
@@ -1 +1 @@
{"path": "../../examples/OpenEye_structural_featurizer.ipynb"}
{"path": "../../tutorials/getting_started/OpenEye_structural_featurizer_showcase.ipynb"}
2 changes: 1 addition & 1 deletion docs/notebooks/Schrodinger_structural_featurizer.nblink
Original file line number Diff line number Diff line change
@@ -1 +1 @@
{"path": "../../examples/Schrodinger_structural_featurizer.ipynb"}
{"path": "../../tutorials/getting_started/Schrodinger_structural_featurizer_showcase.ipynb"}
2 changes: 1 addition & 1 deletion docs/notebooks/getting_started.nblink
Original file line number Diff line number Diff line change
@@ -1 +1 @@
{"path": "../../examples/getting_started.ipynb"}
{"path": "../../tutorials/getting_started/getting_started_with_kinoml.ipynb"}
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
{"path": "../../tutorials/experiments/kinase-ligand-informed-morgan-composition-EGFR/experiments_notebook.ipynb"}
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
{"path": "../../tutorials/experiments/kinase-ligand-informed-smiles-sequence-EGFR/experiment_notebook.ipynb"}
2 changes: 1 addition & 1 deletion docs/notebooks/kinoml_object_model.nblink
Original file line number Diff line number Diff line change
@@ -1 +1 @@
{"path": "../../examples/kinoml_object_model.ipynb", "extra-media": ["../../kinoml/data/"]}
{"path": "../../tutorials/getting_started/kinoml_object_model.ipynb", "extra-media": ["../../kinoml/data/"]}
1 change: 1 addition & 0 deletions docs/notebooks/ligand-only-morgan1024-EGFR.nblink
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
{"path": "../../tutorials/experiments/ligand-only-morgan1024-EGFR/experiment_notebook.ipynb"}
1 change: 1 addition & 0 deletions docs/notebooks/ligand-only-smiles-EGFR.nblink
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
{"path": "../../tutorials/experiments/ligand-only-smiles-EGFR/experiment_notebook.ipynb"}
Binary file added kinoml/data/fig_1_kinomltechpaper_v2.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added kinoml/data/first_tutorial_scheme_v2.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
29 changes: 29 additions & 0 deletions tutorials/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
How to use the tutorials folder
==============================
This tutorial folder contains two subfolders:



* **getting_started**: this folder contains four jupyter notebook tutorials that give the user a general overview of KinoML potential usage and capabilities.

* **getting_started_with_kinoml**: this notebook aims to give a brief overview of KinoML capabilities. This notebook is divided into three parts that show how to use KinoML to: (1) filter and obtain the desired data from an external data source, (2) featurize this data to make it ML readable and (3) train and evaluate a ML model on the featurized data obtain from the previous steps.

* **kinoml_object_model**: this notebook aims to guide the user through the KinoML object model, showing how to access each object.

* **OpenEye_structural_featurizer_showcase**: this notebook displays all the OpenEye-based structural modeling featurizers implemented in KinoML and how to use each of them.

* **Schrodinger_structural_featurizer_showcase**: this notebook introduces the structural modeling featurizers implemented in KinoML that use the molecular modeling capabilities from the Schrodinger Suite to prepare protein structures and to dock small molecules into their binding sites.



* **experiments**: this folder contains four separate structure-based experiments to predict ligand binding affinity to the EGFR kinase. The aim of these notebook are to showcase how to use KinoML to conduct experiments end-to-end, from obtaining the data from the database to training and evaluating a ML model to predict ligand binding affinity. Note that if the user wants to run this notebooks with their own data, they can do so by adjusting the neccesary parameters within the notebooks. All experiments are divided into two parts:

1. **Featurize the data set**: obtaining the data set and featurize it with the featurization pipeline of choice.

2. **Run the experiment**: the ML model of choice, implemented in the `kinoml.ml` class is trained and evaluated.


Please note that the order in which the different notebooks are displayed here is the recommended order for running them, providing a more comprehensive understanding of KinoML.

⚠️ You will need a valid OpenEye License for the featurizers of the tutorials to work. For the Schrodinger featurizers tutorial (`Schrodinger_structural_featurizer_showcase.ipynb`) you will also need a Schrodinger License!

Loading

0 comments on commit b4d1942

Please sign in to comment.