Skip to content

Commit

Permalink
Merge pull request #22 from volkamerlab/polish
Browse files Browse the repository at this point in the history
Polish READMEs
  • Loading branch information
dominiquesydow authored Jul 22, 2020
2 parents ff9bde0 + 9d05253 commit 0ed7a41
Show file tree
Hide file tree
Showing 7 changed files with 30 additions and 33 deletions.
15 changes: 6 additions & 9 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,15 +13,12 @@

## Repository content

This repository holds
(i) fragment library data,
(ii) a *quick start* notebook explaining how to load and use the library, alongside
(iii) notebooks covering the full analyses regarding the fragment library and the combinatorial library as described in
the corresponding paper.

data/
notebooks/
environment.yml
This repository holds the following resources:

1. Fragment library data and a link to the combinatorial library data.
2. *Quick start* notebook explaining how to load and use the fragment library.
3. Notebooks covering the full analyses regarding the fragment and combinatorial libraries as described in
the corresponding paper.

Please find detailed description of files in `data/` and `notebooks/` in the folders' `README` files.

Expand Down
6 changes: 3 additions & 3 deletions data/README.md
Original file line number Diff line number Diff line change
@@ -1,9 +1,9 @@
# Data

Overview on data content:
Overview on data content.

- `fragment_library/`: Fullfragment library resulting from the KinFragLib fragmentation procedure comprises of about 3,000 fragments, which are the basis for exploring the subpocket-based chemical space of ligands co-crystallized with kinases.
- `fragment_library_filtered/`: Filtered fragment library: Select fragments meaningful for the recombination (remove pool X, deduplicate per subpocket, remove unfragmented ligands, remove all fragments that connect only to pool X, keep only fragment-like fragments, and filter for hinge-like AP fragments).
- `fragment_library/`: Full fragment library resulting from the KinFragLib fragmentation procedure comprises of about 3000 fragments, which are the basis for exploring the subpocket-based chemical space of ligands co-crystallized with kinases.
- `fragment_library_filtered/`: Filtered fragment library: Select fragments tailored for the recombination (remove pool X, deduplicate per subpocket, remove unfragmented ligands, remove all fragments that connect only to pool X, keep only fragment-like fragments, and filter for hinge-like AP fragments).
- `fragment_library_reduced/`: Reduced fragment library: Select a diverse set of fragments (per subpocket) for recombination starting from the filtered fragment library.
- `combinatorial_library/`: Combinatorial library based on the reduced fragment library.
- `external/`: Data from external resources.
20 changes: 10 additions & 10 deletions data/fragment_library/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,15 +17,15 @@ Fragments are organized by the subpockets they occupy. Each fragment subpocket p

Each fragment contains the following information:

- 3D coordinates of the fragment's atoms as reported in the corresponding KLIFS complex structure file
- 3D coordinates of the fragment's atoms as reported in the corresponding KLIFS complex structure file.
- `kinase`, `family`, and `group`:
*Kinase* name, *family* and *group* of the kinase that the ligand (from which the fragment originates) was
co-crystallized with
co-crystallized with.
- `complex_pdb`, `ligand_pdb`, `alt`, and `chain`:
*PDB complex* and *ligand ID*, *alternate model* and *chain* for the KLIFS structure that the ligand
(from which the fragment originates) was co-crystallized with
- `atom.prop.subpocket`: Subpocket assignment for each of the fragment's atoms
- `atom.prop.environment`: BRICS environment IDs for each of the fragment's atoms
(from which the fragment originates) was co-crystallized with.
- `atom.prop.subpocket`: Subpocket assignment for each of the fragment's atoms.
- `atom.prop.environment`: BRICS environment IDs for each of the fragment's atoms.

Please refer to `notebooks/1_1_quick_start.ipynb` on how to load and work with this dataset.

Expand All @@ -38,14 +38,14 @@ Original ligands that are composed of the fragments in the full fragment library
Each ligand contains the following information:

- `kinase`, `family`, and `group`:
*Kinase* name, *family* and *group* of the kinase that the ligand was co-crystallized with
*Kinase* name, *family* and *group* of the kinase that the ligand was co-crystallized with.
- `complex_pdb`, `ligand_pdb`, `alt`, and `chain`:
*PDB complex* and *ligand ID*, *alternate model* and *chain* for the KLIFS structure that the ligand was co-crystallized with
*PDB complex* and *ligand ID*, *alternate model* and *chain* for the KLIFS structure that the ligand was co-crystallized with.
- `subpocket`:
Subpockets that the ligand occupies
Subpockets that the ligand occupies.
- `ac_helix`:
aC-helix conformation for the KLIFS structure that the ligand was co-crystallized with
aC-helix conformation for the KLIFS structure that the ligand was co-crystallized with.
- `smiles`:
Ligand's SMILES string
Ligand's SMILES string.

Please refer to `notebooks/2_1_fragment_analysis_original_ligands.ipynb` where this data is generated.
4 changes: 2 additions & 2 deletions data/fragment_library_filtered/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

The (full) fragment library resulting from the KinFragLib fragmentation procedure comprises of 7486 fragments, which are the basis for exploring the subpocket-based chemical space of ligands co-crystallized with kinases (see `data/fragment_library/`).

In order to prepare a library with fragments meaningful for recombination, we offer heare a filtered fragment library (2009 fragments) based on the following filters:
In order to prepare a library with fragments tailored for recombination, we offer heare a filtered fragment library (2009 fragments) based on the following filters:

1. Remove pool X
2. Deduplicate fragment library (per subpocket)
Expand All @@ -11,4 +11,4 @@ In order to prepare a library with fragments meaningful for recombination, we of
5. Keep "Rule of Three (Ro3)" compliant fragments (fragment-likeness)
6. Filter AP subpocket fragments (typical hinge-like)

Please refer to the notebook `notebooks/3_1_fragment_library_reduced.ipynb` to check how the data was generated and/or use it as a starting point for customized protocols.
Please refer to the notebook `notebooks/3_1_fragment_library_reduced.ipynb` to check how the data was generated and/or use it as a starting point for customized protocols.
6 changes: 3 additions & 3 deletions data/fragment_library_reduced/README.md
Original file line number Diff line number Diff line change
@@ -1,13 +1,13 @@
# Reduced fragment library

The (full) fragment library resulting from the KinFragLib fragmentation procedure comprises of 7486 fragments, which are the basis for exploring the subpocket-based chemical space of ligands co-crystallized with kinases (see `data/fragment_library/`).
The (full) fragment library resulting from the KinFragLib fragmentation procedure comprises of 7,486 fragments, which are the basis for exploring the subpocket-based chemical space of ligands co-crystallized with kinases (see `data/fragment_library/`).

In order to demonstrate how this library can be used for recombining ligands, we offer here a reduced fragment library (624 fragments) based on the following filters:

1. Remove all fragments that are not useful in a recombination. Check `data/fragment_library_filtered/`.
2. Select a diverse set of fragments (per subpocket) for recombination to (i) save computational cost and (ii) avoid recombination of highly similar fragments.

Step 1 is necessary to focus on fragments meaningful for the recombination, whereas step 2 mainly aims to reduce computational costs during recombination.
Step 1 is necessary to focus on fragments tailored for the recombination, whereas step 2 mainly aims to reduce computational costs during recombination.

## Reduction steps

Expand All @@ -23,4 +23,4 @@ Step 1 is necessary to focus on fragments meaningful for the recombination, wher
- `N_REPRESENTED_FRAGMENTS` = 10
- `INCLUDE_SINGLETONS` = True

Please refer to the notebook `notebooks/3_1_fragment_library_reduced.ipynb` to check how the data was generated and/or use it as a starting point for customized protocols.
Please refer to the notebook `notebooks/3_1_fragment_library_reduced.ipynb` to check how the data was generated and/or use it as a starting point for customized protocols.
4 changes: 2 additions & 2 deletions notebooks/3_1_fragment_library_reduced.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -1060,7 +1060,7 @@
" f'The (full) fragment library resulting from the KinFragLib fragmentation procedure comprises of {fragment_library_concat.shape[0]} fragments, '\n",
" f'which are the basis for exploring the subpocket-based chemical space of ligands co-crystallized with kinases '\n",
" f'(see `data/fragment_library/`).\\n\\n'\n",
" f'In order to prepare a library with fragments meaningful for recombination, '\n",
" f'In order to prepare a library with fragments tailored for recombination, '\n",
" f'we offer heare a filtered fragment library ({fragment_library_concat_filtered.shape[0]} fragments) based on the following filters:\\n\\n'\n",
" f'1. Remove pool X\\n'\n",
" f'2. Deduplicate fragment library (per subpocket)\\n'\n",
Expand Down Expand Up @@ -16550,7 +16550,7 @@
" f'(i) save computational cost and '\n",
" f'(ii) avoid recombination of highly similar fragments.'\n",
" f'\\n\\n'\n",
" f'Step 1 is necessary to focus on fragments meaningful for the recombination, whereas step 2 mainly aims to reduce computational costs during recombination.'\n",
" f'Step 1 is necessary to focus on fragments tailored for the recombination, whereas step 2 mainly aims to reduce computational costs during recombination.'\n",
" f'\\n\\n'\n",
" f'## Reduction steps'\n",
" f'\\n\\n'\n",
Expand Down
8 changes: 4 additions & 4 deletions notebooks/README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# Notebooks

Overview on notebook content:
Overview on notebook content.

## 1. Quick start

Expand Down Expand Up @@ -50,12 +50,12 @@ The aim of this notebook is to extract information from the combinatorial librar

### `4_2_combinatorial_library_properties.ipynb`

In this notebook we want to analyze properties of the combinatorial library, such as the ligand size and Lipinski's rule of five criteria.
In this notebook, we want to analyze properties of the combinatorial library, such as the ligand size and Lipinski's rule of five criteria.

### `4_3_combinatorial_library_comparison_klifs.ipynb`

In this notebook we want to compare the combinatorial library to the original KLIFS ligands, i.e. the ligands from which the fragment library originates from. We consider exact and substructure matches.
In this notebook, we want to compare the combinatorial library to the original KLIFS ligands, i.e. the ligands from which the fragment library originates from. We consider exact and substructure matches.

### `4_4_combinatorial_library_comparison_chembl.ipynb`

In this notebook we want to compare the combinatorial library to the ChEMBL 25 dataset in order to find exact matches and the most similar ChEMBL ligand per recombined ligand.
In this notebook, we want to compare the combinatorial library to the ChEMBL 25 dataset in order to find exact matches and the most similar ChEMBL molecule per recombined ligand.

0 comments on commit 0ed7a41

Please sign in to comment.