algorithms_bioinformatics

read-in data
- visualize the distribution of data to decide which threshold of IC50 to use
- filter out non-binders (ic50 = 10,000 nM)
- decide which alleles to use and split the peptide sequences(80/20) for train/test
- cluster peptides with Hobohm and heuristics (1/r*s)
construct the PSSM for X MHC I alleles (X = 20)
visualize the PSSMs as logos
do prediction with PSSM with test data
Pearson correlations

The code can be run all by downloading the entire repository and running the __main__.py file. Be aware that in all the files there is a specified data directory which needs to be changed to the new directory where you downloaded the repository. PS: It needs to be fixed so you don't actually need to touch anything from the code and you can call several different variables.

Name		Name	Last commit message	Last commit date
Latest commit History 50 Commits
.ipynb_checkpoints		.ipynb_checkpoints
__pycache__		__pycache__
data		data
legacy		legacy
01_hobohm_1_clusters.ipynb		01_hobohm_1_clusters.ipynb
README.md		README.md
__main__.py		__main__.py
get_hobohm_PSSM.ipynb		get_hobohm_PSSM.ipynb
get_hobohm_PSSM.py		get_hobohm_PSSM.py
heuristics_PSSM.ipynb		heuristics_PSSM.ipynb
heuristics_PSSM.py		heuristics_PSSM.py
hobohm1_centroids.ipynb		hobohm1_centroids.ipynb
hobohm1_centroids.py		hobohm1_centroids.py
load_data.ipynb		load_data.ipynb
load_data.py		load_data.py
predictions.ipynb		predictions.ipynb
predictions.py		predictions.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

algorithms_bioinformatics

About

Releases

Packages

Contributors 3

Languages

laurasansc/algorithms_bioinformatics

Folders and files

Latest commit

History

Repository files navigation

algorithms_bioinformatics

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages