Skip to content

laurasansc/algorithms_bioinformatics

Repository files navigation

algorithms_bioinformatics

  • read-in data

    • visualize the distribution of data to decide which threshold of IC50 to use
    • filter out non-binders (ic50 = 10,000 nM)
    • decide which alleles to use and split the peptide sequences(80/20) for train/test
    • cluster peptides with Hobohm and heuristics (1/r*s)
  • construct the PSSM for X MHC I alleles (X = 20)

  • visualize the PSSMs as logos

  • do prediction with PSSM with test data

  • Pearson correlations

The code can be run all by downloading the entire repository and running the __main__.py file. Be aware that in all the files there is a specified data directory which needs to be changed to the new directory where you downloaded the repository. PS: It needs to be fixed so you don't actually need to touch anything from the code and you can call several different variables.

About

22125 - Algorithms in Bioinformatics Project

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 3

  •  
  •  
  •