Skip to content

This project was developed in the third semester of my master degree bioinformatics. Overall our goal was to present to the scientific community for the first time a characterization of folding preferences of so called upstream open reading frames. (more in the ReadMe)

Notifications You must be signed in to change notification settings

nigelhartm/structural_characterization_uorf

Repository files navigation

Structural characterization of uORFs using Deep Learning prediction methods

This project was developed in the third semester of my master degree bioinformatics. Overall our goal was to present to the scientific community for the first time a characterization of folding preferences of so called upstream open reading frames.

Therefore we developed our own deep learning approach trained by proteins from the PDB. We compared our results to the current front runner alphafold. Additionally we did basic statistics on uORFs and tried to get more informations by using iupred2a to predict their intrinsic disorder.

(For more information please don't hesitate to contact me.)

Actually this repository lacks of good documentation / commented code, it will be updated if there is time available. Reason is that most scripts just got used once.

Results

  • Observed unique properties of uORFs
  • Results show uORFs maybe try to avoid defined structures
  • Machine learning approaches trained by proteins, not peptides!
  • Further research needed (different species, etc.)

Presentation

Available under presentation.pdf:

  1. Locate uORFs
  2. Statistics on uORFs
  3. Gene Ontology
  4. Secondary structure prediction - Simple CNN
  5. Secondary structure prediction - AlphaFold
  6. Simple CNN vs AlphaFold
  7. Prediction of intrinsic disorder

Dataset

The dataset, resulting from this project is available under https://www.kaggle.com/datasets/nigelhartm/arabidopsis-thaliana-uorf

About

This project was developed in the third semester of my master degree bioinformatics. Overall our goal was to present to the scientific community for the first time a characterization of folding preferences of so called upstream open reading frames. (more in the ReadMe)

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published