Protein-Knowledge-Graph-Application

In this research, we have shown that the efficient incorporation of the Protein Knowledge Graph cant detect the scientifically importannt relations among the full protein sequences.

Data

We have leveraged ProteinKG25 ( https://www.zjukg.org/project/ProteinKG25/ ) knowledge graph data, and phosphosites curated data (https://huggingface.co/datasets/waylandy/phosformer_curated/blob/main/curated/phosphosites_11mer_kinase_specific.tsv). We have also leveraged the following databases:

Reactome Database: https://drive.google.com/file/d/1k1NuOmRCrFNp0S4EyRmfXVjwJTZI8pzN/view?usp=sharing

Phosphorylation Databse: https://drive.google.com/file/d/1hjUg_jFUhuhrBN3hw4H5y4mJYgv08ew6/view?usp=sharing

ProteinKG25 knowledge graph can be accessed from the following google drive link:

https://drive.google.com/drive/folders/1UzEb-mWO6gG2GyQqmQG_oF8vWqmKkDfG?usp=sharing

Data from our Expermental Outcomes ( The data is used in Global Ranking )

Protein Sequence Pairs ( that appear together only after incorporating knowledge graph ) using TransE embedding model: https://drive.google.com/file/d/1x3e6q4AQB6QepVkta751RnjLHd764Uyt/view?usp=sharing

Protein Sequence Pairs ( that appear together only after incorporating knowledge graph ) using RotatE embedding model: https://drive.google.com/file/d/1TUcfOlAqmeWDzZi6_z_2GiHFxaynFZuW/view?usp=sharing

Protein Sequence Pairs ( that appear together only after incorporating knowledge graph ) using HolE embedding model: https://drive.google.com/file/d/1GWUpJXVZAT1HUX9siEyOm9PCk6An8mBR/view?usp=sharing

Protein Sequence Pairs ( that appear together only after incorporating knowledge graph ) using DistMult embedding model: https://drive.google.com/file/d/19GCax77Ue8RAl8OjUO1F2qxivO4RSz5i/view?usp=sharing

Protein Sequence Pairs ( that appear together only after incorporating knowledge graph ) using ComplEx embedding model: https://drive.google.com/file/d/18N6Qbt7P_98KisOT_boJndNKsQjdwOYn/view?usp=sharing

Code

The code to generate the outcomes incorporating knowledge graph can be accessed in ProteinKg25_ESM-2_Application notbook. In ProteinKg25_ESM-2_Application notebook, to have results for the different embedding models, in cell named " Ampligraph Library to extract the knowledge graph embedding", the following code editing is required:

please assign the target model to Embedding__MODEL=Model_NamE

In the above line replce Model_NamE with the target embedding model names. We have leveraged 5 embedding models. 5 embedding models are: ['HolE', 'RotatE', 'ComplEx', 'DistMult', 'TransE']

The code of Global ranking can be accessed through the following google colab link:

https://colab.research.google.com/drive/1wo-Zdb8rgeD7RJvY4XKo7xpjdDjeXFHr?usp=sharing

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
ProteinKg25_ESM-2_Application.ipynb		ProteinKg25_ESM-2_Application.ipynb
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Protein-Knowledge-Graph-Application

Data

Data from our Expermental Outcomes ( The data is used in Global Ranking )

Code

About

Releases

Packages

Languages

anonimoustt/Protein-Knowledge-Graph-Application

Folders and files

Latest commit

History

Repository files navigation

Protein-Knowledge-Graph-Application

Data

Data from our Expermental Outcomes ( The data is used in Global Ranking )

Code

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages