SpokeSigv2

Knowledge graph-based integration of individual electronic health records and common genetic variants to predict disease risk

Introduction

This project explores the use of Propagated Spoke Entry Vectors (PSEVs) and combination genetic with clinical data to enhance disease prediction models. By leveraging both genetic and clinical information, we aim to improve the accuracy and robustness of predictive models in healthcare.

Propagated Spoke Entry Vectors

Propagated Spoke Entry Vectors (PSEVs) are a novel approach to representing genetic and clinical information in a format suitable for machine learning models. PSEVs are derived from biological knowledge graphs, specifically from SPOKE.

Key features of PSEVs:

Capture complex relationships between genes and biological entities
Provide a dense, fixed-length vector representation of genetic information
Enable integration of diverse biological knowledge into predictive models

Combining Genetic and Clinical Data

Our approach combines PSEVs (representing genetic data) with traditional clinical data to create more comprehensive and powerful predictive models. The combination process involves:

Normalizing both genetic PSEV and clinical PSEV
Applying different weights to genetic and clinical components
Exploring the optimal balance between genetic and clinical information

Benefits of this combined approach:

Leverages complementary information from both data types
Potentially improves prediction accuracy for complex diseases
Allows for personalized risk assessment based on both genetic predisposition and clinical factors

Results and Interpretation

After running the scripts, you'll find the following results:

ROC curves for different genetic-clinical weight combinations
Performance metrics (AUC, F1 score, balanced accuracy) for each weight
Predictions and probabilities for the validation set

Interpret these results to:

Determine the optimal balance between genetic and clinical data
Assess the model's performance across different disease outcomes
Identify potential biomarkers or risk factors

For more information about SPOKE and PSEVs, please visit SPOKE's official website.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

SpokeSigv2

Knowledge graph-based integration of individual electronic health records and common genetic variants to predict disease risk

Introduction

Propagated Spoke Entry Vectors

Combining Genetic and Clinical Data

Results and Interpretation

Files

README.md

Latest commit

History

README.md

File metadata and controls

SpokeSigv2

Knowledge graph-based integration of individual electronic health records and common genetic variants to predict disease risk

Introduction

Propagated Spoke Entry Vectors

Combining Genetic and Clinical Data

Results and Interpretation