This project involves analysis of bulk RNA-seq data and construction and simulation of genome-scale metabolic models in MATLAB to study AKR1A1 deficiency.
MATLAB 2019B along with the required toolboxes ( ). Dependencies include rFASTCORMICS and R functions for gene length normalization.
Normalized RNA-seq counts are processed to assess data distribution using boxplots, histograms, and ksdensity functions. PCA is performed for dimensionality reduction, followed by data discretization and metabolic modeling.
Execute the masterdriver.m
script to run the pre-processing and analysis pipeline, which includes data loading, preprocessing, discretization, model building and initial model analysis. For detailed metabolic flux analysis post-HPC sampling test, execute masterdriverAnalysisSampling.m
.
driverData.m
: Loads RNA-seq data and performs initial preprocessing and descriptive statistics.- Boxplots: Visual representations of the distribution across different samples.
h_boxplot_all.png
,r_boxplot_all.png
- Histograms: Distributions of gene expression levels.
h_gene_distribution_all.png
,r_gene_distribution_all.png
- Cumulative Distribution Functions (CDFs): Density plots for gene expression.
h_cdf_all.png
,r_cdf_all.png
- PCA Plots: Scatter plots from PCA analysis showing data clustering.
h_pca_score_1.png
,r_pca_score_1.png
driverModel_withoutO2S.m
: Sets up medium composition and constructs genome-scale metabolic models.setMediumConstraints_Chiara.m
: Applies medium constraints to models based on experimental conditions.- Histograms illustrating the distribution of non-zero exchange reaction fluxes for all conditions (sc1, sc2, sc12) in both
769-P
andHuh7
type models.
- Histograms illustrating the distribution of non-zero exchange reaction fluxes for all conditions (sc1, sc2, sc12) in both
KO_GLO1_treatment.m
: Simulates genetic alterations or treatment effects like GLO1 gene knockout.removeunusedgen.m
: Removes unused genes to optimize model efficiency.analysis.m
: Conducts preliminary analysis on refined models.- Jaccard Similarity Heatmaps: Visual representations of model similarities.
ModelsimilaritybasedonJaccarddistance_H.png
ModelsimilaritybasedonJaccarddistance_7.png
- Pathway Activity Clustergrams: Shows the activity of different metabolic pathways across models.
Pathwayactivityforallmodels_H.png
Pathwayactivityforallmodels_7.png
- Flux Variability Analysis (FVA) Heatmaps: Similarity based on Flux Variability Analysis.
FVA_similarity_heatmap_7_.png
FVA_similarity_heatmap_H_.png
After completing the initial modeling and analysis on your local machine, the models may be subject to more intensive computational tasks, such as sampling tests, which are typically run on a High-Performance Computing (HPC) system. These HPC-related steps will follow a different procedure. Upon completion of the HPC sampling tests, the results are collected and used for further analysis, which may involve additional scripts.
Sampling Analysis are conducted by masterdriverAnalysisSampling.m
, which includes detailed metabolic flux analysis.
performAnalysisSampling
: Performs metabolic flux sampling analysis, comparing control and treated models to highlight significant metabolic differences.performAnalysisFluxSum
: Performs metabolic flux sum sampling analysis (fluxsum=metabolite turnover rate)
Metabolic Pathway Visualization was performed with the R script AKR1A1_exploration5.R
designed to visualize metabolic pathway alterations in AKR1A1 deficiency using heatmaps. The script generates heatmaps for selected subsystems under various conditions, highlighting significant metabolic shifts.
The script generates a series of PDF files, each representing a heatmap of Signal-to-Noise ratio (SNR) changes across different conditions for selected subsystems:
- File Naming Convention:
Sampling_heatmap_[subsystem]_dir_[direction].pdf
- Example files:
Sampling_heatmap_Glycolysis_gluconeogenesis_dir_0.pdf
Sampling_heatmap_Pentose_phosphate_pathway_dir_1.pdf
Sampling_heatmap_Pyruvate_metabolism_dir_-1.pdf
- Example files:
Each PDF file corresponds to a specific direction of change:
dir_0
: No significant changedir_1
: Positive changedir_-1
: Negative change
These heatmaps help identify key metabolic changes and are grouped by the directionality of reaction flux differences: increased, decreased, or unchanged.
- File Naming Convention:
heatmap_fluxsum_[subsystem]_5_[condition].pdf
- Example:
heatmap_fluxsum_ppp_5_0.pdf
- Example:
Evelyn Gonzalez, Chiara Pecorari, Maria Pires Pacheco, Thomas Sauter
04/24