Skip to content

Commit

Permalink
fix: synchronize STEP numbers protocol and paper
Browse files Browse the repository at this point in the history
  • Loading branch information
edkerk committed Dec 1, 2023
1 parent 8d90e87 commit 9546b87
Show file tree
Hide file tree
Showing 3 changed files with 88 additions and 97 deletions.
128 changes: 59 additions & 69 deletions tutorials/full_ecModel/protocol.m
Original file line number Diff line number Diff line change
Expand Up @@ -7,16 +7,17 @@
% dependent on how you intend to use the ecModel it may require additional
% curation and evaluation.
%
% DO NOT USE THE ECMODEL GENERATED HERE OUTSIDE OF THIS TUTORIAL.
% DO NOT DIRECTLY USE THE ECMODEL GENERATED HERE OUTSIDE OF THIS TUTORIAL.
%
% In comparison to the published GECKO3 Nature Protocols paper, this script
% might have more up-to-date descriptions about the capabilities and
% functions, which were introduced after the Nature Protocols paper was
% published. This script will have additional commands and analyses that
% are not included in the paper as such. This script should always work with
% the most recent GECKO3 release.
% the most recent GECKO3 release. The STAGE and STEP numbers match those in the
% Nature Protocols paper.

%% Preparation stage for ecModel reconstruction
%% Installation
% - Install RAVEN & GECKO
% - Install RAVEN by following the installation instructions:
% https://github.com/SysBioChalmers/RAVEN/wiki/Installation
Expand All @@ -25,29 +26,33 @@
% - The installation of Gurobi as LP solver is highly recommended
checkInstallation; % Confirm that RAVEN is functional, should be 2.8.3 or later.

% - GECKO can be installed via cloning or direct download of ZIP file.
% See installation instructions in the README.md:
% https://github.com/SysBioChalmers/GECKO/tree/main#installation
% - Alternatively, GECKO can be installed as MATLAB Add-On:
% https://se.mathworks.com/help/matlab/matlab_env/get-add-ons.html
% - Install GECKO by following the installation instructions:
% https://github.com/SysBioChalmers/GECKO/wiki/Installation-and-upgrade
% - As RAVEN, GECKO can also be installed as MATLAB Add-On (link above)
% - Add the appropriate GECKO (sub)folders to MATLAB path:
GECKOInstaller.install

%% STAGE 0: Preparation stage for ecModel reconstruction
% STEP 1 Preparation of project file structure and data
% - Initiate a basic structure of files and folders for your intended
% project. This includes a copy of the template adapter.
% The next line is commented out as the project structure is already
% available in GECKO/tutorials/full_ecModel.
%startGECKOproject()

% STEP 2 Store the starting GEM
% - Find a high-quality GEM of your species of interest. ecYeastGEM is
% based on yeast-GEM https://github.com/SysBioChalmers/yeast-GEM/releases
% - Release v8.6.2 of yeast-GEM is also distributed with GECKO at
% tutorials/full_ecModel/models/yeast-GEM.yml
% - Modify the model adapter (e.g. tutorials/full_ecModel/ecYeastGEMadapter.m)
% to contain organism- and model-specific parameters.

% STEP 3-7 Modify the model adapter
% In the Nature Protocols paper is explained how to decide on the organism- and
% model-specific parameters in the model adapter, which for this tutorial is
% located at tutorials/full_ecModel/ecYeastGEMadapter.m.

%% STAGE 1: expansion from a starting metabolic model to an ecModel structure
% STEP 1 Set modelAdapter
% STEP 8 Set modelAdapter
adapterLocation = fullfile(findGECKOroot,'tutorials','full_ecModel','YeastGEMAdapter.m');
ModelAdapter = ModelAdapterManager.setDefault(adapterLocation);

Expand All @@ -66,7 +71,7 @@
% change in the adapter file and run again the command
% ModelAdapterManager.setDefault()

% STEP 2 Load conventional yeast-GEM
% STEP 9 Load conventional yeast-GEM
% If the location to the conventional GEM was already set in the modelAdapter,
% as obj.param.convGEM, then loadConventionalGEM can directly be used. If
% the model is stored somewhere else (and not specified in obj.param.convGEM),
Expand All @@ -75,7 +80,7 @@
model = loadConventionalGEM();
% model = importModel(fullfile(geckoRoot,'tutorials','full_ecModel','models','yeast-GEM.yml'));

% STEP 3 Prepare ecModel
% STEP 10-11 Prepare ecModel
% We will make a full GECKO ecModel. For an example of reconstructing a
% light GECKO ecModel, see tutorials/light_ecModel.
[ecModel, noUniprot] = makeEcModel(model,false);
Expand All @@ -86,7 +91,7 @@
% function and what is the order of inputs and outputs.
doc makeEcModel

% STEP 4 Annotate with complex data
% STEP 12-13 Annotate with complex data
% As Saccharomyces cerevisiae is available on Complex Portal
% (https://www.ebi.ac.uk/complexportal/), its data is included in
% ecYeastGEM.
Expand All @@ -98,7 +103,7 @@
% complexInfo can be given as second input, but not needed here, as it will
% read the file that was written by getComplexData.

% STEP 5 Store model in YAML format
% STEP 14 Store model in YAML format
% In this script there is code at the end of each stage to store a copy of
% the ecModel. However, only the stage 3 ecModel is kept and distributed
% together with GECKO (as is shown at the end of stage 3).
Expand All @@ -108,11 +113,12 @@
% Uncomment the line below if you want to reload the model.
%ecModel=loadEcModel('ecYeastGEM_stage1.yml');

% STEP 15 Different kcat sources
% Decide which kcat source to use. In the steps below, all options are
% shown. Not all are required, it is up to the user to decide which ones
% they want to use.

% STEP 6 Fuzzy matching with BRENDA
% STEP 16-17 Gather EC numbers, required for BRENDA as kcat source
% Requires EC numbers, which are here first taken from the starting model,
% with the missing ones taken from Uniprot & KEGG databases.
ecModel = getECfromGEM(ecModel);
Expand All @@ -123,47 +129,51 @@
% reactions.
ecModel = getECfromDatabase(ecModel);

% Do the actual fuzzy matching with BRENDA.
% STEP 18-19 Gather kcat values from BRENDA
kcatList_fuzzy = fuzzyKcatMatching(ecModel);
% Now we have a kcatList, which will be used to update ecModel in a later
% step.

% STEP 7 DLKcat prediction through machine learning
% Requires metabolite SMILES, which are gathered from PubChem.
% STEP 20-22 Gather metabolite SMILES, required for DLKcat as kcat source
% Metabolite SMILES are gathered from PubChem.
[ecModel, noSmiles] = findMetSmiles(ecModel);

% STEP 23 Prepare DLKcat input file
% An input file is written, which is then used by DLKcat, while the output
% file is read back into MATLAB. The full_ecModel tutorial already comes
% file is read back into MATLAB.The full_ecModel tutorial already comes
% with a DLKcat.tsv file populated with kcat values. If this file should be
% regenerated, the line below should be uncommented. Note that this
% overwrites the existing files, thereby discarding existing kcat predictions.
%writeDLKcatInput(ecModel,[],[],[],[],true);

% STEP 24 Run DLKcat
% runDLKcat will run the DLKcat algorithm via a Docker image. If the
% DLKcat.tsv file already has kcat values, these will all be overwritten.
%runDLKcat();

% STEP 25 Load DLKcat output
kcatList_DLKcat = readDLKcatOutput(ecModel);

% STEP 8 Combine kcat from BRENDA and DLKcat
% STEP 26 Combine kcat from BRENDA and DLKcat
kcatList_merged = mergeDLKcatAndFuzzyKcats(kcatList_DLKcat, kcatList_fuzzy);

% STEP 9 Take kcatList and populate edModel.ec.kcat
% STEP 27 Take kcatList and populate edModel.ec.kcat
ecModel = selectKcatValue(ecModel, kcatList_merged);

% STEP 10 Apply custom kcat values
% STEP 28 Apply custom kcat values
% During the development of yeast-GEM ecModels (through GECKO 1 & 2),
% various kcat values have been manually curated, to result in a model that
% is able to validate a wide range of phenotypes. These curations are
% summarized under /data/customKcats.tsv, and applied here.
[ecModel, rxnUpdated, notMatch] = applyCustomKcats(ecModel);

% To modify the S-matrix, to actually implement the kcat/MW constraints,
% applyKcatConstraints should be run. This is done in STEP 13.
% applyKcatConstraints should be run. This is done in STEP 31.

% STEP 11 Get kcat values across isozymes
% STEP 29 Get kcat values across isozymes
ecModel = getKcatAcrossIsozymes(ecModel);

% STEP 12 Get standard kcat
% STEP 30 Get standard kcat
% Assign a protein cost to reactions without gene assocation. These
% reactions are identified as those without a corresponding entry in
% ecModel.grRules. The following reactions are exempted:
Expand All @@ -184,8 +194,8 @@
% folder, containing the relevant reaction identifiers.
[ecModel, rxnsMissingGPR, standardMW, standardKcat] = getStandardKcat(ecModel);

% STEP 13 Apply kcat constraints from ecModel.ec.kcat to ecModel.S
% While STEP 9-12 add or modify values in ecModel.ec.kcat, these contraints
% STEP 31 Apply kcat constraints from ecModel.ec.kcat to ecModel.S
% While STEP 27-30 add or modify values in ecModel.ec.kcat, these contraints
% are not yet applied to the S-matrix: the enzyme is not yet set as pseudo-
% substrate with the -MW/kcat stoichiometry. Now, applyKcatConstraints will
% take the values from ecModel.ec.kcat, considers any complex/subunit data
Expand All @@ -196,7 +206,7 @@
% reapply the new constraints onto the metabolic model.
ecModel = applyKcatConstraints(ecModel);

% STEP 14 Set upper bound of protein pool
% STEP 32 Set upper bound of protein pool
% The protein pool exchange is constrained by the total protein content
% (Ptot), multiplied by the f-factor (ratio of enzymes/proteins) and the
% sigma-factor (how saturated enzymes are on average: how close to their
Expand Down Expand Up @@ -224,7 +234,7 @@
% Uncomment the line below if you want to reload the model.
%ecModel=loadEcModel('ecYeastGEM_stage2.yml');

% STEP 15 Test maximum growth rate
% STEP 33-38 Test maximum growth rate
% Test whether the model is able to reach maximum growth if glucose uptake
% is unlimited. First set glucose uptake unconstraint.
ecModel = setParam(ecModel,'lb','r_1714',-1000);
Expand All @@ -237,7 +247,7 @@
% The growth rate is far below 0.41 (the maximum growth rate of
% S. cerevisiae, that is entered in the model adapter as obj.params.gR_exp.

% STEP 16 Relax protein pool constraint
% STEP 39-42 Relax protein pool constraint
% As a simplistic way to ensure the model to reach the growth rate, the
% upper bound of the protein pool exchange reaction can be increased to
% whatever is required. This works, but STEP 17 is preferred.
Expand All @@ -254,7 +264,7 @@
ecModel = setParam(ecModel,'lb','r_4041',0);
ecModel = setParam(ecModel,'obj','r_4041',1);

% STEP 17 Sensitivity tuning
% STEP 43-44 Sensitivity tuning
% First reset the protein pool constraint to a more realistic value,
% reverting STEP 16.
ecModel = setProtPoolSize(ecModel);
Expand All @@ -263,7 +273,7 @@
% Inspect the tunedKcats structure in table format.
struct2table(tunedKcats)

% STEP 18 Curate kcat values based on kcat tuning
% STEP 45-51 Curate kcat values based on kcat tuning
% As example, the kcat of 5'-phosphoribosylformyl glycinamidine synthetase
% (reaction r_0079) was increased from 0.05 to 5. Inspecting the kcat
% source might help to determine if this is reasonable.
Expand Down Expand Up @@ -301,19 +311,20 @@

saveEcModel(ecModel,'ecYeastGEM_stage3.yml');

% STEP 52 Save the curated ecModel without proteomics integration
% This functional ecModel will also be kept in the GECKO GitHub.
saveEcModel(ecModel,'ecYeastGEM.yml');

%% STAGE 4 integration of proteomics data into the ecModel.
% Uncomment the line below if you want to reload the model.
%ecModel=loadEcModel('ecYeastGEM_stage3.yml');

% STEP 19 Load proteomics data and constrain ecModel
% STEP 53-57 Load proteomics data and constrain ecModel
protData = loadProtData(3); %Number of replicates, only one experiment.
ecModel = fillEnzConcs(ecModel,protData);
ecModel = constrainEnzConcs(ecModel);

% STEP 20 Update protein pool
% STEP 58 Update protein pool
% The protein pool reaction will be constraint by the remaining, unmeasured
% enzyme content. This is calculated by subtracting the sum of
% ecModel.ec.concs from the condition-specific total protein content. The
Expand All @@ -322,7 +333,7 @@
fluxData = loadFluxData();
ecModel = updateProtPool(ecModel,fluxData.Ptot(1));

% STEP 21 Load flux data
% STEP 59-63 Load flux data
% Matching the proteomics sample(s), condition-specific flux data needs to
% be loaded to constrain the model. This was already loaded in Step 20 for
% gathering Ptot, but is repeated here nonetheless. Flux data is read from
Expand All @@ -336,7 +347,7 @@
% The growth rate of 0.1 is far from being reached. Therefore, the next
% step is to flexibilize enzyme concentrations.

% STEP 22 Enzyme concentrations are flexibilized (increased), until the
% STEP 64-65 Enzyme concentrations are flexibilized (increased), until the
% intended growth rate is reached. This is condition-specific, so the
% intended growth rate is gathered from the fluxData structure.
[ecModel, flexEnz] = flexibilizeEnzConcs(ecModel,fluxData.grRate(1),10);
Expand Down Expand Up @@ -364,32 +375,11 @@
saveEcModel(ecModel,'ecYeastGEM_stage4');

%% STAGE 5: simulation and analysis
% STEP 23 Example of various useful RAVEN functions
% % Set the upper bound of reaction r_0003 to 10.
% ecModel = setParam(ecModel,'ub','r_0003',10);
% % Set the lower bound of reaction r_0003 to 0.
% ecModel = setParam(ecModel,'lb','r_0003',0);
% % Set the objective function to maximize reaction 'r_4041'.
% ecModel = setParam(ecModel,'obj','r_4041',1);
% % Set the objective function to minimize protein usage.
% ecModel = setParam(ecModel,'obj','prot_pool_exchange',1);
% % Perform flux balance analysis (FBA).
% sol = solveLP(ecModel);
% % Perform parsimonious FBA (minimum total flux).
% sol = solveLP(ecModel,1);
% % Inspect exchange fluxes from FBA solution.
% printFluxes(ecModel,sol.x)
% % Inspect all (non-zero) fluxes from FBA solution.
% printFluxes(ecModel,sol.x,false)
% % Export to Excel file (will not contain potential model.ec content).
% exportToExcelFormat(ecModel,'filename.xlsx');

% STEP 24 Simulate Crabtree effect with protein pool

% (Re)load the ecModel without proteomics integration.
% STEP 66 (Re)load the ecModel without proteomics integration.
ecModel = loadEcModel('ecYeastGEM.yml');

% We will soon run a custom plotCrabtree function that is kept in the code
% STEP 67-68 Simulate Crabtree effect with protein pool
% We will below run a custom plotCrabtree function that is kept in the code
% subfolder. To run this function we will need to navigate into the folder
% where it is stored, but we will navigate back to the current folder
% afterwards.
Expand Down Expand Up @@ -421,16 +411,16 @@
% It is obvious that no total protein constraint is reached, and Crabtree
% effect is not observed.

% Perform the Crabtree simulation on the pre-Step 17 ecModel (where kcat
% Perform the Crabtree simulation on the pre-Step 33 ecModel (where kcat
% sensitivity tuning has not yet been applied).
ecModel_preTuning = loadEcModel('ecYeastGEM_stage2.yml');
ecModel_preTuning = setParam(ecModel_preTuning,'lb','r_1714',-1000);
plotCrabtree(ecModel_preTuning);
saveas(gcf,fullfile(params.path,'output','crabtree_preStep17.pdf'))
saveas(gcf,fullfile(params.path,'output','crabtree_preStep33.pdf'))
% Without kcat tuning, the model gets constrained too early (at too low
% growth rates), which means that no solutions exist at high growth rates.

% STEP 25 Selecting objective functions
% STEP 69-70 Selecting objective functions
ecModel = setParam(ecModel,'obj',params.bioRxn,1);
sol = solveLP(ecModel)
fprintf('Growth rate that is reached: %f /hour.\n', abs(sol.f))
Expand All @@ -444,7 +434,7 @@
sol = solveLP(ecModel)
fprintf('Minimum protein pool usage: %.2f mg/gDCW.\n', abs(sol.f))

% STEP 26 Inspect enzyme usage
% STEP 71 Inspect enzyme usage
% Show the result from the earlier simulation, without mapping to
% non-ecModel.
ecModel = setParam(ecModel, 'obj', 'r_1714', 1);
Expand All @@ -454,15 +444,15 @@
usageReport = reportEnzymeUsage(ecModel,usageData);
usageReport.topAbsUsage

% STEP 27 Compare fluxes from ecModel and conventional GEM
% STEP 72 Compare fluxes from ecModel and conventional GEM
sol = solveLP(ecModel);
% Map the ecModel fluxes back to the conventional GEM.
[mappedFlux, enzUsageFlux, usageEnz] = mapRxnsToConv(ecModel, model, sol.x);
% Confirm that mappedFlux is of the same length as model.rxns.
numel(mappedFlux)
numel(model.rxns)

% STEP 28 Perform (ec)FVA
% STEP 73-75 Perform (ec)FVA
% Perform FVA on a conventional GEM, ecModel, and ecModel plus proteomics
% integration, all under similar exchange flux constraints. First make sure
% that the correct models are loaded.
Expand Down Expand Up @@ -505,7 +495,7 @@
plotEcFVA(minFlux, maxFlux);
saveas(gca, fullfile(params.path,'output','ecFVA.pdf'))

% STEP 29 Compare light and full ecModels
% STEP 76-77 Compare light and full ecModels
% For a fair comparison of the two types of ecModels, the custom
% plotlightVSfull function makes a light and full ecModel for yeast-GEM and
% compares their flux distributions at maximum growth rate.
Expand All @@ -518,5 +508,5 @@
changedFlux = abs(fluxRatio-1) > 0.001;
model.rxnNames(changedFlux)

% STEP 30 Contextualize ecModels
% STEP 78-79 Contextualize ecModels
% This step is exemplified in tutorials/light_ecModel/protocol.m.
Loading

0 comments on commit 9546b87

Please sign in to comment.