Lab4 Metabolic network function
analysis with COBRA
The Constraint-based
Reconstruction and Analysis (COBRA) Toolbox was developed by Palsson's group at UCSD. The COBRA toolbox is described in
the Nature Protocols article: Quantitative
prediction of cellular metabolism with constraint-based models: the COBRA
Toolbox by Scott A Becker, Adam M Feist, Monica L
Mo, Gregory Hannum, Bernhard Ø Palsson
& Markus J Herrgard. (refs:
Nature Protocols 2, 727 - 738 (2007), doi:10.1038/nprot.2007.99)
Equipment
• The COBRA
Toolbox version 3.0(the latest stable version)(early version is also acceptable)
• A computer capable of running MATLAB R2018a(the latest stable version) of MATLAB (Mathworks Inc.) numerical
computation and visualization software(early version is also acceptable)
• libSBML programming library 5.16.0(the latest stable version)(http://www.sbml.org)
• SBMLToolbox version 4.1.0
(the latest stable version) for MATLAB to allow reading and writing models
in SBML format (http://www.sbml.org)
• A linear programming (LP) solver. Currently the COBRA Toolbox
supports:
o Gurobi Optimization through
Gurobi Mex!Recommended!
o CPLEX (ILOG Inc.) through Tomlab Optimization Inc.
o GLPK through glpkmex, GLPK does not provide accurate solutions for OptKnock or GDLS calculations as
implemented in the Toolbox.
• A quadratic programming (QP) solver. (optional) Currently the COBRA
toolbox supports:
o CPLEX (ILOG Inc.) through Tomlab
o QPNG (part of GLPK)QPNG does not provide accurate solutions for MOMA as implemented in the Toolbox.
Equipment Setup
• Install MATLAB
• Install libSBML, the SBML Toolbox,
and selected solvers(Gurobi) according to their specific instructions.
• Unpack the COBRA archive(or follow the instructions on home page above !!beware of doing every step accordingly to meet all prerequisites!!(such as the options when installing git bash))
*Do not put COBRA or other toolboxes into 'toolbox' Directory (../MATLAB/toolbox) created by MATLAB, or you will need to refreash and reload it (rehash tooboxcache) after adding new toolboxes, which will be unnecessarily labour-comsuming.
• Cobra_Install_Path is the path to the top level
directory for the COBRA Toolbox
Procedure
Initializing the Toolbox
1 | Navigate to the directory where you installed the Toolbox: >> initCobraToolbox()
2| Save the paths added if desired: >> savepath()
Changing COBRA solvers
3| Set the solvers used by the COBRA Toolbox
using the following function:
>> changeCobraSolver(solverName, [solverType]);
Variables are defined as
follows: solverName specifies
the solver package to use; the COBRA Toolbox currently
supports ̳gurobi, tomlab_cplex, glpk, and
qpng. solverType (default
LP) specifies the type of problems (LP, MILP, QP, MIQP, NLP) to
solve with the solver specified by solverName. When changeCobraSolver is called without any arguments, it will
return the names of the current solvers settings.
Run COBRA Toolbox test suite
4 | The
test suite contains scripts that test the functionality of scripts within the
Toolbox. The scripts in the testing directory provide useful examples of many
of the Toolboxs functions.
>> testAll()
testAll sequentially navigates the test suite directory
(testing) and runs each test. Upon completion, it displays which tests were
completed successfully and which failed.
Caution! For
solver suites other than Gurobi or Tomlab, the user may encounter failures that require tuning
of solver parameters.
Critical! It is not necessary for all tests to be passed to use the COBRA Toolbox
for your applications. It is only necessary that the tests related to your work
do not fail.
Read COBRA-compliant SBML models into MATLAB
5 | Load a COBRA-compliant model into MATLAB.
To load a model, navigate within MATLAB to the
directory containing the model and call the following function from the command
window:
>> model
= readCbModel([filename]);
When called with no arguments, readCbModel will prompt the user to select a file using a
dialog box. readCbModel
supports SBML-formatted (Level 2 versions 1 or 4)
files. SBML files for a variety of organisms are
available from the BiGG database. The
function returns a COBRA Toolbox model structure
containing the necessary fields to describe the model for use with subsequent
steps.
You can read the E.coli Metabolic model Ecoli1260 or other models for your team
project.
CRITICAL STEP! If the model is not properly loaded into MATLAB, none of the following functions will work. Ensure
that libSBML and SBML Toolbox
are properly installed and accessible by MATLAB and
that the SBML file is formatted correctly.
Saving the model
6 | COBRA Toolbox model structures may be saved as text
or SBML files. On Microsoft Windows, the structures may
also be written to an Excel (xls) file.
>> writeCbModel(model, format, [fileName], [compSymbolList], [compNameList],
[SBMLLevel], [SBMLVersion]);
For format use sbml for SBML file format or xls for Excel format (only available on MS
Windows). For filename use the name of the file. If not provided, a
dialog box will prompt the user to specify name and location of the output
file. This feature is dependent on the SBML Toolbox to
generate the XML file. The toolbox is able to output SBML level 2 versions 1 or 4.
Modify COBRA Toolbox models
7 | Once the model is loaded into MATLAB by readCbModel, the model can be modified to simulate different
conditions such as altering reaction bonds (A), adding (B) or removing
reactions (C) or changing the model objective (D).
(A) to alter reaction bounds:
>> model
= changeRxnBounds(model, rxnNameList, value, boundType);
rxnNameList is a cell
array of reaction ids corresponding to reaction ids in model.rxns;
value is a floating point number; boundType specifies
which bounds to change for the reactions and can take values of l, u, or
b for lower, upper, or both, respectively. This function is useful for
defining the in silico media composition by changing
the lower bounds of exchange reactions.
(B) New reactions can be added
to a COBRA Toolbox model using the following function:
>> [model] = addReaction(model, rxnName, metaboliteList, stoichCoeffList,
[revFlag], [lowerBound], [upperBound], [objCoeff],
[subsystem], [grRule], [geneNameList],
[systNameList], [checkDuplicate]);
metaboliteList is a list of metabolites involved in the reaction (if a
metabolite does not exist in model.mets then this
function will add it);
stoichCoeffList is the stoichiometric
coefficients for the corresponding elements in metaboliteList.
This function checks for reactions with the same name or stoichiometic coefficients, however this can be disabled by
setting checkDuplicate to false.
(C) To remove a reaction, call
the following function:
>> [model] = removeRxns(model, rxnRemoveList)
rxnRemoveList a cell array of reaction ids corresponding to elements
in model.rxns. Metabolites that are no longer
involved in any reactions are removed from the model. The model may no longer
function after reactions have been removed.
(D) COBRA
modeling often entails performing calculations that focus on a specified
objective, such as growth59. To
change the objective function, use the following function:
>> model
= changeObjective(model, rxnNameList, [objectiveCoeff]);
rxnNameList is either a string or a cell array of strings containing
reaction ids corresponding to elements in model.rxns
that should be included in the objective function; objectiveCoeff
specifies the weight given to the respective reaction in rxnNameList.
If left empty, objectiveCoeff is assumed to be 1.
Simulate optimal growth using flux-balance analysis (FBA)
8| Simulating optimal growth using FBA is one of the
fundamental COBRA phenotypic calculations for metabolic
network models. FBA is a method that calculates the
flow of metabolites through a metabolic network28. Growth is simulated by optimizing the model for flux
through the model's biomass function; however, it is also possible to perform
simulations that focus on optimizing other biological characteristics, such as ATP production. The reaction to optimize is set using the model.c vector.
In addition to specifying an
objective, it is also necessary to define the in silico growth
medium; this is accomplished by modifying the bounds of exchange reactions.
Exchange reactions for metabolites comprising the in silico growth
medium should have a lower bound less than 0; all other exchange reactions
should have a lower bound of 0. All exchange reactions should have an upper
bound greater than 0 to prevent metabolite build up. The solution returned will
have units based on the units used in the model (typically mmolgDW-1h-1).
>> [solution] = optimizeCbModel(model, [osenseStr], [minNorm], [allowLoops])
where: osenseStr
is either max or min to maximize or minimize the value of the objective; minNorm (default 0, if nonzero, attempt to find a solution
that minimizes the presence of loops0; allowLoops
(default true, if set to false, use the loop law algorithm49 to remove loops, this procedure can be time consuming).
optimizeCbModel will
return a solution structure containing: the objective value f, the primal
solution x, the dual solution y, the reduced cost w, a universal status
flag stat, a solver specific status flag origStat,
and the time to compute the solution time. The primal solution, x
represents the flux carried by each reaction within the model. The dual
solution, y represents the shadow prices for each metabolite and indicates
how much the addition of the corresponding metabolite will increase or decrease
the objective value28, 60. The
reduced cost, w, indicates how much each reaction affects the objective. A
solver status of 1 indicates that an optimal solution was found.
Simulating deletion studies
9 | Deletion studies can be easily simulated with in silico models.
Gene deletion methods within the Toolbox are dependent on the proper setup of
the gene-reaction matrix as well as the rules defining the Boolean relationship
between genes and reactions. Reactions that are affected by a gene deletion
have their upper and lower flux bounds set to zero and are therefore not
functional. The set of reactions on which a gene deletion has an effect is
calculated using the gene reaction association and rules.
It is possible to study either
(A) single essential gene deletions or (B) pairs of synthetic lethal genes. The
possible results from deletion studies are: 1) unchanged maximal growth, 2)
reduced maximal growth, or 3) no growth (lethal). Deletion studies can be used
to predict gene/reaction essentiality.
(A) Essential Gene Study
>> [grRatio,
grRateKO, grRateWT, hasEffect, delRxns, fluxSolution] = singleGeneDeletion(model, method, [geneList])
where: method can be either
FBA(default), MOMA62 or linear
MOMA (lMOMA); geneList, is a cell array of genes corresponding to model.genes (if not provided deletion simulations are
performed for all genes in the model); grRatio is the
growth rate of the knockout / growth rate of WT; grRateKO
is the growth rate of the knockouts; grRateWT is the
wild-type growth rate; hasEffect is a Boolean list
that contains true for each gene whose deletion alters the growth rate; delRxns contains a list of the reactions, the bounds of
which are set to 0 for each gene deletion; and fluxSolution
is the flux solution for each deletion.
(B) Synthetic Lethal Study
>> [grRatioDble,
grRateKO, grRateWT] = doubleGeneDeletion(model, method, [geneList1], [geneList2])
where: method can be either
FBA(default), MOMA62 or linear
MOMA (lMOMA); geneList1 is
a cell array of genes corresponding to model.genes
(if not provided, the function assumes all genes in model.genes
are to be interrogated); geneList2 is a cell array of genes that correspond to
the second set of genes in the synthetic lethal pair (if not provided, the
function assumes that all genes in model.genes are to
be interrogated); grRatioDble is the growth rate of
the knockout / growth rate of WT; grRateKO is the
growth rate of the knockouts; and grRateWT is the
wild-type growth rate.
Flux Variability Analysis (FVA)
10 | FBA only returns a single flux distribution that
corresponds to maximal growth under given growth conditions. However, alternate
optimal solutions may exist which correspond to maximal growth. FVA calculates the full range of numerical values for each
reaction flux within the network63.
To determine the minimum and maximum flux values that the reactions within the
model can carry while obtaining a specific percentage of optimal growth rate:
>> [minFlux
maxFlux] = fluxVariability(model, optPercentage,
[rxnNameList], [verbFlag],
[allowLoops])
where optPercentage (default 100) specifies the percentage of
optimal that an alternate flux distribution must realize to be considered an
acceptable alternative flux distribution.
Visualization of Flux Variability Analysis Results
11| To visualize the results from this function, a
flux variability map can be generated from an existing reaction map, color
coding reactions based on flux directionality.
>> drawFluxVariability(map, model, minFlux, maxFlux, [options])
where: map is the map structure
corresponding to the model read in using readCbMap(mapfile);
model is the COBRA model structure used in the fluxVariability function; minFlux
and maxFlux are vectors generated by the fluxVariability function described above; and options is a
structure containing optional parameters such as edge and node color and size:
bi-directional reversible reactions are colored green, unidirectional
reversible reactions that carry flux in the forward direction are colored
magenta, unidirectional reversible reactions that carry flux only in the
reverse direction are colored cyan, and irreversible fluxes are colored blue.
Sampling the solution space
12| FBA only returns a single optimal
point and thus gives little information about the entire solution space. An
alternative approach is to characterize the solution space using sampling.
The generalized parallel sampler samples any arbitrary linearly- constrained
space by moving a fixed number of points in parallel.
>> [sampleStructOut,
mixedFrac] = gpSampler(sampleStruct, [nPoints],
[bias], [maxTime], [maxSteps])
where: sampleStruct
is the COBRA Toolbox
problem structure for linear programming problems (see Supplementary
Information); nPoints is the number of sample points
is set through; maxTime is the maximum sampling time;
bias is a structure that imposes marginal distributions on reactions; sampleStructOut is sampleStruct
with the addition of the points field containing the solutions; and mixedFrac gives an estimate of how mixed the sampling
solution is relative to the warmup pointsa mixedFrac value of 0.5 indicates complete mixing.
Finding correlated reaction sets
13| Sampling can be
used to determine dependencies between reactions that can be further used to
define modules of
reactions in networks that have to be
co-utilized in precise stoichiometric ratios
(correlated reaction sets). Correlated reactions
sets are unbiased, condition-dependent definitions of
modules in metabolic networks that can, for example, be compared with
modules obtained using gene expression
data27.
Starting with a set of
samples (the output from first calling sampleCbModel),
identify the correlated reactions sets with:
>> [sets,setNumber,setSize] = identifyCorrelSets(modelS,samples,corrThr)
The inputs to the
function are the model (modelS) and a set of random
flux samples (samples) as well as an optional argument
defining the minimum correlation threshold (corrThr). The function returns a list of the correlated
reaction sets (sets) as well as
the set number that each reaction belongs to (vector setNumber) and the set sizes (vector setSize).
If a reaction is not part of
any correlated reaction set, the corresponding entry in setNumber
equals zero.
Robustness analysis
14| The effect of reducing flux through a single reaction on growth is a network property of great interest. One could, for
example, study the effect of decreasing the
expression level of a specific metabolic enzyme on the growth rate in order to
predict haploinsufficient
phenotypes in eukaryotes. Robustness analysis allows the computation of how an
objective of interest
(e.g., growth rate) changes
as the flux through a specific reaction of interest varies in magnitude. The
function used for a
robustness analysis is:
>> [controlFlux,
objFlux] = robustnessAnalysis(model,
controlRxn, nPoints)
This function is used
to compute and plot the value of the model objective function as a function of
flux values for a reaction
of interest (controlRxn) as a
means to analyze the network robustness with respect to that reaction. The
range of reaction values
vary within the minimum and maximum bounds for the
reaction of interest and the objective function is maximized using FBA
according to each set flux value of controlRxn. A plot with a defined number of points (nPoints) can be generated to visually
assess how the objective function changes as the flux through the control
reaction varies.
Assignment
Use the E.coli metabolic
model Ecoli1260 or other models for your team project as case, do the above analysis, and write
a WORD report including the function and main results of each step.
Please send the report to molle_faust@hotmail.com
before May.26 00:00am. (Label the file name with Lab4+your name
and ID)