Lab4  Metabolic network function analysis with COBRA

The Constraint-based Reconstruction and Analysis (COBRA) Toolbox was developed by Palsson's group at UCSD. The COBRA toolbox is described in the Nature Protocols article: Quantitative prediction of cellular metabolism with constraint-based models: the COBRA Toolbox by Scott A Becker, Adam M Feist, Monica L Mo, Gregory Hannum, Bernhard Ø Palsson & Markus J Herrgard. (refs: Nature Protocols 2, 727 - 738 (2007), doi:10.1038/nprot.2007.99)

Equipment

• The COBRA Toolbox version 3.0(the latest stable version)(early version is also acceptable)
• A computer capable of running MATLAB  R2018a(the latest stable version) of MATLAB (Mathworks Inc.) numerical computation and visualization software(early version is also acceptable)
libSBML programming library 5.16.0(the latest stable version)(http://www.sbml.org)
SBMLToolbox version 4.1.0 (the latest stable version) for MATLAB to allow reading and writing models in SBML format (http://www.sbml.org)
• A linear programming (LP) solver. Currently the COBRA Toolbox supports:
o Gurobi Optimization through Gurobi Mex!Recommended!
o CPLEX (ILOG Inc.) through Tomlab Optimization Inc.
o GLPK through glpkmex, GLPK does not provide accurate solutions for OptKnock or GDLS calculations as implemented in the Toolbox.
• A quadratic programming (QP) solver. (optional) Currently the COBRA toolbox supports:
o CPLEX (ILOG Inc.) through Tomlab
o QPNG (part of GLPK)QPNG does not provide accurate solutions for MOMA as implemented in the Toolbox.


Equipment Setup
• Install MATLAB
• Install libSBML, the SBML Toolbox, and selected solvers(Gurobi) according to their specific instructions.
• Unpack the COBRA archive(or follow the instructions on home page above !!beware of doing every step accordingly to meet all prerequisites!!(such as the options when installing git bash))
*Do not put COBRA or other toolboxes into 'toolbox' Directory (../MATLAB/toolbox) created by MATLAB, or you will need to refreash and reload it (rehash tooboxcache) after adding new toolboxes, which will be unnecessarily labour-comsuming.
Cobra_Install_Path is the path to the top level directory for the COBRA Toolbox

Procedure
*The functions in examples are based on older version of COBRA, so they may be slightly different from that in the latest version provided above.

Initializing the Toolbox
1 | Navigate to the directory where you installed the Toolbox: >> initCobraToolbox()
2| Save the paths added if desired: >> savepath()

Changing COBRA solvers
3| Set the solvers used by the COBRA Toolbox using the following function:

>> changeCobraSolver(solverName, [solverType]);

Variables are defined as follows: solverName specifies the solver package to use; the COBRA Toolbox currently supports ̳gurobi, tomlab_cplex, glpk, and qpng. solverType (default LP) specifies the type of problems (LP, MILP, QP, MIQP, NLP) to solve with the solver specified by solverName. When changeCobraSolver is called without any arguments, it will return the names of the current solvers settings.

Run COBRA Toolbox test suite

4 | The test suite contains scripts that test the functionality of scripts within the Toolbox. The scripts in the testing directory provide useful examples of many of the Toolboxs functions.

>> testAll()

testAll sequentially navigates the test suite directory (testing) and runs each test. Upon completion, it displays which tests were completed successfully and which failed.
Caution! For solver suites other than Gurobi or Tomlab, the user may encounter failures that require tuning of solver parameters.
Critical! It is not necessary for all tests to be passed to use the COBRA Toolbox for your applications. It is only necessary that the tests related to your work do not fail.

Read COBRA-compliant SBML models into MATLAB
5 | Load a COBRA-compliant model into MATLAB. To load a model, navigate within MATLAB to the directory containing the model and call the following function from the command window:

>> model = readCbModel([filename]);

When called with no arguments, readCbModel will prompt the user to select a file using a dialog box. readCbModel supports SBML-formatted (Level 2 versions 1 or 4) files. SBML files for a variety of organisms are available from the BiGG database. The function returns a COBRA Toolbox model structure containing the necessary fields to describe the model for use with subsequent steps.

You can read the E.coli Metabolic model Ecoli1260 or other models for your team project.

CRITICAL STEP! If the model is not properly loaded into MATLAB, none of the following functions will work. Ensure that libSBML and SBML Toolbox are properly installed and accessible by MATLAB and that the SBML file is formatted correctly.

Saving the model
6 | COBRA Toolbox model structures may be saved as text or SBML files. On Microsoft Windows, the structures may also be written to an Excel (xls) file.

>> writeCbModel(model, format, [fileName], [compSymbolList], [compNameList], [SBMLLevel], [SBMLVersion]);

For format use sbml for SBML file format or xls for Excel format (only available on MS Windows). For filename use the name of the file. If not provided, a dialog box will prompt the user to specify name and location of the output file. This feature is dependent on the SBML Toolbox to generate the XML file. The toolbox is able to output SBML level 2 versions 1 or 4.

Modify COBRA Toolbox models
7 | Once the model is loaded into MATLAB by readCbModel, the model can be modified to simulate different conditions such as altering reaction bonds (A), adding (B) or removing reactions (C) or changing the model objective (D).
(A) to alter reaction bounds:

>> model = changeRxnBounds(model, rxnNameList, value, boundType);

rxnNameList is a cell array of reaction ids corresponding to reaction ids in model.rxns; value is a floating point number; boundType specifies which bounds to change for the reactions and can take values of l, u, or b for lower, upper, or both, respectively. This function is useful for defining the in silico media composition by changing the lower bounds of exchange reactions.

(B) New reactions can be added to a COBRA Toolbox model using the following function:

>> [model] = addReaction(model, rxnName, metaboliteList, stoichCoeffList, [revFlag], [lowerBound], [upperBound], [objCoeff], [subsystem], [grRule], [geneNameList], [systNameList], [checkDuplicate]);

metaboliteList is a list of metabolites involved in the reaction (if a metabolite does not exist in model.mets then this function will add it);

stoichCoeffList is the stoichiometric coefficients for the corresponding elements in metaboliteList. This function checks for reactions with the same name or stoichiometic coefficients, however this can be disabled by setting checkDuplicate to false.

(C) To remove a reaction, call the following function:

>> [model] = removeRxns(model, rxnRemoveList)

rxnRemoveList a cell array of reaction ids corresponding to elements in model.rxns. Metabolites that are no longer involved in any reactions are removed from the model. The model may no longer function after reactions have been removed.

(D) COBRA modeling often entails performing calculations that focus on a specified objective, such as growth59. To change the objective function, use the following function:

>> model = changeObjective(model, rxnNameList, [objectiveCoeff]);

rxnNameList is either a string or a cell array of strings containing reaction ids corresponding to elements in model.rxns that should be included in the objective function; objectiveCoeff specifies the weight given to the respective reaction in rxnNameList. If left empty, objectiveCoeff is assumed to be 1.

Simulate optimal growth using flux-balance analysis (FBA)
8| Simulating optimal growth using FBA is one of the fundamental COBRA phenotypic calculations for metabolic network models. FBA is a method that calculates the flow of metabolites through a metabolic network
28. Growth is simulated by optimizing the model for flux through the model's biomass function; however, it is also possible to perform simulations that focus on optimizing other biological characteristics, such as ATP production. The reaction to optimize is set using the model.c vector.

In addition to specifying an objective, it is also necessary to define the in silico growth medium; this is accomplished by modifying the bounds of exchange reactions. Exchange reactions for metabolites comprising the in silico growth medium should have a lower bound less than 0; all other exchange reactions should have a lower bound of 0. All exchange reactions should have an upper bound greater than 0 to prevent metabolite build up. The solution returned will have units based on the units used in the model (typically mmolgDW-1h-1).

>> [solution] = optimizeCbModel(model, [osenseStr], [minNorm], [allowLoops])

where: osenseStr is either max or min to maximize or minimize the value of the objective; minNorm (default 0, if nonzero, attempt to find a solution that minimizes the presence of loops0; allowLoops (default true, if set to false, use the loop law algorithm49 to remove loops, this procedure can be time consuming).

optimizeCbModel will return a solution structure containing: the objective value f, the primal solution x, the dual solution y, the reduced cost w, a universal status flag stat, a solver specific status flag origStat, and the time to compute the solution time. The primal solution, x represents the flux carried by each reaction within the model. The dual solution, y represents the shadow prices for each metabolite and indicates how much the addition of the corresponding metabolite will increase or decrease the objective value28, 60. The reduced cost, w, indicates how much each reaction affects the objective. A solver status of 1 indicates that an optimal solution was found.

Simulating deletion studies
9 | Deletion studies can be easily simulated with
in silico models. Gene deletion methods within the Toolbox are dependent on the proper setup of the gene-reaction matrix as well as the rules defining the Boolean relationship between genes and reactions. Reactions that are affected by a gene deletion have their upper and lower flux bounds set to zero and are therefore not functional. The set of reactions on which a gene deletion has an effect is calculated using the gene reaction association and rules.

It is possible to study either (A) single essential gene deletions or (B) pairs of synthetic lethal genes. The possible results from deletion studies are: 1) unchanged maximal growth, 2) reduced maximal growth, or 3) no growth (lethal). Deletion studies can be used to predict gene/reaction essentiality.

(A) Essential Gene Study

>> [grRatio, grRateKO, grRateWT, hasEffect, delRxns, fluxSolution] = singleGeneDeletion(model, method, [geneList])

where: method can be either FBA(default), MOMA62 or linear MOMA (lMOMA); geneList, is a cell array of genes corresponding to model.genes (if not provided deletion simulations are performed for all genes in the model); grRatio is the growth rate of the knockout / growth rate of WT; grRateKO is the growth rate of the knockouts; grRateWT is the wild-type growth rate; hasEffect is a Boolean list that contains true for each gene whose deletion alters the growth rate; delRxns contains a list of the reactions, the bounds of which are set to 0 for each gene deletion; and fluxSolution is the flux solution for each deletion.

(B) Synthetic Lethal Study

>> [grRatioDble, grRateKO, grRateWT] = doubleGeneDeletion(model, method, [geneList1], [geneList2])

where: method can be either FBA(default), MOMA62 or linear MOMA (lMOMA); geneList1 is a cell array of genes corresponding to model.genes (if not provided, the function assumes all genes in model.genes are to be interrogated); geneList2 is a cell array of genes that correspond to the second set of genes in the synthetic lethal pair (if not provided, the function assumes that all genes in model.genes are to be interrogated); grRatioDble is the growth rate of the knockout / growth rate of WT; grRateKO is the growth rate of the knockouts; and grRateWT is the wild-type growth rate.

Flux Variability Analysis (FVA)
10 | FBA only returns a single flux distribution that corresponds to maximal growth under given growth conditions. However, alternate optimal solutions may exist which correspond to maximal growth. FVA calculates the full range of numerical values for each reaction flux within the network
63.
To determine the minimum and maximum flux values that the reactions within the model can carry while obtaining a specific percentage of optimal growth rate:

>> [minFlux maxFlux] = fluxVariability(model, optPercentage, [rxnNameList], [verbFlag], [allowLoops])

where optPercentage (default 100) specifies the percentage of optimal that an alternate flux distribution must realize to be considered an acceptable alternative flux distribution.

Visualization of Flux Variability Analysis Results
11| To visualize the results from this function, a flux variability map can be generated from an existing reaction map, color coding reactions based on flux directionality.

>> drawFluxVariability(map, model, minFlux, maxFlux, [options])

where: map is the map structure corresponding to the model read in using readCbMap(mapfile); model is the COBRA model structure used in the fluxVariability function; minFlux and maxFlux are vectors generated by the fluxVariability function described above; and options is a structure containing optional parameters such as edge and node color and size: bi-directional reversible reactions are colored green, unidirectional reversible reactions that carry flux in the forward direction are colored magenta, unidirectional reversible reactions that carry flux only in the reverse direction are colored cyan, and irreversible fluxes are colored blue.

Sampling the solution space

12| FBA only returns a single optimal point and thus gives little information about the entire solution space. An alternative approach is to characterize the solution space using sampling. The generalized parallel sampler samples any arbitrary linearly- constrained space by moving a fixed number of points in parallel.

>> [sampleStructOut, mixedFrac] = gpSampler(sampleStruct, [nPoints], [bias], [maxTime], [maxSteps])

where: sampleStruct is the COBRA Toolbox problem structure for linear programming problems (see Supplementary Information); nPoints is the number of sample points is set through; maxTime is the maximum sampling time; bias is a structure that imposes marginal distributions on reactions; sampleStructOut is sampleStruct with the addition of the points field containing the solutions; and mixedFrac gives an estimate of how mixed the sampling solution is relative to the warmup pointsa mixedFrac value of 0.5 indicates complete mixing.

Finding correlated reaction sets

13| Sampling can be used to determine dependencies between reactions that can be further used to define modules of

reactions in networks that have to be co-utilized in precise stoichiometric ratios (correlated reaction sets). Correlated reactions

sets are unbiased, condition-dependent definitions of modules in metabolic networks that can, for example, be compared with

modules obtained using gene expression data27.

Starting with a set of samples (the output from first calling sampleCbModel), identify the correlated reactions sets with:

>> [sets,setNumber,setSize] = identifyCorrelSets(modelS,samples,corrThr)

The inputs to the function are the model (modelS) and a set of random flux samples (samples) as well as an optional argument

defining the minimum correlation threshold (corrThr). The function returns a list of the correlated reaction sets (sets) as well as

the set number that each reaction belongs to (vector setNumber) and the set sizes (vector setSize). If a reaction is not part of

any correlated reaction set, the corresponding entry in setNumber equals zero.

Robustness analysis

14| The effect of reducing flux through a single reaction on growth is a network property of great interest. One could, for

example, study the effect of decreasing the expression level of a specific metabolic enzyme on the growth rate in order to

predict haploinsufficient phenotypes in eukaryotes. Robustness analysis allows the computation of how an objective of interest

(e.g., growth rate) changes as the flux through a specific reaction of interest varies in magnitude. The function used for a

robustness analysis is:

>>  [controlFlux, objFlux] = robustnessAnalysis(model, controlRxn, nPoints)

This function is used to compute and plot the value of the model objective function as a function of flux values for a reaction

of interest (controlRxn) as a means to analyze the network robustness with respect to that reaction. The range of reaction values

vary within the minimum and maximum bounds for the reaction of interest and the objective function is maximized using FBA

according to each set flux value of controlRxn. A plot with a defined number of points (nPoints) can be generated to visually

assess how the objective function changes as the flux through the control reaction varies.

Assignment

Use the E.coli metabolic model Ecoli1260 or other models for your team project as case, do the above analysis, and write a WORD report including the function and main results of each step.

Please send the report to molle_faust@hotmail.com before May.26 00:00am.  (Label the file name with Lab4+your name and ID)