Integration of gene expression data into genome-scale metabolic models
Introduction
Following the developments in genomics there has been an increased focus on the behavior of complete biological systems. In such integrative analysis, also referred to as systems biology (Kitano, 2000; Ideker et al., 2001; Nielsen and Olsson, 2002), biological data from all levels of metabolism, from genome to metabolome, are combined in order to view the studied organism as a whole rather than investigating the single components of the system. In order to integrate the wealth of information, mathematical models play an important role, and systems biology is therefore often associated with quantitative investigation of the biological system under study.
Several mathematical modeling frameworks have been developed to describe and to analyze the metabolic behavior of an organism or a living cell (Gombert and Nielsen, 2000; Arkin, 2001). One of these approaches is stoichiometric modeling, which relies on mass balances over intracellular metabolites and the assumption of pseudo-steady-state conditions to determine intracellular metabolic fluxes. The information contained in a stoichiometric model itself results in an underdetermined linear equation system, which is not enough to calculate a unique flux distribution, and the models are therefore combined with additional experimental data or assumptions to yield a well-defined flux map. Examples of applications are, for instance, calculation of metabolic fluxes for a specific experiment (Aiba and Matsuoka, 1979; Christensen and Nielsen, 2000), and prediction of how phenotypic behavior is affected by genetic or environmental changes (Varma et al., 1993; Edwards and Palsson, 2000; Stuckrath et al., 2002; Segre et al., 2002).
An advantage with stoichiometric modeling is that it is based on well-known stoichiometric coefficients and that it does not require determination of parameters like kinetic constants. With the increasing amount of biological knowledge in public databases, it is therefore relatively straightforward to construct detailed metabolic models, and in recent years large-scale models primarily based on genome sequence information have been developed. The modeled organisms include the prokaryotes Haemophilus influenzae, Escherichia coli, Helicobacter pylori (Edwards and Palsson (1999), Edwards and Palsson (2000); Schilling et al., 2002), and most recently the eukaryote Saccharomyces cerevisiae (Forster et al., 2003). These models have a few hundred to over thousand reactions and are typically used for computational studies, for instance systematic insertion or deletion of heterologous reactions to obtain improved metabolic properties (Burgard and Maranas, 2001). A common approach is the so-called flux balance analysis where metabolic behavior is simulated under the assumption that the cells exhibit optimal growth (Varma and Palsson, 1994). For prokaryotes this assumption seems to hold true in many cases (Edwards et al., 2001) and recently it was demonstrated experimentally that sub-optimal growing cells could be evolved to the predicted optimal phenotype (Ibarra et al., 2002).
The price to pay for the simplicity of stoichiometric models is that no information on metabolic regulation is included. For instance, the S. cerevisiae model readily provides a good prediction of phenotypic behavior in glucose-limited aerobic and anaerobic chemostats (Famili et al., 2003). However, unless further information is supplied it is difficult to describe batch cultivations where the glucose levels are high and regulatory phenomena, referred to as glucose repression (Ronne, 1995; Gancedo, 1998; Johnston, 1999), drastically decrease respiration, biomass yield, etc. To improve the flux estimates, one can provide additional physiological information, such as experimentally measured uptake rates or knowledge about enzyme activities, to constrain the range of possible flux distributions.
Covert et al. (2001) suggested how the stoichiometric modeling framework could be extended with an overlaid transcriptional regulatory network using a Boolean logic formalism. This was later applied to a moderate size model for the central carbon metabolism in E. coli (Covert and Palsson, 2002). Apart from the fact that many regulatory phenomena cannot be accurately described by Boolean logic, this approach is at present primarily limited by the available knowledge on regulatory processes. As an alternative, we here investigate the possibilities to benefit from genome-wide measurements of transcription, e.g., using DNA or oligonucleotide microarrays. We discuss how such measurements can be combined with genome-scale stoichiometric models, thereby incorporating information on transcriptional regulation and hence improving prediction performance. Examples where gene expression data, from batch and chemostat cultivations of S. cerevisiae, are combined with the recently developed yeast model (Forster et al., 2003) are given, and the results are compared to experimental values obtained in Gombert et al. (2001) using 13C-labeling experiments.
Section snippets
Relating gene expression to fluxes
With the aim of obtaining improved flux predictions by extracting information from gene expression data, an important question is if or when gene expression, via translation, enzyme regulation, etc., at all correlates with a given metabolic flux. As have been shown, there is not necessarily correlation between gene expression and protein concentration (Gygi et al., 1999) or between enzyme activity and metabolic flux (ter Kuile and Westerhoff, 2001). Obviously, the same applies for the relation
From transcriptome data to flux constraints: a simple approach
As discussed above, it is difficult to correlate gene expression with metabolic flux. To generate constraints on metabolic fluxes, however, it can obviously be exploited that if a gene is not expressed, the corresponding protein and its related activity will, at steady state, be absent. Accordingly, one may use expression data to detect enzyme-coding genes that are not expressed and then constrain the corresponding metabolic fluxes to zero in the model simulation, thus reducing the feasible
Prediction of fluxes in batch cultivations of
As a case study, the method outlined above was applied to expression data from batch and chemostat cultivations of S. cerevisiae with glucose as carbon source (Piper et al., 2002; Westergaard et al., manuscript in preparation) in combination with the genome-scale model presented in Forster et al. (2003). Flux balance predictions using the model give good results for aerobic and anaerobic glucose-limited chemostat cultivations but as mentioned above, it fails to predict the reduced biomass yield
Robustness: experimental and computational considerations
Control of metabolism rarely resides at single enzymes but is rather distributed over several genes or enzymes as in glucose repression, where the expression for a large number of genes is affected (Ronne, 1995). The Boolean nature of the presented method may, in particular for lowly expressed genes, give a large impact for small expression changes in single genes. On the one hand, one may have to accept that the upper bounds for metabolic performance calculated in the simulations are, although
Concluding remarks
While many systems biology approaches neglect the metabolite level, we want to emphasis the vast amount of existing metabolic knowledge and see the metabolism as an important part to understand cellular systems. As an example of this, we have discussed how detailed metabolic models can be combined with transcription data to get improved predictions of cellular behavior.
The key idea was to exploit regulatory information in transcriptome data to give additional constraints on metabolic fluxes in
Acknowledgements
The authors thank Steen Lund Westergaard and Christoffer Bro for valuable discussions and for sharing experimental data prior to publication. Financial support from the Alf Åkerman—Trygg Hansa foundation, the Danish Biotechnology Instrument Center (DABIC), and the Øresund Bio+IT postdoc program is gratefully acknowledged.
References (48)
- et al.
ATP10, a yeast nuclear gene required for the assembly of the mitochondrial F1–F0 complex
J. Biol. Chem
(1990) Synthetic cell biology
Curr. Opin. Biotechnol
(2001)- et al.
Transcriptional regulation in constraints-based metabolic models of E. coli
J. Biol. Chem
(2002) - et al.
Regulation of gene expression in flux balance models of metabolism
J. Theor. Biol
(2001) - et al.
The Saccharomyces cerevisiae TCM62 gene encodes a chaperone necessary for the assembly of the mitochondrial succinate dehydrogenase (Complex II)
J. Biol. Chem
(1998) - et al.
Systems properties of the Haemophilus influenzae Rd metabolic genotype
J. Biol. Chem
(1999) - et al.
The complete sequence of the mitochondrial genome of Saccharomyces cerevisiae
FEBS Lett
(1998) - et al.
Mathematical modelling of metabolism
Curr. Opin. Biotechnol
(2000) Feasting, fasting and fermenting—glucose sensing in yeast and other cells
Trends Genet
(1999)- et al.
An expanded role for microbial physiology in metabolic engineering and functional genomicsmoving towards systems biology
FEMS Yeast Res
(2002)
A single amino acid change in subunit 6 of the yeast mitochondrial ATPase suppresses a null mutation in ATP10
J. Biol. Chem
Reproducibility of oligonucleotide microarray transcriptome analyses—an interlaboratory comparison using chemostat cultures of Saccharomyces cerevisiae
J. Biol. Chem
Glucose repression in fungi
Trends Genet
Detection of elementary flux modes in biochemical networksa promising tool for pathway analysis and metabolic engineering
Trends Biotechnol
Transcriptome meets metabolomehierarchical and metabolic regulation of the glycolytic pathway
FEBS Lett
MRNA degradation machines in eukaryotic cells
Biochimie
An interlaboratory comparison of physiological and genetic properties of four Saccharomyces cerevisiae strains
Enz. Microb. Technol
Identification of metabolic model—citrate production from glucose by Candida lipolytica
Biotechnol. Bioeng
Introduction to Linear Optimization
Probing the performance limits of the E. coli metabolic network subject to gene additions or deletions
Biotechnol. Bioeng
Metabolic network analysis of Penicillium chrysogenum using C-13-labeled glucose
Biotechnol. Bioeng
The E. coli MG1655 in silico metabolic genotypeits definition, characteristics, and capabilities
Proc. Natl. Acad. Sci
In silico predictions of E. coli metabolic capabilities are consistent with experimental data
Nat. Biotechnol
Cited by (175)
Guidelines for extracting biologically relevant context-specific metabolic models using gene expression data
2023, Metabolic EngineeringMinireview: Engineering evolution to reconfigure phenotypic traits in microbes for biotechnological applications
2023, Computational and Structural Biotechnology JournalNetwork biology and artificial intelligence drive the understanding of the multidrug resistance phenotype in cancer
2022, Drug Resistance UpdatesMetabolic modeling of fungi
2021, Encyclopedia of MycologyModel validation and selection in metabolic flux analysis and flux balance analysis
2024, Biotechnology Progress
- 1
Current address: Novo Nordisk A/S, BioProcess Laboratories, Novo Allé, Dk-2880 Bagsvaerd, Denmark. E-mail address: [email protected].