Elsevier

Metabolic Engineering

Volume 6, Issue 4, October 2004, Pages 285-293
Metabolic Engineering

Integration of gene expression data into genome-scale metabolic models

https://doi.org/10.1016/j.ymben.2003.12.002Get rights and content

Abstract

A framework for integration of transcriptome data into stoichiometric metabolic models to obtain improved flux predictions is presented. The key idea is to exploit the regulatory information in the expression data to give additional constraints on the metabolic fluxes in the model. Measurements of gene expression from chemostat and batch cultures of Saccharomyces cerevisiae were combined with a recently developed genome-scale model, and the computed metabolic flux distributions were compared to experimental values from carbon labeling experiments and metabolic network analysis. The integration of expression data resulted in improved predictions of metabolic behavior in batch cultures, enabling quantitative predictions of exchange fluxes as well as qualitative estimations of changes in intracellular fluxes. A critical discussion of correlation between gene expression and metabolic fluxes is given.

Introduction

Following the developments in genomics there has been an increased focus on the behavior of complete biological systems. In such integrative analysis, also referred to as systems biology (Kitano, 2000; Ideker et al., 2001; Nielsen and Olsson, 2002), biological data from all levels of metabolism, from genome to metabolome, are combined in order to view the studied organism as a whole rather than investigating the single components of the system. In order to integrate the wealth of information, mathematical models play an important role, and systems biology is therefore often associated with quantitative investigation of the biological system under study.

Several mathematical modeling frameworks have been developed to describe and to analyze the metabolic behavior of an organism or a living cell (Gombert and Nielsen, 2000; Arkin, 2001). One of these approaches is stoichiometric modeling, which relies on mass balances over intracellular metabolites and the assumption of pseudo-steady-state conditions to determine intracellular metabolic fluxes. The information contained in a stoichiometric model itself results in an underdetermined linear equation system, which is not enough to calculate a unique flux distribution, and the models are therefore combined with additional experimental data or assumptions to yield a well-defined flux map. Examples of applications are, for instance, calculation of metabolic fluxes for a specific experiment (Aiba and Matsuoka, 1979; Christensen and Nielsen, 2000), and prediction of how phenotypic behavior is affected by genetic or environmental changes (Varma et al., 1993; Edwards and Palsson, 2000; Stuckrath et al., 2002; Segre et al., 2002).

An advantage with stoichiometric modeling is that it is based on well-known stoichiometric coefficients and that it does not require determination of parameters like kinetic constants. With the increasing amount of biological knowledge in public databases, it is therefore relatively straightforward to construct detailed metabolic models, and in recent years large-scale models primarily based on genome sequence information have been developed. The modeled organisms include the prokaryotes Haemophilus influenzae, Escherichia coli, Helicobacter pylori (Edwards and Palsson (1999), Edwards and Palsson (2000); Schilling et al., 2002), and most recently the eukaryote Saccharomyces cerevisiae (Forster et al., 2003). These models have a few hundred to over thousand reactions and are typically used for computational studies, for instance systematic insertion or deletion of heterologous reactions to obtain improved metabolic properties (Burgard and Maranas, 2001). A common approach is the so-called flux balance analysis where metabolic behavior is simulated under the assumption that the cells exhibit optimal growth (Varma and Palsson, 1994). For prokaryotes this assumption seems to hold true in many cases (Edwards et al., 2001) and recently it was demonstrated experimentally that sub-optimal growing cells could be evolved to the predicted optimal phenotype (Ibarra et al., 2002).

The price to pay for the simplicity of stoichiometric models is that no information on metabolic regulation is included. For instance, the S. cerevisiae model readily provides a good prediction of phenotypic behavior in glucose-limited aerobic and anaerobic chemostats (Famili et al., 2003). However, unless further information is supplied it is difficult to describe batch cultivations where the glucose levels are high and regulatory phenomena, referred to as glucose repression (Ronne, 1995; Gancedo, 1998; Johnston, 1999), drastically decrease respiration, biomass yield, etc. To improve the flux estimates, one can provide additional physiological information, such as experimentally measured uptake rates or knowledge about enzyme activities, to constrain the range of possible flux distributions.

Covert et al. (2001) suggested how the stoichiometric modeling framework could be extended with an overlaid transcriptional regulatory network using a Boolean logic formalism. This was later applied to a moderate size model for the central carbon metabolism in E. coli (Covert and Palsson, 2002). Apart from the fact that many regulatory phenomena cannot be accurately described by Boolean logic, this approach is at present primarily limited by the available knowledge on regulatory processes. As an alternative, we here investigate the possibilities to benefit from genome-wide measurements of transcription, e.g., using DNA or oligonucleotide microarrays. We discuss how such measurements can be combined with genome-scale stoichiometric models, thereby incorporating information on transcriptional regulation and hence improving prediction performance. Examples where gene expression data, from batch and chemostat cultivations of S. cerevisiae, are combined with the recently developed yeast model (Forster et al., 2003) are given, and the results are compared to experimental values obtained in Gombert et al. (2001) using 13C-labeling experiments.

Section snippets

Relating gene expression to fluxes

With the aim of obtaining improved flux predictions by extracting information from gene expression data, an important question is if or when gene expression, via translation, enzyme regulation, etc., at all correlates with a given metabolic flux. As have been shown, there is not necessarily correlation between gene expression and protein concentration (Gygi et al., 1999) or between enzyme activity and metabolic flux (ter Kuile and Westerhoff, 2001). Obviously, the same applies for the relation

From transcriptome data to flux constraints: a simple approach

As discussed above, it is difficult to correlate gene expression with metabolic flux. To generate constraints on metabolic fluxes, however, it can obviously be exploited that if a gene is not expressed, the corresponding protein and its related activity will, at steady state, be absent. Accordingly, one may use expression data to detect enzyme-coding genes that are not expressed and then constrain the corresponding metabolic fluxes to zero in the model simulation, thus reducing the feasible

Prediction of fluxes in batch cultivations of S.cerevisiae

As a case study, the method outlined above was applied to expression data from batch and chemostat cultivations of S. cerevisiae with glucose as carbon source (Piper et al., 2002; Westergaard et al., manuscript in preparation) in combination with the genome-scale model presented in Forster et al. (2003). Flux balance predictions using the model give good results for aerobic and anaerobic glucose-limited chemostat cultivations but as mentioned above, it fails to predict the reduced biomass yield

Robustness: experimental and computational considerations

Control of metabolism rarely resides at single enzymes but is rather distributed over several genes or enzymes as in glucose repression, where the expression for a large number of genes is affected (Ronne, 1995). The Boolean nature of the presented method may, in particular for lowly expressed genes, give a large impact for small expression changes in single genes. On the one hand, one may have to accept that the upper bounds for metabolic performance calculated in the simulations are, although

Concluding remarks

While many systems biology approaches neglect the metabolite level, we want to emphasis the vast amount of existing metabolic knowledge and see the metabolism as an important part to understand cellular systems. As an example of this, we have discussed how detailed metabolic models can be combined with transcription data to get improved predictions of cellular behavior.

The key idea was to exploit regulatory information in transcriptome data to give additional constraints on metabolic fluxes in

Acknowledgements

The authors thank Steen Lund Westergaard and Christoffer Bro for valuable discussions and for sharing experimental data prior to publication. Financial support from the Alf Åkerman—Trygg Hansa foundation, the Danish Biotechnology Instrument Center (DABIC), and the Øresund Bio+IT postdoc program is gratefully acknowledged.

References (48)

  • M.F. Paul et al.

    A single amino acid change in subunit 6 of the yeast mitochondrial ATPase suppresses a null mutation in ATP10

    J. Biol. Chem

    (2000)
  • M.D.W. Piper et al.

    Reproducibility of oligonucleotide microarray transcriptome analyses—an interlaboratory comparison using chemostat cultures of Saccharomyces cerevisiae

    J. Biol. Chem

    (2002)
  • H. Ronne

    Glucose repression in fungi

    Trends Genet

    (1995)
  • S. Schuster et al.

    Detection of elementary flux modes in biochemical networksa promising tool for pathway analysis and metabolic engineering

    Trends Biotechnol

    (1999)
  • B.H. ter Kuile et al.

    Transcriptome meets metabolomehierarchical and metabolic regulation of the glycolytic pathway

    FEBS Lett

    (2001)
  • H. Tourriere et al.

    MRNA degradation machines in eukaryotic cells

    Biochimie

    (2002)
  • J.P. van Dijken et al.

    An interlaboratory comparison of physiological and genetic properties of four Saccharomyces cerevisiae strains

    Enz. Microb. Technol

    (2000)
  • Affymetrix, 2000. Affymetrix GeneChip Expression Analysis Technical Manual. Affymetrix Inc., Santa Clara, CA,...
  • S. Aiba et al.

    Identification of metabolic model—citrate production from glucose by Candida lipolytica

    Biotechnol. Bioeng

    (1979)
  • D. Bertsimas et al.

    Introduction to Linear Optimization

    (1997)
  • A.P. Burgard et al.

    Probing the performance limits of the E. coli metabolic network subject to gene additions or deletions

    Biotechnol. Bioeng

    (2001)
  • B. Christensen et al.

    Metabolic network analysis of Penicillium chrysogenum using C-13-labeled glucose

    Biotechnol. Bioeng

    (2000)
  • J.S. Edwards et al.

    The E. coli MG1655 in silico metabolic genotypeits definition, characteristics, and capabilities

    Proc. Natl. Acad. Sci

    (2000)
  • J.S. Edwards et al.

    In silico predictions of E. coli metabolic capabilities are consistent with experimental data

    Nat. Biotechnol

    (2001)
  • Cited by (175)

    • Metabolic modeling of fungi

      2021, Encyclopedia of Mycology
    View all citing articles on Scopus
    1

    Current address: Novo Nordisk A/S, BioProcess Laboratories, Novo Allé, Dk-2880 Bagsvaerd, Denmark. E-mail address: [email protected].

    View full text