Open systems: panoramic views of gene expression
Introduction
In the last decade the scope of every biologist’s view of gene expression has increased by several orders of magnitude. Where once differential gene expression was viewed in terms of a single gene on a Northern blot (Eikhom et al., 1975), we now have the ability to assess the expression of tens of thousands of genes simultaneously. The landscape of gene expression is generally referred to as a ‘transcript profile’ or an ‘expression profile’. The transcript profile is an intricate, context-dependent pattern of expressed messages. It is at once simple at the level of a single gene at a single time point (a message is either up-regulated, down-regulated, or unchanged relative to another gene or expression profile), but immensely complex in terms of any given expressed message as it relates to the multitude of genes modulated in response to a stimulus across the transcriptome — the expressed genome (Fig. 1).
The technologies that illuminate expression profiles are the core of the relatively new field of functional genomics: precisely understanding global changes in gene expression in well-defined experimental contexts (Brent, 2000). The state-of-the-art in analysis of differential gene expression (DGE) is broadly divided into two families of technologies: closed architecture systems and open architecture systems. A DGE technology is deemed ‘closed’ when the genes that determine the space of inquiry are finite, in terms of whole genomes, and predetermined by virtue of their inclusion by selection. Absolute novelty is precluded and serendipity is hampered. Closed system analyses are inherently limited by the retrospective nature of the inquiry — one must begin with genes that are known. Coverage of genomes is strictly dependent upon the completeness of our knowledge of that genome, thus severely limiting comprehensive applicability in all but the most well characterized species or systems. Genes not represented in a closed system are not assessed. The most common methods of analyses using closed systems are oligonucleotide or cDNA array hybridization technologies (Marshall and Hodgson, 1998), and quantitative polymerase chain reaction (qPCR, or TaqMan) (Heid et al., 1996).
While closed systems work well for particular types of analyses, as when a known subset of genes is to be exclusively assessed, they have restricted application in the frontiers of functional genomics where the genetic topology is not well defined and where novelty is at a premium. Even in species where databases are nearing completion, applications of closed systems will be limited by incomplete knowledge of global genetic complexities, such as splice variants, polymorphisms, RNA editing, etc. When a complete and accurate library of information is available, then advanced knowledge of the space of inquiry will be less of a drawback for closed systems. However, there will likely still be limitations regarding the number of genes and variants of genes that can be analyzed using a single array — at least for the near term. Closed systems must anticipate transcriptome variation to be optimally effective. The open systems with the most utility must be flexible enough to respond to the dynamic nature of the transcriptome being assayed. Closed systems have been extensively described elsewhere (Lockhart et al., 1996, DeRisi et al., 1997, Marshall and Hodgson, 1998). This review will focus on open systems of differential gene expression analysis.
Open architecture systems, in contrast to closed systems, are defined by the fact that no a priori comprehensive knowledge of the transcriptome is necessary, hence the field of discovery is ‘open’. Advanced knowledge of the genes that may be modulated in any given experimental system is not required. Open systems have a natural advantage over closed systems in that the source of the transcriptome and its inherent complexity (expressed single nucleotide polymorphisms — cSNPs, alternative splices, RNA editing, etc.) is immaterial and is not a barrier to discovery. The knowledge of the genome of an experimental model may be finite, yet experiments with open systems still yield vast amounts of highly contextualized information that can be effectively explored and mined when the best bioinformatic and organizational tools are applied. Although open systems may not be dependent upon existing transcriptome information, the most effective open systems exploit existing expressed genome databases to greatly facilitate the analysis process and increase the efficiency of known and novel gene identification. Filtering out known genes can highlight novel genes. Initial characterization of a novel gene can be facilitated by the context provided by a cohort of similarly modulated known genes. Though function of a novel gene certainly cannot be absolutely ascertained by virtue of simple association, its affiliation with an assemblage of similarly co-modulated genes can be useful for the generation of a testable hypothesis to address novel gene function. This may be particularly informative if the cohort implicates a known biological pathway or is indicative of a biological phenomenon.
Closed systems and open systems are complementary. Once novelty has been identified, whether it is absolute novelty or novel orthologs of known genes in less well-characterized systems, these genes can then be used in directed ways in the closed system architecture. These two systems have in common that the output of analysis results in gene lists, which require a well-planned strategy for annotation and classification by functional role hierarchies, as has been applied to already completed genomes (Rubin et al., 2000).
In this review we provide a comprehensive survey of state-of-the-art open expression systems and introduce important issues concerning data management and experimental design. In addition, we describe a selected application of one system, GeneCalling, which is a patented, industrial-scale open architecture technology for quantitative expression analysis (QEA) (Shimkets et al., 1999) (USPTO 5871697 and USPTO 5972693).
Section snippets
Technologies and comparisons
The earliest technologies for transcript profiling were differential display (DD) (Liang and Pardee, 1992) and arbitrarily primed polymerase chain reaction (AP-PCR) (Welsh and McClelland, 1990). DD is an expression analysis method whereby mRNA from each sample is converted to cDNA, cDNA is PCR-amplified using a combination of random primers and anchored oligo-dT primers, then run on a gel. Each mRNA is represented as a single band and differentially expressed bands are excised, cloned, and
General technology comparisons
The primary considerations in selecting a technology for gene expression analysis are the amount of material required and the sensitivity and coverage of the method. The amount of biological material required by each technology platform is a primary concern. When attempting to analyze small anatomical structures in animal models or patient biopsies the amount of starting material necessary might impede a well-designed, comprehensive analysis. The amount of template cDNA required for each method
Applications of DGE analysis
Recent applications of open systems to diverse areas of research include: a survey of expression by activated T-cells (Prashar and Weissman, 1996), the study of rat cardiac hypertrophy (Shimkets et al., 1999), analysis of estradiol-inducible transcription factors involved in maize flavonoid pathways (Bruce et al., 2000), the discovery of a novel CXC chemokine involved in the regulation of hematopoiesis (see below) (Ohneda et al., 2000), the description of a previously unappreciated endocrine
Discovery and identification of a novel chemokine
In an adult, hematopoiesis occurs in the bone marrow where stem cells divide infrequently to produce more stem cells (self-renewal) and various committed progenitor cells. Committed progenitor cells will, under the control of specific signals, advance down a signal-specified lineage. Hematopoiesis, therefore, is a complex process requiring interplay between many different signals. These signals, both contact-dependent and soluble, are produced by the surrounding stromal cells and in other
Overview and conclusions
Analyses of model systems using comprehensive differential gene expression methodologies can generate vast amounts of information that must be managed and interpreted properly to be of any use. Simple binary comparisons raise more questions than they can possibly answer, so experiments become complex by necessity. The most effective systems for DGE analysis allow one to dexterously manage the data generated from one-to-one comparisons, as well as efficiently triage that data by allowing
Acknowledgements
The authors wish to thank Morgan Kilbourn for assistance with figures, Jeff Powell, James Slattery, Mark Vincent, Jonathan M. Rothberg, and the manuscript referees for their critical review of this manuscript.
References (53)
- et al.
High-throughput gene expression analysis using SAGE
Drug Discovery Today
(1998) Genomic biology
Cell
(2000)- et al.
Ribosomal RNA metabolism in synchronized plasmacytoma cells
Exp. Cell. Res.
(1975) - et al.
Subtractive hybridization, a technique for extraction of DNA sequences distinguishing two closely related genomes: critical analysis
Genet. Anal.
(1996) - et al.
Equalizing cDNA subtraction based on selective suppression of polymerase chain reaction: cloning of Jurkat cell transcripts induced by phytohemaglutinin and phorbol 12-myristate 13-acetate
Anal. Biochem.
(1996) - et al.
cDNA representational difference analysis: a sensitive and flexible method for identification of differentially expressed genes
Methods Enzymol.
(1999) - et al.
Definitive hematopoiesis is autonomously initiated by the AGM region
Cell
(1996) - et al.
Hematopoietic stem cell maintenance and differentiation are supported by embryonic aorta-gonad-mesonephros region-derived endothelium
Blood
(1998) - et al.
WECHE: a novel hematopoietic regulatory factor
Immunity
(2000) - et al.
READS: a method for display of 3′-end fragments of restriction enzyme-digested cDNAs for analysis of differential gene expression
Methods Enzymol.
(1999)
Quantitative expression analysis of genes regulated by both obesity and leptin reveals a regulatory loop between leptin and pituitary-derived ACTH
J. Biol. Chem.
Characterization of the first definitive hematopoietic stem cells in the AGM and liver of the mouse embryo
Immunity
Stimulation of mouse and human primitive hematopoiesis by murine embryonic aorta-gonad-mesonephros-derived stromal cell lines
Blood
Visualization of differential gene expression using a novel method of RNA fingerprinting based on AFLP: analysis of gene expression during potato tuber development
Plant J.
Normalization and subtraction: two approaches to facilitate gene discovery
Genome Res.
Representational difference analysis of cDNA for the detection of differential gene expression in bacteria: development using a model of iron-regulated gene expression in Neisseria meningitidis
Microbiology
Expression profiling of the maize flavonoid pathway genes controlled by estradiol-inducible transcription factors CRC and P
Plant Cell
The first genome from the third domain of life [news]
Nature
The role of stromal cell heparan sulphate in regulating haemopoiesis
Leuk. Lymphoma
Exploring the metabolic and genetic control of gene expression on a genomic scale
Science
Real time quantitative PCR
Genome Res.
Cited by (76)
Contemporary molecular tools in microbial ecology and their application to advancing biotechnology
2015, Biotechnology AdvancesCitation Excerpt :Another common enrichment method at nucleic acid level is Stable Isotope Probing (SIP) where a growth substrate labeled with a stable isotope (13C, 15N etc.) is fed to the enrichment culture or soil sample and links the respective microbial function to a specific taxon or subpopulation via selective recovery of labeled DNA (Radajewski et al., 2003; Kalyuzhnaya et al., 2008). Suppressive subtractive hybridization (SSH) (Galbraith et al., 2004), differential expression analysis (DEA) (Green et al., 2001), phage display (Crameri and Suter, 1993; Ciric et al., 2014), affinity capture (Demidov et al., 2000), and microarrays (Wu et al., 2001) are other technologies that might be used for metagenomic enrichment at various stages of sampling, nucleic acid extraction and clone library preparation. While metagenomics has been used widely to identify novel enzymes or natural products, there are challenges and issues associated with it (Thomas et al., 2012; Leis et al., 2013).
Transcriptome analysis of HeLa cells response to Brucella melitensis infection: A molecular approach to understand the role of the mucosal epithelium in the onset of the Brucella pathogenesis
2012, Microbes and InfectionCitation Excerpt :Despite the importance of epithelial cells in the initial Brucella pathogenesis, a detailed molecular response of these cells infected with the intracellular pathogen has not been fully investigated. Several tools have been developed to study the transcriptional profiles of both pathogen and host [11], the most common of which is cDNA microarray technology. Recently using this approach, we demonstrated that B. melitensis undergo an adaptation period during the first 4 h post HeLa cells infection that is subsequently overcome, facilitating Brucella to replicate intracellularly [12].
Oxidative stress response of European flounder (Platichthys flesus) to cadmium determined by a custom cDNA microarray
2006, Marine Environmental ResearchBioprospecting of functional cellulases from metagenome for second generation biofuel production: a review
2018, Critical Reviews in MicrobiologyMolecular methods for gene expression analysis: Ecotoxicological applications
2016, Ecotoxicological Testing of Marine and Freshwater Ecosystems: Emerging Techniques, Trends and Strategies
- 1
These authors contributed equally to the preparation of this review.