Abstract
We describe a computational protocol for the ARACNE algorithm, an information-theoretic method for identifying transcriptional interactions between gene products using microarray expression profile data. Similar to other algorithms, ARACNE predicts potential functional associations among genes, or novel functions for uncharacterized genes, by identifying statistical dependencies between gene products. However, based on biochemical validation, literature searches and DNA binding site enrichment analysis, ARACNE has also proven effective in identifying bona fide transcriptional targets, even in complex mammalian networks. Thus we envision that predictions made by ARACNE, especially when supplemented with prior knowledge or additional data sources, can provide appropriate hypotheses for the further investigation of cellular networks. While the examples in this protocol use only gene expression profile data, the algorithm's theoretical basis readily extends to a variety of other high-throughput measurements, such as pathway-specific or genome-wide proteomics, microRNA and metabolomics data. As these data become readily available, we expect that ARACNE might prove increasingly useful in elucidating the underlying interaction models. For a microarray data set containing ∼10,000 probes, reconstructing the network around a single probe completes in several minutes using a desktop computer with a Pentium 4 processor. Reconstructing a genome-wide network generally requires a computational cluster, especially if the recommended bootstrapping procedure is used.
This is a preview of subscription content, access via your institution
Access options
Subscribe to this journal
Receive 12 print issues and online access
$259.00 per year
only $21.58 per issue
Rent or buy this article
Prices vary by article type
from$1.95
to$39.95
Prices may be subject to local taxes which are calculated during checkout
Similar content being viewed by others
References
Schena, M., Shalon, D., Davis, R.W. & Brown, P.O. Quantitative monitoring of gene expression patterns with a complementary DNA microarray. Science 270, 467–470 (1995).
Lu, J. et al. MicroRNA expression profiles classify human cancers. Nature 435, 834–838 (2005).
Perez, O.D. & Nolan, G.P. Simultaneous measurement of multiple active kinase states using polychromatic flow cytometry. Nat. Biotechnol. 20, 155–162 (2002).
Lu, W., Kimball, E. & Rabinowitz, J.D. A high-performance liquid chromatography-tandem mass spectrometry method for quantitation of nitrogen-containing intracellular metabolites. J. Am. Soc. Mass Spectrom. 17, 37–50 (2006).
van Someren, E.P., Wessels, L.F., Backer, E. & Reinders, M.J. Genetic network modeling. Pharmacogenomics 3, 507–525 (2002).
Friedman, N. Inferring cellular networks using probabilistic graphical models. Science 303, 799–805 (2004).
Butte, A.J. & Kohane, I.S. Mutual information relevance networks: functional genomic clustering using pairwise entropy measurements. Pac. Symp. Biocomput. 418–429 (2000).
Sachs, K., Perez, O., Pe'er, D., Lauffenburger, D.A. & Nolan, G.P. Causal protein-signaling networks derived from multiparameter single-cell data. Science 308, 523–529 (2005).
Margolin, A.A. et al. ARACNE: an algorithm for the reconstruction of gene regulatory networks in a mammalian cellular context. BMC Bioinformatics 7, S7 (2006).
Tegner, J., Yeung, M.K., Hasty, J. & Collins, J.J. Reverse engineering gene networks: integrating genetic perturbations with dynamical modeling. Proc. Natl. Acad. Sci. USA 100, 5944–5949 (2003).
Gardner, T.S., di Bernardo, D., Lorenz, D. & Collins, J.J. Inferring genetic networks and identifying compound mode of action via expression profiling. Science 301, 102–105 (2003).
Segal, E. et al. Module networks: identifying regulatory modules and their condition-specific regulators from gene expression data. Nat. Genet. 34, 166–176 (2003).
Hartemink, A.J., Gifford, D.K., Jaakkola, T.S. & Young, R.A. Using graphical models and genomic expression data to statistically validate models of genetic regulatory networks. Pac. Symp. Biocomput. 422–433 (2001).
Gat-Viks, I. & Shamir, R. Chain functions and scoring functions in genetic networks. Bioinformatics 19 (Suppl. 1): i108–i117 (2003).
Ideker, T. et al. Integrated genomic and proteomic analyses of a systematically perturbed metabolic network. Science 292, 929–934 (2001).
Basso, K. et al. Reverse engineering of regulatory networks in human B cells. Nat. Genet. 37, 382–390 (2005).
Hartemink, A.J. Reverse engineering gene regulatory networks. Nat. Biotechnol. 23, 554–555 (2005).
Ren, B. et al. Genome-wide location and function of DNA binding proteins. Science 290, 2306–2309 (2000).
Cover, T.M. & Thomas, J.A. Elements of Information Theory (John Wiley & Sons, New York, 1991).
Ashburner, M. et al. Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat. Genet. 25, 25–29 (2000).
Kel, A.E. et al. MATCH: A tool for searching transcription factor binding sites in DNA sequences. Nucleic Acids Res. 31, 3576–3579 (2003).
Klein, U. et al. Gene expression profiling of B cell chronic lymphocytic leukemia reveals a homogeneous phenotype related to memory B cells. J. Exp. Med. 194, 1625–1638 (2001).
Matys, V. et al. TRANSFAC and its module TRANSCompel: transcriptional gene regulation in eukaryotes. Nucleic Acids Res. 34, D108–D110 (2006).
Vlieghe, D. et al. A new generation of JASPAR, the open-access repository for transcription factor binding site profiles. Nucleic Acids Res. 34, D95–D97 (2006).
Shannon, P. et al. Cytoscape: a software environment for integrated models of biomolecular interaction networks. Genome Res. 13, 2498–2504 (2003).
Acknowledgements
This work was supported by the National Cancer Institute, the National Institute of Allergy and Infectious Diseases, and the National Centers for Biomedical Computing NIH Roadmap Initiative. A.A.M. is supported by an IBM Ph.D. fellowship and by the National Library of Medicine Medical Informatics Research Training Program. I.N. is supported by the Department of Energy/National Nuclear Security Administration. We would like to thank R. Dalla-Favera for continued support and insight, K. Basso and U. Klein for contributions to the experimental validation of the original ARACNE algorithm, and K. Smith for help in reviewing the manuscript.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing financial interests.
Rights and permissions
About this article
Cite this article
Margolin, A., Wang, K., Lim, W. et al. Reverse engineering cellular networks. Nat Protoc 1, 662–671 (2006). https://doi.org/10.1038/nprot.2006.106
Published:
Issue Date:
DOI: https://doi.org/10.1038/nprot.2006.106
This article is cited by
-
CEBPD is a master transcriptional factor for hypoxia regulated proteins in glioblastoma and augments hypoxia induced invasion through extracellular matrix-integrin mediated EGFR/PI3K pathway
Cell Death & Disease (2023)
-
Network-based inference of master regulators in epithelial membrane protein 2-treated human RPE cells
BMC Genomic Data (2022)
-
Computational identification of new potential transcriptional partners of ERRα in breast cancer cells: specific partners for specific targets
Scientific Reports (2022)
-
Inferring and analyzing gene regulatory networks from multi-factorial expression data: a complete and interactive suite
BMC Genomics (2021)
-
Artificial intelligence guided discovery of a barrier-protective therapy in inflammatory bowel disease
Nature Communications (2021)
Comments
By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.