Abstract
Recent advances in biotechnology have produced a wealth of genomic data, which capture a variety of complementary cellular features. While these data promise to yield key insights into molecular biology, much of the available information remains underutilized because of the lack of scalable approaches for integrating signals across large, diverse data sets. A proper framework for capturing these numerous snapshots of complementary phenomena under a variety of conditions can provide the holistic view necessary for developing precise systems-level hypotheses.
Here we describe bioPIXIE, a system for combining information from diverse genomic data sets to predict biological networks. bioPIXIE utilizes a Bayesian framework for probabilistic integration of several high-throughput genomic data types including gene expression, protein–protein interactions, genetic interactions, protein localization, and sequence data to predict biological networks. The main purpose of the system is to support user-driven exploration through the inferred functional network, which is enabled by a public, web-based interface. We describe the features and supporting methods of this integration and discovery framework and present case examples where bioPIXIE has been used to generate specific, testable hypotheses for Saccharomyces cerevisiae, many of which have been confirmed experimentally.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Deng, M., F. Sun and T. Chen. 2003. Assessment of the reliability of protein–protein interactions and protein function prediction. Pac Symp Biocomput 140–151.
Bader, J.S., A. Chaudhuri, J.M. Rothberg and J. Chant. 2004. Gaining confidence in high-throughput protein interaction networks. Nat Biotechnol 22:78–85.
Sprinzak, E., S. Sattath and H. Margalit. 2003. How reliable are experimental protein–protein interaction data? J Mol Biol 327:919–923.
Barutcuoglu, Z., R.E. Schapire and O.G. Troyanskaya. 2006. Hierarchical multi-label prediction of gene function. Bioinformatics 22:830–836.
Lanckriet, G.R., M. Deng, N. Cristianini, M.I. Jordan and W.S. Noble. 2004. Kernel-based data fusion and its application to protein function prediction in yeast. Pac Symp Biocomput 300–311.
Letovsky, S. and S. Kasif. 2003. Predicting protein function from protein/protein interaction data: a probabilistic approach. Bioinformatics 19 Suppl 1:i197–i204.
von Mering, C., M. Huynen, D. Jaeggi, S. Schmidt, P. Bork and B. Snel. 2003. STRING: a database of predicted functional associations between proteins. Nucleic Acids Res 31:258–261.
Lee, I., S.V. Date, A.T. Adai and E.M. Marcotte. 2004. A probabilistic functional network of yeast genes. Science 306:1555–1558.
Jansen, R., H. Yu, D. Greenbaum, Y. Kluger, N.J. Krogan, S. Chung, A. Emili, M. Snyder, et al. 2003. A Bayesian networks approach for predicting protein–protein interactions from genomic data. Science 302:449–453.
Jaimovich, A., G. Elidan, H. Margalit and N. Friedman. 2005. Towards an integrated protein–protein interaction network. Research in Computational Molecular Biology, Proceedings Cambridge, MA, USA, 3500:14–38.
Myers, C.L., D. Robson, A. Wible, M.A. Hibbs, C. Chiriac, C.L. Theesfeld, K. Dolinski and O.G. Troyanskaya. 2005. Discovery of biological networks from diverse functional genomic data. Genome Biol 6:R114.
Murali, T.M., C.J. Wu and S. Kasif. 2006. The art of gene function prediction. Nat Biotechnol 24:1474–1475; author reply 1475–1476.
Druzdzel, M. 1999. SMILE: Structural Modeling, Inference, and Learning Engine and GeNIe: A Development Environment for Graphical Decision-Theoretic Models (Intelligent Systems Demonstration). pp. 902-903. In National Conference on Artificial Intelligence (AAAI-99). AAAI Press/The MIT Press, Menlo Park, CA.
Web site. Graphviz Home Page. In http://www.graphviz.org
Eddy, S.R. 2004. What is Bayesian statistics? Nat Biotechnol 22:1177–1178.
Myers, C.L., D.R. Barrett, M.A. Hibbs, C. Huttenhower and O.G. Troyanskaya. 2006. Finding function: evaluation methods for functional genomic data. BMC Genomics 7:187.
Ball, C.A., K. Dolinski, S.S. Dwight, M.A. Harris, L. Issel-Tarver, A. Kasarskis, C.R. Scafe, G. Sherlock, et al. 2000. Integrating functional genomic information into the Saccharomyces genome database. Nucleic Acids Res 28:77–80.
Schauber, C., L. Chen, P. Tongaonkar, I. Vega, D. Lambertson, W. Potts and K. Madura. 1998. Rad23 links DNA repair to the ubiquitin/proteasome pathway. Nature 391:715–718.
Ashburner, M., C.A. Ball, J.A. Blake, D. Botstein, H. Butler, J.M. Cherry, A.P. Davis, K. Dolinski, et al. 2000. Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat Genet 25:25–29.
Boyle, E.I., S. Weng, J. Gollub, H. Jin, D. Botstein, J.M. Cherry and G. Sherlock. 2004. GO:TermFinder – open source software for accessing Gene Ontology information and finding significantly enriched Gene Ontology terms associated with a list of genes. Bioinformatics 20:3710–3715.
Miles, J. and T. Formosa. 1992. Evidence that POB1, a Saccharomyces cerevisiae protein that binds to DNA polymerase alpha, acts in DNA metabolism in vivo. Mol Cell Biol 12:5724–5735.
Fisher, R.A. 1915. Frequency distribution of the values of the correlation coefficient in samples from an indefinitely large population. Biometrika 10:507–521.
Kloster, M., C. Tang and N.S. Wingreen. 2005. Finding regulatory modules through large-scale gene-expression data analysis. Bioinformatics 21:1172–1179.
Myers, C.L. and O.G. Troyanskaya. 2007. Context-sensitive data integration and prediction of biological networks. Bioinformatics 23:2322–2330.
Huh, W.K., J.V. Falvo, L.C. Gerke, A.S. Carroll, R.W. Howson, J.S. Weissman and E.K. O’Shea. 2003. Global analysis of protein localization in budding yeast. Nature 425:686–691.
Friedman, N., D. Geiger and M. Goldszmidt. 1997. Bayesian network classifiers. Machine Learning 29:131–163.
Prakash, S. and L. Prakash. 2000. Nucleotide excision repair in yeast. Mutat Res 451:13–24.
van Laar, T., A.J. van der Eb and C. Terleth. 2002. A role for Rad23 proteins in 26S proteasome-dependent protein degradation? Mutat Res 499:53–61.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Humana Press, a part of Springer Science+Business Media, LLC
About this protocol
Cite this protocol
Myers, C.L., Chiriac, C., Troyanskaya, O.G. (2009). Discovering Biological Networks from Diverse Functional Genomic Data. In: Nikolsky, Y., Bryant, J. (eds) Protein Networks and Pathway Analysis. Methods in Molecular Biology, vol 563. Humana Press. https://doi.org/10.1007/978-1-60761-175-2_9
Download citation
DOI: https://doi.org/10.1007/978-1-60761-175-2_9
Published:
Publisher Name: Humana Press
Print ISBN: 978-1-60761-174-5
Online ISBN: 978-1-60761-175-2
eBook Packages: Springer Protocols