Abstract
Boolean networks are a popular model class for capturing the interactions of genes and global dynamical behavior of genetic regulatory networks. Recently, a significant amount of attention has been focused on the inference or identification of the model structure from gene expression data. We consider the Consistency as well as Best-Fit Extension problems in the context of inferring the networks from data. The latter approach is especially useful in situations when gene expression measurements are noisy and may lead to inconsistent observations. We propose simple efficient algorithms that can be used to answer the Consistency Problem and find one or all consistent Boolean networks relative to the given examples. The same method is extended to learning gene regulatory networks under the Best-Fit Extension paradigm. We also introduce a simple and fast way of finding all Boolean networks having limited error size in the Best-Fit Extension Problem setting. We apply the inference methods to a real gene expression data set and present the results for a selected set of genes.
Article PDF
Similar content being viewed by others
References
Akutsu, T., Kuhara, S., Maruyama, O., & Miyano, S. (1998). Identification of gene regulatory networks by strategic gene disruptions and gene overexpressions. In Proc. the 9th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA'98), (pp. 695–702).
Akutsu, T., Miyano, S., & Kuhara, S. (1999). Identification of genetic networks from a small number of gene expression patterns under the Boolean network model. Pacific Symposium on Biocomputing, 4,17–28.
Akutsu, T., Miyano, S., & Kuhara, S. (2000). Inferring qualitative relations in genetic networks and metabolic pathways. Bioinformatics, 16, 727–734.
Arnone, M. I., & Davidson, E. H. (1997). The hardwiring of development: Organization and function of genomic regulatory systems. Development, 124, 1851–1864.
Banerjee, N., & Zhang, M. Q. (2002). Functional genomics as applied to mapping transcription regulatory networks. Current Opinion in Microbiology, 5:3, 313–317.
Boros, E., Ibaraki, T., & Makino, K. (1998). Error-Free and Best-Fit Extensions of partially defined Boolean functions. Information and Computation, 140, 254–283.
Chen, T., He, H. L., & Church, G. M. (1999). Modeling gene expression with differential equations. Pacific Symposium on Biocomputing, 4,29–40.
Chen, Y., Dougherty, E., & Bittner, M. (1997). Ratio-based decisions and the quantitative analysis of cDNA microarray images. Journal of Biomedical Optics, 2, 364–374.
Chen, Y., Kamat, V., Dougherty, E. R., Bittner, M. L., Meltzer, P. S., & Trent, J. M. (2002). Ratio statistics of gene expression levels and applications to microarray data analysis. Bioinformatics, 18, 1207–1215.
Cormen, T. H., Leiserson, C. E., & Rivest, R. L. (1998). Introduction to Algorithms. MIT Press.
D'Haeseleer, P., Wen, X., Fuhrman, S., & Somogyi, R. (1999). Linear modeling of mRNA expression levels during CNS development and injury. Pacific Symposium on Biocomputing, 4,41–52
de Jong, H. (2002). Modeling and simulation of genetic regulatory systems: A literature review. Journal of Computational Biology, 9:1,67–103.
Dougherty, E. R., Kim, S., & Chen, Y. (2000). Coefficient of determination in nonlinear signal processing. Signal Processing, 80:10, 2219–2235.
Friedman, N., Linial, M., Nachman, I., & Pe'er, D. (2000). Using Bayesian networks to analyze expression data. Journal of Computational Biology, 7, 601–620.
Glass, L., & Kauffman, S. A. (1973). The logical analysis of continuous non-linear biochemical control networks. Journal of Theoretical Biology, 39, 103–129.
Hartemink, A., Gifford, D., Jaakkola, T., & Young, R. (2001). Using graphical models and genomic expression data to statistically validate models of genetic regulatory networks. Pacific Symposium on Biocomputing, 6, 422–433.
Hasty, J., McMillen, D., Isaacs, F., & Collins, J. J. (2001). Computational studies of gene regulatory networks: In numero molecular biology. Nature Reviews Genetics, 2, 268–279.
Huang, S. (1999). Gene expression profiling, genetic networks and cellular states: An integrating concept for tumorigenesis and drug discovery. Journal of Molecular Medicine, 77, 469–480.
Ideker, T. E., Thorsson, V., & Karp, R. M. (2000). Discovery of regulatory interactions through perturbation: Inference and experimental design. Pacific Symposium on Biocomputing, 5, 302–313.
Karp, R. M., Stoughton, R., & Yeung, K. Y. (1999). Algorithms for choosing differential gene expression experi-ments. RECOMB99 (pp. 208–217). ACM.
Kauffman, S. A. (1969). Metabolic stability and epigenesis in randomly constructed genetic nets. Journal of Theoretical Biology, 22, 437–467.
Kauffman, S. A. (1993). The Origins of Order: Self-organization and Selection in Evolution. New York: Oxford University Press.
Kerr, M. K., Leiter, E. H., Picard, L., & Churchill, G. A. (2002). Sources of variation in microarray experiments. In W. Zhang, & I. Shmulevich (Eds.), Computational and Statistical Approaches to Genomics. Boston: Kluwer Academic Publishers.
Lee, T. I., Rinaldi, N. J., Robert, F., Odom, D. T., Bar-Joseph, Z., Gerber, G. K., Hannett, N. M., Harbison, C. T., Thompson, C. M., Simon, I. et al. (2002). Transcriptional regulatory networks in Saccharomyces cerevisiae. Science, 298, 799–804.
Liang, S., Fuhrman, S., & Somogyi, R. (1998). REVEAL, A general reverse engineering algorithm for inference of genetic network architectures. Pacific Symposium on Biocomputing, 3,18–29.
Maki, Y., Tominaga, D., Okamoto, M., Watanabe, S., & Eguchi, Y. (2001). Development of a system for the inference of large scale genetic networks. Pacific Symposium on Biocomputing, 6, 446–458.
MacKay, D. J. C. (1992). Bayesian interpolation. Neural Computation, 4:3, 415–447.
McAdams, H. H., & Arkin, A. (1997). Stochastic mechanisms in gene expression. Proc. Natl. Acad. Sci. USA, 94, 814–819.
McAdams, H. H., & Arkin, A. (1999). It's a noisy business! Genetic regulation at the nanomolar scale. Trends in Genetics, 15,65–69.
Mestl, T., Plahte, E., & Omholt, S. W. (1995). A mathematical framework for describing and analyzing gene regulatory networks. Journal of Theoretical Biology, 176, 291–300.
Murphy, K., & Mian, S. (1999). Modelling gene expression data using dynamic Bayesian networks. Technical Report, University of California, Berkeley.
Noda, K., Shinohara, A., Takeda, M., Matsumoto, S., Miyano, S., & Kuhara, S. (1998). Finding genetic network from experiments by weighted network model. Genome Informatics, 9, 141–150.
Ren, B., Robert, F., Wyrick, J. J., Aparicio. O., Jennings, E. G., Simon, I., Zeitlinger, J., Schreiber, J., Hannett, N., Kanin, E. et al. (2000). Genome-wide location and function of DNA binding proteins. Science, 290, 2306–2309.
Shmulevich, I., Dougherty, E. R., Seungchan, K., & Zhang, W. (2002a). Probabilistic Boolean networks: A rule-based uncertainty model for gene regulatory networks. Bioinformatics, 18, 261–274.
Shmulevich, I., Saarinen, A., Yli-Harja, O., & Astola, J. (2002b). Inference of genetic regulatory networks under the Best-Fit Extension paradigm. In W. Zhang, and I. Shmulevich (Eds.), Computational And Statistical Approaches To Genomics. Boston: Kluwer Academic Publishers.
Shmulevich, I., & Zhang, W. (2002c). Binary analysis and optimization-based normalization of gene expression data. Bioinformatics, 18, 555–565.
Simon, I., Barnett, J., Hannett, N., Harbison, C. T., Rinaldi, N. J., Volkert, T. L., Wyrick, J. J., Zeitlinger, J., Gifford, D. K., Jaakkola, T. S., & Young, R. A. (2001). Serial regulation of transcriptional regulators in the yeast cell cycle. Cell, 106, 697–708.
Smolen, P., Baxter, D. A., & Byrne, J. H. (2000). Mathematical modeling of gene networks. Neuron, 26, 567–580.
Smyth, G. K., Yang, Y. H., & Speed, T. (2003). Statistical issues in cDNA microarray data analysis. In M. J. Brownstein, & A. B. Khodursky (Eds.), Functional Genomics: Methods and Protocols, Methods in Molecular Biology series (pp. 111–136). Totowa, NJ: Humana Press. To appear.
Somogyi, R., & Sniegoski, C. (1996). Modeling the complexity of gene networks: Understanding multigenic and pleiotropic regulation. Complexity, 1,45–63.
Spellman, P. T., Sherlock, G., Zhang, M. Q., Iyer, V. R., Anders, K., Eisen, M. B., Brown, P. O., Botstein, D., & Futcher, B. (1998). Comprehensive identification of cell cycle-regulated genes of the yeast Saccharomyces cerevisiae by Microarray Hybridization. Molecular Biology of the Cell, 9, 3273–3297.
Tabus, I., & Astola, J. (2001). On the use of MDL principle in gene expression prediction. Journal of Applied Signal Processing, 4, 297–303.
Tabus, I., Rissanen, J., & Astola, J. (2002). Normalized maximum likelihood models for Boolean regression with application to prediction and classification in genomics. In W. Zhang, & I. Shmulevich (Eds.), Computational And Statistical Approaches To Genomics. Boston: Kluwer Acadmic Publishers.
Thieffry, D., Huerta, A. M., Pèrez-Rueda, E., & Collado-Vides, J. (1998). From specific gene regulation to genomic networks: A global analysis of transcriptional regulation in Escherichia coli. BioEssays, 20, 433–440.
Troyanskaya, O., Cantor, M., Sherlock, G., Brown, P., Hastie, T., Tibshirani, R., Botstein, D., & Altman, R. B. (2001). Missing value estimation methods for DNA microarrays. Bioinformatics, 17, 520–525.
Vohradsky, J. (2001). Neural model of the genetic network. The Journal of Biological Chemistry, 276:39, 36168–36173.
Weaver, D. C., Workman, C. T., & Stormo, G. D. (1999). Modeling regulatory networks with weight matrices. Pacific Symposium on Biocomputing, 4, 112–123.
Yli-Harja, O., Linne, M.-L., & Astola, J. (2001). On the use of cDNAmicroarray data in Boolean network inference. In Proc. Conf. on Computer Science and Information Technologies (pp. 405–409). Yerevan, Armenia.
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Lähdesmäki, H., Shmulevich, I. & Yli-Harja, O. On Learning Gene Regulatory Networks Under the Boolean Network Model. Machine Learning 52, 147–167 (2003). https://doi.org/10.1023/A:1023905711304
Issue Date:
DOI: https://doi.org/10.1023/A:1023905711304