A network flow model for biclustering via optimal re-ordering of data matrices

DiMaggio, Peter A.; McAllister, Scott R.; Floudas, Christodoulos A.; Feng, Xiao-Jiang; Rabinowitz, Joshua D.; Rabitz, Herschel A.

doi:10.1007/s10898-008-9349-z

A network flow model for biclustering via optimal re-ordering of data matrices

Published: 13 September 2008

Volume 47, pages 343–354, (2010)
Cite this article

Journal of Global Optimization Aims and scope Submit manuscript

Peter A. DiMaggio Jr.¹,
Scott R. McAllister¹,
Christodoulos A. Floudas¹,
Xiao-Jiang Feng²,
Joshua D. Rabinowitz² &
…
Herschel A. Rabitz²

170 Accesses
10 Citations
Explore all metrics

Abstract

The analysis of large-scale data sets using clustering techniques arises in many different disciplines and has important applications. Most traditional clustering techniques require heuristic methods for finding good solutions and produce suboptimal clusters as a result. In this article, we present a rigorous biclustering approach, OREO, which is based on the Optimal RE-Ordering of the rows and columns of a data matrix. The physical permutations of the rows and columns are accomplished via a network flow model according to a given objective function. This optimal re-ordering model is used in an iterative framework where cluster boundaries in one dimension are used to partition and re-order the other dimensions of the corresponding submatrices. The performance of OREO is demonstrated on metabolite concentration data to validate the ability of the proposed method and compare it to existing clustering methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Novel Biclustering Methods for Re-ordering Data Matrices

A New Algorithm for Convex Biclustering and Its Extension to the Compositional Data

Article 28 September 2022

Binhuan Wang, Lanqiu Yao, … Huilin Li

BROCCOLI: overlapping and outlier-robust biclustering through proximal stochastic gradient descent

Article Open access 11 August 2021

Sibylle Hess, Gianvito Pio, … Michelangelo Ceci

References

Anderberg M.R.: Cluster Analysis for Applications. Academic Press, New York (1973)
Google Scholar
Jain, A.K., Flynn, P.J.: Image segmentation using clustering. In: Ahuja, N., Bowyer, K. (eds.) Advances in Image Understanding: A Festschrift for Azriel Rosenfeld, pp. 65–83. IEEE Press, Piscataway (1996)
Google Scholar
Salton G.: Developments in automatic text retrieval. Science 253, 974–980 (1991)
Article Google Scholar
Eisen M.B., Spellman P.T., Brown P.O., Botstein D.: Cluster analysis and display of genome-wide expression patterns. Proc. Natl. Acad. Sci. USA 95, 14863–14868 (1998)
Article Google Scholar
Zhang Y., Skolnick J.: SPICKER: a clustering approach to identify near-native protein folds. J. Comput. Chem. 25, 865–871 (2004)
Article Google Scholar
Mönnigmann M., Floudas C.A.: Protein loop structure prediction with flexible stem geometries. Protein: Struct. Funct. Bioinform. 61, 748–762 (2005)
Article Google Scholar
Edwards A.W.F., Cavalli-Sforza L.L.: A method for cluster analysis. Biometrics 21, 362–375 (1965)
Article Google Scholar
Wolfe J.H.: Pattern clustering by multivariate mixture analysis. Multivariate Behav. Res. 5, 329–350 (1970)
Article Google Scholar
Jain A.K., Mao J.: Artificial neural networks: a tutorial. IEEE Comput. 29, 31–44 (1996)
Google Scholar
Klein R.W., Dubes R.C.: Experiments in projection and clustering by simulated annealing. Pattern Recognit. 22, 213–220 (1989)
Article Google Scholar
Raghavan, V.V., Birchand, K.: A clustering strategy based on a formalism of the reproductive process in a natural system. In: Proceedings of the Second International Conference on Information Storage and Retrieval, pp. 10–22 (1979)
Bhuyan, J.N., Raghavan, V.V., Venkatesh, K.E.: Genetic algorithm for clustering with an ordered representation. In: Proceedings of the Fourth International Conference on Genetic Algorithms, pp. 408–415 (1991)
Slonim N., Atwal G.S., Tkacik G., Bialek W.: Information-based clustering. Proc. Natl. Acad. Sci. USA 102(51), 18297–18302 (2005)
Article Google Scholar
Tan M.P., Broach J.R., Floudas C.A.: A novel clustering approach and prediction of optimal number of clusters: global optimum search with enhanced positioning. J. Glob. Optim. 39(3), 323–346 (2007)
Article Google Scholar
Tan M.P., Broach J.R., Floudas C.A.: Evaluation of normalization and pre-clustering issues in a novel clustering approach: global optimum search with enhanced positioning. J. Bioinform. Comput. Biol. 5(4), 895–913 (2007)
Article Google Scholar
Tan M.P., Smith E.R., Broach J.R., Floudas C.A.: Microarray data mining: a novel optimization-based approach to uncover biologically coherent structures. BMC Biol. 9, 268–283 (2008)
Google Scholar
Busygin S., Prokopyev O.A., Pardalos P.M.: An optimization based approach for data classification. Optim. Methods Softw. 22(1), 3–9 (2007)
Article Google Scholar
Lenstra J.K.: Clustering a data array and the traveling-salesman problem. Oper. Res. 22(2), 413–414 (1974)
Article Google Scholar
Lenstra J.K., Rinnooy Kan A.H.G.: Some simple applications of the traveling-salesman problem. Oper. Res. Q 26(4), 717–733 (1975)
Article Google Scholar
Alpert C.J., Kahng A.B.: Splitting an ordering into a partition to minimize diameter. J. Classif. 14, 51–74 (1997)
Article Google Scholar
Climer S., Zhang W.: Rearrangement clustering: pitfalls, remedies, and applications. J. Mach. Learn. Res. 7, 919–943 (2006)
Google Scholar
Turner H.L., Bailey T.C., Krzanowski W.J., Hemingway C.A.: Biclustering models for structured microarray data. IEEE/ACM Trans. Comput. Biol. Bioinform. 2(4), 316–329 (2005)
Article Google Scholar
Cheng, Y., Church, G.M.: Biclustering of expression data. In: Proc. ISMB 2000, pp. 93–103 (2000)
Reiss D.J., Baliga N.S., Bonneau R.: Integrated biclustering of heterogeneous genome-wide datasets for the inference of global regulatory networks. BMC Bioinform. 7, 280–302 (2006)
Article Google Scholar
Kluger Y., Basri R., Chang J.T., Gerstein M.: Spectral biclustering of microarray data: coclustering genes and conditions. Genome Res. 13, 703–716 (2003)
Article Google Scholar
Prelic A., Bleuler S., Zimmermann P., Wille A., Buhlmann P., Gruissem W., Hennig L., Thiele L., Zitzler E.: A systematic comparison and evaluation of biclustering methods for gene expression data. Bioinformatics 22(9), 1122–1129 (2006)
Article Google Scholar
Tanay A., Sharan R., Shamir R.: Discovering statistically significant biclusters in gene expression data. Bioinformatics 18, S136–S144 (2002)
Google Scholar
Yoon S., Nardini C., Benini L., Micheli G.: Discovering coherent biclusters from gene expression data using zero-suppressed binary decision diagrams. IEEE/ACM Trans. Comput. Biol. Bioinform. 2(4), 339–354 (2005)
Article Google Scholar
Bleuler, S., Prelic, A., Zitzler, E.: An EA framework for biclustering of gene expression data. In: IEEE Congress on Evolutionary Computation, pp. 166–173 (2004)
Divina F., Aguilar J.: Biclustering of expression data with evolutionary computation. Trans. Knowl. Data Eng. 18(5), 590–602 (2006)
Article Google Scholar
Busygin S., Prokopyev O.A., Pardalos P.M.: Feature selection for consistent biclustering via fractional 0–1 programming. J. Comb. Optim. 10, 7–21 (2005)
Article Google Scholar
Ford L.R., Fulkerson D.R.: Flows in Networks. Princeton University Press, Princeton (1962)
Google Scholar
Floudas C.A., Grossmann I.E.: Synthesis of flexible heat exchanger networks with uncertain flowrates and temperatures. Comput. Chem. Eng. 11(4), 319–336 (1987)
Article Google Scholar
Ciric A.R., Floudas C.A.: A retrofit approach for heat-exchanger networks. Comput. Chem. Eng. 13(6), 703–715 (1989)
Article Google Scholar
Floudas C.A., Anastasiadis S.H.: Synthesis of distillation sequences with several multicomponent feed and product streams. Chem. Eng. Sci. 43(9), 2407–2419 (1988)
Article Google Scholar
Kokossis A.C., Floudas C.A.: Optimization of complex reactor networks-II: nonisothermal operation. Chem. Eng. Sci. 49(7), 1037–1051 (1994)
Article Google Scholar
Aggarwal A., Floudas C.A.: Synthesis of general separation sequences—nonsharp separations. Comput. Chem. Eng. 14(6), 631–653 (1990)
Article Google Scholar
CPLEX.: ILOG CPLEX 9.0 User’s Manual (2005)
Applegate D.L., Bixby R.E., Chvatal V., Cook W.J.: The traveling salesman problem: a computational study. Princeton University Press, Princeton (2007)
Google Scholar
Brauer M.J., Yuan J., Bennett B., Lu W., Kimball E., Bostein D., Rabinowitz J.D.: Conservation of the metabolomic response to starvation across two divergent microbes. Proc. Natl. Acad. Sci. USA 103, 19302–19307 (2006)
Article Google Scholar
Ihmels J., Friedlander G., Bergmann S., Sarig O., Ziv Y., Barkai N.: Revealing modular organization in the yeast transcriptional network. Nat. Genet. 31, 370–377 (2002)
Google Scholar
Ben-Dor, A., Chor, B., Karp, R., Yakhini, Z.: Discovering local structure in gene expression data: the order-preserving submatrix problem. In: Proceedings of the Sixth Annual International Conference on Computational Biology (RECOMB 2002), Washington, DC, USA, pp. 49–57 (2002)
Grothaus G.A., Mufti A., Murali T.M.: Automatic layout and visualization of biclusters. Algorithms Mol. Biol. 1, 1–15 (2006)
Article Google Scholar
Androulakis I.P., Maranas C.D., Floudas C.A.: Prediction of oligopeptide conformations via deterministic global optimization. J. Glob. Optim. 11, 1–34 (1997)
Article Google Scholar
Klepeis J.L., Floudas C.A.: Free energy calculations for peptides via deterministic global optimization. J. Chem. Phys. 110, 7491–7512 (1999)
Article Google Scholar
Klepeis J.L., Floudas C.A., Morikis D., Lambris J.D.: Predicting peptide structures using NMR data and deterministic global optimization. J. Comput. Chem. 20(13), 1354–1370 (1999)
Article Google Scholar
Klepeis J.L., Floudas C.A.: Ab initio tertiary structure prediction of proteins. J. Glob. Optim. 25, 113–140 (2003)
Article Google Scholar
Klepeis J.L., Floudas C.A.: ASTRO-FOLD: a combinatorial and global optimization framework for ab initio prediction of three-dimensional structures of proteins from the amino acid sequence. Biophys. J. 85, 2119–2146 (2003)
Article Google Scholar
Klepeis J.L., Floudas C.A., Morikis D., Tsokos C.G., Argyropoulos E., Spruce L., Lambris J.D.: Integrated computational and experimental approach for lead optimization and design of compstatin variants with improved activity. J. Am. Chem. Soc. 125(28), 8422–8423 (2003)
Article Google Scholar
Fung H.K., Floudas C.A., Taylor M.S., Zhang L., Morikis D.: Towards full sequence de novo protein design with flexible templates for human beta-defensin-2. Biophys. J. 94, 584–599 (2008)
Article Google Scholar
Lin X., Floudas C.A.: Design, synthesis and scheduling of multipurpose batch plants via an effective continuous-time formulation. Comput. Chem. Eng. 25, 665–674 (2001)
Article Google Scholar
Janak S.L., Lin X., Floudas C.A.: Enhanced continuous-time unit-specific event based formulation for short-term scheduling of multipurpose batch processes: resource constraints and mixed storage policies. Ind. Eng. Chem. Res. 43, 2516–2533 (2004)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Chemical Engineering, Princeton University, Princeton, NJ, 08544-5263, USA
Peter A. DiMaggio Jr., Scott R. McAllister & Christodoulos A. Floudas
Department of Chemistry, Princeton University, Princeton, NJ, 08544-5263, USA
Xiao-Jiang Feng, Joshua D. Rabinowitz & Herschel A. Rabitz

Authors

Peter A. DiMaggio Jr.
View author publications
You can also search for this author in PubMed Google Scholar
Scott R. McAllister
View author publications
You can also search for this author in PubMed Google Scholar
Christodoulos A. Floudas
View author publications
You can also search for this author in PubMed Google Scholar
Xiao-Jiang Feng
View author publications
You can also search for this author in PubMed Google Scholar
Joshua D. Rabinowitz
View author publications
You can also search for this author in PubMed Google Scholar
Herschel A. Rabitz
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Christodoulos A. Floudas.

Rights and permissions

Reprints and permissions

About this article

Cite this article

DiMaggio, P.A., McAllister, S.R., Floudas, C.A. et al. A network flow model for biclustering via optimal re-ordering of data matrices. J Glob Optim 47, 343–354 (2010). https://doi.org/10.1007/s10898-008-9349-z

Download citation

Received: 26 August 2008
Accepted: 26 August 2008
Published: 13 September 2008
Issue Date: July 2010
DOI: https://doi.org/10.1007/s10898-008-9349-z

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A network flow model for biclustering via optimal re-ordering of data matrices

Abstract

Access this article

Similar content being viewed by others

Novel Biclustering Methods for Re-ordering Data Matrices

A New Algorithm for Convex Biclustering and Its Extension to the Compositional Data

BROCCOLI: overlapping and outlier-robust biclustering through proximal stochastic gradient descent

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Abstract

Access this article

Similar content being viewed by others

Novel Biclustering Methods for Re-ordering Data Matrices

A New Algorithm for Convex Biclustering and Its Extension to the Compositional Data

BROCCOLI: overlapping and outlier-robust biclustering through proximal stochastic gradient descent

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation