Abstract
In taxonomy and other branches of classification it is useful to know when tree-like classifications on overlapping sets of labels can be consistently combined into a parent tree. This paper considers the computation complexity of this problem. Recognizing when a consistent parent tree exists is shown to be intractable (NP-complete) for sets of unrooted trees, even when each tree in the set classifies just four labels. Consequently determining the compatibility of qualitative characters and partial binary characters is, in general, also NP-complete. However for sets of rooted trees an algorithm is described which constructs the “strict consensus tree” of all consistent parent trees (when they exist) in polynomial time. The related question of recognizing when a set of subtrees uniquely defines a parent tree is also considered, and a simple necessary and sufficient condition is described for rooted trees.
Similar content being viewed by others
References
AHO, A. V., SAVIG, Y., SZYMANSKI, T. G., and ULLMAN, J.D. (1981), “Inferring a Tree from the Lowest Common Ancestors with an Application to the Optimization of Relational Expressions”,SIAM Journal on Computing, 10(3), 405–421.
BANDELT, H-J., and DRESS, A. (1986), “Reconstructing the Shape of a Tree from Observed Dissimilarity Data”,Advances in Applied Mathematics, 7, 309–343.
BANDELT, H-J., VON HAESELER, A., BOLICK, J., and SCHÜTTE, H. (1990), “A Comparative Study of Sequence Dissimilarities and Evolutionary Distances Derived from Sets of Aligned RNA Sequences”, preprint.
BONDY, J. A., and MURTY, U. S. R. (1976),Graph Theory with Applications, London: Macmillan.
BROSSEIER, G. (1990), “Piecewise Heirarchical Clustering”,Journal of Classification, 7, 197–216.
BUNEMAN, P. (1971), “The Recovery of Trees from Measures of Dissimilarily”, inMathematics in the Archaeological and Historical Sciences, Eds., F. R. Hodson, D. G. Kendall, and P. Tautu, Edinburgh: Edinburgh University Press, 387–395.
BUNEMAN, P. (1974), “A Characterization of Rigid Circuit Graphs”,Discrete Mathematics, 9, 205–212.
CARTER, M., HENDY, M. D., PENNY, D., SZÉKELY, L. A., and WORMALD, N. C. (1990), “On the Distribution of Lengths of Evolutionary Trees”,SIAM Journal on Discrete Mathematics, 3, 38–47.
CAVALLI-SFORZA, L. L., and EDWARDS, A. W. F. (1967), “Phylogenetic Analysis: Models and Estimation Procedures”,Evolution, 21, 550–570.
CAVENDER, J. A. and FELSENSTEIN, J. (1987), “Invariants of Phylogenies: Simple Cases with Discrete States”,Journal of Classification, 4, 57–71.
COLONIUS, H., and SCHULZE, H. H. (1981), “Tree Structures for Proximity Data”,British Journal of Mathematical and Statistical Psychology, 34, 167–180.
CONSTANTINESCU, M., and SANKOFF, D. (1986), “Tree Enumeration Modulo a Consensus”,Journal of Classification, 3, 349–356.
DAY, W. H. E. (1985), “Optimal Algorithms for Comparing Trees with Labeled Leaves”,Journal of Classification, 2, 7–28.
DAY, W. H. E. (1985), “Analysis of Quartet Dissimilarity Measures between Undirected Phylogenetic Trees”,Systematic Zoology, 35(3), 325–333.
DAY, W. H. E., and SANKOFF, D. (1986), “Computational Complexity of Inferring Phylogenies by Compatibility”,Systematic Zoology, 35(2), 224–229.
DEKKER, M. C. H. (1986),Reconstruction Methods for Derivation Trees, Masters thesis, Vrije Universitiet, Amsterdam.
DRESS, A., VON HAESELER, A., and KRUEGER, M. (1986), “Reconstructing Phylogenetic Trees Using Variants of the ‘Four-Point-Condition’,”Studien zur Klassifikation, 17, 299–305.
DRESS, A., and STEEL, M. A. (1991) “Convex Tree Realizations of Partitions,”Applied Mathematics Letters (in press).
DROLET, S., and SANKOFF, D. (1990), “Quadratic Tree Invariants for Multivalued Characters,”Journal of Theoretical Biology, 144, 117–129.
ERDÖS, P., and SZÉKELY, L. A. (1989) “Applications of Antilexicographic Order 1. An Enumerative Theory of Trees,”Advances in Applied Mathematics, 10, 488–496.
ESTABROOK, G. F., JOHNSON, C. S., Jr., and MCMORRIS, F. R. (1976), “An Algebraic Analysis of Cladistic Characters,”Discrete Mathematics, 16, 141–147.
ESTABROOK, G. F., and MCMORRIS, F. R. (1977), “When are Two Taxonomic Characters Compatible?,”Journal of Mathematical Biology, 4, 195–299.
ESTABROOK, G. F., and MEACHAM, C. A. (1979), “How to Determine the Compatibility of Undirected Character State Trees,”Mathematical Biosciences, 46, 251–256.
ESTABROOK, G. F., McMORRIS, F. R., and MEACHAM C. A. (1985), “Comparison of Undirected Phylogenetic Trees based on Subtrees of Four Evolutionary Units,”Systematic Zoology, 34(2), 193–200.
FELSENSTEIN, J. (1988), “Phylogenies from Molecular Sequences: Inference and Reliability,”Annual Review of Genetics, 22, 521–565.
FITCH, W. M. (1975), “Towards Finding the Tree of Maximum Parsimony,” inThe Eighth International Conference on Numerical Taxonomy, Ed., G. F. Estabrook, San Francisco: W.H. Freeman, 189–230.
FOULDS, L. R., and GRAHAM, R. L. (1982), “The Steiner Problem in Phylogeny is NP-complete,”Advances in Applied Mathematics, 3, 43–49.
GAREY, M. R., and JOHNSON, D.S., (1979),Computers and Intractibility, New Jersey: Bell Telephone Laboratories Ltd.
GAVRIL, F. (1974), “The Intersection Graphs of Subtrees in Trees are Exactly the Chordal Graphs,”Journal of Combinatorial Theory (B), 16, 47–56.
GOLUMBIC, M. C. (1980),Algorithmic Graph Theory and Perfect Graphs, New York: Academic Press, 92.
GORDON, A. D. (1986), “Consensus Supertrees: The Synthesis of Rooted Trees Containing Overlapping Sets of Labeled Leaves,”Journal of Classification, 3, 335–348.
GUSFIELD, D. (1991), “Efficient Algorithms for Inferring Evolutionary Trees,”Networks, 21, 19–28.
KANNAN, S., and WARNOW, T. (1990) “Inferring Evolutionary Trees from DNA Sequences,” in31st Annual Symposium on Foundations of Computer Science (Proceedings), Los Alamitos, California: IEEE Computer Society Press, 362–371.
HALL, M. H. Jr. (1967),Combinatorial Theory, Waltham: Blaisdell, 175.
HENDY, M. D. (1989), “The Relationship Between Simple Evolutionary Tree Models and Observable Sequence Data,”Systematic Zoology, 38, 310–321.
KANT-ANTONESCU, M., and SANKOFF, D. (1991), “Efficient Construction of Supertrees,” manuscript.
MCMORRIS, F. R. (1975), “Compatibility Criteria for Cladistic and Qualitative Taxonomic Characters,” inThe Eighth International Conference on Numerical Taxonomy, Ed., E. A. Estabrook, San Fransisco: W.H. Freeman, 399–415.
MCMORRIS, F. R. (1977), “On the Compatibility of Binary Qualitative Taxonomic Characters,”Bulletin of Mathematical Biology, 39, 133–138.
MCMORRIS, F. R., WARNOW, T., and WIMER, T. (1991), “Chordal Completion of Coloured Graphs for a Fixed Number of Colours”, manuscript.
MARGUSH, T., and MCMORRIS, F. R. (1981), “Consensus n-trees,”Bulletin of Mathematical Biology, 43, 239–244.
MEACHAM, C. A. (1981), “A Manual Method for Constructing Trees and Hierarchical Classifications,”Journal of Molecular Evolution, 18, 30–37.
MEACHAM, C. A. (1983), “Theoretical and Computational Considerations of the Compatibility of Qualitative Taxonomic Characters,” inNumerical Taxonomy, Ed., J. Felsenstein, NATO ASI Series Vol. G1, Berlin Heidelberg: Springer-Verlag, 304–314.
MEACHAM, C. A., and DUNCAN, T. (1987), “The Necessity of Convex Groups in Biological Classification,”Systematic Botany, 12, 78–90.
SOKAL, R. R., and ROHLF, F. J. (1981), “Taxonomic Congruence in the Leptopodomorpha Re-examined,”Systematic Zoology, 30, 309–325.
WALTER, J. R. (1972),Representations of Rigid Cycle Graphs, Ph.D. thesis, Wayne State University, Detroit, Michigan.
WARNOW, T. (1991),Combinatorial Algorithms for Constructing Phylogenetic Trees, PhD thesis, University of California-Berkeley.
WATERMAN, M. S., and SMITH, T. S. (1978), “On the Similarity of Dendograms,”Journal of Theoretical Biology, 73, 789–800.
YANNAKAKIS, M. (1981), “Computing the Minimum Fill-in is NP-complete,”SIAM Journal on Algebraic and Discrete Methods, 2, 77–79.
Author information
Authors and Affiliations
Additional information
This work was supproted by the Alexander von Humoldt-Stiftung. I wish to thank Andreas Dress, Hans-Jürgen Bandelt and the referees for their helpful comments.
Rights and permissions
About this article
Cite this article
Steel, M. The complexity of reconstructing trees from qualitative characters and subtrees. Journal of Classification 9, 91–116 (1992). https://doi.org/10.1007/BF02618470
Issue Date:
DOI: https://doi.org/10.1007/BF02618470