Skip to main content
Log in

The complexity of reconstructing trees from qualitative characters and subtrees

  • Published:
Journal of Classification Aims and scope Submit manuscript

Abstract

In taxonomy and other branches of classification it is useful to know when tree-like classifications on overlapping sets of labels can be consistently combined into a parent tree. This paper considers the computation complexity of this problem. Recognizing when a consistent parent tree exists is shown to be intractable (NP-complete) for sets of unrooted trees, even when each tree in the set classifies just four labels. Consequently determining the compatibility of qualitative characters and partial binary characters is, in general, also NP-complete. However for sets of rooted trees an algorithm is described which constructs the “strict consensus tree” of all consistent parent trees (when they exist) in polynomial time. The related question of recognizing when a set of subtrees uniquely defines a parent tree is also considered, and a simple necessary and sufficient condition is described for rooted trees.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • AHO, A. V., SAVIG, Y., SZYMANSKI, T. G., and ULLMAN, J.D. (1981), “Inferring a Tree from the Lowest Common Ancestors with an Application to the Optimization of Relational Expressions”,SIAM Journal on Computing, 10(3), 405–421.

    Article  MATH  MathSciNet  Google Scholar 

  • BANDELT, H-J., and DRESS, A. (1986), “Reconstructing the Shape of a Tree from Observed Dissimilarity Data”,Advances in Applied Mathematics, 7, 309–343.

    Article  MATH  MathSciNet  Google Scholar 

  • BANDELT, H-J., VON HAESELER, A., BOLICK, J., and SCHÜTTE, H. (1990), “A Comparative Study of Sequence Dissimilarities and Evolutionary Distances Derived from Sets of Aligned RNA Sequences”, preprint.

  • BONDY, J. A., and MURTY, U. S. R. (1976),Graph Theory with Applications, London: Macmillan.

    Google Scholar 

  • BROSSEIER, G. (1990), “Piecewise Heirarchical Clustering”,Journal of Classification, 7, 197–216.

    Article  MathSciNet  Google Scholar 

  • BUNEMAN, P. (1971), “The Recovery of Trees from Measures of Dissimilarily”, inMathematics in the Archaeological and Historical Sciences, Eds., F. R. Hodson, D. G. Kendall, and P. Tautu, Edinburgh: Edinburgh University Press, 387–395.

    Google Scholar 

  • BUNEMAN, P. (1974), “A Characterization of Rigid Circuit Graphs”,Discrete Mathematics, 9, 205–212.

    Article  MATH  MathSciNet  Google Scholar 

  • CARTER, M., HENDY, M. D., PENNY, D., SZÉKELY, L. A., and WORMALD, N. C. (1990), “On the Distribution of Lengths of Evolutionary Trees”,SIAM Journal on Discrete Mathematics, 3, 38–47.

    Article  MATH  MathSciNet  Google Scholar 

  • CAVALLI-SFORZA, L. L., and EDWARDS, A. W. F. (1967), “Phylogenetic Analysis: Models and Estimation Procedures”,Evolution, 21, 550–570.

    Article  Google Scholar 

  • CAVENDER, J. A. and FELSENSTEIN, J. (1987), “Invariants of Phylogenies: Simple Cases with Discrete States”,Journal of Classification, 4, 57–71.

    Article  MATH  Google Scholar 

  • COLONIUS, H., and SCHULZE, H. H. (1981), “Tree Structures for Proximity Data”,British Journal of Mathematical and Statistical Psychology, 34, 167–180.

    MATH  MathSciNet  Google Scholar 

  • CONSTANTINESCU, M., and SANKOFF, D. (1986), “Tree Enumeration Modulo a Consensus”,Journal of Classification, 3, 349–356.

    Article  MATH  MathSciNet  Google Scholar 

  • DAY, W. H. E. (1985), “Optimal Algorithms for Comparing Trees with Labeled Leaves”,Journal of Classification, 2, 7–28.

    Article  MATH  MathSciNet  Google Scholar 

  • DAY, W. H. E. (1985), “Analysis of Quartet Dissimilarity Measures between Undirected Phylogenetic Trees”,Systematic Zoology, 35(3), 325–333.

    Article  Google Scholar 

  • DAY, W. H. E., and SANKOFF, D. (1986), “Computational Complexity of Inferring Phylogenies by Compatibility”,Systematic Zoology, 35(2), 224–229.

    Article  Google Scholar 

  • DEKKER, M. C. H. (1986),Reconstruction Methods for Derivation Trees, Masters thesis, Vrije Universitiet, Amsterdam.

    Google Scholar 

  • DRESS, A., VON HAESELER, A., and KRUEGER, M. (1986), “Reconstructing Phylogenetic Trees Using Variants of the ‘Four-Point-Condition’,”Studien zur Klassifikation, 17, 299–305.

    Google Scholar 

  • DRESS, A., and STEEL, M. A. (1991) “Convex Tree Realizations of Partitions,”Applied Mathematics Letters (in press).

  • DROLET, S., and SANKOFF, D. (1990), “Quadratic Tree Invariants for Multivalued Characters,”Journal of Theoretical Biology, 144, 117–129.

    Article  MathSciNet  Google Scholar 

  • ERDÖS, P., and SZÉKELY, L. A. (1989) “Applications of Antilexicographic Order 1. An Enumerative Theory of Trees,”Advances in Applied Mathematics, 10, 488–496.

    Article  MATH  MathSciNet  Google Scholar 

  • ESTABROOK, G. F., JOHNSON, C. S., Jr., and MCMORRIS, F. R. (1976), “An Algebraic Analysis of Cladistic Characters,”Discrete Mathematics, 16, 141–147.

    Article  MATH  MathSciNet  Google Scholar 

  • ESTABROOK, G. F., and MCMORRIS, F. R. (1977), “When are Two Taxonomic Characters Compatible?,”Journal of Mathematical Biology, 4, 195–299.

    MATH  MathSciNet  Google Scholar 

  • ESTABROOK, G. F., and MEACHAM, C. A. (1979), “How to Determine the Compatibility of Undirected Character State Trees,”Mathematical Biosciences, 46, 251–256.

    Article  MathSciNet  Google Scholar 

  • ESTABROOK, G. F., McMORRIS, F. R., and MEACHAM C. A. (1985), “Comparison of Undirected Phylogenetic Trees based on Subtrees of Four Evolutionary Units,”Systematic Zoology, 34(2), 193–200.

    Article  Google Scholar 

  • FELSENSTEIN, J. (1988), “Phylogenies from Molecular Sequences: Inference and Reliability,”Annual Review of Genetics, 22, 521–565.

    Article  Google Scholar 

  • FITCH, W. M. (1975), “Towards Finding the Tree of Maximum Parsimony,” inThe Eighth International Conference on Numerical Taxonomy, Ed., G. F. Estabrook, San Francisco: W.H. Freeman, 189–230.

    Google Scholar 

  • FOULDS, L. R., and GRAHAM, R. L. (1982), “The Steiner Problem in Phylogeny is NP-complete,”Advances in Applied Mathematics, 3, 43–49.

    Article  MATH  MathSciNet  Google Scholar 

  • GAREY, M. R., and JOHNSON, D.S., (1979),Computers and Intractibility, New Jersey: Bell Telephone Laboratories Ltd.

    Google Scholar 

  • GAVRIL, F. (1974), “The Intersection Graphs of Subtrees in Trees are Exactly the Chordal Graphs,”Journal of Combinatorial Theory (B), 16, 47–56.

    Article  MATH  MathSciNet  Google Scholar 

  • GOLUMBIC, M. C. (1980),Algorithmic Graph Theory and Perfect Graphs, New York: Academic Press, 92.

    MATH  Google Scholar 

  • GORDON, A. D. (1986), “Consensus Supertrees: The Synthesis of Rooted Trees Containing Overlapping Sets of Labeled Leaves,”Journal of Classification, 3, 335–348.

    Article  MATH  MathSciNet  Google Scholar 

  • GUSFIELD, D. (1991), “Efficient Algorithms for Inferring Evolutionary Trees,”Networks, 21, 19–28.

    MATH  MathSciNet  Google Scholar 

  • KANNAN, S., and WARNOW, T. (1990) “Inferring Evolutionary Trees from DNA Sequences,” in31st Annual Symposium on Foundations of Computer Science (Proceedings), Los Alamitos, California: IEEE Computer Society Press, 362–371.

    Chapter  Google Scholar 

  • HALL, M. H. Jr. (1967),Combinatorial Theory, Waltham: Blaisdell, 175.

    MATH  Google Scholar 

  • HENDY, M. D. (1989), “The Relationship Between Simple Evolutionary Tree Models and Observable Sequence Data,”Systematic Zoology, 38, 310–321.

    Article  Google Scholar 

  • KANT-ANTONESCU, M., and SANKOFF, D. (1991), “Efficient Construction of Supertrees,” manuscript.

  • MCMORRIS, F. R. (1975), “Compatibility Criteria for Cladistic and Qualitative Taxonomic Characters,” inThe Eighth International Conference on Numerical Taxonomy, Ed., E. A. Estabrook, San Fransisco: W.H. Freeman, 399–415.

    Google Scholar 

  • MCMORRIS, F. R. (1977), “On the Compatibility of Binary Qualitative Taxonomic Characters,”Bulletin of Mathematical Biology, 39, 133–138.

    MATH  MathSciNet  Google Scholar 

  • MCMORRIS, F. R., WARNOW, T., and WIMER, T. (1991), “Chordal Completion of Coloured Graphs for a Fixed Number of Colours”, manuscript.

  • MARGUSH, T., and MCMORRIS, F. R. (1981), “Consensus n-trees,”Bulletin of Mathematical Biology, 43, 239–244.

    MATH  MathSciNet  Google Scholar 

  • MEACHAM, C. A. (1981), “A Manual Method for Constructing Trees and Hierarchical Classifications,”Journal of Molecular Evolution, 18, 30–37.

    Article  Google Scholar 

  • MEACHAM, C. A. (1983), “Theoretical and Computational Considerations of the Compatibility of Qualitative Taxonomic Characters,” inNumerical Taxonomy, Ed., J. Felsenstein, NATO ASI Series Vol. G1, Berlin Heidelberg: Springer-Verlag, 304–314.

    Google Scholar 

  • MEACHAM, C. A., and DUNCAN, T. (1987), “The Necessity of Convex Groups in Biological Classification,”Systematic Botany, 12, 78–90.

    Article  Google Scholar 

  • SOKAL, R. R., and ROHLF, F. J. (1981), “Taxonomic Congruence in the Leptopodomorpha Re-examined,”Systematic Zoology, 30, 309–325.

    Article  Google Scholar 

  • WALTER, J. R. (1972),Representations of Rigid Cycle Graphs, Ph.D. thesis, Wayne State University, Detroit, Michigan.

    Google Scholar 

  • WARNOW, T. (1991),Combinatorial Algorithms for Constructing Phylogenetic Trees, PhD thesis, University of California-Berkeley.

  • WATERMAN, M. S., and SMITH, T. S. (1978), “On the Similarity of Dendograms,”Journal of Theoretical Biology, 73, 789–800.

    Article  MathSciNet  Google Scholar 

  • YANNAKAKIS, M. (1981), “Computing the Minimum Fill-in is NP-complete,”SIAM Journal on Algebraic and Discrete Methods, 2, 77–79.

    Article  MATH  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Additional information

This work was supproted by the Alexander von Humoldt-Stiftung. I wish to thank Andreas Dress, Hans-Jürgen Bandelt and the referees for their helpful comments.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Steel, M. The complexity of reconstructing trees from qualitative characters and subtrees. Journal of Classification 9, 91–116 (1992). https://doi.org/10.1007/BF02618470

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1007/BF02618470

Keywords

Navigation