Explosions and hot spots in supertree methods

https://doi.org/10.1016/j.jtbi.2008.03.024Get rights and content

Abstract

In phylogenetic systematics a problem of great practical and theoretical interest is to construct one or more large phylogenies (evolutionary trees), i.e., supertrees, from a given set of small phylogenies with overlapping sets of leaf labels. Although the methods being used to solve this problem are usually given plausible biological or theoretical justifications, occasionally it is possible to see that the result of a supertree method (SM) is explosive, and therefore logically meaningless, in the sense that it has been inferred from logical propositions that are contradictory. This paper presents the basic ideas and issues of how explosions affect the inference of rooted trees by SMs. We define the relevant concepts, give examples, and show how sometimes it is possible to identify hot spots in the input from which an SM may make explosive inferences that cannot be logically justified.

Introduction

explode vt. 1. to cause to be rejected; expose as false; discredit

hot spot [Slang] 1. an area of actual or potential trouble or violence

  — Webster's New World Dictionary (1972).

A provocative feature of classical logic concerns how contradiction and logical inference interact. This logic's inference relation  is explosive since according to it a contradiction entails everything, i.e., for any logical propositions P and Q, P¬PQ. When this condition occurs the set {P,¬P} of contradictory propositions is said to explode and the logical consequent Q is an explosive, and therefore logically meaningless, inference from that set. Thus if P is “Integer 2 is prime” and Q is “Bill's supertree method (SM) is best” then the logical implication P¬PQ, although valid, provides no support for using Bill's SM, just as P¬P¬Q, although also valid, provides no support for not using Bill's SM.

In several ways exploding sets, and explosive inferences from exploding sets, pertain when inferring super- (or consensus) trees. (i) When inferring rooted phylogenies let an SM C be presented with a set K of rooted triplets (defined in the next section). Each triplet xy|z is a logical proposition that x and y are more closely related than either is to z. If at least two triplets of {xy|z,xz|y,yz|x} are in input K or result C(K) then at least a subset of K or C(K) explodes. (ii) Let C be any supertree (or consensus) method that when presented with a set K of trees infers a set C(K) of super- (or consensus) trees; if K explodes, i.e., if K is logically equivalent to a contradiction, then C(K) could be viewed as an explosive, and therefore logically meaningless, inference from K. (iii) For this reason if C is any supertree (or consensus) method that when presented with a set K of trees infers a set C(K) of super- (or consensus) trees, it would be instructive to characterize the sets K that cause C(K) to explode, i.e., to understand precisely the circumstances in which C(K) becomes a logically meaningless inference from K.

For example let C be a consensus method and for the leaf set S={a,b,c,d} let C's input K={K1,K2} comprise the rooted trees of Fig. 1. K can be represented as a union K=KK of triplet sets. K={ab|c,ab|d} is contained in K1 and K2 and is their triplet strict consensus (Sokal and Rohlf, 1981, p. 312). K={ac|d,cd|a,bc|d,cd|b} contains contradictory pairs of triplets; if K is a region of contradiction in K then C(K) should not derive from any explosive, and therefore logically meaningless, inferences from K.

This paper presents the basic ideas and issues of how explosions affect the inference of rooted trees by SMs. We define the concepts needed to specify explosions. We use the semi-strict SM (Goloboff and Pol, 2002) and a new semi-closed SM to illustrate properties of exploding SMs and to characterize when those SMs explode trivially. With such characterizations we can identify input hot spots from which those SMs may make explosive inferences that cannot be logically justified. We suggest how to avoid explosive, and therefore logically meaningless, inferences from hot spots. We summarize why users should worry when explosions occur while inferring phylogenies.

Section snippets

Concepts and terms

Our SMs are based on the concept of rooted tree, i.e., an acyclic connected graph with each leaf (vertex of degree 1) uniquely labeled, with one interior vertex that is distinguished and called the root, and with no vertices of degree 2 except possibly the root. Always our trees are rooted with more than two labeled leaves and one interior vertex. A way to study sets of such trees is to replace each tree by a set of its phylogenetically informative subtrees, which may be taken to be binary

Trivial explosions

If an SM C can be defined in terms of logical propositions then it may be possible to characterize the sets at which C explodes trivially. Our examples are SMs that generalize from the consensus (e.g., Bremer, 1990, Day and McMorris, 2003) to the supertree context.

Hot spots

Let C be any SM with a characterization (#) of any triplet set K such that C explodes trivially at K. We could use (#) to identify regions of contradiction (with respect to C) in any set K by applying (#) to the subsets of K. Specifically let KK be any maximal subset of K that satisfies (#); since C explodes trivially at K then K is a hot spot in K for C. C explodes trivially at K if and only if K is itself a hot spot in K for C. Informally, although K itself need not explode, any hot spot K 

Discussion

We have studied explosive inferences in a simple phylogenetic context where explosions and hot spots have natural formulations in logical rather than statistical terms. Seen through a statistical lens, an explosive inference may be a best inference in some sense, e.g., using maximum likelihood under some reasonable model, and may be meaningful. Nor would we necessarily condemn majority-rule or median SMs as giving logically meaningless inferences from contradictions, for such results could be

Acknowledgments

We thank P.A. Goloboff and an anonymous referee for their criticisms of a draft of this paper. M.W. was supported in part by BBSRC Grant 40/18385.

References (12)

There are more references available in the full text version of this article.

Cited by (4)

  • Coalescent methods for estimating phylogenetic trees

    2009, Molecular Phylogenetics and Evolution
  • Conservative supertrees

    2011, Systematic Biology
  • Robinson-Foulds supertrees

    2010, Algorithms for Molecular Biology
View full text