Trends in Ecology & Evolution
ReviewThe supermatrix approach to systematics
Section snippets
Building ever-larger trees
In The Origin of Species [1], Charles Darwin established that descent with modification explains the similarities and differences among all organisms: from bacteria to orchids to humans, we are all connected in a single Tree of Life. Until recently, taxonomic specialists were restricted to reconstructing phylogenetic relationships among relatively few species using small numbers of characters. However, recent advances now enable phylogenetic analyses of thousands of taxa or characters (e.g. 2, 3
Basic strengths of the supermatrix approach
The supermatrix approach is defined by the direct, simultaneous use of all the character evidence from all included taxa (Figure 1). A basic advantage of the approach is simply that it uses this character evidence more fully in estimating the tree than do supertree methods; in supertree analyses, some of the character information within data sets is lost when sets of characters are summarized as trees 6, 13. This advantage of the supermatrix approach is so fundamental (more information is
The gene tree–species tree problem
The standard supermatrix approach implicitly assumes that all characters have experienced the same branching history. It is now well known that this assumption is not always valid. In particular, the gene tree for copies of a gene in different species might not match the species tree (the history of splitting of species lineages) because of hybridization, horizontal gene transfer, gene duplication and sorting of genetic polymorphisms among species lineages (lineage sorting) [22]. This ‘gene
New supermatrix methods
An equally weighted parsimony approach has been perhaps the most common mode of analysis in supermatrix studies (e.g. 18, 29). Proponents of parsimony seek maximally general explanations for similarities among taxa as results of inheritance from a common ancestor; the criterion of generality entails choosing a tree that minimizes the number of character-state changes required to explain character similarities [30]. Here, we describe several new methods that extend the capabilities of
Supermatrices and the Tree of Life
The Tree of Life as we know it includes ∼1.5 million living species [48] plus many thousands of extinct species. Despite the continuing development of methods that increase the speed of phylogenetic analyses (e.g. 40, 49, 50, 51, 52), it might never be possible to estimate the entire Tree of Life reliably using a single, standard supermatrix analysis [53]. Nonetheless, current methods used to construct very large trees suggest that a supermatrix approach can have a central role in assembling
Supermatrices for the future
The supermatrix approach has proven to be a powerful method of combining diverse data to infer phylogenetic relationships. The approach directly uses the evidence provided by each character, often revealing emergent support that is hidden in separate analyses of data partitions, and easily accommodates different classes of character data. Regarding the latter point, molecular systematists sometimes forget that a tree of all known life must include many fossil taxa and, thus, must be derived in
Acknowledgements
We thank C. Hayashi, J. Kim, K. de Queiroz and three anonymous reviewers for discussion or comments on the article, and M. O’Leary and G. Giribet for providing the MorphoBank figure. J.G. was funded by NSF EAR0228629, DEB–0213171 and DEB–0212572.
Glossary
- Bayesian phylogenetic method
- any method that uses Bayesian inference in phylogenetic estimation. In Bayesian phylogenetic inference, the posterior probability of a tree (which, given certain assumptions, is the probability that the tree is correct) is a function of the product of the tree's prior probability and its likelihood. The most commonly used Bayesian phylogenetic method uses a Markov Chain Monte Carlo technique designed to visit different tree topologies with a frequency proportional to
References (73)
Phylogenetic supertrees: assembling the trees of life
Trends Ecol. Evol.
(1998)The evolution of supertrees
Trends Ecol. Evol.
(2004)Corroboration among data sets in simultaneous analysis: hidden support for phylogenetic relationships among higher level artiodactyl taxa
Cladistics
(1999)Genome mosaicism and organismal lineages
Trends Genet.
(2004)- et al.
Uninode coding vs. gene-tree parsimony for phylogenetic reconstruction using duplicate genes
Mol. Phylog. Evol.
(2002) - et al.
Gene tree parsimony vs. uninode coding for phylogenetic reconstruction
Mol. Phylog. Evol
(2003) The phylogeny of the extant hexapod orders
Cladistics
(2001)Is Ellipura monophyletic? A combined analysis of basal hexapod relationships with emphasis on the origin of insects
Org. Diversity Evol.
(2004)Independence of alignment and tree search
Mol. Phylog. Evol.
(2004)- et al.
Links between maximum likelihood and maximum parsimony under a simple model of site substitution
Bull. Math. Biol.
(1997)
Bayesian inference of the metazoan phylogeny: a combined molecular and morphological approach
Curr. Biol.
Analyzing large data sets in reasonable times: solutions for composite optima
Cladistics
Parsimony jackknifing outperforms neighbor-joining
Cladistics
Unlikelihood that minimal phylogenies for a realistic biological study can be constructed in reasonable computation time
Math. Biosci.
Missing data and the design of phylogenetic analyses
J. Biomed. Inform.
Data exploration in phylogenetic inference: scientific, heuristic, or neither
Cladistics
On the Origin of Species
Simultaneous parsimony jackknife analysis of 2538 rbcL DNA sequences reveals support for major clades of green plants, land plants, seed plants, and flowering plants
Plant Syst. Evol.
Genome-scale approaches to resolving incongruence in molecular phylogenies
Nature
A concern for evidence and a phylogenetic hypothesis of relationships among Epicrates (Boidae, Serpentes)
Syst. Zool.
Prospects for building the Tree of Life from large sequence databases
Science
Inconsistencies in arguments for the supertree approach: supermatrices versus supertrees of Crocodylia
Syst. Biol.
Molecular evidence and marine snake origins
Biol. Lett.
Relationships of endemic African mammals and their fossil relatives based on morphological and molecular evidence
J. Mamm. Evol.
Phylogenetic relationships of extinct cetartiodactyls: results of simultaneous analyses of molecular, morphological, and stratigraphic data
J. Mamm. Evol.
Phylogenomics and the reconstruction of the tree of life
Nat. Rev. Genet.
Separate versus combined analysis of phylogenetic evidence
Annu. Rev. Ecol. Syst.
Against consensus
Syst. Zool.
Combining data in phylogenetic systematics: an empirical approach using three molecular data sets in the Solanaceae
Syst. Biol.
Partitioned Bremer support localizes significant conflict in bee flies (Diptera: Bombyliidae: Anthracinae)
Invertebr. Syst.
Synergistic effects of combining morphological and molecular data in resolving the phylogeny of butterflies and skippers
Proc. R. Soc. B
Supertree bootstrapping methods for assessing phylogenetic variation among genes in genome-scale data sets
Syst. Biol.
Taxonomic sampling, phylogenetic accuracy, and investigator bias
Syst. Biol.
Increased taxon sampling greatly reduces phylogenetic error
Syst. Biol.
Gene trees in species trees
Syst. Biol.
Inferring phylogeny despite incomplete lineage sorting
Syst. Biol.
Cited by (325)
Assessing sequence heterogeneity in Chlorellaceae DNA barcode markers for phylogenetic inference
2023, Journal of Genetic Engineering and BiotechnologyPhylogeny and diversification of planthoppers (Hemiptera: Fulgoromorpha) based on a comprehensive molecular dataset and large taxon sampling
2023, Molecular Phylogenetics and EvolutionSystematics of Miocene apes: State of the art of a neverending controversy
2023, Journal of Human EvolutionTotal evidence phylogeny of platyrrhine primates and a comparison of undated and tip-dating approaches
2023, Journal of Human EvolutionDissecting the genome, secretome, and effectome repertoires of Monilinia spp.: The causal agent of brown rot disease: A comparative analysis
2023, Postharvest Biology and Technology