Trends in Microbiology
Volume 19, Issue 10, October 2011, Pages 483-491
Journal home page for Trends in Microbiology

Review
Microbial systems biology
Phylogenomic networks

https://doi.org/10.1016/j.tim.2011.07.001Get rights and content

Phylogenomics is aimed at studying functional and evolutionary aspects of genome biology using phylogenetic analysis of whole genomes. Current approaches to genome phylogenies are commonly founded in terms of phylogenetic trees. However, several evolutionary processes are non tree-like in nature, including recombination and lateral gene transfer (LGT). Phylogenomic networks are a special type of phylogenetic network reconstructed from fully sequenced genomes. The network model, comprising genomes connected by pairwise evolutionary relations, enables the reconstruction of both vertical and LGT events. Modeling genome evolution in the form of a network enables the use of an extensive toolbox developed for network research. The structural properties of phylogenomic networks open up fundamentally new insights into genome evolution.

Section snippets

Phylogenomics

The evolutionary history of a species is most commonly depicted as a bifurcating phylogenetic tree (see Glossary) comprising nodes and branches. The nodes in the tree correspond to contemporary species (external nodes) and their ancestors (internal nodes). The branches represent vertical inheritance linking ancestors with their descendants (Figure 1a). The accumulation of fully sequenced genomes since the early 2000s has enabled the practice of phylogenomics, that is, the study of phylogenetic

Networks

A network (or a graph) is a mathematical model of pairwise relations among entities. The entities (vertices or nodes) in the network are linked by edges representing the connections or interactions between these entities. In a coauthorship network, for example, the vertices signify scientists and the edges represent common publications to the scientists that they connect [22]. In an aviation network, airports are connected by flights [23]. Network approaches are common in almost all fields of

Phylogenetic networks

Networks are commonly used in phylogenetic research for the reconstruction of evolutionary processes that are non-tree-like in nature including hybridization, recombination, genome fusions and LGT [19]. The application of networks to phylogenetic data enables the modeling and visualization of reticulated evolutionary events that cannot be represented using a bifurcating phylogenetic tree 18, 19, 20, 21, 34, 35, 36. Network applications can also be used for tree-like (vertical inheritance only)

Phylogenomic networks of shared genes

Networks of shared genes are reconstructed from the presence/absence pattern of all orthologous protein families distributed across the genomes in the network 11, 46, 47, 48, 49, 50, 54. The vertices in the network are genomes (species) and the edges correspond to gene sharing between the genomes they connect. The gene sharing network reconstruction procedure includes the following steps: (i) selecting the genomes to be included in the network; (ii) sorting all proteins encoded in the selected

Phylogenomic LGT networks from shared genes

Phylogenomic LGT networks have been developed to study the lateral component in microbial evolution and are reconstructed from LGT events inferred from genomic data 11, 14, 48, 50, 51. Networks of laterally shared genes (LSG) are a special case of shared genes networks. These are designed specifically to study gene distribution patterns resulting from LGT during prokaryotic evolution. The vertices in the network are the external and internal nodes of a reference species phylogenetic tree. Edges

Phylogenomic LGT networks from trees

Phylogenomic LGT networks have also been reconstructed from LGT events detected in gene phylogenies 14, 51. As in the LSG network, the phylogenomic LGT network reconstruction requires a species tree that is considered as a reference for distinction between vertical inheritance and LGT. For the network reconstruction, a phylogenetic tree is reconstructed for each protein family. Branches (splits) in the protein family tree that are found in disagreement with the reference species tree are

Structural properties of phylogenomic networks

Structural properties of networks can be analyzed and understood using an extensive set of tools developed over the years 24, 26. Node connectivity, for example, is a measure that quantifies the extent to which a node is central within the network [26]. A similar measure, vertex centrality, quantifies the frequency in which the vertex occurs along the shortest path between any vertex pair in the network. The overall distribution of vertex centrality is commonly used to test for internal

Concluding remarks

Each of the different phylogenomic network types presented here offers a different insight into microbial genome evolution. Networks capture a substantial component of genome evolution, which is not tree-like in nature. Therefore, in biological systems where reticulated evolutionary events are common, phylogenomic networks offer a general computational approach that is more biologically realistic and evolutionarily more accurate. The prevalence of LGT during microbial and viral evolution makes

Acknowledgments

I thank Giddy Landan, Liat Shavit-Grievink, Ovidiu Popa, Thorsten Klösges and William Martin for their critical comments on the manuscript. This publication was funded in part by an ERC grant NETWORKORIGINS to William Martin.

Glossary

Conjugation
the transfer of DNA via proteinaceous cell-to-cell junctions in bacteria.
Degree (or connectivity)
the number of edges that connect the node with other nodes.
Directed network
a network where the entities are connected by asymmetric relationships.
Edge (or link)
related vertices are connected by an edge.
Gene copy number
the number of copies of a certain gene within the genome.
Gene transfer agents (GTA)
phage-like DNA-carriers that are produced by a donor cell during the growth phase and

References (61)

  • C.M. Thomas et al.

    Mechanisms of, and barriers to, horizontal gene transfer between bacteria

    Nat. Rev. Microbiol.

    (2005)
  • I. Chen

    The ins and outs of DNA transfer in bacteria

    Science

    (2005)
  • L.D. McDaniel

    High frequency of horizontal gene transfer in the oceans

    Science

    (2010)
  • V. Kunin

    The net of life: reconstructing the microbial phylogenetic network

    Genome Res.

    (2005)
  • T. Dagan et al.

    Ancestral genome sizes specify the minimum rate of lateral gene transfer during prokaryote evolution

    Proc. Natl. Acad. Sci. U.S.A.

    (2007)
  • B.G. Mirkin

    Algorithms for computing parsimonious evolutionary scenarios for genome evolution, the last universal common ancestor and dominance of horizontal gene transfer in the evolution of prokaryotes

    BMC Evol. Biol.

    (2003)
  • F. Ge

    The cobweb of life revealed by genome-scale estimates of horizontal gene transfer

    PLoS Biol.

    (2005)
  • R.G. Beiko

    Highways of gene sharing in prokaryotes

    Proc. Natl. Acad. Sci. U.S.A.

    (2005)
  • R. Sorek

    Genome-wide experimental determination of barriers to horizontal gene transfer

    Science

    (2007)
  • R. Jain

    Horizontal gene transfer among genomes: the complexity hypothesis

    Proc. Natl. Acad. Sci. U.S.A.

    (1999)
  • O. Cohen

    The complexity hypothesis revisited: connectivity rather than function constitutes a barrier to horizontal gene transfer

    Mol. Biol. Evol.

    (2011)
  • D.H. Huson et al.

    Application of phylogenetic networks in evolutionary studies

    Mol. Biol. Evol.

    (2006)
  • D.H. Huson et al.

    survey of combinatorial methods for phylogenetic networks

    Genome Biol. Evol.

    (2011)
  • E. Bapteste

    Prokaryotic evolution and the tree of life are two different things

    Biol. Direct

    (2009)
  • T. Dagan et al.

    Getting a better picture of microbial evolution en route to a network of genomes

    Philos. Trans. R. Soc. Lond. B: Biol. Sci.

    (2009)
  • M.E. Newman

    The structure of scientific collaboration networks

    Proc. Natl. Acad. Sci. U.S.A.

    (2001)
  • R. Guimerà

    The worldwide air transportation network: anomalous centrality, community structure, and cities’ global roles

    Proc. Natl. Acad. Sci. U.S.A.

    (2005)
  • S.H. Strogatz

    Exploring complex networks

    Nature

    (2001)
  • Cited by (58)

    • The past, present and future of the tree of life

      2021, Current Biology
      Citation Excerpt :

      Many early depictions of the network of life were attempts at imagining what the network might look like rather than the product of bioinformatics4,5. The qualitative nature of these images was recognized by their proponents, and so was the need to develop proper tools for building networks from actual genomic data28–30. These points of contention — the extent and relevance of lateral gene transfer, the significance of core genes, and the possibility of a statistical tree of life, as well as broad questions pertaining to pluralism and networks — formed the cornerstones of the tree versus network debate as it initially unfolded.

    • ORF-based binarized structure network analysis of plasmids (OSNAp), a novel approach to core gene-independent plasmid phylogeny

      2020, Plasmid
      Citation Excerpt :

      This indicates that evolution of plasmids does not necessarily occur as simple branching events from a common ancestor. Therefore, network analysis, which does not require assumption of linear evolution, is more suitable in describing phylogeny of plasmids (Dagan, 2011). Network analysis is increasingly applied to phylogenetic analysis of plasmids (Brilli et al., 2008; Fondi and Fani, 2010; Tamminen et al., 2012).

    • A Scalable Permutation Approach Reveals Replication and Preservation Patterns of Network Modules in Large Datasets

      2016, Cell Systems
      Citation Excerpt :

      In general, these approaches represent each measured variable as a node and the relationships between variables as edges that connect nodes; in aggregate, the connected edges and nodes comprise the network. Statistical analysis of these networks can identify and characterize gene modules, gene regulatory networks, protein-protein interactions, microbial networks and predict diverse molecular interactions (Abraham et al., 2014; Barabási et al., 2011; Dagan, 2011; Faust and Raes, 2012; Lusis and Weiss, 2010; Schadt, 2009). Typically, a research project investigates one or more sub-graphs of these inferred networks, for example, a group of genes associated with disease pathogenesis.

    • Regulation of competence for natural transformation in streptococci

      2015, Infection, Genetics and Evolution
      Citation Excerpt :

      For almost two decades, the accumulation of sequenced genomes and the study of their phylogenetic relationships (phylogenomics) has revealed the widespread occurrence of lateral gene transfer (LGT) events in the evolution of prokaryote genomes (for a review, see (Dagan, 2011)).

    • Adapting simultaneous analysis phylogenomic techniques to study complex disease gene relationships

      2015, Journal of Biomedical Informatics
      Citation Excerpt :

      However, like experiments focused on a single gene or pathway, an isolated phylogenetic analysis may not capture important features of co-evolution or conservation of gene clusters impacting complex disease processes. Additionally, reliance on phylogenetic trees of individual genes may not fully address the potential for genetic changes such as lateral gene transfer, reversion of mutations, or recombination events [17,18]. To account for multiple evolutionary patterns represented by multiple genes, data matrices can be combined into a single phylogenetic analysis through a “simultaneous analysis” (SA) approach [19–21].

    View all citing articles on Scopus
    View full text