Trends in Microbiology
ReviewMicrobial systems biologyPhylogenomic networks
Section snippets
Phylogenomics
The evolutionary history of a species is most commonly depicted as a bifurcating phylogenetic tree (see Glossary) comprising nodes and branches. The nodes in the tree correspond to contemporary species (external nodes) and their ancestors (internal nodes). The branches represent vertical inheritance linking ancestors with their descendants (Figure 1a). The accumulation of fully sequenced genomes since the early 2000s has enabled the practice of phylogenomics, that is, the study of phylogenetic
Networks
A network (or a graph) is a mathematical model of pairwise relations among entities. The entities (vertices or nodes) in the network are linked by edges representing the connections or interactions between these entities. In a coauthorship network, for example, the vertices signify scientists and the edges represent common publications to the scientists that they connect [22]. In an aviation network, airports are connected by flights [23]. Network approaches are common in almost all fields of
Phylogenetic networks
Networks are commonly used in phylogenetic research for the reconstruction of evolutionary processes that are non-tree-like in nature including hybridization, recombination, genome fusions and LGT [19]. The application of networks to phylogenetic data enables the modeling and visualization of reticulated evolutionary events that cannot be represented using a bifurcating phylogenetic tree 18, 19, 20, 21, 34, 35, 36. Network applications can also be used for tree-like (vertical inheritance only)
Phylogenomic networks of shared genes
Networks of shared genes are reconstructed from the presence/absence pattern of all orthologous protein families distributed across the genomes in the network 11, 46, 47, 48, 49, 50, 54. The vertices in the network are genomes (species) and the edges correspond to gene sharing between the genomes they connect. The gene sharing network reconstruction procedure includes the following steps: (i) selecting the genomes to be included in the network; (ii) sorting all proteins encoded in the selected
Phylogenomic LGT networks from shared genes
Phylogenomic LGT networks have been developed to study the lateral component in microbial evolution and are reconstructed from LGT events inferred from genomic data 11, 14, 48, 50, 51. Networks of laterally shared genes (LSG) are a special case of shared genes networks. These are designed specifically to study gene distribution patterns resulting from LGT during prokaryotic evolution. The vertices in the network are the external and internal nodes of a reference species phylogenetic tree. Edges
Phylogenomic LGT networks from trees
Phylogenomic LGT networks have also been reconstructed from LGT events detected in gene phylogenies 14, 51. As in the LSG network, the phylogenomic LGT network reconstruction requires a species tree that is considered as a reference for distinction between vertical inheritance and LGT. For the network reconstruction, a phylogenetic tree is reconstructed for each protein family. Branches (splits) in the protein family tree that are found in disagreement with the reference species tree are
Structural properties of phylogenomic networks
Structural properties of networks can be analyzed and understood using an extensive set of tools developed over the years 24, 26. Node connectivity, for example, is a measure that quantifies the extent to which a node is central within the network [26]. A similar measure, vertex centrality, quantifies the frequency in which the vertex occurs along the shortest path between any vertex pair in the network. The overall distribution of vertex centrality is commonly used to test for internal
Concluding remarks
Each of the different phylogenomic network types presented here offers a different insight into microbial genome evolution. Networks capture a substantial component of genome evolution, which is not tree-like in nature. Therefore, in biological systems where reticulated evolutionary events are common, phylogenomic networks offer a general computational approach that is more biologically realistic and evolutionarily more accurate. The prevalence of LGT during microbial and viral evolution makes
Acknowledgments
I thank Giddy Landan, Liat Shavit-Grievink, Ovidiu Popa, Thorsten Klösges and William Martin for their critical comments on the manuscript. This publication was funded in part by an ERC grant NETWORKORIGINS to William Martin.
Glossary
- Conjugation
- the transfer of DNA via proteinaceous cell-to-cell junctions in bacteria.
- Degree (or connectivity)
- the number of edges that connect the node with other nodes.
- Directed network
- a network where the entities are connected by asymmetric relationships.
- Edge (or link)
- related vertices are connected by an edge.
- Gene copy number
- the number of copies of a certain gene within the genome.
- Gene transfer agents (GTA)
- phage-like DNA-carriers that are produced by a donor cell during the growth phase and
References (61)
- et al.
Importance of widespread gene transfer agent genes in alpha-proteobacteria
Trends Microbiol.
(2007) Scale-free characteristics of random networks: the topology of the World-Wide Web
Phys. A
(2000)Genome-wide dissection of microRNA functions and cotargeting networks using gene set signatures
Mol. Cell
(2010)- et al.
A canonical decomposition theory for metrics on a finite set
Adv. Math.
(1992) Genome history in the symbiotic hybrid Euglena gracilis
Gene
(2007)Phylogenomics: improving functional predictions for uncharacterized genes by evolutionary analysis
Genome Res.
(1998)- et al.
Phylogenomics: intersection of evolution and genomics
Science
(2003) Lateral gene transfer and the nature of bacterial innovation
Nature
(2000)Direct visualization of horizontal gene transfer
Science
(2008)- et al.
DNA uptake during bacterial transformation
Nat. Rev. Microbiol.
(2004)
Mechanisms of, and barriers to, horizontal gene transfer between bacteria
Nat. Rev. Microbiol.
The ins and outs of DNA transfer in bacteria
Science
High frequency of horizontal gene transfer in the oceans
Science
The net of life: reconstructing the microbial phylogenetic network
Genome Res.
Ancestral genome sizes specify the minimum rate of lateral gene transfer during prokaryote evolution
Proc. Natl. Acad. Sci. U.S.A.
Algorithms for computing parsimonious evolutionary scenarios for genome evolution, the last universal common ancestor and dominance of horizontal gene transfer in the evolution of prokaryotes
BMC Evol. Biol.
The cobweb of life revealed by genome-scale estimates of horizontal gene transfer
PLoS Biol.
Highways of gene sharing in prokaryotes
Proc. Natl. Acad. Sci. U.S.A.
Genome-wide experimental determination of barriers to horizontal gene transfer
Science
Horizontal gene transfer among genomes: the complexity hypothesis
Proc. Natl. Acad. Sci. U.S.A.
The complexity hypothesis revisited: connectivity rather than function constitutes a barrier to horizontal gene transfer
Mol. Biol. Evol.
Application of phylogenetic networks in evolutionary studies
Mol. Biol. Evol.
survey of combinatorial methods for phylogenetic networks
Genome Biol. Evol.
Prokaryotic evolution and the tree of life are two different things
Biol. Direct
Getting a better picture of microbial evolution en route to a network of genomes
Philos. Trans. R. Soc. Lond. B: Biol. Sci.
The structure of scientific collaboration networks
Proc. Natl. Acad. Sci. U.S.A.
The worldwide air transportation network: anomalous centrality, community structure, and cities’ global roles
Proc. Natl. Acad. Sci. U.S.A.
Exploring complex networks
Nature
Cited by (58)
The past, present and future of the tree of life
2021, Current BiologyCitation Excerpt :Many early depictions of the network of life were attempts at imagining what the network might look like rather than the product of bioinformatics4,5. The qualitative nature of these images was recognized by their proponents, and so was the need to develop proper tools for building networks from actual genomic data28–30. These points of contention — the extent and relevance of lateral gene transfer, the significance of core genes, and the possibility of a statistical tree of life, as well as broad questions pertaining to pluralism and networks — formed the cornerstones of the tree versus network debate as it initially unfolded.
ORF-based binarized structure network analysis of plasmids (OSNAp), a novel approach to core gene-independent plasmid phylogeny
2020, PlasmidCitation Excerpt :This indicates that evolution of plasmids does not necessarily occur as simple branching events from a common ancestor. Therefore, network analysis, which does not require assumption of linear evolution, is more suitable in describing phylogeny of plasmids (Dagan, 2011). Network analysis is increasingly applied to phylogenetic analysis of plasmids (Brilli et al., 2008; Fondi and Fani, 2010; Tamminen et al., 2012).
A Scalable Permutation Approach Reveals Replication and Preservation Patterns of Network Modules in Large Datasets
2016, Cell SystemsCitation Excerpt :In general, these approaches represent each measured variable as a node and the relationships between variables as edges that connect nodes; in aggregate, the connected edges and nodes comprise the network. Statistical analysis of these networks can identify and characterize gene modules, gene regulatory networks, protein-protein interactions, microbial networks and predict diverse molecular interactions (Abraham et al., 2014; Barabási et al., 2011; Dagan, 2011; Faust and Raes, 2012; Lusis and Weiss, 2010; Schadt, 2009). Typically, a research project investigates one or more sub-graphs of these inferred networks, for example, a group of genes associated with disease pathogenesis.
Regulation of competence for natural transformation in streptococci
2015, Infection, Genetics and EvolutionCitation Excerpt :For almost two decades, the accumulation of sequenced genomes and the study of their phylogenetic relationships (phylogenomics) has revealed the widespread occurrence of lateral gene transfer (LGT) events in the evolution of prokaryote genomes (for a review, see (Dagan, 2011)).
Adapting simultaneous analysis phylogenomic techniques to study complex disease gene relationships
2015, Journal of Biomedical InformaticsCitation Excerpt :However, like experiments focused on a single gene or pathway, an isolated phylogenetic analysis may not capture important features of co-evolution or conservation of gene clusters impacting complex disease processes. Additionally, reliance on phylogenetic trees of individual genes may not fully address the potential for genetic changes such as lateral gene transfer, reversion of mutations, or recombination events [17,18]. To account for multiple evolutionary patterns represented by multiple genes, data matrices can be combined into a single phylogenetic analysis through a “simultaneous analysis” (SA) approach [19–21].