Coevolution, modularity and human disease

https://doi.org/10.1016/j.gde.2006.09.001Get rights and content

The concepts of coevolution and modularity have been studied separately for decades. Recent advances in genomics have led to the first systematic studies in each of these fields at the molecular level, resulting in several important discoveries. Both coevolution and modularity appear to be pervasive features of genomic data from all species studied to date, and their presence can be detected in many types of datasets, including genome sequences, gene expression data, and protein–protein interaction data. Moreover, the combination of these two ideas might have implications for our understanding of many aspects of biology, ranging from the general architecture of living systems to the causes of various human diseases.

Introduction

In more than a century of work, numerous examples of phenotypic coevolution — defined as an interaction in which multiple species influence the fitness and evolution of one another — have been recorded. Some well-known examples include the similarity between the shapes of hummingbirds’ beaks and the flowers they pollinate, and the intricate correspondence between different species of fig and fig wasp [1]. Coevolution at the molecular scale is analogous to phenotypic coevolution but involves interacting molecules instead of species (Figure 1a and ‘Coevolution’). Molecular coevolution has been studied for more than 20 years, both between proteins and nucleic acids [2] and between physically interacting proteins [3]. At all scales in biology, coevolution is an important process in determining evolutionary trajectories. Given that every protein, cell and species must intimately interact with some other protein(s), cell(s) or species, all of which are themselves constantly evolving, coevolution is necessary to maintain functional interactions over time.

As with coevolution, modularity — defined as the tendency for functional linkages to occur within but not between semi-autonomous groups of proteins, cells or species, known as ‘modules’ — can be seen at almost any spatial scale, ranging from networks of proteins all the way up to entire species (Figure 1b and ‘Modularity’). Unlike coevolution, however, modularity has — until recently, and with several notable exceptions — most commonly been studied only peripherally or indirectly, as opposed to being an explicit object of research. It is really only within the past 5–10 years that modularity has become a widespread topic of research in its own right. Recent work has identified a modular structure within many large-scale datasets, including those concerning protein–protein interactions, synthetic genetic interactions, and gene expression. Modularity appears to be a fundamental aspect of living systems and is thought to be an important determinant of the evolvability (see Glossary) of species.

In this review, I focus on various questions that have been addressed by recent studies; for example, which genes and/or proteins coevolve, and why? How widespread is modularity in various types of protein networks? How do coevolution and modularity affect the evolution and phenotypes of different species, and how are these two phenomena related to one another? And finally, how can these ideas be used to deepen our understanding of how organisms function (and malfunction)?

Section snippets

Coevolution

Several methods have been developed to detect molecular coevolution. These mostly fall into one of two general categories: functional studies or genome sequence analyses.

Functional studies can investigate many types of molecular coevolution. Physically interacting proteins can be shown to coevolve, by demonstrating the presence of interactions between the two proteins in one species, and between the two orthologs (see Glossary) of those proteins in a second species, without any interaction

Modularity

Examples of protein modules, semi-autonomous groups of proteins with a high density of functional linkages within themselves (Figure 1b), abound. Let us consider three specific cases.

Galactose utilization in S. cerevisiae: the small number of proteins required specifically for galactose import and metabolism in this yeast are primarily involved in transcriptional regulation (GAL3, GAL4, GAL6 and GAL80), galactose transport (GAL2) or enzymatic reactions (GAL1, GAL5, GAL7 and GAL10). There is a

Relationships between coevolution and modularity

Coevolution and modularity are related to one another in several respects. First, many functional modules are highly enriched for protein–protein interactions within the module, and such interactions often coevolve because of structural constraints in the two interactors (see ‘Coevolution’). Additionally, the expression levels of proteins within a module might show correlated changes if the function(s) carried out by the module are needed at different frequencies or levels in different species.

Applications to human disease

Considering the extent to which modularity and coevolution appear to have influenced the evolutionary trajectories and present-day phenotypes all across the tree of life, these concepts could prove to be extremely useful in studying not just basic biology in model systems but also the etiologies of human diseases. In particular, analyses based on the study of coevolution and modularity have the power to reveal functional relationships between proteins, which might make such analyses ideally

Conclusions

Coevolution and modularity are ubiquitous features of life. In the past several years, systematic and unbiased studies of many species have revealed the influence of both of these intimately related factors in many aspects of evolution and present-day organismal function. Perhaps most interestingly, recent studies have begun to show that these concepts might be useful not only for understanding the most basic functions of proteins and cells; they can also be applied to the study of human

Update

Recent work has shown that genes within functional modules in yeast tend to co-evolve [48], confirming predictions from several publications [9•, 33, 34]. Also, a database of human protein–protein interactions has been used to identify candidate disease genes on the basis of both their interactions with known disease genes and their locations in genomic regions implicated in these diseases [49]; it will be interesting to see if searching for densely connected modules of interactions within

References and recommended reading

Papers of particular interest, published within the annual period of review, have been highlighted as:

  • • of special interest

  • •• of outstanding interest

Acknowledgements

I thank Joshua B Plotkin, Alice S Chen-Plotkin and Eileen M Woo for helpful discussions. HBF is a Merck/Massachusetts Institute of Technology CSBi postdoctoral fellow.

Glossary

Association study
A population-based study in which genotypes at genetic markers throughout part or all of a genome are compared with a phenotype of interest, such as a disease, to find genotype–phenotype correlations implying that a particular genomic region influences a trait.
Bayesian framework
Related to Bayes’ theorem, which is a method for calculating the probability of a hypothesis from a prior probability as well as the likelihood of some additional data under the current hypothesis. Many

References (49)

  • W. Zhong et al.

    Genome-wide prediction of C. elegans genetic interactions

    Science

    (2006)
  • P. Shannon et al.

    Cytoscape: a software environment for integrated models of biomolecular interaction networks

    Genome Res

    (2003)
  • J.N. Thompson

    The coevolutionary process

    (1994)
  • G.A. Dover et al.

    Molecular coevolution: DNA divergence and the maintenance of function

    Cell

    (1984)
  • T. Gildor et al.

    Coevolution of cyclin Pcl5 and its substrate Gcn4

    Eukaryot Cell

    (2005)
  • P.J. Shaw et al.

    Coevolution in Bicoid-dependent promoters and the inception of regulatory incompatibilities among species of higher Diptera

    Evol Dev

    (2002)
  • A.P. Gasch et al.

    Conservation and evolution of cis-regulatory systems in ascomycete fungi

    PLoS Biol

    (2004)
  • F. Pazos et al.

    In silico two-hybrid system for the selection of physically interacting protein pairs

    Proteins

    (2002)
  • H.B. Fraser et al.

    Coevolution of gene expression among interacting proteins

    Proc Natl Acad Sci USA

    (2004)
  • W.K. Kim et al.

    Large-scale co-evolution analysis of protein structural interlogues using the global protein structural interactome map (PSIMAP)

    Bioinformatics

    (2004)
  • G. Lithwick et al.

    Relative predicted protein levels of functionally associated proteins are conserved across organisms

    Nucleic Acids Res

    (2005)
  • A.I. Shulman et al.

    Structural determinants of allosteric ligand activation in RXR heterodimers

    Cell

    (2004)
  • W.P. Russ et al.

    Natural-like function in artificial WW domains

    Nature

    (2005)
  • T. Ideker et al.

    Integrated genomic and proteomic analyses of a systematically perturbed metabolic network

    Science

    (2001)
  • Cited by (20)

    • SEaCorAl: Identifying and contrasting the regulation-correlation bias in RNA-Seq paired expression data of patient groups

      2021, Computers in Biology and Medicine
      Citation Excerpt :

      This is key to allow evolvability in uncertain and noisy environments and, at the same time, maintain adaptability [38,41]. Modularity is an omnipresent property of genomic data of all living systems which can be found in many kinds of experimental datasets, such as protein-protein or protein-DNA interactions, gene expression measurements, and many others [42]. Using network science terminology, modularity is often referred to as having a “community structure”, i.e., their vertices are organized into groups, called communities, clusters, or modules.

    • Evolutionary rate heterogeneity between multi- and single-interface hubs across human housekeeping and tissue-specific protein interaction network: Insights from proteins’ and its partners’ properties

      2018, Genomics
      Citation Excerpt :

      Additionally, the party hubs mediate within-module interactions (intra-module), whereas date hubs integrate between modules (inter-module) [7]. However, the SI proteins acting on various modules face stronger consequences when deleted than the less pervasive densely connected MI proteins, due to their association with diverse functions [8]. Besides, a few studies have been carried out to understand the structural (conformational) and functional role of these hub proteins [9,10].

    • A systems biology approach for the investigation of the heparin/heparan sulfate interactome

      2011, Journal of Biological Chemistry
      Citation Excerpt :

      The accumulation of these large data sets required innovative ways to represent and analyze molecular networks, thus stimulating the development of a new discipline known as network biology (5–8). This new approach has been successfully used to integrate data from different experimental platforms (9), infer properties of interaction networks by applying statistical theories (6), assign protein function (10), identify network signatures characteristic of diseases such as cancer (8, 10), and investigate the evolution of interaction networks (11, 12). However, the chemical complexity of secondary gene products such as glycans and lipids and the technical challenges associated with the study of their interactions have generated a gap in our current models of interaction networks, and as a consequence, the interactions of proteins with secondary gene products such as glycosaminoglycans (GAGs)4 have been excluded from the above systematic analyses.

    • Evolution of complexity in miRNA-mediated gene regulation systems

      2008, Trends in Genetics
      Citation Excerpt :

      Darwin's theory of evolution predicts that complex biosystems have formed by the accumulation of numerous slight adaptive changes [1], but our understanding of the evolutionary mechanisms of complex systems is limited [2–5].

    View all citing articles on Scopus
    View full text