Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Review Article
  • Published:

Comparative genomics at the vertebrate extremes

Key Points

  • Distant species comparisons reveal core vertebrate sequences that often function as enhancers.

  • Sequences that are conserved across long evolutionary distances often occur in clusters.

  • Distant species comparisons enable the analysis of a limited subset of genes in the human genome. These genes often have pivotal roles in embryonic development.

  • Comparisons of multiple primate species enable the identification of primate-specific conserved sequences.

  • Comparisons between human and chimpanzee sequences reveal sequence changes that underlie human-specific adaptations.

  • The sequences of many additional vertebrate genomes, as well as extensive information about human sequence polymorphisms, will become available in the near future, which could significantly change the approach to comparative genomics.

Abstract

Annotators of the human genome are increasingly exploiting comparisons with genomes at both the distal and proximal evolutionary edges of the vertebrate tree. Despite the sequence similarity between primates, comparisons among members of this clade are beginning to identify primate- as well as human-specific functional elements. At the distal evolutionary extreme, comparing the human genome to that of non-mammal vertebrates such as fish has proved to be a powerful filter to prioritize sequences that most probably have significant functional activity in all vertebrates.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Figure 1: Architecture of human–Fugu rubripes conserved non-coding sequences in the human genome.
Figure 2: Sonic hedgehog expression in the limbs is regulated by an enhancer at a distance of 1 Mb.

Similar content being viewed by others

References

  1. Homer . The Odissey Ch. 12 (Signet Classic, New York, 1999).

    Google Scholar 

  2. Nobrega, M. A. & Pennacchio, L. A. Comparative genomic analysis as a tool for biological discovery. J. Physiol. 554, 31–39 (2004).

    CAS  PubMed  Google Scholar 

  3. Pennacchio, L. A. & Rubin, E. M. Comparative genomic tools and databases: providing insights into the human genome. J. Clin. Invest. 111, 1099–1106 (2003).

    CAS  PubMed  PubMed Central  Google Scholar 

  4. Frazer, K. A. et al. Evolutionarily conserved sequences on human chromosome 21. Genome Res. 11, 1651–1659 (2001).

    CAS  PubMed  PubMed Central  Google Scholar 

  5. Loots, G. G. et al. Identification of a coordinate regulator of interleukins 4, 13, and 5 by cross-species sequence comparisons. Science 288, 136–140 (2000).

    CAS  PubMed  Google Scholar 

  6. Pennacchio, L. A. et al. An apolipoprotein influencing triglycerides in humans and mice revealed by comparative sequencing. Science 294, 169–173 (2001).

    CAS  PubMed  Google Scholar 

  7. Kellis, M., Patterson, N., Endrizzi, M., Birren, B. & Lander, E. S. Sequencing and comparison of yeast species to identify genes and regulatory elements. Nature 423, 241–254 (2003). Paradigmatic example of the power of comparisons of multiple, related genomes to identify functional sequence in a genome.

    CAS  PubMed  Google Scholar 

  8. Thomas, J. W. et al. Comparative analyses of multi-species sequences from targeted genomic regions. Nature 424, 788–793 (2003).

    CAS  PubMed  Google Scholar 

  9. Boffelli, D. et al. Phylogenetic shadowing of primate sequences to find functional regions of the human genome. Science 299, 1391–1394 (2003). The first paper to describe the use of comparisons of multiple, closely related primates to identify primate-specific conserved sequences.

    CAS  PubMed  Google Scholar 

  10. Hardison, R. C. et al. Covariation in frequencies of substitution, deletion, transposition, and recombination during eutherian evolution. Genome Res. 13, 13–26 (2003).

    CAS  PubMed  PubMed Central  Google Scholar 

  11. Hardison, R. C. Comparative genomics. PLoS Biol. 1, E58 (2003).

    PubMed  PubMed Central  Google Scholar 

  12. Mouse Genome Sequencing Consortium. Initial sequencing and comparative analysis of the mouse genome. Nature 420, 520–562 (2002).

  13. Schwartz, S. et al. Human–mouse alignments with BLASTZ. Genome Res. 13, 103–107 (2003).

    CAS  PubMed  PubMed Central  Google Scholar 

  14. Pennacchio, L. A., Baroukh, N. & Rubin, E. M. in Symposia on Quantitative Biology: The Genome of Homo sapiens (Cold Spring Harbor Press, Cold Spring Harbor, in the press).

  15. Elnitski, L. et al. Distinguishing regulatory DNA from neutral sites. Genome Res. 13, 64–72 (2003).

    CAS  PubMed  PubMed Central  Google Scholar 

  16. Brenner, S. et al. Characterization of the pufferfish (Fugu) genome as a compact model vertebrate genome. Nature 366, 265–268 (1993).

    CAS  PubMed  Google Scholar 

  17. Aparicio, S. et al. Whole-genome shotgun assembly and analysis of the genome of Fugu rubripes. Science 297, 1301–1310 (2002).

    CAS  PubMed  Google Scholar 

  18. Arnone, M. I. & Davidson, E. H. The hardwiring of development: organization and function of genomic regulatory systems. Development 124, 1851–1864 (1997).

    CAS  PubMed  Google Scholar 

  19. Aparicio, S. et al. Detecting conserved regulatory elements with the model genome of the Japanese puffer fish, Fugu rubripes. Proc. Natl Acad. Sci. USA 92, 1684–1688 (1995). Demonstrates that human– F. rubripes comparisons detect conserved non-coding sequences that, once tested in in vivo assays, correspond to enhancers.

    CAS  PubMed  PubMed Central  Google Scholar 

  20. Nobrega, M. A., Ovcharenko, I., Afzal, V. & Rubin, E. M. Scanning human gene deserts for long-range enhancers. Science 302, 413 (2003).

    CAS  PubMed  Google Scholar 

  21. Lettice, L. A. et al. A long-range Shh enhancer regulates expression in the developing limb and fin and is associated with preaxial polydactyly. Hum. Mol. Genet. 12, 1725–1735 (2003). Demonstrates that sequence variation in cis -regulatory elements at near-megabase distances can result in phenotypic variation.

    CAS  PubMed  Google Scholar 

  22. Kleinjan, D. J. & van Heyningen, V. Position effect in human genetic disease. Hum. Mol. Genet. 7, 1611–1618 (1998).

    CAS  PubMed  Google Scholar 

  23. de Kok, Y. J. et al. Identification of a hot spot for microdeletions in patients with X-linked deafness type 3 (DFN3) 900 kb proximal to the DFN3 gene POU3F4. Hum. Mol. Genet. 5, 1229–1235 (1996).

    CAS  PubMed  Google Scholar 

  24. Zerucha, T. et al. A highly conserved enhancer in the Dlx5/Dlx6 intergenic region is the site of cross-regulatory interactions between Dlx genes in the embryonic forebrain. J. Neurosci. 20, 709–721 (2000).

    CAS  PubMed  PubMed Central  Google Scholar 

  25. Blader, P., Plessy, C. & Strahle, U. Multiple regulatory elements with spatially and temporally distinct activities control neurogenin1 expression in primary neurons of the zebrafish embryo. Mech. Dev. 120, 211–218 (2003).

    CAS  PubMed  Google Scholar 

  26. Dickmeis, T. et al. Expression profiling and comparative genomics identify a conserved regulatory region controlling midline expression in the zebrafish embryo. Genome Res. 14, 228–238 (2004).

    CAS  PubMed  PubMed Central  Google Scholar 

  27. Goode, D. K., Snell, P. K. & Elgar, G. K. Comparative analysis of vertebrate Shh genes identifies novel conserved non-coding sequence. Mamm. Genome 14, 192–201 (2003).

    CAS  PubMed  Google Scholar 

  28. Kimura-Yoshida, C. et al. Characterization of the pufferfish Otx2 cis-regulators reveals evolutionarily conserved genetic mechanisms for vertebrate head specification. Development 131, 57–71 (2004).

    CAS  PubMed  Google Scholar 

  29. Barton, L. M. et al. Regulation of the stem cell leukemia (SCL) gene: a tale of two fishes. Proc. Natl Acad. Sci. USA 98, 6747–6752 (2001).

    CAS  PubMed  PubMed Central  Google Scholar 

  30. Lien, C. L., McAnally, J., Richardson, J. A. & Olson, E. N. Cardiac-specific activity of an Nkx2-5 enhancer requires an evolutionarily conserved Smad binding site. Dev. Biol. 244, 257–266 (2002).

    CAS  PubMed  Google Scholar 

  31. Ghanem, N. et al. Regulatory roles of conserved intergenic domains in vertebrate Dlx bigene clusters. Genome Res. 13, 533–543 (2003).

    CAS  PubMed  PubMed Central  Google Scholar 

  32. Sharpe, J. et al. Identification of Sonic hedgehog as a candidate gene responsible for the polydactylous mouse mutant Sasquatch. Curr. Biol. 9, 97–100 (1999).

    CAS  PubMed  Google Scholar 

  33. Lettice, L. A. et al. Disruption of a long-range cis-acting regulator for Shh causes preaxial polydactyly. Proc. Natl Acad. Sci. USA 99, 7548–7553 (2002).

    CAS  PubMed  PubMed Central  Google Scholar 

  34. Berman, B. P. et al. Exploiting transcription factor binding site clustering to identify cis-regulatory modules involved in pattern formation in the Drosophila genome. Proc. Natl Acad. Sci. USA 99, 757–762 (2002).

    CAS  PubMed  PubMed Central  Google Scholar 

  35. Markstein, M., Markstein, P., Markstein, V. & Levine, M. S. Genome-wide analysis of clustered Dorsal binding sites identifies putative target genes in the Drosophila embryo. Proc. Natl Acad. Sci. USA 99, 763–768 (2002).

    CAS  PubMed  Google Scholar 

  36. Chiang, D. Y., Moses, A. M., Kellis, M., Lander, E. S. & Eisen, M. B. Phylogenetically and spatially conserved word pairs associated with gene-expression changes in yeasts. Genome Biol. 4, R43 (2003).

    PubMed  PubMed Central  Google Scholar 

  37. Moses, A. M., Chiang, D. Y., Kellis, M., Lander, E. S. & Eisen, M. B. Position specific variation in the rate of evolution in transcription factor binding sites. BMC Evol. Biol. 3, 19 (2003).

    PubMed  PubMed Central  Google Scholar 

  38. Anand, S. et al. Divergence of Hoxc8 early enhancer parallels diverged axial morphologies between mammals and fishes. Proc. Natl Acad. Sci. USA 100, 15666–15669 (2003).

    CAS  PubMed  PubMed Central  Google Scholar 

  39. Mainguy, G. et al. A position-dependent organisation of retinoid response elements is conserved in the vertebrate Hox clusters. Trends Genet. 19, 476–479 (2003).

    CAS  PubMed  Google Scholar 

  40. Erwin, D. H. & Davidson, E. H. The last common bilaterian ancestor. Development 129, 3021–3032 (2002). One of the many insightful studies by this group that characterizes genetic regulatory networks, aspects of which are shared by all bilaterians, in contrast to other aspects that probably evolved later, in subgroups of species.

    CAS  PubMed  Google Scholar 

  41. Davidson, E. H. et al. A genomic regulatory network for development. Science 295, 1669–1678 (2002).

    CAS  PubMed  Google Scholar 

  42. Bejerano, G. et al. Ultra-conserved elements in the human genome. Science 6 May 2004 (doi:10.1126/science.1098119). Seminal study first reporting the characterization of ultra-conserved elements in mammalian genomes.

  43. Dodou, E., Xu, S. M. & Black, B. L. mef2c is activated directly by myogenic basic helix-loop-helix proteins during skeletal muscle development in vivo. Mech. Dev. 120, 1021–1032 (2003).

    CAS  PubMed  Google Scholar 

  44. Ludwig, M. Z., Bergman, C., Patel, N. H. & Kreitman, M. Evidence for stabilizing selection in a eukaryotic enhancer element. Nature 403, 564–567 (2000). First convincing demonstration of the role of balancing selection in maintaining an invariant function in enhancers with diverging sequence.

    CAS  PubMed  Google Scholar 

  45. Takahashi, H., Mitani, Y., Satoh, G. & Satoh, N. Evolutionary alterations of the minimal promoter for notochord-specific Brachyury expression in ascidian embryos. Development 126, 3725–3734 (1999).

    CAS  PubMed  Google Scholar 

  46. Lynch, M. & Conery, J. S. The evolutionary fate and consequences of duplicate genes. Science 290, 1151–1155 (2000).

    CAS  PubMed  Google Scholar 

  47. Johnson, M. E. et al. Positive selection of a gene family during the emergence of humans and African apes. Nature 413, 514–519 (2001).

    CAS  PubMed  Google Scholar 

  48. Lawn, R. M. et al. The recurring evolution of lipoprotein(a). Insights from cloning of hedgehog apolipoprotein(a). J. Biol. Chem. 270, 24004–24009 (1995).

    CAS  PubMed  Google Scholar 

  49. Boffelli, D., Cheng, J. F. & Rubin, E. M. Convergent evolution in primates and an insectivore. Genomics 83, 19–23 (2004).

    CAS  PubMed  Google Scholar 

  50. King, M. C. & Wilson, A. C. Evolution at two levels in humans and chimpanzees. Science 188, 107–116 (1975).

    CAS  PubMed  Google Scholar 

  51. Yang, Z. & Nielsen, R. Estimating synonymous and nonsynonymous substitution rates under realistic evolutionary models. Mol. Biol. Evol. 17, 32–43 (2000).

    CAS  PubMed  Google Scholar 

  52. Yang, Z. PAML: a program package for phylogenetic analysis by maximum likelihood. Comput. Appl. Biosci. 13, 555–556 (1997).

    CAS  PubMed  Google Scholar 

  53. Hughes, A. L. & Yeager, M. Natural selection at major histocompatibility complex loci of vertebrates. Annu. Rev. Genet. 32, 415–435 (1998).

    CAS  PubMed  Google Scholar 

  54. Swanson, W. J., Yang, Z., Wolfner, M. F. & Aquadro, C. F. Positive Darwinian selection drives the evolution of several female reproductive proteins in mammals. Proc. Natl Acad. Sci. USA 98, 2509–2514 (2001).

    CAS  PubMed  PubMed Central  Google Scholar 

  55. Swanson, W. J. & Vacquier, V. D. The rapid evolution of reproductive proteins. Nature Rev. Genet. 3, 137–144 (2002).

    CAS  PubMed  Google Scholar 

  56. Wyckoff, G. J., Wang, W. & Wu, C. I. Rapid evolution of male reproductive genes in the descent of man. Nature 403, 304–309 (2000).

    CAS  PubMed  Google Scholar 

  57. Clark, A. G., Begun, D. J. & Prout, T. Female × male interactions in Drosophila sperm competition. Science 283, 217–220 (1999).

    CAS  PubMed  Google Scholar 

  58. Goldberg, A. et al. Adaptive evolution of cytochrome c oxidase subunit VIII in anthropoid primates. Proc. Natl Acad. Sci. USA 100, 5873–5878 (2003).

    CAS  PubMed  PubMed Central  Google Scholar 

  59. Enard, W. et al. Molecular evolution of FOXP2, a gene involved in speech and language. Nature 418, 869–872 (2002). Elegant identification of a gene suspected to be involved in the development of speech undergoing positive selection in the human lineage.

    CAS  PubMed  Google Scholar 

  60. Huttley, G. A. et al. Adaptive evolution of the tumour suppressor BRCA1 in humans and chimpanzees. Australian Breast Cancer Family Study. Nature Genet. 25, 410–413 (2000).

    CAS  PubMed  Google Scholar 

  61. Stedman, H. H. et al. Myosin gene mutation correlates with anatomical changes in the human lineage. Nature 428, 415–418 (2004).

    CAS  PubMed  Google Scholar 

  62. Clark, A. G. et al. Inferring nonneutral evolution from human–chimp–mouse orthologous gene trios. Science 302, 1960–1963 (2003).

    CAS  PubMed  Google Scholar 

  63. Zhang, J., Zhang, Y. P. & Rosenberg, H. F. Adaptive evolution of a duplicated pancreatic ribonuclease gene in a leaf-eating monkey. Nature Genet. 30, 411–415 (2002).

    CAS  PubMed  Google Scholar 

  64. Fleming, M. A., Potter, J. D., Ramirez, C. J., Ostrander, G. K. & Ostrander, E. A. Understanding missense mutations in the BRCA1 gene: an evolutionary approach. Proc. Natl Acad. Sci. USA 100, 1151–1156 (2003).

    CAS  PubMed  PubMed Central  Google Scholar 

  65. Wasserman, W. W. & Sandelin, A. Applied bioinformatics for the identification of regulatory elemements. Nature Rev. Genet. 5, 276–287 (2004).

    CAS  PubMed  Google Scholar 

  66. Gumucio, D. L. et al. Differential phylogenetic footprinting as a means to identify base changes responsible for recruitment of the anthropoid γ-gene to a fetal expression pattern. J. Biol. Chem. 269, 15371–15380 (1994).

    CAS  PubMed  Google Scholar 

  67. Rockman, M. V., Hahn, M. W., Soranzo, N., Goldstein, D. B. & Wray, G. A. Positive selection on a human-specific transcription factor binding site regulating IL4 expression. Curr. Biol. 13, 2118–2123 (2003).

    CAS  PubMed  Google Scholar 

  68. Frazer, K. A. et al. Genomic DNA insertions and deletions occur frequently between humans and nonhuman primates. Genome Res. 13, 341–346 (2003).

    CAS  PubMed  PubMed Central  Google Scholar 

  69. Locke, D. P. et al. Large-scale variation among human and great ape genomes determined by array comparative genomic hybridization. Genome Res. 13, 347–357 (2003).

    CAS  PubMed  PubMed Central  Google Scholar 

  70. Yu, N. et al. Larger genetic differences within Africans than between Africans and Eurasians. Genetics 161, 269–274 (2002).

    CAS  PubMed  PubMed Central  Google Scholar 

  71. Collins, F. S., Green, E. D., Guttmacher, A. E. & Guyer, M. S. A vision for the future of genomics research. Nature 422, 835–847 (2003).

    CAS  PubMed  Google Scholar 

  72. Dermitzakis, E. T. et al. Evolutionary discrimination of mammalian conserved non-genic sequences (CNGs). Science 302, 1033–1035 (2003).

    CAS  PubMed  Google Scholar 

  73. Cooper, G. M. et al. Quantitative estimates of sequence divergence for comparative analyses of mammalian genomes. Genome Res. 13, 813–820 (2003).

    CAS  PubMed  PubMed Central  Google Scholar 

  74. Margulies, E. H., Blanchette, M., NISC Comparative Sequencing Program, Haussler, D. & Green, E. D. Identification and characterization of multi-species conserved sequences. Genome Res. 13, 2507–2518 (2003).

    CAS  PubMed  PubMed Central  Google Scholar 

  75. Frazer, K. A. et al. Noncoding sequences conserved in a limited number of mammals in the SIM2 interval are frequently functional. Genome Res. 14, 367–372 (2004).

    CAS  PubMed  PubMed Central  Google Scholar 

  76. Carroll, S. B. Endless forms: the evolution of gene regulation and morphological diversity. Cell 101, 577–580 (2000).

    CAS  PubMed  Google Scholar 

  77. Fay, J. C., Wyckoff, G. J. & Wu, C. I. Testing the neutral theory of molecular evolution with genomic data from Drosophila. Nature 415, 1024–1026 (2002).

    CAS  PubMed  Google Scholar 

  78. Smith, J. M. & Haigh, J. The hitch-hiking effect of a favourable gene. Genet. Res. 23, 23–35 (1974).

    CAS  PubMed  Google Scholar 

  79. Gellner, K. & Brenner, S. Analysis of 148 kb of genomic DNA around the wnt1 locus of Fugu rubripes. Genome Res. 9, 251–258 (1999).

    CAS  PubMed  PubMed Central  Google Scholar 

  80. Muller, F. et al. Intronic enhancers control expression of zebrafish sonic hedgehog in floor plate and notochord. Development 126, 2103–2116 (1999).

    CAS  PubMed  Google Scholar 

  81. Bagheri-Fam, S., Ferraz, C., Demaille, J., Scherer, G. & Pfeifer, D. Comparative genomics of the SOX9 region in human and Fugu rubripes: conservation of short regulatory sequence elements within large intergenic regions. Genomics 78, 73–82 (2001).

    CAS  PubMed  Google Scholar 

  82. Hans, S. & Campos-Ortega, J. A. On the organisation of the regulatory region of the zebrafish δD gene. Development 129, 4773–4784 (2002).

    CAS  PubMed  Google Scholar 

  83. Santini, S., Boore, J. L. & Meyer, A. Evolutionary conservation of regulatory elements in vertebrate Hox gene clusters. Genome Res. 13, 1111–1122 (2003).

    CAS  PubMed  PubMed Central  Google Scholar 

  84. Spitz, F., Gonzalez, F. & Duboule, D. A global control region defines a chromosomal regulatory landscape containing the HoxD cluster. Cell 113, 405–417 (2003). One of the most elegant examples of the application of distant vertebrate sequence comparisons aiding the sifting of large genomic intervals for functional sequences.

    CAS  PubMed  Google Scholar 

  85. Griffin, C., Kleinjan, D. A., Doe, B. & van Heyningen, V. New 3′ elements control Pax6 expression in the developing pretectum, neural retina and olfactory region. Mech. Dev. 112, 89–100 (2002).

    CAS  PubMed  Google Scholar 

  86. Eggers, J. H., Stock, M., Fliegauf, M., Vonderstrass, B. & Otto, F. Genomic characterization of the RUNX2 gene of Fugu rubripes. Gene 291, 159–167 (2002).

    CAS  PubMed  Google Scholar 

Download references

Acknowledgements

We would like to thank Len Pennacchio and members of the Rubin laboratory for useful discussions. Research was conducted at the E. O. Lawrence Berkeley National Laboratory, with support by a grant from the Programs for Genomic Application, NHLBI, and performed under a Department of Energy contract, University of California, USA.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Edward M. Rubin.

Ethics declarations

Competing interests

The authors declare no competing financial interests.

Related links

Related links

DATABASES

Entrez

DACH

even skipped

Hoxb4

LMBR1

LPA

MEF2C

Shh

FURTHER INFORMATION

Encode

Gene Ontology Consortium database

SHADOWER

VISTA Tools

Glossary

NEUTRAL RATE

Genetic variation that does not affect the fitness of the organism is not subject to selection and evolves at the neutral rate.

GENE DESERTS

Gene-poor regions in the genome that are larger than 500 kb. Gene deserts often contain sporadic evidence of transcription.

VISTA

A powerful tool for aligning the genome and visualizing the location of conserved sequences (see online links box).

PHYLOGENETIC SHADOWING

An approach that combines comparisons of sequences from multiple, closely related species with a molecular phylogenetic model of sequence evolution to identify significantly conserved elements.

POSITIVE SELECTION

A sequence change in a species that results in increased fitness is subject to positive selection. As a consequence, the change normally becomes fixed, leading to adaptive evolution of that species.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Boffelli, D., Nobrega, M. & Rubin, E. Comparative genomics at the vertebrate extremes. Nat Rev Genet 5, 456–465 (2004). https://doi.org/10.1038/nrg1350

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1038/nrg1350

This article is cited by

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing