Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Opinion
  • Published:

Opinion

A gene-centric approach to genome-wide association studies

Abstract

Genic variants are more likely to alter gene function and affect disease risk than those that occur outside genes. Variants in genes, however, might not be sufficiently covered by the existing approaches to genome-wide association studies. Our analysis of the HapMap ENCODE data indicates that this concern is valid, and that an alternative approach that focuses on genic variants provides a more complete coverage of functionally important regions and a greater genotyping efficiency. We therefore argue that resources should be developed to make gene-centric genome-wide association studies feasible.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Figure 1: Approaches to constructing SNP sets for genome-wide association studies.
Figure 2: Coverage of genic and non-genic SNPs by a quasi-random SNP set.
Figure 3: Coverage of genic SNPs by a non-genic SNP set.
Figure 4: Predicted power of genome-wide association studies using a quasi-random SNP set.
Figure 5: Relative efficiencies of approaches to genome-wide association studies.

Similar content being viewed by others

References

  1. Risch, N. & Merikangas, K. The future of genetic studies of complex human diseases. Science 273, 1516–1517 (1996).

    Article  CAS  Google Scholar 

  2. Kruglyak, L. Prospects for whole-genome linkage disequilibrium mapping of common disease genes. Nature Genet. 22, 139–144 (1999).

    Article  CAS  Google Scholar 

  3. Collins, F. S., Brooks, L. D. & Chakravarti, A. A DNA polymorphism discovery resource for research on human genetic variation. Genome Res. 8, 1229–1231 (1998).

    Article  CAS  Google Scholar 

  4. Olds, L. C. & Sibley, E. Lactase persistence DNA variant enhances lactase promoter activity in vitro: functional role as a cis regulatory element. Hum. Mol. Genet. 12, 2333–2340 (2003).

    Article  CAS  Google Scholar 

  5. The International HapMap Consortium. The International HapMap Project. Nature 426, 789–796 (2003).

  6. Altshuler, D. et al. A haplotype map of the human genome. Nature 437, 1299–1320 (2005).

    Article  Google Scholar 

  7. Ozaki, K. et al. Functional SNPs in the lymphotoxin-α gene that are associated with susceptibility to myocardial infarction. Nature Genet. 32, 650–654 (2002).

    Article  CAS  Google Scholar 

  8. Klein, R. J. et al. Complement factor H polymorphism in age-related macular degeneration. Science 308, 385–389 (2005).

    Article  CAS  Google Scholar 

  9. Maraganore, D. M. et al. High-resolution whole-genome association study of Parkinson disease. Am. J. Hum. Genet. 77, 685–693 (2005).

    Article  CAS  Google Scholar 

  10. Conley, Y. P. et al. Candidate gene analysis suggests a role for fatty acid biosynthesis and regulation of the complement system in the etiology of age-related maculopathy. Hum. Mol. Genet. 14, 1991–2002 (2005).

    Article  CAS  Google Scholar 

  11. Rivera, A. et al. Hypothetical LOC387715 is a second major susceptibility gene for age-related macular degeneration, contributing independently of complement factor H to disease risk. Hum. Mol. Genet. 14, 3227–3236 (2005).

    Article  CAS  Google Scholar 

  12. Terwilliger, J. D. & Hiekkalinna, T. An utter refutation of the “Fundamental Theorem of the HapMap”. Eur. J. Hum. Genet. 14, 426–437 (2006).

    Article  CAS  Google Scholar 

  13. Botstein, D. & Risch, N. Discovering genotypes underlying human phenotypes: past successes for mendelian disease, future approaches for complex disease. Nature Genet. 33, 228–237 (2003).

    Article  CAS  Google Scholar 

  14. Palmer L. J., Cardon, L. R. Shaking the tree: Mapping complex disease genes using linkage disequilibrium. Lancet 336, 1223–1234 (2005).

    Article  Google Scholar 

  15. Tabor, H. K., Risch, N. J. & Myers, R. M. Candidate-gene approaches for studying complex genetic traits: practical considerations. Nature Rev. Genet. 3, 391–397 (2002).

    Article  CAS  Google Scholar 

  16. Smith, A. V., Thomas, D. J., Munro, H. M. & Abecasis, G. R. Sequence features in regions of weak and strong linkage disequilibrium. Genome Res. 15, 1519–1534 (2005).

    Article  CAS  Google Scholar 

  17. McVean, G. A. et al. The fine-scale structure of recombination rate variation in the human genome. Science 304, 581–584 (2004).

    Article  CAS  Google Scholar 

  18. Tsunoda, T. et al. Variation of gene-based SNPs and linkage disequilibrium patterns in the human genome. Hum. Mol. Genet. 13, 1623–1632 (2004).

    Article  CAS  Google Scholar 

  19. Goddard, K. A., Hopkins, P. J., Hall, J. M. & Witte, J. S. Linkage disequilibrium and allele-frequency distributions for 114 single-nucleotide polymorphisms in five populations. Am. J. Hum. Genet. 66, 216–234 (2000).

    Article  CAS  Google Scholar 

  20. Stephens, J. C. et al. Haplotype variation and linkage disequilibrium in 313 human genes. Science 293, 489–493 (2001).

    Article  CAS  Google Scholar 

  21. Hill, W. G. & Robertson, A. The effect of linkage on limits to artificial selection. Genet. Res. 8, 269–294 (1966).

    Article  CAS  Google Scholar 

  22. Pe'er, I. et al. Evaluating and improving power in whole-genome association studies using fixed marker sets. Nature Genet. 38, 663–667 (2006).

    Article  CAS  Google Scholar 

  23. Barrett, J. C. & Cardon, L. R. Evaluating coverage of genome-wide association studies. Nature Genet. 38, 659–662 (2006).

    Article  CAS  Google Scholar 

  24. de Bakker, P. I. et al. Efficiency and power in genetic association studies. Nature Genet. 37, 1217–1223 (2005).

    Article  CAS  Google Scholar 

  25. Jorgenson, E. & Witte, J. S. Coverage and power in genome-wide association studies. Am. J. Hum. Genet. 78, 884–889 (2006).

    Article  CAS  Google Scholar 

  26. Pennisi, E. Human genome. A low number wins the GeneSweep Pool. Science 300, 1484 (2003).

    Article  CAS  Google Scholar 

  27. Livingston, R. J. et al. Pattern of sequence variation across 213 environmental response genes. Genome Res. 14, 1821–1831 (2004).

    Article  CAS  Google Scholar 

  28. Crawford, D. C. et al. Haplotype diversity across 100 candidate genes for inflammation, lipid metabolism, and blood pressure regulation in two populations. Am. J. Hum. Genet. 74, 610–622 (2004).

    Article  CAS  Google Scholar 

  29. Pe'er, I. et al. Biases and reconciliation in estimates of linkage disequilibrium in the human genome. Am. J. Hum. Genet. 78, 588–603 (2006).

    Article  CAS  Google Scholar 

Download references

Acknowledgements

We thank N. Risch, D. Thomas, X. Liu and I. Cheng for their helpful comments, as well as L. Edblad and L. Woldin for assistance in the preparation of the manuscript. We also thank anonymous reviewers for their helpful suggestions. This work was supported by grants from the US National Institutes of Health.

Author information

Authors and Affiliations

Authors

Ethics declarations

Competing interests

The authors declare no competing financial interests.

Supplementary information

Related links

Related links

FURTHER INFORMATION

Affymetrix 500k Array Set

EntrezSNP

Ensembl

HapMap ENCODE

HapMart

International HapMap Project

National Institute of Environmental Health Science Environmental Genome Project

SeattleSNPs Program for Genomic Applications

Tagger

Wellcome Trust Exon Resequencing Project

Witte Laboratory Homepage

Glossary

Genome-wide significance criterion

The level of significance that an association must reach to reject the null hypothesis of no association, taking into account the large number of tests being conducted.

Linkage analysis

A method for localizing genes that is based on the co-inheritance of genetic markers and phenotypes in families over several generations.

Linkage disequilibrium

The non-random association of alleles of different linked polymorphisms in a population.

Minor allele frequency

The frequency of the less-common allele at a polymorphic locus. It has a value that lies between 0 and 0.5, and can vary between populations.

Multiple-hypothesis testing

The practice of testing more than one hypothesis within an experiment. As a result, the probability of an unusual result from within the entire experiment occurring by chance is higher than the individual probability for one test alone.

Odds ratio

A measurement of association that is commonly used in case–control studies. It is defined as the odds of exposure to the susceptible genetic variant in cases compared with that in controls. If the odds ratio is statistically significantly greater or less than one, then the genetic variant is associated with the disease.

Power

The probability of rejecting the null hypothesis when it is false. In genome-wide association studies, the null hypothesis is that there is no association between a variant and the phenotype of interest.

HapMap

A catalogue of common genetic variation in the human genome that was developed by the International HapMap Project.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Jorgenson, E., Witte, J. A gene-centric approach to genome-wide association studies. Nat Rev Genet 7, 885–891 (2006). https://doi.org/10.1038/nrg1962

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1038/nrg1962

This article is cited by

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing