Abstract
Genetic variation influences gene expression, and this variation in gene expression can be efficiently mapped to specific genomic regions and variants. Here we have used gene expression profiling of Epstein-Barr virus–transformed lymphoblastoid cell lines of all 270 individuals genotyped in the HapMap Consortium to elucidate the detailed features of genetic variation underlying gene expression variation. We find that gene expression is heritable and that differentiation between populations is in agreement with earlier small-scale studies. A detailed association analysis of over 2.2 million common SNPs per population (5% frequency in HapMap) with gene expression identified at least 1,348 genes with association signals in cis and at least 180 in trans. Replication in at least one independent population was achieved for 37% of cis signals and 15% of trans signals, respectively. Our results strongly support an abundance of cis-regulatory variation in the human genome. Detection of trans effects is limited but suggests that regulatory variation may be the key primary effect contributing to phenotypic variation in humans. We also explore several methodologies that improve the current state of analysis of gene expression variation.
This is a preview of subscription content, access via your institution
Access options
Subscribe to this journal
Receive 12 print issues and online access
$209.00 per year
only $17.42 per issue
Rent or buy this article
Prices vary by article type
from$1.95
to$39.95
Prices may be subject to local taxes which are calculated during checkout
Similar content being viewed by others
Accession codes
References
Ferrari, S.L. et al. Polymorphisms in the low-density lipoprotein receptor-related protein 5 (LRP5) gene are associated with variation in vertebral bone mass, vertebral bone size, and stature in whites. Am. J. Hum. Genet. 74, 866–875 (2004).
Bansal, A. et al. Association testing by DNA pooling: an effective initial screen. Proc. Natl. Acad. Sci. USA 99, 16871–16874 (2002).
Mahley, R.W. & Huang, Y. Apolipoprotein E: from atherosclerosis to Alzheimer's disease and beyond. Curr. Opin. Lipidol. 10, 207–217 (1999).
Kleinjan, D.A. & van Heyningen, V. Long-range control of gene expression: emerging mechanisms and disruption in disease. Am. J. Hum. Genet. 76, 8–32 (2005).
Valentonyte, R. et al. Sarcoidosis is associated with a truncating splice site mutation in BTNL2. Nat. Genet. 37, 357–364 (2005).
Saxena, R. et al. Genome-wide association analysis identifies loci for type 2 diabetes and triglyceride levels. Science 316, 1331–1336 (2007).
Zeggini, E. et al. Replication of genome-wide association signals in UK samples reveals risk loci for type 2 diabetes. Science 316, 1336–1341 (2007).
Fanciulli, M. et al. FCGR3B copy number variation is associated with susceptibility to systemic, but not organ-specific, autoimmunity. Nat. Genet. 39, 721–723 (2007).
Stankiewicz, P. & Lupski, J.R. Genome architecture, rearrangements and genomic disorders. Trends Genet. 18, 74–82 (2002).
Merla, G. et al. Submicroscopic deletion in patients with Williams-Beuren syndrome influences expression levels of the nonhemizygous flanking genes. Am. J. Hum. Genet. 79, 332–341 (2006).
Li, M. et al. CFH haplotypes without the Y402H coding variant show strong association with susceptibility to age-related macular degeneration. Nat. Genet. 38, 1049–1054 (2006).
Hirschhorn, J.N. & Daly, M.J. Genome-wide association studies for common diseases and complex traits. Nat. Rev. Genet. 6, 95–108 (2005).
Stranger, B.E. & Dermitzakis, E.T. The genetics of regulatory variation in the human genome. Hum. Genomics 2, 126–131 (2005).
Stranger, B.E. & Dermitzakis, E.T. From DNA to RNA to disease and back: the 'central dogma' of regulatory disease variation. Hum. Genomics 2, 383–390 (2006).
Doss, S., Schadt, E.E., Drake, T.A. & Lusis, A.J. Cis-acting expression quantitative trait loci in mice. Genome Res. 15, 681–691 (2005).
Cheung, V.G. et al. Mapping determinants of human gene expression by regional and genome-wide association. Nature 437, 1365–1369 (2005).
Schadt, E.E. et al. Genetics of gene expression surveyed in maize, mouse and man. Nature 422, 297–302 (2003).
Stranger, B.E. et al. Genome-wide associations of gene expression variation in humans. PLoS Genet. 1, e78 (2005).
Stranger, B.E. et al. Relative impact of nucleotide and copy number variation on gene expression phenotypes. Science 315, 848–853 (2007).
Knight, J.C. Regulatory polymorphisms underlying complex disease traits. J. Mol. Med. 83, 97–109 (2005).
Monks, S.A. et al. Genetic inheritance of gene expression in human cell lines. Am. J. Hum. Genet. 75, 1094–1105 (2004).
The International HapMap Consortium. A haplotype map of the human genome. Nature 437, 1299–1320 (2005).
The International HapMap Consortium. The phase II haplotype map of the human genome. Nature (in the press).
Dunning, M.J., Smith, D.R., Thorne, N.P. & Tavare, S. beadarray: an R Package to analyse Illumina BeadArrays. R News 6, 17 (2006).
Dunning, M.J., Thorne, N.P., Camilier, I., Smith, M.L. & Tavare, S. Quality control and low-level statistical analysis of Illumina BeadArrays. Rev. Stat. 4, 1–30 (2006).
Ashburner, M. et al. Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat. Genet. 25, 25–29 (2000).
Camon, E., Barrell, D., Lee, V., Dimmer, E. & Apweiler, R. The Gene Ontology Annotation (GOA) Database—an integrated resource of GO annotations to the UniProt Knowledgebase. In Silico Biol. 4, 5–6 (2004).
Storey, J.D., Akey, J.M. & Kruglyak, L. Multiple locus linkage analysis of genomewide expression in yeast. PLoS Biol. 3, e267 (2005).
Lee, S.I., Pe'er, D., Dudley, A.M., Church, G.M. & Koller, D. Identifying regulatory mechanisms using individual variation reveals key role for chromatin modification. Proc. Natl. Acad. Sci. USA 103, 14062–14067 (2006).
Koren, M. et al. ATM haplotypes and breast cancer risk in Jewish high-risk women. Br. J. Cancer 94, 1537–1543 (2006).
Birney, E. et al. Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project. Nature 447, 799–816 (2007).
Hoogendoorn, B. et al. Functional analysis of polymorphisms in the promoter regions of genes on 22q11. Hum. Mutat. 24, 35–42 (2004).
Dermitzakis, E.T. et al. Evolutionary discrimination of mammalian conserved non-genic sequences (CNGs). Science 302, 1033–1035 (2003).
Drake, J.A. et al. Conserved noncoding sequences are selectively constrained and not mutation cold spots. Nat. Genet. 38, 223–227 (2006).
Bejerano, G. et al. Ultraconserved elements in the human genome. Science 304, 1321–1325 (2004).
Abbasi, A.A. et al. Human GLI3 intragenic conserved non-coding sequences are tissue-specific enhancers. PLoS ONE 2, e366 (2007).
Woolfe, A. et al. Highly conserved non-coding sequences are associated with vertebrate development. PLoS Biol. 3, e7 (2005).
Brem, R.B. & Kruglyak, L. The landscape of genetic complexity across 5,700 gene expression traits in yeast. Proc. Natl. Acad. Sci. USA 102, 1572–1577 (2005).
Yvert, G. et al. Trans-acting regulatory variation in Saccharomyces cerevisiae and the role of transcription factors. Nat. Genet. 35, 57–64 (2003).
Kuhn, K. et al. A novel, high-performance random array platform for quantitative gene expression profiling. Genome Res. 14, 2347–2356 (2004).
Bolstad, B.M., Irizarry, R.A., Astrand, M. & Speed, T.P. A comparison of normalization methods for high density oligonucleotide array data based on variance and bias. Bioinformatics 19, 185–193 (2003).
Churchill, G.A. & Doerge, R.W. Empirical threshold values for quantitative trait mapping. Genetics 138, 963–971 (1994).
Doerge, R.W. & Churchill, G.A. Permutation tests for multiple loci affecting a quantitative character. Genetics 142, 285–294 (1996).
Acknowledgements
We thank the HapMap Consortium for data availability; M. Smith for assistance with software development; and M. Gibbs, J. Orwick and C. Gerringer for technical support. Funding was provided by the Wellcome Trust (to E.T.D. and P.D.), the US National Institutes of Health ENDGAME (to E.T.D. and S.T.), Cancer Research UK (to S.T.), and the Medical Research Council (to M.D.). S.T. is a Royal Society Wolfson Research Merit Award holder.
Author information
Authors and Affiliations
Contributions
B.E.S. performed the majority of the analysis, coordinated the efforts on the project, performed part of the experimental work, and wrote part of the manuscript. E.T.D. and P.D. helped with the analysis, wrote part of the manuscript, and led the project. S.T. and M.D. performed the normalization and helped with statistical analysis. A.C.N., A.D., C.P.B., P.F. and S.M. performed specific parts of the analysis. M.S.F. helped with the analysis and performed part of the experimental work. C.E.I. performed most of the experimental work. C.B. wrote some of the scripts and performed part of the analysis. D.K. provided advice on the permutation analysis.
Corresponding authors
Supplementary information
Supplementary Text and Figures
Supplementary Figs. 1–6, Supplementary Table 1, and Supplementary Methods (PDF 1606 kb)
Supplementary Table 2
Number and source category of SNPs used in trans analysis. (PDF 833 kb)
Supplementary Table 3
Significant cis- 1Mb associations, linear regression, individual population analysis, 0.001 permutation threshold. (PDF 1406 kb)
Supplementary Table 4
Significant cis- 1 Mb associations, linear regression, multiple population analysis, 0.001 permutation threshold. (PDF 647 kb)
Supplementary Table 5
Significant cis- 1Mb associations, Spearman rank correlation, individual population analysis, 0.001 permutation threshold. (PDF 27 kb)
Supplementary Table 6
Significant trans associations, linear regression, individual population analysis, 0.001 permutation threshold. (PDF 44 kb)
Rights and permissions
About this article
Cite this article
Stranger, B., Nica, A., Forrest, M. et al. Population genomics of human gene expression. Nat Genet 39, 1217–1224 (2007). https://doi.org/10.1038/ng2142
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/ng2142
This article is cited by
-
Genome-wide expression QTL mapping reveals the highly dynamic regulatory landscape of a major wheat pathogen
BMC Biology (2023)
-
Associative gene networks reveal novel candidates important for ADHD and dyslexia comorbidity
BMC Medical Genomics (2023)
-
Identification of a RAD51B enhancer variant for susceptibility and progression to glioma
Cancer Cell International (2023)
-
The role and risks of selective adaptation in extreme coral habitats
Nature Communications (2023)
-
Molecular quantitative trait loci
Nature Reviews Methods Primers (2023)