Skip to main content
Log in

An imputed genotype resource for the laboratory mouse

  • Published:
Mammalian Genome Aims and scope Submit manuscript

Abstract

We have created a high-density SNP resource encompassing 7.87 million polymorphic loci across 49 inbred mouse strains of the laboratory mouse by combining data available from public databases and training a hidden Markov model to impute missing genotypes in the combined data. The strong linkage disequilibrium found in dense sets of SNP markers in the laboratory mouse provides the basis for accurate imputation. Using genotypes from eight independent SNP resources, we empirically validated the quality of the imputed genotypes and demonstrated that they are highly reliable for most inbred strains. The imputed SNP resource will be useful for studies of natural variation and complex traits. It will facilitate association study designs by providing high-density SNP genotypes for large numbers of mouse strains. We anticipate that this resource will continue to evolve as new genotype data become available for laboratory mouse strains. The data are available for bulk download or query at http://cgd.jax.org/.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2

Similar content being viewed by others

Website references

References

  • Abe K, Noguchi H, Tagawa K, Yuzuriha M, Toyoda A et al (2004) Contribution of Asian mouse subspecies Mus musculus molossinus to genomic constitution of strain C57BL/6J, as defined by BAC-end sequence-SNP analysis. Genome Res 14:2439–2447

    Article  PubMed  Google Scholar 

  • Bogue MA (2003) Mouse Phenome Project: understanding human biology through mouse genetics and genomics. J Appl Physiol 95:1335-1337

    PubMed  CAS  Google Scholar 

  • Cervino AC, Li G, Edwards S, Zhu J, Laurie C et al (2005) Integrating QTL and high-density SNP analyses in mice to identify Insig2 as a susceptibility gene for plasma cholesterol levels. Genomics 86:505–517

    Article  PubMed  CAS  Google Scholar 

  • Churchill GA (1989) Stochastic models for heterogeneous DNA sequences. Bull Math Biol 51:79–94

    PubMed  CAS  Google Scholar 

  • Churchill GA, Airey DC, Allayee H, Angel JM, Attie AD et al (2004) The Collaborative Cross, a community resource for the genetic analysis of complex traits. Nat Genet 36:1133–1137

    Article  PubMed  CAS  Google Scholar 

  • DiPetrillo K, Wang X, Stylianou L, Pagien B (2005) Bioinformatics toolbox for narrowing rodent quantitative trait loci. Trends Genet 21:684–692

    Article  Google Scholar 

  • Durbin R, Eddy SR, Krogh A, Mitchison G (1998) Biological sequence analysis (Cambridge, UK: Cambridge University Press)

    Google Scholar 

  • Drosophila 12 Genomes Consortium (2007) Evolution of genes and genomes on the Drosophila phylogeny. Nature 450:203–218

    Article  Google Scholar 

  • Frazer KA, Wade CM, Hinds DA, Patil N, Cox DR et al (2004) Segmental phylogenetic relationships of inbred mouse strains revealed by fine-scale analysis of sequence variation across 4.6 Mb of mouse genome. Genome Res 14:1493–1500

    Article  PubMed  CAS  Google Scholar 

  • Frazer KA, Eskin E, Kang HM, Bogue MA, Hinds DA et al (2007) A sequence-based variation map of 8.27 million SNPs in inbred mouse strain. Nature 448:1050–1053

    Article  PubMed  CAS  Google Scholar 

  • Guenet JL, Bohomme F (2003) Wild mice: an ever-increasing contribution to a popular mammalian model. Trends Genet 19:24–31

    Article  PubMed  CAS  Google Scholar 

  • Ideraabdullah FY, de la Casa-Esperon E, Bell TA, Detwiler DA, Magnuson T et al (2004) Genetic and haplotype diversity among wild derived mouse inbred strains. Genome Res 14:1880–1887

    Article  PubMed  CAS  Google Scholar 

  • Kent WJ (2002) BLAT—the BLAST-like alignment tool. Genome Res 12:656–664

    Article  PubMed  CAS  Google Scholar 

  • Kimmel G, Shamir R (2005) A block-free hidden Markov model for genotypes and its application to disease association. J Comput Biol 12:1243

    Article  PubMed  CAS  Google Scholar 

  • Liao G, Wang J, Guo J, Allard J, Cheng J et al 2004. In silico genetics: identification of a functional element regulating H2-Ealpha gene expression. Science 306:690–695

    Article  PubMed  CAS  Google Scholar 

  • Lyon MF, Rastan S, Brown SDM (eds.) (1996) Genetic variants and strains of the laboratory mouse, 3rd ed. (Oxford, UK: Oxford Univeristy Press)

    Google Scholar 

  • McClurg P, Janes J, Wu C, Delano DL, Walker JR et al (2007) Genomewide association analysis in diverse inbred mice: power and population structure. Genetics 176:675–683

    Article  PubMed  CAS  Google Scholar 

  • Mott R (2007) A haplotype map for the laboratory mouse. Nat Genet 39:1054–1056

    Article  PubMed  CAS  Google Scholar 

  • Mural RJ, Adams MD, Myers EW, Smith HO, Miklos GL et al (2002) A comparison of whole-genome shotgun-derived mouse chromosome 16 and the human genome. Science 296:1661–1671

    Article  PubMed  CAS  Google Scholar 

  • Payseur BA, Hoekstra HE (2005) Signatures of reproductive isolation in patterns of single nucleotide diversity across inbred strains of mice. Genetics 171:1905–1016

    Article  PubMed  CAS  Google Scholar 

  • Payseur BA, Place M (2007) Prospects for association mapping in classical inbred mouse strains. Genetics 175:1999–2008

    Article  PubMed  CAS  Google Scholar 

  • Pletcher MT, McClurg P, Batalov S, Su AI, Barnes SW et al (2004) Use of a dense single nucleotide polymorphism map for in silico mapping in the mouse. PLoS Biol 2:2159–2169

    Article  CAS  Google Scholar 

  • Petkov PM, Ding Y, Cassell MA, Zhang W, Wagner G et al (2004) An efficient SNP system for mouse genome scanning and elucidating strain relationships. Genome Res 14:1806–1811

    Article  PubMed  CAS  Google Scholar 

  • Petkov PM, Graber JH, Churchill GA, DiPetrillo K, King BL et al (2005) Evidence of a large-scale functional organization of mammalian chromosomes. PLoS Genet 1:e33

    Article  PubMed  Google Scholar 

  • Roberts A, McMillan L, Wang W, Parker J, Rusyn I et al (2007) Inferring missing genotypes in large SNP panels using fast nearest-neighbor searches over sliding windows. Bioinformatics 23:i401

    Article  PubMed  CAS  Google Scholar 

  • Roberts A, Pardo-Manuel de Villena F, Wang W, McMillan L, Threadgill DW (2007) The polymorphism architecture of mouse genetic resources elucidated using genome-wide resequencing data: implications for QTL discovery and systems genetics. Mamm Genome 18:473–481

    Article  PubMed  CAS  Google Scholar 

  • Siebert SK, Schadt EE (2007) Moving toward a system genetics view of disease. Mamm Genome 18:389–401

    Article  Google Scholar 

  • Scheet P, Stephens M (2006) A fast and flexible statistical model for large-scale population genotype data: applications to inferring missing genotypes and haplotypic phase. Am J Hum Genet 78:129

    Article  Google Scholar 

  • Shifman S, Bell JT, Copley RR, Taylor MS, Williams RW et al (2006) A high-resolution single nucleotide polymorphism genetic map of the mouse genome. PLoS Biol 4:e395

    Article  PubMed  Google Scholar 

  • Viterbi AJ (1967) Error bounds for convolutional codes and an asymptotically optimum decoding algorithm. IEEE Trans Information Theory 13:260–269

    Article  Google Scholar 

  • Wade CM, Daly MJ (2005) Genetic variation in laboratory mice. Nat Genet 37:1175–1180

    Article  PubMed  CAS  Google Scholar 

  • Wade CM, Kulbokas EJ 3rd, Kirby AW, Zody MC, Mullikin JC et al (2002) The mosaic structure of variation in the laboratory mouse genome. Nature 420:574–578

    Article  PubMed  CAS  Google Scholar 

  • Waterston RH, Lindblad-Toh K, Birney E, Rogers J, Abril JF et al (2002) Initial sequencing and comparative analysis of the mouse genome. Nature 420:520–562

    Article  PubMed  CAS  Google Scholar 

  • Wiltshire T, Pletcher MT, Batalov S, Barnes SW, Tarantino LM et al (2003) Genome-wide single-nucleotide polymorphism analysis defines haplotype patterns in mouse. Proc Natl Acad Sci U S A 100:3380–3385

    Article  PubMed  CAS  Google Scholar 

  • Yalcin B, Fullerton J, Miller S, Keays DA, Brady S et al (2004) Unexpected complexity in the haplotypes of commonly used inbred strains of laboratory mice. Proc Natl Acad Sci. U S A 101:9734–9739

    Article  PubMed  CAS  Google Scholar 

  • Yang H, Bell TA, Churchill GA, Pardo-Manuel de Villena F (2007) On the subspecific origin of the laboratory mouse. Nat Genet 39:1100–1107

    Article  PubMed  CAS  Google Scholar 

Download references

Acknowledgments

This work was supported by the U.S. National Institutes of General Medical Sciences as part of the Center of Excellence in Systems Biology (1P50 GM076468). The authors thank Tim Wiltshire for sharing genotyping data prior to its publication and Jesse Hammer and Susan Moxley for graphics assistance.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Gary A. Churchill.

Electronic supplementary material

Below is the link to the electronic supplementary material.

(DOC 89 kb)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Szatkiewicz, J.P., Beane, G.L., Ding, Y. et al. An imputed genotype resource for the laboratory mouse. Mamm Genome 19, 199–208 (2008). https://doi.org/10.1007/s00335-008-9098-9

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00335-008-9098-9

Keywords

Navigation