Abstract
Amplification artifacts introduced during library preparation for the Illumina Genome Analyzer increase the likelihood that an appreciable proportion of these sequences will be duplicates and cause an uneven distribution of read coverage across the targeted sequencing regions. As a consequence, these unfavorable features result in difficulties in genome assembly and variation analysis from the short reads, particularly when the sequences are from genomes with base compositions at the extremes of high or low G+C content. Here we present an amplification-free method of library preparation, in which the cluster amplification step, rather than the PCR, enriches for fully ligated template strands, reducing the incidence of duplicate sequences, improving read mapping and single nucleotide polymorphism calling and aiding de novo assembly. We illustrate this by generating and analyzing DNA sequences from extremely (G+C)-poor (Plasmodium falciparum), (G+C)-neutral (Escherichia coli) and (G+C)-rich (Bordetella pertussis) genomes.
This is a preview of subscription content, access via your institution
Access options
Subscribe to this journal
Receive 12 print issues and online access
$259.00 per year
only $21.58 per issue
Rent or buy this article
Prices vary by article type
from$1.95
to$39.95
Prices may be subject to local taxes which are calculated during checkout
Similar content being viewed by others
References
Goman, M. et al. The establishment of genomic DNA libraries for the human malaria parasite Plasmodium falciparum and identification of individual clones by hybridisation. Mol. Biochem. Parasitol. 5, 391–400 (1982).
Camargo, A.A., Fischer, K., Lanzer, M. & del Portillo, H.A. Construction and characterization of a Plasmodium vivax genomic library in yeast artificial chromosomes. Genomics 42, 467–473 (1997).
de Bruin, D., Lanzer, M. & Ravetch, J.V. Characterization of yeast artificial chromosomes from Plasmodium falciparum: construction of a stable, representative library and cloning of telomeric DNA fragments. Genomics 14, 332–339 (1992).
Triglia, T. & Kemp, D.J. Large fragments of Plasmodium falciparum DNA can be stable when cloned in yeast artificial chromosomes. Mol. Biochem. Parasitol. 44, 207–211 (1991).
Pollack, Y., Katzen, A.L., Spira, D.T. & Golenser, J. The genome of Plasmodium falciparum. I: DNA base composition. Nucleic Acids Res. 10, 539–546 (1982).
Gardner, M.J. et al. Genome sequence of the human malaria parasite Plasmodium falciparum. Nature 419, 498–511 (2002).
Bentley, D.R. et al. Accurate whole human genome sequencing using reversible terminator chemistry. Nature 456, 53–59 (2008).
Saiki, R.K. et al. Primer-directed enzymatic amplification of DNA with a thermostable DNA polymerase. Science 239, 487–491 (1988).
Day, D.J. et al. Identification of non-amplifying CYP21 genes when using PCR-based diagnosis of 21-hydroxylase deficiency in congenital adrenal hyperplasia (CAH) affected pedigrees. Hum. Mol. Genet. 5, 2039–2048 (1996).
Barnard, R., Futo, V., Pecheniuk, N., Slattery, M. & Walsh, T. PCR bias toward the wild-type k-ras and p53 sequences: implications for PCR detection of mutations and cancer diagnosis. Biotechniques 25, 684–691 (1998).
Hahn, S., Garvin, A.M., Di Naro, E. & Holzgreve, W. Allele drop-out can occur in alleles differing by a single nucleotide and is not alleviated by preamplification or minor template increments. Genet. Test. 2, 351–355 (1998).
Ogino, S. & Wilson, R.B. Quantification of PCR bias caused by a single nucleotide polymorphism in SMN gene dosage analysis. J. Mol. Diagn. 4, 185–190 (2002).
Quail, M.A. et al. A large genome centre's improvements to the Illumina sequencing system. Nat. Methods 5, 1005–1010 (2008).
Dohm, J.C., Lottaz, C., Borodina, T. & Himmelbauer, H. Substantial biases in ultra-short read data sets from high-throughput DNA sequencing. Nucleic Acids Res. 36, e105 (2008).
Ning, Z., Cox, A.J. & Mullikin, J.C. SSAHA: a fast search method for large DNA databases. Genome Res. 11, 1725–1729 (2001).
Li, H., Ruan, J. & Durbin, R. Mapping short DNA sequencing reads and calling variants using mapping quality scores. Genome Res. 18, 1851–1858 (2008).
Acknowledgements
This work was supported by the Wellcome Trust (grant WT079643). We thank C. Newbold and S. Kyes (University of Oxford) for providing DNA from P. falciparum 3D7.
Author information
Authors and Affiliations
Contributions
I.K. planned and performed experiments; Z.N., M.J.S. and M.B. analyzed data; M.A.Q. prepared standard sequencing libraries; I.K. and D.J.T. devised the project; D.J.T., Z.N. and I.K. wrote the manuscript.
Corresponding author
Supplementary information
Supplementary Text and Figures
Supplementary Figures 1–2, Supplementary Table 1, Supplementary Methods (PDF 1188 kb)
Rights and permissions
About this article
Cite this article
Kozarewa, I., Ning, Z., Quail, M. et al. Amplification-free Illumina sequencing-library preparation facilitates improved mapping and assembly of (G+C)-biased genomes. Nat Methods 6, 291–295 (2009). https://doi.org/10.1038/nmeth.1311
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/nmeth.1311
This article is cited by
-
Scalable, high quality, whole genome sequencing from archived, newborn, dried blood spots
npj Genomic Medicine (2023)
-
Plant-associated fungi support bacterial resilience following water limitation
The ISME Journal (2022)
-
PacBio sequencing output increased through uniform and directional fivefold concatenation
Scientific Reports (2021)
-
Reliable detection of somatic mutations in solid tissues by laser-capture microdissection and low-input DNA sequencing
Nature Protocols (2021)
-
Inhibition of a nutritional endosymbiont by glyphosate abolishes mutualistic benefit on cuticle synthesis in Oryzaephilus surinamensis
Communications Biology (2021)