Skip to main content
Log in

Genome-Wide Analysis of Repetitive Elements in Papaya

  • Published:
Tropical Plant Biology Aims and scope Submit manuscript

Abstract

Papaya (Carica papaya L.) is an important fruit crop cultivated in tropical and subtropical regions worldwide. A first draft of its genome sequence has been recently released. Together with Arabidopsis, rice, poplar, grapevine and other genomes in the pipeline, it represents a good opportunity to gain insight into the organization of plant genomes. Here we report a detailed analysis of repetitive elements in the papaya genome, including transposable elements (TEs), tandemly-arrayed sequences, and high copy number genes. These repetitive sequences account for ∼56% of the papaya genome with TEs being the most abundant at 52%, tandem repeats at 1.3% and high copy number genes at 3%. Most common types of TEs are represented in the papaya genome with retrotransposons being the dominant class, accounting for 40% of the genome. The most prevalent retrotransposons are Ty3-gypsy (27.8%) and Ty1-copia (5.5%). Among the tandem repeats, microsatellites are the most abundant in number, but represent only 0.19% of the genome. Minisatellites and satellites are less abundant, but represent 0.68% and 0.43% of the genome, respectively, due to greater repeat length. Despite an overall smaller gene repertoire in papaya than many other angiosperms, a significant fraction of genes (>2%) are present in large gene families with copy number greater than 20. This repeat database clarified a major part of the papaya genome organization and partly explained the lower gene repertoire in papaya than in Arabidopsis.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

Notes

  1. We use the estimate from Table 2 as a large fraction of the Retrotransposon matches in Table 1 have not been classified.

References

  1. Akkaya MS, Shoemaker RC, Specht JE, Bhagwat AA, Cregan PB (1995) Integration of simple sequence repeat DNA markers into a soybean linkage map. Crop Sci 35:1439–1445

    Article  CAS  Google Scholar 

  2. Arabidopsis Genome Initiative (2001) Analysis of the genome sequence of the flowering plant Arabidopsis thaliana. Nature 408:796–815 doi:10.1038/35048692

    Article  Google Scholar 

  3. Bennetzen JL (2002) Mechanisms and rates of genome expansion and contraction in flowering plants. Genetica 115:29–36 doi:10.1023/A:1016015913350

    Article  PubMed  CAS  Google Scholar 

  4. Bennetzen JL, Ma J, Devos KM (2005) Mechanisms of recent genome size variation in flowering plants. Ann Bot (Lond) 95:127–132 doi:10.1093/aob/mci008

    Article  CAS  Google Scholar 

  5. Benson G (1999) Tandem repeats finder: a program to analyze DNA sequences. Nucleic Acids Res 27:573–580 doi:10.1093/nar/27.2.573

    Article  PubMed  CAS  Google Scholar 

  6. Camacho JP, Sharbel TF, Beukeboom LW (2000) B-chromosome evolution. Philos Trans R Soc Lond B Biol Sci 355:163–178 doi:10.1098/rstb.2000.0556

    Article  PubMed  CAS  Google Scholar 

  7. Cheng XD, Ling HQ (2006) Non-LTR retrotransposons: LINEs and SINEs in plant genome. Yichuan 28:731–736

    CAS  Google Scholar 

  8. Csink AK, Henikoff S (1998) Something from nothing: the evolution and utility of satellite repeats. Trends Genet 14:200–204 doi:10.1016/S0168-9525(98)01444-9

    Article  PubMed  CAS  Google Scholar 

  9. de la Herrán R, Cuñado N, Navajas-Pérez N, Santos JL, Ruiz Rejón C, Garrido-Ramos MA et al (2005) The controversial telomeres of lily plants. Cytogenet Genome Res 109:144–147 doi:10.1159/000082393

    Article  PubMed  CAS  Google Scholar 

  10. de Ridder C, Kourie DG, Watson BW (2006) FireμSat: meeting the challenge of detecting microsatellites in DNA. Proc SAICSIT 2006:247–256 doi:10.1145/1216262.1216289

    Article  Google Scholar 

  11. Edgar RC, Myers EW (2005) PILER: identification and classification of genomic repeats. Bioinformatics 21:i152–i158 doi:10.1093/bioinformatics/bti1003

    Article  PubMed  CAS  Google Scholar 

  12. Elder JR, Turner BJ (1995) Concerted evolution of repetitive DNA sequences in eukaryotes. Q Rev Biol 70:297–320 doi:10.1086/419073

    Article  PubMed  CAS  Google Scholar 

  13. Fitzgerald DJ, Dryden GL, Bronson EC, Williams JS, Anderson JN (1994) Conserved patterns of bending in satellite and nucleosome positioning DNA. J Biol Chem 269:21303–21314

    PubMed  CAS  Google Scholar 

  14. Flavell RB, Bennett MD, Smith JB, Smith DB (1974) Genome size and proportion of repeated nucleotide-sequence DNA in plants. Biochem Genet 12:257–269 doi:10.1007/BF00485947

    Article  PubMed  CAS  Google Scholar 

  15. Hatch FT, Mazrimas JA (1974) Fractionation and characterisation of satellite DNAs of the kangaroo rat (Dipodomys ordii). Nucleic Acids Res 1:559–575 doi:10.1093/nar/1.4.559

    Article  PubMed  CAS  Google Scholar 

  16. Henikoff S, Ahmad K, Malik HS (2001) The centromere paradox: stable inheritance with rapidly evolving DNA. Science 293:1098–1102 doi:10.1126/science.1062939

    Article  PubMed  CAS  Google Scholar 

  17. International Rice Genome Sequencing Project (2005) The map-based sequence of the rice genome. Nature 436:793–800

    Article  CAS  Google Scholar 

  18. Jurka J, Kapitonov VV, Pavlicek A, Klonowski P, Kohany O, Walichiewicz J (2005) Repbase update, a database of eukaryotic repetitive elements. Cytogenet Genome Res 110:462–467 doi:10.1159/000084979

    Article  PubMed  CAS  Google Scholar 

  19. Kubis SE, Schmidt T, Heslop-Harrison JS (1998) Repetitive DNA elements as a major component of plant genomes. Ann Bot (Lond) 82:45–55 doi:10.1006/anbo.1998.0779

    Article  CAS  Google Scholar 

  20. Lagercrantz U, Ellegren H, Andersson L (1993) The abundance of various polymorphic microsatellite motifs differs between plants and vertebrates. Nucleic Acids Res 21:1111–1115 doi:10.1093/nar/21.5.1111

    Article  PubMed  CAS  Google Scholar 

  21. Lai CW, Yu Q, Hou S, Skelton RL, Jones MR, Lewis KL et al (2006) Analysis of papaya BAC end sequences reveals first insights into the organization of a fruit tree genome. Mol Genet Genomics 276(1):1–12 doi:10.1007/s00438-006-0122-z

    Article  PubMed  CAS  Google Scholar 

  22. Larkin MA, Blackshields G, Brown NP, Chenna R, McGettigan PA, McWilliam H et al (2007) ClustalW and ClustalX version 2. Bioinformatics 23(21):2947–2948 doi:10.1093/bioinformatics/btm404

    Article  PubMed  CAS  Google Scholar 

  23. Li W, Godzik A (2006) Cd-hit: a fast program for clustering and comparing large sets of protein or nucleotide sequences. Bioinformatics 22:1658–1659 doi:10.1093/bioinformatics/btl158

    Article  PubMed  CAS  Google Scholar 

  24. Loridon K, Cournoyer B, Goubely C, Depeiges A, Picard G (1998) Length polymorphism and allele structure of trinucleotide microsatellites in natural accessions of Arabidopsis thaliana. Theor Appl Genet 97:591–604 doi:10.1007/s001220050935

    Article  CAS  Google Scholar 

  25. Macas J, Mészáros T, Nouzová M (2002) PlantSat: a specialized database for plant satellite repeats. Bioinformatics 18:28–35 doi:10.1093/bioinformatics/18.1.28

    Article  PubMed  CAS  Google Scholar 

  26. McCombie WR et al (2000) The complete sequence of a heterochromatic island from a higher eukaryote. Cell 100:377–386 doi:10.1016/S0092-8674(00)80673-X

    Article  CAS  Google Scholar 

  27. Meagher TR, Vassiliadis C (2005) Phenotypic impacts of repetitive DNA in flowering plants. New Phytol 168:71–80 doi:10.1111/j.1469-8137.2005.01527.x

    Article  PubMed  CAS  Google Scholar 

  28. Messing J, Bharti AK, Karlowski WM, Gundlach H, Kim HR, Yu Y et al (2004) Sequence composition and genome organization of maize. Proc Natl Acad Sci U S A 101:14349–14354 doi:10.1073/pnas.0406163101

    Article  PubMed  CAS  Google Scholar 

  29. Miklos GL (1985) Localited highly repetitive DNA sequences in vertebrate and invertebrate genomes. In: McIntryre JR (ed) Molecular evolutionary genetics. Plenum, New York, pp 231–241

    Google Scholar 

  30. Ming R et al (2008) The draft genome of the transgenic tropical fruit tree papaya (Carica papaya Linnaeus). Nature 452:991–996 doi:10.1038/nature06856

    Article  PubMed  CAS  Google Scholar 

  31. Murray MG, Peters DL, Thompson WF (1981) Ancient repeated sequences in the pea and mung bean genomes and implications for genome evolution. J Mol Evol 17:31–42 doi:10.1007/BF01792422

    Article  CAS  Google Scholar 

  32. Navajas-Pérez R, Rubio-Escudero C, Aznarte JL, Ruiz Rejón M, Garrido-Ramos MA (2007) SatDNA Analyzer: a computing tool for satellite-DNA evolutionary analysis. Bioinformatics 23:767–768 doi:10.1093/bioinformatics/btm005

    Article  PubMed  Google Scholar 

  33. Navajas-Pérez R, Schwarzacher T, de la Herrán R, Ruiz Rejón C, Ruiz Rejón M, Garrido-Ramos MA (2006) The origin and evolution of the variability in a Y-specific satellite-DNA of Rumex acetosa and its relatives. Gene 368:61–71 doi:10.1016/j.gene.2005.10.013

    Article  PubMed  CAS  Google Scholar 

  34. Nunome T, Suwabe K, Ohyama A, Fukuoka H (2003) Characterization of trinucleotide microsatellites in eggplant. Breed Sci 53:77–83 doi:10.1270/jsbbs.53.77

    Article  CAS  Google Scholar 

  35. Ohno S (1972) So much “junk” DNA in our genome. Brookhaven Symp Biol 23:366–370

    PubMed  CAS  Google Scholar 

  36. Orgel LE, Crick FH (1980) Selfish DNA: the ultimate parasite. Nature 284:604–607 doi:10.1038/284604a0

    Article  PubMed  CAS  Google Scholar 

  37. Pelissier T, Tutois S, Tourmente S, Deragon JM, Picard G (1996) DNA regions flanking the major Arabidopsis thaliana are principally enriched in Athila retroelement sequences. Genetica 97:141–151 doi:10.1007/BF00054621

    Article  PubMed  CAS  Google Scholar 

  38. Petitpierre E, Juan C, Pons J, Plohl M, Ugarković D (1995) Satellite DNA and constitutive heterochromatin in tenebrionid beetles. In: Brandham PE, Bennett MD (eds) Kew chromosome conference IV. Royal Botanic Gardens, London, pp 351–362

    Google Scholar 

  39. Plohl M, Mestrovic N, Bruvo B, Ugarkovic D (1998) Similarity of structural features and evolution of satellite DNAs from Palorus subdepressus (Coleoptera) and related species. J Mol Evol 46:234–239 doi:10.1007/PL00006298

    Article  PubMed  CAS  Google Scholar 

  40. Poole RL (2007) The TAIR Database. Methods Mol Biol 406:179–212 doi:10.1007/978-1-59745-535-0_8

    Article  PubMed  CAS  Google Scholar 

  41. Price AL, Jones NC, Pevzner PA (2005) De novo identification of repeat families in large genomes. Bioinformatics 21:351–358 doi:10.1093/bioinformatics/bti1018

    Article  Google Scholar 

  42. Rajagopal J, Das S, Khurana DK, Srivastava PS, Lakshmikumaran M (1999) Molecular characterization and distribution of a 145-bp tandem repeat family in the genus Populus. Genome 42:909–918 doi:10.1139/gen-42-5-909

    Article  PubMed  CAS  Google Scholar 

  43. Ronquist F, Huelsenbeck JP (2003) MrBayes 3: Bayesian phylogenetic inference under mixed models. Bioinformatics 19:1572–1574 doi:10.1093/bioinformatics/btg180

    Article  PubMed  CAS  Google Scholar 

  44. Schmidt AL, Anderson LM (2006) Repetitive DNA elements as mediators of genomic change in response to environmental cues. Biol Rev Camb Philos Soc 81:531–543 doi:10.1017/S146479310600710X

    Article  PubMed  Google Scholar 

  45. Smulders MJM, Bredemeijer G, Rus-Kortekaas W, Arens P, Vosman B (1997) Use of short microsatellites from database sequences to generate polymorphisms among Lycopersicon esculentum cultivars and accessions of other Lycopersicon species. Theor Appl Genet 97:264–272 doi:10.1007/s001220050409

    Article  Google Scholar 

  46. Song QJ, Fickus EW, Cregan PB (2002) Characterization of trinucleotide SSR motifs in wheat. Theor Appl Genet 104:286–293 doi:10.1007/s001220100698

    Article  PubMed  CAS  Google Scholar 

  47. Thomas CA Jr (1971) The genetic organization of chromosomes. Annu Rev Genet 5:237–256 doi:10.1146/annurev.ge.05.120171.001321

    Article  PubMed  CAS  Google Scholar 

  48. Thornburg BG, Gotea V, Makałowski W (2006) Transposable elements as a significant source of transcription regulating signals. Gene 365:104–110 doi:10.1016/j.gene.2005.09.036

    Article  PubMed  CAS  Google Scholar 

  49. Ugarković D, Plohl M (2002) Variation in satellite DNA profiles, causes and effects. EMBO J 21:5955–5959 doi:10.1093/emboj/cdf612

    Article  PubMed  Google Scholar 

  50. Volfovsky N, Haas BJ, Salzberg SL (2001) A clustering method for repeat analysis in DNA sequences. Genome Biol 2(8):research0027.1–research0027.11

    Article  Google Scholar 

  51. Wicker T, Matthews DE, Keller B (2002) TREP: a database for Triticeae repetitive elements. Trends Plant Sci 7:561–562 doi:10.1016/S1360-1385(02)02372-5

    Article  CAS  Google Scholar 

Download references

Acknowledgements

We thank Ning Jiang for suggestions and discussion on construction and characterization of the papaya repeat database. We appreciate financial support from the U.S. National Institutes of Health (R01-GM083873 to S.S.), the U.S. National Science Foundation (DBI-0553417 to R.M. and A.H.P.), the U. Hawaii and U.S. Department of Defense (W81XWH0520013 to M.A).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Niranjan Nagarajan.

Additional information

Nagarajan and Navajas-Pérez contributed equally to this work.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Nagarajan, N., Navajas-Pérez, R., Pop, M. et al. Genome-Wide Analysis of Repetitive Elements in Papaya. Tropical Plant Biol. 1, 191–201 (2008). https://doi.org/10.1007/s12042-008-9015-0

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12042-008-9015-0

Keywords

Navigation