Gene Discovery Using Computational and Microarray Analysis of Transcription in the Drosophila melanogaster Testis

  1. Justen Andrews1,
  2. Gerard G. Bouffard2,
  3. Chris Cheadle3,
  4. Jining Lü1,
  5. Kevin G. Becker3, and
  6. Brian Oliver1,4
  1. 1Laboratory of Cellular and Developmental Biology, National Institute of Diabetes and Digestive and Kidney Diseases, National Institutes of Health, Bethesda, Maryland 20892, USA; 2Bioinformatics Group, National Institute of Health Intramural Sequencing Center, National Human Genome Research Institute, National Institutes of Health, Gaithersburg, Maryland 20877, USA; 3DNA Array Unit, National Institute on Aging, National Institutes of Health, Baltimore, Maryland 21224, USA

Abstract

Identification and annotation of all the genes in the sequencedDrosophila genome is a work in progress. Wild-type testis function requires many genes and is thus of potentially high value for the identification of transcription units. We therefore undertook a survey of the repertoire of genes expressed in the Drosophilatestis by computational and microarray analysis. We generated 3141 high-quality testis expressed sequence tags (ESTs). Testis ESTs computationally collapsed into 1560 cDNA set used for further analysis. Of those, 11% correspond to named genes, and 33% provide biological evidence for a predicted gene. A surprising 47% fail to align with existing ESTs and 16% with predicted genes in the current genome release. EST frequency and microarray expression profiles indicate that the testis mRNA population is highly complex and shows an extended range of transcript abundance. Furthermore, >80% of the genes expressed in the testis showed onefold overexpression relative to ovaries, or gonadectomized flies. Additionally, >3% showed more than threefold overexpression at p <0.05. Surprisingly, 22% of the genes most highly overexpressed in testis matchDrosophila genomic sequence, but not predicted genes. These data strongly support the idea that sequencing additional cDNA libraries from defined tissues, such as testis, will be important tools for refined annotation of the Drosophila genome. Additionally, these data suggest that the number of genes in Drosophila will significantly exceed the conservative estimate of 13,601.

[The sequence data described in this paper have been submitted to the dbEST data library under accession nos.AI944400AI947263 and BE661985BE662262.]

[The microarray data described in this paper have been submitted to the GEO data library under accession nos. GPLS, GSM3–GSM10.]

Footnotes

  • 4 Corresponding author.

  • E-MAIL oliver{at}helix.nih.gov; FAX (301) 496-5239.

  • Article and publication are at www.genome.org/cgi/doi/10.1101/gr.159800.

    • Received August 10, 2000.
    • Accepted October 12, 2000.
| Table of Contents

Preprint Server