Skip to main content
Log in

REGANOR

A Gene Prediction Server for Prokaryotic Genomes and a Database of High Quality Gene Predictions for Prokaryotes

  • Application Note
  • Published:
Applied Bioinformatics

Abstract

With >1000 prokaryotic genome sequencing projects ongoing or already finished, comprehensive comparative analysis of the gene content of these genomes has become viable. To allow for a meaningful comparative analysis, gene prediction of the various genomes should be as accurate as possible. It is clear that improving the state of genome annotation requires automated gene identification methods to cope with the influence of artifacts, such as genomic GC content. There is currently still room for improvement in the state of annotations.

We present a web server and a database of high-quality gene predictions. The web server is a resource for gene identification in prokaryote genome sequences. It implements our previously described, accurate gene finding method REGANOR. We also provide novel gene predictions for 241 complete, or almost complete, prokaryotic genomes. We demonstrate how this resource can easily be utilised to identify promising candidates for currently missing genes from genome annotations with several examples. All data sets are available online.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Table I
Fig. 2
Fig. 3

Similar content being viewed by others

References

  1. GOLD™ genomes online database v 2.0 [online]. Available from URL: http://www.genomesonline.org [Accessed 2006 Jun 21]

  2. Kyrpides NC, Ouzounis CA, Iliopoulos I, et al. Analysis of the Thermotoga maritima genome combining a variety of sequence similarity and genome context tools. Nucleic Acids Res 2000 Nov 15; 28(22): 4573–6

    Article  PubMed  CAS  Google Scholar 

  3. Dandekar T, Huynen M, Regula JT, et al. Re-annotating the Mycoplasma pneumoniae genome sequence: adding value, function and reading frames. Nucleic Acids Res 2000 Sep 1; 28(17): 3278–88

    Article  PubMed  CAS  Google Scholar 

  4. Daraselia N, Dernovoy D, Tian Y, et al. Reannotation of Shewanella oneidensis genome. OMICS 2003; 7(2): 171–5

    Article  PubMed  CAS  Google Scholar 

  5. Kolker E, Picone AF, Galperin MY, et al. Global profiling of Shewanella oneidensis MR-1: expression of hypothetical genes and improved functional annotations. Proc Natl Acad Sci U S A 2005 Feb 8; 102(6): 2099–104

    Article  PubMed  CAS  Google Scholar 

  6. Overbeek R, Begley T, Butler R, et al. The subsystems approach to genome annotation and its use in the project to annotate 1000 genomes. Nucleic Acids Res 2005 Oct 7; 33(17): 5691–702

    Article  PubMed  CAS  Google Scholar 

  7. McHardy AC, Goesmann A, Pühler A, et al. Development of joint application strategies for two microbial gene finders. Bioinformatics 2004 Jul 10; 20(10): 1622–31

    Article  PubMed  CAS  Google Scholar 

  8. Delcher AL, Harmon D, Kasif S, et al. Improved microbial gene identification with GLIMMER. Nucleic Acids Res 1999 Dec 1; 27(23): 4636–41

    Article  PubMed  CAS  Google Scholar 

  9. Badger JH, Olsen GJ. CRITICA: coding region identification tool invoking comparative analysis. Mol Biol Evol 1999 Apr; 16(4): 512–24

    Article  PubMed  CAS  Google Scholar 

  10. Osterman A, Overbeek R. Missing genes in metabolic pathways: a comparative genomics approach. Curr Opin Chem Biol 2003 Apr; 7(2): 238–51

    Article  PubMed  CAS  Google Scholar 

  11. Meyer F, Goesmann A, McHardy AC, et al. GenDB: an open source genome annotation system for prokaryote genomes. Nucleic Acids Res 2003 Apr 15; 31(8): 2187–95

    Article  PubMed  CAS  Google Scholar 

  12. Besemer J, Lomsadze A, Borodovsky M. GeneMarkS: a self-training method for prediction of gene starts in microbial genomes: implications for finding sequence motifs in regulatory regions. Nucleic Acids Res 2001 Jun 15; 29(12): 2607–1

    Article  PubMed  CAS  Google Scholar 

  13. Larsen TS, Krogh A. Easygene: a prokaryotic gene finder that ranks ORFs by statistical significance. BMC Bioinformatics 2003 Jun 3; 4: 21

    Article  PubMed  Google Scholar 

  14. Bocs S, Cruveiller S, Vallenet D, et al. AMIGene: annotation of Microbial Genes. Nucleic Acids Res 2003 Jul 1; 31(13): 3723–6

    Article  PubMed  CAS  Google Scholar 

  15. Guo FB, Ou HY, Zhang CT. ZCURVE: a new system for recognizing protein-coding genes in bacterial and archaeal genomes. Nucleic Acids Res 2003 Mar 15; 31(6): 1780–9

    Article  PubMed  CAS  Google Scholar 

  16. Mahony S, Mclnerney JO, Smith TJ, et al. Gene prediction using the self-organizing map: automatic generation of multiple gene models. BMC Bioinformatics 2004 Mar 5; 5: 23

    Article  PubMed  Google Scholar 

  17. Frishman D, Mironov A, Mewes HW, et al. Combining diverse evidence for gene recognition in completely sequenced bacterial genomes. Nucleic Acids Res 1998; 26: 2941–7

    Article  PubMed  CAS  Google Scholar 

  18. Shibuya T, Rigoutsos I. Dictionary-driven prokaryotic gene finding. Nucleic Acids Res 2002 Jun 15; 30(12): 2710–25

    Article  PubMed  CAS  Google Scholar 

  19. Lowe TM, Eddy SR. tRNAscan-SE: a program for improved detection of transfer RNA genes in genomic sequence. Nucleic Acids Res 1997 Mar 1; 25(5): 955–64

    PubMed  CAS  Google Scholar 

  20. Altschul SF, Madden TL, Schaffer AA, et al. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res 1997 Sep 1; 25(17): 3389–402

    Article  PubMed  CAS  Google Scholar 

  21. GFF format [online]. Available from URL: http://www.sanger.ac.uk/Software/formats/GFF/ [Accessed 2006 Jun 21]

  22. Overbeek RA, Disz T, Stevens RL. The SEED: a peer-to-peer environment for genome annotation. Communications of the ACM 2004; 47: 46–51

    Article  Google Scholar 

  23. Xie G, Keyhani NO, Bonner CA, et al. Ancient origin of the tryptophan operon and the dynamics of evolutionary change. Microbiol Mol Biol Rev 2003 Sep; 67(3): 303–42

    Article  PubMed  CAS  Google Scholar 

  24. Ivanova N, Sorokin A, Anderson I, et al. Related articles: genome sequence of Bacillus cereus and comparative analysis with Bacillus anthracis. Nature 2003 May 1; 423(6935): 87–91

    Article  PubMed  CAS  Google Scholar 

  25. Read TD, Peterson SN, Tourasse N, et al. The genome sequence of Bacillus anthracis Ames and comparison to closely related bacteria. Nature 2003 May 1; 423(6935): 81–6

    Article  PubMed  CAS  Google Scholar 

Download references

Acknowledgements

The authors would like to thank Ross Overbeek and Gordon Pusch for initiating the project and valuable comments and discussion. We would also like to thank Niels Larsen for making the SearchforRNAs program available to us. Burkhard Linke is funded by the Deutsche Forschungsgemeinschaft (DFG PU28/25-3). Lutz Krause is supported by the DFG Graduiertenkolleg 635 Bioinformatik. Heiko Neuweger is funded by the EU (GOCECT-2004-505403).

The authors have no conflics of interest that are directly relevant to the content of this article.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Folker Meyer.

Additional information

Availability: The gene finding server is accessible via https://www.cebitec.uni-bielefeld.de/groups/brf/software/reganor/cgi-bin/reganor_upload.cgi. The server software is available with the GenDB genome annotation system (version 2.2.1 onwards) under the GNU general public license. The software can be downloaded from https://sourceforge.net/projects/gendb/. More information on installing GenDB and REGANOR and the system requirements can be found on the GenDB project page http://www.cebitec.uni-bielefeld.de/groups/brf/software/wiki/GenDBWiki/AdministratorDocumentation/GenDBInstallation

These authors contributed equally to this article.

These authors contributed equally to this article.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Linke, B., McHardy, A.C., Neuweger, H. et al. REGANOR. Appl-Bioinformatics 5, 193–198 (2006). https://doi.org/10.2165/00822942-200605030-00008

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.2165/00822942-200605030-00008

Keywords

Navigation