Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Brief Communication
  • Published:

GenePRIMP: a gene prediction improvement pipeline for prokaryotic genomes

Abstract

We present 'gene prediction improvement pipeline' (GenePRIMP; http://geneprimp.jgi-psf.org/), a computational process that performs evidence-based evaluation of gene models in prokaryotic genomes and reports anomalies including inconsistent start sites, missed genes and split genes. We found that manual curation of gene models using the anomaly reports generated by GenePRIMP improved their quality, and demonstrate the applicability of GenePRIMP in improving finishing quality and comparing different genome-sequencing and annotation technologies.

This is a preview of subscription content, access via your institution

Access options

Rent or buy this article

Prices vary by article type

from$1.95

to$39.95

Prices may be subject to local taxes which are calculated during checkout

Figure 1: GenePRIMP analysis of gene calls in the M. palustris genome by three gene callers.
Figure 2: The GenePRIMP processing pipeline.

Similar content being viewed by others

References

  1. Benson, D.A., Karsch-Mizrachi, I., Lipman, D.J., Ostell, J. & Sayers, E.W. Nucleic Acids Res. 38, D46–D51 (2010).

    Article  CAS  Google Scholar 

  2. Ishino, Y., Okada, H., Ikeuchi, M. & Taniguchi, H. Proteomics 7, 4053–4065 (2007).

    Article  CAS  Google Scholar 

  3. Smollett, K.L. et al. Microbiology 155, 186–197 (2009).

    Article  CAS  Google Scholar 

  4. Kyrpides, N.C. Nat. Biotechnol. 27, 627–632 (2009).

    Article  CAS  Google Scholar 

  5. Hyatt, D. et al. BMC Bioinformatics (in the press).

  6. Besemer, J., Lomsadze, A. & Borodovsky, M. Nucleic Acids Res. 29, 2607–2618 (2001).

    Article  CAS  Google Scholar 

  7. Delcher, A.L., Bratke, K.A., Powers, E.C. & Salzberg, S.L. Bioinformatics 23, 673–679 (2007).

    Article  CAS  Google Scholar 

  8. Zhu, H.Q., Hu, G.Q., Quyang, Z.Q., Wang, J. & She, Z.S. Bioinformatics 20, 3308–3317 (2004).

    Article  CAS  Google Scholar 

  9. Tech, M. & Meinicke, P. BMC Bioinformatics 7, 121 (2006).

    Article  Google Scholar 

  10. Yu, G.X. et al. Nucleic Acids Res. 35, 3953–3962 (2007).

    Article  CAS  Google Scholar 

  11. Nagy, A. et al. BMC Bioinformatics 9, 353 (2008).

    Article  Google Scholar 

  12. Castellana, N.E. et al. Proc. Natl. Acad. Sci. USA 105, 21034–21038 (2008).

    Article  CAS  Google Scholar 

  13. Markowitz, V.M. et al. Nucleic Acids Res. 38, D382–D390 (2010).

    Article  CAS  Google Scholar 

  14. Aziz, R.K. et al. BMC Genomics 9, 75 (2008).

    Article  Google Scholar 

  15. Bocs, S., Cruveiller, S., Vallenet, D., Nuel, G. & Medigue, C. Nucleic Acids Res. 31, 3723–3726 (2003).

    Article  CAS  Google Scholar 

Download references

Acknowledgements

We acknowledge the help and support of I. Anderson, K. Mavromatis, X. Zhao and V. Markowitz. GenePRIMP was developed under the auspices of the US Department of Energy′s Office of Science, Biological and Environmental Research Program and by the University of California, Lawrence Berkeley National Laboratory under contract DE-AC02-05CH11231, Lawrence Livermore National Laboratory under contract DE-AC52-07NA27344 and Los Alamos National Laboratory under contract DE-AC02-06NA25396. Validation and improvement of the system was supported by US National Institutes of Health Data Analysis and Coordination Center contract U01-HG004866. The work conducted by the US Department of Energy Joint Genome Institute is supported by the Office of Science of the US. Department of Energy under contract DE-AC02-05CH11231.

Author information

Authors and Affiliations

Authors

Contributions

N.N.I. and N.C.K. conceived the initial approach. N.N.I. and A.P. designed the system. A.P. implemented the GenePRIMP code base and web portal. S.D.H. contributed to the development of the web portal. N.N.I., N.M., G.O. and A.L. manually curated the genomes sequenced at the Department of Energy Joint Genome Institute and contributed to testing and validation.

Corresponding author

Correspondence to Amrita Pati.

Ethics declarations

Competing interests

The authors declare no competing financial interests.

Supplementary information

Supplementary Text and Figures

Supplementary Figures 1–9, Supplementary Table 1 and Supplementary Data 1–5 (PDF 3711 kb)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Pati, A., Ivanova, N., Mikhailova, N. et al. GenePRIMP: a gene prediction improvement pipeline for prokaryotic genomes. Nat Methods 7, 455–457 (2010). https://doi.org/10.1038/nmeth.1457

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/nmeth.1457

This article is cited by

Search

Quick links

Nature Briefing: Translational Research

Sign up for the Nature Briefing: Translational Research newsletter — top stories in biotechnology, drug discovery and pharma.

Get what matters in translational research, free to your inbox weekly. Sign up for Nature Briefing: Translational Research