Skip to main content
Log in

Prediction of small, noncoding RNAs in bacteria using heterogeneous data

  • Published:
Journal of Mathematical Biology Aims and scope Submit manuscript

Abstract

sRNAFinder is a new gene prediction system for systematic identification of noncoding genes in bacteria. Most noncoding RNAs in prokaryotes belong to a class of genes denoted as small RNAs (sRNAs). In the model organism Escherichia coli, over 70 sRNA genes have been identified, and the existence of many more has been hypothesized. While various sources of information have proven useful for prediction of novel sRNA genes, most computational approaches do not take advantage of the disparate sources of data available for identifying these noncoding RNA genes. We present a general probabilistic method for predicting sRNA genes in bacteria. The method, based on a general Markov model, is implemented in the computational tool sRNAFinder. sRNAFinder incorporates heterogeneous data sources for gene prediction, including primary sequence data, transcript expression data from microarray experiments, and conserved RNA structure information as determined from comparative genomics analysis. We demonstrate that sRNAFinder improves upon current tools for identifying small, noncoding genes in bacteria.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  1. Alexandersson M., Cawley S., Pachter L. (2003). SLAM: cross-species gene finding and alignment with a generalized pair hidden Markov model. Genome Res. 13: 496–502

    Article  Google Scholar 

  2. Allen J.E., Pertea M., Salzberg S.L. (2004). Computational gene prediction using multiple sources of evidence. Genome Res. 14: 142–148

    Article  Google Scholar 

  3. Altschul S.F., Madden T.L., Schaffer A.A., Zhang J., Zhang Z., Miller W., Lipman D.J. (1997). Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 25: 3389–3402

    Article  Google Scholar 

  4. Argaman L., Hershberg R., Vogel J., Bejerano G., Wagner E.G., Margalit H., Altuvia S. (2001). Novel small RNA-encoding genes in the intergenic regions of Escherichia coli. Curr. Biol. 11: 941–950

    Article  Google Scholar 

  5. Brejova B., Brown D.G., Li M., Vinar T. (2005). ExonHunter: a comprehensive approach to gene finding. Bioinformatics 21: i57–i65

    Article  Google Scholar 

  6. Burge C., Karlin S. (1997). Prediction of complete gene structures in human genomic DNA. J. Mol. Biol. 268: 78–94

    Article  Google Scholar 

  7. Carafa Y.A., Brody E., Thermes C. (1990). Prediction of rho-independent Escherichia coli transcription terminators. J. Mol. Biol. 216: 835–858

    Article  Google Scholar 

  8. Carter R.J., Dubchak I., Holbrook S.R. (2001). A computational approach to identify genes for functional RNAs in genomic sequences. Nucleic Acids Res. 29: 3928–3938

    Google Scholar 

  9. Chen S., Lesnik E.A., Hall T.A., Sampath R., Griffey R.H., Ecker D.J., Blyn L.B. (2002). A bioinformatics based approach to discover small RNA genes in the Escherichia coli genome. Biosystems 65: 157–177

    Article  Google Scholar 

  10. Coventry A., Kleitman D.J., Berger B. (2004). MSARI: multiple sequence alignments for statistical detection of RNA secondary structure. Proc. Natl. Acad. Sci. USA 101: 12102–12107

    Article  Google Scholar 

  11. Ermolaeva M.D., Khalak H.G., White O., Smith H.O., Salzberg S.L. (2000). Prediction of transcription terminators in bacterial genomes. J. Mol. Biol. 301: 27–33

    Article  Google Scholar 

  12. Flicek P., Keibler E., Hu P., Korf I., Brent M.R. (2003). Leveraging the mouse genome for gene prediction in human: from whole-genome shotgun reads to a global synteny map. Genome Res. 13: 46–54

    Article  Google Scholar 

  13. Forney G.D. Jr. (1973). The Viterbi algorithm. Proc. IEEE 61: 263–278

    Article  MathSciNet  Google Scholar 

  14. Gottesman S. (2004). The small RNA regulators of Escherichia coli: roles and mechanisms. Annu. Rev. Microbiol. 58: 303–328

    Article  Google Scholar 

  15. Gumbel E.J. (1958). Statistics of Extremes. Columbia University Press, New York

    MATH  Google Scholar 

  16. Hershberg R., Altuvia S., Margalit H. (2003). A survey of small RNA-encoding genes in Escherichia coli. Nucleic Acids Res. 31: 1813–1820

    Article  Google Scholar 

  17. Hershberg R., Bejerano G., Santos-Zavaleta A., Margalit H. (2001). PromEC: an updated database of Escherichia coli mRNA promoters with experimentally identified transcriptional start sites. Nucleic Acids Res. 29: 277

    Article  Google Scholar 

  18. Howard R.A. (1971). Dynamic Probabilistic Systems, Vol. II: Semi-Markov and Decision Processes. Wiley, New York

    MATH  Google Scholar 

  19. Howe K.L., Chothia T., Durbin R. (2002). GAZE: A genetic framework for the integration of gene-prediction data by dynamic programming. Genome Res. 12: 1418–1427

    Article  Google Scholar 

  20. Korf I., Flicek P., Duan D., Brent M.R. (2001). Integrating genomic homology into gene structure prediction. Bioinformatics 17: S140–S148

    Google Scholar 

  21. Lai E.C., Tomancak P., Williams R.W., Rubin G.M. (2003). Computational identification of Drosophila microRNA genes. Genome Biol. 4: R42

    Article  Google Scholar 

  22. Lenz D.H., Mok K.C., Lilley B.N., Kulkarni R.V., Wingreen N.S., Bassler B.L. (2004). The small RNA chaperone Hfq and multiple small RNAs control quorum sensing in Vibrio harveyi and Vibrio cholerae. Cell 118: 69–82

    Article  Google Scholar 

  23. Lim L.P., Glasner M.E., Yekta S., Burge C.B., Bartel D.P. (2003). Vertebrate microRNA genes. Science 299: 1540

    Article  Google Scholar 

  24. Livny J., Fogel M.A., Davis B.M., Waldor M.K. (2005). sRNAPredict: an integrative computational approach to identify sRNAs in bacterial genomes. Nucleic Acids Res. 33: 4096–4105

    Article  Google Scholar 

  25. Masse E., Majdalani N., Gottesman S. (2003). Regulatory roles of small RNAs in bacteria. Curr. Opin. Microbiol. 6: 120–124

    Article  Google Scholar 

  26. Parra G., Agarwal P., Abril J.F., Wiehe T., Fickett J.W., Guigo R. (2003). Comparative gene prediction in human and mouse. Genome Res. 13: 108–117

    Article  Google Scholar 

  27. Rabiner L.R. (1989). A tutorial on hidden Markov models and selected applications in speech recognition. Proc. IEEE 77: 257–285

    Article  Google Scholar 

  28. Rivas E., Eddy S.R. (2001). Noncoding RNA gene detection using comparative sequence analysis. BMC Bioinformatics 2: 8

    Article  Google Scholar 

  29. Rivas E., Klein R.J., Jones T.A., Eddy S.R. (2001). Computational identification of noncoding RNAs in E. coli by comparative genomics. Curr. Biol. 11: 1369–1373

    Article  Google Scholar 

  30. Selinger D.W., Cheung K.J., Mei R., Johansson E.M., Richmond C.S., Blattner F.R., Lockhart D.J., Church G.M. (2000). RNA expression analysis using a 30 base pair resolution Escherichia coli genome array. Nat. Biotechnol. 18: 1262–1268

    Article  Google Scholar 

  31. Staden R. (1984). Computer methods to locate signals in nucleic acid sequences. Nucleic Acids Res. 12: 505–519

    Article  Google Scholar 

  32. Storz G., Gottesman S. (2006). Versatile roles of small RNA regulators in bacteria. In: Gesteland, R.F., Cech, T.R., Atkins, J.F. (eds) The RNA World, pp 567–594. Cold Spring Harbor Laboratory Press, Cold Spring Harbor

    Google Scholar 

  33. Tjaden B., Goodwin S.S., Opdyke J.A., Guillier M., Fu D.X., Gottesman S., Storz G. (2006). Target prediction for small, noncoding RNAs in bacteria. Nucleic Acids Res. 34: 2791–2802

    Article  Google Scholar 

  34. Tjaden B., Haynor D.R., Stolyar S., Rosenow C., Kolker E. (2002). Identifying operons and untranslated regions of transcripts using Escherichia coli RNA expression analysis. Bioinformatics 18: S337–S344

    Article  Google Scholar 

  35. Tjaden B., Saxena R.M., Stolyar S., Haynor D.R., Kolker E., Rosenow C. (2002). Transcriptome analysis of Escherichia coli using high-density oligonucleotide probe arrays. Nucleic Acids Res. 30: 3732–3738

    Article  Google Scholar 

  36. Washietl S., Hofacker I.L., Stadler P.F. (2005). Fast and reliable prediction of noncoding RNAs. Proc. Natl. Acad. Sci. USA 102: 2454–2459

    Article  Google Scholar 

  37. Wassarman K.M., Repoila F., Rosenow C., Storz G., Gottesman S. (2001). Identification of novel small RNAs using comparative genomics and microarrays. Genes Dev. 15: 1637–1651

    Article  Google Scholar 

  38. Workman C., Krogh A. (1999). No evidence that mRNAs have lower folding free energies than random sequences with the same dinucleotide distribution. Nucleic Acids Res. 27: 4816–4822

    Article  Google Scholar 

  39. Yeh R., Lim L.P., Burge C.B. (2001). Computational inference of homologous gene structures in the human genome. Genome Res. 11: 803–816

    Article  Google Scholar 

  40. Zhang A., Wassarman K.M., Rosenow C., Tjaden B., Storz G., Gottesman S. (2003). Global analysis of small RNA and mRNA targets of Hfq. Mol. Microbiol. 50: 1111–1124

    Article  Google Scholar 

  41. Zhang L., Pavlovic V., Cantor C.R., Kasif S. (2003). Human-mouse gene identification by comparative evidence integration and evolutionary analysis. Genome Res. 13: 1190–1202

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Brian Tjaden.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Tjaden, B. Prediction of small, noncoding RNAs in bacteria using heterogeneous data. J. Math. Biol. 56, 183–200 (2008). https://doi.org/10.1007/s00285-007-0079-5

Download citation

  • Received:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00285-007-0079-5

Keywords

Mathematics Subject Classification (2000)

Navigation