Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Brief Communication
  • Published:

Metagenomic microbial community profiling using unique clade-specific marker genes

Abstract

Metagenomic shotgun sequencing data can identify microbes populating a microbial community and their proportions, but existing taxonomic profiling methods are inefficient for increasingly large data sets. We present an approach that uses clade-specific marker genes to unambiguously assign reads to microbial clades more accurately and >50× faster than current approaches. We validated our metagenomic phylogenetic analysis tool, MetaPhlAn, on terabases of short reads and provide the largest metagenomic profiling to date of the human gut. It can be accessed at http://huttenhower.sph.harvard.edu/metaphlan/.

This is a preview of subscription content, access via your institution

Access options

Rent or buy this article

Prices vary by article type

from$1.95

to$39.95

Prices may be subject to local taxes which are calculated during checkout

Figure 1: Comparison of MetaPhlAn to existing methods.
Figure 2: Composition of healthy vaginal microbiota.
Figure 3: The gut microbiota in asymptomatic Western populations as inferred by MetaPhlAn on 224 samples combining the HMP and MetaHIT cohorts.

Similar content being viewed by others

References

  1. DeLong, E.F. Nat. Rev. Microbiol. 3, 459–469 (2005).

    Article  CAS  Google Scholar 

  2. Daniel, R. Nat. Rev. Microbiol. 3, 470–478 (2005).

    Article  CAS  Google Scholar 

  3. The Human Microbiome Project Consortium. Nature advance online publication, doi:10.1038/nature11209 (14 June 2012).

  4. Qin, J. et al. Nature 464, 59–65 (2010).

    Article  CAS  Google Scholar 

  5. Ravel, J. et al. Proc. Natl. Acad. Sci. USA 108, 4680–4687 (2011).

    Article  CAS  Google Scholar 

  6. Veiga, P. et al. Proc. Natl. Acad. Sci. USA 107, 18132–18137 (2010).

    Article  CAS  Google Scholar 

  7. Turnbaugh, P.J. et al. Nature 457, 480–484 (2009).

    Article  CAS  Google Scholar 

  8. Markowitz, V.M. et al. Nucleic Acids Res. 38, D382–D390 (2010).

    Article  CAS  Google Scholar 

  9. Fredricks, D.N., Fiedler, T.L. & Marrazzo, J.M. N. Engl. J. Med. 353, 1899–1911 (2005).

    Article  CAS  Google Scholar 

  10. Stewart, F.J., Ulloa, O. & DeLong, E.F. Environ. Microbiol. 14, 23–40 (2012).

    Article  CAS  Google Scholar 

  11. Arumugam, M. et al. Nature 473, 174–180 (2011).

    Article  CAS  Google Scholar 

  12. Brady, A. & Salzberg, S. Nat. Methods 8, 367 (2011).

    Article  CAS  Google Scholar 

  13. Altschul, S.F., Gish, W., Miller, W., Myers, E.W. & Lipman, D.J. J. Mol. Biol. 215, 403–410 (1990).

    Article  CAS  Google Scholar 

  14. Parks, D.H., MacDonald, N. & Beiko, R. BMC Bioinformatics 12, 328 (2011).

    Article  CAS  Google Scholar 

  15. Rosen, G.L., Reichenberger, E.R. & Rosenfeld, A.M. Bioinformatics 27, 127–129 (2011).

    Article  CAS  Google Scholar 

  16. Segata, N. & Huttenhower, C. PLoS ONE 6, e24704 (2011).

    Article  CAS  Google Scholar 

  17. Bohlin, J. et al. BMC Evol. Biol. 10, 249 (2010).

    Article  Google Scholar 

  18. Edgar, R.C. Bioinformatics 26, 2460–2461 (2010).

    Article  CAS  Google Scholar 

  19. Wu, M. & Eisen, J.A. Genome Biol. 9, R151 (2008).

    Article  Google Scholar 

  20. Ciccarelli, F.D. et al. Science 311, 1283–1287 (2006).

    Article  CAS  Google Scholar 

  21. Mavromatis, K. et al. Nat. Methods 4, 495–500 (2007).

    Article  CAS  Google Scholar 

  22. Kanehisa, M., Goto, S., Furumichi, M., Tanabe, M. & Hirakawa, M. Nucleic Acids Res. 38, D355–D360 (2010).

    Article  CAS  Google Scholar 

  23. Li, H., Ruan, J. & Durbin, R. Genome Res. 18, 1851–1858 (2008).

    Article  CAS  Google Scholar 

  24. Pruitt, K.D., Tatusova, T., Klimke, W. & Maglott, D.R. Nucleic Acids Res. 37, D32–D36 (2009).

    Article  CAS  Google Scholar 

  25. Huson, D.H., Auch, A.F., Qi, J. & Schuster, S.C. Genome Res. 17, 377–386 (2007).

    Article  CAS  Google Scholar 

  26. Huson, D.H., Mitra, S., Ruscheweyh, H.J., Weber, N. & Schuster, S.C. Genome Res. 21, 1552–1560 (2011).

    Article  CAS  Google Scholar 

  27. Gori, F., Folino, G., Jetten, M.S.M. & Marchiori, E. Bioinformatics 27, 196–203 (2011).

    Article  CAS  Google Scholar 

  28. Berger, S.A. & Stamatakis, A. Bioinformatics 27, 2068–2075 (2011).

    Article  CAS  Google Scholar 

  29. Gerlach, W. & Stoye, J. Nucleic Acids Res. 39, e91 (2011).

    Article  CAS  Google Scholar 

  30. McHardy, A.C., Rigoutsos, I., Hugenholtz, P., Tsirigos, A. & Martin, H.G. Nat. Methods 4, 63–72 (2007).

    Article  CAS  Google Scholar 

  31. Patil, K.R. et al. Nat. Methods 8, 191–192 (2011).

    Article  CAS  Google Scholar 

  32. Brady, A. & Salzberg, S.L. Nat. Methods 6, 673–676 (2009).

    Article  CAS  Google Scholar 

  33. Rosen, G., Garbarine, E., Caseiro, D., Polikar, R. & Sokhansanj, B. Adv. Bioinformatics 2008, 205969 (2008).

    Article  Google Scholar 

  34. Nalbantoglu, O.U., Way, S.F., Hinrichs, S.H. & Sayood, K. BMC Bioinformatics 12, 41 (2011).

    Article  Google Scholar 

  35. Leung, H.C. et al. Bioinformatics 27, 1489–1495 (2011).

    Article  CAS  Google Scholar 

  36. Schloss, P.D. et al. Appl. Environ. Microbiol. 75, 7537–7541 (2009).

    Article  CAS  Google Scholar 

  37. Cole, J.R. et al. Nucleic Acids Res. 37, D141–D145 (2009).

    Article  CAS  Google Scholar 

Download references

Acknowledgements

We would like to thank F. Stewart and E. DeLong for their helpful input during this study; D. Gevers, S. Sykes and K. Huang for their feedback on the methodology and J. Reyes and G. Weingart for their assistance with the implementation. This work was supported by US National Institutes of Health grant 1R01HG005969 and National Science Foundation grant DBI-1053486 to C.H.

Author information

Authors and Affiliations

Authors

Contributions

N.S., A.B., O.J. and C.H. conceived the method; N.S. implemented the software; N.S. and C.H. performed the experiments; N.S., L.W., V.N. and C.H. analyzed the data; and N.S. and C.H. wrote the manuscript.

Corresponding author

Correspondence to Curtis Huttenhower.

Ethics declarations

Competing interests

The authors declare no competing financial interests.

Supplementary information

Supplementary Text and Figures

Supplementary Figures 1–7, Supplementary Tables 1 and 2 and Supplementary Notes 1–3 (PDF 2101 kb)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Segata, N., Waldron, L., Ballarini, A. et al. Metagenomic microbial community profiling using unique clade-specific marker genes. Nat Methods 9, 811–814 (2012). https://doi.org/10.1038/nmeth.2066

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/nmeth.2066

This article is cited by

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing