Approaching a complete repository of sequence-verified protein-encoding clones for Saccharomyces cerevisiae

  1. Yanhui Hu1,
  2. Andreas Rolfs1,
  3. Bhupinder Bhullar1,
  4. Tellamraju V. S. Murthy1,
  5. Cong Zhu2,
  6. Michael F. Berger2,3,
  7. Anamaria A. Camargo4,
  8. Fontina Kelley1,
  9. Seamus McCarron1,
  10. Daniel Jepson1,
  11. Aaron Richardson1,
  12. Jacob Raphael1,
  13. Donna Moreira1,
  14. Elena Taycher1,
  15. Dongmei Zuo1,
  16. Stephanie Mohr5,
  17. Michael F. Kane6,
  18. Janice Williamson1,
  19. Andrew Simpson7,
  20. Martha L. Bulyk2,3,8,9,
  21. Edward Harlow1,
  22. Gerald Marsischky1,
  23. Richard D. Kolodner6, and
  24. Joshua LaBaer1,5,10
  1. 1 Harvard Institute of Proteomics, Harvard Medical School, Cambridge, Massachusetts 02141, USA;
  2. 2 Division of Genetics, Department of Medicine, Brigham & Women’s Hospital, Harvard Medical School, Boston, Masschusetts 02115, USA;
  3. 3 Harvard University Graduate Biophysics Program, Cambridge, Massachusetts 02138, USA;
  4. 4 Ludwig Institute for Cancer Research, Sao Paulo SP Brazil 01509-010;
  5. 5 DF/HCC DNA Resource Core, Harvard Medical School, Cambridge, Massachusetts 02141, USA;
  6. 6 Ludwig Institute for Cancer Research, University of California San Diego, School of Medicine, La Jolla, California 92093, USA;
  7. 7 Ludwig Institute for Cancer Research, New York, New York 10158, USA;
  8. 8 Department of Pathology, Brigham & Women’s Hospital and Harvard Medical School, Boston, Massachusetts 02115, USA;
  9. 9 Harvard-MIT Division of Health Sciences & Technology (HST), Harvard Medical School, Boston, Massachusetts 02115, USA

Abstract

The availability of an annotated genome sequence for the yeast Saccharomyces cerevisiae has made possible the proteome-scale study of protein function and protein–protein interactions. These studies rely on availability of cloned open reading frame (ORF) collections that can be used for cell-free or cell-based protein expression. Several yeast ORF collections are available, but their use and data interpretation can be hindered by reliance on now out-of-date annotations, the inflexible presence of N- or C-terminal tags, and/or the unknown presence of mutations introduced during the cloning process. High-throughput biochemical and genetic analyses would benefit from a “gold standard” (fully sequence-verified, high-quality) ORF collection, which allows for high confidence in and reproducibility of experimental results. Here, we describe Yeast FLEXGene, a S. cerevisiae protein-coding clone collection that covers over 5000 predicted protein-coding sequences. The clone set covers 87% of the current S. cerevisiae genome annotation and includes full sequencing of each ORF insert. Availability of this collection makes possible a wide variety of studies from purified proteins to mutation suppression analysis, which should contribute to a global understanding of yeast protein function.

Footnotes

  • 10 Corresponding author.

    10 E-mail Joshua_labaer{at}hms.harvard.edu; fax (617) 324-0824.

  • [Supplemental material is available online at www.genome.org]

  • Article published online before print. Article and publication date are at http://www.genome.org/cgi/doi/10.1101/gr.6037607

    • Received October 13, 2006.
    • Accepted January 3, 2007.
| Table of Contents

Preprint Server