Approaching a complete repository of sequence-verified protein-encoding clones for Saccharomyces cerevisiae
- Yanhui Hu1,
- Andreas Rolfs1,
- Bhupinder Bhullar1,
- Tellamraju V. S. Murthy1,
- Cong Zhu2,
- Michael F. Berger2,3,
- Anamaria A. Camargo4,
- Fontina Kelley1,
- Seamus McCarron1,
- Daniel Jepson1,
- Aaron Richardson1,
- Jacob Raphael1,
- Donna Moreira1,
- Elena Taycher1,
- Dongmei Zuo1,
- Stephanie Mohr5,
- Michael F. Kane6,
- Janice Williamson1,
- Andrew Simpson7,
- Martha L. Bulyk2,3,8,9,
- Edward Harlow1,
- Gerald Marsischky1,
- Richard D. Kolodner6, and
- Joshua LaBaer1,5,10
- 1 Harvard Institute of Proteomics, Harvard Medical School, Cambridge, Massachusetts 02141, USA;
- 2 Division of Genetics, Department of Medicine, Brigham & Women’s Hospital, Harvard Medical School, Boston, Masschusetts 02115, USA;
- 3 Harvard University Graduate Biophysics Program, Cambridge, Massachusetts 02138, USA;
- 4 Ludwig Institute for Cancer Research, Sao Paulo SP Brazil 01509-010;
- 5 DF/HCC DNA Resource Core, Harvard Medical School, Cambridge, Massachusetts 02141, USA;
- 6 Ludwig Institute for Cancer Research, University of California San Diego, School of Medicine, La Jolla, California 92093, USA;
- 7 Ludwig Institute for Cancer Research, New York, New York 10158, USA;
- 8 Department of Pathology, Brigham & Women’s Hospital and Harvard Medical School, Boston, Massachusetts 02115, USA;
- 9 Harvard-MIT Division of Health Sciences & Technology (HST), Harvard Medical School, Boston, Massachusetts 02115, USA
Abstract
The availability of an annotated genome sequence for the yeast Saccharomyces cerevisiae has made possible the proteome-scale study of protein function and protein–protein interactions. These studies rely on availability of cloned open reading frame (ORF) collections that can be used for cell-free or cell-based protein expression. Several yeast ORF collections are available, but their use and data interpretation can be hindered by reliance on now out-of-date annotations, the inflexible presence of N- or C-terminal tags, and/or the unknown presence of mutations introduced during the cloning process. High-throughput biochemical and genetic analyses would benefit from a “gold standard” (fully sequence-verified, high-quality) ORF collection, which allows for high confidence in and reproducibility of experimental results. Here, we describe Yeast FLEXGene, a S. cerevisiae protein-coding clone collection that covers over 5000 predicted protein-coding sequences. The clone set covers 87% of the current S. cerevisiae genome annotation and includes full sequencing of each ORF insert. Availability of this collection makes possible a wide variety of studies from purified proteins to mutation suppression analysis, which should contribute to a global understanding of yeast protein function.
Footnotes
-
↵10 Corresponding author.
↵10 E-mail Joshua_labaer{at}hms.harvard.edu; fax (617) 324-0824.
-
[Supplemental material is available online at www.genome.org]
-
Article published online before print. Article and publication date are at http://www.genome.org/cgi/doi/10.1101/gr.6037607
-
- Received October 13, 2006.
- Accepted January 3, 2007.
- Copyright © 2007, Cold Spring Harbor Laboratory Press