ReviewThe TB Structural Genomics Consortium: A decade of progress
Introduction
The Tuberculosis Structural Genomics Consortium (TBSGC, http://www.webtb.org) is an international collaboration of researchers whose primary objective is to comprehensively determine the three-dimensional structures of proteins from Mycobacterium tuberculosis (Mtb) with the intent of TB diagnosis and treatment. The Consortium, comprising 460 members from 93 research centers spanning 15 countries, employs state-of-the-art technologies for gene cloning, protein expression and structure determination. Since the inception of the TBSGC in 2000, approximately 250 Mtb protein structures have been elucidated by Consortium members, accounting for over one-third of Mtb structures deposited into the Protein Data Bank (PDB, http://www.rcsb.org), many of which have been previously reviewed.1, 2, 3, 4 The wealth of information from atomic resolution details of proteins, particularly in complex with their cognate substrates or cofactors as well as protein–protein interaction networks, may aid other scientists in interpreting their genetic and biochemical data and may ultimately be exploited in rational structure-based design of therapeutics for TB.
In addition to structural elucidation of Mtb proteins, the TBSGC is actively developing bioinformatics resources that can be used for data mining to complement structural information. Highlighted below are several useful and unique databases available on the TBSGC’s website:
- (1)
The Gene Expression Correlation Grid or gecGrid server (http://www.webtb.org/gecGrid/) is a compilation of four Mtb H37Rv gene expression datasets,5, 6, 7, 8 which includes 553 experiments to infer approximately 7,700,000 pairwise coexpression relationships between pairs of genes.9 Additionally, gecGrid is a useful tool in assessing correlation among networks/systems of genes. Gene expression correlations are represented in matrices, where each entry has an associated correlation coefficient, a measure of the level of positive (or negative) coexpression.
- (2)
The Prolinks database (http://prolinks.mbi.ucla.edu) combines four algorithms (i.e., phylogenetic profile, Rosetta Stone, gene neighbor and gene cluster) to predict functional linkages between proteins from 83 organisms, including Mtb, and 10 million high confidence links.10 The Phylogenetic Profile method uses the presence and absence of proteins across multiple genomes to detect functional linkages11, 12; Rosetta Stone uses a gene fusion event in a second organism to infer functional relatedness13, 14; the Gene Cluster method uses genome proximity to predict functional linkage15, 16, 17; and the Gene Neighbor method uses both gene proximity and phylogenetic distribution to infer linkage.18, 19, 20 The Proteome Navigator tool allows users to browse predicted linkage networks interactively, providing accompanying annotation from additional public databases.
- (3)
The ProKnow program (http://proknow.mbi.ucla.edu) constitutes a knowledgebase where features from proteins such as three-dimensional fold, sequence, motif and functional linkages are extracted and then related to annotated functions from Gene Ontology functional terms.21 In the advent of structural genomics, ProKnow is a useful resource in functional assessments of proteins with structural information but unknown functions. ProKnow has been applied to the Mtb H37Rv genome and the results for all genes, their function-based similarities and links show that ProKnow is able to assign around 50% of genes in the genome with high confidence. These high confidence linkages have been incorporated into functional information on the TBSGC website.
The TBSGC has also established integrated core facilities located at Texas A&M University (TAMU), Los Alamos National Laboratory (LANL) and Lawrence Berkeley National Laboratory (LBNL) to provide technical support for Consortium members, which is presented in the first section of this review. The remaining sections highlight a compilation of recent Mtb protein structures determined by TBSGC groups, many of which are associated with metabolic pathways and are, thus, potential attractive anti-TB therapeutic targets. In particular, these include structural studies on urease (Habel and Hung), chorismate-utilizing enzymes (Johnston et al.), arginine biosynthesis enzymes (Sankaranarayanan and James), crotonase, malate synthase, fumarase and phosphoenolpyruvate carboxykinase (Krieger et al.) and heme degrader, MhuD (Chim et al.). Additionally, atomic resolution details of proteins from systems involving disulfide bond formation of secreted proteins (Chim et al.) and toxin–antitoxin gene pairs (Miallau and Eisenberg) can ultimately be exploited for rational structure-based drug design.
Section snippets
The TBSGC structure determination pipeline (Li-Wei Hung, Chang-Yub Kim, Hongye Li, James C. Sacchettini and Thomas C. Terwilliger)
The TBSGC has established an effective pipeline as a core resource for Mtb structural biology. The purpose of the pipeline is to provide an integrated set of facilities for cloning, protein expression, purification and X-ray data collection capabilities for the entire TBSGC project. The current pipeline consists of a high-throughput cloning and expression facility at the Institute of Biosciences and Technology, TAMU, a protein production facility at LANL, and crystallization and data collection
Solution and crystal structures of Mtb UreA - the implications in bio-molecular assembly and drug discovery of Mtb urease (Jeff E. Habel and Li-Wei Hung)
Found in a broad range of plants and bacteria, urease is a nickel-containing enzyme catalyzing the hydrolysis of urea into ammonia and carbamate (which then decomposes with water to form ammonia and carbon dioxide). Urease has been implicated as a potential virulence enhancing factor in diseases of the human gastrointestinal and urinary tract.27, 28, 29 For Mtb, urease is thought to increase survival in the lung tissue possibly through pH modulation (as seen in Helicobacter pylori) and nitrogen
Studies on chorismate-utilizing enzymes in Mtb (Jodie M. Johnston, Esther M. Bulloch, Richard J. Payne, Alexandra Manos-Turvey, Edward N. Baker and J. Shaun Lott)
Chorismate is a central metabolic intermediate that is utilized by many bacteria in the production of a variety of aromatic compounds, including the aromatic amino acids, folate, and salicylate. Because the biochemical pathways that catalyze the biosynthesis and utilization of chorismate are found in plants, bacteria and some protists but not in more complex eukaryotes, they have long been considered to be potentially useful targets for therapeutic intervention, with a strong inherent
Ornithine acetyltransferase (Rv1653) and ornithine carbamoyltransferase (Rv1656) in arginine the biosynthesis pathway of Mtb (Ramasamy Sankaranarayanan and Michael N.G. James)
The X-ray crystal structures of two enzymes (Ornithine Acetyltransferase, Mtb OAT (E.C.2.3.1.35) and Ornithine Carbamoyltransferase, Mtb OTC (E.C.2.1.3.3) that are involved in the arginine biosynthetic pathway of Mtb are reported here. Mtb OAT reversibly catalyzes the transfer of the acetyl group from N-acetylornithine to l-glutamate to produce N-acetylglutamate.67 Although there are two other enzymes, N-acetylglutamate synthase (E.C.2.3.1.1) and N-acetylornithine deacetylase (E.C.3.5.1.16)
Structure of key enzymes in persistence related metabolic shift (Inna Krieger, John Bruning, Stephanie Swanson, Haelee Kim and James C. Sacchettini)
Mtb demonstrates remarkable metabolic versatility, allowing it to survive and persist under the hypoxic, acidic, and nutrient poor conditions inside the host macrophage. It accomplishes this by adapting its growth rate to suit its environment. During the persistent state of infection, Mtb’s growth drops to a virtually non-replicating state by significantly altering its metabolism to conserve energy and effectively use resources scavenged from the host.72 Our focus has been to study the
Iron availability
Iron availability is essential in supporting microbial viability and growth; in Mtb, siderophores (small, high-affinity iron-chelating molecules) are secreted to sequester Fe3+ from human transferrin and lactoferrin.90, 91, 92 Alternatively, pathogenic bacteria have also evolved mechanisms in which host-heme acquisition can overcome iron requirements. Staphylococcus aureus,93 Serratia marcescans,94, 95 Shigella dysenteriae,96, 97, 98 and Neisseria spp99, 100, 101 possess distinct heme-uptake
Toxin–Antitoxin complexes from Mtb (Linda Miallau and David Eisenberg)
Toxin–antitoxin (TA) gene pairs were initially discovered on plasmids and were found to be essential for the maintenance of foreign genetic elements in host cells through the process termed “plasmid addiction”.131 Bacteria acquire plasmids that encode both toxin and antitoxin genes in an operon. Under normal physiological conditions both proteins are expressed and form a tight complex incapable of inhibiting bacterial growth. However, when the plasmid encoding those systems is lost, for
Conclusions
In the last decade, the TBSGC project has yielded useful bioinformatics resources as well as structural information. The richness of information available in molecular structures also provides the starting-point for many new research avenues and future drug design projects unanticipated in the present day. For example, it is possible that a future finding will implicate one of the biosynthetic enzymes highlighted in this review as pivotal during Mtb pathogenesis, prompting a large-scale drug
Acknowledgements
The UCLA authors wish to acknowledge Duilio Cascio, Dan Anderson, Andrew Min, Mark Arbing and the PETC laboratory.
References (152)
- et al.
The TB structural genomics consortium: a resource for Mycobacterium tuberculosis biology
Tuberculosis (Edinb)
(2003) - et al.
DnaE2 polymerase contributes to in vivo survival and the emergence of drug resistance in Mycobacterium tuberculosis
Cell
(2003) - et al.
Comparative expression studies of a complex phenotype: cord formation in Mycobacterium tuberculosis
Tuberculosis (Edinb)
(2004) - et al.
Conservation of gene order: a fingerprint of proteins that physically interact
Trends Biochem Sci
(1998) - et al.
Inference of protein function from protein structure
Structure
(2005) Rational protein crystallization by mutational surface engineering
Structure
(2004)- et al.
Inhibition of polo-like kinase 1 by blocking polo-box domain-dependent protein-protein interactions
Chem Biol
(2008) - et al.
Cell surface-associated Tat modulates HIV-1 infection and spreading through a specific interaction with gp120 viral envelope protein
Blood
(2005) - et al.
The structure of 3-deoxy-d-arabino-heptulosonate 7-phosphate synthase from Mycobacterium tuberculosis reveals a common catalytic scaffold and ancestry for type I and type II enzymes
J Mol Biol
(2005) - et al.
p-Aminobenzoate biosynthesis in Escherichia coli. Purification of aminodeoxychorismate lyase and cloning of pabC
J Biol Chem
(1991)