Communication
An Evolving Hierarchical Family Classification for Glycosyltransferases

https://doi.org/10.1016/S0022-2836(03)00307-3Get rights and content

Abstract

Glycosyltransferases are a ubiquitous group of enzymes that catalyse the transfer of a sugar moiety from an activated sugar donor onto saccharide or non-saccharide acceptors. Although many glycosyltransferases catalyse chemically similar reactions, presumably through transition states with substantial oxocarbenium ion character, they display remarkable diversity in their donor, acceptor and product specificity and thereby generate a potentially infinite number of glycoconjugates, oligo- and polysaccharides. We have performed a comprehensive survey of glycosyltransferase-related sequences (over 7200 to date) and present here a classification of these enzymes akin to that proposed previously for glycoside hydrolases, into a hierarchical system of families, clans, and folds. This evolving classification rationalises structural and mechanistic investigation, harnesses information from a wide variety of related enzymes to inform cell biology and overcomes recurrent problems in the functional prediction of glycosyltransferase-related open-reading frames.

Section snippets

The problems of current nomenclature

A vast number of glycosyltransferase sequences are unveiled by the sequencing of genomes. Current estimates suggest that about 1% of the ORFs of each genome is dedicated to the task of glycosidic bond synthesis (P.M.C. & B.H., unpublished results). Furthermore, protein glycosylation, a glycosyltransferase-catalysed process, massively expands the functional proteome of higher organisms. It is a huge drawback, and not merely to glycobiology, that glycosyltransferases have often proved extremely

Sequence families: historical perspectives

In order to overcome the limitations of the IUBMB system and to reflect the likely increase in sequence data, Campbell and colleagues had proposed the classification of glycosyltransferases into families on the basis of similarities in amino acid sequence,8 a scheme inspired by the analogous and widely accepted classification of glycoside hydrolases.9., 10., 11. In 1997, 27 families of glycosyltransferases were described based on the analysis of the 600 sequences available at that time.8., 12.

Enzymes not included in the classification

The IUBMB classification features one class of glycosyltransferase not considered here: the enzymes that utilise disaccharides, oligosaccharides or polysaccharides as sugar donors, such as cyclodextrin glucanotransferases (EC 2.4.1.19), dextransucrase (EC 2.4.1.5), xyloglucan endotransferases (EC 2.4.1.207), etc. Unlike the glycosyltransferases discussed here, these enzymes are transglycosidases which are structurally, mechanistically and evolutionarily related to glycosidases.

An updated and evolving sequence classification for glycosyltransferases

Table 1 shows a summary of the content of the 65 glycosyltransferase families identified so far, including families GT1–GT27 described previously.8., 12. These continuously updated families, together with their links to appropriate databases (including GenBank, SwissProt, Enzyme, Taxonomy, Protein DataBank, etc.) are available from the Carbohydrate-Active enZymes (CAZy) database† . Our classification is probably incomplete, as it is likely that some

Genomic annotation

A feature of the sequence-based classification is that given families contain enzymes that display the same stereochemical outcome (Figure 1). Even in the most general case, instead of annotating an ORF as putative glycosyltransferase (now commonplace), one may annotate it as putative retaining (or inverting) glycosyltransferase from family GTxx. Whilst a superficially small improvement, such annotations would massively improve our ability to dissect diverse cellular processes such as

Glycosyltransferases: fold, clans, families, stereochemistry and specificities

As already noted,8., 12., 16. distant similarities between some families are revealed with sensitive sequence-similarity detection methods such as hydrophobic cluster analysis17 or PSI-BLAST.18 These distant similarities indicate interfamily relatedness, presumably as a result of evolutionary divergence. 3-D structure comparison is, arguably, the most powerful means to establish relatedness of proteins, even in the absence of detectable sequence similarity, and the recent elucidation of the

The CAZy classification highlights sequence “pitfalls” including so-called “conserved” motifs and modularity

One benefit of the sequence family classification is that it allows one to assess other potential diagnostics of glycosyltransferase activity. For example, it has been observed that a number of glycosyltransferases contain a so-called DxD motif,30., 31., 32., 33. although, confusingly, none of the elements of this conserved motif is invariant. In the GT-A fold structures, this motif binds one of the ribose hydroxyl groups and a divalent metal ion coordinated to the phosphate groups.19., 21., 34.

Concluding remarks

Structurally, glycosyltransferases could be mistaken as “dull”, as they seem to adopt either one of only two folds. Given the large number of nucleotide-sugar donors, the huge variety of acceptors (almost any class of molecule can be glycosylated: proteins, sugars, lipids, steroids, nucleic acids, antibiotics, etc.) and the resulting astronomical number of products, the two structural templates prove to be amongst the most ingenious and versatile scaffolds in nature. The almost infinite variety

Acknowledgements

We thank Chris Whitfield (Guelph, Ontario, Canada), Warren Wakarchuk (Ottawa, Ontario, Canada), Chris West (Gainesville, FL, USA), Rafael Oriol (Villejuif, France) for useful discussions and/or for sharing unpublished observations with us. This work was funded by grant QLK5-CT2001-00443 (EDEN) of the European Commission and by the Wellcome Trust. G.J.D is a Royal Society University Research Fellow.

References (50)

  • S.G. Withers et al.

    One step closer to a sweet conclusion

    Chem. Biol.

    (2002)
  • A.A. Costa et al.

    Characterization of a gene which encodes a mannosyltransferase homolog of Paracoccidioides brasiliensis

    Microb. Infect.

    (2002)
  • J. Stolz et al.

    The components of the Saccharomyces cerevisiae mannosyltransferase complex M-Pol I have distinct functions in mannan synthesis

    J. Biol. Chem.

    (2002)
  • L.C. Pedersen et al.

    Heparan/chondroitin sulfate biosynthesis: structure and mechanism of human glucuronyltransferase I

    J. Biol. Chem.

    (2000)
  • P.L. DeAngelis et al.

    Identification and molecular cloning of a heparosan synthase from Pasteurella multocida type D

    J. Biol. Chem.

    (2002)
  • P.L. DeAngelis et al.

    Identification and molecular cloning of a chondroitin synthase from Pasteurella multocida type F

    J. Biol. Chem.

    (2000)
  • B.J. Gibbons et al.

    Crystal structure of the autocatalytic initiator of glycogen biosynthesis, glycogenin

    J. Mol. Biol.

    (2002)
  • A.M. Mulichak et al.

    Structure of the UDP-glucosyltransferase GtfB that modifies the heptapeptide aglycone in the biosynthesis of vancomycin group antibiotics

    Structure

    (2001)
  • P.M. Rudd et al.

    Glycosylation and the immune system

    Science

    (2001)
  • L. Wells et al.

    Glycosylation of nucleocytoplasmic proteins: signal transduction and O-GlcNAc

    Science

    (2001)
  • C.R. Bertozzi et al.

    Chemical glycobiology

    Science

    (2001)
  • G.J. Davies et al.

    Structural enzymology of carbohydrate-active enzymes: implications for the post-genomic era

    Biochem. Soc. Trans.

    (2002)
  • Recommendations of the Nomenclature Committee of the International Union of Biochemistry and Molecular Biology on the Nomenclature and Classification of Enzymes

    (1992)
  • J.A. Campbell et al.

    A classification of nucleotide-diphospho-sugar glycosyltransferases based on amino acid sequence similarities

    Biochem. J.

    (1997)
  • B. Henrissat

    A classification of glycosyl hydrolases based on amino acid sequence similarities

    Biochem. J.

    (1991)
  • Cited by (963)

    View all citing articles on Scopus
    View full text