Abstract
The specificities of the UDP-GalNAc:polypeptide N-acetylgalactosaminyltransferases which link the carbohydrate GalNAc to the side-chain of certain serine and threonine residues in mucin type glycoproteins, are presently unknown. The specificity seems to be modulated by sequence context, secondary structure and surface accessibility. The sequence context of glycosylated threonines was found to differ from that of serine, and the sites were found to cluster. Non-clustered sites had a sequence context different from that of clustered sites. Charged residues were disfavoured at position – 1 and +3. A jury of artificial neural networks was trained to recognize the sequence context and surface accessibility of 299 known and verified mucin type O-glycosylation sites extracted from O-GLYCBASE. The cross-validated NetOglyc network system correctly found 83% of the glycosylated and 90% of the non-glycosylated serine and threonine residues in independent test sets, thus proving more accurate than matrix statistics and vector projection methods. Predictions of O-glycosylation sites in the envelope glycoprotein gp120 from the primate lentiviruses HIV-1, HIV-2 and SIV are presented. The most conserved O-glycosylation signals in these evolutionary-related glycoproteins were found in their first hypervariable loop, V1. However, the strain variation for HIV-1 gp120 was significant. A computer server, available through WWW or E-mail, has been developed for prediction of mucin type O-glycosylation sites in proteins based on the amino acid sequence. The server addresses are http://www.cbs.dtu.dk/services/NetOGlyc/ and netOglyc@cbs.dtu.dk.
Similar content being viewed by others
References
Hounsell E, Davies M, Renouf D (1996) Glycoconj J 13: 19-26.
Varki A (1993) Glycobiology 3: 97-130.
Jentoft N (1990) Trends in Biochemical Sciences 15: 291-4.
Hart GW (1992) Curr Opin Cell Biol 4: 1017-23.
Fukuda M (1991) Glycobiology 1: 337-356.
Kinloch RA, Sakai Y, Wassarman PM (1995) Proc Natl Acad Sci USA 92: 263-67.
Muramatsu T (1993) Glycobiology 3: 294-6.
Strous GJ, Dekker J (1992) Crit Rev Biochem Mol Biol 27: 57-92.
Carraway K, Hull SR (1991) Glycobiology 1: 131-8.
Clausen H, Bennett E (1996) Glycobiology 6: 635-46.
Gooley AA, Williams KL (1994) Glycobiology 4: 413-17.
Wilson IBH, Gavel Y, Heijne GV (1991) Biochem J 275: 529-34.
Wang Y, Abernethy JL, Eckhardt AE, Hill RL (1992) J Biol Chem 267: 12709-16.
Hansen JE, Lund O, Engelbrecht J, Bohr H, Nielsen JO, Hansen JES, Brunak S (1995) Biochem J 308: 801-13.
O'Connell BC, Hagen FK, Tabak LA (1992) J Biol Chem 267: 25010-18.
Wang Y, Agwral N, Eckhard AE, Stevens RD, Hill R (1993) J Biol Chem 268: 22979-83.
Nishimori I, Johnson NR, Sanderson SD, Perini F, Mountjoy K, Cerny R, Gros ML, Hollingsworth MA (1994) J Biol Chem 269: 16123-30.
O'Connell BC, Tabak LA, Ramasubbu N (1991) Biochem Biophys Res Com 180: 1024-30.
Nehrke K, Hagen FK, Tabak LA (1996) J Biol Chem 271: 7061-65.
Roth J, Wang Y, Eckhardt AE, Hill RL (1994) Proc Natl Acad Sci USA 91: 8935-9.
Asker N, Baeckstrom D, Axelsson MA, Carlstedt I, Hansson GC (1995) Biochem J 308: 873-80.
Hansen JE, Lund O, Rapacki K, Clausen H, Mosekilde E, Nielsen JO, Hansen JES (1994) In Protein Structure by Distance Analysis (Bohr H. and Brunak S eds), pp 247-54. Amsterdam: IOS Press.
Dahms NM, Hart GW (1986) J Biol Chem 261: 13186-96.
Yamada T, Uyeda A, Takao T, Shimonishi Y, Matsushima M, Kikuchi M (1995) Eur J Biochem 230: 965-70.
Elliott S, Bartley T, Delorme E, Derby P, Hunt R, Lorenzini T, Parker V, Rohde M, Stoney K (1994) Biochemistry 33: 11237-45.
Hagen FK, Van-Wuyckhuyse B, Tabak LA (1993) J Biol Chem 268: 18960-5.
Zara J, Hagen FK, Hagen KGT, Van-Wuyckhuyse BC, Tabak LA (1996) Biochem Biophys Res Commun 228: 38-44.
Sorensen T, White T, Wandall HH, Kristensen AK, Roepstorff P, Clausen H (1995) J Biol Chem 270: 24166-73.
White T, Bennett EP, Takio K, Sorensen T, Bonding N, Clausen H (1995) J Biol Chem 270: 24156-65.
Homa FL, Hollander T, Lehman DJ, Thomsen DR, Elharmmer AP (1993) J Biol Chem 268: 12609-16.
Meurer JA, Naylor JM, Baker CA, Thomsen DR, Homa FL, Elhammer AP (1995) J Biochemistry 118: 568-74.
Hennet T, Hagen F, Tabak L, Marth J (1995) Proc Natl Acad Sci USA 92: 12070-74.
Hagen FK, Gregoire CA, Tabak LA (1995) Glycoconj J 12: 901-9.
Yoshida A, Hara T, Ikenaga H, Takeuchi M(1995) Glycoconj J 12: 824-82.
Bennett E, Hassan H, Clausen H (1996) J Biol Chem 271: 17006-12.
Meurer J, Drong R, Homa F, Slightom J, Elhammer A (1996) Glycobiology 6: 231-41.
Hansen JE, Lund O, Rapacki K, Brunak S (1997) Nucleic Acid Research 25: 278-82.
Elhammer AP, Poorman RA, Brown E, Maggiora LL, Hoogerheide JG, Kzdy FD (1993) J Biol Chem 268: 10029-38.
Chou KC (1995) Protein Sci 4: 1365-83.
Chou KC, Zhang CT, Kezdy FJ, Poorman RA (1995) Proteins 21: 118-26.
Hansen JE, Nielsen C, Arendrup M, Olofsson S, Mathiesen L, Nielsen JO, Clausen H (1991) J Virol 65: 6461-7.
Hansen JE, Clausen H, Hu SL, Nielsen JO, Olofsson S (1992) Arch Virol 126: 11-20.
Bernstein HB, Tucker SP, Hunter E, Schutzbach JS, Compans RW (1994) J Virol 68: 463-8.
Overbaugh J, Rudensey LM (1992) J Virol 66: 5937-48.
Pearson WR (1994) Methods Mol Biol 25: 365-89.
Hobohm U, Scharf M, Schneider R, Sander C (1992) Protein Sci 1: 409-17.
Connolly ML (1983) Science 221: 709-13.
Rose GD, Geselowitz AR, Lesser GJ, Lee RH, Zehfus MH (1985) Science 229: 834-8.
Blom N, Hansen JE, Blaas D, Brunak S (1996) Protein Sci 5: 2203-16.
Shannon CE (1948) Bell System Tech J 27: 379–23, 623-56.
Schneider TD, Stephens RM (1990) Nucleic Acids Res 18: 6097-100.
Presnell SR, Cohen FE (1993) Annu Rev Biophys Biomol Struct 22: 283-98.
Rost B, Sander C (1994) Proteins 19: 55-72.
Bohr H, Bohr J, Brunak S, Cotterill RMC, Lautrup B, Nùrskov L, Olsen O, Hsen SB, Peters B (1988) FEBS Lett 241: 223-8.
Qian N, Sejnowski TJ (1988) J Mol Biol 202: 865-84.
Holley LH, Karplus M (1989) Proc Natl Acad Sci 86: 152-6.
MacGregor MJ, Flores TP, Sternberg MJE (1989) Protein Engineering 2: 521-6.
Kneller DG, Cohen FE, Langridge R (1990) J Mol Biol 214: 171-82.
Rost B, Sander C (1994) Proteins 20: 216-26.
Rose B, Casadio R, Fariselli P, Sander C (1995) Protein Sci 4: 521-33.
Korning PG, Hebsgaard S, Tolstrup N, Engelbrecht J, Rouzé P, Brunak S (1996) Nucl Acids Res 24: 3439-52.
Brunak S, Engelbrecht J, Knudsen S (1991) J Mol Biol 220: 49-65.
Rumelhart DE, Hinton GE, Williams RJ (1986) Nature 323: 533-6.
Mathews BW (1975) Biochim Biophys Acta 405: 442-51.
Nielsen H, Engelbrecht J, Brunak S, von Heijne G (1997) Protein Eng 10: 1-6.
Rost B, Sander C (1993) J Mol Biol 232: 584-99.
Rost B, Sander C, Schneider R (1994) Comput Appl Biosci 10: 53-60.
Hollosi M, Perczel A, Fasman GD (1990) Biopolymers 29: 1549-64.
Wragg S, Hagen FK, Tabak LA (1995) J Biol Chem 270: 16947-54.
Brockhausen I, Toki D, Brockhausen J, Peters S, Bielfeldt T, Kleen A, Paulsen H, Meldal M, Hagen F, Tabak LA (1996) Glycoconj J 13: 849-56.
Hansen JE, Clausen H, Nielsen C, Teglbjaerg LS, Hansen LL, Nielsen CM, Dabelsteen E, Mathiesen L, Hakomori SI, Nielsen JO (1990) J Virol 64: 2833-40.
Chackerian B, Rudensey L, Overbaugh J (1997) J Virol 71: 7719-27.
Hansen JES, Jansson B, Gram GJ, Clausen H, Nielsen JO, Olofsson S (1996) Arch Virol 141: 291-300.
O'Connell B, Tabak L (1993) J Dent Res 72: 1554-8.
Geyer R, Dabrowski J, Dabrowski U, Linder D, Schlueter M, Schott HH, Stirm S (1990) Eur J Biochem 187: 95-110.
Barker WC, George DG, Mewes HW, Pfeiffer F, Tsugita A (1993) Nucleic Acids Res 21: 3089-96.
Bairoch A, Apweiler R (1996) Nucleic Acids Res 24: 21-5.
Dahr W, Beyreuther K (1985) Biol Chem Hoppe-Seyler 366: 1067-70.
Pisano A, Redmond JW, Williams KL, Gooley AA (1993) Glycobiology 3: 429-35.
Murayama J, Yamashita T, Tomita M, Hamada A (1983) Biochim Biophys Acta 742: 477-83.
Murayama JI, Tomita M, Hamada A (1982) J Membr Biol 64: 205-15.
HonmaK, TomitaM, Hamada A (1980) J Biochem 88: 1679-91.
Killeen N, Barclay AN, Willis AC, Williams AF (1987) EMBO J 6: 4029-34.
Schmid K, Heidiger MA, Brossmer R, Collins JH, Haupt H, Marti T, Offner GD, Schaller J, Takagaki K, Walsh MT, Schwick HG, Rose FS, Remold-O'Donnell E (1992) Proc Natl Acad Sci 89: 663-7.
Putnam FW, Liu YSV, Low TLK (1979) J Biol Chem 254: 2865-74.
Robinson EA, Appella E (1979) J Biol Chem 254: 11418-30.
Takayasu T, Suzuki S, Kametani F, Takahashi N, Shinoda T, Okuyama T, Munekata E (1982) Biochem Biophys Res Commun 105: 1066-71.
Kaushansky K, Lopez JA, Brown CB (1992) Biochemistry 31: 1881-6.
Takeuchi M, Kobata A (1991) Glycobiology 1: 337-46.
Birken S, Agosto G, Amr S, Nisula B, Cole L, Lewis J, Canfield R (1988) Endocrinology 122: 2054-6.
Morgan FJ, Birken S, Canfield RE (1975) J Biol Chem 250: 5247-58.
Seidah NG, Chretien M (1981) Proc Natl Acad Sci 78: 4236-40.
Bennett HP, Seidah NG, Benjannet S, Solomon S, Chretien M (1986) Int J Pept Protein Res 27: 306-13.
Fiat AM, Jolles J, Aubert JP, Loucheux-Lefebvre MH, Jolles P (1980) Eur J Biochem 111: 333-9.
Yan SB, Wold F (1984) Biochemistry 23: 3759-65.
Schaller J, Marti T, Rosselet SJ, Kampfer U, Rickli EE (1987) Fibrinolysis 1: 91-102.
Robb RJ, Kutny RM, Panico M, Morris HR, Chowdhry V (1984) Proc Natl Acad Sci 81: 6486-90.
Lottspeich F, Kellermann J, Henschen A, Foertsch B, Muller-Esterl W (1985) Eur J Biochem 152: 307-14.
Kellermann J, Lottspeich F, Henschen A, Muller-Esterl W (1986) Adv Exp Med Biol 198: 85-9.
Hill HD, Schwyzer M, Steinman HM, Hill RL (1977) J Biol Chem 252: 3799-804.
Takahashi N, Takahashi Y, Putnam FW (1985) Proc Natl Acad Sci 82: 1906-10.
Brewer Jr HB, Shulman R, Herbert P, Ronan R, Wehrly K (1974) J Biol Chem 249: 4975-84.
Kellerman J, Lottspeich F, Geiger R, Deutzmann R (1989) Adv Exp Med Biol 247A: 519-25.
Walsh KA, Titani K, Takio K, Kumar S, Hayes R, Petra PH (1986) Biochemistry 25: 7584-90.
Lopez Otin C, Grubb A, Mendez E (1984) Arch Biochem Biophys 228: 544-54.
Hochstrasser K, Schonberger OL, Rossmanith I, Wachter E (1981) Hoppe-Seyler's Z Physiol Chem 372: 1357-62.
Young JD, Tsuchiya D, Sandlin DE, Holroyde MJ (1979) Biochemistry 18: 4444-8.
Titani K, Takio K, Handa M, Ruggeri ZM (1987) Proc Natl Acad Sci 84: 5610-14.
Gejyo F, Chang JL, Burgi W, Zand Schmid K, Offner GD, Troxler R, Van Halbeek H, Dorland L, Gerwig G, Vliegenthart F (1983) J Biol Chem 258: 4966-71.
Watzlawick H, Walsh MT, Yoshioka Y, Schmid K, Brossmer R (1992) Biochemistry 31: 12198-203.
Perkins SJ, Smith KF, Amatayakuul S, Ashford D, Rademacher TW, Dwek RA, Lachmann PJ, Harrison RA (1990) J Mol Biol 214: 751-63.
Bock SC, Skriver K, Nielsen E, Thogersen HC, Wiman B, Donaldson VH, Eddy RL, Marrinan J, Radziejwska E, Huber R (1986) Biochemistry 25: 4292-301.
De Caro AM, Adrich Z, Fournet B, Capon C, Bonicel JJ, De Caro JD, Rovery M(1989) Biochim Biophys Acta 994: 281-84.
Wernette-Hammond ME, Lauer S, Corsini A, Walker D, Taylor J, Rall SC (1989) J Biol Chem 264: 9094-101.
Hayes GR, Enns CA, Lucas JJ (1992) Glycobiology 2: 355-9.
Do SI, Cummings RD (1992) Glycobiology 2: 345-53.
Adolf GR, Kalsner I, Ahorn H, Maurer Fogy I, Cantell K (1991) Biochem J 276: 511-18.
Voigt CG, Maurer-Fogy I, Adolf GR (1992) FEBS Lett 314: 85-8.
Fujiwara S, Shinkai H, Mann K, Timpl R (1993) Matrix 13: 215-22.
Clogston CL, Hu S, Boone TC, Lu HS (1993) J Chromatogr 637: 55-62.
Minamitake Y, Kodama S, Katayama T, Ada H, Tanaka S, Tsujtmoto M (1990) J Biochem 107: 292-7.
Daughaday WH, Trivedi B, Baxter RC (1993) Proc Natl Acad Sci 90: 5823-7.
Peters B, Krzesicki R, Perini F, Ruddon R (1989) Endocrinology 124: 1602-12.
Pisano A, Jardine DR, Packer NH, Farnsworth V, Carson W, Cartier P, Redmond JW, Williams KL, Gooley AA, (1996) In Techniques in Glycobiology (Townsend R and Hotchkiss A eds). New York: Marcel Dekker Inc.
Gooley AA, Classon BJ, Marshcalek R, Williams KL (1991) Biochem Biophys Res Com 178: 1194-1200.
Carlsson SR, Lycksell PO, Fukuda M (1993) Arch Biochem Biophys 304: 65-73.
Calvete JJ, Muniz-Diaz E (1993) FEBS Lett 328: 30-4.
Murayama JI, Utsumi H, Hamada A (1989) Biochim Biophys Acta 999: 273-80.
Pisano A, Packer NH, Redmond JW, Williams KL, Gooley AA (1994) Glycobiology 4: 837-44.
Shimizu N, Hara H, Sogabe T, Sakai H, Ihara I, Inoue H, Nakamura T, Shimizu S (1992) Biochem Biophys Res Commun 189: 1329-35.
Schindler PA, Settineri CA, Collet X, Fielding CJ, Burlingame AL (1995) Protein Sci 4: 791-803.
Stadie TR, Chai W, Lawson AM, ByÞeld PG, Hanisch FG (1995) Eur J Biochem 229: 140-7.
Wang CS, Dashti A, Jackson KW, Yeh JC, Cummings RD, Tang J (1995) Biochemistry 34: 10639-44.
Inoue K, Morita T (1993) Eur J Biochem 218: 153-63.
Mizuochi T, Yamashita K, Fujikawa K, Titani K, Kobata A (1980) J Biol Chem 255: 3526-31.
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Hansen, J.E., Lund, O., Tolstrup, N. et al. NetOglyc: Prediction of mucin type O-glycosylation sites based on sequence context and surface accessibility. Glycoconj J 15, 115–130 (1998). https://doi.org/10.1023/A:1006960004440
Issue Date:
DOI: https://doi.org/10.1023/A:1006960004440