Abstract
Improvements in comparative protein structure modeling for the remote target-template sequence similarity cases are possible through the optimal combination of multiple template structures and by improving the quality of target-template alignment. Recently developed MMM and M4T methods were designed to address these problems. Here we describe new developments in both the alignment generation and the template selection parts of the modeling algorithms. We set up a new scoring function in MMM to deliver more accurate target-template alignments. This was achieved by developing and incorporating into the composite scoring function a novel statistical pairwise potential that combines local and non-local terms. The non-local term of the statistical potential utilizes a shuffled reference state definition that helped to eliminate most of the false positive signal from the background distribution of pairwise contacts. The accuracy of the scoring function was further increased by using BLOSUM mutation table scores.
Similar content being viewed by others
Abbreviations
- MMM:
-
Multiple mapping method
- M4T:
-
Multiple mapping method with multiple templates
References
Burley SK, Almo SC, Bonanno JB, Capel M, Chance MR, Gaasterland T, Lin D, Sali A, Studier FW, Swaminathan S (1999) Structural genomics: beyond the human genome project. Nat Genet 23:151
Manjasetty BA, Shi W, Zhan C, Fiser A, Chance MR (2007) A high-throughput approach to protein structure analysis. Genet Eng (N Y) 28:105–128
Cardozo T, Totrov M, Abagyan R (1995) Homology modeling by the ICM method. Proteins 23:403. doi:10.1002/prot.340230314
Chothia C, Lesk AM, Levitt M, Amit AG, Mariuzza RA, Phillips SE, Poljak RJ (1986) The predicted structure of immunoglobulin D1.3 and its comparison with the crystal structure. Science 233:755. doi:10.1126/science.3090684
Fiser A (2004) Protein structure modeling in the proteomics era. Expert Rev Proteomics 1:97–110. doi:10.1586/14789450.1.1.97
Greer J (1981) Comparative model-building of the mammalian serine proteases. J Mol Biol 153:1027. doi:10.1016/0022-2836(81)90465-4
Levitt M (1992) Accurate modeling of protein conformation by automatic segment matching. J Mol Biol 226:507. doi:10.1016/0022-2836(92)90964-L
Sutcliffe MJ, Haneef I, Carney D, Blundell TL (1987) Knowledge based modelling of homologous proteins, part I: three-dimensional frameworks derived from the simultaneous superposition of multiple structures. Protein Eng 1:377. doi:10.1093/protein/1.5.377
Yang AS, Honig B (1999) Sequence to structure alignment in comparative modeling using PrISM. Proteins 37:66. doi:10.1002/(SICI)1097-0134(1999)37:3+<66::AID-PROT10>3.0.CO;2-K
Rai BK, Fiser A (2006) Multiple mapping method: a novel approach to the sequence-to-structure alignment problem in comparative protein structure modeling. Proteins Struct Funct Bioinform 63:644–661. doi:10.1002/prot.20835
Rai BK, Madrid-Aliste CJ, Fajardo JE, Fiser A (2006) MMM: a sequence-to-structure alignment protocol. Bioinformatics 22:2691–2692. doi:10.1093/bioinformatics/btl449
Sanchez R, Sali A (1997) Evaluation of comparative protein structure modeling by MODELLER-3. Proteins (Suppl 1):50–58. doi:10.1002/(SICI)1097-0134(1997)1+<50::AID-PROT8>3.0.CO;2-S
Venclovas C, Margelevicius M (2005) Comparative modeling in CASP6 using consensus approach to template selection, sequence-structure alignment, and structure assessment. Proteins 61(Suppl 7):99–105. doi:10.1002/prot.20725
Contreras-Moreira B, Fitzjohn PW, Offman M, Smith GR, Bates PA (2003) Novel use of a genetic algorithm for protein structure prediction: searching template and sequence alignment space. Proteins 53(Suppl 6):424–429. doi:10.1002/prot.10549
Fernandez-Fuentes N, Madrid-Aliste CJ, Rai BK, Fajardo JE, Fiser A (2007) M4T: a comparative protein structure modeling server. Nucleic Acids Res 35:W363–W368
Fernandez-Fuentes N, Rai BK, Madrid-Aliste CJ, Eduardo Fajardo J, Fiser A (2007) Comparative protein structure modeling by combining multiple templates and optimizing sequence-to-structure alignments. Bioinformatics 23:2558–2565. doi:10.1093/bioinformatics/btm377
Berman HM, Westbrook J, Feng Z, Gilliland G, Bhat TN, Weissig H, Shindyalov IN, Bourne PE (2000) The Protein Data Bank. Nucleic Acids Res 28:235. doi:10.1093/nar/28.1.235
Li W, Godzik A (2006) Cd-hit: a fast program for clustering and comparing large sets of protein or nucleotide sequences. Bioinformatics 22:1658. doi:10.1093/bioinformatics/btl158
Thompson JD, Higgins DG, Gibson TJ (1994) CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res 22:4673–4680. doi:10.1093/nar/22.22.4673
Madhusudhan MS, Marti-Renom MA, Sanchez R, Sali A (2006) Variable gap penalty for protein sequence-structure alignment. Protein Eng Des Sel 19:129–133. doi:10.1093/protein/gzj005
Sanchez R, Sali A (1998) Large-scale protein structure modeling of the Saccharomyces cerevisiae genome. Proc Natl Acad Sci USA 95:13597–13602. doi:10.1073/pnas.95.23.13597
Edgar RC (2004) MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res 32:1792–1797. doi:10.1093/nar/gkh340
Russell RB, Barton GJ (1992) Multiple protein sequence alignment from tertiary structure comparison: assignment of global and residue confidence levels. Proteins 14:309–323. doi:10.1002/prot.340140216
Shi J, Blundell TL, Mizuguchi K (2001) FUGUE: sequence-structure homology recognition using environment-specific substitution tables and structure-dependent gap penalties. J Mol Biol 310:243. doi:10.1006/jmbi.2001.4762
Rice DW, Eisenberg D (1997) A 3D–1D substitution matrix for protein fold recognition that includes predicted secondary structure of the sequence. J Mol Biol 267:1026–1038. doi:10.1006/jmbi.1997.0924
Miyazawa S, Jernigan RL (1996) Residue-residue potentials with a favorable contact pair term and an unfavorable high packing density term, for simulation and threading. J Mol Biol 256:623–644. doi:10.1006/jmbi.1996.0114
Henikoff S, Henikoff JG (1992) Amino acid substitution matrices from protein blocks. Proc Natl Acad Sci USA 89:10915–10919. doi:10.1073/pnas.89.22.10915
Rykunov D, Fiser A (2007) Effects of amino acid composition, finite size of proteins, and sparse statistics on distance-dependent statistical pair potentials. Proteins Struct Funct Bioinform 67:559–568. doi:10.1002/prot.21279
Altschul SF, Madden TL, Schaffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ (1997) Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res 25:3389–3402. doi:10.1093/nar/25.17.3389
Acknowledgements
This work was supported by NIH GM62519-04. This work is dedicated to the memory of Elliot Steinberger, who passed away recently.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Rykunov, D., Steinberger, E., Madrid-Aliste, C.J. et al. Improved scoring function for comparative modeling using the M4T method. J Struct Funct Genomics 10, 95–99 (2009). https://doi.org/10.1007/s10969-008-9044-9
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10969-008-9044-9