Skip to main content
Log in

Lazy structure-activity relationships (lazar) for the prediction of rodent carcinogenicity and Salmonella mutagenicity

  • Full–length article
  • Published:
Molecular Diversity Aims and scope Submit manuscript

Abstract

lazar is a new tool for the prediction of toxic properties of chemical structures. It derives predictions for query structures from a database with experimentally determined toxicity data. lazar generates predictions by searching the database for compounds that are similar with respect to a given toxic activity and calculating the prediction from their activities. Apart form the prediction, lazar provides the rationales (structural features and similar compounds) for the prediction and a reliable condence index that indicates, if a query structure falls within the applicability domain of the training database.

Leave-one-out (LOO) crossvalidation experiments were carried out for 10 carcinogenicity endpoints ({female|male} {hamster|mouse|rat} carcinogenicity and aggregate endpoints {hamster|mouse|rat} carcinogenicity and rodent carcinogenicity) and Salmonella mutagenicity from the Carcinogenic Potency Database (CPDB). An external validation of Salmonella mutagenicity predictions was performed with a dataset of 3895 structures. Leave-one-out and external validation experiments indicate that Salmonella mutagenicity can be predicted with 85% accuracy for compounds within the applicability domain of the CPDB. The LOO accuracy of lazar predictions of rodent carcinogenicity is 86%, the accuracies for other carcinogenicity endpoints vary between 78 and 95% for structures within the applicability domain.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

Abbreviations

CCRIS:

chemical carcinogenesis research information system

CPDB:

carcinogenic potency database

DSSTox:

distributed structure-searchable toxicity project

lazar:

lazy structure-activity relationships

LOO:

leave-one-out crossvalidation

k-nn:

k-nearest-neighbours

(Q)SAR:

(quantitative) structure-activity relationships

References

  1. Helma, C. (Ed.)., Predictive Toxicology, Taylor & Francis, Boca Raton (2005).

    Google Scholar 

  2. Eriksson, L., Johansson, E. and Lundstedt, T. Regression- and projection-based approaches in Predictive Toxicology, in Helma, C. (Ed.)., Predictive Toxicology. Taylor & Francis, Boca Raton (2005) pp. 177–222.

    Google Scholar 

  3. Parsons, S. and McBurney, P. The use of expert systems for toxicology risk prediction, in Helma, C. (Ed.)., Predictive Toxicology. Taylor & Francis, Boca Raton (2005) pp. 135–176.

    Google Scholar 

  4. Kramer, S. and Helma, C. Machine learning and data mining, in Helma, C. (Ed.)., Predictive Toxicology. Taylor & Francis, Boca Raton (2005) pp. 223–254.

    Google Scholar 

  5. Imielinski, T. and Mannila, H., A database perspective on knowledge discovery, Communications of the ACM, 39 (1996) 58–64.

    Article  Google Scholar 

  6. DeRaedt, L., A perspective on inductive databases, SIGKDD Explorations, 4 (2002) 69–77.

    Article  Google Scholar 

  7. Toivonen, H., Srinivasan, A., King, R.D., Kramer, S. and Helma, C., Statistical evaluation of the Predictive Toxicology Challenge 2000–2001, Bioinformatics, 19 (2003) 1183–1193.

    Article  PubMed  CAS  Google Scholar 

  8. Benigni, R. and Zito, R., The second national toxicology program comparative exercise on the prediction of rodent carcinogenicity: Denitive results. Mutation Res., 566 (2004) 49–63.

    CAS  Google Scholar 

  9. Benigni, R., Structure–activity relationship studies of chemical mutagens and carcinogens: mechanistic investigations and prediction approaches, Chemical Reviews, in press (2005).

  10. Helma, C., Data mining and knowledge discovery in predictive toxicology, SAR QSAR Environ. Res., 15 (2004) 367–383.

    Article  PubMed  CAS  Google Scholar 

  11. Helma, C., lazar: Lazy structure – activity relationships for toxicity prediction, in Helma, C. (Ed.)., Predictive Toxicology. Taylor & Francis, Boca Raton (2005) pp. 479–499.

    Google Scholar 

  12. Willett, P., Barnard, J. and Downs, G., Chemical similarity searching, J. Chem. Inf. Comput. Sci., 38 (1998) 983–996.

    Article  CAS  Google Scholar 

  13. Kramer, S., De Raedt, L. and Helma, C., Molecular feature mining in HIV data, in Proceedings of the Seventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD-01) (2001) pp. 136–143.

  14. Hill, A., Erweiterung des Molecular Feature Miners für 3-dimensionale Fragmente, Master's thesis, Universität Freiburg (2002).

  15. Molzberger, L., Development of a method to search efficiently for frequent substructures in large molecule databases, Master's thesis, Universität Freiburg (2004).

  16. Poroikov, V. and Filimonov, D., Pass: Prediction of biological activity for substances, in Helma, C. (Ed.)., Predictive Toxicology. Taylor & Francis, Boca Raton (2005) pp. 459–478.

    Google Scholar 

  17. Varnek, A. and Solov'ev, V., “in silicodesign of potential anti-HIV actives using fragment descriptors, Comb. Chem. High. Throughput Screen., 8 (2005) 403–416.

    Article  PubMed  CAS  Google Scholar 

  18. Coles, S., Day, N., Murray-Rust, P., Rzepa, H. and Zhang, Y., Enhancement of the chemical semantic web through the use of InChI identifiers, Org. Biomol. Chem., 3 (2005) 1832–1834.

    Article  PubMed  CAS  Google Scholar 

  19. Hawkins, D., The problem of overfitting, J. Chem. Inf. Comput. Sci., 44 (2004) 1–12.

    Article  PubMed  CAS  Google Scholar 

  20. Kazius, J., McGuire, R. and Bursi, R., Derivation and vlaidation of toxicophores for mutagenicity prediction, J. Med. Chem., 48 (2005) 312–320.

    Article  PubMed  CAS  Google Scholar 

  21. Witten, I. and Frank, E., Data Mining: Practical machine learning tools and techniques, Morgan Kaufmann Publishers, San Francisco, California (2000).

    Google Scholar 

  22. Helma, C., Kramer, T., Kramer, S. and DeRaedt, L., Data Mining and Machine Learning techniques for the identification of mutagenicity inducing substructures and structure–activity relationships of noncongeneric compounds. J. Chem. Inf. Comput. Sci., 44 (2004) 1402–1411.

    Article  PubMed  CAS  Google Scholar 

  23. Benigni, R., Qsar prediction of rodent carcinogenicity for a set of chemicals currently bioassayed by the us national toxicology program, Mutagenesis, 6 (1991) 423–425.

    Article  PubMed  CAS  Google Scholar 

  24. Benigni, R., Predicting chemical carcinogenesis in rodents: The state of art in light of a comparative exercise, Mutation Res., 334 (1995) 103–113.

    PubMed  CAS  Google Scholar 

  25. Woo, Y. and Lai, D.Y., Mechanism of action of chemical carcinogens and their role in structure-activity relationship (SAR) analysis and risk assessment, in Benigni, R. (Ed.)., Quantitative Structure–Activity Relationship (QSAR) Models of Mutagens and Carcinogens. CRC Press, Boca Raton (2003) pp. 41–80.

    Google Scholar 

  26. Gottmann, E., Kramer, S., Pfahringer, B. and Helma, C., Data quality in predictive toxicology: Reproducibility of rodent carcinogenicity experiments, Environ. Health Perspect., 109 (2001) 509–514.

    Article  PubMed  CAS  Google Scholar 

  27. Benigni, R. and Giuliani, A., Putting the Predictive Toxicology Challenge into prespective: Reflections on the results, Bioinformatics, 19 (2003) 1194–1200.

    Article  PubMed  CAS  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Christoph Helma.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Helma, C. Lazy structure-activity relationships (lazar) for the prediction of rodent carcinogenicity and Salmonella mutagenicity. Mol Divers 10, 147–158 (2006). https://doi.org/10.1007/s11030-005-9001-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11030-005-9001-5

Keywords

Navigation