AMBIT applicability domain estimation
ambit-model package is an implementation of methods described in
- Jaworska, J., Nikolova-Jeliazkova, N., & Aldenberg, T. (2005). QSAR applicabilty domain estimation by projection of the training set descriptor space: a review. Alternatives to Laboratory Animals ATLA, 33(5), 445–459.
- Nikolova-Jeliazkova, N., & Jaworska, J. (2005). An approach to determining applicability domains for QSAR group contribution models: an analysis of SRC KOWWIN. Alternatives to Laboratory Animals ATLA, 33(5), 461–470.
- Netzeva, T. I., Worth, A., Aldenberg, T., Benigni, R., Cronin, M. T. D., Gramatica, P., … Yang, C. (2005). Current status of methods for defining the applicability domain of (quantitative) structure-activity relationships. The report and recommendations of ECVAM Workshop 52. Alternatives to Laboratory Animals ATLA.
- Jaworska, J., & Nikolova-Jeliazkova, N. (2007). How can structural similarity analysis help in category formation? SAR and QSAR in Environmental Research, 18(3-4), 195–207. doi:10.1080/10629360701306050
The appdomain-example project is a command line application, demonstrating how to use ambit-model package.
Alternatively, the applicability domain algorithms are implemented in Ambit Discovery desktop application as well as REST web services in Ambit web application.
The applicability domain is estimated based on the data in the training set only (independent of the model). The applicability domain estimation is reported for the test set. You may specify one and the same file as both test and training set. The input file formats are recognised by extension (e.g. .csv, .sdf, .cml).
The result file consists of all the properties in the test set, the predicted metric by the applicability domain method and a flag indicating if the molecule is out of domain ( 0 - in domain, 1 - out of domain). The output file type is recognised by extension (e.g. .csv, .sdf, .cml).