Sections
You are here: Home » Development » Documentation » Components » lazar

lazar

Contact: Andreas Maunz

Categories: Prediction, Descriptor calculation, Validation

Exposed methods:

predict
Input: chemical structure(s)
Output: prediction(s), confidence(s), neighbors, relevant fragments
Input format: Plain text in custom tab separated format
Output format: Plain text in YAML format
User-specified parameters: None
Reporting information: Neighbors and significant features for each prediction
leave-one-out validation
Input: chemical structures and activities
Output: actual vs predicted values, validation statistics
Input format: Plain text in custom tab separated format
Output format: Plain text in YAML format
User-specified parameters: None
Reporting information: Neighbors and significant features for each prediction

Description:

Lazar is a k-nearest-neighbor approach to predict chemical endpoints from a training set based on structural
fragments. It uses a SMILES file and precomputed fragments with occurrences as well as target class
information for each compound as training input. It also features regression, in which case the target activities
consist of continuous values. Lazar uses activity-specific similarity (i.e. each fragment contributes with its
significance for the target activity) that is the basis for predictions and confidence index for every single
prediction.
For classification, a weighted nearest neighbor voting is the standard prediction, whereas for regression a
kernel model based on activity-specific similarity is used by default. A kernel model is also available for
classification, as well as a multilinear model for regression.
The software is implemented in the C++ programming language and was developed for Linux. Lazar is
dependent on the OpenBabel (http://openbabel.org) chemistry toolbox, GNU Scientific Library, as well as on R
and the R package kernlab. Lazar is a plugin for Ruby on rails to exhibit its functionality as webservice, in
which case it also provides a graphical user interface (GUI), however it can still be executed from the command
line. The input format accepted at the moment is flat files, each line a SMILES string / a YAML formatted
fragment with occurrence numbers / an id followed by target activity name and value, respectively. Lazar's
output is YAML, yielding reach information about query compound, predicted and database activity, neighbors
and significant fragments. For further information we refer the reader to the according literature

Background (publication date, popularity/level of familiarity, rationale of approach, further comments)
Published 2006 (classification) and 2008 (regression), presently shipped with a lot of
classification and regression endpoint datasets. A web-based prototype is available
from lazar.in-silico.de. Provides self-contained, information rich predictions, suitable
for one-click interfaces. Usable without expert knowledge, provides automatic
applicability domain estimation.

Bias (instance-selection bias, feature-selection bias, combined instance-selection/feature-selection bias, independence assumptions?, ...)
Feature-selection bias

Lazy learning/eager learning
Lazy learning

Interpretability of models (black box model?, ...)
Intuitive (neighbors, significant fragments, visual depiction).

Type of Descriptor:

Interfaces: Standalone application

Priority: Medium

Development status: Production

Homepage: http://lazar.in-silico.de

Dependencies:
External components: OpenBabel, R, GSL - GNU Scientific Library


Technical details

Data: No

Software: Yes

Programming language(s): C++

Operating system(s): Linux

Input format: custom tab delimited

Output format: custom YAML

License: GPL


References

References:
[HEL06] C. Helma. Lazy structure-activity relationships (lazar) for the prediction of rodent carcinogenicity and Salmonella mutagenicity. Molecular Diversity, 10:147-158, 2006
[MAU08] A. Maunz, C. Helma. Prediction of chemical toxicity with local support vector regression and activity-specific kernels, SAR and QSAR in Environmental Research, Vol. 19, No. 5-6. (July 2008), pp. 413-431.

Document Actions