Sections
You are here: Home » Development » API » API 1.1 - archived version » Model

Model

— filed under:

Provides access to OpenTox prediction models

Model

Component description

Provides different representations for QSAR/toxicology models. Models are the output/result of learning algorithms. To make use of a model for prediction, it is necessary to have a dataset with compatible descriptors/features. If the dataset_service parameter is stated, a new dataset will be created if on the other hand the result_dataset parameter is stated, the stated dataset will be updated with the predicted feature values. In other words, a new "column" for the predictions is added to the input dataset. If none of the two parameters is given, a default dataset service is used and a new dataset is created.

REST operations

Description
Method
URI
Parameters
Result
Status codes
Get a list of all available models GET /model (optional)
?query=URI-of-the-owl:sameAs-entry
List of model URIs or RDF representation
If query specified, returns all models, for which
owl:sameAs is given by the query
200,404,503
Get the representation of a model GET /model/{id} - Representation of the model in a supported MIME type
200,404,503
Delete a model DELETE /model/{id} - - 200,404,503
Apply a model to predict a dataset
POST /model/{id} dataset_uri=dataseturi
result_dataset
=result_dataseturi
dataset_service
=datasetserviceuri
URI of created prediction dataset (predictions are features), task URI for time consuming computations 200,202,400,404,500,503
Apply a model to predict a compound
POST /model/{id} compound_uri=compounduri Prediction in a supported MIME type, task URI for time consuming computations 200,202,400,404,500,503

Notes:

  • dataset_uri=datasetURI is mandatory parameter
  • result_dataset=result_datasetURI, pointing to a resulting dataset.  This dataset will be updated with the predicted feature values.
  • dataset_service=datasetserviceURI, pointing to a dataset service. A new dataset with predicted feature values will be created.
  • either result_dataset or dataset_service parameter might be present.
  • If neither of result_dataset or dataset_service parameter is specified, the model service uses a pre-configured dataset service.

Model variables

REST operations

Description Method URI Parameters Result Status codes
List of independent variables GET /model/{id}/independent - URI-list/RDF of features used as independent variables 200,404,503
List of dependent variables GET /model/{id}/dependent - URI-list/RDF of features used as dependent variables 200,404,503
List of predicted features GET
/model/{id}/predicted
- URI-list/RDF of features, where predictions are stored 200,404,503

This facilitates extracting specified feature values per compound , e.g. /compound/{cid}?feature=/model/{id}/independent&?feature=/model/{id}/predicted  will return representation of the compound with values for independent variables as well as predicted feature values

 

Example curl calls

Get a list of all available model URIs at TUM (-X GET is default and could be ignored here):

curl -X GET -H 'Accept:text/uri-list' http://opentox.informatik.tu-muenchen.de:8080/OpenTox-dev/model

Get an RDF XML representation of all available models at NTUA (RDF XML is the default representation; you could also paste the URI to the adress bar of your browser):

curl http://opentox.ntua.gr:3000/model

Get the RDF XML representation of model TUMOpenToxModel_kNN_9 (RDF XML is the default representation):

curl -X GET http://opentox.informatik.tu-muenchen.de:8080/OpenTox-dev/model/TUMOpenToxModel_kNN_9

Make a prediction for ambit dataset 1038 with model TUMOpenToxModel_kNN_95:

curl -X POST -d 'dataset_uri=http://apps.ideaconsult.net:8080/ambit2/dataset/1038' \
-d 'dataset_service=http://apps.ideaconsult.net:8080/ambit2/dataset/' \
http://opentox.informatik.tu-muenchen.de:8080/OpenTox-dev/model/TUMOpenToxModel_kNN_95


Model representation

RDF representation defined in opentox.owl examples

The RDF representation contains metadata (e.g. algorithm, training dataset, parameters, dependent/independent variables) about the model. You should use PMML (if supported) to retrieve a portable version of the model.

Metadata

  • training algorithm
  • training dataset
  • training parameters
  • independent variables
  • dependent variables

Supported MIME types

Mandatory:

  • application/rdf+xml (default)

Optional:

  • application/xml (PMML)
  • text/xml (PMML)
  • text/x-yaml
  • text/x-json
  • application/json
  • ...

HTTP status codes

Interpretation Nr Name
Success 200 OK
Asynchronous task accepted
202 Accepted
Dataset_id is wrong 400 Bad Request
Model for specific id not found 404 Not Found
Prediction error 500 Internal server error
Service not available 503 Service unavailable

 

Document Actions

URI returned on Model POST

Posted by Jeliazkova Nina at Oct 01, 2009 06:32 PM
 According to the current API 1.0 the Model returns URI to the new calculated features (prediction results). The reason for this proposal was the idea that the dataset consists of Compound URI and Feature Definition URI and is able to construct the FeatureValue URIs like /feature/compound/{id}/feature/{id}. This will be difficult in case URIs reside on different servers. Therefore I would suggest the Model to return new Dataset URI , that will include Compound and Feature Definition URI from the initial Dataset, but add also Feature Definitions that correspond to the predicted values.

URI returned on Model POST

Posted by Helma Christoph at Oct 01, 2009 09:07 PM
My predictions return not only a prediction_feature, but a lot of additional information (similarities, neighbors, substructures with statistical significance, etc) that do not fit very well into our dataset definition (they are in fact an aggregation of datasets and features). Any suggestions how to deal with such a situation?

Model XML representation

Posted by Sopasakis Pantelis at Oct 02, 2009 01:39 PM
There has been a discussion about the adoption of the PMML schema for the representation of (some) models. In the case where PMML does not support a model, there has been a proposal that we should design XML schemas that resemble PMML as much as possible. If the models are represented by the proposed XML schema then the client has no information about the parameters of the trained model and these parameters are stored internally in the server. On the other hand if we use the PMML schema we lose the information about the dataset uri that was used to train the model, the uri of the algorithm and other parameters. If we decide to provide a PMML compliant XML, we should include these information in it, eg...
<?xml version="1.0" ?>
<PMML ...>
<OpenToxModel id="http://someserver.com/model/123" name="123" algorithmId="http://someserver.com/[…]/svc" >
<DatasetID>http://someOtherServer.com/dataset/3345</DatasetID>
<AlgorithmParamaters>
<!-- Tuning Parameters of the algorithm that the user provided -->
</AlgorithmParamaters>
<User>chung</User>
<TimeStamp>xxx</TimeStamp>
</OpenToxModel>
<!-- The data dictionary provides a list of the variables
 involved in the model -->
<DataDictionary>
</DataDictionary>
<!-- The Model element of the PMML schema provides the parameters
of the trained model
-->
<Model>
...
</Model>
</PMML>

Model XML representation

Posted by Sopasakis Pantelis at Oct 03, 2009 09:31 PM
It would also be a good idea to provide just the XML representation http://opentox.org/[…]/XML%20schema%20for%20Model%20(API%201.1)/view?searchterm=xml%20schema and in the future provide also a PMML representation for *some* models. I know its not an easy task to do that for some models and PMML does not support lots of models. I'm not in favour of prioritising PMML support for Models.