Sections
You are here: Home » Development » API » API 1.1 - archived version » Algorithm

Algorithm

Provides access to OpenTox algorithms

REST operations

Description Method URI Parameters Result Status codes
Get URIs of all available algorithms
GET /algorithm (optional)
?sameas=URI-of-the-owl:sameAs-entry
List of all algorithm URIs or RDF representation, or algorithms of specific types, if query parameter exists

Returns all algorithms, for which owl:sameAs is given by the query
200,404,503
Get the ontology representation of an algorithm GET /algorithm/{id} - Algorithm representation in one of the supported MIME types
200,404,503
Apply the algorithm POST /algorithm/{id} dataset_uri
prediction_feature,
parameter (specified by the algorithm provider)
dataset_service=datasetservice_uri
model URI
dataset URI
featureURI


Redirect to task URI for time consuming computations


200,303,404,503

 

Notes:

  • dataset_service=datasetservice_uri, pointing to a dataset service. Relevant, if the output of the algotithm is a dataset (e.g. with calculated descriptors).  If dataset_service parameter is not specified, the model service uses a pre-configured dataset service.

Algorithm representation

 

Parameters

Input parameters:

  • dataset_uri is mandatory for all kind of prediction algorithms (machine learning or otherwise), as well for data processing algorithms.
  • prediction_feature is mandatory for prediction (classification/regression) and other supervised learning algorithms. The URI of the feature with the endpoint to predict is expected as value.
  • parameter contains all the algorithm specific parameters
  • TODO: Update AlgorithmTypes ontology to include information of input and output parameters

Algorithm types

Algorithm types are defined in algorithm types ontology http://www.opentox.org/algorithms.owl

Data cleanup algorithms. Algorithm, which is a subclass of   http://www.opentox.org/algorithms.owl#DataCleanup

  • input parameters: dataset_uri , parameter
  • output parameters: dataset_uri

Feature selection algorithms  , subclass of http://www.opentox.org/algorithmTypes.owl#FeatureSelection

  • input parameters: dataset_uri , parameter
  • output parameters:  feature_uri

 

Sample curl calls for descriptor selection:

Select the 40 most informative descriptors (according to Information Gain) from dataset http://apps.ideaconsult.net:8080/ambit2/dataset/1037:

curl -X POST -d "dataset_uri=http://apps.ideaconsult.net:8080/ambit2/dataset/1037" -d "prediction_feature=http://apps.ideaconsult.net:8080/ambit2/feature/26701" \
 -d 'numToSelect=40' http://opentox.informatik.tu-muenchen.de:8080/OpenTox-dev/algorithm/InfoGainAttributeEval

 

Supervised learning algorithms , subclass of  http://www.opentox.org/algorithmTypes.owl#Supervied

  • input parameter:  dataset_uri , parameter,  prediction_feature
  • output parameters:  dataset_uri

 

Sample curl calls for learning models:

Learn a dicision tree model:

curl -X POST -d 'dataset_uri=http://apps.ideaconsult.net:8080/ambit2/dataset/1037' -d 'prediction_feature=http://apps.ideaconsult.net:8080/ambit2/feature/26701' \
-d 'dataset_service=http://apps.ideaconsult.net:8080/ambit2/dataset' http://opentox.informatik.tu-muenchen.de:8080/OpenTox-dev/algorithm/J48

Available options of J48 can be found at: http://opentox.informatik.tu-muenchen.de/trac/TUMOpenTox/wiki/j48.

Learn a model with the k nearest neighbor algorithm (k=5):

curl -X POST -d 'dataset_uri=http://apps.ideaconsult.net:8080/ambit2/dataset/1037' -d 'prediction_feature=http://apps.ideaconsult.net:8080/ambit2/feature/26701' \
 -d 'dataset_service=http://apps.ideaconsult.net:8080/ambit2/dataset' -d 'KNN=5' \
http://opentox.informatik.tu-muenchen.de:8080/OpenTox-dev/algorithm/kNNclassification

 

 

Descriptor calculation algorithms   subclass of  http://www.opentox.org/algorithmTypes.owl#DescriptorCalculation

  • input parameters: dataset_uri , parameter
  • output parameters:  dataset_uri

 

Sample curl calls for calculating descriptors:

Calculate all CDK feature:

curl -X POST -d 'dataset_uri=http://apps.ideaconsult.net:8080/ambit2/dataset/662?feature_uris[]=http://apps.ideaconsult.net:8080/ambit2/feature/26701' \
-d 'dataset_service=http://apps.ideaconsult.net:8080/ambit2/dataset' http://opentox.informatik.tu-muenchen.de:8080/OpenTox-dev/algorithm/CDKPhysChem

Calculate WienerNumbers using CDK:

curl -X POST -d 'dataset_uri=http://apps.ideaconsult.net:8080/ambit2/dataset/662?feature_uris[]=http://apps.ideaconsult.net:8080/ambit2/feature/26701' \
-d 'WienerNumbersDescriptor=true' -d 'ALL=false' -d 'dataset_service=http://apps.ideaconsult.net:8080/ambit2/dataset' \
 http://opentox.informatik.tu-muenchen.de:8080/OpenTox-dev/algorithm/CDKPhysChem

Available descriptors can be found here: http://opentox.informatik.tu-muenchen.de/trac/TUMOpenTox/wiki/CDKPhysChem

Calculate structural descriptors with FreeTreeMiner with a minimum support of 80%:

curl -X POST -d 'dataset_uri=http://apps.ideaconsult.net:8080/ambit2/dataset/662?feature_uris[]=http://apps.ideaconsult.net:8080/ambit2/feature/26701' \
-d 'dataset_service=http://apps.ideaconsult.net:8080/ambit2/dataset' -d 'minSup=0.8' http://opentox.informatik.tu-muenchen.de:8080/OpenTox-dev/algorithm/FTM

 

An Algorithm service shall provide separate URLs for algorithms with default (or without) parameters and for algorithms with specific parameter values. The second type of algorithm URLs are created on the fly, when an algorithm with specific parameters or a dataset is invoked. For example, when calculating descriptors, depending on the http://dataset.service.eu/dataset/6 and a set of parameters, the calculation service creates the following feature: 
<ot:feature>
<ot:NumericFeature rdf:about="http://dataset.service.eu/feature/1">
<dc:creator>Name of creator</dc:creator>
<ot:hasSource rdf:resource="http://algorithm.service.org/algorithm/FTM1/C"/>
<owl:sameAs rdf:resource="http://www.opentox.org/api/1.2#TUM_FTM_C"/>
<ot:units>count</ot:units>
<dc:title>TUM_FTM_C</dc:title>
<rdf:type rdf:resource="http://www.opentox.org/api/1.2#Feature"/>
</ot:NumericFeature>
</ot:feature>
and internally the algorithm service creates a new algorithm entry:
http://algorithm.service.org/algorithm/FTM1/C
with a representation like below: 
<ot:Algorithm rdf:about="http://algorithm.service.org/algorithm/FTM1">
    <ot:parameters>
      <ot:Parameter>
        <dc:title rdf:datatype="http://www.w3.org/2001/XMLSchema#string">minSup</dc:title>
        <dc:description rdf:datatype="http://www.w3.org/2001/XMLSchema#string"> Specifies the min support for mining (fraction). Is to be between 0 and 1</dc:description>
        <ot:paramScope rdf:datatype="http://www.w3.org/2001/XMLSchema#string">optional</ot:paramScope>
        <ot:paramValue rdf:datatype="http://www.w3.org/2001/XMLSchema#int">0.8</ot:paramValue>
      </ot:Parameter>
    </ot:parameters>
    <owl:sameAs>http://www.blueobelisk.org/ontologies/chemoinformatics-algorithms/#subtree</owl:sameAs>
    <dc:contributor>contributor.name@domain.org</dc:contributor>
    <ot:parameters>
      <ot:Parameter>
        <dc:title rdf:datatype="http://www.w3.org/2001/XMLSchema#string">dataset_service</dc:title>
        <dc:description rdf:datatype="http://www.w3.org/2001/XMLSchema#string">URI to the dataset service to be used</dc:description>
        <ot:paramScope rdf:datatype="http://www.w3.org/2001/XMLSchema#string">optional</ot:paramScope>
        <ot:paramValue rdf:datatype="http://www.w3.org/2001/XMLSchema#string"></ot:paramValue>
      </ot:Parameter>
    </ot:parameters>
    <ot:isA>http://www.opentox.org/algorithms.owl#DescriptorCalculation</ot:isA>
    <ot:parameters>
      <ot:Parameter>
        <dc:title rdf:datatype="http://www.w3.org/2001/XMLSchema#string">dataset_uri</dc:title>
        <dc:description rdf:datatype="http://www.w3.org/2001/XMLSchema#string">URI to the dataset to be used</dc:description>
        <ot:paramScope rdf:datatype="http://www.w3.org/2001/XMLSchema#string">mandatory</ot:paramScope>
        <ot:paramValue rdf:datatype="http://www.w3.org/2001/XMLSchema#string">http://dataset.service.eu/dataset/6</ot:paramValue>
      </ot:Parameter>
    </ot:parameters>
    <ot:parameters>
      <ot:Parameter>
        <dc:title rdf:datatype="http://www.w3.org/2001/XMLSchema#string">hydrogen</dc:title>
        <dc:description rdf:datatype="http://www.w3.org/2001/XMLSchema#string">Include hydrogen atoms. </dc:description>
        <ot:paramScope rdf:datatype="http://www.w3.org/2001/XMLSchema#string">optional</ot:paramScope>
        <ot:paramValue rdf:datatype="http://www.w3.org/2001/XMLSchema#boolean">false</ot:paramValue>
      </ot:Parameter>
    </ot:parameters>
    <ot:isA>http://www.opentox.org/algorithms.owl#PatternMining</ot:isA>
    <dc:description rdf:datatype="http://www.w3.org/2001/XMLSchema#string">OpenTox REST interface to the FTM algorithm implementation of TUM.</dc:description>
    <dc:contributor>contributor.name@domain.org</dc:contributor>
    <dc:title rdf:datatype="http://www.w3.org/2001/XMLSchema#string">FreeTreeMiner </dc:title>
    <dc:identifier rdf:datatype="http://www.w3.org/2001/XMLSchema#anyURI">http://algorithm.service.org/algorithm/FTM1</dc:identifier>
    <dc:creator>creator.name@domain.org</dc:creator>
    <dc:date rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime">Tue Jun 22 16:14:24 CEST 2010</dc:date>
  </ot:Algorithm>
  • If a client would like to use exactly same algorithm settings to calculate http://dataset.service.org/feature/1, it will use the http://algorithm.service.org/algorithm/FTM1, available via ot:hasSource, in an uniform way for all kind of algorithms, regardless of the existence of parameters;
  • All the complexity is hidden within the algorithm service;
  • If a calculation with the generic http://algorithm.service.org/algorithm/FTM1 algorithm is initiated with specific set of parameters, the service might lookup internally whether such set already exist and eventually reuse, otherwise a new algorithm URL is created along the calculations;
  • A calculation with the http://algorithm.service.org/algorithm/FTM1 algorithm is initiated without parameters , these are already known;
  • If a calculation with the http://algorithm.service.org/algorithm/FTM1 algorithm is initiated with specific set of parameters, the service shall throw an error 400 "Bad request".

 

Return results:

 

  • Restrict results to URL
  • Agree on how the resulting URL is returned.  There are several options (not mutually exclusive)
- if no content is returned, the URL is within Location-ref HTTP header (this is mandatory for redirect responses, like returning task IDs), but can be in any other response. - if content is returned, depending on the content type, it can be text/uri-list or RDF representation, containing perhaps only a simple RDF node with the URL as the node identificator and the object type.

 

Supported MIME types

Mandatory:

  • application/rdf+xml (default)

Optional:

  • application/xml (PMML)
  • text/xml (PMML)
  • text/x-yaml
  • text/x-json
  • application/json
  • ...

HTTP status codes

Interpretation Nr Name
Success 200 OK
No algorithm in the respective category found, or specific algorithm not found 404 Not Found
Incorrect dataset URI, or incorrect parameters 400 Bad request
Model building error 500 Internal Server Error
Model building in progress (redirect to task URI)
303 Redirect
Service not available 503 Service unavailable

Background:

This is a generic interface for OpenTox algorithms. As algorithms can be used for a wide variety of purposes (e.g. model building, feature calculation, feature selection, similarity calculation, substructure matching), required and optional input parameters and algorithm results (e.g. model or dataset URIs, literal values) have to be specified in the algorithm representation together with a definition of the algorithm.

 

Document Actions

Feature Selection

Posted by Sopasakis Pantelis at Oct 02, 2009 02:27 PM
There is no POST for feature selection algorithms. I propose the following RESTful operation:

POST /algorithm/preprocessing/featureselection/{feat_sel_algorithm_id}
posted parameters: dataset_uri, feature_definition
supported mime types: text/uri-list
returns: URI list of (selected) feature definitions.

questions on proposal by CH

Posted by Tobias Girschick at Oct 08, 2009 02:29 PM
I am not totally clear how you envision the algorithm ontology. Here
http://opentox.org/dev/apis/api-1.1/Algorithm you propose that a
GET on /algorithm/{id} returns an algorithm_ontology_uri. Could you
specify what I do get back from something like that:

GET /algorithm_ontology/{id}

Would this return the XML (or whatever we choose) representation of the
algorithm?

But if this works out, I basically agree with a simplified API
(/algorithm/{id})

Algorithm - Dataset compatibility

Posted by Jeliazkova Nina at Dec 17, 2009 09:54 AM
Algorithm Ontology to be extended to include information what kind of features/datasets are compatible with particular types of algorithms.

e.g.
- algorithm requires compound structures
- algorithm requires numerical features only