Algorithm
Provides access to OpenTox algorithms
REST operations
Description | Method | URI | Parameters | Result | Status codes |
Get URIs of all available algorithms |
GET | /algorithm | (optional) ?sameas=URI-of-the-owl:sameAs-entry |
List of all algorithm URIs or RDF representation, or algorithms of
specific types, if query parameter exists Returns all algorithms, for which owl:sameAs is given by the query |
200,404,503 |
Get the ontology representation of an algorithm | GET | /algorithm/{id} | - | Algorithm representation in one of the supported MIME types |
200,404,503 |
Apply the algorithm | POST | /algorithm/{id} | dataset_uri prediction_feature, parameter (specified by the algorithm provider) dataset_service=datasetservice_uri |
model URI dataset URI featureURI Redirect to task URI for time consuming computations |
200,303,404,503 |
Notes:
- dataset_service=datasetservice_uri, pointing to a dataset service. Relevant, if the output of the algotithm is a dataset (e.g. with calculated descriptors). If dataset_service parameter is not specified, the model service uses a pre-configured dataset service.
Algorithm representation
- RDF representation defined in OpenTox API ontology (examples)
- All algorithms are subclasses of http://www.opentox.org/api/1.1#Algorithm
-
Algorithm type in RDF representation is set by direct subclassing
(rdf:type) of a class from the algorithm types ontology (ota:http://www.opentox.org/algorithms.owl
) (e.g. <myalgorithm> rdf:type ota:Classification) .
Parameters
Input parameters:
- dataset_uri is mandatory for all kind of prediction
algorithms (machine learning or otherwise), as well for data processing
algorithms.
- prediction_feature is mandatory for prediction (classification/regression) and other supervised learning algorithms. The URI of the feature with the endpoint to predict is expected as value.
- parameter contains all the algorithm
specific parameters
- TODO: Update AlgorithmTypes ontology to include information of input and output parameters
Algorithm types
Algorithm types are defined in algorithm types ontology http://www.opentox.org/algorithms.owlData cleanup algorithms. Algorithm, which is a subclass of http://www.opentox.org/algorithms.owl#DataCleanup
- input parameters: dataset_uri , parameter
- output parameters: dataset_uri
Feature selection algorithms , subclass of http://www.opentox.org/algorithmTypes.owl#FeatureSelection
- input parameters: dataset_uri , parameter
- output parameters: feature_uri
Sample curl calls for descriptor selection:
Select the 40 most informative descriptors (according to Information Gain) from dataset http://apps.ideaconsult.net:8080/ambit2/dataset/1037:
curl -X POST -d "dataset_uri=http://apps.ideaconsult.net:8080/ambit2/dataset/1037" -d "prediction_feature=http://apps.ideaconsult.net:8080/ambit2/feature/26701" \ -d 'numToSelect=40' http://opentox.informatik.tu-muenchen.de:8080/OpenTox-dev/algorithm/InfoGainAttributeEval
Supervised learning algorithms , subclass of http://www.opentox.org/algorithmTypes.owl#Supervied
- input parameter: dataset_uri , parameter, prediction_feature
- output parameters: dataset_uri
Sample curl calls for learning models:
Learn a dicision tree model:
curl -X POST -d 'dataset_uri=http://apps.ideaconsult.net:8080/ambit2/dataset/1037' -d 'prediction_feature=http://apps.ideaconsult.net:8080/ambit2/feature/26701' \ -d 'dataset_service=http://apps.ideaconsult.net:8080/ambit2/dataset' http://opentox.informatik.tu-muenchen.de:8080/OpenTox-dev/algorithm/J48
Available options of J48 can be found at: http://opentox.informatik.tu-muenchen.de/trac/TUMOpenTox/wiki/j48.
Learn a model with the k nearest neighbor algorithm (k=5):
curl -X POST -d 'dataset_uri=http://apps.ideaconsult.net:8080/ambit2/dataset/1037' -d 'prediction_feature=http://apps.ideaconsult.net:8080/ambit2/feature/26701' \ -d 'dataset_service=http://apps.ideaconsult.net:8080/ambit2/dataset' -d 'KNN=5' \ http://opentox.informatik.tu-muenchen.de:8080/OpenTox-dev/algorithm/kNNclassification
Descriptor calculation algorithms subclass of http://www.opentox.org/algorithmTypes.owl#DescriptorCalculation
- input parameters: dataset_uri , parameter
- output parameters: dataset_uri
Sample curl calls for calculating descriptors:
Calculate all CDK feature:
curl -X POST -d 'dataset_uri=http://apps.ideaconsult.net:8080/ambit2/dataset/662?feature_uris[]=http://apps.ideaconsult.net:8080/ambit2/feature/26701' \ -d 'dataset_service=http://apps.ideaconsult.net:8080/ambit2/dataset' http://opentox.informatik.tu-muenchen.de:8080/OpenTox-dev/algorithm/CDKPhysChem
Calculate WienerNumbers using CDK:
curl -X POST -d 'dataset_uri=http://apps.ideaconsult.net:8080/ambit2/dataset/662?feature_uris[]=http://apps.ideaconsult.net:8080/ambit2/feature/26701' \ -d 'WienerNumbersDescriptor=true' -d 'ALL=false' -d 'dataset_service=http://apps.ideaconsult.net:8080/ambit2/dataset' \ http://opentox.informatik.tu-muenchen.de:8080/OpenTox-dev/algorithm/CDKPhysChem
Available descriptors can be found here: http://opentox.informatik.tu-muenchen.de/trac/TUMOpenTox/wiki/CDKPhysChem
Calculate structural descriptors with FreeTreeMiner with a minimum support of 80%:
curl -X POST -d 'dataset_uri=http://apps.ideaconsult.net:8080/ambit2/dataset/662?feature_uris[]=http://apps.ideaconsult.net:8080/ambit2/feature/26701' \ -d 'dataset_service=http://apps.ideaconsult.net:8080/ambit2/dataset' -d 'minSup=0.8' http://opentox.informatik.tu-muenchen.de:8080/OpenTox-dev/algorithm/FTM
An Algorithm service shall provide separate URLs for algorithms with default (or without) parameters and for algorithms with specific parameter values. The second type of algorithm URLs are created on the fly, when an algorithm with specific parameters or a dataset is invoked. For example, when calculating descriptors, depending on the http://dataset.service.eu/dataset/6 and a set of parameters, the calculation service creates the following feature:
<ot:feature> <ot:NumericFeature rdf:about="http://dataset.service.eu/feature/1"> <dc:creator>Name of creator</dc:creator> <ot:hasSource rdf:resource="http://algorithm.service.org/algorithm/FTM1/C"/> <owl:sameAs rdf:resource="http://www.opentox.org/api/1.2#TUM_FTM_C"/> <ot:units>count</ot:units> <dc:title>TUM_FTM_C</dc:title> <rdf:type rdf:resource="http://www.opentox.org/api/1.2#Feature"/> </ot:NumericFeature> </ot:feature>and internally the algorithm service creates a new algorithm entry:
http://algorithm.service.org/algorithm/FTM1/Cwith a representation like below:
<ot:Algorithm rdf:about="http://algorithm.service.org/algorithm/FTM1"> <ot:parameters> <ot:Parameter> <dc:title rdf:datatype="http://www.w3.org/2001/XMLSchema#string">minSup</dc:title> <dc:description rdf:datatype="http://www.w3.org/2001/XMLSchema#string"> Specifies the min support for mining (fraction). Is to be between 0 and 1</dc:description> <ot:paramScope rdf:datatype="http://www.w3.org/2001/XMLSchema#string">optional</ot:paramScope> <ot:paramValue rdf:datatype="http://www.w3.org/2001/XMLSchema#int">0.8</ot:paramValue> </ot:Parameter> </ot:parameters> <owl:sameAs>http://www.blueobelisk.org/ontologies/chemoinformatics-algorithms/#subtree</owl:sameAs> <dc:contributor>contributor.name@domain.org</dc:contributor> <ot:parameters> <ot:Parameter> <dc:title rdf:datatype="http://www.w3.org/2001/XMLSchema#string">dataset_service</dc:title> <dc:description rdf:datatype="http://www.w3.org/2001/XMLSchema#string">URI to the dataset service to be used</dc:description> <ot:paramScope rdf:datatype="http://www.w3.org/2001/XMLSchema#string">optional</ot:paramScope> <ot:paramValue rdf:datatype="http://www.w3.org/2001/XMLSchema#string"></ot:paramValue> </ot:Parameter> </ot:parameters> <ot:isA>http://www.opentox.org/algorithms.owl#DescriptorCalculation</ot:isA> <ot:parameters> <ot:Parameter> <dc:title rdf:datatype="http://www.w3.org/2001/XMLSchema#string">dataset_uri</dc:title> <dc:description rdf:datatype="http://www.w3.org/2001/XMLSchema#string">URI to the dataset to be used</dc:description> <ot:paramScope rdf:datatype="http://www.w3.org/2001/XMLSchema#string">mandatory</ot:paramScope> <ot:paramValue rdf:datatype="http://www.w3.org/2001/XMLSchema#string">http://dataset.service.eu/dataset/6</ot:paramValue> </ot:Parameter> </ot:parameters> <ot:parameters> <ot:Parameter> <dc:title rdf:datatype="http://www.w3.org/2001/XMLSchema#string">hydrogen</dc:title> <dc:description rdf:datatype="http://www.w3.org/2001/XMLSchema#string">Include hydrogen atoms. </dc:description> <ot:paramScope rdf:datatype="http://www.w3.org/2001/XMLSchema#string">optional</ot:paramScope> <ot:paramValue rdf:datatype="http://www.w3.org/2001/XMLSchema#boolean">false</ot:paramValue> </ot:Parameter> </ot:parameters> <ot:isA>http://www.opentox.org/algorithms.owl#PatternMining</ot:isA> <dc:description rdf:datatype="http://www.w3.org/2001/XMLSchema#string">OpenTox REST interface to the FTM algorithm implementation of TUM.</dc:description> <dc:contributor>contributor.name@domain.org</dc:contributor> <dc:title rdf:datatype="http://www.w3.org/2001/XMLSchema#string">FreeTreeMiner </dc:title> <dc:identifier rdf:datatype="http://www.w3.org/2001/XMLSchema#anyURI">http://algorithm.service.org/algorithm/FTM1</dc:identifier> <dc:creator>creator.name@domain.org</dc:creator> <dc:date rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime">Tue Jun 22 16:14:24 CEST 2010</dc:date> </ot:Algorithm>
- If a client would like to use exactly same algorithm settings to calculate http://dataset.service.org/feature/1, it will use the http://algorithm.service.org/algorithm/FTM1, available via ot:hasSource, in an uniform way for all kind of algorithms, regardless of the existence of parameters;
- All the complexity is hidden within the algorithm service;
- If a calculation with the generic http://algorithm.service.org/algorithm/FTM1
algorithm is initiated with specific set of parameters, the service
- A calculation with the http://algorithm.service.org/algorithm/FTM1 algorithm is initiated without parameters , these are already known;
- If
a calculation with the http://algorithm.service.org/algorithm/FTM1
algorithm is initiated with
specific set of parameters, the service shall throw an error 400 "Bad
request".
Return results:
- Restrict results to URL
- Agree on how the resulting URL
is returned. There are several options (not mutually exclusive)
Supported MIME types
Mandatory:
- application/rdf+xml (default)
Optional:
- application/xml (PMML)
- text/xml (PMML)
- text/x-yaml
- text/x-json
- application/json
- ...
HTTP status codes
Interpretation | Nr | Name |
Success | 200 | OK |
No algorithm in the respective category found, or specific algorithm not found | 404 | Not Found |
Incorrect dataset URI, or incorrect parameters | 400 | Bad request |
Model building error | 500 | Internal Server Error |
Model building in progress (redirect to task URI) |
303 | Redirect |
Service not available | 503 | Service unavailable |
Background:
This is a generic interface for OpenTox algorithms. As algorithms can be used for a wide variety of purposes (e.g. model building, feature calculation, feature selection, similarity calculation, substructure matching), required and optional input parameters and algorithm results (e.g. model or dataset URIs, literal values) have to be specified in the algorithm representation together with a definition of the algorithm.
Feature Selection
POST /algorithm/preprocessing/featureselection/{feat_sel_algorithm_id}
posted parameters: dataset_uri, feature_definition
supported mime types: text/uri-list
returns: URI list of (selected) feature definitions.