API proposal for applicability domain estimation

An API proposal, attempting to unify different approaches of applicability domain estimation.

Applicability domain in OpenTox framework:

An applicability domain procedure is an OpenTox Algorithm.
An applicability domain "model" is created posting a dataset URI to an applicability domain algorithm URI. This creates ot:Model with type ota:ApplicabilityDomain and returns a "AD-model" uri.
Alternatively, for AD, embedded in a predictive model, just declare additional rdf:type of the model to be ota:ApplicabilityDomain
An applicability domain estimation is done by POSTing a dataset to the "AD-model" uri. This generates another dataset with an extra feature telling whether the corresponding compound belongs to the applicability domain (or in fuzzy terms, how much does it belong to that set).
For models with embedded AD, on POST of a dataset to the model , both prediction results and AD estimates are generated.
All models provides the estimation results as specified below.

Applicability domain RDF representation:

A predictive model can be assigned external or embedded applicability domain

In case of AD external to the model:

@prefix ot:      <http://www.opentox.org/api/1.1#> .
@prefix ota:     <http://www.opentox.org/algorithmTypes.owl#> .

</model/mlr-model> ot:hasDomain </model/leverage-ad-model>.

</model/mlr-model> rdf:type ot:Model.
</model/mlr-model> ot:algorithm </algorithm/mlr>.
</algorithm/mlr> rdf:type ot:Algorithm.
</algorithm/mlr> rdf:type ota:Regression.

</model/leverage-ad-model> rdf:type ot:Model.
</model/leverage-ad-model> ot:algorithm </algorithm/leverage>.
</algorithm/leverage> rdf:type ot:Algorithm.
</algorithm/leverage> rdf:type ota:ApplicabilityDomain.

In case of AD embedded with the model

@prefix ot:      <http://www.opentox.org/api/1.1#> .
@prefix ota:     <http://www.opentox.org/algorithmTypes.owl#> .

<lazar-model> ot:hasDomain <lazar-model>.

<lazar-model> rdf:type ot:Model.
<lazar-model> ot:algorithm </algorithm/lazar>.

</algorithm/lazar> rdf:type ot:Algorithm.

</algorithm/lazar> rdf:type ota:ApplicabilityDomain.
</algorithm/lazar> rdf:type ota:LazyLearning.

Results form applicability domain estimation

by analogy of ot:predictedVariables, used to specify features, where prediction results are stored, one can specify which features hold the result of AD estimation (suggestion for better property names instead of ot:adMembership and ot:adMetric are welcome !)

@prefix ot:      <http://www.opentox.org/api/1.1#> .

//the estimated value, e.g. leverage
ot:Model ot:adMetric ot:Feature.

//the desision for AD membership, based on the estimated value - e.g. "in-domain" if leverage > threshold
//have to agree on the value type - boolean, numeric, string, nominal ?
ot:Model ot:adMembership ot:Feature.

and subsequently use the same ot:dataEntry and ot:FeatureValue RDF constructions , used elsewhere to specify property values, to specify AD results as well:

@prefix ot:      <http://www.opentox.org/api/1.1#> .
@prefix dc:      <http://purl.org/dc/elements/1.1/> .
@prefix :        <http://ambit.uni-plovdiv.bg:8080/ambit2/> .
@prefix ota:     <http://www.opentox.org/algorithmTypes.owl#> .
@prefix rdfs:    <http://www.w3.org/2000/01/rdf-schema#> .
@prefix owl:     <http://www.w3.org/2002/07/owl#> .
@prefix xsd:     <http://www.w3.org/2001/XMLSchema#> .
@prefix ac:      <http://ambit.uni-plovdiv.bg:8080/ambit2/compound/> .
@prefix ad:      <http://ambit.uni-plovdiv.bg:8080/ambit2/dataset/> .
@prefix rdf:     <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix af:      <http://ambit.uni-plovdiv.bg:8080/ambit2/feature/> .


ad:1  a       ot:Dataset ;
      ot:dataEntry
              [ a       ot:DataEntry ;
                ot:compound ac:1 ;
                ot:values
                        [ a       ot:FeatureValue ;
                          ot:feature af:1 ;
                          ot:value "3.14"^^xsd:double
                        ]

                ot:values
                        [ a       ot:FeatureValue ;
                          ot:feature af:9999 ;
                          ot:value "0.0"^^xsd:double
                        ]
              ] .

af:1
      a       ot:Feature , ot:NumericFeature ;
      dc:title "MLR-prediction" ;
      ot:hasSource <http://opentox.ntua.gr/model/mlr> ;
      ot:units "" .


af:9999
      a       ot:Feature , ot:NumericFeature ;
      dc:title "AD-leverage" ;
      ot:hasSource <http://opentox.ntua.gr/model/leverage-ad> ;
      ot:units "" .


ac:1
      a       ot:Compound ;

ot:NumericFeature
      a       owl:Class ;
      rdfs:subClassOf ot:Feature .

ot:DataEntry
      a       owl:Class .

ot:hasSource
      a       owl:ObjectProperty .

ot:units
      a       owl:DatatypeProperty .

ot:values
      a       owl:ObjectProperty .

ot:compound
      a       owl:ObjectProperty .

dc:title
      a       owl:AnnotationProperty .

ot:feature
      a       owl:ObjectProperty .

ot:Dataset
      a       owl:Class .

dc:description
      a       owl:AnnotationProperty .

ot:dataEntry
      a       owl:ObjectProperty .

ot:Compound
      a       owl:Class .

dc:identifier
      a       owl:AnnotationProperty .

ot:FeatureValue
      a       owl:Class .

ot:Feature
      a       owl:Class .

dc:type
      a       owl:AnnotationProperty .

ot:value
      a       owl:DatatypeProperty .

There is no difference in representation of AD results, if AD is embedded in the model itself, besides that ot:hasSource for features , representing predicted values and AD estimation, point to the same ot:Model object

ad:1  a       ot:Dataset ;
      ot:dataEntry
              [ a       ot:DataEntry ;
                ot:compound ac:1 ;
               ot:values
                        [ a       ot:FeatureValue ;
                          ot:feature af:lazar_prediction ;
                          ot:value "1.0"^^xsd:double
                        ]
                ot:values
                        [ a       ot:FeatureValue ;
                          ot:feature af:10000 ;
                          ot:value "0.666"^^xsd:double
                        ]
              ] .

af:10000
      a       ot:Feature , ot:NumericFeature ;
      dc:title "AD-lazar" ;
      ot:hasSource <http://in-silico.ch/model/lazar> ;
      ot:units "" .


af:lazar_prediction
      a       ot:Feature , ot:NumericFeature ;
      dc:title "prediction-lazar" ;
      ot:hasSource <http://in-silico.ch/model/lazar> ;
      ot:units "".

ac:1
      a       ot:Compound ;

Sections

API proposal for applicability domain estimation

Applicability domain in OpenTox framework:

Applicability domain RDF representation:

A predictive model can be assigned external or embedded applicability domain

Results form applicability domain estimation

Document Actions