Sections
You are here: Home » Development » Testing » Test Case Development » ToxCreate

ToxCreate

ToxCreate Application

Try ToxCreate Demo application at www.toxcreate.net/test

 

Issue tracker: http://github.com/helma/opentox-toxmodel/issues

Contains:

  • Bug reports, feature requests and comments
  • Development priorities (next steps), may be changed by user votes
  • Required contributions from other participants

Please use the issue tracker for bug reports, feature requests, comments and votes for development priorities.

 

TUM services

Planned TUM contributed services (/algorithm and /model) with current status:

Regression algorithms for model learning:

Descriptor calculation algorithms:

  • JOELIB2 [under dev.; planned for approx. 20.01.]
  • CDKPhysChem [under dev.; planned for approx. 20.01.]

Descriptor selection (Feature selection) algorithms:

Model service for predictions:

 

TUM issue tracker: http://lxkramer13.informatik.tu-muenchen.de/trac/TUMOpenTox-dev/report

 

TUM complete service overview: http://opentox.informatik.tu-muenchen.de:8080/OpenTox-dev/algorithm

 



NTUA Services

Issue Tracker: An issue tracker is available online at : http://github.com/sopasakis/yaqp/issues (Hosted by github).

Documentation/Examples: Detailed documentation and examples of use can be found at the new location of our web site which is still under development.

 

 

Training of Regression Models

Use Case: Training of Regression Models; Build a model that predicts a numeric value for a certain feature

Assumes: Dataset Services, Dataset with at least two numeric features one of which should be declared to be the predicted one.

Intended Audience: Scientists related to life sciences and toxicology, QSAR experts, People intrested in machine learning/statistics, Pharmaceutical Industry R&D and other related fields.

Input Information Needed:

  • URI of an existing dataset provided by a related service including at least two numeric features.
  • The target feature need to be declared.
  • Other model specific parameters are used to calibrate the algorithm that produces the model (for instance, the parameter γ in the SVM case.)

 

Exception Events:

  • Provided dataset URI does not exist
  • Unacceptable tuning parameters (for example γ < 0 or tolerance < 0)
  • Provided prediction feature is not a valid feature of the dataset
  • Dataset contains less than two numeric attributes
  • The provided prediction feature is not numeric in this dataset

 

Expected Result: URI of trained model

Subsequent Events: Once the model is generated the following use cases can include it:

  • Use the model for prediction
  • Validate the model using the training data or other external data

 

List of Model Training Services:

 

 

Training of Classification Models

Use Case: Training of Classification Models; Build a model that predicts a nominal value (a category) for a certain feature. Nominal are called those features that accept values in a finite set of values, not necessarily numeric

Assumes: Dataset Services, Dataset with at least one nominal feature and another numeric or nominal feature

Intended Audience: Scientists related to life sciences and toxicology, QSAR experts, People intrested in machine learning/statistics, Pharmaceutical Industry R&D and other related fields.

Input Information Needed:

  • URI of an existing dataset provided by the user, as described in the 'Assumes' section
  • The target feature need to be declared.
  • Other model specific parameters are used to calibrate the algorithm that produces the model (for instance, the parameter γ in the case of support vector classifiers.)

 

Exception Events:

  • Provided dataset URI does not exist
  • Unacceptable tuning parameters (for example γ < 0 or tolerance < 0)
  • Provided prediction feature is not a valid feature of (or it is not contained in) the dataset
  • Dataset is not valid for classification.
  • The provided prediction feature is not nominal in this dataset

 

Expected Result: URI of trained model

Subsequent Events: Once the model is generated the following use cases can exploit it:

  • Use the model for prediction
  • Validate the model using the training data or other external data

 

List of Model Training Services:

http://opentox.ntua.gr:3000/algorithm/svm This service is not implemented yet because it depends on other non-implemented dataset services related to the "NominalFeature" characterization.

 

 

Domain of Applicability

Use Case: Domain of Applicability Calculation Services; Build a resource (proposal: a model-type resource; needs to be agreed) that is able to decide whether a compound or a set of such can be used in combination with a certain model, or as it is formally said, whether a certain compound is in the domain of applicability of a certain model.

Assumes: Dataset and Model services

Intended Audience: Scientists related to life sciences and toxicology, QSAR experts, People intrested in machine learning/statistics, Pharmaceutical Industry R&D and other related fields.

Input Information Needed:

  • URI of an existing dataset provided by the user, as described in the 'Assumes' section OR
  • URI of existing compound and a set of services able to calculate the features for this compound which are independent features in the model under consideration
  • URI of a trained model
  • There are applicability domain calculation services that do not require tuning parameters
  •  

    Exception Events:

    • Compound URI or Dataset URI not found
    • Features could not be calculated for a given compound
    • The provided dataset does not contain the independent features of the model whose DoA is to be calculated
    • Unacceptable tuning parameters (if present)

     

    Expected Result: URI of DoA model

    Subsequent Events: Once the DoA model is generated the following use cases can exploit it:

    • Use the DoA model to tell whether a prediction model is appropriate for the prediction concerning a certain compound
    • Use the DoA model to find one or more appropriate models for a certain compound or dataset

     

    List of DoA Services:

    http://opentox.ntua.gr:3000/algorithm/doa This service is not implemented yet. We design a DoA service based on the method of leverages. This method uses only the training data to take a decision. Other methods include the algorithm as well. A first implementation of the service will be available not after 2010/02/04.

     

     

    Data CleanUp Services

    Use Case: Data Preprocessing services used to clean dataset from unwanted features (e.g. String) and/or missing values. Mainly we recognize two types of cleanup services: One that removes all features of a certain type from a dataset thus creating a new one and services that compensate for missing values, substituting them with the mean or median value of all other feature values for the same feature.

    Assumes: Dataset services, Available datasets

    Intended Audience: Scientists related to life sciences and toxicology, QSAR experts, People intrested in machine learning/statistics, Pharmaceutical Industry R&D and other related fields.

    Input Information Needed:

    • URI of an existing dataset provided by the user
    • Service-specific parameters such as the type of cleanup to be applied to the dataset.

     

    Exception Events:

    • Dataset URI not found
    • Dataset representation is generated but uploading of the cleaned-up dataset to a remote server failed
    • Unacceptable tuning parameters (if present)

     

    Expected Result: URI of cleaned dataset

    Subsequent Events: Once the cleaned up dataset  is generated the following use cases can exploit it:

    • Use the dataset to build a regression or classification model

     

    List of DoA Services:

    These services is not implemented yet but will be ready not after 2010/02/04

     

    IDEA services

    Issue tracker: https://sourceforge.net/tracker/?group_id=191756

    Ontology service

    http://ambit.uni-plovdiv.bg:8080/ontology

    Dataset services

     

    http://ambit.uni-plovdiv.bg:8080/ambit2/dataset

    Algorithm services

    http://ambit.uni-plovdiv.bg:8080/ambit2/algorithm

     

     

    Weka machine learning algorithms

    Automatically recognize numeric and nominal attributes, even if not declared explicitly in RDF and will ignore e.g. string attributes if only numeric are required

     

    pKa estimation
    No dataset or target feature required.

     

    Toxtree

    No dataset or target feature required

     

    Model services

    Document Actions