You are here: Home » Tutorials » Drug Discovery III

Build a Weight of Evidence for Drug Candidate Molecules

by Roman Affentranger, Douglas Connect, Switzerland


In this workshop exercise we will take a look at how OpenTox can be used in a Drug Discovery application to prioritize drug candidates according to their predicted toxicities. For the exercise we assume that a number of active compounds have been identified, and we want to categorize these molecules according to their toxicity profile. Ideally, we would identify a subset of compounds that are non-toxic according to all the predictions we use, and these compounds could be candidates for further optimization.

The exercise covers how to use OpenTox to gather toxicity predictions to be used - for example - with a Weight-Of-Evidence approach, but it does not guide through the process of combining the predictions to reach conclusions.

When you are done with the exercise, please fill out the Evaluation Form.

The exercise is focused on antimalarial compounds taken from the Tres Cantos Antimalarial Compound Set ( In the workshop we will cover the following activities:

  1. Creation of an OpenTox dataset from a local file
  2. Selection of models available in OpenTox through ToxPredict
  3. Application of the selected models to the uploaded dataset
  4. Obtaining simple statistics for the predictions obtained for the uploaded dataset
  5. Adding the predicted values to a tabular view of the dataset and downloading the table
  6. Additional steps I) Applying an individual model to a dataset
  7. Additional steps II) Searching a dataset according to prediction values
  8. Additional steps III) Obtaining additional predictions
  9. Additional steps IV) Downloading data from other dataset services in OpenTox
  10. Complement your findings with predicted associations with hepatobiliary adverse events



For this exercise, you need Java installed. If you are not sure whether you have Java installed or not, you can check here:

To download and install Java, go to



In 2010, the Tres Cantos Medicines Development Campus of GlaxoSmithKline deposited a collection of compounds – the Tres Cantos Antimalarial Compound Set, TCAMS – at the ChEMBL Neglected Tropical Disease Database ( making the data publicly available.  Each of the 13533 chemicals in the TCAMS inhibits growth of the 3D7 strain of Plasmodium falciparum – the malaria causing parasite – by at least 80% at a concentration of 2 µM. 5267 compounds in TCAMS also show the same effect against the multi-drug resistant P. falciparum strain DD2. Evidence for liver toxicity is provided by growth inhibition data against human hepatoma HepG2 cells. Over the whole TCAMS, 3137 compounds inhibit HepG2 growth by 40% or more at a concentration of 10 µM.

In this tutorial we will work on a subset of 87 compounds of the TCAMS, and we will use several OpenTox in combination to help prioritize these molecules for further investigation as drug candidates. The subset of 87 TCAMS compounds can be downloaded here. The file contains the identifiers of the compounds, as used in TCAMS, along with their SMILES and the cytotoxicity data of each compound against two P. falciparum strains (3D7 and DD2) and against human hepatoma HepG2 cells. Cytotoxicity is given as percentage growth inhibition at a compound concentration of 2 µM in the case of the two P. falciparum strains and at a concentration of 10 µM in the case of HepG2 cells.

P. falciparum 3D7 is one of the strains responsible for many infections in particular in children. P. falciparum DD2 is a multi-drug resistand strain of Plasmodium. The cytotoxicity data against HepG2 serves as an in vitro marker for general liver toxicity. We will add toxicity predictions to complement the toxicity information.

Start the tutorial with step 1. Creation of an OpenTox dataset from a local file


Document Actions