You are here: Home » Tutorials » Drug Discovery II

Drug Discovery Predictive Toxicology Application II: Building a Model to Predict Kinase Inhibitor Activity

A tutorial on a potential application of the OpenTox Framework in drug discovery.

By: Roman Affentranger (Douglas Connect) and Nina Jeliazkova (IdeaConsult)


The OpenTox framework can be of use in a drug discovery environment in different ways. This tutorial illustrates two potential use cases: 1) using predictive toxicology models in OpenTox to prioritize compounds in a drug discovery application, and 2) using the OpenTox model building functionality to create models predicting protein inhibitor activity.

Using the data on antimalarial compounds made available at the ChEMBL Neglected Tropical Disease (NTD) archive (, in this exercise subsets of the antimalarials are extracted to be used in a model building exercise via the OpenTox prototype application ToxCreate. 857 of the 13’519 compounds contained in the TCAMS dataset are annotated with a protein (class) target. Of these 857 compounds, 233 are annotated as Ser/Thr kinase inhibitors. In this exercise we’ll use this information to create a dataset that can be used to build a model that predicts whether or not a given compound is likely to be a kinase inhibitor. The dataset for the model building therefore needs to consist of two columns: the SMILES string of the compound and its classification (Ser/Thr kinase inhibitor = 1, otherwise 0).

The tutorial is available following the link below, or as PDF for download (Note that the PDF file is outdated. We will provide an updated version as soon as possible).

Step 1: Selecting a subset to create a model with ToxCreate

Document Actions