Sections
You are here: Home » Meet » OpenTox 2011 » posterabstracts » Data integration. A case study

Data integration. A case study

Nina Jeliazkova, IdeaConsult, Bulgaria

An important pre-requisite for the successful implementation of the main principles of the 3Rs – Reduction, Refinement and Replacement alternatives – is the universal access to high quality experimental data on various chemical properties. Unfortunately, even today, the “state-of-the-art” is characterised by highly fragmented and unconnected life sciences data (both from a physical and ontological perspective), which is furthermore frequently inaccurate and/or difficult or even impossible to find or access. We present an AMBIT web services-based case study of integration and comparison of 67 datasets with physico-chemical and/or experimental toxicity data, originating from various public and commercial sources. The datasets can be retrieved as a whole or by submission of search queries on chemical identifiers, properties, structures, sub-structures or similarity. The content from different datasets can be easily collated, thanks to the universal database structure design and the ontology that establishes a shared terminology and meaning of the data fields. Additionally, the datasets are “model-ready”, and can be used as an input to OpenTox compliant predictive models. Query and submission of new data or data modifications can be carried out through the web services interface, which implements the OpenTox framework API.

Chemical structures (including ECHA’s list of pre-registered substances) have been collected from various public sources and/or generated by name to structure conversion. In this process inconsistencies between chemical structures have been discovered and flagged automatically through built-in heuristics. We report the overlap between the datasets in terms of number of common compounds, as well as mutual similarity or coverage of the “chemical domain”, calculated by structure-based and descriptor based methods. Several computational resources and predictive methods are seamlessly integrated by the uniform OpenTox application programming interface. The similarity between a user supplied set of chemicals and the existing datasets can be conveniently accessed online, either programmatically or via a web browser.

(presenting author: Nina Jeliazkova)

Document Actions