You are here: Home » Meet » OpenTox 2011 » posterabstracts » In silico pKa prediction

In silico pKa prediction

Robert Körner, Iurii Sushko, Sergii Novotarskyi and Igor Tetko, eADMET GmbH, Germany

The biopharmaceutical profile of a compound depends directly on the dissociation constants of its acidic and basic groups, commonly expressed as the negative decadic logarithm pKa of the acid dissociation constant (Ka). The acid dissociation constant (also protonation or ionization constant) Ka is an equilibrium constant defined as the ratio of the protonated and the deprotonated form of a compound; it is usually stated as pKa = − log10 Ka. The pKa value of a compound strongly influences its pharmacokinetic and biochemical properties. Its accurate es- timation is therefore of great interest in areas such as biochemistry, medicinal chemistry, pharmaceutical chemistry, and drug development. Aside from the pharmaceutical industry, it also has relevance in environmental ecotoxicology, as well as the agrochemicals and specialty chemicals industries. 

In literature, a vast number of different approaches for pKa prediction can be found (Rupp et al, Comb. Chem. High Throughput Screening: submitted 2010). These approaches can be divided into two different classes. On the one hand there are direct calculations, so called ab initio methods, trying to determine the pKa value by quantum chemical or mechanical computation. On the other hand there statistical models, trained on chemical or structural descriptors. These descriptors can be, for example, of quantum chemical, semi empirical, graph topological or simple statistical nature. This type of modeling is called QSPR (Quantitative Structure Property Relationship).

In our recent work, we develop such a QSPR model using localized molecular descriptors to train multiple linear regression and artificial neural networks to estimate dissociation constants (pKa). The performance of our approach is similar to that of a semi-empirical model (Tehan et al, QSAR & Comb. Sci. 21(5): 457–472, 473–485) based on frontier electron theory.
How such a prediction model can be built, is shown by an example performed with OCHEM, an online chemical database with an environment for modeling ( It is a publicly accessible database for chemical compound data and predictive models. OCHEM is built on a “wiki”-oriented structure, where users can collect and organize chemical compounds, together with data on physico-chemical and biological properties of these. On the other side, users get the facility to develop, apply, and distribute predictive models. It is unique in its combination of compound data and predictive models. 

This study is partially supported by the BMBF GO-Bio project 0313883.

Document Actions