# Predictive uncertainty by analogy – does it make sense?

**Ullrika Sahlin, Linneaus University, SwedenNina Jeliazkova, IdeaConsult, BulgariaTom Aldenberg, RIVM, NetherlandsJonna Stå**

**lring, AstraZeneca, Sweden**

Tomas Öberg, Linneaus University, Sweden

Tomas Öberg, Linneaus University, Sweden

A QSAR predicts by analogy saying that “similar chemicals have similar properties and behavior”. When QSARs are used for decision making (e.g. for regulatory use or drug discovery) assessments of the associated predictive uncertainty for a chemical compound and model reliability are asked for. A predictive distribution from QSAR regression can be assessed in several ways [1]; directly e.g. by the use of probabilistic (e.g., Bayesian) QSAR; indirectly by assessing the predictive variance, e.g., as the PRedictive Error Sum of Squares (PRESS). Whereas PRESS generates the same predictive error for every compound predicted by a QSAR, predictive error could alternatively be allowed to vary from compound to compound. Such assessment could be based on analogy saying that “compounds that are similar are predicted with similar predictive error”. Assessing predictive uncertainty by analogy reasoning exist within QSAR modeling, but in the same way as the analogy not always work for QSARs, predicting uncertainty by analogy needs to be motivated both theoretically and empirically. Here we challenge the traditional PRESS with a weighted PRESS [2], where predictive error is estimated as an weighted average or a nearest neighbor average. We provide arguments for using analogy to assess predictive uncertainty and perform experimental tests to answer 1) if DPRESS generates more reliable predictions than PRESS and 2) if there is a similarity measure that performs better than others? We show results based on published QSAR data where we compare and test the reliability of QSAR predictions derived from different ways to assess predictive error which also includes using different measures of similarity.

[1] Sahlin U, Filipsson M, Öberg T., *A risk assessment perspective of
current practice in characterizing uncertainties in QSAR regression
predictions*, Mol. Inf. 2011; In press.

[2] Clark R., *DPRESS: Localizing estimates of predictive uncertainty*, J Cheminf. 2009; **1**:11.

(presenting author: Ullrika Sahlin)