Sections
You are here: Home » Data » Documents » Development » Validation » Validation statistics

Validation statistics

This is a list of statistics provided by the validation service.

 

Type

Value

Synonym

Description

Meta information

training_dataset_uri

 

 

test_dataset_uri

 

 

prediction_dataset_uri

 

Dataset that contains model predictions

prediction_feature

 

Predicted feature

uri

 

URI of the validation object itself

model_uri

 

 

real_runtime

 

Time needed for validation

id

 

 

created_at

 

 

General validation information

num_instances

 

Number of instances in the test dataset

num_without_class

 

Number of instances with missing class values

percent_without_class

 

Percent of instances with missing class values

num_unpredicted

 

 

percent_unpredicted

 

 

Classification information

num_correct

 

Number of correctly classified instances

num_incorrect

 

Number of incorrectly classified instances

percent_correct

accuracy

 

accuracy = num_correct / num_predictions

i.e. in terms of accuracy non-classified instances are NOT considered as miss-classifications. 

percent correct is given in percent (0-100) while accuracy is given between (0-1)

percent_incorrect

 

 

weighted_area_under_roc

 

Weighted mean (according to number of instances with the actual class value) from all area_under_roc values of all class values

Classification information

(each value available once for each class value)

area_under_roc

 

Area under ROC Curve

f_measure

 

 

precision

positive-predictive-value (PPV)

 

num_false_positives

 

 

num_false_negatives

 

 

num_true_positives

 

 

num_true_negatives

 

 

true_negative_rate

specificity

 

true_positive_rate

sensitivity, recall

 

false_negative_rate

 

 

false_positive_rate

 

 

Classification confusion matrix

(each value available once for each pair of class values)

confusion_matrix_predicted

 

Predicted class value

confusion_matrix_actual

 

Actual class value

confusion_matrix_value

 

Number of instances with above actual/predicted class value

Regression information

root_mean_squared_error

RMSE, PRESS

RMSE_CV and S_PRESS are available in the crossvalidation-report (mean and standard deviation of root_mean_squared_error)

weighted_root_mean_squared_error

Each squared compound prediction error is weighted according to the confidence.
Sqrt( Sum(i=1 to n){(yi - fi)2 * ci} / (n * cmean) )

mean_absolute_error

MEA
Sum(i=1 to n){(yi - fi)} / n
weighted_mean_absolute_error

Each compound prediction error is weighted according to the confidence.
Sum(i=1 to n){(yi - fi) * ci} / (n * cmean)

sum_squared_error

residual_sum_of_squares, SS_ERR

Sum(i=1 to n){(yi - fi)2}

total_sum_of_squares
SS_TOT
Sum(i=1 to n){(yi - ymean)2}

r_square

 

1 - SS_ERR / SS_TOT = 1 - Sum(i=1 to n){(yi - fi)2} / Sum(i=1 to n){(yi - ymean)2}

(see http://web.maths.unsw.edu.au/~adelle/Garvan/Assays/GoodnessOfFit.html, http://en.wikipedia.org/wiki/Coefficient_of_determination#Definitions)

How can R² be negative? see http://www.graphpad.com/faq/viewfaq.cfm?faq=711

weighted_r_square

r² with confidence weighted predictions
1 - Sum(i=1 to n){(yi - fi)2*ci} / Sum(i=1 to n){(yi - ymean)2*ci}

target_variance_actual

 

 the variance of the actual endpoint values

1 / (n-1) Sum(i=1 to n){(yi - ymean)2}

target_variance_predicted

 the variance of the predicted endpoint values

1 / (n-1) Sum(i=1 to n){(fi - fmean)2}

sample_correlation_coefficient

 

 (defined i.e. in wikipedia: http://en.wikipedia.org/wiki/Correlation_and_dependence#Pearson.27s_product-moment_coefficient)

concordance_correlation_coefficient

 

defined in http://ukpmc.ac.uk/abstract/MED/2720055

(2 Sum(i=1 to n){(yi - ymean)(fi - fmean)} )

/

( Sum(i=1 to n){(yi - ymean)2} + Sum(i=1 to n){(fi - fmean)2} + n ymean fmean )

Crossvalidation information

crossvalidation_uri

 

URI of crossvalidation (if available)

crossvalidation_fold

 

Fold of crossvalidation (if available)

Document Actions