Validation statistics

This is a list of statistics provided by the validation service.

Type

Value

Synonym

Description

Meta information

training_dataset_uri

test_dataset_uri

prediction_dataset_uri

Dataset that contains model predictions

prediction_feature

Predicted feature

uri

URI of the validation object itself

model_uri

real_runtime

Time needed for validation

created_at

General validation information

num_instances

Number of instances in the test dataset

num_without_class

Number of instances with missing class values

percent_without_class

Percent of instances with missing class values

num_unpredicted

percent_unpredicted

Classification information

num_correct

Number of correctly classified instances

num_incorrect

Number of incorrectly classified instances

percent_correct

accuracy

accuracy = num_correct / num_predictions

i.e. in terms of accuracy non-classified instances are NOT considered as miss-classifications.

percent correct is given in percent (0-100) while accuracy is given between (0-1)

percent_incorrect

weighted_area_under_roc

Weighted mean (according to number of instances with the actual class value) from all area_under_roc values of all class values

Classification information

(each value available once for each class value)

area_under_roc

Area under ROC Curve

f_measure

precision

positive-predictive-value (PPV)

num_false_positives

num_false_negatives

num_true_positives

num_true_negatives

true_negative_rate

specificity

true_positive_rate

sensitivity, recall

false_negative_rate

false_positive_rate

Classification confusion matrix

(each value available once for each pair of class values)

confusion_matrix_predicted

Predicted class value

confusion_matrix_actual

Actual class value

confusion_matrix_value

Number of instances with above actual/predicted class value

Regression information

root_mean_squared_error

RMSE, PRESS

RMSE_CV and S_PRESS are available in the crossvalidation-report (mean and standard deviation of root_mean_squared_error)

weighted_root_mean_squared_error

Each squared compound prediction error is weighted according to the confidence.
Sqrt( Sum_{(i=1 to n)}{(y_i- f_i)² * c_i} / (n * c_mean) )

mean_absolute_error

MEA

Sum_{(i=1 to n)}{(y_i- f_i)} / n

weighted_mean_absolute_error

Each compound prediction error is weighted according to the confidence.
Sum_{(i=1 to n)}{(y_i- f_i) * c_i} / (n * c_mean)

sum_squared_error

residual_sum_of_squares, SS_ERR

Sum_{(i=1 to n)}{(y_i- f_i)²}

total_sum_of_squares

SS_TOT

Sum_{(i=1 to n)}{(y_i - y_mean)²}

r_square

1 - SS_ERR / SS_TOT = 1 - Sum_{(i=1 to n)}{(y_i- f_i)²} / Sum_{(i=1 to n)}{(y_i - y_mean)²}

(see http://web.maths.unsw.edu.au/~adelle/Garvan/Assays/GoodnessOfFit.html, http://en.wikipedia.org/wiki/Coefficient_of_determination#Definitions)

How can R² be negative? see http://www.graphpad.com/faq/viewfaq.cfm?faq=711

weighted_r_square

r² with confidence weighted predictions
1 - Sum_{(i=1 to n)}{(y_i- f_i)²*c_i} / Sum_{(i=1 to n)}{(y_i - y_mean)²*c_i}

target_variance_actual

the variance of the actual endpoint values

1 / (n-1) Sum_{(i=1 to n)}{(y_i- y_mean)²}

target_variance_predicted

the variance of the predicted endpoint values

1 / (n-1) Sum_{(i=1 to n)}{(f_i- f_mean)²}

sample_correlation_coefficient

(defined i.e. in wikipedia: http://en.wikipedia.org/wiki/Correlation_and_dependence#Pearson.27s_product-moment_coefficient)

concordance_correlation_coefficient

defined in http://ukpmc.ac.uk/abstract/MED/2720055

(2 Sum_{(i=1 to n)}{(y_i- y_mean)(f_i- f_mean)} )

( Sum_{(i=1 to n)}{(y_i - y_mean)²} + Sum_{(i=1 to n)}{(f_i - f_mean)²} + n y_meanf_mean )

Crossvalidation information

crossvalidation_uri

URI of crossvalidation (if available)

crossvalidation_fold

Fold of crossvalidation (if available)

Sections

Validation statistics

Document Actions